<li>Peter noticed that there were still some old CRP names on CGSpace, because I hadn’t forced the Discovery index to be updated after I fixed the others last week</li>
<li><p>Elizabeth from CIAT emailed to ask if I could help her by adding ORCID identifiers to all of Joseph Tohme’s items</p></li>
<li><p>I used my <ahref="https://gist.githubusercontent.com/alanorth/a49d85cd9c5dea89cddbe809813a7050/raw/f67b6e45a9a940732882ae4bb26897a9b245ef31/add-orcid-identifiers-csv.py">add-orcid-identifiers-csv.py</a> script:</p>
<li><p>I was prepared to skip some commits that I had cherry picked from the upstream <code>dspace-5_x</code> branch when we did the DSpace 5.5 upgrade (see notes on 2016-10-19 and 2017-12-17):</p>
<li><p>… but somehow git knew, and didn’t include them in my interactive rebase!</p></li>
<li><p>I need to send this branch to Atmire and also arrange payment (see <ahref="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">ticket #560</a> in their tracker)</p></li>
<li><p>Fix Sisay’s SSH access to the new DSpace Test server (linode19)</p></li>
<li><p>I ran all system updates on DSpace Test and rebooted it</p></li>
<li><p>Proof some records on DSpace Test for Udana from IWMI</p></li>
<li><p>He has done better with the small syntax and consistency issues but then there are larger concerns with not linking to DOIs, copying titles incorrectly, etc</p></li>
<li><p>In Tomcat 8.5 the <code>removeAbandoned</code> property has been split into two: <code>removeAbandonedOnBorrow</code> and <code>removeAbandonedOnMaintenance</code></p></li>
<li><p>I assume we want <code>removeAbandonedOnBorrow</code> and make updates to the Tomcat 8 templates in Ansible</p></li>
<li><p>After reading more documentation I see that Tomcat 8.5’s default DBCP seems to now be Commons DBCP2 instead of Tomcat DBCP</p></li>
<li><p>It can be overridden in Tomcat’s <em>server.xml</em> by setting <code>factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"</code> in the <code><Resource></code></p></li>
<li><p>I think we should use this default, so we’ll need to remove some other settings that are specific to Tomcat’s DBCP like <code>jdbcInterceptors</code> and <code>abandonWhenPercentageFull</code></p></li>
<li><p>Merge the changes adding ORCID identifier to advanced search and Atmire Listings and Reports (<ahref="https://github.com/ilri/DSpace/pull/371">#371</a>)</p></li>
<li><p>Fix one more issue of missing XMLUI strings (for CRP subject when clicking “view more” in the Discovery sidebar)</p></li>
<li><p>I told Udana to fix the citation and abstract of the one item, and to correct the <code>dc.language.iso</code> for the five Spanish items in his Book Chapters collection</p></li>
<li><p>Then we can import the records to CGSpace</p></li>
<li>I caught wind of an interesting XMLUI performance optimization coming in DSpace 6.3: <ahref="https://jira.duraspace.org/browse/DS-3883">https://jira.duraspace.org/browse/DS-3883</a></li>
<li>I asked for it to be ported to DSpace 5.x</li>
<li><p>While testing an XMLUI patch for <ahref="https://jira.duraspace.org/browse/DS-3883">DS-3883</a> I noticed that there is still some remaining Authority / Solr configuration left that we need to remove:</p>
<pre><code>2018-04-14 18:55:25,841 ERROR org.dspace.authority.AuthoritySolrServiceImpl @ Authority solr is not correctly configured, check "solr.authority.server" property in the dspace.cfg
<li><p>I see the same error on DSpace Test so this is definitely a problem</p></li>
<li><p>After disabling the authority consumer I no longer see the error</p></li>
<li><p>I merged a pull request to the <code>5_x-prod</code> branch to clean that up (<ahref="https://github.com/ilri/DSpace/pull/372">#372</a>)</p></li>
<li><p>File a ticket on DSpace’s Jira for the <code>target="_blank"</code> security and performance issue (<ahref="https://jira.duraspace.org/browse/DS-3891">DS-3891</a>)</p></li>
<li><p>I re-deployed DSpace Test (linode19) and was surprised by how long it took the ant update to complete:</p>
<li>IWMI people are asking about building a search query that outputs RSS for their reports</li>
<li>They want the same results as this Discovery query: <ahref="https://cgspace.cgiar.org/discover?filtertype_1=dateAccessioned&filter_relational_operator_1=contains&filter_1=2018&submit_apply_filter=&query=&scope=10568%2F16814&rpp=100&sort_by=dc.date.issued_dt&order=desc">https://cgspace.cgiar.org/discover?filtertype_1=dateAccessioned&filter_relational_operator_1=contains&filter_1=2018&submit_apply_filter=&query=&scope=10568%2F16814&rpp=100&sort_by=dc.date.issued_dt&order=desc</a></li>
<li>They will need to use OpenSearch, but I can’t remember all the parameters</li>
<li><p>They want items by issue date, so we need to use sort option 2</p></li>
<li><p>According to the DSpace Manual there are only the following parameters to OpenSearch: format, scope, rpp, start, and sort_by</p></li>
<li><p>The OpenSearch <code>query</code> parameter expects a Discovery search filter that is defined in <code>dspace/config/spring/api/discovery.xml</code></p></li>
<li><p>So for IWMI they should be able to use something like this: <ahref="https://cgspace.cgiar.org/open-search/discover?query=dateIssued:2018&scope=10568/16814&sort_by=2&order=DESC&format=rss">https://cgspace.cgiar.org/open-search/discover?query=dateIssued:2018&scope=10568/16814&sort_by=2&order=DESC&format=rss</a></p></li>
<li><p>There are also <code>rpp</code> (results per page) and <code>start</code> parameters but in my testing now on DSpace 5.5 they behave very strangely</p></li>
<li><p>For example, set <code>rpp=1</code> and then check the results for <code>start</code> values of 0, 1, and 2 and they are all the same!</p></li>
<li><p>If I have time I will check if this behavior persists on DSpace 6.x on the official DSpace demo and file a bug</p></li>
<li><p>Also, the DSpace Manual as of 5.x has very poor documentation for OpenSearch</p></li>
<li><p>They don’t tell you to use Discovery search filters in the <code>query</code> (with format <code>query=dateIssued:2018</code>)</p></li>
<li><p>They don’t tell you that the sort options are actually defined in <code>dspace.cfg</code> (ie, you need to use <code>2</code> instead of <code>dc.date.issued_dt</code>)</p></li>
<li><p>They are missing the <code>order</code> parameter (ASC vs DESC)</p></li>
<li><p>I notice that DSpace Test has crashed again, due to memory:</p>
<li><p>I will increase the JVM heap size from 5120M to 6144M, though we don’t have much room left to grow as DSpace Test (linode19) is using a smaller instance size than CGSpace</p></li>
<li><p>Gabriela from CIP asked if I could send her a list of all CIP authors so she can do some replacements on the name formats</p></li>
<li><p>I got a list of all the CIP collections manually and use the same query that I used in <ahref="/cgspace-notes/2017-08">August, 2017</a>:</p>
<pre><code>dspace#= \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in ('10568/89347', '10568/88229', '10568/53086', '10568/53085', '10568/69069', '10568/53087', '10568/53088', '10568/53089', '10568/53090', '10568/53091', '10568/53092', '10568/70150', '10568/53093', '10568/64874', '10568/53094'))) group by text_value order by count desc) to /tmp/cip-authors.csv with csv;
<li><p>I also re-deployed CGSpace (linode18) to make the ORCID search, authority cleanup, CCAFS project tag <code>PII-LAM_CSAGender</code> live</p></li>
<li><p>When re-deploying I also updated the GeoLite databases so I hope the country stats become more accurate…</p></li>
<li><p>After re-deployment I ran all system updates on the server and rebooted it</p></li>
<li><p>After the reboot I forced a reïndexing of the Discovery to populate the new ORCID index:</p>
<pre><code>org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-715] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:250; busy:18; idle:0; lastwait:5000].
<li><p>I tried to restart Tomcat but <code>systemctl</code> hangs</p></li>
<li><p>I tried to reboot the server from the command line but after a few minutes it didn’t come back up</p></li>
<li><p>Looking at the Linode console I see that it is stuck trying to shut down</p></li>
<li><p>Even “Reboot” via Linode console doesn’t work!</p></li>
<li><p>After shutting it down a few times via the Linode console it finally rebooted</p></li>
<li><p>Everything is back but I have no idea what caused this—I suspect something with the hosting provider</p></li>
<li><p>Also super weird, the last entry in the DSpace log file is from <code>2018-04-20 16:35:09</code>, and then immediately it goes to <code>2018-04-20 19:15:04</code> (three hours later!):</p>
<pre><code>2018-04-20 16:35:09,144 ERROR org.dspace.app.util.AbstractDSpaceWebapp @ Failed to record shutdown in Webapp table.
org.apache.tomcat.jdbc.pool.PoolExhaustedException: [localhost-startStop-2] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:250; busy:18; idle
<li>Testing my Ansible playbooks with a clean and updated installation of Ubuntu 18.04 and I fixed some issues that I hadn’t run into a few weeks ago</li>
<li>There seems to be a new issue with Java dependencies, though</li>
<li>The <code>default-jre</code> package is going to be Java 10 on Ubuntu 18.04, but I want to use <code>openjdk-8-jre-headless</code> (well, the JDK actually, but it uses this JRE)</li>
<li>Tomcat and Ant are fine with Java 8, but the <code>maven</code> package wants to pull in Java 10 for some reason</li>
<li>Looking closer, I see that <code>maven</code> depends on <code>java7-runtime-headless</code>, which is indeed provided by <code>openjdk-8-jre-headless</code></li>
<li>So it must be one of Maven’s dependencies…</li>
<li>I will watch it for a few days because it could be an issue that will be resolved before Ubuntu 18.04’s release</li>
<li>Otherwise I will post a bug to the ubuntu-release mailing list</li>
<li>Looks like the only way to fix this is to install <code>openjdk-8-jdk-headless</code> before (so it pulls in the JRE) in a separate transaction, or to manually install <code>openjdk-8-jre-headless</code> in the same apt transaction as <code>maven</code></li>
<li>Also, I started porting PostgreSQL 9.6 into the Ansible infrastructure scripts</li>
<li>This should be a drop in I believe, though I will definitely test it more locally as well as on DSpace Test once we move to DSpace 5.8 and Ubuntu 18.04 in the coming months</li>
<li>Still testing the <ahref="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a> for Ubuntu 18.04, Tomcat 8.5, and PostgreSQL 9.6</li>
<li>One other new thing I notice is that PostgreSQL 9.6 no longer uses <code>createuser</code> and <code>nocreateuser</code>, as those have actually meant <code>superuser</code> and <code>nosuperuser</code> and have been deprecated for <em>ten years</em></li>
<li><p>So for my notes, when I’m importing a CGSpace database dump I need to amend my notes to give super user permission to a user, rather than create user:</p>
<li>DSpace Test crashed again, looks like memory issues again</li>
<li>JVM heap size was last increased to 6144m but the system only has 8GB total so there’s not much we can do here other than get a bigger Linode instance or remove the massive Solr Statistics data</li>
<li>I will email the CGSpace team to ask them whether or not we want to commit to having a public test server that accurately mirrors CGSpace (ie, to upgrade to the next largest Linode)</li>