<li>Peter noticed that there were still some old CRP names on CGSpace, because I hadn’t forced the Discovery index to be updated after I fixed the others last week</li>
<li>For completeness I re-ran the CRP corrections on CGSpace:</li>
<li>Elizabeth from CIAT emailed to ask if I could help her by adding ORCID identifiers to all of Joseph Tohme’s items</li>
<li>I used my <ahref="https://gist.githubusercontent.com/alanorth/a49d85cd9c5dea89cddbe809813a7050/raw/f67b6e45a9a940732882ae4bb26897a9b245ef31/add-orcid-identifiers-csv.py">add-orcid-identifiers-csv.py</a> script:</li>
<li>I was prepared to skip some commits that I had cherry picked from the upstream <code>dspace-5_x</code> branch when we did the DSpace 5.5 upgrade (see notes on 2016-10-19 and 2017-12-17):
<ul>
<li>[DS-3246] Improve cleanup in recyclable components (upstream commit on dspace-5_x: 9f0f5940e7921765c6a22e85337331656b18a403)</li>
<li>[DS-3250] applying patch provided by Atmire (upstream commit on dspace-5_x: c6fda557f731dbc200d7d58b8b61563f86fe6d06)</li>
<li>bump up to latest minor pdfbox version (upstream commit on dspace-5_x: b5330b78153b2052ed3dc2fd65917ccdbfcc0439)</li>
<li>DS-3583 Usage of correct Collection Array (#1731) (upstream commit on dspace-5_x: c8f62e6f496fa86846bfa6bcf2d16811087d9761)</li>
</ul></li>
<li>… but somehow git knew, and didn’t include them in my interactive rebase!</li>
<li>I need to send this branch to Atmire and also arrange payment (see <ahref="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">ticket #560</a> in their tracker)</li>
<li>I ran all system updates on DSpace Test and rebooted it</li>
<li>Proof some records on DSpace Test for Udana from IWMI</li>
<li>He has done better with the small syntax and consistency issues but then there are larger concerns with not linking to DOIs, copying titles incorrectly, etc</li>
</ul>
<h2id="2018-04-10">2018-04-10</h2>
<ul>
<li>I got a notice that CGSpace CPU usage was very high this morning</li>
<li>Looking at the nginx logs, here are the top users today so far:</li>
<li>In Tomcat 8.5 the <code>removeAbandoned</code> property has been split into two: <code>removeAbandonedOnBorrow</code> and <code>removeAbandonedOnMaintenance</code></li>
<li>I assume we want <code>removeAbandonedOnBorrow</code> and make updates to the Tomcat 8 templates in Ansible</li>
<li>After reading more documentation I see that Tomcat 8.5’s default DBCP seems to now be Commons DBCP2 instead of Tomcat DBCP</li>
<li>It can be overridden in Tomcat’s <em>server.xml</em> by setting <code>factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"</code> in the <code><Resource></code></li>
<li>I think we should use this default, so we’ll need to remove some other settings that are specific to Tomcat’s DBCP like <code>jdbcInterceptors</code> and <code>abandonWhenPercentageFull</code></li>
<li>Merge the changes adding ORCID identifier to advanced search and Atmire Listings and Reports (<ahref="https://github.com/ilri/DSpace/pull/371">#371</a>)</li>
<li>Fix one more issue of missing XMLUI strings (for CRP subject when clicking “view more” in the Discovery sidebar)</li>
<li>I told Udana to fix the citation and abstract of the one item, and to correct the <code>dc.language.iso</code> for the five Spanish items in his Book Chapters collection</li>
<li>Then we can import the records to CGSpace</li>
</ul>
<h2id="2018-04-11">2018-04-11</h2>
<ul>
<li>DSpace Test (linode19) crashed again some time since yesterday:</li>
<li>I ran all system updates and rebooted the server</li>
</ul>
<h2id="2018-04-12">2018-04-12</h2>
<ul>
<li>I caught wind of an interesting XMLUI performance optimization coming in DSpace 6.3: <ahref="https://jira.duraspace.org/browse/DS-3883">https://jira.duraspace.org/browse/DS-3883</a></li>
<li>I asked for it to be ported to DSpace 5.x</li>
<li>While testing an XMLUI patch for <ahref="https://jira.duraspace.org/browse/DS-3883">DS-3883</a> I noticed that there is still some remaining Authority / Solr configuration left that we need to remove:</li>
</ul>
<pre><code>2018-04-14 18:55:25,841 ERROR org.dspace.authority.AuthoritySolrServiceImpl @ Authority solr is not correctly configured, check "solr.authority.server" property in the dspace.cfg
java.lang.NullPointerException
</code></pre>
<ul>
<li>I assume we need to remove <code>authority</code> from the consumers in <code>dspace/config/dspace.cfg</code>:</li>
<li>I see the same error on DSpace Test so this is definitely a problem</li>
<li>After disabling the authority consumer I no longer see the error</li>
<li>I merged a pull request to the <code>5_x-prod</code> branch to clean that up (<ahref="https://github.com/ilri/DSpace/pull/372">#372</a>)</li>
<li>File a ticket on DSpace’s Jira for the <code>target="_blank"</code> security and performance issue (<ahref="https://jira.duraspace.org/browse/DS-3891">DS-3891</a>)</li>
<li>IWMI people are asking about building a search query that outputs RSS for their reports</li>
<li>They want the same results as this Discovery query: <ahref="https://cgspace.cgiar.org/discover?filtertype_1=dateAccessioned&filter_relational_operator_1=contains&filter_1=2018&submit_apply_filter=&query=&scope=10568%2F16814&rpp=100&sort_by=dc.date.issued_dt&order=desc">https://cgspace.cgiar.org/discover?filtertype_1=dateAccessioned&filter_relational_operator_1=contains&filter_1=2018&submit_apply_filter=&query=&scope=10568%2F16814&rpp=100&sort_by=dc.date.issued_dt&order=desc</a></li>
<li>They will need to use OpenSearch, but I can’t remember all the parameters</li>
<li>Apparently search sort options for OpenSearch are in <code>dspace.cfg</code>:</li>
<li>They want items by issue date, so we need to use sort option 2</li>
<li>According to the DSpace Manual there are only the following parameters to OpenSearch: format, scope, rpp, start, and sort_by</li>
<li>The OpenSearch <code>query</code> parameter expects a Discovery search filter that is defined in <code>dspace/config/spring/api/discovery.xml</code></li>
<li>So for IWMI they should be able to use something like this: <ahref="https://cgspace.cgiar.org/open-search/discover?query=dateIssued:2018&scope=10568/16814&sort_by=2&order=DESC&format=rss">https://cgspace.cgiar.org/open-search/discover?query=dateIssued:2018&scope=10568/16814&sort_by=2&order=DESC&format=rss</a></li>
<li>There are also <code>rpp</code> (results per page) and <code>start</code> parameters but in my testing now on DSpace 5.5 they behave very strangely</li>
<li>For example, set <code>rpp=1</code> and then check the results for <code>start</code> values of 0, 1, and 2 and they are all the same!</li>
<li>If I have time I will check if this behavior persists on DSpace 6.x on the official DSpace demo and file a bug</li>
<li>Also, the DSpace Manual as of 5.x has very poor documentation for OpenSearch</li>
<li>They don’t tell you to use Discovery search filters in the <code>query</code> (with format <code>query=dateIssued:2018</code>)</li>
<li>They don’t tell you that the sort options are actually defined in <code>dspace.cfg</code> (ie, you need to use <code>2</code> instead of <code>dc.date.issued_dt</code>)</li>
<li>They are missing the <code>order</code> parameter (ASC vs DESC)</li>
<li>I notice that DSpace Test has crashed again, due to memory:</li>
<li>I will increase the JVM heap size from 5120M to 6144M, though we don’t have much room left to grow as DSpace Test (linode19) is using a smaller instance size than CGSpace</li>
<pre><code>dspace#= \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in ('10568/89347', '10568/88229', '10568/53086', '10568/53085', '10568/69069', '10568/53087', '10568/53088', '10568/53089', '10568/53090', '10568/53091', '10568/53092', '10568/70150', '10568/53093', '10568/64874', '10568/53094'))) group by text_value order by count desc) to /tmp/cip-authors.csv with csv;