mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2020-01-27
This commit is contained in:
@ -8,9 +8,9 @@
|
||||
<meta property="og:title" content="March, 2019" />
|
||||
<meta property="og:description" content="2019-03-01
|
||||
|
||||
I checked IITA's 259 Feb 14 records from last month for duplicates using Atmire's Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good
|
||||
I checked IITA’s 259 Feb 14 records from last month for duplicates using Atmire’s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good
|
||||
I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc…
|
||||
Looking at the other half of Udana's WLE records from 2018-11
|
||||
Looking at the other half of Udana’s WLE records from 2018-11
|
||||
|
||||
I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)
|
||||
I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items
|
||||
@ -30,9 +30,9 @@ I think I will need to ask Udana to re-copy and paste the abstracts with more ca
|
||||
<meta name="twitter:title" content="March, 2019"/>
|
||||
<meta name="twitter:description" content="2019-03-01
|
||||
|
||||
I checked IITA's 259 Feb 14 records from last month for duplicates using Atmire's Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good
|
||||
I checked IITA’s 259 Feb 14 records from last month for duplicates using Atmire’s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good
|
||||
I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc…
|
||||
Looking at the other half of Udana's WLE records from 2018-11
|
||||
Looking at the other half of Udana’s WLE records from 2018-11
|
||||
|
||||
I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)
|
||||
I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items
|
||||
@ -43,7 +43,7 @@ Most worryingly, there are encoding errors in the abstracts for eleven items, fo
|
||||
|
||||
I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.62.2" />
|
||||
<meta name="generator" content="Hugo 0.63.1" />
|
||||
|
||||
|
||||
|
||||
@ -73,7 +73,7 @@ I think I will need to ask Udana to re-copy and paste the abstracts with more ca
|
||||
|
||||
<!-- combined, minified CSS -->
|
||||
|
||||
<link href="https://alanorth.github.io/cgspace-notes/css/style.a20c1a4367639632cdb341d23c27ca44fedcc75b0f8b3cbea6203010da153d3c.css" rel="stylesheet" integrity="sha256-ogwaQ2djljLNs0HSPCfKRP7cx1sPizy+piAwENoVPTw=" crossorigin="anonymous">
|
||||
<link href="https://alanorth.github.io/cgspace-notes/css/style.23e2c3298bcc8c1136c19aba330c211ec94c36f7c4454ea15cf4d3548370042a.css" rel="stylesheet" integrity="sha256-I+LDKYvMjBE2wZq6MwwhHslMNvfERU6hXPTTVINwBCo=" crossorigin="anonymous">
|
||||
|
||||
|
||||
<!-- RSS 2.0 feed -->
|
||||
@ -120,16 +120,16 @@ I think I will need to ask Udana to re-copy and paste the abstracts with more ca
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-03/">March, 2019</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2019-03-01T12:16:30+01:00">Fri Mar 01, 2019</time> by Alan Orth in
|
||||
<i class="fa fa-folder" aria-hidden="true"></i> <a href="/cgspace-notes/categories/notes" rel="category tag">Notes</a>
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2019-03-01">2019-03-01</h2>
|
||||
<ul>
|
||||
<li>I checked IITA's 259 Feb 14 records from last month for duplicates using Atmire's Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good</li>
|
||||
<li>I checked IITA’s 259 Feb 14 records from last month for duplicates using Atmire’s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good</li>
|
||||
<li>I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc…</li>
|
||||
<li>Looking at the other half of Udana's WLE records from 2018-11
|
||||
<li>Looking at the other half of Udana’s WLE records from 2018-11
|
||||
<ul>
|
||||
<li>I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)</li>
|
||||
<li>I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items</li>
|
||||
@ -142,14 +142,14 @@ I think I will need to ask Udana to re-copy and paste the abstracts with more ca
|
||||
</ul>
|
||||
<h2 id="2019-03-03">2019-03-03</h2>
|
||||
<ul>
|
||||
<li>Trying to finally upload IITA's 259 Feb 14 items to CGSpace so I exported them from DSpace Test:</li>
|
||||
<li>Trying to finally upload IITA’s 259 Feb 14 items to CGSpace so I exported them from DSpace Test:</li>
|
||||
</ul>
|
||||
<pre><code>$ mkdir 2019-03-03-IITA-Feb14
|
||||
$ dspace export -i 10568/108684 -t COLLECTION -m -n 0 -d 2019-03-03-IITA-Feb14
|
||||
</code></pre><ul>
|
||||
<li>As I was inspecting the archive I noticed that there were some problems with the bitsreams:
|
||||
<ul>
|
||||
<li>First, Sisay didn't include the bitstream descriptions</li>
|
||||
<li>First, Sisay didn’t include the bitstream descriptions</li>
|
||||
<li>Second, only five items had bitstreams and I remember in the discussion with IITA that there should have been nine!</li>
|
||||
<li>I had to refer to the original CSV from January to find the file names, then download and add them to the export contents manually!</li>
|
||||
</ul>
|
||||
@ -158,11 +158,11 @@ $ dspace export -i 10568/108684 -t COLLECTION -m -n 0 -d 2019-03-03-IITA-Feb14
|
||||
</ul>
|
||||
<pre><code>$ dspace import -a -c 10568/99832 -e aorth@stfu.com -m 2019-03-03-IITA-Feb14.map -s /tmp/2019-03-03-IITA-Feb14
|
||||
</code></pre><ul>
|
||||
<li>DSpace's export function doesn't include the collections for some reason, so you need to import them somewhere first, then export the collection metadata and re-map the items to proper owning collections based on their types using OpenRefine or something</li>
|
||||
<li>DSpace’s export function doesn’t include the collections for some reason, so you need to import them somewhere first, then export the collection metadata and re-map the items to proper owning collections based on their types using OpenRefine or something</li>
|
||||
<li>After re-importing to CGSpace to apply the mappings, I deleted the collection on DSpace Test and ran the <code>dspace cleanup</code> script</li>
|
||||
<li>Merge the IITA research theme changes from last month to the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/413">#413</a>)
|
||||
<ul>
|
||||
<li>I will deploy to CGSpace soon and then think about how to batch tag all IITA's existing items with this metadata</li>
|
||||
<li>I will deploy to CGSpace soon and then think about how to batch tag all IITA’s existing items with this metadata</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Deploy Tomcat 7.0.93 on CGSpace (linode18) after having tested it on DSpace Test (linode19) for a week</li>
|
||||
@ -170,7 +170,7 @@ $ dspace export -i 10568/108684 -t COLLECTION -m -n 0 -d 2019-03-03-IITA-Feb14
|
||||
<h2 id="2019-03-06">2019-03-06</h2>
|
||||
<ul>
|
||||
<li>Abenet was having problems with a CIP user account, I think that the user could not register</li>
|
||||
<li>I suspect it's related to the email issue that ICT hasn't responded about since last week</li>
|
||||
<li>I suspect it’s related to the email issue that ICT hasn’t responded about since last week</li>
|
||||
<li>As I thought, I still cannot send emails from CGSpace:</li>
|
||||
</ul>
|
||||
<pre><code>$ dspace test-email
|
||||
@ -203,17 +203,17 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/dc-subject.x
|
||||
</ul>
|
||||
<h2 id="2019-03-08">2019-03-08</h2>
|
||||
<ul>
|
||||
<li>There's an issue with CGSpace right now where all items are giving a blank page in the XMLUI
|
||||
<li>There’s an issue with CGSpace right now where all items are giving a blank page in the XMLUI
|
||||
<ul>
|
||||
<li><del>Interestingly, if I check an item in the REST API it is also mostly blank: only the title and the ID!</del> On second thought I realize I probably was just seeing the default view without any “expands”</li>
|
||||
<li>I don't see anything unusual in the Tomcat logs, though there are thousands of those <code>solr_update_time_stamp</code> errors:</li>
|
||||
<li>I don’t see anything unusual in the Tomcat logs, though there are thousands of those <code>solr_update_time_stamp</code> errors:</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code># journalctl -u tomcat7 | grep -c 'Multiple update components target the same field:solr_update_time_stamp'
|
||||
1076
|
||||
</code></pre><ul>
|
||||
<li>I restarted Tomcat and it's OK now…</li>
|
||||
<li>I restarted Tomcat and it’s OK now…</li>
|
||||
<li>Skype meeting with Peter and Abenet and Sisay
|
||||
<ul>
|
||||
<li>We want to try to crowd source the correction of invalid AGROVOC terms starting with the ~313 invalid ones from our top 1500</li>
|
||||
@ -244,7 +244,7 @@ UPDATE 44
|
||||
</ul>
|
||||
<h2 id="2019-03-10">2019-03-10</h2>
|
||||
<ul>
|
||||
<li>Working on tagging IITA's items with their new research theme (<code>cg.identifier.iitatheme</code>) based on their existing IITA subjects (see <a href="/cgspace-notes/2018-02/">notes from 2019-02</a>)</li>
|
||||
<li>Working on tagging IITA’s items with their new research theme (<code>cg.identifier.iitatheme</code>) based on their existing IITA subjects (see <a href="/cgspace-notes/2018-02/">notes from 2019-02</a>)</li>
|
||||
<li>I exported the entire IITA community from CGSpace and then used <code>csvcut</code> to extract only the needed fields:</li>
|
||||
</ul>
|
||||
<pre><code>$ csvcut -c 'id,cg.subject.iita,cg.subject.iita[],cg.subject.iita[en],cg.subject.iita[en_US]' ~/Downloads/10568-68616.csv > /tmp/iita.csv
|
||||
@ -258,7 +258,7 @@ UPDATE 44
|
||||
</ul>
|
||||
<pre><code>if(isBlank(value), 'PLANT PRODUCTION & HEALTH', value + '||PLANT PRODUCTION & HEALTH')
|
||||
</code></pre><ul>
|
||||
<li>Then it's more annoying because there are four IITA subject columns…</li>
|
||||
<li>Then it’s more annoying because there are four IITA subject columns…</li>
|
||||
<li>In total this would add research themes to 1,755 items</li>
|
||||
<li>I want to double check one last time with Bosede that they would like to do this, because I also see that this will tag a few hundred items from the 1970s and 1980s</li>
|
||||
</ul>
|
||||
@ -268,7 +268,7 @@ UPDATE 44
|
||||
</ul>
|
||||
<h2 id="2019-03-12">2019-03-12</h2>
|
||||
<ul>
|
||||
<li>I imported the changes to 256 of IITA's records on CGSpace</li>
|
||||
<li>I imported the changes to 256 of IITA’s records on CGSpace</li>
|
||||
</ul>
|
||||
<h2 id="2019-03-14">2019-03-14</h2>
|
||||
<ul>
|
||||
@ -291,21 +291,21 @@ UPDATE 44
|
||||
|
||||
done
|
||||
</code></pre><ul>
|
||||
<li>Then I couldn't figure out a clever way to join all the CSVs, so I just grepped them to find the IDs with dates from 2018 and 2019 and there are apparently only three:</li>
|
||||
<li>Then I couldn’t figure out a clever way to join all the CSVs, so I just grepped them to find the IDs with dates from 2018 and 2019 and there are apparently only three:</li>
|
||||
</ul>
|
||||
<pre><code>$ grep -oE '201[89]' /tmp/*.csv | sort -u
|
||||
/tmp/94834.csv:2018
|
||||
/tmp/95615.csv:2018
|
||||
/tmp/96747.csv:2018
|
||||
</code></pre><ul>
|
||||
<li>And looking at those items more closely, only one of them has an <em>issue date</em> of after 2018-04, so I will only update that one (as the countrie's name only changed in 2018-04)</li>
|
||||
<li>And looking at those items more closely, only one of them has an <em>issue date</em> of after 2018-04, so I will only update that one (as the countrie’s name only changed in 2018-04)</li>
|
||||
<li>Run all system updates and reboot linode20</li>
|
||||
<li>Follow up with Felix from Earlham to see if he's done testing DSpace Test with COPO so I can re-sync the server from CGSpace</li>
|
||||
<li>Follow up with Felix from Earlham to see if he’s done testing DSpace Test with COPO so I can re-sync the server from CGSpace</li>
|
||||
</ul>
|
||||
<h2 id="2019-03-15">2019-03-15</h2>
|
||||
<ul>
|
||||
<li>CGSpace (linode18) has the blank page error again</li>
|
||||
<li>I'm not sure if it's related, but I see the following error in DSpace's log:</li>
|
||||
<li>I’m not sure if it’s related, but I see the following error in DSpace’s log:</li>
|
||||
</ul>
|
||||
<pre><code>2019-03-15 14:09:32,685 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL QueryTable Error -
|
||||
java.sql.SQLException: Connection org.postgresql.jdbc.PgConnection@55ba10b5 is closed.
|
||||
@ -354,7 +354,7 @@ java.sql.SQLException: Connection org.postgresql.jdbc.PgConnection@55ba10b5 is c
|
||||
10 dspaceCli
|
||||
15 dspaceWeb
|
||||
</code></pre><ul>
|
||||
<li>I didn't see anything interesting in the PostgreSQL logs, though this stack trace from the Tomcat logs (in the systemd journal) from earlier today <em>might</em> be related?</li>
|
||||
<li>I didn’t see anything interesting in the PostgreSQL logs, though this stack trace from the Tomcat logs (in the systemd journal) from earlier today <em>might</em> be related?</li>
|
||||
</ul>
|
||||
<pre><code>SEVERE: Servlet.service() for servlet [spring] in context with path [] threw exception [org.springframework.web.util.NestedServletException: Request processing failed; nested exception is java.util.EmptyStackException] with root cause
|
||||
java.util.EmptyStackException
|
||||
@ -408,7 +408,7 @@ java.util.EmptyStackException
|
||||
<li>Last week Felix from Earlham said that they finished testing on DSpace Test (linode19) so I made backups of some things there and re-deployed the system on Ubuntu 18.04
|
||||
<ul>
|
||||
<li>During re-deployment I hit a few issues with the <a href="https://github.com/ilri/rmg-ansible-public">Ansible playbooks</a> and made some minor improvements</li>
|
||||
<li>There seems to be an <a href="https://bugs.launchpad.net/ubuntu/+source/nodejs/+bug/1794589">issue with nodejs's dependencies now</a>, which causes npm to get uninstalled when installing the certbot dependencies (due to a conflict in libssl dependencies)</li>
|
||||
<li>There seems to be an <a href="https://bugs.launchpad.net/ubuntu/+source/nodejs/+bug/1794589">issue with nodejs’s dependencies now</a>, which causes npm to get uninstalled when installing the certbot dependencies (due to a conflict in libssl dependencies)</li>
|
||||
<li>I re-worked the playbooks to use Node.js from the upstream official repository for now</li>
|
||||
</ul>
|
||||
</li>
|
||||
@ -421,13 +421,13 @@ java.util.EmptyStackException
|
||||
<ul>
|
||||
<li>After restarting Tomcat, Solr was giving the “Error opening new searcher” error for all cores</li>
|
||||
<li>I stopped Tomcat, added <code>ulimit -v unlimited</code> to the <code>catalina.sh</code> script and deleted all old locks in the DSpace <code>solr</code> directory and then DSpace started up normally</li>
|
||||
<li>I'm still not exactly sure why I see this error and if the <code>ulimit</code> trick actually helps, as the <code>tomcat7.service</code> has <code>LimitAS=infinity</code> anyways (and from checking the PID's limits file in <code>/proc</code> it seems to be applied)</li>
|
||||
<li>I’m still not exactly sure why I see this error and if the <code>ulimit</code> trick actually helps, as the <code>tomcat7.service</code> has <code>LimitAS=infinity</code> anyways (and from checking the PID’s limits file in <code>/proc</code> it seems to be applied)</li>
|
||||
<li>Then I noticed that the item displays were blank… so I checked the database info and saw there were some unfinished migrations</li>
|
||||
<li>I'm not entirely sure if it's related, but I tried to delete the old migrations and then force running the ignored ones like when we upgraded to <a href="/cgspace-notes/2018-06/">DSpace 5.8 in 2018-06</a> and then after restarting Tomcat I could see the item displays again</li>
|
||||
<li>I’m not entirely sure if it’s related, but I tried to delete the old migrations and then force running the ignored ones like when we upgraded to <a href="/cgspace-notes/2018-06/">DSpace 5.8 in 2018-06</a> and then after restarting Tomcat I could see the item displays again</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I copied the 2019 Solr statistics core from CGSpace to DSpace Test and it works (and is only 5.5GB currently), so now we have some useful stats on DSpace Test for the CUA module and the dspace-statistics-api</li>
|
||||
<li>I ran DSpace's cleanup task on CGSpace (linode18) and there were errors:</li>
|
||||
<li>I ran DSpace’s cleanup task on CGSpace (linode18) and there were errors:</li>
|
||||
</ul>
|
||||
<pre><code>$ dspace cleanup -v
|
||||
Error: ERROR: update or delete on table "bitstream" violates foreign key constraint "bundle_primary_bitstream_id_fkey" on table "bundle"
|
||||
@ -485,8 +485,8 @@ $ grep -I 'SQL QueryTable Error' dspace.log.2019-03-{08,14,15,16,17,18} | awk -F
|
||||
72 dspace.log.2019-03-17
|
||||
8 dspace.log.2019-03-18
|
||||
</code></pre><ul>
|
||||
<li>It seems to be something with grep doing binary matching on some log files for some reason, so I guess I need to always use <code>-I</code> to say binary files don't match</li>
|
||||
<li>Anyways, the full error in DSpace's log is:</li>
|
||||
<li>It seems to be something with grep doing binary matching on some log files for some reason, so I guess I need to always use <code>-I</code> to say binary files don’t match</li>
|
||||
<li>Anyways, the full error in DSpace’s log is:</li>
|
||||
</ul>
|
||||
<pre><code>2019-03-18 12:26:23,331 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL QueryTable Error -
|
||||
java.sql.SQLException: Connection org.postgresql.jdbc.PgConnection@75eaa668 is closed.
|
||||
@ -509,7 +509,7 @@ $ psql -c 'select * from pg_stat_activity' | grep -o -E '(dspaceWeb|dspaceApi|ds
|
||||
<pre><code>2019-01-13 06:25:13.062 CET [9157] postgres@template1 ERROR: column "waiting" does not exist at character 217
|
||||
</code></pre><ul>
|
||||
<li>This is unrelated and apparently due to <a href="https://github.com/munin-monitoring/munin/issues/746">Munin checking a column that was changed in PostgreSQL 9.6</a></li>
|
||||
<li>I suspect that this issue with the blank pages might not be PostgreSQL after all, perhaps it's a Cocoon thing?</li>
|
||||
<li>I suspect that this issue with the blank pages might not be PostgreSQL after all, perhaps it’s a Cocoon thing?</li>
|
||||
<li>Looking in the cocoon logs I see a large number of warnings about “Can not load requested doc” around 11AM and 12PM:</li>
|
||||
</ul>
|
||||
<pre><code>$ grep 'Can not load requested doc' cocoon.log.2019-03-18 | grep -oE '2019-03-18 [0-9]{2}:' | sort | uniq -c
|
||||
@ -567,7 +567,7 @@ $ psql -c 'select * from pg_stat_activity' | grep -o -E '(dspaceWeb|dspaceApi|ds
|
||||
717 2019-03-08 11:
|
||||
59 2019-03-08 12:
|
||||
</code></pre><ul>
|
||||
<li>I'm not sure if it's cocoon or that's just a symptom of something else</li>
|
||||
<li>I’m not sure if it’s cocoon or that’s just a symptom of something else</li>
|
||||
</ul>
|
||||
<h2 id="2019-03-19">2019-03-19</h2>
|
||||
<ul>
|
||||
@ -581,8 +581,8 @@ $ psql -c 'select * from pg_stat_activity' | grep -o -E '(dspaceWeb|dspaceApi|ds
|
||||
(1 row)
|
||||
</code></pre><ul>
|
||||
<li>Perhaps my <code>agrovoc-lookup.py</code> script could notify if it finds these because they potentially give false negatives</li>
|
||||
<li>CGSpace (linode18) is having problems with Solr again, I'm seeing “Error opening new searcher” in the Solr logs and there are no stats for previous years</li>
|
||||
<li>Apparently the Solr statistics shards didn't load properly when we restarted Tomcat <em>yesterday</em>:</li>
|
||||
<li>CGSpace (linode18) is having problems with Solr again, I’m seeing “Error opening new searcher” in the Solr logs and there are no stats for previous years</li>
|
||||
<li>Apparently the Solr statistics shards didn’t load properly when we restarted Tomcat <em>yesterday</em>:</li>
|
||||
</ul>
|
||||
<pre><code>2019-03-18 12:32:39,799 ERROR org.apache.solr.core.CoreContainer @ Error creating core [statistics-2018]: Error opening new searcher
|
||||
...
|
||||
@ -593,7 +593,7 @@ Caused by: org.apache.solr.common.SolrException: Error opening new searcher
|
||||
... 31 more
|
||||
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/home/cgspace.cgiar.org/solr/statistics-2018/data/index/write.lock
|
||||
</code></pre><ul>
|
||||
<li>For reference, I don't see the <code>ulimit -v unlimited</code> in the <code>catalina.sh</code> script, though the <code>tomcat7</code> systemd service has <code>LimitAS=infinity</code></li>
|
||||
<li>For reference, I don’t see the <code>ulimit -v unlimited</code> in the <code>catalina.sh</code> script, though the <code>tomcat7</code> systemd service has <code>LimitAS=infinity</code></li>
|
||||
<li>The limits of the current Tomcat java process are:</li>
|
||||
</ul>
|
||||
<pre><code># cat /proc/27182/limits
|
||||
@ -615,7 +615,7 @@ Max nice priority 0 0
|
||||
Max realtime priority 0 0
|
||||
Max realtime timeout unlimited unlimited us
|
||||
</code></pre><ul>
|
||||
<li>I will try to add <code>ulimit -v unlimited</code> to the Catalina startup script and check the output of the limits to see if it's different in practice, as some wisdom on Stack Overflow says this solves the Solr core issues and I've superstitiously tried it various times in the past
|
||||
<li>I will try to add <code>ulimit -v unlimited</code> to the Catalina startup script and check the output of the limits to see if it’s different in practice, as some wisdom on Stack Overflow says this solves the Solr core issues and I’ve superstitiously tried it various times in the past
|
||||
<ul>
|
||||
<li>The result is the same before and after, so <em>adding the ulimit directly is unneccessary</em> (whether or not unlimited address space is useful or not is another question)</li>
|
||||
</ul>
|
||||
@ -627,7 +627,7 @@ Max realtime timeout unlimited unlimited us
|
||||
# systemctl start tomcat7
|
||||
</code></pre><ul>
|
||||
<li>After restarting I confirmed that all Solr statistics cores were loaded successfully…</li>
|
||||
<li>Another avenue might be to look at point releases in Solr 4.10.x, as we're running 4.10.2 and they released 4.10.3 and 4.10.4 back in 2014 or 2015
|
||||
<li>Another avenue might be to look at point releases in Solr 4.10.x, as we’re running 4.10.2 and they released 4.10.3 and 4.10.4 back in 2014 or 2015
|
||||
<ul>
|
||||
<li>I see several issues regarding locks and IndexWriter that were fixed in Solr and Lucene 4.10.3 and 4.10.4…</li>
|
||||
</ul>
|
||||
@ -651,7 +651,7 @@ Max realtime timeout unlimited unlimited us
|
||||
</ul>
|
||||
<h2 id="2019-03-21">2019-03-21</h2>
|
||||
<ul>
|
||||
<li>It's been two days since we had the blank page issue on CGSpace, and looking in the Cocoon logs I see very low numbers of the errors that we were seeing the last time the issue occurred:</li>
|
||||
<li>It’s been two days since we had the blank page issue on CGSpace, and looking in the Cocoon logs I see very low numbers of the errors that we were seeing the last time the issue occurred:</li>
|
||||
</ul>
|
||||
<pre><code>$ grep 'Can not load requested doc' cocoon.log.2019-03-20 | grep -oE '2019-03-20 [0-9]{2}:' | sort | uniq -c
|
||||
3 2019-03-20 00:
|
||||
@ -732,7 +732,7 @@ $ grep 'Can not load requested doc' cocoon.log.2019-03-23 | grep -oE '2019-03-23
|
||||
440 2019-03-23 08:
|
||||
260 2019-03-23 09:
|
||||
</code></pre><ul>
|
||||
<li>I was curious to see if clearing the Cocoon cache in the XMLUI control panel would fix it, but it didn't</li>
|
||||
<li>I was curious to see if clearing the Cocoon cache in the XMLUI control panel would fix it, but it didn’t</li>
|
||||
<li>Trying to drill down more, I see that the bulk of the errors started aroundi 21:20:</li>
|
||||
</ul>
|
||||
<pre><code>$ grep 'Can not load requested doc' cocoon.log.2019-03-22 | grep -oE '2019-03-22 21:[0-9]' | sort | uniq -c
|
||||
@ -794,16 +794,16 @@ org.postgresql.util.PSQLException: This statement has been closed.
|
||||
<p>I restarted Tomcat and now the item displays are working again for now</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>I am wondering if this is an issue with removing abandoned connections in Tomcat's JDBC pooling?</p>
|
||||
<p>I am wondering if this is an issue with removing abandoned connections in Tomcat’s JDBC pooling?</p>
|
||||
<ul>
|
||||
<li>It's hard to tell because we have <code>logAbanded</code> enabled, but I don't see anything in the <code>tomcat7</code> service logs in the systemd journal</li>
|
||||
<li>It’s hard to tell because we have <code>logAbanded</code> enabled, but I don’t see anything in the <code>tomcat7</code> service logs in the systemd journal</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>
|
||||
<p>I sent another mail to the dspace-tech mailing list with my observations</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>I spent some time trying to test and debug the Tomcat connection pool's settings, but for some reason our logs are either messed up or no connections are actually getting abandoned</p>
|
||||
<p>I spent some time trying to test and debug the Tomcat connection pool’s settings, but for some reason our logs are either messed up or no connections are actually getting abandoned</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>I compiled this <a href="https://github.com/gnosly/TomcatJdbcConnectionTest">TomcatJdbcConnectionTest</a> and created a bunch of database connections and waited a few minutes but they never got abandoned until I created over <code>maxActive</code> (75), after which almost all were purged at once</p>
|
||||
@ -820,7 +820,7 @@ org.postgresql.util.PSQLException: This statement has been closed.
|
||||
<pre><code>$ jconsole -J-DsocksProxyHost=localhost -J-DsocksProxyPort=3000 service:jmx:rmi:///jndi/rmi://localhost:5400/jmxrmi -J-DsocksNonProxyHosts=
|
||||
</code></pre><ul>
|
||||
<li>I need to remember to check the active connections next time we have issues with blank item pages on CGSpace</li>
|
||||
<li>In other news, I've been running G1GC on DSpace Test (linode19) since 2018-11-08 without realizing it, which is probably a good thing</li>
|
||||
<li>In other news, I’ve been running G1GC on DSpace Test (linode19) since 2018-11-08 without realizing it, which is probably a good thing</li>
|
||||
<li>I deployed the latest <code>5_x-prod</code> branch on CGSpace (linode18) and added more validation to the JDBC pool in our Tomcat config
|
||||
<ul>
|
||||
<li>This includes the new <code>testWhileIdle</code> and <code>testOnConnect</code> pool settings as well as the two new JDBC interceptors: <code>StatementFinalizer</code> and <code>ConnectionState</code> that should hopefully make sure our connections in the pool are valid</li>
|
||||
@ -828,7 +828,7 @@ org.postgresql.util.PSQLException: This statement has been closed.
|
||||
</li>
|
||||
<li>I spent one hour looking at the invalid AGROVOC terms from last week
|
||||
<ul>
|
||||
<li>It doesn't seem like any of the editors did any work on this so I did most of them</li>
|
||||
<li>It doesn’t seem like any of the editors did any work on this so I did most of them</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
@ -842,21 +842,21 @@ org.postgresql.util.PSQLException: This statement has been closed.
|
||||
<li>Looking at the DBCP status on CGSpace via jconsole and everything looks good, though I wonder why <code>timeBetweenEvictionRunsMillis</code> is -1, because the <a href="https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html">Tomcat 7.0 JDBC docs</a> say the default is 5000…
|
||||
<ul>
|
||||
<li>Could be an error in the docs, as I see the <a href="https://commons.apache.org/proper/commons-dbcp/configuration.html">Apache Commons DBCP</a> has -1 as the default</li>
|
||||
<li>Maybe I need to re-evaluate the “defauts” of Tomcat 7's DBCP and set them explicitly in our config</li>
|
||||
<li>Maybe I need to re-evaluate the “defauts” of Tomcat 7’s DBCP and set them explicitly in our config</li>
|
||||
<li>From Tomcat 8 they seem to default to Apache Commons’ DBCP 2.x</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Also, CGSpace doesn't have many Cocoon errors yet this morning:</li>
|
||||
<li>Also, CGSpace doesn’t have many Cocoon errors yet this morning:</li>
|
||||
</ul>
|
||||
<pre><code>$ grep 'Can not load requested doc' cocoon.log.2019-03-25 | grep -oE '2019-03-25 [0-9]{2}:' | sort | uniq -c
|
||||
4 2019-03-25 00:
|
||||
1 2019-03-25 01:
|
||||
</code></pre><ul>
|
||||
<li>Holy shit I just realized we've been using the wrong DBCP pool in Tomcat
|
||||
<li>Holy shit I just realized we’ve been using the wrong DBCP pool in Tomcat
|
||||
<ul>
|
||||
<li>By default you get the Commons DBCP one unless you specify factory <code>org.apache.tomcat.jdbc.pool.DataSourceFactory</code></li>
|
||||
<li>Now I see all my interceptor settings etc in jconsole, where I didn't see them before (also a new <code>tomcat.jdbc</code> mbean)!</li>
|
||||
<li>No wonder our settings didn't quite match the ones in the <a href="https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html">Tomcat DBCP Pool docs</a></li>
|
||||
<li>Now I see all my interceptor settings etc in jconsole, where I didn’t see them before (also a new <code>tomcat.jdbc</code> mbean)!</li>
|
||||
<li>No wonder our settings didn’t quite match the ones in the <a href="https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html">Tomcat DBCP Pool docs</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Uptime Robot reported that CGSpace went down and I see the load is very high</li>
|
||||
@ -885,7 +885,7 @@ org.postgresql.util.PSQLException: This statement has been closed.
|
||||
1222 35.174.184.209
|
||||
1720 2a01:4f8:13b:1296::2
|
||||
</code></pre><ul>
|
||||
<li>The IPs look pretty normal except we've never seen <code>93.179.69.74</code> before, and it uses the following user agent:</li>
|
||||
<li>The IPs look pretty normal except we’ve never seen <code>93.179.69.74</code> before, and it uses the following user agent:</li>
|
||||
</ul>
|
||||
<pre><code>Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.20 Safari/535.1
|
||||
</code></pre><ul>
|
||||
@ -894,7 +894,7 @@ org.postgresql.util.PSQLException: This statement has been closed.
|
||||
<pre><code>$ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=93.179.69.74' dspace.log.2019-03-25 | sort | uniq | wc -l
|
||||
1
|
||||
</code></pre><ul>
|
||||
<li>That's weird because the total number of sessions today seems low compared to recent days:</li>
|
||||
<li>That’s weird because the total number of sessions today seems low compared to recent days:</li>
|
||||
</ul>
|
||||
<pre><code>$ grep -o -E 'session_id=[A-Z0-9]{32}' dspace.log.2019-03-25 | sort -u | wc -l
|
||||
5657
|
||||
@ -914,7 +914,7 @@ $ grep -o -E 'session_id=[A-Z0-9]{32}' dspace.log.2019-03-22 | sort -u | wc -l
|
||||
</code></pre><ul>
|
||||
<li>I restarted Tomcat and deployed the new Tomcat JDBC settings on CGSpace since I had to restart the server anyways
|
||||
<ul>
|
||||
<li>I need to watch this carefully though because I've read some places that Tomcat's DBCP doesn't track statements and might create memory leaks if an application doesn't close statements before a connection gets returned back to the pool</li>
|
||||
<li>I need to watch this carefully though because I’ve read some places that Tomcat’s DBCP doesn’t track statements and might create memory leaks if an application doesn’t close statements before a connection gets returned back to the pool</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>According the Uptime Robot the server was up and down a few more times over the next hour so I restarted Tomcat again</li>
|
||||
@ -969,14 +969,14 @@ $ grep -o -E 'session_id=[A-Z0-9]{32}' dspace.log.2019-03-22 | sort -u | wc -l
|
||||
<li><code>216.244.66.198</code> is DotBot</li>
|
||||
<li><code>93.179.69.74</code> is some IP in Ukraine, which I will add to the list of bot IPs in nginx</li>
|
||||
<li>I can only hope that this helps the load go down because all this traffic is disrupting the service for normal users and well-behaved bots (and interrupting my dinner and breakfast)</li>
|
||||
<li>Looking at the database usage I'm wondering why there are so many connections from the DSpace CLI:</li>
|
||||
<li>Looking at the database usage I’m wondering why there are so many connections from the DSpace CLI:</li>
|
||||
</ul>
|
||||
<pre><code>$ psql -c 'select * from pg_stat_activity' | grep -o -E '(dspaceWeb|dspaceApi|dspaceCli)' | sort | uniq -c
|
||||
5 dspaceApi
|
||||
10 dspaceCli
|
||||
13 dspaceWeb
|
||||
</code></pre><ul>
|
||||
<li>Looking closer I see they are all idle… so at least I know the load isn't coming from some background nightly task or something</li>
|
||||
<li>Looking closer I see they are all idle… so at least I know the load isn’t coming from some background nightly task or something</li>
|
||||
<li>Make a minor edit to my <code>agrovoc-lookup.py</code> script to match subject terms with parentheses like <code>COCOA (PLANT)</code></li>
|
||||
<li>Test 89 corrections and 79 deletions for AGROVOC subject terms from the ones I cleaned up in the last week</li>
|
||||
</ul>
|
||||
@ -984,12 +984,12 @@ $ grep -o -E 'session_id=[A-Z0-9]{32}' dspace.log.2019-03-22 | sort -u | wc -l
|
||||
$ ./delete-metadata-values.py -i /tmp/2019-03-26-AGROVOC-79-deletions.csv -db dspace -u dspace -p 'fuuu' -m 57 -f dc.subject -d -n
|
||||
</code></pre><ul>
|
||||
<li>UptimeRobot says CGSpace is down again, but it seems to just be slow, as the load is over 10.0</li>
|
||||
<li>Looking at the nginx logs I don't see anything terribly abusive, but SemrushBot has made ~3,000 requests to Discovery and Browse pages today:</li>
|
||||
<li>Looking at the nginx logs I don’t see anything terribly abusive, but SemrushBot has made ~3,000 requests to Discovery and Browse pages today:</li>
|
||||
</ul>
|
||||
<pre><code># grep SemrushBot /var/log/nginx/access.log | grep -E "26/Mar/2019" | grep -E '(discover|browse)' | wc -l
|
||||
2931
|
||||
</code></pre><ul>
|
||||
<li>So I'm adding it to the badbot rate limiting in nginx, and actually, I kinda feel like just blocking all user agents with “bot” in the name for a few days to see if things calm down… maybe not just yet</li>
|
||||
<li>So I’m adding it to the badbot rate limiting in nginx, and actually, I kinda feel like just blocking all user agents with “bot” in the name for a few days to see if things calm down… maybe not just yet</li>
|
||||
<li>Otherwise, these are the top users in the web and API logs the last hour (18–19):</li>
|
||||
</ul>
|
||||
<pre><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E "26/Mar/2019:(18|19)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||||
@ -1021,7 +1021,7 @@ $ ./delete-metadata-values.py -i /tmp/2019-03-26-AGROVOC-79-deletions.csv -db ds
|
||||
<pre><code>$ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=(18.195.78.144|18.196.196.108)' dspace.log.2019-03-26 | sort | uniq | wc -l
|
||||
937
|
||||
</code></pre><ul>
|
||||
<li>I will add their IPs to the list of bot IPs in nginx so I can tag them as bots to let Tomcat's Crawler Session Manager Valve to force them to re-use their session</li>
|
||||
<li>I will add their IPs to the list of bot IPs in nginx so I can tag them as bots to let Tomcat’s Crawler Session Manager Valve to force them to re-use their session</li>
|
||||
<li>Another user agent behaving badly in Colombia is “GuzzleHttp/6.3.3 curl/7.47.0 PHP/7.0.30-0ubuntu0.16.04.1”</li>
|
||||
<li>I will add curl to the Tomcat Crawler Session Manager because anyone using curl is most likely an automated read-only request</li>
|
||||
<li>I will add GuzzleHttp to the nginx badbots rate limiting, because it is making requests to dynamic Discovery pages</li>
|
||||
@ -1029,7 +1029,7 @@ $ ./delete-metadata-values.py -i /tmp/2019-03-26-AGROVOC-79-deletions.csv -db ds
|
||||
<pre><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep 45.5.184.72 | grep -E "26/Mar/2019:" | grep -E '(discover|browse)' | wc -l
|
||||
119
|
||||
</code></pre><ul>
|
||||
<li>What's strange is that I can't see any of their requests in the DSpace log…</li>
|
||||
<li>What’s strange is that I can’t see any of their requests in the DSpace log…</li>
|
||||
</ul>
|
||||
<pre><code>$ grep -I -c 45.5.184.72 dspace.log.2019-03-26
|
||||
0
|
||||
@ -1050,7 +1050,7 @@ $ ./delete-metadata-values.py -i /tmp/2019-03-26-AGROVOC-79-deletions.csv -db ds
|
||||
</code></pre><ul>
|
||||
<li>None of these 18.x.x.x IPs specify a user agent and they are all on Amazon!</li>
|
||||
<li>Shortly after I started the re-indexing UptimeRobot began to complain that CGSpace was down, then up, then down, then up…</li>
|
||||
<li>I see the load on the server is about 10.0 again for some reason though I don't know WHAT is causing that load
|
||||
<li>I see the load on the server is about 10.0 again for some reason though I don’t know WHAT is causing that load
|
||||
<ul>
|
||||
<li>It could be the CPU steal metric, as if Linode has oversold the CPU resources on this VM host…</li>
|
||||
</ul>
|
||||
@ -1061,14 +1061,14 @@ $ ./delete-metadata-values.py -i /tmp/2019-03-26-AGROVOC-79-deletions.csv -db ds
|
||||
<p><img src="/cgspace-notes/2019/03/cpu-week-fs8.png" alt="CPU week"></p>
|
||||
<p><img src="/cgspace-notes/2019/03/cpu-year-fs8.png" alt="CPU year"></p>
|
||||
<ul>
|
||||
<li>What's clear from this is that some other VM on our host has heavy usage for about four hours at 6AM and 6PM and that during that time the load on our server spikes
|
||||
<li>What’s clear from this is that some other VM on our host has heavy usage for about four hours at 6AM and 6PM and that during that time the load on our server spikes
|
||||
<ul>
|
||||
<li>CPU steal has drastically increased since March 25th</li>
|
||||
<li>It might be time to move to a dedicated CPU VM instances, or even real servers</li>
|
||||
<li>For now I just sent a support ticket to bring this to Linode's attention</li>
|
||||
<li>For now I just sent a support ticket to bring this to Linode’s attention</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>In other news, I see that it's not even the end of the month yet and we have 3.6 million hits already:</li>
|
||||
<li>In other news, I see that it’s not even the end of the month yet and we have 3.6 million hits already:</li>
|
||||
</ul>
|
||||
<pre><code># zcat --force /var/log/nginx/* | grep -cE "[0-9]{1,2}/Mar/2019"
|
||||
3654911
|
||||
@ -1120,7 +1120,7 @@ sys 0m2.551s
|
||||
<li>It has 64GB of ECC RAM, six core Xeon processor from 2018, and 2x960GB NVMe storage</li>
|
||||
<li>The alternative of staying with Linode and using dedicated CPU instances with added block storage gets expensive quickly if we want to keep more than 16GB of RAM (do we?)
|
||||
<ul>
|
||||
<li>Regarding RAM, our JVM heap is 8GB and we leave the rest of the system's 32GB of RAM to PostgreSQL and Solr buffers</li>
|
||||
<li>Regarding RAM, our JVM heap is 8GB and we leave the rest of the system’s 32GB of RAM to PostgreSQL and Solr buffers</li>
|
||||
<li>Seeing as we have 56GB of Solr data it might be better to have more RAM in order to keep more of it in memory</li>
|
||||
<li>Also, I know that the Linode block storage is a major bottleneck for Solr indexing</li>
|
||||
</ul>
|
||||
@ -1128,7 +1128,7 @@ sys 0m2.551s
|
||||
</ul>
|
||||
</li>
|
||||
<li>Looking at the weird issue with shitloads of downloads on the <a href="https://cgspace.cgiar.org/handle/10568/100289">CTA item</a> again</li>
|
||||
<li>The item was added on 2019-03-13 and these three IPs have attempted to download the item's bitstream 43,000 times since it was added eighteen days ago:</li>
|
||||
<li>The item was added on 2019-03-13 and these three IPs have attempted to download the item’s bitstream 43,000 times since it was added eighteen days ago:</li>
|
||||
</ul>
|
||||
<pre><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/access.log.{2..17}.gz | grep 'Spore-192-EN-web.pdf' | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 5
|
||||
42 196.43.180.134
|
||||
@ -1147,7 +1147,7 @@ sys 0m2.551s
|
||||
</ul>
|
||||
<pre><code>2019-03-29 09:10:07,311 ERROR org.dspace.rest.Resource @ Could not delete collection(id=1451), AuthorizeException. Message: org.dspace.authorize.AuthorizeException: Authorization denied for action ADMIN on COLLECTION:1451 by user 9492
|
||||
</code></pre><ul>
|
||||
<li>IWMI people emailed to ask why two items with the same DOI don't have the same Altmetric score:
|
||||
<li>IWMI people emailed to ask why two items with the same DOI don’t have the same Altmetric score:
|
||||
<ul>
|
||||
<li><a href="https://cgspace.cgiar.org/handle/10568/89846">https://cgspace.cgiar.org/handle/10568/89846</a> (Bioversity)</li>
|
||||
<li><a href="https://cgspace.cgiar.org/handle/10568/89975">https://cgspace.cgiar.org/handle/10568/89975</a> (CIAT)</li>
|
||||
@ -1178,7 +1178,7 @@ sys 0m2.551s
|
||||
<li><a href="https://www.altmetric.com/explorer/highlights?identifier=10568%2F89975">https://www.altmetric.com/explorer/highlights?identifier=10568%2F89975</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>So it's likely the DSpace Altmetric badge code that is deciding not to show the badge</li>
|
||||
<li>So it’s likely the DSpace Altmetric badge code that is deciding not to show the badge</li>
|
||||
</ul>
|
||||
<!-- raw HTML omitted -->
|
||||
|
||||
|
Reference in New Issue
Block a user