mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2020-01-27
This commit is contained in:
@ -9,9 +9,9 @@
|
||||
<meta property="og:description" content="2018-02-01
|
||||
|
||||
Peter gave feedback on the dc.rights proof of concept that I had sent him last week
|
||||
We don't need to distinguish between internal and external works, so that makes it just a simple list
|
||||
We don’t need to distinguish between internal and external works, so that makes it just a simple list
|
||||
Yesterday I figured out how to monitor DSpace sessions using JMX
|
||||
I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu's munin-plugins-java package and used the stuff I discovered about JMX in 2018-01
|
||||
I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu’s munin-plugins-java package and used the stuff I discovered about JMX in 2018-01
|
||||
" />
|
||||
<meta property="og:type" content="article" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-02/" />
|
||||
@ -23,11 +23,11 @@ I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu's munin-plug
|
||||
<meta name="twitter:description" content="2018-02-01
|
||||
|
||||
Peter gave feedback on the dc.rights proof of concept that I had sent him last week
|
||||
We don't need to distinguish between internal and external works, so that makes it just a simple list
|
||||
We don’t need to distinguish between internal and external works, so that makes it just a simple list
|
||||
Yesterday I figured out how to monitor DSpace sessions using JMX
|
||||
I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu's munin-plugins-java package and used the stuff I discovered about JMX in 2018-01
|
||||
I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu’s munin-plugins-java package and used the stuff I discovered about JMX in 2018-01
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.62.2" />
|
||||
<meta name="generator" content="Hugo 0.63.1" />
|
||||
|
||||
|
||||
|
||||
@ -57,7 +57,7 @@ I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu's munin-plug
|
||||
|
||||
<!-- combined, minified CSS -->
|
||||
|
||||
<link href="https://alanorth.github.io/cgspace-notes/css/style.a20c1a4367639632cdb341d23c27ca44fedcc75b0f8b3cbea6203010da153d3c.css" rel="stylesheet" integrity="sha256-ogwaQ2djljLNs0HSPCfKRP7cx1sPizy+piAwENoVPTw=" crossorigin="anonymous">
|
||||
<link href="https://alanorth.github.io/cgspace-notes/css/style.23e2c3298bcc8c1136c19aba330c211ec94c36f7c4454ea15cf4d3548370042a.css" rel="stylesheet" integrity="sha256-I+LDKYvMjBE2wZq6MwwhHslMNvfERU6hXPTTVINwBCo=" crossorigin="anonymous">
|
||||
|
||||
|
||||
<!-- RSS 2.0 feed -->
|
||||
@ -104,7 +104,7 @@ I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu's munin-plug
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-02/">February, 2018</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2018-02-01T16:28:54+02:00">Thu Feb 01, 2018</time> by Alan Orth in
|
||||
<i class="fa fa-folder" aria-hidden="true"></i> <a href="/cgspace-notes/categories/notes" rel="category tag">Notes</a>
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
@ -112,9 +112,9 @@ I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu's munin-plug
|
||||
<h2 id="2018-02-01">2018-02-01</h2>
|
||||
<ul>
|
||||
<li>Peter gave feedback on the <code>dc.rights</code> proof of concept that I had sent him last week</li>
|
||||
<li>We don't need to distinguish between internal and external works, so that makes it just a simple list</li>
|
||||
<li>We don’t need to distinguish between internal and external works, so that makes it just a simple list</li>
|
||||
<li>Yesterday I figured out how to monitor DSpace sessions using JMX</li>
|
||||
<li>I copied the logic in the <code>jmx_tomcat_dbpools</code> provided by Ubuntu's <code>munin-plugins-java</code> package and used the stuff I discovered about JMX <a href="/cgspace-notes/2018-01/">in 2018-01</a></li>
|
||||
<li>I copied the logic in the <code>jmx_tomcat_dbpools</code> provided by Ubuntu’s <code>munin-plugins-java</code> package and used the stuff I discovered about JMX <a href="/cgspace-notes/2018-01/">in 2018-01</a></li>
|
||||
</ul>
|
||||
<p><img src="/cgspace-notes/2018/02/jmx_dspace_sessions-day.png" alt="DSpace Sessions"></p>
|
||||
<ul>
|
||||
@ -163,7 +163,7 @@ sys 0m1.905s
|
||||
<pre><code>dspace=# update metadatavalue set text_value=REGEXP_REPLACE(text_value, '\s+$' , '') where resource_type_id=2 and metadata_field_id=3 and text_value ~ '^.*?\s+$';
|
||||
UPDATE 20
|
||||
</code></pre><ul>
|
||||
<li>I tried the <code>TRIM(TRAILING from text_value)</code> function and it said it changed 20 items but the spaces didn't go away</li>
|
||||
<li>I tried the <code>TRIM(TRAILING from text_value)</code> function and it said it changed 20 items but the spaces didn’t go away</li>
|
||||
<li>This is on a fresh import of the CGSpace database, but when I tried to apply it on CGSpace there were no changes detected. Weird.</li>
|
||||
<li>Anyways, Peter wants a new list of authors to clean up, so I exported another CSV:</li>
|
||||
</ul>
|
||||
@ -200,10 +200,10 @@ Tue Feb 6 09:30:32 UTC 2018
|
||||
295 197.210.168.174
|
||||
752 144.76.64.79
|
||||
</code></pre><ul>
|
||||
<li>I did notice in <code>/var/log/tomcat7/catalina.out</code> that Atmire's update thing was running though</li>
|
||||
<li>I did notice in <code>/var/log/tomcat7/catalina.out</code> that Atmire’s update thing was running though</li>
|
||||
<li>So I restarted Tomcat and now everything is fine</li>
|
||||
<li>Next time I see that many database connections I need to save the output so I can analyze it later</li>
|
||||
<li>I'm going to re-schedule the taskUpdateSolrStatsMetadata task as <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=566">Bram detailed in ticket 566</a> to see if it makes CGSpace stop crashing every morning</li>
|
||||
<li>I’m going to re-schedule the taskUpdateSolrStatsMetadata task as <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=566">Bram detailed in ticket 566</a> to see if it makes CGSpace stop crashing every morning</li>
|
||||
<li>If I move the task from 3AM to 3PM, deally CGSpace will stop crashing in the morning, or start crashing ~12 hours later</li>
|
||||
<li>Eventually Atmire has said that there will be a fix for this high load caused by their script, but it will come with the 5.8 compatability they are already working on</li>
|
||||
<li>I re-deployed CGSpace with the new task time of 3PM, ran all system updates, and restarted the server</li>
|
||||
@ -211,16 +211,16 @@ Tue Feb 6 09:30:32 UTC 2018
|
||||
<li>I implemented some changes to the pooling in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> so that each DSpace web application can use its own pool (web, api, and solr)</li>
|
||||
<li>Each pool uses its own name and hopefully this should help me figure out which one is using too many connections next time CGSpace goes down</li>
|
||||
<li>Also, this will mean that when a search bot comes along and hammers the XMLUI, the REST and OAI applications will be fine</li>
|
||||
<li>I'm not actually sure if the Solr web application uses the database though, so I'll have to check later and remove it if necessary</li>
|
||||
<li>I’m not actually sure if the Solr web application uses the database though, so I’ll have to check later and remove it if necessary</li>
|
||||
<li>I deployed the changes on DSpace Test only for now, so I will monitor and make them on CGSpace later this week</li>
|
||||
</ul>
|
||||
<h2 id="2018-02-07">2018-02-07</h2>
|
||||
<ul>
|
||||
<li>Abenet wrote to ask a question about the ORCiD lookup not working for one CIAT user on CGSpace</li>
|
||||
<li>I tried on DSpace Test and indeed the lookup just doesn't work!</li>
|
||||
<li>I tried on DSpace Test and indeed the lookup just doesn’t work!</li>
|
||||
<li>The ORCiD code in DSpace appears to be using <code>http://pub.orcid.org/</code>, but when I go there in the browser it redirects me to <code>https://pub.orcid.org/v2.0/</code></li>
|
||||
<li>According to <a href="https://groups.google.com/forum/#!topic/orcid-api-users/qfg-HwAB1bk">the announcement</a> the v1 API was moved from <code>http://pub.orcid.org/</code> to <code>https://pub.orcid.org/v1.2</code> until March 1st when it will be discontinued for good</li>
|
||||
<li>But the old URL is hard coded in DSpace and it doesn't work anyways, because it currently redirects you to <code>https://pub.orcid.org/v2.0/v1.2</code></li>
|
||||
<li>But the old URL is hard coded in DSpace and it doesn’t work anyways, because it currently redirects you to <code>https://pub.orcid.org/v2.0/v1.2</code></li>
|
||||
<li>So I guess we have to disable that shit once and for all and switch to a controlled vocabulary</li>
|
||||
<li>CGSpace crashed again, this time around <code>Wed Feb 7 11:20:28 UTC 2018</code></li>
|
||||
<li>I took a few snapshots of the PostgreSQL activity at the time and as the minutes went on and the connections were very high at first but reduced on their own:</li>
|
||||
@ -249,7 +249,7 @@ $ grep -c 'PostgreSQL JDBC' /tmp/pg_stat_activity*
|
||||
1828
|
||||
</code></pre><ul>
|
||||
<li>CGSpace went down again a few hours later, and now the connections to the dspaceWeb pool are maxed at 250 (the new limit I imposed with the new separate pool scheme)</li>
|
||||
<li>What's interesting is that the DSpace log says the connections are all busy:</li>
|
||||
<li>What’s interesting is that the DSpace log says the connections are all busy:</li>
|
||||
</ul>
|
||||
<pre><code>org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-328] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:250; busy:250; idle:0; lastwait:5000].
|
||||
</code></pre><ul>
|
||||
@ -263,14 +263,14 @@ $ psql -c 'select * from pg_stat_activity' | grep dspaceWeb | grep -c "idle
|
||||
187
|
||||
</code></pre><ul>
|
||||
<li>What the fuck, does DSpace think all connections are busy?</li>
|
||||
<li>I suspect these are issues with abandoned connections or maybe a leak, so I'm going to try adding the <code>removeAbandoned='true'</code> parameter which is apparently off by default</li>
|
||||
<li>I will try <code>testOnReturn='true'</code> too, just to add more validation, because I'm fucking grasping at straws</li>
|
||||
<li>I suspect these are issues with abandoned connections or maybe a leak, so I’m going to try adding the <code>removeAbandoned='true'</code> parameter which is apparently off by default</li>
|
||||
<li>I will try <code>testOnReturn='true'</code> too, just to add more validation, because I’m fucking grasping at straws</li>
|
||||
<li>Also, WTF, there was a heap space error randomly in catalina.out:</li>
|
||||
</ul>
|
||||
<pre><code>Wed Feb 07 15:01:54 UTC 2018 | Query:containerItem:91917 AND type:2
|
||||
Exception in thread "http-bio-127.0.0.1-8081-exec-58" java.lang.OutOfMemoryError: Java heap space
|
||||
</code></pre><ul>
|
||||
<li>I'm trying to find a way to determine what was using all those Tomcat sessions, but parsing the DSpace log is hard because some IPs are IPv6, which contain colons!</li>
|
||||
<li>I’m trying to find a way to determine what was using all those Tomcat sessions, but parsing the DSpace log is hard because some IPs are IPv6, which contain colons!</li>
|
||||
<li>Looking at the first crash this morning around 11, I see these IPv4 addresses making requests around 10 and 11AM:</li>
|
||||
</ul>
|
||||
<pre><code>$ grep -E '^2018-02-07 (10|11)' dspace.log.2018-02-07 | grep -o -E 'ip_addr=[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' | sort -n | uniq -c | sort -n | tail -n 20
|
||||
@ -319,20 +319,20 @@ $ grep 46.229.168 dspace.log.2018-02-07 | grep -o -E 'session_id=[A-Z0-9]{32}' |
|
||||
992
|
||||
|
||||
</code></pre><ul>
|
||||
<li>Let's investigate who these IPs belong to:
|
||||
<li>Let’s investigate who these IPs belong to:
|
||||
<ul>
|
||||
<li>104.196.152.243 is CIAT, which is already marked as a bot via nginx!</li>
|
||||
<li>207.46.13.71 is Bing, which is already marked as a bot in Tomcat's Crawler Session Manager Valve!</li>
|
||||
<li>40.77.167.62 is Bing, which is already marked as a bot in Tomcat's Crawler Session Manager Valve!</li>
|
||||
<li>207.46.13.135 is Bing, which is already marked as a bot in Tomcat's Crawler Session Manager Valve!</li>
|
||||
<li>68.180.228.157 is Yahoo, which is already marked as a bot in Tomcat's Crawler Session Manager Valve!</li>
|
||||
<li>40.77.167.36 is Bing, which is already marked as a bot in Tomcat's Crawler Session Manager Valve!</li>
|
||||
<li>207.46.13.54 is Bing, which is already marked as a bot in Tomcat's Crawler Session Manager Valve!</li>
|
||||
<li>46.229.168.x is Semrush, which is already marked as a bot in Tomcat's Crawler Session Manager Valve!</li>
|
||||
<li>207.46.13.71 is Bing, which is already marked as a bot in Tomcat’s Crawler Session Manager Valve!</li>
|
||||
<li>40.77.167.62 is Bing, which is already marked as a bot in Tomcat’s Crawler Session Manager Valve!</li>
|
||||
<li>207.46.13.135 is Bing, which is already marked as a bot in Tomcat’s Crawler Session Manager Valve!</li>
|
||||
<li>68.180.228.157 is Yahoo, which is already marked as a bot in Tomcat’s Crawler Session Manager Valve!</li>
|
||||
<li>40.77.167.36 is Bing, which is already marked as a bot in Tomcat’s Crawler Session Manager Valve!</li>
|
||||
<li>207.46.13.54 is Bing, which is already marked as a bot in Tomcat’s Crawler Session Manager Valve!</li>
|
||||
<li>46.229.168.x is Semrush, which is already marked as a bot in Tomcat’s Crawler Session Manager Valve!</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Nice, so these are all known bots that are already crammed into one session by Tomcat's Crawler Session Manager Valve.</li>
|
||||
<li>What in the actual fuck, why is our load doing this? It's gotta be something fucked up with the database pool being “busy” but everything is fucking idle</li>
|
||||
<li>Nice, so these are all known bots that are already crammed into one session by Tomcat’s Crawler Session Manager Valve.</li>
|
||||
<li>What in the actual fuck, why is our load doing this? It’s gotta be something fucked up with the database pool being “busy” but everything is fucking idle</li>
|
||||
<li>One that I should probably add in nginx is 54.83.138.123, which is apparently the following user agent:</li>
|
||||
</ul>
|
||||
<pre><code>BUbiNG (+http://law.di.unimi.it/BUbiNG.html)
|
||||
@ -343,7 +343,7 @@ $ grep 46.229.168 dspace.log.2018-02-07 | grep -o -E 'session_id=[A-Z0-9]{32}' |
|
||||
/var/log/nginx/access.log:1925
|
||||
/var/log/nginx/access.log.1:2029
|
||||
</code></pre><ul>
|
||||
<li>And they have 30 IPs, so fuck that shit I'm going to add them to the Tomcat Crawler Session Manager Valve nowwww</li>
|
||||
<li>And they have 30 IPs, so fuck that shit I’m going to add them to the Tomcat Crawler Session Manager Valve nowwww</li>
|
||||
<li>Lots of discussions on the dspace-tech mailing list over the last few years about leaky transactions being a known problem with DSpace</li>
|
||||
<li>Helix84 recommends restarting PostgreSQL instead of Tomcat because it restarts quicker</li>
|
||||
<li>This is how the connections looked when it crashed this afternoon:</li>
|
||||
@ -359,16 +359,16 @@ $ grep 46.229.168 dspace.log.2018-02-07 | grep -o -E 'session_id=[A-Z0-9]{32}' |
|
||||
5 dspaceWeb
|
||||
</code></pre><ul>
|
||||
<li>So is this just some fucked up XMLUI database leaking?</li>
|
||||
<li>I notice there is an issue (that I've probably noticed before) on the Jira tracker about this that was fixed in DSpace 5.7: <a href="https://jira.duraspace.org/browse/DS-3551">https://jira.duraspace.org/browse/DS-3551</a></li>
|
||||
<li>I seriously doubt this leaking shit is fixed for sure, but I'm gonna cherry-pick all those commits and try them on DSpace Test and probably even CGSpace because I'm fed up with this shit</li>
|
||||
<li>I cherry-picked all the commits for DS-3551 but it won't build on our current DSpace 5.5!</li>
|
||||
<li>I notice there is an issue (that I’ve probably noticed before) on the Jira tracker about this that was fixed in DSpace 5.7: <a href="https://jira.duraspace.org/browse/DS-3551">https://jira.duraspace.org/browse/DS-3551</a></li>
|
||||
<li>I seriously doubt this leaking shit is fixed for sure, but I’m gonna cherry-pick all those commits and try them on DSpace Test and probably even CGSpace because I’m fed up with this shit</li>
|
||||
<li>I cherry-picked all the commits for DS-3551 but it won’t build on our current DSpace 5.5!</li>
|
||||
<li>I sent a message to the dspace-tech mailing list asking why DSpace thinks these connections are busy when PostgreSQL says they are idle</li>
|
||||
</ul>
|
||||
<h2 id="2018-02-10">2018-02-10</h2>
|
||||
<ul>
|
||||
<li>I tried to disable ORCID lookups but keep the existing authorities</li>
|
||||
<li>This item has an ORCID for Ralf Kiese: http://localhost:8080/handle/10568/89897</li>
|
||||
<li>Switch authority.controlled off and change authorLookup to lookup, and the ORCID badge doesn't show up on the item</li>
|
||||
<li>Switch authority.controlled off and change authorLookup to lookup, and the ORCID badge doesn’t show up on the item</li>
|
||||
<li>Leave all settings but change choices.presentation to lookup and ORCID badge is there and item submission uses LC Name Authority and it breaks with this error:</li>
|
||||
</ul>
|
||||
<pre><code>Field dc_contributor_author has choice presentation of type "select", it may NOT be authority-controlled.
|
||||
@ -377,7 +377,7 @@ $ grep 46.229.168 dspace.log.2018-02-07 | grep -o -E 'session_id=[A-Z0-9]{32}' |
|
||||
</ul>
|
||||
<pre><code>xmlui.mirage2.forms.instancedCompositeFields.noSuggestionError
|
||||
</code></pre><ul>
|
||||
<li>So I don't think we can disable the ORCID lookup function and keep the ORCID badges</li>
|
||||
<li>So I don’t think we can disable the ORCID lookup function and keep the ORCID badges</li>
|
||||
</ul>
|
||||
<h2 id="2018-02-11">2018-02-11</h2>
|
||||
<ul>
|
||||
@ -409,7 +409,7 @@ authors-2018-02-05.csv: line 100, char 18, byte 4179: After a first byte between
|
||||
<pre><code>$ export JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx1024m"
|
||||
$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
|
||||
</code></pre><ul>
|
||||
<li>That reminds me that Bizu had asked me to fix some of Alan Duncan's names in December</li>
|
||||
<li>That reminds me that Bizu had asked me to fix some of Alan Duncan’s names in December</li>
|
||||
<li>I see he actually has some variations with “Duncan, Alan J.": <a href="https://cgspace.cgiar.org/discover?filtertype_1=author&filter_relational_operator_1=contains&filter_1=Duncan%2C+Alan&submit_apply_filter=&query=">https://cgspace.cgiar.org/discover?filtertype_1=author&filter_relational_operator_1=contains&filter_1=Duncan%2C+Alan&submit_apply_filter=&query=</a></li>
|
||||
<li>I will just update those for her too and then restart the indexing:</li>
|
||||
</ul>
|
||||
@ -440,7 +440,7 @@ dspace=# commit;
|
||||
<li>I wrote a Python script (<a href="https://gist.github.com/alanorth/57a88379126d844563c1410bd7b8d12b"><code>resolve-orcids-from-solr.py</code></a>) using SolrClient to parse the Solr authority cache for ORCID IDs</li>
|
||||
<li>We currently have 1562 authority records with ORCID IDs, and 624 unique IDs</li>
|
||||
<li>We can use this to build a controlled vocabulary of ORCID IDs for new item submissions</li>
|
||||
<li>I don't know how to add ORCID IDs to existing items yet… some more querying of PostgreSQL for authority values perhaps?</li>
|
||||
<li>I don’t know how to add ORCID IDs to existing items yet… some more querying of PostgreSQL for authority values perhaps?</li>
|
||||
<li>I added the script to the <a href="https://github.com/ilri/DSpace/wiki/Scripts">ILRI DSpace wiki on GitHub</a></li>
|
||||
</ul>
|
||||
<h2 id="2018-02-12">2018-02-12</h2>
|
||||
@ -448,21 +448,21 @@ dspace=# commit;
|
||||
<li>Follow up with Atmire on the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">DSpace 5.8 Compatibility ticket</a> to ask again if they want me to send them a DSpace 5.8 branch to work on</li>
|
||||
<li>Abenet asked if there was a way to get the number of submissions she and Bizuwork did</li>
|
||||
<li>I said that the Atmire Workflow Statistics module was supposed to be able to do that</li>
|
||||
<li>We had tried it in <a href="/cgspace-notes/2017-06/">June, 2017</a> and found that it didn't work</li>
|
||||
<li>Atmire sent us some fixes but they didn't work either</li>
|
||||
<li>We had tried it in <a href="/cgspace-notes/2017-06/">June, 2017</a> and found that it didn’t work</li>
|
||||
<li>Atmire sent us some fixes but they didn’t work either</li>
|
||||
<li>I just tried the branch with the fixes again and it indeed does not work:</li>
|
||||
</ul>
|
||||
<p><img src="/cgspace-notes/2018/02/atmire-workflow-statistics.png" alt="Atmire Workflow Statistics No Data Available"></p>
|
||||
<ul>
|
||||
<li>I see that in <a href="/cgspace-notes/2017-04/">April, 2017</a> I just used a SQL query to get a user's submissions by checking the <code>dc.description.provenance</code> field</li>
|
||||
<li>I see that in <a href="/cgspace-notes/2017-04/">April, 2017</a> I just used a SQL query to get a user’s submissions by checking the <code>dc.description.provenance</code> field</li>
|
||||
<li>So for Abenet, I can check her submissions in December, 2017 with:</li>
|
||||
</ul>
|
||||
<pre><code>dspace=# select * from metadatavalue where resource_type_id=2 and metadata_field_id=28 and text_value ~ '^Submitted.*yabowork.*2017-12.*';
|
||||
</code></pre><ul>
|
||||
<li>I emailed Peter to ask whether we can move DSpace Test to a new Linode server and attach 300 GB of disk space to it</li>
|
||||
<li>This would be using <a href="https://www.linode.com/blockstorage">Linode's new block storage volumes</a></li>
|
||||
<li>This would be using <a href="https://www.linode.com/blockstorage">Linode’s new block storage volumes</a></li>
|
||||
<li>I think our current $40/month Linode has enough CPU and memory capacity, but we need more disk space</li>
|
||||
<li>I think I'd probably just attach the block storage volume and mount it on /home/dspace</li>
|
||||
<li>I think I’d probably just attach the block storage volume and mount it on /home/dspace</li>
|
||||
<li>Ask Peter about <code>dc.rights</code> on DSpace Test again, if he likes it then we should move it to CGSpace soon</li>
|
||||
</ul>
|
||||
<h2 id="2018-02-13">2018-02-13</h2>
|
||||
@ -492,16 +492,16 @@ dspace.log.2018-02-11:3
|
||||
dspace.log.2018-02-12:0
|
||||
dspace.log.2018-02-13:4
|
||||
</code></pre><ul>
|
||||
<li>I apparently added that on 2018-02-07 so it could be, as I don't see any of those socket closed errors in 2018-01's logs!</li>
|
||||
<li>I apparently added that on 2018-02-07 so it could be, as I don’t see any of those socket closed errors in 2018-01’s logs!</li>
|
||||
<li>I will increase the removeAbandonedTimeout from its default of 60 to 90 and enable logAbandoned</li>
|
||||
<li>Peter hit this issue one more time, and this is apparently what Tomcat's catalina.out log says when an abandoned connection is removed:</li>
|
||||
<li>Peter hit this issue one more time, and this is apparently what Tomcat’s catalina.out log says when an abandoned connection is removed:</li>
|
||||
</ul>
|
||||
<pre><code>Feb 13, 2018 2:05:42 PM org.apache.tomcat.jdbc.pool.ConnectionPool abandon
|
||||
WARNING: Connection has been abandoned PooledConnection[org.postgresql.jdbc.PgConnection@22e107be]:java.lang.Exception
|
||||
</code></pre><h2 id="2018-02-14">2018-02-14</h2>
|
||||
<ul>
|
||||
<li>Skype with Peter and the Addis team to discuss what we need to do for the ORCIDs in the immediate future</li>
|
||||
<li>We said we'd start with a controlled vocabulary for <code>cg.creator.id</code> on the DSpace Test submission form, where we store the author name and the ORCID in some format like: Alan S. Orth (0000-0002-1735-7458)</li>
|
||||
<li>We said we’d start with a controlled vocabulary for <code>cg.creator.id</code> on the DSpace Test submission form, where we store the author name and the ORCID in some format like: Alan S. Orth (0000-0002-1735-7458)</li>
|
||||
<li>Eventually we need to find a way to print the author names with links to their ORCID profiles</li>
|
||||
<li>Abenet will send an email to the partners to give us ORCID IDs for their authors and to stress that they update their name format on ORCID.org if they want it in a special way</li>
|
||||
<li>I sent the Codeobia guys a question to ask how they prefer that we store the IDs, ie one of:
|
||||
@ -539,14 +539,14 @@ $ grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' CGcenter_ORCID_ID_c
|
||||
<pre><code>$ cat CGcenter_ORCID_ID_combined.csv ciat-orcids.txt | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq | wc -l
|
||||
1227
|
||||
</code></pre><ul>
|
||||
<li>There are some formatting issues with names in Peter's list, so I should remember to re-generate the list of names from ORCID's API once we're done</li>
|
||||
<li>There are some formatting issues with names in Peter’s list, so I should remember to re-generate the list of names from ORCID’s API once we’re done</li>
|
||||
<li>The <code>dspace cleanup -v</code> currently fails on CGSpace with the following:</li>
|
||||
</ul>
|
||||
<pre><code> - Deleting bitstream record from database (ID: 149473)
|
||||
Error: ERROR: update or delete on table "bitstream" violates foreign key constraint "bundle_primary_bitstream_id_fkey" on table "bundle"
|
||||
Detail: Key (bitstream_id)=(149473) is still referenced from table "bundle".
|
||||
</code></pre><ul>
|
||||
<li>The solution is to update the bitstream table, as I've discovered several other times in 2016 and 2017:</li>
|
||||
<li>The solution is to update the bitstream table, as I’ve discovered several other times in 2016 and 2017:</li>
|
||||
</ul>
|
||||
<pre><code>$ psql dspace -c 'update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (149473);'
|
||||
UPDATE 1
|
||||
@ -561,7 +561,7 @@ UPDATE 1
|
||||
<li>See the corresponding page on Altmetric: <a href="https://www.altmetric.com/details/handle/10568/78450">https://www.altmetric.com/details/handle/10568/78450</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>And this item doesn't even exist on CGSpace!</li>
|
||||
<li>And this item doesn’t even exist on CGSpace!</li>
|
||||
<li>Start working on XMLUI item display code for ORCIDs</li>
|
||||
<li>Send emails to Macaroni Bros and Usman at CIFOR about ORCID metadata</li>
|
||||
<li>CGSpace crashed while I was driving to Tel Aviv, and was down for four hours!</li>
|
||||
@ -573,7 +573,7 @@ UPDATE 1
|
||||
1 dspaceWeb
|
||||
3 dspaceApi
|
||||
</code></pre><ul>
|
||||
<li>I see shitloads of memory errors in Tomcat's logs:</li>
|
||||
<li>I see shitloads of memory errors in Tomcat’s logs:</li>
|
||||
</ul>
|
||||
<pre><code># grep -c "Java heap space" /var/log/tomcat7/catalina.out
|
||||
56
|
||||
@ -607,13 +607,13 @@ UPDATE 1
|
||||
UPDATE 2
|
||||
</code></pre><h2 id="2018-02-18">2018-02-18</h2>
|
||||
<ul>
|
||||
<li>ICARDA's Mohamed Salem pointed out that it would be easiest to format the <code>cg.creator.id</code> field like “Alan Orth: 0000-0002-1735-7458” because no name will have a “:” so it's easier to split on</li>
|
||||
<li>ICARDA’s Mohamed Salem pointed out that it would be easiest to format the <code>cg.creator.id</code> field like “Alan Orth: 0000-0002-1735-7458” because no name will have a “:” so it’s easier to split on</li>
|
||||
<li>I finally figured out a few ways to extract ORCID iDs from metadata using XSLT and display them in the XMLUI:</li>
|
||||
</ul>
|
||||
<p><img src="/cgspace-notes/2018/02/xmlui-orcid-display.png" alt="Displaying ORCID iDs in XMLUI"></p>
|
||||
<ul>
|
||||
<li>The one on the bottom left uses a similar format to our author display, and the one in the middle uses the format <a href="https://orcid.org/trademark-and-id-display-guidelines">recommended by ORCID's branding guidelines</a></li>
|
||||
<li>Also, I realized that the Academicons font icon set we're using includes an ORCID badge so we don't need to use the PNG image anymore</li>
|
||||
<li>The one on the bottom left uses a similar format to our author display, and the one in the middle uses the format <a href="https://orcid.org/trademark-and-id-display-guidelines">recommended by ORCID’s branding guidelines</a></li>
|
||||
<li>Also, I realized that the Academicons font icon set we’re using includes an ORCID badge so we don’t need to use the PNG image anymore</li>
|
||||
<li>Run system updates on DSpace Test (linode02) and reboot the server</li>
|
||||
<li>Looking back at the system errors on 2018-02-15, I wonder what the fuck caused this:</li>
|
||||
</ul>
|
||||
@ -629,13 +629,13 @@ UPDATE 2
|
||||
167432 dspace.log.2018-02-18
|
||||
</code></pre><ul>
|
||||
<li>From an average of a few hundred thousand to over four million lines in DSpace log?</li>
|
||||
<li>Using grep's <code>-B1</code> I can see the line before the heap space error, which has the time, ie:</li>
|
||||
<li>Using grep’s <code>-B1</code> I can see the line before the heap space error, which has the time, ie:</li>
|
||||
</ul>
|
||||
<pre><code>2018-02-15 16:02:12,748 ERROR org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter @ Serious Error Occurred Processing Request!
|
||||
org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.OutOfMemoryError: Java heap space
|
||||
</code></pre><ul>
|
||||
<li>So these errors happened at hours 16, 18, 19, and 20</li>
|
||||
<li>Let's see what was going on in nginx then:</li>
|
||||
<li>Let’s see what was going on in nginx then:</li>
|
||||
</ul>
|
||||
<pre><code># zcat --force /var/log/nginx/*.log.{3,4}.gz | wc -l
|
||||
168571
|
||||
@ -693,7 +693,7 @@ Traceback (most recent call last):
|
||||
family_name = data['name']['family-name']['value']
|
||||
TypeError: 'NoneType' object is not subscriptable
|
||||
</code></pre><ul>
|
||||
<li>According to ORCID that identifier's family-name is null so that sucks</li>
|
||||
<li>According to ORCID that identifier’s family-name is null so that sucks</li>
|
||||
<li>I fixed the script so that it checks if the family name is null</li>
|
||||
<li>Now another:</li>
|
||||
</ul>
|
||||
@ -707,19 +707,19 @@ Traceback (most recent call last):
|
||||
if data['name']['given-names']:
|
||||
TypeError: 'NoneType' object is not subscriptable
|
||||
</code></pre><ul>
|
||||
<li>According to ORCID that identifier's entire name block is null!</li>
|
||||
<li>According to ORCID that identifier’s entire name block is null!</li>
|
||||
</ul>
|
||||
<h2 id="2018-02-20">2018-02-20</h2>
|
||||
<ul>
|
||||
<li>Send Abenet an email about getting a purchase requisition for a new DSpace Test server on Linode</li>
|
||||
<li>Discuss some of the issues with null values and poor-quality names in some ORCID identifiers with Abenet and I think we'll now only use ORCID iDs that have been sent to use from partners, not those extracted via keyword searches on orcid.org</li>
|
||||
<li>This should be the version we use (the existing controlled vocabulary generated from CGSpace's Solr authority core plus the IDs sent to us so far by partners):</li>
|
||||
<li>Discuss some of the issues with null values and poor-quality names in some ORCID identifiers with Abenet and I think we’ll now only use ORCID iDs that have been sent to use from partners, not those extracted via keyword searches on orcid.org</li>
|
||||
<li>This should be the version we use (the existing controlled vocabulary generated from CGSpace’s Solr authority core plus the IDs sent to us so far by partners):</li>
|
||||
</ul>
|
||||
<pre><code>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml ORCID_ID_CIAT_IITA_IWMI.csv | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq > 2018-02-20-combined.txt
|
||||
</code></pre><ul>
|
||||
<li>I updated the <code>resolve-orcids.py</code> to use the “credit-name” if it exists in a profile, falling back to “given-names” + “family-name”</li>
|
||||
<li>Also, I added color coded output to the debug messages and added a “quiet” mode that supresses the normal behavior of printing results to the screen</li>
|
||||
<li>I'm using this as the test input for <code>resolve-orcids.py</code>:</li>
|
||||
<li>I’m using this as the test input for <code>resolve-orcids.py</code>:</li>
|
||||
</ul>
|
||||
<pre><code>$ cat orcid-test-values.txt
|
||||
# valid identifier with 'given-names' and 'family-name'
|
||||
@ -753,13 +753,13 @@ TypeError: 'NoneType' object is not subscriptable
|
||||
<li>The Altmetric JavaScript builds the following API call: <a href="https://api.altmetric.com/v1/handle/10568/83320?callback=_altmetric.embed_callback&domain=cgspace.cgiar.org&key=3c130976ca2b8f2e88f8377633751ba1&cache_until=13-20">https://api.altmetric.com/v1/handle/10568/83320?callback=_altmetric.embed_callback&domain=cgspace.cgiar.org&key=3c130976ca2b8f2e88f8377633751ba1&cache_until=13-20</a></li>
|
||||
<li>The response body is <em>not</em> JSON</li>
|
||||
<li>To contrast, the following bare API call without query parameters is valid JSON: <a href="https://api.altmetric.com/v1/handle/10568/83320">https://api.altmetric.com/v1/handle/10568/83320</a></li>
|
||||
<li>I told them that it's their JavaScript that is fucked up</li>
|
||||
<li>I told them that it’s their JavaScript that is fucked up</li>
|
||||
<li>Remove CPWF project number and Humidtropics subject from submission form (<a href="https://github.com/alanorth/DSpace/pull/3">#3</a>)</li>
|
||||
<li>I accidentally merged it into my own repository, oops</li>
|
||||
</ul>
|
||||
<h2 id="2018-02-22">2018-02-22</h2>
|
||||
<ul>
|
||||
<li>CGSpace was apparently down today around 13:00 server time and I didn't get any emails on my phone, but saw them later on the computer</li>
|
||||
<li>CGSpace was apparently down today around 13:00 server time and I didn’t get any emails on my phone, but saw them later on the computer</li>
|
||||
<li>It looks like Sisay restarted Tomcat because I was offline</li>
|
||||
<li>There was absolutely nothing interesting going on at 13:00 on the server, WTF?</li>
|
||||
</ul>
|
||||
@ -789,7 +789,7 @@ TypeError: 'NoneType' object is not subscriptable
|
||||
5208 5.9.6.51
|
||||
8686 45.5.184.196
|
||||
</code></pre><ul>
|
||||
<li>So I don't see any definite cause for this crash, I see a shit ton of abandoned PostgreSQL connections today around 1PM!</li>
|
||||
<li>So I don’t see any definite cause for this crash, I see a shit ton of abandoned PostgreSQL connections today around 1PM!</li>
|
||||
</ul>
|
||||
<pre><code># grep -c 'org.apache.tomcat.jdbc.pool.ConnectionPool abandon' /var/log/tomcat7/catalina.out
|
||||
729
|
||||
@ -821,14 +821,14 @@ TypeError: 'NoneType' object is not subscriptable
|
||||
<pre><code>$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml /tmp/ccafs | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq | wc -l
|
||||
1004
|
||||
</code></pre><ul>
|
||||
<li>I will add them to DSpace Test but Abenet says she's still waiting to set us ILRI's list</li>
|
||||
<li>I will add them to DSpace Test but Abenet says she’s still waiting to set us ILRI’s list</li>
|
||||
<li>I will tell her that we should proceed on sharing our work on DSpace Test with the partners this week anyways and we can update the list later</li>
|
||||
<li>While regenerating the names for these ORCID identifiers I saw <a href="https://pub.orcid.org/v2.1/0000-0002-2614-426X/person">one that has a weird value for its names</a>:</li>
|
||||
</ul>
|
||||
<pre><code>Looking up the names associated with ORCID iD: 0000-0002-2614-426X
|
||||
Given Names Deactivated Family Name Deactivated: 0000-0002-2614-426X
|
||||
</code></pre><ul>
|
||||
<li>I don't know if the user accidentally entered this as their name or if that's how ORCID behaves when the name is private?</li>
|
||||
<li>I don’t know if the user accidentally entered this as their name or if that’s how ORCID behaves when the name is private?</li>
|
||||
<li>I will remove that one from our list for now</li>
|
||||
<li>Remove Dryland Systems subject from submission form because that CRP closed two years ago (<a href="https://github.com/ilri/DSpace/pull/355">#355</a>)</li>
|
||||
<li>Run all system updates on DSpace Test</li>
|
||||
@ -842,7 +842,7 @@ Given Names Deactivated Family Name Deactivated: 0000-0002-2614-426X
|
||||
62464
|
||||
(1 row)
|
||||
</code></pre><ul>
|
||||
<li>I know from earlier this month that there are only 624 unique ORCID identifiers in the Solr authority core, so it's way easier to just fetch the unique ORCID iDs from Solr and then go back to PostgreSQL and do the metadata mapping that way</li>
|
||||
<li>I know from earlier this month that there are only 624 unique ORCID identifiers in the Solr authority core, so it’s way easier to just fetch the unique ORCID iDs from Solr and then go back to PostgreSQL and do the metadata mapping that way</li>
|
||||
<li>The query in Solr would simply be <code>orcid_id:*</code></li>
|
||||
<li>Assuming I know that authority record with <code>id:d7ef744b-bbd4-4171-b449-00e37e1b776f</code>, then I could query PostgreSQL for all metadata records using that authority:</li>
|
||||
</ul>
|
||||
@ -877,14 +877,14 @@ Nor Azwadi: 0000-0001-9634-1958
|
||||
<ul>
|
||||
<li>Peter is having problems with “Socket closed” on his submissions page again</li>
|
||||
<li>He says his personal account loads much faster than his CGIAR account, which could be because the CGIAR account has potentially thousands of submissions over the last few years</li>
|
||||
<li>I don't know why it would take so long, but this logic kinda makes sense</li>
|
||||
<li>I don’t know why it would take so long, but this logic kinda makes sense</li>
|
||||
<li>I think I should increase the <code>removeAbandonedTimeout</code> from 90 to something like 180 and continue observing</li>
|
||||
<li>I also reduced the timeout for the API pool back to 60 because those interfaces are only used by bots</li>
|
||||
</ul>
|
||||
<h2 id="2018-02-27">2018-02-27</h2>
|
||||
<ul>
|
||||
<li>Peter is still having problems with “Socket closed” on his submissions page</li>
|
||||
<li>I have disabled <code>removeAbandoned</code> for now because that's the only thing I changed in the last few weeks since he started having issues</li>
|
||||
<li>I have disabled <code>removeAbandoned</code> for now because that’s the only thing I changed in the last few weeks since he started having issues</li>
|
||||
<li>I think the real line of logic to follow here is why the submissions page is so slow for him (presumably because of loading all his submissions?)</li>
|
||||
<li>I need to see which SQL queries are run during that time</li>
|
||||
<li>And only a few hours after I disabled the <code>removeAbandoned</code> thing CGSpace went down and lo and behold, there were 264 connections, most of which were idle:</li>
|
||||
@ -895,7 +895,7 @@ Nor Azwadi: 0000-0001-9634-1958
|
||||
$ psql -c 'select * from pg_stat_activity' | grep dspaceWeb | grep -c "idle in transaction"
|
||||
218
|
||||
</code></pre><ul>
|
||||
<li>So I'm re-enabling the <code>removeAbandoned</code> setting</li>
|
||||
<li>So I’m re-enabling the <code>removeAbandoned</code> setting</li>
|
||||
<li>I grabbed a snapshot of the active connections in <code>pg_stat_activity</code> for all queries running longer than 2 minutes:</li>
|
||||
</ul>
|
||||
<pre><code>dspace=# \copy (SELECT now() - query_start as "runtime", application_name, usename, datname, waiting, state, query
|
||||
@ -926,8 +926,8 @@ COPY 263
|
||||
</ul>
|
||||
<h2 id="2018-02-28">2018-02-28</h2>
|
||||
<ul>
|
||||
<li>CGSpace crashed today, the first HTTP 499 in nginx's access.log was around 09:12</li>
|
||||
<li>There's nothing interesting going on in nginx's logs around that time:</li>
|
||||
<li>CGSpace crashed today, the first HTTP 499 in nginx’s access.log was around 09:12</li>
|
||||
<li>There’s nothing interesting going on in nginx’s logs around that time:</li>
|
||||
</ul>
|
||||
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "28/Feb/2018:09:" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||||
65 197.210.168.174
|
||||
@ -995,8 +995,8 @@ dspace.log.2018-02-28:1
|
||||
<li>According to the log 01D9932D6E85E90C2BA9FF5563A76D03 is an ILRI editor, doing lots of updating and editing of items</li>
|
||||
<li>8100883DAD00666A655AE8EC571C95AE is some Indian IP address</li>
|
||||
<li>1E9834E918A550C5CD480076BC1B73A4 looks to be a session shared by the bots</li>
|
||||
<li>So maybe it was due to the editor's uploading of files, perhaps something that was too big or?</li>
|
||||
<li>I think I'll increase the JVM heap size on CGSpace from 6144m to 8192m because I'm sick of this random crashing shit and the server has memory and I'd rather eliminate this so I can get back to solving PostgreSQL issues and doing other real work</li>
|
||||
<li>So maybe it was due to the editor’s uploading of files, perhaps something that was too big or?</li>
|
||||
<li>I think I’ll increase the JVM heap size on CGSpace from 6144m to 8192m because I’m sick of this random crashing shit and the server has memory and I’d rather eliminate this so I can get back to solving PostgreSQL issues and doing other real work</li>
|
||||
<li>Run the few corrections from earlier this month for sponsor on CGSpace:</li>
|
||||
</ul>
|
||||
<pre><code>cgspace=# update metadatavalue set text_value='United States Agency for International Development' where resource_type_id=2 and metadata_field_id=29 and text_value like '%U.S. Agency for International Development%';
|
||||
|
Reference in New Issue
Block a user