cgspace-notes/public/2018-02/index.html

<!DOCTYPE html>
<html lang="en">

  <head>
    <meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">

<meta property="og:title" content="February, 2018" />
<meta property="og:description" content="2018-02-01


Peter gave feedback on the dc.rights proof of concept that I had sent him last week
We don&rsquo;t need to distinguish between internal and external works, so that makes it just a simple list
Yesterday I figured out how to monitor DSpace sessions using JMX
I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu&rsquo;s munin-plugins-java package and used the stuff I discovered about JMX in 2018-01


" />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-02/" />


<meta property="article:published_time" content="2018-02-01T16:28:54&#43;02:00"/>

<meta property="article:modified_time" content="2018-02-07T12:34:36&#43;02:00"/>


<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="February, 2018"/>
<meta name="twitter:description" content="2018-02-01


Peter gave feedback on the dc.rights proof of concept that I had sent him last week
We don&rsquo;t need to distinguish between internal and external works, so that makes it just a simple list
Yesterday I figured out how to monitor DSpace sessions using JMX
I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu&rsquo;s munin-plugins-java package and used the stuff I discovered about JMX in 2018-01


"/>
<meta name="generator" content="Hugo 0.36" />


<script type="application/ld+json">
{
  "@context": "http://schema.org",
  "@type": "BlogPosting",
  "headline": "February, 2018",
  "url": "https://alanorth.github.io/cgspace-notes/2018-02/",
  "wordCount": "1334",
  "datePublished": "2018-02-01T16:28:54&#43;02:00",
  "dateModified": "2018-02-07T12:34:36&#43;02:00",
  "author": {
    "@type": "Person",
    "name": "Alan Orth"
  },
  "keywords": "Notes"
}
</script>


    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-02/">

    <title>February, 2018 | CGSpace Notes</title>

    <!-- combined, minified CSS -->
    <link href="https://alanorth.github.io/cgspace-notes/css/style.css" rel="stylesheet" integrity="sha384-HjEPigLMLBzVQsUi6JWp9tmxJtBimdClDBxwZrwZR&#43;VE3s11/PtFYOrLClxIv2SG" crossorigin="anonymous">

    
  </head>

  <body>

    
    <div class="blog-masthead">
      <div class="container">
        <nav class="nav blog-nav">
          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
          
          
        </nav>
      </div>
    </div>
    

    <header class="blog-header">
      <div class="container">
        <h1 class="blog-title"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
        <p class="lead blog-description">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
      </div>
    </header>
    

    <div class="container">
      <div class="row">
        <div class="col-sm-8 blog-main">

          
<article class="blog-post">
  <header>
    <h2 class="blog-post-title"><a href="https://alanorth.github.io/cgspace-notes/2018-02/">February, 2018</a></h2>
    <p class="blog-post-meta"><time datetime="2018-02-01T16:28:54&#43;02:00">Thu Feb 01, 2018</time> by Alan Orth in 

<i class="fa fa-tag" aria-hidden="true"></i>&nbsp;<a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>

</p>
  </header>
  <h2 id="2018-02-01">2018-02-01</h2>

<ul>
<li>Peter gave feedback on the <code>dc.rights</code> proof of concept that I had sent him last week</li>
<li>We don&rsquo;t need to distinguish between internal and external works, so that makes it just a simple list</li>
<li>Yesterday I figured out how to monitor DSpace sessions using JMX</li>
<li>I copied the logic in the <code>jmx_tomcat_dbpools</code> provided by Ubuntu&rsquo;s <code>munin-plugins-java</code> package and used the stuff I discovered about JMX <a href="/cgspace-notes/2018-01/">in 2018-01</a></li>
</ul>

<p></p>

<p><img src="/cgspace-notes/2018/02/jmx_dspace_sessions-day.png" alt="DSpace Sessions" /></p>

<ul>
<li>Run all system updates and reboot DSpace Test</li>
<li>Wow, I packaged up the <code>jmx_dspace_sessions</code> stuff in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> and deployed it on CGSpace and it totally works:</li>
</ul>

<pre><code># munin-run jmx_dspace_sessions
v_.value 223
v_jspui.value 1
v_oai.value 0
</code></pre>

<h2 id="2018-02-03">2018-02-03</h2>

<ul>
<li>Bram from Atmire responded about the high load caused by the Solr updater script and said it will be fixed with the updates to DSpace 5.8 compatibility: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=566">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=566</a></li>
<li>We will close that ticket for now and wait for the 5.8 stuff: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560</a></li>
<li>I finally took a look at the second round of cleanups Peter had sent me for author affiliations in mid January</li>
<li>After trimming whitespace and quickly scanning for encoding errors I applied them on CGSpace:</li>
</ul>

<pre><code>$ ./delete-metadata-values.py -i /tmp/2018-02-03-Affiliations-12-deletions.csv -f cg.contributor.affiliation -m 211 -d dspace -u dspace -p 'fuuu'
$ ./fix-metadata-values.py -i /tmp/2018-02-03-Affiliations-1116-corrections.csv -f cg.contributor.affiliation -t correct -m 211 -d dspace -u dspace -p 'fuuu'
</code></pre>

<ul>
<li>Then I started a full Discovery reindex:</li>
</ul>

<pre><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b

real    96m39.823s
user    14m10.975s
sys     2m29.088s
</code></pre>

<ul>
<li>Generate a new list of affiliations for Peter to sort through:</li>
</ul>

<pre><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'affiliation') AND resource_type_id = 2 group by text_value order by count desc) to /tmp/affiliations.csv with csv;
COPY 3723
</code></pre>

<ul>
<li>Oh, and it looks like we processed over 3.1 million requests in January, up from 2.9 million in <a href="/cgspace-notes/2017-12/">December</a>:</li>
</ul>

<pre><code># time zcat --force /var/log/nginx/* | grep -cE &quot;[0-9]{1,2}/Jan/2018&quot;
3126109

real    0m23.839s
user    0m27.225s
sys     0m1.905s
</code></pre>

<h2 id="2018-02-05">2018-02-05</h2>

<ul>
<li>Toying with correcting authors with trailing spaces via PostgreSQL:</li>
</ul>

<pre><code>dspace=# update metadatavalue set text_value=REGEXP_REPLACE(text_value, '\s+$' , '') where resource_type_id=2 and metadata_field_id=3 and text_value ~ '^.*?\s+$';
UPDATE 20
</code></pre>

<ul>
<li>I tried the <code>TRIM(TRAILING from text_value)</code> function and it said it changed 20 items but the spaces didn&rsquo;t go away</li>
<li>This is on a fresh import of the CGSpace database, but when I tried to apply it on CGSpace there were no changes detected. Weird.</li>
<li>Anyways, Peter wants a new list of authors to clean up, so I exported another CSV:</li>
</ul>

<pre><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors-2018-02-05.csv with csv;
COPY 55630
</code></pre>

<h2 id="2018-02-06">2018-02-06</h2>

<ul>
<li>UptimeRobot says CGSpace is down this morning around 9:15</li>
<li>I see 308 PostgreSQL connections in <code>pg_stat_activity</code></li>
<li>The usage otherwise seemed low for REST/OAI as well as XMLUI in the last hour:</li>
</ul>

<pre><code># date
Tue Feb  6 09:30:32 UTC 2018
# cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E &quot;6/Feb/2018:(08|09)&quot; | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
      2 223.185.41.40
      2 66.249.64.14
      2 77.246.52.40
      4 157.55.39.82
      4 193.205.105.8
      5 207.46.13.63
      5 207.46.13.64
      6 154.68.16.34
      7 207.46.13.66
   1548 50.116.102.77
# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 /var/log/nginx/error.log /var/log/nginx/error.log.1 | grep -E &quot;6/Feb/2018:(08|09)&quot; | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
     77 213.55.99.121
     86 66.249.64.14
    101 104.196.152.243
    103 207.46.13.64
    118 157.55.39.82
    133 207.46.13.66
    136 207.46.13.63
    156 68.180.228.157
    295 197.210.168.174
    752 144.76.64.79
</code></pre>

<ul>
<li>I did notice in <code>/var/log/tomcat7/catalina.out</code> that Atmire&rsquo;s update thing was running though</li>
<li>So I restarted Tomcat and now everything is fine</li>
<li>Next time I see that many database connections I need to save the output so I can analyze it later</li>
<li>I&rsquo;m going to re-schedule the taskUpdateSolrStatsMetadata task as <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=566">Bram detailed in ticket 566</a> to see if it makes CGSpace stop crashing every morning</li>
<li>If I move the task from 3AM to 3PM, deally CGSpace will stop crashing in the morning, or start crashing ~12 hours later</li>
<li>Eventually Atmire has said that there will be a fix for this high load caused by their script, but it will come with the 5.8 compatability they are already working on</li>
<li>I re-deployed CGSpace with the new task time of 3PM, ran all system updates, and restarted the server</li>
<li>Also, I changed the name of the DSpace fallback pool on DSpace Test and CGSpace to be called &lsquo;dspaceCli&rsquo; so that I can distinguish it in <code>pg_stat_activity</code></li>
<li>I implemented some changes to the pooling in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> so that each DSpace web application can use its own pool (web, api, and solr)</li>
<li>Each pool uses its own name and hopefully this should help me figure out which one is using too many connections next time CGSpace goes down</li>
<li>Also, this will mean that when a search bot comes along and hammers the XMLUI, the REST and OAI applications will be fine</li>
<li>I&rsquo;m not actually sure if the Solr web application uses the database though, so I&rsquo;ll have to check later and remove it if necessary</li>
<li>I deployed the changes on DSpace Test only for now, so I will monitor and make them on CGSpace later this week</li>
</ul>

<h2 id="2018-02-07">2018-02-07</h2>

<ul>
<li>Abenet wrote to ask a question about the ORCiD lookup not working for one CIAT user on CGSpace</li>
<li>I tried on DSpace Test and indeed the lookup just doesn&rsquo;t work!</li>
<li>The ORCiD code in DSpace appears to be using <code>http://pub.orcid.org/</code>, but when I go there in the browser it redirects me to <code>https://pub.orcid.org/v2.0/</code></li>
<li>According to <a href="https://groups.google.com/forum/#!topic/orcid-api-users/qfg-HwAB1bk">the announcement</a> the v1 API was moved from <code>http://pub.orcid.org/</code> to <code>https://pub.orcid.org/v1.2</code> until March 1st when it will be discontinued for good</li>
<li>But the old URL is hard coded in DSpace and it doesn&rsquo;t work anyways, because it currently redirects you to <code>https://pub.orcid.org/v2.0/v1.2</code></li>
<li>So I guess we have to disable that shit once and for all and switch to a controlled vocabulary</li>
<li>CGSpace crashed again, this time around <code>Wed Feb  7 11:20:28 UTC 2018</code></li>
<li>I took a few snapshots of the PostgreSQL activity at the time and as the minutes went on and the connections were very high at first but reduced on their own:</li>
</ul>

<pre><code>$ psql -c 'select * from pg_stat_activity' &gt; /tmp/pg_stat_activity.txt
$ grep -c 'PostgreSQL JDBC' /tmp/pg_stat_activity*
/tmp/pg_stat_activity1.txt:300
/tmp/pg_stat_activity2.txt:272
/tmp/pg_stat_activity3.txt:168
/tmp/pg_stat_activity4.txt:5
/tmp/pg_stat_activity5.txt:6
</code></pre>

<ul>
<li>Interestingly, all of those 751 connections were idle!</li>
</ul>

<pre><code>$ grep &quot;PostgreSQL JDBC&quot; /tmp/pg_stat_activity* | grep -c idle
751
</code></pre>

<ul>
<li>Since I was restarting Tomcat anyways, I decided to deploy the changes to create two different pools for web and API apps</li>
<li>Looking the Munin graphs, I can see that there were almost double the normal number of DSpace sessions at the time of the crash (and also yesterday!):</li>
</ul>

<p><img src="/cgspace-notes/2018/02/jmx_dspace-sessions-day.png" alt="DSpace Sessions" /></p>

<ul>
<li>Indeed it seems like there were over 1800 sessions today around the hours of 10 and 11 AM:</li>
</ul>

<pre><code>$ grep -E '^2018-02-07 (10|11)' dspace.log.2018-02-07 | grep -o -E 'session_id=[A-Z0-9]{32}' | sort -n | uniq | wc -l
1828
</code></pre>

<ul>
<li>CGSpace went down again a few hours later, and now the connections to the dspaceWeb pool are maxed at 250 (the new limit I imposed with the new separate pool scheme)</li>
<li>What&rsquo;s interesting is that the DSpace log says the connections are all busy:</li>
</ul>

<pre><code>org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-328] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:250; busy:250; idle:0; lastwait:5000].
</code></pre>

<ul>
<li>&hellip; but in PostgreSQL I see them <code>idle</code> or <code>idle in transaction</code>:</li>
</ul>

<pre><code>$ psql -c 'select * from pg_stat_activity' | grep -c dspaceWeb
250
$ psql -c 'select * from pg_stat_activity' | grep dspaceWeb | grep -c idle
250
$ psql -c 'select * from pg_stat_activity' | grep dspaceWeb | grep -c &quot;idle in transaction&quot;
187
</code></pre>

<ul>
<li>What the fuck, does DSpace think all connections are busy?</li>
<li>I suspect these are issues with abandoned connections or maybe a leak, so I&rsquo;m going to try adding the <code>removeAbandoned='true'</code> parameter which is apparently off by default</li>
<li>I will try <code>testOnReturn='true'</code> too, just to add more validation, because I&rsquo;m fucking grasping at straws</li>
</ul>

  
</article> 


        </div> <!-- /.blog-main -->

        <aside class="col-sm-3 ml-auto blog-sidebar">
  

        <section class="sidebar-module">
    <h4>Recent Posts</h4>
    <ol class="list-unstyled">


<li><a href="/cgspace-notes/2018-02/">February, 2018</a></li>

<li><a href="/cgspace-notes/2018-01/">January, 2018</a></li>

<li><a href="/cgspace-notes/2017-12/">December, 2017</a></li>

<li><a href="/cgspace-notes/2017-11/">November, 2017</a></li>

<li><a href="/cgspace-notes/2017-10/">October, 2017</a></li>

    </ol>
  </section>

  
  <section class="sidebar-module">
    <h4>Links</h4>
    <ol class="list-unstyled">
      
      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
      
      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
      
      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
      
    </ol>
  </section>
  
</aside>


      </div> <!-- /.row -->
    </div> <!-- /.container -->
    

    <footer class="blog-footer">
      <p>
      
      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
      
      </p>
      <p>
      <a href="#">Back to top</a>
      </p>
    </footer>
    

  </body>

</html>
Add notes for 2018-02-01 2018-02-01 16:30:28 +02:00			`<!DOCTYPE html>`
			`<html lang="en">`

			`<head>`
			`<meta charset="utf-8">`
			`<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">`

			`<meta property="og:title" content="February, 2018" />`
			`<meta property="og:description" content="2018-02-01`


			`Peter gave feedback on the dc.rights proof of concept that I had sent him last week`
			`We don’t need to distinguish between internal and external works, so that makes it just a simple list`
Update notes 2018-02-01 19:04:07 +02:00			`Yesterday I figured out how to monitor DSpace sessions using JMX`
			`I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu’s munin-plugins-java package and used the stuff I discovered about JMX in 2018-01`
Add notes for 2018-02-01 2018-02-01 16:30:28 +02:00

			`" />`
			`<meta property="og:type" content="article" />`
			`<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-02/" />`



			`<meta property="article:published_time" content="2018-02-01T16:28:54+02:00"/>`

Update notes for 2018-02-07 2018-02-07 18:33:26 +02:00			`<meta property="article:modified_time" content="2018-02-07T12:34:36+02:00"/>`
Add notes for 2018-02-01 2018-02-01 16:30:28 +02:00










			`<meta name="twitter:card" content="summary"/>`
			`<meta name="twitter:title" content="February, 2018"/>`
			`<meta name="twitter:description" content="2018-02-01`


			`Peter gave feedback on the dc.rights proof of concept that I had sent him last week`
			`We don’t need to distinguish between internal and external works, so that makes it just a simple list`
Update notes 2018-02-01 19:04:07 +02:00			`Yesterday I figured out how to monitor DSpace sessions using JMX`
			`I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu’s munin-plugins-java package and used the stuff I discovered about JMX in 2018-01`
Add notes for 2018-02-01 2018-02-01 16:30:28 +02:00

			`"/>`
Add notes for 2018-02-07 2018-02-07 12:34:36 +02:00			`<meta name="generator" content="Hugo 0.36" />`
Add notes for 2018-02-01 2018-02-01 16:30:28 +02:00


			`<script type="application/ld+json">`
			`{`
			`"@context": "http://schema.org",`
			`"@type": "BlogPosting",`
			`"headline": "February, 2018",`
			`"url": "https://alanorth.github.io/cgspace-notes/2018-02/",`
Update notes for 2018-02-07 2018-02-07 18:33:26 +02:00			`"wordCount": "1334",`
Add notes for 2018-02-01 2018-02-01 16:30:28 +02:00			`"datePublished": "2018-02-01T16:28:54+02:00",`
Update notes for 2018-02-07 2018-02-07 18:33:26 +02:00			`"dateModified": "2018-02-07T12:34:36+02:00",`
Add notes for 2018-02-01 2018-02-01 16:30:28 +02:00			`"author": {`
			`"@type": "Person",`
			`"name": "Alan Orth"`
			`},`
			`"keywords": "Notes"`
			`}`
			`</script>`



			`<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-02/">`

			`<title>February, 2018 \| CGSpace Notes</title>`

			`<!-- combined, minified CSS -->`
			`<link href="https://alanorth.github.io/cgspace-notes/css/style.css" rel="stylesheet" integrity="sha384-HjEPigLMLBzVQsUi6JWp9tmxJtBimdClDBxwZrwZR+VE3s11/PtFYOrLClxIv2SG" crossorigin="anonymous">`









			`</head>`

			`<body>`


			`<div class="blog-masthead">`
			`<div class="container">`
			`<nav class="nav blog-nav">`
			`<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>`


			`</nav>`
			`</div>`
			`</div>`



			`<header class="blog-header">`
			`<div class="container">`
			`<h1 class="blog-title"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>`
			`<p class="lead blog-description">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>`
			`</div>`
			`</header>`



			`<div class="container">`
			`<div class="row">`
			`<div class="col-sm-8 blog-main">`




			`<article class="blog-post">`
			`<header>`
			`<h2 class="blog-post-title"><a href="https://alanorth.github.io/cgspace-notes/2018-02/">February, 2018</a></h2>`
			`<p class="blog-post-meta"><time datetime="2018-02-01T16:28:54+02:00">Thu Feb 01, 2018</time> by Alan Orth in`

			`<i class="fa fa-tag" aria-hidden="true"></i> <a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>`

			`</p>`
			`</header>`
			`<h2 id="2018-02-01">2018-02-01</h2>`

			`<ul>`
			`<li>Peter gave feedback on the <code>dc.rights</code> proof of concept that I had sent him last week</li>`
			`<li>We don’t need to distinguish between internal and external works, so that makes it just a simple list</li>`
Update notes 2018-02-01 19:04:07 +02:00			`<li>Yesterday I figured out how to monitor DSpace sessions using JMX</li>`
			`<li>I copied the logic in the <code>jmx_tomcat_dbpools</code> provided by Ubuntu’s <code>munin-plugins-java</code> package and used the stuff I discovered about JMX <a href="/cgspace-notes/2018-01/">in 2018-01</a></li>`
Add notes for 2018-02-01 2018-02-01 16:30:28 +02:00			`</ul>`

			`<p></p>`

Update notes 2018-02-01 19:04:07 +02:00			`<p><img src="/cgspace-notes/2018/02/jmx_dspace_sessions-day.png" alt="DSpace Sessions" /></p>`

			`<ul>`
			`<li>Run all system updates and reboot DSpace Test</li>`
			`<li>Wow, I packaged up the <code>jmx_dspace_sessions</code> stuff in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> and deployed it on CGSpace and it totally works:</li>`
			`</ul>`

			`<pre><code># munin-run jmx_dspace_sessions`
			`v_.value 223`
			`v_jspui.value 1`
			`v_oai.value 0`
			`</code></pre>`

Updates 2018-02-03 09:44:31 +02:00			`<h2 id="2018-02-03">2018-02-03</h2>`

			`<ul>`
			`<li>Bram from Atmire responded about the high load caused by the Solr updater script and said it will be fixed with the updates to DSpace 5.8 compatibility: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=566">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=566</a></li>`
			`<li>We will close that ticket for now and wait for the 5.8 stuff: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560</a></li>`
Update notes 2018-02-04 00:00:51 +02:00			`<li>I finally took a look at the second round of cleanups Peter had sent me for author affiliations in mid January</li>`
			`<li>After trimming whitespace and quickly scanning for encoding errors I applied them on CGSpace:</li>`
Updates 2018-02-03 09:44:31 +02:00			`</ul>`

Update notes 2018-02-04 00:00:51 +02:00			`<pre><code>$ ./delete-metadata-values.py -i /tmp/2018-02-03-Affiliations-12-deletions.csv -f cg.contributor.affiliation -m 211 -d dspace -u dspace -p 'fuuu'`
			`$ ./fix-metadata-values.py -i /tmp/2018-02-03-Affiliations-1116-corrections.csv -f cg.contributor.affiliation -t correct -m 211 -d dspace -u dspace -p 'fuuu'`
			`</code></pre>`

			`<ul>`
			`<li>Then I started a full Discovery reindex:</li>`
			`</ul>`

			`<pre><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b`
Update notes 2018-02-04 11:26:07 +02:00
			`real 96m39.823s`
			`user 14m10.975s`
			`sys 2m29.088s`
Update notes 2018-02-04 00:00:51 +02:00			`</code></pre>`

			`<ul>`
			`<li>Generate a new list of affiliations for Peter to sort through:</li>`
			`</ul>`

			`<pre><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'affiliation') AND resource_type_id = 2 group by text_value order by count desc) to /tmp/affiliations.csv with csv;`
			`COPY 3723`
			`</code></pre>`

			`<ul>`
			`<li>Oh, and it looks like we processed over 3.1 million requests in January, up from 2.9 million in <a href="/cgspace-notes/2017-12/">December</a>:</li>`
			`</ul>`

			`<pre><code># time zcat --force /var/log/nginx/* \| grep -cE "[0-9]{1,2}/Jan/2018"`
			`3126109`

			`real 0m23.839s`
			`user 0m27.225s`
			`sys 0m1.905s`
			`</code></pre>`

Update notes 2018-02-05 19:08:05 +02:00			`<h2 id="2018-02-05">2018-02-05</h2>`

			`<ul>`
			`<li>Toying with correcting authors with trailing spaces via PostgreSQL:</li>`
			`</ul>`

			`<pre><code>dspace=# update metadatavalue set text_value=REGEXP_REPLACE(text_value, '\s+$' , '') where resource_type_id=2 and metadata_field_id=3 and text_value ~ '^.*?\s+$';`
			`UPDATE 20`
			`</code></pre>`

			`<ul>`
			`<li>I tried the <code>TRIM(TRAILING from text_value)</code> function and it said it changed 20 items but the spaces didn’t go away</li>`
			`<li>This is on a fresh import of the CGSpace database, but when I tried to apply it on CGSpace there were no changes detected. Weird.</li>`
			`<li>Anyways, Peter wants a new list of authors to clean up, so I exported another CSV:</li>`
			`</ul>`

			`<pre><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors-2018-02-05.csv with csv;`
			`COPY 55630`
			`</code></pre>`

Add notes for 2018-02-06 2018-02-06 14:03:07 +02:00			`<h2 id="2018-02-06">2018-02-06</h2>`

			`<ul>`
			`<li>UptimeRobot says CGSpace is down this morning around 9:15</li>`
			`<li>I see 308 PostgreSQL connections in <code>pg_stat_activity</code></li>`
			`<li>The usage otherwise seemed low for REST/OAI as well as XMLUI in the last hour:</li>`
			`</ul>`

			`<pre><code># date`
			`Tue Feb 6 09:30:32 UTC 2018`
			`# cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 \| grep -E "6/Feb/2018:(08\|09)" \| awk '{print $1}' \| sort \| uniq -c \| sort -n \| tail -n 10`
			`2 223.185.41.40`
			`2 66.249.64.14`
			`2 77.246.52.40`
			`4 157.55.39.82`
			`4 193.205.105.8`
			`5 207.46.13.63`
			`5 207.46.13.64`
			`6 154.68.16.34`
			`7 207.46.13.66`
			`1548 50.116.102.77`
			`# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 /var/log/nginx/error.log /var/log/nginx/error.log.1 \| grep -E "6/Feb/2018:(08\|09)" \| awk '{print $1}' \| sort \| uniq -c \| sort -n \| tail -n 10`
			`77 213.55.99.121`
			`86 66.249.64.14`
			`101 104.196.152.243`
			`103 207.46.13.64`
			`118 157.55.39.82`
			`133 207.46.13.66`
			`136 207.46.13.63`
			`156 68.180.228.157`
			`295 197.210.168.174`
			`752 144.76.64.79`
			`</code></pre>`

			`<ul>`
			`<li>I did notice in <code>/var/log/tomcat7/catalina.out</code> that Atmire’s update thing was running though</li>`
			`<li>So I restarted Tomcat and now everything is fine</li>`
			`<li>Next time I see that many database connections I need to save the output so I can analyze it later</li>`
			`<li>I’m going to re-schedule the taskUpdateSolrStatsMetadata task as <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=566">Bram detailed in ticket 566</a> to see if it makes CGSpace stop crashing every morning</li>`
			`<li>If I move the task from 3AM to 3PM, deally CGSpace will stop crashing in the morning, or start crashing ~12 hours later</li>`
			`<li>Eventually Atmire has said that there will be a fix for this high load caused by their script, but it will come with the 5.8 compatability they are already working on</li>`
			`<li>I re-deployed CGSpace with the new task time of 3PM, ran all system updates, and restarted the server</li>`
			`<li>Also, I changed the name of the DSpace fallback pool on DSpace Test and CGSpace to be called ‘dspaceCli’ so that I can distinguish it in <code>pg_stat_activity</code></li>`
			`<li>I implemented some changes to the pooling in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> so that each DSpace web application can use its own pool (web, api, and solr)</li>`
			`<li>Each pool uses its own name and hopefully this should help me figure out which one is using too many connections next time CGSpace goes down</li>`
			`<li>Also, this will mean that when a search bot comes along and hammers the XMLUI, the REST and OAI applications will be fine</li>`
			`<li>I’m not actually sure if the Solr web application uses the database though, so I’ll have to check later and remove it if necessary</li>`
Update notes for 2018-02-06 2018-02-06 14:03:46 +02:00			`<li>I deployed the changes on DSpace Test only for now, so I will monitor and make them on CGSpace later this week</li>`
Add notes for 2018-02-06 2018-02-06 14:03:07 +02:00			`</ul>`

Add notes for 2018-02-07 2018-02-07 12:34:36 +02:00			`<h2 id="2018-02-07">2018-02-07</h2>`

			`<ul>`
			`<li>Abenet wrote to ask a question about the ORCiD lookup not working for one CIAT user on CGSpace</li>`
			`<li>I tried on DSpace Test and indeed the lookup just doesn’t work!</li>`
			`<li>The ORCiD code in DSpace appears to be using <code>http://pub.orcid.org/</code>, but when I go there in the browser it redirects me to <code>https://pub.orcid.org/v2.0/</code></li>`
			`<li>According to <a href="https://groups.google.com/forum/#!topic/orcid-api-users/qfg-HwAB1bk">the announcement</a> the v1 API was moved from <code>http://pub.orcid.org/</code> to <code>https://pub.orcid.org/v1.2</code> until March 1st when it will be discontinued for good</li>`
			`<li>But the old URL is hard coded in DSpace and it doesn’t work anyways, because it currently redirects you to <code>https://pub.orcid.org/v2.0/v1.2</code></li>`
			`<li>So I guess we have to disable that shit once and for all and switch to a controlled vocabulary</li>`
Update notes for 2018-02-07 2018-02-07 18:33:26 +02:00			`<li>CGSpace crashed again, this time around <code>Wed Feb 7 11:20:28 UTC 2018</code></li>`
			`<li>I took a few snapshots of the PostgreSQL activity at the time and as the minutes went on and the connections were very high at first but reduced on their own:</li>`
			`</ul>`

			`<pre><code>$ psql -c 'select * from pg_stat_activity' > /tmp/pg_stat_activity.txt`
			`$ grep -c 'PostgreSQL JDBC' /tmp/pg_stat_activity*`
			`/tmp/pg_stat_activity1.txt:300`
			`/tmp/pg_stat_activity2.txt:272`
			`/tmp/pg_stat_activity3.txt:168`
			`/tmp/pg_stat_activity4.txt:5`
			`/tmp/pg_stat_activity5.txt:6`
			`</code></pre>`

			`<ul>`
			`<li>Interestingly, all of those 751 connections were idle!</li>`
			`</ul>`

			`<pre><code>$ grep "PostgreSQL JDBC" /tmp/pg_stat_activity* \| grep -c idle`
			`751`
			`</code></pre>`

			`<ul>`
			`<li>Since I was restarting Tomcat anyways, I decided to deploy the changes to create two different pools for web and API apps</li>`
			`<li>Looking the Munin graphs, I can see that there were almost double the normal number of DSpace sessions at the time of the crash (and also yesterday!):</li>`
			`</ul>`

			`<p><img src="/cgspace-notes/2018/02/jmx_dspace-sessions-day.png" alt="DSpace Sessions" /></p>`

			`<ul>`
			`<li>Indeed it seems like there were over 1800 sessions today around the hours of 10 and 11 AM:</li>`
			`</ul>`

			`<pre><code>$ grep -E '^2018-02-07 (10\|11)' dspace.log.2018-02-07 \| grep -o -E 'session_id=[A-Z0-9]{32}' \| sort -n \| uniq \| wc -l`
			`1828`
			`</code></pre>`

			`<ul>`
			`<li>CGSpace went down again a few hours later, and now the connections to the dspaceWeb pool are maxed at 250 (the new limit I imposed with the new separate pool scheme)</li>`
			`<li>What’s interesting is that the DSpace log says the connections are all busy:</li>`
			`</ul>`

			`<pre><code>org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-328] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:250; busy:250; idle:0; lastwait:5000].`
			`</code></pre>`

			`<ul>`
			`<li>… but in PostgreSQL I see them <code>idle</code> or <code>idle in transaction</code>:</li>`
			`</ul>`

			`<pre><code>$ psql -c 'select * from pg_stat_activity' \| grep -c dspaceWeb`
			`250`
			`$ psql -c 'select * from pg_stat_activity' \| grep dspaceWeb \| grep -c idle`
			`250`
			`$ psql -c 'select * from pg_stat_activity' \| grep dspaceWeb \| grep -c "idle in transaction"`
			`187`
			`</code></pre>`

			`<ul>`
			`<li>What the fuck, does DSpace think all connections are busy?</li>`
			`<li>I suspect these are issues with abandoned connections or maybe a leak, so I’m going to try adding the <code>removeAbandoned='true'</code> parameter which is apparently off by default</li>`
			`<li>I will try <code>testOnReturn='true'</code> too, just to add more validation, because I’m fucking grasping at straws</li>`
Add notes for 2018-02-07 2018-02-07 12:34:36 +02:00			`</ul>`

Add notes for 2018-02-01 2018-02-01 16:30:28 +02:00



			`</article>`



			`</div> <!-- /.blog-main -->`

			`<aside class="col-sm-3 ml-auto blog-sidebar">`



			`<section class="sidebar-module">`
			`<h4>Recent Posts</h4>`
			`<ol class="list-unstyled">`


			`<li><a href="/cgspace-notes/2018-02/">February, 2018</a></li>`

			`<li><a href="/cgspace-notes/2018-01/">January, 2018</a></li>`

			`<li><a href="/cgspace-notes/2017-12/">December, 2017</a></li>`

			`<li><a href="/cgspace-notes/2017-11/">November, 2017</a></li>`

			`<li><a href="/cgspace-notes/2017-10/">October, 2017</a></li>`

			`</ol>`
			`</section>`




			`<section class="sidebar-module">`
			`<h4>Links</h4>`
			`<ol class="list-unstyled">`

			`<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>`

			`<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>`

			`<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>`

			`</ol>`
			`</section>`

			`</aside>`


			`</div> <!-- /.row -->`
			`</div> <!-- /.container -->`



			`<footer class="blog-footer">`
			`<p>`

			`Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.`

			`</p>`
			`<p>`
			`<a href="#">Back to top</a>`
			`</p>`
			`</footer>`


			`</body>`

			`</html>`