<!DOCTYPE html>
<html lang="en-us">
<head prefix="og: http://ogp.me/ns#">
  <meta charset="utf-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1" />
  <meta property="og:title" content=" December, 2015 &middot;  CGSpace Notes" />
  
  <meta property="og:site_name" content="CGSpace Notes" />
  <meta property="og:url" content="/cgspace-notes/2015-12/" />
  
  
  <meta property="og:type" content="article" />
  
  <meta property="og:article:published_time" content="2015-12-02T13:18:00&#43;03:00" />
  
  <meta property="og:article:tag" content="notes" />
  
  

  <title>
     December, 2015 &middot;  CGSpace Notes
  </title>

  <link rel="stylesheet" href="/cgspace-notes/css/bootstrap.min.css" />
  <link rel="stylesheet" href="/cgspace-notes/css/main.css" />
  <link rel="stylesheet" href="/cgspace-notes/css/font-awesome.min.css" />
  <link rel="stylesheet" href="/cgspace-notes/css/github.css" />
  <link rel="stylesheet" href="//fonts.googleapis.com/css?family=Source+Sans+Pro:200,300,400" type="text/css">
  <link rel="shortcut icon" href="/cgspace-notes/images/favicon.ico" />
  <link rel="apple-touch-icon" href="/cgspace-notes/images/apple-touch-icon.png" />
  
</head>
<body>
    <header class="global-header"  style="background-image:url(../images/bg.jpg )">
    <section class="header-text">
      <h1><a href="/cgspace-notes/">CGSpace Notes</a></h1>
      
      <div class="sns-links hidden-print">
  
  
  
  
  
  
  
  
  
</div>

      
      <a href="/cgspace-notes/" class="btn-header btn-back hidden-xs">
        <i class="fa fa-angle-left" aria-hidden="true"></i>
        &nbsp;Home
      </a>
      
      
    </section>
  </header>
  <main class="container">


<article>
  <header>
    <h1 class="text-primary">December, 2015</h1>
    <div class="post-meta clearfix">
      <div class="post-date pull-left">
        Posted on
        <time datetime="2015-12-02T13:18:00&#43;03:00">
          Dec 2, 2015
        </time>
      </div>
      <div class="pull-right">
        
        <span class="post-tag small"><a href="/cgspace-notes//tags/notes">#notes</a></span>
        
      </div>
    </div>
  </header>
  <section>
    

<h2 id="2015-12-02:012a628feed6d64ae1151cbd6151ccd6">2015-12-02</h2>

<ul>
<li>Replace <code>lzop</code> with <code>xz</code> in log compression cron jobs on DSpace Test—it uses less space:</li>
</ul>

<pre><code># cd /home/dspacetest.cgiar.org/log
# ls -lh dspace.log.2015-11-18*
-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
</code></pre>

<ul>
<li>I had used lrzip once, but it needs more memory and is harder to use as it requires the lrztar wrapper</li>
<li>Need to remember to go check if everything is ok in a few days and then change CGSpace</li>
<li>CGSpace went down again (due to PostgreSQL idle connections of course)</li>
<li>Current database settings for DSpace are <code>db.maxconnections = 30</code> and <code>db.maxidle = 8</code>, yet idle connections are exceeding this:</li>
</ul>

<pre><code>$ psql -c 'SELECT * from pg_stat_activity;' | grep cgspace | grep -c idle
39
</code></pre>

<ul>
<li>I restarted PostgreSQL and Tomcat and it&rsquo;s back</li>
<li>On a related note of why CGSpace is so slow, I decided to finally try the <code>pgtune</code> script to tune the postgres settings:</li>
</ul>

<pre><code># apt-get install pgtune
# pgtune -i /etc/postgresql/9.3/main/postgresql.conf -o postgresql.conf-pgtune
# mv /etc/postgresql/9.3/main/postgresql.conf /etc/postgresql/9.3/main/postgresql.conf.orig 
# mv postgresql.conf-pgtune /etc/postgresql/9.3/main/postgresql.conf
</code></pre>

<ul>
<li>It introduced the following new settings:</li>
</ul>

<pre><code>default_statistics_target = 50
maintenance_work_mem = 480MB
constraint_exclusion = on
checkpoint_completion_target = 0.9
effective_cache_size = 5632MB
work_mem = 48MB
wal_buffers = 8MB
checkpoint_segments = 16
shared_buffers = 1920MB
max_connections = 80
</code></pre>

<ul>
<li>Now I need to go read PostgreSQL docs about these options, and watch memory settings in munin etc</li>
<li>For what it&rsquo;s worth, now the REST API should be faster (because of these PostgreSQL tweaks):</li>
</ul>

<pre><code>$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.474
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
2.141
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.685
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.995
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.786
</code></pre>

<ul>
<li>Last week it was an average of 8 seconds&hellip; now this is <sup>1</sup>&frasl;<sub>4</sub> of that</li>
<li>CCAFS noticed that one of their items displays only the Atmire statlets: <a href="https://cgspace.cgiar.org/handle/10568/42445">https://cgspace.cgiar.org/handle/10568/42445</a></li>
</ul>

<p><img src="../images/2015/12/ccafs-item-no-metadata.png" alt="CCAFS item" /></p>

<ul>
<li>The authorizations for the item are all public READ, and I don&rsquo;t see any errors in dspace.log when browsing that item</li>
<li>I filed a ticket on Atmire&rsquo;s issue tracker</li>
<li>I also filed a ticket on Atmire&rsquo;s issue tracker for the PostgreSQL stuff</li>
</ul>

<h2 id="2015-12-03:012a628feed6d64ae1151cbd6151ccd6">2015-12-03</h2>

<ul>
<li>CGSpace very slow, and monitoring emailing me to say its down, even though I can load the page (very slowly)</li>
<li>Idle postgres connections look like this (with no change in DSpace db settings lately):</li>
</ul>

<pre><code>$ psql -c 'SELECT * from pg_stat_activity;' | grep cgspace | grep -c idle
29
</code></pre>

<ul>
<li>I restarted Tomcat and postgres&hellip;</li>
<li>Atmire commented that we should raise the JVM heap size by ~500M, so it is now <code>-Xms3584m -Xmx3584m</code></li>
<li>We weren&rsquo;t out of heap yet, but it&rsquo;s probably fair enough that the DSpace 5 upgrade (and new Atmire modules) requires more memory so it&rsquo;s ok</li>
<li>A possible side effect is that I see that the REST API is twice as fast for the request above now:</li>
</ul>

<pre><code>$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.368
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.968
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.006
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.849
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.806
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.854
</code></pre>

<h2 id="2015-12-05:012a628feed6d64ae1151cbd6151ccd6">2015-12-05</h2>

<ul>
<li>CGSpace has been up and down all day and REST API is completely unresponsive</li>
<li>PostgreSQL idle connections are currently:</li>
</ul>

<pre><code>postgres@linode01:~$ psql -c 'SELECT * from pg_stat_activity;' | grep cgspace | grep -c idle
28
</code></pre>

<ul>
<li>I have reverted all the pgtune tweaks from the other day, as they didn&rsquo;t fix the stability issues, so I&rsquo;d rather not have them introducing more variables into the equation</li>
<li>The PostgreSQL stats from Munin all point to something database-related with the DSpace 5 upgrade around mid–late November</li>
</ul>

<p><img src="../images/2015/12/postgres_bgwriter-year.png" alt="PostgreSQL bgwriter (year)" />
<img src="../images/2015/12/postgres_cache_cgspace-year.png" alt="PostgreSQL cache (year)" />
<img src="../images/2015/12/postgres_locks_cgspace-year.png" alt="PostgreSQL locks (year)" />
<img src="../images/2015/12/postgres_scans_cgspace-year.png" alt="PostgreSQL scans (year)" /></p>

<h2 id="2015-12-07:012a628feed6d64ae1151cbd6151ccd6">2015-12-07</h2>

<ul>
<li>Atmire sent <a href="https://github.com/ilri/DSpace/pull/161">some fixes</a> to DSpace&rsquo;s REST API code that was leaving contexts open (causing the slow performance and database issues)</li>
<li>After deploying the fix to CGSpace the REST API is consistently faster:</li>
</ul>

<pre><code>$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.675
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.599
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.588
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.566
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.497
</code></pre>

<h2 id="2015-12-08:012a628feed6d64ae1151cbd6151ccd6">2015-12-08</h2>

<ul>
<li>Switch CGSpace log compression cron jobs from using lzop to xz—the compression isn&rsquo;t as good, but it&rsquo;s much faster and causes less IO/CPU load</li>
<li>Since we figured out (and fixed) the cause of the performance issue, I reverted Google Bot&rsquo;s crawl rate to the &ldquo;Let Google optimize&rdquo; setting</li>
</ul>

  </section>
  <footer>
    
    <section class="author-info row">
      <div class="author-avatar col-md-2">
        
      </div>
      <div class="author-meta col-md-6">
        
        <h1 class="author-name text-primary">Alan Orth</h1>
        
        
      </div>
      
    </section>
    <ul class="pager">
      
      <li class="previous"><a href="/cgspace-notes/2015-11/"><span aria-hidden="true">&larr;</span> Older</a></li>
      
      
      <li class="next"><a href="/cgspace-notes/2016-01/">Newer <span aria-hidden="true">&rarr;</span></a></li>
      
    </ul>
  </footer>
</article>

  </main>
  <footer class="container global-footer">
    <div class="copyright-note pull-left">
      
    </div>
    <div class="sns-links hidden-print">
  
  
  
  
  
  
  
  
  
</div>

  </footer>

  <script src="/cgspace-notes/js/highlight.pack.js"></script>
  <script>
    hljs.initHighlightingOnLoad();
  </script>
  
  
</body>
</html>