mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-05 06:43:00 +01:00
648 lines
33 KiB
XML
648 lines
33 KiB
XML
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
|
||
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
|
||
<channel>
|
||
<title>CGSpace Notes</title>
|
||
<link>/cgspace-notes/</link>
|
||
<description>Recent content on CGSpace Notes</description>
|
||
<generator>Hugo -- gohugo.io</generator>
|
||
<language>en-us</language>
|
||
<lastBuildDate>Fri, 05 Feb 2016 13:18:00 +0300</lastBuildDate>
|
||
<atom:link href="/cgspace-notes/index.xml" rel="self" type="application/rss+xml" />
|
||
|
||
<item>
|
||
<title>February, 2016</title>
|
||
<link>/cgspace-notes/2016-02/</link>
|
||
<pubDate>Fri, 05 Feb 2016 13:18:00 +0300</pubDate>
|
||
|
||
<guid>/cgspace-notes/2016-02/</guid>
|
||
<description>
|
||
|
||
<h2 id="2016-02-05:124a59adbaa8ef13e1518d003fc03981">2016-02-05</h2>
|
||
|
||
<ul>
|
||
<li>Looking at some DAGRIS data for Abenet Yabowork</li>
|
||
<li>Lots of issues with spaces, newlines, etc causing the import to fail</li>
|
||
<li>I noticed we have a very <em>interesting</em> list of countries on CGSpace:</li>
|
||
</ul>
|
||
|
||
<p><img src="../images/2016/02/cgspace-countries.png" alt="CGSpace country list" /></p>
|
||
|
||
<ul>
|
||
<li>Not only are there 49,000 countries, we have some blanks (25)&hellip;</li>
|
||
<li>Also, lots of things like &ldquo;COTE D`LVOIRE&rdquo; and &ldquo;COTE D IVOIRE&rdquo;</li>
|
||
</ul>
|
||
|
||
<h2 id="2016-02-06:124a59adbaa8ef13e1518d003fc03981">2016-02-06</h2>
|
||
|
||
<ul>
|
||
<li>Found a way to get items with null/empty metadata values from SQL</li>
|
||
<li>First, find the <code>metadata_field_id</code> for the field you want from the <code>metadatafieldregistry</code> table:</li>
|
||
</ul>
|
||
|
||
<pre><code>dspacetest=# select * from metadatafieldregistry;
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>In this case our country field is 78</li>
|
||
<li>Now find all resources with type 2 (item) that have null/empty values for that field:</li>
|
||
</ul>
|
||
|
||
<pre><code>dspacetest=# select resource_id from metadatavalue where resource_type_id=2 and metadata_field_id=78 and (text_value='' OR text_value IS NULL);
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Then you can find the handle that owns it from its <code>resource_id</code>:</li>
|
||
</ul>
|
||
|
||
<pre><code>dspacetest=# select handle from item, handle where handle.resource_id = item.item_id AND item.item_id = '22678';
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>It&rsquo;s 25 items so editing in the web UI is annoying, let&rsquo;s try SQL!</li>
|
||
</ul>
|
||
|
||
<pre><code>dspacetest=# delete from metadatavalue where metadata_field_id=78 and text_value='';
|
||
DELETE 25
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>After that perhaps a regular <code>dspace index-discovery</code> (no -b) <em>should</em> suffice&hellip;</li>
|
||
<li>Hmm, I indexed, cleared the Cocoon cache, and restarted Tomcat but the 25 &ldquo;|||&rdquo; countries are still there</li>
|
||
<li>Maybe I need to do a full re-index&hellip;</li>
|
||
<li>Yep! The full re-index seems to work.</li>
|
||
<li>Process the empty countries on CGSpace</li>
|
||
</ul>
|
||
|
||
<h2 id="2016-02-07:124a59adbaa8ef13e1518d003fc03981">2016-02-07</h2>
|
||
|
||
<ul>
|
||
<li>Working on cleaning up Abenet&rsquo;s DAGRIS data with OpenRefine</li>
|
||
<li>I discovered two really nice functions in OpenRefine: <code>value.trim()</code> and <code>value.escape(&quot;javascript&quot;)</code> which shows whitespace characters like <code>\r\n</code>!</li>
|
||
<li>For some reason when you import an Excel file into OpenRefine it exports dates like 1949 to 1949.0 in the CSV</li>
|
||
<li>I re-import the resulting CSV and run a GREL on the date issued column: <code>value.replace(&quot;\.0&quot;, &quot;&quot;)</code></li>
|
||
<li>I need to start running DSpace in Mac OS X instead of a Linux VM</li>
|
||
<li>Install PostgreSQL from homebrew, then configure and import CGSpace database dump:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ postgres -D /opt/brew/var/postgres
|
||
$ createuser --superuser postgres
|
||
$ createuser --pwprompt dspacetest
|
||
$ createdb -O dspacetest --encoding=UNICODE dspacetest
|
||
$ psql postgres
|
||
postgres=# alter user dspacetest createuser;
|
||
postgres=# \q
|
||
$ pg_restore -O -U dspacetest -d dspacetest ~/Downloads/cgspace_2016-02-07.backup
|
||
$ psql postgres
|
||
postgres=# alter user dspacetest nocreateuser;
|
||
postgres=# \q
|
||
$ vacuumdb dspacetest
|
||
$ psql -U dspacetest -f ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspacetest -h localhost
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>After building and running a <code>fresh_install</code> I symlinked the webapps into Tomcat&rsquo;s webapps folder:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ mv /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/ROOT /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/ROOT.orig
|
||
$ ln -sfv ~/dspace/webapps/xmlui /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/ROOT
|
||
$ ln -sfv ~/dspace/webapps/rest /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/rest
|
||
$ ln -sfv ~/dspace/webapps/jspui /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/jspui
|
||
$ ln -sfv ~/dspace/webapps/oai /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/oai
|
||
$ ln -sfv ~/dspace/webapps/solr /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/solr
|
||
$ /opt/brew/Cellar/tomcat/8.0.30/bin/catalina start
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Add CATALINA_OPTS in <code>/opt/brew/Cellar/tomcat/8.0.30/libexec/bin/setenv.sh</code>, as this script is sourced by the <code>catalina</code> startup script</li>
|
||
<li>For example:</li>
|
||
</ul>
|
||
|
||
<pre><code>CATALINA_OPTS=&quot;-Djava.awt.headless=true -Xms2048m -Xmx2048m -XX:MaxPermSize=256m -XX:+UseConcMarkSweepGC -Dfile.encoding=UTF-8&quot;
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>After verifying that the site is working, start a full index:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ ~/dspace/bin/dspace index-discovery -b
|
||
</code></pre>
|
||
|
||
<h2 id="2016-02-08:124a59adbaa8ef13e1518d003fc03981">2016-02-08</h2>
|
||
|
||
<ul>
|
||
<li>Finish cleaning up and importing ~400 DAGRIS items into CGSpace</li>
|
||
<li>Whip up some quick CSS to make the button in the submission workflow use the XMLUI theme&rsquo;s brand colors (<a href="https://github.com/ilri/DSpace/issues/154">#154</a>)</li>
|
||
</ul>
|
||
|
||
<p><img src="../images/2016/02/submit-button-ilri.png" alt="ILRI submission buttons" />
|
||
<img src="../images/2016/02/submit-button-drylands.png" alt="Drylands submission buttons" /></p>
|
||
|
||
<h2 id="2016-02-09:124a59adbaa8ef13e1518d003fc03981">2016-02-09</h2>
|
||
|
||
<ul>
|
||
<li>Re-sync DSpace Test with CGSpace</li>
|
||
<li>Help Sisay with OpenRefine</li>
|
||
<li>Enable HTTPS on DSpace Test using Let&rsquo;s Encrypt:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ cd ~/src/git
|
||
$ git clone https://github.com/letsencrypt/letsencrypt
|
||
$ cd letsencrypt
|
||
$ sudo service nginx stop
|
||
# add port 443 to firewall rules
|
||
$ ./letsencrypt-auto certonly --standalone -d dspacetest.cgiar.org
|
||
$ sudo service nginx start
|
||
$ ansible-playbook dspace.yml -l linode02 -t nginx,firewall -u aorth --ask-become-pass
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>We should install it in /opt/letsencrypt and then script the renewal script, but first we have to wire up some variables and template stuff based on the script here: <a href="https://letsencrypt.org/howitworks/">https://letsencrypt.org/howitworks/</a></li>
|
||
<li>I had to export some CIAT items that were being cleaned up on the test server and I noticed their <code>dc.contributor.author</code> fields have DSpace 5 authority index UUIDs&hellip;</li>
|
||
<li>To clean those up in OpenRefine I used this GREL expression: <code>value.replace(/::\w{8}-\w{4}-\w{4}-\w{4}-\w{12}::600/,&quot;&quot;)</code></li>
|
||
<li>Getting more and more hangs on DSpace Test, seemingly random but also during CSV import</li>
|
||
<li>Logs don&rsquo;t always show anything right when it fails, but eventually one of these appears:</li>
|
||
</ul>
|
||
|
||
<pre><code>org.dspace.discovery.SearchServiceException: Error while processing facet fields: java.lang.OutOfMemoryError: Java heap space
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>or</li>
|
||
</ul>
|
||
|
||
<pre><code>Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Right now DSpace Test&rsquo;s Tomcat heap is set to 1536m and we have quite a bit of free RAM:</li>
|
||
</ul>
|
||
|
||
<pre><code># free -m
|
||
total used free shared buffers cached
|
||
Mem: 3950 3902 48 9 37 1311
|
||
-/+ buffers/cache: 2552 1397
|
||
Swap: 255 57 198
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>So I&rsquo;ll bump up the Tomcat heap to 2048 (CGSpace production server is using 3GB)</li>
|
||
</ul>
|
||
|
||
<h2 id="2016-02-11:124a59adbaa8ef13e1518d003fc03981">2016-02-11</h2>
|
||
|
||
<ul>
|
||
<li>Massaging some CIAT data in OpenRefine</li>
|
||
<li>There are 1200 records that have PDFs, and will need to be imported into CGSpace</li>
|
||
<li>I created a <code>filename</code> column based on the <code>dc.identifier.url</code> column using the following transform:</li>
|
||
</ul>
|
||
|
||
<pre><code>value.split('/')[-1]
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Then I wrote a tool called <a href="https://gist.github.com/alanorth/2206f24483fe5f0454fc"><code>generate-thumbnails.py</code></a> to download the PDFs and generate thumbnails for them, for example:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ ./generate-thumbnails.py ciat-reports.csv
|
||
Processing 64661.pdf
|
||
&gt; Downloading 64661.pdf
|
||
&gt; Creating thumbnail for 64661.pdf
|
||
Processing 64195.pdf
|
||
&gt; Downloading 64195.pdf
|
||
&gt; Creating thumbnail for 64195.pdf
|
||
</code></pre>
|
||
|
||
<h2 id="2016-02-12:124a59adbaa8ef13e1518d003fc03981">2016-02-12</h2>
|
||
|
||
<ul>
|
||
<li>Looking at CIAT&rsquo;s records again, there are some problems with a dozen or so files (out of 1200)</li>
|
||
<li>A few items are using the same exact PDF</li>
|
||
<li>A few items are using HTM or DOC files</li>
|
||
<li>A few items link to PDFs on IFPRI&rsquo;s e-Library or Research Gate</li>
|
||
<li>A few items have no item</li>
|
||
</ul>
|
||
</description>
|
||
</item>
|
||
|
||
<item>
|
||
<title>January, 2016</title>
|
||
<link>/cgspace-notes/2016-01/</link>
|
||
<pubDate>Wed, 13 Jan 2016 13:18:00 +0300</pubDate>
|
||
|
||
<guid>/cgspace-notes/2016-01/</guid>
|
||
<description>
|
||
|
||
<h2 id="2016-01-13:3846b7fcbca60cdedafd373cb39cd76d">2016-01-13</h2>
|
||
|
||
<ul>
|
||
<li>Move ILRI collection <code>10568/12503</code> from <code>10568/27869</code> to <code>10568/27629</code> using the <a href="https://gist.github.com/alanorth/392c4660e8b022d99dfa">move_collections.sh</a> script I wrote last year.</li>
|
||
<li>I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated.</li>
|
||
<li>Update GitHub wiki for documentation of <a href="https://github.com/ilri/DSpace/wiki/Maintenance-Tasks">maintenance tasks</a>.</li>
|
||
</ul>
|
||
|
||
<h2 id="2016-01-14:3846b7fcbca60cdedafd373cb39cd76d">2016-01-14</h2>
|
||
|
||
<ul>
|
||
<li>Update CCAFS project identifiers in input-forms.xml</li>
|
||
<li>Run system updates and restart the server</li>
|
||
</ul>
|
||
|
||
<h2 id="2016-01-18:3846b7fcbca60cdedafd373cb39cd76d">2016-01-18</h2>
|
||
|
||
<ul>
|
||
<li>Change &ldquo;Extension material&rdquo; to &ldquo;Extension Material&rdquo; in input-forms.xml (a mistake that fell through the cracks when we fixed the others in DSpace 4 era)</li>
|
||
</ul>
|
||
|
||
<h2 id="2016-01-19:3846b7fcbca60cdedafd373cb39cd76d">2016-01-19</h2>
|
||
|
||
<ul>
|
||
<li>Work on tweaks and updates for the social sharing icons on item pages: add Delicious and Mendeley (from Academicons), make links open in new windows, and set the icon color to the theme&rsquo;s primary color (<a href="https://github.com/ilri/DSpace/issues/157">#157</a>)</li>
|
||
<li>Tweak date-based facets to show more values in drill-down ranges (<a href="https://github.com/ilri/DSpace/issues/162">#162</a>)</li>
|
||
<li>Need to remember to clear the Cocoon cache after deployment or else you don&rsquo;t see the new ranges immediately</li>
|
||
<li>Set up recipe on IFTTT to tweet new items from the CGSpace Atom feed to my twitter account</li>
|
||
<li>Altmetrics&rsquo; support for Handles is kinda weak, so they can&rsquo;t associate our items with DOIs until they are tweeted or blogged, etc first.</li>
|
||
</ul>
|
||
|
||
<h2 id="2016-01-21:3846b7fcbca60cdedafd373cb39cd76d">2016-01-21</h2>
|
||
|
||
<ul>
|
||
<li>Still waiting for my IFTTT recipe to fire, two days later</li>
|
||
<li>It looks like the Atom feed on CGSpace hasn&rsquo;t changed in two days, but there have definitely been new items</li>
|
||
<li>The RSS feed is nearly as old, but has different old items there</li>
|
||
<li>On a hunch I cleared the Cocoon cache and now the feeds are fresh</li>
|
||
<li>Looks like there is configuration option related to this, <code>webui.feed.cache.age</code>, which defaults to 48 hours, though I&rsquo;m not sure what relation it has to the Cocoon cache</li>
|
||
<li>In any case, we should change this cache to be something more like 6 hours, as we publish new items several times per day.</li>
|
||
<li>Work around a CSS issue with long URLs in the item view (<a href="https://github.com/ilri/DSpace/issues/172">#172</a>)</li>
|
||
</ul>
|
||
|
||
<h2 id="2016-01-25:3846b7fcbca60cdedafd373cb39cd76d">2016-01-25</h2>
|
||
|
||
<ul>
|
||
<li>Re-deploy CGSpace and DSpace Test with latest <code>5_x-prod</code> branch</li>
|
||
<li>This included the social icon fixes/updates, date-based facet tweaks, reducing the feed cache age, and fixing a layout issue in XMLUI item view when an item had long URLs</li>
|
||
</ul>
|
||
|
||
<h2 id="2016-01-26:3846b7fcbca60cdedafd373cb39cd76d">2016-01-26</h2>
|
||
|
||
<ul>
|
||
<li>Run nginx updates on CGSpace and DSpace Test (<a href="http://mailman.nginx.org/pipermail/nginx/2016-January/049700.html">1.8.1 and 1.9.10, respectively</a>)</li>
|
||
<li>Run updates on DSpace Test and reboot for new Linode kernel <code>Linux 4.4.0-x86_64-linode63</code> (first update in months)</li>
|
||
</ul>
|
||
|
||
<h2 id="2016-01-28:3846b7fcbca60cdedafd373cb39cd76d">2016-01-28</h2>
|
||
|
||
<ul>
|
||
<li>Start looking at importing some Bioversity data that had been prepared earlier this week</li>
|
||
|
||
<li><p>While checking the data I noticed something strange, there are 79 items but only 8 unique PDFs:</p>
|
||
|
||
<p>$ ls SimpleArchiveForBio/ | wc -l
|
||
79
|
||
$ find SimpleArchiveForBio/ -iname &ldquo;*.pdf&rdquo; -exec basename {} \; | sort -u | wc -l
|
||
8</p></li>
|
||
</ul>
|
||
|
||
<h2 id="2016-01-29:3846b7fcbca60cdedafd373cb39cd76d">2016-01-29</h2>
|
||
|
||
<ul>
|
||
<li>Add five missing center-specific subjects to XMLUI item view (<a href="https://github.com/ilri/DSpace/issues/174">#174</a>)</li>
|
||
<li>This <a href="https://cgspace.cgiar.org/handle/10568/67062">CCAFS item</a> Before:</li>
|
||
</ul>
|
||
|
||
<p><img src="../images/2016/01/xmlui-subjects-before.png" alt="XMLUI subjects before" /></p>
|
||
|
||
<ul>
|
||
<li>After:</li>
|
||
</ul>
|
||
|
||
<p><img src="../images/2016/01/xmlui-subjects-after.png" alt="XMLUI subjects after" /></p>
|
||
</description>
|
||
</item>
|
||
|
||
<item>
|
||
<title>December, 2015</title>
|
||
<link>/cgspace-notes/2015-12/</link>
|
||
<pubDate>Wed, 02 Dec 2015 13:18:00 +0300</pubDate>
|
||
|
||
<guid>/cgspace-notes/2015-12/</guid>
|
||
<description>
|
||
|
||
<h2 id="2015-12-02:012a628feed6d64ae1151cbd6151ccd6">2015-12-02</h2>
|
||
|
||
<ul>
|
||
<li>Replace <code>lzop</code> with <code>xz</code> in log compression cron jobs on DSpace Test—it uses less space:</li>
|
||
</ul>
|
||
|
||
<pre><code># cd /home/dspacetest.cgiar.org/log
|
||
# ls -lh dspace.log.2015-11-18*
|
||
-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
|
||
-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
|
||
-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>I had used lrzip once, but it needs more memory and is harder to use as it requires the lrztar wrapper</li>
|
||
<li>Need to remember to go check if everything is ok in a few days and then change CGSpace</li>
|
||
<li>CGSpace went down again (due to PostgreSQL idle connections of course)</li>
|
||
<li>Current database settings for DSpace are <code>db.maxconnections = 30</code> and <code>db.maxidle = 8</code>, yet idle connections are exceeding this:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ psql -c 'SELECT * from pg_stat_activity;' | grep cgspace | grep -c idle
|
||
39
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>I restarted PostgreSQL and Tomcat and it&rsquo;s back</li>
|
||
<li>On a related note of why CGSpace is so slow, I decided to finally try the <code>pgtune</code> script to tune the postgres settings:</li>
|
||
</ul>
|
||
|
||
<pre><code># apt-get install pgtune
|
||
# pgtune -i /etc/postgresql/9.3/main/postgresql.conf -o postgresql.conf-pgtune
|
||
# mv /etc/postgresql/9.3/main/postgresql.conf /etc/postgresql/9.3/main/postgresql.conf.orig
|
||
# mv postgresql.conf-pgtune /etc/postgresql/9.3/main/postgresql.conf
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>It introduced the following new settings:</li>
|
||
</ul>
|
||
|
||
<pre><code>default_statistics_target = 50
|
||
maintenance_work_mem = 480MB
|
||
constraint_exclusion = on
|
||
checkpoint_completion_target = 0.9
|
||
effective_cache_size = 5632MB
|
||
work_mem = 48MB
|
||
wal_buffers = 8MB
|
||
checkpoint_segments = 16
|
||
shared_buffers = 1920MB
|
||
max_connections = 80
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Now I need to go read PostgreSQL docs about these options, and watch memory settings in munin etc</li>
|
||
<li>For what it&rsquo;s worth, now the REST API should be faster (because of these PostgreSQL tweaks):</li>
|
||
</ul>
|
||
|
||
<pre><code>$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
1.474
|
||
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
2.141
|
||
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
1.685
|
||
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
1.995
|
||
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
1.786
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Last week it was an average of 8 seconds&hellip; now this is <sup>1</sup>&frasl;<sub>4</sub> of that</li>
|
||
<li>CCAFS noticed that one of their items displays only the Atmire statlets: <a href="https://cgspace.cgiar.org/handle/10568/42445">https://cgspace.cgiar.org/handle/10568/42445</a></li>
|
||
</ul>
|
||
|
||
<p><img src="../images/2015/12/ccafs-item-no-metadata.png" alt="CCAFS item" /></p>
|
||
|
||
<ul>
|
||
<li>The authorizations for the item are all public READ, and I don&rsquo;t see any errors in dspace.log when browsing that item</li>
|
||
<li>I filed a ticket on Atmire&rsquo;s issue tracker</li>
|
||
<li>I also filed a ticket on Atmire&rsquo;s issue tracker for the PostgreSQL stuff</li>
|
||
</ul>
|
||
|
||
<h2 id="2015-12-03:012a628feed6d64ae1151cbd6151ccd6">2015-12-03</h2>
|
||
|
||
<ul>
|
||
<li>CGSpace very slow, and monitoring emailing me to say its down, even though I can load the page (very slowly)</li>
|
||
<li>Idle postgres connections look like this (with no change in DSpace db settings lately):</li>
|
||
</ul>
|
||
|
||
<pre><code>$ psql -c 'SELECT * from pg_stat_activity;' | grep cgspace | grep -c idle
|
||
29
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>I restarted Tomcat and postgres&hellip;</li>
|
||
<li>Atmire commented that we should raise the JVM heap size by ~500M, so it is now <code>-Xms3584m -Xmx3584m</code></li>
|
||
<li>We weren&rsquo;t out of heap yet, but it&rsquo;s probably fair enough that the DSpace 5 upgrade (and new Atmire modules) requires more memory so it&rsquo;s ok</li>
|
||
<li>A possible side effect is that I see that the REST API is twice as fast for the request above now:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
1.368
|
||
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
0.968
|
||
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
1.006
|
||
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
0.849
|
||
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
0.806
|
||
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
0.854
|
||
</code></pre>
|
||
|
||
<h2 id="2015-12-05:012a628feed6d64ae1151cbd6151ccd6">2015-12-05</h2>
|
||
|
||
<ul>
|
||
<li>CGSpace has been up and down all day and REST API is completely unresponsive</li>
|
||
<li>PostgreSQL idle connections are currently:</li>
|
||
</ul>
|
||
|
||
<pre><code>postgres@linode01:~$ psql -c 'SELECT * from pg_stat_activity;' | grep cgspace | grep -c idle
|
||
28
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>I have reverted all the pgtune tweaks from the other day, as they didn&rsquo;t fix the stability issues, so I&rsquo;d rather not have them introducing more variables into the equation</li>
|
||
<li>The PostgreSQL stats from Munin all point to something database-related with the DSpace 5 upgrade around mid–late November</li>
|
||
</ul>
|
||
|
||
<p><img src="../images/2015/12/postgres_bgwriter-year.png" alt="PostgreSQL bgwriter (year)" />
|
||
<img src="../images/2015/12/postgres_cache_cgspace-year.png" alt="PostgreSQL cache (year)" />
|
||
<img src="../images/2015/12/postgres_locks_cgspace-year.png" alt="PostgreSQL locks (year)" />
|
||
<img src="../images/2015/12/postgres_scans_cgspace-year.png" alt="PostgreSQL scans (year)" /></p>
|
||
|
||
<h2 id="2015-12-07:012a628feed6d64ae1151cbd6151ccd6">2015-12-07</h2>
|
||
|
||
<ul>
|
||
<li>Atmire sent <a href="https://github.com/ilri/DSpace/pull/161">some fixes</a> to DSpace&rsquo;s REST API code that was leaving contexts open (causing the slow performance and database issues)</li>
|
||
<li>After deploying the fix to CGSpace the REST API is consistently faster:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
0.675
|
||
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
0.599
|
||
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
0.588
|
||
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
0.566
|
||
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
0.497
|
||
</code></pre>
|
||
|
||
<h2 id="2015-12-08:012a628feed6d64ae1151cbd6151ccd6">2015-12-08</h2>
|
||
|
||
<ul>
|
||
<li>Switch CGSpace log compression cron jobs from using lzop to xz—the compression isn&rsquo;t as good, but it&rsquo;s much faster and causes less IO/CPU load</li>
|
||
<li>Since we figured out (and fixed) the cause of the performance issue, I reverted Google Bot&rsquo;s crawl rate to the &ldquo;Let Google optimize&rdquo; setting</li>
|
||
</ul>
|
||
</description>
|
||
</item>
|
||
|
||
<item>
|
||
<title>November, 2015</title>
|
||
<link>/cgspace-notes/2015-11/</link>
|
||
<pubDate>Mon, 23 Nov 2015 17:00:57 +0300</pubDate>
|
||
|
||
<guid>/cgspace-notes/2015-11/</guid>
|
||
<description>
|
||
|
||
<h2 id="2015-11-22:3d03b850f8126f80d8144c2e17ea0ae7">2015-11-22</h2>
|
||
|
||
<ul>
|
||
<li>CGSpace went down</li>
|
||
<li>Looks like DSpace exhausted its PostgreSQL connection pool</li>
|
||
<li>Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspace
|
||
78
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>For now I have increased the limit from 60 to 90, run updates, and rebooted the server</li>
|
||
</ul>
|
||
|
||
<h2 id="2015-11-24:3d03b850f8126f80d8144c2e17ea0ae7">2015-11-24</h2>
|
||
|
||
<ul>
|
||
<li>CGSpace went down again</li>
|
||
<li>Getting emails from uptimeRobot and uptimeButler that it&rsquo;s down, and Google Webmaster Tools is sending emails that there is an increase in crawl errors</li>
|
||
<li>Looks like there are still a bunch of idle PostgreSQL connections:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspace
|
||
96
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>For some reason the number of idle connections is very high since we upgraded to DSpace 5</li>
|
||
</ul>
|
||
|
||
<h2 id="2015-11-25:3d03b850f8126f80d8144c2e17ea0ae7">2015-11-25</h2>
|
||
|
||
<ul>
|
||
<li>Troubleshoot the DSpace 5 OAI breakage caused by nginx routing config</li>
|
||
<li>The OAI application requests stylesheets and javascript files with the path <code>/oai/static/css</code>, which gets matched here:</li>
|
||
</ul>
|
||
|
||
<pre><code># static assets we can load from the file system directly with nginx
|
||
location ~ /(themes|static|aspects/ReportingSuite) {
|
||
try_files $uri @tomcat;
|
||
...
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>The document root is relative to the xmlui app, so this gets a 404—I&rsquo;m not sure why it doesn&rsquo;t pass to <code>@tomcat</code></li>
|
||
<li>Anyways, I can&rsquo;t find any URIs with path <code>/static</code>, and the more important point is to handle all the static theme assets, so we can just remove <code>static</code> from the regex for now (who cares if we can&rsquo;t use nginx to send Etags for OAI CSS!)</li>
|
||
<li>Also, I noticed we aren&rsquo;t setting CSP headers on the static assets, because in nginx headers are inherited in child blocks, but if you use <code>add_header</code> in a child block it doesn&rsquo;t inherit the others</li>
|
||
<li>We simply need to add <code>include extra-security.conf;</code> to the above location block (but research and test first)</li>
|
||
<li>We should add WOFF assets to the list of things to set expires for:</li>
|
||
</ul>
|
||
|
||
<pre><code>location ~* \.(?:ico|css|js|gif|jpe?g|png|woff)$ {
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>We should also add <code>aspects/Statistics</code> to the location block for static assets (minus <code>static</code> from above):</li>
|
||
</ul>
|
||
|
||
<pre><code>location ~ /(themes|aspects/ReportingSuite|aspects/Statistics) {
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Need to check <code>/about</code> on CGSpace, as it&rsquo;s blank on my local test server and we might need to add something there</li>
|
||
<li>CGSpace has been up and down all day due to PostgreSQL idle connections (current DSpace pool is 90):</li>
|
||
</ul>
|
||
|
||
<pre><code>$ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspace
|
||
93
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>I looked closer at the idle connections and saw that many have been idle for hours (current time on server is <code>2015-11-25T20:20:42+0000</code>):</li>
|
||
</ul>
|
||
|
||
<pre><code>$ psql -c 'SELECT * from pg_stat_activity;' | less -S
|
||
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start |
|
||
-------+----------+-------+----------+----------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+---
|
||
20951 | cgspace | 10966 | 18205 | cgspace | | 127.0.0.1 | | 37731 | 2015-11-25 13:13:02.837624+00 | | 20
|
||
20951 | cgspace | 10967 | 18205 | cgspace | | 127.0.0.1 | | 37737 | 2015-11-25 13:13:03.069421+00 | | 20
|
||
...
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>There is a relevant Jira issue about this: <a href="https://jira.duraspace.org/browse/DS-1458">https://jira.duraspace.org/browse/DS-1458</a></li>
|
||
<li>It seems there is some sense changing DSpace&rsquo;s default <code>db.maxidle</code> from unlimited (-1) to something like 8 (Tomcat default) or 10 (Confluence default)</li>
|
||
<li>Change <code>db.maxidle</code> from -1 to 10, reduce <code>db.maxconnections</code> from 90 to 50, and restart postgres and tomcat7</li>
|
||
<li>Also redeploy DSpace Test with a clean sync of CGSpace and mirror these database settings there as well</li>
|
||
<li>Also deploy the nginx fixes for the <code>try_files</code> location block as well as the expires block</li>
|
||
</ul>
|
||
|
||
<h2 id="2015-11-26:3d03b850f8126f80d8144c2e17ea0ae7">2015-11-26</h2>
|
||
|
||
<ul>
|
||
<li>CGSpace behaving much better since changing <code>db.maxidle</code> yesterday, but still two up/down notices from monitoring this morning (better than 50!)</li>
|
||
<li>CCAFS colleagues mentioned that the REST API is very slow, 24 seconds for one item</li>
|
||
<li>Not as bad for me, but still unsustainable if you have to get many:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
|
||
8.415
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Monitoring e-mailed in the evening to say CGSpace was down</li>
|
||
<li>Idle connections in PostgreSQL again:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ psql -c 'SELECT * from pg_stat_activity;' | grep cgspace | grep -c idle
|
||
66
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>At the time, the current DSpace pool size was 50&hellip;</li>
|
||
<li>I reduced the pool back to the default of 30, and reduced the <code>db.maxidle</code> settings from 10 to 8</li>
|
||
</ul>
|
||
|
||
<h2 id="2015-11-29:3d03b850f8126f80d8144c2e17ea0ae7">2015-11-29</h2>
|
||
|
||
<ul>
|
||
<li>Still more alerts that CGSpace has been up and down all day</li>
|
||
<li>Current database settings for DSpace:</li>
|
||
</ul>
|
||
|
||
<pre><code>db.maxconnections = 30
|
||
db.maxwait = 5000
|
||
db.maxidle = 8
|
||
db.statementpool = true
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>And idle connections:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ psql -c 'SELECT * from pg_stat_activity;' | grep cgspace | grep -c idle
|
||
49
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Perhaps I need to start drastically increasing the connection limits—like to 300—to see if DSpace&rsquo;s thirst can ever be quenched</li>
|
||
<li>On another note, SUNScholar&rsquo;s notes suggest adjusting some other postgres variables: <a href="http://wiki.lib.sun.ac.za/index.php/SUNScholar/Optimisations/Database">http://wiki.lib.sun.ac.za/index.php/SUNScholar/Optimisations/Database</a></li>
|
||
<li>This might help with REST API speed (which I mentioned above and still need to do real tests)</li>
|
||
</ul>
|
||
</description>
|
||
</item>
|
||
|
||
</channel>
|
||
</rss> |