2015-12-02 13:25:34 +02:00
<!DOCTYPE html>
2016-09-21 15:24:28 +03:00
< html lang = "en" >
< head >
2016-09-27 23:54:30 +03:00
2016-09-21 15:24:28 +03:00
< meta charset = "utf-8" >
< meta http-equiv = "X-UA-Compatible" content = "IE=edge" >
< meta name = "viewport" content = "width=device-width, initial-scale=1, shrink-to-fit=no" >
<!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
< meta name = "description" content = "" >
< meta name = "author" content = "Alan Orth" >
<!-- OpenGraph Metadata: http://ogp.me/ -->
< meta property = "og:title" content = "December, 2015" >
< meta property = "og:description" content = "" >
< meta property = "og:type" content = "article" >
< meta property = "article:published_time" content = "2015-12-02T13:18:00+03:00" >
< meta property = "article:author" content = "Alan Orth" >
2016-02-08 08:59:05 +02:00
2016-09-21 15:24:28 +03:00
< meta property = "og:url" content = "https://alanorth.github.io/cgspace-notes/2015-12/" >
<!-- Metadata for Twitter: https://dev.twitter.com/cards/markup -->
2016-09-28 14:55:10 +03:00
< meta property = "twitter:card" content = "summary" >
2016-09-21 15:24:28 +03:00
< meta property = "twitter:title" content = "December, 2015" >
< meta property = "twitter:description" content = "" >
< meta name = "generator" content = "Hugo 0.16" / >
< base href = "https://alanorth.github.io/cgspace-notes/" >
< link rel = "canonical" href = "https://alanorth.github.io/cgspace-notes/2015-12/" >
< title > December, 2015 | CGSpace Notes< / title >
<!-- combined, minified CSS -->
< link href = "https://alanorth.github.io/cgspace-notes/css/style.css" rel = "stylesheet" >
<!-- RSS 2.0 feed of posts -->
< link href = "https://alanorth.github.io/cgspace-notes/post/index.xml" type = "application/rss+xml" rel = "alternate" >
< / head >
< body >
< div class = "blog-masthead" >
< div class = "container" >
< nav class = "nav blog-nav" >
< a class = "nav-link " href = "https://alanorth.github.io/cgspace-notes/" > Home< / a >
< / nav >
< / div >
< / div >
< header class = "blog-header" >
< div class = "container" >
< h1 class = "blog-title" > < a href = "https://alanorth.github.io/cgspace-notes/" rel = "home" > CGSpace Notes< / a > < / h1 >
2016-02-08 08:59:05 +02:00
2016-09-21 15:24:28 +03:00
< / div >
2016-02-08 08:59:05 +02:00
< / header >
2016-09-21 15:24:28 +03:00
< div class = "container" >
< div class = "row" >
< div class = "col-sm-8 blog-main" >
2016-02-08 08:59:05 +02:00
2016-09-21 15:24:28 +03:00
< article class = "blog-post" >
< header >
< h2 class = "blog-post-title" > < a href = "https://alanorth.github.io/cgspace-notes/2015-12/" title = "December, 2015" > December, 2015< / a > < / h2 >
2016-09-27 23:54:30 +03:00
< p class = "blog-post-meta" > < time datetime = "2015-12-02T13:18:00+03:00" > Wed Dec 02, 2015< / time > by Alan Orth in
< i class = "fa fa-tag" aria-hidden = "true" > < / i > < a href = "/cgspace-notes/tags/notes" rel = "tag" > notes< / a >
< / p >
2016-09-21 15:24:28 +03:00
< / header >
2016-02-08 08:59:05 +02:00
2015-12-02 13:25:34 +02:00
2016-08-03 10:09:36 +03:00
< h2 id = "2015-12-02" > 2015-12-02< / h2 >
2015-12-02 13:25:34 +02:00
< ul >
< li > Replace < code > lzop< / code > with < code > xz< / code > in log compression cron jobs on DSpace Test—it uses less space:< / li >
< / ul >
< pre > < code > # cd /home/dspacetest.cgiar.org/log
# ls -lh dspace.log.2015-11-18*
-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
< / code > < / pre >
< ul >
< li > I had used lrzip once, but it needs more memory and is harder to use as it requires the lrztar wrapper< / li >
< li > Need to remember to go check if everything is ok in a few days and then change CGSpace< / li >
2015-12-02 19:16:44 +02:00
< li > CGSpace went down again (due to PostgreSQL idle connections of course)< / li >
< li > Current database settings for DSpace are < code > db.maxconnections = 30< / code > and < code > db.maxidle = 8< / code > , yet idle connections are exceeding this:< / li >
< / ul >
< pre > < code > $ psql -c 'SELECT * from pg_stat_activity;' | grep cgspace | grep -c idle
39
< / code > < / pre >
< ul >
< li > I restarted PostgreSQL and Tomcat and it’ s back< / li >
< li > On a related note of why CGSpace is so slow, I decided to finally try the < code > pgtune< / code > script to tune the postgres settings:< / li >
< / ul >
< pre > < code > # apt-get install pgtune
# pgtune -i /etc/postgresql/9.3/main/postgresql.conf -o postgresql.conf-pgtune
# mv /etc/postgresql/9.3/main/postgresql.conf /etc/postgresql/9.3/main/postgresql.conf.orig
# mv postgresql.conf-pgtune /etc/postgresql/9.3/main/postgresql.conf
< / code > < / pre >
< ul >
< li > It introduced the following new settings:< / li >
< / ul >
< pre > < code > default_statistics_target = 50
maintenance_work_mem = 480MB
constraint_exclusion = on
checkpoint_completion_target = 0.9
effective_cache_size = 5632MB
work_mem = 48MB
wal_buffers = 8MB
checkpoint_segments = 16
shared_buffers = 1920MB
max_connections = 80
< / code > < / pre >
< ul >
< li > Now I need to go read PostgreSQL docs about these options, and watch memory settings in munin etc< / li >
2015-12-02 21:11:28 +02:00
< li > For what it’ s worth, now the REST API should be faster (because of these PostgreSQL tweaks):< / li >
< / ul >
< pre > < code > $ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.474
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
2.141
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.685
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.995
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.786
< / code > < / pre >
< ul >
< li > Last week it was an average of 8 seconds… now this is < sup > 1< / sup > ⁄ < sub > 4< / sub > of that< / li >
2015-12-02 22:19:56 +02:00
< li > CCAFS noticed that one of their items displays only the Atmire statlets: < a href = "https://cgspace.cgiar.org/handle/10568/42445" > https://cgspace.cgiar.org/handle/10568/42445< / a > < / li >
< / ul >
2016-09-21 15:24:28 +03:00
< p > < img src = "2015/12/ccafs-item-no-metadata.png" alt = "CCAFS item" / > < / p >
2015-12-02 22:19:56 +02:00
< ul >
< li > The authorizations for the item are all public READ, and I don’ t see any errors in dspace.log when browsing that item< / li >
< li > I filed a ticket on Atmire’ s issue tracker< / li >
< li > I also filed a ticket on Atmire’ s issue tracker for the PostgreSQL stuff< / li >
2015-12-03 11:08:14 +02:00
< / ul >
2016-08-03 10:09:36 +03:00
< h2 id = "2015-12-03" > 2015-12-03< / h2 >
2015-12-03 11:08:14 +02:00
< ul >
< li > CGSpace very slow, and monitoring emailing me to say its down, even though I can load the page (very slowly)< / li >
< li > Idle postgres connections look like this (with no change in DSpace db settings lately):< / li >
< / ul >
< pre > < code > $ psql -c 'SELECT * from pg_stat_activity;' | grep cgspace | grep -c idle
29
< / code > < / pre >
< ul >
< li > I restarted Tomcat and postgres… < / li >
2015-12-04 00:08:49 +02:00
< li > Atmire commented that we should raise the JVM heap size by ~500M, so it is now < code > -Xms3584m -Xmx3584m< / code > < / li >
< li > We weren’ t out of heap yet, but it’ s probably fair enough that the DSpace 5 upgrade (and new Atmire modules) requires more memory so it’ s ok< / li >
2015-12-04 00:09:41 +02:00
< li > A possible side effect is that I see that the REST API is twice as fast for the request above now:< / li >
2015-12-02 13:25:34 +02:00
< / ul >
2015-12-04 00:09:41 +02:00
< pre > < code > $ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.368
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.968
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.006
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.849
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.806
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.854
< / code > < / pre >
2016-08-03 10:09:36 +03:00
< h2 id = "2015-12-05" > 2015-12-05< / h2 >
2015-12-05 17:42:56 +02:00
< ul >
< li > CGSpace has been up and down all day and REST API is completely unresponsive< / li >
< li > PostgreSQL idle connections are currently:< / li >
< / ul >
< pre > < code > postgres@linode01:~$ psql -c 'SELECT * from pg_stat_activity;' | grep cgspace | grep -c idle
28
< / code > < / pre >
< ul >
< li > I have reverted all the pgtune tweaks from the other day, as they didn’ t fix the stability issues, so I’ d rather not have them introducing more variables into the equation< / li >
< li > The PostgreSQL stats from Munin all point to something database-related with the DSpace 5 upgrade around mid– late November< / li >
< / ul >
2016-09-21 15:24:28 +03:00
< p > < img src = "2015/12/postgres_bgwriter-year.png" alt = "PostgreSQL bgwriter (year)" / >
< img src = "2015/12/postgres_cache_cgspace-year.png" alt = "PostgreSQL cache (year)" / >
< img src = "2015/12/postgres_locks_cgspace-year.png" alt = "PostgreSQL locks (year)" / >
< img src = "2015/12/postgres_scans_cgspace-year.png" alt = "PostgreSQL scans (year)" / > < / p >
2015-12-05 17:42:56 +02:00
2016-08-03 10:09:36 +03:00
< h2 id = "2015-12-07" > 2015-12-07< / h2 >
2015-12-07 19:10:54 +02:00
< ul >
< li > Atmire sent < a href = "https://github.com/ilri/DSpace/pull/161" > some fixes< / a > to DSpace’ s REST API code that was leaving contexts open (causing the slow performance and database issues)< / li >
< li > After deploying the fix to CGSpace the REST API is consistently faster:< / li >
< / ul >
< pre > < code > $ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.675
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.599
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.588
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.566
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.497
< / code > < / pre >
2016-08-03 10:09:36 +03:00
< h2 id = "2015-12-08" > 2015-12-08< / h2 >
2015-12-08 12:01:25 +02:00
< ul >
< li > Switch CGSpace log compression cron jobs from using lzop to xz—the compression isn’ t as good, but it’ s much faster and causes less IO/CPU load< / li >
2015-12-08 21:49:06 +02:00
< li > Since we figured out (and fixed) the cause of the performance issue, I reverted Google Bot’ s crawl rate to the “ Let Google optimize” setting< / li >
2015-12-08 12:01:25 +02:00
< / ul >
2016-09-21 15:24:28 +03:00
2016-02-08 08:59:05 +02:00
2016-09-21 15:24:28 +03:00
< / article >
< / div > <!-- /.blog - main -->
2016-02-08 08:59:05 +02:00
2016-09-21 15:24:28 +03:00
< aside class = "col-sm-3 offset-sm-1 blog-sidebar" >
< section class = "sidebar-module" >
< h4 > Recent Posts< / h4 >
< ol class = "list-unstyled" >
2016-02-08 08:59:05 +02:00
2016-09-21 15:24:28 +03:00
< li > < a href = "/cgspace-notes/2016-09/" > September, 2016< / a > < / li >
2016-02-08 08:59:05 +02:00
2016-09-21 15:24:28 +03:00
< li > < a href = "/cgspace-notes/2016-08/" > August, 2016< / a > < / li >
2016-02-08 08:59:05 +02:00
2016-09-21 15:24:28 +03:00
< li > < a href = "/cgspace-notes/2016-07/" > July, 2016< / a > < / li >
2016-02-08 08:59:05 +02:00
2016-09-21 15:24:28 +03:00
< li > < a href = "/cgspace-notes/2016-06/" > June, 2016< / a > < / li >
2016-02-08 08:59:05 +02:00
2016-09-21 15:24:28 +03:00
< li > < a href = "/cgspace-notes/2016-05/" > May, 2016< / a > < / li >
< / ol >
< / section >
2016-02-08 08:59:05 +02:00
2016-09-21 15:24:28 +03:00
< section class = "sidebar-module" >
< h4 > Links< / h4 >
< ol class = "list-unstyled" >
< li > < a href = "https://cgspace.cgiar.org" > CGSpace< / a > < / li >
< li > < a href = "https://dspacetest.cgiar.org" > DSpace Test< / a > < / li >
< li > < a href = "https://github.com/ilri/DSpace" > CGSpace @ GitHub< / a > < / li >
< / ol >
< / section >
2016-02-08 08:59:05 +02:00
2016-09-21 15:24:28 +03:00
< / aside >
2015-12-02 13:25:34 +02:00
2016-09-21 15:24:28 +03:00
< / div > <!-- /.row -->
< / div > <!-- /.container -->
< footer class = "blog-footer" >
< p >
Blog template built by < a href = 'https://twitter.com/mralanorth' > @mralanorth< / a > .
< / p >
2016-02-08 08:59:05 +02:00
< / footer >
2015-12-02 13:25:34 +02:00
< / body >
2016-09-21 15:24:28 +03:00
< / html >