2015-12-02 12:25:34 +01:00
<!DOCTYPE html>
< html lang = "en-us" >
2016-02-08 07:59:05 +01:00
< head prefix = "og: http://ogp.me/ns#" >
< meta charset = "utf-8" / >
< meta name = "viewport" content = "width=device-width, initial-scale=1.0, maximum-scale=1" / >
< meta property = "og:title" content = " December, 2015 · CGSpace Notes" / >
< meta property = "og:site_name" content = "CGSpace Notes" / >
< meta property = "og:url" content = "/cgspace-notes/2015-12/" / >
< meta property = "og:type" content = "article" / >
< meta property = "og:article:published_time" content = "2015-12-02T13:18:00+03:00" / >
< meta property = "og:article:tag" content = "notes" / >
< title >
December, 2015 · CGSpace Notes
< / title >
< link rel = "stylesheet" href = "/cgspace-notes/css/bootstrap.min.css" / >
< link rel = "stylesheet" href = "/cgspace-notes/css/main.css" / >
< link rel = "stylesheet" href = "/cgspace-notes/css/font-awesome.min.css" / >
< link rel = "stylesheet" href = "/cgspace-notes/css/github.css" / >
< link rel = "stylesheet" href = "//fonts.googleapis.com/css?family=Source+Sans+Pro:200,300,400" type = "text/css" >
< link rel = "shortcut icon" href = "/cgspace-notes/images/favicon.ico" / >
< link rel = "apple-touch-icon" href = "/cgspace-notes/images/apple-touch-icon.png" / >
2015-12-02 12:25:34 +01:00
< / head >
< body >
2016-02-08 07:59:05 +01:00
< header class = "global-header" style = "background-image:url(../images/bg.jpg )" >
< section class = "header-text" >
< h1 > < a href = "/cgspace-notes/" > CGSpace Notes< / a > < / h1 >
< div class = "sns-links hidden-print" >
2015-12-02 12:25:34 +01:00
2016-02-08 07:59:05 +01:00
< / div >
2016-02-07 20:33:55 +01:00
2016-02-08 07:59:05 +01:00
< a href = "/cgspace-notes/" class = "btn-header btn-back hidden-xs" >
< i class = "fa fa-angle-left" aria-hidden = "true" > < / i >
Home
< / a >
< / section >
< / header >
< main class = "container" >
< article >
< header >
< h1 class = "text-primary" > December, 2015< / h1 >
< div class = "post-meta clearfix" >
< div class = "post-date pull-left" >
Posted on
< time datetime = "2015-12-02T13:18:00+03:00" >
Dec 2, 2015
< / time >
< / div >
< div class = "pull-right" >
< span class = "post-tag small" > < a href = "/cgspace-notes//tags/notes" > #notes< / a > < / span >
< / div >
< / div >
< / header >
< section >
2015-12-02 12:25:34 +01:00
< h2 id = "2015-12-02:012a628feed6d64ae1151cbd6151ccd6" > 2015-12-02< / h2 >
< ul >
< li > Replace < code > lzop< / code > with < code > xz< / code > in log compression cron jobs on DSpace Test—it uses less space:< / li >
< / ul >
< pre > < code > # cd /home/dspacetest.cgiar.org/log
# ls -lh dspace.log.2015-11-18*
-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
< / code > < / pre >
< ul >
< li > I had used lrzip once, but it needs more memory and is harder to use as it requires the lrztar wrapper< / li >
< li > Need to remember to go check if everything is ok in a few days and then change CGSpace< / li >
2015-12-02 18:16:44 +01:00
< li > CGSpace went down again (due to PostgreSQL idle connections of course)< / li >
< li > Current database settings for DSpace are < code > db.maxconnections = 30< / code > and < code > db.maxidle = 8< / code > , yet idle connections are exceeding this:< / li >
< / ul >
< pre > < code > $ psql -c 'SELECT * from pg_stat_activity;' | grep cgspace | grep -c idle
39
< / code > < / pre >
< ul >
< li > I restarted PostgreSQL and Tomcat and it’ s back< / li >
< li > On a related note of why CGSpace is so slow, I decided to finally try the < code > pgtune< / code > script to tune the postgres settings:< / li >
< / ul >
< pre > < code > # apt-get install pgtune
# pgtune -i /etc/postgresql/9.3/main/postgresql.conf -o postgresql.conf-pgtune
# mv /etc/postgresql/9.3/main/postgresql.conf /etc/postgresql/9.3/main/postgresql.conf.orig
# mv postgresql.conf-pgtune /etc/postgresql/9.3/main/postgresql.conf
< / code > < / pre >
< ul >
< li > It introduced the following new settings:< / li >
< / ul >
< pre > < code > default_statistics_target = 50
maintenance_work_mem = 480MB
constraint_exclusion = on
checkpoint_completion_target = 0.9
effective_cache_size = 5632MB
work_mem = 48MB
wal_buffers = 8MB
checkpoint_segments = 16
shared_buffers = 1920MB
max_connections = 80
< / code > < / pre >
< ul >
< li > Now I need to go read PostgreSQL docs about these options, and watch memory settings in munin etc< / li >
2015-12-02 20:11:28 +01:00
< li > For what it’ s worth, now the REST API should be faster (because of these PostgreSQL tweaks):< / li >
< / ul >
< pre > < code > $ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.474
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
2.141
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.685
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.995
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.786
< / code > < / pre >
< ul >
< li > Last week it was an average of 8 seconds… now this is < sup > 1< / sup > ⁄ < sub > 4< / sub > of that< / li >
2015-12-02 21:19:56 +01:00
< li > CCAFS noticed that one of their items displays only the Atmire statlets: < a href = "https://cgspace.cgiar.org/handle/10568/42445" > https://cgspace.cgiar.org/handle/10568/42445< / a > < / li >
< / ul >
2015-12-05 16:44:11 +01:00
< p > < img src = "../images/2015/12/ccafs-item-no-metadata.png" alt = "CCAFS item" / > < / p >
2015-12-02 21:19:56 +01:00
< ul >
< li > The authorizations for the item are all public READ, and I don’ t see any errors in dspace.log when browsing that item< / li >
< li > I filed a ticket on Atmire’ s issue tracker< / li >
< li > I also filed a ticket on Atmire’ s issue tracker for the PostgreSQL stuff< / li >
2015-12-03 10:08:14 +01:00
< / ul >
< h2 id = "2015-12-03:012a628feed6d64ae1151cbd6151ccd6" > 2015-12-03< / h2 >
< ul >
< li > CGSpace very slow, and monitoring emailing me to say its down, even though I can load the page (very slowly)< / li >
< li > Idle postgres connections look like this (with no change in DSpace db settings lately):< / li >
< / ul >
< pre > < code > $ psql -c 'SELECT * from pg_stat_activity;' | grep cgspace | grep -c idle
29
< / code > < / pre >
< ul >
< li > I restarted Tomcat and postgres… < / li >
2015-12-03 23:08:49 +01:00
< li > Atmire commented that we should raise the JVM heap size by ~500M, so it is now < code > -Xms3584m -Xmx3584m< / code > < / li >
< li > We weren’ t out of heap yet, but it’ s probably fair enough that the DSpace 5 upgrade (and new Atmire modules) requires more memory so it’ s ok< / li >
2015-12-03 23:09:41 +01:00
< li > A possible side effect is that I see that the REST API is twice as fast for the request above now:< / li >
2015-12-02 12:25:34 +01:00
< / ul >
2015-12-03 23:09:41 +01:00
< pre > < code > $ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.368
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.968
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
1.006
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.849
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.806
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.854
< / code > < / pre >
2015-12-05 16:42:56 +01:00
< h2 id = "2015-12-05:012a628feed6d64ae1151cbd6151ccd6" > 2015-12-05< / h2 >
< ul >
< li > CGSpace has been up and down all day and REST API is completely unresponsive< / li >
< li > PostgreSQL idle connections are currently:< / li >
< / ul >
< pre > < code > postgres@linode01:~$ psql -c 'SELECT * from pg_stat_activity;' | grep cgspace | grep -c idle
28
< / code > < / pre >
< ul >
< li > I have reverted all the pgtune tweaks from the other day, as they didn’ t fix the stability issues, so I’ d rather not have them introducing more variables into the equation< / li >
< li > The PostgreSQL stats from Munin all point to something database-related with the DSpace 5 upgrade around mid– late November< / li >
< / ul >
< p > < img src = "../images/2015/12/postgres_bgwriter-year.png" alt = "PostgreSQL bgwriter (year)" / >
< img src = "../images/2015/12/postgres_cache_cgspace-year.png" alt = "PostgreSQL cache (year)" / >
< img src = "../images/2015/12/postgres_locks_cgspace-year.png" alt = "PostgreSQL locks (year)" / >
2015-12-05 16:44:36 +01:00
< img src = "../images/2015/12/postgres_scans_cgspace-year.png" alt = "PostgreSQL scans (year)" / > < / p >
2015-12-05 16:42:56 +01:00
2015-12-07 18:10:54 +01:00
< h2 id = "2015-12-07:012a628feed6d64ae1151cbd6151ccd6" > 2015-12-07< / h2 >
< ul >
< li > Atmire sent < a href = "https://github.com/ilri/DSpace/pull/161" > some fixes< / a > to DSpace’ s REST API code that was leaving contexts open (causing the slow performance and database issues)< / li >
< li > After deploying the fix to CGSpace the REST API is consistently faster:< / li >
< / ul >
< pre > < code > $ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.675
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.599
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.588
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.566
$ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle/10568/32802?expand=all
0.497
< / code > < / pre >
2015-12-08 11:01:25 +01:00
< h2 id = "2015-12-08:012a628feed6d64ae1151cbd6151ccd6" > 2015-12-08< / h2 >
< ul >
< li > Switch CGSpace log compression cron jobs from using lzop to xz—the compression isn’ t as good, but it’ s much faster and causes less IO/CPU load< / li >
2015-12-08 20:49:06 +01:00
< li > Since we figured out (and fixed) the cause of the performance issue, I reverted Google Bot’ s crawl rate to the “ Let Google optimize” setting< / li >
2015-12-08 11:01:25 +01:00
< / ul >
2015-12-02 12:25:34 +01:00
< / section >
2016-02-08 07:59:05 +01:00
< footer >
< section class = "author-info row" >
< div class = "author-avatar col-md-2" >
< / div >
< div class = "author-meta col-md-6" >
< h1 class = "author-name text-primary" > Alan Orth< / h1 >
< / div >
< / section >
< ul class = "pager" >
< li class = "previous" > < a href = "/cgspace-notes/2015-11/" > < span aria-hidden = "true" > ← < / span > Older< / a > < / li >
< li class = "next" > < a href = "/cgspace-notes/2016-01/" > Newer < span aria-hidden = "true" > → < / span > < / a > < / li >
< / ul >
< / footer >
2015-12-02 12:25:34 +01:00
< / article >
2016-02-08 07:59:05 +01:00
< / main >
< footer class = "container global-footer" >
< div class = "copyright-note pull-left" >
< / div >
< div class = "sns-links hidden-print" >
2015-12-02 12:25:34 +01:00
< / div >
2016-02-08 07:59:05 +01:00
< / footer >
2015-12-02 12:25:34 +01:00
2016-02-08 07:59:05 +01:00
< script src = "/cgspace-notes/js/highlight.pack.js" > < / script >
< script >
hljs.initHighlightingOnLoad();
< / script >
2015-12-02 12:25:34 +01:00
< / body >
< / html >