mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-14 19:07:03 +01:00
526 lines
16 KiB
HTML
526 lines
16 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en">
|
|
|
|
<head>
|
|
<meta charset="utf-8">
|
|
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
|
|
|
|
<meta property="og:title" content="January, 2018" />
|
|
<meta property="og:description" content="2018-01-02
|
|
|
|
|
|
Uptime Robot noticed that CGSpace went down and up a few times last night, for a few minutes each time
|
|
I didn’t get any load alerts from Linode and the REST and XMLUI logs don’t show anything out of the ordinary
|
|
The nginx logs show HTTP 200s until 02/Jan/2018:11:27:17 +0000 when Uptime Robot got an HTTP 500
|
|
In dspace.log around that time I see many errors like “Client closed the connection before file download was complete”
|
|
And just before that I see this:
|
|
|
|
|
|
Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
|
|
|
|
|
|
|
|
Ah hah! So the pool was actually empty!
|
|
I need to increase that, let’s try to bump it up from 50 to 75
|
|
After that one client got an HTTP 499 but then the rest were HTTP 200, so I don’t know what the hell Uptime Robot saw
|
|
I notice this error quite a few times in dspace.log:
|
|
|
|
|
|
2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
|
|
org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse 'dateIssued_keyword:[1976+TO+1979]': Encountered " "]" "] "" at line 1, column 32.
|
|
|
|
|
|
|
|
And there are many of these errors every day for the past month:
|
|
|
|
|
|
$ grep -c "Error while searching for sidebar facets" dspace.log.*
|
|
dspace.log.2017-11-21:4
|
|
dspace.log.2017-11-22:1
|
|
dspace.log.2017-11-23:4
|
|
dspace.log.2017-11-24:11
|
|
dspace.log.2017-11-25:0
|
|
dspace.log.2017-11-26:1
|
|
dspace.log.2017-11-27:7
|
|
dspace.log.2017-11-28:21
|
|
dspace.log.2017-11-29:31
|
|
dspace.log.2017-11-30:15
|
|
dspace.log.2017-12-01:15
|
|
dspace.log.2017-12-02:20
|
|
dspace.log.2017-12-03:38
|
|
dspace.log.2017-12-04:65
|
|
dspace.log.2017-12-05:43
|
|
dspace.log.2017-12-06:72
|
|
dspace.log.2017-12-07:27
|
|
dspace.log.2017-12-08:15
|
|
dspace.log.2017-12-09:29
|
|
dspace.log.2017-12-10:35
|
|
dspace.log.2017-12-11:20
|
|
dspace.log.2017-12-12:44
|
|
dspace.log.2017-12-13:36
|
|
dspace.log.2017-12-14:59
|
|
dspace.log.2017-12-15:104
|
|
dspace.log.2017-12-16:53
|
|
dspace.log.2017-12-17:66
|
|
dspace.log.2017-12-18:83
|
|
dspace.log.2017-12-19:101
|
|
dspace.log.2017-12-20:74
|
|
dspace.log.2017-12-21:55
|
|
dspace.log.2017-12-22:66
|
|
dspace.log.2017-12-23:50
|
|
dspace.log.2017-12-24:85
|
|
dspace.log.2017-12-25:62
|
|
dspace.log.2017-12-26:49
|
|
dspace.log.2017-12-27:30
|
|
dspace.log.2017-12-28:54
|
|
dspace.log.2017-12-29:68
|
|
dspace.log.2017-12-30:89
|
|
dspace.log.2017-12-31:53
|
|
dspace.log.2018-01-01:45
|
|
dspace.log.2018-01-02:34
|
|
|
|
|
|
|
|
Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let’s Encrypt if it’s just a handful of domains
|
|
|
|
|
|
" />
|
|
<meta property="og:type" content="article" />
|
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-01/" />
|
|
|
|
|
|
|
|
<meta property="article:published_time" content="2018-01-02T08:35:54-08:00"/>
|
|
|
|
<meta property="article:modified_time" content="2018-01-02T09:30:34-08:00"/>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<meta name="twitter:card" content="summary"/><meta name="twitter:title" content="January, 2018"/>
|
|
<meta name="twitter:description" content="2018-01-02
|
|
|
|
|
|
Uptime Robot noticed that CGSpace went down and up a few times last night, for a few minutes each time
|
|
I didn’t get any load alerts from Linode and the REST and XMLUI logs don’t show anything out of the ordinary
|
|
The nginx logs show HTTP 200s until 02/Jan/2018:11:27:17 +0000 when Uptime Robot got an HTTP 500
|
|
In dspace.log around that time I see many errors like “Client closed the connection before file download was complete”
|
|
And just before that I see this:
|
|
|
|
|
|
Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
|
|
|
|
|
|
|
|
Ah hah! So the pool was actually empty!
|
|
I need to increase that, let’s try to bump it up from 50 to 75
|
|
After that one client got an HTTP 499 but then the rest were HTTP 200, so I don’t know what the hell Uptime Robot saw
|
|
I notice this error quite a few times in dspace.log:
|
|
|
|
|
|
2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
|
|
org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse 'dateIssued_keyword:[1976+TO+1979]': Encountered " "]" "] "" at line 1, column 32.
|
|
|
|
|
|
|
|
And there are many of these errors every day for the past month:
|
|
|
|
|
|
$ grep -c "Error while searching for sidebar facets" dspace.log.*
|
|
dspace.log.2017-11-21:4
|
|
dspace.log.2017-11-22:1
|
|
dspace.log.2017-11-23:4
|
|
dspace.log.2017-11-24:11
|
|
dspace.log.2017-11-25:0
|
|
dspace.log.2017-11-26:1
|
|
dspace.log.2017-11-27:7
|
|
dspace.log.2017-11-28:21
|
|
dspace.log.2017-11-29:31
|
|
dspace.log.2017-11-30:15
|
|
dspace.log.2017-12-01:15
|
|
dspace.log.2017-12-02:20
|
|
dspace.log.2017-12-03:38
|
|
dspace.log.2017-12-04:65
|
|
dspace.log.2017-12-05:43
|
|
dspace.log.2017-12-06:72
|
|
dspace.log.2017-12-07:27
|
|
dspace.log.2017-12-08:15
|
|
dspace.log.2017-12-09:29
|
|
dspace.log.2017-12-10:35
|
|
dspace.log.2017-12-11:20
|
|
dspace.log.2017-12-12:44
|
|
dspace.log.2017-12-13:36
|
|
dspace.log.2017-12-14:59
|
|
dspace.log.2017-12-15:104
|
|
dspace.log.2017-12-16:53
|
|
dspace.log.2017-12-17:66
|
|
dspace.log.2017-12-18:83
|
|
dspace.log.2017-12-19:101
|
|
dspace.log.2017-12-20:74
|
|
dspace.log.2017-12-21:55
|
|
dspace.log.2017-12-22:66
|
|
dspace.log.2017-12-23:50
|
|
dspace.log.2017-12-24:85
|
|
dspace.log.2017-12-25:62
|
|
dspace.log.2017-12-26:49
|
|
dspace.log.2017-12-27:30
|
|
dspace.log.2017-12-28:54
|
|
dspace.log.2017-12-29:68
|
|
dspace.log.2017-12-30:89
|
|
dspace.log.2017-12-31:53
|
|
dspace.log.2018-01-01:45
|
|
dspace.log.2018-01-02:34
|
|
|
|
|
|
|
|
Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let’s Encrypt if it’s just a handful of domains
|
|
|
|
|
|
"/>
|
|
<meta name="generator" content="Hugo 0.32.2" />
|
|
|
|
|
|
|
|
<script type="application/ld+json">
|
|
{
|
|
"@context": "http://schema.org",
|
|
"@type": "BlogPosting",
|
|
"headline": "January, 2018",
|
|
"url": "https://alanorth.github.io/cgspace-notes/2018-01/",
|
|
"wordCount": "731",
|
|
"datePublished": "2018-01-02T08:35:54-08:00",
|
|
"dateModified": "2018-01-02T09:30:34-08:00",
|
|
"author": {
|
|
"@type": "Person",
|
|
"name": "Alan Orth"
|
|
},
|
|
"keywords": "Notes"
|
|
}
|
|
</script>
|
|
|
|
|
|
|
|
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2018-01/">
|
|
|
|
<title>January, 2018 | CGSpace Notes</title>
|
|
|
|
<!-- combined, minified CSS -->
|
|
<link href="https://alanorth.github.io/cgspace-notes/css/style.css" rel="stylesheet" integrity="sha384-O8wjsnz02XiyrPxnhfF6AVOv6YLBaEGRCnVF+DL3gCPBy9cieyHcpixIrVyD2JS5" crossorigin="anonymous">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</head>
|
|
|
|
<body>
|
|
|
|
<div class="blog-masthead">
|
|
<div class="container">
|
|
<nav class="nav blog-nav">
|
|
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
|
|
|
|
|
|
</nav>
|
|
</div>
|
|
</div>
|
|
|
|
<header class="blog-header">
|
|
<div class="container">
|
|
<h1 class="blog-title"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
|
|
<p class="lead blog-description">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
|
|
</div>
|
|
</header>
|
|
|
|
<div class="container">
|
|
<div class="row">
|
|
<div class="col-sm-8 blog-main">
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title"><a href="https://alanorth.github.io/cgspace-notes/2018-01/">January, 2018</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2018-01-02T08:35:54-08:00">Tue Jan 02, 2018</time> by Alan Orth in
|
|
|
|
<i class="fa fa-tag" aria-hidden="true"></i> <a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2018-01-02">2018-01-02</h2>
|
|
|
|
<ul>
|
|
<li>Uptime Robot noticed that CGSpace went down and up a few times last night, for a few minutes each time</li>
|
|
<li>I didn’t get any load alerts from Linode and the REST and XMLUI logs don’t show anything out of the ordinary</li>
|
|
<li>The nginx logs show HTTP 200s until <code>02/Jan/2018:11:27:17 +0000</code> when Uptime Robot got an HTTP 500</li>
|
|
<li>In dspace.log around that time I see many errors like “Client closed the connection before file download was complete”</li>
|
|
<li>And just before that I see this:</li>
|
|
</ul>
|
|
|
|
<pre><code>Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>Ah hah! So the pool was actually empty!</li>
|
|
<li>I need to increase that, let’s try to bump it up from 50 to 75</li>
|
|
<li>After that one client got an HTTP 499 but then the rest were HTTP 200, so I don’t know what the hell Uptime Robot saw</li>
|
|
<li>I notice this error quite a few times in dspace.log:</li>
|
|
</ul>
|
|
|
|
<pre><code>2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
|
|
org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse 'dateIssued_keyword:[1976+TO+1979]': Encountered " "]" "] "" at line 1, column 32.
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>And there are many of these errors every day for the past month:</li>
|
|
</ul>
|
|
|
|
<pre><code>$ grep -c "Error while searching for sidebar facets" dspace.log.*
|
|
dspace.log.2017-11-21:4
|
|
dspace.log.2017-11-22:1
|
|
dspace.log.2017-11-23:4
|
|
dspace.log.2017-11-24:11
|
|
dspace.log.2017-11-25:0
|
|
dspace.log.2017-11-26:1
|
|
dspace.log.2017-11-27:7
|
|
dspace.log.2017-11-28:21
|
|
dspace.log.2017-11-29:31
|
|
dspace.log.2017-11-30:15
|
|
dspace.log.2017-12-01:15
|
|
dspace.log.2017-12-02:20
|
|
dspace.log.2017-12-03:38
|
|
dspace.log.2017-12-04:65
|
|
dspace.log.2017-12-05:43
|
|
dspace.log.2017-12-06:72
|
|
dspace.log.2017-12-07:27
|
|
dspace.log.2017-12-08:15
|
|
dspace.log.2017-12-09:29
|
|
dspace.log.2017-12-10:35
|
|
dspace.log.2017-12-11:20
|
|
dspace.log.2017-12-12:44
|
|
dspace.log.2017-12-13:36
|
|
dspace.log.2017-12-14:59
|
|
dspace.log.2017-12-15:104
|
|
dspace.log.2017-12-16:53
|
|
dspace.log.2017-12-17:66
|
|
dspace.log.2017-12-18:83
|
|
dspace.log.2017-12-19:101
|
|
dspace.log.2017-12-20:74
|
|
dspace.log.2017-12-21:55
|
|
dspace.log.2017-12-22:66
|
|
dspace.log.2017-12-23:50
|
|
dspace.log.2017-12-24:85
|
|
dspace.log.2017-12-25:62
|
|
dspace.log.2017-12-26:49
|
|
dspace.log.2017-12-27:30
|
|
dspace.log.2017-12-28:54
|
|
dspace.log.2017-12-29:68
|
|
dspace.log.2017-12-30:89
|
|
dspace.log.2017-12-31:53
|
|
dspace.log.2018-01-01:45
|
|
dspace.log.2018-01-02:34
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let’s Encrypt if it’s just a handful of domains</li>
|
|
</ul>
|
|
|
|
<p></p>
|
|
|
|
<h2 id="2018-01-03">2018-01-03</h2>
|
|
|
|
<ul>
|
|
<li>I woke up to more up and down of CGSpace, this time UptimeRobot noticed a few rounds of up and down of a few minutes each and Linode also notified of high CPU load from 12 to 2 PM</li>
|
|
<li>Looks like I need to increase the database pool size again:</li>
|
|
</ul>
|
|
|
|
<pre><code>$ grep -c "Timeout: Pool empty." dspace.log.2018-01-*
|
|
dspace.log.2018-01-01:0
|
|
dspace.log.2018-01-02:1972
|
|
dspace.log.2018-01-03:1909
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>For some reason there were a lot of “active” connections last night:</li>
|
|
</ul>
|
|
|
|
<p><img src="/cgspace-notes/2018/01/postgres_connections-day.png" alt="CGSpace PostgreSQL connections" /></p>
|
|
|
|
<ul>
|
|
<li>The active IPs in XMLUI are:</li>
|
|
</ul>
|
|
|
|
<pre><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -E "3/Jan/2018" | awk '{print $1}' | sort -n | uniq -c | sort -h | tail
|
|
607 40.77.167.141
|
|
611 2a00:23c3:8c94:7800:392c:a491:e796:9c50
|
|
663 188.226.169.37
|
|
759 157.55.39.245
|
|
887 68.180.229.254
|
|
1037 157.55.39.175
|
|
1068 216.244.66.245
|
|
1495 66.249.64.91
|
|
1934 104.196.152.243
|
|
2219 134.155.96.78
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>134.155.96.78 appears to be at the University of Mannheim in Germany</li>
|
|
<li>They identify as: Mozilla/5.0 (compatible; heritrix/3.2.0 +<a href="http://ifm.uni-mannheim.de">http://ifm.uni-mannheim.de</a>)</li>
|
|
<li>This appears to be the <a href="https://github.com/internetarchive/heritrix3">Internet Archive’s open source bot</a></li>
|
|
<li>They seem to be re-using their Tomcat session so I don’t need to do anything to them just yet:</li>
|
|
</ul>
|
|
|
|
<pre><code>$ grep 134.155.96.78 dspace.log.2018-01-03 | grep -o -E 'session_id=[A-Z0-9]{32}' | sort -n | uniq | wc -l
|
|
2
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>The API logs show the normal users:</li>
|
|
</ul>
|
|
|
|
<pre><code># cat /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E "3/Jan/2018" | awk '{print $1}' | sort -n | uniq -c | sort -h | tail
|
|
32 207.46.13.182
|
|
38 40.77.167.132
|
|
38 68.180.229.254
|
|
43 66.249.64.91
|
|
46 40.77.167.141
|
|
49 157.55.39.245
|
|
79 157.55.39.175
|
|
1533 50.116.102.77
|
|
4069 70.32.83.92
|
|
9355 45.5.184.196
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>In other related news I see a sizeable amount of requests coming from python-requests</li>
|
|
<li>For example, just in the last day there were 1700!</li>
|
|
</ul>
|
|
|
|
<pre><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep -c python-requests
|
|
1773
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>But they come from hundreds of IPs, many of which are 54.x.x.x:</li>
|
|
</ul>
|
|
|
|
<pre><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/library-access.log /var/log/nginx/library-access.log.1 | grep python-requests | awk '{print $1}' | sort -n | uniq -c | sort -h | tail -n 30
|
|
9 54.144.87.92
|
|
9 54.146.222.143
|
|
9 54.146.249.249
|
|
9 54.158.139.206
|
|
9 54.161.235.224
|
|
9 54.163.41.19
|
|
9 54.163.4.51
|
|
9 54.196.195.107
|
|
9 54.198.89.134
|
|
9 54.80.158.113
|
|
10 54.198.171.98
|
|
10 54.224.53.185
|
|
10 54.226.55.207
|
|
10 54.227.8.195
|
|
10 54.242.234.189
|
|
10 54.242.238.209
|
|
10 54.80.100.66
|
|
11 54.161.243.121
|
|
11 54.205.154.178
|
|
11 54.234.225.84
|
|
11 54.87.23.173
|
|
11 54.90.206.30
|
|
12 54.196.127.62
|
|
12 54.224.242.208
|
|
12 54.226.199.163
|
|
13 54.162.149.249
|
|
13 54.211.182.255
|
|
19 50.17.61.150
|
|
21 54.211.119.107
|
|
139 164.39.7.62
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>I have no idea what these are but they seem to be coming from Amazon…</li>
|
|
<li>I guess for now I just have to increase the database connection pool’s max active</li>
|
|
<li>It’s currently 75 and normally I’d just bump it by 25 but let me be a bit daring and push it by 50 to 125, because I used to see at least 121 connections in pg_stat_activity before when we were using the shitty default pooling</li>
|
|
</ul>
|
|
|
|
|
|
|
|
|
|
|
|
</article>
|
|
|
|
|
|
|
|
</div> <!-- /.blog-main -->
|
|
|
|
<aside class="col-sm-3 ml-auto blog-sidebar">
|
|
|
|
|
|
|
|
<section class="sidebar-module">
|
|
<h4>Recent Posts</h4>
|
|
<ol class="list-unstyled">
|
|
|
|
|
|
<li><a href="/cgspace-notes/2018-01/">January, 2018</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2017-12/">December, 2017</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2017-11/">November, 2017</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2017-10/">October, 2017</a></li>
|
|
|
|
<li><a href="/cgspace-notes/cgiar-library-migration/">CGIAR Library Migration</a></li>
|
|
|
|
</ol>
|
|
</section>
|
|
|
|
|
|
|
|
|
|
<section class="sidebar-module">
|
|
<h4>Links</h4>
|
|
<ol class="list-unstyled">
|
|
|
|
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
|
|
|
|
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
|
|
|
|
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
|
|
|
|
</ol>
|
|
</section>
|
|
|
|
</aside>
|
|
|
|
|
|
</div> <!-- /.row -->
|
|
</div> <!-- /.container -->
|
|
|
|
<footer class="blog-footer">
|
|
<p>
|
|
|
|
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
|
|
|
|
</p>
|
|
<p>
|
|
<a href="#">Back to top</a>
|
|
</p>
|
|
</footer>
|
|
|
|
</body>
|
|
|
|
</html>
|