mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-22 11:43:00 +01:00
791 lines
29 KiB
HTML
791 lines
29 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en">
|
||
|
||
<head>
|
||
<meta charset="utf-8">
|
||
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
|
||
|
||
<meta property="og:title" content="February, 2019" />
|
||
<meta property="og:description" content="2019-02-01
|
||
|
||
|
||
Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!
|
||
The top IPs before, during, and after this latest alert tonight were:
|
||
|
||
|
||
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "01/Feb/2019:(17|18|19|20|21)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
245 207.46.13.5
|
||
332 54.70.40.11
|
||
385 5.143.231.38
|
||
405 207.46.13.173
|
||
405 207.46.13.75
|
||
1117 66.249.66.219
|
||
1121 35.237.175.180
|
||
1546 5.9.6.51
|
||
2474 45.5.186.2
|
||
5490 85.25.237.71
|
||
|
||
|
||
|
||
85.25.237.71 is the “Linguee Bot” that I first saw last month
|
||
The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase
|
||
There were just over 3 million accesses in the nginx logs last month:
|
||
|
||
|
||
# time zcat --force /var/log/nginx/* | grep -cE "[0-9]{1,2}/Jan/2019"
|
||
3018243
|
||
|
||
real 0m19.873s
|
||
user 0m22.203s
|
||
sys 0m1.979s
|
||
" />
|
||
<meta property="og:type" content="article" />
|
||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-02/" />
|
||
<meta property="article:published_time" content="2019-02-01T21:37:30+02:00"/>
|
||
<meta property="article:modified_time" content="2019-02-10T13:32:53+02:00"/>
|
||
|
||
<meta name="twitter:card" content="summary"/>
|
||
<meta name="twitter:title" content="February, 2019"/>
|
||
<meta name="twitter:description" content="2019-02-01
|
||
|
||
|
||
Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!
|
||
The top IPs before, during, and after this latest alert tonight were:
|
||
|
||
|
||
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "01/Feb/2019:(17|18|19|20|21)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
245 207.46.13.5
|
||
332 54.70.40.11
|
||
385 5.143.231.38
|
||
405 207.46.13.173
|
||
405 207.46.13.75
|
||
1117 66.249.66.219
|
||
1121 35.237.175.180
|
||
1546 5.9.6.51
|
||
2474 45.5.186.2
|
||
5490 85.25.237.71
|
||
|
||
|
||
|
||
85.25.237.71 is the “Linguee Bot” that I first saw last month
|
||
The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase
|
||
There were just over 3 million accesses in the nginx logs last month:
|
||
|
||
|
||
# time zcat --force /var/log/nginx/* | grep -cE "[0-9]{1,2}/Jan/2019"
|
||
3018243
|
||
|
||
real 0m19.873s
|
||
user 0m22.203s
|
||
sys 0m1.979s
|
||
"/>
|
||
<meta name="generator" content="Hugo 0.54.0" />
|
||
|
||
|
||
|
||
<script type="application/ld+json">
|
||
{
|
||
"@context": "http://schema.org",
|
||
"@type": "BlogPosting",
|
||
"headline": "February, 2019",
|
||
"url": "https://alanorth.github.io/cgspace-notes/2019-02/",
|
||
"wordCount": "2587",
|
||
"datePublished": "2019-02-01T21:37:30+02:00",
|
||
"dateModified": "2019-02-10T13:32:53+02:00",
|
||
"author": {
|
||
"@type": "Person",
|
||
"name": "Alan Orth"
|
||
},
|
||
"keywords": "Notes"
|
||
}
|
||
</script>
|
||
|
||
|
||
|
||
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-02/">
|
||
|
||
<title>February, 2019 | CGSpace Notes</title>
|
||
|
||
<!-- combined, minified CSS -->
|
||
<link href="https://alanorth.github.io/cgspace-notes/css/style.css" rel="stylesheet" integrity="sha384-6+EGfPoOzk/n2DVJSlglKT8TV1TgIMvVcKI73IZgBswLasPBn94KommV6ilJqCXE" crossorigin="anonymous">
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
</head>
|
||
|
||
<body>
|
||
|
||
|
||
<div class="blog-masthead">
|
||
<div class="container">
|
||
<nav class="nav blog-nav">
|
||
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
|
||
</nav>
|
||
</div>
|
||
</div>
|
||
|
||
|
||
|
||
|
||
<header class="blog-header">
|
||
<div class="container">
|
||
<h1 class="blog-title"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
|
||
<p class="lead blog-description">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
|
||
</div>
|
||
</header>
|
||
|
||
|
||
|
||
|
||
<div class="container">
|
||
<div class="row">
|
||
<div class="col-sm-8 blog-main">
|
||
|
||
|
||
|
||
|
||
<article class="blog-post">
|
||
<header>
|
||
<h2 class="blog-post-title"><a href="https://alanorth.github.io/cgspace-notes/2019-02/">February, 2019</a></h2>
|
||
<p class="blog-post-meta"><time datetime="2019-02-01T21:37:30+02:00">Fri Feb 01, 2019</time> by Alan Orth in
|
||
|
||
<i class="fa fa-tag" aria-hidden="true"></i> <a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
|
||
|
||
</p>
|
||
</header>
|
||
<h2 id="2019-02-01">2019-02-01</h2>
|
||
|
||
<ul>
|
||
<li>Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!</li>
|
||
<li>The top IPs before, during, and after this latest alert tonight were:</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "01/Feb/2019:(17|18|19|20|21)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
245 207.46.13.5
|
||
332 54.70.40.11
|
||
385 5.143.231.38
|
||
405 207.46.13.173
|
||
405 207.46.13.75
|
||
1117 66.249.66.219
|
||
1121 35.237.175.180
|
||
1546 5.9.6.51
|
||
2474 45.5.186.2
|
||
5490 85.25.237.71
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li><code>85.25.237.71</code> is the “Linguee Bot” that I first saw last month</li>
|
||
<li>The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase</li>
|
||
<li>There were just over 3 million accesses in the nginx logs last month:</li>
|
||
</ul>
|
||
|
||
<pre><code># time zcat --force /var/log/nginx/* | grep -cE "[0-9]{1,2}/Jan/2019"
|
||
3018243
|
||
|
||
real 0m19.873s
|
||
user 0m22.203s
|
||
sys 0m1.979s
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Normally I’d say this was very high, but <a href="/cgspace-notes/2018-02/">about this time last year</a> I remember thinking the same thing when we had 3.1 million…</li>
|
||
<li>I will have to keep an eye on this to see if there is some error in Solr…</li>
|
||
<li>Atmire sent their <a href="https://github.com/ilri/DSpace/pull/407">pull request to re-enable the Metadata Quality Module (MQM) on our <code>5_x-dev</code> branch</a> today
|
||
|
||
<ul>
|
||
<li>I will test it next week and send them feedback</li>
|
||
</ul></li>
|
||
</ul>
|
||
|
||
<h2 id="2019-02-02">2019-02-02</h2>
|
||
|
||
<ul>
|
||
<li>Another alert from Linode about CGSpace (linode18) this morning, here are the top IPs in the web server logs before, during, and after that time:</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Feb/2019:0(1|2|3|4|5)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
284 18.195.78.144
|
||
329 207.46.13.32
|
||
417 35.237.175.180
|
||
448 34.218.226.147
|
||
694 2a01:4f8:13b:1296::2
|
||
718 2a01:4f8:140:3192::2
|
||
786 137.108.70.14
|
||
1002 5.9.6.51
|
||
6077 85.25.237.71
|
||
8726 45.5.184.2
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li><code>45.5.184.2</code> is CIAT and <code>85.25.237.71</code> is the new Linguee bot that I first noticed a few days ago</li>
|
||
<li>I will increase the Linode alert threshold from 275 to 300% because this is becoming too much!</li>
|
||
<li>I tested the Atmire Metadata Quality Module (MQM)’s duplicate checked on the some <a href="https://dspacetest.cgiar.org/handle/10568/81268">WLE items</a> that I helped Udana with a few months ago on DSpace Test (linode19) and indeed it found many duplicates!</li>
|
||
</ul>
|
||
|
||
<h2 id="2019-02-03">2019-02-03</h2>
|
||
|
||
<ul>
|
||
<li>This is seriously getting annoying, Linode sent another alert this morning that CGSpace (linode18) load was 377%!</li>
|
||
<li>Here are the top IPs before, during, and after that time:</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "03/Feb/2019:0(5|6|7|8|9)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
325 85.25.237.71
|
||
340 45.5.184.72
|
||
431 5.143.231.8
|
||
756 5.9.6.51
|
||
1048 34.218.226.147
|
||
1203 66.249.66.219
|
||
1496 195.201.104.240
|
||
4658 205.186.128.185
|
||
4658 70.32.83.92
|
||
4852 45.5.184.2
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li><code>45.5.184.2</code> is CIAT, <code>70.32.83.92</code> and <code>205.186.128.185</code> are Macaroni Bros harvesters for CCAFS I think</li>
|
||
<li><code>195.201.104.240</code> is a new IP address in Germany with the following user agent:</li>
|
||
</ul>
|
||
|
||
<pre><code>Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>This user was making 20–60 requests per minute this morning… seems like I should try to block this type of behavior heuristically, regardless of user agent!</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "03/Feb/2019" | grep 195.201.104.240 | grep -o -E '03/Feb/2019:0[0-9]:[0-9][0-9]' | uniq -c | sort -n | tail -n 20
|
||
19 03/Feb/2019:07:42
|
||
20 03/Feb/2019:07:12
|
||
21 03/Feb/2019:07:27
|
||
21 03/Feb/2019:07:28
|
||
25 03/Feb/2019:07:23
|
||
25 03/Feb/2019:07:29
|
||
26 03/Feb/2019:07:33
|
||
28 03/Feb/2019:07:38
|
||
30 03/Feb/2019:07:31
|
||
33 03/Feb/2019:07:35
|
||
33 03/Feb/2019:07:37
|
||
38 03/Feb/2019:07:40
|
||
43 03/Feb/2019:07:24
|
||
43 03/Feb/2019:07:32
|
||
46 03/Feb/2019:07:36
|
||
47 03/Feb/2019:07:34
|
||
47 03/Feb/2019:07:39
|
||
47 03/Feb/2019:07:41
|
||
51 03/Feb/2019:07:26
|
||
59 03/Feb/2019:07:25
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>At least they re-used their Tomcat session!</li>
|
||
</ul>
|
||
|
||
<pre><code>$ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=195.201.104.240' dspace.log.2019-02-03 | sort | uniq | wc -l
|
||
1
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>This user was making requests to <code>/browse</code>, which is not currently under the existing rate limiting of dynamic pages in our nginx config
|
||
|
||
<ul>
|
||
<li>I <a href="https://github.com/ilri/rmg-ansible-public/commit/36dfb072d6724fb5cdc81ef79cab08ed9ce427ad">extended the existing <code>dynamicpages</code> (12/m) rate limit to <code>/browse</code> and <code>/discover</code></a> with an allowance for bursting of up to five requests for “real” users</li>
|
||
</ul></li>
|
||
<li>Run all system updates on linode20 and reboot it
|
||
|
||
<ul>
|
||
<li>This will be the new AReS repository explorer server soon</li>
|
||
</ul></li>
|
||
</ul>
|
||
|
||
<h2 id="2019-02-04">2019-02-04</h2>
|
||
|
||
<ul>
|
||
<li>Generate a list of CTA subjects from CGSpace for Peter:</li>
|
||
</ul>
|
||
|
||
<pre><code>dspace=# \copy (SELECT DISTINCT text_value, count(*) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=124 GROUP BY text_value ORDER BY COUNT DESC) to /tmp/cta-subjects.csv with csv header;
|
||
COPY 321
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Skype with Michael Victor about CKM and CGSpace</li>
|
||
<li>Discuss the new IITA research theme field with Abenet and decide that we should use <code>cg.identifier.iitatheme</code></li>
|
||
<li>This morning there was another alert from Linode about the high load on CGSpace (linode18), here are the top IPs in the web server logs before, during, and after that time:</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "04/Feb/2019:0(5|6|7|8|9)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
589 2a01:4f8:140:3192::2
|
||
762 66.249.66.219
|
||
889 35.237.175.180
|
||
1332 34.218.226.147
|
||
1393 5.9.6.51
|
||
1940 50.116.102.77
|
||
3578 85.25.237.71
|
||
4311 45.5.184.2
|
||
4658 205.186.128.185
|
||
4658 70.32.83.92
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>At this rate I think I just need to stop paying attention to these alerts—DSpace gets thrashed when people use the APIs properly and there’s nothing we can do to improve REST API performance!</li>
|
||
<li>Perhaps I just need to keep increasing the Linode alert threshold (currently 300%) for this host?</li>
|
||
</ul>
|
||
|
||
<h2 id="2019-02-05">2019-02-05</h2>
|
||
|
||
<ul>
|
||
<li>Peter sent me corrections and deletions for the CTA subjects and as usual, there were encoding errors with some accentsÁ in his file</li>
|
||
<li>In other news, it seems that the GREL syntax regarding booleans changed in OpenRefine recently, so I need to update some expressions like the one I use to detect encoding errors to use <code>toString()</code>:</li>
|
||
</ul>
|
||
|
||
<pre><code>or(
|
||
isNotNull(value.match(/.*\uFFFD.*/)),
|
||
isNotNull(value.match(/.*\u00A0.*/)),
|
||
isNotNull(value.match(/.*\u200A.*/)),
|
||
isNotNull(value.match(/.*\u2019.*/)),
|
||
isNotNull(value.match(/.*\u00b4.*/)),
|
||
isNotNull(value.match(/.*\u007e.*/))
|
||
).toString()
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Testing the corrections for sixty-five items and sixteen deletions using my <a href="https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897">fix-metadata-values.py</a> and <a href="https://gist.github.com/alanorth/bd7d58c947f686401a2b1fadc78736be">delete-metadata-values.py</a> scripts:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ ./fix-metadata-values.py -i 2019-02-04-Correct-65-CTA-Subjects.csv -f cg.subject.cta -t CORRECT -m 124 -db dspace -u dspace -p 'fuu' -d
|
||
$ ./delete-metadata-values.py -i 2019-02-04-Delete-16-CTA-Subjects.csv -f cg.subject.cta -m 124 -db dspace -u dspace -p 'fuu' -d
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>I applied them on DSpace Test and CGSpace and started a full Discovery re-index:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ export JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx1024m"
|
||
$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Peter had marked several terms with <code>||</code> to indicate multiple values in his corrections so I will have to go back and do those manually:</li>
|
||
</ul>
|
||
|
||
<pre><code>EMPODERAMENTO DE JOVENS,EMPODERAMENTO||JOVENS
|
||
ENVIRONMENTAL PROTECTION AND NATURAL RESOURCES MANAGEMENT,NATURAL RESOURCES MANAGEMENT||ENVIRONMENT
|
||
FISHERIES AND AQUACULTURE,FISHERIES||AQUACULTURE
|
||
MARKETING AND TRADE,MARKETING||TRADE
|
||
MARKETING ET COMMERCE,MARKETING||COMMERCE
|
||
NATURAL RESOURCES AND ENVIRONMENT,NATURAL RESOURCES MANAGEMENT||ENVIRONMENT
|
||
PÊCHES ET AQUACULTURE,PÊCHES||AQUACULTURE
|
||
PESCAS E AQUACULTURE,PISCICULTURA||AQUACULTURE
|
||
</code></pre>
|
||
|
||
<h2 id="2019-02-06">2019-02-06</h2>
|
||
|
||
<ul>
|
||
<li>I dumped the CTA community so I can try to fix the subjects with multiple subjects that Peter indicated in his corrections:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ dspace metadata-export -i 10568/42211 -f /tmp/cta.csv
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Then I used <code>csvcut</code> to get only the CTA subject columns:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ csvcut -c "id,collection,cg.subject.cta,cg.subject.cta[],cg.subject.cta[en_US]" /tmp/cta.csv > /tmp/cta-subjects.csv
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>After that I imported the CSV into OpenRefine where I could properly identify and edit the subjects as multiple values</li>
|
||
<li>Then I imported it back into CGSpace:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ dspace metadata-import -f /tmp/2019-02-06-CTA-multiple-subjects.csv
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Another day, another alert about high load on CGSpace (linode18) from Linode</li>
|
||
<li>This time the load average was 370% and the top ten IPs before, during, and after that time were:</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "06/Feb/2019:0(5|6|7|8|9)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
689 35.237.175.180
|
||
1236 5.9.6.51
|
||
1305 34.218.226.147
|
||
1580 66.249.66.219
|
||
1939 50.116.102.77
|
||
2313 108.212.105.35
|
||
4666 205.186.128.185
|
||
4666 70.32.83.92
|
||
4950 85.25.237.71
|
||
5158 45.5.186.2
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Looking closer at the top users, I see <code>45.5.186.2</code> is in Brazil and was making over 100 requests per minute to the REST API:</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 | grep 45.5.186.2 | grep -o -E '06/Feb/2019:0[0-9]:[0-9][0-9]' | uniq -c | sort -n | tail -n 10
|
||
118 06/Feb/2019:05:46
|
||
119 06/Feb/2019:05:37
|
||
119 06/Feb/2019:05:47
|
||
120 06/Feb/2019:05:43
|
||
120 06/Feb/2019:05:44
|
||
121 06/Feb/2019:05:38
|
||
122 06/Feb/2019:05:39
|
||
125 06/Feb/2019:05:42
|
||
126 06/Feb/2019:05:40
|
||
126 06/Feb/2019:05:41
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>I was thinking of rate limiting those because I assumed most of them would be errors, but actually most are HTTP 200 OK!</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E '06/Feb/2019' | grep 45.5.186.2 | awk '{print $9}' | sort | uniq -c
|
||
10411 200
|
||
1 301
|
||
7 302
|
||
3 404
|
||
18 499
|
||
2 500
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>I should probably start looking at the top IPs for web (XMLUI) and for API (REST and OAI) separately:</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E "06/Feb/2019:0(5|6|7|8|9)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
328 220.247.212.35
|
||
372 66.249.66.221
|
||
380 207.46.13.2
|
||
519 2a01:4f8:140:3192::2
|
||
572 5.143.231.8
|
||
689 35.237.175.180
|
||
771 108.212.105.35
|
||
1236 5.9.6.51
|
||
1554 66.249.66.219
|
||
4942 85.25.237.71
|
||
# zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E "06/Feb/2019:0(5|6|7|8|9)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
10 66.249.66.221
|
||
26 66.249.66.219
|
||
69 5.143.231.8
|
||
340 45.5.184.72
|
||
1040 34.218.226.147
|
||
1542 108.212.105.35
|
||
1937 50.116.102.77
|
||
4661 205.186.128.185
|
||
4661 70.32.83.92
|
||
5102 45.5.186.2
|
||
</code></pre>
|
||
|
||
<h2 id="2019-02-07">2019-02-07</h2>
|
||
|
||
<ul>
|
||
<li>Linode sent an alert last night that the load on CGSpace (linode18) was over 300%</li>
|
||
<li>Here are the top IPs in the web server and API logs before, during, and after that time, respectively:</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E "06/Feb/2019:(17|18|19|20|23)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
5 66.249.66.209
|
||
6 2a01:4f8:210:51ef::2
|
||
6 40.77.167.75
|
||
9 104.198.9.108
|
||
9 157.55.39.192
|
||
10 157.55.39.244
|
||
12 66.249.66.221
|
||
20 95.108.181.88
|
||
27 66.249.66.219
|
||
2381 45.5.186.2
|
||
# zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E "06/Feb/2019:(17|18|19|20|23)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
455 45.5.186.2
|
||
506 40.77.167.75
|
||
559 54.70.40.11
|
||
825 157.55.39.244
|
||
871 2a01:4f8:140:3192::2
|
||
938 157.55.39.192
|
||
1058 85.25.237.71
|
||
1416 5.9.6.51
|
||
1606 66.249.66.219
|
||
1718 35.237.175.180
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Then again this morning another alert:</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E "07/Feb/2019:0(5|6|7|8|9)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
5 66.249.66.223
|
||
8 104.198.9.108
|
||
13 110.54.160.222
|
||
24 66.249.66.219
|
||
25 175.158.217.98
|
||
214 34.218.226.147
|
||
346 45.5.184.72
|
||
4529 45.5.186.2
|
||
4661 205.186.128.185
|
||
4661 70.32.83.92
|
||
# zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E "07/Feb/2019:0(5|6|7|8|9)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
145 157.55.39.237
|
||
154 66.249.66.221
|
||
214 34.218.226.147
|
||
261 35.237.175.180
|
||
273 2a01:4f8:140:3192::2
|
||
300 169.48.66.92
|
||
487 5.143.231.39
|
||
766 5.9.6.51
|
||
771 85.25.237.71
|
||
848 66.249.66.219
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>So it seems that the load issue comes from the REST API, not the XMLUI</li>
|
||
<li>I could probably rate limit the REST API, or maybe just keep increasing the alert threshold so I don’t get alert spam (this is probably the correct approach because it seems like the REST API can keep up with the requests and is returning HTTP 200 status as far as I can tell)</li>
|
||
<li>Bosede from IITA sent a message that a colleague is having problems submitting to some collections in their community:</li>
|
||
</ul>
|
||
|
||
<pre><code>Authorization denied for action WORKFLOW_STEP_1 on COLLECTION:1056 by user 1759
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Collection 1056 appears to be <a href="https://cgspace.cgiar.org/handle/10568/68741">IITA Posters and Presentations</a> and I see that its workflow step 1 (Accept/Reject) is empty:</li>
|
||
</ul>
|
||
|
||
<p><img src="/cgspace-notes/2019/02/iita-workflow-step1-empty.png" alt="IITA Posters and Presentations workflow step 1 empty" /></p>
|
||
|
||
<ul>
|
||
<li>IITA editors or approvers should be added to that step (though I’m curious why nobody is in that group currently)</li>
|
||
<li>Abenet says we are not using the “Accept/Reject” step so this group should be deleted</li>
|
||
<li>Bizuwork asked about the “DSpace Submission Approved and Archived” emails that stopped working last month</li>
|
||
<li>I tried the <code>test-email</code> command on DSpace and it indeed is not working:</li>
|
||
</ul>
|
||
|
||
<pre><code>$ dspace test-email
|
||
|
||
About to send test email:
|
||
- To: aorth@mjanja.ch
|
||
- Subject: DSpace test email
|
||
- Server: smtp.serv.cgnet.com
|
||
|
||
Error sending email:
|
||
- Error: javax.mail.MessagingException: Could not connect to SMTP host: smtp.serv.cgnet.com, port: 25;
|
||
nested exception is:
|
||
java.net.ConnectException: Connection refused (Connection refused)
|
||
|
||
Please see the DSpace documentation for assistance.
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>I can’t connect to TCP port 25 on that server so I sent a mail to CGNET support to ask what’s up</li>
|
||
<li>CGNET said these servers were discontinued in 2018-01 and that I should use <a href="https://docs.microsoft.com/en-us/exchange/mail-flow-best-practices/how-to-set-up-a-multifunction-device-or-application-to-send-email-using-office-3">Office 365</a></li>
|
||
</ul>
|
||
|
||
<h2 id="2019-02-08">2019-02-08</h2>
|
||
|
||
<ul>
|
||
<li>I re-configured CGSpace to use the email/password for cgspace-support, but I get this error when I try the <code>test-email</code> script:</li>
|
||
</ul>
|
||
|
||
<pre><code>Error sending email:
|
||
- Error: com.sun.mail.smtp.SMTPSendFailedException: 530 5.7.57 SMTP; Client was not authenticated to send anonymous mail during MAIL FROM [AM6PR10CA0028.EURPRD10.PROD.OUTLOOK.COM]
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>I tried to log into Outlook 365 with the credentials but I think the ones I have must be wrong, so I will ask ICT to reset the password</li>
|
||
</ul>
|
||
|
||
<h2 id="2019-02-09">2019-02-09</h2>
|
||
|
||
<ul>
|
||
<li>Linode sent alerts about CPU load yesterday morning, yesterday night, and this morning! All over 300% CPU load!</li>
|
||
<li>This is just for this morning:</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E "09/Feb/2019:(07|08|09|10|11)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
289 35.237.175.180
|
||
290 66.249.66.221
|
||
296 18.195.78.144
|
||
312 207.46.13.201
|
||
393 207.46.13.64
|
||
526 2a01:4f8:140:3192::2
|
||
580 151.80.203.180
|
||
742 5.143.231.38
|
||
1046 5.9.6.51
|
||
1331 66.249.66.219
|
||
# zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E "09/Feb/2019:(07|08|09|10|11)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
4 66.249.83.30
|
||
5 49.149.10.16
|
||
8 207.46.13.64
|
||
9 207.46.13.201
|
||
11 105.63.86.154
|
||
11 66.249.66.221
|
||
31 66.249.66.219
|
||
297 2001:41d0:d:1990::
|
||
908 34.218.226.147
|
||
1947 50.116.102.77
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>I know 66.249.66.219 is Google, 5.9.6.51 is MegaIndex, and 5.143.231.38 is SputnikBot</li>
|
||
<li>Ooh, but 151.80.203.180 is some malicious bot making requests for <code>/etc/passwd</code> like this:</li>
|
||
</ul>
|
||
|
||
<pre><code>/bitstream/handle/10568/68981/Identifying%20benefit%20flows%20studies%20on%20the%20potential%20monetary%20and%20non%20monetary%20benefits%20arising%20from%20the%20International%20Treaty%20on%20Plant%20Genetic_1671.pdf?sequence=1&amp;isAllowed=../etc/passwd
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>151.80.203.180 is on OVH so I sent a message to their abuse email…</li>
|
||
</ul>
|
||
|
||
<h2 id="2019-02-10">2019-02-10</h2>
|
||
|
||
<ul>
|
||
<li>Linode sent another alert about CGSpace (linode18) CPU load this morning, here are the top IPs in the web server XMLUI and API logs before, during, and after that time:</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E "10/Feb/2019:0(5|6|7|8|9)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
232 18.195.78.144
|
||
238 35.237.175.180
|
||
281 66.249.66.221
|
||
314 151.80.203.180
|
||
319 34.218.226.147
|
||
326 40.77.167.178
|
||
352 157.55.39.149
|
||
444 2a01:4f8:140:3192::2
|
||
1171 5.9.6.51
|
||
1196 66.249.66.219
|
||
# zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E "10/Feb/2019:0(5|6|7|8|9)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||
6 112.203.241.69
|
||
7 157.55.39.149
|
||
9 40.77.167.178
|
||
15 66.249.66.219
|
||
368 45.5.184.72
|
||
432 50.116.102.77
|
||
971 34.218.226.147
|
||
4403 45.5.186.2
|
||
4668 205.186.128.185
|
||
4668 70.32.83.92
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Another interesting thing might be the total number of requests for web and API services during that time:</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -cE "10/Feb/2019:0(5|6|7|8|9)"
|
||
16333
|
||
# zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -cE "10/Feb/2019:0(5|6|7|8|9)"
|
||
15964
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>Also, the number of unique IPs served during that time:</li>
|
||
</ul>
|
||
|
||
<pre><code># zcat --force /var/log/nginx/{access,error,library-access}.log /var/log/nginx/{access,error,library-access}.log.1 | grep -E "10/Feb/2019:0(5|6|7|8|9)" | awk '{print $1}' | sort | uniq | wc -l
|
||
1622
|
||
# zcat --force /var/log/nginx/{oai,rest,statistics}.log /var/log/nginx/{oai,rest,statistics}.log.1 | grep -E "10/Feb/2019:0(5|6|7|8|9)" | awk '{print $1}' | sort | uniq | wc -l
|
||
95
|
||
</code></pre>
|
||
|
||
<ul>
|
||
<li>It’s very clear to me now that the API requests are the heaviest!</li>
|
||
<li>I think I need to increase the Linode alert threshold from 300 to 350% now so I stop getting some of these alerts—it’s becoming a bit of <em>the boy who cried wolf</em> because it alerts like clockwork twice per day!</li>
|
||
<li>Add my Python- and shell-based metadata workflow helper scripts as well as the environment settings for pipenv to our DSpace repository (<a href="https://github.com/ilri/DSpace/pull/408">#408</a>) so I can track changes and distribute them more formally instead of just keeping them <a href="https://github.com/ilri/DSpace/wiki/Scripts">collected on the wiki</a></li>
|
||
<li>Started adding IITA research theme (<code>cg.identifier.iitatheme</code>) to CGSpace
|
||
|
||
<ul>
|
||
<li>I’m still waiting for feedback from IITA whether they actually want to use “SOCIAL SCIENCE & AGRIC BUSINESS” because it is listed as <a href="http://www.iita.org/project-discipline/social-science-and-agribusiness/">“Social Science and Agribusiness”</a> on their website</li>
|
||
<li>Also, I think they want to do some mappings of items with existing subjects to these new themes</li>
|
||
</ul></li>
|
||
<li>Update ILRI author name style in the controlled vocabulary (Domelevo Entfellner, Jean-Baka) (<a href="https://github.com/ilri/DSpace/pull/409">#409</a>)
|
||
|
||
<ul>
|
||
<li>I’m still waiting to hear from Bizuwork whether we’ll batch update all existing items with the old name style</li>
|
||
</ul></li>
|
||
<li>Last week Hector Tobon from CCAFS asked me about the Creative Commons 3.0 Intergovernmental Organizations (IGO) license because it is not in the list of SPDX licenses
|
||
|
||
<ul>
|
||
<li>Today I made a request to the <a href="https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md">SPDX using their web form</a> to include this class of Creative Commons licenses](<a href="https://wiki.creativecommons.org/wiki/Intergovernmental_Organizations">https://wiki.creativecommons.org/wiki/Intergovernmental_Organizations</a>)</li>
|
||
</ul></li>
|
||
</ul>
|
||
|
||
<!-- vim: set sw=2 ts=2: -->
|
||
|
||
|
||
|
||
|
||
|
||
</article>
|
||
|
||
|
||
|
||
</div> <!-- /.blog-main -->
|
||
|
||
<aside class="col-sm-3 ml-auto blog-sidebar">
|
||
|
||
|
||
|
||
<section class="sidebar-module">
|
||
<h4>Recent Posts</h4>
|
||
<ol class="list-unstyled">
|
||
|
||
|
||
<li><a href="/cgspace-notes/2019-02/">February, 2019</a></li>
|
||
|
||
<li><a href="/cgspace-notes/2019-01/">January, 2019</a></li>
|
||
|
||
<li><a href="/cgspace-notes/2018-12/">December, 2018</a></li>
|
||
|
||
<li><a href="/cgspace-notes/2018-11/">November, 2018</a></li>
|
||
|
||
<li><a href="/cgspace-notes/2018-10/">October, 2018</a></li>
|
||
|
||
</ol>
|
||
</section>
|
||
|
||
|
||
|
||
|
||
<section class="sidebar-module">
|
||
<h4>Links</h4>
|
||
<ol class="list-unstyled">
|
||
|
||
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
|
||
|
||
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
|
||
|
||
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
|
||
|
||
</ol>
|
||
</section>
|
||
|
||
</aside>
|
||
|
||
|
||
</div> <!-- /.row -->
|
||
</div> <!-- /.container -->
|
||
|
||
|
||
|
||
<footer class="blog-footer">
|
||
<p>
|
||
|
||
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
|
||
|
||
</p>
|
||
<p>
|
||
<a href="#">Back to top</a>
|
||
</p>
|
||
</footer>
|
||
|
||
|
||
</body>
|
||
|
||
</html>
|