mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-12-19 19:52:18 +01:00
303 lines
8.9 KiB
HTML
303 lines
8.9 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en">
|
|
|
|
<head>
|
|
<meta charset="utf-8">
|
|
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
|
|
|
|
<meta property="og:title" content="February, 2019" />
|
|
<meta property="og:description" content="2019-02-01
|
|
|
|
|
|
Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!
|
|
The top IPs before, during, and after this latest alert tonight were:
|
|
|
|
|
|
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "01/Feb/2019:(17|18|19|20|21)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
|
245 207.46.13.5
|
|
332 54.70.40.11
|
|
385 5.143.231.38
|
|
405 207.46.13.173
|
|
405 207.46.13.75
|
|
1117 66.249.66.219
|
|
1121 35.237.175.180
|
|
1546 5.9.6.51
|
|
2474 45.5.186.2
|
|
5490 85.25.237.71
|
|
|
|
|
|
|
|
85.25.237.71 is the “Linguee Bot” that I first saw last month
|
|
The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase
|
|
There were just over 3 million accesses in the nginx logs last month:
|
|
|
|
|
|
# time zcat --force /var/log/nginx/* | grep -cE "[0-9]{1,2}/Jan/2019"
|
|
3018243
|
|
|
|
real 0m19.873s
|
|
user 0m22.203s
|
|
sys 0m1.979s
|
|
" />
|
|
<meta property="og:type" content="article" />
|
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-02/" /><meta property="article:published_time" content="2019-02-01T21:37:30+02:00"/>
|
|
<meta property="article:modified_time" content="2019-02-02T00:01:39+02:00"/>
|
|
|
|
<meta name="twitter:card" content="summary"/>
|
|
<meta name="twitter:title" content="February, 2019"/>
|
|
<meta name="twitter:description" content="2019-02-01
|
|
|
|
|
|
Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!
|
|
The top IPs before, during, and after this latest alert tonight were:
|
|
|
|
|
|
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "01/Feb/2019:(17|18|19|20|21)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
|
245 207.46.13.5
|
|
332 54.70.40.11
|
|
385 5.143.231.38
|
|
405 207.46.13.173
|
|
405 207.46.13.75
|
|
1117 66.249.66.219
|
|
1121 35.237.175.180
|
|
1546 5.9.6.51
|
|
2474 45.5.186.2
|
|
5490 85.25.237.71
|
|
|
|
|
|
|
|
85.25.237.71 is the “Linguee Bot” that I first saw last month
|
|
The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase
|
|
There were just over 3 million accesses in the nginx logs last month:
|
|
|
|
|
|
# time zcat --force /var/log/nginx/* | grep -cE "[0-9]{1,2}/Jan/2019"
|
|
3018243
|
|
|
|
real 0m19.873s
|
|
user 0m22.203s
|
|
sys 0m1.979s
|
|
"/>
|
|
<meta name="generator" content="Hugo 0.53" />
|
|
|
|
|
|
|
|
<script type="application/ld+json">
|
|
{
|
|
"@context": "http://schema.org",
|
|
"@type": "BlogPosting",
|
|
"headline": "February, 2019",
|
|
"url": "https://alanorth.github.io/cgspace-notes/2019-02/",
|
|
"wordCount": "367",
|
|
"datePublished": "2019-02-01T21:37:30+02:00",
|
|
"dateModified": "2019-02-02T00:01:39+02:00",
|
|
"author": {
|
|
"@type": "Person",
|
|
"name": "Alan Orth"
|
|
},
|
|
"keywords": "Notes"
|
|
}
|
|
</script>
|
|
|
|
|
|
|
|
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-02/">
|
|
|
|
<title>February, 2019 | CGSpace Notes</title>
|
|
|
|
<!-- combined, minified CSS -->
|
|
<link href="https://alanorth.github.io/cgspace-notes/css/style.css" rel="stylesheet" integrity="sha384-6+EGfPoOzk/n2DVJSlglKT8TV1TgIMvVcKI73IZgBswLasPBn94KommV6ilJqCXE" crossorigin="anonymous">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</head>
|
|
|
|
<body>
|
|
|
|
|
|
<div class="blog-masthead">
|
|
<div class="container">
|
|
<nav class="nav blog-nav">
|
|
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
|
|
</nav>
|
|
</div>
|
|
</div>
|
|
|
|
|
|
|
|
|
|
<header class="blog-header">
|
|
<div class="container">
|
|
<h1 class="blog-title"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
|
|
<p class="lead blog-description">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
|
|
</div>
|
|
</header>
|
|
|
|
|
|
|
|
|
|
<div class="container">
|
|
<div class="row">
|
|
<div class="col-sm-8 blog-main">
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title"><a href="https://alanorth.github.io/cgspace-notes/2019-02/">February, 2019</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2019-02-01T21:37:30+02:00">Fri Feb 01, 2019</time> by Alan Orth in
|
|
|
|
<i class="fa fa-tag" aria-hidden="true"></i> <a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2019-02-01">2019-02-01</h2>
|
|
|
|
<ul>
|
|
<li>Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!</li>
|
|
<li>The top IPs before, during, and after this latest alert tonight were:</li>
|
|
</ul>
|
|
|
|
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "01/Feb/2019:(17|18|19|20|21)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
|
245 207.46.13.5
|
|
332 54.70.40.11
|
|
385 5.143.231.38
|
|
405 207.46.13.173
|
|
405 207.46.13.75
|
|
1117 66.249.66.219
|
|
1121 35.237.175.180
|
|
1546 5.9.6.51
|
|
2474 45.5.186.2
|
|
5490 85.25.237.71
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li><code>85.25.237.71</code> is the “Linguee Bot” that I first saw last month</li>
|
|
<li>The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase</li>
|
|
<li>There were just over 3 million accesses in the nginx logs last month:</li>
|
|
</ul>
|
|
|
|
<pre><code># time zcat --force /var/log/nginx/* | grep -cE "[0-9]{1,2}/Jan/2019"
|
|
3018243
|
|
|
|
real 0m19.873s
|
|
user 0m22.203s
|
|
sys 0m1.979s
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>Normally I’d say this was very high, but <a href="/cgspace-notes/2018-02/">about this time last year</a> I remember thinking the same thing when we had 3.1 million…</li>
|
|
<li>I will have to keep an eye on this to see if there is some error in Solr…</li>
|
|
<li>Atmire sent their <a href="https://github.com/ilri/DSpace/pull/407">pull request to re-enable the Metadata Quality Module (MQM) on our <code>5_x-dev</code> branch</a> today
|
|
|
|
<ul>
|
|
<li>I will test it next week and send them feedback</li>
|
|
</ul></li>
|
|
</ul>
|
|
|
|
<h2 id="2019-02-02">2019-02-02</h2>
|
|
|
|
<ul>
|
|
<li>Another alert from Linode about CGSpace (linode18) this morning, here are the top IPs in the web server logs before, during, and after that time:</li>
|
|
</ul>
|
|
|
|
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Feb/2019:0(1|2|3|4|5)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
|
284 18.195.78.144
|
|
329 207.46.13.32
|
|
417 35.237.175.180
|
|
448 34.218.226.147
|
|
694 2a01:4f8:13b:1296::2
|
|
718 2a01:4f8:140:3192::2
|
|
786 137.108.70.14
|
|
1002 5.9.6.51
|
|
6077 85.25.237.71
|
|
8726 45.5.184.2
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li><code>45.5.184.2</code> is CIAT and <code>85.25.237.71</code> is the new Linguee bot that I first noticed a few days ago</li>
|
|
<li>I will increase the Linode alert threshold from 275 to 300% because this is becoming too much!</li>
|
|
<li>I tested the Atmire Metadata Quality Module (MQM)’s duplicate checked on the some <a href="https://dspacetest.cgiar.org/handle/10568/81268">WLE items</a> that I helped Udana with a few months ago on DSpace Test (linode19) and indeed it found many duplicates!</li>
|
|
</ul>
|
|
|
|
<!-- vim: set sw=2 ts=2: -->
|
|
|
|
|
|
|
|
|
|
|
|
</article>
|
|
|
|
|
|
|
|
</div> <!-- /.blog-main -->
|
|
|
|
<aside class="col-sm-3 ml-auto blog-sidebar">
|
|
|
|
|
|
|
|
<section class="sidebar-module">
|
|
<h4>Recent Posts</h4>
|
|
<ol class="list-unstyled">
|
|
|
|
|
|
<li><a href="/cgspace-notes/2019-02/">February, 2019</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2019-01/">January, 2019</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2018-12/">December, 2018</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2018-11/">November, 2018</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2018-10/">October, 2018</a></li>
|
|
|
|
</ol>
|
|
</section>
|
|
|
|
|
|
|
|
|
|
<section class="sidebar-module">
|
|
<h4>Links</h4>
|
|
<ol class="list-unstyled">
|
|
|
|
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
|
|
|
|
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
|
|
|
|
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
|
|
|
|
</ol>
|
|
</section>
|
|
|
|
</aside>
|
|
|
|
|
|
</div> <!-- /.row -->
|
|
</div> <!-- /.container -->
|
|
|
|
|
|
|
|
<footer class="blog-footer">
|
|
<p>
|
|
|
|
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
|
|
|
|
</p>
|
|
<p>
|
|
<a href="#">Back to top</a>
|
|
</p>
|
|
</footer>
|
|
|
|
|
|
</body>
|
|
|
|
</html>
|