2019-02-01 20:45:50 +01:00
<!DOCTYPE html>
< html lang = "en" >
< head >
< meta charset = "utf-8" >
< meta name = "viewport" content = "width=device-width, initial-scale=1, shrink-to-fit=no" >
< meta property = "og:title" content = "February, 2019" / >
< meta property = "og:description" content = "2019-02-01
Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!
2019-02-01 23:01:39 +01:00
The top IPs before, during, and after this latest alert tonight were:
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E " 01/Feb/2019:(17|18|19|20|21)" | awk ' {print $1}' | sort | uniq -c | sort -n | tail -n 10
245 207.46.13.5
332 54.70.40.11
385 5.143.231.38
405 207.46.13.173
405 207.46.13.75
1117 66.249.66.219
1121 35.237.175.180
1546 5.9.6.51
2474 45.5.186.2
5490 85.25.237.71
85.25.237.71 is the “ Linguee Bot” that I first saw last month
2019-02-01 20:45:50 +01:00
The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase
There were just over 3 million accesses in the nginx logs last month:
# time zcat --force /var/log/nginx/* | grep -cE " [0-9]{1,2}/Jan/2019"
3018243
real 0m19.873s
user 0m22.203s
sys 0m1.979s
" />
< meta property = "og:type" content = "article" / >
< meta property = "og:url" content = "https://alanorth.github.io/cgspace-notes/2019-02/" / > < meta property = "article:published_time" content = "2019-02-01T21:37:30+02:00" / >
2019-02-02 10:36:24 +01:00
< meta property = "article:modified_time" content = "2019-02-02T00:01:39+02:00" / >
2019-02-01 20:45:50 +01:00
< meta name = "twitter:card" content = "summary" / >
< meta name = "twitter:title" content = "February, 2019" / >
< meta name = "twitter:description" content = "2019-02-01
Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!
2019-02-01 23:01:39 +01:00
The top IPs before, during, and after this latest alert tonight were:
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E " 01/Feb/2019:(17|18|19|20|21)" | awk ' {print $1}' | sort | uniq -c | sort -n | tail -n 10
245 207.46.13.5
332 54.70.40.11
385 5.143.231.38
405 207.46.13.173
405 207.46.13.75
1117 66.249.66.219
1121 35.237.175.180
1546 5.9.6.51
2474 45.5.186.2
5490 85.25.237.71
85.25.237.71 is the “ Linguee Bot” that I first saw last month
2019-02-01 20:45:50 +01:00
The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase
There were just over 3 million accesses in the nginx logs last month:
# time zcat --force /var/log/nginx/* | grep -cE " [0-9]{1,2}/Jan/2019"
3018243
real 0m19.873s
user 0m22.203s
sys 0m1.979s
"/>
< meta name = "generator" content = "Hugo 0.53" / >
< script type = "application/ld+json" >
{
"@context": "http://schema.org",
"@type": "BlogPosting",
"headline": "February, 2019",
"url": "https://alanorth.github.io/cgspace-notes/2019-02/",
2019-02-02 10:36:24 +01:00
"wordCount": "367",
2019-02-01 20:45:50 +01:00
"datePublished": "2019-02-01T21:37:30+ 02:00",
2019-02-02 10:36:24 +01:00
"dateModified": "2019-02-02T00:01:39+ 02:00",
2019-02-01 20:45:50 +01:00
"author": {
"@type": "Person",
"name": "Alan Orth"
},
"keywords": "Notes"
}
< / script >
< link rel = "canonical" href = "https://alanorth.github.io/cgspace-notes/2019-02/" >
< title > February, 2019 | CGSpace Notes< / title >
<!-- combined, minified CSS -->
< link href = "https://alanorth.github.io/cgspace-notes/css/style.css" rel = "stylesheet" integrity = "sha384-6+EGfPoOzk/n2DVJSlglKT8TV1TgIMvVcKI73IZgBswLasPBn94KommV6ilJqCXE" crossorigin = "anonymous" >
< / head >
< body >
< div class = "blog-masthead" >
< div class = "container" >
< nav class = "nav blog-nav" >
< a class = "nav-link " href = "https://alanorth.github.io/cgspace-notes/" > Home< / a >
< / nav >
< / div >
< / div >
< header class = "blog-header" >
< div class = "container" >
< h1 class = "blog-title" > < a href = "https://alanorth.github.io/cgspace-notes/" rel = "home" > CGSpace Notes< / a > < / h1 >
< p class = "lead blog-description" > Documenting day-to-day work on the < a href = "https://cgspace.cgiar.org" > CGSpace< / a > repository.< / p >
< / div >
< / header >
< div class = "container" >
< div class = "row" >
< div class = "col-sm-8 blog-main" >
< article class = "blog-post" >
< header >
< h2 class = "blog-post-title" > < a href = "https://alanorth.github.io/cgspace-notes/2019-02/" > February, 2019< / a > < / h2 >
< p class = "blog-post-meta" > < time datetime = "2019-02-01T21:37:30+02:00" > Fri Feb 01, 2019< / time > by Alan Orth in
< i class = "fa fa-tag" aria-hidden = "true" > < / i > < a href = "/cgspace-notes/tags/notes" rel = "tag" > Notes< / a >
< / p >
< / header >
< h2 id = "2019-02-01" > 2019-02-01< / h2 >
< ul >
< li > Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!< / li >
2019-02-01 23:01:39 +01:00
< li > The top IPs before, during, and after this latest alert tonight were:< / li >
< / ul >
< pre > < code > # zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E " 01/Feb/2019:(17|18|19|20|21)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
245 207.46.13.5
332 54.70.40.11
385 5.143.231.38
405 207.46.13.173
405 207.46.13.75
1117 66.249.66.219
1121 35.237.175.180
1546 5.9.6.51
2474 45.5.186.2
5490 85.25.237.71
< / code > < / pre >
< ul >
< li > < code > 85.25.237.71< / code > is the “ Linguee Bot” that I first saw last month< / li >
2019-02-01 20:45:50 +01:00
< li > The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase< / li >
< li > There were just over 3 million accesses in the nginx logs last month:< / li >
< / ul >
< pre > < code > # time zcat --force /var/log/nginx/* | grep -cE " [0-9]{1,2}/Jan/2019"
3018243
real 0m19.873s
user 0m22.203s
sys 0m1.979s
< / code > < / pre >
< ul >
< li > Normally I’ d say this was very high, but < a href = "/cgspace-notes/2018-02/" > about this time last year< / a > I remember thinking the same thing when we had 3.1 million… < / li >
< li > I will have to keep an eye on this to see if there is some error in Solr… < / li >
2019-02-01 23:01:39 +01:00
< li > Atmire sent their < a href = "https://github.com/ilri/DSpace/pull/407" > pull request to re-enable the Metadata Quality Module (MQM) on our < code > 5_x-dev< / code > branch< / a > today
< ul >
< li > I will test it next week and send them feedback< / li >
< / ul > < / li >
2019-02-01 20:45:50 +01:00
< / ul >
2019-02-02 10:36:24 +01:00
< h2 id = "2019-02-02" > 2019-02-02< / h2 >
< ul >
< li > Another alert from Linode about CGSpace (linode18) this morning, here are the top IPs in the web server logs before, during, and after that time:< / li >
< / ul >
< pre > < code > # zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E " 02/Feb/2019:0(1|2|3|4|5)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
284 18.195.78.144
329 207.46.13.32
417 35.237.175.180
448 34.218.226.147
694 2a01:4f8:13b:1296::2
718 2a01:4f8:140:3192::2
786 137.108.70.14
1002 5.9.6.51
6077 85.25.237.71
8726 45.5.184.2
< / code > < / pre >
< ul >
< li > < code > 45.5.184.2< / code > is CIAT and < code > 85.25.237.71< / code > is the new Linguee bot that I first noticed a few days ago< / li >
< li > I will increase the Linode alert threshold from 275 to 300% because this is becoming too much!< / li >
< li > I tested the Atmire Metadata Quality Module (MQM)’ s duplicate checked on the some < a href = "https://dspacetest.cgiar.org/handle/10568/81268" > WLE items< / a > that I helped Udana with a few months ago on DSpace Test (linode19) and indeed it found many duplicates!< / li >
< / ul >
2019-02-01 20:45:50 +01:00
<!-- vim: set sw=2 ts=2: -->
< / article >
< / div > <!-- /.blog - main -->
< aside class = "col-sm-3 ml-auto blog-sidebar" >
< section class = "sidebar-module" >
< h4 > Recent Posts< / h4 >
< ol class = "list-unstyled" >
< li > < a href = "/cgspace-notes/2019-02/" > February, 2019< / a > < / li >
< li > < a href = "/cgspace-notes/2019-01/" > January, 2019< / a > < / li >
< li > < a href = "/cgspace-notes/2018-12/" > December, 2018< / a > < / li >
< li > < a href = "/cgspace-notes/2018-11/" > November, 2018< / a > < / li >
< li > < a href = "/cgspace-notes/2018-10/" > October, 2018< / a > < / li >
< / ol >
< / section >
< section class = "sidebar-module" >
< h4 > Links< / h4 >
< ol class = "list-unstyled" >
< li > < a href = "https://cgspace.cgiar.org" > CGSpace< / a > < / li >
< li > < a href = "https://dspacetest.cgiar.org" > DSpace Test< / a > < / li >
< li > < a href = "https://github.com/ilri/DSpace" > CGSpace @ GitHub< / a > < / li >
< / ol >
< / section >
< / aside >
< / div > <!-- /.row -->
< / div > <!-- /.container -->
< footer class = "blog-footer" >
< p >
Blog template created by < a href = "https://twitter.com/mdo" > @mdo< / a > , ported to Hugo by < a href = 'https://twitter.com/mralanorth' > @mralanorth< / a > .
< / p >
< p >
< a href = "#" > Back to top< / a >
< / p >
< / footer >
< / body >
< / html >