diff --git a/content/posts/2018-11.md b/content/posts/2018-11.md index 69b0ec2c0..c7d273bd3 100644 --- a/content/posts/2018-11.md +++ b/content/posts/2018-11.md @@ -10,6 +10,58 @@ tags: ["Notes"] - Finalize AReS Phase I and Phase II ToRs - Send a note about my [dspace-statistics-api](https://github.com/ilri/dspace-statistics-api) to the dspace-tech mailing list +## 2018-11-03 + +- Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage +- Today these are the top 10 IPs: + +``` +# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "03/Nov/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10 + 1300 66.249.64.63 + 1384 35.237.175.180 + 1430 138.201.52.218 + 1455 207.46.13.156 + 1500 40.77.167.175 + 1979 50.116.102.77 + 2790 66.249.64.61 + 3367 84.38.130.177 + 4537 70.32.83.92 + 22508 66.249.64.59 +``` + +- The `66.249.64.x` are definitely Google +- `70.32.83.92` is well known, probably CCAFS or something, as it's only a few thousand requests and always to REST API +- `84.38.130.177` is some new IP in Latvia that is only hitting the XMLUI, using the following user agent: + +``` +Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.792.0 Safari/535.1 +``` + +- They at least seem to be re-using their Tomcat sessions: + +``` +$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=84.38.130.177' dspace.log.2018-11-03 | sort | uniq +342 +``` + +- `50.116.102.77` is also a regular REST API user +- `40.77.167.175` and `207.46.13.156` seem to be Bing +- `138.201.52.218` seems to be on Hetzner in Germany, but is using this user agent: + +``` +Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0 +``` + +- And it doesn't seem they are re-using their Tomcat sessions: + +``` +$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=138.201.52.218' dspace.log.2018-11-03 | sort | uniq +1243 +``` + +- Ah, we've apparently seen this server exactly a year ago in 2017-11, making 40,000 requests in one day... +- I wonder if it's worth adding them to the list of bots in the nginx config? + diff --git a/docs/2018-11/index.html b/docs/2018-11/index.html index 373bea5b4..b628d3387 100644 --- a/docs/2018-11/index.html +++ b/docs/2018-11/index.html @@ -13,10 +13,69 @@ Finalize AReS Phase I and Phase II ToRs Send a note about my dspace-statistics-api to the dspace-tech mailing list +2018-11-03 + + +Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage +Today these are the top 10 IPs: + + +# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "03/Nov/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10 + 1300 66.249.64.63 + 1384 35.237.175.180 + 1430 138.201.52.218 + 1455 207.46.13.156 + 1500 40.77.167.175 + 1979 50.116.102.77 + 2790 66.249.64.61 + 3367 84.38.130.177 + 4537 70.32.83.92 + 22508 66.249.64.59 + + + +The 66.249.64.x are definitely Google +70.32.83.92 is well known, probably CCAFS or something, as it’s only a few thousand requests and always to REST API +84.38.130.177 is some new IP in Latvia that is only hitting the XMLUI, using the following user agent: + + +Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.792.0 Safari/535.1 + + + +They at least seem to be re-using their Tomcat sessions: + + +$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=84.38.130.177' dspace.log.2018-11-03 | sort | uniq +342 + + + +50.116.102.77 is also a regular REST API user +40.77.167.175 and 207.46.13.156 seem to be Bing +138.201.52.218 seems to be on Hetzner in Germany, but is using this user agent: + + +Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0 + + + +And it doesn’t seem they are re-using their Tomcat sessions: + + +$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=138.201.52.218' dspace.log.2018-11-03 | sort | uniq +1243 + + + +Ah, we’ve apparently seen this server exactly a year ago in 2017-11, making 40,000 requests in one day… +I wonder if it’s worth adding them to the list of bots in the nginx config? + + " /> - + @@ -27,6 +86,65 @@ Finalize AReS Phase I and Phase II ToRs Send a note about my dspace-statistics-api to the dspace-tech mailing list +2018-11-03 + + +Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage +Today these are the top 10 IPs: + + +# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "03/Nov/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10 + 1300 66.249.64.63 + 1384 35.237.175.180 + 1430 138.201.52.218 + 1455 207.46.13.156 + 1500 40.77.167.175 + 1979 50.116.102.77 + 2790 66.249.64.61 + 3367 84.38.130.177 + 4537 70.32.83.92 + 22508 66.249.64.59 + + + +The 66.249.64.x are definitely Google +70.32.83.92 is well known, probably CCAFS or something, as it’s only a few thousand requests and always to REST API +84.38.130.177 is some new IP in Latvia that is only hitting the XMLUI, using the following user agent: + + +Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.792.0 Safari/535.1 + + + +They at least seem to be re-using their Tomcat sessions: + + +$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=84.38.130.177' dspace.log.2018-11-03 | sort | uniq +342 + + + +50.116.102.77 is also a regular REST API user +40.77.167.175 and 207.46.13.156 seem to be Bing +138.201.52.218 seems to be on Hetzner in Germany, but is using this user agent: + + +Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0 + + + +And it doesn’t seem they are re-using their Tomcat sessions: + + +$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=138.201.52.218' dspace.log.2018-11-03 | sort | uniq +1243 + + + +Ah, we’ve apparently seen this server exactly a year ago in 2017-11, making 40,000 requests in one day… +I wonder if it’s worth adding them to the list of bots in the nginx config? + + "/> @@ -38,9 +156,9 @@ Send a note about my dspace-statistics-api to the dspace-tech mailing list "@type": "BlogPosting", "headline": "November, 2018", "url": "https://alanorth.github.io/cgspace-notes/2018-11/", - "wordCount": "20", + "wordCount": "260", "datePublished": "2018-11-01T16:41:30+02:00", - "dateModified": "2018-11-01T16:41:30+02:00", + "dateModified": "2018-11-01T16:43:37+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -113,6 +231,65 @@ Send a note about my dspace-statistics-api to the dspace-tech mailing list
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "03/Nov/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 1300 66.249.64.63
+ 1384 35.237.175.180
+ 1430 138.201.52.218
+ 1455 207.46.13.156
+ 1500 40.77.167.175
+ 1979 50.116.102.77
+ 2790 66.249.64.61
+ 3367 84.38.130.177
+ 4537 70.32.83.92
+ 22508 66.249.64.59
+
+
+66.249.64.x
are definitely Google70.32.83.92
is well known, probably CCAFS or something, as it’s only a few thousand requests and always to REST API84.38.130.177
is some new IP in Latvia that is only hitting the XMLUI, using the following user agent:Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.792.0 Safari/535.1
+
+
+$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=84.38.130.177' dspace.log.2018-11-03 | sort | uniq
+342
+
+
+50.116.102.77
is also a regular REST API user40.77.167.175
and 207.46.13.156
seem to be Bing138.201.52.218
seems to be on Hetzner in Germany, but is using this user agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0
+
+
+$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=138.201.52.218' dspace.log.2018-11-03 | sort | uniq
+1243
+
+
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "03/Nov/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 1300 66.249.64.63
+ 1384 35.237.175.180
+ 1430 138.201.52.218
+ 1455 207.46.13.156
+ 1500 40.77.167.175
+ 1979 50.116.102.77
+ 2790 66.249.64.61
+ 3367 84.38.130.177
+ 4537 70.32.83.92
+ 22508 66.249.64.59
+
+
+66.249.64.x
are definitely Google70.32.83.92
is well known, probably CCAFS or something, as it’s only a few thousand requests and always to REST API84.38.130.177
is some new IP in Latvia that is only hitting the XMLUI, using the following user agent:Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.792.0 Safari/535.1
+
+
+$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=84.38.130.177' dspace.log.2018-11-03 | sort | uniq
+342
+
+
+50.116.102.77
is also a regular REST API user40.77.167.175
and 207.46.13.156
seem to be Bing138.201.52.218
seems to be on Hetzner in Germany, but is using this user agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0
+
+
+$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=138.201.52.218' dspace.log.2018-11-03 | sort | uniq
+1243
+
+
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "03/Nov/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 1300 66.249.64.63
+ 1384 35.237.175.180
+ 1430 138.201.52.218
+ 1455 207.46.13.156
+ 1500 40.77.167.175
+ 1979 50.116.102.77
+ 2790 66.249.64.61
+ 3367 84.38.130.177
+ 4537 70.32.83.92
+ 22508 66.249.64.59
+
+
+66.249.64.x
are definitely Google70.32.83.92
is well known, probably CCAFS or something, as it’s only a few thousand requests and always to REST API84.38.130.177
is some new IP in Latvia that is only hitting the XMLUI, using the following user agent:Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.792.0 Safari/535.1
+
+
+$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=84.38.130.177' dspace.log.2018-11-03 | sort | uniq
+342
+
+
+50.116.102.77
is also a regular REST API user40.77.167.175
and 207.46.13.156
seem to be Bing138.201.52.218
seems to be on Hetzner in Germany, but is using this user agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0
+
+
+$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=138.201.52.218' dspace.log.2018-11-03 | sort | uniq
+1243
+
+
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "03/Nov/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 1300 66.249.64.63
+ 1384 35.237.175.180
+ 1430 138.201.52.218
+ 1455 207.46.13.156
+ 1500 40.77.167.175
+ 1979 50.116.102.77
+ 2790 66.249.64.61
+ 3367 84.38.130.177
+ 4537 70.32.83.92
+ 22508 66.249.64.59
+
+
+66.249.64.x
are definitely Google70.32.83.92
is well known, probably CCAFS or something, as it’s only a few thousand requests and always to REST API84.38.130.177
is some new IP in Latvia that is only hitting the XMLUI, using the following user agent:Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.792.0 Safari/535.1
+
+
+$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=84.38.130.177' dspace.log.2018-11-03 | sort | uniq
+342
+
+
+50.116.102.77
is also a regular REST API user40.77.167.175
and 207.46.13.156
seem to be Bing138.201.52.218
seems to be on Hetzner in Germany, but is using this user agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0
+
+
+$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=138.201.52.218' dspace.log.2018-11-03 | sort | uniq
+1243
+
+
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "03/Nov/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 1300 66.249.64.63
+ 1384 35.237.175.180
+ 1430 138.201.52.218
+ 1455 207.46.13.156
+ 1500 40.77.167.175
+ 1979 50.116.102.77
+ 2790 66.249.64.61
+ 3367 84.38.130.177
+ 4537 70.32.83.92
+ 22508 66.249.64.59
+
+
+66.249.64.x
are definitely Google70.32.83.92
is well known, probably CCAFS or something, as it’s only a few thousand requests and always to REST API84.38.130.177
is some new IP in Latvia that is only hitting the XMLUI, using the following user agent:Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.792.0 Safari/535.1
+
+
+$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=84.38.130.177' dspace.log.2018-11-03 | sort | uniq
+342
+
+
+50.116.102.77
is also a regular REST API user40.77.167.175
and 207.46.13.156
seem to be Bing138.201.52.218
seems to be on Hetzner in Germany, but is using this user agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0
+
+
+$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=138.201.52.218' dspace.log.2018-11-03 | sort | uniq
+1243
+
+
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "03/Nov/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 1300 66.249.64.63
+ 1384 35.237.175.180
+ 1430 138.201.52.218
+ 1455 207.46.13.156
+ 1500 40.77.167.175
+ 1979 50.116.102.77
+ 2790 66.249.64.61
+ 3367 84.38.130.177
+ 4537 70.32.83.92
+ 22508 66.249.64.59
+
+
+66.249.64.x
are definitely Google70.32.83.92
is well known, probably CCAFS or something, as it’s only a few thousand requests and always to REST API84.38.130.177
is some new IP in Latvia that is only hitting the XMLUI, using the following user agent:Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.792.0 Safari/535.1
+
+
+$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=84.38.130.177' dspace.log.2018-11-03 | sort | uniq
+342
+
+
+50.116.102.77
is also a regular REST API user40.77.167.175
and 207.46.13.156
seem to be Bing138.201.52.218
seems to be on Hetzner in Germany, but is using this user agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:62.0) Gecko/20100101 Firefox/62.0
+
+
+$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=138.201.52.218' dspace.log.2018-11-03 | sort | uniq
+1243
+
+
+