diff --git a/content/posts/2018-07.md b/content/posts/2018-07.md index b2c4d062f..58fc80914 100644 --- a/content/posts/2018-07.md +++ b/content/posts/2018-07.md @@ -179,5 +179,40 @@ org.apache.solr.client.solrj.SolrServerException: IOException occured when talki ``` - But not sure what caused that... +- I got a message from Linode tonight that CPU usage was high on CGSpace for the past few hours around 8PM GMT +- Looking in the nginx logs I see the top ten IP addresses active today: + +``` +# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "09/Jul/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10 + 1691 40.77.167.84 + 1701 40.77.167.69 + 1718 50.116.102.77 + 1872 137.108.70.6 + 2172 157.55.39.234 + 2190 207.46.13.47 + 2848 178.154.200.38 + 4367 35.227.26.162 + 4387 70.32.83.92 + 4738 95.108.181.88 +``` + +- Of those, *all* except `70.32.83.92` and `50.116.102.77` are *NOT* re-using their Tomcat sessions, for example from the XMLUI logs: + +``` +$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=95.108.181.88' dspace.log.2018-07-09 +4435 +``` + +- `95.108.181.88` appears to be Yandex, so I dunno why it's creating so many sessions, as its user agent should match Tomcat's Crawler Session Manager Valve +- `70.32.83.92` is on MediaTemple but I'm not sure who it is. They are mostly hitting REST so I guess that's fine +- `35.227.26.162` doesn't declare a user agent and is on Google Cloud, so I should probably mark them as a bot in nginx +- `178.154.200.38` is Yandex again +- `207.46.13.47` is Bing +- `157.55.39.234` is Bing +- `137.108.70.6` is our old friend CORE bot +- `50.116.102.77` doesn't declare a user agent and lives on HostGator, but mostly just hits the REST API so I guess that's fine +- `40.77.167.84` is Bing again +- Interestingly, the first time that I see `35.227.26.162` was on 2018-06-08 +- I've added `35.227.26.162` to the bot tagging logic in the nginx vhost diff --git a/docs/2018-07/index.html b/docs/2018-07/index.html index 9d635387c..44a2e0888 100644 --- a/docs/2018-07/index.html +++ b/docs/2018-07/index.html @@ -30,7 +30,7 @@ There is insufficient memory for the Java Runtime Environment to continue. - + @@ -71,9 +71,9 @@ There is insufficient memory for the Java Runtime Environment to continue. "@type": "BlogPosting", "headline": "July, 2018", "url": "https://alanorth.github.io/cgspace-notes/2018-07/", - "wordCount": "1213", + "wordCount": "1454", "datePublished": "2018-07-01T12:56:54+03:00", - "dateModified": "2018-07-09T07:51:04+03:00", + "dateModified": "2018-07-09T16:45:50+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -342,6 +342,43 @@ org.apache.solr.client.solrj.SolrServerException: IOException occured when talki + +
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "09/Jul/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+   1691 40.77.167.84
+   1701 40.77.167.69
+   1718 50.116.102.77
+   1872 137.108.70.6
+   2172 157.55.39.234
+   2190 207.46.13.47
+   2848 178.154.200.38
+   4367 35.227.26.162
+   4387 70.32.83.92
+   4738 95.108.181.88
+
+ + + +
$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=95.108.181.88' dspace.log.2018-07-09
+4435
+
+ + diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 43062b165..fff810de3 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -4,7 +4,7 @@ https://alanorth.github.io/cgspace-notes/2018-07/ - 2018-07-09T07:51:04+03:00 + 2018-07-09T16:45:50+03:00 @@ -174,7 +174,7 @@ https://alanorth.github.io/cgspace-notes/ - 2018-07-09T07:51:04+03:00 + 2018-07-09T16:45:50+03:00 0 @@ -185,7 +185,7 @@ https://alanorth.github.io/cgspace-notes/tags/notes/ - 2018-07-09T07:51:04+03:00 + 2018-07-09T16:45:50+03:00 0 @@ -197,13 +197,13 @@ https://alanorth.github.io/cgspace-notes/posts/ - 2018-07-09T07:51:04+03:00 + 2018-07-09T16:45:50+03:00 0 https://alanorth.github.io/cgspace-notes/tags/ - 2018-07-09T07:51:04+03:00 + 2018-07-09T16:45:50+03:00 0