diff --git a/content/posts/2019-01.md b/content/posts/2019-01.md
index 004c1cf48..bc8fec0fe 100644
--- a/content/posts/2019-01.md
+++ b/content/posts/2019-01.md
@@ -930,4 +930,133 @@ identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/
- 70.32.83.92 is CCAFS
- 205.186.128.185 is CCAFS or perhaps another Macaroni Bros harvester (new ILRI website?)
+## 2019-01-28
+
+- Udana from WLE asked me about the interaction between their publication website and their items on CGSpace
+ - There is an item that is mapped into their collection from IWMI and is missing their `cg.identifier.wletheme` metadata
+ - I told him that, as far as I remember, when WLE introduced Phase II research themes in 2017 we decided to infer theme ownership from the collection hierarchy and we created a [WLE Phase II Research Themes](https://cgspace.cgiar.org/handle/10568/81268) subCommunity
+ - Perhaps they need to ask Macaroni Bros about the mapping
+- Linode alerted that CGSpace (linode18) was using too much CPU again this morning, here are the active IPs from the web server log at the time:
+
+```
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "28/Jan/2019:0(6|7|8)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 67 207.46.13.50
+ 105 41.204.190.40
+ 117 34.218.226.147
+ 126 35.237.175.180
+ 203 213.55.99.121
+ 332 45.5.184.72
+ 377 5.9.6.51
+ 512 45.5.184.2
+ 4644 205.186.128.185
+ 4644 70.32.83.92
+```
+
+- There seems to be a pattern with `70.32.83.92` and `205.186.128.185` lately!
+- Every morning at 8AM they are the top users... I should tell them to stagger their requests...
+- I signed up for a [VisualPing](https://visualping.io/) of the [PostgreSQL JDBC driver download page](https://jdbc.postgresql.org/download.html) to my CGIAR email address
+ - Hopefully this will one day alert me that a new driver is released!
+- Last night Linode sent an alert that CGSpace (linode18) was using high CPU, here are the most active IPs in the hours just before, during, and after the alert:
+
+```
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "28/Jan/2019:(17|18|19|20|21)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 310 45.5.184.2
+ 425 5.143.231.39
+ 526 54.70.40.11
+ 1003 199.47.87.141
+ 1374 35.237.175.180
+ 1455 5.9.6.51
+ 1501 66.249.66.223
+ 1771 66.249.66.219
+ 2107 199.47.87.140
+ 2540 45.5.186.2
+```
+
+- Of course there is CIAT's `45.5.186.2`, but also `45.5.184.2` appears to be CIAT... I wonder why they have two harvesters?
+- `199.47.87.140` and `199.47.87.141` is TurnItIn with the following user agent:
+
+```
+TurnitinBot (https://turnitin.com/robot/crawlerinfo.html)
+```
+
+## 2019-01-29
+
+- Linode sent an alert about CGSpace (linode18) CPU usage this morning, here are the top IPs in the web server logs just before, during, and after the alert:
+
+```
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "29/Jan/2019:0(3|4|5|6|7)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 334 45.5.184.72
+ 429 66.249.66.223
+ 522 35.237.175.180
+ 555 34.218.226.147
+ 655 66.249.66.221
+ 844 5.9.6.51
+ 2507 66.249.66.219
+ 4645 70.32.83.92
+ 4646 205.186.128.185
+ 9329 45.5.186.2
+```
+
+- `45.5.186.2` is CIAT as usual...
+- `70.32.83.92` and `205.186.128.185` are CCAFS as usual...
+- `66.249.66.219` is Google...
+- I'm thinking it might finally be time to increase the threshold of the Linode CPU alerts
+ - I adjusted the alert threshold from 250% to 275%
+
+## 2019-01-30
+
+- Got another alert from Linode about CGSpace (linode18) this morning, here are the top IPs before, during, and after the alert:
+
+```
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "30/Jan/2019:0(5|6|7|8|9)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 273 46.101.86.248
+ 301 35.237.175.180
+ 334 45.5.184.72
+ 387 5.9.6.51
+ 527 2a01:4f8:13b:1296::2
+ 1021 34.218.226.147
+ 1448 66.249.66.219
+ 4649 205.186.128.185
+ 4649 70.32.83.92
+ 5163 45.5.184.2
+```
+
+- I might need to adjust the threshold again, because the load average this morning was 296% and the activity looks pretty normal (as always recently)
+
+## 2019-01-31
+
+- Linode sent alerts about CGSpace (linode18) last night and this morning, here are the top IPs before, during, and after those times:
+
+```
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "30/Jan/2019:(16|17|18|19|20)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 436 18.196.196.108
+ 460 157.55.39.168
+ 460 207.46.13.96
+ 500 197.156.105.116
+ 728 54.70.40.11
+ 1560 5.9.6.51
+ 1562 35.237.175.180
+ 1601 85.25.237.71
+ 1894 66.249.66.219
+ 2610 45.5.184.2
+# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "31/Jan/2019:0(2|3|4|5|6)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 318 207.46.13.242
+ 334 45.5.184.72
+ 486 35.237.175.180
+ 609 34.218.226.147
+ 620 66.249.66.219
+ 1054 5.9.6.51
+ 4391 70.32.83.92
+ 4428 205.186.128.185
+ 6758 85.25.237.71
+ 9239 45.5.186.2
+```
+
+- `45.5.186.2` and `45.5.184.2` are CIAT as always
+- `85.25.237.71` is some new server in Germany that I've never seen before with the user agent:
+
+```
+Linguee Bot (http://www.linguee.com/bot; bot@linguee.com)
+```
+
diff --git a/docs/2015-11/index.html b/docs/2015-11/index.html
index 860c58ef3..d300d46a8 100644
--- a/docs/2015-11/index.html
+++ b/docs/2015-11/index.html
@@ -278,6 +278,8 @@ db.statementpool = true
-
diff --git a/docs/2016-03/index.html b/docs/2016-03/index.html
index ff64de63e..cc8d5e34c 100644
--- a/docs/2016-03/index.html
+++ b/docs/2016-03/index.html
@@ -363,6 +363,8 @@ Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Ja
+
-
diff --git a/docs/2017-07/index.html b/docs/2017-07/index.html
index fc7ee2353..a92df23a0 100644
--- a/docs/2017-07/index.html
+++ b/docs/2017-07/index.html
@@ -321,6 +321,8 @@ delete from metadatavalue where resource_type_id=2 and metadata_field_id=235 and
+
-
diff --git a/docs/2019-01/index.html b/docs/2019-01/index.html
index 90bcc0415..1cfa36d04 100644
--- a/docs/2019-01/index.html
+++ b/docs/2019-01/index.html
@@ -27,7 +27,7 @@ I don’t see anything interesting in the web server logs around that time t
" />
-
+
@@ -60,9 +60,9 @@ I don’t see anything interesting in the web server logs around that time t
"@type": "BlogPosting",
"headline": "January, 2019",
"url": "https://alanorth.github.io/cgspace-notes/2019-01/",
- "wordCount": "4866",
+ "wordCount": "5532",
"datePublished": "2019-01-02T09:48:30+02:00",
- "dateModified": "2019-01-25T19:45:15+02:00",
+ "dateModified": "2019-01-27T17:25:19+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -1237,6 +1237,155 @@ identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/
+
2019-01-28
+
+
+
Udana from WLE asked me about the interaction between their publication website and their items on CGSpace
+
+
+
There is an item that is mapped into their collection from IWMI and is missing their cg.identifier.wletheme metadata
+
I told him that, as far as I remember, when WLE introduced Phase II research themes in 2017 we decided to infer theme ownership from the collection hierarchy and we created a WLE Phase II Research Themes subCommunity
+
Perhaps they need to ask Macaroni Bros about the mapping
+
+
Linode alerted that CGSpace (linode18) was using too much CPU again this morning, here are the active IPs from the web server log at the time:
Hopefully this will one day alert me that a new driver is released!
+
+
Last night Linode sent an alert that CGSpace (linode18) was using high CPU, here are the most active IPs in the hours just before, during, and after the alert:
Linode sent an alert about CGSpace (linode18) CPU usage this morning, here are the top IPs in the web server logs just before, during, and after the alert:
Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!
+
The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase
+
There were just over 3 million accesses in the nginx logs last month:
Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!
+
The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase
+
There were just over 3 million accesses in the nginx logs last month: