diff --git a/content/posts/2022-07.md b/content/posts/2022-07.md index 14e7568eb..f6803327f 100644 --- a/content/posts/2022-07.md +++ b/content/posts/2022-07.md @@ -32,4 +32,53 @@ Time: 399.751 ms - Start a harvest on AReS +## 2022-07-04 + +- Linode told me that CGSpace had high load yesterday + - I also got some up and down notices from UptimeRobot + - Looking now, I see there was a very high CPU and database pool load, but a mostly normal DSpace session count + +![CPU load day](/cgspace-notes/2022/07/cpu-day.png) +![JDBC pool day](/cgspace-notes/2022/07/jmx_tomcat_dbpools-day.png) + +- Seems we have some old database transactions since 2022-06-27: + +![PostgreSQL locks week](/cgspace-notes/2022/07/postgres_locks_ALL-week.png) +![PostgreSQL query length week](/cgspace-notes/2022/07/postgres_querylength_ALL-week.png) + +- Looking at the top connections to nginx yesterday: + +```console +# awk '{print $1}' /var/log/nginx/{access,library-access,oai,rest}.log.1 | sort | uniq -c | sort -h | tail + 1132 64.124.8.34 + 1146 2a01:4f8:1c17:5550::1 + 1380 137.184.159.211 + 1533 64.124.8.59 + 4013 80.248.237.167 + 4776 54.195.118.125 + 10482 45.5.186.2 + 11177 172.104.229.92 + 15855 2a01:7e00::f03c:91ff:fe9a:3a37 + 22179 64.39.98.251 +``` + +- And the total number of unique IPs: + +```console +# awk '{print $1}' /var/log/nginx/{access,library-access,oai,rest}.log.1 | sort -u | wc -l +6952 +``` + +- This seems low, so it must have been from the request patterns by certain visitors + - 64.39.98.251 is Qualys, and I'm debating blocking [all their IPs](https://pci.qualys.com/static/help/merchant/getting_started/check_scanner_ip_addresses.htm) using a geo block in nginx (need to test) + - The top few are known ILRI and other CGIAR scrapers, but 80.248.237.167 is on InternetVikings in Sweden, using a normal user agentand scraping Discover + - 64.124.8.59 is making requests with a normal user agent and belongs to Castle Global or Zayo +- I ran all system updates and rebooted the server (could have just restarted PostgreSQL but I thought I might as well do everything) +- I implemented a geo mapping for the user agent mapping AND the nginx `limit_req_zone` by extracting the networks into an external file and including it in two different geo mapping blocks + - This is clever and relies on the fact that we can use defaults in both cases + - First, we map the user agent of requests from these networks to "bot" so that Tomcat and Solr handle them accordingly + - Second, we use this as a key in a `limit_req_zone`, which relies on a default mapping of '' (and nginx doesn't evaluate empty cache keys) +- I noticed that CIP uploaded a number of Georgian presentations with `dcterms.language` set to English and Other so I changed them to "ka" + - Perhaps we need to update our list of languages to include all instead of the most common ones + diff --git a/docs/2022-06/index.html b/docs/2022-06/index.html index ebdc6e2d2..de34877ec 100644 --- a/docs/2022-06/index.html +++ b/docs/2022-06/index.html @@ -26,7 +26,7 @@ There seem to be many more of these: - + @@ -60,7 +60,7 @@ There seem to be many more of these: "url": "https://alanorth.github.io/cgspace-notes/2022-06/", "wordCount": "1786", "datePublished": "2022-06-06T09:01:36+03:00", - "dateModified": "2022-06-30T16:48:03+03:00", + "dateModified": "2022-07-04T09:25:14+03:00", "author": { "@type": "Person", "name": "Alan Orth" diff --git a/docs/2022-07/index.html b/docs/2022-07/index.html index 80e7262e7..b19f9153d 100644 --- a/docs/2022-07/index.html +++ b/docs/2022-07/index.html @@ -19,7 +19,7 @@ Also, the trgm functions I’ve used before are case insensitive, but Levens - + @@ -44,9 +44,9 @@ Also, the trgm functions I’ve used before are case insensitive, but Levens "@type": "BlogPosting", "headline": "July, 2022", "url": "https://alanorth.github.io/cgspace-notes/2022-07/", - "wordCount": "164", + "wordCount": "507", "datePublished": "2022-07-02T14:07:36+03:00", - "dateModified": "2022-07-02T14:07:36+03:00", + "dateModified": "2022-07-04T09:25:14+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -147,6 +147,63 @@ Also, the trgm functions I’ve used before are case insensitive, but Levens +

2022-07-04

+ +

CPU load day +JDBC pool day

+ +

PostgreSQL locks week +PostgreSQL query length week

+ +
# awk '{print $1}' /var/log/nginx/{access,library-access,oai,rest}.log.1 | sort | uniq -c | sort -h | tail
+   1132 64.124.8.34
+   1146 2a01:4f8:1c17:5550::1
+   1380 137.184.159.211
+   1533 64.124.8.59
+   4013 80.248.237.167
+   4776 54.195.118.125
+  10482 45.5.186.2
+  11177 172.104.229.92
+  15855 2a01:7e00::f03c:91ff:fe9a:3a37
+  22179 64.39.98.251
+
+
# awk '{print $1}' /var/log/nginx/{access,library-access,oai,rest}.log.1 | sort -u | wc -l
+6952
+
diff --git a/docs/2022/07/cpu-day.png b/docs/2022/07/cpu-day.png new file mode 100644 index 000000000..09c6ea9f8 Binary files /dev/null and b/docs/2022/07/cpu-day.png differ diff --git a/docs/2022/07/jmx_tomcat_dbpools-day.png b/docs/2022/07/jmx_tomcat_dbpools-day.png new file mode 100644 index 000000000..e1d251489 Binary files /dev/null and b/docs/2022/07/jmx_tomcat_dbpools-day.png differ diff --git a/docs/2022/07/postgres_locks_ALL-week.png b/docs/2022/07/postgres_locks_ALL-week.png new file mode 100644 index 000000000..9079e8f19 Binary files /dev/null and b/docs/2022/07/postgres_locks_ALL-week.png differ diff --git a/docs/2022/07/postgres_querylength_ALL-week.png b/docs/2022/07/postgres_querylength_ALL-week.png new file mode 100644 index 000000000..5cb704313 Binary files /dev/null and b/docs/2022/07/postgres_querylength_ALL-week.png differ diff --git a/docs/categories/index.html b/docs/categories/index.html index bec37df3f..964e5efbc 100644 --- a/docs/categories/index.html +++ b/docs/categories/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/index.html b/docs/categories/notes/index.html index b5cb216a6..e3e5c7ef5 100644 --- a/docs/categories/notes/index.html +++ b/docs/categories/notes/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/2/index.html b/docs/categories/notes/page/2/index.html index 03ec0a28e..70bff6411 100644 --- a/docs/categories/notes/page/2/index.html +++ b/docs/categories/notes/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/3/index.html b/docs/categories/notes/page/3/index.html index f9604243b..bb084bde7 100644 --- a/docs/categories/notes/page/3/index.html +++ b/docs/categories/notes/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/4/index.html b/docs/categories/notes/page/4/index.html index eaf0201da..b8e801f24 100644 --- a/docs/categories/notes/page/4/index.html +++ b/docs/categories/notes/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/5/index.html b/docs/categories/notes/page/5/index.html index ec41ab99a..5122a29bd 100644 --- a/docs/categories/notes/page/5/index.html +++ b/docs/categories/notes/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/6/index.html b/docs/categories/notes/page/6/index.html index 371701750..2ab773135 100644 --- a/docs/categories/notes/page/6/index.html +++ b/docs/categories/notes/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/7/index.html b/docs/categories/notes/page/7/index.html index 625bc4856..f639854c9 100644 --- a/docs/categories/notes/page/7/index.html +++ b/docs/categories/notes/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/index.html b/docs/index.html index 1ada86612..7c6ce7763 100644 --- a/docs/index.html +++ b/docs/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/2/index.html b/docs/page/2/index.html index 1379df2db..46d9584ce 100644 --- a/docs/page/2/index.html +++ b/docs/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/3/index.html b/docs/page/3/index.html index 2a3d5d815..6feecf006 100644 --- a/docs/page/3/index.html +++ b/docs/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/4/index.html b/docs/page/4/index.html index 7df44be06..756b85335 100644 --- a/docs/page/4/index.html +++ b/docs/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/5/index.html b/docs/page/5/index.html index aec3dc29e..db3b1b94e 100644 --- a/docs/page/5/index.html +++ b/docs/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/6/index.html b/docs/page/6/index.html index 6b7062bae..a290bc42e 100644 --- a/docs/page/6/index.html +++ b/docs/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/7/index.html b/docs/page/7/index.html index 414151a35..d219c86d6 100644 --- a/docs/page/7/index.html +++ b/docs/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/8/index.html b/docs/page/8/index.html index 11add4848..28a425edc 100644 --- a/docs/page/8/index.html +++ b/docs/page/8/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/9/index.html b/docs/page/9/index.html index 5fcda1e0b..669afee0f 100644 --- a/docs/page/9/index.html +++ b/docs/page/9/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/index.html b/docs/posts/index.html index 0ef2bd156..ab51cddd6 100644 --- a/docs/posts/index.html +++ b/docs/posts/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/2/index.html b/docs/posts/page/2/index.html index 181b2e39f..81bc3d3c8 100644 --- a/docs/posts/page/2/index.html +++ b/docs/posts/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/3/index.html b/docs/posts/page/3/index.html index 0d684fe07..98155ba0d 100644 --- a/docs/posts/page/3/index.html +++ b/docs/posts/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/4/index.html b/docs/posts/page/4/index.html index c7d92dcd6..5d3a4f3b8 100644 --- a/docs/posts/page/4/index.html +++ b/docs/posts/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/5/index.html b/docs/posts/page/5/index.html index 6068f0792..5156ed3a1 100644 --- a/docs/posts/page/5/index.html +++ b/docs/posts/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/6/index.html b/docs/posts/page/6/index.html index 3762b19c9..f6be904f0 100644 --- a/docs/posts/page/6/index.html +++ b/docs/posts/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/7/index.html b/docs/posts/page/7/index.html index 7d05bf0be..27d177e48 100644 --- a/docs/posts/page/7/index.html +++ b/docs/posts/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/8/index.html b/docs/posts/page/8/index.html index b5f85c9ad..554e4d9f3 100644 --- a/docs/posts/page/8/index.html +++ b/docs/posts/page/8/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/9/index.html b/docs/posts/page/9/index.html index e73e57d5a..951e8d624 100644 --- a/docs/posts/page/9/index.html +++ b/docs/posts/page/9/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/sitemap.xml b/docs/sitemap.xml index d5a2ee0d8..7dd053305 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -3,22 +3,22 @@ xmlns:xhtml="http://www.w3.org/1999/xhtml"> https://alanorth.github.io/cgspace-notes/categories/ - 2022-07-02T14:07:36+03:00 + 2022-07-04T09:25:14+03:00 https://alanorth.github.io/cgspace-notes/ - 2022-07-02T14:07:36+03:00 + 2022-07-04T09:25:14+03:00 https://alanorth.github.io/cgspace-notes/2022-07/ - 2022-07-02T14:07:36+03:00 + 2022-07-04T09:25:14+03:00 https://alanorth.github.io/cgspace-notes/categories/notes/ - 2022-07-02T14:07:36+03:00 + 2022-07-04T09:25:14+03:00 https://alanorth.github.io/cgspace-notes/posts/ - 2022-07-02T14:07:36+03:00 + 2022-07-04T09:25:14+03:00 https://alanorth.github.io/cgspace-notes/2022-06/ - 2022-06-30T16:48:03+03:00 + 2022-07-04T09:25:14+03:00 https://alanorth.github.io/cgspace-notes/2022-05/ 2022-05-30T16:00:02+03:00 diff --git a/static/2022/07/cpu-day.png b/static/2022/07/cpu-day.png new file mode 100644 index 000000000..09c6ea9f8 Binary files /dev/null and b/static/2022/07/cpu-day.png differ diff --git a/static/2022/07/jmx_tomcat_dbpools-day.png b/static/2022/07/jmx_tomcat_dbpools-day.png new file mode 100644 index 000000000..e1d251489 Binary files /dev/null and b/static/2022/07/jmx_tomcat_dbpools-day.png differ diff --git a/static/2022/07/postgres_locks_ALL-week.png b/static/2022/07/postgres_locks_ALL-week.png new file mode 100644 index 000000000..9079e8f19 Binary files /dev/null and b/static/2022/07/postgres_locks_ALL-week.png differ diff --git a/static/2022/07/postgres_querylength_ALL-week.png b/static/2022/07/postgres_querylength_ALL-week.png new file mode 100644 index 000000000..5cb704313 Binary files /dev/null and b/static/2022/07/postgres_querylength_ALL-week.png differ