From 19715c3295fed8decc9a6fa463392a8ac36766c1 Mon Sep 17 00:00:00 2001 From: Alan Orth Date: Thu, 7 Jul 2022 10:02:04 +0300 Subject: [PATCH] Add notes for 2022-07-06 --- content/posts/2022-07.md | 36 +++++++++++++++++++ docs/2022-07/index.html | 47 +++++++++++++++++++++++-- docs/categories/index.html | 2 +- docs/categories/notes/index.html | 2 +- docs/categories/notes/page/2/index.html | 2 +- docs/categories/notes/page/3/index.html | 2 +- docs/categories/notes/page/4/index.html | 2 +- docs/categories/notes/page/5/index.html | 2 +- docs/categories/notes/page/6/index.html | 2 +- docs/categories/notes/page/7/index.html | 2 +- docs/index.html | 2 +- docs/page/2/index.html | 2 +- docs/page/3/index.html | 2 +- docs/page/4/index.html | 2 +- docs/page/5/index.html | 2 +- docs/page/6/index.html | 2 +- docs/page/7/index.html | 2 +- docs/page/8/index.html | 2 +- docs/page/9/index.html | 2 +- docs/posts/index.html | 2 +- docs/posts/page/2/index.html | 2 +- docs/posts/page/3/index.html | 2 +- docs/posts/page/4/index.html | 2 +- docs/posts/page/5/index.html | 2 +- docs/posts/page/6/index.html | 2 +- docs/posts/page/7/index.html | 2 +- docs/posts/page/8/index.html | 2 +- docs/posts/page/9/index.html | 2 +- docs/sitemap.xml | 10 +++--- 29 files changed, 111 insertions(+), 34 deletions(-) diff --git a/content/posts/2022-07.md b/content/posts/2022-07.md index b173260de..c1c0b391c 100644 --- a/content/posts/2022-07.md +++ b/content/posts/2022-07.md @@ -82,4 +82,40 @@ Time: 399.751 ms - Perhaps we need to update our list of languages to include all instead of the most common ones - I wrote a script `ilri/iso-639-value-pairs.py` to extract the names and Alpha 2 codes for all ISO 639-1 languages from pycountry and added them to `input-forms.xml` +## 2022-07-06 + +- CGSpace went down and up a few times due to high load + - I found one host in Romania making very high speed requests with a normal user agent (`Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.2; WOW64; Trident/7.0; .NET4.0E; .NET4.0C`): + +```console +# awk '{print $1}' /var/log/nginx/{access,library-access,oai,rest}.log | sort | uniq -c | sort -h | tail -n 10 + 516 142.132.248.90 + 525 157.55.39.234 + 587 66.249.66.21 + 593 95.108.213.59 + 1372 137.184.159.211 + 4776 54.195.118.125 + 5441 205.186.128.185 + 6267 45.5.186.2 + 15839 2a01:7e00::f03c:91ff:fe9a:3a37 + 36114 146.19.75.141 +``` + +- I added 146.19.75.141 to the list of bot networks in nginx +- While looking at the logs I started thinking about Bing again + - They apparently [publish a list of all their networks](https://www.bing.com/toolbox/bingbot.json) + - I wrote a script to use `prips` to [print the IPs for each network](https://stackoverflow.com/a/52501093/1996540) + - The script is `bing-networks-to-ips.sh` + - From Bing's IPs alone I purged 145,403 hits... sheesh +- Delete two items on CGSpace for Margarita because she was getting the "Authorization denied for action OBSOLETE (DELETE) on BITSTREAM:0b26875a-..." error + - This is the same DSpace 6 bug I noticed in 2021-03, 2021-04, and 2021-05 +- Update some `cg.audience` metadata to use "Academics" instead of "Academicians": + +```console +dspace=# UPDATE metadatavalue SET text_value='Academics' WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=144 AND text_value='Academicians'; +UPDATE 104 +``` + +- I will also have to remove "Academicians" from input-forms.xml + diff --git a/docs/2022-07/index.html b/docs/2022-07/index.html index 08323abe3..69f808265 100644 --- a/docs/2022-07/index.html +++ b/docs/2022-07/index.html @@ -19,7 +19,7 @@ Also, the trgm functions I’ve used before are case insensitive, but Levens - + @@ -44,9 +44,9 @@ Also, the trgm functions I’ve used before are case insensitive, but Levens "@type": "BlogPosting", "headline": "July, 2022", "url": "https://alanorth.github.io/cgspace-notes/2022-07/", - "wordCount": "532", + "wordCount": "739", "datePublished": "2022-07-02T14:07:36+03:00", - "dateModified": "2022-07-04T17:20:01+03:00", + "dateModified": "2022-07-04T22:10:02+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -205,6 +205,47 @@ Also, the trgm functions I’ve used before are case insensitive, but Levens
  • I wrote a script ilri/iso-639-value-pairs.py to extract the names and Alpha 2 codes for all ISO 639-1 languages from pycountry and added them to input-forms.xml
  • +

    2022-07-06

    + +
    # awk '{print $1}' /var/log/nginx/{access,library-access,oai,rest}.log | sort | uniq -c | sort -h | tail -n 10
    +    516 142.132.248.90
    +    525 157.55.39.234
    +    587 66.249.66.21
    +    593 95.108.213.59
    +   1372 137.184.159.211
    +   4776 54.195.118.125
    +   5441 205.186.128.185
    +   6267 45.5.186.2
    +  15839 2a01:7e00::f03c:91ff:fe9a:3a37
    +  36114 146.19.75.141
    +
    +
    dspace=# UPDATE metadatavalue SET text_value='Academics' WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=144 AND text_value='Academicians';
    +UPDATE 104
    +
    diff --git a/docs/categories/index.html b/docs/categories/index.html index b327ac106..ae2e983da 100644 --- a/docs/categories/index.html +++ b/docs/categories/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/index.html b/docs/categories/notes/index.html index 445cdd726..c3803b567 100644 --- a/docs/categories/notes/index.html +++ b/docs/categories/notes/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/2/index.html b/docs/categories/notes/page/2/index.html index b43d48859..cc27d02ac 100644 --- a/docs/categories/notes/page/2/index.html +++ b/docs/categories/notes/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/3/index.html b/docs/categories/notes/page/3/index.html index 8832a4bd3..916de1585 100644 --- a/docs/categories/notes/page/3/index.html +++ b/docs/categories/notes/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/4/index.html b/docs/categories/notes/page/4/index.html index ea8e52c3d..6475a5951 100644 --- a/docs/categories/notes/page/4/index.html +++ b/docs/categories/notes/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/5/index.html b/docs/categories/notes/page/5/index.html index acaec1231..454f20c04 100644 --- a/docs/categories/notes/page/5/index.html +++ b/docs/categories/notes/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/6/index.html b/docs/categories/notes/page/6/index.html index 265e95e51..42a77601d 100644 --- a/docs/categories/notes/page/6/index.html +++ b/docs/categories/notes/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/7/index.html b/docs/categories/notes/page/7/index.html index eb6ff6aff..934f5d07b 100644 --- a/docs/categories/notes/page/7/index.html +++ b/docs/categories/notes/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/index.html b/docs/index.html index e0c776a98..8c47206f4 100644 --- a/docs/index.html +++ b/docs/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/2/index.html b/docs/page/2/index.html index dc0e8c34e..4fce30841 100644 --- a/docs/page/2/index.html +++ b/docs/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/3/index.html b/docs/page/3/index.html index f40452dab..420e5d61d 100644 --- a/docs/page/3/index.html +++ b/docs/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/4/index.html b/docs/page/4/index.html index fa3bc8013..74acd354a 100644 --- a/docs/page/4/index.html +++ b/docs/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/5/index.html b/docs/page/5/index.html index 0c0b12b41..7cde28874 100644 --- a/docs/page/5/index.html +++ b/docs/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/6/index.html b/docs/page/6/index.html index 5b7bc20a0..2dddc557a 100644 --- a/docs/page/6/index.html +++ b/docs/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/7/index.html b/docs/page/7/index.html index 24a305595..931acb758 100644 --- a/docs/page/7/index.html +++ b/docs/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/8/index.html b/docs/page/8/index.html index 922ab60db..d46da45a9 100644 --- a/docs/page/8/index.html +++ b/docs/page/8/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/9/index.html b/docs/page/9/index.html index ebe596cca..8bf78a895 100644 --- a/docs/page/9/index.html +++ b/docs/page/9/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/index.html b/docs/posts/index.html index a2ec20140..b7f586dd9 100644 --- a/docs/posts/index.html +++ b/docs/posts/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/2/index.html b/docs/posts/page/2/index.html index 6ccb04e6e..02ceab965 100644 --- a/docs/posts/page/2/index.html +++ b/docs/posts/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/3/index.html b/docs/posts/page/3/index.html index b84f71f55..faa0306bd 100644 --- a/docs/posts/page/3/index.html +++ b/docs/posts/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/4/index.html b/docs/posts/page/4/index.html index d25e70025..6c25befd8 100644 --- a/docs/posts/page/4/index.html +++ b/docs/posts/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/5/index.html b/docs/posts/page/5/index.html index 9ea89d8d3..a3063c5c5 100644 --- a/docs/posts/page/5/index.html +++ b/docs/posts/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/6/index.html b/docs/posts/page/6/index.html index b2a6d33ce..0f68729bf 100644 --- a/docs/posts/page/6/index.html +++ b/docs/posts/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/7/index.html b/docs/posts/page/7/index.html index 8043fb2a4..55f7751b3 100644 --- a/docs/posts/page/7/index.html +++ b/docs/posts/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/8/index.html b/docs/posts/page/8/index.html index 875b2095d..c17fbf9dc 100644 --- a/docs/posts/page/8/index.html +++ b/docs/posts/page/8/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/9/index.html b/docs/posts/page/9/index.html index 7042f7b14..7ab398b3c 100644 --- a/docs/posts/page/9/index.html +++ b/docs/posts/page/9/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 2e9f0a3bc..f7a49eba9 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -3,19 +3,19 @@ xmlns:xhtml="http://www.w3.org/1999/xhtml"> https://alanorth.github.io/cgspace-notes/categories/ - 2022-07-04T17:20:01+03:00 + 2022-07-04T22:10:02+03:00 https://alanorth.github.io/cgspace-notes/ - 2022-07-04T17:20:01+03:00 + 2022-07-04T22:10:02+03:00 https://alanorth.github.io/cgspace-notes/2022-07/ - 2022-07-04T17:20:01+03:00 + 2022-07-04T22:10:02+03:00 https://alanorth.github.io/cgspace-notes/categories/notes/ - 2022-07-04T17:20:01+03:00 + 2022-07-04T22:10:02+03:00 https://alanorth.github.io/cgspace-notes/posts/ - 2022-07-04T17:20:01+03:00 + 2022-07-04T22:10:02+03:00 https://alanorth.github.io/cgspace-notes/2022-06/ 2022-07-04T09:25:14+03:00