diff --git a/content/posts/2021-05.md b/content/posts/2021-05.md new file mode 100644 index 000000000..13f28274a --- /dev/null +++ b/content/posts/2021-05.md @@ -0,0 +1,95 @@ +--- +title: "May, 2021" +date: 2021-05-02T09:50:54+03:00 +author: "Alan Orth" +categories: ["Notes"] +--- + +## 2021-05-01 + +- I looked at the top user agents and IPs in the Solr statistics for last month and I see these user agents: + - "RI/1.0", 1337 + - "Microsoft Office Word 2014", 941 +- I will add the RI/1.0 pattern to our DSpace agents overload and purge them from Solr (we had previously seen this agent with 9,000 hits or so in 2020-09), but I think I will leave the Microsoft Word one... as that's an actual user... + + + +- I should probably add the `RI/1.0` pattern to COUNTER-Robots project +- As well as these IPs: + - 193.169.254.178, 21648 + - 181.62.166.177, 20323 + - 45.146.166.180, 19376 +- The first IP seems to be in Estonia and their requests to the REST API change user agents from curl to Mac OS X to Windows and more + - Also, they seem to be trying to exploit something: + +```console +193.169.254.178 - - [21/Apr/2021:01:59:01 +0200] "GET /rest/collections/1179/items?limit=812&expand=metadata\x22%20and%20\x2221\x22=\x2221 HTTP/1.1" 400 5 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)" +193.169.254.178 - - [21/Apr/2021:02:00:36 +0200] "GET /rest/collections/1179/items?limit=812&expand=metadata-21%2B21*01 HTTP/1.1" 200 458201 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)" +193.169.254.178 - - [21/Apr/2021:02:00:36 +0200] "GET /rest/collections/1179/items?limit=812&expand=metadata'||lower('')||' HTTP/1.1" 400 5 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)" +193.169.254.178 - - [21/Apr/2021:02:02:10 +0200] "GET /rest/collections/1179/items?limit=812&expand=metadata'%2Brtrim('')%2B' HTTP/1.1" 200 458209 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)" +``` + +- I will report the IP on abuseipdb.com and purge their hits from Solr +- The second IP is in Colombia and is making thousands of requests for what looks like some test site: + +```console +181.62.166.177 - - [20/Apr/2021:22:48:42 +0200] "GET /rest/collections/d1e11546-c62a-4aee-af91-fd482b3e7653/items?expand=metadata HTTP/2.0" 200 123613 "http://cassavalighthousetest.org/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36" +181.62.166.177 - - [20/Apr/2021:22:55:39 +0200] "GET /rest/collections/d1e11546-c62a-4aee-af91-fd482b3e7653/items?expand=metadata HTTP/2.0" 200 123613 "http://cassavalighthousetest.org/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36" +``` + +- But this site does not exist (yet?) + - I will purge them from Solr +- The third IP is in Russia apparently, and the user agent has the `pl-PL` locale with thousands of requests like this: + +```console +45.146.166.180 - - [18/Apr/2021:16:28:44 +0200] "GET /bitstream/handle/10947/4153/.AAS%202014%20Annual%20Report.pdf?sequence=1%22%29%29%20AND%201691%3DUTL_INADDR.GET_HOST_ADDRESS%28CHR%28113%29%7C%7CCHR%28118%29%7C%7CCHR%28113%29%7C%7CCHR%28106%29%7C%7CCHR%28113%29%7C%7C%28SELECT%20%28CASE%20WHEN%20%281691%3D1691%29%20THEN%201%20ELSE%200%20END%29%20FROM%20DUAL%29%7C%7CCHR%28113%29%7C%7CCHR%2898%29%7C%7CCHR%28122%29%7C%7CCHR%28120%29%7C%7CCHR%28113%29%29%20AND%20%28%28%22RKbp%22%3D%22RKbp&isAllowed=y HTTP/1.1" 200 918998 "http://cgspace.cgiar.org:80/bitstream/handle/10947/4153/.AAS 2014 Annual Report.pdf" "Mozilla/5.0 (Windows; U; Windows NT 5.1; pl-PL) AppleWebKit/523.15 (KHTML, like Gecko) Version/3.0 Safari/523.15" +``` + +- I will purge these all with my `check-spider-ip-hits.sh` script: + +```console +$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips.txt -p +Purging 21648 hits from 193.169.254.178 in statistics +Purging 20323 hits from 181.62.166.177 in statistics +Purging 19376 hits from 45.146.166.180 in statistics + +Total number of bot hits purged: 61347 +``` + +## 2021-05-02 + +- Check the AReS Harvester indexes: + +```console +$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items +yellow open openrxv-items-temp H-CGsyyLTaqAj6-nKXZ-7w 1 1 0 0 283b 283b +yellow open openrxv-items-final ul3SKsa7Q9Cd_K7qokBY_w 1 1 103951 0 254mb 254mb +$ curl -s 'http://localhost:9200/_alias/' | python -m json.tool +... + "openrxv-items-temp": { + "aliases": {} + }, + "openrxv-items-final": { + "aliases": { + "openrxv-items": {} + } + }, +``` + +- I think they look OK (`openrxv-items` is an alias of `openrxv-items-final`), but I took a backup just in case: + +```console +$ elasticdump --input=http://localhost:9200/openrxv-items --output=/home/aorth/openrxv-items_mapping.json --type=mapping +$ elasticdump --input=http://localhost:9200/openrxv-items --output=/home/aorth/openrxv-items_data.json --type=data --limit=1000 +``` + +- Then I started an indexing in the AReS Explorer admin dashboard +- The indexing finished, but it looks like the aliases are messed up again: + +```console +$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items +yellow open openrxv-items-temp H-CGsyyLTaqAj6-nKXZ-7w 1 1 104165 105024 487.7mb 487.7mb +yellow open openrxv-items-final d0tbMM_SRWimirxr_gm9YA 1 1 937 0 2.2mb 2.2mb +``` + + diff --git a/docs/2015-11/index.html b/docs/2015-11/index.html index 0649b801d..1538f6de4 100644 --- a/docs/2015-11/index.html +++ b/docs/2015-11/index.html @@ -242,6 +242,8 @@ db.statementpool = true
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -250,8 +252,6 @@ db.statementpool = true
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2015-12/index.html b/docs/2015-12/index.html index f5ffbcf73..7f559c718 100644 --- a/docs/2015-12/index.html +++ b/docs/2015-12/index.html @@ -264,6 +264,8 @@ $ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -272,8 +274,6 @@ $ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2016-01/index.html b/docs/2016-01/index.html index 91a2bd63a..f6a809aa4 100644 --- a/docs/2016-01/index.html +++ b/docs/2016-01/index.html @@ -200,6 +200,8 @@ $ find SimpleArchiveForBio/ -iname “*.pdf” -exec basename {} ; | sor
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -208,8 +210,6 @@ $ find SimpleArchiveForBio/ -iname “*.pdf” -exec basename {} ; | sor
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2016-02/index.html b/docs/2016-02/index.html index dd380da1b..617e26b53 100644 --- a/docs/2016-02/index.html +++ b/docs/2016-02/index.html @@ -378,6 +378,8 @@ Bitstream: tést señora alimentación.pdf
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -386,8 +388,6 @@ Bitstream: tést señora alimentación.pdf
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2016-03/index.html b/docs/2016-03/index.html index 4b0de9771..5132a5c79 100644 --- a/docs/2016-03/index.html +++ b/docs/2016-03/index.html @@ -316,6 +316,8 @@ Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Ja
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -324,8 +326,6 @@ Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Ja
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2016-04/index.html b/docs/2016-04/index.html index 76ff53a7f..eeffc8b2d 100644 --- a/docs/2016-04/index.html +++ b/docs/2016-04/index.html @@ -495,6 +495,8 @@ dspace.log.2016-04-27:7271
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -503,8 +505,6 @@ dspace.log.2016-04-27:7271
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2016-05/index.html b/docs/2016-05/index.html index 18268ee56..9d6d4b9cf 100644 --- a/docs/2016-05/index.html +++ b/docs/2016-05/index.html @@ -371,6 +371,8 @@ sys 0m20.540s
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -379,8 +381,6 @@ sys 0m20.540s
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2016-06/index.html b/docs/2016-06/index.html index a5416361a..ec5d0179c 100644 --- a/docs/2016-06/index.html +++ b/docs/2016-06/index.html @@ -409,6 +409,8 @@ $ ./delete-metadata-values.py -f dc.contributor.corporate -i Corporate-Authors-D
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -417,8 +419,6 @@ $ ./delete-metadata-values.py -f dc.contributor.corporate -i Corporate-Authors-D
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2016-07/index.html b/docs/2016-07/index.html index ac7f1f933..9e53dbda4 100644 --- a/docs/2016-07/index.html +++ b/docs/2016-07/index.html @@ -325,6 +325,8 @@ discovery.index.authority.ignore-variants=true
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -333,8 +335,6 @@ discovery.index.authority.ignore-variants=true
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2016-08/index.html b/docs/2016-08/index.html index 2a8bb0893..d538031aa 100644 --- a/docs/2016-08/index.html +++ b/docs/2016-08/index.html @@ -389,6 +389,8 @@ $ JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx512m" /home/cgspace.cgiar.org/b
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -397,8 +399,6 @@ $ JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx512m" /home/cgspace.cgiar.org/b
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2016-09/index.html b/docs/2016-09/index.html index dffc60bb8..5e24ecdea 100644 --- a/docs/2016-09/index.html +++ b/docs/2016-09/index.html @@ -606,6 +606,8 @@ $ ./delete-metadata-values.py -i ilrisubjects-delete-13.csv -f cg.subject.ilri -
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -614,8 +616,6 @@ $ ./delete-metadata-values.py -i ilrisubjects-delete-13.csv -f cg.subject.ilri -
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2016-10/index.html b/docs/2016-10/index.html index 8896352f6..7bd14152d 100644 --- a/docs/2016-10/index.html +++ b/docs/2016-10/index.html @@ -372,6 +372,8 @@ dspace=# update metadatavalue set text_value = regexp_replace(text_value, 'http:
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -380,8 +382,6 @@ dspace=# update metadatavalue set text_value = regexp_replace(text_value, 'http:
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2016-11/index.html b/docs/2016-11/index.html index cd2d8154f..5c659eaf9 100644 --- a/docs/2016-11/index.html +++ b/docs/2016-11/index.html @@ -548,6 +548,8 @@ org.dspace.discovery.SearchServiceException: Error executing query
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -556,8 +558,6 @@ org.dspace.discovery.SearchServiceException: Error executing query
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2016-12/index.html b/docs/2016-12/index.html index 732be5390..4dbc174cc 100644 --- a/docs/2016-12/index.html +++ b/docs/2016-12/index.html @@ -784,6 +784,8 @@ $ exit
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -792,8 +794,6 @@ $ exit
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2017-01/index.html b/docs/2017-01/index.html index 401084b03..e1d3b97fb 100644 --- a/docs/2017-01/index.html +++ b/docs/2017-01/index.html @@ -369,6 +369,8 @@ $ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -377,8 +379,6 @@ $ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2017-02/index.html b/docs/2017-02/index.html index 1143b73aa..d2833bea1 100644 --- a/docs/2017-02/index.html +++ b/docs/2017-02/index.html @@ -424,6 +424,8 @@ COPY 1968
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -432,8 +434,6 @@ COPY 1968
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2017-03/index.html b/docs/2017-03/index.html index 1ba032b35..babf9590e 100644 --- a/docs/2017-03/index.html +++ b/docs/2017-03/index.html @@ -355,6 +355,8 @@ $ ./delete-metadata-values.py -i Investors-Delete-121.csv -f dc.description.spon
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -363,8 +365,6 @@ $ ./delete-metadata-values.py -i Investors-Delete-121.csv -f dc.description.spon
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2017-04/index.html b/docs/2017-04/index.html index 74bd8ac0f..c5fb21ca8 100644 --- a/docs/2017-04/index.html +++ b/docs/2017-04/index.html @@ -585,6 +585,8 @@ $ gem install compass -v 1.0.3
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -593,8 +595,6 @@ $ gem install compass -v 1.0.3
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2017-05/index.html b/docs/2017-05/index.html index 6e6dfd872..d95c389db 100644 --- a/docs/2017-05/index.html +++ b/docs/2017-05/index.html @@ -391,6 +391,8 @@ UPDATE 187
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -399,8 +401,6 @@ UPDATE 187
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2017-06/index.html b/docs/2017-06/index.html index 044384496..2a69d6302 100644 --- a/docs/2017-06/index.html +++ b/docs/2017-06/index.html @@ -270,6 +270,8 @@ $ JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8" [dspace]/bin/dspace impo
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -278,8 +280,6 @@ $ JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8" [dspace]/bin/dspace impo
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2017-07/index.html b/docs/2017-07/index.html index b1c3cae4e..0d32efe87 100644 --- a/docs/2017-07/index.html +++ b/docs/2017-07/index.html @@ -275,6 +275,8 @@ delete from metadatavalue where resource_type_id=2 and metadata_field_id=235 and
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -283,8 +285,6 @@ delete from metadatavalue where resource_type_id=2 and metadata_field_id=235 and
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2017-08/index.html b/docs/2017-08/index.html index e45690f23..f4c1dc717 100644 --- a/docs/2017-08/index.html +++ b/docs/2017-08/index.html @@ -517,6 +517,8 @@ org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -525,8 +527,6 @@ org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2017-09/index.html b/docs/2017-09/index.html index fd60a5147..75bbada10 100644 --- a/docs/2017-09/index.html +++ b/docs/2017-09/index.html @@ -659,6 +659,8 @@ Cert Status: good
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -667,8 +669,6 @@ Cert Status: good
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2017-10/index.html b/docs/2017-10/index.html index e68ab6db5..b2344c657 100644 --- a/docs/2017-10/index.html +++ b/docs/2017-10/index.html @@ -443,6 +443,8 @@ session_id=6C30F10B4351A4ED83EC6ED50AFD6B6A
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -451,8 +453,6 @@ session_id=6C30F10B4351A4ED83EC6ED50AFD6B6A
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2017-11/index.html b/docs/2017-11/index.html index b0845e2ec..c99304045 100644 --- a/docs/2017-11/index.html +++ b/docs/2017-11/index.html @@ -944,6 +944,8 @@ $ cat dspace.log.2017-11-28 | grep -o -E 'session_id=[A-Z0-9]{32}' | sort -n | u
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -952,8 +954,6 @@ $ cat dspace.log.2017-11-28 | grep -o -E 'session_id=[A-Z0-9]{32}' | sort -n | u
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2017-12/index.html b/docs/2017-12/index.html index 6048450de..49427e3a2 100644 --- a/docs/2017-12/index.html +++ b/docs/2017-12/index.html @@ -783,6 +783,8 @@ DELETE 20
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -791,8 +793,6 @@ DELETE 20
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2018-01/index.html b/docs/2018-01/index.html index c5f0f74b0..2c00c4ec7 100644 --- a/docs/2018-01/index.html +++ b/docs/2018-01/index.html @@ -1452,6 +1452,8 @@ Catalina:type=Manager,context=/,host=localhost activeSessions 8
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -1460,8 +1462,6 @@ Catalina:type=Manager,context=/,host=localhost activeSessions 8
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2018-02/index.html b/docs/2018-02/index.html index 04d94c766..0c9366dfa 100644 --- a/docs/2018-02/index.html +++ b/docs/2018-02/index.html @@ -1039,6 +1039,8 @@ UPDATE 3
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -1047,8 +1049,6 @@ UPDATE 3
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2018-03/index.html b/docs/2018-03/index.html index 745d8236e..6734f952f 100644 --- a/docs/2018-03/index.html +++ b/docs/2018-03/index.html @@ -585,6 +585,8 @@ Fixed 5 occurences of: GENEBANKS
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -593,8 +595,6 @@ Fixed 5 occurences of: GENEBANKS
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2018-04/index.html b/docs/2018-04/index.html index ce68b338b..2a6122479 100644 --- a/docs/2018-04/index.html +++ b/docs/2018-04/index.html @@ -594,6 +594,8 @@ $ pg_restore -O -U dspacetest -d dspacetest -W -h localhost /tmp/dspace_2018-04-
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -602,8 +604,6 @@ $ pg_restore -O -U dspacetest -d dspacetest -W -h localhost /tmp/dspace_2018-04-
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2018-05/index.html b/docs/2018-05/index.html index f4fcf32a7..b244c9f71 100644 --- a/docs/2018-05/index.html +++ b/docs/2018-05/index.html @@ -523,6 +523,8 @@ $ psql -h localhost -U postgres dspacetest
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -531,8 +533,6 @@ $ psql -h localhost -U postgres dspacetest
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2018-06/index.html b/docs/2018-06/index.html index be74b197a..fb7c37b97 100644 --- a/docs/2018-06/index.html +++ b/docs/2018-06/index.html @@ -517,6 +517,8 @@ $ sed '/^id/d' 10568-*.csv | csvcut -c 1,2 > map-to-cifor-archive.csv
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -525,8 +527,6 @@ $ sed '/^id/d' 10568-*.csv | csvcut -c 1,2 > map-to-cifor-archive.csv
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2018-07/index.html b/docs/2018-07/index.html index 84eb91ca7..9c8144797 100644 --- a/docs/2018-07/index.html +++ b/docs/2018-07/index.html @@ -569,6 +569,8 @@ dspace=# select count(text_value) from metadatavalue where resource_type_id=2 an
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -577,8 +579,6 @@ dspace=# select count(text_value) from metadatavalue where resource_type_id=2 an
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2018-08/index.html b/docs/2018-08/index.html index 3a77b961d..dec7ee15c 100644 --- a/docs/2018-08/index.html +++ b/docs/2018-08/index.html @@ -442,6 +442,8 @@ $ dspace database migrate ignored
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -450,8 +452,6 @@ $ dspace database migrate ignored
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2018-09/index.html b/docs/2018-09/index.html index 1a51b5f5b..8ac1c899e 100644 --- a/docs/2018-09/index.html +++ b/docs/2018-09/index.html @@ -748,6 +748,8 @@ UPDATE metadatavalue SET text_value='ja' WHERE resource_type_id=2 AND metadata_f
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -756,8 +758,6 @@ UPDATE metadatavalue SET text_value='ja' WHERE resource_type_id=2 AND metadata_f
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2018-10/index.html b/docs/2018-10/index.html index e5726f758..c1121e5af 100644 --- a/docs/2018-10/index.html +++ b/docs/2018-10/index.html @@ -656,6 +656,8 @@ $ curl -X GET -H "Content-Type: application/json" -H "Accept: app
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -664,8 +666,6 @@ $ curl -X GET -H "Content-Type: application/json" -H "Accept: app
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2018-11/index.html b/docs/2018-11/index.html index 0133f9e09..48373d52d 100644 --- a/docs/2018-11/index.html +++ b/docs/2018-11/index.html @@ -553,6 +553,8 @@ $ dspace dsrun org.dspace.eperson.Groomer -a -b 11/27/2016 -d
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -561,8 +563,6 @@ $ dspace dsrun org.dspace.eperson.Groomer -a -b 11/27/2016 -d
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2018-12/index.html b/docs/2018-12/index.html index 57799f919..c19c828bf 100644 --- a/docs/2018-12/index.html +++ b/docs/2018-12/index.html @@ -594,6 +594,8 @@ UPDATE 1
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -602,8 +604,6 @@ UPDATE 1
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2019-01/index.html b/docs/2019-01/index.html index cd052a075..76c4ff3a4 100644 --- a/docs/2019-01/index.html +++ b/docs/2019-01/index.html @@ -1264,6 +1264,8 @@ identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -1272,8 +1274,6 @@ identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2019-02/index.html b/docs/2019-02/index.html index 868b8b611..9c998304f 100644 --- a/docs/2019-02/index.html +++ b/docs/2019-02/index.html @@ -1344,6 +1344,8 @@ Please see the DSpace documentation for assistance.
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -1352,8 +1354,6 @@ Please see the DSpace documentation for assistance.
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2019-03/index.html b/docs/2019-03/index.html index e5d869ce7..38d48590c 100644 --- a/docs/2019-03/index.html +++ b/docs/2019-03/index.html @@ -1208,6 +1208,8 @@ sys 0m2.551s
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -1216,8 +1218,6 @@ sys 0m2.551s
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2019-04/index.html b/docs/2019-04/index.html index 4cd081d76..809440a7d 100644 --- a/docs/2019-04/index.html +++ b/docs/2019-04/index.html @@ -1299,6 +1299,8 @@ UPDATE 14
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -1307,8 +1309,6 @@ UPDATE 14
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2019-05/index.html b/docs/2019-05/index.html index dc97039b8..7730965ca 100644 --- a/docs/2019-05/index.html +++ b/docs/2019-05/index.html @@ -631,6 +631,8 @@ COPY 64871
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -639,8 +641,6 @@ COPY 64871
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2019-06/index.html b/docs/2019-06/index.html index 3e2267f06..c2b516429 100644 --- a/docs/2019-06/index.html +++ b/docs/2019-06/index.html @@ -317,6 +317,8 @@ UPDATE 2
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -325,8 +327,6 @@ UPDATE 2
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2019-07/index.html b/docs/2019-07/index.html index 3c5d0428c..72e44b004 100644 --- a/docs/2019-07/index.html +++ b/docs/2019-07/index.html @@ -554,6 +554,8 @@ issn.validate('1020-3362')
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -562,8 +564,6 @@ issn.validate('1020-3362')
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2019-08/index.html b/docs/2019-08/index.html index 9f15c2daa..7032656d9 100644 --- a/docs/2019-08/index.html +++ b/docs/2019-08/index.html @@ -573,6 +573,8 @@ sys 2m27.496s
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -581,8 +583,6 @@ sys 2m27.496s
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2019-09/index.html b/docs/2019-09/index.html index 0106d6d73..f8d96ce6a 100644 --- a/docs/2019-09/index.html +++ b/docs/2019-09/index.html @@ -581,6 +581,8 @@ $ csv-metadata-quality -i /tmp/clarisa-institutions.csv -o /tmp/clarisa-institut
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -589,8 +591,6 @@ $ csv-metadata-quality -i /tmp/clarisa-institutions.csv -o /tmp/clarisa-institut
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2019-10/index.html b/docs/2019-10/index.html index 30031574a..cbe4e8cfe 100644 --- a/docs/2019-10/index.html +++ b/docs/2019-10/index.html @@ -385,6 +385,8 @@ $ dspace import -a -c 10568/104057 -e fuu@cgiar.org -m 2019-10-15-Bioversity.map
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -393,8 +395,6 @@ $ dspace import -a -c 10568/104057 -e fuu@cgiar.org -m 2019-10-15-Bioversity.map
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2019-11/index.html b/docs/2019-11/index.html index 1abc9fdeb..0fd59116c 100644 --- a/docs/2019-11/index.html +++ b/docs/2019-11/index.html @@ -692,6 +692,8 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -700,8 +702,6 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2019-12/index.html b/docs/2019-12/index.html index 7bb29f8c2..680d604f4 100644 --- a/docs/2019-12/index.html +++ b/docs/2019-12/index.html @@ -404,6 +404,8 @@ UPDATE 1
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -412,8 +414,6 @@ UPDATE 1
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2020-01/index.html b/docs/2020-01/index.html index 2c7362fe7..6f97b7c74 100644 --- a/docs/2020-01/index.html +++ b/docs/2020-01/index.html @@ -604,6 +604,8 @@ COPY 2900
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -612,8 +614,6 @@ COPY 2900
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2020-02/index.html b/docs/2020-02/index.html index 31714f92f..b477dfb18 100644 --- a/docs/2020-02/index.html +++ b/docs/2020-02/index.html @@ -1275,6 +1275,8 @@ Moving: 21993 into core statistics-2019
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -1283,8 +1285,6 @@ Moving: 21993 into core statistics-2019
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2020-03/index.html b/docs/2020-03/index.html index dbbb0631a..43d5b39eb 100644 --- a/docs/2020-03/index.html +++ b/docs/2020-03/index.html @@ -484,6 +484,8 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -492,8 +494,6 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2020-04/index.html b/docs/2020-04/index.html index 2c2e76e7a..5cb5e6f86 100644 --- a/docs/2020-04/index.html +++ b/docs/2020-04/index.html @@ -658,6 +658,8 @@ $ psql -c 'select * from pg_stat_activity' | wc -l
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -666,8 +668,6 @@ $ psql -c 'select * from pg_stat_activity' | wc -l
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2020-05/index.html b/docs/2020-05/index.html index 979b7079c..266bd0afd 100644 --- a/docs/2020-05/index.html +++ b/docs/2020-05/index.html @@ -477,6 +477,8 @@ Caused by: java.lang.NullPointerException
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -485,8 +487,6 @@ Caused by: java.lang.NullPointerException
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2020-06/index.html b/docs/2020-06/index.html index 0ce7d10b0..3c55c1489 100644 --- a/docs/2020-06/index.html +++ b/docs/2020-06/index.html @@ -811,6 +811,8 @@ $ csvcut -c 'id,cg.subject.ilri[],cg.subject.ilri[en_US],dc.subject[en_US]' /tmp
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -819,8 +821,6 @@ $ csvcut -c 'id,cg.subject.ilri[],cg.subject.ilri[en_US],dc.subject[en_US]' /tmp
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2020-07/index.html b/docs/2020-07/index.html index fb4d4252c..2fcc9712e 100644 --- a/docs/2020-07/index.html +++ b/docs/2020-07/index.html @@ -1142,6 +1142,8 @@ Fixed 4 occurences of: Muloi, D.M.
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -1150,8 +1152,6 @@ Fixed 4 occurences of: Muloi, D.M.
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2020-08/index.html b/docs/2020-08/index.html index 2964a723f..cdd19a350 100644 --- a/docs/2020-08/index.html +++ b/docs/2020-08/index.html @@ -798,6 +798,8 @@ $ grep -c added /tmp/2020-08-27-countrycodetagger.log
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -806,8 +808,6 @@ $ grep -c added /tmp/2020-08-27-countrycodetagger.log
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2020-09/index.html b/docs/2020-09/index.html index 7004fe0ad..61198443e 100644 --- a/docs/2020-09/index.html +++ b/docs/2020-09/index.html @@ -717,6 +717,8 @@ solr_query_params = {
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -725,8 +727,6 @@ solr_query_params = {
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2020-10/index.html b/docs/2020-10/index.html index ba16248a7..27c3ae295 100644 --- a/docs/2020-10/index.html +++ b/docs/2020-10/index.html @@ -1241,6 +1241,8 @@ $ ./delete-metadata-values.py -i 2020-10-31-delete-74-sponsors.csv -db dspace -u
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -1249,8 +1251,6 @@ $ ./delete-metadata-values.py -i 2020-10-31-delete-74-sponsors.csv -db dspace -u
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2020-11/index.html b/docs/2020-11/index.html index 9ab46534f..85e4a194d 100644 --- a/docs/2020-11/index.html +++ b/docs/2020-11/index.html @@ -731,6 +731,8 @@ $ ./fix-metadata-values.py -i 2020-11-30-fix-hung-orcid.csv -db dspace63 -u dspa
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -739,8 +741,6 @@ $ ./fix-metadata-values.py -i 2020-11-30-fix-hung-orcid.csv -db dspace63 -u dspa
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2020-12/index.html b/docs/2020-12/index.html index 25678bde2..ed398bfa3 100644 --- a/docs/2020-12/index.html +++ b/docs/2020-12/index.html @@ -869,6 +869,8 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-2020-12-29?pretty'
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -877,8 +879,6 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-2020-12-29?pretty'
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2021-01/index.html b/docs/2021-01/index.html index 71226db8c..dd6f15f25 100644 --- a/docs/2021-01/index.html +++ b/docs/2021-01/index.html @@ -688,6 +688,8 @@ java.lang.IllegalArgumentException: Invalid character found in the request targe
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -696,8 +698,6 @@ java.lang.IllegalArgumentException: Invalid character found in the request targe
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2021-02/index.html b/docs/2021-02/index.html index 7f73518ad..cbabbc00a 100644 --- a/docs/2021-02/index.html +++ b/docs/2021-02/index.html @@ -899,6 +899,8 @@ dspace.log.2021-02-28:0
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -907,8 +909,6 @@ dspace.log.2021-02-28:0
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2021-03/index.html b/docs/2021-03/index.html index 1dacaf784..391b91070 100644 --- a/docs/2021-03/index.html +++ b/docs/2021-03/index.html @@ -875,6 +875,8 @@ COPY 3081
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -883,8 +885,6 @@ COPY 3081
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2021-04/index.html b/docs/2021-04/index.html index 9394b8eeb..bd81ce568 100644 --- a/docs/2021-04/index.html +++ b/docs/2021-04/index.html @@ -24,7 +24,7 @@ Perhaps one of the containers crashed, I should have looked closer but I was in - + @@ -56,7 +56,7 @@ Perhaps one of the containers crashed, I should have looked closer but I was in "url": "https://alanorth.github.io/cgspace-notes/2021-04/", "wordCount": "4669", "datePublished": "2021-04-01T09:50:54+03:00", - "dateModified": "2021-04-26T15:58:48+03:00", + "dateModified": "2021-04-28T18:57:48+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -1042,6 +1042,8 @@ $ ./ilri/doi-to-handle.py -i /tmp/dois.txt -o /tmp/handles.csv -db dspace63 -u d
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -1050,8 +1052,6 @@ $ ./ilri/doi-to-handle.py -i /tmp/dois.txt -o /tmp/handles.csv -db dspace63 -u d
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/2021-05/index.html b/docs/2021-05/index.html new file mode 100644 index 000000000..664592fdd --- /dev/null +++ b/docs/2021-05/index.html @@ -0,0 +1,279 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + May, 2021 | CGSpace Notes + + + + + + + + + + + + + + + + + + + + + +
+
+ +
+
+ + + + +
+
+

CGSpace Notes

+

Documenting day-to-day work on the CGSpace repository.

+
+
+ + + + +
+
+
+ + + + +
+
+

May, 2021

+ +
+

2021-05-01

+
    +
  • I looked at the top user agents and IPs in the Solr statistics for last month and I see these user agents: +
      +
    • “RI/1.0”, 1337
    • +
    • “Microsoft Office Word 2014”, 941
    • +
    +
  • +
  • I will add the RI/1.0 pattern to our DSpace agents overload and purge them from Solr (we had previously seen this agent with 9,000 hits or so in 2020-09), but I think I will leave the Microsoft Word one… as that’s an actual user…
  • +
+
    +
  • I should probably add the RI/1.0 pattern to COUNTER-Robots project
  • +
  • As well as these IPs: +
      +
    • 193.169.254.178, 21648
    • +
    • 181.62.166.177, 20323
    • +
    • 45.146.166.180, 19376
    • +
    +
  • +
  • The first IP seems to be in Estonia and their requests to the REST API change user agents from curl to Mac OS X to Windows and more +
      +
    • Also, they seem to be trying to exploit something:
    • +
    +
  • +
+
193.169.254.178 - - [21/Apr/2021:01:59:01 +0200] "GET /rest/collections/1179/items?limit=812&expand=metadata\x22%20and%20\x2221\x22=\x2221 HTTP/1.1" 400 5 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
+193.169.254.178 - - [21/Apr/2021:02:00:36 +0200] "GET /rest/collections/1179/items?limit=812&expand=metadata-21%2B21*01 HTTP/1.1" 200 458201 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
+193.169.254.178 - - [21/Apr/2021:02:00:36 +0200] "GET /rest/collections/1179/items?limit=812&expand=metadata'||lower('')||' HTTP/1.1" 400 5 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
+193.169.254.178 - - [21/Apr/2021:02:02:10 +0200] "GET /rest/collections/1179/items?limit=812&expand=metadata'%2Brtrim('')%2B' HTTP/1.1" 200 458209 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"
+
    +
  • I will report the IP on abuseipdb.com and purge their hits from Solr
  • +
  • The second IP is in Colombia and is making thousands of requests for what looks like some test site:
  • +
+
181.62.166.177 - - [20/Apr/2021:22:48:42 +0200] "GET /rest/collections/d1e11546-c62a-4aee-af91-fd482b3e7653/items?expand=metadata HTTP/2.0" 200 123613 "http://cassavalighthousetest.org/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36"
+181.62.166.177 - - [20/Apr/2021:22:55:39 +0200] "GET /rest/collections/d1e11546-c62a-4aee-af91-fd482b3e7653/items?expand=metadata HTTP/2.0" 200 123613 "http://cassavalighthousetest.org/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36"
+
    +
  • But this site does not exist (yet?) +
      +
    • I will purge them from Solr
    • +
    +
  • +
  • The third IP is in Russia apparently, and the user agent has the pl-PL locale with thousands of requests like this:
  • +
+
45.146.166.180 - - [18/Apr/2021:16:28:44 +0200] "GET /bitstream/handle/10947/4153/.AAS%202014%20Annual%20Report.pdf?sequence=1%22%29%29%20AND%201691%3DUTL_INADDR.GET_HOST_ADDRESS%28CHR%28113%29%7C%7CCHR%28118%29%7C%7CCHR%28113%29%7C%7CCHR%28106%29%7C%7CCHR%28113%29%7C%7C%28SELECT%20%28CASE%20WHEN%20%281691%3D1691%29%20THEN%201%20ELSE%200%20END%29%20FROM%20DUAL%29%7C%7CCHR%28113%29%7C%7CCHR%2898%29%7C%7CCHR%28122%29%7C%7CCHR%28120%29%7C%7CCHR%28113%29%29%20AND%20%28%28%22RKbp%22%3D%22RKbp&isAllowed=y HTTP/1.1" 200 918998 "http://cgspace.cgiar.org:80/bitstream/handle/10947/4153/.AAS 2014 Annual Report.pdf" "Mozilla/5.0 (Windows; U; Windows NT 5.1; pl-PL) AppleWebKit/523.15 (KHTML, like Gecko) Version/3.0 Safari/523.15"
+
    +
  • I will purge these all with my check-spider-ip-hits.sh script:
  • +
+
$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips.txt -p
+Purging 21648 hits from 193.169.254.178 in statistics
+Purging 20323 hits from 181.62.166.177 in statistics
+Purging 19376 hits from 45.146.166.180 in statistics
+
+Total number of bot hits purged: 61347
+

2021-05-02

+
    +
  • Check the AReS Harvester indexes:
  • +
+
$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items
+yellow open openrxv-items-temp       H-CGsyyLTaqAj6-nKXZ-7w 1 1      0 0    283b    283b
+yellow open openrxv-items-final      ul3SKsa7Q9Cd_K7qokBY_w 1 1 103951 0   254mb   254mb
+$ curl -s 'http://localhost:9200/_alias/' | python -m json.tool
+...
+    "openrxv-items-temp": {
+        "aliases": {}
+    },
+    "openrxv-items-final": {
+        "aliases": {
+            "openrxv-items": {}
+        }
+    },
+
    +
  • I think they look OK (openrxv-items is an alias of openrxv-items-final), but I took a backup just in case:
  • +
+
$ elasticdump --input=http://localhost:9200/openrxv-items --output=/home/aorth/openrxv-items_mapping.json --type=mapping
+$ elasticdump --input=http://localhost:9200/openrxv-items --output=/home/aorth/openrxv-items_data.json --type=data --limit=1000
+
    +
  • Then I started an indexing in the AReS Explorer admin dashboard
  • +
  • The indexing finished, but it looks like the aliases are messed up again:
  • +
+
$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items
+yellow open openrxv-items-temp       H-CGsyyLTaqAj6-nKXZ-7w 1 1 104165 105024 487.7mb 487.7mb
+yellow open openrxv-items-final      d0tbMM_SRWimirxr_gm9YA 1 1    937      0   2.2mb   2.2mb
+
+ + + + + +
+ + + +
+ + + + +
+
+ + + + + + + + + diff --git a/docs/404.html b/docs/404.html index 5af201eaa..36348cfc4 100644 --- a/docs/404.html +++ b/docs/404.html @@ -95,6 +95,8 @@
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -103,8 +105,6 @@
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/categories/index.html b/docs/categories/index.html index d2fd04638..18d6d3486 100644 --- a/docs/categories/index.html +++ b/docs/categories/index.html @@ -10,7 +10,7 @@ - + @@ -84,7 +84,7 @@

Notes

- +
Read more → @@ -108,6 +108,8 @@
    +
  1. May, 2021
  2. +
  3. April, 2021
  4. March, 2021
  5. @@ -116,8 +118,6 @@
  6. February, 2021
  7. -
  8. January, 2021
  9. -
diff --git a/docs/categories/index.xml b/docs/categories/index.xml index b4df36b76..112e88429 100644 --- a/docs/categories/index.xml +++ b/docs/categories/index.xml @@ -6,11 +6,11 @@ Recent content in Categories on CGSpace Notes Hugo -- gohugo.io en-us - Thu, 01 Apr 2021 09:50:54 +0300 + Sun, 02 May 2021 09:50:54 +0300 Notes https://alanorth.github.io/cgspace-notes/categories/notes/ - Thu, 01 Apr 2021 09:50:54 +0300 + Sun, 02 May 2021 09:50:54 +0300 https://alanorth.github.io/cgspace-notes/categories/notes/ diff --git a/docs/categories/notes/index.html b/docs/categories/notes/index.html index ecf83694e..c61be4db0 100644 --- a/docs/categories/notes/index.html +++ b/docs/categories/notes/index.html @@ -10,7 +10,7 @@ - + @@ -81,6 +81,33 @@ +
+
+

May, 2021

+ +
+

2021-05-01

+
    +
  • I looked at the top user agents and IPs in the Solr statistics for last month and I see these user agents: +
      +
    • “RI/1.0”, 1337
    • +
    • “Microsoft Office Word 2014”, 941
    • +
    +
  • +
  • I will add the RI/1.0 pattern to our DSpace agents overload and purge them from Solr (we had previously seen this agent with 9,000 hits or so in 2020-09), but I think I will leave the Microsoft Word one… as that’s an actual user…
  • +
+ Read more → +
+ + + + + +

April, 2021

@@ -334,39 +361,6 @@ - -
-
-

September, 2020

- -
-

2020-09-02

-
    -
  • Replace Marissa van Epp for Rhys Bucknall in the CCAFS groups on CGSpace because Marissa no longer works at CCAFS
  • -
  • The AReS Explorer hasn’t updated its index since 2020-08-22 when I last forced it -
      -
    • I restarted it again now and told Moayad that the automatic indexing isn’t working
    • -
    -
  • -
  • Add Alliance of Bioversity International and CIAT to affiliations on CGSpace
  • -
  • Abenet told me that the general search text on AReS doesn’t get reset when you use the “Reset Filters” button - -
  • -
  • I filed an issue on OpenRXV to make some minor edits to the admin UI: https://github.com/ilri/OpenRXV/issues/40
  • -
- Read more → -
- - - - -