Add notes for 2021-06-22

This commit is contained in:
2021-06-22 15:22:15 +03:00
parent b787c427ab
commit b3577743e0
25 changed files with 113 additions and 30 deletions

View File

@ -194,5 +194,43 @@ $ curl -s -H "Accept: application/json" "https://demo.dspace.org/rest/items?offs
- I tested with filter "farmer managed irrigation systems" on DSpace Test
- Before the patch I got 293 results, and the few I checked didn't have the expected metadata value
- After the patch I got 162 results, and all the items I checked had the exact metadata value I was expecting
- I tested a fresh harvest from my local AReS on DSpace Test with the DS-4065 REST API patch and here are my results:
- 90459 in final from last harvesting
- 90307 in temp after new harvest
- 90327 in temp after start plugins
- The 90327 number seems closer to the "real" number of items on CGSpace...
- Seems close, but not entirely correct yet:
```console
$ grep -oE '"handle":"[[:digit:]]+/[[:digit:]]+"' openrxv-items_data-local-ds-4065.json | wc -l
90327
$ grep -oE '"handle":"[[:digit:]]+/[[:digit:]]+"' openrxv-items_data-local-ds-4065.json | sort -u | wc -l
90317
```
## 2021-06-22
- Make a [pull request](https://github.com/atmire/COUNTER-Robots/pull/43) to the COUNTER-Robots project to add two new user agents: crusty and newspaper
- These two bots have made ~3,000 requests on CGSpace
- Then I added them to our local bot override in CGSpace (until the above pull request is merged) and ran my bot checking script:
```console
$ ./ilri/check-spider-hits.sh -f dspace/config/spiders/agents/ilri -p
Purging 1339 hits from RI\/1\.0 in statistics
Purging 447 hits from crusty in statistics
Purging 3736 hits from newspaper in statistics
Total number of bot hits purged: 5522
```
- Surprised to see RI/1.0 in there because it's been in the override file for a while
- Looking at the 2021 statistics in Solr I see a few more suspicious user agents:
- `PostmanRuntime/7.26.8`
- `node-fetch/1.0 (+https://github.com/bitinn/node-fetch)`
- `Photon/1.0`
- `StatusCake_Pagespeed_indev`
- `node-superagent/3.8.3`
- `cortex/1.0`
- These bots account for ~42,000 hits in our statistics... I will just purge them and add them to our local override, but I can't be bothered to submit them to COUNTER-Robots since I'd have to look up the information for each one
<!-- vim: set sw=2 ts=2: -->