Add notes for 2021-05-10

This commit is contained in:
2021-05-10 17:16:32 +03:00
parent 51c6db6ebd
commit bf80328223
23 changed files with 102 additions and 29 deletions

View File

@ -145,4 +145,43 @@ $ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items-
$ curl -X PUT "localhost:9200/openrxv-items-temp/_settings" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": false}}'
```
## 2021-05-10
- Amazing, the harvesting on AReS finished but it messed up all the indexes and now there are no items in any index!
```console
$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items
yellow open openrxv-items-temp 8thRX0WVRUeAzmd2hkG6TA 1 1 0 0 283b 283b
yellow open openrxv-items-temp-backup _0tyvctBTg2pjOlcoVP1LA 1 1 104165 20134 305.5mb 305.5mb
yellow open openrxv-items-final BtvV9kwVQ3yBYCZvJS1QyQ 1 1 0 0 283b 283b
```
- I fixed the indexes manually by re-creating them and cloning from the backup:
```console
$ curl -XDELETE 'http://localhost:9200/openrxv-items-final'
$ curl -X PUT "localhost:9200/openrxv-items-temp-backup/_settings" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": true}}'
$ curl -s -X POST http://localhost:9200/openrxv-items-temp-backup/_clone/openrxv-items-final
$ curl -s -X POST 'http://localhost:9200/_aliases' -H 'Content-Type: application/json' -d'{"actions" : [{"add" : { "index" : "openrxv-items-final", "alias" : "openrxv-items"}}]}'
$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp-backup'
```
- Also I ran all updated on the server and updated all Docker images, then rebooted the server (linode20):
```console
$ docker images | grep -v ^REPO | sed 's/ \+/:/g' | cut -d: -f1,2 | xargs -L1 docker pull
```
- I backed up the AReS Elasticsearch data using elasticdump, then started a new harvest:
```console
$ elasticdump --input=http://localhost:9200/openrxv-items --output=/home/aorth/openrxv-items_mapping.json --type=mapping
$ elasticdump --input=http://localhost:9200/openrxv-items --output=/home/aorth/openrxv-items_data.json --type=data --limit=1000
```
- Discuss CGSpace statistics with the CIP team
- They were wondering why their numbers for 2020 were so low
- I checked their community using the DSpace Statistics API and found very accurate numbers for 2020 and 2019 for them
- I think they had been using AReS, which actually doesn't even give stats for a time period...
<!-- vim: set sw=2 ts=2: -->