Add notes for 2020-10-30

This commit is contained in:
2020-10-30 15:24:23 +02:00
parent 272cc0cf58
commit 1e04a7da72
22 changed files with 97 additions and 28 deletions

View File

@ -935,4 +935,36 @@ $ dspace community-filiator --set --parent 10568/83389 --child 10568/1208
$ dspace community-filiator --set --parent 10568/83389 --child 10568/56924
```
## 2020-10-30
- The `AtomicStatisticsUpdateCLI` process finished on the current DSpace Test statistics core after about 32 hours
- I started it on the statistics-2019 core
- Atmire responded about the duplicate values in Solr that I had asked about a few days ago
- They said it could be due to the schema and asked if I see it only on old records or even on new ones created in the new CUA with DSpace 6
- I did a test and found that I got duplicate data after browsing for a minute on DSpace Test (version 6) and sent them a screenshot
- Looking over Peter's corrections to journal titles (dc.source) and publishers (dc.publisher)
- I had to check the corrections for strange Unicode errors and replacements with "|" and ";" in OpenRefine using this GREL:
```
or(
isNotNull(value.match(/.*\uFFFD.*/)),
isNotNull(value.match(/.*\u00A0.*/)),
isNotNull(value.match(/.*\u200A.*/)),
isNotNull(value.match(/.*\u2019.*/)),
isNotNull(value.match(/.*\u00b4.*/)),
isNotNull(value.match(/.*\u007e.*/))
).toString()
```
- Then I did a test to apply the corrections and deletions on my local DSpace:
```
$ ./fix-metadata-values.py -i 2020-10-30-fix-854-journals.csv -db dspace -u dspace -p 'fuuu' -f dc.source -t 'correct' -m 55
$ ./delete-metadata-values.py -i 2020-10-30-delete-90-journals.csv -db dspace -u dspace -p 'fuuu' -f dc.source -m 55
$ ./fix-metadata-values.py -i 2020-10-30-fix-386-publishers.csv -db dspace -u dspace -p 'fuuu' -f dc.publisher -t correct -m 39
$ ./delete-metadata-values.py -i 2020-10-30-delete-10-publishers.csv -db dspace -u dspace -p 'fuuu' -f dc.publisher -m 39
```
- I will wait to apply them on CGSpace when I have all the other corrections from Peter processed
<!-- vim: set sw=2 ts=2: -->