Add notes for 2020-08-14

This commit is contained in:
2020-08-14 11:22:16 +03:00
parent eafe422984
commit 3252567208
20 changed files with 127 additions and 25 deletions

View File

@ -398,4 +398,56 @@ dspace=# SELECT count(text_value) FROM metadatavalue WHERE metadata_field_id = 2
- I purged 150,000 hits from 2020 and 2020 from these user agents and hosts
## 2020-08-14
- Last night I started the processing of the statistics-2016 core with the Atmire stats util and I see some errors like this:
```
Record uid: f6b288d7-d60d-4df9-b311-1696b88552a0 couldn't be processed
com.atmire.statistics.util.update.atomic.ProcessingException: something went wrong while processing record uid: f6b288d7-d60d-4df9-b311-1696b88552a0, an error occured in the com.atmire.statistics.util.update.atomic.processor.ContainerOwnerDBProcessor
at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.applyProcessors(SourceFile:304)
at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.processRecords(SourceFile:176)
at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.performRun(SourceFile:161)
at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.update(SourceFile:128)
at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI.main(SourceFile:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
Caused by: java.lang.NullPointerException
```
- I see it has `id: 980-unmigrated` and `type: 0`...
- The 2016 core has 629,983 unmigrated docs, mostly:
- `type: 5`: 620311
- `type: 0`: 7255
- `type: 3`: 1333
- I purged the unmigrated docs and continued processing:
```
$ curl -s "http://localhost:8081/solr/statistics-2016/update?softCommit=true" -H "Content-Type: text/xml" --data-binary '<delete><query>id:/.*unmigrated.*/</query></delete>'
$ export JAVA_OPTS='-Dfile.encoding=UTF-8 -Xmx2048m'
$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2016
```
- Then I see there are 849,000 docs with `id: -1` and `type: 5` so I should purge those too probably:
```
$ curl -s "http://localhost:8081/solr/statistics-2017/update?softCommit=true" -H "Content-Type: text/xml" --data-binary '<delete><query>id:\-1</query></delete>'
```
- Altmetric asked for a dump of CGSpace's OAI "sets" so they can update their affiliation mappings
- I did it in a kinda ghetto way:
```
$ http 'https://cgspace.cgiar.org/oai/request?verb=ListSets' > /tmp/0.xml
$ for num in {100..1300..100}; do http "https://cgspace.cgiar.org/oai/request?verb=ListSets&resumptionToken=////$num" > /tmp/$num.xml; sleep 2; done
$ for num in {0..1300..100}; do cat /tmp/$num.xml >> /tmp/cgspace-oai-sets.xml; done
```
- This produces one file that has all the sets, albeit with 14 pages of responses concatenated into one document, but that's how theirs was in the first place...
- Help Bizu with a restricted item for CIAT
<!-- vim: set sw=2 ts=2: -->