Update notes for 2020-05-31

This commit is contained in:
2020-05-31 20:49:41 +03:00
parent b96ed23f9f
commit 9da6ffb4d2
26 changed files with 119 additions and 36 deletions

View File

@ -216,10 +216,53 @@ $ ant update
- Database migrations take 10:18.287s during the first startup...
- perhaps when we do the production CGSpace migration I can do this in advance and tell users not to make any submissions?
- I had a mistake in my Solr internal URL parameter so DSpace couldn't find it, but once I fixed that DSpace starts up OK!
- Once the initial Discovery reindexing is completed I started the Solr statistics UUID migration:
- Once the initial Discovery reindexing was completed (after three hours or so!) I started the Solr statistics UUID migration:
```
$ export JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8"
$ dspace solr-upgrade-statistics-6x -i statistics -n 250000
$ dspace solr-upgrade-statistics-6x -i statistics -n 1000000
$ dspace solr-upgrade-statistics-6x -i statistics -n 1000000
...
```
- It's taking about 35 minutes for 1,000,000 records...
- Some issues towards the end of this core:
```
Exception: Error while creating field 'p_group_id{type=uuid,properties=indexed,stored,multiValued}' from value '10'
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error while creating field 'p_group_id{type=uuid,properties=indexed,stored,multiValued}' from value '10'
at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at org.dspace.util.SolrUpgradePre6xStatistics.batchUpdateStats(SolrUpgradePre6xStatistics.java:161)
at org.dspace.util.SolrUpgradePre6xStatistics.run(SolrUpgradePre6xStatistics.java:456)
at org.dspace.util.SolrUpgradePre6xStatistics.main(SolrUpgradePre6xStatistics.java:365)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
```
- So basically there are some documents that have IDs that have *not* been converted to UUID, and have *not* been labeled as "unmigrated" either...
- Of these 101,257 documents, 90,000 are of type 5 (search), 9,000 are type storage, and 800 are type view, but it's weird because if I look at their type/statistics_type using a facet the storage ones disappear...
- For now I will export these documents from the statistics core and then delete them:
```
$ ./run.sh -s http://localhost:8081/solr/statistics -a export -o statistics-unmigrated.json -k uid -f '(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)'
$ curl -s "http://localhost:8081/solr/statistics/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</query></delete>"
```
- Now the UUID conversion script says there is nothing left to convert, so I can try to run the Atmire CUA conversion utility:
```
$ export JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8"
$ dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 1
```
- Experiment a bit with the Python [country-converter](https://pypi.org/project/country-converter/) library as it can convert between different formats (like ISO 3166 and UN m49)