diff --git a/content/posts/2020-11.md b/content/posts/2020-11.md
index 18819ecac..46fb9a1b9 100644
--- a/content/posts/2020-11.md
+++ b/content/posts/2020-11.md
@@ -352,4 +352,49 @@ $ psql -c 'SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid =
2071
```
+## 2020-11-18
+
+- I decided to enable the `rollbackOnReturn=true` option in [Tomcat's JDBC connection pool parameters](https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html) because I noticed that all of the "idle in transaction" connections waiting for locks were SELECT queries
+ - There are many posts on the Internet about people having this issue with Hibernate
+ - The locks are lower now, but Peter and Abenet are still having issues approving items and Tezira forwarded one strange case where an item was "approved" and was assigned a handle, but it doesn't exist...
+ - I sent another mail to the dspace-tech mailing list to ask for help
+ - I reverted the `rollbackOnReturn` change in Tomcat...
+ - I sent a message to Atmire to ask for urgent help
+- Call with IWMI and Abenet about them potentially moving from InMagic to CGSpace
+ - They have questions about the reporting on AReS
+ - We told them that we can use collections to infer Strategic Priorities and Research Groups and WLE Flagships
+ - It sounds like we will create this structure under the top-level IWMI community:
+ - IWMI Strategic Priorities (sub-community)
+ - Water, Food and Ecosystems (sub-community)
+ - Sustainable and Resilient Food Production Systems (collection)
+ - Sustainable Water infrastructure and Ecosystems (collection)
+ - Integrated Basin and Aquifer Management
+ - Water, Climate Change and Resilience (sub-community)
+ - Climate Change Adaptation and Resilience (collection)
+ - etc...
+ - They will submit items to their normal output type collections and map to these
+- In other news I finally finished processing the Solr statistics for UUIDs and re-indexed the stats with the dspace-statistics-api
+ - I started the Atmire stats processing, notes in the dedicated [CGSpace DSpace 6 Upgrade section]({{< relref "cgspace-dspace6-upgrade.md" >}})
+- Peter got a strange message this evening when trying to update metadata:
+
+```
+2020-11-18 16:57:33,309 ERROR org.hibernate.engine.jdbc.batch.internal.BatchingBatch @ HHH000315: Exception executing batch [Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1]
+2020-11-18 16:57:33,316 ERROR org.hibernate.engine.jdbc.batch.internal.BatchingBatch @ HHH000315: Exception executing batch [Batch update returned unexpected row count from update [13]; actual row count: 0; expected: 1]
+2020-11-18 16:57:33,385 INFO org.hibernate.engine.jdbc.batch.internal.AbstractBatchImpl @ HHH000010: On release of batch it still contained JDBC statements
+```
+
+- Minor bug fixes to limit parameter in DSpace Statistics API
+ - Release [version 1.3.2](https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.2)
+- Send a list of potential ToRs for a next phase of OpenRXV development to Michael Victor for feedback:
+ - Enable advanced reporting templates using "Angular expressions" in Docxtemplater (would be used immediately for IWMI and Bioversity–CIAT)
+ - Enable embedding of charts like world map and word cloud in reports
+ - Enable embedding of item thumbnails in reports, similar to the "list of information products"
+ - Enable something like the "Statistics" Excel report Peter wanted in 2019 so we can get community and collection statistics reports
+ - Add a new "metrics" block with statistics about top authors and items by number of views and downloads for the current search terms
+ - Add ability to change the explorer UI to "Usage Statistics" mode where lists of authors, affiliations, sponsors, CRPs, communities, collections, etc are sorted according to the number of views or downloads for the current search results, rather than by number of occurrences of metadata values
+ - Add ability to "drill down" or modify search filter terms by clicking on countries in the map
+ - Enable date-based usage statistics (currently only "all time" statistics are available)
+ - Fixing minor bugs for all issues filed on GitHub
+- I also added GitHub issues for each of them
+
diff --git a/content/posts/cgspace-dspace6-upgrade.md b/content/posts/cgspace-dspace6-upgrade.md
index 01072400f..3ab523a37 100644
--- a/content/posts/cgspace-dspace6-upgrade.md
+++ b/content/posts/cgspace-dspace6-upgrade.md
@@ -11,7 +11,8 @@ Notes about the DSpace 6 upgrade on CGSpace in 2020-11.
-- [Processing Solr Statistics With solr-upgrade-statistics-6x](#processing-solr-statistics-with-solr-upgrade-statistics-6x)
+- [Re-import OAI with clean index](#re-import-oai-with-clean-index)
+- [Processing Solr statistics with solr-upgrade-statistics-6x](#processing-solr-statistics-with-solr-upgrade-statistics-6x)
- [Current year's statistics core](#statistics)
- [statistics-2019 core](#statistics-2019)
- [statistics-2018 core](#statistics-2018)
@@ -20,11 +21,28 @@ Notes about the DSpace 6 upgrade on CGSpace in 2020-11.
- [statistics-2015 core](#statistics-2015)
- [statistics-2014 core](#statistics-2014)
- [statistics-2013 core](#statistics-2013)
+ - [statistics-2013 core](#statistics-2012)
+ - [statistics-2013 core](#statistics-2011)
+ - [statistics-2013 core](#statistics-2010)
+- [Processing Solr statistics with AtomicStatisticsUpdateCLI](processing-solr-statistics-with-atomicstatisticsupdatecli)
-## Processing Solr Statistics With solr-upgrade-statistics-6x
+
+### Re-import OAI with clean index
+
+After the upgrade is complete, re-index all items into OAI with a clean index:
+
+```console
+$ export JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx2048m"
+$ dspace oai -c import
+```
+
+The process ran out of memory several times so I had to keep trying again with more JVM heap memory.
+
+
+### Processing Solr Statistics With solr-upgrade-statistics-6x
After the main upgrade process was finished and DSpace was running I started processing the Solr statistics with `solr-upgrade-statistics-6x` to migrate all IDs to UUIDs.
-### statistics
+## statistics
First process the current year's statistics core:
```console
@@ -57,7 +75,7 @@ After several rounds of processing it finished. Here are some statistics about u
$ curl -s "http://localhost:8081/solr/statistics/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "
pg_stat_activity
$ psql -c 'SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;' | wc -l
2071
-
+rollbackOnReturn=true
option in Tomcat’s JDBC connection pool parameters because I noticed that all of the “idle in transaction” connections waiting for locks were SELECT queries
+rollbackOnReturn
change in Tomcat…2020-11-18 16:57:33,309 ERROR org.hibernate.engine.jdbc.batch.internal.BatchingBatch @ HHH000315: Exception executing batch [Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1]
+2020-11-18 16:57:33,316 ERROR org.hibernate.engine.jdbc.batch.internal.BatchingBatch @ HHH000315: Exception executing batch [Batch update returned unexpected row count from update [13]; actual row count: 0; expected: 1]
+2020-11-18 16:57:33,385 INFO org.hibernate.engine.jdbc.batch.internal.AbstractBatchImpl @ HHH000010: On release of batch it still contained JDBC statements
+
Notes about the DSpace 6 upgrade on CGSpace in 2020-11.
After the upgrade is complete, re-index all items into OAI with a clean index:
+$ export JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx2048m"
+$ dspace oai -c import
+
The process ran out of memory several times so I had to keep trying again with more JVM heap memory.
+After the main upgrade process was finished and DSpace was running I started processing the Solr statistics with solr-upgrade-statistics-6x
to migrate all IDs to UUIDs.
First process the current year’s statistics core:
$ export JAVA_OPTS='-Dfile.encoding=UTF-8 -Xmx2048m'
$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics
@@ -147,7 +157,7 @@ $ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics
Majority are type: 5
(aka SITE, according to Constants.java
) so we can purge them:
$ curl -s "http://localhost:8081/solr/statistics/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>*:* NOT id:/.{36}/</query></delete>"
-
statistics-2019
+
Processing the statistics-2019 core:
$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics
...
@@ -172,7 +182,7 @@ $ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics
4,172,929 are type: 5
(aka SITE) so we can purge them:
$ curl -s "http://localhost:8081/solr/statistics-2019/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>*:* NOT id:/.{36}/</query></delete>"
-
statistics-2018
+
Processing the statistics-2018 core:
$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics-2018
...
@@ -225,7 +235,7 @@ $ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics-2018
1,660,524 are type: 5
(SITE) so we can purge them:
$ curl -s "http://localhost:8081/solr/statistics-2017/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>*:* NOT id:/.{36}/</query></delete>"
-
statistics-2016
+
Processing the statistics-2016 core:
$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics-2016
...
@@ -249,7 +259,7 @@ $ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics-2018
1,469,706 are type: 5
(SITE) so we can purge them:
$ curl -s "http://localhost:8081/solr/statistics-2016/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>*:* NOT id:/.{36}/</query></delete>"
-
statistics-2015
+
Processing the statistics-2015 core:
$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics-2015
...
@@ -326,6 +336,75 @@ $ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics-2018
15,691 are type: 5
(SITE) so we can purge them:
$ curl -s "http://localhost:8081/solr/statistics-2013/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>*:* NOT id:/.{36}/</query></delete>"
+
statistics-2012
+Processing the statistics-2012 core:
+$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics-2012
+...
+=================================================================
+ *** Statistics Records with Legacy Id ***
+
+ 2,229,332 Item View
+ 913,577 Bistream View
+ 215,577 Collection View
+ 104,734 Community View
+ --------------------------------------
+ 3,463,220 TOTAL
+=================================================================
+
Summary of unmigrated docs after processing:
+
+- 0:
(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)
+- 33,161:
id:/.+-unmigrated/
+- 33,161:
*:* NOT id:/.{36}/
+- 33,161 are
type: 3
(COLLECTION), which is different than I’ve seen previously… but I suppose I still have to purge them because there will be errors in the Atmire modules otherwise:
+
+$ curl -s "http://localhost:8081/solr/statistics-2012/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>*:* NOT id:/.{36}/</query></delete>"
+
statistics-2011
+Processing the statistics-2011 core:
+$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics-2011
+...
+=================================================================
+ *** Statistics Records with Legacy Id ***
+
+ 904,896 Item View
+ 385,789 Bistream View
+ 154,356 Collection View
+ 62,978 Community View
+ --------------------------------------
+ 1,508,019 TOTAL
+=================================================================
+
Summary of unmigrated docs after processing:
+
+- 0:
(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)
+- 17,551:
id:/.+-unmigrated/
+- 17,551:
*:* NOT id:/.{36}/
+- 12,116 are
type: 3
(COLLECTION), which is different than I’ve seen previously… but I suppose I still have to purge them because there will be errors in the Atmire modules otherwise:
+
+$ curl -s "http://localhost:8081/solr/statistics-2011/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>*:* NOT id:/.{36}/</query></delete>"
+
statistics-2010
+Processing the statistics-2010 core:
+$ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics-2010
+...
+=================================================================
+ *** Statistics Records with Legacy Id ***
+
+ 26,067 Item View
+ 15,615 Bistream View
+ 4,116 Collection View
+ 1,094 Community View
+ --------------------------------------
+ 46,892 TOTAL
+=================================================================
+
Summary of unmigrated docs after processing:
+
+- 0:
(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)
+- 1,012:
id:/.+-unmigrated/
+- 1,012:
*:* NOT id:/.{36}/
+- 654 are
type: 3
(COLLECTION), which is different than I’ve seen previously… but I suppose I still have to purge them because there will be errors in the Atmire modules otherwise:
+
+$ curl -s "http://localhost:8081/solr/statistics-2010/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>*:* NOT id:/.{36}/</query></delete>"
+
Processing Solr statistics with AtomicStatisticsUpdateCLI
+On 2020-11-18 I finished processing the Solr statistics with solr-upgrade-statistics-6x and I started processing them with AtomicStatisticsUpdateCLI:
+$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics
diff --git a/docs/index.html b/docs/index.html
index 6407a5767..150898746 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/page/2/index.html b/docs/page/2/index.html
index 1ace90bc4..03f14e1a3 100644
--- a/docs/page/2/index.html
+++ b/docs/page/2/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/page/3/index.html b/docs/page/3/index.html
index c0f5b31f4..ee0d791cf 100644
--- a/docs/page/3/index.html
+++ b/docs/page/3/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/page/4/index.html b/docs/page/4/index.html
index bb189f3c3..b528125dd 100644
--- a/docs/page/4/index.html
+++ b/docs/page/4/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/page/5/index.html b/docs/page/5/index.html
index 340fe63d9..80a5f5b0f 100644
--- a/docs/page/5/index.html
+++ b/docs/page/5/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/page/6/index.html b/docs/page/6/index.html
index 60aeabab1..3ef8b1db2 100644
--- a/docs/page/6/index.html
+++ b/docs/page/6/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/page/7/index.html b/docs/page/7/index.html
index b189f9256..dfa564c75 100644
--- a/docs/page/7/index.html
+++ b/docs/page/7/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/posts/index.html b/docs/posts/index.html
index 6bdb20575..3366834b6 100644
--- a/docs/posts/index.html
+++ b/docs/posts/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/posts/page/2/index.html b/docs/posts/page/2/index.html
index d4ecf712c..d5ecca2a5 100644
--- a/docs/posts/page/2/index.html
+++ b/docs/posts/page/2/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/posts/page/3/index.html b/docs/posts/page/3/index.html
index 01dcbc99e..b8dd603bd 100644
--- a/docs/posts/page/3/index.html
+++ b/docs/posts/page/3/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/posts/page/4/index.html b/docs/posts/page/4/index.html
index 8d7500f5a..f16e57c99 100644
--- a/docs/posts/page/4/index.html
+++ b/docs/posts/page/4/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/posts/page/5/index.html b/docs/posts/page/5/index.html
index 713e7c356..d8800163d 100644
--- a/docs/posts/page/5/index.html
+++ b/docs/posts/page/5/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/posts/page/6/index.html b/docs/posts/page/6/index.html
index ee3d63e4d..7ccdf8bdc 100644
--- a/docs/posts/page/6/index.html
+++ b/docs/posts/page/6/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/posts/page/7/index.html b/docs/posts/page/7/index.html
index 51e223d0e..bcaa3c23f 100644
--- a/docs/posts/page/7/index.html
+++ b/docs/posts/page/7/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
index 724c318b7..57f08a1a3 100644
--- a/docs/sitemap.xml
+++ b/docs/sitemap.xml
@@ -4,42 +4,42 @@
https://alanorth.github.io/cgspace-notes/categories/
- 2020-11-16T10:53:45+02:00
+ 2020-11-18T17:15:23+02:00
https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/
- 2020-11-15T13:27:35+02:00
+ 2020-11-17T22:14:56+02:00
https://alanorth.github.io/cgspace-notes/
- 2020-11-16T10:53:45+02:00
+ 2020-11-18T17:15:23+02:00
https://alanorth.github.io/cgspace-notes/tags/migration/
- 2020-11-15T13:27:35+02:00
+ 2020-11-17T22:14:56+02:00
https://alanorth.github.io/cgspace-notes/categories/notes/
- 2020-11-16T10:53:45+02:00
+ 2020-11-18T17:15:23+02:00
https://alanorth.github.io/cgspace-notes/posts/
- 2020-11-16T10:53:45+02:00
+ 2020-11-18T17:15:23+02:00
https://alanorth.github.io/cgspace-notes/tags/
- 2020-11-15T13:27:35+02:00
+ 2020-11-17T22:14:56+02:00
https://alanorth.github.io/cgspace-notes/2020-11/
- 2020-11-16T10:53:45+02:00
+ 2020-11-17T22:14:56+02:00
@@ -209,7 +209,7 @@
https://alanorth.github.io/cgspace-notes/2018-02/
- 2019-10-28T13:39:25+02:00
+ 2020-11-18T17:15:23+02:00
diff --git a/docs/tags/index.html b/docs/tags/index.html
index 6e32df9c8..7071cc6da 100644
--- a/docs/tags/index.html
+++ b/docs/tags/index.html
@@ -9,7 +9,7 @@
-
+
diff --git a/docs/tags/migration/index.html b/docs/tags/migration/index.html
index b1453dd49..3623afc28 100644
--- a/docs/tags/migration/index.html
+++ b/docs/tags/migration/index.html
@@ -9,7 +9,7 @@
-
+