2020-11-17 21:14:56 +01:00
<!DOCTYPE html>
< html lang = "en" >
< head >
< meta charset = "utf-8" >
< meta name = "viewport" content = "width=device-width, initial-scale=1, shrink-to-fit=no" >
2020-12-06 15:53:29 +01:00
2020-11-17 21:14:56 +01:00
< meta property = "og:title" content = "CGSpace DSpace 6 Upgrade" / >
< meta property = "og:description" content = "Documenting the DSpace 6 upgrade." / >
< meta property = "og:type" content = "article" / >
< meta property = "og:url" content = "https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/" / >
< meta property = "article:published_time" content = "2020-11-15T13:27:35+02:00" / >
2020-12-06 15:53:29 +01:00
< meta property = "article:modified_time" content = "2020-12-01T19:15:48+02:00" / >
2020-11-17 21:14:56 +01:00
< meta name = "twitter:card" content = "summary" / >
< meta name = "twitter:title" content = "CGSpace DSpace 6 Upgrade" / >
< meta name = "twitter:description" content = "Documenting the DSpace 6 upgrade." / >
2021-12-28 12:24:23 +01:00
< meta name = "generator" content = "Hugo 0.91.2" / >
2020-11-17 21:14:56 +01:00
< script type = "application/ld+json" >
{
"@context": "http://schema.org",
"@type": "BlogPosting",
"headline": "CGSpace DSpace 6 Upgrade",
"url": "https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/",
2020-11-29 07:40:58 +01:00
"wordCount": "1570",
2020-11-17 21:14:56 +01:00
"datePublished": "2020-11-15T13:27:35+02:00",
2020-12-06 15:53:29 +01:00
"dateModified": "2020-12-01T19:15:48+02:00",
2020-11-17 21:14:56 +01:00
"author": {
"@type": "Person",
"name": "Alan Orth"
},
"keywords": "Notes, Migration",
"description": "Documenting the DSpace 6 upgrade."
}
< / script >
< link rel = "canonical" href = "https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/" >
< title > CGSpace DSpace 6 Upgrade | CGSpace Notes< / title >
<!-- combined, minified CSS -->
2021-01-24 08:46:27 +01:00
< link href = "https://alanorth.github.io/cgspace-notes/css/style.beb8012edc08ba10be012f079d618dc243812267efe62e11f22fe49618f976a4.css" rel = "stylesheet" integrity = "sha256-vrgBLtwIuhC+AS8HnWGNwkOBImfv5i4R8i/klhj5dqQ=" crossorigin = "anonymous" >
2020-11-17 21:14:56 +01:00
<!-- minified Font Awesome for SVG icons -->
2021-09-28 09:32:32 +02:00
< script defer src = "https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity = "sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz+lcnA=" crossorigin = "anonymous" > < / script >
2020-11-17 21:14:56 +01:00
<!-- RSS 2.0 feed -->
< / head >
< body >
< div class = "blog-masthead" >
< div class = "container" >
< nav class = "nav blog-nav" >
< a class = "nav-link " href = "https://alanorth.github.io/cgspace-notes/" > Home< / a >
< / nav >
< / div >
< / div >
< header class = "blog-header" >
< div class = "container" >
< h1 class = "blog-title" dir = "auto" > < a href = "https://alanorth.github.io/cgspace-notes/" rel = "home" > CGSpace Notes< / a > < / h1 >
< p class = "lead blog-description" dir = "auto" > Documenting day-to-day work on the < a href = "https://cgspace.cgiar.org" > CGSpace< / a > repository.< / p >
< / div >
< / header >
< div class = "container" >
< div class = "row" >
< div class = "col-sm-8 blog-main" >
< article class = "blog-post" >
< header >
< h2 class = "blog-post-title" dir = "auto" > < a href = "https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/" > CGSpace DSpace 6 Upgrade< / a > < / h2 >
< p class = "blog-post-meta" >
< time datetime = "2020-11-15T13:27:35+02:00" > Sun Nov 15, 2020< / time >
in
< span class = "fas fa-folder" aria-hidden = "true" > < / span > < a href = "/cgspace-notes/categories/notes/" rel = "category tag" > Notes< / a >
< span class = "fas fa-tag" aria-hidden = "true" > < / span > < a href = "/cgspace-notes/tags/migration/" rel = "tag" > Migration< / a >
< / p >
< / header >
< p > Notes about the DSpace 6 upgrade on CGSpace in 2020-11.< / p >
< ul >
2020-11-18 22:15:06 +01:00
< li > < a href = "#re-import-oai-with-clean-index" > Re-import OAI with clean index< / a > < / li >
< li > < a href = "#processing-solr-statistics-with-solr-upgrade-statistics-6x" > Processing Solr statistics with solr-upgrade-statistics-6x< / a >
2020-11-17 21:14:56 +01:00
< ul >
< li > < a href = "#statistics" > Current year’ s statistics core< / a > < / li >
< li > < a href = "#statistics-2019" > statistics-2019 core< / a > < / li >
< li > < a href = "#statistics-2018" > statistics-2018 core< / a > < / li >
< li > < a href = "#statistics-2017" > statistics-2017 core< / a > < / li >
< li > < a href = "#statistics-2016" > statistics-2016 core< / a > < / li >
< li > < a href = "#statistics-2015" > statistics-2015 core< / a > < / li >
< li > < a href = "#statistics-2014" > statistics-2014 core< / a > < / li >
< li > < a href = "#statistics-2013" > statistics-2013 core< / a > < / li >
2020-11-18 22:15:06 +01:00
< li > < a href = "#statistics-2012" > statistics-2013 core< / a > < / li >
< li > < a href = "#statistics-2011" > statistics-2013 core< / a > < / li >
< li > < a href = "#statistics-2010" > statistics-2013 core< / a > < / li >
2020-11-17 21:14:56 +01:00
< / ul >
< / li >
2020-11-18 22:15:06 +01:00
< li > < a href = "processing-solr-statistics-with-atomicstatisticsupdatecli" > Processing Solr statistics with AtomicStatisticsUpdateCLI< / a > < / li >
2020-11-17 21:14:56 +01:00
< / ul >
2020-11-18 22:15:06 +01:00
< h3 id = "re-import-oai-with-clean-index" > Re-import OAI with clean index< / h3 >
< p > After the upgrade is complete, re-index all items into OAI with a clean index:< / p >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ export JAVA_OPTS< span style = "color:#f92672" > =< / span > < span style = "color:#e6db74" > " -Dfile.encoding=UTF-8 -Xmx2048m" < / span >
2020-11-18 22:15:06 +01:00
$ dspace oai -c import
2021-11-09 05:29:52 +01:00
< / code > < / pre > < / div > < p > The process ran out of memory several times so I had to keep trying again with more JVM heap memory.< / p >
2020-11-18 22:15:06 +01:00
< h3 id = "processing-solr-statistics-with-solr-upgrade-statistics-6x" > Processing Solr Statistics With solr-upgrade-statistics-6x< / h3 >
2020-11-17 21:14:56 +01:00
< p > After the main upgrade process was finished and DSpace was running I started processing the Solr statistics with < code > solr-upgrade-statistics-6x< / code > to migrate all IDs to UUIDs.< / p >
2020-11-18 22:15:06 +01:00
< h2 id = "statistics" > statistics< / h2 >
2020-11-17 21:14:56 +01:00
< p > First process the current year’ s statistics core:< / p >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ export JAVA_OPTS< span style = "color:#f92672" > =< / span > < span style = "color:#e6db74" > ' -Dfile.encoding=UTF-8 -Xmx2048m' < / span >
$ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics
2020-11-17 21:14:56 +01:00
...
=================================================================
*** Statistics Records with Legacy Id ***
2021-11-09 05:29:52 +01:00
< span style = "color:#960050;background-color:#1e0010" >
< / span > < span style = "color:#960050;background-color:#1e0010" > < / span > 3,817,407 Bistream View
2020-11-17 21:14:56 +01:00
1,693,443 Item View
105,974 Collection View
62,383 Community View
163,192 Community Search
162,581 Collection Search
470,288 Unexpected Type & Full Site
--------------------------------------
6,475,268 TOTAL
=================================================================
2021-11-09 05:29:52 +01:00
< / code > < / pre > < / div > < p > After several rounds of processing it finished. Here are some statistics about unmigrated documents:< / p >
2020-11-17 21:14:56 +01:00
< ul >
< li > 227,000: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 471,000: < code > id:/.+-unmigrated/< / code > < / li >
< li > 698,000: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > Majority are < code > type: 5< / code > (aka SITE, according to < code > Constants.java< / code > ) so we can purge them:< / li >
< / ul >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / code > < / pre > < / div > < h2 id = "statistics-2019" > statistics-2019< / h2 >
2020-11-17 21:14:56 +01:00
< p > Processing the statistics-2019 core:< / p >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics
2020-11-17 21:14:56 +01:00
...
=================================================================
*** Statistics Records with Legacy Id ***
2021-11-09 05:29:52 +01:00
< span style = "color:#960050;background-color:#1e0010" >
< / span > < span style = "color:#960050;background-color:#1e0010" > < / span > 5,569,344 Bistream View
2020-11-17 21:14:56 +01:00
2,179,105 Item View
117,194 Community View
104,091 Collection View
774,138 Community Search
568,347 Collection Search
1,482,620 Unexpected Type & Full Site
--------------------------------------
10,794,839 TOTAL
=================================================================
2021-11-09 05:29:52 +01:00
< / code > < / pre > < / div > < p > After several rounds of processing it finished. Here are some statistics about unmigrated documents:< / p >
2020-11-17 21:14:56 +01:00
< ul >
< li > 2,690,309: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 1,494,587: < code > id:/.+-unmigrated/< / code > < / li >
< li > 4,184,896: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 4,172,929 are < code > type: 5< / code > (aka SITE) so we can purge them:< / li >
< / ul >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2019/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / code > < / pre > < / div > < h2 id = "statistics-2018" > statistics-2018< / h2 >
2020-11-17 21:14:56 +01:00
< p > Processing the statistics-2018 core:< / p >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2018
2020-11-17 21:14:56 +01:00
...
=================================================================
*** Statistics Records with Legacy Id ***
2021-11-09 05:29:52 +01:00
< span style = "color:#960050;background-color:#1e0010" >
< / span > < span style = "color:#960050;background-color:#1e0010" > < / span > 3,561,532 Bistream View
2020-11-17 21:14:56 +01:00
1,129,326 Item View
97,401 Community View
63,508 Collection View
207,827 Community Search
43,752 Collection Search
457,820 Unexpected Type & Full Site
--------------------------------------
5,561,166 TOTAL
=================================================================
2021-11-09 05:29:52 +01:00
< / code > < / pre > < / div > < p > After some time I got an error about Java heap space so I increased the JVM memory and restarted processing:< / p >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ export JAVA_OPTS< span style = "color:#f92672" > =< / span > < span style = "color:#e6db74" > ' -Dfile.encoding=UTF-8 -Xmx4096m' < / span >
$ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2018
< / code > < / pre > < / div > < p > Eventually the processing finished. Here are some statistics about unmigrated documents:< / p >
2020-11-17 21:14:56 +01:00
< ul >
< li > 365,473: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 546,955: < code > id:/.+-unmigrated/< / code > < / li >
< li > 923,158: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 823,293: are < code > type: 5< / code > so we can purge them:< / li >
< / ul >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2018/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / code > < / pre > < / div > < h2 id = "statistics-2017" > statistics-2017< / h2 >
2020-11-17 21:14:56 +01:00
< p > Processing the statistics-2017 core:< / p >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2017
2020-11-17 21:14:56 +01:00
...
=================================================================
*** Statistics Records with Legacy Id ***
2021-11-09 05:29:52 +01:00
< span style = "color:#960050;background-color:#1e0010" >
< / span > < span style = "color:#960050;background-color:#1e0010" > < / span > 2,529,208 Bistream View
2020-11-17 21:14:56 +01:00
1,618,717 Item View
144,945 Community View
74,249 Collection View
479,647 Community Search
114,658 Collection Search
852,215 Unexpected Type & Full Site
--------------------------------------
5,813,639 TOTAL
=================================================================
2021-11-09 05:29:52 +01:00
< / code > < / pre > < / div > < p > Eventually the processing finished. Here are some statistics about unmigrated documents:< / p >
2020-11-17 21:14:56 +01:00
< ul >
< li > 808,309: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 893,868: < code > id:/.+-unmigrated/< / code > < / li >
< li > 1,702,177: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 1,660,524 are < code > type: 5< / code > (SITE) so we can purge them:< / li >
< / ul >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2017/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / code > < / pre > < / div > < h2 id = "statistics-2016" > statistics-2016< / h2 >
2020-11-17 21:14:56 +01:00
< p > Processing the statistics-2016 core:< / p >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2016
2020-11-17 21:14:56 +01:00
...
=================================================================
*** Statistics Records with Legacy Id ***
2021-11-09 05:29:52 +01:00
< span style = "color:#960050;background-color:#1e0010" >
< / span > < span style = "color:#960050;background-color:#1e0010" > < / span > 1,765,924 Bistream View
2020-11-17 21:14:56 +01:00
1,151,575 Item View
187,110 Community View
51,204 Collection View
347,382 Community Search
66,605 Collection Search
620,298 Unexpected Type & Full Site
--------------------------------------
4,190,098 TOTAL
=================================================================
2021-11-09 05:29:52 +01:00
< / code > < / pre > < / div > < ul >
2020-11-17 21:14:56 +01:00
< li > 849,408: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 627,747: < code > id:/.+-unmigrated/< / code > < / li >
< li > 1,477,155: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 1,469,706 are < code > type: 5< / code > (SITE) so we can purge them:< / li >
< / ul >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2016/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / code > < / pre > < / div > < h2 id = "statistics-2015" > statistics-2015< / h2 >
2020-11-17 21:14:56 +01:00
< p > Processing the statistics-2015 core:< / p >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2015
2020-11-17 21:14:56 +01:00
...
=================================================================
*** Statistics Records with Legacy Id ***
2021-11-09 05:29:52 +01:00
< span style = "color:#960050;background-color:#1e0010" >
< / span > < span style = "color:#960050;background-color:#1e0010" > < / span > 990,916 Bistream View
2020-11-17 21:14:56 +01:00
506,070 Item View
116,153 Community View
33,282 Collection View
21,062 Community Search
10,788 Collection Search
52,107 Unexpected Type & Full Site
--------------------------------------
1,730,378 TOTAL
=================================================================
2021-11-09 05:29:52 +01:00
< / code > < / pre > < / div > < p > Summary of stats after processing:< / p >
2020-11-17 21:14:56 +01:00
< ul >
< li > 195,293: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 67,146: < code > id:/.+-unmigrated/< / code > < / li >
< li > 262,439: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 247,400 are < code > type: 5< / code > (SITE) so we can purge them:< / li >
< / ul >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2015/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / code > < / pre > < / div > < h2 id = "statistics-2014" > statistics-2014< / h2 >
2020-11-17 21:14:56 +01:00
< p > Processing the statistics-2014 core:< / p >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2014
2020-11-17 21:14:56 +01:00
...
=================================================================
*** Statistics Records with Legacy Id ***
2021-11-09 05:29:52 +01:00
< span style = "color:#960050;background-color:#1e0010" >
< / span > < span style = "color:#960050;background-color:#1e0010" > < / span > 2,381,603 Item View
2020-11-17 21:14:56 +01:00
1,323,357 Bistream View
501,545 Community View
247,805 Collection View
250 Collection Search
188 Community Search
50 Item Search
10,918 Unexpected Type & Full Site
--------------------------------------
4,465,716 TOTAL
=================================================================
2021-11-09 05:29:52 +01:00
< / code > < / pre > < / div > < p > Summary of unmigrated documents after processing:< / p >
2020-11-17 21:14:56 +01:00
< ul >
< li > 182,131: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 39,947: < code > id:/.+-unmigrated/< / code > < / li >
< li > 222,078: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 188,791 are < code > type: 5< / code > (SITE) so we can purge them:< / li >
< / ul >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2014/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / code > < / pre > < / div > < h2 id = "statistics-2013" > statistics-2013< / h2 >
2020-11-17 21:14:56 +01:00
< p > Processing the statistics-2013 core:< / p >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2013
2020-11-17 21:14:56 +01:00
...
=================================================================
*** Statistics Records with Legacy Id ***
2021-11-09 05:29:52 +01:00
< span style = "color:#960050;background-color:#1e0010" >
< / span > < span style = "color:#960050;background-color:#1e0010" > < / span > 2,352,124 Item View
2020-11-17 21:14:56 +01:00
1,117,676 Bistream View
575,711 Community View
171,639 Collection View
248 Item Search
7 Collection Search
5 Community Search
1,452 Unexpected Type & Full Site
--------------------------------------
4,218,862 TOTAL
=================================================================
2021-11-09 05:29:52 +01:00
< / code > < / pre > < / div > < p > Summary of unmigrated docs after processing:< / p >
2020-11-17 21:14:56 +01:00
< ul >
< li > 2,548 : < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 29,772: < code > id:/.+-unmigrated/< / code > < / li >
< li > 32,320: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 15,691 are < code > type: 5< / code > (SITE) so we can purge them:< / li >
< / ul >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2013/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / code > < / pre > < / div > < h2 id = "statistics-2012" > statistics-2012< / h2 >
2020-11-18 22:15:06 +01:00
< p > Processing the statistics-2012 core:< / p >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2012
2020-11-18 22:15:06 +01:00
...
=================================================================
*** Statistics Records with Legacy Id ***
2021-11-09 05:29:52 +01:00
< span style = "color:#960050;background-color:#1e0010" >
< / span > < span style = "color:#960050;background-color:#1e0010" > < / span > 2,229,332 Item View
2020-11-18 22:15:06 +01:00
913,577 Bistream View
215,577 Collection View
104,734 Community View
--------------------------------------
3,463,220 TOTAL
=================================================================
2021-11-09 05:29:52 +01:00
< / code > < / pre > < / div > < p > Summary of unmigrated docs after processing:< / p >
2020-11-18 22:15:06 +01:00
< ul >
< li > 0: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 33,161: < code > id:/.+-unmigrated/< / code > < / li >
< li > 33,161: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 33,161 are < code > type: 3< / code > (COLLECTION), which is different than I’ ve seen previously… but I suppose I still have to purge them because there will be errors in the Atmire modules otherwise:< / li >
< / ul >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2012/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / code > < / pre > < / div > < h2 id = "statistics-2011" > statistics-2011< / h2 >
2020-11-18 22:15:06 +01:00
< p > Processing the statistics-2011 core:< / p >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2011
2020-11-18 22:15:06 +01:00
...
=================================================================
*** Statistics Records with Legacy Id ***
2021-11-09 05:29:52 +01:00
< span style = "color:#960050;background-color:#1e0010" >
< / span > < span style = "color:#960050;background-color:#1e0010" > < / span > 904,896 Item View
2020-11-18 22:15:06 +01:00
385,789 Bistream View
154,356 Collection View
62,978 Community View
--------------------------------------
1,508,019 TOTAL
=================================================================
2021-11-09 05:29:52 +01:00
< / code > < / pre > < / div > < p > Summary of unmigrated docs after processing:< / p >
2020-11-18 22:15:06 +01:00
< ul >
< li > 0: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 17,551: < code > id:/.+-unmigrated/< / code > < / li >
< li > 17,551: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 12,116 are < code > type: 3< / code > (COLLECTION), which is different than I’ ve seen previously… but I suppose I still have to purge them because there will be errors in the Atmire modules otherwise:< / li >
< / ul >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2011/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / code > < / pre > < / div > < h2 id = "statistics-2010" > statistics-2010< / h2 >
2020-11-18 22:15:06 +01:00
< p > Processing the statistics-2010 core:< / p >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2010
2020-11-18 22:15:06 +01:00
...
=================================================================
*** Statistics Records with Legacy Id ***
2021-11-09 05:29:52 +01:00
< span style = "color:#960050;background-color:#1e0010" >
< / span > < span style = "color:#960050;background-color:#1e0010" > < / span > 26,067 Item View
2020-11-18 22:15:06 +01:00
15,615 Bistream View
4,116 Collection View
1,094 Community View
--------------------------------------
46,892 TOTAL
=================================================================
2021-11-09 05:29:52 +01:00
< / code > < / pre > < / div > < p > Summary of unmigrated docs after processing:< / p >
2020-11-18 22:15:06 +01:00
< ul >
< li > 0: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 1,012: < code > id:/.+-unmigrated/< / code > < / li >
< li > 1,012: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 654 are < code > type: 3< / code > (COLLECTION), which is different than I’ ve seen previously… but I suppose I still have to purge them because there will be errors in the Atmire modules otherwise:< / li >
< / ul >
2021-11-09 05:29:52 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4" > < code class = "language-console" data-lang = "console" > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2010/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / code > < / pre > < / div > < h3 id = "processing-solr-statistics-with-atomicstatisticsupdatecli" > Processing Solr statistics with AtomicStatisticsUpdateCLI< / h3 >
2020-11-22 22:08:49 +01:00
< p > On 2020-11-18 I finished processing the Solr statistics with solr-upgrade-statistics-6x and I started processing them with AtomicStatisticsUpdateCLI.< / p >
< h2 id = "statistics-1" > statistics< / h2 >
< p > First the current year’ s statistics core, in 12-hour batches:< / p >
2021-09-13 15:21:16 +02:00
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics
2020-11-22 22:08:49 +01:00
< / code > < / pre > < p > It took ~38 hours to finish processing this core.< / p >
2020-11-23 21:44:29 +01:00
< h2 id = "statistics-2019-1" > statistics-2019< / h2 >
< p > The statistics-2019 core, in 12-hour batches:< / p >
2021-09-13 15:21:16 +02:00
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2019
2020-11-23 21:44:29 +01:00
< / code > < / pre > < p > It took ~32 hours to finish processing this core.< / p >
2020-11-24 20:47:24 +01:00
< h2 id = "statistics-2018-1" > statistics-2018< / h2 >
< p > The statistics-2018 core, in 12-hour batches:< / p >
2021-09-13 15:21:16 +02:00
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2018
2020-11-24 20:47:24 +01:00
< / code > < / pre > < p > It took ~28 hours to finish processing this core.< / p >
2020-11-25 21:32:41 +01:00
< h2 id = "statistics-2017-1" > statistics-2017< / h2 >
< p > The statistics-2017 core, in 12-hour batches:< / p >
2021-09-13 15:21:16 +02:00
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2017
2020-11-25 21:32:41 +01:00
< / code > < / pre > < p > It took ~24 hours to finish processing this core.< / p >
2020-11-29 07:40:58 +01:00
< h2 id = "statistics-2016-1" > statistics-2016< / h2 >
< p > The statistics-2016 core, in 12-hour batches:< / p >
2021-09-13 15:21:16 +02:00
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2016
2020-11-29 07:40:58 +01:00
< / code > < / pre > < p > It took ~20 hours to finish processing this core.< / p >
< h2 id = "statistics-2015-1" > statistics-2015< / h2 >
< p > The statistics-2015 core, in 12-hour batches:< / p >
2021-09-13 15:21:16 +02:00
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2015
2020-12-01 18:15:48 +01:00
< / code > < / pre > < p > It took ~21 hours to finish processing this core.< / p >
2020-11-29 07:40:58 +01:00
< h2 id = "statistics-2014-1" > statistics-2014< / h2 >
< p > The statistics-2014 core, in 12-hour batches:< / p >
2021-09-13 15:21:16 +02:00
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2014
2020-11-29 07:40:58 +01:00
< / code > < / pre > < p > It took ~12 hours to finish processing this core.< / p >
< h2 id = "statistics-2013-1" > statistics-2013< / h2 >
< p > The statistics-2013 core, in 12-hour batches:< / p >
2021-09-13 15:21:16 +02:00
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2013
2020-11-29 07:40:58 +01:00
< / code > < / pre > < p > It took ~3 hours to finish processing this core.< / p >
< h2 id = "statistics-2012-1" > statistics-2012< / h2 >
< p > The statistics-2012 core, in 12-hour batches:< / p >
2021-09-13 15:21:16 +02:00
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2012
2020-11-29 07:40:58 +01:00
< / code > < / pre > < p > It took ~2 hours to finish processing this core.< / p >
< h2 id = "statistics-2011-1" > statistics-2011< / h2 >
< p > The statistics-2011 core, in 12-hour batches:< / p >
2021-09-13 15:21:16 +02:00
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2011
2020-11-29 07:40:58 +01:00
< / code > < / pre > < p > It took 1 hour to finish processing this core.< / p >
< h2 id = "statistics-2010-1" > statistics-2010< / h2 >
< p > The statistics-2010 core, in 12-hour batches:< / p >
2021-09-13 15:21:16 +02:00
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2010
2020-11-29 07:40:58 +01:00
< / code > < / pre > < p > It took five minutes to finish processing this core.< / p >
2020-11-17 21:14:56 +01:00
< / article >
< / div > <!-- /.blog - main -->
< aside class = "col-sm-3 ml-auto blog-sidebar" >
< section class = "sidebar-module" >
< h4 > Recent Posts< / h4 >
< ol class = "list-unstyled" >
2022-01-01 14:21:47 +01:00
< li > < a href = "/cgspace-notes/2022-01/" > January, 2022< / a > < / li >
2021-12-03 11:58:43 +01:00
< li > < a href = "/cgspace-notes/2021-12/" > December, 2021< / a > < / li >
2021-11-01 09:49:21 +01:00
< li > < a href = "/cgspace-notes/2021-11/" > November, 2021< / a > < / li >
< li > < a href = "/cgspace-notes/2021-10/" > October, 2021< / a > < / li >
2021-09-02 16:21:48 +02:00
< li > < a href = "/cgspace-notes/2021-09/" > September, 2021< / a > < / li >
2020-11-17 21:14:56 +01:00
< / ol >
< / section >
< section class = "sidebar-module" >
< h4 > Links< / h4 >
< ol class = "list-unstyled" >
< li > < a href = "https://cgspace.cgiar.org" > CGSpace< / a > < / li >
< li > < a href = "https://dspacetest.cgiar.org" > DSpace Test< / a > < / li >
< li > < a href = "https://github.com/ilri/DSpace" > CGSpace @ GitHub< / a > < / li >
< / ol >
< / section >
< / aside >
< / div > <!-- /.row -->
< / div > <!-- /.container -->
< footer class = "blog-footer" >
< p dir = "auto" >
Blog template created by < a href = "https://twitter.com/mdo" > @mdo< / a > , ported to Hugo by < a href = 'https://twitter.com/mralanorth' > @mralanorth< / a > .
< / p >
< p >
< a href = "#" > Back to top< / a >
< / p >
< / footer >
< / body >
< / html >