2023-07-04 07:03:36 +02:00
<!DOCTYPE html>
< html lang = "en" >
< head >
< meta charset = "utf-8" >
< meta name = "viewport" content = "width=device-width, initial-scale=1, shrink-to-fit=no" >
< meta property = "og:title" content = "CGSpace DSpace 6 Upgrade" / >
< meta property = "og:description" content = "Documenting the DSpace 6 upgrade." / >
< meta property = "og:type" content = "article" / >
< meta property = "og:url" content = "https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/" / >
< meta property = "article:published_time" content = "2020-11-15T13:27:35+02:00" / >
< meta property = "article:modified_time" content = "2020-12-01T19:15:48+02:00" / >
< meta name = "twitter:card" content = "summary" / >
< meta name = "twitter:title" content = "CGSpace DSpace 6 Upgrade" / >
< meta name = "twitter:description" content = "Documenting the DSpace 6 upgrade." / >
2023-08-31 16:36:25 +02:00
< meta name = "generator" content = "Hugo 0.118.2" >
2023-07-04 07:03:36 +02:00
< script type = "application/ld+json" >
{
"@context": "http://schema.org",
"@type": "BlogPosting",
"headline": "CGSpace DSpace 6 Upgrade",
"url": "https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/",
"wordCount": "1570",
"datePublished": "2020-11-15T13:27:35+02:00",
"dateModified": "2020-12-01T19:15:48+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
},
"keywords": "Notes, Migration",
"description": "Documenting the DSpace 6 upgrade."
}
< / script >
< link rel = "canonical" href = "https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/" >
< title > CGSpace DSpace 6 Upgrade | CGSpace Notes< / title >
<!-- combined, minified CSS -->
< link href = "https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel = "stylesheet" integrity = "sha256-xrqAvFBmlVdkWr4F+GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin = "anonymous" >
<!-- minified Font Awesome for SVG icons -->
< script defer src = "https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity = "sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz+lcnA=" crossorigin = "anonymous" > < / script >
<!-- RSS 2.0 feed -->
< / head >
< body >
< div class = "blog-masthead" >
< div class = "container" >
< nav class = "nav blog-nav" >
< a class = "nav-link " href = "https://alanorth.github.io/cgspace-notes/" > Home< / a >
< / nav >
< / div >
< / div >
< header class = "blog-header" >
< div class = "container" >
< h1 class = "blog-title" dir = "auto" > < a href = "https://alanorth.github.io/cgspace-notes/" rel = "home" > CGSpace Notes< / a > < / h1 >
< p class = "lead blog-description" dir = "auto" > Documenting day-to-day work on the < a href = "https://cgspace.cgiar.org" > CGSpace< / a > repository.< / p >
< / div >
< / header >
< div class = "container" >
< div class = "row" >
< div class = "col-sm-8 blog-main" >
< article class = "blog-post" >
< header >
< h2 class = "blog-post-title" dir = "auto" > < a href = "https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/" > CGSpace DSpace 6 Upgrade< / a > < / h2 >
< p class = "blog-post-meta" >
< time datetime = "2020-11-15T13:27:35+02:00" > Sun Nov 15, 2020< / time >
in
< span class = "fas fa-folder" aria-hidden = "true" > < / span > < a href = "/categories/notes/" rel = "category tag" > Notes< / a >
< span class = "fas fa-tag" aria-hidden = "true" > < / span > < a href = "/tags/migration/" rel = "tag" > Migration< / a >
< / p >
< / header >
< p > Notes about the DSpace 6 upgrade on CGSpace in 2020-11.< / p >
< ul >
< li > < a href = "#re-import-oai-with-clean-index" > Re-import OAI with clean index< / a > < / li >
< li > < a href = "#processing-solr-statistics-with-solr-upgrade-statistics-6x" > Processing Solr statistics with solr-upgrade-statistics-6x< / a >
< ul >
< li > < a href = "#statistics" > Current year’ s statistics core< / a > < / li >
< li > < a href = "#statistics-2019" > statistics-2019 core< / a > < / li >
< li > < a href = "#statistics-2018" > statistics-2018 core< / a > < / li >
< li > < a href = "#statistics-2017" > statistics-2017 core< / a > < / li >
< li > < a href = "#statistics-2016" > statistics-2016 core< / a > < / li >
< li > < a href = "#statistics-2015" > statistics-2015 core< / a > < / li >
< li > < a href = "#statistics-2014" > statistics-2014 core< / a > < / li >
< li > < a href = "#statistics-2013" > statistics-2013 core< / a > < / li >
< li > < a href = "#statistics-2012" > statistics-2013 core< / a > < / li >
< li > < a href = "#statistics-2011" > statistics-2013 core< / a > < / li >
< li > < a href = "#statistics-2010" > statistics-2013 core< / a > < / li >
< / ul >
< / li >
< li > < a href = "processing-solr-statistics-with-atomicstatisticsupdatecli" > Processing Solr statistics with AtomicStatisticsUpdateCLI< / a > < / li >
< / ul >
< h3 id = "re-import-oai-with-clean-index" > Re-import OAI with clean index< / h3 >
< p > After the upgrade is complete, re-index all items into OAI with a clean index:< / p >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ export JAVA_OPTS< span style = "color:#f92672" > =< / span > < span style = "color:#e6db74" > " -Dfile.encoding=UTF-8 -Xmx2048m" < / span >
< / span > < / span > < span style = "display:flex;" > < span > $ dspace oai -c import
< / span > < / span > < / code > < / pre > < / div > < p > The process ran out of memory several times so I had to keep trying again with more JVM heap memory.< / p >
< h3 id = "processing-solr-statistics-with-solr-upgrade-statistics-6x" > Processing Solr Statistics With solr-upgrade-statistics-6x< / h3 >
< p > After the main upgrade process was finished and DSpace was running I started processing the Solr statistics with < code > solr-upgrade-statistics-6x< / code > to migrate all IDs to UUIDs.< / p >
< h2 id = "statistics" > statistics< / h2 >
< p > First process the current year’ s statistics core:< / p >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ export JAVA_OPTS< span style = "color:#f92672" > =< / span > < span style = "color:#e6db74" > ' -Dfile.encoding=UTF-8 -Xmx2048m' < / span >
< / span > < / span > < span style = "display:flex;" > < span > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics
< / span > < / span > < span style = "display:flex;" > < span > ...
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < span style = "display:flex;" > < span > *** Statistics Records with Legacy Id ***
< / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" >
< / span > < / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" > < / span > 3,817,407 Bistream View
< / span > < / span > < span style = "display:flex;" > < span > 1,693,443 Item View
< / span > < / span > < span style = "display:flex;" > < span > 105,974 Collection View
< / span > < / span > < span style = "display:flex;" > < span > 62,383 Community View
< / span > < / span > < span style = "display:flex;" > < span > 163,192 Community Search
< / span > < / span > < span style = "display:flex;" > < span > 162,581 Collection Search
< / span > < / span > < span style = "display:flex;" > < span > 470,288 Unexpected Type & Full Site
< / span > < / span > < span style = "display:flex;" > < span > --------------------------------------
< / span > < / span > < span style = "display:flex;" > < span > 6,475,268 TOTAL
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < / code > < / pre > < / div > < p > After several rounds of processing it finished. Here are some statistics about unmigrated documents:< / p >
< ul >
< li > 227,000: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 471,000: < code > id:/.+-unmigrated/< / code > < / li >
< li > 698,000: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > Majority are < code > type: 5< / code > (aka SITE, according to < code > Constants.java< / code > ) so we can purge them:< / li >
< / ul >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / span > < / span > < / code > < / pre > < / div > < h2 id = "statistics-2019" > statistics-2019< / h2 >
< p > Processing the statistics-2019 core:< / p >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics
< / span > < / span > < span style = "display:flex;" > < span > ...
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < span style = "display:flex;" > < span > *** Statistics Records with Legacy Id ***
< / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" >
< / span > < / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" > < / span > 5,569,344 Bistream View
< / span > < / span > < span style = "display:flex;" > < span > 2,179,105 Item View
< / span > < / span > < span style = "display:flex;" > < span > 117,194 Community View
< / span > < / span > < span style = "display:flex;" > < span > 104,091 Collection View
< / span > < / span > < span style = "display:flex;" > < span > 774,138 Community Search
< / span > < / span > < span style = "display:flex;" > < span > 568,347 Collection Search
< / span > < / span > < span style = "display:flex;" > < span > 1,482,620 Unexpected Type & Full Site
< / span > < / span > < span style = "display:flex;" > < span > --------------------------------------
< / span > < / span > < span style = "display:flex;" > < span > 10,794,839 TOTAL
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < / code > < / pre > < / div > < p > After several rounds of processing it finished. Here are some statistics about unmigrated documents:< / p >
< ul >
< li > 2,690,309: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 1,494,587: < code > id:/.+-unmigrated/< / code > < / li >
< li > 4,184,896: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 4,172,929 are < code > type: 5< / code > (aka SITE) so we can purge them:< / li >
< / ul >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2019/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / span > < / span > < / code > < / pre > < / div > < h2 id = "statistics-2018" > statistics-2018< / h2 >
< p > Processing the statistics-2018 core:< / p >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2018
< / span > < / span > < span style = "display:flex;" > < span > ...
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < span style = "display:flex;" > < span > *** Statistics Records with Legacy Id ***
< / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" >
< / span > < / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" > < / span > 3,561,532 Bistream View
< / span > < / span > < span style = "display:flex;" > < span > 1,129,326 Item View
< / span > < / span > < span style = "display:flex;" > < span > 97,401 Community View
< / span > < / span > < span style = "display:flex;" > < span > 63,508 Collection View
< / span > < / span > < span style = "display:flex;" > < span > 207,827 Community Search
< / span > < / span > < span style = "display:flex;" > < span > 43,752 Collection Search
< / span > < / span > < span style = "display:flex;" > < span > 457,820 Unexpected Type & Full Site
< / span > < / span > < span style = "display:flex;" > < span > --------------------------------------
< / span > < / span > < span style = "display:flex;" > < span > 5,561,166 TOTAL
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < / code > < / pre > < / div > < p > After some time I got an error about Java heap space so I increased the JVM memory and restarted processing:< / p >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ export JAVA_OPTS< span style = "color:#f92672" > =< / span > < span style = "color:#e6db74" > ' -Dfile.encoding=UTF-8 -Xmx4096m' < / span >
< / span > < / span > < span style = "display:flex;" > < span > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2018
< / span > < / span > < / code > < / pre > < / div > < p > Eventually the processing finished. Here are some statistics about unmigrated documents:< / p >
< ul >
< li > 365,473: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 546,955: < code > id:/.+-unmigrated/< / code > < / li >
< li > 923,158: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 823,293: are < code > type: 5< / code > so we can purge them:< / li >
< / ul >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2018/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / span > < / span > < / code > < / pre > < / div > < h2 id = "statistics-2017" > statistics-2017< / h2 >
< p > Processing the statistics-2017 core:< / p >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2017
< / span > < / span > < span style = "display:flex;" > < span > ...
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < span style = "display:flex;" > < span > *** Statistics Records with Legacy Id ***
< / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" >
< / span > < / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" > < / span > 2,529,208 Bistream View
< / span > < / span > < span style = "display:flex;" > < span > 1,618,717 Item View
< / span > < / span > < span style = "display:flex;" > < span > 144,945 Community View
< / span > < / span > < span style = "display:flex;" > < span > 74,249 Collection View
< / span > < / span > < span style = "display:flex;" > < span > 479,647 Community Search
< / span > < / span > < span style = "display:flex;" > < span > 114,658 Collection Search
< / span > < / span > < span style = "display:flex;" > < span > 852,215 Unexpected Type & Full Site
< / span > < / span > < span style = "display:flex;" > < span > --------------------------------------
< / span > < / span > < span style = "display:flex;" > < span > 5,813,639 TOTAL
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < / code > < / pre > < / div > < p > Eventually the processing finished. Here are some statistics about unmigrated documents:< / p >
< ul >
< li > 808,309: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 893,868: < code > id:/.+-unmigrated/< / code > < / li >
< li > 1,702,177: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 1,660,524 are < code > type: 5< / code > (SITE) so we can purge them:< / li >
< / ul >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2017/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / span > < / span > < / code > < / pre > < / div > < h2 id = "statistics-2016" > statistics-2016< / h2 >
< p > Processing the statistics-2016 core:< / p >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2016
< / span > < / span > < span style = "display:flex;" > < span > ...
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < span style = "display:flex;" > < span > *** Statistics Records with Legacy Id ***
< / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" >
< / span > < / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" > < / span > 1,765,924 Bistream View
< / span > < / span > < span style = "display:flex;" > < span > 1,151,575 Item View
< / span > < / span > < span style = "display:flex;" > < span > 187,110 Community View
< / span > < / span > < span style = "display:flex;" > < span > 51,204 Collection View
< / span > < / span > < span style = "display:flex;" > < span > 347,382 Community Search
< / span > < / span > < span style = "display:flex;" > < span > 66,605 Collection Search
< / span > < / span > < span style = "display:flex;" > < span > 620,298 Unexpected Type & Full Site
< / span > < / span > < span style = "display:flex;" > < span > --------------------------------------
< / span > < / span > < span style = "display:flex;" > < span > 4,190,098 TOTAL
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < / code > < / pre > < / div > < ul >
< li > 849,408: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 627,747: < code > id:/.+-unmigrated/< / code > < / li >
< li > 1,477,155: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 1,469,706 are < code > type: 5< / code > (SITE) so we can purge them:< / li >
< / ul >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2016/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / span > < / span > < / code > < / pre > < / div > < h2 id = "statistics-2015" > statistics-2015< / h2 >
< p > Processing the statistics-2015 core:< / p >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2015
< / span > < / span > < span style = "display:flex;" > < span > ...
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < span style = "display:flex;" > < span > *** Statistics Records with Legacy Id ***
< / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" >
< / span > < / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" > < / span > 990,916 Bistream View
< / span > < / span > < span style = "display:flex;" > < span > 506,070 Item View
< / span > < / span > < span style = "display:flex;" > < span > 116,153 Community View
< / span > < / span > < span style = "display:flex;" > < span > 33,282 Collection View
< / span > < / span > < span style = "display:flex;" > < span > 21,062 Community Search
< / span > < / span > < span style = "display:flex;" > < span > 10,788 Collection Search
< / span > < / span > < span style = "display:flex;" > < span > 52,107 Unexpected Type & Full Site
< / span > < / span > < span style = "display:flex;" > < span > --------------------------------------
< / span > < / span > < span style = "display:flex;" > < span > 1,730,378 TOTAL
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < / code > < / pre > < / div > < p > Summary of stats after processing:< / p >
< ul >
< li > 195,293: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 67,146: < code > id:/.+-unmigrated/< / code > < / li >
< li > 262,439: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 247,400 are < code > type: 5< / code > (SITE) so we can purge them:< / li >
< / ul >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2015/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / span > < / span > < / code > < / pre > < / div > < h2 id = "statistics-2014" > statistics-2014< / h2 >
< p > Processing the statistics-2014 core:< / p >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2014
< / span > < / span > < span style = "display:flex;" > < span > ...
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < span style = "display:flex;" > < span > *** Statistics Records with Legacy Id ***
< / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" >
< / span > < / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" > < / span > 2,381,603 Item View
< / span > < / span > < span style = "display:flex;" > < span > 1,323,357 Bistream View
< / span > < / span > < span style = "display:flex;" > < span > 501,545 Community View
< / span > < / span > < span style = "display:flex;" > < span > 247,805 Collection View
< / span > < / span > < span style = "display:flex;" > < span > 250 Collection Search
< / span > < / span > < span style = "display:flex;" > < span > 188 Community Search
< / span > < / span > < span style = "display:flex;" > < span > 50 Item Search
< / span > < / span > < span style = "display:flex;" > < span > 10,918 Unexpected Type & Full Site
< / span > < / span > < span style = "display:flex;" > < span > --------------------------------------
< / span > < / span > < span style = "display:flex;" > < span > 4,465,716 TOTAL
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < / code > < / pre > < / div > < p > Summary of unmigrated documents after processing:< / p >
< ul >
< li > 182,131: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 39,947: < code > id:/.+-unmigrated/< / code > < / li >
< li > 222,078: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 188,791 are < code > type: 5< / code > (SITE) so we can purge them:< / li >
< / ul >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2014/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / span > < / span > < / code > < / pre > < / div > < h2 id = "statistics-2013" > statistics-2013< / h2 >
< p > Processing the statistics-2013 core:< / p >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2013
< / span > < / span > < span style = "display:flex;" > < span > ...
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < span style = "display:flex;" > < span > *** Statistics Records with Legacy Id ***
< / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" >
< / span > < / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" > < / span > 2,352,124 Item View
< / span > < / span > < span style = "display:flex;" > < span > 1,117,676 Bistream View
< / span > < / span > < span style = "display:flex;" > < span > 575,711 Community View
< / span > < / span > < span style = "display:flex;" > < span > 171,639 Collection View
< / span > < / span > < span style = "display:flex;" > < span > 248 Item Search
< / span > < / span > < span style = "display:flex;" > < span > 7 Collection Search
< / span > < / span > < span style = "display:flex;" > < span > 5 Community Search
< / span > < / span > < span style = "display:flex;" > < span > 1,452 Unexpected Type & Full Site
< / span > < / span > < span style = "display:flex;" > < span > --------------------------------------
< / span > < / span > < span style = "display:flex;" > < span > 4,218,862 TOTAL
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < / code > < / pre > < / div > < p > Summary of unmigrated docs after processing:< / p >
< ul >
< li > 2,548 : < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 29,772: < code > id:/.+-unmigrated/< / code > < / li >
< li > 32,320: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 15,691 are < code > type: 5< / code > (SITE) so we can purge them:< / li >
< / ul >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2013/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / span > < / span > < / code > < / pre > < / div > < h2 id = "statistics-2012" > statistics-2012< / h2 >
< p > Processing the statistics-2012 core:< / p >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2012
< / span > < / span > < span style = "display:flex;" > < span > ...
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < span style = "display:flex;" > < span > *** Statistics Records with Legacy Id ***
< / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" >
< / span > < / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" > < / span > 2,229,332 Item View
< / span > < / span > < span style = "display:flex;" > < span > 913,577 Bistream View
< / span > < / span > < span style = "display:flex;" > < span > 215,577 Collection View
< / span > < / span > < span style = "display:flex;" > < span > 104,734 Community View
< / span > < / span > < span style = "display:flex;" > < span > --------------------------------------
< / span > < / span > < span style = "display:flex;" > < span > 3,463,220 TOTAL
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < / code > < / pre > < / div > < p > Summary of unmigrated docs after processing:< / p >
< ul >
< li > 0: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 33,161: < code > id:/.+-unmigrated/< / code > < / li >
< li > 33,161: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 33,161 are < code > type: 3< / code > (COLLECTION), which is different than I’ ve seen previously… but I suppose I still have to purge them because there will be errors in the Atmire modules otherwise:< / li >
< / ul >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2012/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / span > < / span > < / code > < / pre > < / div > < h2 id = "statistics-2011" > statistics-2011< / h2 >
< p > Processing the statistics-2011 core:< / p >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2011
< / span > < / span > < span style = "display:flex;" > < span > ...
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < span style = "display:flex;" > < span > *** Statistics Records with Legacy Id ***
< / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" >
< / span > < / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" > < / span > 904,896 Item View
< / span > < / span > < span style = "display:flex;" > < span > 385,789 Bistream View
< / span > < / span > < span style = "display:flex;" > < span > 154,356 Collection View
< / span > < / span > < span style = "display:flex;" > < span > 62,978 Community View
< / span > < / span > < span style = "display:flex;" > < span > --------------------------------------
< / span > < / span > < span style = "display:flex;" > < span > 1,508,019 TOTAL
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < / code > < / pre > < / div > < p > Summary of unmigrated docs after processing:< / p >
< ul >
< li > 0: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 17,551: < code > id:/.+-unmigrated/< / code > < / li >
< li > 17,551: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 12,116 are < code > type: 3< / code > (COLLECTION), which is different than I’ ve seen previously… but I suppose I still have to purge them because there will be errors in the Atmire modules otherwise:< / li >
< / ul >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2011/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / span > < / span > < / code > < / pre > < / div > < h2 id = "statistics-2010" > statistics-2010< / h2 >
< p > Processing the statistics-2010 core:< / p >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ chrt -b < span style = "color:#ae81ff" > 0< / span > dspace solr-upgrade-statistics-6x -n < span style = "color:#ae81ff" > 2500000< / span > -i statistics-2010
< / span > < / span > < span style = "display:flex;" > < span > ...
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < span style = "display:flex;" > < span > *** Statistics Records with Legacy Id ***
< / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" >
< / span > < / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#960050;background-color:#1e0010" > < / span > 26,067 Item View
< / span > < / span > < span style = "display:flex;" > < span > 15,615 Bistream View
< / span > < / span > < span style = "display:flex;" > < span > 4,116 Collection View
< / span > < / span > < span style = "display:flex;" > < span > 1,094 Community View
< / span > < / span > < span style = "display:flex;" > < span > --------------------------------------
< / span > < / span > < span style = "display:flex;" > < span > 46,892 TOTAL
< / span > < / span > < span style = "display:flex;" > < span > =================================================================
< / span > < / span > < / code > < / pre > < / div > < p > Summary of unmigrated docs after processing:< / p >
< ul >
< li > 0: < code > (*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)< / code > < / li >
< li > 1,012: < code > id:/.+-unmigrated/< / code > < / li >
< li > 1,012: < code > *:* NOT id:/.{36}/< / code > < / li >
< li > 654 are < code > type: 3< / code > (COLLECTION), which is different than I’ ve seen previously… but I suppose I still have to purge them because there will be errors in the Atmire modules otherwise:< / li >
< / ul >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ curl -s < span style = "color:#e6db74" > " http://localhost:8081/solr/statistics-2010/update?softCommit=true" < / span > -H < span style = "color:#e6db74" > " Content-Type: text/xml" < / span > --data-binary < span style = "color:#e6db74" > " < delete> < query> *:* NOT id:/.{36}/< /query> < /delete> " < / span >
< / span > < / span > < / code > < / pre > < / div > < h3 id = "processing-solr-statistics-with-atomicstatisticsupdatecli" > Processing Solr statistics with AtomicStatisticsUpdateCLI< / h3 >
< p > On 2020-11-18 I finished processing the Solr statistics with solr-upgrade-statistics-6x and I started processing them with AtomicStatisticsUpdateCLI.< / p >
< h2 id = "statistics-1" > statistics< / h2 >
< p > First the current year’ s statistics core, in 12-hour batches:< / p >
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics
< / code > < / pre > < p > It took ~38 hours to finish processing this core.< / p >
< h2 id = "statistics-2019-1" > statistics-2019< / h2 >
< p > The statistics-2019 core, in 12-hour batches:< / p >
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2019
< / code > < / pre > < p > It took ~32 hours to finish processing this core.< / p >
< h2 id = "statistics-2018-1" > statistics-2018< / h2 >
< p > The statistics-2018 core, in 12-hour batches:< / p >
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2018
< / code > < / pre > < p > It took ~28 hours to finish processing this core.< / p >
< h2 id = "statistics-2017-1" > statistics-2017< / h2 >
< p > The statistics-2017 core, in 12-hour batches:< / p >
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2017
< / code > < / pre > < p > It took ~24 hours to finish processing this core.< / p >
< h2 id = "statistics-2016-1" > statistics-2016< / h2 >
< p > The statistics-2016 core, in 12-hour batches:< / p >
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2016
< / code > < / pre > < p > It took ~20 hours to finish processing this core.< / p >
< h2 id = "statistics-2015-1" > statistics-2015< / h2 >
< p > The statistics-2015 core, in 12-hour batches:< / p >
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2015
< / code > < / pre > < p > It took ~21 hours to finish processing this core.< / p >
< h2 id = "statistics-2014-1" > statistics-2014< / h2 >
< p > The statistics-2014 core, in 12-hour batches:< / p >
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2014
< / code > < / pre > < p > It took ~12 hours to finish processing this core.< / p >
< h2 id = "statistics-2013-1" > statistics-2013< / h2 >
< p > The statistics-2013 core, in 12-hour batches:< / p >
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2013
< / code > < / pre > < p > It took ~3 hours to finish processing this core.< / p >
< h2 id = "statistics-2012-1" > statistics-2012< / h2 >
< p > The statistics-2012 core, in 12-hour batches:< / p >
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2012
< / code > < / pre > < p > It took ~2 hours to finish processing this core.< / p >
< h2 id = "statistics-2011-1" > statistics-2011< / h2 >
< p > The statistics-2011 core, in 12-hour batches:< / p >
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2011
< / code > < / pre > < p > It took 1 hour to finish processing this core.< / p >
< h2 id = "statistics-2010-1" > statistics-2010< / h2 >
< p > The statistics-2010 core, in 12-hour batches:< / p >
< pre tabindex = "0" > < code > $ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics-2010
< / code > < / pre > < p > It took five minutes to finish processing this core.< / p >
< / article >
< / div > <!-- /.blog - main -->
< aside class = "col-sm-3 ml-auto blog-sidebar" >
< section class = "sidebar-module" >
< h4 > Recent Posts< / h4 >
< ol class = "list-unstyled" >
2023-09-02 16:37:15 +02:00
< li > < a href = "/cgspace-notes/2023-09/" > September, 2023< / a > < / li >
2023-08-04 17:05:44 +02:00
< li > < a href = "/cgspace-notes/2023-08/" > August, 2023< / a > < / li >
2023-07-04 07:03:36 +02:00
< li > < a href = "/cgspace-notes/2023-07/" > July, 2023< / a > < / li >
< li > < a href = "/cgspace-notes/2023-06/" > June, 2023< / a > < / li >
< li > < a href = "/cgspace-notes/2023-05/" > May, 2023< / a > < / li >
< / ol >
< / section >
< section class = "sidebar-module" >
< h4 > Links< / h4 >
< ol class = "list-unstyled" >
< li > < a href = "https://cgspace.cgiar.org" > CGSpace< / a > < / li >
< li > < a href = "https://dspacetest.cgiar.org" > DSpace Test< / a > < / li >
< li > < a href = "https://github.com/ilri/DSpace" > CGSpace @ GitHub< / a > < / li >
< / ol >
< / section >
< / aside >
< / div > <!-- /.row -->
< / div > <!-- /.container -->
< footer class = "blog-footer" >
< p dir = "auto" >
Blog template created by < a href = "https://twitter.com/mdo" > @mdo< / a > , ported to Hugo by < a href = 'https://twitter.com/mralanorth' > @mralanorth< / a > .
< / p >
< p >
< a href = "#" > Back to top< / a >
< / p >
< / footer >
< / body >
< / html >