<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta property="og:title" content="July, 2019" /> <meta property="og:description" content="2019-07-01 Create an “AfricaRice books and book chapters” collection on CGSpace for AfricaRice Last month Sisay asked why the following “most popular” statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace: DSpace Test CGSpace Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community " /> <meta property="og:type" content="article" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-07/" /> <meta property="article:published_time" content="2019-07-01T12:13:51+03:00"/> <meta property="article:modified_time" content="2019-07-02T18:08:14+03:00"/> <meta name="twitter:card" content="summary"/> <meta name="twitter:title" content="July, 2019"/> <meta name="twitter:description" content="2019-07-01 Create an “AfricaRice books and book chapters” collection on CGSpace for AfricaRice Last month Sisay asked why the following “most popular” statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace: DSpace Test CGSpace Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community "/> <meta name="generator" content="Hugo 0.55.6" /> <script type="application/ld+json"> { "@context": "http://schema.org", "@type": "BlogPosting", "headline": "July, 2019", "url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-07\/", "wordCount": "544", "datePublished": "2019-07-01T12:13:51\x2b03:00", "dateModified": "2019-07-02T18:08:14\x2b03:00", "author": { "@type": "Person", "name": "Alan Orth" }, "keywords": "Notes" } </script> <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-07/"> <title>July, 2019 | CGSpace Notes</title> <!-- combined, minified CSS --> <link href="https://alanorth.github.io/cgspace-notes/css/style.css" rel="stylesheet" integrity="sha384-G5B34w7DFTumWTswxYzTX7NWfbvQEg1HbFFEg6ItN03uTAAoS2qkPS/fu3LhuuSA" crossorigin="anonymous"> <!-- RSS 2.0 feed --> </head> <body> <div class="blog-masthead"> <div class="container"> <nav class="nav blog-nav"> <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a> </nav> </div> </div> <header class="blog-header"> <div class="container"> <h1 class="blog-title"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1> <p class="lead blog-description">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p> </div> </header> <div class="container"> <div class="row"> <div class="col-sm-8 blog-main"> <article class="blog-post"> <header> <h2 class="blog-post-title"><a href="https://alanorth.github.io/cgspace-notes/2019-07/">July, 2019</a></h2> <p class="blog-post-meta"><time datetime="2019-07-01T12:13:51+03:00">Mon Jul 01, 2019</time> by Alan Orth in <i class="fa fa-tag" aria-hidden="true"></i> <a href="/cgspace-notes/tags/notes" rel="tag">Notes</a> </p> </header> <h2 id="2019-07-01">2019-07-01</h2> <ul> <li>Create an “AfricaRice books and book chapters” collection on CGSpace for AfricaRice</li> <li>Last month Sisay asked why the following “most popular” statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace: <ul> <li><a href="https://dspacetest.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&time_filter_end_date=01%2F12%2F2018">DSpace Test</a></li> <li><a href="https://cgspace.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&time_filter_end_date=01%2F12%2F2018">CGSpace</a></li> </ul></li> <li>Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community</li> </ul> <ul> <li>If I change the parameters to 2019 I see stats, so I’m really thinking it has something to do with the sharded yearly Solr statistics cores <ul> <li>I checked the Solr admin UI and I see all Solr cores loaded, so I don’t know what it could be</li> <li>When I check the Atmire content and usage module it seems obvious that there is a problem with the old cores because I dont have anything before 2019-01</li> </ul></li> </ul> <p><img src="/cgspace-notes/2019/07/atmire-cua-2018-missing.png" alt="Atmire CUA 2018 stats missing" /></p> <ul> <li>I don’t see anyone logged in right now so I’m going to try to restart Tomcat and see if the stats are accessible after Solr comes back up</li> <li><p>I decided to run all system updates on the server (linode18) and reboot it</p> <ul> <li>After rebooting Tomcat came back up, but the the Solr statistics cores were not all loaded</li> <li><p>The error is always (with a different core):</p> <pre><code>org.apache.solr.common.SolrException: Error CREATEing SolrCore 'statistics-2010': Unable to create core [statistics-2010] Caused by: Lock obtain timed out: NativeFSLock@/home/cgspace.cgiar.org/solr/statistics-2010/data/index/write.lock </code></pre></li> </ul></li> <li><p>I restarted Tomcat <em>ten times</em> and it never worked…</p></li> <li><p>I tried to stop Tomcat and delete the write locks:</p> <pre><code># systemctl stop tomcat7 # find /dspace/solr/statistics* -iname "*.lock" -print -delete /dspace/solr/statistics/data/index/write.lock /dspace/solr/statistics-2010/data/index/write.lock /dspace/solr/statistics-2011/data/index/write.lock /dspace/solr/statistics-2012/data/index/write.lock /dspace/solr/statistics-2013/data/index/write.lock /dspace/solr/statistics-2014/data/index/write.lock /dspace/solr/statistics-2015/data/index/write.lock /dspace/solr/statistics-2016/data/index/write.lock /dspace/solr/statistics-2017/data/index/write.lock /dspace/solr/statistics-2018/data/index/write.lock # find /dspace/solr/statistics* -iname "*.lock" -print -delete # systemctl start tomcat7 </code></pre></li> <li><p>But it still didn’t work!</p></li> <li><p>I stopped Tomcat, deleted the old locks, and will try to use the “simple” lock file type in <code>solr/statistics/conf/solrconfig.xml</code>:</p> <pre><code><lockType>${solr.lock.type:simple}</lockType> </code></pre></li> <li><p>And after restarting Tomcat it still doesn’t work</p></li> <li><p>Now I’ll try going back to “native” locking with <code>unlockAtStartup</code>:</p> <pre><code><unlockOnStartup>true</unlockOnStartup> </code></pre></li> <li><p>Now the cores seem to load, but I still see an error in the Solr Admin UI and I still can’t access any stats before 2018</p></li> <li><p>I filed an <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=685">issue with Atmire</a>, so let’s see if they can help</p></li> <li><p>And since I’m annoyed and it’s been a few months, I’m going to move the JVM heap settings that I’ve been testing on DSpace Test to CGSpace</p></li> <li><p>The old ones were:</p> <pre><code>-Djava.awt.headless=true -Xms8192m -Xmx8192m -XX:+UseConcMarkSweepGC -Dfile.encoding=UTF-8 -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=5400 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false </code></pre></li> <li><p>And the new ones come from Solr 4.10.x’s startup scripts:</p> <pre><code>-Djava.awt.headless=true -Xms8192m -Xmx8192m -Dfile.encoding=UTF-8 -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=1337 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false </code></pre></li> </ul> <h2 id="2019-07-02">2019-07-02</h2> <ul> <li><p>Help upload twenty-seven posters from the 2019-05 Sharefair to CGSpace</p> <ul> <li><p>Sisay had already done the SAFBundle so I did some minor corrections to and uploaded them to a temporary collection so I could check them in OpenRefine:</p> <pre><code>$ sed -i 's/CC-BY 4.0/CC-BY-4.0/' item_*/dublin_core.xml $ echo "10568/101992" >> item_*/collections $ dspace import -a -e me@cgiar.org -m 2019-07-02-Sharefair.map -s /tmp/Sharefair_mapped </code></pre></li> </ul></li> <li><p>I noticed that all twenty-seven items had double dates like “2019-05||2019-05” so I fixed those, but the rest of the metadata looked good so I unmapped them from the temporary collection</p></li> <li><p>Finish looking at the fifty-six AfricaRice items and upload them to CGSpace:</p> <pre><code>$ dspace import -a -e me@cgiar.org -m 2019-07-02-AfricaRice-11to73.map -s /tmp/SimpleArchiveFormat </code></pre></li> </ul> <!-- vim: set sw=2 ts=2: --> </article> </div> <!-- /.blog-main --> <aside class="col-sm-3 ml-auto blog-sidebar"> <section class="sidebar-module"> <h4>Recent Posts</h4> <ol class="list-unstyled"> <li><a href="/cgspace-notes/2019-07/">July, 2019</a></li> <li><a href="/cgspace-notes/posts/">Posts</a></li> <li><a href="/cgspace-notes/2019-06/">June, 2019</a></li> <li><a href="/cgspace-notes/2019-05/">May, 2019</a></li> <li><a href="/cgspace-notes/2019-04/">April, 2019</a></li> </ol> </section> <section class="sidebar-module"> <h4>Links</h4> <ol class="list-unstyled"> <li><a href="https://cgspace.cgiar.org">CGSpace</a></li> <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li> <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li> </ol> </section> </aside> </div> <!-- /.row --> </div> <!-- /.container --> <footer class="blog-footer"> <p> Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>. </p> <p> <a href="#">Back to top</a> </p> </footer> </body> </html>