2020-03-02 11:38:10 +01:00
<!DOCTYPE html>
< html lang = "en" >
< head >
< meta charset = "utf-8" >
< meta name = "viewport" content = "width=device-width, initial-scale=1, shrink-to-fit=no" >
< meta property = "og:title" content = "March, 2020" / >
< meta property = "og:description" content = "2020-03-02
Update dspace-statistics-api for DSpace 6+ UUIDs
Tag version 1.2.0 on GitHub
Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased SolrUpgradePre6xStatistics.java
You need to download this into the DSpace 6.x source and compile it
" />
< meta property = "og:type" content = "article" / >
< meta property = "og:url" content = "https://alanorth.github.io/cgspace-notes/2020-03/" / >
< meta property = "article:published_time" content = "2020-03-02T12:31:30+02:00" / >
2020-03-08 14:53:34 +01:00
< meta property = "article:modified_time" content = "2020-03-08T14:28:39+02:00" / >
2020-03-02 11:38:10 +01:00
< meta name = "twitter:card" content = "summary" / >
< meta name = "twitter:title" content = "March, 2020" / >
< meta name = "twitter:description" content = "2020-03-02
Update dspace-statistics-api for DSpace 6+ UUIDs
Tag version 1.2.0 on GitHub
Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased SolrUpgradePre6xStatistics.java
You need to download this into the DSpace 6.x source and compile it
"/>
2020-03-04 17:02:54 +01:00
< meta name = "generator" content = "Hugo 0.66.0" / >
2020-03-02 11:38:10 +01:00
< script type = "application/ld+json" >
{
"@context": "http://schema.org",
"@type": "BlogPosting",
"headline": "March, 2020",
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2020-03\/",
2020-03-08 14:53:34 +01:00
"wordCount": "480",
2020-03-02 11:38:10 +01:00
"datePublished": "2020-03-02T12:31:30+02:00",
2020-03-08 14:53:34 +01:00
"dateModified": "2020-03-08T14:28:39+02:00",
2020-03-02 11:38:10 +01:00
"author": {
"@type": "Person",
"name": "Alan Orth"
},
"keywords": "Notes"
}
< / script >
< link rel = "canonical" href = "https://alanorth.github.io/cgspace-notes/2020-03/" >
< title > March, 2020 | CGSpace Notes< / title >
<!-- combined, minified CSS -->
< link href = "https://alanorth.github.io/cgspace-notes/css/style.6da5c906cc7a8fbb93f31cd2316c5dbe3f19ac4aa6bfb066f1243045b8f6061e.css" rel = "stylesheet" integrity = "sha256-baXJBsx6j7uT8xzSMWxdvj8ZrEqmv7Bm8SQwRbj2Bh4=" crossorigin = "anonymous" >
<!-- minified Font Awesome for SVG icons -->
< script defer src = "https://alanorth.github.io/cgspace-notes/js/fontawesome.min.90e14c13cee52929ac33e1c21694a3cc95063a194eb22aad9f7976434e1a9125.js" integrity = "sha256-kOFME87lKSmsM+HCFpSjzJUGOhlOsiqtn3l2Q04akSU=" crossorigin = "anonymous" > < / script >
<!-- RSS 2.0 feed -->
< / head >
< body >
< div class = "blog-masthead" >
< div class = "container" >
< nav class = "nav blog-nav" >
< a class = "nav-link " href = "https://alanorth.github.io/cgspace-notes/" > Home< / a >
< / nav >
< / div >
< / div >
< header class = "blog-header" >
< div class = "container" >
< h1 class = "blog-title" dir = "auto" > < a href = "https://alanorth.github.io/cgspace-notes/" rel = "home" > CGSpace Notes< / a > < / h1 >
< p class = "lead blog-description" dir = "auto" > Documenting day-to-day work on the < a href = "https://cgspace.cgiar.org" > CGSpace< / a > repository.< / p >
< / div >
< / header >
< div class = "container" >
< div class = "row" >
< div class = "col-sm-8 blog-main" >
< article class = "blog-post" >
< header >
< h2 class = "blog-post-title" dir = "auto" > < a href = "https://alanorth.github.io/cgspace-notes/2020-03/" > March, 2020< / a > < / h2 >
< p class = "blog-post-meta" > < time datetime = "2020-03-02T12:31:30+02:00" > Mon Mar 02, 2020< / time > by Alan Orth in
< span class = "fas fa-folder" aria-hidden = "true" > < / span > < a href = "/cgspace-notes/categories/notes/" rel = "category tag" > Notes< / a >
< / p >
< / header >
< h2 id = "2020-03-02" > 2020-03-02< / h2 >
< ul >
< li > Update < a href = "https://github.com/ilri/dspace-statistics-api" > dspace-statistics-api< / a > for DSpace 6+ UUIDs
< ul >
< li > Tag version 1.2.0 on GitHub< / li >
< / ul >
< / li >
< li > Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased < a href = "https://github.com/DSpace/DSpace/commit/184f2b2153479045fba6239342c63e7f8564b8b6#diff-0350ce2e13b28d5d61252b7a8f50a059" > SolrUpgradePre6xStatistics.java< / a >
< ul >
< li > You need to download this into the DSpace 6.x source and compile it< / li >
< / ul >
< / li >
< / ul >
< pre > < code > $ export JAVA_OPTS=" -Xmx1024m -Dfile.encoding=UTF-8"
$ ~/dspace63/bin/dspace solr-upgrade-statistics-6x
2020-03-04 17:02:54 +01:00
< / code > < / pre > < h2 id = "2020-03-03" > 2020-03-03< / h2 >
< ul >
< li > Skype with Peter and Abenet to discuss the CG Core survey
< ul >
< li > We also discussed some other CGSpace issues< / li >
< / ul >
< / li >
< / ul >
< h2 id = "2020-03-04" > 2020-03-04< / h2 >
< ul >
< li > Abenet asked me to add some new ILRI subjects to CGSpace
< ul >
< li > I < a href = "https://github.com/ilri/DSpace/commit/b51a242e773bd8658d3cab4ac883975708b00386" > updated the input-forms.xml< / a > in our < code > 5_x-prod< / code > branch on GitHub< / li >
< li > Abenet said we are changing < code > HEALTH< / code > to < code > HUMAN HEALTH< / code > so I need to fix those using my < code > fix-metadata-values.py< / code > script:< / li >
< / ul >
< / li >
< / ul >
< pre > < code > $ ./fix-metadata-values.py -i 2020-03-04-fix-1-ilri-subject.csv -db dspace -u dspace -p 'fuuu' -f cg.subject.ilri -m 203 -t correct -d
< / code > < / pre > < ul >
< li > But I have not run it on CGSpace yet because we want to ask Peter if he is sure about it… < / li >
< li > Send a message to Macaroni Bros to ask them about their Drupal module and its readiness for DSpace 6 UUIDs< / li >
< / ul >
2020-03-08 12:34:56 +01:00
< h2 id = "2020-03-05" > 2020-03-05< / h2 >
< ul >
< li > I found a very < a href = "https://lucene.apache.org/solr/guide/8_1/solr-system-requirements.html#lucene-solr-prior-to-7-0" > interesting comment on the Solr 8.1 guide< / a > about Java compatibility:< / li >
< / ul >
< blockquote >
< p > Lucene/Solr 7.0 was the first version that successfully passed our tests using Java 9 and higher. You should avoid Java 9 or later for Lucene/Solr 6.x or earlier.< / p >
< / blockquote >
< h2 id = "2020-03-08" > 2020-03-08< / h2 >
< ul >
< li > I want to try to consolidate our yearly Solr statistics cores back into one < code > statistics< / code > core using the solr-import-export-json tool< / li >
< li > I will try it on DSpace test, doing one year at a time:< / li >
< / ul >
< pre > < code > $ ./run.sh -s http://localhost:8081/solr/statistics-2010 -a export -o /tmp/statistics-2010.json -k uid
$ ./run.sh -s http://localhost:8081/solr/statistics -a import -o /tmp/statistics-2010.json -k uid
$ curl -s " http://localhost:8081/solr/statistics-2010/update?softCommit=true" -H " Content-Type: text/xml" --data-binary " < delete> < query> time:2010*< /query> < /delete> "
$ ./run.sh -s http://localhost:8081/solr/statistics-2011 -a export -o /tmp/statistics-2011.json -k uid
$ ./run.sh -s http://localhost:8081/solr/statistics -a import -o /tmp/statistics-2011.json -k uid
$ curl -s " http://localhost:8081/solr/statistics-2011/update?softCommit=true" -H " Content-Type: text/xml" --data-binary " < delete> < query> time:2011*< /query> < /delete> "
$ ./run.sh -s http://localhost:8081/solr/statistics -a import -o /tmp/statistics-2012.json -k uid
$ curl -s 'http://localhost:8081/solr/statistics/select?q=time:2012*& rows=0& wt=json& indent=true' | grep numFound
" response" :{" numFound" :3761989," start" :0," docs" :[]
$ curl -s 'http://localhost:8081/solr/statistics-2012/select?q=time:2012*& rows=0& wt=json& indent=true' | grep numFound
" response" :{" numFound" :3761989," start" :0," docs" :[]
$ curl -s " http://localhost:8081/solr/statistics-2012/update?softCommit=true" -H " Content-Type: text/xml" --data-binary " < delete> < query> time:2012*< /query> < /delete> "
< / code > < / pre > < ul >
2020-03-08 13:28:39 +01:00
< li > I will do this for as many cores as I can (disk space limited) and then monitor the effect on the system and JVM memory usage
< ul >
< li > Exporting half years might work, using a filter query with months as a regular expression:< / li >
< / ul >
< / li >
2020-03-08 12:34:56 +01:00
< / ul >
2020-03-08 13:28:39 +01:00
< pre > < code > $ ./run.sh -s http://localhost:8081/solr/statistics-2014 -a export -o /tmp/statistics-2014-1.json -k uid -f 'time:/2014-0[1-6].*/'
2020-03-08 14:53:34 +01:00
< / code > < / pre > < ul >
< li > Upgrade PostgreSQL from 9.6 to 10 on DSpace Test (linode19)
< ul >
< li > I’ ve been running it for one month in my local environment, and others have reported on the dspace-tech mailing list that they are using 10 and 11< / li >
< / ul >
< / li >
< / ul >
< pre > < code > # apt install postgresql-10 postgresql-contrib-10
# systemctl stop tomcat7
# pg_ctlcluster 9.6 main stop
# tar -cvzpf var-lib-postgresql-9.6.tar.gz /var/lib/postgresql/9.6
# tar -cvzpf etc-postgresql-9.6.tar.gz /etc/postgresql/9.6
# pg_ctlcluster 10 main stop
# pg_dropcluster 10 main
# pg_upgradecluster 9.6 main
# pg_dropcluster 9.6 main
# dpkg -l | grep postgresql | grep 9.6 | awk '{print $2}' | xargs dpkg -r
2020-03-08 13:28:39 +01:00
< / code > < / pre > <!-- raw HTML omitted -->
2020-03-02 11:38:10 +01:00
< / article >
< / div > <!-- /.blog - main -->
< aside class = "col-sm-3 ml-auto blog-sidebar" >
< section class = "sidebar-module" >
< h4 > Recent Posts< / h4 >
< ol class = "list-unstyled" >
< li > < a href = "/cgspace-notes/2020-03/" > March, 2020< / a > < / li >
< li > < a href = "/cgspace-notes/2020-02/" > February, 2020< / a > < / li >
< li > < a href = "/cgspace-notes/2020-01/" > January, 2020< / a > < / li >
< li > < a href = "/cgspace-notes/2019-12/" > December, 2019< / a > < / li >
< li > < a href = "/cgspace-notes/2019-11/" > November, 2019< / a > < / li >
< / ol >
< / section >
< section class = "sidebar-module" >
< h4 > Links< / h4 >
< ol class = "list-unstyled" >
< li > < a href = "https://cgspace.cgiar.org" > CGSpace< / a > < / li >
< li > < a href = "https://dspacetest.cgiar.org" > DSpace Test< / a > < / li >
< li > < a href = "https://github.com/ilri/DSpace" > CGSpace @ GitHub< / a > < / li >
< / ol >
< / section >
< / aside >
< / div > <!-- /.row -->
< / div > <!-- /.container -->
< footer class = "blog-footer" >
< p dir = "auto" >
Blog template created by < a href = "https://twitter.com/mdo" > @mdo< / a > , ported to Hugo by < a href = 'https://twitter.com/mralanorth' > @mralanorth< / a > .
< / p >
< p >
< a href = "#" > Back to top< / a >
< / p >
< / footer >
< / body >
< / html >