cgspace-notes/docs/2019-12/index.html

358 lines
14 KiB
HTML
Raw Normal View History

2019-12-01 10:29:49 +01:00
<!DOCTYPE html>
<html lang="en" >
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta property="og:title" content="December, 2019" />
<meta property="og:description" content="2019-12-01
Upgrade CGSpace (linode18) to Ubuntu 18.04:
Check any packages that have residual configs and purge them:
# dpkg -l | grep -E &lsquo;^rc&rsquo; | awk &lsquo;{print $2}&rsquo; | xargs dpkg -P
Make sure all packages are up to date and the package manager is up to date, then reboot:
# apt update &amp;&amp; apt full-upgrade
# apt-get autoremove &amp;&amp; apt-get autoclean
# dpkg -C
# reboot
" />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-12/" />
<meta property="article:published_time" content="2019-12-01T11:22:30+02:00" />
2019-12-17 15:45:21 +01:00
<meta property="article:modified_time" content="2019-12-17T14:49:24+02:00" />
2019-12-01 10:29:49 +01:00
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="December, 2019"/>
<meta name="twitter:description" content="2019-12-01
Upgrade CGSpace (linode18) to Ubuntu 18.04:
Check any packages that have residual configs and purge them:
# dpkg -l | grep -E &lsquo;^rc&rsquo; | awk &lsquo;{print $2}&rsquo; | xargs dpkg -P
Make sure all packages are up to date and the package manager is up to date, then reboot:
# apt update &amp;&amp; apt full-upgrade
# apt-get autoremove &amp;&amp; apt-get autoclean
# dpkg -C
# reboot
"/>
2019-12-17 13:49:24 +01:00
<meta name="generator" content="Hugo 0.61.0" />
2019-12-01 10:29:49 +01:00
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "BlogPosting",
"headline": "December, 2019",
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-12\/",
2019-12-17 15:45:21 +01:00
"wordCount": "928",
2019-12-01 10:29:49 +01:00
"datePublished": "2019-12-01T11:22:30+02:00",
2019-12-17 15:45:21 +01:00
"dateModified": "2019-12-17T14:49:24+02:00",
2019-12-01 10:29:49 +01:00
"author": {
"@type": "Person",
"name": "Alan Orth"
},
"keywords": "Notes"
}
</script>
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-12/">
<title>December, 2019 | CGSpace Notes</title>
<!-- combined, minified CSS -->
<link href="https://alanorth.github.io/cgspace-notes/css/style.css" rel="stylesheet" integrity="sha384-G5B34w7DFTumWTswxYzTX7NWfbvQEg1HbFFEg6ItN03uTAAoS2qkPS/fu3LhuuSA" crossorigin="anonymous">
<!-- RSS 2.0 feed -->
</head>
<body>
<div class="blog-masthead">
<div class="container">
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>
<header class="blog-header">
<div class="container">
<h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
<p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
</div>
</header>
<div class="container">
<div class="row">
<div class="col-sm-8 blog-main">
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-12/">December, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-12-01T11:22:30&#43;02:00">Sun Dec 01, 2019</time> by Alan Orth in
<i class="fa fa-folder" aria-hidden="true"></i>&nbsp;<a href="/cgspace-notes/categories/notes" rel="category tag">Notes</a>
</p>
</header>
2019-12-17 13:49:24 +01:00
<h2 id="2019-12-01">2019-12-01</h2>
2019-12-01 10:29:49 +01:00
<ul>
<li>Upgrade CGSpace (linode18) to Ubuntu 18.04:
<ul>
<li>Check any packages that have residual configs and purge them:</li>
<li><!-- raw HTML omitted --># dpkg -l | grep -E &lsquo;^rc&rsquo; | awk &lsquo;{print $2}&rsquo; | xargs dpkg -P<!-- raw HTML omitted --></li>
<li>Make sure all packages are up to date and the package manager is up to date, then reboot:</li>
</ul>
</li>
</ul>
<pre><code># apt update &amp;&amp; apt full-upgrade
# apt-get autoremove &amp;&amp; apt-get autoclean
# dpkg -C
# reboot
</code></pre><ul>
<li>Take some backups:</li>
</ul>
<pre><code># dpkg -l &gt; 2019-12-01-linode18-dpkg.txt
# tar czf 2019-12-01-linode18-etc.tar.gz /etc
</code></pre><ul>
<li>Then check all third-party repositories in /etc/apt to see if everything using &ldquo;xenial&rdquo; has packages available for &ldquo;bionic&rdquo; and then update the sources:</li>
<li><!-- raw HTML omitted --># sed -i &lsquo;s/xenial/bionic/&rsquo; /etc/apt/sources.list.d/*.list<!-- raw HTML omitted --></li>
<li>Pause the Uptime Robot monitoring for CGSpace</li>
<li>Make sure the update manager is installed and do the upgrade:</li>
</ul>
<pre><code># apt install update-manager-core
# do-release-upgrade
</code></pre><ul>
<li>After the upgrade finishes, remove Java 11, force the installation of bionic nginx, and reboot the server:</li>
</ul>
<pre><code># apt purge openjdk-11-jre-headless
# apt install 'nginx=1.16.1-1~bionic'
# reboot
</code></pre><ul>
<li>After the server comes back up, remove Python virtualenvs that were created with Python 3.5 and re-run certbot to make sure it's working:</li>
</ul>
<pre><code># rm -rf /opt/eff.org/certbot/venv/bin/letsencrypt
# rm -rf /opt/ilri/dspace-statistics-api/venv
# /opt/certbot-auto
</code></pre><ul>
<li>Clear Ansible's fact cache and re-run the playbooks to update the system's firewalls, SSH config, etc</li>
2019-12-01 15:58:13 +01:00
<li>Altmetric finally responded to my question about Dublin Core fields
<ul>
<li>They shared a <a href="https://help.altmetric.com/support/solutions/articles/6000141419-what-metadata-is-required-to-track-our-content-">list of fields they use for tracking</a>, but it only mentions HTML meta tags, and not fields considered when harvesting via OAI</li>
<li>Anyways, there might be some areas we can improve on the HTML meta tags, if I look at one <a href="https://hdl.handle.net/10568/101623">item with a DOI, ISSN, etc</a> I see that we could at least add status (Open Access) and journal title</li>
<li>I merged a <a href="https://github.com/ilri/DSpace/pull/438">pull request</a> into the <code>5_x-prod</code> branch to add status and journal title to the XHTML meta tags</li>
</ul>
</li>
2019-12-01 10:29:49 +01:00
</ul>
2019-12-17 13:49:24 +01:00
<h2 id="2019-12-02">2019-12-02</h2>
2019-12-02 15:43:27 +01:00
<ul>
<li>Raise the issue of old, low-quality thumbnails with Peter and the CGSpace team
<ul>
<li>I suggested that we move manually uploaded thumbnails from the <code>ORIGINAL</code> bundle to the <code>THUMBNAIL</code> bundle</li>
<li>Also replace old thumbnails where an item is available on Slideshare or YouTube because those are easy to get new, high-quality thumbnails for</li>
</ul>
</li>
<li>Continue testing CG Core v2 implementation on DSpace Test
<ul>
<li>Compare the OAI QDC representation of a few items on CGSpace vs DSpace Test:</li>
</ul>
</li>
</ul>
<pre><code>$ http 'https://cgspace.cgiar.org/oai/request?verb=GetRecord&amp;metadataPrefix=oai_dc&amp;identifier=oai:cgspace.cgiar.org:10568/104030' &gt; /tmp/cgspace-104030.xml
$ http 'https://dspacetest.cgiar.org/oai/request?verb=GetRecord&amp;metadataPrefix=oai_dc&amp;identifier=oai:cgspace.cgiar.org:10568/104030' &gt; /tmp/dspacetest-104030.xml
</code></pre><ul>
<li>The DSpace Test ones actually now capture the DOI, where the CGSpace doesn't&hellip;</li>
<li>And the DSpace Test one doesn't include review status as <code>dc.description</code>, but I don't think that's an important field</li>
</ul>
2019-12-17 13:49:24 +01:00
<h2 id="2019-12-04">2019-12-04</h2>
2019-12-08 15:03:19 +01:00
<ul>
<li>Peter noticed that there were about seventy items on CGSpace that were marked as private
<ul>
<li>Some have been withdrawn, but I extracted a list of the forty-eight that were not:</li>
</ul>
</li>
</ul>
<pre><code>dspace=# \COPY (SELECT handle, owning_collection FROM item, handle WHERE item.discoverable='f' AND item.in_archive='t' AND handle.resource_id = item.item_id) to /tmp/2019-12-04-CGSpace-private-items.csv WITH CSV HEADER;
COPY 48
2019-12-17 13:49:24 +01:00
</code></pre><h2 id="2019-12-05">2019-12-05</h2>
2019-12-08 15:03:19 +01:00
<ul>
<li>Give <a href="https://hdl.handle.net/10568/106045">presentation about CG Core v2</a> to the MEL Developers&rsquo; Retreat in Nairobi, Kenya (via Skype)</li>
<li>Send some pull requests to the cg-core schema repository:
<ul>
<li><a href="https://github.com/AgriculturalSemantics/cg-core/pull/16">HTML syntax fixes</a></li>
<li><a href="https://github.com/AgriculturalSemantics/cg-core/pull/17">Add LICENSE file</a></li>
<li><a href="https://github.com/AgriculturalSemantics/cg-core/pull/18">Build main.css using npm build</a></li>
</ul>
</li>
</ul>
2019-12-17 13:49:24 +01:00
<h2 id="2019-12-08">2019-12-08</h2>
2019-12-08 15:03:19 +01:00
<ul>
<li>Enrico noticed that the AReS Explorer on CGSpace (linode18) was down
<ul>
<li>I only see HTTP 502 in the nginx logs on CGSpace&hellip; so I assume it's something wrong with the AReS server</li>
<li>I ran all system updates on the AReS server (linode20) and rebooted it</li>
<li>After rebooting the Explorer was accessible again</li>
</ul>
</li>
</ul>
2019-12-17 13:49:24 +01:00
<h2 id="2019-12-09">2019-12-09</h2>
2019-12-09 14:30:15 +01:00
<ul>
<li>Update PostgreSQL JDBC driver to <a href="https://jdbc.postgresql.org/documentation/changelog.html#version_42.2.9">version 42.2.9</a> in <a href="https://github.com/ilri/rmg-ansible-public">Ansible playbooks</a>
<ul>
<li>Deploy on DSpace Test (linode19) to test before deploying on CGSpace in a few days</li>
</ul>
</li>
2019-12-09 18:18:51 +01:00
<li>Altmetric responded to my question about <a href="https://hdl.handle.net/10568/97087">the WLE item</a> that has a lower score than its DOI
<ul>
<li>They say that they will &ldquo;reprocess&rdquo; the item &ldquo;before Christmas&rdquo;</li>
</ul>
</li>
2019-12-09 14:30:15 +01:00
</ul>
2019-12-17 13:49:24 +01:00
<h2 id="2019-12-11">2019-12-11</h2>
2019-12-11 17:20:20 +01:00
<ul>
<li>Post <a href="https://www.yammer.com/dspacedevelopers/#/Threads/show?threadId=454830191804416">message to Yammer about good practices for thumbnails on CGSpace</a>
<ul>
<li>On the topic of thumbnails, I'm thinking we might want to force regenerate all PDF thumbnails on CGSpace since we upgraded it to Ubuntu 18.04 and got a new ghostscript&hellip;</li>
</ul>
</li>
<li>More discussion about report formats for AReS</li>
2019-12-11 18:02:05 +01:00
<li>Peter noticed that the Atmire reports weren't showing any statistics before 2019
<ul>
<li>I checked and indeed Solr had an issue loading some core last time it was started</li>
<li>I restarted Tomcat three times before all cores came up successfully</li>
</ul>
</li>
<li>While I was restarting the Tomcat service I upgraded the PostgreSQL JDBC driver to version 42.2.9, which had been deployed on DSpace Test earlier this week</li>
2019-12-11 17:20:20 +01:00
</ul>
2019-12-17 13:49:24 +01:00
<h2 id="2019-12-16">2019-12-16</h2>
<ul>
<li>Visit CodeObia office to discuss next phase of OpenRXV/AReS development
<ul>
<li>We discussed using CSV instead of Excel for tabular reports
<ul>
<li>OpenRXV should only have &ldquo;simple&rdquo; reports with Dublin Core fields</li>
<li>AReS should have this as well as a customized &ldquo;extended&rdquo; report that has CRPs, Subjects, Sponsors, etc from CGSpace</li>
</ul>
</li>
<li>We discussed using RTF instead of Word for graphical reports</li>
</ul>
</li>
</ul>
<h2 id="2019-12-17">2019-12-17</h2>
<ul>
<li>Start filing GitHub issues for the reporting features on OpenRXV and AReS
<ul>
<li>I created an issue for the &ldquo;simple&rdquo; tabular reports on OpenRXV GitHub (<a href="https://github.com/ilri/OpenRXV/issues/29">#29</a>)</li>
<li>I created an issue for the &ldquo;extended&rdquo; tabular reports on AReS GitHub (<a href="https://github.com/ilri/AReS/issues/8">#8</a>)</li>
<li>I created an issue for &ldquo;simple&rdquo; text reports on the OpenRXV GitHub (<a href="https://github.com/ilri/OpenRXV/issues/30">#30</a>)</li>
<li>I created an issue for &ldquo;extended&rdquo; text reports on the AReS GitHub (<a href="https://github.com/ilri/AReS/issues/9">#9</a>)</li>
</ul>
</li>
<li>I looked into creating RTF documents from HTML in Node.js and there is a library called <a href="https://www.npmjs.com/package/html-to-rtf">html-to-rtf</a> that works well, but doesn't support images</li>
2019-12-17 15:45:21 +01:00
<li>Export a list of all investors (<code>dc.description.sponsorship</code>) for Peter to look through and correct:</li>
2019-12-17 13:49:24 +01:00
</ul>
2019-12-17 15:45:21 +01:00
<pre><code>dspace=# \COPY (SELECT DISTINCT text_value as &quot;dc.contributor.sponsor&quot;, count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 29 GROUP BY text_value ORDER BY count DESC LIMIT 1500) to /tmp/2019-12-17-investors.csv WITH CSV HEADER;
COPY 643
</code></pre><!-- raw HTML omitted -->
2019-12-01 10:29:49 +01:00
</article>
</div> <!-- /.blog-main -->
<aside class="col-sm-3 ml-auto blog-sidebar">
<section class="sidebar-module">
<h4>Recent Posts</h4>
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2019-12/">December, 2019</a></li>
<li><a href="/cgspace-notes/2019-11/">November, 2019</a></li>
<li><a href="/cgspace-notes/cgspace-cgcorev2-migration/">CGSpace CG Core v2 Migration</a></li>
<li><a href="/cgspace-notes/2019-10/">October, 2019</a></li>
<li><a href="/cgspace-notes/2019-09/">September, 2019</a></li>
</ol>
</section>
<section class="sidebar-module">
<h4>Links</h4>
<ol class="list-unstyled">
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
</ol>
</section>
</aside>
</div> <!-- /.row -->
</div> <!-- /.container -->
<footer class="blog-footer">
<p dir="auto">
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
</p>
<p>
<a href="#">Back to top</a>
</p>
</footer>
</body>
</html>