cgspace-notes/docs/posts/index.html

463 lines
17 KiB
HTML
Raw Normal View History

2018-02-11 17:28:23 +01:00
<!DOCTYPE html>
<html lang="en" >
2018-02-11 17:28:23 +01:00
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta property="og:title" content="Posts" />
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
2018-03-09 21:16:20 +01:00
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
2020-06-30 14:47:18 +02:00
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
2018-09-30 07:23:48 +02:00
2018-02-11 17:28:23 +01:00
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Posts"/>
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
2020-06-30 14:47:18 +02:00
<meta name="generator" content="Hugo 0.73.0" />
2018-02-11 17:28:23 +01:00
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "Blog",
"headline": "CGSpace Notes",
2020-04-02 09:55:42 +02:00
"url" : "https://alanorth.github.io/cgspace-notes/posts/",
2018-02-11 17:28:23 +01:00
"author": {
"@type": "Person",
"name": "Alan Orth"
},
2020-06-02 14:12:32 +02:00
"dateModified": "2020-06-01T13:55:39+03:00",
2020-04-02 09:55:42 +02:00
"keywords": "notes,""migration,""notes,",
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
2018-02-11 17:28:23 +01:00
}
</script>
2018-03-09 21:16:20 +01:00
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/posts/">
2018-02-11 17:28:23 +01:00
<title>CGSpace Notes</title>
2018-02-11 17:28:23 +01:00
<!-- combined, minified CSS -->
2020-01-23 19:19:38 +01:00
2020-01-28 11:01:42 +01:00
<link href="https://alanorth.github.io/cgspace-notes/css/style.6da5c906cc7a8fbb93f31cd2316c5dbe3f19ac4aa6bfb066f1243045b8f6061e.css" rel="stylesheet" integrity="sha256-baXJBsx6j7uT8xzSMWxdvj8ZrEqmv7Bm8SQwRbj2Bh4=" crossorigin="anonymous">
2018-02-11 17:28:23 +01:00
2020-01-28 11:01:42 +01:00
<!-- minified Font Awesome for SVG icons -->
2020-04-02 09:55:42 +02:00
<script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f3d2a1f5980bab30ddd0d8cadbd496475309fc48e2b1d052c5c09e6facffcb0f.js" integrity="sha256-89Kh9ZgLqzDd0NjK29SWR1MJ/EjisdBSxcCeb6z/yw8=" crossorigin="anonymous"></script>
2020-01-28 11:01:42 +01:00
2018-02-11 17:28:23 +01:00
<!-- RSS 2.0 feed -->
2019-04-14 15:59:47 +02:00
<link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/posts/index.xml" title="CGSpace Notes" />
2018-02-11 17:28:23 +01:00
</head>
<body>
<div class="blog-masthead">
<div class="container">
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>
2018-12-19 12:20:39 +01:00
2018-02-11 17:28:23 +01:00
<header class="blog-header">
<div class="container">
<h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
<p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
2018-02-11 17:28:23 +01:00
</div>
</header>
2018-12-19 12:20:39 +01:00
2018-02-11 17:28:23 +01:00
<div class="container">
<div class="row">
<div class="col-sm-8 blog-main">
2020-05-02 09:08:14 +02:00
<article class="blog-post">
<header>
2020-06-02 14:12:32 +02:00
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-06/">June, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-06-01T13:55:39+03:00">Mon Jun 01, 2020</time> by Alan Orth in
2020-05-02 09:08:14 +02:00
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
2020-06-02 14:12:32 +02:00
<h2 id="2020-06-01">2020-06-01</h2>
2020-05-02 09:08:14 +02:00
<ul>
2020-06-02 14:12:32 +02:00
<li>I tried to run the <code>AtomicStatisticsUpdateCLI</code> CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
2020-05-02 09:08:14 +02:00
<ul>
2020-06-02 14:12:32 +02:00
<li>I sent Atmire the dspace.log from today and told them to log into the server to debug the process</li>
2020-05-02 09:08:14 +02:00
</ul>
</li>
2020-06-02 14:12:32 +02:00
<li>In other news, I checked the statistics API on DSpace 6 and it&rsquo;s working</li>
<li>I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:</li>
2020-05-02 09:08:14 +02:00
</ul>
2020-06-02 14:12:32 +02:00
<a href='https://alanorth.github.io/cgspace-notes/2020-06/'>Read more →</a>
2020-05-02 09:08:14 +02:00
</article>
2020-06-01 16:08:25 +02:00
<article class="blog-post">
<header>
2020-06-02 14:12:32 +02:00
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-05/">May, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-05-02T09:52:04+03:00">Sat May 02, 2020</time> by Alan Orth in
2020-06-01 16:08:25 +02:00
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
2020-06-02 14:12:32 +02:00
<h2 id="2020-05-02">2020-05-02</h2>
2020-06-01 16:08:25 +02:00
<ul>
2020-06-02 14:12:32 +02:00
<li>Peter said that CTA is having problems submitting an item to CGSpace
2020-06-01 16:08:25 +02:00
<ul>
2020-06-02 14:12:32 +02:00
<li>Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in &lsquo;idle in transaction&rsquo; and &lsquo;waiting for lock&rsquo; state are increasing again</li>
<li>I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)</li>
2020-06-01 16:08:25 +02:00
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-05/'>Read more →</a>
</article>
2020-04-02 09:54:46 +02:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
2020-04-02 09:55:42 +02:00
<p class="blog-post-meta"><time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time> by Alan Orth in
2020-04-02 09:54:46 +02:00
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-04-02">2020-04-02</h2>
<ul>
2020-04-02 11:33:41 +02:00
<li>Maria asked me to update Charles Staver&rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
<ul>
<li>I updated the fifty-eight existing items on CGSpace</li>
</ul>
</li>
<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
<ul>
2020-04-02 15:30:44 +02:00
<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
2020-04-02 11:33:41 +02:00
<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
</ul>
</li>
<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
2020-04-02 09:54:46 +02:00
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-04/'>Read more →</a>
</article>
2020-03-02 11:38:10 +01:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-03/">March, 2020</a></h2>
2020-04-02 09:55:42 +02:00
<p class="blog-post-meta"><time datetime="2020-03-02T12:31:30+02:00">Mon Mar 02, 2020</time> by Alan Orth in
2020-03-02 11:38:10 +01:00
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-03-02">2020-03-02</h2>
<ul>
<li>Update <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> for DSpace 6+ UUIDs
<ul>
<li>Tag version 1.2.0 on GitHub</li>
</ul>
</li>
<li>Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased <a href="https://github.com/DSpace/DSpace/commit/184f2b2153479045fba6239342c63e7f8564b8b6#diff-0350ce2e13b28d5d61252b7a8f50a059">SolrUpgradePre6xStatistics.java</a>
<ul>
<li>You need to download this into the DSpace 6.x source and compile it</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-03/'>Read more →</a>
</article>
2020-02-02 16:15:48 +01:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-02/">February, 2020</a></h2>
2020-04-02 09:55:42 +02:00
<p class="blog-post-meta"><time datetime="2020-02-02T11:56:30+02:00">Sun Feb 02, 2020</time> by Alan Orth in
2020-02-02 16:15:48 +01:00
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-02-02">2020-02-02</h2>
<ul>
<li>Continue working on porting CGSpace&rsquo;s DSpace 5 code to DSpace 6.3 that I started yesterday
<ul>
<li>Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database</li>
<li>I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks</li>
<li>Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff</li>
<li>The code finally builds and runs with a fresh install</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-02/'>Read more →</a>
</article>
2020-01-14 19:40:41 +01:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-01/">January, 2020</a></h2>
2020-04-02 09:55:42 +02:00
<p class="blog-post-meta"><time datetime="2020-01-06T10:48:30+02:00">Mon Jan 06, 2020</time> by Alan Orth in
2020-01-28 11:01:42 +01:00
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
2020-01-14 19:40:41 +01:00
</p>
</header>
<h2 id="2020-01-06">2020-01-06</h2>
<ul>
<li>Open <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">a ticket</a> with Atmire to request a quote for the upgrade to DSpace 6</li>
<li>Last week Altmetric responded about the <a href="https://hdl.handle.net/10568/97087">item</a> that had a lower score than than its DOI
<ul>
<li>The score is now linked to the DOI</li>
<li>Another <a href="https://handle.hdl.net/10568/91278">item</a> that had the same problem in 2019 has now also linked to the score for its DOI</li>
<li>Another <a href="https://hdl.handle.net/10568/81236">item</a> that had the same problem in 2019 has also been fixed</li>
</ul>
</li>
</ul>
<h2 id="2020-01-07">2020-01-07</h2>
<ul>
<li>Peter Ballantyne highlighted one more WLE <a href="https://hdl.handle.net/10568/101286">item</a> that is missing the Altmetric score that its DOI has
<ul>
<li>The DOI has a score of 259, but the Handle has no score at all</li>
<li>I <a href="https://twitter.com/mralanorth/status/1214471427157626881">tweeted</a> the CGSpace repository link</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-01/'>Read more →</a>
</article>
2019-12-01 10:29:49 +01:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-12/">December, 2019</a></h2>
2020-04-02 09:55:42 +02:00
<p class="blog-post-meta"><time datetime="2019-12-01T11:22:30+02:00">Sun Dec 01, 2019</time> by Alan Orth in
2020-01-28 11:01:42 +01:00
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
2019-12-01 10:29:49 +01:00
</p>
</header>
2019-12-17 13:49:24 +01:00
<h2 id="2019-12-01">2019-12-01</h2>
2019-12-01 10:29:49 +01:00
<ul>
<li>Upgrade CGSpace (linode18) to Ubuntu 18.04:
<ul>
<li>Check any packages that have residual configs and purge them:</li>
<li><!-- raw HTML omitted --># dpkg -l | grep -E &lsquo;^rc&rsquo; | awk &lsquo;{print $2}&rsquo; | xargs dpkg -P<!-- raw HTML omitted --></li>
<li>Make sure all packages are up to date and the package manager is up to date, then reboot:</li>
</ul>
</li>
</ul>
<pre><code># apt update &amp;&amp; apt full-upgrade
# apt-get autoremove &amp;&amp; apt-get autoclean
# dpkg -C
# reboot
</code></pre>
<a href='https://alanorth.github.io/cgspace-notes/2019-12/'>Read more →</a>
</article>
2019-11-04 15:41:19 +01:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-11/">November, 2019</a></h2>
2020-04-02 09:55:42 +02:00
<p class="blog-post-meta"><time datetime="2019-11-04T12:20:30+02:00">Mon Nov 04, 2019</time> by Alan Orth in
2020-01-28 11:01:42 +01:00
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
2019-11-04 15:41:19 +01:00
</p>
</header>
2019-12-17 13:49:24 +01:00
<h2 id="2019-11-04">2019-11-04</h2>
2019-11-04 15:41:19 +01:00
<ul>
2019-11-28 16:30:45 +01:00
<li>Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
2019-11-04 15:41:19 +01:00
<ul>
2019-11-28 16:30:45 +01:00
<li>I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:</li>
</ul>
</li>
</ul>
2019-11-04 15:41:19 +01:00
<pre><code># zcat --force /var/log/nginx/*access.log.*.gz | grep -cE &quot;[0-9]{1,2}/Oct/2019&quot;
4671942
# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE &quot;[0-9]{1,2}/Oct/2019&quot;
1277694
2019-11-28 16:30:45 +01:00
</code></pre><ul>
<li>So 4.6 million from XMLUI and another 1.2 million from API requests</li>
2020-01-27 15:20:44 +01:00
<li>Let&rsquo;s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):</li>
2019-11-28 16:30:45 +01:00
</ul>
2019-11-04 15:41:19 +01:00
<pre><code># zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E &quot;[0-9]{1,2}/Oct/2019&quot;
1183456
# zcat --force /var/log/nginx/rest.log.*.gz | grep -E &quot;[0-9]{1,2}/Oct/2019&quot; | grep -c -E &quot;/rest/bitstreams&quot;
106781
2019-11-28 16:30:45 +01:00
</code></pre>
2019-11-04 15:41:19 +01:00
<a href='https://alanorth.github.io/cgspace-notes/2019-11/'>Read more →</a>
</article>
2019-10-28 12:43:25 +01:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/">CGSpace CG Core v2 Migration</a></h2>
2020-04-02 09:55:42 +02:00
<p class="blog-post-meta"><time datetime="2019-10-28T13:27:35+02:00">Mon Oct 28, 2019</time> by Alan Orth in
2020-01-28 11:01:42 +01:00
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
2019-10-28 12:43:25 +01:00
2020-01-28 11:01:42 +01:00
<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/tags/migration/" rel="tag">Migration</a>
2019-10-28 12:43:25 +01:00
</p>
</header>
<p>Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2.</p>
<p>With reference to <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2 draft standard</a> by Marie-Angélique as well as <a href="http://www.dublincore.org/specifications/dublin-core/dcmi-terms/">DCMI DCTERMS</a>.</p>
<a href='https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/'>Read more →</a>
</article>
2019-10-01 16:31:40 +02:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-10/">October, 2019</a></h2>
2020-04-02 09:55:42 +02:00
<p class="blog-post-meta"><time datetime="2019-10-01T13:20:51+03:00">Tue Oct 01, 2019</time> by Alan Orth in
2020-01-28 11:01:42 +01:00
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
2019-10-01 16:31:40 +02:00
</p>
</header>
2020-01-27 15:20:44 +01:00
2019-10-01 Udana from IWMI asked me for a CSV export of their community on CGSpace I exported it, but a quick run through the csv-metadata-quality tool shows that there are some low-hanging fruits we can fix before I send him the data I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U+00A0) there that would otherwise be removed by the csv-metadata-quality script&rsquo;s &ldquo;unneccesary Unicode&rdquo; fix: $ csvcut -c 'id,dc.
2019-10-01 16:31:40 +02:00
<a href='https://alanorth.github.io/cgspace-notes/2019-10/'>Read more →</a>
</article>
2018-02-11 17:28:23 +01:00
<nav class="blog-pagination">
<a class="btn btn-outline-primary disabled" href="#" role="button" aria-disabled="true">Previous page</a>
2018-03-09 21:16:20 +01:00
<a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/2/" rel="next" role="button">Next page</a>
2018-02-11 17:28:23 +01:00
</nav>
2018-02-11 17:28:23 +01:00
</div> <!-- /.blog-main -->
<aside class="col-sm-3 ml-auto blog-sidebar">
<section class="sidebar-module">
<h4>Recent Posts</h4>
<ol class="list-unstyled">
2020-06-02 14:12:32 +02:00
<li><a href="/cgspace-notes/2020-06/">June, 2020</a></li>
2020-05-02 09:08:14 +02:00
2020-06-02 14:12:32 +02:00
<li><a href="/cgspace-notes/2020-05/">May, 2020</a></li>
2020-06-01 16:08:25 +02:00
2020-04-02 09:54:46 +02:00
<li><a href="/cgspace-notes/2020-04/">April, 2020</a></li>
2020-03-02 11:38:10 +01:00
<li><a href="/cgspace-notes/2020-03/">March, 2020</a></li>
2020-02-02 16:15:48 +01:00
<li><a href="/cgspace-notes/2020-02/">February, 2020</a></li>
2018-02-11 17:28:23 +01:00
</ol>
</section>
<section class="sidebar-module">
<h4>Links</h4>
<ol class="list-unstyled">
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
</ol>
</section>
</aside>
</div> <!-- /.row -->
</div> <!-- /.container -->
<footer class="blog-footer">
<p dir="auto">
2018-02-11 17:28:23 +01:00
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
</p>
<p>
<a href="#">Back to top</a>
</p>
</footer>
</body>
</html>