cgspace-notes/docs/page/3/index.html

480 lines
17 KiB
HTML
Raw Normal View History

2018-02-11 17:28:23 +01:00
<!DOCTYPE html>
<html lang="en" >
2018-02-11 17:28:23 +01:00
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
2020-12-06 15:53:29 +01:00
2018-02-11 17:28:23 +01:00
<meta property="og:title" content="CGSpace Notes" />
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
2022-02-14 14:43:12 +01:00
<meta property="og:updated_time" content="2022-02-14T09:40:59+03:00" />
2020-12-06 15:53:29 +01:00
2018-09-30 07:23:48 +02:00
2018-02-11 17:28:23 +01:00
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="CGSpace Notes"/>
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
2022-02-10 18:35:40 +01:00
<meta name="generator" content="Hugo 0.92.1" />
2018-02-11 17:28:23 +01:00
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "Blog",
"headline": "CGSpace Notes",
2020-04-02 09:55:42 +02:00
"url" : "https://alanorth.github.io/cgspace-notes/",
2018-02-11 17:28:23 +01:00
"author": {
"@type": "Person",
"name": "Alan Orth"
},
2022-02-10 18:35:40 +01:00
"dateModified": "2022-02-01T14:06:54+02:00",
2020-11-16 09:54:00 +01:00
"keywords": "notes, migration, notes",
2020-04-02 09:55:42 +02:00
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
2018-02-11 17:28:23 +01:00
}
</script>
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/">
<title>CGSpace Notes</title>
2018-02-11 17:28:23 +01:00
<!-- combined, minified CSS -->
2020-01-23 19:19:38 +01:00
2021-01-24 08:46:27 +01:00
<link href="https://alanorth.github.io/cgspace-notes/css/style.beb8012edc08ba10be012f079d618dc243812267efe62e11f22fe49618f976a4.css" rel="stylesheet" integrity="sha256-vrgBLtwIuhC&#43;AS8HnWGNwkOBImfv5i4R8i/klhj5dqQ=" crossorigin="anonymous">
2018-02-11 17:28:23 +01:00
2020-01-28 11:01:42 +01:00
<!-- minified Font Awesome for SVG icons -->
2021-09-28 09:32:32 +02:00
<script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
2020-01-28 11:01:42 +01:00
2018-02-11 17:28:23 +01:00
<!-- RSS 2.0 feed -->
2019-04-14 15:59:47 +02:00
<link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/index.xml" title="CGSpace Notes" />
2018-02-11 17:28:23 +01:00
</head>
<body>
<div class="blog-masthead">
<div class="container">
<nav class="nav blog-nav">
<a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>
2018-12-19 12:20:39 +01:00
2018-02-11 17:28:23 +01:00
<header class="blog-header">
<div class="container">
<h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
<p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
2018-02-11 17:28:23 +01:00
</div>
</header>
2018-12-19 12:20:39 +01:00
2018-02-11 17:28:23 +01:00
<div class="container">
<div class="row">
<div class="col-sm-8 blog-main">
2022-02-10 18:35:40 +01:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-08/">August, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-08-02T15:35:54+03:00">Sun Aug 02, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-08-02">2020-08-02</h2>
<ul>
<li>I spent a few days working on a Java-based curation task to tag items with ISO 3166-1 Alpha2 country codes based on their <code>cg.coverage.country</code> text values
<ul>
<li>It looks up the names in ISO 3166-1 first, and then in our CGSpace countries mapping (which has five or so of Peter&rsquo;s preferred &ldquo;display&rdquo; country names)</li>
<li>It implements a &ldquo;force&rdquo; mode too that will clear existing country codes and re-tag everything</li>
<li>It is class based so I can easily add support for other vocabularies, and the technique could even be used for organizations with mappings to ROR and Clarisa&hellip;</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-08/'>Read more →</a>
</article>
2022-01-01 14:21:47 +01:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-07/">July, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-07-01T10:53:54+03:00">Wed Jul 01, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-07-01">2020-07-01</h2>
<ul>
<li>A few users noticed that CGSpace wasn&rsquo;t loading items today, item pages seem blank
<ul>
<li>I looked at the PostgreSQL locks but they don&rsquo;t seem unusual</li>
<li>I guess this is the same &ldquo;blank item page&rdquo; issue that we had a few times in 2019 that we never solved</li>
<li>I restarted Tomcat and PostgreSQL and the issue was gone</li>
</ul>
</li>
<li>Since I was restarting Tomcat anyways I decided to redeploy the latest changes from the <code>5_x-prod</code> branch and I added a note about COVID-19 items to the CGSpace frontpage at Peter&rsquo;s request</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-07/'>Read more →</a>
</article>
2021-12-03 11:58:43 +01:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-06/">June, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-06-01T13:55:39+03:00">Mon Jun 01, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-06-01">2020-06-01</h2>
<ul>
<li>I tried to run the <code>AtomicStatisticsUpdateCLI</code> CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
<ul>
<li>I sent Atmire the dspace.log from today and told them to log into the server to debug the process</li>
</ul>
</li>
<li>In other news, I checked the statistics API on DSpace 6 and it&rsquo;s working</li>
<li>I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-06/'>Read more →</a>
</article>
2021-11-01 09:49:21 +01:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-05/">May, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-05-02T09:52:04+03:00">Sat May 02, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-05-02">2020-05-02</h2>
<ul>
<li>Peter said that CTA is having problems submitting an item to CGSpace
<ul>
<li>Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in &lsquo;idle in transaction&rsquo; and &lsquo;waiting for lock&rsquo; state are increasing again</li>
<li>I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-05/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-04-02">2020-04-02</h2>
<ul>
<li>Maria asked me to update Charles Staver&rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
<ul>
<li>I updated the fifty-eight existing items on CGSpace</li>
</ul>
</li>
<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
<ul>
<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
</ul>
</li>
<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-04/'>Read more →</a>
</article>
2021-09-02 16:21:48 +02:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-03/">March, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-03-02T12:31:30+02:00">Mon Mar 02, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-03-02">2020-03-02</h2>
<ul>
<li>Update <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> for DSpace 6+ UUIDs
<ul>
<li>Tag version 1.2.0 on GitHub</li>
</ul>
</li>
<li>Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased <a href="https://github.com/DSpace/DSpace/commit/184f2b2153479045fba6239342c63e7f8564b8b6#diff-0350ce2e13b28d5d61252b7a8f50a059">SolrUpgradePre6xStatistics.java</a>
<ul>
<li>You need to download this into the DSpace 6.x source and compile it</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-03/'>Read more →</a>
</article>
2021-08-01 15:19:05 +02:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-02/">February, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-02-02T11:56:30+02:00">Sun Feb 02, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-02-02">2020-02-02</h2>
<ul>
<li>Continue working on porting CGSpace&rsquo;s DSpace 5 code to DSpace 6.3 that I started yesterday
<ul>
<li>Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database</li>
<li>I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks</li>
<li>Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff</li>
<li>The code finally builds and runs with a fresh install</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-02/'>Read more →</a>
</article>
2021-07-07 15:30:06 +02:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-01/">January, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-01-06T10:48:30+02:00">Mon Jan 06, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-01-06">2020-01-06</h2>
<ul>
<li>Open <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">a ticket</a> with Atmire to request a quote for the upgrade to DSpace 6</li>
<li>Last week Altmetric responded about the <a href="https://hdl.handle.net/10568/97087">item</a> that had a lower score than than its DOI
<ul>
<li>The score is now linked to the DOI</li>
2021-09-20 16:31:45 +02:00
<li>Another <a href="https://hdl.handle.net/10568/91278">item</a> that had the same problem in 2019 has now also linked to the score for its DOI</li>
2021-07-07 15:30:06 +02:00
<li>Another <a href="https://hdl.handle.net/10568/81236">item</a> that had the same problem in 2019 has also been fixed</li>
</ul>
</li>
</ul>
<h2 id="2020-01-07">2020-01-07</h2>
<ul>
<li>Peter Ballantyne highlighted one more WLE <a href="https://hdl.handle.net/10568/101286">item</a> that is missing the Altmetric score that its DOI has
<ul>
<li>The DOI has a score of 259, but the Handle has no score at all</li>
<li>I <a href="https://twitter.com/mralanorth/status/1214471427157626881">tweeted</a> the CGSpace repository link</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-01/'>Read more →</a>
</article>
2021-06-03 20:54:49 +02:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-12/">December, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-12-01T11:22:30+02:00">Sun Dec 01, 2019</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2019-12-01">2019-12-01</h2>
<ul>
<li>Upgrade CGSpace (linode18) to Ubuntu 18.04:
<ul>
<li>Check any packages that have residual configs and purge them:</li>
<li><!-- raw HTML omitted --># dpkg -l | grep -E &lsquo;^rc&rsquo; | awk &lsquo;{print $2}&rsquo; | xargs dpkg -P<!-- raw HTML omitted --></li>
<li>Make sure all packages are up to date and the package manager is up to date, then reboot:</li>
</ul>
</li>
</ul>
2021-09-13 15:21:16 +02:00
<pre tabindex="0"><code># apt update &amp;&amp; apt full-upgrade
2021-06-03 20:54:49 +02:00
# apt-get autoremove &amp;&amp; apt-get autoclean
# dpkg -C
# reboot
</code></pre>
<a href='https://alanorth.github.io/cgspace-notes/2019-12/'>Read more →</a>
</article>
2021-05-02 18:55:06 +02:00
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-11/">November, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-11-04T12:20:30+02:00">Mon Nov 04, 2019</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2019-11-04">2019-11-04</h2>
<ul>
<li>Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
<ul>
<li>I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:</li>
</ul>
</li>
</ul>
2021-09-13 15:21:16 +02:00
<pre tabindex="0"><code># zcat --force /var/log/nginx/*access.log.*.gz | grep -cE &quot;[0-9]{1,2}/Oct/2019&quot;
2021-05-02 18:55:06 +02:00
4671942
# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE &quot;[0-9]{1,2}/Oct/2019&quot;
1277694
</code></pre><ul>
<li>So 4.6 million from XMLUI and another 1.2 million from API requests</li>
<li>Let&rsquo;s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):</li>
</ul>
2021-09-13 15:21:16 +02:00
<pre tabindex="0"><code># zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E &quot;[0-9]{1,2}/Oct/2019&quot;
2021-05-02 18:55:06 +02:00
1183456
# zcat --force /var/log/nginx/rest.log.*.gz | grep -E &quot;[0-9]{1,2}/Oct/2019&quot; | grep -c -E &quot;/rest/bitstreams&quot;
106781
</code></pre>
<a href='https://alanorth.github.io/cgspace-notes/2019-11/'>Read more →</a>
</article>
2018-02-11 17:28:23 +01:00
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/page/2/" rel="prev" role="button">Previous page</a>
2018-03-09 21:16:20 +01:00
<a class="btn btn-outline-primary" href="/cgspace-notes/page/4/" rel="next" role="button">Next page</a>
2018-02-11 17:28:23 +01:00
</nav>
2018-02-11 17:28:23 +01:00
</div> <!-- /.blog-main -->
<aside class="col-sm-3 ml-auto blog-sidebar">
<section class="sidebar-module">
<h4>Recent Posts</h4>
<ol class="list-unstyled">
2022-02-10 18:35:40 +01:00
<li><a href="/cgspace-notes/2022-02/">February, 2022</a></li>
2022-01-01 14:21:47 +01:00
<li><a href="/cgspace-notes/2022-01/">January, 2022</a></li>
2021-12-03 11:58:43 +01:00
<li><a href="/cgspace-notes/2021-12/">December, 2021</a></li>
2021-11-01 09:49:21 +01:00
<li><a href="/cgspace-notes/2021-11/">November, 2021</a></li>
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
2018-02-11 17:28:23 +01:00
</ol>
</section>
<section class="sidebar-module">
<h4>Links</h4>
<ol class="list-unstyled">
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
</ol>
</section>
</aside>
</div> <!-- /.row -->
</div> <!-- /.container -->
<footer class="blog-footer">
<p dir="auto">
2018-02-11 17:28:23 +01:00
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
</p>
<p>
<a href="#">Back to top</a>
</p>
</footer>
</body>
</html>