mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-10 00:55:47 +01:00
488 lines
17 KiB
HTML
488 lines
17 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en" >
|
|
|
|
<head>
|
|
<meta charset="utf-8">
|
|
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
|
|
|
|
|
|
<meta property="og:title" content="CGSpace Notes" />
|
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
|
<meta property="og:type" content="website" />
|
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
|
<meta property="og:updated_time" content="2024-06-03T14:14:00+03:00" />
|
|
|
|
|
|
|
|
<meta name="twitter:card" content="summary"/>
|
|
<meta name="twitter:title" content="CGSpace Notes"/>
|
|
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
|
<meta name="generator" content="Hugo 0.126.3">
|
|
|
|
|
|
|
|
<script type="application/ld+json">
|
|
{
|
|
"@context": "http://schema.org",
|
|
"@type": "Blog",
|
|
"headline": "CGSpace Notes",
|
|
"url" : "https://alanorth.github.io/cgspace-notes/",
|
|
"author": {
|
|
"@type": "Person",
|
|
"name": "Alan Orth"
|
|
},
|
|
"dateModified": "2024-06-03T14:14:00+03:00",
|
|
"keywords": "notes, migration, notes",
|
|
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
|
|
}
|
|
</script>
|
|
|
|
|
|
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/">
|
|
|
|
<title>CGSpace Notes</title>
|
|
|
|
|
|
<!-- combined, minified CSS -->
|
|
|
|
<link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F+GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
|
|
|
|
|
|
<!-- minified Font Awesome for SVG icons -->
|
|
|
|
<script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz+lcnA=" crossorigin="anonymous"></script>
|
|
|
|
<!-- RSS 2.0 feed -->
|
|
<link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/index.xml" title="CGSpace Notes" />
|
|
|
|
|
|
|
|
|
|
</head>
|
|
|
|
<body>
|
|
|
|
|
|
<div class="blog-masthead">
|
|
<div class="container">
|
|
<nav class="nav blog-nav">
|
|
<a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
|
|
</nav>
|
|
</div>
|
|
</div>
|
|
|
|
|
|
|
|
|
|
<header class="blog-header">
|
|
<div class="container">
|
|
<h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
|
|
<p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
|
|
</div>
|
|
</header>
|
|
|
|
|
|
|
|
|
|
<div class="container">
|
|
<div class="row">
|
|
<div class="col-sm-8 blog-main">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-06/">June, 2020</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2020-06-01T13:55:39+03:00">Mon Jun 01, 2020</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2020-06-01">2020-06-01</h2>
|
|
<ul>
|
|
<li>I tried to run the <code>AtomicStatisticsUpdateCLI</code> CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
|
|
<ul>
|
|
<li>I sent Atmire the dspace.log from today and told them to log into the server to debug the process</li>
|
|
</ul>
|
|
</li>
|
|
<li>In other news, I checked the statistics API on DSpace 6 and it’s working</li>
|
|
<li>I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2020-06/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-05/">May, 2020</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2020-05-02T09:52:04+03:00">Sat May 02, 2020</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2020-05-02">2020-05-02</h2>
|
|
<ul>
|
|
<li>Peter said that CTA is having problems submitting an item to CGSpace
|
|
<ul>
|
|
<li>Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in ‘idle in transaction’ and ‘waiting for lock’ state are increasing again</li>
|
|
<li>I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2020-05/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2020-04-02">2020-04-02</h2>
|
|
<ul>
|
|
<li>Maria asked me to update Charles Staver’s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
|
|
<ul>
|
|
<li>I updated the fifty-eight existing items on CGSpace</li>
|
|
</ul>
|
|
</li>
|
|
<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
|
|
<ul>
|
|
<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
|
|
<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
|
|
<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
|
|
</ul>
|
|
</li>
|
|
<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2020-04/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-03/">March, 2020</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2020-03-02T12:31:30+02:00">Mon Mar 02, 2020</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2020-03-02">2020-03-02</h2>
|
|
<ul>
|
|
<li>Update <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> for DSpace 6+ UUIDs
|
|
<ul>
|
|
<li>Tag version 1.2.0 on GitHub</li>
|
|
</ul>
|
|
</li>
|
|
<li>Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased <a href="https://github.com/DSpace/DSpace/commit/184f2b2153479045fba6239342c63e7f8564b8b6#diff-0350ce2e13b28d5d61252b7a8f50a059">SolrUpgradePre6xStatistics.java</a>
|
|
<ul>
|
|
<li>You need to download this into the DSpace 6.x source and compile it</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2020-03/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-02/">February, 2020</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2020-02-02T11:56:30+02:00">Sun Feb 02, 2020</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2020-02-02">2020-02-02</h2>
|
|
<ul>
|
|
<li>Continue working on porting CGSpace’s DSpace 5 code to DSpace 6.3 that I started yesterday
|
|
<ul>
|
|
<li>Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database</li>
|
|
<li>I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks</li>
|
|
<li>Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff</li>
|
|
<li>The code finally builds and runs with a fresh install</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2020-02/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-01/">January, 2020</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2020-01-06T10:48:30+02:00">Mon Jan 06, 2020</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2020-01-06">2020-01-06</h2>
|
|
<ul>
|
|
<li>Open <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">a ticket</a> with Atmire to request a quote for the upgrade to DSpace 6</li>
|
|
<li>Last week Altmetric responded about the <a href="https://hdl.handle.net/10568/97087">item</a> that had a lower score than than its DOI
|
|
<ul>
|
|
<li>The score is now linked to the DOI</li>
|
|
<li>Another <a href="https://hdl.handle.net/10568/91278">item</a> that had the same problem in 2019 has now also linked to the score for its DOI</li>
|
|
<li>Another <a href="https://hdl.handle.net/10568/81236">item</a> that had the same problem in 2019 has also been fixed</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
<h2 id="2020-01-07">2020-01-07</h2>
|
|
<ul>
|
|
<li>Peter Ballantyne highlighted one more WLE <a href="https://hdl.handle.net/10568/101286">item</a> that is missing the Altmetric score that its DOI has
|
|
<ul>
|
|
<li>The DOI has a score of 259, but the Handle has no score at all</li>
|
|
<li>I <a href="https://twitter.com/mralanorth/status/1214471427157626881">tweeted</a> the CGSpace repository link</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2020-01/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-12/">December, 2019</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2019-12-01T11:22:30+02:00">Sun Dec 01, 2019</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2019-12-01">2019-12-01</h2>
|
|
<ul>
|
|
<li>Upgrade CGSpace (linode18) to Ubuntu 18.04:
|
|
<ul>
|
|
<li>Check any packages that have residual configs and purge them:</li>
|
|
<li><!-- raw HTML omitted --># dpkg -l | grep -E ‘^rc’ | awk ‘{print $2}’ | xargs dpkg -P<!-- raw HTML omitted --></li>
|
|
<li>Make sure all packages are up to date and the package manager is up to date, then reboot:</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
<pre tabindex="0"><code># apt update && apt full-upgrade
|
|
# apt-get autoremove && apt-get autoclean
|
|
# dpkg -C
|
|
# reboot
|
|
</code></pre>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2019-12/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-11/">November, 2019</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2019-11-04T12:20:30+02:00">Mon Nov 04, 2019</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2019-11-04">2019-11-04</h2>
|
|
<ul>
|
|
<li>Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
|
|
<ul>
|
|
<li>I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
<pre tabindex="0"><code># zcat --force /var/log/nginx/*access.log.*.gz | grep -cE "[0-9]{1,2}/Oct/2019"
|
|
4671942
|
|
# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE "[0-9]{1,2}/Oct/2019"
|
|
1277694
|
|
</code></pre><ul>
|
|
<li>So 4.6 million from XMLUI and another 1.2 million from API requests</li>
|
|
<li>Let’s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):</li>
|
|
</ul>
|
|
<pre tabindex="0"><code># zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E "[0-9]{1,2}/Oct/2019"
|
|
1183456
|
|
# zcat --force /var/log/nginx/rest.log.*.gz | grep -E "[0-9]{1,2}/Oct/2019" | grep -c -E "/rest/bitstreams"
|
|
106781
|
|
</code></pre>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2019-11/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-10/">October, 2019</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2019-10-01T13:20:51+03:00">Tue Oct 01, 2019</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
2019-10-01 Udana from IWMI asked me for a CSV export of their community on CGSpace I exported it, but a quick run through the csv-metadata-quality tool shows that there are some low-hanging fruits we can fix before I send him the data I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U+00A0) there that would otherwise be removed by the csv-metadata-quality script’s “unneccesary Unicode” fix: $ csvcut -c 'id,dc.
|
|
<a href='https://alanorth.github.io/cgspace-notes/2019-10/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-09/">September, 2019</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2019-09-01T10:17:51+03:00">Sun Sep 01, 2019</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2019-09-01">2019-09-01</h2>
|
|
<ul>
|
|
<li>Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning</li>
|
|
<li>Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:</li>
|
|
</ul>
|
|
<pre tabindex="0"><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
|
440 17.58.101.255
|
|
441 157.55.39.101
|
|
485 207.46.13.43
|
|
728 169.60.128.125
|
|
730 207.46.13.108
|
|
758 157.55.39.9
|
|
808 66.160.140.179
|
|
814 207.46.13.212
|
|
2472 163.172.71.23
|
|
6092 3.94.211.189
|
|
# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
|
33 2a01:7e00::f03c:91ff:fe16:fcb
|
|
57 3.83.192.124
|
|
57 3.87.77.25
|
|
57 54.82.1.8
|
|
822 2a01:9cc0:47:1:1a:4:0:2
|
|
1223 45.5.184.72
|
|
1633 172.104.229.92
|
|
5112 205.186.128.185
|
|
7249 2a01:7e00::f03c:91ff:fe18:7396
|
|
9124 45.5.186.2
|
|
</code></pre>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2019-09/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
<nav class="blog-pagination">
|
|
|
|
<a class="btn btn-outline-primary" href="/cgspace-notes/page/5/" rel="prev" role="button">Previous page</a>
|
|
<a class="btn btn-outline-primary" href="/cgspace-notes/page/7/" rel="next" role="button">Next page</a>
|
|
|
|
|
|
|
|
</nav>
|
|
|
|
|
|
|
|
|
|
|
|
</div> <!-- /.blog-main -->
|
|
|
|
<aside class="col-sm-3 ml-auto blog-sidebar">
|
|
|
|
|
|
|
|
<section class="sidebar-module">
|
|
<h4>Recent Posts</h4>
|
|
<ol class="list-unstyled">
|
|
|
|
|
|
<li><a href="/cgspace-notes/2024-06/">June, 2024</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2024-05/">May, 2024</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2024-04/">April, 2024</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2024-03/">March, 2024</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2024-02/">February, 2024</a></li>
|
|
|
|
</ol>
|
|
</section>
|
|
|
|
|
|
|
|
|
|
<section class="sidebar-module">
|
|
<h4>Links</h4>
|
|
<ol class="list-unstyled">
|
|
|
|
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
|
|
|
|
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
|
|
|
|
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
|
|
|
|
</ol>
|
|
</section>
|
|
|
|
</aside>
|
|
|
|
|
|
</div> <!-- /.row -->
|
|
</div> <!-- /.container -->
|
|
|
|
|
|
|
|
<footer class="blog-footer">
|
|
<p dir="auto">
|
|
|
|
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
|
|
|
|
</p>
|
|
<p>
|
|
<a href="#">Back to top</a>
|
|
</p>
|
|
</footer>
|
|
|
|
|
|
</body>
|
|
|
|
</html>
|