mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-22 22:55:04 +01:00
441 lines
17 KiB
HTML
441 lines
17 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en" >
|
|
|
|
<head>
|
|
<meta charset="utf-8">
|
|
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
|
|
|
|
|
|
<meta property="og:title" content="CGSpace Notes" />
|
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
|
<meta property="og:type" content="website" />
|
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
|
<meta property="og:updated_time" content="2024-05-01T17:10:05+03:00" />
|
|
|
|
|
|
|
|
<meta name="twitter:card" content="summary"/>
|
|
<meta name="twitter:title" content="CGSpace Notes"/>
|
|
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
|
<meta name="generator" content="Hugo 0.125.5">
|
|
|
|
|
|
|
|
<script type="application/ld+json">
|
|
{
|
|
"@context": "http://schema.org",
|
|
"@type": "Blog",
|
|
"headline": "CGSpace Notes",
|
|
"url" : "https://alanorth.github.io/cgspace-notes/",
|
|
"author": {
|
|
"@type": "Person",
|
|
"name": "Alan Orth"
|
|
},
|
|
"dateModified": "2024-05-01T10:39:00+03:00",
|
|
"keywords": "notes, migration, notes",
|
|
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
|
|
}
|
|
</script>
|
|
|
|
|
|
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/">
|
|
|
|
<title>CGSpace Notes</title>
|
|
|
|
|
|
<!-- combined, minified CSS -->
|
|
|
|
<link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F+GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
|
|
|
|
|
|
<!-- minified Font Awesome for SVG icons -->
|
|
|
|
<script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz+lcnA=" crossorigin="anonymous"></script>
|
|
|
|
<!-- RSS 2.0 feed -->
|
|
<link rel="alternate" type="application/rss+xml" href="https://alanorth.github.io/cgspace-notes/index.xml" title="CGSpace Notes" />
|
|
|
|
|
|
|
|
|
|
</head>
|
|
|
|
<body>
|
|
|
|
|
|
<div class="blog-masthead">
|
|
<div class="container">
|
|
<nav class="nav blog-nav">
|
|
<a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
|
|
</nav>
|
|
</div>
|
|
</div>
|
|
|
|
|
|
|
|
|
|
<header class="blog-header">
|
|
<div class="container">
|
|
<h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
|
|
<p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
|
|
</div>
|
|
</header>
|
|
|
|
|
|
|
|
|
|
<div class="container">
|
|
<div class="row">
|
|
<div class="col-sm-8 blog-main">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-07/">July, 2023</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2023-07-01T17:14:36+03:00">Sat Jul 01, 2023</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as “Copyrighted; all rights reserved” based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it’s usually copyrighted (could still be open access, but we can’t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status… In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don’t like the Impact Area icons as a component because they don’t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I’ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be “acceptedVersion”, which is presumably the author’s version, as opposed to the “publishedVersion”, which means it’s available as open access on the publisher’s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as described in my DSpace issue from last year: https://github.
|
|
<a href='https://alanorth.github.io/cgspace-notes/2023-07/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-06/">June, 2023</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2023-06-02T10:29:36+03:00">Fri Jun 02, 2023</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2023-06-02">2023-06-02</h2>
|
|
<ul>
|
|
<li>Spend some time testing my <code>post_bitstreams.py</code> script to update thumbnails for items on CGSpace
|
|
<ul>
|
|
<li>Interestingly I found an item with a JFIF thumbnail and another with a WebP thumbnail…</li>
|
|
</ul>
|
|
</li>
|
|
<li>Meeting with Valentina, Stefano, and Sara about MODS metadata in CGSpace
|
|
<ul>
|
|
<li>They have experience with improving the MODS interface in MELSpace’s OAI-PMH for use with AGRIS and were curious if we could do the same in CGSpace</li>
|
|
<li>From what I can see we need to upgrade the MODS schema from 3.1 to 3.7 and then just add a bunch of our fields to the crosswalk</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2023-06/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-05/">May, 2023</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2023-05-03T08:53:36+03:00">Wed May 03, 2023</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2023-05-03">2023-05-03</h2>
|
|
<ul>
|
|
<li>Alliance’s TIP team emailed me to ask about issues authenticating on CGSpace
|
|
<ul>
|
|
<li>It seems their password expired, which is annoying</li>
|
|
</ul>
|
|
</li>
|
|
<li>I continued looking at the CGSpace subjects for the FAO / AGROVOC exercise that I started last week
|
|
<ul>
|
|
<li>There are many of our subjects that would match if they added a “-” like “high yielding varieties” or used singular…</li>
|
|
<li>Also I found at least two spelling mistakes, for example “decison support systems”, which would match if it was spelled correctly</li>
|
|
</ul>
|
|
</li>
|
|
<li>Work on cleaning, proofing, and uploading twenty-seven records for IFPRI to CGSpace</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2023-05/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-04/">April, 2023</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2023-04-02T08:19:36+03:00">Sun Apr 02, 2023</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2023-04-02">2023-04-02</h2>
|
|
<ul>
|
|
<li>Run all system updates on CGSpace and reboot it</li>
|
|
<li>I exported CGSpace to CSV to check for any missing Initiative collection mappings
|
|
<ul>
|
|
<li>I also did a check for missing country/region mappings with csv-metadata-quality</li>
|
|
</ul>
|
|
</li>
|
|
<li>Start a harvest on AReS</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2023-04/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-03/">March, 2023</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2023-03-01T07:58:36+03:00">Wed Mar 01, 2023</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2023-03-01">2023-03-01</h2>
|
|
<ul>
|
|
<li>Remove <code>cg.subject.wle</code> and <code>cg.identifier.wletheme</code> from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021)</li>
|
|
<li><a href="https://salsa.debian.org/iso-codes-team/iso-codes/-/blob/main/CHANGELOG.md#4130-2023-02-28">iso-codes 4.13.0 was released</a>, which incorporates my changes to the common names for Iran, Laos, and Syria</li>
|
|
<li>I finally got through with porting the input form from DSpace 6 to DSpace 7</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2023-03/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-02/">February, 2023</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2023-02-01T10:57:36+03:00">Wed Feb 01, 2023</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2023-02-01">2023-02-01</h2>
|
|
<ul>
|
|
<li>Export CGSpace to cross check the DOI metadata with Crossref
|
|
<ul>
|
|
<li>I want to try to expand my use of their data to journals, publishers, volumes, issues, etc…</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2023-02/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-01/">January, 2023</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2023-01-01T08:44:36+03:00">Sun Jan 01, 2023</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2023-01-01">2023-01-01</h2>
|
|
<ul>
|
|
<li>Apply some more ORCID identifiers to items on CGSpace using my <code>2022-09-22-add-orcids.csv</code> file
|
|
<ul>
|
|
<li>I want to update all ORCID names and refresh them in the database</li>
|
|
<li>I see we have some new ones that aren’t in our list if I combine with this file:</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2023-01/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-12/">December, 2022</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2022-12-01T08:52:36+03:00">Thu Dec 01, 2022</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2022-12-01">2022-12-01</h2>
|
|
<ul>
|
|
<li>Fix some incorrect regions on CGSpace
|
|
<ul>
|
|
<li>I exported the CCAFS and IITA communities, extracted just the country and region columns, then ran them through csv-metadata-quality to fix the regions</li>
|
|
</ul>
|
|
</li>
|
|
<li>Add a few more authors to my CSV with author names and ORCID identifiers and tag 283 items!</li>
|
|
<li>Replace “East Asia” with “Eastern Asia” region on CGSpace (UN M.49 region)</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2022-12/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-11/">November, 2022</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2022-11-01T09:11:36+03:00">Tue Nov 01, 2022</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2022-11-01">2022-11-01</h2>
|
|
<ul>
|
|
<li>Last night I re-synced DSpace 7 Test from CGSpace
|
|
<ul>
|
|
<li>I also updated all my local <code>7_x-dev</code> branches on the latest upstreams</li>
|
|
</ul>
|
|
</li>
|
|
<li>I spent some time updating the authorizations in Alliance collections
|
|
<ul>
|
|
<li>I want to make sure they use groups instead of individuals where possible!</li>
|
|
</ul>
|
|
</li>
|
|
<li>I reverted the Cocoon autosave change because it was more of a nuissance that Peter can’t upload CSVs from the web interface and is a very low severity security issue</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2022-11/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-10/">October, 2022</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2022-10-01T19:45:36+03:00">Sat Oct 01, 2022</time> by Alan Orth in
|
|
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/categories/notes/" rel="category tag">Notes</a>
|
|
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2022-10-01">2022-10-01</h2>
|
|
<ul>
|
|
<li>Start a harvest on AReS last night</li>
|
|
<li>Yesterday I realized how to use <a href="https://im4java.sourceforge.net/docs/dev-guide.html">GraphicsMagick with im4java</a> and I want to re-visit some of my thumbnail tests
|
|
<ul>
|
|
<li>I’m also interested in libvips support via jVips, though last time I checked it was only for Java 8</li>
|
|
<li>I filed <a href="https://github.com/criteo/JVips/issues/141">an issue to ask about Java 11+ support</a></li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
<a href='https://alanorth.github.io/cgspace-notes/2022-10/'>Read more →</a>
|
|
</article>
|
|
|
|
|
|
|
|
|
|
|
|
<nav class="blog-pagination">
|
|
|
|
<a class="btn btn-outline-primary" href="/cgspace-notes/" rel="prev" role="button">Previous page</a>
|
|
<a class="btn btn-outline-primary" href="/cgspace-notes/page/3/" rel="next" role="button">Next page</a>
|
|
|
|
|
|
|
|
</nav>
|
|
|
|
|
|
|
|
|
|
|
|
</div> <!-- /.blog-main -->
|
|
|
|
<aside class="col-sm-3 ml-auto blog-sidebar">
|
|
|
|
|
|
|
|
<section class="sidebar-module">
|
|
<h4>Recent Posts</h4>
|
|
<ol class="list-unstyled">
|
|
|
|
|
|
<li><a href="/cgspace-notes/2024-05/">May, 2024</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2024-04/">April, 2024</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2024-03/">March, 2024</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2024-02/">February, 2024</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2024-01/">January, 2024</a></li>
|
|
|
|
</ol>
|
|
</section>
|
|
|
|
|
|
|
|
|
|
<section class="sidebar-module">
|
|
<h4>Links</h4>
|
|
<ol class="list-unstyled">
|
|
|
|
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
|
|
|
|
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
|
|
|
|
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
|
|
|
|
</ol>
|
|
</section>
|
|
|
|
</aside>
|
|
|
|
|
|
</div> <!-- /.row -->
|
|
</div> <!-- /.container -->
|
|
|
|
|
|
|
|
<footer class="blog-footer">
|
|
<p dir="auto">
|
|
|
|
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
|
|
|
|
</p>
|
|
<p>
|
|
<a href="#">Back to top</a>
|
|
</p>
|
|
</footer>
|
|
|
|
|
|
</body>
|
|
|
|
</html>
|