cgspace-notes/docs/cgspace-cgcorev2-migration/index.html

522 lines
17 KiB
HTML
Raw Normal View History

2019-10-28 12:51:00 +01:00
<!DOCTYPE html>
<html lang="en" >
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
2020-12-06 15:53:29 +01:00
2019-10-28 12:51:00 +01:00
<meta property="og:title" content="CGSpace CG Core v2 Migration" />
<meta property="og:description" content="Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2." />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/" />
2021-04-01 08:49:08 +02:00
<meta property="article:published_time" content="2021-02-21T13:27:35+02:00" />
2021-09-21 11:47:05 +02:00
<meta property="article:modified_time" content="2021-09-21T12:46:34+03:00" />
2019-10-28 12:51:00 +01:00
2020-12-06 15:53:29 +01:00
2019-10-28 12:51:00 +01:00
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="CGSpace CG Core v2 Migration"/>
<meta name="twitter:description" content="Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2."/>
2022-03-10 12:35:14 +01:00
<meta name="generator" content="Hugo 0.93.2" />
2019-10-28 12:51:00 +01:00
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "BlogPosting",
"headline": "CGSpace CG Core v2 Migration",
2020-04-02 09:55:42 +02:00
"url": "https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/",
2021-09-21 11:47:05 +02:00
"wordCount": "579",
2021-04-01 08:49:08 +02:00
"datePublished": "2021-02-21T13:27:35+02:00",
2021-09-21 11:47:05 +02:00
"dateModified": "2021-09-21T12:46:34+03:00",
2019-10-28 12:51:00 +01:00
"author": {
"@type": "Person",
"name": "Alan Orth"
},
"keywords": "Notes, Migration",
"description": "Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2."
}
</script>
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/">
<title>CGSpace CG Core v2 Migration | CGSpace Notes</title>
<!-- combined, minified CSS -->
2020-01-23 19:19:38 +01:00
2021-01-24 08:46:27 +01:00
<link href="https://alanorth.github.io/cgspace-notes/css/style.beb8012edc08ba10be012f079d618dc243812267efe62e11f22fe49618f976a4.css" rel="stylesheet" integrity="sha256-vrgBLtwIuhC&#43;AS8HnWGNwkOBImfv5i4R8i/klhj5dqQ=" crossorigin="anonymous">
2019-10-28 12:51:00 +01:00
2020-01-28 11:01:42 +01:00
<!-- minified Font Awesome for SVG icons -->
2021-09-28 09:32:32 +02:00
<script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
2020-01-28 11:01:42 +01:00
2019-10-28 12:51:00 +01:00
<!-- RSS 2.0 feed -->
</head>
<body>
<div class="blog-masthead">
<div class="container">
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>
<header class="blog-header">
<div class="container">
<h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
<p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
</div>
</header>
<div class="container">
<div class="row">
<div class="col-sm-8 blog-main">
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/">CGSpace CG Core v2 Migration</a></h2>
2020-11-16 09:54:00 +01:00
<p class="blog-post-meta">
2021-04-01 08:49:08 +02:00
<time datetime="2021-02-21T13:27:35+02:00">Sun Feb 21, 2021</time>
2020-11-16 09:54:00 +01:00
in
2020-01-28 11:01:42 +01:00
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
2019-10-28 12:51:00 +01:00
2020-01-28 11:01:42 +01:00
<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/tags/migration/" rel="tag">Migration</a>
2019-10-28 12:51:00 +01:00
</p>
</header>
2021-04-01 08:49:08 +02:00
<p>Changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2. Implemented on 2021-02-21.</p>
2019-10-28 12:51:00 +01:00
<p>With reference to <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2 draft standard</a> by Marie-Angélique as well as <a href="http://www.dublincore.org/specifications/dublin-core/dcmi-terms/">DCMI DCTERMS</a>.</p>
<ul>
2021-09-21 11:47:05 +02:00
<li><a href="#proposed-changes">Proposed Changes</a>
<ul>
<li><a href="#out-of-scope">Out of Scope</a></li>
</ul>
</li>
2019-10-28 12:51:00 +01:00
<li><a href="#fields-to-create">Fields to Create</a></li>
<li><a href="#fields-to-delete">Fields to Delete</a></li>
<li><a href="#implementation-progress">Implementation Progress</a></li>
</ul>
<h2 id="proposed-changes">Proposed Changes</h2>
2021-01-18 15:21:24 +01:00
<p>As of 2021-01-18 the scope of the changes includes the following fields:</p>
2019-10-28 12:51:00 +01:00
<ul>
<li>cg.creator.id→cg.creator.identifier
<ul>
<li>ORCID identifiers</li>
2019-11-28 16:30:45 +01:00
</ul>
</li>
2019-10-28 12:51:00 +01:00
<li>dc.format.extent→dcterms.extent</li>
<li>dc.date.issued→dcterms.issued</li>
<li>dc.description.abstract→dcterms.abstract</li>
<li>dc.description→dcterms.description</li>
<li>dc.description.sponsorship→cg.contributor.donor
<ul>
<li>values from CrossRef or Grid.ac if possible</li>
2019-11-28 16:30:45 +01:00
</ul>
</li>
2021-01-18 15:21:24 +01:00
<li>dc.description.version→cg.reviewStatus</li>
<li>cg.fulltextstatus→cg.howPublished
2019-10-28 12:51:00 +01:00
<ul>
<li>CGSpace uses values like &ldquo;Formally Published&rdquo; or &ldquo;Grey Literature&rdquo;</li>
2019-11-28 16:30:45 +01:00
</ul>
</li>
2019-10-28 12:51:00 +01:00
<li>dc.identifier.citation→dcterms.bibliographicCitation</li>
<li>cg.identifier.status→dcterms.accessRights
<ul>
<li>current values are &ldquo;Open Access&rdquo; and &ldquo;Limited Access&rdquo;</li>
<li>future values are possibly &ldquo;Open&rdquo; and &ldquo;Restricted&rdquo;?</li>
2019-11-28 16:30:45 +01:00
</ul>
</li>
2019-10-28 12:51:00 +01:00
<li>dc.language.iso→dcterms.language
<ul>
<li>current values are ISO 639-1 (aka Alpha 2)</li>
<li>future values are possibly ISO 639-3 (aka Alpha 3)?</li>
2019-11-28 16:30:45 +01:00
</ul>
</li>
2019-10-28 12:51:00 +01:00
<li>cg.link.reference→dcterms.relation</li>
<li>dc.publisher→dcterms.publisher</li>
2021-01-24 16:40:56 +01:00
<li>dc.relation.ispartofseries will be split into:
<ul>
<li>series name: dcterms.isPartOf</li>
<li>series number: cg.number</li>
</ul>
</li>
2019-10-28 12:51:00 +01:00
<li>dc.rights→dcterms.license
<ul>
<li>Using <a href="https://spdx.org/licenses/">SPDX license identifiers</a> if possible</li>
2019-11-28 16:30:45 +01:00
</ul>
</li>
2019-10-28 12:51:00 +01:00
<li>dc.source→cg.journal</li>
<li>dc.subject→dcterms.subject</li>
<li>dc.type→dcterms.type</li>
<li>dc.identifier.isbn→cg.isbn</li>
<li>dc.identifier.issn→cg.issn</li>
2019-12-22 11:14:25 +01:00
<li>cg.targetaudience→dcterms.audience</li>
2019-10-28 12:51:00 +01:00
</ul>
2021-09-21 11:47:05 +02:00
<h3 id="out-of-scope">Out of Scope</h3>
2019-10-28 12:51:00 +01:00
<p>The following fields are currently out of the scope of this migration because they are used internally by DSpace 5.x/6.x and would be difficult to change without significant modifications to the core of the code:</p>
<ul>
2019-10-29 15:23:43 +01:00
<li>dc.title (<code>IncludePageMeta.java</code> only considers DC when building pageMeta, which we rely on in XMLUI because of XSLT from DRI)</li>
<li>dc.title.alternative</li>
2019-10-28 12:51:00 +01:00
<li>dc.date.available</li>
<li>dc.date.accessioned</li>
2019-10-29 15:23:43 +01:00
<li>dc.identifier.uri (hard coded for Handle assignment upon item submission)</li>
2019-10-28 12:51:00 +01:00
<li>dc.description.provenance</li>
2019-10-29 15:23:43 +01:00
<li>dc.contributor.author (<code>IncludePageMeta.java</code> only considers DC when building pageMeta, which we rely on in XMLUI because of XSLT from DRI)</li>
2019-10-28 12:51:00 +01:00
</ul>
<h2 id="fields-to-create">Fields to Create</h2>
<p>Make sure the following fields exist:</p>
2019-11-28 16:30:45 +01:00
<ul>
2021-01-28 15:28:21 +01:00
<li><input checked="" disabled="" type="checkbox"> cg.creator.identifier (247)</li>
<li><input checked="" disabled="" type="checkbox"> cg.contributor.donor (248)</li>
<li><input checked="" disabled="" type="checkbox"> cg.reviewStatus (249)</li>
<li><input checked="" disabled="" type="checkbox"> cg.howPublished (250)</li>
<li><input checked="" disabled="" type="checkbox"> cg.journal (251)</li>
<li><input checked="" disabled="" type="checkbox"> cg.isbn (252)</li>
<li><input checked="" disabled="" type="checkbox"> cg.issn (253)</li>
<li><input checked="" disabled="" type="checkbox"> cg.volume (254)</li>
<li><input checked="" disabled="" type="checkbox"> cg.number (255)</li>
<li><input checked="" disabled="" type="checkbox"> cg.issue (256)</li>
2019-10-28 12:51:00 +01:00
</ul>
<h2 id="fields-to-delete">Fields to delete</h2>
<p>Fields to delete after migration:</p>
2019-11-28 16:30:45 +01:00
<ul>
2021-04-01 08:49:08 +02:00
<li><input checked="" disabled="" type="checkbox"> cg.creator.id</li>
<li><input checked="" disabled="" type="checkbox"> cg.fulltextstatus</li>
<li><input checked="" disabled="" type="checkbox"> cg.identifier.status</li>
<li><input checked="" disabled="" type="checkbox"> cg.link.reference</li>
<li><input checked="" disabled="" type="checkbox"> cg.targetaudience</li>
2019-10-28 12:51:00 +01:00
</ul>
<h2 id="implementation-progress">Implementation Progress</h2>
2021-01-28 15:28:21 +01:00
<p>Tally of the status of the implementation of the new fields in the CGSpace <code>6_x-cgcorev2</code> branch.</p>
2019-10-28 12:51:00 +01:00
<table>
<thead>
<tr>
<th>Field Name</th>
2020-09-16 12:47:13 +02:00
<th style="text-align:center">migrate-fields.sh</th>
<th style="text-align:center">Input Forms</th>
<th style="text-align:center">XMLUI Themes¹</th>
<th style="text-align:center">dspace.cfg</th>
<th style="text-align:center">Discovery</th>
<th style="text-align:center">Atmire Modules</th>
<th style="text-align:center">Crosswalks</th>
2019-10-28 12:51:00 +01:00
</tr>
</thead>
<tbody>
<tr>
<td>cg.creator.identifier</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>dcterms.extent</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>dcterms.issued</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">?</td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>dcterms.abstract</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>dcterms.description</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>cg.contributor.donor</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>cg.reviewStatus</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>cg.howPublished</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>dcterms.bibliographicCitation</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>dcterms.accessRights</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>dcterms.language</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>dcterms.relation</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>dcterms.publisher</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>dcterms.isPartOf</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>dcterms.license</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>cg.journal</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>dcterms.subject</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>dcterms.type</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>cg.isbn</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
<tr>
<td>cg.issn</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-10-28 12:51:00 +01:00
</tr>
2019-12-22 11:14:25 +01:00
<tr>
<td>dcterms.audience</td>
2020-09-16 12:47:13 +02:00
<td style="text-align:center"></td>
<td style="text-align:center"></td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center">-</td>
<td style="text-align:center"></td>
<td style="text-align:center"></td>
2019-12-22 11:14:25 +01:00
</tr>
2019-10-28 12:51:00 +01:00
</tbody>
</table>
<p>There are a few things that I need to check once I get a deployment of this code up and running:</p>
<ul>
<li>Assess the XSL changes to see if things like <code>not(@qualifier)]</code> still make sense after we move fields from DC to DCTERMS, as some fields will no longer have qualifiers</li>
2020-04-13 16:24:05 +02:00
<li>Do I need to edit crosswalks that we are not using, like <a href="https://wiki.lyrasis.org/display/DSDOC5x/DSpace+AIP+Format#DSpaceAIPFormat-MODSSchema">MODS</a>?</li>
2019-10-28 12:51:00 +01:00
<li>There is potentially a lot of work in the OAI metadata formats like DIM, METS, and QDC (see <code>dspace/config/crosswalks/oai/*.xsl</code>)</li>
</ul>
2019-11-28 16:30:45 +01:00
<hr>
2020-01-27 15:20:44 +01:00
<p>¹ Not committed yet because I don&rsquo;t want to have to make minor adjustments in multiple commits. Re-apply the gauntlet of fixes with the sed script:</p>
2022-03-04 13:30:06 +01:00
<pre tabindex="0"><code>$ find dspace/modules/xmlui-mirage2/src/main/webapp/themes -iname &#34;*.xsl&#34; -exec sed -i -f ./cgcore-xsl-replacements.sed {} \;
2019-10-28 12:51:00 +01:00
</code></pre>
</article>
</div> <!-- /.blog-main -->
<aside class="col-sm-3 ml-auto blog-sidebar">
<section class="sidebar-module">
<h4>Recent Posts</h4>
<ol class="list-unstyled">
2022-03-01 15:48:40 +01:00
<li><a href="/cgspace-notes/2022-03/">March, 2022</a></li>
2022-02-10 18:35:40 +01:00
<li><a href="/cgspace-notes/2022-02/">February, 2022</a></li>
2022-01-01 14:21:47 +01:00
<li><a href="/cgspace-notes/2022-01/">January, 2022</a></li>
2021-12-03 11:58:43 +01:00
<li><a href="/cgspace-notes/2021-12/">December, 2021</a></li>
2021-11-01 09:49:21 +01:00
<li><a href="/cgspace-notes/2021-11/">November, 2021</a></li>
2019-10-28 12:51:00 +01:00
</ol>
</section>
<section class="sidebar-module">
<h4>Links</h4>
<ol class="list-unstyled">
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
</ol>
</section>
</aside>
</div> <!-- /.row -->
</div> <!-- /.container -->
<footer class="blog-footer">
<p dir="auto">
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
</p>
<p>
<a href="#">Back to top</a>
</p>
</footer>
</body>
</html>