mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes
This commit is contained in:
@ -14,7 +14,7 @@ Work on CGSpace duplicate DOIs more
|
||||
<meta property="og:type" content="article" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2024-04/" />
|
||||
<meta property="article:published_time" content="2024-04-04T10:23:00+03:00" />
|
||||
<meta property="article:modified_time" content="2024-04-18T09:38:02+03:00" />
|
||||
<meta property="article:modified_time" content="2024-04-18T17:00:25+03:00" />
|
||||
|
||||
|
||||
|
||||
@ -24,7 +24,7 @@ Work on CGSpace duplicate DOIs more
|
||||
|
||||
Work on CGSpace duplicate DOIs more
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.125.0">
|
||||
<meta name="generator" content="Hugo 0.125.3">
|
||||
|
||||
|
||||
|
||||
@ -34,9 +34,9 @@ Work on CGSpace duplicate DOIs more
|
||||
"@type": "BlogPosting",
|
||||
"headline": "April, 2024",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2024-04/",
|
||||
"wordCount": "456",
|
||||
"wordCount": "711",
|
||||
"datePublished": "2024-04-04T10:23:00+03:00",
|
||||
"dateModified": "2024-04-18T09:38:02+03:00",
|
||||
"dateModified": "2024-04-18T17:00:25+03:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -214,7 +214,52 @@ curl -s -o /dev/null 0.01s user 0.01s system 0% cpu 4.764 total
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>map $request_uri $new_uri {
|
||||
</span></span><span style="display:flex;"><span> /handle/10568/112821 /handle/10568/97605;
|
||||
</span></span><span style="display:flex;"><span>}
|
||||
</span></span></code></pre></div><!-- raw HTML omitted -->
|
||||
</span></span></code></pre></div><h2 id="2024-04-19">2024-04-19</h2>
|
||||
<ul>
|
||||
<li>Spend some time looking at duplicate DOIs again…</li>
|
||||
<li>Refresh ORCID identifiers from ORCID API and update CGSpace metadata and controlled vocabulary</li>
|
||||
</ul>
|
||||
<h2 id="2024-04-20">2024-04-20</h2>
|
||||
<ul>
|
||||
<li>I read an <a href="https://github.com/greenelab/scihub/issues/9">interesting thread about DOI casing</a>
|
||||
<ul>
|
||||
<li>Apparently the DOI specification says ASCII characters in DOIs are case insensitive</li>
|
||||
<li>Indeed, <a href="https://www.crossref.org/documentation/member-setup/constructing-your-dois/">Crossref recommends lower case</a> for all DOIs</li>
|
||||
<li>I was curious about the DOIs in our database so I checked before and after lower casing:</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace7= ☘ \COPY (SELECT DISTINCT(text_value) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=220 AND text_value IS NOT NULL AND text_value !='') TO /tmp/dois-sql-before.txt;
|
||||
</span></span><span style="display:flex;"><span>COPY 25675
|
||||
</span></span><span style="display:flex;"><span>localhost/dspace7= ☘ \COPY (SELECT DISTINCT(lower(text_value)) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=220 AND text_value IS NOT NULL AND text_value !='') TO /tmp/dois-sql-after.txt;
|
||||
</span></span><span style="display:flex;"><span>COPY 25666
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>I need to investigate options for lower casing these in the repository, for example in a curation task, and in all workflows around DSpace metadata…</li>
|
||||
</ul>
|
||||
<h2 id="2024-04-23">2024-04-23</h2>
|
||||
<ul>
|
||||
<li>Spent some time writing a Java curation task to normalize DOIs in items when they enter the workflow edit step
|
||||
<ul>
|
||||
<li>The workflow curation tasks are not documented very well but I got a basic configuration working</li>
|
||||
<li>I found a bug in DSpace curation tasks and discussed on Slack</li>
|
||||
<li>I finalized the <code>NormalizeDOIs</code> curation task and released v7.6.1.1 of the <a href="https://github.com/ilri/cgspace-java-helpers">cgspace-java-helpers</a> project</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="2024-04-24">2024-04-24</h2>
|
||||
<ul>
|
||||
<li>A bit more testing of the curation tasks
|
||||
<ul>
|
||||
<li>I tested a patch by Mark Wood</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I added support for normalizing DOIs to this same format to my <a href="https://github.com/ilri/csv-metadata-quality">csv-metadata-quality</a> project</li>
|
||||
</ul>
|
||||
<h2 id="2024-04-25">2024-04-25</h2>
|
||||
<ul>
|
||||
<li>I lowercased the remaining 3,900 DOIs on CGSpace that had uppercase ASCII characters</li>
|
||||
</ul>
|
||||
<!-- raw HTML omitted -->
|
||||
|
||||
|
||||
|
||||
|
Reference in New Issue
Block a user