mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2024-04-29
This commit is contained in:
@ -14,7 +14,7 @@ Work on CGSpace duplicate DOIs more
|
||||
<meta property="og:type" content="article" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2024-04/" />
|
||||
<meta property="article:published_time" content="2024-04-04T10:23:00+03:00" />
|
||||
<meta property="article:modified_time" content="2024-04-25T15:28:20+03:00" />
|
||||
<meta property="article:modified_time" content="2024-04-27T11:22:58+03:00" />
|
||||
|
||||
|
||||
|
||||
@ -34,9 +34,9 @@ Work on CGSpace duplicate DOIs more
|
||||
"@type": "BlogPosting",
|
||||
"headline": "April, 2024",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2024-04/",
|
||||
"wordCount": "728",
|
||||
"wordCount": "852",
|
||||
"datePublished": "2024-04-04T10:23:00+03:00",
|
||||
"dateModified": "2024-04-25T15:28:20+03:00",
|
||||
"dateModified": "2024-04-27T11:22:58+03:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -264,6 +264,24 @@ curl -s -o /dev/null 0.01s user 0.01s system 0% cpu 4.764 total
|
||||
<ul>
|
||||
<li>Spend some time looking at duplicate DOIs again…</li>
|
||||
</ul>
|
||||
<h2 id="2024-04-29">2024-04-29</h2>
|
||||
<ul>
|
||||
<li>Start working on the IFPRI 2020–2021 batch migration
|
||||
<ul>
|
||||
<li>I modified my <code>check_duplicates.py</code> script to check for DOIs instead of titles, and use a similarity of 1.0 to make sure the match is exact</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I noticed something in the Tomcat log:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>tomcat9[690]: WARNING: The HTTP response header [Content-Disposition] with value [attachment; filename="Literature review on Women’s Empowerment and their Resilience2.pdf"] has been removed from the response because it is invalid
|
||||
</span></span><span style="display:flex;"><span>tomcat9[690]: java.lang.IllegalArgumentException: The Unicode character [’] at code point [8,217] cannot be encoded as it is outside the permitted range of 0 to 255
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>I found the bitstream’s ID and then used the <code>ds6_bitstream2itemhandle</code> <a href="https://wiki.lyrasis.org/display/DSPACE/Helper+SQL+functions+for+DSpace+6">SQL helper function</a> to find the item’s handle
|
||||
<ul>
|
||||
<li>Then I replaced the curly quote with a regular quote in all bistreams</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<!-- raw HTML omitted -->
|
||||
|
||||
|
||||
|
Reference in New Issue
Block a user