Add notes for 2024-04-29

This commit is contained in:
Alan Orth 2024-04-29 17:21:28 +03:00
parent 8f156a0365
commit e323c15e8b
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
39 changed files with 76 additions and 44 deletions

View File

@ -152,4 +152,18 @@ COPY 25666
- Spend some time looking at duplicate DOIs again...
## 2024-04-29
- Start working on the IFPRI 20202021 batch migration
- I modified my `check_duplicates.py` script to check for DOIs instead of titles, and use a similarity of 1.0 to make sure the match is exact
- I noticed something in the Tomcat log:
```console
tomcat9[690]: WARNING: The HTTP response header [Content-Disposition] with value [attachment; filename="Literature review on Womens Empowerment and their Resilience2.pdf"] has been removed from the response because it is invalid
tomcat9[690]: java.lang.IllegalArgumentException: The Unicode character [] at code point [8,217] cannot be encoded as it is outside the permitted range of 0 to 255
```
- I found the bitstream's ID and then used the `ds6_bitstream2itemhandle` [SQL helper function](https://wiki.lyrasis.org/display/DSPACE/Helper+SQL+functions+for+DSpace+6) to find the item's handle
- Then I replaced the curly quote with a regular quote in all bistreams
<!-- vim: set sw=2 ts=2: -->

View File

@ -14,7 +14,7 @@ Work on CGSpace duplicate DOIs more
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2024-04/" />
<meta property="article:published_time" content="2024-04-04T10:23:00+03:00" />
<meta property="article:modified_time" content="2024-04-25T15:28:20+03:00" />
<meta property="article:modified_time" content="2024-04-27T11:22:58+03:00" />
@ -34,9 +34,9 @@ Work on CGSpace duplicate DOIs more
"@type": "BlogPosting",
"headline": "April, 2024",
"url": "https://alanorth.github.io/cgspace-notes/2024-04/",
"wordCount": "728",
"wordCount": "852",
"datePublished": "2024-04-04T10:23:00+03:00",
"dateModified": "2024-04-25T15:28:20+03:00",
"dateModified": "2024-04-27T11:22:58+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -264,6 +264,24 @@ curl -s -o /dev/null 0.01s user 0.01s system 0% cpu 4.764 total
<ul>
<li>Spend some time looking at duplicate DOIs again&hellip;</li>
</ul>
<h2 id="2024-04-29">2024-04-29</h2>
<ul>
<li>Start working on the IFPRI 20202021 batch migration
<ul>
<li>I modified my <code>check_duplicates.py</code> script to check for DOIs instead of titles, and use a similarity of 1.0 to make sure the match is exact</li>
</ul>
</li>
<li>I noticed something in the Tomcat log:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>tomcat9[690]: WARNING: The HTTP response header [Content-Disposition] with value [attachment; filename=&#34;Literature review on Womens Empowerment and their Resilience2.pdf&#34;] has been removed from the response because it is invalid
</span></span><span style="display:flex;"><span>tomcat9[690]: java.lang.IllegalArgumentException: The Unicode character [] at code point [8,217] cannot be encoded as it is outside the permitted range of 0 to 255
</span></span></code></pre></div><ul>
<li>I found the bitstream&rsquo;s ID and then used the <code>ds6_bitstream2itemhandle</code> <a href="https://wiki.lyrasis.org/display/DSPACE/Helper+SQL+functions+for+DSpace+6">SQL helper function</a> to find the item&rsquo;s handle
<ul>
<li>Then I replaced the curly quote with a regular quote in all bistreams</li>
</ul>
</li>
</ul>
<!-- raw HTML omitted -->

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -6,7 +6,7 @@
<description>Recent content in Categories on CGSpace Notes</description>
<generator>Hugo</generator>
<language>en-us</language>
<lastBuildDate>Thu, 25 Apr 2024 15:28:20 +0300</lastBuildDate>
<lastBuildDate>Sat, 27 Apr 2024 11:22:58 +0300</lastBuildDate>
<atom:link href="https://alanorth.github.io/cgspace-notes/categories/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>Notes</title>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -6,7 +6,7 @@
<description>Recent content in Notes on CGSpace Notes</description>
<generator>Hugo</generator>
<language>en-us</language>
<lastBuildDate>Thu, 25 Apr 2024 15:28:20 +0300</lastBuildDate>
<lastBuildDate>Sat, 27 Apr 2024 11:22:58 +0300</lastBuildDate>
<atom:link href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>April, 2024</title>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -6,7 +6,7 @@
<description>Recent content on CGSpace Notes</description>
<generator>Hugo</generator>
<language>en-us</language>
<lastBuildDate>Thu, 25 Apr 2024 15:28:20 +0300</lastBuildDate>
<lastBuildDate>Sat, 27 Apr 2024 11:22:58 +0300</lastBuildDate>
<atom:link href="https://alanorth.github.io/cgspace-notes/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>April, 2024</title>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -6,7 +6,7 @@
<description>Recent content in Posts on CGSpace Notes</description>
<generator>Hugo</generator>
<language>en-us</language>
<lastBuildDate>Thu, 25 Apr 2024 15:28:20 +0300</lastBuildDate>
<lastBuildDate>Sat, 27 Apr 2024 11:22:58 +0300</lastBuildDate>
<atom:link href="https://alanorth.github.io/cgspace-notes/posts/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>April, 2024</title>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-25T15:28:20+03:00" />
<meta property="og:updated_time" content="2024-04-27T11:22:58+03:00" />

View File

@ -3,19 +3,19 @@
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://alanorth.github.io/cgspace-notes/2024-04/</loc>
<lastmod>2024-04-25T15:28:20+03:00</lastmod>
<lastmod>2024-04-27T11:22:58+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
<lastmod>2024-04-25T15:28:20+03:00</lastmod>
<lastmod>2024-04-27T11:22:58+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2024-04-25T15:28:20+03:00</lastmod>
<lastmod>2024-04-27T11:22:58+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2024-04-25T15:28:20+03:00</lastmod>
<lastmod>2024-04-27T11:22:58+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2024-04-25T15:28:20+03:00</lastmod>
<lastmod>2024-04-27T11:22:58+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2024-03/</loc>
<lastmod>2024-04-04T10:23:49+03:00</lastmod>