Add notes

This commit is contained in:
Alan Orth 2024-04-16 09:35:30 +03:00
parent 281827944a
commit efd8eb7f79
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
39 changed files with 105 additions and 44 deletions

View File

@ -25,5 +25,35 @@ categories: ["Notes"]
- Finish working on the 650 IFPRI 2022 records that were not already on CGSpace, then uploaded them
- I need to merge the metadata for the remaining 212 that are already on CGSpace
- Spend some time looking at duplicate DOIs again...
## 2024-04-13
- Spend some time looking at duplicate DOIs again...
## 2024-04-14
- Spend some time looking at duplicate DOIs again...
## 2024-04-15
- Spend some time looking at duplicate DOIs again...
- Delete ~260 duplicate metadata values using the elaborate SQL and sort method I documented here: https://github.com/DSpace/DSpace/issues/8253#issuecomment-1331756418
- Tony noticed that the DSpace 7 REST API is very slow with the embeds so I profiled a bit:
```
$ time curl -s -o /dev/null 'https://cgspace.cgiar.org/server/api/discover/search/objects?query=cg.identifier.project%3AIFPRI*&scope=8f1e9650-fe87-4e6e-889a-1cacfb747408&page=0&size=100&embed=thumbnail,bundles/bitstreams&sort=dcterms.issued,desc'
curl -s -o /dev/null 0.01s user 0.01s system 0% cpu 47.515 total
$ time curl -s -o /dev/null 'https://cgspace.cgiar.org/server/api/discover/search/objects?query=cg.identifier.project%3AIFPRI*&scope=8f1e9650-fe87-4e6e-889a-1cacfb747408&page=0&size=100&sort=dcterms.issued,desc'
curl -s -o /dev/null 0.01s user 0.01s system 0% cpu 4.764 total
```
- Finalize processing the remaining 206 items from the IFPRI 2022 batch set that already existed on CGSpace
- I merged metadata with the existing items
- There are still six remaining items that I identified as being duplicates (3x2) in the IFPRI set itself
## 2024-04-16
- Spend some time looking at duplicate DOIs again...
<!-- vim: set sw=2 ts=2: -->

View File

@ -14,7 +14,7 @@ Work on CGSpace duplicate DOIs more
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2024-04/" />
<meta property="article:published_time" content="2024-04-04T10:23:00+03:00" />
<meta property="article:modified_time" content="2024-04-09T16:50:56+03:00" />
<meta property="article:modified_time" content="2024-04-12T20:40:52+03:00" />
@ -34,9 +34,9 @@ Work on CGSpace duplicate DOIs more
"@type": "BlogPosting",
"headline": "April, 2024",
"url": "https://alanorth.github.io/cgspace-notes/2024-04/",
"wordCount": "77",
"wordCount": "236",
"datePublished": "2024-04-04T10:23:00+03:00",
"dateModified": "2024-04-09T16:50:56+03:00",
"dateModified": "2024-04-12T20:40:52+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -137,6 +137,37 @@ Work on CGSpace duplicate DOIs more
<li>I need to merge the metadata for the remaining 212 that are already on CGSpace</li>
</ul>
</li>
<li>Spend some time looking at duplicate DOIs again&hellip;</li>
</ul>
<h2 id="2024-04-13">2024-04-13</h2>
<ul>
<li>Spend some time looking at duplicate DOIs again&hellip;</li>
</ul>
<h2 id="2024-04-14">2024-04-14</h2>
<ul>
<li>Spend some time looking at duplicate DOIs again&hellip;</li>
</ul>
<h2 id="2024-04-15">2024-04-15</h2>
<ul>
<li>Spend some time looking at duplicate DOIs again&hellip;</li>
<li>Delete ~260 duplicate metadata values using the elaborate SQL and sort method I documented here: <a href="https://github.com/DSpace/DSpace/issues/8253#issuecomment-1331756418">https://github.com/DSpace/DSpace/issues/8253#issuecomment-1331756418</a></li>
<li>Tony noticed that the DSpace 7 REST API is very slow with the embeds so I profiled a bit:</li>
</ul>
<pre tabindex="0"><code>$ time curl -s -o /dev/null &#39;https://cgspace.cgiar.org/server/api/discover/search/objects?query=cg.identifier.project%3AIFPRI*&amp;scope=8f1e9650-fe87-4e6e-889a-1cacfb747408&amp;page=0&amp;size=100&amp;embed=thumbnail,bundles/bitstreams&amp;sort=dcterms.issued,desc&#39;
curl -s -o /dev/null 0.01s user 0.01s system 0% cpu 47.515 total
$ time curl -s -o /dev/null &#39;https://cgspace.cgiar.org/server/api/discover/search/objects?query=cg.identifier.project%3AIFPRI*&amp;scope=8f1e9650-fe87-4e6e-889a-1cacfb747408&amp;page=0&amp;size=100&amp;sort=dcterms.issued,desc&#39;
curl -s -o /dev/null 0.01s user 0.01s system 0% cpu 4.764 total
</code></pre><ul>
<li>Finalize processing the remaining 206 items from the IFPRI 2022 batch set that already existed on CGSpace
<ul>
<li>I merged metadata with the existing items</li>
<li>There are still six remaining items that I identified as being duplicates (3x2) in the IFPRI set itself</li>
</ul>
</li>
</ul>
<h2 id="2024-04-16">2024-04-16</h2>
<ul>
<li>Spend some time looking at duplicate DOIs again&hellip;</li>
</ul>
<!-- raw HTML omitted -->

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -6,7 +6,7 @@
<description>Recent content in Categories on CGSpace Notes</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<lastBuildDate>Tue, 09 Apr 2024 16:50:56 +0300</lastBuildDate>
<lastBuildDate>Fri, 12 Apr 2024 20:40:52 +0300</lastBuildDate>
<atom:link href="https://alanorth.github.io/cgspace-notes/categories/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>Notes</title>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -6,7 +6,7 @@
<description>Recent content in Notes on CGSpace Notes</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<lastBuildDate>Tue, 09 Apr 2024 16:50:56 +0300</lastBuildDate>
<lastBuildDate>Fri, 12 Apr 2024 20:40:52 +0300</lastBuildDate>
<atom:link href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>April, 2024</title>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -6,7 +6,7 @@
<description>Recent content on CGSpace Notes</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<lastBuildDate>Tue, 09 Apr 2024 16:50:56 +0300</lastBuildDate>
<lastBuildDate>Fri, 12 Apr 2024 20:40:52 +0300</lastBuildDate>
<atom:link href="https://alanorth.github.io/cgspace-notes/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>April, 2024</title>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -6,7 +6,7 @@
<description>Recent content in Posts on CGSpace Notes</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<lastBuildDate>Tue, 09 Apr 2024 16:50:56 +0300</lastBuildDate>
<lastBuildDate>Fri, 12 Apr 2024 20:40:52 +0300</lastBuildDate>
<atom:link href="https://alanorth.github.io/cgspace-notes/posts/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>April, 2024</title>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-04-09T16:50:56+03:00" />
<meta property="og:updated_time" content="2024-04-12T20:40:52+03:00" />

View File

@ -3,19 +3,19 @@
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://alanorth.github.io/cgspace-notes/2024-04/</loc>
<lastmod>2024-04-09T16:50:56+03:00</lastmod>
<lastmod>2024-04-12T20:40:52+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
<lastmod>2024-04-09T16:50:56+03:00</lastmod>
<lastmod>2024-04-12T20:40:52+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2024-04-09T16:50:56+03:00</lastmod>
<lastmod>2024-04-12T20:40:52+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2024-04-09T16:50:56+03:00</lastmod>
<lastmod>2024-04-12T20:40:52+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2024-04-09T16:50:56+03:00</lastmod>
<lastmod>2024-04-12T20:40:52+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2024-03/</loc>
<lastmod>2024-04-04T10:23:49+03:00</lastmod>