Add notes for 2022-07-08

This commit is contained in:
Alan Orth 2022-07-08 15:49:45 +03:00
parent 19715c3295
commit 11ce30438c
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
29 changed files with 154 additions and 36 deletions

View File

@ -118,4 +118,53 @@ UPDATE 104
- I will also have to remove "Academicians" from input-forms.xml
<!-- vim: set sw=2 ts=2: -->
## 2022-07-07
- Finalize lists of non-AGROVOC subjects in CGSpace that I started last week
- I used the [SQL helper functions](https://wiki.lyrasis.org/display/DSPACE/Helper+SQL+functions+for+DSpace+6) to find the collections where each term was used:
```console
localhost/dspace= ☘ SELECT DISTINCT(ds6_item2collectionhandle(dspace_object_id)) AS collection, COUNT(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND LOWER(text_value) = 'water demand' GROUP BY collection ORDER BY count DESC LIMIT 5;
collection │ count
─────────────┼───────
10568/36178 │ 56
10568/36185 │ 46
10568/36181 │ 35
10568/36188 │ 28
10568/36179 │ 21
(5 rows)
```
- For now I only did terms from my list that had 100 or more occurrences in CGSpace
- This leaves us with thirty-six terms that I will send to Sara Jani and Elizabeth Arnaud for evaluating possible inclusion to AGROVOC
- Write to some submitters from CIAT, Bioversity, and CCAFS to ask if they are still uploading new items with their legacy subject fields on CGSpace
- We want to remove them from the submission form to create space for new fields
- Update one term I noticed people using that was close to AGROVOC:
```console
dspace=# UPDATE metadatavalue SET text_value='development policies' WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=187 AND text_value='development policy';
UPDATE 108
```
- After contacting some editors I removed some old metadata fields from the submission form and browse indexes:
- Bioversity subject (`cg.subject.bioversity`)
- CCAFS phase 1 project tag (`cg.identifier.ccafsproject`)
- CIAT project tag (`cg.identifier.ciatproject`)
- CIAT subject (`cg.subject.ciat`)
- Work on cleaning and proofing forty-six AfricaRice items for CGSpace
- Last week we identified some duplicates so I removed those
- The data is of mediocre quality
- I've been fixing citations (nitpick), adding licenses, adding volume/issue/extent, fixing DOIs, and adding some AGROVOC subjects
- I even found titles that have typos, looking something like OCR errors...
## 2022-07-08
- Finalize the cleaning and proofing of AfricaRice records
- I found two suspicious items that claim to have been published but I can't find in the respective journals, so I removed those
- I uploaded the forty-four items to [DSpace Test](https://dspacetest.cgiar.org/handle/10568/119135)
- Margarita from CCAFS said they are no longer using the CCAFS subject or CCAFS phase 2 project tag
- I removed these from the input-form.xml and Discovery facets:
- cg.identifier.ccafsprojectpii
- cg.subject.cifor
- For now we will keep them in the search filters

View File

@ -19,7 +19,7 @@ Also, the trgm functions I&rsquo;ve used before are case insensitive, but Levens
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-07/" />
<meta property="article:published_time" content="2022-07-02T14:07:36+03:00" />
<meta property="article:modified_time" content="2022-07-04T22:10:02+03:00" />
<meta property="article:modified_time" content="2022-07-07T10:02:04+03:00" />
@ -44,9 +44,9 @@ Also, the trgm functions I&rsquo;ve used before are case insensitive, but Levens
"@type": "BlogPosting",
"headline": "July, 2022",
"url": "https://alanorth.github.io/cgspace-notes/2022-07/",
"wordCount": "739",
"wordCount": "1095",
"datePublished": "2022-07-02T14:07:36+03:00",
"dateModified": "2022-07-04T22:10:02+03:00",
"dateModified": "2022-07-07T10:02:04+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -246,7 +246,76 @@ Also, the trgm functions I&rsquo;ve used before are case insensitive, but Levens
</span></span></code></pre></div><ul>
<li>I will also have to remove &ldquo;Academicians&rdquo; from input-forms.xml</li>
</ul>
<!-- raw HTML omitted -->
<h2 id="2022-07-07">2022-07-07</h2>
<ul>
<li>Finalize lists of non-AGROVOC subjects in CGSpace that I started last week
<ul>
<li>I used the <a href="https://wiki.lyrasis.org/display/DSPACE/Helper+SQL+functions+for+DSpace+6">SQL helper functions</a> to find the collections where each term was used:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace= ☘ SELECT DISTINCT(ds6_item2collectionhandle(dspace_object_id)) AS collection, COUNT(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND LOWER(text_value) = &#39;water demand&#39; GROUP BY collection ORDER BY count DESC LIMIT 5;
</span></span><span style="display:flex;"><span> collection │ count
</span></span><span style="display:flex;"><span>─────────────┼───────
</span></span><span style="display:flex;"><span> 10568/36178 │ 56
</span></span><span style="display:flex;"><span> 10568/36185 │ 46
</span></span><span style="display:flex;"><span> 10568/36181 │ 35
</span></span><span style="display:flex;"><span> 10568/36188 │ 28
</span></span><span style="display:flex;"><span> 10568/36179 │ 21
</span></span><span style="display:flex;"><span>(5 rows)
</span></span></code></pre></div><ul>
<li>For now I only did terms from my list that had 100 or more occurrences in CGSpace
<ul>
<li>This leaves us with thirty-six terms that I will send to Sara Jani and Elizabeth Arnaud for evaluating possible inclusion to AGROVOC</li>
</ul>
</li>
<li>Write to some submitters from CIAT, Bioversity, and CCAFS to ask if they are still uploading new items with their legacy subject fields on CGSpace
<ul>
<li>We want to remove them from the submission form to create space for new fields</li>
</ul>
</li>
<li>Update one term I noticed people using that was close to AGROVOC:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>dspace=# UPDATE metadatavalue SET text_value=&#39;development policies&#39; WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=187 AND text_value=&#39;development policy&#39;;
</span></span><span style="display:flex;"><span>UPDATE 108
</span></span></code></pre></div><ul>
<li>After contacting some editors I removed some old metadata fields from the submission form and browse indexes:
<ul>
<li>Bioversity subject (<code>cg.subject.bioversity</code>)</li>
<li>CCAFS phase 1 project tag (<code>cg.identifier.ccafsproject</code>)</li>
<li>CIAT project tag (<code>cg.identifier.ciatproject</code>)</li>
<li>CIAT subject (<code>cg.subject.ciat</code>)</li>
</ul>
</li>
<li>Work on cleaning and proofing forty-six AfricaRice items for CGSpace
<ul>
<li>Last week we identified some duplicates so I removed those</li>
<li>The data is of mediocre quality</li>
<li>I&rsquo;ve been fixing citations (nitpick), adding licenses, adding volume/issue/extent, fixing DOIs, and adding some AGROVOC subjects</li>
<li>I even found titles that have typos, looking something like OCR errors&hellip;</li>
</ul>
</li>
</ul>
<h2 id="2022-07-08">2022-07-08</h2>
<ul>
<li>Finalize the cleaning and proofing of AfricaRice records
<ul>
<li>I found two suspicious items that claim to have been published but I can&rsquo;t find in the respective journals, so I removed those</li>
<li>I uploaded the forty-four items to <a href="https://dspacetest.cgiar.org/handle/10568/119135">DSpace Test</a></li>
</ul>
</li>
<li>Margarita from CCAFS said they are no longer using the CCAFS subject or CCAFS phase 2 project tag
<ul>
<li>I removed these from the input-form.xml and Discovery facets:
<ul>
<li>cg.identifier.ccafsprojectpii</li>
<li>cg.subject.cifor</li>
</ul>
</li>
<li>For now we will keep them in the search filters</li>
</ul>
</li>
</ul>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-07-04T22:10:02+03:00" />
<meta property="og:updated_time" content="2022-07-07T10:02:04+03:00" />

View File

@ -3,19 +3,19 @@
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
<lastmod>2022-07-04T22:10:02+03:00</lastmod>
<lastmod>2022-07-07T10:02:04+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2022-07-04T22:10:02+03:00</lastmod>
<lastmod>2022-07-07T10:02:04+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2022-07/</loc>
<lastmod>2022-07-04T22:10:02+03:00</lastmod>
<lastmod>2022-07-07T10:02:04+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2022-07-04T22:10:02+03:00</lastmod>
<lastmod>2022-07-07T10:02:04+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2022-07-04T22:10:02+03:00</lastmod>
<lastmod>2022-07-07T10:02:04+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2022-06/</loc>
<lastmod>2022-07-04T09:25:14+03:00</lastmod>