Add notes for 2020-09-17

This commit is contained in:
Alan Orth 2020-09-17 15:33:37 +03:00
parent 079843e677
commit c50061dd09
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
2 changed files with 88 additions and 1 deletions

View File

@ -251,4 +251,58 @@ $ ~/dspace/bin/dspace import -a -e y.arrr@cgiar.org -m /tmp/2020-09-15-cip-annua
- Then I uploaded them to CGSpace - Then I uploaded them to CGSpace
## 2020-09-16
- Looking further into Carlos Tejos's question about integrating LandVoc (the AGROVOC subset) into DSpace
- I see that you can actually get LandVoc concepts directly from AGROVOC's SPARQL, for example with [this query](http://agrovoc.uniroma2.it/sparql#query=PREFIX+rdfs%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX+skos%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F02%2Fskos%2Fcore%23%3E%0A%0ASELECT+%3Fconcept%0AWHERE+%7B%0A++%3Fconcept+a+skos%3AConcept+%3B%0A+++++++++++skos%3AinScheme+%3Chttp%3A%2F%2Flandvoc.org%2Flandvoc%3E+.%0A%0A%7D+ORDER+BY+%3Fconcept&contentTypeConstruct=text%2Fturtle&contentTypeSelect=application%2Fsparql-results%2Bjson&endpoint=http%3A%2F%2Fagrovoc.uniroma2.it%2Fsparql&requestMethod=POST&tabTitle=Query&headers=%7B%7D&outputFormat=table)
![AGROVOC LandVoc SPARQL](/cgspace-notes/2020/09/agrovoc-landvoc-sparql.png)
- So maybe we can query AGROVOC directly using a similar method to [DSpace-CRIS's GettyAuthority](https://github.com/4Science/DSpace/blob/dspace-5_x_x-cris/dspace-api/src/main/java/org/dspace/content/authority/TGNAuthority.java)
- I wired up DSpace-CRIS's VIAFAuthority to see how authorities for auto suggested names get stored
- After submission you can see the item's VIAF identifier:
![VIAF authority](/cgspace-notes/2020/09/viaf-authority.png)
- And this identifier is the ID on VIAF, pretty cool!
![VIAF entry for Charles Darwin](/cgspace-notes/2020/09/viaf-darwin.png)
- I did a similar test with the Getty Thesaurus of Geographic Names (TGN) and it stores the concept URI in the authority:
![TGNAuthority](/cgspace-notes/2020/09/tgn-concept-uri.png)
- But the authority values are not exposed anywhere as metadata...
- I need to play with it a bit more I guess...
- The nice thing is that the Getty example from DSpace-CRIS uses SPARQL as well, and the TGN authority extends it
- We could use a similar model for AGROVOC/LandVoc very easily
## 2020-09-17
- Maria from Bioveristy asked about the ORCID identifier for one of her colleagues that seems to have been removed from our list
- I re-added it to our controlled vocabulary and added the identifier to fifty-one of his existing items on CGSpace using my script:
```
$ cat 2020-09-17-add-bioversity-orcids.csv
dc.contributor.author,cg.creator.id
"Etten, Jacob van","Jacob van Etten: 0000-0001-7554-2558"
"van Etten, Jacob","Jacob van Etten: 0000-0001-7554-2558"
$ ./add-orcid-identifiers-csv.py -i 2020-09-17-add-bioversity-orcids.csv -db dspace -u dspace -p 'dom@in34sniper'
```
- I sent a follow-up message to Atmire to look into the two remaining issues with the DSpace 6 upgrade
- First is the fact that we have zero results in our Listings and Reports, for any search
- Second is the error we get during CSV imports
- Help Natalia and Cathy from Bioversity-CIAT with their OpenSearch query on "trade offs" again
- They wanted to build a search query with multiple filters (type, crpsubject, status) and the general query "trade offs"
- I found a great [reference for DSpace's OpenSearch syntax](https://www.kiwi.fi/pages/viewpage.action?pageId=45782169) (albeit in Finnish, but the example URLs show the syntax clearly)
- We can use quotes and `AND` and `OR` and even group search parameters with parenthesis!
- So now I built a query for Natalia which uses these (showing without URL encoding so you can see the syntax):
```
https://cgspace.cgiar.org/open-search/discover?query=type:"Journal Article" AND status:"Open Access" AND crpsubject:"Water, Land and Ecosystems" AND "tradeoffs"&rpp=100
```
- I noticed that my `move-collections.sh` script didn't work on DSpace 6 because of the change from IDs to UUIDs, so I modified it to quote the collection `resource_id` parameters in the PostgreSQL query
<!-- vim: set sw=2 ts=2: --> <!-- vim: set sw=2 ts=2: -->

View File

@ -55,7 +55,7 @@ I filed an issue on OpenRXV to make some minor edits to the admin UI: https://gi
"@type": "BlogPosting", "@type": "BlogPosting",
"headline": "September, 2020", "headline": "September, 2020",
"url": "https://alanorth.github.io/cgspace-notes/2020-09/", "url": "https://alanorth.github.io/cgspace-notes/2020-09/",
"wordCount": "1911", "wordCount": "2161",
"datePublished": "2020-09-02T15:35:54+03:00", "datePublished": "2020-09-02T15:35:54+03:00",
"dateModified": "2020-09-15T17:32:29+03:00", "dateModified": "2020-09-15T17:32:29+03:00",
"author": { "author": {
@ -464,6 +464,39 @@ Would fix 3 occurences of: SOUTHWEST ASIA
</ul> </ul>
</li> </li>
</ul> </ul>
<h2 id="2020-09-17">2020-09-17</h2>
<ul>
<li>Maria from Bioveristy asked about the ORCID identifier for one of her colleagues that seems to have been removed from our list
<ul>
<li>I re-added it to our controlled vocabulary and added the identifier to fifty-one of his existing items on CGSpace using my script:</li>
</ul>
</li>
</ul>
<pre><code>$ cat 2020-09-17-add-bioversity-orcids.csv
dc.contributor.author,cg.creator.id
&quot;Etten, Jacob van&quot;,&quot;Jacob van Etten: 0000-0001-7554-2558&quot;
&quot;van Etten, Jacob&quot;,&quot;Jacob van Etten: 0000-0001-7554-2558&quot;
$ ./add-orcid-identifiers-csv.py -i 2020-09-17-add-bioversity-orcids.csv -db dspace -u dspace -p 'dom@in34sniper'
</code></pre><ul>
<li>I sent a follow-up message to Atmire to look into the two remaining issues with the DSpace 6 upgrade
<ul>
<li>First is the fact that we have zero results in our Listings and Reports, for any search</li>
<li>Second is the error we get during CSV imports</li>
</ul>
</li>
<li>Help Natalia and Cathy from Bioversity-CIAT with their OpenSearch query on &ldquo;trade offs&rdquo; again
<ul>
<li>They wanted to build a search query with multiple filters (type, crpsubject, status) and the general query &ldquo;trade offs&rdquo;</li>
<li>I found a great <a href="https://www.kiwi.fi/pages/viewpage.action?pageId=45782169">reference for DSpace&rsquo;s OpenSearch syntax</a> (albeit in Finnish, but the example URLs show the syntax clearly)</li>
<li>We can use quotes and <code>AND</code> and <code>OR</code> and even group search parameters with parenthesis!</li>
<li>So now I built a query for Natalia which uses these (showing without URL encoding so you can see the syntax):</li>
</ul>
</li>
</ul>
<pre><code>https://cgspace.cgiar.org/open-search/discover?query=type:&quot;Journal Article&quot; AND status:&quot;Open Access&quot; AND crpsubject:&quot;Water, Land and Ecosystems&quot; AND &quot;tradeoffs&quot;&amp;rpp=100
</code></pre><ul>
<li>I noticed that my <code>move-collections.sh</code> script didn&rsquo;t work on DSpace 6 because of the change from IDs to UUIDs, so I modified it to quote the collection <code>resource_id</code> parameters in the PostgreSQL query</li>
</ul>
<!-- raw HTML omitted --> <!-- raw HTML omitted -->