From c50061dd0963335563c1639b60dd4e17ba66e688 Mon Sep 17 00:00:00 2001 From: Alan Orth Date: Thu, 17 Sep 2020 15:33:37 +0300 Subject: [PATCH] Add notes for 2020-09-17 --- content/posts/2020-09.md | 54 ++++++++++++++++++++++++++++++++++++++++ docs/2020-09/index.html | 35 +++++++++++++++++++++++++- 2 files changed, 88 insertions(+), 1 deletion(-) diff --git a/content/posts/2020-09.md b/content/posts/2020-09.md index d5feaf2ca..2383cda9c 100644 --- a/content/posts/2020-09.md +++ b/content/posts/2020-09.md @@ -251,4 +251,58 @@ $ ~/dspace/bin/dspace import -a -e y.arrr@cgiar.org -m /tmp/2020-09-15-cip-annua - Then I uploaded them to CGSpace +## 2020-09-16 + +- Looking further into Carlos Tejos's question about integrating LandVoc (the AGROVOC subset) into DSpace + - I see that you can actually get LandVoc concepts directly from AGROVOC's SPARQL, for example with [this query](http://agrovoc.uniroma2.it/sparql#query=PREFIX+rdfs%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX+skos%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F02%2Fskos%2Fcore%23%3E%0A%0ASELECT+%3Fconcept%0AWHERE+%7B%0A++%3Fconcept+a+skos%3AConcept+%3B%0A+++++++++++skos%3AinScheme+%3Chttp%3A%2F%2Flandvoc.org%2Flandvoc%3E+.%0A%0A%7D+ORDER+BY+%3Fconcept&contentTypeConstruct=text%2Fturtle&contentTypeSelect=application%2Fsparql-results%2Bjson&endpoint=http%3A%2F%2Fagrovoc.uniroma2.it%2Fsparql&requestMethod=POST&tabTitle=Query&headers=%7B%7D&outputFormat=table) + +![AGROVOC LandVoc SPARQL](/cgspace-notes/2020/09/agrovoc-landvoc-sparql.png) + +- So maybe we can query AGROVOC directly using a similar method to [DSpace-CRIS's GettyAuthority](https://github.com/4Science/DSpace/blob/dspace-5_x_x-cris/dspace-api/src/main/java/org/dspace/content/authority/TGNAuthority.java) +- I wired up DSpace-CRIS's VIAFAuthority to see how authorities for auto suggested names get stored + - After submission you can see the item's VIAF identifier: + +![VIAF authority](/cgspace-notes/2020/09/viaf-authority.png) + +- And this identifier is the ID on VIAF, pretty cool! + +![VIAF entry for Charles Darwin](/cgspace-notes/2020/09/viaf-darwin.png) + +- I did a similar test with the Getty Thesaurus of Geographic Names (TGN) and it stores the concept URI in the authority: + +![TGNAuthority](/cgspace-notes/2020/09/tgn-concept-uri.png) + +- But the authority values are not exposed anywhere as metadata... + - I need to play with it a bit more I guess... +- The nice thing is that the Getty example from DSpace-CRIS uses SPARQL as well, and the TGN authority extends it + - We could use a similar model for AGROVOC/LandVoc very easily + +## 2020-09-17 + +- Maria from Bioveristy asked about the ORCID identifier for one of her colleagues that seems to have been removed from our list + - I re-added it to our controlled vocabulary and added the identifier to fifty-one of his existing items on CGSpace using my script: + +``` +$ cat 2020-09-17-add-bioversity-orcids.csv +dc.contributor.author,cg.creator.id +"Etten, Jacob van","Jacob van Etten: 0000-0001-7554-2558" +"van Etten, Jacob","Jacob van Etten: 0000-0001-7554-2558" +$ ./add-orcid-identifiers-csv.py -i 2020-09-17-add-bioversity-orcids.csv -db dspace -u dspace -p 'dom@in34sniper' +``` + +- I sent a follow-up message to Atmire to look into the two remaining issues with the DSpace 6 upgrade + - First is the fact that we have zero results in our Listings and Reports, for any search + - Second is the error we get during CSV imports +- Help Natalia and Cathy from Bioversity-CIAT with their OpenSearch query on "trade offs" again + - They wanted to build a search query with multiple filters (type, crpsubject, status) and the general query "trade offs" + - I found a great [reference for DSpace's OpenSearch syntax](https://www.kiwi.fi/pages/viewpage.action?pageId=45782169) (albeit in Finnish, but the example URLs show the syntax clearly) + - We can use quotes and `AND` and `OR` and even group search parameters with parenthesis! + - So now I built a query for Natalia which uses these (showing without URL encoding so you can see the syntax): + +``` +https://cgspace.cgiar.org/open-search/discover?query=type:"Journal Article" AND status:"Open Access" AND crpsubject:"Water, Land and Ecosystems" AND "tradeoffs"&rpp=100 +``` + +- I noticed that my `move-collections.sh` script didn't work on DSpace 6 because of the change from IDs to UUIDs, so I modified it to quote the collection `resource_id` parameters in the PostgreSQL query + diff --git a/docs/2020-09/index.html b/docs/2020-09/index.html index 56d78ea6f..f3260df5c 100644 --- a/docs/2020-09/index.html +++ b/docs/2020-09/index.html @@ -55,7 +55,7 @@ I filed an issue on OpenRXV to make some minor edits to the admin UI: https://gi "@type": "BlogPosting", "headline": "September, 2020", "url": "https://alanorth.github.io/cgspace-notes/2020-09/", - "wordCount": "1911", + "wordCount": "2161", "datePublished": "2020-09-02T15:35:54+03:00", "dateModified": "2020-09-15T17:32:29+03:00", "author": { @@ -464,6 +464,39 @@ Would fix 3 occurences of: SOUTHWEST ASIA +

2020-09-17

+ +
$ cat 2020-09-17-add-bioversity-orcids.csv
+dc.contributor.author,cg.creator.id
+"Etten, Jacob van","Jacob van Etten: 0000-0001-7554-2558"
+"van Etten, Jacob","Jacob van Etten: 0000-0001-7554-2558"
+$ ./add-orcid-identifiers-csv.py -i 2020-09-17-add-bioversity-orcids.csv -db dspace -u dspace -p 'dom@in34sniper'
+
+
https://cgspace.cgiar.org/open-search/discover?query=type:"Journal Article" AND status:"Open Access" AND crpsubject:"Water, Land and Ecosystems" AND "tradeoffs"&rpp=100
+