diff --git a/content/posts/2021-08.md b/content/posts/2021-08.md index f6780c973..fd4a00bc9 100644 --- a/content/posts/2021-08.md +++ b/content/posts/2021-08.md @@ -294,5 +294,54 @@ $ dspace community-filiator --set --parent=10568/114644 --child=10568/76451 - I made an initial attempt on the policy statements page on DSpace Test - It is modeled on Sherpa Romeo's OpenDOAR policy statements advice +- Sit with Moayad and discuss the future of AReS + - We specifically discussed formalizing the API and documenting its use to allow as an alternative to harvesting directly from CGSpace + - We also discussed allowing linking to search results to enable something like "Explore this collection" links on CGSpace collection pages +- Lower case all AGROVOC metadata, as I had noticed a few in sentence case: + +```console +dspace=# UPDATE metadatavalue SET text_value=LOWER(text_value) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=187 AND text_value ~ '[[:upper:]]'; +UPDATE 484 +``` + +- Also update some DOIs using the `dx.doi.org` format, just to keep things uniform: + +```console +dspace=# UPDATE metadatavalue SET text_value = regexp_replace(text_value, 'https://dx.doi.org', 'https://doi.org') WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 220 AND text_value LIKE 'https://dx.doi.org%'; +UPDATE 469 +``` + +- Then start a full Discovery re-indexing to update the Feed the Future community item counts that have been stuck at 0 since we moved the three projects to be a subcommunity a few days ago: + +```console +$ time chrt -b 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b + +real 322m16.917s +user 226m43.121s +sys 3m17.469s +``` + +- I learned how to use the OpenRXV API, which is just a thin wrapper around Elasticsearch: + +```console +$ curl -X POST 'https://cgspace.cgiar.org/explorer/api/search?scroll=1d' \ + -H 'Content-Type: application/json' \ + -d '{ + "size": 10, + "query": { + "bool": { + "filter": { + "term": { + "repo.keyword": "CGSpace" + } + } + } + } +}' +$ curl -X POST 'https://cgspace.cgiar.org/explorer/api/search/scroll/DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAASekWMTRwZ3lEMkVRYUtKZjgyMno4dV9CUQ== +``` + +- This uses the Elasticsearch scroll ID to page through results + - The second query doesn't need the request body because it is saved for 1 day as part of the first request diff --git a/docs/2021-08/index.html b/docs/2021-08/index.html index 1caf051d0..e86bd65df 100644 --- a/docs/2021-08/index.html +++ b/docs/2021-08/index.html @@ -18,7 +18,7 @@ I decided to upgrade linode20 from Ubuntu 18.04 to 20.04 - + @@ -42,9 +42,9 @@ I decided to upgrade linode20 from Ubuntu 18.04 to 20.04 "@type": "BlogPosting", "headline": "August, 2021", "url": "https://alanorth.github.io/cgspace-notes/2021-08/", - "wordCount": "2255", + "wordCount": "2512", "datePublished": "2021-08-01T09:01:07+03:00", - "dateModified": "2021-08-16T21:35:44+03:00", + "dateModified": "2021-08-17T10:59:14+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -438,6 +438,53 @@ $ dspace community-filiator --set --parent=10568/114644 --child=10568/76451
  • It is modeled on Sherpa Romeo’s OpenDOAR policy statements advice
  • +
  • Sit with Moayad and discuss the future of AReS + +
  • +
  • Lower case all AGROVOC metadata, as I had noticed a few in sentence case:
  • + +
    dspace=# UPDATE metadatavalue SET text_value=LOWER(text_value) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=187 AND text_value ~ '[[:upper:]]';
    +UPDATE 484
    +
    +
    dspace=# UPDATE metadatavalue SET text_value = regexp_replace(text_value, 'https://dx.doi.org', 'https://doi.org') WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 220 AND text_value LIKE 'https://dx.doi.org%';
    +UPDATE 469
    +
    +
    $ time chrt -b 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b
    +
    +real    322m16.917s
    +user    226m43.121s
    +sys     3m17.469s
    +
    +
    $ curl -X POST 'https://cgspace.cgiar.org/explorer/api/search?scroll=1d' \
    +    -H 'Content-Type: application/json' \
    +    -d '{
    +    "size": 10,
    +    "query": {
    +        "bool": {
    +            "filter": {
    +                "term": {
    +                    "repo.keyword": "CGSpace"
    +                }
    +            }
    +        }
    +    }
    +}'
    +$ curl -X POST 'https://cgspace.cgiar.org/explorer/api/search/scroll/DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAASekWMTRwZ3lEMkVRYUtKZjgyMno4dV9CUQ==
    +
    diff --git a/docs/categories/index.html b/docs/categories/index.html index 864414b03..bc7ccd702 100644 --- a/docs/categories/index.html +++ b/docs/categories/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/index.html b/docs/categories/notes/index.html index cff546d80..f16967d52 100644 --- a/docs/categories/notes/index.html +++ b/docs/categories/notes/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/2/index.html b/docs/categories/notes/page/2/index.html index 7a4bf84ce..02623c2c7 100644 --- a/docs/categories/notes/page/2/index.html +++ b/docs/categories/notes/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/3/index.html b/docs/categories/notes/page/3/index.html index c2feac870..c9b4a2ed1 100644 --- a/docs/categories/notes/page/3/index.html +++ b/docs/categories/notes/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/4/index.html b/docs/categories/notes/page/4/index.html index 42bdcf362..822cff75a 100644 --- a/docs/categories/notes/page/4/index.html +++ b/docs/categories/notes/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/5/index.html b/docs/categories/notes/page/5/index.html index b55b46a7b..fc2c3ffa9 100644 --- a/docs/categories/notes/page/5/index.html +++ b/docs/categories/notes/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/index.html b/docs/index.html index 0b4371c10..974e860a7 100644 --- a/docs/index.html +++ b/docs/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/2/index.html b/docs/page/2/index.html index 7ab3a8d9f..0845eacf1 100644 --- a/docs/page/2/index.html +++ b/docs/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/3/index.html b/docs/page/3/index.html index 12865591e..8c51526ff 100644 --- a/docs/page/3/index.html +++ b/docs/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/4/index.html b/docs/page/4/index.html index 99be038df..cf09c8732 100644 --- a/docs/page/4/index.html +++ b/docs/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/5/index.html b/docs/page/5/index.html index 988871abf..d144afaf8 100644 --- a/docs/page/5/index.html +++ b/docs/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/6/index.html b/docs/page/6/index.html index 1439c94a6..52b8b3a3c 100644 --- a/docs/page/6/index.html +++ b/docs/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/7/index.html b/docs/page/7/index.html index 5c3e06c03..533521b28 100644 --- a/docs/page/7/index.html +++ b/docs/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/8/index.html b/docs/page/8/index.html index 87869da24..41364f8ef 100644 --- a/docs/page/8/index.html +++ b/docs/page/8/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/index.html b/docs/posts/index.html index 4f4e892f6..27ba0c63a 100644 --- a/docs/posts/index.html +++ b/docs/posts/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/2/index.html b/docs/posts/page/2/index.html index 45069a36f..b1b322367 100644 --- a/docs/posts/page/2/index.html +++ b/docs/posts/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/3/index.html b/docs/posts/page/3/index.html index 0a792f556..3ecfb7945 100644 --- a/docs/posts/page/3/index.html +++ b/docs/posts/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/4/index.html b/docs/posts/page/4/index.html index 4a4e03383..254ca2509 100644 --- a/docs/posts/page/4/index.html +++ b/docs/posts/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/5/index.html b/docs/posts/page/5/index.html index f3004bdd2..36791dd7a 100644 --- a/docs/posts/page/5/index.html +++ b/docs/posts/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/6/index.html b/docs/posts/page/6/index.html index 3fffe4a72..9279b7b32 100644 --- a/docs/posts/page/6/index.html +++ b/docs/posts/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/7/index.html b/docs/posts/page/7/index.html index ccc618b93..6675f162c 100644 --- a/docs/posts/page/7/index.html +++ b/docs/posts/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/8/index.html b/docs/posts/page/8/index.html index f3ddcf3fe..ceba23947 100644 --- a/docs/posts/page/8/index.html +++ b/docs/posts/page/8/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 006e78c25..7299f9eda 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -3,19 +3,19 @@ xmlns:xhtml="http://www.w3.org/1999/xhtml"> https://alanorth.github.io/cgspace-notes/2021-08/ - 2021-08-16T21:35:44+03:00 + 2021-08-17T10:59:14+03:00 https://alanorth.github.io/cgspace-notes/categories/ - 2021-08-16T21:35:44+03:00 + 2021-08-17T10:59:14+03:00 https://alanorth.github.io/cgspace-notes/ - 2021-08-16T21:35:44+03:00 + 2021-08-17T10:59:14+03:00 https://alanorth.github.io/cgspace-notes/categories/notes/ - 2021-08-16T21:35:44+03:00 + 2021-08-17T10:59:14+03:00 https://alanorth.github.io/cgspace-notes/posts/ - 2021-08-16T21:35:44+03:00 + 2021-08-17T10:59:14+03:00 https://alanorth.github.io/cgspace-notes/2021-07/ 2021-08-01T16:19:05+03:00