diff --git a/content/posts/2018-07.md b/content/posts/2018-07.md index e6023db21..e7aed6a97 100644 --- a/content/posts/2018-07.md +++ b/content/posts/2018-07.md @@ -334,4 +334,50 @@ dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue COPY 4518 ``` +## 2018-07-15 + +- Run all system updates on CGSpace, add latest metadata changes from last week, and start the Linode instance upgrade +- After the upgrade I see we have more disk space available in the instance's dashboard, so I shut the instance down and resized it from 392GB to 650GB +- The resize was very quick (less than one minute) and after booting the instance back up I now have 631GB for the root filesystem (with 267GB available)! +- Peter had asked a question about how mapped items are displayed in the Altmetric dashboard +- For example, [10568/82810](10568/82810) is mapped to four collections, but only shows up in one "department" in their dashboard +- Altmetric help said that [according to OAI that item is only in one department](https://cgspace.cgiar.org/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:cgspace.cgiar.org:10568/82810) +- I noticed that indeed there was only one collection listed, so I forced an OAI re-import on CGSpace: + +``` +$ dspace oai import -c +OAI 2.0 manager action started +Clearing index +Index cleared +Using full import. +Full import +100 items imported so far... +200 items imported so far... +... +73900 items imported so far... +Total: 73925 items +Purging cached OAI responses. +OAI 2.0 manager action ended. It took 697 seconds. +``` + +- Now I see four colletions in OAI for that item! +- I need to ask the dspace-tech mailing list if the nightly OAI import catches the case of old items that have had metadata or mappings change +- ICARDA sent me a list of the ORCID iDs they have in the MEL system and it looks like almost 150 are new and unique to us! + +``` +$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq | wc -l +1020 +$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml MEL\ ORCID.json | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq | wc -l +1158 +``` + +- I combined the two lists and regenerated the names for all our the ORCID iDs using my [resolve-orcids.py](https://gist.github.com/alanorth/57a88379126d844563c1410bd7b8d12b) script: + +``` +$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml MEL\ ORCID.json | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq > /tmp/2018-07-15-orcid-ids.txt +$ ./resolve-orcids.py -i /tmp/2018-07-15-orcid-ids.txt -o /tmp/2018-07-15-resolved-orcids.txt -d +``` + +- Help Udana from WLE understand some Altmetrics concepts + diff --git a/docs/2018-07/index.html b/docs/2018-07/index.html index e4919e42b..c94ff4b97 100644 --- a/docs/2018-07/index.html +++ b/docs/2018-07/index.html @@ -30,7 +30,7 @@ There is insufficient memory for the Java Runtime Environment to continue. - + @@ -71,9 +71,9 @@ There is insufficient memory for the Java Runtime Environment to continue. "@type": "BlogPosting", "headline": "July, 2018", "url": "https://alanorth.github.io/cgspace-notes/2018-07/", - "wordCount": "2226", + "wordCount": "2561", "datePublished": "2018-07-01T12:56:54+03:00", - "dateModified": "2018-07-12T17:07:17+03:00", + "dateModified": "2018-07-13T19:45:58+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -516,6 +516,57 @@ $ csvcut -c 1 < /tmp/affiliations.csv > /tmp/affiliations-1.csv COPY 4518 +

2018-07-15

+ + + +
$ dspace oai import -c
+OAI 2.0 manager action started
+Clearing index
+Index cleared
+Using full import.
+Full import
+100 items imported so far...
+200 items imported so far...
+...
+73900 items imported so far...
+Total: 73925 items
+Purging cached OAI responses.
+OAI 2.0 manager action ended. It took 697 seconds.
+
+ + + +
$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq | wc -l
+1020
+$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml MEL\ ORCID.json | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq | wc -l
+1158
+
+ + + +
$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml MEL\ ORCID.json | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq > /tmp/2018-07-15-orcid-ids.txt
+$ ./resolve-orcids.py -i /tmp/2018-07-15-orcid-ids.txt -o /tmp/2018-07-15-resolved-orcids.txt -d
+
+ + + diff --git a/docs/robots.txt b/docs/robots.txt index 2176f05a2..0f0d86454 100644 --- a/docs/robots.txt +++ b/docs/robots.txt @@ -37,7 +37,7 @@ Disallow: /cgspace-notes/2015-12/ Disallow: /cgspace-notes/2015-11/ Disallow: /cgspace-notes/ Disallow: /cgspace-notes/categories/ -Disallow: /cgspace-notes/tags/notes/ Disallow: /cgspace-notes/categories/notes/ +Disallow: /cgspace-notes/tags/notes/ Disallow: /cgspace-notes/posts/ Disallow: /cgspace-notes/tags/ diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 636ecc96c..3e2237782 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -4,7 +4,7 @@ https://alanorth.github.io/cgspace-notes/2018-07/ - 2018-07-12T17:07:17+03:00 + 2018-07-13T19:45:58+03:00 @@ -174,7 +174,7 @@ https://alanorth.github.io/cgspace-notes/ - 2018-07-12T17:07:17+03:00 + 2018-07-13T19:45:58+03:00 0 @@ -183,27 +183,27 @@ 0 - - https://alanorth.github.io/cgspace-notes/tags/notes/ - 2018-07-12T17:07:17+03:00 - 0 - - https://alanorth.github.io/cgspace-notes/categories/notes/ 2018-03-09T22:10:33+02:00 0 + + https://alanorth.github.io/cgspace-notes/tags/notes/ + 2018-07-13T19:45:58+03:00 + 0 + + https://alanorth.github.io/cgspace-notes/posts/ - 2018-07-12T17:07:17+03:00 + 2018-07-13T19:45:58+03:00 0 https://alanorth.github.io/cgspace-notes/tags/ - 2018-07-12T17:07:17+03:00 + 2018-07-13T19:45:58+03:00 0