diff --git a/content/posts/2018-07.md b/content/posts/2018-07.md index e6023db21..e7aed6a97 100644 --- a/content/posts/2018-07.md +++ b/content/posts/2018-07.md @@ -334,4 +334,50 @@ dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue COPY 4518 ``` +## 2018-07-15 + +- Run all system updates on CGSpace, add latest metadata changes from last week, and start the Linode instance upgrade +- After the upgrade I see we have more disk space available in the instance's dashboard, so I shut the instance down and resized it from 392GB to 650GB +- The resize was very quick (less than one minute) and after booting the instance back up I now have 631GB for the root filesystem (with 267GB available)! +- Peter had asked a question about how mapped items are displayed in the Altmetric dashboard +- For example, [10568/82810](10568/82810) is mapped to four collections, but only shows up in one "department" in their dashboard +- Altmetric help said that [according to OAI that item is only in one department](https://cgspace.cgiar.org/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:cgspace.cgiar.org:10568/82810) +- I noticed that indeed there was only one collection listed, so I forced an OAI re-import on CGSpace: + +``` +$ dspace oai import -c +OAI 2.0 manager action started +Clearing index +Index cleared +Using full import. +Full import +100 items imported so far... +200 items imported so far... +... +73900 items imported so far... +Total: 73925 items +Purging cached OAI responses. +OAI 2.0 manager action ended. It took 697 seconds. +``` + +- Now I see four colletions in OAI for that item! +- I need to ask the dspace-tech mailing list if the nightly OAI import catches the case of old items that have had metadata or mappings change +- ICARDA sent me a list of the ORCID iDs they have in the MEL system and it looks like almost 150 are new and unique to us! + +``` +$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq | wc -l +1020 +$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml MEL\ ORCID.json | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq | wc -l +1158 +``` + +- I combined the two lists and regenerated the names for all our the ORCID iDs using my [resolve-orcids.py](https://gist.github.com/alanorth/57a88379126d844563c1410bd7b8d12b) script: + +``` +$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml MEL\ ORCID.json | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq > /tmp/2018-07-15-orcid-ids.txt +$ ./resolve-orcids.py -i /tmp/2018-07-15-orcid-ids.txt -o /tmp/2018-07-15-resolved-orcids.txt -d +``` + +- Help Udana from WLE understand some Altmetrics concepts + diff --git a/docs/2018-07/index.html b/docs/2018-07/index.html index e4919e42b..c94ff4b97 100644 --- a/docs/2018-07/index.html +++ b/docs/2018-07/index.html @@ -30,7 +30,7 @@ There is insufficient memory for the Java Runtime Environment to continue. - + @@ -71,9 +71,9 @@ There is insufficient memory for the Java Runtime Environment to continue. "@type": "BlogPosting", "headline": "July, 2018", "url": "https://alanorth.github.io/cgspace-notes/2018-07/", - "wordCount": "2226", + "wordCount": "2561", "datePublished": "2018-07-01T12:56:54+03:00", - "dateModified": "2018-07-12T17:07:17+03:00", + "dateModified": "2018-07-13T19:45:58+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -516,6 +516,57 @@ $ csvcut -c 1 < /tmp/affiliations.csv > /tmp/affiliations-1.csv COPY 4518 +
$ dspace oai import -c
+OAI 2.0 manager action started
+Clearing index
+Index cleared
+Using full import.
+Full import
+100 items imported so far...
+200 items imported so far...
+...
+73900 items imported so far...
+Total: 73925 items
+Purging cached OAI responses.
+OAI 2.0 manager action ended. It took 697 seconds.
+
+
+$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq | wc -l
+1020
+$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml MEL\ ORCID.json | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq | wc -l
+1158
+
+
+$ cat dspace/config/controlled-vocabularies/cg-creator-id.xml MEL\ ORCID.json | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq > /tmp/2018-07-15-orcid-ids.txt
+$ ./resolve-orcids.py -i /tmp/2018-07-15-orcid-ids.txt -o /tmp/2018-07-15-resolved-orcids.txt -d
+
+
+