mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2022-12-08
This commit is contained in:
@ -88,5 +88,40 @@ $ csvgrep -c matched -m true /tmp/cgspace-matches.csv | wc -l
|
||||
- This means I've added a few thousand UN M.49 regions to the `cg.coverage.subregion` field in the last few days
|
||||
- I had to extract them from CGSpace and delete them using `delete-metadata-values.py`
|
||||
- My [DSpace 7.x pull request to tell ImageMagick about the PDF CropBox](https://github.com/DSpace/DSpace/pull/8550) was merged
|
||||
- Start a harvest on AReS
|
||||
|
||||
## 2022-12-08
|
||||
|
||||
- While on the plane I decided to fix some ORCID identifiers, as I had seen some poorly formatted ones
|
||||
- I couldn't remember the XPath syntax so this was kinda ghetto:
|
||||
|
||||
```console
|
||||
$ xmllint --xpath '//node/isComposedBy/node()' dspace/config/controlled-vocabularies/cg-creator-identifier.xml | grep -oE 'label=".*"' | sed -e 's/label="//' -e 's/"$//' > /tmp/orcid-names.txt
|
||||
$ ./ilri/update-orcids.py -i /tmp/orcid-names.txt -db dspace -u dspace -p 'fuuu' -m 247
|
||||
```
|
||||
|
||||
- After that there were still some poorly formatted ones that my script didn't fix, so perhaps these are new ones not in our list
|
||||
- I dumped them and combined with the existing ones to resolve later:
|
||||
|
||||
```console
|
||||
localhost/dspace= ☘ \COPY (SELECT dspace_object_id,text_value FROM metadatavalue WHERE metadata_field_id=247 AND text_value LIKE '%http%') to /tmp/orcid-formatting.txt;
|
||||
COPY 36
|
||||
```
|
||||
|
||||
- I think there are really just some new ones...
|
||||
|
||||
```console
|
||||
$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml /tmp/orcid-formatting.txt| grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort -u > /tmp/2022-12-08-orcids.txt
|
||||
$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort -u | wc -l
|
||||
1907
|
||||
$ wc -l /tmp/2022-12-08-orcids.txt
|
||||
1939 /tmp/2022-12-08-orcids.txt
|
||||
```
|
||||
|
||||
- Then I applied these updates on CGSpace
|
||||
- Maria mentioned that she was getting a lot more items in her daily subscription emails
|
||||
- I had a hunch it was related to me updating the `last_modified` timestamp after updating a bunch of countries, regions, etc in items
|
||||
- Then today I noticed this option in `dspace.cfg`: `eperson.subscription.onlynew`
|
||||
- By default DSpace sends notifications for modified items too! I've disabled it now...
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user