mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-06-26 16:13:48 +02:00
43 lines
1.2 KiB
Markdown
43 lines
1.2 KiB
Markdown
|
---
|
||
|
title: "February, 2024"
|
||
|
date: 2024-01-05T11:10:00+03:00
|
||
|
author: "Alan Orth"
|
||
|
categories: ["Notes"]
|
||
|
---
|
||
|
|
||
|
## 2024-02-05
|
||
|
|
||
|
- Delete duplicate metadata as described in my DSpace issue from last year: https://github.com/DSpace/DSpace/issues/8253
|
||
|
- Lower case all the AGROVOC subjects on CGSpace
|
||
|
|
||
|
<!--more-->
|
||
|
|
||
|
```sql
|
||
|
dspace=# BEGIN;
|
||
|
BEGIN
|
||
|
dspace=*# UPDATE metadatavalue SET text_value=LOWER(text_value) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=187 AND text_value ~ '[[:upper:]]';
|
||
|
UPDATE 180
|
||
|
dspace=*# COMMIT;
|
||
|
COMMIT
|
||
|
```
|
||
|
|
||
|
## 2024-02-06
|
||
|
|
||
|
- Discuss IWMI using the CGSpace REST API for their new website
|
||
|
- Export the IWMI community to extract their ORCID identifiers:
|
||
|
|
||
|
```console
|
||
|
$ dspace metadata-export -i 10568/16814 -f /tmp/iwmi.csv
|
||
|
$ csvcut -c 'cg.creator.identifier,cg.creator.identifier[en_US]' ~/Downloads/2024-02-06-iwmi.csv \
|
||
|
| grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' \
|
||
|
| sort -u \
|
||
|
| tee /tmp/iwmi-orcids.txt \
|
||
|
| wc -l
|
||
|
353
|
||
|
$ ./ilri/resolve_orcids.py -i /tmp/iwmi-orcids.txt -o /tmp/iwmi-orcids-names.csv -d
|
||
|
```
|
||
|
|
||
|
- I noticed some similar looking names in our list so I clustered them in OpenRefine and manually checked a dozen or so to update our list
|
||
|
|
||
|
<!-- vim: set sw=2 ts=2: -->
|