cgspace-notes/content/posts/2024-02.md

43 lines
1.2 KiB
Markdown
Raw Normal View History

2024-02-06 09:45:02 +01:00
---
title: "February, 2024"
date: 2024-01-05T11:10:00+03:00
author: "Alan Orth"
categories: ["Notes"]
---
## 2024-02-05
- Delete duplicate metadata as described in my DSpace issue from last year: https://github.com/DSpace/DSpace/issues/8253
- Lower case all the AGROVOC subjects on CGSpace
<!--more-->
```sql
dspace=# BEGIN;
BEGIN
dspace=*# UPDATE metadatavalue SET text_value=LOWER(text_value) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=187 AND text_value ~ '[[:upper:]]';
UPDATE 180
dspace=*# COMMIT;
COMMIT
```
## 2024-02-06
- Discuss IWMI using the CGSpace REST API for their new website
- Export the IWMI community to extract their ORCID identifiers:
```console
$ dspace metadata-export -i 10568/16814 -f /tmp/iwmi.csv
$ csvcut -c 'cg.creator.identifier,cg.creator.identifier[en_US]' ~/Downloads/2024-02-06-iwmi.csv \
| grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' \
| sort -u \
| tee /tmp/iwmi-orcids.txt \
| wc -l
353
$ ./ilri/resolve_orcids.py -i /tmp/iwmi-orcids.txt -o /tmp/iwmi-orcids-names.csv -d
```
- I noticed some similar looking names in our list so I clustered them in OpenRefine and manually checked a dozen or so to update our list
<!-- vim: set sw=2 ts=2: -->