mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-25 08:00:18 +01:00
119 lines
4.0 KiB
Markdown
119 lines
4.0 KiB
Markdown
---
|
|
title: "February, 2024"
|
|
date: 2024-02-05T11:10:00+03:00
|
|
author: "Alan Orth"
|
|
categories: ["Notes"]
|
|
---
|
|
|
|
## 2024-02-05
|
|
|
|
- Delete duplicate metadata as described in my DSpace issue from last year: https://github.com/DSpace/DSpace/issues/8253
|
|
- Lower case all the AGROVOC subjects on CGSpace
|
|
|
|
<!--more-->
|
|
|
|
```sql
|
|
dspace=# BEGIN;
|
|
BEGIN
|
|
dspace=*# UPDATE metadatavalue SET text_value=LOWER(text_value) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=187 AND text_value ~ '[[:upper:]]';
|
|
UPDATE 180
|
|
dspace=*# COMMIT;
|
|
COMMIT
|
|
```
|
|
|
|
## 2024-02-06
|
|
|
|
- Discuss IWMI using the CGSpace REST API for their new website
|
|
- Export the IWMI community to extract their ORCID identifiers:
|
|
|
|
```console
|
|
$ dspace metadata-export -i 10568/16814 -f /tmp/iwmi.csv
|
|
$ csvcut -c 'cg.creator.identifier,cg.creator.identifier[en_US]' ~/Downloads/2024-02-06-iwmi.csv \
|
|
| grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' \
|
|
| sort -u \
|
|
| tee /tmp/iwmi-orcids.txt \
|
|
| wc -l
|
|
353
|
|
$ ./ilri/resolve_orcids.py -i /tmp/iwmi-orcids.txt -o /tmp/iwmi-orcids-names.csv -d
|
|
```
|
|
|
|
- I noticed some similar looking names in our list so I clustered them in OpenRefine and manually checked a dozen or so to update our list
|
|
|
|
## 2024-02-07
|
|
|
|
- Maria asked me about the "missing" item from last week again
|
|
- I can see it when I used the Admin search, but not in her workflow
|
|
- It was submitted by TIP so I checked that user's workspace and found it there
|
|
- After depositing, it went into the workflow so Maria should be able to see it now
|
|
|
|
## 2024-02-09
|
|
|
|
- Minor edits to CGSpace submission form
|
|
- Upload 55 ISNAR book chapters to CGSpace from Peter
|
|
|
|
## 2024-02-19
|
|
|
|
- Looking into the collection mapping issue on CGSpace
|
|
- It seems to be by design in DSpace 7: https://github.com/DSpace/dspace-angular/issues/1203
|
|
- This is a massive setback for us...
|
|
|
|
## 2024-02-20
|
|
|
|
- Minor work on OpenRXV to fix a bug in the ng-select drop downs
|
|
- Minor work on the DSpace 7 nginx configuration to allow requesting robots.txt and sitemaps without hitting rate limits
|
|
|
|
## 2024-02-21
|
|
|
|
- Minor updates on OpenRXV, including one bug fix for missing mapped collections
|
|
- Salem had to re-work the harvester for DSpace 7 since the mapped collections and parent collection list are separate!
|
|
|
|
## 2024-02-22
|
|
|
|
- Discuss tagging of datasets and re-work the submission form to encourage use of DOI field for any item that has a DOI, and the normal URL field if not
|
|
- The "cg.identifier.dataurl" field will be used for "related" datasets
|
|
- I still have to check and move some metadata for existing datasets
|
|
|
|
## 2024-02-23
|
|
|
|
- This morning Tomcat died due to an OOM kill from the kernel:
|
|
|
|
```console
|
|
kernel: Out of memory: Killed process 698 (java) total-vm:14151300kB, anon-rss:9665812kB, file-rss:320kB, shmem-rss:0kB, UID:997 pgtables:20436kB oom_score_adj:0
|
|
```
|
|
|
|
- I don't see any abnormal pattern in my Grafana graphs, for JVM or system load... very weird
|
|
- I updated the submission form on CGSpace to include the new changes to URLs for datasets
|
|
- I also updated about 80 datasets to move the URLs to the correct field
|
|
|
|
## 2024-02-25
|
|
|
|
- This morning Tomcat died while I was doing a CSV export, with an OOM kill from the kernel:
|
|
|
|
```console
|
|
kernel: Out of memory: Killed process 720768 (java) total-vm:14079976kB, anon-rss:9301684kB, file-rss:152kB, shmem-rss:0kB, UID:997 pgtables:19488kB oom_score_adj:0
|
|
```
|
|
|
|
- I don't know why this is happening so often recently...
|
|
|
|
## 2024-02-27
|
|
|
|
- IFPRI sent me a list of authors to add to our list for now, until we can find a better way of doing it
|
|
- I extracted the existing authors from our controlled vocabulary and combined them with IFPRI's:
|
|
|
|
```console
|
|
$ xmllint --xpath '//node/isComposedBy/node()' dspace/config/controlled-vocabularies/dc-contributor-author.xml \
|
|
| grep -oE 'label=".*"' \
|
|
| sed -e 's/label="//' -e 's/"$//' > /tmp/authors
|
|
$ cat /tmp/authors /tmp/ifpri-authors | sort -u > /tmp/new-authors
|
|
```
|
|
|
|
## 2024-02-28
|
|
|
|
- I figured out a way to add a new Angular component to handle all our relation fields
|
|
|
|
## 2024-02-29
|
|
|
|
- Clean up a bunch of metadata on CGSpace
|
|
|
|
<!-- vim: set sw=2 ts=2: -->
|