mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-22 06:35:03 +01:00
72 lines
2.8 KiB
Markdown
72 lines
2.8 KiB
Markdown
---
|
|
title: "August, 2024"
|
|
date: 2024-08-08T23:07:00-07:00
|
|
author: "Alan Orth"
|
|
categories: ["Notes"]
|
|
---
|
|
|
|
## 2024-08-08
|
|
|
|
- While working on the CGIAR Climate Change Synthesis I learned some new tricks with OpenRefine
|
|
|
|
<!--more-->
|
|
|
|
- The first was to retrieve affiliations from OpenAlex and extract them from JSON with this GREL:
|
|
|
|
```
|
|
forEach(
|
|
value.parseJson()['authorships'],
|
|
a,
|
|
forEach(
|
|
a.parseJson()['institutions'],
|
|
i,
|
|
i['display_name']
|
|
).join("||")
|
|
).join("||")
|
|
```
|
|
|
|
- It is a nested `forEach` to extract all institutions for all authors
|
|
- Second was a better way to deduplicate lists in Jython while preserving list order:
|
|
|
|
```python
|
|
# better dedupe preserves order
|
|
seen = set()
|
|
deduped_list = [x for x in value.split("||") if x not in seen and not seen.add(x)]
|
|
|
|
return "||".join(deduped_list)
|
|
```
|
|
|
|
## 2024-08-20
|
|
|
|
- Delete duplicate metadata values using the method I described in this GitHub issue: https://github.com/DSpace/DSpace/issues/8253#issuecomment-1331756418
|
|
|
|
## 2024-08-22
|
|
|
|
- Help IWMI with some OpenSearch RSS/Atom feeds for search results:
|
|
- https://cgspace.cgiar.org/server/opensearch/search?query=affiliation:"International Water Management Institute" AND initiative:"Climate Resilience" AND subject:flooding
|
|
- https://cgspace.cgiar.org/server/opensearch/search?query=affiliation:"International Water Management Institute" AND initiative:"Climate Resilience" AND subject:drought
|
|
- https://cgspace.cgiar.org/server/opensearch/search?query=affiliation:"International Water Management Institute" AND initiative:"Climate Resilience" AND subject:landslides
|
|
|
|
- Export list of withdrawn handle redirects:
|
|
|
|
```
|
|
dspace=# \COPY (SELECT m.text_value AS handle_from, h.handle AS handle_to FROM metadatavalue m JOIN handle h on m.dspace_object_id = h.resource_id WHERE m.metadata_field_id=181 AND h.resource_type_id=2 AND h.resource_id IN (SELECT uuid FROM item WHERE in_archive AND NOT withdrawn)) to /tmp/handle-redirects.csv CSV HEADER;
|
|
COPY 400
|
|
```
|
|
|
|
- Export list of IFPRI CONTENTdm redirects:
|
|
|
|
```
|
|
dspace-# \COPY (SELECT m.text_value, h.handle FROM metadatavalue m JOIN handle h on m.dspace_object_id = h.resource_id WHERE m.metadata_field_id=28 AND m.text_value LIKE '%URL from IFPRI CONTENTdm%' AND h.resource_type_id=2 AND m.dspace_object_id IN (SELECT uuid FROM item WHERE in_archive AND NOT withdrawn)) to /tmp/ifpri.csv CSV HEADER;
|
|
COPY 10794
|
|
```
|
|
|
|
- I filed [an issue](https://github.com/DSpace/dspace-angular/issues/3258) on DSpace Angular for anonymous users to be able to export search results to CSV
|
|
|
|
## 2024-08-26
|
|
|
|
- Spent some time trying to rebase our DSpace Angular themes on top of the massive header/navbar rework from [DSpace 7.6.2](https://github.com/DSpace/dspace-angular/pull/2858)
|
|
- Spent some time getting missing bibliographic metadata (issue dates, licenses, pages, volume, issue, publisher, etc) from Crossref for CGSpace
|
|
|
|
<!-- vim: set sw=2 ts=2: -->
|