mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-24 23:50:17 +01:00
Add content/posts/2024-08.md
This commit is contained in:
parent
64b8957945
commit
7be53639dc
39
content/posts/2024-08.md
Normal file
39
content/posts/2024-08.md
Normal file
@ -0,0 +1,39 @@
|
|||||||
|
---
|
||||||
|
title: "August, 2024"
|
||||||
|
date: 2024-08-08T23:07:00-07:00
|
||||||
|
author: "Alan Orth"
|
||||||
|
categories: ["Notes"]
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2024-08-08
|
||||||
|
|
||||||
|
- While working on the CGIAR Climate Change Synthesis I learned some new tricks with OpenRefine
|
||||||
|
|
||||||
|
<!--more-->
|
||||||
|
|
||||||
|
- The first was to retrieve affiliations from OpenAlex and extract them from JSON with this GREL:
|
||||||
|
|
||||||
|
```
|
||||||
|
forEach(
|
||||||
|
value.parseJson()['authorships'],
|
||||||
|
a,
|
||||||
|
forEach(
|
||||||
|
a.parseJson()['institutions'],
|
||||||
|
i,
|
||||||
|
i['display_name']
|
||||||
|
).join("||")
|
||||||
|
).join("||")
|
||||||
|
```
|
||||||
|
|
||||||
|
- It is a nested `forEach` to extract all institutions for all authors
|
||||||
|
- Second was a better way to deduplicate lists in Jython while preserving list order:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# better dedupe preserves order
|
||||||
|
seen = set()
|
||||||
|
deduped_list = [x for x in value.split("||") if x not in seen and not seen.add(x)]
|
||||||
|
|
||||||
|
return "||".join(deduped_list)
|
||||||
|
```
|
||||||
|
|
||||||
|
<!-- vim: set sw=2 ts=2: -->
|
Loading…
Reference in New Issue
Block a user