cgspace-notes/content/posts/2024-08.md

853 B

title date author categories
August, 2024 2024-08-08T23:07:00-07:00 Alan Orth
Notes

2024-08-08

  • While working on the CGIAR Climate Change Synthesis I learned some new tricks with OpenRefine
  • The first was to retrieve affiliations from OpenAlex and extract them from JSON with this GREL:
forEach(
  value.parseJson()['authorships'],
  a,
  forEach(
    a.parseJson()['institutions'],
    i,
    i['display_name']
  ).join("||")
).join("||")
  • It is a nested forEach to extract all institutions for all authors
  • Second was a better way to deduplicate lists in Jython while preserving list order:
# better dedupe preserves order
seen = set()
deduped_list = [x for x in value.split("||") if x not in seen and not seen.add(x)]

return "||".join(deduped_list)