Update notes

This commit is contained in:
2018-03-08 22:47:12 +02:00
parent 8c2e314038
commit 2e5c0e3fed
3 changed files with 27 additions and 8 deletions

View File

@ -102,6 +102,15 @@ dspacetest=# select distinct text_lang from metadatavalue where resource_type_id
(9 rows)
```
- On second inspection it looks like `dc.description.provenance` fields use the text_lang "en" so that's probably why there are over 100,000 fields changed...
- If I skip that, there are about 2,000, which seems more reasonably like the amount of fields users have edited manually, or fucked up during CSV import, etc:
```
dspace=# update metadatavalue set text_lang='en_US' where resource_type_id=2 and text_lang in ('EN','En','en_','EN_US','en_U','eng');
UPDATE 2309
```
- I will apply this on CGSpace right now
- In other news, I was playing with adding ORCID identifiers to a dump of CIAT's community via CSV in OpenRefine
- Using a series of filters, flags, and GREL expressions to isolate items for a certain author, I figured out how to add ORCID identifiers to the `cg.creator.id` field
- For example, a GREL expression in a custom text facet to get all items with `dc.contributor.author[en_US]` of a certain author with several name variations (this is how you use a logical OR in OpenRefine):