- Looks like the OAI bug from DSpace 5.1 that caused validation at Base Search to fail is now fixed and DSpace Test passes validation! ([#63](https://github.com/ilri/DSpace/issues/63))
- After re-deploying and re-indexing I didn't see the same issue, and the indexing completed in 85 minutes, which is about how long it is supposed to take
- I noticed some weird CRPs in the database, and they don't show up in Discovery for some reason, perhaps the `:`
- I'll export these and fix them in batch:
```
dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id=230 group by text_value order by count desc) to /tmp/crp.csv with csv;
- Add `AMR` to ILRI subjects and remove one duplicate instance of IITA in author affiliations controlled vocabulary ([#288](https://github.com/ilri/DSpace/pull/288))
- Thinking about batch updates for ORCIDs and authors
- Playing with [SolrClient](https://github.com/moonlitesolutions/SolrClient) in Python to query Solr
- All records in the authority core are either `authority_type:orcid` or `authority_type:person`
- There is a `deleted` field and all items seem to be `false`, but might be important sanity check to remember
- The way to go is probably to have a CSV of author names and authority IDs, then to batch update them in PostgreSQL
- Dump of the top ~200 authors in CGSpace:
```
dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id=3 group by text_value order by count desc limit 210) to /tmp/210-authors.csv with csv;
dspace=# \copy (select distinct text_lang, count(*) from metadatavalue where resource_type_id=2 group by text_lang order by count desc) to /tmp/text-langs.csv with csv;
COPY 14
```
- Perhaps we need to fix them all in batch, or experiment with fixing only certain metadatavalues:
```
dspace=# update metadatavalue set text_lang='en_US' where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS';
- The `fix-metadata.py` script I have is meant for specific metadata values, so if I want to update some `text_lang` values I should just do it directly in the database
- For example, on a limited set:
```
dspace=# update metadatavalue set text_lang=NULL where resource_type_id=2 and metadata_field_id=203 and text_value='LIVESTOCK' and text_lang='';
UPDATE 420
```
- And assuming I want to do it for all fields:
```
dspacetest=# update metadatavalue set text_lang=NULL where resource_type_id=2 and text_lang='';
UPDATE 183726
```
- After that restarted Tomcat and PostgreSQL (because I'm superstitious about caches) and now I see the following in REST API query: