- Experimenting with IFPRI OAI (we want to harvest their publications)
- After reading the [ContentDM documentation](https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html) I found IFPRI's OAI endpoint: http://ebrary.ifpri.org/oai/oai.php
- After reading the [OAI documentation](https://www.openarchives.org/OAI/openarchivesprotocol.html) and testing with an [OAI validator](http://validator.oaipmh.com/) I found out how to get their publications
- This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&from=2016-01-01&set=p15738coll2&metadataPrefix=oai_dc
- You can see the others by using the OAI `ListSets` verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets
- Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in `dc.identifier.fund` to `cg.identifier.cpwfproject` and then the rest to `dc.description.sponsorship`
```
dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA');
UPDATE 497
dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75;
UPDATE 14
```
- Fix a few minor miscellaneous issues in `dspace.cfg` ([#227](https://github.com/ilri/DSpace/pull/227))
- But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error
- I've sent a message to the DSpace mailing list to ask about the Browse index definition
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
(13 rows)
```
- And now an actually relevent example:
```
dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence = 500;
count
-------
707
(1 row)
dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence != 500;
count
-------
253
(1 row)
```
- Trying something experimental:
```
dspacetest=# update metadatavalue set confidence=500 where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security';
UPDATE 960
```
- And then re-indexing authority and Discovery...?
- After Discovery reindex the CCAFS authors are all together in the Authors sidebar facet
- The docs for the ORCiD and Authority stuff for DSpace 5 mention changing the browse indexes to use the Authority as well:
- Figured out how to export a list of the unique values from a metadata field ordered by count:
```
dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=29 group by text_value order by count desc) to /tmp/sponsorship.csv with csv;
- Discuss pulling data from IFPRI's ContentDM with Ryan Miller
- Looks like OAI is kinda obtuse for this, and if we use ContentDM's API we'll be able to access their internal field names (rather than trying to figure out how they stuffed them into various, repeated Dublin Core fields)
- Looks like this is all we need: https://wiki.duraspace.org/display/DSDOC5x/Submission+User+Interface#SubmissionUserInterface-ConfiguringControlledVocabularies
- I wrote an XPath expression to extract the ILRI subjects from `input-forms.xml` (uses xmlstartlet):
```
$ xml sel -t -m '//value-pairs[@value-pairs-name="ilrisubject"]/pair/displayed-value/text()' -c '.' -n dspace/config/input-forms.xml
```
- Write to Atmire about the use of `atmire.orcid.id` to see if we can change it
- Seems to be a virtual field that is queried from the authority cache... hmm
- In other news, I found out that the About page that we haven't been using lives in `dspace/config/about.xml`, so now we can update the text
- File bug about `closed="true"` attribute of controlled vocabularies not working: https://jira.duraspace.org/browse/DS-3238