After reading the OAI documentation and testing with an OAI validator I found out how to get their publications
This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&from=2016-01-01&set=p15738coll2&metadataPrefix=oai_dc
You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets
Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in dc.identifier.fund to cg.identifier.cpwfproject and then the rest to dc.description.sponsorship
After reading the OAI documentation and testing with an OAI validator I found out how to get their publications
This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&from=2016-01-01&set=p15738coll2&metadataPrefix=oai_dc
You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets
Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in dc.identifier.fund to cg.identifier.cpwfproject and then the rest to dc.description.sponsorship
<li>After reading the <ahref="https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html">ContentDM documentation</a> I found IFPRI’s OAI endpoint: <ahref="http://ebrary.ifpri.org/oai/oai.php">http://ebrary.ifpri.org/oai/oai.php</a></li>
<li>After reading the <ahref="https://www.openarchives.org/OAI/openarchivesprotocol.html">OAI documentation</a> and testing with an <ahref="http://validator.oaipmh.com/">OAI validator</a> I found out how to get their publications</li>
<li>This is their publications set: <ahref="http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&from=2016-01-01&set=p15738coll2&metadataPrefix=oai_dc">http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&from=2016-01-01&set=p15738coll2&metadataPrefix=oai_dc</a></li>
<li>You can see the others by using the OAI <code>ListSets</code> verb: <ahref="http://ebrary.ifpri.org/oai/oai.php?verb=ListSets">http://ebrary.ifpri.org/oai/oai.php?verb=ListSets</a></li>
<li>Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in <code>dc.identifier.fund</code> to <code>cg.identifier.cpwfproject</code> and then the rest to <code>dc.description.sponsorship</code></li>
</ul>
<pre><code>dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA');
UPDATE 497
dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75;
<li>Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with <code>cg.coverage.admin-unit</code></li>
<li>But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error</li>
<li>A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue</li>
<li>I found a thread on the mailing list talking about it and there is bug report and a patch: <ahref="https://jira.duraspace.org/browse/DS-2740">https://jira.duraspace.org/browse/DS-2740</a></li>
<li>The patch applies successfully on DSpace 5.1 so I will try it later</li>
<pre><code>dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence = 500;
dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence != 500;
<pre><code>dspacetest=# update metadatavalue set confidence=500 where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security';
<li>Re-sync DSpace Test with CGSpace and perform test of metadata migration again</li>
<li>Run phase two of metadata migrations on CGSpace (see the <ahref="https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c">migration notes</a>)</li>
<li>Run all system updates and reboot CGSpace server</li>
<pre><code>dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=29 group by text_value order by count desc) to /tmp/sponsorship.csv with csv;
<p>Looks like OAI is kinda obtuse for this, and if we use ContentDM’s API we’ll be able to access their internal field names (rather than trying to figure out how they stuffed them into various, repeated Dublin Core fields)</p>
<li>Looks like this is all we need: <ahref="https://wiki.lyrasis.org/display/DSDOC5x/Submission+User+Interface#SubmissionUserInterface-ConfiguringControlledVocabularies">https://wiki.lyrasis.org/display/DSDOC5x/Submission+User+Interface#SubmissionUserInterface-ConfiguringControlledVocabularies</a></li>
<pre><code>$ xml sel -t -m '//value-pairs[@value-pairs-name="ilrisubject"]/pair/displayed-value/text()' -c '.' -n dspace/config/input-forms.xml
<li>In other news, I found out that the About page that we haven’t been using lives in <code>dspace/config/about.xml</code>, so now we can update the text</li>
<li>File bug about <code>closed="true"</code> attribute of controlled vocabularies not working: <ahref="https://jira.duraspace.org/browse/DS-3238">https://jira.duraspace.org/browse/DS-3238</a></li>
<li>Atmire explained that the <code>atmire.orcid.id</code> field doesn’t exist in the schema, as it actually comes from the authority cache during XMLUI run time</li>
<li>This means we don’t see it when harvesting via OAI or REST, for example</li>
<li>They opened a feature ticket on the DSpace tracker to ask for support of this: <ahref="https://jira.duraspace.org/browse/DS-3239">https://jira.duraspace.org/browse/DS-3239</a></li>
<pre><code>dspacetest=# SELECT authority, confidence FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=3 AND text_value = 'CGIAR Research Program on Climate Change, Agriculture and Food Security';
dspacetest=# UPDATE metadatavalue set confidence = 500 where resource_type_id=2 AND metadata_field_id=3 AND text_value = 'CGIAR Research Program on Climate Change, Agriculture and Food Security';
<li>After the database edit, I did a full Discovery re-index</li>
<li>And now there are exactly 960 items in the authors facet for ‘CGIAR Research Program on Climate Change, Agriculture and Food Security’</li>
<li>Now I ran the same on CGSpace</li>
<li>Merge controlled vocabulary functionality for animal breeds to <code>5_x-prod</code> (<ahref="https://github.com/ilri/DSpace/pull/236">#236</a>)</li>
<li>Write python script to update metadata values in batch via PostgreSQL: <ahref="https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897">fix-metadata-values.py</a></li>
<li>We need to use this to correct some pretty ugly values in fields like <code>dc.description.sponsorship</code></li>
<li>Merge item display tweaks from earlier this week (<ahref="https://github.com/ilri/DSpace/pull/231">#231</a>)</li>
<li>Merge controlled vocabulary functionality for subregions (<ahref="https://github.com/ilri/DSpace/pull/238">#238</a>)</li>
<p>Clean up titles and hints in <code>input-forms.xml</code> to use title/sentence case and a few more consistency things (<ahref="https://github.com/ilri/DSpace/pull/241">#241</a>)</p>
</li>
<li>
<p>The final list of fields to migrate in the third phase of metadata migrations is:</p>
<p>Interesting “Sunburst” visualization on a Digital Commons page: <ahref="http://www.repository.law.indiana.edu/sunburst.html">http://www.repository.law.indiana.edu/sunburst.html</a></p>
</li>
<li>
<p>Final testing on metadata fix/delete for <code>dc.description.sponsorship</code> cleanup</p>
</li>
<li>
<p>Need to run <code>fix-metadata-values.py</code> and then <code>fix-metadata-values.py</code></p>
<pre><code>dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=126 group by text_value order by count desc) to /tmp/contributors-june28.csv with csv;
<li>Re-evaluate <code>dc.contributor.corporate</code> and it seems we will move it to <code>dc.contributor.author</code> as this is more in line with how editors are actually using it</li>