Add notes for 2016-06-07

This commit is contained in:
Alan Orth 2016-06-08 00:28:43 +03:00
parent f1cfef0582
commit c9c6800dcd
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
9 changed files with 426 additions and 331 deletions

View File

@ -1,5 +1,5 @@
+++
date = "2016-05-01T10:53:00+03:00"
date = "2016-06-01T10:53:00+03:00"
author = "Alan Orth"
title = "June, 2016"
tags = ["notes"]
@ -110,3 +110,23 @@ webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority
- Re-sync DSpace Test with CGSpace and perform test of metadata migration again
- Run phase two of metadata migrations on CGSpace (see the [migration notes](https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c))
- Run all system updates and reboot CGSpace server
## 2016-06-07
- Figured out how to export a list of the unique values from a metadata field ordered by count:
```
dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=29 group by text_value order by count desc) to /tmp/sponsorship.csv with csv;
```
- Identified the next round of fields to migrate:
- dc.title.jtitle → dc.source
- dc.crsubject.crpsubject → cg.contributor.crp
- dc.contributor.affiliation → cg.contributor.affiliation
- dc.Species → cg.species
- dc.contributor.corporate → dc.contributor
- dc.identifier.url → cg.identifier.url
- dc.identifier.doi → cg.identifier.doi
- dc.identifier.googleurl → cg.identifier.googleurl
- dc.identifier.dataurl → cg.identifier.dataurl

View File

@ -550,7 +550,7 @@ dspace.log.2016-04-27:7271
<li class="previous"><a href="/cgspace-notes/2016-03/"><span aria-hidden="true">&larr;</span> Older</a></li>
<li class="next"><a href="/cgspace-notes/2016-06/">Newer <span aria-hidden="true">&rarr;</span></a></li>
<li class="next"><a href="/cgspace-notes/2016-05/">Newer <span aria-hidden="true">&rarr;</span></a></li>
</ul>
</footer>

View File

@ -393,10 +393,10 @@ sys 0m20.540s
</section>
<ul class="pager">
<li class="previous"><a href="/cgspace-notes/2016-06/"><span aria-hidden="true">&larr;</span> Older</a></li>
<li class="previous"><a href="/cgspace-notes/2016-04/"><span aria-hidden="true">&larr;</span> Older</a></li>
<li class="next disabled"><a href="#">Newer <span aria-hidden="true">&rarr;</span></a></li>
<li class="next"><a href="/cgspace-notes/2016-06/">Newer <span aria-hidden="true">&rarr;</span></a></li>
</ul>
</footer>

View File

@ -11,7 +11,7 @@
<meta property="og:type" content="article" />
<meta property="og:article:published_time" content="2016-05-01T10:53:00&#43;03:00" />
<meta property="og:article:published_time" content="2016-06-01T10:53:00&#43;03:00" />
<meta property="og:article:tag" content="notes" />
@ -65,8 +65,8 @@
<div class="post-meta clearfix">
<div class="post-date pull-left">
Posted on
<time datetime="2016-05-01T10:53:00&#43;03:00">
May 1, 2016
<time datetime="2016-06-01T10:53:00&#43;03:00">
Jun 1, 2016
</time>
</div>
<div class="pull-right">
@ -197,6 +197,31 @@ UPDATE 960
<li>Re-sync DSpace Test with CGSpace and perform test of metadata migration again</li>
<li>Run phase two of metadata migrations on CGSpace (see the <a href="https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c">migration notes</a>)</li>
<li>Run all system updates and reboot CGSpace server</li>
</ul>
<h2 id="2016-06-07:6783872e82b68b1517e00f494e6b6504">2016-06-07</h2>
<ul>
<li>Figured out how to export a list of the unique values from a metadata field ordered by count:</li>
</ul>
<pre><code>dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=29 group by text_value order by count desc) to /tmp/sponsorship.csv with csv;
</code></pre>
<ul>
<li>Identified the next round of fields to migrate:
<ul>
<li>dc.title.jtitle → dc.source</li>
<li>dc.crsubject.crpsubject → cg.contributor.crp</li>
<li>dc.contributor.affiliation → cg.contributor.affiliation</li>
<li>dc.Species → cg.species</li>
<li>dc.contributor.corporate → dc.contributor</li>
<li>dc.identifier.url → cg.identifier.url</li>
<li>dc.identifier.doi → cg.identifier.doi</li>
<li>dc.identifier.googleurl → cg.identifier.googleurl</li>
<li>dc.identifier.dataurl → cg.identifier.dataurl</li>
</ul></li>
</ul>
</section>
@ -216,10 +241,10 @@ UPDATE 960
</section>
<ul class="pager">
<li class="previous"><a href="/cgspace-notes/2016-04/"><span aria-hidden="true">&larr;</span> Older</a></li>
<li class="previous"><a href="/cgspace-notes/2016-05/"><span aria-hidden="true">&larr;</span> Older</a></li>
<li class="next"><a href="/cgspace-notes/2016-05/">Newer <span aria-hidden="true">&rarr;</span></a></li>
<li class="next disabled"><a href="#">Newer <span aria-hidden="true">&rarr;</span></a></li>
</ul>
</footer>

View File

@ -58,6 +58,34 @@
<div class="article-list">
<article>
<header>
<h2><a href="/cgspace-notes/2016-06/">June, 2016</a></h2>
<div class="post-meta clearfix">
<div class="post-date pull-left">
Posted on
<time datetime="2016-06-01T10:53:00&#43;03:00">
Jun 1, 2016
</time>
</div>
</div>
</header>
<div>
2016-06-01 Experimenting with IFPRI OAI (we want to harvest their publications) After reading the ContentDM documentation I found IFPRI&rsquo;s OAI endpoint: http://ebrary.ifpri.org/oai/oai.php After reading the OAI documentation and testing with an OAI validator I found out how to get their publications This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in dc.identifier.fund to cg.identifier.cpwfproject and then the rest to dc.description.sponsorship dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA'); UPDATE 497 dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75; UPDATE 14 Fix a few minor miscellaneous issues in dspace.cfg (#227) 2016-06-02 Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with cg.coverage.admin-unit Seems that the Browse configuration in dspace.cfg can&rsquo;t handle the &lsquo;-&rsquo; in the field name: webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error I&rsquo;ve sent a message to the DSpace mailing list to ask about the Browse index definition A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue I found a thread on the mailing list talking about it and there is bug report and a patch: https://jira.duraspace.org/browse/DS-2740 The patch applies successfully on DSpace 5.1 so I will try it later 2016-06-03 Investigating the CCAFS authority issue, I exported the metadata for the Videos collection The top two authors are: CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500 CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600 So the only difference is the &ldquo;confidence&rdquo; Ok, well THAT is interesting: dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %'; text_value | authority | confidence ------------+--------------------------------------+------------ Orth, A.
</div>
<footer>
<ul class="pager">
<li class="next"><a href="/cgspace-notes/2016-06/">Read more <span aria-hidden="true">&raquo;</span></a></li>
</ul>
</footer>
</article>
<hr/>
<article>
<header>
<h2><a href="/cgspace-notes/2016-05/">May, 2016</a></h2>
@ -84,34 +112,6 @@
<hr/>
<article>
<header>
<h2><a href="/cgspace-notes/2016-06/">June, 2016</a></h2>
<div class="post-meta clearfix">
<div class="post-date pull-left">
Posted on
<time datetime="2016-05-01T10:53:00&#43;03:00">
May 1, 2016
</time>
</div>
</div>
</header>
<div>
2016-06-01 Experimenting with IFPRI OAI (we want to harvest their publications) After reading the ContentDM documentation I found IFPRI&rsquo;s OAI endpoint: http://ebrary.ifpri.org/oai/oai.php After reading the OAI documentation and testing with an OAI validator I found out how to get their publications This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in dc.identifier.fund to cg.identifier.cpwfproject and then the rest to dc.description.sponsorship dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA'); UPDATE 497 dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75; UPDATE 14 Fix a few minor miscellaneous issues in dspace.cfg (#227) 2016-06-02 Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with cg.coverage.admin-unit Seems that the Browse configuration in dspace.cfg can&rsquo;t handle the &lsquo;-&rsquo; in the field name: webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error I&rsquo;ve sent a message to the DSpace mailing list to ask about the Browse index definition A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue I found a thread on the mailing list talking about it and there is bug report and a patch: https://jira.duraspace.org/browse/DS-2740 The patch applies successfully on DSpace 5.1 so I will try it later 2016-06-03 Investigating the CCAFS authority issue, I exported the metadata for the Videos collection The top two authors are: CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500 CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600 So the only difference is the &ldquo;confidence&rdquo; Ok, well THAT is interesting: dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %'; text_value | authority | confidence ------------+--------------------------------------+------------ Orth, A.
</div>
<footer>
<ul class="pager">
<li class="next"><a href="/cgspace-notes/2016-06/">Read more <span aria-hidden="true">&raquo;</span></a></li>
</ul>
</footer>
</article>
<hr/>
<article>

View File

@ -6,9 +6,164 @@
<description>Recent content on CGSpace Notes</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<lastBuildDate>Sun, 01 May 2016 23:06:00 +0300</lastBuildDate>
<lastBuildDate>Wed, 01 Jun 2016 10:53:00 +0300</lastBuildDate>
<atom:link href="/cgspace-notes/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>June, 2016</title>
<link>/cgspace-notes/2016-06/</link>
<pubDate>Wed, 01 Jun 2016 10:53:00 +0300</pubDate>
<guid>/cgspace-notes/2016-06/</guid>
<description>
&lt;h2 id=&#34;2016-06-01:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-01&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Experimenting with IFPRI OAI (we want to harvest their publications)&lt;/li&gt;
&lt;li&gt;After reading the &lt;a href=&#34;https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html&#34;&gt;ContentDM documentation&lt;/a&gt; I found IFPRI&amp;rsquo;s OAI endpoint: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php&#34;&gt;http://ebrary.ifpri.org/oai/oai.php&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;After reading the &lt;a href=&#34;https://www.openarchives.org/OAI/openarchivesprotocol.html&#34;&gt;OAI documentation&lt;/a&gt; and testing with an &lt;a href=&#34;http://validator.oaipmh.com/&#34;&gt;OAI validator&lt;/a&gt; I found out how to get their publications&lt;/li&gt;
&lt;li&gt;This is their publications set: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;amp;from=2016-01-01&amp;amp;set=p15738coll2&amp;amp;metadataPrefix=oai_dc&#34;&gt;http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;amp;from=2016-01-01&amp;amp;set=p15738coll2&amp;amp;metadataPrefix=oai_dc&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;You can see the others by using the OAI &lt;code&gt;ListSets&lt;/code&gt; verb: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php?verb=ListSets&#34;&gt;http://ebrary.ifpri.org/oai/oai.php?verb=ListSets&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in &lt;code&gt;dc.identifier.fund&lt;/code&gt; to &lt;code&gt;cg.identifier.cpwfproject&lt;/code&gt; and then the rest to &lt;code&gt;dc.description.sponsorship&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like &#39;PN%&#39; or text_value like &#39;PHASE%&#39; or text_value = &#39;CBA&#39; or text_value = &#39;IA&#39;);
UPDATE 497
dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75;
UPDATE 14
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Fix a few minor miscellaneous issues in &lt;code&gt;dspace.cfg&lt;/code&gt; (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/227&#34;&gt;#227&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-06-02:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-02&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with &lt;code&gt;cg.coverage.admin-unit&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Seems that the Browse configuration in &lt;code&gt;dspace.cfg&lt;/code&gt; can&amp;rsquo;t handle the &amp;lsquo;-&amp;rsquo; in the field name:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error&lt;/li&gt;
&lt;li&gt;I&amp;rsquo;ve sent a message to the DSpace mailing list to ask about the Browse index definition&lt;/li&gt;
&lt;li&gt;A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue&lt;/li&gt;
&lt;li&gt;I found a thread on the mailing list talking about it and there is bug report and a patch: &lt;a href=&#34;https://jira.duraspace.org/browse/DS-2740&#34;&gt;https://jira.duraspace.org/browse/DS-2740&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;The patch applies successfully on DSpace 5.1 so I will try it later&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-06-03:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-03&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Investigating the CCAFS authority issue, I exported the metadata for the Videos collection&lt;/li&gt;
&lt;li&gt;The top two authors are:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500
CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;So the only difference is the &amp;ldquo;confidence&amp;rdquo;&lt;/li&gt;
&lt;li&gt;Ok, well THAT is interesting:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like &#39;%Orth, %&#39;;
text_value | authority | confidence
------------+--------------------------------------+------------
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
(13 rows)
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;And now an actually relevent example:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39; and confidence = 500;
count
-------
707
(1 row)
dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39; and confidence != 500;
count
-------
253
(1 row)
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Trying something experimental:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# update metadatavalue set confidence=500 where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39;;
UPDATE 960
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;And then re-indexing authority and Discovery&amp;hellip;?&lt;/li&gt;
&lt;li&gt;After Discovery reindex the CCAFS authors are all together in the Authors sidebar facet&lt;/li&gt;
&lt;li&gt;The docs for the ORCiD and Authority stuff for DSpace 5 mention changing the browse indexes to use the Authority as well:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;That would only be for the &amp;ldquo;Browse by&amp;rdquo; function&amp;hellip; so we&amp;rsquo;ll have to see what effect that has later&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-06-04:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-04&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Re-sync DSpace Test with CGSpace and perform test of metadata migration again&lt;/li&gt;
&lt;li&gt;Run phase two of metadata migrations on CGSpace (see the &lt;a href=&#34;https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c&#34;&gt;migration notes&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Run all system updates and reboot CGSpace server&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-06-07:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-07&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Figured out how to export a list of the unique values from a metadata field ordered by count:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=29 group by text_value order by count desc) to /tmp/sponsorship.csv with csv;
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Identified the next round of fields to migrate:
&lt;ul&gt;
&lt;li&gt;dc.title.jtitle → dc.source&lt;/li&gt;
&lt;li&gt;dc.crsubject.crpsubject → cg.contributor.crp&lt;/li&gt;
&lt;li&gt;dc.contributor.affiliation → cg.contributor.affiliation&lt;/li&gt;
&lt;li&gt;dc.Species → cg.species&lt;/li&gt;
&lt;li&gt;dc.contributor.corporate → dc.contributor&lt;/li&gt;
&lt;li&gt;dc.identifier.url → cg.identifier.url&lt;/li&gt;
&lt;li&gt;dc.identifier.doi → cg.identifier.doi&lt;/li&gt;
&lt;li&gt;dc.identifier.googleurl → cg.identifier.googleurl&lt;/li&gt;
&lt;li&gt;dc.identifier.dataurl → cg.identifier.dataurl&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
</item>
<item>
<title>May, 2016</title>
<link>/cgspace-notes/2016-05/</link>
@ -316,136 +471,6 @@ sys 0m20.540s
</description>
</item>
<item>
<title>June, 2016</title>
<link>/cgspace-notes/2016-06/</link>
<pubDate>Sun, 01 May 2016 10:53:00 +0300</pubDate>
<guid>/cgspace-notes/2016-06/</guid>
<description>
&lt;h2 id=&#34;2016-06-01:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-01&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Experimenting with IFPRI OAI (we want to harvest their publications)&lt;/li&gt;
&lt;li&gt;After reading the &lt;a href=&#34;https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html&#34;&gt;ContentDM documentation&lt;/a&gt; I found IFPRI&amp;rsquo;s OAI endpoint: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php&#34;&gt;http://ebrary.ifpri.org/oai/oai.php&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;After reading the &lt;a href=&#34;https://www.openarchives.org/OAI/openarchivesprotocol.html&#34;&gt;OAI documentation&lt;/a&gt; and testing with an &lt;a href=&#34;http://validator.oaipmh.com/&#34;&gt;OAI validator&lt;/a&gt; I found out how to get their publications&lt;/li&gt;
&lt;li&gt;This is their publications set: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;amp;from=2016-01-01&amp;amp;set=p15738coll2&amp;amp;metadataPrefix=oai_dc&#34;&gt;http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;amp;from=2016-01-01&amp;amp;set=p15738coll2&amp;amp;metadataPrefix=oai_dc&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;You can see the others by using the OAI &lt;code&gt;ListSets&lt;/code&gt; verb: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php?verb=ListSets&#34;&gt;http://ebrary.ifpri.org/oai/oai.php?verb=ListSets&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in &lt;code&gt;dc.identifier.fund&lt;/code&gt; to &lt;code&gt;cg.identifier.cpwfproject&lt;/code&gt; and then the rest to &lt;code&gt;dc.description.sponsorship&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like &#39;PN%&#39; or text_value like &#39;PHASE%&#39; or text_value = &#39;CBA&#39; or text_value = &#39;IA&#39;);
UPDATE 497
dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75;
UPDATE 14
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Fix a few minor miscellaneous issues in &lt;code&gt;dspace.cfg&lt;/code&gt; (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/227&#34;&gt;#227&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-06-02:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-02&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with &lt;code&gt;cg.coverage.admin-unit&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Seems that the Browse configuration in &lt;code&gt;dspace.cfg&lt;/code&gt; can&amp;rsquo;t handle the &amp;lsquo;-&amp;rsquo; in the field name:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error&lt;/li&gt;
&lt;li&gt;I&amp;rsquo;ve sent a message to the DSpace mailing list to ask about the Browse index definition&lt;/li&gt;
&lt;li&gt;A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue&lt;/li&gt;
&lt;li&gt;I found a thread on the mailing list talking about it and there is bug report and a patch: &lt;a href=&#34;https://jira.duraspace.org/browse/DS-2740&#34;&gt;https://jira.duraspace.org/browse/DS-2740&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;The patch applies successfully on DSpace 5.1 so I will try it later&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-06-03:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-03&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Investigating the CCAFS authority issue, I exported the metadata for the Videos collection&lt;/li&gt;
&lt;li&gt;The top two authors are:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500
CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;So the only difference is the &amp;ldquo;confidence&amp;rdquo;&lt;/li&gt;
&lt;li&gt;Ok, well THAT is interesting:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like &#39;%Orth, %&#39;;
text_value | authority | confidence
------------+--------------------------------------+------------
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
(13 rows)
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;And now an actually relevent example:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39; and confidence = 500;
count
-------
707
(1 row)
dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39; and confidence != 500;
count
-------
253
(1 row)
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Trying something experimental:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# update metadatavalue set confidence=500 where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39;;
UPDATE 960
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;And then re-indexing authority and Discovery&amp;hellip;?&lt;/li&gt;
&lt;li&gt;After Discovery reindex the CCAFS authors are all together in the Authors sidebar facet&lt;/li&gt;
&lt;li&gt;The docs for the ORCiD and Authority stuff for DSpace 5 mention changing the browse indexes to use the Authority as well:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;That would only be for the &amp;ldquo;Browse by&amp;rdquo; function&amp;hellip; so we&amp;rsquo;ll have to see what effect that has later&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-06-04:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-04&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Re-sync DSpace Test with CGSpace and perform test of metadata migration again&lt;/li&gt;
&lt;li&gt;Run phase two of metadata migrations on CGSpace (see the &lt;a href=&#34;https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c&#34;&gt;migration notes&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Run all system updates and reboot CGSpace server&lt;/li&gt;
&lt;/ul&gt;
</description>
</item>
<item>
<title>April, 2016</title>
<link>/cgspace-notes/2016-04/</link>

View File

@ -3,20 +3,20 @@
<url>
<loc>/cgspace-notes/</loc>
<lastmod>2016-05-01T23:06:00+03:00</lastmod>
<lastmod>2016-06-01T10:53:00+03:00</lastmod>
<priority>0</priority>
</url>
<url>
<loc>/cgspace-notes/2016-06/</loc>
<lastmod>2016-06-01T10:53:00+03:00</lastmod>
</url>
<url>
<loc>/cgspace-notes/2016-05/</loc>
<lastmod>2016-05-01T23:06:00+03:00</lastmod>
</url>
<url>
<loc>/cgspace-notes/2016-06/</loc>
<lastmod>2016-05-01T10:53:00+03:00</lastmod>
</url>
<url>
<loc>/cgspace-notes/2016-04/</loc>
<lastmod>2016-04-04T11:06:00+03:00</lastmod>

View File

@ -61,6 +61,32 @@
<section class="article-list">
<h1>Notes</h1>
<hr/>
<article>
<header>
<h2><a href="/cgspace-notes/2016-06/">June, 2016</a></h2>
<div class="post-meta clearfix">
<div class="post-date pull-left">
Posted on
<time datetime="2016-06-01T10:53:00&#43;03:00">
Jun 1, 2016
</time>
</div>
</div>
</header>
<div>
2016-06-01 Experimenting with IFPRI OAI (we want to harvest their publications) After reading the ContentDM documentation I found IFPRI&rsquo;s OAI endpoint: http://ebrary.ifpri.org/oai/oai.php After reading the OAI documentation and testing with an OAI validator I found out how to get their publications This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in dc.identifier.fund to cg.identifier.cpwfproject and then the rest to dc.description.sponsorship dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA'); UPDATE 497 dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75; UPDATE 14 Fix a few minor miscellaneous issues in dspace.cfg (#227) 2016-06-02 Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with cg.coverage.admin-unit Seems that the Browse configuration in dspace.cfg can&rsquo;t handle the &lsquo;-&rsquo; in the field name: webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error I&rsquo;ve sent a message to the DSpace mailing list to ask about the Browse index definition A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue I found a thread on the mailing list talking about it and there is bug report and a patch: https://jira.duraspace.org/browse/DS-2740 The patch applies successfully on DSpace 5.1 so I will try it later 2016-06-03 Investigating the CCAFS authority issue, I exported the metadata for the Videos collection The top two authors are: CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500 CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600 So the only difference is the &ldquo;confidence&rdquo; Ok, well THAT is interesting: dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %'; text_value | authority | confidence ------------+--------------------------------------+------------ Orth, A.
</div>
<footer>
<ul class="pager">
<li class="next"><a href="/cgspace-notes/2016-06/">Read more <span aria-hidden="true">&raquo;</span></a></li>
</ul>
</footer>
</article>
<hr/>
<article>
<header>
@ -87,32 +113,6 @@
</article>
<hr/>
<article>
<header>
<h2><a href="/cgspace-notes/2016-06/">June, 2016</a></h2>
<div class="post-meta clearfix">
<div class="post-date pull-left">
Posted on
<time datetime="2016-05-01T10:53:00&#43;03:00">
May 1, 2016
</time>
</div>
</div>
</header>
<div>
2016-06-01 Experimenting with IFPRI OAI (we want to harvest their publications) After reading the ContentDM documentation I found IFPRI&rsquo;s OAI endpoint: http://ebrary.ifpri.org/oai/oai.php After reading the OAI documentation and testing with an OAI validator I found out how to get their publications This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in dc.identifier.fund to cg.identifier.cpwfproject and then the rest to dc.description.sponsorship dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA'); UPDATE 497 dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75; UPDATE 14 Fix a few minor miscellaneous issues in dspace.cfg (#227) 2016-06-02 Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with cg.coverage.admin-unit Seems that the Browse configuration in dspace.cfg can&rsquo;t handle the &lsquo;-&rsquo; in the field name: webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error I&rsquo;ve sent a message to the DSpace mailing list to ask about the Browse index definition A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue I found a thread on the mailing list talking about it and there is bug report and a patch: https://jira.duraspace.org/browse/DS-2740 The patch applies successfully on DSpace 5.1 so I will try it later 2016-06-03 Investigating the CCAFS authority issue, I exported the metadata for the Videos collection The top two authors are: CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500 CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600 So the only difference is the &ldquo;confidence&rdquo; Ok, well THAT is interesting: dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %'; text_value | authority | confidence ------------+--------------------------------------+------------ Orth, A.
</div>
<footer>
<ul class="pager">
<li class="next"><a href="/cgspace-notes/2016-06/">Read more <span aria-hidden="true">&raquo;</span></a></li>
</ul>
</footer>
</article>
<hr/>
<article>
<header>

View File

@ -6,9 +6,164 @@
<description>Recent content in Notes on CGSpace Notes</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<lastBuildDate>Sun, 01 May 2016 23:06:00 +0300</lastBuildDate>
<lastBuildDate>Wed, 01 Jun 2016 10:53:00 +0300</lastBuildDate>
<atom:link href="/cgspace-notes/tags/notes/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>June, 2016</title>
<link>/cgspace-notes/2016-06/</link>
<pubDate>Wed, 01 Jun 2016 10:53:00 +0300</pubDate>
<guid>/cgspace-notes/2016-06/</guid>
<description>
&lt;h2 id=&#34;2016-06-01:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-01&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Experimenting with IFPRI OAI (we want to harvest their publications)&lt;/li&gt;
&lt;li&gt;After reading the &lt;a href=&#34;https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html&#34;&gt;ContentDM documentation&lt;/a&gt; I found IFPRI&amp;rsquo;s OAI endpoint: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php&#34;&gt;http://ebrary.ifpri.org/oai/oai.php&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;After reading the &lt;a href=&#34;https://www.openarchives.org/OAI/openarchivesprotocol.html&#34;&gt;OAI documentation&lt;/a&gt; and testing with an &lt;a href=&#34;http://validator.oaipmh.com/&#34;&gt;OAI validator&lt;/a&gt; I found out how to get their publications&lt;/li&gt;
&lt;li&gt;This is their publications set: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;amp;from=2016-01-01&amp;amp;set=p15738coll2&amp;amp;metadataPrefix=oai_dc&#34;&gt;http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;amp;from=2016-01-01&amp;amp;set=p15738coll2&amp;amp;metadataPrefix=oai_dc&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;You can see the others by using the OAI &lt;code&gt;ListSets&lt;/code&gt; verb: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php?verb=ListSets&#34;&gt;http://ebrary.ifpri.org/oai/oai.php?verb=ListSets&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in &lt;code&gt;dc.identifier.fund&lt;/code&gt; to &lt;code&gt;cg.identifier.cpwfproject&lt;/code&gt; and then the rest to &lt;code&gt;dc.description.sponsorship&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like &#39;PN%&#39; or text_value like &#39;PHASE%&#39; or text_value = &#39;CBA&#39; or text_value = &#39;IA&#39;);
UPDATE 497
dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75;
UPDATE 14
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Fix a few minor miscellaneous issues in &lt;code&gt;dspace.cfg&lt;/code&gt; (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/227&#34;&gt;#227&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-06-02:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-02&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with &lt;code&gt;cg.coverage.admin-unit&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Seems that the Browse configuration in &lt;code&gt;dspace.cfg&lt;/code&gt; can&amp;rsquo;t handle the &amp;lsquo;-&amp;rsquo; in the field name:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error&lt;/li&gt;
&lt;li&gt;I&amp;rsquo;ve sent a message to the DSpace mailing list to ask about the Browse index definition&lt;/li&gt;
&lt;li&gt;A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue&lt;/li&gt;
&lt;li&gt;I found a thread on the mailing list talking about it and there is bug report and a patch: &lt;a href=&#34;https://jira.duraspace.org/browse/DS-2740&#34;&gt;https://jira.duraspace.org/browse/DS-2740&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;The patch applies successfully on DSpace 5.1 so I will try it later&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-06-03:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-03&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Investigating the CCAFS authority issue, I exported the metadata for the Videos collection&lt;/li&gt;
&lt;li&gt;The top two authors are:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500
CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;So the only difference is the &amp;ldquo;confidence&amp;rdquo;&lt;/li&gt;
&lt;li&gt;Ok, well THAT is interesting:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like &#39;%Orth, %&#39;;
text_value | authority | confidence
------------+--------------------------------------+------------
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
(13 rows)
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;And now an actually relevent example:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39; and confidence = 500;
count
-------
707
(1 row)
dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39; and confidence != 500;
count
-------
253
(1 row)
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Trying something experimental:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# update metadatavalue set confidence=500 where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39;;
UPDATE 960
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;And then re-indexing authority and Discovery&amp;hellip;?&lt;/li&gt;
&lt;li&gt;After Discovery reindex the CCAFS authors are all together in the Authors sidebar facet&lt;/li&gt;
&lt;li&gt;The docs for the ORCiD and Authority stuff for DSpace 5 mention changing the browse indexes to use the Authority as well:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;That would only be for the &amp;ldquo;Browse by&amp;rdquo; function&amp;hellip; so we&amp;rsquo;ll have to see what effect that has later&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-06-04:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-04&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Re-sync DSpace Test with CGSpace and perform test of metadata migration again&lt;/li&gt;
&lt;li&gt;Run phase two of metadata migrations on CGSpace (see the &lt;a href=&#34;https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c&#34;&gt;migration notes&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Run all system updates and reboot CGSpace server&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-06-07:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-07&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Figured out how to export a list of the unique values from a metadata field ordered by count:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=29 group by text_value order by count desc) to /tmp/sponsorship.csv with csv;
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Identified the next round of fields to migrate:
&lt;ul&gt;
&lt;li&gt;dc.title.jtitle → dc.source&lt;/li&gt;
&lt;li&gt;dc.crsubject.crpsubject → cg.contributor.crp&lt;/li&gt;
&lt;li&gt;dc.contributor.affiliation → cg.contributor.affiliation&lt;/li&gt;
&lt;li&gt;dc.Species → cg.species&lt;/li&gt;
&lt;li&gt;dc.contributor.corporate → dc.contributor&lt;/li&gt;
&lt;li&gt;dc.identifier.url → cg.identifier.url&lt;/li&gt;
&lt;li&gt;dc.identifier.doi → cg.identifier.doi&lt;/li&gt;
&lt;li&gt;dc.identifier.googleurl → cg.identifier.googleurl&lt;/li&gt;
&lt;li&gt;dc.identifier.dataurl → cg.identifier.dataurl&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
</item>
<item>
<title>May, 2016</title>
<link>/cgspace-notes/2016-05/</link>
@ -316,136 +471,6 @@ sys 0m20.540s
</description>
</item>
<item>
<title>June, 2016</title>
<link>/cgspace-notes/2016-06/</link>
<pubDate>Sun, 01 May 2016 10:53:00 +0300</pubDate>
<guid>/cgspace-notes/2016-06/</guid>
<description>
&lt;h2 id=&#34;2016-06-01:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-01&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Experimenting with IFPRI OAI (we want to harvest their publications)&lt;/li&gt;
&lt;li&gt;After reading the &lt;a href=&#34;https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html&#34;&gt;ContentDM documentation&lt;/a&gt; I found IFPRI&amp;rsquo;s OAI endpoint: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php&#34;&gt;http://ebrary.ifpri.org/oai/oai.php&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;After reading the &lt;a href=&#34;https://www.openarchives.org/OAI/openarchivesprotocol.html&#34;&gt;OAI documentation&lt;/a&gt; and testing with an &lt;a href=&#34;http://validator.oaipmh.com/&#34;&gt;OAI validator&lt;/a&gt; I found out how to get their publications&lt;/li&gt;
&lt;li&gt;This is their publications set: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;amp;from=2016-01-01&amp;amp;set=p15738coll2&amp;amp;metadataPrefix=oai_dc&#34;&gt;http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;amp;from=2016-01-01&amp;amp;set=p15738coll2&amp;amp;metadataPrefix=oai_dc&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;You can see the others by using the OAI &lt;code&gt;ListSets&lt;/code&gt; verb: &lt;a href=&#34;http://ebrary.ifpri.org/oai/oai.php?verb=ListSets&#34;&gt;http://ebrary.ifpri.org/oai/oai.php?verb=ListSets&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in &lt;code&gt;dc.identifier.fund&lt;/code&gt; to &lt;code&gt;cg.identifier.cpwfproject&lt;/code&gt; and then the rest to &lt;code&gt;dc.description.sponsorship&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like &#39;PN%&#39; or text_value like &#39;PHASE%&#39; or text_value = &#39;CBA&#39; or text_value = &#39;IA&#39;);
UPDATE 497
dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75;
UPDATE 14
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Fix a few minor miscellaneous issues in &lt;code&gt;dspace.cfg&lt;/code&gt; (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/227&#34;&gt;#227&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-06-02:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-02&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with &lt;code&gt;cg.coverage.admin-unit&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Seems that the Browse configuration in &lt;code&gt;dspace.cfg&lt;/code&gt; can&amp;rsquo;t handle the &amp;lsquo;-&amp;rsquo; in the field name:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error&lt;/li&gt;
&lt;li&gt;I&amp;rsquo;ve sent a message to the DSpace mailing list to ask about the Browse index definition&lt;/li&gt;
&lt;li&gt;A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue&lt;/li&gt;
&lt;li&gt;I found a thread on the mailing list talking about it and there is bug report and a patch: &lt;a href=&#34;https://jira.duraspace.org/browse/DS-2740&#34;&gt;https://jira.duraspace.org/browse/DS-2740&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;The patch applies successfully on DSpace 5.1 so I will try it later&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-06-03:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-03&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Investigating the CCAFS authority issue, I exported the metadata for the Videos collection&lt;/li&gt;
&lt;li&gt;The top two authors are:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500
CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;So the only difference is the &amp;ldquo;confidence&amp;rdquo;&lt;/li&gt;
&lt;li&gt;Ok, well THAT is interesting:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like &#39;%Orth, %&#39;;
text_value | authority | confidence
------------+--------------------------------------+------------
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, Alan | | -1
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
(13 rows)
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;And now an actually relevent example:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39; and confidence = 500;
count
-------
707
(1 row)
dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39; and confidence != 500;
count
-------
253
(1 row)
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Trying something experimental:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# update metadatavalue set confidence=500 where metadata_field_id=3 and text_value like &#39;CGIAR Research Program on Climate Change, Agriculture and Food Security&#39;;
UPDATE 960
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;And then re-indexing authority and Discovery&amp;hellip;?&lt;/li&gt;
&lt;li&gt;After Discovery reindex the CCAFS authors are all together in the Authors sidebar facet&lt;/li&gt;
&lt;li&gt;The docs for the ORCiD and Authority stuff for DSpace 5 mention changing the browse indexes to use the Authority as well:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;That would only be for the &amp;ldquo;Browse by&amp;rdquo; function&amp;hellip; so we&amp;rsquo;ll have to see what effect that has later&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-06-04:6783872e82b68b1517e00f494e6b6504&#34;&gt;2016-06-04&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Re-sync DSpace Test with CGSpace and perform test of metadata migration again&lt;/li&gt;
&lt;li&gt;Run phase two of metadata migrations on CGSpace (see the &lt;a href=&#34;https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c&#34;&gt;migration notes&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Run all system updates and reboot CGSpace server&lt;/li&gt;
&lt;/ul&gt;
</description>
</item>
<item>
<title>April, 2016</title>
<link>/cgspace-notes/2016-04/</link>