mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-26 08:28:18 +01:00
Add notes for 2016-06-07
This commit is contained in:
parent
f1cfef0582
commit
c9c6800dcd
@ -1,5 +1,5 @@
|
|||||||
+++
|
+++
|
||||||
date = "2016-05-01T10:53:00+03:00"
|
date = "2016-06-01T10:53:00+03:00"
|
||||||
author = "Alan Orth"
|
author = "Alan Orth"
|
||||||
title = "June, 2016"
|
title = "June, 2016"
|
||||||
tags = ["notes"]
|
tags = ["notes"]
|
||||||
@ -110,3 +110,23 @@ webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority
|
|||||||
- Re-sync DSpace Test with CGSpace and perform test of metadata migration again
|
- Re-sync DSpace Test with CGSpace and perform test of metadata migration again
|
||||||
- Run phase two of metadata migrations on CGSpace (see the [migration notes](https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c))
|
- Run phase two of metadata migrations on CGSpace (see the [migration notes](https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c))
|
||||||
- Run all system updates and reboot CGSpace server
|
- Run all system updates and reboot CGSpace server
|
||||||
|
|
||||||
|
## 2016-06-07
|
||||||
|
|
||||||
|
- Figured out how to export a list of the unique values from a metadata field ordered by count:
|
||||||
|
|
||||||
|
```
|
||||||
|
dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=29 group by text_value order by count desc) to /tmp/sponsorship.csv with csv;
|
||||||
|
```
|
||||||
|
|
||||||
|
- Identified the next round of fields to migrate:
|
||||||
|
- dc.title.jtitle → dc.source
|
||||||
|
- dc.crsubject.crpsubject → cg.contributor.crp
|
||||||
|
- dc.contributor.affiliation → cg.contributor.affiliation
|
||||||
|
- dc.Species → cg.species
|
||||||
|
- dc.contributor.corporate → dc.contributor
|
||||||
|
- dc.identifier.url → cg.identifier.url
|
||||||
|
- dc.identifier.doi → cg.identifier.doi
|
||||||
|
- dc.identifier.googleurl → cg.identifier.googleurl
|
||||||
|
- dc.identifier.dataurl → cg.identifier.dataurl
|
||||||
|
|
||||||
|
@ -550,7 +550,7 @@ dspace.log.2016-04-27:7271
|
|||||||
<li class="previous"><a href="/cgspace-notes/2016-03/"><span aria-hidden="true">←</span> Older</a></li>
|
<li class="previous"><a href="/cgspace-notes/2016-03/"><span aria-hidden="true">←</span> Older</a></li>
|
||||||
|
|
||||||
|
|
||||||
<li class="next"><a href="/cgspace-notes/2016-06/">Newer <span aria-hidden="true">→</span></a></li>
|
<li class="next"><a href="/cgspace-notes/2016-05/">Newer <span aria-hidden="true">→</span></a></li>
|
||||||
|
|
||||||
</ul>
|
</ul>
|
||||||
</footer>
|
</footer>
|
||||||
|
@ -393,10 +393,10 @@ sys 0m20.540s
|
|||||||
</section>
|
</section>
|
||||||
<ul class="pager">
|
<ul class="pager">
|
||||||
|
|
||||||
<li class="previous"><a href="/cgspace-notes/2016-06/"><span aria-hidden="true">←</span> Older</a></li>
|
<li class="previous"><a href="/cgspace-notes/2016-04/"><span aria-hidden="true">←</span> Older</a></li>
|
||||||
|
|
||||||
|
|
||||||
<li class="next disabled"><a href="#">Newer <span aria-hidden="true">→</span></a></li>
|
<li class="next"><a href="/cgspace-notes/2016-06/">Newer <span aria-hidden="true">→</span></a></li>
|
||||||
|
|
||||||
</ul>
|
</ul>
|
||||||
</footer>
|
</footer>
|
||||||
|
@ -11,7 +11,7 @@
|
|||||||
|
|
||||||
<meta property="og:type" content="article" />
|
<meta property="og:type" content="article" />
|
||||||
|
|
||||||
<meta property="og:article:published_time" content="2016-05-01T10:53:00+03:00" />
|
<meta property="og:article:published_time" content="2016-06-01T10:53:00+03:00" />
|
||||||
|
|
||||||
<meta property="og:article:tag" content="notes" />
|
<meta property="og:article:tag" content="notes" />
|
||||||
|
|
||||||
@ -65,8 +65,8 @@
|
|||||||
<div class="post-meta clearfix">
|
<div class="post-meta clearfix">
|
||||||
<div class="post-date pull-left">
|
<div class="post-date pull-left">
|
||||||
Posted on
|
Posted on
|
||||||
<time datetime="2016-05-01T10:53:00+03:00">
|
<time datetime="2016-06-01T10:53:00+03:00">
|
||||||
May 1, 2016
|
Jun 1, 2016
|
||||||
</time>
|
</time>
|
||||||
</div>
|
</div>
|
||||||
<div class="pull-right">
|
<div class="pull-right">
|
||||||
@ -197,6 +197,31 @@ UPDATE 960
|
|||||||
<li>Re-sync DSpace Test with CGSpace and perform test of metadata migration again</li>
|
<li>Re-sync DSpace Test with CGSpace and perform test of metadata migration again</li>
|
||||||
<li>Run phase two of metadata migrations on CGSpace (see the <a href="https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c">migration notes</a>)</li>
|
<li>Run phase two of metadata migrations on CGSpace (see the <a href="https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c">migration notes</a>)</li>
|
||||||
<li>Run all system updates and reboot CGSpace server</li>
|
<li>Run all system updates and reboot CGSpace server</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2 id="2016-06-07:6783872e82b68b1517e00f494e6b6504">2016-06-07</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Figured out how to export a list of the unique values from a metadata field ordered by count:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=29 group by text_value order by count desc) to /tmp/sponsorship.csv with csv;
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Identified the next round of fields to migrate:
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>dc.title.jtitle → dc.source</li>
|
||||||
|
<li>dc.crsubject.crpsubject → cg.contributor.crp</li>
|
||||||
|
<li>dc.contributor.affiliation → cg.contributor.affiliation</li>
|
||||||
|
<li>dc.Species → cg.species</li>
|
||||||
|
<li>dc.contributor.corporate → dc.contributor</li>
|
||||||
|
<li>dc.identifier.url → cg.identifier.url</li>
|
||||||
|
<li>dc.identifier.doi → cg.identifier.doi</li>
|
||||||
|
<li>dc.identifier.googleurl → cg.identifier.googleurl</li>
|
||||||
|
<li>dc.identifier.dataurl → cg.identifier.dataurl</li>
|
||||||
|
</ul></li>
|
||||||
</ul>
|
</ul>
|
||||||
|
|
||||||
</section>
|
</section>
|
||||||
@ -216,10 +241,10 @@ UPDATE 960
|
|||||||
</section>
|
</section>
|
||||||
<ul class="pager">
|
<ul class="pager">
|
||||||
|
|
||||||
<li class="previous"><a href="/cgspace-notes/2016-04/"><span aria-hidden="true">←</span> Older</a></li>
|
<li class="previous"><a href="/cgspace-notes/2016-05/"><span aria-hidden="true">←</span> Older</a></li>
|
||||||
|
|
||||||
|
|
||||||
<li class="next"><a href="/cgspace-notes/2016-05/">Newer <span aria-hidden="true">→</span></a></li>
|
<li class="next disabled"><a href="#">Newer <span aria-hidden="true">→</span></a></li>
|
||||||
|
|
||||||
</ul>
|
</ul>
|
||||||
</footer>
|
</footer>
|
||||||
|
@ -58,6 +58,34 @@
|
|||||||
<div class="article-list">
|
<div class="article-list">
|
||||||
|
|
||||||
|
|
||||||
|
<article>
|
||||||
|
<header>
|
||||||
|
<h2><a href="/cgspace-notes/2016-06/">June, 2016</a></h2>
|
||||||
|
<div class="post-meta clearfix">
|
||||||
|
<div class="post-date pull-left">
|
||||||
|
Posted on
|
||||||
|
<time datetime="2016-06-01T10:53:00+03:00">
|
||||||
|
Jun 1, 2016
|
||||||
|
</time>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</header>
|
||||||
|
<div>
|
||||||
|
2016-06-01 Experimenting with IFPRI OAI (we want to harvest their publications) After reading the ContentDM documentation I found IFPRI’s OAI endpoint: http://ebrary.ifpri.org/oai/oai.php After reading the OAI documentation and testing with an OAI validator I found out how to get their publications This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&from=2016-01-01&set=p15738coll2&metadataPrefix=oai_dc You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in dc.identifier.fund to cg.identifier.cpwfproject and then the rest to dc.description.sponsorship dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA'); UPDATE 497 dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75; UPDATE 14 Fix a few minor miscellaneous issues in dspace.cfg (#227) 2016-06-02 Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with cg.coverage.admin-unit Seems that the Browse configuration in dspace.cfg can’t handle the ‘-’ in the field name: webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error I’ve sent a message to the DSpace mailing list to ask about the Browse index definition A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue I found a thread on the mailing list talking about it and there is bug report and a patch: https://jira.duraspace.org/browse/DS-2740 The patch applies successfully on DSpace 5.1 so I will try it later 2016-06-03 Investigating the CCAFS authority issue, I exported the metadata for the Videos collection The top two authors are: CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500 CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600 So the only difference is the “confidence” Ok, well THAT is interesting: dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %'; text_value | authority | confidence ------------+--------------------------------------+------------ Orth, A.
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<footer>
|
||||||
|
<ul class="pager">
|
||||||
|
<li class="next"><a href="/cgspace-notes/2016-06/">Read more <span aria-hidden="true">»</span></a></li>
|
||||||
|
</ul>
|
||||||
|
</footer>
|
||||||
|
|
||||||
|
</article>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<hr/>
|
||||||
|
|
||||||
<article>
|
<article>
|
||||||
<header>
|
<header>
|
||||||
<h2><a href="/cgspace-notes/2016-05/">May, 2016</a></h2>
|
<h2><a href="/cgspace-notes/2016-05/">May, 2016</a></h2>
|
||||||
@ -84,34 +112,6 @@
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
<hr/>
|
|
||||||
|
|
||||||
<article>
|
|
||||||
<header>
|
|
||||||
<h2><a href="/cgspace-notes/2016-06/">June, 2016</a></h2>
|
|
||||||
<div class="post-meta clearfix">
|
|
||||||
<div class="post-date pull-left">
|
|
||||||
Posted on
|
|
||||||
<time datetime="2016-05-01T10:53:00+03:00">
|
|
||||||
May 1, 2016
|
|
||||||
</time>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</header>
|
|
||||||
<div>
|
|
||||||
2016-06-01 Experimenting with IFPRI OAI (we want to harvest their publications) After reading the ContentDM documentation I found IFPRI’s OAI endpoint: http://ebrary.ifpri.org/oai/oai.php After reading the OAI documentation and testing with an OAI validator I found out how to get their publications This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&from=2016-01-01&set=p15738coll2&metadataPrefix=oai_dc You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in dc.identifier.fund to cg.identifier.cpwfproject and then the rest to dc.description.sponsorship dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA'); UPDATE 497 dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75; UPDATE 14 Fix a few minor miscellaneous issues in dspace.cfg (#227) 2016-06-02 Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with cg.coverage.admin-unit Seems that the Browse configuration in dspace.cfg can’t handle the ‘-’ in the field name: webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error I’ve sent a message to the DSpace mailing list to ask about the Browse index definition A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue I found a thread on the mailing list talking about it and there is bug report and a patch: https://jira.duraspace.org/browse/DS-2740 The patch applies successfully on DSpace 5.1 so I will try it later 2016-06-03 Investigating the CCAFS authority issue, I exported the metadata for the Videos collection The top two authors are: CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500 CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600 So the only difference is the “confidence” Ok, well THAT is interesting: dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %'; text_value | authority | confidence ------------+--------------------------------------+------------ Orth, A.
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<footer>
|
|
||||||
<ul class="pager">
|
|
||||||
<li class="next"><a href="/cgspace-notes/2016-06/">Read more <span aria-hidden="true">»</span></a></li>
|
|
||||||
</ul>
|
|
||||||
</footer>
|
|
||||||
|
|
||||||
</article>
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
<hr/>
|
<hr/>
|
||||||
|
|
||||||
<article>
|
<article>
|
||||||
|
287
public/index.xml
287
public/index.xml
@ -6,9 +6,164 @@
|
|||||||
<description>Recent content on CGSpace Notes</description>
|
<description>Recent content on CGSpace Notes</description>
|
||||||
<generator>Hugo -- gohugo.io</generator>
|
<generator>Hugo -- gohugo.io</generator>
|
||||||
<language>en-us</language>
|
<language>en-us</language>
|
||||||
<lastBuildDate>Sun, 01 May 2016 23:06:00 +0300</lastBuildDate>
|
<lastBuildDate>Wed, 01 Jun 2016 10:53:00 +0300</lastBuildDate>
|
||||||
<atom:link href="/cgspace-notes/index.xml" rel="self" type="application/rss+xml" />
|
<atom:link href="/cgspace-notes/index.xml" rel="self" type="application/rss+xml" />
|
||||||
|
|
||||||
|
<item>
|
||||||
|
<title>June, 2016</title>
|
||||||
|
<link>/cgspace-notes/2016-06/</link>
|
||||||
|
<pubDate>Wed, 01 Jun 2016 10:53:00 +0300</pubDate>
|
||||||
|
|
||||||
|
<guid>/cgspace-notes/2016-06/</guid>
|
||||||
|
<description>
|
||||||
|
|
||||||
|
<h2 id="2016-06-01:6783872e82b68b1517e00f494e6b6504">2016-06-01</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Experimenting with IFPRI OAI (we want to harvest their publications)</li>
|
||||||
|
<li>After reading the <a href="https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html">ContentDM documentation</a> I found IFPRI&rsquo;s OAI endpoint: <a href="http://ebrary.ifpri.org/oai/oai.php">http://ebrary.ifpri.org/oai/oai.php</a></li>
|
||||||
|
<li>After reading the <a href="https://www.openarchives.org/OAI/openarchivesprotocol.html">OAI documentation</a> and testing with an <a href="http://validator.oaipmh.com/">OAI validator</a> I found out how to get their publications</li>
|
||||||
|
<li>This is their publications set: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc">http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc</a></li>
|
||||||
|
<li>You can see the others by using the OAI <code>ListSets</code> verb: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListSets">http://ebrary.ifpri.org/oai/oai.php?verb=ListSets</a></li>
|
||||||
|
<li>Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in <code>dc.identifier.fund</code> to <code>cg.identifier.cpwfproject</code> and then the rest to <code>dc.description.sponsorship</code></li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA');
|
||||||
|
UPDATE 497
|
||||||
|
dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75;
|
||||||
|
UPDATE 14
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Fix a few minor miscellaneous issues in <code>dspace.cfg</code> (<a href="https://github.com/ilri/DSpace/pull/227">#227</a>)</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2 id="2016-06-02:6783872e82b68b1517e00f494e6b6504">2016-06-02</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with <code>cg.coverage.admin-unit</code></li>
|
||||||
|
<li>Seems that the Browse configuration in <code>dspace.cfg</code> can&rsquo;t handle the &lsquo;-&rsquo; in the field name:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error</li>
|
||||||
|
<li>I&rsquo;ve sent a message to the DSpace mailing list to ask about the Browse index definition</li>
|
||||||
|
<li>A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue</li>
|
||||||
|
<li>I found a thread on the mailing list talking about it and there is bug report and a patch: <a href="https://jira.duraspace.org/browse/DS-2740">https://jira.duraspace.org/browse/DS-2740</a></li>
|
||||||
|
<li>The patch applies successfully on DSpace 5.1 so I will try it later</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2 id="2016-06-03:6783872e82b68b1517e00f494e6b6504">2016-06-03</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Investigating the CCAFS authority issue, I exported the metadata for the Videos collection</li>
|
||||||
|
<li>The top two authors are:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500
|
||||||
|
CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>So the only difference is the &ldquo;confidence&rdquo;</li>
|
||||||
|
<li>Ok, well THAT is interesting:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %';
|
||||||
|
text_value | authority | confidence
|
||||||
|
------------+--------------------------------------+------------
|
||||||
|
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
||||||
|
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
||||||
|
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
||||||
|
Orth, Alan | | -1
|
||||||
|
Orth, Alan | | -1
|
||||||
|
Orth, Alan | | -1
|
||||||
|
Orth, Alan | | -1
|
||||||
|
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
|
||||||
|
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
|
||||||
|
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
||||||
|
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
||||||
|
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
|
||||||
|
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
|
||||||
|
(13 rows)
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>And now an actually relevent example:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence = 500;
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
707
|
||||||
|
(1 row)
|
||||||
|
|
||||||
|
dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence != 500;
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
253
|
||||||
|
(1 row)
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Trying something experimental:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# update metadatavalue set confidence=500 where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security';
|
||||||
|
UPDATE 960
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>And then re-indexing authority and Discovery&hellip;?</li>
|
||||||
|
<li>After Discovery reindex the CCAFS authors are all together in the Authors sidebar facet</li>
|
||||||
|
<li>The docs for the ORCiD and Authority stuff for DSpace 5 mention changing the browse indexes to use the Authority as well:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>That would only be for the &ldquo;Browse by&rdquo; function&hellip; so we&rsquo;ll have to see what effect that has later</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2 id="2016-06-04:6783872e82b68b1517e00f494e6b6504">2016-06-04</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Re-sync DSpace Test with CGSpace and perform test of metadata migration again</li>
|
||||||
|
<li>Run phase two of metadata migrations on CGSpace (see the <a href="https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c">migration notes</a>)</li>
|
||||||
|
<li>Run all system updates and reboot CGSpace server</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2 id="2016-06-07:6783872e82b68b1517e00f494e6b6504">2016-06-07</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Figured out how to export a list of the unique values from a metadata field ordered by count:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=29 group by text_value order by count desc) to /tmp/sponsorship.csv with csv;
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Identified the next round of fields to migrate:
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>dc.title.jtitle → dc.source</li>
|
||||||
|
<li>dc.crsubject.crpsubject → cg.contributor.crp</li>
|
||||||
|
<li>dc.contributor.affiliation → cg.contributor.affiliation</li>
|
||||||
|
<li>dc.Species → cg.species</li>
|
||||||
|
<li>dc.contributor.corporate → dc.contributor</li>
|
||||||
|
<li>dc.identifier.url → cg.identifier.url</li>
|
||||||
|
<li>dc.identifier.doi → cg.identifier.doi</li>
|
||||||
|
<li>dc.identifier.googleurl → cg.identifier.googleurl</li>
|
||||||
|
<li>dc.identifier.dataurl → cg.identifier.dataurl</li>
|
||||||
|
</ul></li>
|
||||||
|
</ul>
|
||||||
|
</description>
|
||||||
|
</item>
|
||||||
|
|
||||||
<item>
|
<item>
|
||||||
<title>May, 2016</title>
|
<title>May, 2016</title>
|
||||||
<link>/cgspace-notes/2016-05/</link>
|
<link>/cgspace-notes/2016-05/</link>
|
||||||
@ -316,136 +471,6 @@ sys 0m20.540s
|
|||||||
</description>
|
</description>
|
||||||
</item>
|
</item>
|
||||||
|
|
||||||
<item>
|
|
||||||
<title>June, 2016</title>
|
|
||||||
<link>/cgspace-notes/2016-06/</link>
|
|
||||||
<pubDate>Sun, 01 May 2016 10:53:00 +0300</pubDate>
|
|
||||||
|
|
||||||
<guid>/cgspace-notes/2016-06/</guid>
|
|
||||||
<description>
|
|
||||||
|
|
||||||
<h2 id="2016-06-01:6783872e82b68b1517e00f494e6b6504">2016-06-01</h2>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Experimenting with IFPRI OAI (we want to harvest their publications)</li>
|
|
||||||
<li>After reading the <a href="https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html">ContentDM documentation</a> I found IFPRI&rsquo;s OAI endpoint: <a href="http://ebrary.ifpri.org/oai/oai.php">http://ebrary.ifpri.org/oai/oai.php</a></li>
|
|
||||||
<li>After reading the <a href="https://www.openarchives.org/OAI/openarchivesprotocol.html">OAI documentation</a> and testing with an <a href="http://validator.oaipmh.com/">OAI validator</a> I found out how to get their publications</li>
|
|
||||||
<li>This is their publications set: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc">http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc</a></li>
|
|
||||||
<li>You can see the others by using the OAI <code>ListSets</code> verb: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListSets">http://ebrary.ifpri.org/oai/oai.php?verb=ListSets</a></li>
|
|
||||||
<li>Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in <code>dc.identifier.fund</code> to <code>cg.identifier.cpwfproject</code> and then the rest to <code>dc.description.sponsorship</code></li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<pre><code>dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA');
|
|
||||||
UPDATE 497
|
|
||||||
dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75;
|
|
||||||
UPDATE 14
|
|
||||||
</code></pre>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Fix a few minor miscellaneous issues in <code>dspace.cfg</code> (<a href="https://github.com/ilri/DSpace/pull/227">#227</a>)</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<h2 id="2016-06-02:6783872e82b68b1517e00f494e6b6504">2016-06-02</h2>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with <code>cg.coverage.admin-unit</code></li>
|
|
||||||
<li>Seems that the Browse configuration in <code>dspace.cfg</code> can&rsquo;t handle the &lsquo;-&rsquo; in the field name:</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<pre><code>webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text
|
|
||||||
</code></pre>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error</li>
|
|
||||||
<li>I&rsquo;ve sent a message to the DSpace mailing list to ask about the Browse index definition</li>
|
|
||||||
<li>A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue</li>
|
|
||||||
<li>I found a thread on the mailing list talking about it and there is bug report and a patch: <a href="https://jira.duraspace.org/browse/DS-2740">https://jira.duraspace.org/browse/DS-2740</a></li>
|
|
||||||
<li>The patch applies successfully on DSpace 5.1 so I will try it later</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<h2 id="2016-06-03:6783872e82b68b1517e00f494e6b6504">2016-06-03</h2>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Investigating the CCAFS authority issue, I exported the metadata for the Videos collection</li>
|
|
||||||
<li>The top two authors are:</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<pre><code>CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500
|
|
||||||
CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600
|
|
||||||
</code></pre>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>So the only difference is the &ldquo;confidence&rdquo;</li>
|
|
||||||
<li>Ok, well THAT is interesting:</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<pre><code>dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %';
|
|
||||||
text_value | authority | confidence
|
|
||||||
------------+--------------------------------------+------------
|
|
||||||
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
|
||||||
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
|
||||||
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
|
||||||
Orth, Alan | | -1
|
|
||||||
Orth, Alan | | -1
|
|
||||||
Orth, Alan | | -1
|
|
||||||
Orth, Alan | | -1
|
|
||||||
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
|
|
||||||
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
|
|
||||||
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
|
||||||
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
|
||||||
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
|
|
||||||
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
|
|
||||||
(13 rows)
|
|
||||||
</code></pre>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>And now an actually relevent example:</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<pre><code>dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence = 500;
|
|
||||||
count
|
|
||||||
-------
|
|
||||||
707
|
|
||||||
(1 row)
|
|
||||||
|
|
||||||
dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence != 500;
|
|
||||||
count
|
|
||||||
-------
|
|
||||||
253
|
|
||||||
(1 row)
|
|
||||||
</code></pre>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Trying something experimental:</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<pre><code>dspacetest=# update metadatavalue set confidence=500 where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security';
|
|
||||||
UPDATE 960
|
|
||||||
</code></pre>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>And then re-indexing authority and Discovery&hellip;?</li>
|
|
||||||
<li>After Discovery reindex the CCAFS authors are all together in the Authors sidebar facet</li>
|
|
||||||
<li>The docs for the ORCiD and Authority stuff for DSpace 5 mention changing the browse indexes to use the Authority as well:</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<pre><code>webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority
|
|
||||||
</code></pre>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>That would only be for the &ldquo;Browse by&rdquo; function&hellip; so we&rsquo;ll have to see what effect that has later</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<h2 id="2016-06-04:6783872e82b68b1517e00f494e6b6504">2016-06-04</h2>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Re-sync DSpace Test with CGSpace and perform test of metadata migration again</li>
|
|
||||||
<li>Run phase two of metadata migrations on CGSpace (see the <a href="https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c">migration notes</a>)</li>
|
|
||||||
<li>Run all system updates and reboot CGSpace server</li>
|
|
||||||
</ul>
|
|
||||||
</description>
|
|
||||||
</item>
|
|
||||||
|
|
||||||
<item>
|
<item>
|
||||||
<title>April, 2016</title>
|
<title>April, 2016</title>
|
||||||
<link>/cgspace-notes/2016-04/</link>
|
<link>/cgspace-notes/2016-04/</link>
|
||||||
|
@ -3,20 +3,20 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>/cgspace-notes/</loc>
|
<loc>/cgspace-notes/</loc>
|
||||||
<lastmod>2016-05-01T23:06:00+03:00</lastmod>
|
<lastmod>2016-06-01T10:53:00+03:00</lastmod>
|
||||||
<priority>0</priority>
|
<priority>0</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
|
<url>
|
||||||
|
<loc>/cgspace-notes/2016-06/</loc>
|
||||||
|
<lastmod>2016-06-01T10:53:00+03:00</lastmod>
|
||||||
|
</url>
|
||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>/cgspace-notes/2016-05/</loc>
|
<loc>/cgspace-notes/2016-05/</loc>
|
||||||
<lastmod>2016-05-01T23:06:00+03:00</lastmod>
|
<lastmod>2016-05-01T23:06:00+03:00</lastmod>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
<url>
|
|
||||||
<loc>/cgspace-notes/2016-06/</loc>
|
|
||||||
<lastmod>2016-05-01T10:53:00+03:00</lastmod>
|
|
||||||
</url>
|
|
||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>/cgspace-notes/2016-04/</loc>
|
<loc>/cgspace-notes/2016-04/</loc>
|
||||||
<lastmod>2016-04-04T11:06:00+03:00</lastmod>
|
<lastmod>2016-04-04T11:06:00+03:00</lastmod>
|
||||||
|
@ -61,6 +61,32 @@
|
|||||||
<section class="article-list">
|
<section class="article-list">
|
||||||
<h1>Notes</h1>
|
<h1>Notes</h1>
|
||||||
|
|
||||||
|
<hr/>
|
||||||
|
<article>
|
||||||
|
<header>
|
||||||
|
<h2><a href="/cgspace-notes/2016-06/">June, 2016</a></h2>
|
||||||
|
<div class="post-meta clearfix">
|
||||||
|
<div class="post-date pull-left">
|
||||||
|
Posted on
|
||||||
|
<time datetime="2016-06-01T10:53:00+03:00">
|
||||||
|
Jun 1, 2016
|
||||||
|
</time>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</header>
|
||||||
|
<div>
|
||||||
|
2016-06-01 Experimenting with IFPRI OAI (we want to harvest their publications) After reading the ContentDM documentation I found IFPRI’s OAI endpoint: http://ebrary.ifpri.org/oai/oai.php After reading the OAI documentation and testing with an OAI validator I found out how to get their publications This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&from=2016-01-01&set=p15738coll2&metadataPrefix=oai_dc You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in dc.identifier.fund to cg.identifier.cpwfproject and then the rest to dc.description.sponsorship dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA'); UPDATE 497 dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75; UPDATE 14 Fix a few minor miscellaneous issues in dspace.cfg (#227) 2016-06-02 Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with cg.coverage.admin-unit Seems that the Browse configuration in dspace.cfg can’t handle the ‘-’ in the field name: webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error I’ve sent a message to the DSpace mailing list to ask about the Browse index definition A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue I found a thread on the mailing list talking about it and there is bug report and a patch: https://jira.duraspace.org/browse/DS-2740 The patch applies successfully on DSpace 5.1 so I will try it later 2016-06-03 Investigating the CCAFS authority issue, I exported the metadata for the Videos collection The top two authors are: CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500 CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600 So the only difference is the “confidence” Ok, well THAT is interesting: dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %'; text_value | authority | confidence ------------+--------------------------------------+------------ Orth, A.
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<footer>
|
||||||
|
<ul class="pager">
|
||||||
|
<li class="next"><a href="/cgspace-notes/2016-06/">Read more <span aria-hidden="true">»</span></a></li>
|
||||||
|
</ul>
|
||||||
|
</footer>
|
||||||
|
|
||||||
|
</article>
|
||||||
|
|
||||||
|
|
||||||
<hr/>
|
<hr/>
|
||||||
<article>
|
<article>
|
||||||
<header>
|
<header>
|
||||||
@ -87,32 +113,6 @@
|
|||||||
</article>
|
</article>
|
||||||
|
|
||||||
|
|
||||||
<hr/>
|
|
||||||
<article>
|
|
||||||
<header>
|
|
||||||
<h2><a href="/cgspace-notes/2016-06/">June, 2016</a></h2>
|
|
||||||
<div class="post-meta clearfix">
|
|
||||||
<div class="post-date pull-left">
|
|
||||||
Posted on
|
|
||||||
<time datetime="2016-05-01T10:53:00+03:00">
|
|
||||||
May 1, 2016
|
|
||||||
</time>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</header>
|
|
||||||
<div>
|
|
||||||
2016-06-01 Experimenting with IFPRI OAI (we want to harvest their publications) After reading the ContentDM documentation I found IFPRI’s OAI endpoint: http://ebrary.ifpri.org/oai/oai.php After reading the OAI documentation and testing with an OAI validator I found out how to get their publications This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&from=2016-01-01&set=p15738coll2&metadataPrefix=oai_dc You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in dc.identifier.fund to cg.identifier.cpwfproject and then the rest to dc.description.sponsorship dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA'); UPDATE 497 dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75; UPDATE 14 Fix a few minor miscellaneous issues in dspace.cfg (#227) 2016-06-02 Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with cg.coverage.admin-unit Seems that the Browse configuration in dspace.cfg can’t handle the ‘-’ in the field name: webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error I’ve sent a message to the DSpace mailing list to ask about the Browse index definition A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue I found a thread on the mailing list talking about it and there is bug report and a patch: https://jira.duraspace.org/browse/DS-2740 The patch applies successfully on DSpace 5.1 so I will try it later 2016-06-03 Investigating the CCAFS authority issue, I exported the metadata for the Videos collection The top two authors are: CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500 CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600 So the only difference is the “confidence” Ok, well THAT is interesting: dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %'; text_value | authority | confidence ------------+--------------------------------------+------------ Orth, A.
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<footer>
|
|
||||||
<ul class="pager">
|
|
||||||
<li class="next"><a href="/cgspace-notes/2016-06/">Read more <span aria-hidden="true">»</span></a></li>
|
|
||||||
</ul>
|
|
||||||
</footer>
|
|
||||||
|
|
||||||
</article>
|
|
||||||
|
|
||||||
|
|
||||||
<hr/>
|
<hr/>
|
||||||
<article>
|
<article>
|
||||||
<header>
|
<header>
|
||||||
|
@ -6,9 +6,164 @@
|
|||||||
<description>Recent content in Notes on CGSpace Notes</description>
|
<description>Recent content in Notes on CGSpace Notes</description>
|
||||||
<generator>Hugo -- gohugo.io</generator>
|
<generator>Hugo -- gohugo.io</generator>
|
||||||
<language>en-us</language>
|
<language>en-us</language>
|
||||||
<lastBuildDate>Sun, 01 May 2016 23:06:00 +0300</lastBuildDate>
|
<lastBuildDate>Wed, 01 Jun 2016 10:53:00 +0300</lastBuildDate>
|
||||||
<atom:link href="/cgspace-notes/tags/notes/index.xml" rel="self" type="application/rss+xml" />
|
<atom:link href="/cgspace-notes/tags/notes/index.xml" rel="self" type="application/rss+xml" />
|
||||||
|
|
||||||
|
<item>
|
||||||
|
<title>June, 2016</title>
|
||||||
|
<link>/cgspace-notes/2016-06/</link>
|
||||||
|
<pubDate>Wed, 01 Jun 2016 10:53:00 +0300</pubDate>
|
||||||
|
|
||||||
|
<guid>/cgspace-notes/2016-06/</guid>
|
||||||
|
<description>
|
||||||
|
|
||||||
|
<h2 id="2016-06-01:6783872e82b68b1517e00f494e6b6504">2016-06-01</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Experimenting with IFPRI OAI (we want to harvest their publications)</li>
|
||||||
|
<li>After reading the <a href="https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html">ContentDM documentation</a> I found IFPRI&rsquo;s OAI endpoint: <a href="http://ebrary.ifpri.org/oai/oai.php">http://ebrary.ifpri.org/oai/oai.php</a></li>
|
||||||
|
<li>After reading the <a href="https://www.openarchives.org/OAI/openarchivesprotocol.html">OAI documentation</a> and testing with an <a href="http://validator.oaipmh.com/">OAI validator</a> I found out how to get their publications</li>
|
||||||
|
<li>This is their publications set: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc">http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc</a></li>
|
||||||
|
<li>You can see the others by using the OAI <code>ListSets</code> verb: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListSets">http://ebrary.ifpri.org/oai/oai.php?verb=ListSets</a></li>
|
||||||
|
<li>Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in <code>dc.identifier.fund</code> to <code>cg.identifier.cpwfproject</code> and then the rest to <code>dc.description.sponsorship</code></li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA');
|
||||||
|
UPDATE 497
|
||||||
|
dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75;
|
||||||
|
UPDATE 14
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Fix a few minor miscellaneous issues in <code>dspace.cfg</code> (<a href="https://github.com/ilri/DSpace/pull/227">#227</a>)</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2 id="2016-06-02:6783872e82b68b1517e00f494e6b6504">2016-06-02</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with <code>cg.coverage.admin-unit</code></li>
|
||||||
|
<li>Seems that the Browse configuration in <code>dspace.cfg</code> can&rsquo;t handle the &lsquo;-&rsquo; in the field name:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error</li>
|
||||||
|
<li>I&rsquo;ve sent a message to the DSpace mailing list to ask about the Browse index definition</li>
|
||||||
|
<li>A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue</li>
|
||||||
|
<li>I found a thread on the mailing list talking about it and there is bug report and a patch: <a href="https://jira.duraspace.org/browse/DS-2740">https://jira.duraspace.org/browse/DS-2740</a></li>
|
||||||
|
<li>The patch applies successfully on DSpace 5.1 so I will try it later</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2 id="2016-06-03:6783872e82b68b1517e00f494e6b6504">2016-06-03</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Investigating the CCAFS authority issue, I exported the metadata for the Videos collection</li>
|
||||||
|
<li>The top two authors are:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500
|
||||||
|
CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>So the only difference is the &ldquo;confidence&rdquo;</li>
|
||||||
|
<li>Ok, well THAT is interesting:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %';
|
||||||
|
text_value | authority | confidence
|
||||||
|
------------+--------------------------------------+------------
|
||||||
|
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
||||||
|
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
||||||
|
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
||||||
|
Orth, Alan | | -1
|
||||||
|
Orth, Alan | | -1
|
||||||
|
Orth, Alan | | -1
|
||||||
|
Orth, Alan | | -1
|
||||||
|
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
|
||||||
|
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
|
||||||
|
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
||||||
|
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
||||||
|
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
|
||||||
|
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
|
||||||
|
(13 rows)
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>And now an actually relevent example:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence = 500;
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
707
|
||||||
|
(1 row)
|
||||||
|
|
||||||
|
dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence != 500;
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
253
|
||||||
|
(1 row)
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Trying something experimental:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# update metadatavalue set confidence=500 where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security';
|
||||||
|
UPDATE 960
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>And then re-indexing authority and Discovery&hellip;?</li>
|
||||||
|
<li>After Discovery reindex the CCAFS authors are all together in the Authors sidebar facet</li>
|
||||||
|
<li>The docs for the ORCiD and Authority stuff for DSpace 5 mention changing the browse indexes to use the Authority as well:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>That would only be for the &ldquo;Browse by&rdquo; function&hellip; so we&rsquo;ll have to see what effect that has later</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2 id="2016-06-04:6783872e82b68b1517e00f494e6b6504">2016-06-04</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Re-sync DSpace Test with CGSpace and perform test of metadata migration again</li>
|
||||||
|
<li>Run phase two of metadata migrations on CGSpace (see the <a href="https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c">migration notes</a>)</li>
|
||||||
|
<li>Run all system updates and reboot CGSpace server</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2 id="2016-06-07:6783872e82b68b1517e00f494e6b6504">2016-06-07</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Figured out how to export a list of the unique values from a metadata field ordered by count:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=29 group by text_value order by count desc) to /tmp/sponsorship.csv with csv;
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Identified the next round of fields to migrate:
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>dc.title.jtitle → dc.source</li>
|
||||||
|
<li>dc.crsubject.crpsubject → cg.contributor.crp</li>
|
||||||
|
<li>dc.contributor.affiliation → cg.contributor.affiliation</li>
|
||||||
|
<li>dc.Species → cg.species</li>
|
||||||
|
<li>dc.contributor.corporate → dc.contributor</li>
|
||||||
|
<li>dc.identifier.url → cg.identifier.url</li>
|
||||||
|
<li>dc.identifier.doi → cg.identifier.doi</li>
|
||||||
|
<li>dc.identifier.googleurl → cg.identifier.googleurl</li>
|
||||||
|
<li>dc.identifier.dataurl → cg.identifier.dataurl</li>
|
||||||
|
</ul></li>
|
||||||
|
</ul>
|
||||||
|
</description>
|
||||||
|
</item>
|
||||||
|
|
||||||
<item>
|
<item>
|
||||||
<title>May, 2016</title>
|
<title>May, 2016</title>
|
||||||
<link>/cgspace-notes/2016-05/</link>
|
<link>/cgspace-notes/2016-05/</link>
|
||||||
@ -316,136 +471,6 @@ sys 0m20.540s
|
|||||||
</description>
|
</description>
|
||||||
</item>
|
</item>
|
||||||
|
|
||||||
<item>
|
|
||||||
<title>June, 2016</title>
|
|
||||||
<link>/cgspace-notes/2016-06/</link>
|
|
||||||
<pubDate>Sun, 01 May 2016 10:53:00 +0300</pubDate>
|
|
||||||
|
|
||||||
<guid>/cgspace-notes/2016-06/</guid>
|
|
||||||
<description>
|
|
||||||
|
|
||||||
<h2 id="2016-06-01:6783872e82b68b1517e00f494e6b6504">2016-06-01</h2>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Experimenting with IFPRI OAI (we want to harvest their publications)</li>
|
|
||||||
<li>After reading the <a href="https://www.oclc.org/support/services/contentdm/help/server-admin-help/oai-support.en.html">ContentDM documentation</a> I found IFPRI&rsquo;s OAI endpoint: <a href="http://ebrary.ifpri.org/oai/oai.php">http://ebrary.ifpri.org/oai/oai.php</a></li>
|
|
||||||
<li>After reading the <a href="https://www.openarchives.org/OAI/openarchivesprotocol.html">OAI documentation</a> and testing with an <a href="http://validator.oaipmh.com/">OAI validator</a> I found out how to get their publications</li>
|
|
||||||
<li>This is their publications set: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc">http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&amp;from=2016-01-01&amp;set=p15738coll2&amp;metadataPrefix=oai_dc</a></li>
|
|
||||||
<li>You can see the others by using the OAI <code>ListSets</code> verb: <a href="http://ebrary.ifpri.org/oai/oai.php?verb=ListSets">http://ebrary.ifpri.org/oai/oai.php?verb=ListSets</a></li>
|
|
||||||
<li>Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in <code>dc.identifier.fund</code> to <code>cg.identifier.cpwfproject</code> and then the rest to <code>dc.description.sponsorship</code></li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<pre><code>dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA');
|
|
||||||
UPDATE 497
|
|
||||||
dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75;
|
|
||||||
UPDATE 14
|
|
||||||
</code></pre>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Fix a few minor miscellaneous issues in <code>dspace.cfg</code> (<a href="https://github.com/ilri/DSpace/pull/227">#227</a>)</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<h2 id="2016-06-02:6783872e82b68b1517e00f494e6b6504">2016-06-02</h2>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with <code>cg.coverage.admin-unit</code></li>
|
|
||||||
<li>Seems that the Browse configuration in <code>dspace.cfg</code> can&rsquo;t handle the &lsquo;-&rsquo; in the field name:</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<pre><code>webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text
|
|
||||||
</code></pre>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error</li>
|
|
||||||
<li>I&rsquo;ve sent a message to the DSpace mailing list to ask about the Browse index definition</li>
|
|
||||||
<li>A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue</li>
|
|
||||||
<li>I found a thread on the mailing list talking about it and there is bug report and a patch: <a href="https://jira.duraspace.org/browse/DS-2740">https://jira.duraspace.org/browse/DS-2740</a></li>
|
|
||||||
<li>The patch applies successfully on DSpace 5.1 so I will try it later</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<h2 id="2016-06-03:6783872e82b68b1517e00f494e6b6504">2016-06-03</h2>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Investigating the CCAFS authority issue, I exported the metadata for the Videos collection</li>
|
|
||||||
<li>The top two authors are:</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<pre><code>CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500
|
|
||||||
CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600
|
|
||||||
</code></pre>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>So the only difference is the &ldquo;confidence&rdquo;</li>
|
|
||||||
<li>Ok, well THAT is interesting:</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<pre><code>dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %';
|
|
||||||
text_value | authority | confidence
|
|
||||||
------------+--------------------------------------+------------
|
|
||||||
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
|
||||||
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
|
||||||
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
|
||||||
Orth, Alan | | -1
|
|
||||||
Orth, Alan | | -1
|
|
||||||
Orth, Alan | | -1
|
|
||||||
Orth, Alan | | -1
|
|
||||||
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
|
|
||||||
Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1
|
|
||||||
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
|
||||||
Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1
|
|
||||||
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
|
|
||||||
Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600
|
|
||||||
(13 rows)
|
|
||||||
</code></pre>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>And now an actually relevent example:</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<pre><code>dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence = 500;
|
|
||||||
count
|
|
||||||
-------
|
|
||||||
707
|
|
||||||
(1 row)
|
|
||||||
|
|
||||||
dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence != 500;
|
|
||||||
count
|
|
||||||
-------
|
|
||||||
253
|
|
||||||
(1 row)
|
|
||||||
</code></pre>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Trying something experimental:</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<pre><code>dspacetest=# update metadatavalue set confidence=500 where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security';
|
|
||||||
UPDATE 960
|
|
||||||
</code></pre>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>And then re-indexing authority and Discovery&hellip;?</li>
|
|
||||||
<li>After Discovery reindex the CCAFS authors are all together in the Authors sidebar facet</li>
|
|
||||||
<li>The docs for the ORCiD and Authority stuff for DSpace 5 mention changing the browse indexes to use the Authority as well:</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<pre><code>webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority
|
|
||||||
</code></pre>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>That would only be for the &ldquo;Browse by&rdquo; function&hellip; so we&rsquo;ll have to see what effect that has later</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<h2 id="2016-06-04:6783872e82b68b1517e00f494e6b6504">2016-06-04</h2>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Re-sync DSpace Test with CGSpace and perform test of metadata migration again</li>
|
|
||||||
<li>Run phase two of metadata migrations on CGSpace (see the <a href="https://gist.github.com/alanorth/1a730bec5ac9457a8fb0e3e72c98d09c">migration notes</a>)</li>
|
|
||||||
<li>Run all system updates and reboot CGSpace server</li>
|
|
||||||
</ul>
|
|
||||||
</description>
|
|
||||||
</item>
|
|
||||||
|
|
||||||
<item>
|
<item>
|
||||||
<title>April, 2016</title>
|
<title>April, 2016</title>
|
||||||
<link>/cgspace-notes/2016-04/</link>
|
<link>/cgspace-notes/2016-04/</link>
|
||||||
|
Loading…
Reference in New Issue
Block a user