cgspace-notes/public/2017-02/index.html

378 lines
12 KiB
HTML
Raw Normal View History

2017-02-07 16:07:44 +01:00
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta property="og:title" content="February, 2017" />
<meta property="og:description" content="2017-02-07
An item was mapped twice erroneously again, so I had to remove one of the mappings manually:
dspace=# select * from collection2item where item_id = &#39;80278&#39;;
id | collection_id | item_id
-------&#43;---------------&#43;---------
92551 | 313 | 80278
92550 | 313 | 80278
90774 | 1051 | 80278
(3 rows)
dspace=# delete from collection2item where id = 92551 and item_id = 80278;
DELETE 1
2017-02-08 07:11:37 +01:00
Create issue on GitHub to track the addition of CCAFS Phase II project tags (#301)
Looks like we&rsquo;ll be using cg.identifier.ccafsprojectpii as the field name
2017-02-07 16:07:44 +01:00
" />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-02/" />
<meta property="article:published_time" content="2017-02-07T07:04:52-08:00"/>
<meta property="article:modified_time" content="2017-02-07T07:04:52-08:00"/>
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="February, 2017"/>
<meta name="twitter:description" content="2017-02-07
An item was mapped twice erroneously again, so I had to remove one of the mappings manually:
dspace=# select * from collection2item where item_id = &#39;80278&#39;;
id | collection_id | item_id
-------&#43;---------------&#43;---------
92551 | 313 | 80278
92550 | 313 | 80278
90774 | 1051 | 80278
(3 rows)
dspace=# delete from collection2item where id = 92551 and item_id = 80278;
DELETE 1
2017-02-08 07:11:37 +01:00
Create issue on GitHub to track the addition of CCAFS Phase II project tags (#301)
Looks like we&rsquo;ll be using cg.identifier.ccafsprojectpii as the field name
2017-02-07 16:07:44 +01:00
"/>
<meta name="generator" content="Hugo 0.18.1" />
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "BlogPosting",
"headline": "February, 2017",
"url": "https://alanorth.github.io/cgspace-notes/2017-02/",
2017-02-16 18:41:58 +01:00
"wordCount": "774",
2017-02-07 16:07:44 +01:00
"datePublished": "2017-02-07T07:04:52-08:00",
"dateModified": "2017-02-07T07:04:52-08:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
}
,
"keywords": "Notes"
}
</script>
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-02/">
<title>February, 2017 | CGSpace Notes</title>
<!-- combined, minified CSS -->
<link href="https://alanorth.github.io/cgspace-notes/css/style.css" rel="stylesheet" integrity="sha384-qRVpIj9hSzsBhmO8Y7YEKF2UFra2sJQtl9V/uFKKDvy&#43;Wjh9zgTku6VRgT8YdPoD" crossorigin="anonymous">
</head>
<body>
<div class="blog-masthead">
<div class="container">
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>
<header class="blog-header">
<div class="container">
<h1 class="blog-title"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
<p class="lead blog-description">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
</div>
</header>
<div class="container">
<div class="row">
<div class="col-sm-8 blog-main">
<article class="blog-post">
<header>
<h2 class="blog-post-title"><a href="https://alanorth.github.io/cgspace-notes/2017-02/">February, 2017</a></h2>
<p class="blog-post-meta"><time datetime="2017-02-07T07:04:52-08:00">Tue Feb 07, 2017</time> by Alan Orth in
<i class="fa fa-tag" aria-hidden="true"></i>&nbsp;<a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
</p>
</header>
<h2 id="2017-02-07">2017-02-07</h2>
<ul>
<li>An item was mapped twice erroneously again, so I had to remove one of the mappings manually:</li>
</ul>
<pre><code>dspace=# select * from collection2item where item_id = '80278';
id | collection_id | item_id
-------+---------------+---------
92551 | 313 | 80278
92550 | 313 | 80278
90774 | 1051 | 80278
(3 rows)
dspace=# delete from collection2item where id = 92551 and item_id = 80278;
DELETE 1
</code></pre>
2017-02-08 07:11:37 +01:00
<ul>
<li>Create issue on GitHub to track the addition of CCAFS Phase II project tags (<a href="https://github.com/ilri/DSpace/issues/301">#301</a>)</li>
<li>Looks like we&rsquo;ll be using <code>cg.identifier.ccafsprojectpii</code> as the field name</li>
</ul>
2017-02-07 16:07:44 +01:00
<p></p>
2017-02-08 18:31:52 +01:00
<h2 id="2017-02-08">2017-02-08</h2>
<ul>
<li>We also need to rename some of the CCAFS Phase I flagships:
<ul>
<li>CLIMATE-SMART AGRICULTURAL PRACTICESCLIMATE-SMART TECHNOLOGIES AND PRACTICES</li>
<li>CLIMATE RISK MANAGEMENTCLIMATE SERVICES AND SAFETY NETS</li>
<li>LOW EMISSIONS AGRICULTURELOW EMISSIONS DEVELOPMENT</li>
<li>POLICIES AND INSTITUTIONSPRIORITIES AND POLICIES FOR CSA</li>
</ul></li>
<li>The climate risk management one doesn&rsquo;t exist, so I will have to ask Magdalena if they want me to add it to the input forms</li>
2017-02-09 07:27:37 +01:00
<li>Start testing some nearly 500 author corrections that CCAFS sent me:</li>
2017-02-08 18:31:52 +01:00
</ul>
2017-02-09 07:27:37 +01:00
<pre><code>$ ./fix-metadata-values.py -i /tmp/CCAFS-Authors-Feb-7.csv -f dc.contributor.author -t 'correct name' -m 3 -d dspace -u dspace -p fuuu
</code></pre>
2017-02-10 07:36:23 +01:00
<h2 id="2017-02-09">2017-02-09</h2>
<ul>
<li>More work on CCAFS Phase II stuff</li>
<li>Looks like simply adding a new metadata field to <code>dspace/config/registries/cgiar-types.xml</code> and restarting DSpace causes the field to get added to the rregistry</li>
<li>It requires a restart but at least it allows you to manage the registry programmatically</li>
<li>It&rsquo;s not a very good way to manage the registry, though, as removing one there doesn&rsquo;t cause it to be removed from the registry, and we always restore from database backups so there would never be a scenario when we needed these to be created</li>
<li>Testing some corrections on CCAFS Phase II flagships (<code>cg.subject.ccafs</code>):</li>
</ul>
<pre><code>$ ./fix-metadata-values.py -i ccafs-flagships-feb7.csv -f cg.subject.ccafs -t correct -m 210 -d dspace -u dspace -p fuuu
</code></pre>
2017-02-11 00:28:56 +01:00
<h2 id="2017-02-10">2017-02-10</h2>
<ul>
<li>CCAFS said they want to wait on the flagship updates (<code>cg.subject.ccafs</code>) on CGSpace, perhaps for a month or so</li>
<li>Help Marianne Gadeberg (WLE) with some user permissions as it seems she had previously been using a personal email account, and is now on a CGIAR one</li>
<li>I manually added her new account to ~25 authorizations that her hold user was on</li>
</ul>
2017-02-14 12:03:09 +01:00
<h2 id="2017-02-14">2017-02-14</h2>
<ul>
<li>Add <code>SCALING</code> to ILRI subjects (<a href="https://github.com/ilri/DSpace/pull/304">#304</a>), as Sisay&rsquo;s attempts were all sloppy</li>
<li>Cherry pick some patches from the DSpace 5.7 branch:
<ul>
<li>DS-3363 CSV import error says &ldquo;row&rdquo;, means &ldquo;column&rdquo;: f7b6c83e991db099003ee4e28ca33d3c7bab48c0</li>
<li>DS-3479 avoid adding empty metadata values during import: 329f3b48a6de7fad074d825fd12118f7e181e151</li>
<li>[DS-3456] 5x Clarify command line options for statisics import/export tools (#1623): 567ec083c8a94eb2bcc1189816eb4f767745b278</li>
<li>[DS-3458]5x Allow Shard Process to Append to an existing repo: 3c8ecb5d1fd69a1dcfee01feed259e80abbb7749</li>
</ul></li>
<li>I still need to test these, especially as the last two which change some stuff with Solr maintenance</li>
</ul>
2017-02-15 15:04:55 +01:00
<h2 id="2017-02-15">2017-02-15</h2>
<ul>
2017-02-16 10:46:05 +01:00
<li>Update rvm on DSpace Test and CGSpace as there was a <a href="https://github.com/justinsteven/advisories/blob/master/2017_rvm_cd_command_execution.md">security disclosure about versions less than 1.28.0</a></li>
2017-02-15 15:04:55 +01:00
</ul>
2017-02-16 10:46:05 +01:00
<h2 id="2017-02-16">2017-02-16</h2>
<ul>
<li>Looking at memory info from munin on CGSpace:</li>
</ul>
<p><img src="/cgspace-notes/2017/02/meminfo_phisical-week.png" alt="CGSpace meminfo" /></p>
<ul>
<li>We are using only ~8GB of RAM for applications, and 16GB for caches!</li>
<li>The Linode machine we&rsquo;re on has 24GB of RAM but only because that&rsquo;s the only instance that had enough disk space for us (384GB)&hellip;</li>
<li>We should probably look into Google Compute Engine or Digital Ocean where we can get more storage without having to follow a linear increase in instance pricing for CPU/memory as well</li>
<li>Especially because we only use 2 out of 8 CPUs basically:</li>
</ul>
<p><img src="/cgspace-notes/2017/02/cpu-week.png" alt="CGSpace CPU" /></p>
2017-02-16 13:20:22 +01:00
<ul>
<li>Fix issue with duplicate declaration of in atmire-dspace-xmlui <code>pom.xml</code> (causing non-fatal warnings during the maven build)</li>
<li>Experiment with making DSpace generate HTTPS handle links, first a change in dspace.cfg or the site&rsquo;s properties file:</li>
</ul>
<pre><code>handle.canonical.prefix = https://hdl.handle.net/
</code></pre>
<ul>
<li>And then a SQL command to update existing records:</li>
</ul>
<pre><code>dspace=# update metadatavalue set text_value = regexp_replace(text_value, 'http://hdl.handle.net', 'https://hdl.handle.net') where metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = 'identifier' and qualifier = 'uri');
UPDATE 58193
</code></pre>
<ul>
<li>Seems to work fine!</li>
2017-02-16 18:31:01 +01:00
<li>I noticed a few items that have incorrect DOI links (<code>dc.identifier.doi</code>), and after looking in the database I see there are over 100 that are missing the scheme or are just plain wrong:</li>
2017-02-16 13:20:22 +01:00
</ul>
2017-02-16 18:41:58 +01:00
<pre><code>dspace=# select distinct text_value from metadatavalue where resource_type_id=2 and metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = 'identifier' and qualifier = 'doi') and text_value not like 'http%://%';
2017-02-16 18:31:01 +01:00
</code></pre>
<ul>
2017-02-16 18:41:58 +01:00
<li>This will replace any that begin with <code>10.</code> and change them to <code>https://dx.doi.org/10.</code>:</li>
2017-02-16 18:31:01 +01:00
</ul>
<pre><code>dspace=# update metadatavalue set text_value = regexp_replace(text_value, '(^10\..+$)', 'https://dx.doi.org/\1') where metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = 'identifier' and qualifier = 'doi') and text_value like '10.%';
</code></pre>
2017-02-16 18:41:58 +01:00
<ul>
<li>This will get any that begin with <code>doi:10.</code> and change them to <code>https://dx.doi.org/10.x</code>:</li>
</ul>
<pre><code>dspace=# update metadatavalue set text_value = regexp_replace(text_value, '^doi:(10\..+$)', 'https://dx.doi.org/\1') where resource_type_id=2 and metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = 'identifier' and qualifier = 'doi') and text_value like 'doi:10%';
</code></pre>
2017-02-07 16:07:44 +01:00
</article>
</div> <!-- /.blog-main -->
<aside class="col-sm-3 offset-sm-1 blog-sidebar">
<section class="sidebar-module">
<h4>Recent Posts</h4>
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2017-02/">February, 2017</a></li>
<li><a href="/cgspace-notes/2017-01/">January, 2017</a></li>
<li><a href="/cgspace-notes/2016-12/">December, 2016</a></li>
<li><a href="/cgspace-notes/2016-11/">November, 2016</a></li>
<li><a href="/cgspace-notes/2016-10/">October, 2016</a></li>
</ol>
</section>
<section class="sidebar-module">
<h4>Links</h4>
<ol class="list-unstyled">
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
</ol>
</section>
</aside>
</div> <!-- /.row -->
</div> <!-- /.container -->
<footer class="blog-footer">
<p>
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
</p>
<p>
<a href="#">Back to top</a>
</p>
</footer>
</body>
</html>