Add notes for 2019-12-17

This commit is contained in:
2019-12-17 14:49:24 +02:00
parent d83c951532
commit d54e5b69f1
90 changed files with 1420 additions and 1377 deletions

View File

@ -15,7 +15,7 @@
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="May, 2017"/>
<meta name="twitter:description" content="2017-05-01 ICARDA apparently started working on CG Core on their MEL repository They have done a few cg.* fields, but not very consistent and even copy some of CGSpace items: https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full https://cgspace.cgiar.org/handle/10568/73683 2017-05-02 Atmire got back about the Workflow Statistics issue, and apparently it&#39;s a bug in the CUA module so they will send us a pull request 2017-05-04 Sync DSpace Test with database and assetstore from CGSpace Re-deploy DSpace Test with Atmire&#39;s CUA patch for workflow statistics, run system updates, and restart the server Now I can see the workflow statistics and am able to select users, but everything returns 0 items Megan says there are still some mapped items are not appearing since last week, so I forced a full index-discovery -b Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: https://cgspace."/>
<meta name="generator" content="Hugo 0.60.1" />
<meta name="generator" content="Hugo 0.61.0" />
@ -96,7 +96,7 @@
</p>
</header>
<h2 id="20170501">2017-05-01</h2>
<h2 id="2017-05-01">2017-05-01</h2>
<ul>
<li>ICARDA apparently started working on CG Core on their MEL repository</li>
<li>They have done a few <code>cg.*</code> fields, but not very consistent and even copy some of CGSpace items:
@ -106,11 +106,11 @@
</ul>
</li>
</ul>
<h2 id="20170502">2017-05-02</h2>
<h2 id="2017-05-02">2017-05-02</h2>
<ul>
<li>Atmire got back about the Workflow Statistics issue, and apparently it's a bug in the CUA module so they will send us a pull request</li>
</ul>
<h2 id="20170504">2017-05-04</h2>
<h2 id="2017-05-04">2017-05-04</h2>
<ul>
<li>Sync DSpace Test with database and assetstore from CGSpace</li>
<li>Re-deploy DSpace Test with Atmire's CUA patch for workflow statistics, run system updates, and restart the server</li>
@ -118,7 +118,7 @@
<li>Megan says there are still some mapped items are not appearing since last week, so I forced a full <code>index-discovery -b</code></li>
<li>Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: <a href="https://cgspace.cgiar.org/handle/10568/80731">https://cgspace.cgiar.org/handle/10568/80731</a></li>
</ul>
<h2 id="20170505">2017-05-05</h2>
<h2 id="2017-05-05">2017-05-05</h2>
<ul>
<li>Discovered that CGSpace has ~700 items that are missing the <code>cg.identifier.status</code> field</li>
<li>Need to perhaps try using the &ldquo;required metadata&rdquo; curation task to find fields missing these items:</li>
@ -127,13 +127,13 @@
</code></pre><ul>
<li>It seems the curation task dies when it finds an item which has missing metadata</li>
</ul>
<h2 id="20170506">2017-05-06</h2>
<h2 id="2017-05-06">2017-05-06</h2>
<ul>
<li>Add &ldquo;Blog Post&rdquo; to <code>dc.type</code></li>
<li>Create ticket on Atmire tracker to ask about commissioning them to develop the feature to expose ORCID via REST/OAI: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=510">https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=510</a></li>
<li>According to the <a href="https://wiki.duraspace.org/display/DSDOC5x/Curation+System">DSpace curation docs</a> the fact that the <code>requiredmetadata</code> curation task stops when it finds a missing metadata field is by design</li>
</ul>
<h2 id="20170507">2017-05-07</h2>
<h2 id="2017-05-07">2017-05-07</h2>
<ul>
<li>Testing one replacement for CCAFS Flagships (<code>cg.subject.ccafs</code>), first changed in the submission forms, and then in the database:</li>
</ul>
@ -142,7 +142,7 @@
<li>Also, CCAFS wants to re-order their flagships to prioritize the Phase II ones</li>
<li>Waiting for feedback from CCAFS, then I can merge <a href="https://github.com/ilri/DSpace/pull/320">#320</a></li>
</ul>
<h2 id="20170508">2017-05-08</h2>
<h2 id="2017-05-08">2017-05-08</h2>
<ul>
<li>Start working on CGIAR Library migration</li>
<li>We decided to use AIP export to preserve the hierarchies and handles of communities and collections</li>
@ -171,7 +171,7 @@ $ for item in /home/aorth/10947-1/ITEM@10947-*; do [dspace]/bin/dspace packager
<li>This uses the webui's item list sort options, see <code>webui.itemlist.sort-option</code> in <code>dspace.cfg</code></li>
<li>The equivalent Discovery search would be: <a href="https://cgspace.cgiar.org/discover?filtertype_1=crpsubject&amp;filter_relational_operator_1=equals&amp;filter_1=WATER%2C+LAND+AND+ECOSYSTEMS&amp;submit_apply_filter=&amp;query=&amp;rpp=10&amp;sort_by=dc.date.issued_dt&amp;order=desc">https://cgspace.cgiar.org/discover?filtertype_1=crpsubject&amp;filter_relational_operator_1=equals&amp;filter_1=WATER%2C+LAND+AND+ECOSYSTEMS&amp;submit_apply_filter=&amp;query=&amp;rpp=10&amp;sort_by=dc.date.issued_dt&amp;order=desc</a></li>
</ul>
<h2 id="20170509">2017-05-09</h2>
<h2 id="2017-05-09">2017-05-09</h2>
<ul>
<li>The CGIAR Library metadata has some blank metadata values, which leads to <code>|||</code> in the Discovery facets</li>
<li>Clean these up in the database using:</li>
@ -188,7 +188,7 @@ $ for item in /home/aorth/10947-1/ITEM@10947-*; do [dspace]/bin/dspace packager
<li>I think those errors actually come from me running the <code>update-sequences.sql</code> script while Tomcat/DSpace are running</li>
<li>Apparently you need to stop Tomcat!</li>
</ul>
<h2 id="20170510">2017-05-10</h2>
<h2 id="2017-05-10">2017-05-10</h2>
<ul>
<li>Atmire says they are willing to extend the ORCID implementation, and I've asked them to provide a quote</li>
<li>I clarified that the scope of the implementation should be that ORCIDs are stored in the database and exposed via REST / API like other fields</li>
@ -208,13 +208,13 @@ $ for item in /home/aorth/10947-1/ITEM@10947-*; do [dspace]/bin/dspace packager
<li>After this I ran the <code>update-sequences.sql</code> script (with Tomcat shut down), and cleaned up the 200+ blank metadata records:</li>
</ul>
<pre><code>dspace=# delete from metadatavalue where resource_type_id=2 and text_value='';
</code></pre><h2 id="20170513">2017-05-13</h2>
</code></pre><h2 id="2017-05-13">2017-05-13</h2>
<ul>
<li>After quite a bit of troubleshooting with importing cleaned up data as CSV, it seems that there are actually <a href="https://en.wikipedia.org/wiki/Null_character">NUL</a> characters in the <code>dc.description.abstract</code> field (at least) on the lines where CSV importing was failing</li>
<li>I tried to find a way to remove the characters in vim or Open Refine, but decided it was quicker to just remove the column temporarily and import it</li>
<li>The import was successful and detected 2022 changes, which should likely be the rest that were failing to import before</li>
</ul>
<h2 id="20170515">2017-05-15</h2>
<h2 id="2017-05-15">2017-05-15</h2>
<ul>
<li>To delete the blank lines that cause isses during import we need to use a regex in vim <code>g/^$/d</code></li>
<li>After that I started looking in the <code>dc.subject</code> field to try to pull countries and regions out, but there are too many values in there</li>
@ -241,12 +241,12 @@ $ for item in /home/aorth/10947-1/ITEM@10947-*; do [dspace]/bin/dspace packager
<p>Fix cron jobs for log management on DSpace Test, as they weren't catching <code>dspace.log.*</code> files correctly and we had over six months of them and they were taking up many gigs of disk space</p>
</li>
</ul>
<h2 id="20170516">2017-05-16</h2>
<h2 id="2017-05-16">2017-05-16</h2>
<ul>
<li>Discuss updates to WLE themes for their Phase II</li>
<li>Make an issue to track the changes to <code>cg.subject.wle</code>: <a href="https://github.com/ilri/DSpace/issues/322">#322</a></li>
</ul>
<h2 id="20170517">2017-05-17</h2>
<h2 id="2017-05-17">2017-05-17</h2>
<ul>
<li>Looking into the error I get when trying to create a new collection on DSpace Test:</li>
</ul>
@ -275,13 +275,13 @@ $ for item in /home/aorth/10947-1/ITEM@10947-*; do [dspace]/bin/dspace packager
</code></pre><ul>
<li>After that I can create collections just fine, though I'm not sure if it has other side effects</li>
</ul>
<h2 id="20170521">2017-05-21</h2>
<h2 id="2017-05-21">2017-05-21</h2>
<ul>
<li>Start creating a basic theme for the CGIAR System Organization's community on CGSpace</li>
<li>Using colors from the <a href="http://library.cgiar.org/handle/10947/2699">CGIAR Branding guidelines (2014)</a></li>
<li>Make a GitHub issue to track this work: <a href="https://github.com/ilri/DSpace/issues/324">#324</a></li>
</ul>
<h2 id="20170522">2017-05-22</h2>
<h2 id="2017-05-22">2017-05-22</h2>
<ul>
<li>Do some cleanups of community and collection names in CGIAR System Management Office community on DSpace Test, as well as move some items as Peter requested</li>
<li>Peter wanted a list of authors in here, so I generated a list of collections using the &ldquo;View Source&rdquo; on each community and this hacky awk:</li>
@ -311,7 +311,7 @@ from metadatavalue
where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author')
AND resource_type_id = 2
AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in ('10947/2', '10947/3', '10947/10', '10947/4', '10947/5', '10947/6', '10947/7', '10947/8', '10947/9', '10947/11', '10947/25', '10947/12', '10947/26', '10947/27', '10947/28', '10947/29', '10947/30', '10947/13', '10947/14', '10947/15', '10947/16', '10947/31', '10947/32', '10947/33', '10947/34', '10947/35', '10947/36', '10947/37', '10947/17', '10947/18', '10947/38', '10947/19', '10947/39', '10947/40', '10947/41', '10947/42', '10947/43', '10947/2512', '10947/44', '10947/20', '10947/21', '10947/45', '10947/46', '10947/47', '10947/48', '10947/49', '10947/22', '10947/23', '10947/24', '10947/50', '10947/51', '10947/2518', '10947/2776', '10947/2790', '10947/2521', '10947/2522', '10947/2782', '10947/2525', '10947/2836', '10947/2524', '10947/2878', '10947/2520', '10947/2523', '10947/2786', '10947/2631', '10947/2589', '10947/2519', '10947/2708', '10947/2526', '10947/2871', '10947/2527', '10947/4467', '10947/3457', '10947/2528', '10947/2529', '10947/2533', '10947/2530', '10947/2531', '10947/2532', '10947/2538', '10947/2534', '10947/2540', '10947/2900', '10947/2539', '10947/2784', '10947/2536', '10947/2805', '10947/2541', '10947/2535', '10947/2537', '10568/93761'))) group by text_value order by count desc) to /tmp/cgiar-librar-authors.csv with csv;
</code></pre><h2 id="20170523">2017-05-23</h2>
</code></pre><h2 id="2017-05-23">2017-05-23</h2>
<ul>
<li>Add Affiliation to filters on Listing and Reports module (<a href="https://github.com/ilri/DSpace/pull/325">#325</a>)</li>
<li>Start looking at WLE's Phase II metadata updates but it seems they are not tagging their items properly, as their website importer infers which theme to use based on the name of the CGSpace collection!</li>
@ -323,12 +323,12 @@ COPY 111
</code></pre><ul>
<li>Respond to Atmire message about ORCIDs, saying that right now we'd prefer to just have them available via REST API like any other metadata field, and that I'm available for a Skype</li>
</ul>
<h2 id="20170526">2017-05-26</h2>
<h2 id="2017-05-26">2017-05-26</h2>
<ul>
<li>Increase max file size in nginx so that CIP can upload some larger PDFs</li>
<li>Agree to talk with Atmire after the June DSpace developers meeting where they will be discussing exposing ORCIDs via REST/OAI</li>
</ul>
<h2 id="20170528">2017-05-28</h2>
<h2 id="2017-05-28">2017-05-28</h2>
<ul>
<li>File an issue on GitHub to explore/track migration to proper country/region codes (ISO 2/3 and UN M.49): <a href="https://github.com/ilri/DSpace/issues/326">#326</a></li>
<li>Ask Peter how the Landportal.info people should acknowledge us as the source of data on their website</li>
@ -354,7 +354,7 @@ UPDATE 187
<li>Run the corrections on CGSpace and then update discovery / authority</li>
<li>I notice that there are a handful of <code>java.lang.OutOfMemoryError: Java heap space</code> errors in the Catalina logs on CGSpace, I should go look into that&hellip;</li>
</ul>
<h2 id="20170529">2017-05-29</h2>
<h2 id="2017-05-29">2017-05-29</h2>
<ul>
<li>Discuss WLE themes and subjects with Mia and Macaroni Bros</li>
<li>We decided we need to create metadata fields for Phase I and II themes</li>