Add notes for 2019-12-17

This commit is contained in:
2019-12-17 14:49:24 +02:00
parent d83c951532
commit d54e5b69f1
90 changed files with 1420 additions and 1377 deletions

View File

@ -35,7 +35,7 @@ CGSpace
Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community
"/>
<meta name="generator" content="Hugo 0.60.1" />
<meta name="generator" content="Hugo 0.61.0" />
@ -116,7 +116,7 @@ Abenet had another similar issue a few days ago when trying to find the stats fo
</p>
</header>
<h2 id="20190701">2019-07-01</h2>
<h2 id="2019-07-01">2019-07-01</h2>
<ul>
<li>Create an &ldquo;AfricaRice books and book chapters&rdquo; collection on CGSpace for AfricaRice</li>
<li>Last month Sisay asked why the following &ldquo;most popular&rdquo; statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace:
@ -205,7 +205,7 @@ Abenet had another similar issue a few days ago when trying to find the stats fo
-Dcom.sun.management.jmxremote.port=1337
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
</code></pre><h2 id="20190702">2019-07-02</h2>
</code></pre><h2 id="2019-07-02">2019-07-02</h2>
<ul>
<li>Help upload twenty-seven posters from the 2019-05 Sharefair to CGSpace
<ul>
@ -229,11 +229,11 @@ $ dspace import -a -e me@cgiar.org -m 2019-07-02-Sharefair.map -s /tmp/Sharefair
</ul>
</li>
</ul>
<h2 id="20190703">2019-07-03</h2>
<h2 id="2019-07-03">2019-07-03</h2>
<ul>
<li>Atmire responded about the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=685">Solr issue</a> and said they would be willing to help</li>
</ul>
<h2 id="20190704">2019-07-04</h2>
<h2 id="2019-07-04">2019-07-04</h2>
<ul>
<li>Maria Garruccio sent me some new ORCID identifiers for Bioversity authors
<ul>
@ -255,11 +255,11 @@ $ ./resolve-orcids.py -i /tmp/2019-07-04-orcid-ids.txt -o 2019-07-04-orcid-names
<li>But when I ran <code>fix-metadata-values.py</code> I didn't see any changes:</li>
</ul>
<pre><code>$ ./fix-metadata-values.py -i 2019-07-04-update-orcids.csv -db dspace -u dspace -p 'fuuu' -f cg.creator.id -m 240 -t correct -d
</code></pre><h2 id="20190706">2019-07-06</h2>
</code></pre><h2 id="2019-07-06">2019-07-06</h2>
<ul>
<li>Send a reminder to Marie about my notes on the <a href="https://github.com/AgriculturalSemantics/cg-core/issues/2">CG Core v2 issue I created two weeks ago</a></li>
</ul>
<h2 id="20190708">2019-07-08</h2>
<h2 id="2019-07-08">2019-07-08</h2>
<ul>
<li>Communicate with Atmire about the Solr statistics cores issue
<ul>
@ -297,7 +297,7 @@ dc.identifier.issn
978-3-319-58789-9
2320-7035
2593-9173
</code></pre><h2 id="20190709">2019-07-09</h2>
</code></pre><h2 id="2019-07-09">2019-07-09</h2>
<ul>
<li>Thinking about data cleaning automation again and found some resources about Python and Pandas:
<ul>
@ -306,7 +306,7 @@ dc.identifier.issn
</ul>
</li>
</ul>
<h2 id="20190711">2019-07-11</h2>
<h2 id="2019-07-11">2019-07-11</h2>
<ul>
<li>Skype call with Marie Angelique about CG Core v2
<ul>
@ -329,7 +329,7 @@ dc.identifier.issn
</code></pre><ul>
<li>I'm assuming something happened in his browser (like a refresh) after the item was submitted&hellip;</li>
</ul>
<h2 id="20190712">2019-07-12</h2>
<h2 id="2019-07-12">2019-07-12</h2>
<ul>
<li>Atmire responded with some initial feedback about our Tomcat configuration related to the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=685">Solr issue I raised recently</a>
<ul>
@ -350,7 +350,7 @@ dc.identifier.issn
<pre><code># su - postgres
$ psql dspace -c 'update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (167394);'
UPDATE 1
</code></pre><h2 id="20190716">2019-07-16</h2>
</code></pre><h2 id="2019-07-16">2019-07-16</h2>
<ul>
<li>Completely reset the Podman configuration on my laptop because there were some layers that I couldn't delete and it had been some time since I did a cleanup:</li>
</ul>
@ -371,7 +371,7 @@ $ psql -h localhost -U postgres -f ~/src/git/DSpace/dspace/etc/postgres/update-s
<li>Start working on implementing the <a href="https://gist.github.com/alanorth/2db39e91f48d116e00a4edffd6ba6409">CG Core v2 changes</a> on my local DSpace test environment</li>
<li>Make a pull request to CG Core v2 with some fixes for typos in the specification (<a href="https://github.com/AgriculturalSemantics/cg-core/pull/5">#5</a>)</li>
</ul>
<h2 id="20190718">2019-07-18</h2>
<h2 id="2019-07-18">2019-07-18</h2>
<ul>
<li>Talk to Moayad about the remaining issues for OpenRXV / AReS
<ul>
@ -394,7 +394,7 @@ Please see the DSpace documentation for assistance.
</code></pre><ul>
<li>I emailed ICT to ask them to reset it and make the expiration period longer if possible</li>
</ul>
<h2 id="20190719">2019-07-19</h2>
<h2 id="2019-07-19">2019-07-19</h2>
<ul>
<li>ICT reset the password for the CGSpace support account and apparently removed the expiry requirement
<ul>
@ -402,7 +402,7 @@ Please see the DSpace documentation for assistance.
</ul>
</li>
</ul>
<h2 id="20190720">2019-07-20</h2>
<h2 id="2019-07-20">2019-07-20</h2>
<ul>
<li>Create an account for Lionelle Samnick on CGSpace because the registration isn't working for some reason:</li>
</ul>
@ -417,7 +417,7 @@ Please see the DSpace documentation for assistance.
<li>Some invalid ISSNs in dc.identifier.issn (they look like ISBNs)</li>
<li>I see some ISSNs in the dc.identifier.isbn field</li>
<li>I see some invalid ISBNs that look like Excel errors (9,78E+12)</li>
<li>For DOI we just use the URL, not &ldquo;DOI: <a href="https://doi.org..">https://doi.org..</a>.&rdquo;</li>
<li>For DOI we just use the URL, not &ldquo;DOI: <a href="https://doi.org...%22">https://doi.org...&quot;</a></li>
<li>I see an invalid &ldquo;LEAVE BLANK&rdquo; in the cg.contributor.crp field</li>
<li>Country field is using &ldquo;,&rdquo; for multiple values instead of &ldquo;||&rdquo;</li>
<li>Region field is using &ldquo;,&rdquo; for multiple values instead of &ldquo;||&rdquo;</li>
@ -426,7 +426,7 @@ Please see the DSpace documentation for assistance.
</ul>
</li>
</ul>
<h2 id="20190722">2019-07-22</h2>
<h2 id="2019-07-22">2019-07-22</h2>
<ul>
<li>Raise an <a href="https://github.com/AgriculturalSemantics/cg-core/issues/8">issue on CG Core v2 spec regarding country and region coverage</a>
<ul>
@ -445,7 +445,7 @@ Please see the DSpace documentation for assistance.
<li>I left a note saying that DSpace is technically limited to a flat schema so we use <code>cg.coverage.country: Kenya</code></li>
<li>Do a little more work on CG Core v2 in the input forms</li>
</ul>
<h2 id="20190725">2019-07-25</h2>
<h2 id="2019-07-25">2019-07-25</h2>
<ul>
<li>
<p>Generate a list of the ORCID identifiers that we added to CGSpace in 2019 for Sara Jani at ICARDA</p>
@ -461,7 +461,7 @@ Please see the DSpace documentation for assistance.
<li>A few strange publishers after splitting multi-value cells, like &ldquo;(Belgium)&rdquo;</li>
<li>Deleted four ISSNs that are actually ISBNs and are already present in the ISBN field</li>
<li>Eight invalid ISBNs</li>
<li>Convert all DOIs to &ldquo;<a href="https://doi.org">https://doi.org</a>&rdquo; format and fix one invalid DOI</li>
<li>Convert all DOIs to &ldquo;<a href="https://doi.org%22">https://doi.org&quot;</a> format and fix one invalid DOI</li>
<li>Fix a handful of incorrect CRPs that seem to have been split on comma &ldquo;,&rdquo;</li>
<li>Lots of strange values in cg.link.reference, and I normalized all DOIs to <a href="https://doi.org">https://doi.org</a> format
<ul>
@ -488,7 +488,7 @@ from stdnum import issn
isbn.validate('978-92-9043-389-7')
issn.validate('1020-3362')
</code></pre><h2 id="20190726">2019-07-26</h2>
</code></pre><h2 id="2019-07-26">2019-07-26</h2>
<ul>
<li>
<p>Bioversity sent me an updated CSV file that fixes some of the issues I pointed out yesterday</p>
@ -506,7 +506,7 @@ issn.validate('1020-3362')
</code></pre><ul>
<li>I whipped up a quick script using Python Pandas to do whitespace cleanup</li>
</ul>
<h2 id="20190729">2019-07-29</h2>
<h2 id="2019-07-29">2019-07-29</h2>
<ul>
<li>I turned the Pandas script into a proper Python package called <a href="https://git.sr.ht/~alanorth/csv-metadata-quality">csv-metadata-quality</a>
<ul>
@ -520,7 +520,7 @@ issn.validate('1020-3362')
</li>
<li>Inform Bioversity that there is an error in their CSV, seemingly caused by quotes in the citation field</li>
</ul>
<h2 id="20190730">2019-07-30</h2>
<h2 id="2019-07-30">2019-07-30</h2>
<ul>
<li>Add support for removing newlines (line feeds) to <a href="https://git.sr.ht/~alanorth/csv-metadata-quality">csv-metadata-quality</a></li>
<li>On the subject of validating some of our fields like countries and regions, Abenet pointed out that these should all be valid AGROVOC terms, so we can actually try to validate against that!</li>