mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2019-12-17
This commit is contained in:
@ -35,7 +35,7 @@ http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E
|
||||
Then I reduced the JVM heap size from 6144 back to 5120m
|
||||
Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the Ansible infrastructure scripts to support hosts choosing which distribution they want to use
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.60.1" />
|
||||
<meta name="generator" content="Hugo 0.61.0" />
|
||||
|
||||
|
||||
|
||||
@ -116,7 +116,7 @@ Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="20180501">2018-05-01</h2>
|
||||
<h2 id="2018-05-01">2018-05-01</h2>
|
||||
<ul>
|
||||
<li>I cleared the Solr statistics core on DSpace Test by issuing two commands directly to the Solr admin interface:
|
||||
<ul>
|
||||
@ -127,7 +127,7 @@ Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked
|
||||
<li>Then I reduced the JVM heap size from 6144 back to 5120m</li>
|
||||
<li>Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> to support hosts choosing which distribution they want to use</li>
|
||||
</ul>
|
||||
<h2 id="20180502">2018-05-02</h2>
|
||||
<h2 id="2018-05-02">2018-05-02</h2>
|
||||
<ul>
|
||||
<li>Advise Fabio Fidanza about integrating CGSpace content in the new CGIAR corporate website</li>
|
||||
<li>I think they can mostly rely on using the <code>cg.contributor.crp</code> field</li>
|
||||
@ -161,7 +161,7 @@ Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="20180503">2018-05-03</h2>
|
||||
<h2 id="2018-05-03">2018-05-03</h2>
|
||||
<ul>
|
||||
<li>It turns out that the IITA records that I was helping Sisay with in March were imported in 2018-04 without a final check by Abenet or I</li>
|
||||
<li>There are lots of errors on language, CRP, and even some encoding errors on abstract fields</li>
|
||||
@ -172,7 +172,7 @@ Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked
|
||||
<li>Abenet sent a list of 46 ORCID identifiers for ILRI authors so I need to get their names using my <a href="https://gist.github.com/alanorth/57a88379126d844563c1410bd7b8d12b">resolve-orcids.py</a> script and merge them into our controlled vocabulary</li>
|
||||
<li>On the messed up IITA records from 2018-04 I see sixty DOIs in incorrect format (cg.identifier.doi)</li>
|
||||
</ul>
|
||||
<h2 id="20180506">2018-05-06</h2>
|
||||
<h2 id="2018-05-06">2018-05-06</h2>
|
||||
<ul>
|
||||
<li>Fixing the IITA records from Sisay, sixty DOIs have completely invalid format like <code>http:dx.doi.org10.1016j.cropro.2008.07.003</code></li>
|
||||
<li>I corrected all the DOIs and then checked them for validity with a quick bash loop:</li>
|
||||
@ -218,7 +218,7 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
|
||||
<li>I made a pull request (<a href="https://github.com/ilri/DSpace/pull/373">#373</a>) for this that I'll merge some time next week (I'm expecting Atmire to get back to us about DSpace 5.8 soon)</li>
|
||||
<li>After testing quickly I just decided to merge it, and I noticed that I don't even need to restart Tomcat for the changes to get loaded</li>
|
||||
</ul>
|
||||
<h2 id="20180507">2018-05-07</h2>
|
||||
<h2 id="2018-05-07">2018-05-07</h2>
|
||||
<ul>
|
||||
<li>I spent a bit of time playing with <a href="https://github.com/codeforkjeff/conciliator">conciliator</a> and Solr, trying to figure out how to reconcile columns in OpenRefine with data in our existing Solr cores (like CRP subjects)</li>
|
||||
<li>The documentation regarding the Solr stuff is limited, and I cannot figure out what all the fields in <code>conciliator.properties</code> are supposed to be</li>
|
||||
@ -226,7 +226,7 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
|
||||
<li>That, combined with splitting our multi-value fields on “||” in OpenRefine is amaaaaazing, because after reconciliation you can just join them again</li>
|
||||
<li>Oh wow, you can also facet on the individual values once you've split them! That's going to be amazing for proofing CRPs, subjects, etc.</li>
|
||||
</ul>
|
||||
<h2 id="20180509">2018-05-09</h2>
|
||||
<h2 id="2018-05-09">2018-05-09</h2>
|
||||
<ul>
|
||||
<li>Udana asked about the Book Chapters we had been proofing on DSpace Test in 2018-04</li>
|
||||
<li>I told him that there were still some TODO items for him on that data, for example to update the <code>dc.language.iso</code> field for the Spanish items</li>
|
||||
@ -271,7 +271,7 @@ Livestock and Fish
|
||||
</code></pre><ul>
|
||||
<li>I tried to reconcile against a CSV of our countries but reconcile-csv crashes</li>
|
||||
</ul>
|
||||
<h2 id="20180513">2018-05-13</h2>
|
||||
<h2 id="2018-05-13">2018-05-13</h2>
|
||||
<ul>
|
||||
<li>It turns out there was a space in my “country” header that was causing reconcile-csv to crash</li>
|
||||
<li>After removing that it works fine!</li>
|
||||
@ -291,12 +291,12 @@ Livestock and Fish
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="20180514">2018-05-14</h2>
|
||||
<h2 id="2018-05-14">2018-05-14</h2>
|
||||
<ul>
|
||||
<li>Send a message to the OpenRefine mailing list about the bug with reconciling multi-value cells</li>
|
||||
<li>Help Silvia Alonso get a list of all her publications since 2013 from Listings and Reports</li>
|
||||
</ul>
|
||||
<h2 id="20180515">2018-05-15</h2>
|
||||
<h2 id="2018-05-15">2018-05-15</h2>
|
||||
<ul>
|
||||
<li>Turns out I was doing the OpenRefine reconciliation wrong: I needed to copy the matched values to a new column!</li>
|
||||
<li>Also, I learned how to do something cool with Jython expressions in OpenRefine</li>
|
||||
@ -358,7 +358,7 @@ $ ./bin/post -c countries ~/src/git/DSpace/2018-05-10-countries.csv
|
||||
<li>I copied over the DSpace <code>search_text</code> field type from the DSpace Solr config (had to remove some properties so Solr would start) but it doesn't seem to be any better at matching than the <code>text_en</code> type</li>
|
||||
<li>I think I need to focus on trying to return scores with conciliator</li>
|
||||
</ul>
|
||||
<h2 id="20180516">2018-05-16</h2>
|
||||
<h2 id="2018-05-16">2018-05-16</h2>
|
||||
<ul>
|
||||
<li>Discuss GDPR with James Stapleton
|
||||
<ul>
|
||||
@ -381,7 +381,7 @@ $ ./bin/post -c countries ~/src/git/DSpace/2018-05-10-countries.csv
|
||||
<li>According to the <a href="https://developers.google.com/analytics/devguides/collection/analyticsjs/field-reference#anonymizeIp">analytics.js protocol parameter documentation</a> this means that IPs are being anonymized</li>
|
||||
<li>After finding and fixing some duplicates in IITA's <code>IITA_April_27</code> test collection on DSpace Test (10568/92703) I told Sisay that he can move them to IITA's Journal Articles collection on CGSpace</li>
|
||||
</ul>
|
||||
<h2 id="20180517">2018-05-17</h2>
|
||||
<h2 id="2018-05-17">2018-05-17</h2>
|
||||
<ul>
|
||||
<li>Testing reconciliation of countries against Solr via conciliator, I notice that <code>CÔTE D'IVOIRE</code> doesn't match <code>COTE D'IVOIRE</code>, whereas with reconcile-csv it does</li>
|
||||
<li>Also, when reconciling regions against Solr via conciliator <code>EASTERN AFRICA</code> doesn't match <code>EAST AFRICA</code>, whereas with reconcile-csv it does</li>
|
||||
@ -401,23 +401,23 @@ $ ./bin/post -c countries ~/src/git/DSpace/2018-05-10-countries.csv
|
||||
<li>This cookie could be set by a user clicking a link in a privacy policy, for example</li>
|
||||
<li>The additional Javascript could be easily added to our existing <code>googleAnalytics</code> template in each XMLUI theme</li>
|
||||
</ul>
|
||||
<h2 id="20180518">2018-05-18</h2>
|
||||
<h2 id="2018-05-18">2018-05-18</h2>
|
||||
<ul>
|
||||
<li>Do a final check on the thirty (30) IWMI Book Chapters for Udana and upload them to CGSpace</li>
|
||||
<li>These were previously on <a href="https://dspacetest.cgiar.org/handle/10568/91679">DSpace Test as “IWMI test collection”</a> in 2018-04</li>
|
||||
</ul>
|
||||
<h2 id="20180520">2018-05-20</h2>
|
||||
<h2 id="2018-05-20">2018-05-20</h2>
|
||||
<ul>
|
||||
<li>Run all system updates on DSpace Test (linode19), re-deploy DSpace with latest <code>5_x-dev</code> branch (including GDPR IP anonymization), and reboot the server</li>
|
||||
<li>Run all system updates on CGSpace (linode18), re-deploy DSpace with latest <code>5_x-dev</code> branch (including GDPR IP anonymization), and reboot the server</li>
|
||||
</ul>
|
||||
<h2 id="20180521">2018-05-21</h2>
|
||||
<h2 id="2018-05-21">2018-05-21</h2>
|
||||
<ul>
|
||||
<li>Geoffrey from IITA got back with more questions about depositing items programatically into the CGSpace workflow</li>
|
||||
<li>I pointed out that <a href="http://swordapp.org/">SWORD</a> might be an option, as <a href="https://wiki.duraspace.org/display/DSDOC5x/SWORDv2+Server">DSpace supports the SWORDv2 protocol</a> (although we have never tested it)</li>
|
||||
<li>Work on implementing <a href="https://cookieconsent.insites.com">cookie consent</a> popup for all XMLUI themes (SASS theme with primary / secondary branding from Bootstrap)</li>
|
||||
</ul>
|
||||
<h2 id="20180522">2018-05-22</h2>
|
||||
<h2 id="2018-05-22">2018-05-22</h2>
|
||||
<ul>
|
||||
<li>Skype with James Stapleton about last minute GDPR wording</li>
|
||||
<li>After spending yesterday working on integration and theming of the cookieconsent popup, today I cannot get the damn “Agree” button to dismiss the popup!</li>
|
||||
@ -427,7 +427,7 @@ $ ./bin/post -c countries ~/src/git/DSpace/2018-05-10-countries.csv
|
||||
<li>This is a waste of TWO full days of work</li>
|
||||
<li>Marissa Van Epp asked if I could add <code>PII-FP1_PACCA2</code> to the CCAFS phase II project tags on CGSpace so I created a ticket to track it (<a href="https://github.com/ilri/DSpace/issues/376">#376</a>)</li>
|
||||
</ul>
|
||||
<h2 id="20180523">2018-05-23</h2>
|
||||
<h2 id="2018-05-23">2018-05-23</h2>
|
||||
<ul>
|
||||
<li>I'm investigating how many non-CGIAR users we have registered on CGSpace:</li>
|
||||
</ul>
|
||||
@ -439,14 +439,14 @@ $ ./bin/post -c countries ~/src/git/DSpace/2018-05-10-countries.csv
|
||||
<li>I made a pull request for the GDPR compliance popup (<a href="https://github.com/ilri/DSpace/pull/377">#377</a>) and merged it to the <code>5_x-prod</code> branch</li>
|
||||
<li>I will deploy it to CGSpace tonight</li>
|
||||
</ul>
|
||||
<h2 id="20180528">2018-05-28</h2>
|
||||
<h2 id="2018-05-28">2018-05-28</h2>
|
||||
<ul>
|
||||
<li>Daniel Haile-Michael sent a message that CGSpace was down (I am currently in Oregon so the time difference is ~10 hours)</li>
|
||||
<li>I looked in the logs but didn't see anything that would be the cause of the crash</li>
|
||||
<li>Atmire finalized the DSpace 5.8 testing and sent a pull request: <a href="https://github.com/ilri/DSpace/pull/378">https://github.com/ilri/DSpace/pull/378</a></li>
|
||||
<li>They have asked if I can test this and get back to them by June 11th</li>
|
||||
</ul>
|
||||
<h2 id="20180530">2018-05-30</h2>
|
||||
<h2 id="2018-05-30">2018-05-30</h2>
|
||||
<ul>
|
||||
<li>Talk to Samantha from Bioversity about something related to Google Analytics, I'm still not sure what they want</li>
|
||||
<li>DSpace Test crashed last night, seems to be related to system memory (not JVM heap)</li>
|
||||
@ -479,7 +479,7 @@ $ sed 's/.*Item1.*/\n&/g' ~/cifor-duplicates.txt > ~/cifor-duplicates-cle
|
||||
<li>Then I format the list of handles and put it into this SQL query to export authors from items ONLY in those collections (too many to list here):</li>
|
||||
</ul>
|
||||
<pre><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in ('10568/67236','10568/67274',...))) group by text_value order by count desc) to /tmp/ilri-authors.csv with csv;
|
||||
</code></pre><h2 id="20180531">2018-05-31</h2>
|
||||
</code></pre><h2 id="2018-05-31">2018-05-31</h2>
|
||||
<ul>
|
||||
<li>Clarify CGSpace's usage of Google Analytics and personally identifiable information during user registration for Bioversity team who had been asking about GDPR compliance</li>
|
||||
<li>Testing running PostgreSQL in a Docker container on localhost because when I'm on Arch Linux there isn't an easily installable package for particular PostgreSQL versions</li>
|
||||
|
Reference in New Issue
Block a user