mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2020-01-27
This commit is contained in:
@ -9,7 +9,7 @@
|
||||
<meta property="og:description" content="2016-09-01
|
||||
|
||||
Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors
|
||||
Discuss how the migration of CGIAR's Active Directory to a flat structure will break our LDAP groups in DSpace
|
||||
Discuss how the migration of CGIAR’s Active Directory to a flat structure will break our LDAP groups in DSpace
|
||||
We had been using DC=ILRI to determine whether a user was ILRI or not
|
||||
It looks like we might be able to use OUs now, instead of DCs:
|
||||
|
||||
@ -25,13 +25,13 @@ $ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b "dc=cgiarad,dc=or
|
||||
<meta name="twitter:description" content="2016-09-01
|
||||
|
||||
Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors
|
||||
Discuss how the migration of CGIAR's Active Directory to a flat structure will break our LDAP groups in DSpace
|
||||
Discuss how the migration of CGIAR’s Active Directory to a flat structure will break our LDAP groups in DSpace
|
||||
We had been using DC=ILRI to determine whether a user was ILRI or not
|
||||
It looks like we might be able to use OUs now, instead of DCs:
|
||||
|
||||
$ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b "dc=cgiarad,dc=org" -D "admigration1@cgiarad.org" -W "(sAMAccountName=admigration1)"
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.62.2" />
|
||||
<meta name="generator" content="Hugo 0.63.1" />
|
||||
|
||||
|
||||
|
||||
@ -61,7 +61,7 @@ $ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b "dc=cgiarad,dc=or
|
||||
|
||||
<!-- combined, minified CSS -->
|
||||
|
||||
<link href="https://alanorth.github.io/cgspace-notes/css/style.a20c1a4367639632cdb341d23c27ca44fedcc75b0f8b3cbea6203010da153d3c.css" rel="stylesheet" integrity="sha256-ogwaQ2djljLNs0HSPCfKRP7cx1sPizy+piAwENoVPTw=" crossorigin="anonymous">
|
||||
<link href="https://alanorth.github.io/cgspace-notes/css/style.23e2c3298bcc8c1136c19aba330c211ec94c36f7c4454ea15cf4d3548370042a.css" rel="stylesheet" integrity="sha256-I+LDKYvMjBE2wZq6MwwhHslMNvfERU6hXPTTVINwBCo=" crossorigin="anonymous">
|
||||
|
||||
|
||||
<!-- RSS 2.0 feed -->
|
||||
@ -109,14 +109,14 @@ $ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b "dc=cgiarad,dc=or
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-09/">September, 2016</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2016-09-01T15:53:00+03:00">Thu Sep 01, 2016</time> by Alan Orth in
|
||||
|
||||
<i class="fa fa-tag" aria-hidden="true"></i> <a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
|
||||
<span class="fas fa-tag" aria-hidden="true"></span> <a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2016-09-01">2016-09-01</h2>
|
||||
<ul>
|
||||
<li>Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors</li>
|
||||
<li>Discuss how the migration of CGIAR's Active Directory to a flat structure will break our LDAP groups in DSpace</li>
|
||||
<li>Discuss how the migration of CGIAR’s Active Directory to a flat structure will break our LDAP groups in DSpace</li>
|
||||
<li>We had been using <code>DC=ILRI</code> to determine whether a user was ILRI or not</li>
|
||||
<li>It looks like we might be able to use OUs now, instead of DCs:</li>
|
||||
</ul>
|
||||
@ -242,7 +242,7 @@ TLSv1/EDH-RSA-DES-CBC3-SHA
|
||||
<li>See: <a href="http://www.fileformat.info/info/unicode/char/e1/index.htm">http://www.fileformat.info/info/unicode/char/e1/index.htm</a></li>
|
||||
<li>See: <a href="http://demo.icu-project.org/icu-bin/nbrowser?t=%C3%A1&s=&uv=0">http://demo.icu-project.org/icu-bin/nbrowser?t=%C3%A1&s=&uv=0</a></li>
|
||||
<li>If I unzip the original zip from CIAT on Windows, re-zip it with 7zip on Windows, and then unzip it on Linux directly, the file names seem to be proper UTF-8</li>
|
||||
<li>We should definitely clean filenames so they don't use characters that are tricky to process in CSV and shell scripts, like: <code>,</code>, <code>'</code>, and <code>"</code></li>
|
||||
<li>We should definitely clean filenames so they don’t use characters that are tricky to process in CSV and shell scripts, like: <code>,</code>, <code>'</code>, and <code>"</code></li>
|
||||
</ul>
|
||||
<pre><code>value.replace("'","").replace(",","").replace('"','')
|
||||
</code></pre><ul>
|
||||
@ -254,7 +254,7 @@ TLSv1/EDH-RSA-DES-CBC3-SHA
|
||||
<li>The CSV file was giving file names in UTF-8, and unzipping the zip on Mac OS X and transferring it was converting the file names to Unicode equivalence like I saw above</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Import CIAT Gender Network records to CGSpace, first creating the SAF bundles as my user, then importing as the <code>tomcat7</code> user, and deleting the bundle, for each collection's items:</li>
|
||||
<li>Import CIAT Gender Network records to CGSpace, first creating the SAF bundles as my user, then importing as the <code>tomcat7</code> user, and deleting the bundle, for each collection’s items:</li>
|
||||
</ul>
|
||||
<pre><code>$ ./safbuilder.sh -c /home/aorth/ciat-gender-2016-09-06/66601.csv
|
||||
$ JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx512m" /home/cgspace.cgiar.org/bin/dspace import -a -e aorth@mjanja.ch -c 10568/66601 -s /home/aorth/ciat-gender-2016-09-06/SimpleArchiveFormat -m 66601.map
|
||||
@ -263,7 +263,7 @@ $ rm -rf ~/ciat-gender-2016-09-06/SimpleArchiveFormat/
|
||||
<ul>
|
||||
<li>Erase and rebuild DSpace Test based on latest Ubuntu 16.04, PostgreSQL 9.5, and Java 8 stuff</li>
|
||||
<li>Reading about PostgreSQL maintenance and it seems manual vacuuming is only for certain workloads, such as heavy update/write loads</li>
|
||||
<li>I suggest we disable our nightly manual vacuum task, as we're a mostly read workload, and I'd rather stick as close to the documentation as possible since we haven't done any testing/observation of PostgreSQL</li>
|
||||
<li>I suggest we disable our nightly manual vacuum task, as we’re a mostly read workload, and I’d rather stick as close to the documentation as possible since we haven’t done any testing/observation of PostgreSQL</li>
|
||||
<li>See: <a href="https://www.postgresql.org/docs/9.3/static/routine-vacuuming.html">https://www.postgresql.org/docs/9.3/static/routine-vacuuming.html</a></li>
|
||||
<li>CGSpace went down and the error seems to be the same as always (lately):</li>
|
||||
</ul>
|
||||
@ -295,7 +295,7 @@ org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error
|
||||
<pre><code>Exception in thread "http-bio-127.0.0.1-8081-exec-25" java.lang.OutOfMemoryError: Java heap space
|
||||
at java.lang.StringCoding.decode(StringCoding.java:215)
|
||||
</code></pre><ul>
|
||||
<li>We haven't seen that in quite a while…</li>
|
||||
<li>We haven’t seen that in quite a while…</li>
|
||||
<li>Indeed, in a month of logs it only occurs 15 times:</li>
|
||||
</ul>
|
||||
<pre><code># grep -rsI "OutOfMemoryError" /var/log/tomcat7/catalina.* | wc -l
|
||||
@ -397,17 +397,17 @@ java.util.Map does not have a no-arg default constructor.
|
||||
</ul>
|
||||
<pre><code>JAVA_OPTS="-Djava.awt.headless=true -Xms3584m -Xmx3584m -XX:MaxPermSize=256m -XX:+UseConcMarkSweepGC -Dfile.encoding=UTF-8 -XX:-UseGCOverheadLimit -XX:MaxGCPauseMillis=250 -XX:GCTimeRatio=9 -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m -XX:InitiatingHeapOccupancyPercent=75 -XX:+UseLargePages -XX:+AggressiveOpts"
|
||||
</code></pre><ul>
|
||||
<li>So I'm going to bump the heap +512m and remove all the other experimental shit (and update ansible!)</li>
|
||||
<li>So I’m going to bump the heap +512m and remove all the other experimental shit (and update ansible!)</li>
|
||||
<li>Increased JVM heap to 4096m on CGSpace (linode01)</li>
|
||||
</ul>
|
||||
<h2 id="2016-09-15">2016-09-15</h2>
|
||||
<ul>
|
||||
<li>Looking at Google Webmaster Tools again, it seems the work I did on URL query parameters and blocking via the <code>X-Robots-Tag</code> HTTP header in March, 2016 seem to have had a positive effect on Google's index for CGSpace</li>
|
||||
<li>Looking at Google Webmaster Tools again, it seems the work I did on URL query parameters and blocking via the <code>X-Robots-Tag</code> HTTP header in March, 2016 seem to have had a positive effect on Google’s index for CGSpace</li>
|
||||
</ul>
|
||||
<p><img src="/cgspace-notes/2016/09/google-webmaster-tools-index.png" alt="Google Webmaster Tools for CGSpace"></p>
|
||||
<h2 id="2016-09-16">2016-09-16</h2>
|
||||
<ul>
|
||||
<li>CGSpace crashed again, and there are TONS of heap space errors but the datestamps aren't on those lines so I'm not sure if they were yesterday:</li>
|
||||
<li>CGSpace crashed again, and there are TONS of heap space errors but the datestamps aren’t on those lines so I’m not sure if they were yesterday:</li>
|
||||
</ul>
|
||||
<pre><code>dn:CN=Orentlicher\, Natalie (CIAT),OU=Standard,OU=Users,OU=HQ,OU=CIATHUB,dc=cgiarad,dc=org
|
||||
Thu Sep 15 18:45:25 UTC 2016 | Query:id: 55785 AND type:2
|
||||
@ -434,12 +434,12 @@ Exception in thread "Thread-54216" org.apache.solr.client.solrj.impl.H
|
||||
at com.atmire.statistics.SolrLogThread.run(SourceFile:25)
|
||||
</code></pre><ul>
|
||||
<li>I bumped the heap space from 4096m to 5120m to see if this is <em>really</em> about heap speace or not.</li>
|
||||
<li>Looking into some of these errors that I've seen this week but haven't noticed before:</li>
|
||||
<li>Looking into some of these errors that I’ve seen this week but haven’t noticed before:</li>
|
||||
</ul>
|
||||
<pre><code># zcat -f -- /var/log/tomcat7/catalina.* | grep -c 'Failed to generate the schema for the JAX-B elements'
|
||||
113
|
||||
</code></pre><ul>
|
||||
<li>I've sent a message to Atmire about the Solr error to see if it's related to their batch update module</li>
|
||||
<li>I’ve sent a message to Atmire about the Solr error to see if it’s related to their batch update module</li>
|
||||
</ul>
|
||||
<h2 id="2016-09-19">2016-09-19</h2>
|
||||
<ul>
|
||||
@ -474,7 +474,7 @@ $ ./delete-metadata-values.py -f cg.contributor.affiliation -i affiliations_pb-2
|
||||
<p><img src="/cgspace-notes/2016/09/cgspace-search.png" alt="CGSpace search with “OR” boolean logic">
|
||||
<img src="/cgspace-notes/2016/09/dspacetest-search.png" alt="DSpace Test search with “AND” boolean logic"></p>
|
||||
<ul>
|
||||
<li>Found a way to improve the configuration of Atmire's Content and Usage Analysis (CUA) module for date fields</li>
|
||||
<li>Found a way to improve the configuration of Atmire’s Content and Usage Analysis (CUA) module for date fields</li>
|
||||
</ul>
|
||||
<pre><code>-content.analysis.dataset.option.8=metadata:dateAccessioned:discovery
|
||||
+content.analysis.dataset.option.8=metadata:dc.date.accessioned:date(month)
|
||||
@ -500,8 +500,8 @@ $ ./delete-metadata-values.py -i sponsors-delete-8.csv -f dc.description.sponsor
|
||||
<li>Merge accession date improvements for CUA module (<a href="https://github.com/ilri/DSpace/pull/275">#275</a>)</li>
|
||||
<li>Merge addition of accession date to Discovery search filters (<a href="https://github.com/ilri/DSpace/pull/276">#276</a>)</li>
|
||||
<li>Merge updates to sponsorship controlled vocabulary (<a href="https://github.com/ilri/DSpace/pull/277">#277</a>)</li>
|
||||
<li>I've been trying to add a search filter for <code>dc.description</code> so the IITA people can search for some tags they use there, but for some reason the filter never shows up in Atmire's CUA</li>
|
||||
<li>Not sure if it's something like we already have too many filters there (30), or the filter name is reserved, etc…</li>
|
||||
<li>I’ve been trying to add a search filter for <code>dc.description</code> so the IITA people can search for some tags they use there, but for some reason the filter never shows up in Atmire’s CUA</li>
|
||||
<li>Not sure if it’s something like we already have too many filters there (30), or the filter name is reserved, etc…</li>
|
||||
<li>Generate a list of ILRI subjects for Peter and Abenet to look through/fix:</li>
|
||||
</ul>
|
||||
<pre><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where resource_type_id=2 and metadata_field_id=203 group by text_value order by count desc) to /tmp/ilrisubjects.csv with csv;
|
||||
@ -509,7 +509,7 @@ $ ./delete-metadata-values.py -i sponsors-delete-8.csv -f dc.description.sponsor
|
||||
<li>Regenerate Discovery indexes a few times after playing with <code>discovery.xml</code> index definitions (syntax, parameters, etc).</li>
|
||||
<li>Merge changes to boolean logic in Solr search (<a href="https://github.com/ilri/DSpace/pull/274">#274</a>)</li>
|
||||
<li>Run all sponsorship and affiliation fixes on CGSpace, deploy latest <code>5_x-prod</code> branch, and re-index Discovery on CGSpace</li>
|
||||
<li>Tested OCSP stapling on DSpace Test's nginx and it works:</li>
|
||||
<li>Tested OCSP stapling on DSpace Test’s nginx and it works:</li>
|
||||
</ul>
|
||||
<pre><code>$ openssl s_client -connect dspacetest.cgiar.org:443 -servername dspacetest.cgiar.org -tls1_2 -tlsextdebug -status
|
||||
...
|
||||
@ -519,7 +519,7 @@ OCSP Response Data:
|
||||
...
|
||||
Cert Status: good
|
||||
</code></pre><ul>
|
||||
<li>I've been monitoring this for almost two years in this GitHub issue: <a href="https://github.com/ilri/DSpace/issues/38">https://github.com/ilri/DSpace/issues/38</a></li>
|
||||
<li>I’ve been monitoring this for almost two years in this GitHub issue: <a href="https://github.com/ilri/DSpace/issues/38">https://github.com/ilri/DSpace/issues/38</a></li>
|
||||
</ul>
|
||||
<h2 id="2016-09-27">2016-09-27</h2>
|
||||
<ul>
|
||||
@ -552,10 +552,10 @@ UPDATE 101
|
||||
<li>Make a placeholder pull request for <code>discovery.xml</code> changes (<a href="https://github.com/ilri/DSpace/pull/278">#278</a>), as I still need to test their effect on Atmire content analysis module</li>
|
||||
<li>Make a placeholder pull request for Font Awesome changes (<a href="https://github.com/ilri/DSpace/pull/279">#279</a>), which replaces the GitHub image in the footer with an icon, and add style for RSS and @ icons that I will start replacing in community/collection HTML intros</li>
|
||||
<li>Had some issues with local test server after messing with Solr too much, had to blow everything away and re-install from CGSpace</li>
|
||||
<li>Going to try to update Sonja Vermeulen's authority to 2b4166b7-6e4d-4f66-9d8b-ddfbec9a6ae0, as that seems to be one of her authorities that has an ORCID</li>
|
||||
<li>Going to try to update Sonja Vermeulen’s authority to 2b4166b7-6e4d-4f66-9d8b-ddfbec9a6ae0, as that seems to be one of her authorities that has an ORCID</li>
|
||||
<li>Merge Font Awesome changes (<a href="https://github.com/ilri/DSpace/pull/279">#279</a>)</li>
|
||||
<li>Minor fix to a string in Atmire's CUA module (<a href="https://github.com/ilri/DSpace/pull/280">#280</a>)</li>
|
||||
<li>This seems to be what I'll need to do for Sonja Vermeulen (but with <code>2b4166b7-6e4d-4f66-9d8b-ddfbec9a6ae0</code> instead on the live site):</li>
|
||||
<li>Minor fix to a string in Atmire’s CUA module (<a href="https://github.com/ilri/DSpace/pull/280">#280</a>)</li>
|
||||
<li>This seems to be what I’ll need to do for Sonja Vermeulen (but with <code>2b4166b7-6e4d-4f66-9d8b-ddfbec9a6ae0</code> instead on the live site):</li>
|
||||
</ul>
|
||||
<pre><code>dspacetest=# update metadatavalue set authority='09e4da69-33a3-45ca-b110-7d3f82d2d6d2', confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like 'Vermeulen, S%';
|
||||
dspacetest=# update metadatavalue set authority='09e4da69-33a3-45ca-b110-7d3f82d2d6d2', confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like 'Vermeulen SJ%';
|
||||
@ -576,8 +576,8 @@ $ ./delete-metadata-values.py -i ilrisubjects-delete-13.csv -f cg.subject.ilri -
|
||||
<pre><code>dspacetest=# select distinct text_value from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in ('10568/5472', '10568/5473')));
|
||||
</code></pre><h2 id="2016-09-30">2016-09-30</h2>
|
||||
<ul>
|
||||
<li>Deny access to REST API's <code>find-by-metadata-field</code> endpoint to protect against an upstream security issue (DS-3250)</li>
|
||||
<li>There is a patch but it is only for 5.5 and doesn't apply cleanly to 5.1</li>
|
||||
<li>Deny access to REST API’s <code>find-by-metadata-field</code> endpoint to protect against an upstream security issue (DS-3250)</li>
|
||||
<li>There is a patch but it is only for 5.5 and doesn’t apply cleanly to 5.1</li>
|
||||
</ul>
|
||||
|
||||
|
||||
|
Reference in New Issue
Block a user