mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2022-03-04
This commit is contained in:
@ -32,7 +32,7 @@ After running DSpace for over five years I’ve never needed to look in any
|
||||
This will save us a few gigs of backup space we’re paying for on S3
|
||||
Also, I noticed the checker log has some errors we should pay attention to:
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.92.2" />
|
||||
<meta name="generator" content="Hugo 0.93.1" />
|
||||
|
||||
|
||||
|
||||
@ -150,7 +150,7 @@ java.io.FileNotFoundException: /home/cgspace.cgiar.org/assetstore/64/29/06/64290
|
||||
******************************************************
|
||||
</code></pre><ul>
|
||||
<li>So this would be the <code>tomcat7</code> Unix user, who seems to have a default limit of 1024 files in its shell</li>
|
||||
<li>For what it’s worth, we have been setting the actual Tomcat 7 process' limit to 16384 for a few years (in <code>/etc/default/tomcat7</code>)</li>
|
||||
<li>For what it’s worth, we have been setting the actual Tomcat 7 process’ limit to 16384 for a few years (in <code>/etc/default/tomcat7</code>)</li>
|
||||
<li>Looks like cron will read limits from <code>/etc/security/limits.*</code> so we can do something for the tomcat7 user there</li>
|
||||
<li>Submit pull request for Tomcat 7 limits in Ansible dspace role (<a href="https://github.com/ilri/rmg-ansible-public/pull/30">#30</a>)</li>
|
||||
</ul>
|
||||
@ -159,10 +159,10 @@ java.io.FileNotFoundException: /home/cgspace.cgiar.org/assetstore/64/29/06/64290
|
||||
<li>Reduce Amazon S3 storage used for logs from 46 GB to 6GB by deleting a bunch of logs we don’t need!</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code># s3cmd ls s3://cgspace.cgiar.org/log/ > /tmp/s3-logs.txt
|
||||
# grep checker.log /tmp/s3-logs.txt | awk '{print $4}' | xargs s3cmd del
|
||||
# grep cocoon.log /tmp/s3-logs.txt | awk '{print $4}' | xargs s3cmd del
|
||||
# grep handle-plugin.log /tmp/s3-logs.txt | awk '{print $4}' | xargs s3cmd del
|
||||
# grep solr.log /tmp/s3-logs.txt | awk '{print $4}' | xargs s3cmd del
|
||||
# grep checker.log /tmp/s3-logs.txt | awk '{print $4}' | xargs s3cmd del
|
||||
# grep cocoon.log /tmp/s3-logs.txt | awk '{print $4}' | xargs s3cmd del
|
||||
# grep handle-plugin.log /tmp/s3-logs.txt | awk '{print $4}' | xargs s3cmd del
|
||||
# grep solr.log /tmp/s3-logs.txt | awk '{print $4}' | xargs s3cmd del
|
||||
</code></pre><ul>
|
||||
<li>Also, adjust the cron jobs for backups so they only backup <code>dspace.log</code> and some stats files (.dat)</li>
|
||||
<li>Try to do some metadata field migrations using the Atmire batch UI (<code>dc.Species</code> → <code>cg.species</code>) but it took several hours and even missed a few records</li>
|
||||
@ -199,13 +199,13 @@ UPDATE 51258
|
||||
<li>Looking at the DOI issue <a href="https://www.yammer.com/dspacedevelopers/#/Threads/show?threadId=678507860">reported by Leroy from CIAT a few weeks ago</a></li>
|
||||
<li>It seems the <code>dx.doi.org</code> URLs are much more proper in our repository!</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code>dspacetest=# select count(*) from metadatavalue where metadata_field_id=74 and text_value like 'http://dx.doi.org%';
|
||||
<pre tabindex="0"><code>dspacetest=# select count(*) from metadatavalue where metadata_field_id=74 and text_value like 'http://dx.doi.org%';
|
||||
count
|
||||
-------
|
||||
5638
|
||||
(1 row)
|
||||
|
||||
dspacetest=# select count(*) from metadatavalue where metadata_field_id=74 and text_value like 'http://doi.org%';
|
||||
dspacetest=# select count(*) from metadatavalue where metadata_field_id=74 and text_value like 'http://doi.org%';
|
||||
count
|
||||
-------
|
||||
3
|
||||
@ -231,11 +231,11 @@ dspacetest=# select count(*) from metadatavalue where metadata_field_id=74 and t
|
||||
<li>I decided to keep the set of subjects that had <code>FMD</code> and <code>RANGELANDS</code> added, as it appears to have been requested to have been added, and might be the newer list</li>
|
||||
<li>I found 226 blank metadatavalues:</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code>dspacetest# select * from metadatavalue where resource_type_id=2 and text_value='';
|
||||
<pre tabindex="0"><code>dspacetest# select * from metadatavalue where resource_type_id=2 and text_value='';
|
||||
</code></pre><ul>
|
||||
<li>I think we should delete them and do a full re-index:</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code>dspacetest=# delete from metadatavalue where resource_type_id=2 and text_value='';
|
||||
<pre tabindex="0"><code>dspacetest=# delete from metadatavalue where resource_type_id=2 and text_value='';
|
||||
DELETE 226
|
||||
</code></pre><ul>
|
||||
<li>I deleted them on CGSpace but I’ll wait to do the re-index as we’re going to be doing one in a few days for the metadata changes anyways</li>
|
||||
@ -294,7 +294,7 @@ UPDATE metadatavalue SET metadata_field_id=215 WHERE metadata_field_id=106
|
||||
UPDATE 3872
|
||||
UPDATE metadatavalue SET metadata_field_id=217 WHERE metadata_field_id=108
|
||||
UPDATE 46075
|
||||
$ JAVA_OPTS="-Xms512m -Xmx512m -Dfile.encoding=UTF-8" ~/dspace/bin/dspace index-discovery -bf
|
||||
$ JAVA_OPTS="-Xms512m -Xmx512m -Dfile.encoding=UTF-8" ~/dspace/bin/dspace index-discovery -bf
|
||||
</code></pre><ul>
|
||||
<li>CGSpace was down but I’m not sure why, this was in <code>catalina.out</code>:</li>
|
||||
</ul>
|
||||
@ -387,7 +387,7 @@ UPDATE 46075
|
||||
<li>Basically, this gives us the ability to use the latest upstream stable 9.3.x release (currently 9.3.12)</li>
|
||||
<li>Looking into the REST API errors again, it looks like these started appearing a few days ago in the tens of thousands:</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code>$ grep -c "Aborting context in finally statement" dspace.log.2016-04-20
|
||||
<pre tabindex="0"><code>$ grep -c "Aborting context in finally statement" dspace.log.2016-04-20
|
||||
21252
|
||||
</code></pre><ul>
|
||||
<li>I found a recent discussion on the DSpace mailing list and I’ve asked for advice there</li>
|
||||
@ -423,7 +423,7 @@ UPDATE 46075
|
||||
<li>Looks like the last one was “down” from about four hours ago</li>
|
||||
<li>I think there must be something with this REST stuff:</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code># grep -c "Aborting context in finally statement" dspace.log.2016-04-*
|
||||
<pre tabindex="0"><code># grep -c "Aborting context in finally statement" dspace.log.2016-04-*
|
||||
dspace.log.2016-04-01:0
|
||||
dspace.log.2016-04-02:0
|
||||
dspace.log.2016-04-03:0
|
||||
|
Reference in New Issue
Block a user