mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2020-01-27
This commit is contained in:
@ -10,8 +10,8 @@
|
||||
|
||||
Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit
|
||||
We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc
|
||||
After running DSpace for over five years I've never needed to look in any other log file than dspace.log, leave alone one from last year!
|
||||
This will save us a few gigs of backup space we're paying for on S3
|
||||
After running DSpace for over five years I’ve never needed to look in any other log file than dspace.log, leave alone one from last year!
|
||||
This will save us a few gigs of backup space we’re paying for on S3
|
||||
Also, I noticed the checker log has some errors we should pay attention to:
|
||||
" />
|
||||
<meta property="og:type" content="article" />
|
||||
@ -25,11 +25,11 @@ Also, I noticed the checker log has some errors we should pay attention to:
|
||||
|
||||
Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit
|
||||
We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc
|
||||
After running DSpace for over five years I've never needed to look in any other log file than dspace.log, leave alone one from last year!
|
||||
This will save us a few gigs of backup space we're paying for on S3
|
||||
After running DSpace for over five years I’ve never needed to look in any other log file than dspace.log, leave alone one from last year!
|
||||
This will save us a few gigs of backup space we’re paying for on S3
|
||||
Also, I noticed the checker log has some errors we should pay attention to:
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.62.2" />
|
||||
<meta name="generator" content="Hugo 0.63.1" />
|
||||
|
||||
|
||||
|
||||
@ -59,7 +59,7 @@ Also, I noticed the checker log has some errors we should pay attention to:
|
||||
|
||||
<!-- combined, minified CSS -->
|
||||
|
||||
<link href="https://alanorth.github.io/cgspace-notes/css/style.a20c1a4367639632cdb341d23c27ca44fedcc75b0f8b3cbea6203010da153d3c.css" rel="stylesheet" integrity="sha256-ogwaQ2djljLNs0HSPCfKRP7cx1sPizy+piAwENoVPTw=" crossorigin="anonymous">
|
||||
<link href="https://alanorth.github.io/cgspace-notes/css/style.23e2c3298bcc8c1136c19aba330c211ec94c36f7c4454ea15cf4d3548370042a.css" rel="stylesheet" integrity="sha256-I+LDKYvMjBE2wZq6MwwhHslMNvfERU6hXPTTVINwBCo=" crossorigin="anonymous">
|
||||
|
||||
|
||||
<!-- RSS 2.0 feed -->
|
||||
@ -107,7 +107,7 @@ Also, I noticed the checker log has some errors we should pay attention to:
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-04/">April, 2016</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2016-04-04T11:06:00+03:00">Mon Apr 04, 2016</time> by Alan Orth in
|
||||
|
||||
<i class="fa fa-tag" aria-hidden="true"></i> <a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
|
||||
<span class="fas fa-tag" aria-hidden="true"></span> <a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
|
||||
|
||||
</p>
|
||||
</header>
|
||||
@ -115,8 +115,8 @@ Also, I noticed the checker log has some errors we should pay attention to:
|
||||
<ul>
|
||||
<li>Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit</li>
|
||||
<li>We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc</li>
|
||||
<li>After running DSpace for over five years I've never needed to look in any other log file than dspace.log, leave alone one from last year!</li>
|
||||
<li>This will save us a few gigs of backup space we're paying for on S3</li>
|
||||
<li>After running DSpace for over five years I’ve never needed to look in any other log file than dspace.log, leave alone one from last year!</li>
|
||||
<li>This will save us a few gigs of backup space we’re paying for on S3</li>
|
||||
<li>Also, I noticed the <code>checker</code> log has some errors we should pay attention to:</li>
|
||||
</ul>
|
||||
<pre><code>Run start time: 03/06/2016 04:00:22
|
||||
@ -143,13 +143,13 @@ java.io.FileNotFoundException: /home/cgspace.cgiar.org/assetstore/64/29/06/64290
|
||||
******************************************************
|
||||
</code></pre><ul>
|
||||
<li>So this would be the <code>tomcat7</code> Unix user, who seems to have a default limit of 1024 files in its shell</li>
|
||||
<li>For what it's worth, we have been setting the actual Tomcat 7 process’ limit to 16384 for a few years (in <code>/etc/default/tomcat7</code>)</li>
|
||||
<li>For what it’s worth, we have been setting the actual Tomcat 7 process’ limit to 16384 for a few years (in <code>/etc/default/tomcat7</code>)</li>
|
||||
<li>Looks like cron will read limits from <code>/etc/security/limits.*</code> so we can do something for the tomcat7 user there</li>
|
||||
<li>Submit pull request for Tomcat 7 limits in Ansible dspace role (<a href="https://github.com/ilri/rmg-ansible-public/pull/30">#30</a>)</li>
|
||||
</ul>
|
||||
<h2 id="2016-04-05">2016-04-05</h2>
|
||||
<ul>
|
||||
<li>Reduce Amazon S3 storage used for logs from 46 GB to 6GB by deleting a bunch of logs we don't need!</li>
|
||||
<li>Reduce Amazon S3 storage used for logs from 46 GB to 6GB by deleting a bunch of logs we don’t need!</li>
|
||||
</ul>
|
||||
<pre><code># s3cmd ls s3://cgspace.cgiar.org/log/ > /tmp/s3-logs.txt
|
||||
# grep checker.log /tmp/s3-logs.txt | awk '{print $4}' | xargs s3cmd del
|
||||
@ -184,8 +184,8 @@ UPDATE metadatavalue SET metadata_field_id=203 WHERE metadata_field_id=76
|
||||
UPDATE 51258
|
||||
</code></pre><h2 id="2016-04-08">2016-04-08</h2>
|
||||
<ul>
|
||||
<li>Discuss metadata renaming with Abenet, we decided it's better to start with the center-specific subjects like ILRI, CIFOR, CCAFS, IWMI, and CPWF</li>
|
||||
<li>I've e-mailed CCAFS and CPWF people to ask them how much time it will take for them to update their systems to cope with this change</li>
|
||||
<li>Discuss metadata renaming with Abenet, we decided it’s better to start with the center-specific subjects like ILRI, CIFOR, CCAFS, IWMI, and CPWF</li>
|
||||
<li>I’ve e-mailed CCAFS and CPWF people to ask them how much time it will take for them to update their systems to cope with this change</li>
|
||||
</ul>
|
||||
<h2 id="2016-04-10">2016-04-10</h2>
|
||||
<ul>
|
||||
@ -208,7 +208,7 @@ dspacetest=# select count(*) from metadatavalue where metadata_field_id=74 and t
|
||||
<h2 id="2016-04-11">2016-04-11</h2>
|
||||
<ul>
|
||||
<li>The donut is already updated and shows the correct number now</li>
|
||||
<li>CCAFS people say it will only take them an hour to update their code for the metadata renames, so I proposed we'd do it tentatively on Monday the 18th.</li>
|
||||
<li>CCAFS people say it will only take them an hour to update their code for the metadata renames, so I proposed we’d do it tentatively on Monday the 18th.</li>
|
||||
</ul>
|
||||
<h2 id="2016-04-12">2016-04-12</h2>
|
||||
<ul>
|
||||
@ -217,7 +217,7 @@ dspacetest=# select count(*) from metadatavalue where metadata_field_id=74 and t
|
||||
<pre><code>dspacetest=# select text_value, count(*) from metadatavalue where metadata_field_id=217 group by text_value order by count(*) desc;
|
||||
</code></pre><ul>
|
||||
<li>Listings and Reports is still not returning reliable data for <code>dc.type</code></li>
|
||||
<li>I think we need to ask Atmire, as their documentation isn't too clear on the format of the filter configs</li>
|
||||
<li>I think we need to ask Atmire, as their documentation isn’t too clear on the format of the filter configs</li>
|
||||
<li>Alternatively, I want to see if I move all the data from <code>dc.type.output</code> to <code>dc.type</code> and then re-index, if it behaves better</li>
|
||||
<li>Looking at our <code>input-forms.xml</code> I see we have two sets of ILRI subjects, but one has a few extra subjects</li>
|
||||
<li>Remove one set of ILRI subjects and remove duplicate <code>VALUE CHAINS</code> from existing list (<a href="https://github.com/ilri/DSpace/pull/216">#216</a>)</li>
|
||||
@ -231,9 +231,9 @@ dspacetest=# select count(*) from metadatavalue where metadata_field_id=74 and t
|
||||
<pre><code>dspacetest=# delete from metadatavalue where resource_type_id=2 and text_value='';
|
||||
DELETE 226
|
||||
</code></pre><ul>
|
||||
<li>I deleted them on CGSpace but I'll wait to do the re-index as we're going to be doing one in a few days for the metadata changes anyways</li>
|
||||
<li>I deleted them on CGSpace but I’ll wait to do the re-index as we’re going to be doing one in a few days for the metadata changes anyways</li>
|
||||
<li>In other news, moving the <code>dc.type.output</code> to <code>dc.type</code> and re-indexing seems to have fixed the Listings and Reports issue from above</li>
|
||||
<li>Unfortunately this isn't a very good solution, because Listings and Reports config should allow us to filter on <code>dc.type.*</code> but the documentation isn't very clear and I couldn't reach Atmire today</li>
|
||||
<li>Unfortunately this isn’t a very good solution, because Listings and Reports config should allow us to filter on <code>dc.type.*</code> but the documentation isn’t very clear and I couldn’t reach Atmire today</li>
|
||||
<li>We want to do the <code>dc.type.output</code> move on CGSpace anyways, but we should wait as it might affect other external people!</li>
|
||||
</ul>
|
||||
<h2 id="2016-04-14">2016-04-14</h2>
|
||||
@ -289,7 +289,7 @@ UPDATE metadatavalue SET metadata_field_id=217 WHERE metadata_field_id=108
|
||||
UPDATE 46075
|
||||
$ JAVA_OPTS="-Xms512m -Xmx512m -Dfile.encoding=UTF-8" ~/dspace/bin/dspace index-discovery -bf
|
||||
</code></pre><ul>
|
||||
<li>CGSpace was down but I'm not sure why, this was in <code>catalina.out</code>:</li>
|
||||
<li>CGSpace was down but I’m not sure why, this was in <code>catalina.out</code>:</li>
|
||||
</ul>
|
||||
<pre><code>Apr 18, 2016 7:32:26 PM com.sun.jersey.spi.container.ContainerResponse logException
|
||||
SEVERE: Mapped exception to response: 500 (Internal Server Error)
|
||||
@ -334,14 +334,14 @@ javax.ws.rs.WebApplicationException
|
||||
<pre><code># delete from metadatavalue where resource_type_id=2 and metadata_field_id=96;
|
||||
# delete from metadatavalue where resource_type_id=2 and metadata_field_id=83;
|
||||
</code></pre><ul>
|
||||
<li>They are old ICRAF fields and we haven't used them since 2011 or so</li>
|
||||
<li>They are old ICRAF fields and we haven’t used them since 2011 or so</li>
|
||||
<li>Also delete them from the metadata registry</li>
|
||||
<li>CGSpace went down again, <code>dspace.log</code> had this:</li>
|
||||
</ul>
|
||||
<pre><code>2016-04-19 15:02:17,025 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error -
|
||||
org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
|
||||
</code></pre><ul>
|
||||
<li>I restarted Tomcat and PostgreSQL and now it's back up</li>
|
||||
<li>I restarted Tomcat and PostgreSQL and now it’s back up</li>
|
||||
<li>I bet this is the same crash as yesterday, but I only saw the errors in <code>catalina.out</code></li>
|
||||
<li>Looks to be related to this, from <code>dspace.log</code>:</li>
|
||||
</ul>
|
||||
@ -383,24 +383,24 @@ UPDATE 46075
|
||||
<pre><code>$ grep -c "Aborting context in finally statement" dspace.log.2016-04-20
|
||||
21252
|
||||
</code></pre><ul>
|
||||
<li>I found a recent discussion on the DSpace mailing list and I've asked for advice there</li>
|
||||
<li>Looks like this issue was noted and fixed in DSpace 5.5 (we're on 5.1): <a href="https://jira.duraspace.org/browse/DS-2936">https://jira.duraspace.org/browse/DS-2936</a></li>
|
||||
<li>I've sent a message to Atmire asking about compatibility with DSpace 5.5</li>
|
||||
<li>I found a recent discussion on the DSpace mailing list and I’ve asked for advice there</li>
|
||||
<li>Looks like this issue was noted and fixed in DSpace 5.5 (we’re on 5.1): <a href="https://jira.duraspace.org/browse/DS-2936">https://jira.duraspace.org/browse/DS-2936</a></li>
|
||||
<li>I’ve sent a message to Atmire asking about compatibility with DSpace 5.5</li>
|
||||
</ul>
|
||||
<h2 id="2016-04-21">2016-04-21</h2>
|
||||
<ul>
|
||||
<li>Fix a bunch of metadata consistency issues with IITA Journal Articles (Peer review, Formally published, messed up DOIs, etc)</li>
|
||||
<li>Atmire responded with DSpace 5.5 compatible versions for their modules, so I'll start testing those in a few weeks</li>
|
||||
<li>Atmire responded with DSpace 5.5 compatible versions for their modules, so I’ll start testing those in a few weeks</li>
|
||||
</ul>
|
||||
<h2 id="2016-04-22">2016-04-22</h2>
|
||||
<ul>
|
||||
<li>Import 95 records into <a href="https://cgspace.cgiar.org/handle/10568/42219">CTA's Agrodok collection</a></li>
|
||||
<li>Import 95 records into <a href="https://cgspace.cgiar.org/handle/10568/42219">CTA’s Agrodok collection</a></li>
|
||||
</ul>
|
||||
<h2 id="2016-04-26">2016-04-26</h2>
|
||||
<ul>
|
||||
<li>Test embargo during item upload</li>
|
||||
<li>Seems to be working but the help text is misleading as to the date format</li>
|
||||
<li>It turns out the <code>robots.txt</code> issue we thought we solved last month isn't solved because you can't use wildcards in URL patterns: <a href="https://jira.duraspace.org/browse/DS-2962">https://jira.duraspace.org/browse/DS-2962</a></li>
|
||||
<li>It turns out the <code>robots.txt</code> issue we thought we solved last month isn’t solved because you can’t use wildcards in URL patterns: <a href="https://jira.duraspace.org/browse/DS-2962">https://jira.duraspace.org/browse/DS-2962</a></li>
|
||||
<li>Write some nginx rules to add <code>X-Robots-Tag</code> HTTP headers to the dynamic requests from <code>robots.txt</code> instead</li>
|
||||
<li>A few URLs to test with:
|
||||
<ul>
|
||||
@ -449,17 +449,17 @@ dspace.log.2016-04-27:7271
|
||||
<li>Add Spanish XMLUI strings so those users see “CGSpace” instead of “DSpace” in the user interface (<a href="https://github.com/ilri/DSpace/pull/222">#222</a>)</li>
|
||||
<li>Submit patch to upstream DSpace for the misleading help text in the embargo step of the item submission: <a href="https://jira.duraspace.org/browse/DS-3172">https://jira.duraspace.org/browse/DS-3172</a></li>
|
||||
<li>Update infrastructure playbooks for nginx 1.10.x (stable) release: <a href="https://github.com/ilri/rmg-ansible-public/issues/32">https://github.com/ilri/rmg-ansible-public/issues/32</a></li>
|
||||
<li>Currently running on DSpace Test, we'll give it a few days before we adjust CGSpace</li>
|
||||
<li>CGSpace down, restarted tomcat and it's back up</li>
|
||||
<li>Currently running on DSpace Test, we’ll give it a few days before we adjust CGSpace</li>
|
||||
<li>CGSpace down, restarted tomcat and it’s back up</li>
|
||||
</ul>
|
||||
<h2 id="2016-04-28">2016-04-28</h2>
|
||||
<ul>
|
||||
<li>Problems with stability again. I've blocked access to <code>/rest</code> for now to see if the number of errors in the log files drop</li>
|
||||
<li>Problems with stability again. I’ve blocked access to <code>/rest</code> for now to see if the number of errors in the log files drop</li>
|
||||
<li>Later we could maybe start logging access to <code>/rest</code> and perhaps whitelist some IPs…</li>
|
||||
</ul>
|
||||
<h2 id="2016-04-30">2016-04-30</h2>
|
||||
<ul>
|
||||
<li>Logs for today and yesterday have zero references to this REST error, so I'm going to open back up the REST API but log all requests</li>
|
||||
<li>Logs for today and yesterday have zero references to this REST error, so I’m going to open back up the REST API but log all requests</li>
|
||||
</ul>
|
||||
<pre><code>location /rest {
|
||||
access_log /var/log/nginx/rest.log;
|
||||
|
Reference in New Issue
Block a user