mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2021-09-13
This commit is contained in:
@ -46,7 +46,7 @@ I see thousands of them in the logs for the last few months, so it’s not r
|
||||
I’ve raised a ticket with Atmire to ask
|
||||
Another worrying error from dspace.log is:
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.87.0" />
|
||||
<meta name="generator" content="Hugo 0.88.1" />
|
||||
|
||||
|
||||
|
||||
@ -137,7 +137,7 @@ Another worrying error from dspace.log is:
|
||||
<li>CGSpace was down for five hours in the morning while I was sleeping</li>
|
||||
<li>While looking in the logs for errors, I see tons of warnings about Atmire MQM:</li>
|
||||
</ul>
|
||||
<pre><code>2016-12-02 03:00:32,352 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID="TX157907838689377964651674089851855413607")
|
||||
<pre tabindex="0"><code>2016-12-02 03:00:32,352 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID="TX157907838689377964651674089851855413607")
|
||||
2016-12-02 03:00:32,353 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail="dc.title", transactionID="TX157907838689377964651674089851855413607")
|
||||
2016-12-02 03:00:32,353 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail="THUMBNAIL", transactionID="TX157907838689377964651674089851855413607")
|
||||
2016-12-02 03:00:32,353 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail="-1", transactionID="TX157907838689377964651674089851855413607")
|
||||
@ -147,7 +147,7 @@ Another worrying error from dspace.log is:
|
||||
<li>I’ve raised a ticket with Atmire to ask</li>
|
||||
<li>Another worrying error from dspace.log is:</li>
|
||||
</ul>
|
||||
<pre><code>org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.NoSuchMethodError: com.atmire.statistics.generator.DSpaceObjectDatasetGenerator.toDatasetQuery(Lorg/dspace/core/Context;)Lcom/atmire/statistics/content/DatasetQuery;
|
||||
<pre tabindex="0"><code>org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.NoSuchMethodError: com.atmire.statistics.generator.DSpaceObjectDatasetGenerator.toDatasetQuery(Lorg/dspace/core/Context;)Lcom/atmire/statistics/content/DatasetQuery;
|
||||
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:972)
|
||||
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
|
||||
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
|
||||
@ -236,13 +236,13 @@ Caused by: java.lang.NoSuchMethodError: com.atmire.statistics.generator.DSpaceOb
|
||||
</code></pre><ul>
|
||||
<li>The first error I see in dspace.log this morning is:</li>
|
||||
</ul>
|
||||
<pre><code>2016-12-02 03:00:46,656 ERROR org.dspace.authority.AuthorityValueFinder @ anonymous::Error while retrieving AuthorityValue from solr:query\colon; id\colon;"b0b541c1-ec15-48bf-9209-6dbe8e338cdc"
|
||||
<pre tabindex="0"><code>2016-12-02 03:00:46,656 ERROR org.dspace.authority.AuthorityValueFinder @ anonymous::Error while retrieving AuthorityValue from solr:query\colon; id\colon;"b0b541c1-ec15-48bf-9209-6dbe8e338cdc"
|
||||
org.apache.solr.client.solrj.SolrServerException: Server refused connection at: http://localhost:8081/solr/authority
|
||||
</code></pre><ul>
|
||||
<li>Looking through DSpace’s solr log I see that about 20 seconds before this, there were a few 30+ KiB solr queries</li>
|
||||
<li>The last logs here right before Solr became unresponsive (and right after I restarted it five hours later) were:</li>
|
||||
</ul>
|
||||
<pre><code>2016-12-02 03:00:42,606 INFO org.apache.solr.core.SolrCore @ [statistics] webapp=/solr path=/select params={q=containerItem:72828+AND+type:0&shards=localhost:8081/solr/statistics-2010,localhost:8081/solr/statistics&fq=-isInternal:true&fq=-(author_mtdt:"CGIAR\+Institutional\+Learning\+and\+Change\+Initiative"++AND+subject_mtdt:"PARTNERSHIPS"+AND+subject_mtdt:"RESEARCH"+AND+subject_mtdt:"AGRICULTURE"+AND+subject_mtdt:"DEVELOPMENT"++AND+iso_mtdt:"en"+)&rows=0&wt=javabin&version=2} hits=0 status=0 QTime=19
|
||||
<pre tabindex="0"><code>2016-12-02 03:00:42,606 INFO org.apache.solr.core.SolrCore @ [statistics] webapp=/solr path=/select params={q=containerItem:72828+AND+type:0&shards=localhost:8081/solr/statistics-2010,localhost:8081/solr/statistics&fq=-isInternal:true&fq=-(author_mtdt:"CGIAR\+Institutional\+Learning\+and\+Change\+Initiative"++AND+subject_mtdt:"PARTNERSHIPS"+AND+subject_mtdt:"RESEARCH"+AND+subject_mtdt:"AGRICULTURE"+AND+subject_mtdt:"DEVELOPMENT"++AND+iso_mtdt:"en"+)&rows=0&wt=javabin&version=2} hits=0 status=0 QTime=19
|
||||
2016-12-02 08:28:23,908 INFO org.apache.solr.servlet.SolrDispatchFilter @ SolrDispatchFilter.init()
|
||||
</code></pre><ul>
|
||||
<li>DSpace’s own Solr logs don’t give IP addresses, so I will have to enable Nginx’s logging of <code>/solr</code> so I can see where this request came from</li>
|
||||
@ -255,7 +255,7 @@ org.apache.solr.client.solrj.SolrServerException: Server refused connection at:
|
||||
<li>I got a weird report from the CGSpace checksum checker this morning</li>
|
||||
<li>It says 732 bitstreams have potential issues, for example:</li>
|
||||
</ul>
|
||||
<pre><code>------------------------------------------------
|
||||
<pre tabindex="0"><code>------------------------------------------------
|
||||
Bitstream Id = 6
|
||||
Process Start Date = Dec 4, 2016
|
||||
Process End Date = Dec 4, 2016
|
||||
@ -278,7 +278,7 @@ Result = The bitstream could not be found
|
||||
<li>For what it’s worth, there is no item on DSpace Test or S3 backups with that checksum either…</li>
|
||||
<li>In other news, I’m looking at JVM settings from the Solr 4.10.2 release, from <code>bin/solr.in.sh</code>:</li>
|
||||
</ul>
|
||||
<pre><code># These GC settings have shown to work well for a number of common Solr workloads
|
||||
<pre tabindex="0"><code># These GC settings have shown to work well for a number of common Solr workloads
|
||||
GC_TUNE="-XX:-UseSuperWord \
|
||||
-XX:NewRatio=3 \
|
||||
-XX:SurvivorRatio=4 \
|
||||
@ -311,7 +311,7 @@ GC_TUNE="-XX:-UseSuperWord \
|
||||
<li>Atmire responded about the MQM warnings in the DSpace logs</li>
|
||||
<li>Apparently we need to change the batch edit consumers in <code>dspace/config/dspace.cfg</code>:</li>
|
||||
</ul>
|
||||
<pre><code>event.consumer.batchedit.filters = Community|Collection+Create
|
||||
<pre tabindex="0"><code>event.consumer.batchedit.filters = Community|Collection+Create
|
||||
</code></pre><ul>
|
||||
<li>I haven’t tested it yet, but I created a pull request: <a href="https://github.com/ilri/DSpace/pull/289">#289</a></li>
|
||||
</ul>
|
||||
@ -319,7 +319,7 @@ GC_TUNE="-XX:-UseSuperWord \
|
||||
<ul>
|
||||
<li>Some author authority corrections and name standardizations for Peter:</li>
|
||||
</ul>
|
||||
<pre><code>dspace=# update metadatavalue set authority='b041f2f4-19e7-4113-b774-0439baabd197', confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like 'Mora Benard%';
|
||||
<pre tabindex="0"><code>dspace=# update metadatavalue set authority='b041f2f4-19e7-4113-b774-0439baabd197', confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like 'Mora Benard%';
|
||||
UPDATE 11
|
||||
dspace=# update metadatavalue set text_value = 'Hoek, Rein van der', authority='4d6cbce2-6fd5-4b43-9363-58d18e7952c9', confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like 'Hoek, R%';
|
||||
UPDATE 36
|
||||
@ -343,7 +343,7 @@ UPDATE 561
|
||||
<li>The docs say a good starting point for a dedicated server is 25% of the system RAM, and our server isn’t dedicated (also runs Solr, which can benefit from OS cache) so let’s try 1024MB</li>
|
||||
<li>In other news, the authority reindexing keeps crashing (I was manually running it after the author updates above):</li>
|
||||
</ul>
|
||||
<pre><code>$ time JAVA_OPTS="-Xms768m -Xmx768m -Dfile.encoding=UTF-8" /home/dspacetest.cgiar.org/bin/dspace index-authority
|
||||
<pre tabindex="0"><code>$ time JAVA_OPTS="-Xms768m -Xmx768m -Dfile.encoding=UTF-8" /home/dspacetest.cgiar.org/bin/dspace index-authority
|
||||
Retrieving all data
|
||||
Initialize org.dspace.authority.indexer.DSpaceAuthorityIndexer
|
||||
Exception: null
|
||||
@ -376,7 +376,7 @@ sys 0m22.647s
|
||||
<li>For example, do a Solr query for “first_name:Grace” and look at the results</li>
|
||||
<li>Querying that ID shows the fields that need to be changed:</li>
|
||||
</ul>
|
||||
<pre><code>{
|
||||
<pre tabindex="0"><code>{
|
||||
"responseHeader": {
|
||||
"status": 0,
|
||||
"QTime": 1,
|
||||
@ -409,7 +409,7 @@ sys 0m22.647s
|
||||
<li>I think I can just update the <code>value</code>, <code>first_name</code>, and <code>last_name</code> fields…</li>
|
||||
<li>The update syntax should be something like this, but I’m getting errors from Solr:</li>
|
||||
</ul>
|
||||
<pre><code>$ curl 'localhost:8081/solr/authority/update?commit=true&wt=json&indent=true' -H 'Content-type:application/json' -d '[{"id":"1","price":{"set":100}}]'
|
||||
<pre tabindex="0"><code>$ curl 'localhost:8081/solr/authority/update?commit=true&wt=json&indent=true' -H 'Content-type:application/json' -d '[{"id":"1","price":{"set":100}}]'
|
||||
{
|
||||
"responseHeader":{
|
||||
"status":400,
|
||||
@ -421,13 +421,13 @@ sys 0m22.647s
|
||||
<li>When I try using the XML format I get an error that the <code>updateLog</code> needs to be configured for that core</li>
|
||||
<li>Maybe I can just remove the authority UUID from the records, run the indexing again so it creates a new one for each name variant, then match them correctly?</li>
|
||||
</ul>
|
||||
<pre><code>dspace=# update metadatavalue set authority=null, confidence=-1 where resource_type_id=2 and metadata_field_id=3 and text_value like 'Grace, D%';
|
||||
<pre tabindex="0"><code>dspace=# update metadatavalue set authority=null, confidence=-1 where resource_type_id=2 and metadata_field_id=3 and text_value like 'Grace, D%';
|
||||
UPDATE 561
|
||||
</code></pre><ul>
|
||||
<li>Then I’ll reindex discovery and authority and see how the authority Solr core looks</li>
|
||||
<li>After this, now there are authorities for some of the “Grace, D.” and “Grace, Delia” text_values in the database (the first version is actually the same authority that already exists in the core, so it was just added back to some text_values, but the second one is new):</li>
|
||||
</ul>
|
||||
<pre><code>$ curl 'localhost:8081/solr/authority/select?q=id%3A18ea1525-2513-430a-8817-a834cd733fbc&wt=json&indent=true'
|
||||
<pre tabindex="0"><code>$ curl 'localhost:8081/solr/authority/select?q=id%3A18ea1525-2513-430a-8817-a834cd733fbc&wt=json&indent=true'
|
||||
{
|
||||
"responseHeader":{
|
||||
"status":0,
|
||||
@ -453,7 +453,7 @@ UPDATE 561
|
||||
<li>In this case it seems that since there were also two different IDs in the original database, I just picked the wrong one!</li>
|
||||
<li>Better to use:</li>
|
||||
</ul>
|
||||
<pre><code>dspace#= update metadatavalue set text_value='Grace, Delia', authority='bfa61d7c-7583-4175-991c-2e7315000f0c', confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like 'Grace, D%';
|
||||
<pre tabindex="0"><code>dspace#= update metadatavalue set text_value='Grace, Delia', authority='bfa61d7c-7583-4175-991c-2e7315000f0c', confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like 'Grace, D%';
|
||||
</code></pre><ul>
|
||||
<li>This proves that unifying author name varieties in authorities is easy, but fixing the name in the authority is tricky!</li>
|
||||
<li>Perhaps another way is to just add our own UUID to the authority field for the text_value we like, then re-index authority so they get synced from PostgreSQL to Solr, then set the other text_values to use that authority ID</li>
|
||||
@ -461,7 +461,7 @@ UPDATE 561
|
||||
<li>Deploy “take task” hack/fix on CGSpace (<a href="https://github.com/ilri/DSpace/pull/290">#290</a>)</li>
|
||||
<li>I ran the following author corrections and then reindexed discovery:</li>
|
||||
</ul>
|
||||
<pre><code>update metadatavalue set authority='b041f2f4-19e7-4113-b774-0439baabd197', confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like 'Mora Benard%';
|
||||
<pre tabindex="0"><code>update metadatavalue set authority='b041f2f4-19e7-4113-b774-0439baabd197', confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like 'Mora Benard%';
|
||||
update metadatavalue set text_value = 'Hoek, Rein van der', authority='4d6cbce2-6fd5-4b43-9363-58d18e7952c9', confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like 'Hoek, R%';
|
||||
update metadatavalue set text_value = 'Hoek, Rein van der', authority='4d6cbce2-6fd5-4b43-9363-58d18e7952c9', confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like '%an der Hoek%' and text_value !~ '^.*W\.?$';
|
||||
update metadatavalue set authority='18349f29-61b1-44d7-ac60-89e55546e812', confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value like 'Thorne, P%';
|
||||
@ -471,7 +471,7 @@ update metadatavalue set text_value='Grace, Delia', authority='bfa61d7c-7583-417
|
||||
<ul>
|
||||
<li>Something weird happened and Peter Thorne’s names all ended up as “Thorne”, I guess because the original authority had that as its name value:</li>
|
||||
</ul>
|
||||
<pre><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value like 'Thorne%';
|
||||
<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value like 'Thorne%';
|
||||
text_value | authority | confidence
|
||||
------------------+--------------------------------------+------------
|
||||
Thorne, P.J. | 18349f29-61b1-44d7-ac60-89e55546e812 | 600
|
||||
@ -484,12 +484,12 @@ update metadatavalue set text_value='Grace, Delia', authority='bfa61d7c-7583-417
|
||||
</code></pre><ul>
|
||||
<li>I generated a new UUID using <code>uuidgen | tr [A-Z] [a-z]</code> and set it along with correct name variation for all records:</li>
|
||||
</ul>
|
||||
<pre><code>dspace=# update metadatavalue set authority='b2f7603d-2fb5-4018-923a-c4ec8d85b3bb', text_value='Thorne, P.J.' where resource_type_id=2 and metadata_field_id=3 and authority='18349f29-61b1-44d7-ac60-89e55546e812';
|
||||
<pre tabindex="0"><code>dspace=# update metadatavalue set authority='b2f7603d-2fb5-4018-923a-c4ec8d85b3bb', text_value='Thorne, P.J.' where resource_type_id=2 and metadata_field_id=3 and authority='18349f29-61b1-44d7-ac60-89e55546e812';
|
||||
UPDATE 43
|
||||
</code></pre><ul>
|
||||
<li>Apparently we also need to normalize Phil Thornton’s names to <code>Thornton, Philip K.</code>:</li>
|
||||
</ul>
|
||||
<pre><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value ~ '^Thornton[,\.]? P.*';
|
||||
<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value ~ '^Thornton[,\.]? P.*';
|
||||
text_value | authority | confidence
|
||||
---------------------+--------------------------------------+------------
|
||||
Thornton, P | 0d8369bb-57f7-4b2f-92aa-af820b183aca | 600
|
||||
@ -506,7 +506,7 @@ UPDATE 43
|
||||
</code></pre><ul>
|
||||
<li>Seems his original authorities are using an incorrect version of the name so I need to generate another UUID and tie it to the correct name, then reindex:</li>
|
||||
</ul>
|
||||
<pre><code>dspace=# update metadatavalue set authority='2df8136e-d8f4-4142-b58c-562337cab764', text_value='Thornton, Philip K.', confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value ~ '^Thornton[,\.]? P.*';
|
||||
<pre tabindex="0"><code>dspace=# update metadatavalue set authority='2df8136e-d8f4-4142-b58c-562337cab764', text_value='Thornton, Philip K.', confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value ~ '^Thornton[,\.]? P.*';
|
||||
UPDATE 362
|
||||
</code></pre><ul>
|
||||
<li>It seems that, when you are messing with authority and author text values in the database, it is better to run authority reindex first (postgres→solr authority core) and then Discovery reindex (postgres→solr Discovery core)</li>
|
||||
@ -520,7 +520,7 @@ UPDATE 362
|
||||
<li>Set PostgreSQL’s <code>shared_buffers</code> on CGSpace to 10% of system RAM (1200MB)</li>
|
||||
<li>Run the following author corrections on CGSpace:</li>
|
||||
</ul>
|
||||
<pre><code>dspace=# update metadatavalue set authority='34df639a-42d8-4867-a3f2-1892075fcb3f', text_value='Thorne, P.J.' where resource_type_id=2 and metadata_field_id=3 and authority='18349f29-61b1-44d7-ac60-89e55546e812' or authority='021cd183-946b-42bb-964e-522ebff02993';
|
||||
<pre tabindex="0"><code>dspace=# update metadatavalue set authority='34df639a-42d8-4867-a3f2-1892075fcb3f', text_value='Thorne, P.J.' where resource_type_id=2 and metadata_field_id=3 and authority='18349f29-61b1-44d7-ac60-89e55546e812' or authority='021cd183-946b-42bb-964e-522ebff02993';
|
||||
dspace=# update metadatavalue set authority='2df8136e-d8f4-4142-b58c-562337cab764', text_value='Thornton, Philip K.', confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value ~ '^Thornton[,\.]? P.*';
|
||||
</code></pre><ul>
|
||||
<li>The authority IDs were different now than when I was looking a few days ago so I had to adjust them here</li>
|
||||
@ -534,7 +534,7 @@ dspace=# update metadatavalue set authority='2df8136e-d8f4-4142-b58c-562337cab76
|
||||
<ul>
|
||||
<li>Looking at CIAT records from last week again, they have a lot of double authors like:</li>
|
||||
</ul>
|
||||
<pre><code>International Center for Tropical Agriculture::3026b1de-9302-4f3e-85ab-ef48da024eb2::600
|
||||
<pre tabindex="0"><code>International Center for Tropical Agriculture::3026b1de-9302-4f3e-85ab-ef48da024eb2::600
|
||||
International Center for Tropical Agriculture::3026b1de-9302-4f3e-85ab-ef48da024eb2::500
|
||||
International Center for Tropical Agriculture::3026b1de-9302-4f3e-85ab-ef48da024eb2::0
|
||||
</code></pre><ul>
|
||||
@ -542,7 +542,7 @@ International Center for Tropical Agriculture::3026b1de-9302-4f3e-85ab-ef48da024
|
||||
<li>Removing the duplicates in OpenRefine and uploading a CSV to DSpace says “no changes detected”</li>
|
||||
<li>Seems like the only way to sortof clean these up would be to start in SQL:</li>
|
||||
</ul>
|
||||
<pre><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value like 'International Center for Tropical Agriculture';
|
||||
<pre tabindex="0"><code>dspace=# select distinct text_value, authority, confidence from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value like 'International Center for Tropical Agriculture';
|
||||
text_value | authority | confidence
|
||||
-----------------------------------------------+--------------------------------------+------------
|
||||
International Center for Tropical Agriculture | cc726b78-a2f4-4ee9-af98-855c2ea31c36 | -1
|
||||
@ -577,14 +577,14 @@ UPDATE 35
|
||||
<li>So basically, new cron jobs for logs should look something like this:</li>
|
||||
<li>Find any file named <code>*.log*</code> that isn’t <code>dspace.log*</code>, isn’t already zipped, and is older than one day, and zip it:</li>
|
||||
</ul>
|
||||
<pre><code># find /home/dspacetest.cgiar.org/log -regextype posix-extended -iregex ".*\.log.*" ! -iregex ".*dspace\.log.*" ! -iregex ".*\.(gz|lrz|lzo|xz)" ! -newermt "Yesterday" -exec schedtool -B -e ionice -c2 -n7 xz {} \;
|
||||
<pre tabindex="0"><code># find /home/dspacetest.cgiar.org/log -regextype posix-extended -iregex ".*\.log.*" ! -iregex ".*dspace\.log.*" ! -iregex ".*\.(gz|lrz|lzo|xz)" ! -newermt "Yesterday" -exec schedtool -B -e ionice -c2 -n7 xz {} \;
|
||||
</code></pre><ul>
|
||||
<li>Since there is <code>xzgrep</code> and <code>xzless</code> we can actually just zip them after one day, why not?!</li>
|
||||
<li>We can keep the zipped ones for two weeks just in case we need to look for errors, etc, and delete them after that</li>
|
||||
<li>I use <code>schedtool -B</code> and <code>ionice -c2 -n7</code> to set the CPU scheduling to <code>SCHED_BATCH</code> and the IO to best effort which should, in theory, impact important system processes like Tomcat and PostgreSQL less</li>
|
||||
<li>When the tasks are running you can see that the policies do apply:</li>
|
||||
</ul>
|
||||
<pre><code>$ schedtool $(ps aux | grep "xz /home" | grep -v grep | awk '{print $2}') && ionice -p $(ps aux | grep "xz /home" | grep -v grep | awk '{print $2}')
|
||||
<pre tabindex="0"><code>$ schedtool $(ps aux | grep "xz /home" | grep -v grep | awk '{print $2}') && ionice -p $(ps aux | grep "xz /home" | grep -v grep | awk '{print $2}')
|
||||
PID 17049: PRIO 0, POLICY B: SCHED_BATCH , NICE 0, AFFINITY 0xf
|
||||
best-effort: prio 7
|
||||
</code></pre><ul>
|
||||
@ -594,7 +594,7 @@ best-effort: prio 7
|
||||
<li>Some users pointed out issues with the “most popular” stats on a community or collection</li>
|
||||
<li>This error appears in the logs when you try to view them:</li>
|
||||
</ul>
|
||||
<pre><code>2016-12-13 21:17:37,486 ERROR org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter @ Serious Error Occurred Processing Request!
|
||||
<pre tabindex="0"><code>2016-12-13 21:17:37,486 ERROR org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter @ Serious Error Occurred Processing Request!
|
||||
org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.NoSuchMethodError: com.atmire.statistics.generator.DSpaceObjectDatasetGenerator.toDatasetQuery(Lorg/dspace/core/Context;)Lcom/atmire/statistics/content/DatasetQuery;
|
||||
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:972)
|
||||
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
|
||||
@ -679,7 +679,7 @@ Caused by: java.lang.NoSuchMethodError: com.atmire.statistics.generator.DSpaceOb
|
||||
<li>None of our users in African institutes will have IPv6, but some Europeans might, so I need to check if any submissions have been added since then</li>
|
||||
<li>Update some names and authorities in the database:</li>
|
||||
</ul>
|
||||
<pre><code>dspace=# update metadatavalue set authority='5ff35043-942e-4d0a-b377-4daed6e3c1a3', confidence=600, text_value='Duncan, Alan' where resource_type_id=2 and metadata_field_id=3 and text_value ~ '^.*Duncan,? A.*';
|
||||
<pre tabindex="0"><code>dspace=# update metadatavalue set authority='5ff35043-942e-4d0a-b377-4daed6e3c1a3', confidence=600, text_value='Duncan, Alan' where resource_type_id=2 and metadata_field_id=3 and text_value ~ '^.*Duncan,? A.*';
|
||||
UPDATE 204
|
||||
dspace=# update metadatavalue set authority='46804b53-ea30-4a85-9ccf-b79a35816fa9', confidence=600, text_value='Mekonnen, Kindu' where resource_type_id=2 and metadata_field_id=3 and text_value like '%Mekonnen, K%';
|
||||
UPDATE 89
|
||||
@ -692,7 +692,7 @@ UPDATE 140
|
||||
<li>Enable OCSP stapling for hosts >= Ubuntu 16.04 in our Ansible playbooks (<a href="https://github.com/ilri/rmg-ansible-public/pull/76">#76</a>)</li>
|
||||
<li>Working for DSpace Test on the second response:</li>
|
||||
</ul>
|
||||
<pre><code>$ openssl s_client -connect dspacetest.cgiar.org:443 -servername dspacetest.cgiar.org -tls1_2 -tlsextdebug -status
|
||||
<pre tabindex="0"><code>$ openssl s_client -connect dspacetest.cgiar.org:443 -servername dspacetest.cgiar.org -tls1_2 -tlsextdebug -status
|
||||
...
|
||||
OCSP response: no response sent
|
||||
$ openssl s_client -connect dspacetest.cgiar.org:443 -servername dspacetest.cgiar.org -tls1_2 -tlsextdebug -status
|
||||
@ -704,12 +704,12 @@ OCSP Response Data:
|
||||
<li>Migrate CGSpace to new server, roughly following these steps:</li>
|
||||
<li>On old server:</li>
|
||||
</ul>
|
||||
<pre><code># service tomcat7 stop
|
||||
<pre tabindex="0"><code># service tomcat7 stop
|
||||
# /home/backup/scripts/postgres_backup.sh
|
||||
</code></pre><ul>
|
||||
<li>On new server:</li>
|
||||
</ul>
|
||||
<pre><code># systemctl stop tomcat7
|
||||
<pre tabindex="0"><code># systemctl stop tomcat7
|
||||
# rsync -4 -av --delete 178.79.187.182:/home/cgspace.cgiar.org/assetstore/ /home/cgspace.cgiar.org/assetstore/
|
||||
# rsync -4 -av --delete 178.79.187.182:/home/backup/ /home/backup/
|
||||
# rsync -4 -av --delete 178.79.187.182:/home/cgspace.cgiar.org/solr/ /home/cgspace.cgiar.org/solr
|
||||
@ -750,7 +750,7 @@ $ exit
|
||||
<li>Abenet wanted a CSV of the IITA community, but the web export doesn’t include the <code>dc.date.accessioned</code> field</li>
|
||||
<li>I had to export it from the command line using the <code>-a</code> flag:</li>
|
||||
</ul>
|
||||
<pre><code>$ [dspace]/bin/dspace metadata-export -a -f /tmp/iita.csv -i 10568/68616
|
||||
<pre tabindex="0"><code>$ [dspace]/bin/dspace metadata-export -a -f /tmp/iita.csv -i 10568/68616
|
||||
</code></pre><h2 id="2016-12-28">2016-12-28</h2>
|
||||
<ul>
|
||||
<li>We’ve been getting two alerts per day about CPU usage on the new server from Linode</li>
|
||||
|
Reference in New Issue
Block a user