mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2019-12-17
This commit is contained in:
@ -45,7 +45,7 @@ Generate list of authors on CGSpace for Peter to go through and correct:
|
||||
dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
|
||||
COPY 54701
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.60.1" />
|
||||
<meta name="generator" content="Hugo 0.61.0" />
|
||||
|
||||
|
||||
|
||||
@ -126,11 +126,11 @@ COPY 54701
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="20171101">2017-11-01</h2>
|
||||
<h2 id="2017-11-01">2017-11-01</h2>
|
||||
<ul>
|
||||
<li>The CORE developers responded to say they are looking into their bot not respecting our robots.txt</li>
|
||||
</ul>
|
||||
<h2 id="20171102">2017-11-02</h2>
|
||||
<h2 id="2017-11-02">2017-11-02</h2>
|
||||
<ul>
|
||||
<li>Today there have been no hits by CORE and no alerts from Linode (coincidence?)</li>
|
||||
</ul>
|
||||
@ -156,12 +156,12 @@ COPY 54701
|
||||
<li>Also, some dates like with completely invalid format like “2010- 06” and “2011-3-28”</li>
|
||||
<li>I also collapsed some consecutive whitespace on a handful of fields</li>
|
||||
</ul>
|
||||
<h2 id="20171103">2017-11-03</h2>
|
||||
<h2 id="2017-11-03">2017-11-03</h2>
|
||||
<ul>
|
||||
<li>Atmire got back to us to say that they estimate it will take two days of labor to implement the change to Listings and Reports</li>
|
||||
<li>I said I'd ask Abenet if she wants that feature</li>
|
||||
</ul>
|
||||
<h2 id="20171104">2017-11-04</h2>
|
||||
<h2 id="2017-11-04">2017-11-04</h2>
|
||||
<ul>
|
||||
<li>I finished looking through Sisay's CIAT records for the “Alianzas de Aprendizaje” data</li>
|
||||
<li>I corrected about half of the authors to standardize them</li>
|
||||
@ -198,7 +198,7 @@ COPY 54701
|
||||
</code></pre><ul>
|
||||
<li>For now I don't know what this user is!</li>
|
||||
</ul>
|
||||
<h2 id="20171105">2017-11-05</h2>
|
||||
<h2 id="2017-11-05">2017-11-05</h2>
|
||||
<ul>
|
||||
<li>Peter asked if I could fix the appearance of “International Livestock Research Institute” in the author lookup during item submission</li>
|
||||
<li>It looks to be just an issue with the user interface expecting authors to have both a first and last name:</li>
|
||||
@ -226,7 +226,7 @@ COPY 54701
|
||||
<li>This guide shows how to <a href="https://geekflare.com/enable-jmx-tomcat-to-monitor-administer/">enable JMX in Tomcat</a> by modifying <code>CATALINA_OPTS</code></li>
|
||||
<li>I was able to successfully connect to my local Tomcat with jconsole!</li>
|
||||
</ul>
|
||||
<h2 id="20171107">2017-11-07</h2>
|
||||
<h2 id="2017-11-07">2017-11-07</h2>
|
||||
<ul>
|
||||
<li>CGSpace when down and up a few times this morning, first around 3AM, then around 7</li>
|
||||
<li>Tsega had to restart Tomcat 7 to fix it temporarily</li>
|
||||
@ -464,7 +464,7 @@ $ grep -Io -E 'session_id=[A-Z0-9]{32}:ip_addr=104.196.152.243' dspace.log.2017-
|
||||
</ul>
|
||||
<pre><code># grep "Baiduspider/2.0" /var/log/nginx/access.log | awk '{print $1}' | sort -n | uniq | wc -l
|
||||
164
|
||||
</code></pre><h2 id="20171108">2017-11-08</h2>
|
||||
</code></pre><h2 id="2017-11-08">2017-11-08</h2>
|
||||
<ul>
|
||||
<li>Linode sent several alerts last night about CPU usage and outbound traffic rate at 6:13PM</li>
|
||||
<li>Linode sent another alert about CPU usage in the morning at 6:12AM</li>
|
||||
@ -526,7 +526,7 @@ proxy_set_header User-Agent $ua;
|
||||
<li>Run system updates on CGSpace and reboot the server</li>
|
||||
<li>Re-deploy latest <code>5_x-prod</code> branch on CGSpace and DSpace Test (includes the clickable thumbnails, CCAFS phase II project tags, and updated news text)</li>
|
||||
</ul>
|
||||
<h2 id="20171109">2017-11-09</h2>
|
||||
<h2 id="2017-11-09">2017-11-09</h2>
|
||||
<ul>
|
||||
<li>Awesome, it seems my bot mapping stuff in nginx actually reduced the number of Tomcat sessions used by the CIAT scraper today, total requests and unique sessions:</li>
|
||||
</ul>
|
||||
@ -550,13 +550,13 @@ $ grep 104.196.152.243 dspace.log.2017-11-07 | grep -o -E 'session_id=[A-Z0-9]{3
|
||||
<li>This gets me thinking, I wonder if I can use something like nginx's rate limiter to automatically change the user agent of clients who make too many requests</li>
|
||||
<li>Perhaps using a combination of geo and map, like illustrated here: <a href="https://www.nginx.com/blog/rate-limiting-nginx/">https://www.nginx.com/blog/rate-limiting-nginx/</a></li>
|
||||
</ul>
|
||||
<h2 id="20171111">2017-11-11</h2>
|
||||
<h2 id="2017-11-11">2017-11-11</h2>
|
||||
<ul>
|
||||
<li>I was looking at the Google index and noticed there are 4,090 search results for dspace.ilri.org but only seven for mahider.ilri.org</li>
|
||||
<li>Search with something like: inurl:dspace.ilri.org inurl:https</li>
|
||||
<li>I want to get rid of those legacy domains eventually!</li>
|
||||
</ul>
|
||||
<h2 id="20171112">2017-11-12</h2>
|
||||
<h2 id="2017-11-12">2017-11-12</h2>
|
||||
<ul>
|
||||
<li>Update the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure templates</a> to be a little more modular and flexible</li>
|
||||
<li>Looking at the top client IPs on CGSpace so far this morning, even though it's only been eight hours:</li>
|
||||
@ -630,7 +630,7 @@ Server: nginx
|
||||
<li>The first request works, second is denied with an HTTP 503!</li>
|
||||
<li>I need to remember to check the Munin graphs for PostgreSQL and JVM next week to see how this affects them</li>
|
||||
</ul>
|
||||
<h2 id="20171113">2017-11-13</h2>
|
||||
<h2 id="2017-11-13">2017-11-13</h2>
|
||||
<ul>
|
||||
<li>At the end of the day I checked the logs and it really looks like the Baidu rate limiting is working, HTTP 200 vs 503:</li>
|
||||
</ul>
|
||||
@ -659,7 +659,7 @@ Server: nginx
|
||||
<li>After uploading and looking at the data in DSpace Test I saw more errors with CRPs, subjects (one item had four copies of all of its subjects, another had a “.” in it), affiliations, sponsors, etc.</li>
|
||||
<li>Atmire responded to the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=510">ticket about ORCID stuff</a> a few days ago, today I told them that I need to talk to Peter and the partners to see what we would like to do</li>
|
||||
</ul>
|
||||
<h2 id="20171114">2017-11-14</h2>
|
||||
<h2 id="2017-11-14">2017-11-14</h2>
|
||||
<ul>
|
||||
<li>Deploy some nginx configuration updates to CGSpace</li>
|
||||
<li>They had been waiting on a branch for a few months and I think I just forgot about them</li>
|
||||
@ -674,13 +674,13 @@ dspace6=# CREATE EXTENSION pgcrypto;
|
||||
<li>I'm not sure if we can use separate profiles like we did before with <code>mvn -Denv=blah</code> to use blah.properties</li>
|
||||
<li>It seems we need to use “system properties” to override settings, ie: <code>-Ddspace.dir=/Users/aorth/dspace6</code></li>
|
||||
</ul>
|
||||
<h2 id="20171115">2017-11-15</h2>
|
||||
<h2 id="2017-11-15">2017-11-15</h2>
|
||||
<ul>
|
||||
<li>Send Adam Hunt an invite to the DSpace Developers network on Yammer</li>
|
||||
<li>He is the new head of communications at WLE, since Michael left</li>
|
||||
<li>Merge changes to item view's wording of link metadata (<a href="https://github.com/ilri/DSpace/pull/348">#348</a>)</li>
|
||||
</ul>
|
||||
<h2 id="20171117">2017-11-17</h2>
|
||||
<h2 id="2017-11-17">2017-11-17</h2>
|
||||
<ul>
|
||||
<li>Uptime Robot said that CGSpace went down today and I see lots of <code>Timeout waiting for idle object</code> errors in the DSpace logs</li>
|
||||
<li>I looked in PostgreSQL using <code>SELECT * FROM pg_stat_activity;</code> and saw that there were 73 active connections</li>
|
||||
@ -724,7 +724,7 @@ dspace6=# CREATE EXTENSION pgcrypto;
|
||||
<ul>
|
||||
<li>Switch DSpace Test to using the G1GC for JVM so I can see what the JVM graph looks like eventually, and start evaluating it for production</li>
|
||||
</ul>
|
||||
<h2 id="20171119">2017-11-19</h2>
|
||||
<h2 id="2017-11-19">2017-11-19</h2>
|
||||
<ul>
|
||||
<li>Linode sent an alert that CGSpace was using a lot of CPU around 4–6 AM</li>
|
||||
<li>Looking in the nginx access logs I see the most active XMLUI users between 4 and 6 AM:</li>
|
||||
@ -762,18 +762,18 @@ $ grep -c com.atmire.utils.UpdateSolrStatsMetadata dspace.log.2017-11-19
|
||||
<li>It's been a few days since I enabled the G1GC on DSpace Test and the JVM graph definitely changed:</li>
|
||||
</ul>
|
||||
<p><img src="/cgspace-notes/2017/11/tomcat-jvm-g1gc.png" alt="Tomcat G1GC"></p>
|
||||
<h2 id="20171120">2017-11-20</h2>
|
||||
<h2 id="2017-11-20">2017-11-20</h2>
|
||||
<ul>
|
||||
<li>I found <a href="https://www.cakesolutions.net/teamblogs/low-pause-gc-on-the-jvm">an article about JVM tuning</a> that gives some pointers how to enable logging and tools to analyze logs for you</li>
|
||||
<li>Also notes on <a href="https://blog.gceasy.io/2016/11/15/rotating-gc-log-files/">rotating GC logs</a></li>
|
||||
<li>I decided to switch DSpace Test back to the CMS garbage collector because it is designed for low pauses and high throughput (like G1GC!) and because we haven't even tried to monitor or tune it</li>
|
||||
</ul>
|
||||
<h2 id="20171121">2017-11-21</h2>
|
||||
<h2 id="2017-11-21">2017-11-21</h2>
|
||||
<ul>
|
||||
<li>Magdalena was having problems logging in via LDAP and it seems to be a problem with the CGIAR LDAP server:</li>
|
||||
</ul>
|
||||
<pre><code>2017-11-21 11:11:09,621 WARN org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=2FEC0E5286C17B6694567FFD77C3171C:ip_addr=77.241.141.58:ldap_authentication:type=failed_auth javax.naming.CommunicationException\colon; simple bind failed\colon; svcgroot2.cgiarad.org\colon;3269 [Root exception is javax.net.ssl.SSLHandshakeException\colon; sun.security.validator.ValidatorException\colon; PKIX path validation failed\colon; java.security.cert.CertPathValidatorException\colon; validity check failed]
|
||||
</code></pre><h2 id="20171122">2017-11-22</h2>
|
||||
</code></pre><h2 id="2017-11-22">2017-11-22</h2>
|
||||
<ul>
|
||||
<li>Linode sent an alert that the CPU usage on the CGSpace server was very high around 4 to 6 AM</li>
|
||||
<li>The logs don't show anything particularly abnormal between those hours:</li>
|
||||
@ -794,7 +794,7 @@ $ grep -c com.atmire.utils.UpdateSolrStatsMetadata dspace.log.2017-11-19
|
||||
<li>In other news, it looks like the JVM garbage collection pattern is back to its standard jigsaw pattern after switching back to CMS a few days ago:</li>
|
||||
</ul>
|
||||
<p><img src="/cgspace-notes/2017/11/tomcat-jvm-cms.png" alt="Tomcat JVM with CMS GC"></p>
|
||||
<h2 id="20171123">2017-11-23</h2>
|
||||
<h2 id="2017-11-23">2017-11-23</h2>
|
||||
<ul>
|
||||
<li>Linode alerted again that CPU usage was high on CGSpace from 4:13 to 6:13 AM</li>
|
||||
<li>I see a lot of Googlebot (66.249.66.90) in the XMLUI access logs</li>
|
||||
@ -838,7 +838,7 @@ $ grep -c com.atmire.utils.UpdateSolrStatsMetadata dspace.log.2017-11-19
|
||||
<li>Apparently setting <code>random_page_cost</code> to 1 is “common” advice for systems running PostgreSQL on SSD (the default is 4)</li>
|
||||
<li>So I deployed this on DSpace Test and will check the Munin PostgreSQL graphs in a few days to see if anything changes</li>
|
||||
</ul>
|
||||
<h2 id="20171124">2017-11-24</h2>
|
||||
<h2 id="2017-11-24">2017-11-24</h2>
|
||||
<ul>
|
||||
<li>It's too early to tell for sure, but after I made the <code>random_page_cost</code> change on DSpace Test's PostgreSQL yesterday the number of connections dropped drastically:</li>
|
||||
</ul>
|
||||
@ -857,7 +857,7 @@ $ grep -c com.atmire.utils.UpdateSolrStatsMetadata dspace.log.2017-11-19
|
||||
</code></pre><ul>
|
||||
<li>I should probably tell CGIAR people to have CGNET stop that</li>
|
||||
</ul>
|
||||
<h2 id="20171126">2017-11-26</h2>
|
||||
<h2 id="2017-11-26">2017-11-26</h2>
|
||||
<ul>
|
||||
<li>Linode alerted that CGSpace server was using too much CPU from 5:18 to 7:18 AM</li>
|
||||
<li>Yet another mystery because the load for all domains looks fine at that time:</li>
|
||||
@ -873,7 +873,7 @@ $ grep -c com.atmire.utils.UpdateSolrStatsMetadata dspace.log.2017-11-19
|
||||
298 157.55.39.206
|
||||
379 66.249.66.70
|
||||
1855 66.249.66.90
|
||||
</code></pre><h2 id="20171129">2017-11-29</h2>
|
||||
</code></pre><h2 id="2017-11-29">2017-11-29</h2>
|
||||
<ul>
|
||||
<li>Linode alerted that CGSpace was using 279% CPU from 6 to 8 AM this morning</li>
|
||||
<li>About an hour later Uptime Robot said that the server was down</li>
|
||||
@ -911,7 +911,7 @@ $ cat dspace.log.2017-11-28 | grep -o -E 'session_id=[A-Z0-9]{32}' | sort -n | u
|
||||
<li>I will bump DSpace's <code>db.maxconnections</code> from 60 to 90, and PostgreSQL's <code>max_connections</code> from 183 to 273 (which is using my loose formula of 90 * webapps + 3)</li>
|
||||
<li>I really need to figure out how to get DSpace to use a PostgreSQL connection pool</li>
|
||||
</ul>
|
||||
<h2 id="20171130">2017-11-30</h2>
|
||||
<h2 id="2017-11-30">2017-11-30</h2>
|
||||
<ul>
|
||||
<li>Linode alerted about high CPU usage on CGSpace again around 6 to 8 AM</li>
|
||||
<li>Then Uptime Robot said CGSpace was down a few minutes later, but it resolved itself I think (or Tsega restarted Tomcat, I don't know)</li>
|
||||
|
Reference in New Issue
Block a user