mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-21 22:25:02 +01:00
Add notes for 2018-04-20
This commit is contained in:
parent
ec824f22d4
commit
20b80513e4
@ -396,3 +396,88 @@ sys 2m2.687s
|
||||
```
|
||||
|
||||
- This time is with about 70,000 items in the repository
|
||||
|
||||
## 2018-04-20
|
||||
|
||||
- Gabriela from CIP emailed to say that CGSpace was returning a white page, but I haven't seen any emails from UptimeRobot
|
||||
- I confirm that it's just giving a white page around 4:16
|
||||
- The DSpace logs show that there are no database connections:
|
||||
|
||||
```
|
||||
org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-715] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:250; busy:18; idle:0; lastwait:5000].
|
||||
```
|
||||
|
||||
- And there have been shit tons of errors in the last (starting only 20 minutes ago luckily):
|
||||
|
||||
```
|
||||
# grep -c 'org.apache.tomcat.jdbc.pool.PoolExhaustedException' /home/cgspace.cgiar.org/log/dspace.log.2018-04-20
|
||||
32147
|
||||
```
|
||||
|
||||
- I can't even log into PostgreSQL as the `postgres` user, WTF?
|
||||
|
||||
```
|
||||
$ psql -c 'select * from pg_stat_activity' | grep -o -E '(dspaceWeb|dspaceApi|dspaceCli)' | sort | uniq -c
|
||||
^C
|
||||
```
|
||||
|
||||
- Here are the most active IPs today:
|
||||
|
||||
```
|
||||
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "20/Apr/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||||
917 207.46.13.182
|
||||
935 213.55.99.121
|
||||
970 40.77.167.134
|
||||
978 207.46.13.80
|
||||
1422 66.249.64.155
|
||||
1577 50.116.102.77
|
||||
2456 95.108.181.88
|
||||
3216 104.196.152.243
|
||||
4325 70.32.83.92
|
||||
10718 45.5.184.2
|
||||
```
|
||||
|
||||
- It doesn't even seem like there is a lot of traffic compared to the previous days:
|
||||
|
||||
```
|
||||
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "20/Apr/2018" | wc -l
|
||||
74931
|
||||
# zcat --force /var/log/nginx/*.log.1 /var/log/nginx/*.log.2.gz| grep -E "19/Apr/2018" | wc -l
|
||||
91073
|
||||
# zcat --force /var/log/nginx/*.log.2.gz /var/log/nginx/*.log.3.gz| grep -E "18/Apr/2018" | wc -l
|
||||
93459
|
||||
```
|
||||
|
||||
- I tried to restart Tomcat but `systemctl` hangs
|
||||
- I tried to reboot the server from the command line but after a few minutes it didn't come back up
|
||||
- Looking at the Linode console I see that it is stuck trying to shut down
|
||||
- Even "Reboot" via Linode console doesn't work!
|
||||
- After shutting it down a few times via the Linode console it finally rebooted
|
||||
- Everything is back but I have no idea what caused this—I suspect something with the hosting provider
|
||||
- Also super weird, the last entry in the DSpace log file is from `2018-04-20 16:35:09`, and then immediately it goes to `2018-04-20 19:15:04` (three hours later!):
|
||||
|
||||
```
|
||||
2018-04-20 16:35:09,144 ERROR org.dspace.app.util.AbstractDSpaceWebapp @ Failed to record shutdown in Webapp table.
|
||||
org.apache.tomcat.jdbc.pool.PoolExhaustedException: [localhost-startStop-2] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:250; busy:18; idle
|
||||
:0; lastwait:5000].
|
||||
at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:685)
|
||||
at org.apache.tomcat.jdbc.pool.ConnectionPool.getConnection(ConnectionPool.java:187)
|
||||
at org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:128)
|
||||
at org.dspace.storage.rdbms.DatabaseManager.getConnection(DatabaseManager.java:632)
|
||||
at org.dspace.core.Context.init(Context.java:121)
|
||||
at org.dspace.core.Context.<init>(Context.java:95)
|
||||
at org.dspace.app.util.AbstractDSpaceWebapp.deregister(AbstractDSpaceWebapp.java:97)
|
||||
at org.dspace.app.util.DSpaceContextListener.contextDestroyed(DSpaceContextListener.java:146)
|
||||
at org.apache.catalina.core.StandardContext.listenerStop(StandardContext.java:5115)
|
||||
at org.apache.catalina.core.StandardContext.stopInternal(StandardContext.java:5779)
|
||||
at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:224)
|
||||
at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1588)
|
||||
at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1577)
|
||||
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
|
||||
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
|
||||
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
|
||||
at java.lang.Thread.run(Thread.java:748)
|
||||
2018-04-20 19:15:04,006 INFO org.dspace.core.ConfigurationManager @ Loading from classloader: file:/home/cgspace.cgiar.org/config/dspace.cfg
|
||||
```
|
||||
|
||||
- Very suspect!
|
||||
|
@ -21,7 +21,7 @@ Catalina logs at least show some memory errors yesterday:
|
||||
|
||||
<meta property="article:published_time" content="2018-04-01T16:13:54+02:00"/>
|
||||
|
||||
<meta property="article:modified_time" content="2018-04-19T12:40:52+03:00"/>
|
||||
<meta property="article:modified_time" content="2018-04-19T14:28:16+03:00"/>
|
||||
|
||||
|
||||
|
||||
@ -53,9 +53,9 @@ Catalina logs at least show some memory errors yesterday:
|
||||
"@type": "BlogPosting",
|
||||
"headline": "April, 2018",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2018-04/",
|
||||
"wordCount": "2148",
|
||||
"wordCount": "2549",
|
||||
"datePublished": "2018-04-01T16:13:54+02:00",
|
||||
"dateModified": "2018-04-19T12:40:52+03:00",
|
||||
"dateModified": "2018-04-19T14:28:16+03:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -565,6 +565,99 @@ sys 2m2.687s
|
||||
<li>This time is with about 70,000 items in the repository</li>
|
||||
</ul>
|
||||
|
||||
<h2 id="2018-04-20">2018-04-20</h2>
|
||||
|
||||
<ul>
|
||||
<li>Gabriela from CIP emailed to say that CGSpace was returning a white page, but I haven’t seen any emails from UptimeRobot</li>
|
||||
<li>I confirm that it’s just giving a white page around 4:16</li>
|
||||
<li>The DSpace logs show that there are no database connections:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-715] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:250; busy:18; idle:0; lastwait:5000].
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>And there have been shit tons of errors in the last (starting only 20 minutes ago luckily):</li>
|
||||
</ul>
|
||||
|
||||
<pre><code># grep -c 'org.apache.tomcat.jdbc.pool.PoolExhaustedException' /home/cgspace.cgiar.org/log/dspace.log.2018-04-20
|
||||
32147
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>I can’t even log into PostgreSQL as the <code>postgres</code> user, WTF?</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>$ psql -c 'select * from pg_stat_activity' | grep -o -E '(dspaceWeb|dspaceApi|dspaceCli)' | sort | uniq -c
|
||||
^C
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>Here are the most active IPs today:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "20/Apr/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
||||
917 207.46.13.182
|
||||
935 213.55.99.121
|
||||
970 40.77.167.134
|
||||
978 207.46.13.80
|
||||
1422 66.249.64.155
|
||||
1577 50.116.102.77
|
||||
2456 95.108.181.88
|
||||
3216 104.196.152.243
|
||||
4325 70.32.83.92
|
||||
10718 45.5.184.2
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>It doesn’t even seem like there is a lot of traffic compared to the previous days:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "20/Apr/2018" | wc -l
|
||||
74931
|
||||
# zcat --force /var/log/nginx/*.log.1 /var/log/nginx/*.log.2.gz| grep -E "19/Apr/2018" | wc -l
|
||||
91073
|
||||
# zcat --force /var/log/nginx/*.log.2.gz /var/log/nginx/*.log.3.gz| grep -E "18/Apr/2018" | wc -l
|
||||
93459
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>I tried to restart Tomcat but <code>systemctl</code> hangs</li>
|
||||
<li>I tried to reboot the server from the command line but after a few minutes it didn’t come back up</li>
|
||||
<li>Looking at the Linode console I see that it is stuck trying to shut down</li>
|
||||
<li>Even “Reboot” via Linode console doesn’t work!</li>
|
||||
<li>After shutting it down a few times via the Linode console it finally rebooted</li>
|
||||
<li>Everything is back but I have no idea what caused this—I suspect something with the hosting provider</li>
|
||||
<li>Also super weird, the last entry in the DSpace log file is from <code>2018-04-20 16:35:09</code>, and then immediately it goes to <code>2018-04-20 19:15:04</code> (three hours later!):</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>2018-04-20 16:35:09,144 ERROR org.dspace.app.util.AbstractDSpaceWebapp @ Failed to record shutdown in Webapp table.
|
||||
org.apache.tomcat.jdbc.pool.PoolExhaustedException: [localhost-startStop-2] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:250; busy:18; idle
|
||||
:0; lastwait:5000].
|
||||
at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:685)
|
||||
at org.apache.tomcat.jdbc.pool.ConnectionPool.getConnection(ConnectionPool.java:187)
|
||||
at org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:128)
|
||||
at org.dspace.storage.rdbms.DatabaseManager.getConnection(DatabaseManager.java:632)
|
||||
at org.dspace.core.Context.init(Context.java:121)
|
||||
at org.dspace.core.Context.<init>(Context.java:95)
|
||||
at org.dspace.app.util.AbstractDSpaceWebapp.deregister(AbstractDSpaceWebapp.java:97)
|
||||
at org.dspace.app.util.DSpaceContextListener.contextDestroyed(DSpaceContextListener.java:146)
|
||||
at org.apache.catalina.core.StandardContext.listenerStop(StandardContext.java:5115)
|
||||
at org.apache.catalina.core.StandardContext.stopInternal(StandardContext.java:5779)
|
||||
at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:224)
|
||||
at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1588)
|
||||
at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1577)
|
||||
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
|
||||
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
|
||||
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
|
||||
at java.lang.Thread.run(Thread.java:748)
|
||||
2018-04-20 19:15:04,006 INFO org.dspace.core.ConfigurationManager @ Loading from classloader: file:/home/cgspace.cgiar.org/config/dspace.cfg
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>Very suspect!</li>
|
||||
</ul>
|
||||
|
||||
|
||||
|
||||
|
||||
|
@ -4,7 +4,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/2018-04/</loc>
|
||||
<lastmod>2018-04-19T12:40:52+03:00</lastmod>
|
||||
<lastmod>2018-04-19T14:28:16+03:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
@ -159,7 +159,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
||||
<lastmod>2018-04-19T12:40:52+03:00</lastmod>
|
||||
<lastmod>2018-04-19T14:28:16+03:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
@ -170,7 +170,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
|
||||
<lastmod>2018-04-19T12:40:52+03:00</lastmod>
|
||||
<lastmod>2018-04-19T14:28:16+03:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
@ -182,13 +182,13 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
|
||||
<lastmod>2018-04-19T12:40:52+03:00</lastmod>
|
||||
<lastmod>2018-04-19T14:28:16+03:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
|
||||
<lastmod>2018-04-19T12:40:52+03:00</lastmod>
|
||||
<lastmod>2018-04-19T14:28:16+03:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user