cgspace-notes/docs/2024-09/index.html
2024-12-04 16:27:49 +03:00

322 lines
15 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="en" >
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta property="og:title" content="September, 2024" />
<meta property="og:description" content="2024-09-01
Upgrade CGSpace to DSpace 7.6.2
" />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2024-09/" />
<meta property="article:published_time" content="2024-09-01T21:16:00-07:00" />
<meta property="article:modified_time" content="2024-09-30T07:56:53+03:00" />
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="September, 2024"/>
<meta name="twitter:description" content="2024-09-01
Upgrade CGSpace to DSpace 7.6.2
"/>
<meta name="generator" content="Hugo 0.133.1">
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "BlogPosting",
"headline": "September, 2024",
"url": "https://alanorth.github.io/cgspace-notes/2024-09/",
"wordCount": "769",
"datePublished": "2024-09-01T21:16:00-07:00",
"dateModified": "2024-09-30T07:56:53+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
},
"keywords": "Notes"
}
</script>
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2024-09/">
<title>September, 2024 | CGSpace Notes</title>
<!-- combined, minified CSS -->
<link href="https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel="stylesheet" integrity="sha256-xrqAvFBmlVdkWr4F&#43;GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin="anonymous">
<!-- minified Font Awesome for SVG icons -->
<script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
<!-- RSS 2.0 feed -->
</head>
<body>
<div class="blog-masthead">
<div class="container">
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>
<header class="blog-header">
<div class="container">
<h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
<p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
</div>
</header>
<div class="container">
<div class="row">
<div class="col-sm-8 blog-main">
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2024-09/">September, 2024</a></h2>
<p class="blog-post-meta">
<time datetime="2024-09-01T21:16:00-07:00">Sun Sep 01, 2024</time>
in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2024-09-01">2024-09-01</h2>
<ul>
<li>Upgrade CGSpace to DSpace 7.6.2</li>
</ul>
<h2 id="2024-09-05">2024-09-05</h2>
<ul>
<li>Finalize work on migrating DSpace Angular from Yarn to NPM</li>
</ul>
<h2 id="2024-09-06">2024-09-06</h2>
<ul>
<li>This morning Tomcat crashed due to an OOM kill:</li>
</ul>
<pre tabindex="0"><code>Sep 06 00:00:24 server systemd[1]: tomcat9.service: A process of this unit has been killed by the OOM killer.
Sep 06 00:00:25 server systemd[1]: tomcat9.service: Main process exited, code=killed, status=9/KILL
Sep 06 00:00:25 server systemd[1]: tomcat9.service: Failed with result &#39;oom-kill&#39;.
</code></pre><ul>
<li>According to the system journal, it was a Node.js dspace-angular process that tried to allocate memory and failed, thus invoking the OOM killer</li>
<li>Currently I see high memory usage in those processes:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ pm2 status
</span></span><span style="display:flex;"><span>┌────┬──────────────┬─────────────┬─────────┬─────────┬──────────┬────────┬──────┬───────────┬──────────┬──────────┬──────────┬──────────┐
</span></span><span style="display:flex;"><span>│ id │ name │ namespace │ version │ mode │ pid │ uptime │ ↺ │ status │ cpu │ mem │ user │ watching │
</span></span><span style="display:flex;"><span>├────┼──────────────┼─────────────┼─────────┼─────────┼──────────┼────────┼──────┼───────────┼──────────┼──────────┼──────────┼──────────┤
</span></span><span style="display:flex;"><span>│ 0 │ dspace-ui │ default │ 7.6.3-… │ cluster │ 994 │ 4D │ 0 │ online │ 0% │ 3.4gb │ dspace │ disabled │
</span></span><span style="display:flex;"><span>│ 1 │ dspace-ui │ default │ 7.6.3-… │ cluster │ 1015 │ 4D │ 0 │ online │ 0% │ 3.4gb │ dspace │ disabled │
</span></span><span style="display:flex;"><span>│ 2 │ dspace-ui │ default │ 7.6.3-… │ cluster │ 1029 │ 4D │ 0 │ online │ 0% │ 3.4gb │ dspace │ disabled │
</span></span><span style="display:flex;"><span>│ 3 │ dspace-ui │ default │ 7.6.3-… │ cluster │ 1042 │ 4D │ 0 │ online │ 0% │ 3.4gb │ dspace │ disabled │
</span></span><span style="display:flex;"><span>└────┴──────────────┴─────────────┴─────────┴─────────┴──────────┴────────┴──────┴───────────┴──────────┴──────────┴──────────┴──────────┘
</span></span></code></pre></div><ul>
<li>I bet if I look in the logs I&rsquo;d find some kind of heavy traffic on the frontend, causing high caching for Angular SSR</li>
</ul>
<h2 id="2024-09-08">2024-09-08</h2>
<ul>
<li>Analyzing memory use in our DSpace hosts, which have 32GB of memory
<ul>
<li>Effective cache of PostgreSQL is estimated at 11GB, which seems way high since the database is only 2GB</li>
<li>Realistically this should be how we adjust, with PostgreSQL using ~8GB (or less) and each dspace-angular process pinned at 2GB&hellip;</li>
</ul>
</li>
</ul>
<blockquote>
<p>Total - Solr - Tomcat Postgres - Nginx - Angular
31366 (1024×4.4) 7168 (8×1024) 512 - (4x2048) = 2796.4 left&hellip;</p>
</blockquote>
<ul>
<li>I put some of these changes in on DSpace Test and will monitor this week</li>
</ul>
<h2 id="2024-09-10">2024-09-10</h2>
<ul>
<li>Some bot in South Africa made a ton of requests on the API and made the load hit the roof:</li>
</ul>
<pre tabindex="0"><code># grep -E &#39;10/Sep/2024:[10-11]&#39; /var/log/nginx/api-access.log | awk &#39;{print $1}&#39; | sort | uniq -c | sort -h
...
149720 102.182.38.90
</code></pre><ul>
<li>They are using several user agents so are obviously a bot:</li>
</ul>
<pre tabindex="0"><code>Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:130.0) Gecko/20100101 Firefox/130.0
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/111.0
Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:11.0) Gecko/20100101 Firefox/11.0
</code></pre><ul>
<li>I added them to the list of bot networks in nginx and the load went down</li>
</ul>
<h2 id="2024-09-11">2024-09-11</h2>
<ul>
<li>Upgrade DSpace 7 Test to Ubuntu 24.04</li>
<li>I did some minor maintenance to test dspace-statistics-api with Python 3.12
<ul>
<li>I tagged version 1.4.4 and released it on GitHub</li>
</ul>
</li>
</ul>
<h2 id="2024-09-14">2024-09-14</h2>
<ul>
<li>Noticed a persistent higher than usual load on CGSpace and checked the server logs
<ul>
<li>Found some new data center subnets to block because they were making thousands of requests with normal user agents</li>
</ul>
</li>
<li>I enabled HTTP/3 in nginx</li>
<li>I enabled the SSR patch in Angular: <a href="https://github.com/DSpace/dspace-angular/issues/3110">https://github.com/DSpace/dspace-angular/issues/3110</a></li>
</ul>
<h2 id="2024-09-16">2024-09-16</h2>
<ul>
<li>Experiment with the <!-- raw HTML omitted -->dspace-statistics-api-js<!-- raw HTML omitted --> on DSpace 7 Test
<ul>
<li>In the past it always caused Solr to run out of memory, but I increased Solr&rsquo;s heap from 2g to 3g and it runs without crashing</li>
<li>I attached VisualVM to Solr with a 3g and 4g heap and iterated over 1260 pages of results in the dspace-statistics-api-js:</li>
</ul>
</li>
</ul>
<p><img src="/cgspace-notes/2024/09/2024-09-16-Solr-3g-heap.png" alt="Solr with 3g heap"></p>
<p><img src="/cgspace-notes/2024/09/2024-09-16-Solr-4g-heap.png" alt="Solr with 4g heap"></p>
<h2 id="2024-09-23">2024-09-23</h2>
<ul>
<li>Upgrade PostgreSQL from version 14 to 15 on DSpace Test the same way I did last year:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># apt update
</span></span><span style="display:flex;"><span># apt install postgresql-15
</span></span><span style="display:flex;"><span># Update configs with Ansible
</span></span><span style="display:flex;"><span># systemctl stop tomcat9
</span></span><span style="display:flex;"><span># pg_ctlcluster <span style="color:#ae81ff">14</span> main stop
</span></span><span style="display:flex;"><span># tar -cvzpf var-lib-postgresql-14.tar.gz /var/lib/postgresql/14
</span></span><span style="display:flex;"><span># tar -cvzpf etc-postgresql-14.tar.gz /etc/postgresql/14
</span></span><span style="display:flex;"><span># pg_ctlcluster <span style="color:#ae81ff">15</span> main stop
</span></span><span style="display:flex;"><span># pg_dropcluster <span style="color:#ae81ff">15</span> main
</span></span><span style="display:flex;"><span># pg_upgradecluster <span style="color:#ae81ff">14</span> main
</span></span><span style="display:flex;"><span># pg_ctlcluster <span style="color:#ae81ff">15</span> main start
</span></span><span style="display:flex;"><span>...
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>ERROR: could not find function &#34;xml_is_well_formed&#34; in file &#34;/usr/lib/postgresql/15/lib/pgxml.so&#34;
</span></span><span style="display:flex;"><span>ERROR: function public.xml_is_well_formed(text) does not exist
</span></span><span style="display:flex;"><span>ERROR: could not find function &#34;xml_is_well_formed&#34; in file &#34;/usr/lib/postgresql/15/lib/pgxml.so&#34;
</span></span><span style="display:flex;"><span>ERROR: function public.xml_valid(text) does not exist
</span></span></code></pre></div><ul>
<li>After that I <a href="https://adamj.eu/tech/2021/04/13/reindexing-all-tables-after-upgrading-to-postgresql-13/">re-indexed the database indexes</a> using a query:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ su - postgres
</span></span><span style="display:flex;"><span>$ cat /tmp/generate-reindex.sql
</span></span><span style="display:flex;"><span>SELECT &#39;REINDEX TABLE CONCURRENTLY &#39; || quote_ident(relname) || &#39; /*&#39; || pg_size_pretty(pg_total_relation_size(C.oid)) || &#39;*/;&#39;
</span></span><span style="display:flex;"><span>FROM pg_class C
</span></span><span style="display:flex;"><span>LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
</span></span><span style="display:flex;"><span>WHERE nspname = &#39;public&#39;
</span></span><span style="display:flex;"><span> AND C.relkind = &#39;r&#39;
</span></span><span style="display:flex;"><span> AND nspname !~ &#39;^pg_toast&#39;
</span></span><span style="display:flex;"><span>ORDER BY pg_total_relation_size(C.oid) ASC;
</span></span><span style="display:flex;"><span>$ psql dspace &lt; /tmp/generate-reindex.sql &gt; /tmp/reindex.sql
</span></span><span style="display:flex;"><span>$ &lt;trim the extra stuff from /tmp/reindex.sql&gt;
</span></span><span style="display:flex;"><span>$ psql dspace &lt; /tmp/reindex.sql
</span></span></code></pre></div><ul>
<li>The database shrunk by 186MB!</li>
</ul>
<h2 id="2024-09-29">2024-09-29</h2>
<ul>
<li>I upgraded the database on CGSpace to PostgreSQL 15</li>
</ul>
<!-- raw HTML omitted -->
</article>
</div> <!-- /.blog-main -->
<aside class="col-sm-3 ml-auto blog-sidebar">
<section class="sidebar-module">
<h4>Recent Posts</h4>
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2024-12/">December, 2024</a></li>
<li><a href="/cgspace-notes/2024-11/">November, 2024</a></li>
<li><a href="/cgspace-notes/2024-10/">October, 2024</a></li>
<li><a href="/cgspace-notes/2024-09/">September, 2024</a></li>
<li><a href="/cgspace-notes/2024-08/">August, 2024</a></li>
</ol>
</section>
<section class="sidebar-module">
<h4>Links</h4>
<ol class="list-unstyled">
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
</ol>
</section>
</aside>
</div> <!-- /.row -->
</div> <!-- /.container -->
<footer class="blog-footer">
<p dir="auto">
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
</p>
<p>
<a href="#">Back to top</a>
</p>
</footer>
</body>
</html>