Add notes for 2020-02-07

This commit is contained in:
Alan Orth 2020-02-07 14:44:08 +02:00
parent 38177b2a6f
commit 009cc870ba
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
7 changed files with 17564 additions and 8 deletions

View File

@ -303,4 +303,40 @@ $ ./run.sh -s http://localhost:8983/solr/statistics -a import -o ~/Downloads/sta
- OK that imported! I wonder if it works... maybe I'll try another day - OK that imported! I wonder if it works... maybe I'll try another day
## 2020-02-07
- I did some investigation into DSpace indexing performance using flame graphs
- Excellent introduction: http://www.brendangregg.com/flamegraphs.html
- Using flame graphs with java: https://netflixtechblog.com/java-in-flames-e763b3d32166
- Fantastic wrapper scripts for doing perf on Java processes: https://github.com/jvm-profiling-tools/perf-map-agent
```
$ cd ~/src/git/perf-map-agent
$ cmake .
$ make
$ ./bin/create-links-in ~/.local/bin
$ export FLAMEGRAPH_DIR=/home/aorth/src/git/FlameGraph
$ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk
$ export JAVA_OPTS="-XX:+PreserveFramePointer"
$ ~/dspace63/bin/dspace index-discovery -b &
# pid of tomcat java process
$ perf-java-flames 4478
# pid of java indexing process
$ perf-java-flames 11359
```
- All Java processes need to have `-XX:+PreserveFramePointer` if you want to trace their methods
- I did the same tests against DSpace 5.8 and 6.4-SNAPSHOT's CLI indexing process and Tomcat process
- For what it's worth, it appears all the Hibernate stuff is in the CLI processes, so we don't need to trace the Tomcat process
- Here is the flame graph for DSpace 5.8's `dspace index-discovery -b` java process:
![DSpace 5.8 index-discovery flame graph](/cgspace-notes/2020/02/flamegraph-java-cli-dspace58.svg)
- Here is the flame graph for DSpace 6.4-SNAPSHOT's `dspace index-discovery -b` java process:
![DSpace 6.4-SNAPSHOT index-discovery flame graph](/cgspace-notes/2020/02/flamegraph-java-cli-dspace64-snapshot.svg)
- If the width of the stacks indicates time, then it's clear that Hibernate takes longer...
- Apparently there is a "flame diff" tool, I wonder if we can use that to compare!
<!-- vim: set sw=2 ts=2: --> <!-- vim: set sw=2 ts=2: -->

View File

@ -20,7 +20,7 @@ The code finally builds and runs with a fresh install
<meta property="og:type" content="article" /> <meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-02/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-02/" />
<meta property="article:published_time" content="2020-02-02T11:56:30+02:00" /> <meta property="article:published_time" content="2020-02-02T11:56:30+02:00" />
<meta property="article:modified_time" content="2020-02-06T12:47:25+02:00" /> <meta property="article:modified_time" content="2020-02-06T16:54:41+02:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="February, 2020"/> <meta name="twitter:title" content="February, 2020"/>
@ -45,9 +45,9 @@ The code finally builds and runs with a fresh install
"@type": "BlogPosting", "@type": "BlogPosting",
"headline": "February, 2020", "headline": "February, 2020",
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2020-02\/", "url": "https:\/\/alanorth.github.io\/cgspace-notes\/2020-02\/",
"wordCount": "2069", "wordCount": "2254",
"datePublished": "2020-02-02T11:56:30+02:00", "datePublished": "2020-02-02T11:56:30+02:00",
"dateModified": "2020-02-06T12:47:25+02:00", "dateModified": "2020-02-06T16:54:41+02:00",
"author": { "author": {
"@type": "Person", "@type": "Person",
"name": "Alan Orth" "name": "Alan Orth"
@ -420,6 +420,46 @@ $ ./run.sh -s http://localhost:8080/solr/statistics-2019 -a import -o ~/Download
</code></pre><ul> </code></pre><ul>
<li>OK that imported! I wonder if it works&hellip; maybe I&rsquo;ll try another day</li> <li>OK that imported! I wonder if it works&hellip; maybe I&rsquo;ll try another day</li>
</ul> </ul>
<h2 id="2020-02-07">2020-02-07</h2>
<ul>
<li>I did some investigation into DSpace indexing performance using flame graphs
<ul>
<li>Excellent introduction: <a href="http://www.brendangregg.com/flamegraphs.html">http://www.brendangregg.com/flamegraphs.html</a></li>
<li>Using flame graphs with java: <a href="https://netflixtechblog.com/java-in-flames-e763b3d32166">https://netflixtechblog.com/java-in-flames-e763b3d32166</a></li>
<li>Fantastic wrapper scripts for doing perf on Java processes: <a href="https://github.com/jvm-profiling-tools/perf-map-agent">https://github.com/jvm-profiling-tools/perf-map-agent</a></li>
</ul>
</li>
</ul>
<pre><code>$ cd ~/src/git/perf-map-agent
$ cmake .
$ make
$ ./bin/create-links-in ~/.local/bin
$ export FLAMEGRAPH_DIR=/home/aorth/src/git/FlameGraph
$ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk
$ export JAVA_OPTS=&quot;-XX:+PreserveFramePointer&quot;
$ ~/dspace63/bin/dspace index-discovery -b &amp;
# pid of tomcat java process
$ perf-java-flames 4478
# pid of java indexing process
$ perf-java-flames 11359
</code></pre><ul>
<li>All Java processes need to have <code>-XX:+PreserveFramePointer</code> if you want to trace their methods</li>
<li>I did the same tests against DSpace 5.8 and 6.4-SNAPSHOT&rsquo;s CLI indexing process and Tomcat process
<ul>
<li>For what it&rsquo;s worth, it appears all the Hibernate stuff is in the CLI processes, so we don&rsquo;t need to trace the Tomcat process</li>
</ul>
</li>
<li>Here is the flame graph for DSpace 5.8&rsquo;s <code>dspace index-discovery -b</code> java process:</li>
</ul>
<p><img src="/cgspace-notes/2020/02/flamegraph-java-cli-dspace58.svg" alt="DSpace 5.8 index-discovery flame graph"></p>
<ul>
<li>Here is the flame graph for DSpace 6.4-SNAPSHOT&rsquo;s <code>dspace index-discovery -b</code> java process:</li>
</ul>
<p><img src="/cgspace-notes/2020/02/flamegraph-java-cli-dspace64-snapshot.svg" alt="DSpace 6.4-SNAPSHOT index-discovery flame graph"></p>
<ul>
<li>If the width of the stacks indicates time, then it&rsquo;s clear that Hibernate takes longer&hellip;</li>
<li>Apparently there is a &ldquo;flame diff&rdquo; tool, I wonder if we can use that to compare!</li>
</ul>
<!-- raw HTML omitted --> <!-- raw HTML omitted -->

File diff suppressed because it is too large Load Diff

After

Width:  |  Height:  |  Size: 201 KiB

File diff suppressed because it is too large Load Diff

After

Width:  |  Height:  |  Size: 225 KiB

View File

@ -4,27 +4,27 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc> <loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
<lastmod>2020-02-06T12:47:25+02:00</lastmod> <lastmod>2020-02-06T16:54:41+02:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2020-02-06T12:47:25+02:00</lastmod> <lastmod>2020-02-06T16:54:41+02:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/2020-02/</loc> <loc>https://alanorth.github.io/cgspace-notes/2020-02/</loc>
<lastmod>2020-02-06T12:47:25+02:00</lastmod> <lastmod>2020-02-06T16:54:41+02:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2020-02-06T12:47:25+02:00</lastmod> <lastmod>2020-02-06T16:54:41+02:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc> <loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2020-02-06T12:47:25+02:00</lastmod> <lastmod>2020-02-06T16:54:41+02:00</lastmod>
</url> </url>
<url> <url>

File diff suppressed because it is too large Load Diff

After

Width:  |  Height:  |  Size: 201 KiB

File diff suppressed because it is too large Load Diff

After

Width:  |  Height:  |  Size: 225 KiB