Update notes for 2018-10-17

This commit is contained in:
Alan Orth 2018-10-17 00:33:01 +03:00
parent 878e3188ec
commit 0afdffa34f
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
3 changed files with 109 additions and 8 deletions

View File

@ -298,4 +298,52 @@ $ psql -h localhost -U postgres dspacetest -c 'alter user dspacetest nosuperuser
dspace=# \copy (SELECT (CASE when metadata_schema_id=1 THEN 'dc' WHEN metadata_schema_id=2 THEN 'cg' END) AS schema, element, qualifier, scope_note FROM metadatafieldregistry where metadata_schema_id IN (1,2)) TO /tmp/cgspace-schema.csv WITH CSV HEADER; dspace=# \copy (SELECT (CASE when metadata_schema_id=1 THEN 'dc' WHEN metadata_schema_id=2 THEN 'cg' END) AS schema, element, qualifier, scope_note FROM metadatafieldregistry where metadata_schema_id IN (1,2)) TO /tmp/cgspace-schema.csv WITH CSV HEADER;
``` ```
- Talking to the CodeObia guys about the REST API I started to wonder why it's so slow and how I can quantify it in order to ask the dspace-tech mailing list for help profiling it
- Interestingly, the speed doesn't get better after you request the same thing multiple timesit's consistently bad on both CGSpace and DSpace Test!
```
$ time http --print h 'https://cgspace.cgiar.org/rest/items?expand=metadata,bitstreams,parentCommunityList&limit=100&offset=0'
...
0.35s user 0.06s system 1% cpu 25.133 total
0.31s user 0.04s system 1% cpu 25.223 total
0.27s user 0.06s system 1% cpu 27.858 total
0.20s user 0.05s system 1% cpu 23.838 total
0.30s user 0.05s system 1% cpu 24.301 total
$ time http --print h 'https://dspacetest.cgiar.org/rest/items?expand=metadata,bitstreams,parentCommunityList&limit=100&offset=0'
...
0.22s user 0.03s system 1% cpu 17.248 total
0.23s user 0.02s system 1% cpu 16.856 total
0.23s user 0.04s system 1% cpu 16.460 total
0.24s user 0.04s system 1% cpu 21.043 total
0.22s user 0.04s system 1% cpu 17.132 total
```
- I should note that at this time CGSpace is using Oracle Java and DSpace Test is using OpenJDK (both version 8)
- I wonder if the Java garbage collector is important here, or if there are missing indexes in PostgreSQL?
- I switched DSpace Test to the G1GC garbage collector and tried again and now the results are worse!
```
$ time http --print h 'https://dspacetest.cgiar.org/rest/items?expand=metadata,bitstreams,parentCommunityList&limit=100&offset=0'
...
0.20s user 0.03s system 0% cpu 25.017 total
0.23s user 0.02s system 1% cpu 23.299 total
0.24s user 0.02s system 1% cpu 22.496 total
0.22s user 0.03s system 1% cpu 22.720 total
0.23s user 0.03s system 1% cpu 22.632 total
```
- If I make a request without the expands it is ten time faster:
```
$ time http --print h 'https://dspacetest.cgiar.org/rest/items?limit=100&offset=0'
...
0.20s user 0.03s system 7% cpu 3.098 total
0.22s user 0.03s system 8% cpu 2.896 total
0.21s user 0.05s system 9% cpu 2.787 total
0.23s user 0.02s system 8% cpu 2.896 total
```
- I sent a mail to dspace-tech to ask how to profile this...
<!-- vim: set sw=2 ts=2: --> <!-- vim: set sw=2 ts=2: -->

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="2018-10-01 Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items I created a GitHub issue to track this #389, because I&rsquo;m super busy in Nairobi right now 2018-10-03 I see Moayad was busy collecting item views and downloads from CGSpace yesterday: # zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;02/Oct/2018&quot; | awk &#39;{print $1} &#39; | sort | uniq -c | sort -n | tail -n 10 933 40." /> <meta property="og:description" content="2018-10-01 Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items I created a GitHub issue to track this #389, because I&rsquo;m super busy in Nairobi right now 2018-10-03 I see Moayad was busy collecting item views and downloads from CGSpace yesterday: # zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;02/Oct/2018&quot; | awk &#39;{print $1} &#39; | sort | uniq -c | sort -n | tail -n 10 933 40." />
<meta property="og:type" content="article" /> <meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-10/" /><meta property="article:published_time" content="2018-10-01T22:31:54&#43;03:00"/> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-10/" /><meta property="article:published_time" content="2018-10-01T22:31:54&#43;03:00"/>
<meta property="article:modified_time" content="2018-10-15T17:26:03&#43;03:00"/> <meta property="article:modified_time" content="2018-10-16T17:26:18&#43;03:00"/>
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="October, 2018"/> <meta name="twitter:title" content="October, 2018"/>
@ -24,9 +24,9 @@
"@type": "BlogPosting", "@type": "BlogPosting",
"headline": "October, 2018", "headline": "October, 2018",
"url": "https://alanorth.github.io/cgspace-notes/2018-10/", "url": "https://alanorth.github.io/cgspace-notes/2018-10/",
"wordCount": "2132", "wordCount": "2451",
"datePublished": "2018-10-01T22:31:54&#43;03:00", "datePublished": "2018-10-01T22:31:54&#43;03:00",
"dateModified": "2018-10-15T17:26:03&#43;03:00", "dateModified": "2018-10-16T17:26:18&#43;03:00",
"author": { "author": {
"@type": "Person", "@type": "Person",
"name": "Alan Orth" "name": "Alan Orth"
@ -435,6 +435,59 @@ $ psql -h localhost -U postgres dspacetest -c 'alter user dspacetest nosuperuser
<pre><code>dspace=# \copy (SELECT (CASE when metadata_schema_id=1 THEN 'dc' WHEN metadata_schema_id=2 THEN 'cg' END) AS schema, element, qualifier, scope_note FROM metadatafieldregistry where metadata_schema_id IN (1,2)) TO /tmp/cgspace-schema.csv WITH CSV HEADER; <pre><code>dspace=# \copy (SELECT (CASE when metadata_schema_id=1 THEN 'dc' WHEN metadata_schema_id=2 THEN 'cg' END) AS schema, element, qualifier, scope_note FROM metadatafieldregistry where metadata_schema_id IN (1,2)) TO /tmp/cgspace-schema.csv WITH CSV HEADER;
</code></pre> </code></pre>
<ul>
<li>Talking to the CodeObia guys about the REST API I started to wonder why it&rsquo;s so slow and how I can quantify it in order to ask the dspace-tech mailing list for help profiling it</li>
<li>Interestingly, the speed doesn&rsquo;t get better after you request the same thing multiple timesit&rsquo;s consistently bad on both CGSpace and DSpace Test!</li>
</ul>
<pre><code>$ time http --print h 'https://cgspace.cgiar.org/rest/items?expand=metadata,bitstreams,parentCommunityList&amp;limit=100&amp;offset=0'
...
0.35s user 0.06s system 1% cpu 25.133 total
0.31s user 0.04s system 1% cpu 25.223 total
0.27s user 0.06s system 1% cpu 27.858 total
0.20s user 0.05s system 1% cpu 23.838 total
0.30s user 0.05s system 1% cpu 24.301 total
$ time http --print h 'https://dspacetest.cgiar.org/rest/items?expand=metadata,bitstreams,parentCommunityList&amp;limit=100&amp;offset=0'
...
0.22s user 0.03s system 1% cpu 17.248 total
0.23s user 0.02s system 1% cpu 16.856 total
0.23s user 0.04s system 1% cpu 16.460 total
0.24s user 0.04s system 1% cpu 21.043 total
0.22s user 0.04s system 1% cpu 17.132 total
</code></pre>
<ul>
<li>I should note that at this time CGSpace is using Oracle Java and DSpace Test is using OpenJDK (both version 8)</li>
<li>I wonder if the Java garbage collector is important here, or if there are missing indexes in PostgreSQL?</li>
<li>I switched DSpace Test to the G1GC garbage collector and tried again and now the results are worse!</li>
</ul>
<pre><code>$ time http --print h 'https://dspacetest.cgiar.org/rest/items?expand=metadata,bitstreams,parentCommunityList&amp;limit=100&amp;offset=0'
...
0.20s user 0.03s system 0% cpu 25.017 total
0.23s user 0.02s system 1% cpu 23.299 total
0.24s user 0.02s system 1% cpu 22.496 total
0.22s user 0.03s system 1% cpu 22.720 total
0.23s user 0.03s system 1% cpu 22.632 total
</code></pre>
<ul>
<li>If I make a request without the expands it is ten time faster:</li>
</ul>
<pre><code>$ time http --print h 'https://dspacetest.cgiar.org/rest/items?limit=100&amp;offset=0'
...
0.20s user 0.03s system 7% cpu 3.098 total
0.22s user 0.03s system 8% cpu 2.896 total
0.21s user 0.05s system 9% cpu 2.787 total
0.23s user 0.02s system 8% cpu 2.896 total
</code></pre>
<ul>
<li>I sent a mail to dspace-tech to ask how to profile this&hellip;</li>
</ul>
<!-- vim: set sw=2 ts=2: --> <!-- vim: set sw=2 ts=2: -->

View File

@ -4,7 +4,7 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/2018-10/</loc> <loc>https://alanorth.github.io/cgspace-notes/2018-10/</loc>
<lastmod>2018-10-15T17:26:03+03:00</lastmod> <lastmod>2018-10-16T17:26:18+03:00</lastmod>
</url> </url>
<url> <url>
@ -189,7 +189,7 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2018-10-15T17:26:03+03:00</lastmod> <lastmod>2018-10-16T17:26:18+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>
@ -200,7 +200,7 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
<lastmod>2018-10-15T17:26:03+03:00</lastmod> <lastmod>2018-10-16T17:26:18+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>
@ -212,13 +212,13 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc> <loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2018-10-15T17:26:03+03:00</lastmod> <lastmod>2018-10-16T17:26:18+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc> <loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
<lastmod>2018-10-15T17:26:03+03:00</lastmod> <lastmod>2018-10-16T17:26:18+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>