Update notes for 2020-02-06

This commit is contained in:
Alan Orth 2020-02-06 16:54:41 +02:00
parent ef006037c1
commit 38177b2a6f
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
3 changed files with 61 additions and 10 deletions

View File

@ -269,10 +269,38 @@ $ ls -lh /tmp/statistics-2019-01.json
- Then I tested importing this by creating a new core in my development environment: - Then I tested importing this by creating a new core in my development environment:
``` ```
$ curl 'http://localhost:8080/solr/admin/cores?action=CREATE&name=statistics-2019&instanceDir=/home/aorth/dspace63/solr/statistics&dataDir=/home/aorth/dspace63/solr/statistics-2019/data' $ curl 'http://localhost:8080/solr/admin/cores?action=CREATE&name=statistics-2019&instanceDir=/home/aorth/dspace/solr/statistics&dataDir=/home/aorth/dspace/solr/statistics-2019/data'
$ ./run.sh -s http://localhost:8080/solr/statistics-2019 -a import -o ~/Downloads/statistics-2019-01.json -k uid $ ./run.sh -s http://localhost:8080/solr/statistics-2019 -a import -o ~/Downloads/statistics-2019-01.json -k uid
``` ```
- This imports the records into the core, but DSpace can't see them, and when I restart Tomcat the core is not seen by Solr... - This imports the records into the core, but DSpace can't see them, and when I restart Tomcat the core is not seen by Solr...
- I got the core to load by adding it to `dspace/solr/solr.xml` manually, ie:
```
<cores adminPath="/admin/cores">
...
<core name="statistics" instanceDir="statistics" />
<core name="statistics-2019" instanceDir="statistics">
<property name="dataDir" value="/home/aorth/dspace/solr/statistics-2019/data" />
</core>
...
</cores>
```
- But I don't like having to do that... why doesn't it load automatically?
- I sent a mail to the dspace-tech mailing list to ask about it
- Just for fun I tried to load these stats into a Solr 7.7.2 instance using the DSpace 7 solr config:
- First, create a Solr statistics core using the DSpace 7 config:
```
$ ./bin/solr create_core -c statistics -d ~/src/git/DSpace/dspace/solr/statistics/conf -p 8983
```
- Then try to import the stats, skipping a shitload of fields that are apparently added to our Solr statistics by Atmire modules:
```
$ ./run.sh -s http://localhost:8983/solr/statistics -a import -o ~/Downloads/statistics-2019-01.json -k uid -S author_mtdt,author_mtdt_search,iso_mtdt_search,iso_mtdt,subject_mtdt,subject_mtdt_search,containerCollection,containerCommunity,containerItem,countryCode_ngram,countryCode_search,cua_version,dateYear,dateYearMonth,geoipcountrycode,ip_ngram,ip_search,isArchived,isInternal,isWithdrawn,containerBitstream,file_id,referrer_ngram,referrer_search,userAgent_ngram,userAgent_search,version_id,complete_query,complete_query_search,filterquery,ngram_query_search,ngram_simplequery_search,simple_query,simple_query_search,range,rangeDescription,rangeDescription_ngram,rangeDescription_search,range_ngram,range_search,actingGroupId,actorMemberGroupId,bitstreamCount,solr_update_time_stamp,bitstreamId
```
- OK that imported! I wonder if it works... maybe I'll try another day
<!-- vim: set sw=2 ts=2: --> <!-- vim: set sw=2 ts=2: -->

View File

@ -20,7 +20,7 @@ The code finally builds and runs with a fresh install
<meta property="og:type" content="article" /> <meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-02/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-02/" />
<meta property="article:published_time" content="2020-02-02T11:56:30+02:00" /> <meta property="article:published_time" content="2020-02-02T11:56:30+02:00" />
<meta property="article:modified_time" content="2020-02-06T10:01:17+02:00" /> <meta property="article:modified_time" content="2020-02-06T12:47:25+02:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="February, 2020"/> <meta name="twitter:title" content="February, 2020"/>
@ -45,9 +45,9 @@ The code finally builds and runs with a fresh install
"@type": "BlogPosting", "@type": "BlogPosting",
"headline": "February, 2020", "headline": "February, 2020",
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2020-02\/", "url": "https:\/\/alanorth.github.io\/cgspace-notes\/2020-02\/",
"wordCount": "1926", "wordCount": "2069",
"datePublished": "2020-02-02T11:56:30+02:00", "datePublished": "2020-02-02T11:56:30+02:00",
"dateModified": "2020-02-06T10:01:17+02:00", "dateModified": "2020-02-06T12:47:25+02:00",
"author": { "author": {
"@type": "Person", "@type": "Person",
"name": "Alan Orth" "name": "Alan Orth"
@ -392,10 +392,33 @@ $ ls -lh /tmp/statistics-2019-01.json
</code></pre><ul> </code></pre><ul>
<li>Then I tested importing this by creating a new core in my development environment:</li> <li>Then I tested importing this by creating a new core in my development environment:</li>
</ul> </ul>
<pre><code>$ curl 'http://localhost:8080/solr/admin/cores?action=CREATE&amp;name=statistics-2019&amp;instanceDir=/home/aorth/dspace63/solr/statistics&amp;dataDir=/home/aorth/dspace63/solr/statistics-2019/data' <pre><code>$ curl 'http://localhost:8080/solr/admin/cores?action=CREATE&amp;name=statistics-2019&amp;instanceDir=/home/aorth/dspace/solr/statistics&amp;dataDir=/home/aorth/dspace/solr/statistics-2019/data'
$ ./run.sh -s http://localhost:8080/solr/statistics-2019 -a import -o ~/Downloads/statistics-2019-01.json -k uid $ ./run.sh -s http://localhost:8080/solr/statistics-2019 -a import -o ~/Downloads/statistics-2019-01.json -k uid
</code></pre><ul> </code></pre><ul>
<li>This imports the records into the core, but DSpace can&rsquo;t see them, and when I restart Tomcat the core is not seen by Solr&hellip;</li> <li>This imports the records into the core, but DSpace can&rsquo;t see them, and when I restart Tomcat the core is not seen by Solr&hellip;</li>
<li>I got the core to load by adding it to <code>dspace/solr/solr.xml</code> manually, ie:</li>
</ul>
<pre><code> &lt;cores adminPath=&quot;/admin/cores&quot;&gt;
...
&lt;core name=&quot;statistics&quot; instanceDir=&quot;statistics&quot; /&gt;
&lt;core name=&quot;statistics-2019&quot; instanceDir=&quot;statistics&quot;&gt;
&lt;property name=&quot;dataDir&quot; value=&quot;/home/aorth/dspace/solr/statistics-2019/data&quot; /&gt;
&lt;/core&gt;
...
&lt;/cores&gt;
</code></pre><ul>
<li>But I don&rsquo;t like having to do that&hellip; why doesn&rsquo;t it load automatically?</li>
<li>I sent a mail to the dspace-tech mailing list to ask about it</li>
<li>Just for fun I tried to load these stats into a Solr 7.7.2 instance using the DSpace 7 solr config:</li>
<li>First, create a Solr statistics core using the DSpace 7 config:</li>
</ul>
<pre><code>$ ./bin/solr create_core -c statistics -d ~/src/git/DSpace/dspace/solr/statistics/conf -p 8983
</code></pre><ul>
<li>Then try to import the stats, skipping a shitload of fields that are apparently added to our Solr statistics by Atmire modules:</li>
</ul>
<pre><code>$ ./run.sh -s http://localhost:8983/solr/statistics -a import -o ~/Downloads/statistics-2019-01.json -k uid -S author_mtdt,author_mtdt_search,iso_mtdt_search,iso_mtdt,subject_mtdt,subject_mtdt_search,containerCollection,containerCommunity,containerItem,countryCode_ngram,countryCode_search,cua_version,dateYear,dateYearMonth,geoipcountrycode,ip_ngram,ip_search,isArchived,isInternal,isWithdrawn,containerBitstream,file_id,referrer_ngram,referrer_search,userAgent_ngram,userAgent_search,version_id,complete_query,complete_query_search,filterquery,ngram_query_search,ngram_simplequery_search,simple_query,simple_query_search,range,rangeDescription,rangeDescription_ngram,rangeDescription_search,range_ngram,range_search,actingGroupId,actorMemberGroupId,bitstreamCount,solr_update_time_stamp,bitstreamId
</code></pre><ul>
<li>OK that imported! I wonder if it works&hellip; maybe I&rsquo;ll try another day</li>
</ul> </ul>
<!-- raw HTML omitted --> <!-- raw HTML omitted -->

View File

@ -4,27 +4,27 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc> <loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
<lastmod>2020-02-06T10:01:17+02:00</lastmod> <lastmod>2020-02-06T12:47:25+02:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2020-02-06T10:01:17+02:00</lastmod> <lastmod>2020-02-06T12:47:25+02:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/2020-02/</loc> <loc>https://alanorth.github.io/cgspace-notes/2020-02/</loc>
<lastmod>2020-02-06T10:01:17+02:00</lastmod> <lastmod>2020-02-06T12:47:25+02:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2020-02-06T10:01:17+02:00</lastmod> <lastmod>2020-02-06T12:47:25+02:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc> <loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2020-02-06T10:01:17+02:00</lastmod> <lastmod>2020-02-06T12:47:25+02:00</lastmod>
</url> </url>
<url> <url>