mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-22 06:35:03 +01:00
Update notes for 2019-01-21
This commit is contained in:
parent
5f4d3668a2
commit
cec787c696
@ -704,5 +704,38 @@ print(results.hits)
|
||||
```
|
||||
|
||||
- So I guess I need to figure out how to use join queries and maybe even switch to using raw Python requests with JSON
|
||||
- This enumerates the list of Solr cores and returns JSON format:
|
||||
|
||||
```
|
||||
http://localhost:3000/solr/admin/cores?action=STATUS&wt=json
|
||||
```
|
||||
|
||||
- I think I figured out how to search across shards, I needed to give the whole URL to each other core
|
||||
- Now I get more results when I start adding the other statistics cores:
|
||||
|
||||
```
|
||||
$ http 'http://localhost:3000/solr/statistics/select?&indent=on&rows=0&q=*:*' | grep numFound<result name="response" numFound="2061320" start="0">
|
||||
$ http 'http://localhost:3000/solr/statistics/select?&shards=localhost:8081/solr/statistics-2018&indent=on&rows=0&q=*:*' | grep numFound
|
||||
<result name="response" numFound="16280292" start="0" maxScore="1.0">
|
||||
$ http 'http://localhost:3000/solr/statistics/select?&shards=localhost:8081/solr/statistics-2018,localhost:8081/solr/statistics-2017&indent=on&rows=0&q=*:*' | grep numFound
|
||||
<result name="response" numFound="25606142" start="0" maxScore="1.0">
|
||||
$ http 'http://localhost:3000/solr/statistics/select?&shards=localhost:8081/solr/statistics-2018,localhost:8081/solr/statistics-2017,localhost:8081/solr/statistics-2016&indent=on&rows=0&q=*:*' | grep numFound
|
||||
<result name="response" numFound="31532212" start="0" maxScore="1.0">
|
||||
```
|
||||
|
||||
- I should be able to modify the dspace-statistics-api to check the shards via the Solr core status, then add the `shards` parameter to each query to make the search distributed among the cores
|
||||
- I implemented a proof of concept to query the Solr STATUS for active cores and to add them with a `shards` query string
|
||||
- A few things I noticed:
|
||||
- Solr doesn't mind if you use an empty `shards` parameter
|
||||
- Solr doesn't mind if you have an extra comma at the end of the `shards` parameter
|
||||
- If you are searching multiple cores, you need to include the base core in the `shards` parameter as well
|
||||
- For example, compare the following two queries, first including the base core and the shard in the `shards` parameter, and then only including the shard:
|
||||
|
||||
```
|
||||
$ http 'http://localhost:8081/solr/statistics/select?indent=on&rows=0&q=type:2+id:11576&fq=isBot:false&fq=statistics_type:view&shards=localhost:8081/solr/statistics,localhost:8081/solr/statistics-2018' | grep numFound
|
||||
<result name="response" numFound="275" start="0" maxScore="12.205825">
|
||||
$ http 'http://localhost:8081/solr/statistics/select?indent=on&rows=0&q=type:2+id:11576&fq=isBot:false&fq=statistics_type:view&shards=localhost:8081/solr/statistics-2018' | grep numFound
|
||||
<result name="response" numFound="241" start="0" maxScore="12.205825">
|
||||
```
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
@ -27,7 +27,7 @@ I don’t see anything interesting in the web server logs around that time t
|
||||
" />
|
||||
<meta property="og:type" content="article" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-01/" /><meta property="article:published_time" content="2019-01-02T09:48:30+02:00"/>
|
||||
<meta property="article:modified_time" content="2019-01-21T12:54:29+02:00"/>
|
||||
<meta property="article:modified_time" content="2019-01-21T14:16:56+02:00"/>
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="January, 2019"/>
|
||||
@ -60,9 +60,9 @@ I don’t see anything interesting in the web server logs around that time t
|
||||
"@type": "BlogPosting",
|
||||
"headline": "January, 2019",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2019-01/",
|
||||
"wordCount": "3266",
|
||||
"wordCount": "3507",
|
||||
"datePublished": "2019-01-02T09:48:30+02:00",
|
||||
"dateModified": "2019-01-21T12:54:29+02:00",
|
||||
"dateModified": "2019-01-21T14:16:56+02:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -957,8 +957,45 @@ print(results.hits)
|
||||
|
||||
<ul>
|
||||
<li>So I guess I need to figure out how to use join queries and maybe even switch to using raw Python requests with JSON</li>
|
||||
<li>This enumerates the list of Solr cores and returns JSON format:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>http://localhost:3000/solr/admin/cores?action=STATUS&wt=json
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>I think I figured out how to search across shards, I needed to give the whole URL to each other core</li>
|
||||
<li>Now I get more results when I start adding the other statistics cores:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>$ http 'http://localhost:3000/solr/statistics/select?&indent=on&rows=0&q=*:*' | grep numFound<result name="response" numFound="2061320" start="0">
|
||||
$ http 'http://localhost:3000/solr/statistics/select?&shards=localhost:8081/solr/statistics-2018&indent=on&rows=0&q=*:*' | grep numFound
|
||||
<result name="response" numFound="16280292" start="0" maxScore="1.0">
|
||||
$ http 'http://localhost:3000/solr/statistics/select?&shards=localhost:8081/solr/statistics-2018,localhost:8081/solr/statistics-2017&indent=on&rows=0&q=*:*' | grep numFound
|
||||
<result name="response" numFound="25606142" start="0" maxScore="1.0">
|
||||
$ http 'http://localhost:3000/solr/statistics/select?&shards=localhost:8081/solr/statistics-2018,localhost:8081/solr/statistics-2017,localhost:8081/solr/statistics-2016&indent=on&rows=0&q=*:*' | grep numFound
|
||||
<result name="response" numFound="31532212" start="0" maxScore="1.0">
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>I should be able to modify the dspace-statistics-api to check the shards via the Solr core status, then add the <code>shards</code> parameter to each query to make the search distributed among the cores</li>
|
||||
<li>I implemented a proof of concept to query the Solr STATUS for active cores and to add them with a <code>shards</code> query string</li>
|
||||
<li>A few things I noticed:
|
||||
|
||||
<ul>
|
||||
<li>Solr doesn’t mind if you use an empty <code>shards</code> parameter</li>
|
||||
<li>Solr doesn’t mind if you have an extra comma at the end of the <code>shards</code> parameter</li>
|
||||
<li>If you are searching multiple cores, you need to include the base core in the <code>shards</code> parameter as well</li>
|
||||
<li>For example, compare the following two queries, first including the base core and the shard in the <code>shards</code> parameter, and then only including the shard:</li>
|
||||
</ul></li>
|
||||
</ul>
|
||||
|
||||
<pre><code>$ http 'http://localhost:8081/solr/statistics/select?indent=on&rows=0&q=type:2+id:11576&fq=isBot:false&fq=statistics_type:view&shards=localhost:8081/solr/statistics,localhost:8081/solr/statistics-2018' | grep numFound
|
||||
<result name="response" numFound="275" start="0" maxScore="12.205825">
|
||||
$ http 'http://localhost:8081/solr/statistics/select?indent=on&rows=0&q=type:2+id:11576&fq=isBot:false&fq=statistics_type:view&shards=localhost:8081/solr/statistics-2018' | grep numFound
|
||||
<result name="response" numFound="241" start="0" maxScore="12.205825">
|
||||
</code></pre>
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
||||
|
||||
|
@ -4,7 +4,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/2019-01/</loc>
|
||||
<lastmod>2019-01-21T12:54:29+02:00</lastmod>
|
||||
<lastmod>2019-01-21T14:16:56+02:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
@ -204,7 +204,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
||||
<lastmod>2019-01-21T12:54:29+02:00</lastmod>
|
||||
<lastmod>2019-01-21T14:16:56+02:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
@ -215,7 +215,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
|
||||
<lastmod>2019-01-21T12:54:29+02:00</lastmod>
|
||||
<lastmod>2019-01-21T14:16:56+02:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
@ -227,13 +227,13 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
|
||||
<lastmod>2019-01-21T12:54:29+02:00</lastmod>
|
||||
<lastmod>2019-01-21T14:16:56+02:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
|
||||
<lastmod>2019-01-21T12:54:29+02:00</lastmod>
|
||||
<lastmod>2019-01-21T14:16:56+02:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user