Update notes for 2019-01-21

This commit is contained in:
2019-01-21 23:54:39 +02:00
parent 5f4d3668a2
commit cec787c696
3 changed files with 78 additions and 8 deletions

View File

@ -704,5 +704,38 @@ print(results.hits)
```
- So I guess I need to figure out how to use join queries and maybe even switch to using raw Python requests with JSON
- This enumerates the list of Solr cores and returns JSON format:
```
http://localhost:3000/solr/admin/cores?action=STATUS&wt=json
```
- I think I figured out how to search across shards, I needed to give the whole URL to each other core
- Now I get more results when I start adding the other statistics cores:
```
$ http 'http://localhost:3000/solr/statistics/select?&indent=on&rows=0&q=*:*' | grep numFound<result name="response" numFound="2061320" start="0">
$ http 'http://localhost:3000/solr/statistics/select?&shards=localhost:8081/solr/statistics-2018&indent=on&rows=0&q=*:*' | grep numFound
<result name="response" numFound="16280292" start="0" maxScore="1.0">
$ http 'http://localhost:3000/solr/statistics/select?&shards=localhost:8081/solr/statistics-2018,localhost:8081/solr/statistics-2017&indent=on&rows=0&q=*:*' | grep numFound
<result name="response" numFound="25606142" start="0" maxScore="1.0">
$ http 'http://localhost:3000/solr/statistics/select?&shards=localhost:8081/solr/statistics-2018,localhost:8081/solr/statistics-2017,localhost:8081/solr/statistics-2016&indent=on&rows=0&q=*:*' | grep numFound
<result name="response" numFound="31532212" start="0" maxScore="1.0">
```
- I should be able to modify the dspace-statistics-api to check the shards via the Solr core status, then add the `shards` parameter to each query to make the search distributed among the cores
- I implemented a proof of concept to query the Solr STATUS for active cores and to add them with a `shards` query string
- A few things I noticed:
- Solr doesn't mind if you use an empty `shards` parameter
- Solr doesn't mind if you have an extra comma at the end of the `shards` parameter
- If you are searching multiple cores, you need to include the base core in the `shards` parameter as well
- For example, compare the following two queries, first including the base core and the shard in the `shards` parameter, and then only including the shard:
```
$ http 'http://localhost:8081/solr/statistics/select?indent=on&rows=0&q=type:2+id:11576&fq=isBot:false&fq=statistics_type:view&shards=localhost:8081/solr/statistics,localhost:8081/solr/statistics-2018' | grep numFound
<result name="response" numFound="275" start="0" maxScore="12.205825">
$ http 'http://localhost:8081/solr/statistics/select?indent=on&rows=0&q=type:2+id:11576&fq=isBot:false&fq=statistics_type:view&shards=localhost:8081/solr/statistics-2018' | grep numFound
<result name="response" numFound="241" start="0" maxScore="12.205825">
```
<!-- vim: set sw=2 ts=2: -->