mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-26 08:28:18 +01:00
Update notes for 2018-09-17
This commit is contained in:
parent
4cfa9aa101
commit
817f470888
@ -294,5 +294,45 @@ https://cgspace.cgiar.org/rest/statlets?handle=10568/97103
|
|||||||
- Check if it's possible to have items deposited via REST use a workflow so we can perhaps tell ICARDA to use that from MEL
|
- Check if it's possible to have items deposited via REST use a workflow so we can perhaps tell ICARDA to use that from MEL
|
||||||
- Agree that we'll publicize AReS explorer on the week before the Big Data Platform workshop
|
- Agree that we'll publicize AReS explorer on the week before the Big Data Platform workshop
|
||||||
- Put a link and or picture on the CGSpace homepage saying "Visualized CGSpace research" or something, and post a message on Yammer
|
- Put a link and or picture on the CGSpace homepage saying "Visualized CGSpace research" or something, and post a message on Yammer
|
||||||
|
- I want to explore creating a thin API to make the item view and download stats available from Solr so CodeObia can use them in the AReS explorer
|
||||||
|
- Currently CodeObia is exploring using the Atmire statlets internal API, but I don't really like that...
|
||||||
|
- There are some example queries on the [DSpace Solr wiki](https://wiki.duraspace.org/display/DSPACE/Solr)
|
||||||
|
- For example, this query returns 1655 rows for item [10568/10630](https://cgspace.cgiar.org/handle/10568/10630):
|
||||||
|
|
||||||
|
```
|
||||||
|
$ http 'http://localhost:3000/solr/statistics/select?indent=on&rows=0&q=type:0+owningItem:11576&fq=isBot:false'
|
||||||
|
```
|
||||||
|
|
||||||
|
- The id in the Solr query is the item's database id (get it from the REST API or something)
|
||||||
|
- Next, I adopted a query to get the downloads and it shows 889, which is similar to the number Atmire's statlet shows, though the query logic here is confusing:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ http 'http://localhost:3000/solr/statistics/select?indent=on&rows=0&q=type:0+owningItem:11576&fq=isBot:false&fq=-(bundleName:[*+TO+*]-bundleName:ORIGINAL)&fq=-(statistics_type:[*+TO+*]+-statistics_type:view)'
|
||||||
|
```
|
||||||
|
|
||||||
|
- According to the [SolrQuerySyntax](https://wiki.apache.org/solr/SolrQuerySyntax) page on the Apache wiki, the `[* TO *]` syntax just selects a range (in this case all values for a field)
|
||||||
|
- So it seems to be:
|
||||||
|
- `type:0` is for bitstreams according to the DSpace Solr documentation
|
||||||
|
- `-(bundleName:[*+TO+*]-bundleName:ORIGINAL)` seems to be a [negative query starting with all documents](https://wiki.apache.org/solr/NegativeQueryProblems), subtracting those with `bundleName:ORIGINAL`, and then negating the whole thing... meaning only documents from `bundleName:ORIGINAL`?
|
||||||
|
- What the shit, I think I'm right: the simplified logic in *this* query returns the same 889:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ http 'http://localhost:3000/solr/statistics/select?indent=on&rows=0&q=type:0+owningItem:11576&fq=isBot:false&fq=bundleName:ORIGINAL&fq=-(statistics_type:[*+TO+*]+-statistics_type:view)'
|
||||||
|
```
|
||||||
|
|
||||||
|
- And if I simplify the `statistics_type` logic the same way, it still returns the same 889!
|
||||||
|
|
||||||
|
```
|
||||||
|
$ http 'http://localhost:3000/solr/statistics/select?indent=on&rows=0&q=type:0+owningItem:11576&fq=isBot:false&fq=bundleName:ORIGINAL&fq=statistics_type:view'
|
||||||
|
```
|
||||||
|
|
||||||
|
- As for item views, I suppose that's just the same query, minus the `bundleName:ORIGINAL`:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ http 'http://localhost:3000/solr/statistics/select?indent=on&rows=0&q=type:0+owningItem:11576&fq=isBot:false&fq=-bundleName:ORIGINAL&fq=statistics_type:view'
|
||||||
|
```
|
||||||
|
|
||||||
|
- That one returns 766, which is exactly 1655 minus 889...
|
||||||
|
- Also, Solr's `fq` is similar to the regular `q` query parameter, but it is considered for the Solr query cache so it should be faster for multiple queries
|
||||||
|
|
||||||
<!-- vim: set sw=2 ts=2: -->
|
<!-- vim: set sw=2 ts=2: -->
|
||||||
|
@ -18,7 +18,7 @@ I’m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I
|
|||||||
" />
|
" />
|
||||||
<meta property="og:type" content="article" />
|
<meta property="og:type" content="article" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-09/" /><meta property="article:published_time" content="2018-09-02T09:55:54+03:00"/>
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-09/" /><meta property="article:published_time" content="2018-09-02T09:55:54+03:00"/>
|
||||||
<meta property="article:modified_time" content="2018-09-17T17:34:48+03:00"/>
|
<meta property="article:modified_time" content="2018-09-17T19:53:08+03:00"/>
|
||||||
<meta name="twitter:card" content="summary"/>
|
<meta name="twitter:card" content="summary"/>
|
||||||
<meta name="twitter:title" content="September, 2018"/>
|
<meta name="twitter:title" content="September, 2018"/>
|
||||||
<meta name="twitter:description" content="2018-09-02
|
<meta name="twitter:description" content="2018-09-02
|
||||||
@ -41,9 +41,9 @@ I’m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I
|
|||||||
"@type": "BlogPosting",
|
"@type": "BlogPosting",
|
||||||
"headline": "September, 2018",
|
"headline": "September, 2018",
|
||||||
"url": "https://alanorth.github.io/cgspace-notes/2018-09/",
|
"url": "https://alanorth.github.io/cgspace-notes/2018-09/",
|
||||||
"wordCount": "2107",
|
"wordCount": "2386",
|
||||||
"datePublished": "2018-09-02T09:55:54+03:00",
|
"datePublished": "2018-09-02T09:55:54+03:00",
|
||||||
"dateModified": "2018-09-17T17:34:48+03:00",
|
"dateModified": "2018-09-17T19:53:08+03:00",
|
||||||
"author": {
|
"author": {
|
||||||
"@type": "Person",
|
"@type": "Person",
|
||||||
"name": "Alan Orth"
|
"name": "Alan Orth"
|
||||||
@ -440,6 +440,54 @@ $ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=50.116.102.77' dspace.log.2018-09-
|
|||||||
<ul>
|
<ul>
|
||||||
<li>Put a link and or picture on the CGSpace homepage saying “Visualized CGSpace research” or something, and post a message on Yammer</li>
|
<li>Put a link and or picture on the CGSpace homepage saying “Visualized CGSpace research” or something, and post a message on Yammer</li>
|
||||||
</ul></li>
|
</ul></li>
|
||||||
|
<li>I want to explore creating a thin API to make the item view and download stats available from Solr so CodeObia can use them in the AReS explorer</li>
|
||||||
|
<li>Currently CodeObia is exploring using the Atmire statlets internal API, but I don’t really like that…</li>
|
||||||
|
<li>There are some example queries on the <a href="https://wiki.duraspace.org/display/DSPACE/Solr">DSpace Solr wiki</a></li>
|
||||||
|
<li>For example, this query returns 1655 rows for item <a href="https://cgspace.cgiar.org/handle/10568/10630"><sup>10568</sup>⁄<sub>10630</sub></a>:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ http 'http://localhost:3000/solr/statistics/select?indent=on&rows=0&q=type:0+owningItem:11576&fq=isBot:false'
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>The id in the Solr query is the item’s database id (get it from the REST API or something)</li>
|
||||||
|
<li>Next, I adopted a query to get the downloads and it shows 889, which is similar to the number Atmire’s statlet shows, though the query logic here is confusing:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ http 'http://localhost:3000/solr/statistics/select?indent=on&rows=0&q=type:0+owningItem:11576&fq=isBot:false&fq=-(bundleName:[*+TO+*]-bundleName:ORIGINAL)&fq=-(statistics_type:[*+TO+*]+-statistics_type:view)'
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>According to the <a href="https://wiki.apache.org/solr/SolrQuerySyntax">SolrQuerySyntax</a> page on the Apache wiki, the <code>[* TO *]</code> syntax just selects a range (in this case all values for a field)</li>
|
||||||
|
<li>So it seems to be:
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li><code>type:0</code> is for bitstreams according to the DSpace Solr documentation</li>
|
||||||
|
<li><code>-(bundleName:[*+TO+*]-bundleName:ORIGINAL)</code> seems to be a <a href="https://wiki.apache.org/solr/NegativeQueryProblems">negative query starting with all documents</a>, subtracting those with <code>bundleName:ORIGINAL</code>, and then negating the whole thing… meaning only documents from <code>bundleName:ORIGINAL</code>?</li>
|
||||||
|
</ul></li>
|
||||||
|
<li>What the shit, I think I’m right: the simplified logic in <em>this</em> query returns the same 889:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ http 'http://localhost:3000/solr/statistics/select?indent=on&rows=0&q=type:0+owningItem:11576&fq=isBot:false&fq=bundleName:ORIGINAL&fq=-(statistics_type:[*+TO+*]+-statistics_type:view)'
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>And if I simplify the <code>statistics_type</code> logic the same way, it still returns the same 889!</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ http 'http://localhost:3000/solr/statistics/select?indent=on&rows=0&q=type:0+owningItem:11576&fq=isBot:false&fq=bundleName:ORIGINAL&fq=statistics_type:view'
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>As for item views, I suppose that’s just the same query, minus the <code>bundleName:ORIGINAL</code>:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ http 'http://localhost:3000/solr/statistics/select?indent=on&rows=0&q=type:0+owningItem:11576&fq=isBot:false&fq=-bundleName:ORIGINAL&fq=statistics_type:view'
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>That one returns 766, which is exactly 1655 minus 889…</li>
|
||||||
|
<li>Also, Solr’s <code>fq</code> is similar to the regular <code>q</code> query parameter, but it is considered for the Solr query cache so it should be faster for multiple queries</li>
|
||||||
</ul>
|
</ul>
|
||||||
|
|
||||||
<!-- vim: set sw=2 ts=2: -->
|
<!-- vim: set sw=2 ts=2: -->
|
||||||
|
@ -4,7 +4,7 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/2018-09/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/2018-09/</loc>
|
||||||
<lastmod>2018-09-17T17:34:48+03:00</lastmod>
|
<lastmod>2018-09-17T19:53:08+03:00</lastmod>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
<url>
|
<url>
|
||||||
@ -184,7 +184,7 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
||||||
<lastmod>2018-09-17T17:34:48+03:00</lastmod>
|
<lastmod>2018-09-17T19:53:08+03:00</lastmod>
|
||||||
<priority>0</priority>
|
<priority>0</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
@ -195,7 +195,7 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
|
||||||
<lastmod>2018-09-17T17:34:48+03:00</lastmod>
|
<lastmod>2018-09-17T19:53:08+03:00</lastmod>
|
||||||
<priority>0</priority>
|
<priority>0</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
@ -207,13 +207,13 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
|
||||||
<lastmod>2018-09-17T17:34:48+03:00</lastmod>
|
<lastmod>2018-09-17T19:53:08+03:00</lastmod>
|
||||||
<priority>0</priority>
|
<priority>0</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
|
||||||
<lastmod>2018-09-17T17:34:48+03:00</lastmod>
|
<lastmod>2018-09-17T19:53:08+03:00</lastmod>
|
||||||
<priority>0</priority>
|
<priority>0</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user