Add notes for 2018-09-18

This commit is contained in:
Alan Orth 2018-09-18 15:52:20 +03:00
parent 817f470888
commit d41afc28d4
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
3 changed files with 62 additions and 8 deletions

View File

@ -335,4 +335,29 @@ $ http 'http://localhost:3000/solr/statistics/select?indent=on&rows=0&q=type:0+o
- That one returns 766, which is exactly 1655 minus 889...
- Also, Solr's `fq` is similar to the regular `q` query parameter, but it is considered for the Solr query cache so it should be faster for multiple queries
## 2018-09-18
- I managed to create a simple proof of concept REST API to expose item view and download statistics: [cgspace-statistics-api](https://github.com/alanorth/cgspace-statistics-api)
- It uses the Python-based [Falcon](https://falcon.readthedocs.io) web framework and talks to Solr directly using the [SolrClient](https://github.com/moonlitesolutions/SolrClient) library (which seems to have issues in Python 3.7 currently)
- After deploying on DSpace Test I can then get the stats for an item using its ID:
```
$ http -b 'https://dspacetest.cgiar.org/rest/statistics/item?id=110988'
{
"downloads": 2,
"id": 110988,
"views": 15
}
```
- The numbers are different than those that come from Atmire's statlets for some reason, but as I'm querying Solr directly, I have no idea where their numbers come from!
- Moayad from CodeObia asked if I could make the API be able to paginate over all items, for example: /statistics?limit=100&page=1
- Getting all the item IDs from PostgreSQL is certainly easy:
```
dspace=# select item_id from item where in_archive is True and withdrawn is False and discoverable is True;
```
- The rest of the Falcon tooling will be more difficult...
<!-- vim: set sw=2 ts=2: -->

View File

@ -18,7 +18,7 @@ I&rsquo;m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I
" />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-09/" /><meta property="article:published_time" content="2018-09-02T09:55:54&#43;03:00"/>
<meta property="article:modified_time" content="2018-09-17T19:53:08&#43;03:00"/>
<meta property="article:modified_time" content="2018-09-18T01:16:21&#43;03:00"/>
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="September, 2018"/>
<meta name="twitter:description" content="2018-09-02
@ -41,9 +41,9 @@ I&rsquo;m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I
"@type": "BlogPosting",
"headline": "September, 2018",
"url": "https://alanorth.github.io/cgspace-notes/2018-09/",
"wordCount": "2386",
"wordCount": "2546",
"datePublished": "2018-09-02T09:55:54&#43;03:00",
"dateModified": "2018-09-17T19:53:08&#43;03:00",
"dateModified": "2018-09-18T01:16:21&#43;03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -490,6 +490,35 @@ $ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=50.116.102.77' dspace.log.2018-09-
<li>Also, Solr&rsquo;s <code>fq</code> is similar to the regular <code>q</code> query parameter, but it is considered for the Solr query cache so it should be faster for multiple queries</li>
</ul>
<h2 id="2018-09-18">2018-09-18</h2>
<ul>
<li>I managed to create a simple proof of concept REST API to expose item view and download statistics: <a href="https://github.com/alanorth/cgspace-statistics-api">cgspace-statistics-api</a></li>
<li>It uses the Python-based <a href="https://falcon.readthedocs.io">Falcon</a> web framework and talks to Solr directly using the <a href="https://github.com/moonlitesolutions/SolrClient">SolrClient</a> library (which seems to have issues in Python 3.7 currently)</li>
<li>After deploying on DSpace Test I can then get the stats for an item using its ID:</li>
</ul>
<pre><code>$ http -b 'https://dspacetest.cgiar.org/rest/statistics/item?id=110988'
{
&quot;downloads&quot;: 2,
&quot;id&quot;: 110988,
&quot;views&quot;: 15
}
</code></pre>
<ul>
<li>The numbers are different than those that come from Atmire&rsquo;s statlets for some reason, but as I&rsquo;m querying Solr directly, I have no idea where their numbers come from!</li>
<li>Moayad from CodeObia asked if I could make the API be able to paginate over all items, for example: /statistics?limit=100&amp;page=1</li>
<li>Getting all the item IDs from PostgreSQL is certainly easy:</li>
</ul>
<pre><code>dspace=# select item_id from item where in_archive is True and withdrawn is False and discoverable is True;
</code></pre>
<ul>
<li>The rest of the Falcon tooling will be more difficult&hellip;</li>
</ul>
<!-- vim: set sw=2 ts=2: -->

View File

@ -4,7 +4,7 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/2018-09/</loc>
<lastmod>2018-09-17T19:53:08+03:00</lastmod>
<lastmod>2018-09-18T01:16:21+03:00</lastmod>
</url>
<url>
@ -184,7 +184,7 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2018-09-17T19:53:08+03:00</lastmod>
<lastmod>2018-09-18T01:16:21+03:00</lastmod>
<priority>0</priority>
</url>
@ -195,7 +195,7 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
<lastmod>2018-09-17T19:53:08+03:00</lastmod>
<lastmod>2018-09-18T01:16:21+03:00</lastmod>
<priority>0</priority>
</url>
@ -207,13 +207,13 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2018-09-17T19:53:08+03:00</lastmod>
<lastmod>2018-09-18T01:16:21+03:00</lastmod>
<priority>0</priority>
</url>
<url>
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
<lastmod>2018-09-17T19:53:08+03:00</lastmod>
<lastmod>2018-09-18T01:16:21+03:00</lastmod>
<priority>0</priority>
</url>