Add notes for 2018-10-04

This commit is contained in:
Alan Orth 2018-10-04 11:05:59 +03:00
parent 0bec73dbaa
commit ecdc130a62
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
3 changed files with 44 additions and 8 deletions

View File

@ -115,4 +115,20 @@ dc.contributor.author,cg.creator.id
"Thornton, Phillip K.",Philip Thornton: 0000-0002-1854-0182
```
## 2018-10-04
- Salem raised an issue that the dspace-statistics-api reports downloads for some items that have no bitstreams (like many limited access items)
- Every item has at least a `LICENSE` bundle, and some have a `THUMBNAIL` bundle, but the indexing code is specifically checking for downloads from the `ORIGINAL` bundle
- [10568/97460](https://cgspace.cgiar.org/handle/10568/97460) (100550): has a thumbnail bitstream
- [10568/96112](https://cgspace.cgiar.org/handle/10568/96112) (96736): has only a LICENSE bitstream
- I see there are other bundles we might need to pay attention to: `TEXT`, `@_LOGO-COLLECTION_@`, `@_LOGO-COMMUNITY_@`, etc...
- On a hunch I dropped the statistics table and re-indexed and now those two items above have no downloads
- So it's fixed, but I'm not sure why!
- Peter wants to know the number of API requests per month, which was about 250,000 in September (exluding statlet requests):
```
# zcat --force /var/log/nginx/{oai,rest}.log* | grep -E 'Sep/2018' | grep -c -v 'statlets'
251226
```
<!-- vim: set sw=2 ts=2: -->

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="2018-10-01 Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items I created a GitHub issue to track this #389, because I&rsquo;m super busy in Nairobi right now 2018-10-03 I see Moayad was busy collecting item views and downloads from CGSpace yesterday: # zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;02/Oct/2018&quot; | awk &#39;{print $1} &#39; | sort | uniq -c | sort -n | tail -n 10 933 40." />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-10/" /><meta property="article:published_time" content="2018-10-01T22:31:54&#43;03:00"/>
<meta property="article:modified_time" content="2018-10-03T21:52:12&#43;03:00"/>
<meta property="article:modified_time" content="2018-10-03T22:35:11&#43;03:00"/>
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="October, 2018"/>
@ -24,9 +24,9 @@
"@type": "BlogPosting",
"headline": "October, 2018",
"url": "https://alanorth.github.io/cgspace-notes/2018-10/",
"wordCount": "528",
"wordCount": "668",
"datePublished": "2018-10-01T22:31:54&#43;03:00",
"dateModified": "2018-10-03T21:52:12&#43;03:00",
"dateModified": "2018-10-03T22:35:11&#43;03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -216,6 +216,26 @@ Given Names Deactivated Family Name Deactivated: 0000-0001-7930-5752
&quot;Thornton, Phillip K.&quot;,Philip Thornton: 0000-0002-1854-0182
</code></pre>
<h2 id="2018-10-04">2018-10-04</h2>
<ul>
<li>Salem raised an issue that the dspace-statistics-api reports downloads for some items that have no bitstreams (like many limited access items)</li>
<li>Every item has at least a <code>LICENSE</code> bundle, and some have a <code>THUMBNAIL</code> bundle, but the indexing code is specifically checking for downloads from the <code>ORIGINAL</code> bundle
<ul>
<li><a href="https://cgspace.cgiar.org/handle/10568/97460"><sup>10568</sup>&frasl;<sub>97460</sub></a> (100550): has a thumbnail bitstream</li>
<li><a href="https://cgspace.cgiar.org/handle/10568/96112"><sup>10568</sup>&frasl;<sub>96112</sub></a> (96736): has only a LICENSE bitstream</li>
</ul></li>
<li>I see there are other bundles we might need to pay attention to: <code>TEXT</code>, <code>@_LOGO-COLLECTION_@</code>, <code>@_LOGO-COMMUNITY_@</code>, etc&hellip;</li>
<li>On a hunch I dropped the statistics table and re-indexed and now those two items above have no downloads</li>
<li>So it&rsquo;s fixed, but I&rsquo;m not sure why!</li>
<li>Peter wants to know the number of API requests per month, which was about 250,000 in September (exluding statlet requests):</li>
</ul>
<pre><code># zcat --force /var/log/nginx/{oai,rest}.log* | grep -E 'Sep/2018' | grep -c -v 'statlets'
251226
</code></pre>
<!-- vim: set sw=2 ts=2: -->

View File

@ -4,7 +4,7 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/2018-10/</loc>
<lastmod>2018-10-03T21:52:12+03:00</lastmod>
<lastmod>2018-10-03T22:35:11+03:00</lastmod>
</url>
<url>
@ -189,7 +189,7 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2018-10-03T21:52:12+03:00</lastmod>
<lastmod>2018-10-03T22:35:11+03:00</lastmod>
<priority>0</priority>
</url>
@ -200,7 +200,7 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
<lastmod>2018-10-03T21:52:12+03:00</lastmod>
<lastmod>2018-10-03T22:35:11+03:00</lastmod>
<priority>0</priority>
</url>
@ -212,13 +212,13 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2018-10-03T21:52:12+03:00</lastmod>
<lastmod>2018-10-03T22:35:11+03:00</lastmod>
<priority>0</priority>
</url>
<url>
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
<lastmod>2018-10-03T21:52:12+03:00</lastmod>
<lastmod>2018-10-03T22:35:11+03:00</lastmod>
<priority>0</priority>
</url>