mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-04 22:33:02 +01:00
Add notes for 2017-04-12
This commit is contained in:
parent
22a4cc077c
commit
2c87b5f951
@ -123,3 +123,37 @@ $ grep -c profile /tmp/filter-media-cmyk.txt
|
|||||||
## 2017-04-11
|
## 2017-04-11
|
||||||
|
|
||||||
- Looking at the item from CIFOR it hasn't been updated yet, maybe they aren't running the cron job
|
- Looking at the item from CIFOR it hasn't been updated yet, maybe they aren't running the cron job
|
||||||
|
|
||||||
|
## 2017-04-12
|
||||||
|
|
||||||
|
- CIFOR says they have cleaned their OAI cache and run the import again, but I still don't see any updates in their OAI
|
||||||
|
- Looking at CIFOR's OAI using different metadata formats, like qualified Dublin Core and DSpace Intermediate Metadata:
|
||||||
|
- QDC: https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=qdc///col_11463_6/900
|
||||||
|
- DIM: https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=dim///col_11463_6/900
|
||||||
|
- Looking at one of CGSpace's items in OAI it doesn't seem that metadata fields other than those in the DC schema are exported:
|
||||||
|
- https://cgspace.cgiar.org/handle/10568/33346?show=full
|
||||||
|
- https://cgspace.cgiar.org/oai/request?verb=ListRecords&metadataPrefix=dim&set=col_10568_68619
|
||||||
|
- Side note: WTF, I just saw an item on CGSpace's OAI that is using `dc.cplace.country` and `dc.rplace.region`, which we stopped using in 2016 after the metadata migrations:
|
||||||
|
|
||||||
|
![stale metadata in OAI](/cgspace-notes/2017/04/cplace.png)
|
||||||
|
|
||||||
|
- The particular item is [10568/6](http://hdl.handle.net/10568/6) and, for what it's worth, the stale metadata only appears in the OAI view:
|
||||||
|
- XMLUI: https://cgspace.cgiar.org/handle/10568/6?show=full
|
||||||
|
- OAI: https://cgspace.cgiar.org/oai/request?verb=GetRecord&metadataPrefix=dim&identifier=oai:cgspace.cgiar.org:10568/6
|
||||||
|
- I don't see these fields anywhere in our source code or the database's metadata registry, so maybe it's just a cache issue
|
||||||
|
- I will have to check the OAI cron scripts on DSpace Test, and then run them on CGSpace
|
||||||
|
- Running `dspace oai import` and `dspace oai clean-cache` have zero effect, but this seems to rebuild the cache from scratch:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ /home/dspacetest.cgiar.org/bin/dspace oai import -c
|
||||||
|
...
|
||||||
|
63900 items imported so far...
|
||||||
|
64000 items imported so far...
|
||||||
|
Total: 64056 items
|
||||||
|
Purging cached OAI responses.
|
||||||
|
OAI 2.0 manager action ended. It took 829 seconds.
|
||||||
|
```
|
||||||
|
|
||||||
|
- After reading some threads on the DSpace mailing list, I see that `clean-cache` is actually only for caching _responses_, ie to client requests in the OAI web application
|
||||||
|
- These are stored in `[dspace]/var/oai/requests/`
|
||||||
|
- The import command should theoretically catch situations like this where an item's metadata was updated, but in this case we changed the metadata schema and it doesn't seem to catch it (could be a bug!)
|
||||||
|
@ -30,7 +30,7 @@ $ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Th
|
|||||||
|
|
||||||
|
|
||||||
<meta property="article:published_time" content="2017-04-02T17:08:52+02:00"/>
|
<meta property="article:published_time" content="2017-04-02T17:08:52+02:00"/>
|
||||||
<meta property="article:modified_time" content="2017-04-10T17:25:12+03:00"/>
|
<meta property="article:modified_time" content="2017-04-11T20:46:03+03:00"/>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -79,9 +79,9 @@ $ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Th
|
|||||||
"@type": "BlogPosting",
|
"@type": "BlogPosting",
|
||||||
"headline": "April, 2017",
|
"headline": "April, 2017",
|
||||||
"url": "https://alanorth.github.io/cgspace-notes/2017-04/",
|
"url": "https://alanorth.github.io/cgspace-notes/2017-04/",
|
||||||
"wordCount": "784",
|
"wordCount": "1063",
|
||||||
"datePublished": "2017-04-02T17:08:52+02:00",
|
"datePublished": "2017-04-02T17:08:52+02:00",
|
||||||
"dateModified": "2017-04-10T17:25:12+03:00",
|
"dateModified": "2017-04-11T20:46:03+03:00",
|
||||||
"author": {
|
"author": {
|
||||||
"@type": "Person",
|
"@type": "Person",
|
||||||
"name": "Alan Orth"
|
"name": "Alan Orth"
|
||||||
@ -290,6 +290,54 @@ ILAC_Brief21_PMCA.pdf: 113462 bytes, checksum: 249fef468f401c066a119f5db687add0
|
|||||||
<li>Looking at the item from CIFOR it hasn’t been updated yet, maybe they aren’t running the cron job</li>
|
<li>Looking at the item from CIFOR it hasn’t been updated yet, maybe they aren’t running the cron job</li>
|
||||||
</ul>
|
</ul>
|
||||||
|
|
||||||
|
<h2 id="2017-04-12">2017-04-12</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>CIFOR says they have cleaned their OAI cache and run the import again, but I still don’t see any updates in their OAI</li>
|
||||||
|
<li>Looking at CIFOR’s OAI using different metadata formats, like qualified Dublin Core and DSpace Intermediate Metadata:
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>QDC: <a href="https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=qdc///col_11463_6/900">https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=qdc///col_11463_6/900</a></li>
|
||||||
|
<li>DIM: <a href="https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=dim///col_11463_6/900">https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=dim///col_11463_6/900</a></li>
|
||||||
|
</ul></li>
|
||||||
|
<li>Looking at one of CGSpace’s items in OAI it doesn’t seem that metadata fields other than those in the DC schema are exported:
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li><a href="https://cgspace.cgiar.org/handle/10568/33346?show=full">https://cgspace.cgiar.org/handle/10568/33346?show=full</a></li>
|
||||||
|
<li><a href="https://cgspace.cgiar.org/oai/request?verb=ListRecords&metadataPrefix=dim&set=col_10568_68619">https://cgspace.cgiar.org/oai/request?verb=ListRecords&metadataPrefix=dim&set=col_10568_68619</a></li>
|
||||||
|
</ul></li>
|
||||||
|
<li>Side note: WTF, I just saw an item on CGSpace’s OAI that is using <code>dc.cplace.country</code> and <code>dc.rplace.region</code>, which we stopped using in 2016 after the metadata migrations:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<p><img src="/cgspace-notes/2017/04/cplace.png" alt="stale metadata in OAI" /></p>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>The particular item is <a href="http://hdl.handle.net/10568/6"><sup>10568</sup>⁄<sub>6</sub></a> and, for what it’s worth, the stale metadata only appears in the OAI view:
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>XMLUI: <a href="https://cgspace.cgiar.org/handle/10568/6?show=full">https://cgspace.cgiar.org/handle/10568/6?show=full</a></li>
|
||||||
|
<li>OAI: <a href="https://cgspace.cgiar.org/oai/request?verb=GetRecord&metadataPrefix=dim&identifier=oai:cgspace.cgiar.org:10568/6">https://cgspace.cgiar.org/oai/request?verb=GetRecord&metadataPrefix=dim&identifier=oai:cgspace.cgiar.org:10568/6</a></li>
|
||||||
|
</ul></li>
|
||||||
|
<li>I don’t see these fields anywhere in our source code or the database’s metadata registry, so maybe it’s just a cache issue</li>
|
||||||
|
<li>I will have to check the OAI cron scripts on DSpace Test, and then run them on CGSpace</li>
|
||||||
|
<li>Running <code>dspace oai import</code> and <code>dspace oai clean-cache</code> have zero effect, but this seems to rebuild the cache from scratch:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ /home/dspacetest.cgiar.org/bin/dspace oai import -c
|
||||||
|
...
|
||||||
|
63900 items imported so far...
|
||||||
|
64000 items imported so far...
|
||||||
|
Total: 64056 items
|
||||||
|
Purging cached OAI responses.
|
||||||
|
OAI 2.0 manager action ended. It took 829 seconds.
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>After reading some threads on the DSpace mailing list, I see that <code>clean-cache</code> is actually only for caching <em>responses</em>, ie to client requests in the OAI web application</li>
|
||||||
|
<li>These are stored in <code>[dspace]/var/oai/requests/</code></li>
|
||||||
|
<li>The import command should theoretically catch situations like this where an item’s metadata was updated, but in this case we changed the metadata schema and it doesn’t seem to catch it (could be a bug!)</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
BIN
public/2017/04/cplace.png
Normal file
BIN
public/2017/04/cplace.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 84 KiB |
@ -3,7 +3,7 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/2017-04/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/2017-04/</loc>
|
||||||
<lastmod>2017-04-10T17:25:12+03:00</lastmod>
|
<lastmod>2017-04-11T20:46:03+03:00</lastmod>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
<url>
|
<url>
|
||||||
@ -93,7 +93,7 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
||||||
<lastmod>2017-04-10T17:25:12+03:00</lastmod>
|
<lastmod>2017-04-11T20:46:03+03:00</lastmod>
|
||||||
<priority>0</priority>
|
<priority>0</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
@ -104,19 +104,19 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
|
||||||
<lastmod>2017-04-10T17:25:12+03:00</lastmod>
|
<lastmod>2017-04-11T20:46:03+03:00</lastmod>
|
||||||
<priority>0</priority>
|
<priority>0</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/post/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/post/</loc>
|
||||||
<lastmod>2017-04-10T17:25:12+03:00</lastmod>
|
<lastmod>2017-04-11T20:46:03+03:00</lastmod>
|
||||||
<priority>0</priority>
|
<priority>0</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
|
||||||
<lastmod>2017-04-10T17:25:12+03:00</lastmod>
|
<lastmod>2017-04-11T20:46:03+03:00</lastmod>
|
||||||
<priority>0</priority>
|
<priority>0</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
|
BIN
static/2017/04/cplace.png
Normal file
BIN
static/2017/04/cplace.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 84 KiB |
Loading…
Reference in New Issue
Block a user