mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-22 22:55:04 +01:00
Update notes for 2017-04-12
This commit is contained in:
parent
2c87b5f951
commit
d3a5169489
@ -123,10 +123,13 @@ $ grep -c profile /tmp/filter-media-cmyk.txt
|
||||
## 2017-04-11
|
||||
|
||||
- Looking at the item from CIFOR it hasn't been updated yet, maybe they aren't running the cron job
|
||||
- I emailed Usman from CIFOR to ask if he's running the cron job
|
||||
|
||||
## 2017-04-12
|
||||
|
||||
- CIFOR says they have cleaned their OAI cache and run the import again, but I still don't see any updates in their OAI
|
||||
- CIFOR says they have cleaned their OAI cache and that the cron job for OAI import is enabled
|
||||
- Now I see updated fields, like `dc.date.issued` but none from the CG or CIFOR namespaces
|
||||
- Also, DSpace Test hasn't re-harvested this item yet, so I will wait one more day before forcing a re-harvest
|
||||
- Looking at CIFOR's OAI using different metadata formats, like qualified Dublin Core and DSpace Intermediate Metadata:
|
||||
- QDC: https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=qdc///col_11463_6/900
|
||||
- DIM: https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=dim///col_11463_6/900
|
||||
@ -157,3 +160,22 @@ OAI 2.0 manager action ended. It took 829 seconds.
|
||||
- After reading some threads on the DSpace mailing list, I see that `clean-cache` is actually only for caching _responses_, ie to client requests in the OAI web application
|
||||
- These are stored in `[dspace]/var/oai/requests/`
|
||||
- The import command should theoretically catch situations like this where an item's metadata was updated, but in this case we changed the metadata schema and it doesn't seem to catch it (could be a bug!)
|
||||
- Attempting a full rebuild of OAI on CGSpace:
|
||||
|
||||
```
|
||||
$ export JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx1024m"
|
||||
$ time schedtool -D -e ionice -c2 -n7 nice -n19 /home/cgspace.cgiar.org/bin/dspace oai import -c
|
||||
...
|
||||
58700 items imported so far...
|
||||
Total: 58789 items
|
||||
Purging cached OAI responses.
|
||||
OAI 2.0 manager action ended. It took 1032 seconds.
|
||||
|
||||
real 17m20.156s
|
||||
user 4m35.293s
|
||||
sys 1m29.310s
|
||||
```
|
||||
|
||||
- Now the data for 10568/6 is correct in OAI: https://cgspace.cgiar.org/oai/request?verb=GetRecord&metadataPrefix=dim&identifier=oai:cgspace.cgiar.org:10568/6
|
||||
- Perhaps I need to file a bug for this, or at least ask on the DSpace Test mailing list?
|
||||
- I wonder if we could use a crosswalk to convert to a format that CG Core wants, like `<date Type="Available">`
|
||||
|
@ -30,7 +30,7 @@ $ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Th
|
||||
|
||||
|
||||
<meta property="article:published_time" content="2017-04-02T17:08:52+02:00"/>
|
||||
<meta property="article:modified_time" content="2017-04-11T20:46:03+03:00"/>
|
||||
<meta property="article:modified_time" content="2017-04-12T14:39:42+03:00"/>
|
||||
|
||||
|
||||
|
||||
@ -79,9 +79,9 @@ $ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Th
|
||||
"@type": "BlogPosting",
|
||||
"headline": "April, 2017",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2017-04/",
|
||||
"wordCount": "1063",
|
||||
"wordCount": "1208",
|
||||
"datePublished": "2017-04-02T17:08:52+02:00",
|
||||
"dateModified": "2017-04-11T20:46:03+03:00",
|
||||
"dateModified": "2017-04-12T14:39:42+03:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -288,12 +288,15 @@ ILAC_Brief21_PMCA.pdf: 113462 bytes, checksum: 249fef468f401c066a119f5db687add0
|
||||
|
||||
<ul>
|
||||
<li>Looking at the item from CIFOR it hasn’t been updated yet, maybe they aren’t running the cron job</li>
|
||||
<li>I emailed Usman from CIFOR to ask if he’s running the cron job</li>
|
||||
</ul>
|
||||
|
||||
<h2 id="2017-04-12">2017-04-12</h2>
|
||||
|
||||
<ul>
|
||||
<li>CIFOR says they have cleaned their OAI cache and run the import again, but I still don’t see any updates in their OAI</li>
|
||||
<li>CIFOR says they have cleaned their OAI cache and that the cron job for OAI import is enabled</li>
|
||||
<li>Now I see updated fields, like <code>dc.date.issued</code> but none from the CG or CIFOR namespaces</li>
|
||||
<li>Also, DSpace Test hasn’t re-harvested this item yet, so I will wait one more day before forcing a re-harvest</li>
|
||||
<li>Looking at CIFOR’s OAI using different metadata formats, like qualified Dublin Core and DSpace Intermediate Metadata:
|
||||
|
||||
<ul>
|
||||
@ -336,6 +339,26 @@ OAI 2.0 manager action ended. It took 829 seconds.
|
||||
<li>After reading some threads on the DSpace mailing list, I see that <code>clean-cache</code> is actually only for caching <em>responses</em>, ie to client requests in the OAI web application</li>
|
||||
<li>These are stored in <code>[dspace]/var/oai/requests/</code></li>
|
||||
<li>The import command should theoretically catch situations like this where an item’s metadata was updated, but in this case we changed the metadata schema and it doesn’t seem to catch it (could be a bug!)</li>
|
||||
<li>Attempting a full rebuild of OAI on CGSpace:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>$ export JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx1024m"
|
||||
$ time schedtool -D -e ionice -c2 -n7 nice -n19 /home/cgspace.cgiar.org/bin/dspace oai import -c
|
||||
...
|
||||
58700 items imported so far...
|
||||
Total: 58789 items
|
||||
Purging cached OAI responses.
|
||||
OAI 2.0 manager action ended. It took 1032 seconds.
|
||||
|
||||
real 17m20.156s
|
||||
user 4m35.293s
|
||||
sys 1m29.310s
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>Now the data for <sup>10568</sup>⁄<sub>6</sub> is correct in OAI: <a href="https://cgspace.cgiar.org/oai/request?verb=GetRecord&metadataPrefix=dim&identifier=oai:cgspace.cgiar.org:10568/6">https://cgspace.cgiar.org/oai/request?verb=GetRecord&metadataPrefix=dim&identifier=oai:cgspace.cgiar.org:10568/6</a></li>
|
||||
<li>Perhaps I need to file a bug for this, or at least ask on the DSpace Test mailing list?</li>
|
||||
<li>I wonder if we could use a crosswalk to convert to a format that CG Core wants, like <code><date Type="Available"></code></li>
|
||||
</ul>
|
||||
|
||||
|
||||
|
@ -3,7 +3,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/2017-04/</loc>
|
||||
<lastmod>2017-04-11T20:46:03+03:00</lastmod>
|
||||
<lastmod>2017-04-12T14:39:42+03:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
@ -93,7 +93,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
||||
<lastmod>2017-04-11T20:46:03+03:00</lastmod>
|
||||
<lastmod>2017-04-12T14:39:42+03:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
@ -104,19 +104,19 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
|
||||
<lastmod>2017-04-11T20:46:03+03:00</lastmod>
|
||||
<lastmod>2017-04-12T14:39:42+03:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/post/</loc>
|
||||
<lastmod>2017-04-11T20:46:03+03:00</lastmod>
|
||||
<lastmod>2017-04-12T14:39:42+03:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
|
||||
<lastmod>2017-04-11T20:46:03+03:00</lastmod>
|
||||
<lastmod>2017-04-12T14:39:42+03:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user