Update notes for 2017-04-12

This commit is contained in:
Alan Orth 2017-04-12 16:01:30 +03:00
parent 2c87b5f951
commit d3a5169489
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
3 changed files with 55 additions and 10 deletions

View File

@ -123,10 +123,13 @@ $ grep -c profile /tmp/filter-media-cmyk.txt
## 2017-04-11 ## 2017-04-11
- Looking at the item from CIFOR it hasn't been updated yet, maybe they aren't running the cron job - Looking at the item from CIFOR it hasn't been updated yet, maybe they aren't running the cron job
- I emailed Usman from CIFOR to ask if he's running the cron job
## 2017-04-12 ## 2017-04-12
- CIFOR says they have cleaned their OAI cache and run the import again, but I still don't see any updates in their OAI - CIFOR says they have cleaned their OAI cache and that the cron job for OAI import is enabled
- Now I see updated fields, like `dc.date.issued` but none from the CG or CIFOR namespaces
- Also, DSpace Test hasn't re-harvested this item yet, so I will wait one more day before forcing a re-harvest
- Looking at CIFOR's OAI using different metadata formats, like qualified Dublin Core and DSpace Intermediate Metadata: - Looking at CIFOR's OAI using different metadata formats, like qualified Dublin Core and DSpace Intermediate Metadata:
- QDC: https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=qdc///col_11463_6/900 - QDC: https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=qdc///col_11463_6/900
- DIM: https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=dim///col_11463_6/900 - DIM: https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=dim///col_11463_6/900
@ -157,3 +160,22 @@ OAI 2.0 manager action ended. It took 829 seconds.
- After reading some threads on the DSpace mailing list, I see that `clean-cache` is actually only for caching _responses_, ie to client requests in the OAI web application - After reading some threads on the DSpace mailing list, I see that `clean-cache` is actually only for caching _responses_, ie to client requests in the OAI web application
- These are stored in `[dspace]/var/oai/requests/` - These are stored in `[dspace]/var/oai/requests/`
- The import command should theoretically catch situations like this where an item's metadata was updated, but in this case we changed the metadata schema and it doesn't seem to catch it (could be a bug!) - The import command should theoretically catch situations like this where an item's metadata was updated, but in this case we changed the metadata schema and it doesn't seem to catch it (could be a bug!)
- Attempting a full rebuild of OAI on CGSpace:
```
$ export JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx1024m"
$ time schedtool -D -e ionice -c2 -n7 nice -n19 /home/cgspace.cgiar.org/bin/dspace oai import -c
...
58700 items imported so far...
Total: 58789 items
Purging cached OAI responses.
OAI 2.0 manager action ended. It took 1032 seconds.
real 17m20.156s
user 4m35.293s
sys 1m29.310s
```
- Now the data for 10568/6 is correct in OAI: https://cgspace.cgiar.org/oai/request?verb=GetRecord&metadataPrefix=dim&identifier=oai:cgspace.cgiar.org:10568/6
- Perhaps I need to file a bug for this, or at least ask on the DSpace Test mailing list?
- I wonder if we could use a crosswalk to convert to a format that CG Core wants, like `<date Type="Available">`

View File

@ -30,7 +30,7 @@ $ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p &quot;ImageMagick PDF Th
<meta property="article:published_time" content="2017-04-02T17:08:52&#43;02:00"/> <meta property="article:published_time" content="2017-04-02T17:08:52&#43;02:00"/>
<meta property="article:modified_time" content="2017-04-11T20:46:03&#43;03:00"/> <meta property="article:modified_time" content="2017-04-12T14:39:42&#43;03:00"/>
@ -79,9 +79,9 @@ $ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p &quot;ImageMagick PDF Th
"@type": "BlogPosting", "@type": "BlogPosting",
"headline": "April, 2017", "headline": "April, 2017",
"url": "https://alanorth.github.io/cgspace-notes/2017-04/", "url": "https://alanorth.github.io/cgspace-notes/2017-04/",
"wordCount": "1063", "wordCount": "1208",
"datePublished": "2017-04-02T17:08:52&#43;02:00", "datePublished": "2017-04-02T17:08:52&#43;02:00",
"dateModified": "2017-04-11T20:46:03&#43;03:00", "dateModified": "2017-04-12T14:39:42&#43;03:00",
"author": { "author": {
"@type": "Person", "@type": "Person",
"name": "Alan Orth" "name": "Alan Orth"
@ -288,12 +288,15 @@ ILAC_Brief21_PMCA.pdf: 113462 bytes, checksum: 249fef468f401c066a119f5db687add0
<ul> <ul>
<li>Looking at the item from CIFOR it hasn&rsquo;t been updated yet, maybe they aren&rsquo;t running the cron job</li> <li>Looking at the item from CIFOR it hasn&rsquo;t been updated yet, maybe they aren&rsquo;t running the cron job</li>
<li>I emailed Usman from CIFOR to ask if he&rsquo;s running the cron job</li>
</ul> </ul>
<h2 id="2017-04-12">2017-04-12</h2> <h2 id="2017-04-12">2017-04-12</h2>
<ul> <ul>
<li>CIFOR says they have cleaned their OAI cache and run the import again, but I still don&rsquo;t see any updates in their OAI</li> <li>CIFOR says they have cleaned their OAI cache and that the cron job for OAI import is enabled</li>
<li>Now I see updated fields, like <code>dc.date.issued</code> but none from the CG or CIFOR namespaces</li>
<li>Also, DSpace Test hasn&rsquo;t re-harvested this item yet, so I will wait one more day before forcing a re-harvest</li>
<li>Looking at CIFOR&rsquo;s OAI using different metadata formats, like qualified Dublin Core and DSpace Intermediate Metadata: <li>Looking at CIFOR&rsquo;s OAI using different metadata formats, like qualified Dublin Core and DSpace Intermediate Metadata:
<ul> <ul>
@ -336,6 +339,26 @@ OAI 2.0 manager action ended. It took 829 seconds.
<li>After reading some threads on the DSpace mailing list, I see that <code>clean-cache</code> is actually only for caching <em>responses</em>, ie to client requests in the OAI web application</li> <li>After reading some threads on the DSpace mailing list, I see that <code>clean-cache</code> is actually only for caching <em>responses</em>, ie to client requests in the OAI web application</li>
<li>These are stored in <code>[dspace]/var/oai/requests/</code></li> <li>These are stored in <code>[dspace]/var/oai/requests/</code></li>
<li>The import command should theoretically catch situations like this where an item&rsquo;s metadata was updated, but in this case we changed the metadata schema and it doesn&rsquo;t seem to catch it (could be a bug!)</li> <li>The import command should theoretically catch situations like this where an item&rsquo;s metadata was updated, but in this case we changed the metadata schema and it doesn&rsquo;t seem to catch it (could be a bug!)</li>
<li>Attempting a full rebuild of OAI on CGSpace:</li>
</ul>
<pre><code>$ export JAVA_OPTS=&quot;-Dfile.encoding=UTF-8 -Xmx1024m&quot;
$ time schedtool -D -e ionice -c2 -n7 nice -n19 /home/cgspace.cgiar.org/bin/dspace oai import -c
...
58700 items imported so far...
Total: 58789 items
Purging cached OAI responses.
OAI 2.0 manager action ended. It took 1032 seconds.
real 17m20.156s
user 4m35.293s
sys 1m29.310s
</code></pre>
<ul>
<li>Now the data for <sup>10568</sup>&frasl;<sub>6</sub> is correct in OAI: <a href="https://cgspace.cgiar.org/oai/request?verb=GetRecord&amp;metadataPrefix=dim&amp;identifier=oai:cgspace.cgiar.org:10568/6">https://cgspace.cgiar.org/oai/request?verb=GetRecord&amp;metadataPrefix=dim&amp;identifier=oai:cgspace.cgiar.org:10568/6</a></li>
<li>Perhaps I need to file a bug for this, or at least ask on the DSpace Test mailing list?</li>
<li>I wonder if we could use a crosswalk to convert to a format that CG Core wants, like <code>&lt;date Type=&quot;Available&quot;&gt;</code></li>
</ul> </ul>

View File

@ -3,7 +3,7 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/2017-04/</loc> <loc>https://alanorth.github.io/cgspace-notes/2017-04/</loc>
<lastmod>2017-04-11T20:46:03+03:00</lastmod> <lastmod>2017-04-12T14:39:42+03:00</lastmod>
</url> </url>
<url> <url>
@ -93,7 +93,7 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2017-04-11T20:46:03+03:00</lastmod> <lastmod>2017-04-12T14:39:42+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>
@ -104,19 +104,19 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
<lastmod>2017-04-11T20:46:03+03:00</lastmod> <lastmod>2017-04-12T14:39:42+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/post/</loc> <loc>https://alanorth.github.io/cgspace-notes/post/</loc>
<lastmod>2017-04-11T20:46:03+03:00</lastmod> <lastmod>2017-04-12T14:39:42+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc> <loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
<lastmod>2017-04-11T20:46:03+03:00</lastmod> <lastmod>2017-04-12T14:39:42+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>