diff --git a/content/post/2017-04.md b/content/post/2017-04.md index 6ad329731..28cf81cf5 100644 --- a/content/post/2017-04.md +++ b/content/post/2017-04.md @@ -123,3 +123,37 @@ $ grep -c profile /tmp/filter-media-cmyk.txt ## 2017-04-11 - Looking at the item from CIFOR it hasn't been updated yet, maybe they aren't running the cron job + +## 2017-04-12 + +- CIFOR says they have cleaned their OAI cache and run the import again, but I still don't see any updates in their OAI +- Looking at CIFOR's OAI using different metadata formats, like qualified Dublin Core and DSpace Intermediate Metadata: + - QDC: https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=qdc///col_11463_6/900 + - DIM: https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=dim///col_11463_6/900 +- Looking at one of CGSpace's items in OAI it doesn't seem that metadata fields other than those in the DC schema are exported: + - https://cgspace.cgiar.org/handle/10568/33346?show=full + - https://cgspace.cgiar.org/oai/request?verb=ListRecords&metadataPrefix=dim&set=col_10568_68619 +- Side note: WTF, I just saw an item on CGSpace's OAI that is using `dc.cplace.country` and `dc.rplace.region`, which we stopped using in 2016 after the metadata migrations: + +![stale metadata in OAI](/cgspace-notes/2017/04/cplace.png) + +- The particular item is [10568/6](http://hdl.handle.net/10568/6) and, for what it's worth, the stale metadata only appears in the OAI view: + - XMLUI: https://cgspace.cgiar.org/handle/10568/6?show=full + - OAI: https://cgspace.cgiar.org/oai/request?verb=GetRecord&metadataPrefix=dim&identifier=oai:cgspace.cgiar.org:10568/6 +- I don't see these fields anywhere in our source code or the database's metadata registry, so maybe it's just a cache issue +- I will have to check the OAI cron scripts on DSpace Test, and then run them on CGSpace +- Running `dspace oai import` and `dspace oai clean-cache` have zero effect, but this seems to rebuild the cache from scratch: + +``` +$ /home/dspacetest.cgiar.org/bin/dspace oai import -c +... +63900 items imported so far... +64000 items imported so far... +Total: 64056 items +Purging cached OAI responses. +OAI 2.0 manager action ended. It took 829 seconds. +``` + +- After reading some threads on the DSpace mailing list, I see that `clean-cache` is actually only for caching _responses_, ie to client requests in the OAI web application +- These are stored in `[dspace]/var/oai/requests/` +- The import command should theoretically catch situations like this where an item's metadata was updated, but in this case we changed the metadata schema and it doesn't seem to catch it (could be a bug!) diff --git a/public/2017-04/index.html b/public/2017-04/index.html index f8e134d1b..8a908c2c2 100644 --- a/public/2017-04/index.html +++ b/public/2017-04/index.html @@ -30,7 +30,7 @@ $ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Th - + @@ -79,9 +79,9 @@ $ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Th "@type": "BlogPosting", "headline": "April, 2017", "url": "https://alanorth.github.io/cgspace-notes/2017-04/", - "wordCount": "784", + "wordCount": "1063", "datePublished": "2017-04-02T17:08:52+02:00", - "dateModified": "2017-04-10T17:25:12+03:00", + "dateModified": "2017-04-11T20:46:03+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -290,6 +290,54 @@ ILAC_Brief21_PMCA.pdf: 113462 bytes, checksum: 249fef468f401c066a119f5db687add0
  • Looking at the item from CIFOR it hasn’t been updated yet, maybe they aren’t running the cron job
  • +

    2017-04-12

    + + + +

    stale metadata in OAI

    + + + +
    $ /home/dspacetest.cgiar.org/bin/dspace oai import -c
    +...
    +63900 items imported so far...
    +64000 items imported so far...
    +Total: 64056 items
    +Purging cached OAI responses.
    +OAI 2.0 manager action ended. It took 829 seconds.
    +
    + + + diff --git a/public/2017/04/cplace.png b/public/2017/04/cplace.png new file mode 100644 index 000000000..b891cd635 Binary files /dev/null and b/public/2017/04/cplace.png differ diff --git a/public/sitemap.xml b/public/sitemap.xml index 4c0497ad4..52929aa54 100644 --- a/public/sitemap.xml +++ b/public/sitemap.xml @@ -3,7 +3,7 @@ https://alanorth.github.io/cgspace-notes/2017-04/ - 2017-04-10T17:25:12+03:00 + 2017-04-11T20:46:03+03:00 @@ -93,7 +93,7 @@ https://alanorth.github.io/cgspace-notes/ - 2017-04-10T17:25:12+03:00 + 2017-04-11T20:46:03+03:00 0 @@ -104,19 +104,19 @@ https://alanorth.github.io/cgspace-notes/tags/notes/ - 2017-04-10T17:25:12+03:00 + 2017-04-11T20:46:03+03:00 0 https://alanorth.github.io/cgspace-notes/post/ - 2017-04-10T17:25:12+03:00 + 2017-04-11T20:46:03+03:00 0 https://alanorth.github.io/cgspace-notes/tags/ - 2017-04-10T17:25:12+03:00 + 2017-04-11T20:46:03+03:00 0 diff --git a/static/2017/04/cplace.png b/static/2017/04/cplace.png new file mode 100644 index 000000000..b891cd635 Binary files /dev/null and b/static/2017/04/cplace.png differ