From fc6e55937f7ef930e9a8e61f914bb824552ddf4a Mon Sep 17 00:00:00 2001 From: Alan Orth Date: Sat, 4 Jun 2016 11:28:02 +0300 Subject: [PATCH] Update notes for 2016-06-03 --- content/2016-06.md | 66 ++++++++++++++++++++++++++++++++ public/2016-06/index.html | 73 ++++++++++++++++++++++++++++++++++++ public/index.html | 2 +- public/index.xml | 73 ++++++++++++++++++++++++++++++++++++ public/tags/notes/index.html | 2 +- public/tags/notes/index.xml | 73 ++++++++++++++++++++++++++++++++++++ 6 files changed, 287 insertions(+), 2 deletions(-) diff --git a/content/2016-06.md b/content/2016-06.md index 55d01b3a4..f1e6a643b 100644 --- a/content/2016-06.md +++ b/content/2016-06.md @@ -38,3 +38,69 @@ webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text - A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue - I found a thread on the mailing list talking about it and there is bug report and a patch: https://jira.duraspace.org/browse/DS-2740 - The patch applies successfully on DSpace 5.1 so I will try it later + +## 2016-06-03 + +- Investigating the CCAFS authority issue, I exported the metadata for the Videos collection +- The top two authors are: + +``` +CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500 +CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600 +``` + +- So the only difference is the "confidence" +- Ok, well THAT is interesting: + +``` +dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %'; + text_value | authority | confidence +------------+--------------------------------------+------------ + Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1 + Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1 + Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1 + Orth, Alan | | -1 + Orth, Alan | | -1 + Orth, Alan | | -1 + Orth, Alan | | -1 + Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1 + Orth, A. | 05c2c622-d252-4efb-b9ed-95a07d3adf11 | -1 + Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1 + Orth, A. | ab606e3a-2b04-4c7d-9423-14beccf54257 | -1 + Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600 + Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 | 600 +(13 rows) +``` + +- And now an actually relevent example: + +``` +dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence = 500; + count +------- + 707 +(1 row) + +dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence != 500; + count +------- + 253 +(1 row) +``` + +- Trying something experimental: + +``` +dspacetest=# update metadatavalue set confidence=500 where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security'; +UPDATE 960 +``` + +- And then re-indexing authority and Discovery...? +- After Discovery reindex the CCAFS authors are all together in the Authors sidebar facet +- The docs for the ORCiD and Authority stuff for DSpace 5 mention changing the browse indexes to use the Authority as well: + +``` +webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority +``` + +- That would only be for the "Browse by" function... so we'll have to see what effect that has later diff --git a/public/2016-06/index.html b/public/2016-06/index.html index 163f8ec73..b75b3b06d 100644 --- a/public/2016-06/index.html +++ b/public/2016-06/index.html @@ -116,6 +116,79 @@ UPDATE 14
  • A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue
  • I found a thread on the mailing list talking about it and there is bug report and a patch: https://jira.duraspace.org/browse/DS-2740
  • The patch applies successfully on DSpace 5.1 so I will try it later
  • + + +

    2016-06-03

    + + + +
    CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500
    +CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600
    +
    + + + +
    dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %';
    + text_value |              authority               | confidence
    +------------+--------------------------------------+------------
    + Orth, A.   | ab606e3a-2b04-4c7d-9423-14beccf54257 |         -1
    + Orth, A.   | ab606e3a-2b04-4c7d-9423-14beccf54257 |         -1
    + Orth, A.   | ab606e3a-2b04-4c7d-9423-14beccf54257 |         -1
    + Orth, Alan |                                      |         -1
    + Orth, Alan |                                      |         -1
    + Orth, Alan |                                      |         -1
    + Orth, Alan |                                      |         -1
    + Orth, A.   | 05c2c622-d252-4efb-b9ed-95a07d3adf11 |         -1
    + Orth, A.   | 05c2c622-d252-4efb-b9ed-95a07d3adf11 |         -1
    + Orth, A.   | ab606e3a-2b04-4c7d-9423-14beccf54257 |         -1
    + Orth, A.   | ab606e3a-2b04-4c7d-9423-14beccf54257 |         -1
    + Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 |        600
    + Orth, Alan | ad281dbf-ef81-4007-96c3-a7f5d2eaa6d9 |        600
    +(13 rows)
    +
    + + + +
    dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence = 500;
    + count
    +-------
    +   707
    +(1 row)
    +
    +dspacetest=# select count(*) from metadatavalue where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security' and confidence != 500;
    + count
    +-------
    +   253
    +(1 row)
    +
    + + + +
    dspacetest=# update metadatavalue set confidence=500 where metadata_field_id=3 and text_value like 'CGIAR Research Program on Climate Change, Agriculture and Food Security';
    +UPDATE 960
    +
    + + + +
    webui.browse.index.2 = author:metadataAuthority:dc.contributor.author:authority
    +
    + + diff --git a/public/index.html b/public/index.html index 8072c0105..5eeb8a276 100644 --- a/public/index.html +++ b/public/index.html @@ -99,7 +99,7 @@
    - 2016-06-01 Experimenting with IFPRI OAI (we want to harvest their publications) After reading the ContentDM documentation I found IFPRI’s OAI endpoint: http://ebrary.ifpri.org/oai/oai.php After reading the OAI documentation and testing with an OAI validator I found out how to get their publications This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&from=2016-01-01&set=p15738coll2&metadataPrefix=oai_dc You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets Working on second phase of metadata migration, looks like this will + 2016-06-01 Experimenting with IFPRI OAI (we want to harvest their publications) After reading the ContentDM documentation I found IFPRI’s OAI endpoint: http://ebrary.ifpri.org/oai/oai.php After reading the OAI documentation and testing with an OAI validator I found out how to get their publications This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRecords&from=2016-01-01&set=p15738coll2&metadataPrefix=oai_dc You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in dc.identifier.fund to cg.identifier.cpwfproject and then the rest to dc.description.sponsorship dspacetest=# update metadatavalue set metadata_field_id=130 where metadata_field_id=75 and (text_value like 'PN%' or text_value like 'PHASE%' or text_value = 'CBA' or text_value = 'IA'); UPDATE 497 dspacetest=# update metadatavalue set metadata_field_id=29 where metadata_field_id=75; UPDATE 14 Fix a few minor miscellaneous issues in dspace.cfg (#227) 2016-06-02 Testing the configuration and theme changes for the upcoming metadata migration and I found some issues with cg.coverage.admin-unit Seems that the Browse configuration in dspace.cfg can’t handle the ‘-’ in the field name: webui.browse.index.12 = subregion:metadata:cg.coverage.admin-unit:text But actually, I think since DSpace 4 or 5 (we are 5.1) the Browse indexes come from Discovery (defined in discovery.xml) so this is really just a parsing error I’ve sent a message to the DSpace mailing list to ask about the Browse index definition A user was having problems with submission and from the stacktrace it looks like a Sherpa/Romeo issue I found a thread on the mailing list talking about it and there is bug report and a patch: https://jira.duraspace.org/browse/DS-2740 The patch applies successfully on DSpace 5.1 so I will try it later 2016-06-03 Investigating the CCAFS authority issue, I exported the metadata for the Videos collection The top two authors are: CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::500 CGIAR Research Program on Climate Change, Agriculture and Food Security::acd00765-02f1-4b5b-92fa-bfa3877229ce::600 So the only difference is the “confidence” Ok, well THAT is interesting: dspacetest=# select text_value, authority, confidence from metadatavalue where metadata_field_id=3 and text_value like '%Orth, %'; text_value | authority | confidence ------------+--------------------------------------+------------ Orth, A.