diff --git a/content/posts/2021-02.md b/content/posts/2021-02.md index b02505879..60b7ffd15 100644 --- a/content/posts/2021-02.md +++ b/content/posts/2021-02.md @@ -201,6 +201,13 @@ $ ./ilri/delete-metadata-values.py -i /tmp/2020-10-28-Series-PB.csv -db dspace - - Sistematización de experiencias Proyecto ACORDAR - Strüngmann Forum - Unité de Recherche +- I ended up using [python-ftfy](https://github.com/LuminosoInsight/python-ftfy) to fix those very easily, then replaced them in the CSV +- Then I trimmed whitespace at the beginning, end, and around the ";", and applied the 1,600 fixes using `fix-metadata-values.py`: + +```console +$ ./ilri/fix-metadata-values.py -i /tmp/2020-10-28-Series-PB.csv -db dspace -u dspace -p 'fuuu' -f dc.relation.ispartofseries -t 'correct' -m 43 +``` + - Help Peter debug an issue with one of Alan Duncan's new FEAST Data reports on CGSpace - For some reason the default policy for the item was "COLLECTION_492_DEFAULT_READ" group, which had zero members - I changed them all to Anonymous and the item was accessible diff --git a/docs/2021-02/index.html b/docs/2021-02/index.html index f7e1ba54f..91cba332d 100644 --- a/docs/2021-02/index.html +++ b/docs/2021-02/index.html @@ -32,7 +32,7 @@ $ curl -s 'http://localhost:9200/openrxv-items-temp/_count?q=*&pretty - + @@ -70,9 +70,9 @@ $ curl -s 'http://localhost:9200/openrxv-items-temp/_count?q=*&pretty "@type": "BlogPosting", "headline": "February, 2021", "url": "https://alanorth.github.io/cgspace-notes/2021-02/", - "wordCount": "1355", + "wordCount": "1406", "datePublished": "2021-02-01T10:13:54+02:00", - "dateModified": "2021-02-01T12:28:54+02:00", + "dateModified": "2021-02-04T17:28:20+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -357,6 +357,11 @@ $ dspace oai import -c
  • Unité de Recherche
  • +
  • I ended up using python-ftfy to fix those very easily, then replaced them in the CSV
  • +
  • Then I trimmed whitespace at the beginning, end, and around the “;”, and applied the 1,600 fixes using fix-metadata-values.py:
  • + +
    $ ./ilri/fix-metadata-values.py -i /tmp/2020-10-28-Series-PB.csv -db dspace -u dspace -p 'fuuu' -f dc.relation.ispartofseries -t 'correct' -m 43
    +