From 41ba0acca9760de0d0bfc6397657ae0c970ce3a8 Mon Sep 17 00:00:00 2001 From: Alan Orth Date: Tue, 20 Jun 2017 12:00:40 +0300 Subject: [PATCH] Add notes for 2017-06-20 --- content/post/2017-06.md | 16 ++++++++++++++++ public/2017-06/index.html | 26 +++++++++++++++++++++++--- public/sitemap.xml | 10 +++++----- 3 files changed, 44 insertions(+), 8 deletions(-) diff --git a/content/post/2017-06.md b/content/post/2017-06.md index b999d6aaa..4282cb32b 100644 --- a/content/post/2017-06.md +++ b/content/post/2017-06.md @@ -91,3 +91,19 @@ $ JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8" [dspace]/bin/dspace import --add - - Redeploy CGSpace with latest changes from `5_x-prod`, run system updates, and reboot the server - Continue working on ansible infrastructure changes for CGIAR Library + +## 2017-06-20 + +- Import Abenet and Peter's changes to the CGIAR Library CRP community +- Due to them using Windows and renaming some columns there were formatting, encoding, and duplicate metadata value issues +- I had to remove some fields from the CSV and rename some back to, ie, `dc.subject[en_US]` just so DSpace would detect changes properly +- Now it looks much better: https://dspacetest.cgiar.org/handle/10947/2517 +- Removing the HTML tags and HTML/XML entities using the following GREL: + - `replace(value,/<\/?\w+((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)\/?>/,'')` + - `value.unescape("html").unescape("xml")` +- Finally import 914 CIAT Book Chapters to CGSpace in two batches: + +``` +$ JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8" [dspace]/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/35701 --source /home/aorth/CIAT-Books/SimpleArchiveFormat/ --mapfile=/tmp/ciat-books.map &> /tmp/ciat-books.log +$ JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8" [dspace]/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/35701 --source /home/aorth/CIAT-Books/SimpleArchiveFormat/ --mapfile=/tmp/ciat-books2.map &> /tmp/ciat-books2.log +``` diff --git a/public/2017-06/index.html b/public/2017-06/index.html index cd5fd3226..5af0afc07 100644 --- a/public/2017-06/index.html +++ b/public/2017-06/index.html @@ -13,7 +13,7 @@ - + @@ -45,9 +45,9 @@ "@type": "BlogPosting", "headline": "June, 2017", "url": "https://alanorth.github.io/cgspace-notes/2017-06/", - "wordCount": "892", + "wordCount": "1001", "datePublished": "2017-06-01T10:14:52+03:00", - "dateModified": "2017-06-07T18:12:09+03:00", + "dateModified": "2017-06-18T14:53:20+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -223,6 +223,26 @@
  • Continue working on ansible infrastructure changes for CGIAR Library
  • +

    2017-06-20

    + + + +
    $ JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8" [dspace]/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/35701 --source /home/aorth/CIAT-Books/SimpleArchiveFormat/ --mapfile=/tmp/ciat-books.map &> /tmp/ciat-books.log
    +$ JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8" [dspace]/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/35701 --source /home/aorth/CIAT-Books/SimpleArchiveFormat/ --mapfile=/tmp/ciat-books2.map &> /tmp/ciat-books2.log
    +
    + diff --git a/public/sitemap.xml b/public/sitemap.xml index ab0635056..75e91dc73 100644 --- a/public/sitemap.xml +++ b/public/sitemap.xml @@ -4,7 +4,7 @@ https://alanorth.github.io/cgspace-notes/2017-06/ - 2017-06-07T18:12:09+03:00 + 2017-06-18T14:53:20+03:00 @@ -104,7 +104,7 @@ https://alanorth.github.io/cgspace-notes/ - 2017-06-07T18:12:09+03:00 + 2017-06-18T14:53:20+03:00 0 @@ -115,19 +115,19 @@ https://alanorth.github.io/cgspace-notes/tags/notes/ - 2017-06-07T18:12:09+03:00 + 2017-06-18T14:53:20+03:00 0 https://alanorth.github.io/cgspace-notes/post/ - 2017-06-07T18:12:09+03:00 + 2017-06-18T14:53:20+03:00 0 https://alanorth.github.io/cgspace-notes/tags/ - 2017-06-07T18:12:09+03:00 + 2017-06-18T14:53:20+03:00 0