Add notes for 2017-06-20

This commit is contained in:
Alan Orth 2017-06-20 12:00:40 +03:00
parent 4756e9025b
commit 41ba0acca9
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
3 changed files with 44 additions and 8 deletions

View File

@ -91,3 +91,19 @@ $ JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8" [dspace]/bin/dspace import --add -
- Redeploy CGSpace with latest changes from `5_x-prod`, run system updates, and reboot the server
- Continue working on ansible infrastructure changes for CGIAR Library
## 2017-06-20
- Import Abenet and Peter's changes to the CGIAR Library CRP community
- Due to them using Windows and renaming some columns there were formatting, encoding, and duplicate metadata value issues
- I had to remove some fields from the CSV and rename some back to, ie, `dc.subject[en_US]` just so DSpace would detect changes properly
- Now it looks much better: https://dspacetest.cgiar.org/handle/10947/2517
- Removing the HTML tags and HTML/XML entities using the following GREL:
- `replace(value,/<\/?\w+((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)\/?>/,'')`
- `value.unescape("html").unescape("xml")`
- Finally import 914 CIAT Book Chapters to CGSpace in two batches:
```
$ JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8" [dspace]/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/35701 --source /home/aorth/CIAT-Books/SimpleArchiveFormat/ --mapfile=/tmp/ciat-books.map &> /tmp/ciat-books.log
$ JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8" [dspace]/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/35701 --source /home/aorth/CIAT-Books/SimpleArchiveFormat/ --mapfile=/tmp/ciat-books2.map &> /tmp/ciat-books2.log
```

View File

@ -13,7 +13,7 @@
<meta property="article:published_time" content="2017-06-01T10:14:52&#43;03:00"/>
<meta property="article:modified_time" content="2017-06-07T18:12:09&#43;03:00"/>
<meta property="article:modified_time" content="2017-06-18T14:53:20&#43;03:00"/>
@ -45,9 +45,9 @@
"@type": "BlogPosting",
"headline": "June, 2017",
"url": "https://alanorth.github.io/cgspace-notes/2017-06/",
"wordCount": "892",
"wordCount": "1001",
"datePublished": "2017-06-01T10:14:52&#43;03:00",
"dateModified": "2017-06-07T18:12:09&#43;03:00",
"dateModified": "2017-06-18T14:53:20&#43;03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -223,6 +223,26 @@
<li>Continue working on ansible infrastructure changes for CGIAR Library</li>
</ul>
<h2 id="2017-06-20">2017-06-20</h2>
<ul>
<li>Import Abenet and Peter&rsquo;s changes to the CGIAR Library CRP community</li>
<li>Due to them using Windows and renaming some columns there were formatting, encoding, and duplicate metadata value issues</li>
<li>I had to remove some fields from the CSV and rename some back to, ie, <code>dc.subject[en_US]</code> just so DSpace would detect changes properly</li>
<li>Now it looks much better: <a href="https://dspacetest.cgiar.org/handle/10947/2517">https://dspacetest.cgiar.org/handle/10947/2517</a></li>
<li>Removing the HTML tags and HTML/XML entities using the following GREL:
<ul>
<li><code>replace(value,/&lt;\/?\w+((\s+\w+(\s*=\s*(?:&quot;.*?&quot;|'.*?'|[^'&quot;&gt;\s]+))?)+\s*|\s*)\/?&gt;/,'')</code></li>
<li><code>value.unescape(&quot;html&quot;).unescape(&quot;xml&quot;)</code></li>
</ul></li>
<li>Finally import 914 CIAT Book Chapters to CGSpace in two batches:</li>
</ul>
<pre><code>$ JAVA_OPTS=&quot;-Xmx1024m -Dfile.encoding=UTF-8&quot; [dspace]/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/35701 --source /home/aorth/CIAT-Books/SimpleArchiveFormat/ --mapfile=/tmp/ciat-books.map &amp;&gt; /tmp/ciat-books.log
$ JAVA_OPTS=&quot;-Xmx1024m -Dfile.encoding=UTF-8&quot; [dspace]/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/35701 --source /home/aorth/CIAT-Books/SimpleArchiveFormat/ --mapfile=/tmp/ciat-books2.map &amp;&gt; /tmp/ciat-books2.log
</code></pre>

View File

@ -4,7 +4,7 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/2017-06/</loc>
<lastmod>2017-06-07T18:12:09+03:00</lastmod>
<lastmod>2017-06-18T14:53:20+03:00</lastmod>
</url>
<url>
@ -104,7 +104,7 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2017-06-07T18:12:09+03:00</lastmod>
<lastmod>2017-06-18T14:53:20+03:00</lastmod>
<priority>0</priority>
</url>
@ -115,19 +115,19 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
<lastmod>2017-06-07T18:12:09+03:00</lastmod>
<lastmod>2017-06-18T14:53:20+03:00</lastmod>
<priority>0</priority>
</url>
<url>
<loc>https://alanorth.github.io/cgspace-notes/post/</loc>
<lastmod>2017-06-07T18:12:09+03:00</lastmod>
<lastmod>2017-06-18T14:53:20+03:00</lastmod>
<priority>0</priority>
</url>
<url>
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
<lastmod>2017-06-07T18:12:09+03:00</lastmod>
<lastmod>2017-06-18T14:53:20+03:00</lastmod>
<priority>0</priority>
</url>