mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-22 14:45:03 +01:00
Update notes for 2018-12-02
This commit is contained in:
parent
de150e2cf1
commit
cad7ceaba1
@ -56,4 +56,28 @@ $ gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAli
|
||||
DEBUG: FC_WEIGHT didn't match
|
||||
```
|
||||
|
||||
- Start proofing the latest round of 226 IITA archive records that Bosede sent last week and Sisay uploaded to DSpace Test this weekend ([IITA_Dec_1_1997 aka Daniel1807](https://dspacetest.cgiar.org/handle/10568/108298))
|
||||
- One item missing the authorship type
|
||||
- Some invalid countries (smart quotes, mispellings)
|
||||
- Added countries to some items that mentioned research in particular countries in their abstracts
|
||||
- One item had "MADAGASCAR" for ISI Journal
|
||||
- Minor corrections in IITA subject (LIVELIHOOD→LIVELIHOODS)
|
||||
- Trim whitespace in abstract field
|
||||
- Fix some sponsors (though some with "Governments of Canada" etc I'm not sure why those are plural)
|
||||
- Eighteen items had `en||fr` for the language, but the content was only in French so changed them to just `fr`
|
||||
- Six items had encoding errors in French text so I will ask Bosede to re-do them carefully
|
||||
- Correct and normalize a few AGROVOC subjects
|
||||
- Expand my "encoding error" detection GREL to include `~` as I saw a lot of that in some copy pasted French text recently:
|
||||
|
||||
```
|
||||
or(
|
||||
isNotNull(value.match(/.*\uFFFD.*/)),
|
||||
isNotNull(value.match(/.*\u00A0.*/)),
|
||||
isNotNull(value.match(/.*\u200A.*/)),
|
||||
isNotNull(value.match(/.*\u2019.*/)),
|
||||
isNotNull(value.match(/.*\u00b4.*/)),
|
||||
isNotNull(value.match(/.*\u007e.*/))
|
||||
)
|
||||
```
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
@ -21,7 +21,7 @@ I noticed that there is another issue with PDF thumbnails on CGSpace, and I see
|
||||
" />
|
||||
<meta property="og:type" content="article" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-12/" /><meta property="article:published_time" content="2018-12-02T02:09:30+02:00"/>
|
||||
<meta property="article:modified_time" content="2018-12-02T10:47:41+02:00"/>
|
||||
<meta property="article:modified_time" content="2018-12-02T10:57:41+02:00"/>
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="December, 2018"/>
|
||||
@ -48,9 +48,9 @@ I noticed that there is another issue with PDF thumbnails on CGSpace, and I see
|
||||
"@type": "BlogPosting",
|
||||
"headline": "December, 2018",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2018-12/",
|
||||
"wordCount": "301",
|
||||
"wordCount": "463",
|
||||
"datePublished": "2018-12-02T02:09:30+02:00",
|
||||
"dateModified": "2018-12-02T10:47:41+02:00",
|
||||
"dateModified": "2018-12-02T10:57:41+02:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -172,6 +172,34 @@ zsh: segmentation fault (core dumped) gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -
|
||||
DEBUG: FC_WEIGHT didn't match
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>Start proofing the latest round of 226 IITA archive records that Bosede sent last week and Sisay uploaded to DSpace Test this weekend (<a href="https://dspacetest.cgiar.org/handle/10568/108298">IITA_Dec_1_1997 aka Daniel1807</a>)
|
||||
|
||||
<ul>
|
||||
<li>One item missing the authorship type</li>
|
||||
<li>Some invalid countries (smart quotes, mispellings)</li>
|
||||
<li>Added countries to some items that mentioned research in particular countries in their abstracts</li>
|
||||
<li>One item had “MADAGASCAR” for ISI Journal</li>
|
||||
<li>Minor corrections in IITA subject (LIVELIHOOD→LIVELIHOODS)</li>
|
||||
<li>Trim whitespace in abstract field</li>
|
||||
<li>Fix some sponsors (though some with “Governments of Canada” etc I’m not sure why those are plural)</li>
|
||||
<li>Eighteen items had <code>en||fr</code> for the language, but the content was only in French so changed them to just <code>fr</code></li>
|
||||
<li>Six items had encoding errors in French text so I will ask Bosede to re-do them carefully</li>
|
||||
<li>Correct and normalize a few AGROVOC subjects</li>
|
||||
</ul></li>
|
||||
<li>Expand my “encoding error” detection GREL to include <code>~</code> as I saw a lot of that in some copy pasted French text recently:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>or(
|
||||
isNotNull(value.match(/.*\uFFFD.*/)),
|
||||
isNotNull(value.match(/.*\u00A0.*/)),
|
||||
isNotNull(value.match(/.*\u200A.*/)),
|
||||
isNotNull(value.match(/.*\u2019.*/)),
|
||||
isNotNull(value.match(/.*\u00b4.*/)),
|
||||
isNotNull(value.match(/.*\u007e.*/))
|
||||
)
|
||||
</code></pre>
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
||||
|
||||
|
@ -4,7 +4,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/2018-12/</loc>
|
||||
<lastmod>2018-12-02T10:47:41+02:00</lastmod>
|
||||
<lastmod>2018-12-02T10:57:41+02:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
@ -199,7 +199,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
||||
<lastmod>2018-12-02T10:47:41+02:00</lastmod>
|
||||
<lastmod>2018-12-02T10:57:41+02:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
@ -210,7 +210,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
|
||||
<lastmod>2018-12-02T10:47:41+02:00</lastmod>
|
||||
<lastmod>2018-12-02T10:57:41+02:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
@ -222,13 +222,13 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
|
||||
<lastmod>2018-12-02T10:47:41+02:00</lastmod>
|
||||
<lastmod>2018-12-02T10:57:41+02:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
|
||||
<lastmod>2018-12-02T10:47:41+02:00</lastmod>
|
||||
<lastmod>2018-12-02T10:57:41+02:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user