mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Update notes for 2020-10-22
This commit is contained in:
@ -23,7 +23,7 @@ During the FlywayDB migration I got an error:
|
||||
<meta property="og:type" content="article" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-10/" />
|
||||
<meta property="article:published_time" content="2020-10-06T16:55:54+03:00" />
|
||||
<meta property="article:modified_time" content="2020-10-21T15:36:31+03:00" />
|
||||
<meta property="article:modified_time" content="2020-10-22T11:58:26+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="October, 2020"/>
|
||||
@ -51,9 +51,9 @@ During the FlywayDB migration I got an error:
|
||||
"@type": "BlogPosting",
|
||||
"headline": "October, 2020",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2020-10/",
|
||||
"wordCount": "4233",
|
||||
"wordCount": "4350",
|
||||
"datePublished": "2020-10-06T16:55:54+03:00",
|
||||
"dateModified": "2020-10-21T15:36:31+03:00",
|
||||
"dateModified": "2020-10-22T11:58:26+03:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -839,6 +839,31 @@ $ csvcut -c 'id,dc.subject[],dc.subject[en_US],cg.subject.ilri[],cg.subject.ilri
|
||||
</ul>
|
||||
</li>
|
||||
<li>Add two new blocks to list the top communities and collections on AReS</li>
|
||||
<li>I want to extract all CRPs and affiliations from AReS to do some text processing and create some mappings…
|
||||
<ul>
|
||||
<li>First extract 10,000 affiliations from Elasticsearch by only including the <code>affiliation</code> source:</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code>$ http 'http://localhost:9200/openrxv-items-final/_search?_source_includes=affiliation&size=10000&q=*:*' > /tmp/affiliations.json
|
||||
</code></pre><ul>
|
||||
<li>Then I decided to try a different approach and I adjusted my <code>convert-mapping.py</code> script to re-consider some replacement patterns with acronyms from the original AReS <code>mapping.json</code> file to hopefully address some MEL to CGSpace mappings
|
||||
<ul>
|
||||
<li>For example, to changes this:
|
||||
<ul>
|
||||
<li>find: International Livestock Research Institute</li>
|
||||
<li>replace: International Livestock Research Institute - ILRI</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>… into this:
|
||||
<ul>
|
||||
<li>find: International Livestock Research Institute - ILRI</li>
|
||||
<li>replace: International Livestock Research Institute</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I re-uploaded the mappings to Elasticsearch like I did yesterday and restarted the harvesting</li>
|
||||
</ul>
|
||||
<!-- raw HTML omitted -->
|
||||
|
||||
|
Reference in New Issue
Block a user