mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2021-01-04
This commit is contained in:
@ -20,7 +20,7 @@ I started processing those (about 411,000 records):
|
||||
<meta property="og:type" content="article" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-12/" />
|
||||
<meta property="article:published_time" content="2020-12-01T11:32:54+02:00" />
|
||||
<meta property="article:modified_time" content="2020-12-30T09:44:45+02:00" />
|
||||
<meta property="article:modified_time" content="2021-01-04T14:09:58+02:00" />
|
||||
|
||||
|
||||
|
||||
@ -46,9 +46,9 @@ I started processing those (about 411,000 records):
|
||||
"@type": "BlogPosting",
|
||||
"headline": "December, 2020",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2020-12/",
|
||||
"wordCount": "3785",
|
||||
"wordCount": "3772",
|
||||
"datePublished": "2020-12-01T11:32:54+02:00",
|
||||
"dateModified": "2020-12-30T09:44:45+02:00",
|
||||
"dateModified": "2021-01-04T14:09:58+02:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -631,7 +631,7 @@ java.lang.UnsupportedOperationException: Multiple update components target the s
|
||||
</code></pre><ul>
|
||||
<li>I sent the full stack to Atmire to investigate
|
||||
<ul>
|
||||
<li>I know we’ve had thisi “Multiple update components target the same field” error in the past with DSpace 5.x and Atmire said it was harmless, but would nevertheless be fixed in a future update</li>
|
||||
<li>I know we’ve had this “Multiple update components target the same field” error in the past with DSpace 5.x and Atmire said it was harmless, but would nevertheless be fixed in a future update</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I was trying to export the ILRI community on CGSpace so I could update one of the ILRI author’s names, but it throws an error…</li>
|
||||
@ -661,28 +661,35 @@ java.lang.NullPointerException
|
||||
dc.contributor.author,correct
|
||||
"Padmakumar, V.P.","Varijakshapanicker, Padmakumar"
|
||||
$ ./fix-metadata-values.py -i 2020-12-17-update-ILRI-author.csv -db dspace -u dspace -p 'fuuu' -f dc.contributor.author -t 'correct' -m 3
|
||||
|
||||
- Abenet needed a list of all 2020 outputs from the Livestock CRP that were Limited Access
|
||||
- I exported the community from CGSpace and used `csvcut` and `csvgrep` to get a list:
|
||||
|
||||
</code></pre><p>$ csvcut -c ‘dc.identifier.citation[en_US],dc.identifier.uri,dc.identifier.uri[],dc.identifier.uri[en_US],dc.date.issued,dc.date.issued[],dc.date.issued[en_US],cg.identifier.status[en_US]’ ~/Downloads/10568-80099.csv | csvgrep -c ‘cg.identifier.status[en_US]’ -m ‘Limited Access’ | csvgrep -c ‘dc.date.issued’ -m 2020 -c ‘dc.date.issued[]’ -m 2020 -c ‘dc.date.issued[en_US]’ -m 2020 > /tmp/limited-2020.csv</p>
|
||||
<pre><code>
|
||||
## 2020-12-18
|
||||
|
||||
- I added support for indexing community views and downloads to [dspace-statistics-api](https://github.com/ilri/dspace-statistics-api)
|
||||
- I still have to add the API endpoints to make the stats available
|
||||
- Also, I played a little bit with Swagger via [falcon-swagger-ui](https://github.com/rdidyk/falcon-swagger-ui) and I think I can get that working for better API documentation / testing
|
||||
- Atmire sent some feedback on the DeduplicateValuesProcessor
|
||||
- They confirm that it should process _all_ duplicates, not just those in `owningComm` and `owningColl`
|
||||
- They asked me to try it again on DSpace Test now that I've resync'd the Solr statistics cores from production
|
||||
- I started processing the statistics core on DSpace Test
|
||||
|
||||
## 2020-12-20
|
||||
|
||||
- The DeduplicateValuesProcessor has been running on DSpace Test since two days ago and it almost completed its second twelve-hour run, but crashed near the end:
|
||||
|
||||
```console
|
||||
...
|
||||
</code></pre><ul>
|
||||
<li>Abenet needed a list of all 2020 outputs from the Livestock CRP that were Limited Access
|
||||
<ul>
|
||||
<li>I exported the community from CGSpace and used <code>csvcut</code> and <code>csvgrep</code> to get a list:</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code>$ csvcut -c 'dc.identifier.citation[en_US],dc.identifier.uri,dc.identifier.uri[],dc.identifier.uri[en_US],dc.date.issued,dc.date.issued[],dc.date.issued[en_US],cg.identifier.status[en_US]' ~/Downloads/10568-80099.csv | csvgrep -c 'cg.identifier.status[en_US]' -m 'Limited Access' | csvgrep -c 'dc.date.issued' -m 2020 -c 'dc.date.issued[]' -m 2020 -c 'dc.date.issued[en_US]' -m 2020 > /tmp/limited-2020.csv
|
||||
</code></pre><h2 id="2020-12-18">2020-12-18</h2>
|
||||
<ul>
|
||||
<li>I added support for indexing community views and downloads to <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a>
|
||||
<ul>
|
||||
<li>I still have to add the API endpoints to make the stats available</li>
|
||||
<li>Also, I played a little bit with Swagger via <a href="https://github.com/rdidyk/falcon-swagger-ui">falcon-swagger-ui</a> and I think I can get that working for better API documentation / testing</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Atmire sent some feedback on the DeduplicateValuesProcessor
|
||||
<ul>
|
||||
<li>They confirm that it should process <em>all</em> duplicates, not just those in <code>owningComm</code> and <code>owningColl</code></li>
|
||||
<li>They asked me to try it again on DSpace Test now that I’ve resync’d the Solr statistics cores from production</li>
|
||||
<li>I started processing the statistics core on DSpace Test</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="2020-12-20">2020-12-20</h2>
|
||||
<ul>
|
||||
<li>The DeduplicateValuesProcessor has been running on DSpace Test since two days ago and it almost completed its second twelve-hour run, but crashed near the end:</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">...
|
||||
Run 1 — 100% — 8,230,000/8,239,228 docs — 39s — 9h 8m 31s
|
||||
Exception: Java heap space
|
||||
java.lang.OutOfMemoryError: Java heap space
|
||||
|
Reference in New Issue
Block a user