mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2016-10-04
This commit is contained in:
@ -112,6 +112,28 @@
|
||||
<li>Looks like we’ll just have to add the text to the About page (without a link) or add a separate page</li>
|
||||
</ul>
|
||||
|
||||
<h2 id="2016-10-04">2016-10-04</h2>
|
||||
|
||||
<ul>
|
||||
<li>Start testing cleanups of authors that Peter sent last week</li>
|
||||
<li>Out of 40,000+ rows, Peter had indicated corrections for ~3,200 of them—too many to look through carefully, so I did some basic quality checking:
|
||||
|
||||
<ul>
|
||||
<li>Trim leading/trailing whitespace</li>
|
||||
<li>Find invalid characters</li>
|
||||
<li>Cluster values to merge obvious authors</li>
|
||||
</ul></li>
|
||||
<li>That left us with 3,180 valid corrections and 3 deletions:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>$ ./fix-metadata-values.py -i authors-fix-3180.csv -f dc.contributor.author -t correct -m 3 -d dspacetest -u dspacetest -p fuuu
|
||||
$ ./delete-metadata-values.py -i authors-delete-3.csv -f dc.contributor.author -m 3 -d dspacetest -u dspacetest -p fuuu
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>Remove old about page (<a href="https://github.com/ilri/DSpace/pull/284">#284</a>)</li>
|
||||
</ul>
|
||||
|
||||
|
||||
|
||||
</article>
|
||||
|
Reference in New Issue
Block a user