mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2020-01-27
This commit is contained in:
@ -9,8 +9,8 @@
|
||||
<meta property="og:description" content="2017-01-02
|
||||
|
||||
I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error
|
||||
I tested on DSpace Test as well and it doesn't work there either
|
||||
I asked on the dspace-tech mailing list because it seems to be broken, and actually now I'm not sure if we've ever had the sharding task run successfully over all these years
|
||||
I tested on DSpace Test as well and it doesn’t work there either
|
||||
I asked on the dspace-tech mailing list because it seems to be broken, and actually now I’m not sure if we’ve ever had the sharding task run successfully over all these years
|
||||
" />
|
||||
<meta property="og:type" content="article" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-01/" />
|
||||
@ -22,10 +22,10 @@ I asked on the dspace-tech mailing list because it seems to be broken, and actua
|
||||
<meta name="twitter:description" content="2017-01-02
|
||||
|
||||
I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error
|
||||
I tested on DSpace Test as well and it doesn't work there either
|
||||
I asked on the dspace-tech mailing list because it seems to be broken, and actually now I'm not sure if we've ever had the sharding task run successfully over all these years
|
||||
I tested on DSpace Test as well and it doesn’t work there either
|
||||
I asked on the dspace-tech mailing list because it seems to be broken, and actually now I’m not sure if we’ve ever had the sharding task run successfully over all these years
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.62.2" />
|
||||
<meta name="generator" content="Hugo 0.63.1" />
|
||||
|
||||
|
||||
|
||||
@ -55,7 +55,7 @@ I asked on the dspace-tech mailing list because it seems to be broken, and actua
|
||||
|
||||
<!-- combined, minified CSS -->
|
||||
|
||||
<link href="https://alanorth.github.io/cgspace-notes/css/style.a20c1a4367639632cdb341d23c27ca44fedcc75b0f8b3cbea6203010da153d3c.css" rel="stylesheet" integrity="sha256-ogwaQ2djljLNs0HSPCfKRP7cx1sPizy+piAwENoVPTw=" crossorigin="anonymous">
|
||||
<link href="https://alanorth.github.io/cgspace-notes/css/style.23e2c3298bcc8c1136c19aba330c211ec94c36f7c4454ea15cf4d3548370042a.css" rel="stylesheet" integrity="sha256-I+LDKYvMjBE2wZq6MwwhHslMNvfERU6hXPTTVINwBCo=" crossorigin="anonymous">
|
||||
|
||||
|
||||
<!-- RSS 2.0 feed -->
|
||||
@ -103,15 +103,15 @@ I asked on the dspace-tech mailing list because it seems to be broken, and actua
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-01/">January, 2017</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2017-01-02T10:43:00+03:00">Mon Jan 02, 2017</time> by Alan Orth in
|
||||
|
||||
<i class="fa fa-tag" aria-hidden="true"></i> <a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
|
||||
<span class="fas fa-tag" aria-hidden="true"></span> <a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2017-01-02">2017-01-02</h2>
|
||||
<ul>
|
||||
<li>I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error</li>
|
||||
<li>I tested on DSpace Test as well and it doesn't work there either</li>
|
||||
<li>I asked on the dspace-tech mailing list because it seems to be broken, and actually now I'm not sure if we've ever had the sharding task run successfully over all these years</li>
|
||||
<li>I tested on DSpace Test as well and it doesn’t work there either</li>
|
||||
<li>I asked on the dspace-tech mailing list because it seems to be broken, and actually now I’m not sure if we’ve ever had the sharding task run successfully over all these years</li>
|
||||
</ul>
|
||||
<h2 id="2017-01-04">2017-01-04</h2>
|
||||
<ul>
|
||||
@ -186,18 +186,18 @@ Caused by: java.net.SocketException: Broken pipe (Write failed)
|
||||
</ul>
|
||||
<h2 id="2017-01-08">2017-01-08</h2>
|
||||
<ul>
|
||||
<li>Put Sisay's <code>item-view.xsl</code> code to show mapped collections on CGSpace (<a href="https://github.com/ilri/DSpace/pull/295">#295</a>)</li>
|
||||
<li>Put Sisay’s <code>item-view.xsl</code> code to show mapped collections on CGSpace (<a href="https://github.com/ilri/DSpace/pull/295">#295</a>)</li>
|
||||
</ul>
|
||||
<h2 id="2017-01-09">2017-01-09</h2>
|
||||
<ul>
|
||||
<li>A user wrote to tell me that the new display of an item's mappings had a crazy bug for at least one item: <a href="https://cgspace.cgiar.org/handle/10568/78596">https://cgspace.cgiar.org/handle/10568/78596</a></li>
|
||||
<li>A user wrote to tell me that the new display of an item’s mappings had a crazy bug for at least one item: <a href="https://cgspace.cgiar.org/handle/10568/78596">https://cgspace.cgiar.org/handle/10568/78596</a></li>
|
||||
<li>She said she only mapped it once, but it appears to be mapped 184 times</li>
|
||||
</ul>
|
||||
<p><img src="/cgspace-notes/2017/01/mapping-crazy-duplicate.png" alt="Crazy item mapping"></p>
|
||||
<h2 id="2017-01-10">2017-01-10</h2>
|
||||
<ul>
|
||||
<li>I tried to clean up the duplicate mappings by exporting the item's metadata to CSV, editing, and re-importing, but DSpace said “no changes were detected”</li>
|
||||
<li>I've asked on the dspace-tech mailing list to see if anyone can help</li>
|
||||
<li>I tried to clean up the duplicate mappings by exporting the item’s metadata to CSV, editing, and re-importing, but DSpace said “no changes were detected”</li>
|
||||
<li>I’ve asked on the dspace-tech mailing list to see if anyone can help</li>
|
||||
<li>I found an old post on the mailing list discussing a similar issue, and listing some SQL commands that might help</li>
|
||||
<li>For example, this shows 186 mappings for the item, the first three of which are real:</li>
|
||||
</ul>
|
||||
@ -226,7 +226,7 @@ UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 15:
|
||||
<pre><code>print("Fixing {} occurences of: {}".format(records_to_fix, record[0].encode('utf-8')))
|
||||
</code></pre><ul>
|
||||
<li>See: <a href="http://stackoverflow.com/a/36427358/487333">http://stackoverflow.com/a/36427358/487333</a></li>
|
||||
<li>I'm actually not sure if we need to encode() the strings to UTF-8 before writing them to the database… I've never had this issue before</li>
|
||||
<li>I’m actually not sure if we need to encode() the strings to UTF-8 before writing them to the database… I’ve never had this issue before</li>
|
||||
<li>Now back to cleaning up some journal titles so we can make the controlled vocabulary:</li>
|
||||
</ul>
|
||||
<pre><code>$ ./fix-metadata-values.py -i /tmp/fix-27-journal-titles.csv -f dc.source -t correct -m 55 -d dspace -u dspace -p 'fuuu'
|
||||
@ -237,7 +237,7 @@ UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 15:
|
||||
</code></pre><ul>
|
||||
<li>The values are a bit dirty and outdated, since the file I had given to Abenet and Peter was from November</li>
|
||||
<li>I will have to go through these and fix some more before making the controlled vocabulary</li>
|
||||
<li>Added 30 more corrections or so, now there are 49 total and I'll have to get the top 500 after applying them</li>
|
||||
<li>Added 30 more corrections or so, now there are 49 total and I’ll have to get the top 500 after applying them</li>
|
||||
</ul>
|
||||
<h2 id="2017-01-13">2017-01-13</h2>
|
||||
<ul>
|
||||
@ -256,12 +256,12 @@ delete from collection2item where id = '91082';
|
||||
<li>Helping clean up some file names in the 232 CIAT records that Sisay worked on last week</li>
|
||||
<li>There are about 30 files with <code>%20</code> (space) and Spanish accents in the file name</li>
|
||||
<li>At first I thought we should fix these, but actually it is <a href="https://www.w3.org/TR/html4/appendix/notes.html#h-B.2.1">prescribed by the W3 working group to convert these to UTF8 and URL encode them</a>!</li>
|
||||
<li>And the file names don't really matter either, as long as the SAF Builder tool can read them—after that DSpace renames them with a hash in the assetstore</li>
|
||||
<li>And the file names don’t really matter either, as long as the SAF Builder tool can read them—after that DSpace renames them with a hash in the assetstore</li>
|
||||
<li>Seems like the only ones I should replace are the <code>'</code> apostrophe characters, as <code>%27</code>:</li>
|
||||
</ul>
|
||||
<pre><code>value.replace("'",'%27')
|
||||
</code></pre><ul>
|
||||
<li>Add the item's Type to the filename column as a hint to SAF Builder so it can set a more useful description field:</li>
|
||||
<li>Add the item’s Type to the filename column as a hint to SAF Builder so it can set a more useful description field:</li>
|
||||
</ul>
|
||||
<pre><code>value + "__description:" + cells["dc.type"].value
|
||||
</code></pre><ul>
|
||||
@ -279,18 +279,18 @@ $ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -
|
||||
</ul>
|
||||
<h2 id="2017-01-19">2017-01-19</h2>
|
||||
<ul>
|
||||
<li>In testing a random sample of CIAT's PDFs for compressability, it looks like all of these methods generally increase the file size so we will just import them as they are</li>
|
||||
<li>In testing a random sample of CIAT’s PDFs for compressability, it looks like all of these methods generally increase the file size so we will just import them as they are</li>
|
||||
<li>Import 232 CIAT records into CGSpace:</li>
|
||||
</ul>
|
||||
<pre><code>$ JAVA_OPTS="-Xmx512m -Dfile.encoding=UTF-8" /home/cgspace.cgiar.org/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/68704 --source /home/aorth/CIAT_232/SimpleArchiveFormat/ --mapfile=/tmp/ciat.map &> /tmp/ciat.log
|
||||
</code></pre><h2 id="2017-01-22">2017-01-22</h2>
|
||||
<ul>
|
||||
<li>Looking at some records that Sisay is having problems importing into DSpace Test (seems to be because of copious whitespace return characters from Excel's CSV exporter)</li>
|
||||
<li>Looking at some records that Sisay is having problems importing into DSpace Test (seems to be because of copious whitespace return characters from Excel’s CSV exporter)</li>
|
||||
<li>There were also some issues with an invalid dc.date.issued field, and I trimmed leading / trailing whitespace and cleaned up some URLs with unneeded parameters like ?show=full</li>
|
||||
</ul>
|
||||
<h2 id="2017-01-23">2017-01-23</h2>
|
||||
<ul>
|
||||
<li>I merged Atmire's pull request into the development branch so they can deploy it on DSpace Test</li>
|
||||
<li>I merged Atmire’s pull request into the development branch so they can deploy it on DSpace Test</li>
|
||||
<li>Move some old ILRI Program communities to a new subcommunity for former programs (10568/79164):</li>
|
||||
</ul>
|
||||
<pre><code>$ for community in 10568/171 10568/27868 10568/231 10568/27869 10568/150 10568/230 10568/32724 10568/172; do /home/cgspace.cgiar.org/bin/dspace community-filiator --remove --parent=10568/27866 --child="$community" && /home/cgspace.cgiar.org/bin/dspace community-filiator --set --parent=10568/79164 --child="$community"; done
|
||||
|
Reference in New Issue
Block a user