mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Update notes for 2017-09-19
This commit is contained in:
@ -13,7 +13,7 @@
|
||||
|
||||
|
||||
<meta property="article:published_time" content="2017-09-18T16:38:35+03:00"/>
|
||||
<meta property="article:modified_time" content="2017-09-18T18:05:57+03:00"/>
|
||||
<meta property="article:modified_time" content="2017-09-18T21:24:27+03:00"/>
|
||||
|
||||
|
||||
|
||||
@ -37,9 +37,9 @@
|
||||
"@type": "BlogPosting",
|
||||
"headline": "CGIAR Library Migration",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/cgiar-library-migration/",
|
||||
"wordCount": "1169",
|
||||
"wordCount": "1167",
|
||||
"datePublished": "2017-09-18T16:38:35+03:00",
|
||||
"dateModified": "2017-09-18T18:05:57+03:00",
|
||||
"dateModified": "2017-09-18T21:24:27+03:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -108,7 +108,7 @@
|
||||
</header>
|
||||
|
||||
|
||||
<p><em>Temporarily making this a page because it seems Hugo (currently 0.27.1) cannot use a custom slug for a post when there is a permalink defined in <code>config.toml</code></em></p>
|
||||
<p><em>Note: I’m temporarily making this a page because it seems Hugo (currently 0.27.1) cannot use a custom slug for a post when there is a permalink defined in <code>config.toml</code></em></p>
|
||||
|
||||
<p>Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called <em>CGIAR System Organization</em>.</p>
|
||||
|
||||
@ -129,7 +129,7 @@
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> <a href="https://library.cgiar.org/bitstream/handle/10947/2699/CGIAR_Branding_Guidelines_and_Toolkit.pdf">https://library.cgiar.org/bitstream/handle/10947/2699/CGIAR_Branding_Guidelines_and_Toolkit.pdf</a></label></li>
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> <a href="https://library.cgiar.org/handle/10947/4258">https://library.cgiar.org/handle/10947/4258</a></label></li>
|
||||
</ul></li>
|
||||
<li><label><input type="checkbox" disabled class="task-list-item"> Merge <a href="https://github.com/ilri/DSpace/pull/339">#339</a> to <code>5_x-prod</code> branch and rebuild DSpace</label></li>
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Merge <a href="https://github.com/ilri/DSpace/pull/339">#339</a> to <code>5_x-prod</code> branch and rebuild DSpace</label></li>
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Increase <code>max_connections</code> in <code>/etc/postgresql/9.5/main/postgresql.conf</code> by ~10
|
||||
|
||||
<ul>
|
||||
@ -138,13 +138,9 @@
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Temporarily disable nightly <code>index-discovery</code> cron job because the import process will be taking place during some of this time and I don’t want them to be competing to update the Solr index</label></li>
|
||||
</ul>
|
||||
|
||||
<h2 id="migration">Migration</h2>
|
||||
<h2 id="migration-process">Migration Process</h2>
|
||||
|
||||
<p>Process for the actual migration:</p>
|
||||
|
||||
<ul>
|
||||
<li>Export all top-level communities and collections from DSpace Test:</li>
|
||||
</ul>
|
||||
<p><strong>Export all top-level communities and collections from DSpace Test:</strong></p>
|
||||
|
||||
<pre><code>$ export PATH=$PATH:/home/dspacetest.cgiar.org/bin
|
||||
$ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10947/2515 10947-2515/10947-2515.zip
|
||||
@ -161,51 +157,50 @@ $ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10568/93760 10568-93760/105
|
||||
$ dspace packager -d -a -t AIP -e aorth@mjanja.ch -i 10947/1 10947-1/10947-1.zip
|
||||
</code></pre>
|
||||
|
||||
<ul class="task-list">
|
||||
<li>Import to CGSpace (also see <a href="http://alanorth.github.io/cgspace-notes/2017-05/#2017-05-10">notes from 2017-05-10</a>)
|
||||
<p><strong>Import to CGSpace (also see <a href="http://alanorth.github.io/cgspace-notes/2017-05/#2017-05-10">notes from 2017-05-10</a>):</strong></p>
|
||||
|
||||
<ul class="task-list">
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Copy all exports from DSpace Test</label></li>
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Add ingestion overrides to <code>dspace.cfg</code> before import:</label></li>
|
||||
</ul></li>
|
||||
</ul>
|
||||
|
||||
<pre><code> mets.dspaceAIP.ingest.crosswalk.METSRIGHTS = NIL
|
||||
mets.dspaceAIP.ingest.crosswalk.DSPACE-ROLES = NIL
|
||||
<pre><code>mets.dspaceAIP.ingest.crosswalk.METSRIGHTS = NIL
|
||||
mets.dspaceAIP.ingest.crosswalk.DSPACE-ROLES = NIL
|
||||
</code></pre>
|
||||
|
||||
<ul class="task-list">
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Import communities and collections, paying attention to options to skip missing parents and ignore handles:</label></li>
|
||||
</ul>
|
||||
|
||||
<pre><code> $ export JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx3072m -XX:-UseGCOverheadLimit -XX:+TieredCompilation -XX:TieredStopAtLevel=1"
|
||||
$ export PATH=$PATH:/home/cgspace.cgiar.org/bin
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2515/10947-2515.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2516/10947-2516.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2517/10947-2517.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2518/10947-2518.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2519/10947-2519.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2708/10947-2708.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2526/10947-2526.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2871/10947-2871.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-4467/10947-4467.zip
|
||||
$ dspace packager -s -u -t AIP -o ignoreHandle=false -e aorth@mjanja.ch -p 10568/83389 10947-2527/10947-2527.zip
|
||||
$ for item in 10947-2527/ITEM@10947-*; do dspace packager -r -f -u -t AIP -e aorth@mjanja.ch $item; done
|
||||
$ dspace packager -s -t AIP -o ignoreHandle=false -e aorth@mjanja.ch -p 10568/83389 10947-1/10947-1.zip
|
||||
$ for collection in 10947-1/COLLECTION@10947-*; do dspace packager -s -o ignoreHandle=false -t AIP -e aorth@mjanja.ch -p 10947/1 $collection; done
|
||||
$ for item in 10947-1/ITEM@10947-*; do dspace packager -r -f -u -t AIP -e aorth@mjanja.ch $item; done
|
||||
<pre><code>$ export JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx3072m -XX:-UseGCOverheadLimit -XX:+TieredCompilation -XX:TieredStopAtLevel=1"
|
||||
$ export PATH=$PATH:/home/cgspace.cgiar.org/bin
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2515/10947-2515.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2516/10947-2516.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2517/10947-2517.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2518/10947-2518.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2519/10947-2519.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2708/10947-2708.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2526/10947-2526.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-2871/10947-2871.zip
|
||||
$ dspace packager -r -u -a -t AIP -o skipIfParentMissing=true -e aorth@mjanja.ch -p 10568/83389 10947-4467/10947-4467.zip
|
||||
$ dspace packager -s -u -t AIP -o ignoreHandle=false -e aorth@mjanja.ch -p 10568/83389 10947-2527/10947-2527.zip
|
||||
$ for item in 10947-2527/ITEM@10947-*; do dspace packager -r -f -u -t AIP -e aorth@mjanja.ch $item; done
|
||||
$ dspace packager -s -t AIP -o ignoreHandle=false -e aorth@mjanja.ch -p 10568/83389 10947-1/10947-1.zip
|
||||
$ for collection in 10947-1/COLLECTION@10947-*; do dspace packager -s -o ignoreHandle=false -t AIP -e aorth@mjanja.ch -p 10947/1 $collection; done
|
||||
$ for item in 10947-1/ITEM@10947-*; do dspace packager -r -f -u -t AIP -e aorth@mjanja.ch $item; done
|
||||
</code></pre>
|
||||
|
||||
<ul class="task-list">
|
||||
<li>This submits AIP hierarchies recursively (-r) and suppresses errors when an item’s parent collection hasn’t been created yet—for example, if the item is mapped</li>
|
||||
<li>The large historic archive (<sup>10947</sup>⁄<sub>1</sub>) is created in several steps because it requires a lot of memory and often crashes</li>
|
||||
<p>This submits AIP hierarchies recursively (-r) and suppresses errors when an item’s parent collection hasn’t been created yet—for example, if the item is mapped. The large historic archive (<sup>10947</sup>⁄<sub>1</sub>) is created in several steps because it requires a lot of memory and often crashes.</p>
|
||||
|
||||
<li><p>Create new subcommunities and collections for content we reorganized into new hierarchies from the original:</p>
|
||||
<p><strong>Create new subcommunities and collections for content we reorganized into new hierarchies from the original:</strong></p>
|
||||
|
||||
<ul class="task-list">
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Create <em>CGIAR System Management Board</em> sub-community: <sup>10568</sup>⁄<sub>83536</sub>
|
||||
|
||||
<ul class="task-list">
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Create <em>CGIAR System Management Board</em> sub-community: <sup>10568</sup>⁄<sub>83536</sub></label></li>
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Content from <em>CGIAR System Management Board documents</em> collection (<sup>10947</sup>⁄<sub>4561</sub>) goes here</label></li>
|
||||
<li>Import collection hierarchy first and then the items:</li>
|
||||
</ul></label></li>
|
||||
</ul>
|
||||
|
||||
<pre><code>$ dspace packager -r -t AIP -o ignoreHandle=false -e aorth@mjanja.ch -p 10568/83536 10568-93760/COLLECTION@10947-4651.zip
|
||||
@ -213,45 +208,42 @@ $ for item in 10568-93760/ITEM@10947-465*; do dspace packager -r -f -u -t AIP -e
|
||||
</code></pre>
|
||||
|
||||
<ul class="task-list">
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Create <em>CGIAR System Management Office</em> sub-community: <sup>10568</sup>⁄<sub>83537</sub></label></li>
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Create <em>CGIAR System Management Office</em> sub-community: <sup>10568</sup>⁄<sub>83537</sub>
|
||||
|
||||
<ul class="task-list">
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Create <em>CGIAR System Management Office documents</em> collection: <sup>10568</sup>⁄<sub>83538</sub></label></li>
|
||||
<li>Import items to collection individually in replace mode (-r) while explicitly preserving handles and ignoring parents:</li>
|
||||
</ul></label></li>
|
||||
</ul>
|
||||
|
||||
<pre><code>$ for item in 10568-93759/ITEM@10947-46*; do dspace packager -r -t AIP -o ignoreHandle=false -o ignoreParent=true -e aorth@mjanja.ch -p 10568/83538 $item; done
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>Get the handles for the last few items from CGIAR Library that were created since we did the migration to DSpace Test in May:</li>
|
||||
</ul></li>
|
||||
</ul>
|
||||
<p><strong>Get the handles for the last few items from CGIAR Library that were created since we did the migration to DSpace Test in May:</strong></p>
|
||||
|
||||
<pre><code> dspace=# select handle from item, handle where handle.resource_id = item.item_id AND item.item_id in (select item_id from metadatavalue where metadata_field_id=11 and date(text_value) > '2017-05-01T00:00:00Z');
|
||||
<pre><code>dspace=# select handle from item, handle where handle.resource_id = item.item_id AND item.item_id in (select item_id from metadatavalue where metadata_field_id=11 and date(text_value) > '2017-05-01T00:00:00Z');
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>Export them from the CGIAR Library:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code> # for handle in 10947/4658 10947/4659 10947/4660 10947/4661 10947/4665 10947/4664 10947/4666 10947/4669; do /usr/local/dspace/bin/dspace packager -d -a -t AIP -e m.marus@cgiar.org -i $handle ${handle}.zip; done
|
||||
<pre><code># for handle in 10947/4658 10947/4659 10947/4660 10947/4661 10947/4665 10947/4664 10947/4666 10947/4669; do /usr/local/dspace/bin/dspace packager -d -a -t AIP -e m.marus@cgiar.org -i $handle ${handle}.zip; done
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>Import on CGSpace:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code> $ for item in 10947-latest/*.zip; do dspace packager -r -u -t AIP -e aorth@mjanja.ch $item; done
|
||||
<pre><code>$ for item in 10947-latest/*.zip; do dspace packager -r -u -t AIP -e aorth@mjanja.ch $item; done
|
||||
</code></pre>
|
||||
|
||||
<ul class="task-list">
|
||||
<li><label><input type="checkbox" disabled class="task-list-item"> Shut down Tomcat and run <code>update-sequences.sql</code> as the system’s <code>postgres</code> user</label></li>
|
||||
</ul>
|
||||
|
||||
<h2 id="post-migration">Post Migration</h2>
|
||||
|
||||
<ul class="task-list">
|
||||
<li><label><input type="checkbox" disabled class="task-list-item"> Shut down Tomcat and run <code>update-sequences.sql</code> as the system’s <code>postgres</code> user</label></li>
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Remove ingestion overrides from <code>dspace.cfg</code></label></li>
|
||||
<li><label><input type="checkbox" disabled class="task-list-item"> Reset PostgreSQL <code>max_connections</code> to 183</label></li>
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Reset PostgreSQL <code>max_connections</code> to 183</label></li>
|
||||
<li><label><input type="checkbox" checked disabled class="task-list-item"> Enable nightly <code>index-discovery</code> cron job</label></li>
|
||||
<li>HTTPS certificates:
|
||||
|
||||
@ -260,12 +252,12 @@ $ for item in 10568-93760/ITEM@10947-465*; do dspace packager -r -f -u -t AIP -e
|
||||
</ul></li>
|
||||
</ul>
|
||||
|
||||
<pre><code> $ keytool -list -keystore tomcat.keystore
|
||||
$ keytool -importkeystore -srckeystore tomcat.keystore -destkeystore library.cgiar.org.p12 -deststoretype PKCS12 -srcalias tomcat
|
||||
$ openssl pkcs12 -in library.cgiar.org.p12 -nokeys -out library.cgiar.org.crt.pem
|
||||
$ openssl pkcs12 -in library.cgiar.org.p12 -nodes -nocerts -out library.cgiar.org.key.pem
|
||||
$ wget https://certs.godaddy.com/repository/gdroot-g2.crt https://certs.godaddy.com/repository/gdig2.crt.pem
|
||||
$ cat library.cgiar.org.crt.pem gdig2.crt.pem > library.cgiar.org-chained.pem
|
||||
<pre><code>$ keytool -list -keystore tomcat.keystore
|
||||
$ keytool -importkeystore -srckeystore tomcat.keystore -destkeystore library.cgiar.org.p12 -deststoretype PKCS12 -srcalias tomcat
|
||||
$ openssl pkcs12 -in library.cgiar.org.p12 -nokeys -out library.cgiar.org.crt.pem
|
||||
$ openssl pkcs12 -in library.cgiar.org.p12 -nodes -nocerts -out library.cgiar.org.key.pem
|
||||
$ wget https://certs.godaddy.com/repository/gdroot-g2.crt https://certs.godaddy.com/repository/gdig2.crt.pem
|
||||
$ cat library.cgiar.org.crt.pem gdig2.crt.pem > library.cgiar.org-chained.pem
|
||||
</code></pre>
|
||||
|
||||
<ul class="task-list">
|
||||
@ -275,18 +267,15 @@ $ for item in 10568-93760/ITEM@10947-465*; do dspace packager -r -f -u -t AIP -e
|
||||
<li>CNAME: cgspace.cgiar.org</li>
|
||||
</ul></label></li>
|
||||
<li><label><input type="checkbox" disabled class="task-list-item"> Re-deploy DSpace from freshly built <code>5_x-prod</code> branch</label></li>
|
||||
<li><label><input type="checkbox" disabled class="task-list-item"> Merge <code>cgiar-library</code> branch to <code>master</code> and re-run ansible nginx templates</label></li>
|
||||
<li><label><input type="checkbox" disabled class="task-list-item"> Run system updates and reboot server</label></li>
|
||||
<li><label><input type="checkbox" disabled class="task-list-item"> Switch to Let’s Encrypt HTTPS certificates (after DNS is updated and server isn’t busy)</label></li>
|
||||
<li><label><input type="checkbox" disabled class="task-list-item"> Switch to Let’s Encrypt HTTPS certificates (after DNS is updated and server isn’t busy):</label></li>
|
||||
</ul>
|
||||
|
||||
<pre><code>$ sudo systemctl stop tomcat7
|
||||
$ ./letsencrypt-auto certonly --standalone -d library.cgiar.org
|
||||
</code></pre>
|
||||
|
||||
<ul class="task-list">
|
||||
<li><label><input type="checkbox" disabled class="task-list-item"> Merge <code>cgiar-library</code> branch to <code>master</code> and re-run ansible nginx templates</label></li>
|
||||
</ul>
|
||||
|
||||
<h2 id="troubleshooting">Troubleshooting</h2>
|
||||
|
||||
<h3 id="foreign-key-error-in-dspace-cleanup">Foreign Key Error in <code>dspace cleanup</code></h3>
|
||||
|
Reference in New Issue
Block a user