Update notes

This commit is contained in:
Alan Orth 2017-09-19 21:57:38 +03:00
parent 3462d89b33
commit e667b681c1
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
9 changed files with 57 additions and 66 deletions

View File

@ -13,7 +13,7 @@ Rough notes for importing the CGIAR Library content. It was decided that this co
## Pre-migration Technical TODOs
Things that need to happen before the migration:
- [x] Create top-level community on CGSpace to hold the CGIAR Library content: 10568/83389
- [x] Create top-level community on CGSpace to hold the CGIAR Library content: `10568/83389`
- [x] Update nginx redirects in ansible templates
- [x] Update handle in DSpace XMLUI config
- Set up nginx redirects for URLs like:
@ -23,6 +23,16 @@ Things that need to happen before the migration:
- [x] Increase `max_connections` in `/etc/postgresql/9.5/main/postgresql.conf` by ~10
- `SELECT * FROM pg_stat_activity;` seems to show ~6 extra connections used by the command line tools during import
- [x] Temporarily disable nightly `index-discovery` cron job because the import process will be taking place during some of this time and I don't want them to be competing to update the Solr index
- [x] Copy HTTPS certificate key pair from CGIAR Library server's Tomcat keystore:
```
$ keytool -list -keystore tomcat.keystore
$ keytool -importkeystore -srckeystore tomcat.keystore -destkeystore library.cgiar.org.p12 -deststoretype PKCS12 -srcalias tomcat
$ openssl pkcs12 -in library.cgiar.org.p12 -nokeys -out library.cgiar.org.crt.pem
$ openssl pkcs12 -in library.cgiar.org.p12 -nodes -nocerts -out library.cgiar.org.key.pem
$ wget https://certs.godaddy.com/repository/gdroot-g2.crt https://certs.godaddy.com/repository/gdig2.crt.pem
$ cat library.cgiar.org.crt.pem gdig2.crt.pem > library.cgiar.org-chained.pem
```
## Migration Process
@ -79,8 +89,8 @@ This submits AIP hierarchies recursively (-r) and suppresses errors when an item
**Create new subcommunities and collections for content we reorganized into new hierarchies from the original:**
- [x] Create _CGIAR System Management Board_ sub-community: 10568/83536
- [x] Content from _CGIAR System Management Board documents_ collection (10947/4561) goes here
- [x] Create _CGIAR System Management Board_ sub-community: `10568/83536`
- [x] Content from _CGIAR System Management Board documents_ collection (`10947/4561`) goes here
- Import collection hierarchy first and then the items:
```
@ -88,8 +98,8 @@ $ dspace packager -r -t AIP -o ignoreHandle=false -e aorth@mjanja.ch -p 10568/83
$ for item in 10568-93760/ITEM@10947-465*; do dspace packager -r -f -u -t AIP -e aorth@mjanja.ch $item; done
```
- [x] Create _CGIAR System Management Office_ sub-community: 10568/83537
- [x] Create _CGIAR System Management Office documents_ collection: 10568/83538
- [x] Create _CGIAR System Management Office_ sub-community: `10568/83537`
- [x] Create _CGIAR System Management Office documents_ collection: `10568/83538`
- Import items to collection individually in replace mode (-r) while explicitly preserving handles and ignoring parents:
```
@ -116,7 +126,7 @@ $ for item in 10947-latest/*.zip; do dspace packager -r -u -t AIP -e aorth@mjanj
## Post Migration
- [ ] Shut down Tomcat and run `update-sequences.sql` as the system's `postgres` user
- [x] Shut down Tomcat and run `update-sequences.sql` as the system's `postgres` user
- [x] Remove ingestion overrides from `dspace.cfg`
- [x] Reset PostgreSQL `max_connections` to 183
- [x] Enable nightly `index-discovery` cron job
@ -153,24 +163,11 @@ $ sudo su -
- Now I'm wondering how we'll do this when we move servers in the future, because the `make-handle-config` basically assumes you only have one handle
- Also, there is `dspace make-handle-config` and `bin/make-handle-config` and both behave differently (the first is interactive, the second reads your `dspace.cfg` and generates your handle config and `sitebndl.zip` accordingly)
- I'm really not sure on the proper order of events actually
- HTTPS certificates:
- [x] Install current certificates from their Tomcat keystore
```
$ keytool -list -keystore tomcat.keystore
$ keytool -importkeystore -srckeystore tomcat.keystore -destkeystore library.cgiar.org.p12 -deststoretype PKCS12 -srcalias tomcat
$ openssl pkcs12 -in library.cgiar.org.p12 -nokeys -out library.cgiar.org.crt.pem
$ openssl pkcs12 -in library.cgiar.org.p12 -nodes -nocerts -out library.cgiar.org.key.pem
$ wget https://certs.godaddy.com/repository/gdroot-g2.crt https://certs.godaddy.com/repository/gdig2.crt.pem
$ cat library.cgiar.org.crt.pem gdig2.crt.pem > library.cgiar.org-chained.pem
```
- [ ] Update DNS records:
- CNAME: cgspace.cgiar.org
- [ ] Re-deploy DSpace from freshly built `5_x-prod` branch
- [ ] Merge `cgiar-library` branch to `master` and re-run ansible nginx templates
- [ ] Run system updates and reboot server
- [x] Re-deploy DSpace from freshly built `5_x-prod` branch
- [x] Merge `cgiar-library` branch to `master` and re-run ansible nginx templates
- [x] Run system updates and reboot server
- [ ] Switch to Let's Encrypt HTTPS certificates (after DNS is updated and server isn't busy):
```

View File

@ -413,4 +413,5 @@ $ for item in 10568-93759/ITEM@10947-46*; do ~/dspace/bin/dspace packager -r -t
- I had a look at the collection and noticed a bunch of issues with item types and donors, so I asked him to fix those and import it to DSpace Test again first
- Abenet wants to be able to filter by ISI Journal in advanced search on queries like this: https://cgspace.cgiar.org/discover?filtertype_0=dateIssued&filtertype_1=dateIssued&filter_relational_operator_1=equals&filter_relational_operator_0=equals&filter_1=%5B2010+TO+2017%5D&filter_0=2017&filtertype=type&filter_relational_operator=equals&filter=Journal+Article
- I opened an issue to track this ([#340](https://github.com/ilri/DSpace/issues/340)) and will test it on DSpace Test soon
- Marianne Gadeberg from WLE asked if I would add an account for Adam Hunt on CGSpace and give him permissions to approve all WLE publications
- I told him to register first, as he's a CGIAR user and needs an account to be created before I can add him to the groups

View File

@ -33,7 +33,7 @@ Another worrying error from dspace.log is:
<meta property="article:published_time" content="2016-12-02T10:43:00&#43;03:00"/>
<meta property="article:modified_time" content="2017-01-10T16:21:47&#43;02:00"/>
<meta property="article:modified_time" content="2017-09-19T16:07:20&#43;03:00"/>
@ -79,7 +79,7 @@ Another worrying error from dspace.log is:
"url": "https://alanorth.github.io/cgspace-notes/2016-12/",
"wordCount": "4078",
"datePublished": "2016-12-02T10:43:00&#43;03:00",
"dateModified": "2017-01-10T16:21:47&#43;02:00",
"dateModified": "2017-09-19T16:07:20&#43;03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"

View File

@ -61,7 +61,7 @@ Ask Sisay to clean up the WLE approvers a bit, as Marianne&rsquo;s user account
"@type": "BlogPosting",
"headline": "September, 2017",
"url": "https://alanorth.github.io/cgspace-notes/2017-09/",
"wordCount": "2886",
"wordCount": "2937",
"datePublished": "2017-09-07T16:54:52&#43;07:00",
"dateModified": "2017-09-19T12:53:00&#43;03:00",
"author": {
@ -585,6 +585,8 @@ DELETE 207
<li>I had a look at the collection and noticed a bunch of issues with item types and donors, so I asked him to fix those and import it to DSpace Test again first</li>
<li>Abenet wants to be able to filter by ISI Journal in advanced search on queries like this: <a href="https://cgspace.cgiar.org/discover?filtertype_0=dateIssued&amp;filtertype_1=dateIssued&amp;filter_relational_operator_1=equals&amp;filter_relational_operator_0=equals&amp;filter_1=%5B2010+TO+2017%5D&amp;filter_0=2017&amp;filtertype=type&amp;filter_relational_operator=equals&amp;filter=Journal+Article">https://cgspace.cgiar.org/discover?filtertype_0=dateIssued&amp;filtertype_1=dateIssued&amp;filter_relational_operator_1=equals&amp;filter_relational_operator_0=equals&amp;filter_1=%5B2010+TO+2017%5D&amp;filter_0=2017&amp;filtertype=type&amp;filter_relational_operator=equals&amp;filter=Journal+Article</a></li>
<li>I opened an issue to track this (<a href="https://github.com/ilri/DSpace/issues/340">#340</a>) and will test it on DSpace Test soon</li>
<li>Marianne Gadeberg from WLE asked if I would add an account for Adam Hunt on CGSpace and give him permissions to approve all WLE publications</li>
<li>I told him to register first, as he&rsquo;s a CGIAR user and needs an account to be created before I can add him to the groups</li>
</ul>

View File

@ -20,7 +20,7 @@
<description>Note: I&amp;rsquo;m temporarily making this a page because it seems Hugo (currently 0.27.1) cannot use a custom slug for a post when there is a permalink defined in config.toml
Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called CGIAR System Organization.
Pre-migration Technical TODOs Things that need to happen before the migration:
Create top-level community on CGSpace to hold the CGIAR Library content: 10568&amp;frasl;83389 Update nginx redirects in ansible templates Update handle in DSpace XMLUI config Set up nginx redirects for URLs like: https://library.</description>
Create top-level community on CGSpace to hold the CGIAR Library content: 10568/83389 Update nginx redirects in ansible templates Update handle in DSpace XMLUI config Set up nginx redirects for URLs like: https://library.</description>
</item>
</channel>

View File

@ -13,7 +13,7 @@
<meta property="article:published_time" content="2017-09-18T16:38:35&#43;03:00"/>
<meta property="article:modified_time" content="2017-09-19T12:53:00&#43;03:00"/>
<meta property="article:modified_time" content="2017-09-19T16:07:20&#43;03:00"/>
@ -37,9 +37,9 @@
"@type": "BlogPosting",
"headline": "CGIAR Library Migration",
"url": "https://alanorth.github.io/cgspace-notes/cgiar-library-migration/",
"wordCount": "1319",
"wordCount": "1321",
"datePublished": "2017-09-18T16:38:35&#43;03:00",
"dateModified": "2017-09-19T12:53:00&#43;03:00",
"dateModified": "2017-09-19T16:07:20&#43;03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -117,7 +117,7 @@
<p>Things that need to happen before the migration:</p>
<ul class="task-list">
<li><label><input type="checkbox" checked disabled class="task-list-item"> Create top-level community on CGSpace to hold the CGIAR Library content: <sup>10568</sup>&frasl;<sub>83389</sub>
<li><label><input type="checkbox" checked disabled class="task-list-item"> Create top-level community on CGSpace to hold the CGIAR Library content: <code>10568/83389</code>
<ul class="task-list">
<li><label><input type="checkbox" checked disabled class="task-list-item"> Update nginx redirects in ansible templates</label></li>
@ -136,8 +136,17 @@
<li><code>SELECT * FROM pg_stat_activity;</code> seems to show ~6 extra connections used by the command line tools during import</li>
</ul></label></li>
<li><label><input type="checkbox" checked disabled class="task-list-item"> Temporarily disable nightly <code>index-discovery</code> cron job because the import process will be taking place during some of this time and I don&rsquo;t want them to be competing to update the Solr index</label></li>
<li><label><input type="checkbox" checked disabled class="task-list-item"> Copy HTTPS certificate key pair from CGIAR Library server&rsquo;s Tomcat keystore:</label></li>
</ul>
<pre><code>$ keytool -list -keystore tomcat.keystore
$ keytool -importkeystore -srckeystore tomcat.keystore -destkeystore library.cgiar.org.p12 -deststoretype PKCS12 -srcalias tomcat
$ openssl pkcs12 -in library.cgiar.org.p12 -nokeys -out library.cgiar.org.crt.pem
$ openssl pkcs12 -in library.cgiar.org.p12 -nodes -nocerts -out library.cgiar.org.key.pem
$ wget https://certs.godaddy.com/repository/gdroot-g2.crt https://certs.godaddy.com/repository/gdig2.crt.pem
$ cat library.cgiar.org.crt.pem gdig2.crt.pem &gt; library.cgiar.org-chained.pem
</code></pre>
<h2 id="migration-process">Migration Process</h2>
<p><strong>Export all top-level communities and collections from DSpace Test:</strong></p>
@ -195,10 +204,10 @@ $ for item in 10947-1/ITEM@10947-*; do dspace packager -r -f -u -t AIP -e aorth@
<p><strong>Create new subcommunities and collections for content we reorganized into new hierarchies from the original:</strong></p>
<ul class="task-list">
<li><label><input type="checkbox" checked disabled class="task-list-item"> Create <em>CGIAR System Management Board</em> sub-community: <sup>10568</sup>&frasl;<sub>83536</sub>
<li><label><input type="checkbox" checked disabled class="task-list-item"> Create <em>CGIAR System Management Board</em> sub-community: <code>10568/83536</code>
<ul class="task-list">
<li><label><input type="checkbox" checked disabled class="task-list-item"> Content from <em>CGIAR System Management Board documents</em> collection (<sup>10947</sup>&frasl;<sub>4561</sub>) goes here</label></li>
<li><label><input type="checkbox" checked disabled class="task-list-item"> Content from <em>CGIAR System Management Board documents</em> collection (<code>10947/4561</code>) goes here</label></li>
<li>Import collection hierarchy first and then the items:</li>
</ul></label></li>
</ul>
@ -208,10 +217,10 @@ $ for item in 10568-93760/ITEM@10947-465*; do dspace packager -r -f -u -t AIP -e
</code></pre>
<ul class="task-list">
<li><label><input type="checkbox" checked disabled class="task-list-item"> Create <em>CGIAR System Management Office</em> sub-community: <sup>10568</sup>&frasl;<sub>83537</sub>
<li><label><input type="checkbox" checked disabled class="task-list-item"> Create <em>CGIAR System Management Office</em> sub-community: <code>10568/83537</code>
<ul class="task-list">
<li><label><input type="checkbox" checked disabled class="task-list-item"> Create <em>CGIAR System Management Office documents</em> collection: <sup>10568</sup>&frasl;<sub>83538</sub></label></li>
<li><label><input type="checkbox" checked disabled class="task-list-item"> Create <em>CGIAR System Management Office documents</em> collection: <code>10568/83538</code></label></li>
<li>Import items to collection individually in replace mode (-r) while explicitly preserving handles and ignoring parents:</li>
</ul></label></li>
</ul>
@ -241,7 +250,7 @@ $ for item in 10568-93760/ITEM@10947-465*; do dspace packager -r -f -u -t AIP -e
<h2 id="post-migration">Post Migration</h2>
<ul class="task-list">
<li><label><input type="checkbox" disabled class="task-list-item"> Shut down Tomcat and run <code>update-sequences.sql</code> as the system&rsquo;s <code>postgres</code> user</label></li>
<li><label><input type="checkbox" checked disabled class="task-list-item"> Shut down Tomcat and run <code>update-sequences.sql</code> as the system&rsquo;s <code>postgres</code> user</label></li>
<li><label><input type="checkbox" checked disabled class="task-list-item"> Remove ingestion overrides from <code>dspace.cfg</code></label></li>
<li><label><input type="checkbox" checked disabled class="task-list-item"> Reset PostgreSQL <code>max_connections</code> to 183</label></li>
<li><label><input type="checkbox" checked disabled class="task-list-item"> Enable nightly <code>index-discovery</code> cron job</label></li>
@ -279,33 +288,15 @@ $ for item in 10568-93760/ITEM@10947-465*; do dspace packager -r -f -u -t AIP -e
<li>Copy the resulting <code>sitebndl.zip</code> somewhere so we can send it to Handle.net</li>
<li>Now I&rsquo;m wondering how we&rsquo;ll do this when we move servers in the future, because the <code>make-handle-config</code> basically assumes you only have one handle</li>
<li>Also, there is <code>dspace make-handle-config</code> and <code>bin/make-handle-config</code> and both behave differently (the first is interactive, the second reads your <code>dspace.cfg</code> and generates your handle config and <code>sitebndl.zip</code> accordingly)</li>
<li><p>I&rsquo;m really not sure on the proper order of events actually</p></li>
<li><p>HTTPS certificates:</p>
<ul class="task-list">
<li><label><input type="checkbox" checked disabled class="task-list-item"> Install current certificates from their Tomcat keystore</label></li>
</ul></li>
</ul>
<pre><code>$ keytool -list -keystore tomcat.keystore
$ keytool -importkeystore -srckeystore tomcat.keystore -destkeystore library.cgiar.org.p12 -deststoretype PKCS12 -srcalias tomcat
$ openssl pkcs12 -in library.cgiar.org.p12 -nokeys -out library.cgiar.org.crt.pem
$ openssl pkcs12 -in library.cgiar.org.p12 -nodes -nocerts -out library.cgiar.org.key.pem
$ wget https://certs.godaddy.com/repository/gdroot-g2.crt https://certs.godaddy.com/repository/gdig2.crt.pem
$ cat library.cgiar.org.crt.pem gdig2.crt.pem &gt; library.cgiar.org-chained.pem
</code></pre>
<ul class="task-list">
<li>I&rsquo;m really not sure on the proper order of events actually</li>
<li><label><input type="checkbox" disabled class="task-list-item"> Update DNS records:
<ul>
<li>CNAME: cgspace.cgiar.org</li>
</ul></label></li>
<li><label><input type="checkbox" disabled class="task-list-item"> Re-deploy DSpace from freshly built <code>5_x-prod</code> branch</label></li>
<li><label><input type="checkbox" disabled class="task-list-item"> Merge <code>cgiar-library</code> branch to <code>master</code> and re-run ansible nginx templates</label></li>
<li><label><input type="checkbox" disabled class="task-list-item"> Run system updates and reboot server</label></li>
<li><label><input type="checkbox" checked disabled class="task-list-item"> Re-deploy DSpace from freshly built <code>5_x-prod</code> branch</label></li>
<li><label><input type="checkbox" checked disabled class="task-list-item"> Merge <code>cgiar-library</code> branch to <code>master</code> and re-run ansible nginx templates</label></li>
<li><label><input type="checkbox" checked disabled class="task-list-item"> Run system updates and reboot server</label></li>
<li><label><input type="checkbox" disabled class="task-list-item"> Switch to Let&rsquo;s Encrypt HTTPS certificates (after DNS is updated and server isn&rsquo;t busy):</label></li>
</ul>

View File

@ -20,7 +20,7 @@
<description>Note: I&amp;rsquo;m temporarily making this a page because it seems Hugo (currently 0.27.1) cannot use a custom slug for a post when there is a permalink defined in config.toml
Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called CGIAR System Organization.
Pre-migration Technical TODOs Things that need to happen before the migration:
Create top-level community on CGSpace to hold the CGIAR Library content: 10568&amp;frasl;83389 Update nginx redirects in ansible templates Update handle in DSpace XMLUI config Set up nginx redirects for URLs like: https://library.</description>
Create top-level community on CGSpace to hold the CGIAR Library content: 10568/83389 Update nginx redirects in ansible templates Update handle in DSpace XMLUI config Set up nginx redirects for URLs like: https://library.</description>
</item>
<item>

View File

@ -27,7 +27,7 @@ Disallow: /cgspace-notes/2015-12/
Disallow: /cgspace-notes/2015-11/
Disallow: /cgspace-notes/
Disallow: /cgspace-notes/categories/
Disallow: /cgspace-notes/categories/notes/
Disallow: /cgspace-notes/tags/notes/
Disallow: /cgspace-notes/categories/notes/
Disallow: /cgspace-notes/post/
Disallow: /cgspace-notes/tags/

View File

@ -4,7 +4,7 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/cgiar-library-migration/</loc>
<lastmod>2017-09-19T12:53:00+03:00</lastmod>
<lastmod>2017-09-19T16:07:20+03:00</lastmod>
</url>
<url>
@ -54,7 +54,7 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/2016-12/</loc>
<lastmod>2017-01-10T16:21:47+02:00</lastmod>
<lastmod>2017-09-19T16:07:20+03:00</lastmod>
</url>
<url>
@ -124,7 +124,7 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2017-09-19T12:53:00+03:00</lastmod>
<lastmod>2017-09-19T16:07:20+03:00</lastmod>
<priority>0</priority>
</url>
@ -134,14 +134,14 @@
</url>
<url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
<lastmod>2017-09-19T12:53:00+03:00</lastmod>
<priority>0</priority>
</url>
<url>
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
<lastmod>2017-09-19T12:53:00+03:00</lastmod>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2017-09-19T16:07:20+03:00</lastmod>
<priority>0</priority>
</url>