From 4b94ef0fcc76246d23225e0db2302a7c9a45fe78 Mon Sep 17 00:00:00 2001 From: Alan Orth Date: Sat, 29 Sep 2018 15:00:03 +0300 Subject: [PATCH] Add notes for 2018-09-29 --- content/posts/2018-09.md | 19 ++++++++++++++++++- docs/2018-09/index.html | 27 +++++++++++++++++++++++---- docs/robots.txt | 2 +- docs/sitemap.xml | 20 ++++++++++---------- 4 files changed, 52 insertions(+), 16 deletions(-) diff --git a/content/posts/2018-09.md b/content/posts/2018-09.md index 3af594e8d..9d1b91671 100644 --- a/content/posts/2018-09.md +++ b/content/posts/2018-09.md @@ -536,7 +536,7 @@ Indexing item downloads (page 260 of 260) - I did a batch replacement of the access rights with my [fix-metadata-values.py](https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897) script on DSpace Test: ``` -$ ./fix-metadata-values.py -i /tmp/fix-access-status.csv -db dspace-u dspace-p 'fuuu' -f cg.identifier.status -t correct -m 206 +$ ./fix-metadata-values.py -i /tmp/fix-access-status.csv -db dspace -u dspace -p 'fuuu' -f cg.identifier.status -t correct -m 206 ``` - This changes "Open Access" to "Unrestricted Access" and "Limited Access" to "Restricted Access" @@ -590,4 +590,21 @@ $ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=68.6.87.12' dspace.log.2018-09-26 - I will add their IPs to the list of bad bots in nginx so we can add a "bot" user agent to them and let Tomcat's Crawler Session Manager Valve handle them - I asked Atmire to prepare an invoice for 125 credits +## 2018-09-29 + +- I merged some changes to author affiliations from Sisay as well as some corrections to organizational names using smart quotes like `Université d’Abomey Calavi` ([#388](https://github.com/ilri/DSpace/pull/388)) +- Peter sent me a list of 43 author names to fix, but it had some encoding errors like `Belalcázar, John` like usual (I will tell him to stop trying to export as UTF-8 because it never seems to work) +- I did batch replaces for both on CGSpace with my [fix-metadata-values.py](https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897) script: + +``` +$ ./fix-metadata-values.py -i 2018-09-29-fix-affiliations.csv -db dspace -u dspace -p 'fuuu' -f cg.contributor.affiliation -t correct -m 211 +$ ./fix-metadata-values.py -i 2018-09-29-fix-authors.csv -db dspace -u dspace -p 'fuuu' -f dc.contributor.author -t correct -m 3 +``` + +- Afterwards I started a full Discovery re-index: + +``` +$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b +``` + diff --git a/docs/2018-09/index.html b/docs/2018-09/index.html index b5644c2db..5e045f88e 100644 --- a/docs/2018-09/index.html +++ b/docs/2018-09/index.html @@ -18,7 +18,7 @@ I’m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I " /> - + I did a batch replacement of the access rights with my fix-metadata-values.py script on DSpace Test: -
$ ./fix-metadata-values.py -i /tmp/fix-access-status.csv -db dspace-u dspace-p 'fuuu' -f cg.identifier.status -t correct -m 206
+
$ ./fix-metadata-values.py -i /tmp/fix-access-status.csv -db dspace -u dspace -p 'fuuu' -f cg.identifier.status -t correct -m 206
 
    @@ -787,6 +787,25 @@ $ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=68.6.87.12' dspace.log.2018-09-26
  • I asked Atmire to prepare an invoice for 125 credits
+

2018-09-29

+ +
    +
  • I merged some changes to author affiliations from Sisay as well as some corrections to organizational names using smart quotes like Université d’Abomey Calavi (#388)
  • +
  • Peter sent me a list of 43 author names to fix, but it had some encoding errors like Belalcázar, John like usual (I will tell him to stop trying to export as UTF-8 because it never seems to work)
  • +
  • I did batch replaces for both on CGSpace with my fix-metadata-values.py script:
  • +
+ +
$ ./fix-metadata-values.py -i 2018-09-29-fix-affiliations.csv -db dspace -u dspace -p 'fuuu' -f cg.contributor.affiliation -t correct -m 211
+$ ./fix-metadata-values.py -i 2018-09-29-fix-authors.csv -db dspace -u dspace -p 'fuuu' -f dc.contributor.author -t correct -m 3
+
+ +
    +
  • Afterwards I started a full Discovery re-index:
  • +
+ +
$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+ diff --git a/docs/robots.txt b/docs/robots.txt index 5620a7e3e..3582866fa 100644 --- a/docs/robots.txt +++ b/docs/robots.txt @@ -39,7 +39,7 @@ Disallow: /cgspace-notes/2015-12/ Disallow: /cgspace-notes/2015-11/ Disallow: /cgspace-notes/ Disallow: /cgspace-notes/categories/ -Disallow: /cgspace-notes/tags/notes/ Disallow: /cgspace-notes/categories/notes/ +Disallow: /cgspace-notes/tags/notes/ Disallow: /cgspace-notes/posts/ Disallow: /cgspace-notes/tags/ diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 381b440f4..7db200d0e 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -4,7 +4,7 @@ https://alanorth.github.io/cgspace-notes/2018-09/ - 2018-09-27T10:52:23+03:00 + 2018-09-27T19:16:20+03:00 @@ -184,7 +184,7 @@ https://alanorth.github.io/cgspace-notes/ - 2018-09-27T10:52:23+03:00 + 2018-09-27T19:16:20+03:00 0 @@ -193,27 +193,27 @@ 0 - - https://alanorth.github.io/cgspace-notes/tags/notes/ - 2018-09-27T10:52:23+03:00 - 0 - - https://alanorth.github.io/cgspace-notes/categories/notes/ 2018-03-09T22:10:33+02:00 0 + + https://alanorth.github.io/cgspace-notes/tags/notes/ + 2018-09-27T19:16:20+03:00 + 0 + + https://alanorth.github.io/cgspace-notes/posts/ - 2018-09-27T10:52:23+03:00 + 2018-09-27T19:16:20+03:00 0 https://alanorth.github.io/cgspace-notes/tags/ - 2018-09-27T10:52:23+03:00 + 2018-09-27T19:16:20+03:00 0