From d36443d3e82c89d7d1484edb383d4c168f6ea17d Mon Sep 17 00:00:00 2001 From: Alan Orth Date: Tue, 4 Oct 2016 11:34:57 +0300 Subject: [PATCH] Add notes for 2016-10-04 --- content/post/2016-10.md | 16 ++++++++++++++++ public/2016-10/index.html | 22 ++++++++++++++++++++++ public/index.xml | 22 ++++++++++++++++++++++ public/post/index.xml | 22 ++++++++++++++++++++++ public/tags/notes/index.xml | 22 ++++++++++++++++++++++ 5 files changed, 104 insertions(+) diff --git a/content/post/2016-10.md b/content/post/2016-10.md index 6e7730d44..6f13ba3ba 100644 --- a/content/post/2016-10.md +++ b/content/post/2016-10.md @@ -24,3 +24,19 @@ tags = ["Notes"] ![Bootstrap issue with in-page anchors](2016/10/bootstrap-issue.png) - Looks like we'll just have to add the text to the About page (without a link) or add a separate page + +## 2016-10-04 + +- Start testing cleanups of authors that Peter sent last week +- Out of 40,000+ rows, Peter had indicated corrections for ~3,200 of them—too many to look through carefully, so I did some basic quality checking: + - Trim leading/trailing whitespace + - Find invalid characters + - Cluster values to merge obvious authors +- That left us with 3,180 valid corrections and 3 deletions: + +``` +$ ./fix-metadata-values.py -i authors-fix-3180.csv -f dc.contributor.author -t correct -m 3 -d dspacetest -u dspacetest -p fuuu +$ ./delete-metadata-values.py -i authors-delete-3.csv -f dc.contributor.author -m 3 -d dspacetest -u dspacetest -p fuuu +``` + +- Remove old about page ([#284](https://github.com/ilri/DSpace/pull/284)) diff --git a/public/2016-10/index.html b/public/2016-10/index.html index 7f5b35210..59a58bd76 100644 --- a/public/2016-10/index.html +++ b/public/2016-10/index.html @@ -112,6 +112,28 @@
  • Looks like we’ll just have to add the text to the About page (without a link) or add a separate page
  • +

    2016-10-04

    + + + +
    $ ./fix-metadata-values.py -i authors-fix-3180.csv -f dc.contributor.author -t correct -m 3 -d dspacetest -u dspacetest -p fuuu
    +$ ./delete-metadata-values.py -i authors-delete-3.csv -f dc.contributor.author -m 3 -d dspacetest -u dspacetest -p fuuu
    +
    + + + diff --git a/public/index.xml b/public/index.xml index 9adfdec2c..6b55dbda4 100644 --- a/public/index.xml +++ b/public/index.xml @@ -44,6 +44,28 @@ <ul> <li>Looks like we&rsquo;ll just have to add the text to the About page (without a link) or add a separate page</li> </ul> + +<h2 id="2016-10-04">2016-10-04</h2> + +<ul> +<li>Start testing cleanups of authors that Peter sent last week</li> +<li>Out of 40,000+ rows, Peter had indicated corrections for ~3,200 of them—too many to look through carefully, so I did some basic quality checking: + +<ul> +<li>Trim leading/trailing whitespace</li> +<li>Find invalid characters</li> +<li>Cluster values to merge obvious authors</li> +</ul></li> +<li>That left us with 3,180 valid corrections and 3 deletions:</li> +</ul> + +<pre><code>$ ./fix-metadata-values.py -i authors-fix-3180.csv -f dc.contributor.author -t correct -m 3 -d dspacetest -u dspacetest -p fuuu +$ ./delete-metadata-values.py -i authors-delete-3.csv -f dc.contributor.author -m 3 -d dspacetest -u dspacetest -p fuuu +</code></pre> + +<ul> +<li>Remove old about page (<a href="https://github.com/ilri/DSpace/pull/284">#284</a>)</li> +</ul> diff --git a/public/post/index.xml b/public/post/index.xml index 19376a044..9bef17e88 100644 --- a/public/post/index.xml +++ b/public/post/index.xml @@ -44,6 +44,28 @@ <ul> <li>Looks like we&rsquo;ll just have to add the text to the About page (without a link) or add a separate page</li> </ul> + +<h2 id="2016-10-04">2016-10-04</h2> + +<ul> +<li>Start testing cleanups of authors that Peter sent last week</li> +<li>Out of 40,000+ rows, Peter had indicated corrections for ~3,200 of them—too many to look through carefully, so I did some basic quality checking: + +<ul> +<li>Trim leading/trailing whitespace</li> +<li>Find invalid characters</li> +<li>Cluster values to merge obvious authors</li> +</ul></li> +<li>That left us with 3,180 valid corrections and 3 deletions:</li> +</ul> + +<pre><code>$ ./fix-metadata-values.py -i authors-fix-3180.csv -f dc.contributor.author -t correct -m 3 -d dspacetest -u dspacetest -p fuuu +$ ./delete-metadata-values.py -i authors-delete-3.csv -f dc.contributor.author -m 3 -d dspacetest -u dspacetest -p fuuu +</code></pre> + +<ul> +<li>Remove old about page (<a href="https://github.com/ilri/DSpace/pull/284">#284</a>)</li> +</ul> diff --git a/public/tags/notes/index.xml b/public/tags/notes/index.xml index d4dd3fd97..5ba73f0ca 100644 --- a/public/tags/notes/index.xml +++ b/public/tags/notes/index.xml @@ -43,6 +43,28 @@ <ul> <li>Looks like we&rsquo;ll just have to add the text to the About page (without a link) or add a separate page</li> </ul> + +<h2 id="2016-10-04">2016-10-04</h2> + +<ul> +<li>Start testing cleanups of authors that Peter sent last week</li> +<li>Out of 40,000+ rows, Peter had indicated corrections for ~3,200 of them—too many to look through carefully, so I did some basic quality checking: + +<ul> +<li>Trim leading/trailing whitespace</li> +<li>Find invalid characters</li> +<li>Cluster values to merge obvious authors</li> +</ul></li> +<li>That left us with 3,180 valid corrections and 3 deletions:</li> +</ul> + +<pre><code>$ ./fix-metadata-values.py -i authors-fix-3180.csv -f dc.contributor.author -t correct -m 3 -d dspacetest -u dspacetest -p fuuu +$ ./delete-metadata-values.py -i authors-delete-3.csv -f dc.contributor.author -m 3 -d dspacetest -u dspacetest -p fuuu +</code></pre> + +<ul> +<li>Remove old about page (<a href="https://github.com/ilri/DSpace/pull/284">#284</a>)</li> +</ul>