Update notes for 2016-09-19

This commit is contained in:
Alan Orth 2016-09-19 17:52:47 +03:00
parent b3bd4b1d2b
commit 6fc8031da4
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
5 changed files with 66 additions and 13 deletions

View File

@ -276,11 +276,10 @@ org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error
```
- Looking at the top 20 IPs or so, most are Yahoo, MSN, Google, Baidu, TurnitIn (iParadigm), etc... do we have any real users?
- Generate a list of all Affiliations for Peter Ballantyne to go through, make corrections, and create a lookup list from:
- Generate a list of all author affiliations for Peter Ballantyne to go through, make corrections, and create a lookup list from:
```
dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=211 group by text_value order by count desc)
to /tmp/affiliations.csv with csv;
dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=211 group by text_value order by count desc) to /tmp/affiliations.csv with csv;
```
- Looking into the Catalina logs again around the time of the first crash, I see:
@ -387,3 +386,15 @@ Exception in thread "Thread-54216" org.apache.solr.client.solrj.impl.HttpSolrSer
```
- I've sent a message to Atmire about the Solr error to see if it's related to their batch update module
## 2016-09-19
- Work on cleanups for author affiliations after Peter sent me his list of corrections/deletions:
```
$ ./fix-metadata-values.py -i affiliations_pb-322-corrections.csv -f cg.contributor.affiliation -t correct -m 211 -d dspace -u dspace -p fuuu
$ ./delete-metadata-values.py -f cg.contributor.affiliation -i affiliations_pb-2-deletions.csv -m 211 -u dspace-d dspace-p fuuu
```
- After that we need to take the top ~300 and make a controlled vocabulary for it
- I dumped a list of the top 300 affiliations from the database, sorted it alphabetically in OpenRefine, and created a controlled vocabulary for it ([#267](https://github.com/ilri/DSpace/pull/267))

View File

@ -395,11 +395,10 @@ org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error
<ul>
<li>Looking at the top 20 IPs or so, most are Yahoo, MSN, Google, Baidu, TurnitIn (iParadigm), etc&hellip; do we have any real users?</li>
<li>Generate a list of all Affiliations for Peter Ballantyne to go through, make corrections, and create a lookup list from:</li>
<li>Generate a list of all author affiliations for Peter Ballantyne to go through, make corrections, and create a lookup list from:</li>
</ul>
<pre><code>dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=211 group by text_value order by count desc)
to /tmp/affiliations.csv with csv;
<pre><code>dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=211 group by text_value order by count desc) to /tmp/affiliations.csv with csv;
</code></pre>
<ul>
@ -519,6 +518,21 @@ Exception in thread &quot;Thread-54216&quot; org.apache.solr.client.solrj.impl.H
<ul>
<li>I&rsquo;ve sent a message to Atmire about the Solr error to see if it&rsquo;s related to their batch update module</li>
</ul>
<h2 id="2016-09-19">2016-09-19</h2>
<ul>
<li>Work on cleanups for author affiliations after Peter sent me his list of corrections/deletions:</li>
</ul>
<pre><code>$ ./fix-metadata-values.py -i affiliations_pb-322-corrections.csv -f cg.contributor.affiliation -t correct -m 211 -d dspace -u dspace -p fuuu
$ ./delete-metadata-values.py -f cg.contributor.affiliation -i affiliations_pb-2-deletions.csv -m 211 -u dspace-d dspace-p fuuu
</code></pre>
<ul>
<li>After that we need to take the top ~300 and make a controlled vocabulary for it</li>
<li>I dumped a list of the top 300 affiliations from the database, sorted it alphabetically in OpenRefine, and created a controlled vocabulary for it (<a href="https://github.com/ilri/DSpace/pull/267">#267</a>)</li>
</ul>
</section>

View File

@ -333,11 +333,10 @@ org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error
&lt;ul&gt;
&lt;li&gt;Looking at the top 20 IPs or so, most are Yahoo, MSN, Google, Baidu, TurnitIn (iParadigm), etc&amp;hellip; do we have any real users?&lt;/li&gt;
&lt;li&gt;Generate a list of all Affiliations for Peter Ballantyne to go through, make corrections, and create a lookup list from:&lt;/li&gt;
&lt;li&gt;Generate a list of all author affiliations for Peter Ballantyne to go through, make corrections, and create a lookup list from:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=211 group by text_value order by count desc)
to /tmp/affiliations.csv with csv;
&lt;pre&gt;&lt;code&gt;dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=211 group by text_value order by count desc) to /tmp/affiliations.csv with csv;
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
@ -458,6 +457,21 @@ Exception in thread &amp;quot;Thread-54216&amp;quot; org.apache.solr.client.solr
&lt;ul&gt;
&lt;li&gt;I&amp;rsquo;ve sent a message to Atmire about the Solr error to see if it&amp;rsquo;s related to their batch update module&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-09-19&#34;&gt;2016-09-19&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Work on cleanups for author affiliations after Peter sent me his list of corrections/deletions:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;$ ./fix-metadata-values.py -i affiliations_pb-322-corrections.csv -f cg.contributor.affiliation -t correct -m 211 -d dspace -u dspace -p fuuu
$ ./delete-metadata-values.py -f cg.contributor.affiliation -i affiliations_pb-2-deletions.csv -m 211 -u dspace-d dspace-p fuuu
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;After that we need to take the top ~300 and make a controlled vocabulary for it&lt;/li&gt;
&lt;li&gt;I dumped a list of the top 300 affiliations from the database, sorted it alphabetically in OpenRefine, and created a controlled vocabulary for it (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/267&#34;&gt;#267&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
</description>
</item>

View File

@ -1 +1 @@
<!DOCTYPE html><html><head><title>https://alanorth.github.io/cgspace-notes/</title><link rel="canonical" href="https://alanorth.github.io/cgspace-notes/"/><meta http-equiv="content-type" content="text/html; charset=utf-8" /><meta http-equiv="refresh" content="0; url=https://alanorth.github.io/cgspace-notes/" /></head></html>
<!DOCTYPE html><html><head><link rel="canonical" href="https://alanorth.github.io/cgspace-notes/"/><meta http-equiv="content-type" content="text/html; charset=utf-8" /><meta http-equiv="refresh" content="0;url=https://alanorth.github.io/cgspace-notes/" /></head></html>

View File

@ -333,11 +333,10 @@ org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error
&lt;ul&gt;
&lt;li&gt;Looking at the top 20 IPs or so, most are Yahoo, MSN, Google, Baidu, TurnitIn (iParadigm), etc&amp;hellip; do we have any real users?&lt;/li&gt;
&lt;li&gt;Generate a list of all Affiliations for Peter Ballantyne to go through, make corrections, and create a lookup list from:&lt;/li&gt;
&lt;li&gt;Generate a list of all author affiliations for Peter Ballantyne to go through, make corrections, and create a lookup list from:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=211 group by text_value order by count desc)
to /tmp/affiliations.csv with csv;
&lt;pre&gt;&lt;code&gt;dspacetest=# \copy (select text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=211 group by text_value order by count desc) to /tmp/affiliations.csv with csv;
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
@ -458,6 +457,21 @@ Exception in thread &amp;quot;Thread-54216&amp;quot; org.apache.solr.client.solr
&lt;ul&gt;
&lt;li&gt;I&amp;rsquo;ve sent a message to Atmire about the Solr error to see if it&amp;rsquo;s related to their batch update module&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-09-19&#34;&gt;2016-09-19&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Work on cleanups for author affiliations after Peter sent me his list of corrections/deletions:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;$ ./fix-metadata-values.py -i affiliations_pb-322-corrections.csv -f cg.contributor.affiliation -t correct -m 211 -d dspace -u dspace -p fuuu
$ ./delete-metadata-values.py -f cg.contributor.affiliation -i affiliations_pb-2-deletions.csv -m 211 -u dspace-d dspace-p fuuu
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;After that we need to take the top ~300 and make a controlled vocabulary for it&lt;/li&gt;
&lt;li&gt;I dumped a list of the top 300 affiliations from the database, sorted it alphabetically in OpenRefine, and created a controlled vocabulary for it (&lt;a href=&#34;https://github.com/ilri/DSpace/pull/267&#34;&gt;#267&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
</description>
</item>