Update notes for 2016-11-08

This commit is contained in:
Alan Orth 2016-11-08 12:44:29 +02:00
parent 42ca377f12
commit cfe5796b3a
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
5 changed files with 46 additions and 0 deletions

View File

@ -85,3 +85,13 @@ $ ./fix-metadata-values.py -i /tmp/CRPs.csv -f cg.contributor.crp -t correct -m
![Listings and Reports broken in DSpace 5.5](2016/11/listings-and-reports-55.png)
- I've filed a ticket with Atmire
- Thinking about batch updates for ORCIDs and authors
- Playing with [SolrClient](https://github.com/moonlitesolutions/SolrClient) in Python to query Solr
- All records in the authority core are either `authority_type:orcid` or `authority_type:person`
- There is a `deleted` field and all items seem to be `false`, but might be important sanity check to remember
- The way to go is probably to have a CSV of author names and authority IDs, then to batch update them in PostgreSQL
- Dump of the top ~200 authors in CGSpace:
```
dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id=3 group by text_value order by count desc limit 210) to /tmp/210-authors.csv with csv;
```

View File

@ -186,8 +186,17 @@ COPY 22
<ul>
<li>I&rsquo;ve filed a ticket with Atmire</li>
<li>Thinking about batch updates for ORCIDs and authors</li>
<li>Playing with <a href="https://github.com/moonlitesolutions/SolrClient">SolrClient</a> in Python to query Solr</li>
<li>All records in the authority core are either <code>authority_type:orcid</code> or <code>authority_type:person</code></li>
<li>There is a <code>deleted</code> field and all items seem to be <code>false</code>, but might be important sanity check to remember</li>
<li>The way to go is probably to have a CSV of author names and authority IDs, then to batch update them in PostgreSQL</li>
<li>Dump of the top ~200 authors in CGSpace:</li>
</ul>
<pre><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id=3 group by text_value order by count desc limit 210) to /tmp/210-authors.csv with csv;
</code></pre>

View File

@ -113,7 +113,16 @@ COPY 22
&lt;ul&gt;
&lt;li&gt;I&amp;rsquo;ve filed a ticket with Atmire&lt;/li&gt;
&lt;li&gt;Thinking about batch updates for ORCIDs and authors&lt;/li&gt;
&lt;li&gt;Playing with &lt;a href=&#34;https://github.com/moonlitesolutions/SolrClient&#34;&gt;SolrClient&lt;/a&gt; in Python to query Solr&lt;/li&gt;
&lt;li&gt;All records in the authority core are either &lt;code&gt;authority_type:orcid&lt;/code&gt; or &lt;code&gt;authority_type:person&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;There is a &lt;code&gt;deleted&lt;/code&gt; field and all items seem to be &lt;code&gt;false&lt;/code&gt;, but might be important sanity check to remember&lt;/li&gt;
&lt;li&gt;The way to go is probably to have a CSV of author names and authority IDs, then to batch update them in PostgreSQL&lt;/li&gt;
&lt;li&gt;Dump of the top ~200 authors in CGSpace:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id=3 group by text_value order by count desc limit 210) to /tmp/210-authors.csv with csv;
&lt;/code&gt;&lt;/pre&gt;
</description>
</item>

View File

@ -113,7 +113,16 @@ COPY 22
&lt;ul&gt;
&lt;li&gt;I&amp;rsquo;ve filed a ticket with Atmire&lt;/li&gt;
&lt;li&gt;Thinking about batch updates for ORCIDs and authors&lt;/li&gt;
&lt;li&gt;Playing with &lt;a href=&#34;https://github.com/moonlitesolutions/SolrClient&#34;&gt;SolrClient&lt;/a&gt; in Python to query Solr&lt;/li&gt;
&lt;li&gt;All records in the authority core are either &lt;code&gt;authority_type:orcid&lt;/code&gt; or &lt;code&gt;authority_type:person&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;There is a &lt;code&gt;deleted&lt;/code&gt; field and all items seem to be &lt;code&gt;false&lt;/code&gt;, but might be important sanity check to remember&lt;/li&gt;
&lt;li&gt;The way to go is probably to have a CSV of author names and authority IDs, then to batch update them in PostgreSQL&lt;/li&gt;
&lt;li&gt;Dump of the top ~200 authors in CGSpace:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id=3 group by text_value order by count desc limit 210) to /tmp/210-authors.csv with csv;
&lt;/code&gt;&lt;/pre&gt;
</description>
</item>

View File

@ -112,7 +112,16 @@ COPY 22
&lt;ul&gt;
&lt;li&gt;I&amp;rsquo;ve filed a ticket with Atmire&lt;/li&gt;
&lt;li&gt;Thinking about batch updates for ORCIDs and authors&lt;/li&gt;
&lt;li&gt;Playing with &lt;a href=&#34;https://github.com/moonlitesolutions/SolrClient&#34;&gt;SolrClient&lt;/a&gt; in Python to query Solr&lt;/li&gt;
&lt;li&gt;All records in the authority core are either &lt;code&gt;authority_type:orcid&lt;/code&gt; or &lt;code&gt;authority_type:person&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;There is a &lt;code&gt;deleted&lt;/code&gt; field and all items seem to be &lt;code&gt;false&lt;/code&gt;, but might be important sanity check to remember&lt;/li&gt;
&lt;li&gt;The way to go is probably to have a CSV of author names and authority IDs, then to batch update them in PostgreSQL&lt;/li&gt;
&lt;li&gt;Dump of the top ~200 authors in CGSpace:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id=3 group by text_value order by count desc limit 210) to /tmp/210-authors.csv with csv;
&lt;/code&gt;&lt;/pre&gt;
</description>
</item>