Add notes for 2022-03-04

This commit is contained in:
2022-03-04 15:30:06 +03:00
parent 7453499827
commit 27acbac859
115 changed files with 6550 additions and 6444 deletions

View File

@ -48,7 +48,7 @@ I filed a bug on OpenRXV: https://github.com/ilri/OpenRXV/issues/39
I filed an issue on OpenRXV to make some minor edits to the admin UI: https://github.com/ilri/OpenRXV/issues/40
"/>
<meta name="generator" content="Hugo 0.92.2" />
<meta name="generator" content="Hugo 0.93.1" />
@ -173,7 +173,7 @@ $ grep -c added /tmp/2020-09-02-countrycodetagger.log
</code></pre><ul>
<li>I tried to query LDAP directly using the application credentials with ldapsearch and it works:</li>
</ul>
<pre tabindex="0"><code>$ ldapsearch -x -H ldaps://AZCGNEROOT2.CGIARAD.ORG:636/ -b &quot;dc=cgiarad,dc=org&quot; -D &quot;applicationaccount@cgiarad.org&quot; -W &quot;(sAMAccountName=me)&quot;
<pre tabindex="0"><code>$ ldapsearch -x -H ldaps://AZCGNEROOT2.CGIARAD.ORG:636/ -b &#34;dc=cgiarad,dc=org&#34; -D &#34;applicationaccount@cgiarad.org&#34; -W &#34;(sAMAccountName=me)&#34;
</code></pre><ul>
<li>According to the <a href="https://wiki.lyrasis.org/display/DSDOC6x/Authentication+Plugins#AuthenticationPlugins-LDAPAuthentication">DSpace 6 docs</a> we need to escape commas in our LDAP parameters due to the new configuration system
<ul>
@ -206,8 +206,8 @@ Report
Formally Published
Poster
Unrefereed reprint
$ ./delete-metadata-values.py -i 2020-09-03-delete-review-status.csv -db dspace -u dspace -p 'fuuu' -f dc.description.version -m 68
$ ./fix-metadata-values.py -i 2020-09-03-fix-review-status.csv -db dspace -u dspace -p 'fuuu' -f dc.description.version -t 'correct' -m 68
$ ./delete-metadata-values.py -i 2020-09-03-delete-review-status.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.description.version -m 68
$ ./fix-metadata-values.py -i 2020-09-03-fix-review-status.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.description.version -t &#39;correct&#39; -m 68
</code></pre><ul>
<li>Start reviewing 95 items for IITA (20201stbatch)
<ul>
@ -259,9 +259,9 @@ java.lang.NullPointerException
</li>
<li>I will update our nearly 6,000 metadata values for CIFOR in the database accordingly:</li>
</ul>
<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_value = regexp_replace(text_value, '^(http://)?www\.cifor\.org/(nc/)?online-library/browse/view-publication/publication/([[:digit:]]+)\.html$', 'https://www.cifor.org/knowledge/publication/\3') WHERE metadata_field_id=219 AND text_value ~ 'www\.cifor\.org/(nc/)?online-library/browse/view-publication/publication/[[:digit:]]+';
dspace=# UPDATE metadatavalue SET text_value = regexp_replace(text_value, '^https?://www\.cifor\.org/library/([[:digit:]]+)/?$', 'https://www.cifor.org/knowledge/publication/\1') WHERE metadata_field_id=219 AND text_value ~ 'https?://www\.cifor\.org/library/[[:digit:]]+/?';
dspace=# UPDATE metadatavalue SET text_value = regexp_replace(text_value, '^https?://www\.cifor\.org/pid/([[:digit:]]+)/?$', 'https://www.cifor.org/knowledge/publication/\1') WHERE metadata_field_id=219 AND text_value ~ 'https?://www\.cifor\.org/pid/[[:digit:]]+';
<pre tabindex="0"><code>dspace=# UPDATE metadatavalue SET text_value = regexp_replace(text_value, &#39;^(http://)?www\.cifor\.org/(nc/)?online-library/browse/view-publication/publication/([[:digit:]]+)\.html$&#39;, &#39;https://www.cifor.org/knowledge/publication/\3&#39;) WHERE metadata_field_id=219 AND text_value ~ &#39;www\.cifor\.org/(nc/)?online-library/browse/view-publication/publication/[[:digit:]]+&#39;;
dspace=# UPDATE metadatavalue SET text_value = regexp_replace(text_value, &#39;^https?://www\.cifor\.org/library/([[:digit:]]+)/?$&#39;, &#39;https://www.cifor.org/knowledge/publication/\1&#39;) WHERE metadata_field_id=219 AND text_value ~ &#39;https?://www\.cifor\.org/library/[[:digit:]]+/?&#39;;
dspace=# UPDATE metadatavalue SET text_value = regexp_replace(text_value, &#39;^https?://www\.cifor\.org/pid/([[:digit:]]+)/?$&#39;, &#39;https://www.cifor.org/knowledge/publication/\1&#39;) WHERE metadata_field_id=219 AND text_value ~ &#39;https?://www\.cifor\.org/pid/[[:digit:]]+&#39;;
</code></pre><ul>
<li>I did some cleanup on the author affiliations of the IITA data our 2019-04 list using reconcile-csv and OpenRefine:
<ul>
@ -328,7 +328,7 @@ AFRICA SOUTH OF SAHARA,SUB-SAHARAN AFRICA
NORTH AFRICA,NORTHERN AFRICA
WEST ASIA,WESTERN ASIA
SOUTHWEST ASIA,SOUTHWESTERN ASIA
$ ./fix-metadata-values.py -i 2020-09-10-fix-cgspace-regions.csv -db dspace -u dspace -p 'fuuu' -f cg.coverage.region -t 'correct' -m 227 -d -n
$ ./fix-metadata-values.py -i 2020-09-10-fix-cgspace-regions.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.region -t &#39;correct&#39; -m 227 -d -n
Connected to database.
Would fix 12227 occurences of: EAST AFRICA
Would fix 7996 occurences of: WEST AFRICA
@ -417,7 +417,7 @@ Would fix 3 occurences of: SOUTHWEST ASIA
</ul>
</li>
</ul>
<pre tabindex="0"><code>value + &quot;__description:&quot; + cells[&quot;dc.type&quot;].value
<pre tabindex="0"><code>value + &#34;__description:&#34; + cells[&#34;dc.type&#34;].value
</code></pre><ul>
<li>Then I created a SAF bundle with SAFBuilder:</li>
</ul>
@ -477,9 +477,9 @@ Would fix 3 occurences of: SOUTHWEST ASIA
</ul>
<pre tabindex="0"><code>$ cat 2020-09-17-add-bioversity-orcids.csv
dc.contributor.author,cg.creator.id
&quot;Etten, Jacob van&quot;,&quot;Jacob van Etten: 0000-0001-7554-2558&quot;
&quot;van Etten, Jacob&quot;,&quot;Jacob van Etten: 0000-0001-7554-2558&quot;
$ ./add-orcid-identifiers-csv.py -i 2020-09-17-add-bioversity-orcids.csv -db dspace -u dspace -p 'dom@in34sniper'
&#34;Etten, Jacob van&#34;,&#34;Jacob van Etten: 0000-0001-7554-2558&#34;
&#34;van Etten, Jacob&#34;,&#34;Jacob van Etten: 0000-0001-7554-2558&#34;
$ ./add-orcid-identifiers-csv.py -i 2020-09-17-add-bioversity-orcids.csv -db dspace -u dspace -p &#39;dom@in34sniper&#39;
</code></pre><ul>
<li>I sent a follow-up message to Atmire to look into the two remaining issues with the DSpace 6 upgrade
<ul>
@ -496,7 +496,7 @@ $ ./add-orcid-identifiers-csv.py -i 2020-09-17-add-bioversity-orcids.csv -db dsp
</ul>
</li>
</ul>
<pre tabindex="0"><code>https://cgspace.cgiar.org/open-search/discover?query=type:&quot;Journal Article&quot; AND status:&quot;Open Access&quot; AND crpsubject:&quot;Water, Land and Ecosystems&quot; AND &quot;tradeoffs&quot;&amp;rpp=100
<pre tabindex="0"><code>https://cgspace.cgiar.org/open-search/discover?query=type:&#34;Journal Article&#34; AND status:&#34;Open Access&#34; AND crpsubject:&#34;Water, Land and Ecosystems&#34; AND &#34;tradeoffs&#34;&amp;rpp=100
</code></pre><ul>
<li>I noticed that my <code>move-collections.sh</code> script didn&rsquo;t work on DSpace 6 because of the change from IDs to UUIDs, so I modified it to quote the collection <code>resource_id</code> parameters in the PostgreSQL query</li>
</ul>
@ -538,7 +538,7 @@ dspacestatistics=# SELECT SUM(downloads) FROM items;
</ul>
<pre tabindex="0"><code>dspace=# BEGIN;
BEGIN
dspace=# DELETE FROM metadatavalue WHERE text_value='Report' AND resource_type_id=2 AND metadata_field_id=68;
dspace=# DELETE FROM metadatavalue WHERE text_value=&#39;Report&#39; AND resource_type_id=2 AND metadata_field_id=68;
DELETE 12
dspace=# COMMIT;
</code></pre><ul>
@ -573,23 +573,23 @@ dspace=# COMMIT;
</li>
</ul>
<pre tabindex="0"><code>...
item_ids = ['0079470a-87a1-4373-beb1-b16e3f0c4d81', '007a9df1-0871-4612-8b28-5335982198cb']
item_ids_str = ' OR '.join(item_ids).replace('-', '\-')
item_ids = [&#39;0079470a-87a1-4373-beb1-b16e3f0c4d81&#39;, &#39;007a9df1-0871-4612-8b28-5335982198cb&#39;]
item_ids_str = &#39; OR &#39;.join(item_ids).replace(&#39;-&#39;, &#39;\-&#39;)
...
solr_query_params = {
&quot;q&quot;: f&quot;id:({item_ids_str})&quot;,
&quot;fq&quot;: &quot;type:2 AND isBot:false AND statistics_type:view AND time:[2020-01-01T00:00:00Z TO 2020-09-02T00:00:00Z]&quot;,
&quot;facet&quot;: &quot;true&quot;,
&quot;facet.field&quot;: &quot;id&quot;,
&quot;facet.mincount&quot;: 1,
&quot;facet.limit&quot;: 1,
&quot;facet.offset&quot;: 0,
&quot;stats&quot;: &quot;true&quot;,
&quot;stats.field&quot;: &quot;id&quot;,
&quot;stats.calcdistinct&quot;: &quot;true&quot;,
&quot;shards&quot;: shards,
&quot;rows&quot;: 0,
&quot;wt&quot;: &quot;json&quot;,
&#34;q&#34;: f&#34;id:({item_ids_str})&#34;,
&#34;fq&#34;: &#34;type:2 AND isBot:false AND statistics_type:view AND time:[2020-01-01T00:00:00Z TO 2020-09-02T00:00:00Z]&#34;,
&#34;facet&#34;: &#34;true&#34;,
&#34;facet.field&#34;: &#34;id&#34;,
&#34;facet.mincount&#34;: 1,
&#34;facet.limit&#34;: 1,
&#34;facet.offset&#34;: 0,
&#34;stats&#34;: &#34;true&#34;,
&#34;stats.field&#34;: &#34;id&#34;,
&#34;stats.calcdistinct&#34;: &#34;true&#34;,
&#34;shards&#34;: shards,
&#34;rows&#34;: 0,
&#34;wt&#34;: &#34;json&#34;,
}
</code></pre><ul>
<li>The date range format for Solr is important, but it seems we only need to add <code>T00:00:00Z</code> to the normal ISO 8601 YYYY-MM-DD strings</li>
@ -600,61 +600,61 @@ solr_query_params = {
</ul>
<pre tabindex="0"><code>$ curl -s -d @request.json https://dspacetest.cgiar.org/rest/statistics/items | json_pp
{
&quot;currentPage&quot; : 0,
&quot;limit&quot; : 10,
&quot;statistics&quot; : [
&#34;currentPage&#34; : 0,
&#34;limit&#34; : 10,
&#34;statistics&#34; : [
{
&quot;downloads&quot; : 3329,
&quot;id&quot; : &quot;b2c1bbfd-65b0-438c-9e49-d271c49b2696&quot;,
&quot;views&quot; : 1565
&#34;downloads&#34; : 3329,
&#34;id&#34; : &#34;b2c1bbfd-65b0-438c-9e49-d271c49b2696&#34;,
&#34;views&#34; : 1565
},
{
&quot;downloads&quot; : 3797,
&quot;id&quot; : &quot;f44cf173-2344-4eb2-8f00-ee55df32c76f&quot;,
&quot;views&quot; : 48
&#34;downloads&#34; : 3797,
&#34;id&#34; : &#34;f44cf173-2344-4eb2-8f00-ee55df32c76f&#34;,
&#34;views&#34; : 48
},
{
&quot;downloads&quot; : 11064,
&quot;id&quot; : &quot;8542f9da-9ce1-4614-abf4-f2e3fdb4b305&quot;,
&quot;views&quot; : 26
&#34;downloads&#34; : 11064,
&#34;id&#34; : &#34;8542f9da-9ce1-4614-abf4-f2e3fdb4b305&#34;,
&#34;views&#34; : 26
},
{
&quot;downloads&quot; : 6782,
&quot;id&quot; : &quot;2324aa41-e9de-4a2b-bc36-16241464683e&quot;,
&quot;views&quot; : 19
&#34;downloads&#34; : 6782,
&#34;id&#34; : &#34;2324aa41-e9de-4a2b-bc36-16241464683e&#34;,
&#34;views&#34; : 19
},
{
&quot;downloads&quot; : 48,
&quot;id&quot; : &quot;0fe573e7-042a-4240-a4d9-753b61233908&quot;,
&quot;views&quot; : 12
&#34;downloads&#34; : 48,
&#34;id&#34; : &#34;0fe573e7-042a-4240-a4d9-753b61233908&#34;,
&#34;views&#34; : 12
},
{
&quot;downloads&quot; : 0,
&quot;id&quot; : &quot;000e61ca-695d-43e5-9ab8-1f3fd7a67a32&quot;,
&quot;views&quot; : 4
&#34;downloads&#34; : 0,
&#34;id&#34; : &#34;000e61ca-695d-43e5-9ab8-1f3fd7a67a32&#34;,
&#34;views&#34; : 4
},
{
&quot;downloads&quot; : 0,
&quot;id&quot; : &quot;000dc7cd-9485-424b-8ecf-78002613cc87&quot;,
&quot;views&quot; : 1
&#34;downloads&#34; : 0,
&#34;id&#34; : &#34;000dc7cd-9485-424b-8ecf-78002613cc87&#34;,
&#34;views&#34; : 1
},
{
&quot;downloads&quot; : 0,
&quot;id&quot; : &quot;000e1616-3901-4431-80b1-c6bc67312d8c&quot;,
&quot;views&quot; : 1
&#34;downloads&#34; : 0,
&#34;id&#34; : &#34;000e1616-3901-4431-80b1-c6bc67312d8c&#34;,
&#34;views&#34; : 1
},
{
&quot;downloads&quot; : 0,
&quot;id&quot; : &quot;000ea897-5557-49c7-9f54-9fa192c0f83b&quot;,
&quot;views&quot; : 1
&#34;downloads&#34; : 0,
&#34;id&#34; : &#34;000ea897-5557-49c7-9f54-9fa192c0f83b&#34;,
&#34;views&#34; : 1
},
{
&quot;downloads&quot; : 0,
&quot;id&quot; : &quot;000ec427-97e5-4766-85a5-e8dd62199ab5&quot;,
&quot;views&quot; : 1
&#34;downloads&#34; : 0,
&#34;id&#34; : &#34;000ec427-97e5-4766-85a5-e8dd62199ab5&#34;,
&#34;views&#34; : 1
}
],
&quot;totalPages&quot; : 13
&#34;totalPages&#34; : 13
}
</code></pre><ul>
<li>I deployed it on DSpace Test and sent a note to Salem so he can test it</li>