<li>Add <code>dc.type</code> to the output options for Atmire’s Listings and Reports module (<ahref="https://github.com/ilri/DSpace/pull/286">#286</a>)</li>
<li>Looks like the OAI bug from DSpace 5.1 that caused validation at Base Search to fail is now fixed and DSpace Test passes validation! (<ahref="https://github.com/ilri/DSpace/issues/63">#63</a>)</li>
<li>After re-deploying and re-indexing I didn’t see the same issue, and the indexing completed in 85 minutes, which is about how long it is supposed to take</li>
<li>I noticed some weird CRPs in the database, and they don’t show up in Discovery for some reason, perhaps the <code>:</code></li>
<li>I’ll export these and fix them in batch:</li>
</ul>
<pre><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id=230 group by text_value order by count desc) to /tmp/crp.csv with csv;
<li>Add <code>AMR</code> to ILRI subjects and remove one duplicate instance of IITA in author affiliations controlled vocabulary (<ahref="https://github.com/ilri/DSpace/pull/288">#288</a>)</li>
<pre><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id=3 group by text_value order by count desc limit 210) to /tmp/210-authors.csv with csv;
<li>CGSpace crashed so I quickly ran system updates, applied one or two of the waiting changes from the <code>5_x-prod</code> branch, and rebooted the server</li>
<li>The error was <code>Timeout waiting for idle object</code> but I haven’t looked into the Tomcat logs to see what happened</li>
<li>Also, I ran the corrections for CRPs from earlier this week</li>
<li>But the results are deceiving because metadata fields can have text languages and your query must match exactly!</li>
</ul>
<pre><code>dspace=# select distinct text_value, text_lang from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS';
text_value | text_lang
------------+-----------
SEEDS |
SEEDS |
SEEDS | en_US
(3 rows)
</code></pre>
<ul>
<li>So basically, the text language here could be null, blank, or en_US</li>
<li>To query metadata with these properties, you can do:</li>
<li>The results (55+34=89) don’t seem to match those from the database:</li>
</ul>
<pre><code>dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang is null;
count
-------
15
dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang='';
count
-------
4
dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang='en_US';
count
-------
66
</code></pre>
<ul>
<li>So, querying from the API I get 55 + 34 = 89 results, but the database actually only has 85…</li>
<li>And the <code>find-by-metadata-field</code> endpoint doesn’t seem to have a way to get all items with the field, or a wildcard value</li>
<li>I’ll ask a question on the dspace-tech mailing list</li>
<li>And speaking of <code>text_lang</code>, this is interesting:</li>
</ul>
<pre><code>dspacetest=# select distinct text_lang from metadatavalue where resource_type_id=2;
text_lang
-----------
ethnob
en
spa
EN
es
frn
en_
en_US
EN_US
eng
en_U
fr
(14 rows)
</code></pre>
<ul>
<li>Generate a list of all these so I can fix them in batch:</li>
</ul>
<pre><code>dspace=# \copy (select distinct text_lang, count(*) from metadatavalue where resource_type_id=2 group by text_lang order by count desc) to /tmp/text-langs.csv with csv;
COPY 14
</code></pre>
<ul>
<li>Perhaps we need to fix them all in batch, or experiment with fixing only certain metadatavalues:</li>
</ul>
<pre><code>dspace=# update metadatavalue set text_lang='en_US' where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS';