mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-26 00:18:21 +01:00
Update notes for 2016-11-10
This commit is contained in:
parent
b7d9b1e86b
commit
9d06d39752
@ -101,3 +101,92 @@ dspace=# \copy (select distinct text_value, count(*) from metadatavalue where me
|
|||||||
- CGSpace crashed so I quickly ran system updates, applied one or two of the waiting changes from the `5_x-prod` branch, and rebooted the server
|
- CGSpace crashed so I quickly ran system updates, applied one or two of the waiting changes from the `5_x-prod` branch, and rebooted the server
|
||||||
- The error was `Timeout waiting for idle object` but I haven't looked into the Tomcat logs to see what happened
|
- The error was `Timeout waiting for idle object` but I haven't looked into the Tomcat logs to see what happened
|
||||||
- Also, I ran the corrections for CRPs from earlier this week
|
- Also, I ran the corrections for CRPs from earlier this week
|
||||||
|
|
||||||
|
## 2016-11-10
|
||||||
|
|
||||||
|
- Helping Megan Zandstra and CIAT with some questions about the REST API
|
||||||
|
- Playing with `find-by-metadata-field`, this works:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ curl -s -H "accept: application/json" -H "Content-Type: application/json" -X POST "http://localhost:8080/rest/items/find-by-metadata-field" -d '{"key": "cg.subject.ilri","value": "SEEDS"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
- But the results are deceiving because metadata fields can have text languages and your query must match exactly!
|
||||||
|
|
||||||
|
```
|
||||||
|
dspace=# select distinct text_value, text_lang from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS';
|
||||||
|
text_value | text_lang
|
||||||
|
------------+-----------
|
||||||
|
SEEDS |
|
||||||
|
SEEDS |
|
||||||
|
SEEDS | en_US
|
||||||
|
(3 rows)
|
||||||
|
```
|
||||||
|
|
||||||
|
- So basically, the text language here could be null, blank, or en_US
|
||||||
|
- To query metadata with these properties, you can do:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ curl -s -H "accept: application/json" -H "Content-Type: application/json" -X POST "http://localhost:8080/rest/items/find-by-metadata-field" -d '{"key": "cg.subject.ilri","value": "SEEDS"}' | jq length
|
||||||
|
55
|
||||||
|
$ curl -s -H "accept: application/json" -H "Content-Type: application/json" -X POST "http://localhost:8080/rest/items/find-by-metadata-field" -d '{"key": "cg.subject.ilri","value": "SEEDS", "language":""}' | jq length
|
||||||
|
34
|
||||||
|
$ curl -s -H "accept: application/json" -H "Content-Type: application/json" -X POST "http://localhost:8080/rest/items/find-by-metadata-field" -d '{"key": "cg.subject.ilri","value": "SEEDS", "language":"en_US"}' | jq length
|
||||||
|
```
|
||||||
|
|
||||||
|
- The results (55+34=89) don't seem to match those from the database:
|
||||||
|
|
||||||
|
```
|
||||||
|
dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang is null;
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
15
|
||||||
|
dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang='';
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
4
|
||||||
|
dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang='en_US';
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
66
|
||||||
|
```
|
||||||
|
|
||||||
|
- So, querying from the API I get 55 + 34 = 89 results, but the database actually only has 85...
|
||||||
|
- And the `find-by-metadata-field` endpoint doesn't seem to have a way to get all items with the field, or a wildcard value
|
||||||
|
- I'll ask a question on the dspace-tech mailing list
|
||||||
|
- And speaking of `text_lang`, this is interesting:
|
||||||
|
|
||||||
|
```
|
||||||
|
dspacetest=# select distinct text_lang from metadatavalue where resource_type_id=2;
|
||||||
|
text_lang
|
||||||
|
-----------
|
||||||
|
|
||||||
|
ethnob
|
||||||
|
en
|
||||||
|
spa
|
||||||
|
EN
|
||||||
|
es
|
||||||
|
frn
|
||||||
|
en_
|
||||||
|
en_US
|
||||||
|
|
||||||
|
EN_US
|
||||||
|
eng
|
||||||
|
en_U
|
||||||
|
fr
|
||||||
|
(14 rows)
|
||||||
|
```
|
||||||
|
|
||||||
|
- Generate a list of all these so I can fix them in batch:
|
||||||
|
|
||||||
|
```
|
||||||
|
dspace=# \copy (select distinct text_lang, count(*) from metadatavalue where resource_type_id=2 group by text_lang order by count desc) to /tmp/text-langs.csv with csv;
|
||||||
|
COPY 14
|
||||||
|
```
|
||||||
|
|
||||||
|
- Perhaps we need to fix them all in batch, or experiment with fixing only certain metadatavalues:
|
||||||
|
|
||||||
|
```
|
||||||
|
dspace=# update metadatavalue set text_lang='en_US' where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS';
|
||||||
|
UPDATE 85
|
||||||
|
```
|
||||||
|
@ -205,6 +205,102 @@ COPY 22
|
|||||||
<li>Also, I ran the corrections for CRPs from earlier this week</li>
|
<li>Also, I ran the corrections for CRPs from earlier this week</li>
|
||||||
</ul>
|
</ul>
|
||||||
|
|
||||||
|
<h2 id="2016-11-10">2016-11-10</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Helping Megan Zandstra and CIAT with some questions about the REST API</li>
|
||||||
|
<li>Playing with <code>find-by-metadata-field</code>, this works:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ curl -s -H "accept: application/json" -H "Content-Type: application/json" -X POST "http://localhost:8080/rest/items/find-by-metadata-field" -d '{"key": "cg.subject.ilri","value": "SEEDS"}'
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>But the results are deceiving because metadata fields can have text languages and your query must match exactly!</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# select distinct text_value, text_lang from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS';
|
||||||
|
text_value | text_lang
|
||||||
|
------------+-----------
|
||||||
|
SEEDS |
|
||||||
|
SEEDS |
|
||||||
|
SEEDS | en_US
|
||||||
|
(3 rows)
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>So basically, the text language here could be null, blank, or en_US</li>
|
||||||
|
<li>To query metadata with these properties, you can do:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ curl -s -H "accept: application/json" -H "Content-Type: application/json" -X POST "http://localhost:8080/rest/items/find-by-metadata-field" -d '{"key": "cg.subject.ilri","value": "SEEDS"}' | jq length
|
||||||
|
55
|
||||||
|
$ curl -s -H "accept: application/json" -H "Content-Type: application/json" -X POST "http://localhost:8080/rest/items/find-by-metadata-field" -d '{"key": "cg.subject.ilri","value": "SEEDS", "language":""}' | jq length
|
||||||
|
34
|
||||||
|
$ curl -s -H "accept: application/json" -H "Content-Type: application/json" -X POST "http://localhost:8080/rest/items/find-by-metadata-field" -d '{"key": "cg.subject.ilri","value": "SEEDS", "language":"en_US"}' | jq length
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>The results (55+34=89) don’t seem to match those from the database:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang is null;
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
15
|
||||||
|
dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang='';
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
4
|
||||||
|
dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang='en_US';
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
66
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>So, querying from the API I get 55 + 34 = 89 results, but the database actually only has 85…</li>
|
||||||
|
<li>And the <code>find-by-metadata-field</code> endpoint doesn’t seem to have a way to get all items with the field, or a wildcard value</li>
|
||||||
|
<li>I’ll ask a question on the dspace-tech mailing list</li>
|
||||||
|
<li>And speaking of <code>text_lang</code>, this is interesting:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# select distinct text_lang from metadatavalue where resource_type_id=2;
|
||||||
|
text_lang
|
||||||
|
-----------
|
||||||
|
|
||||||
|
ethnob
|
||||||
|
en
|
||||||
|
spa
|
||||||
|
EN
|
||||||
|
es
|
||||||
|
frn
|
||||||
|
en_
|
||||||
|
en_US
|
||||||
|
|
||||||
|
EN_US
|
||||||
|
eng
|
||||||
|
en_U
|
||||||
|
fr
|
||||||
|
(14 rows)
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Generate a list of all these so I can fix them in batch:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# \copy (select distinct text_lang, count(*) from metadatavalue where resource_type_id=2 group by text_lang order by count desc) to /tmp/text-langs.csv with csv;
|
||||||
|
COPY 14
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Perhaps we need to fix them all in batch, or experiment with fixing only certain metadatavalues:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# update metadatavalue set text_lang='en_US' where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS';
|
||||||
|
UPDATE 85
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -131,6 +131,102 @@ COPY 22
|
|||||||
<li>The error was <code>Timeout waiting for idle object</code> but I haven&rsquo;t looked into the Tomcat logs to see what happened</li>
|
<li>The error was <code>Timeout waiting for idle object</code> but I haven&rsquo;t looked into the Tomcat logs to see what happened</li>
|
||||||
<li>Also, I ran the corrections for CRPs from earlier this week</li>
|
<li>Also, I ran the corrections for CRPs from earlier this week</li>
|
||||||
</ul>
|
</ul>
|
||||||
|
|
||||||
|
<h2 id="2016-11-10">2016-11-10</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Helping Megan Zandstra and CIAT with some questions about the REST API</li>
|
||||||
|
<li>Playing with <code>find-by-metadata-field</code>, this works:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ curl -s -H &quot;accept: application/json&quot; -H &quot;Content-Type: application/json&quot; -X POST &quot;http://localhost:8080/rest/items/find-by-metadata-field&quot; -d '{&quot;key&quot;: &quot;cg.subject.ilri&quot;,&quot;value&quot;: &quot;SEEDS&quot;}'
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>But the results are deceiving because metadata fields can have text languages and your query must match exactly!</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# select distinct text_value, text_lang from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS';
|
||||||
|
text_value | text_lang
|
||||||
|
------------+-----------
|
||||||
|
SEEDS |
|
||||||
|
SEEDS |
|
||||||
|
SEEDS | en_US
|
||||||
|
(3 rows)
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>So basically, the text language here could be null, blank, or en_US</li>
|
||||||
|
<li>To query metadata with these properties, you can do:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ curl -s -H &quot;accept: application/json&quot; -H &quot;Content-Type: application/json&quot; -X POST &quot;http://localhost:8080/rest/items/find-by-metadata-field&quot; -d '{&quot;key&quot;: &quot;cg.subject.ilri&quot;,&quot;value&quot;: &quot;SEEDS&quot;}' | jq length
|
||||||
|
55
|
||||||
|
$ curl -s -H &quot;accept: application/json&quot; -H &quot;Content-Type: application/json&quot; -X POST &quot;http://localhost:8080/rest/items/find-by-metadata-field&quot; -d '{&quot;key&quot;: &quot;cg.subject.ilri&quot;,&quot;value&quot;: &quot;SEEDS&quot;, &quot;language&quot;:&quot;&quot;}' | jq length
|
||||||
|
34
|
||||||
|
$ curl -s -H &quot;accept: application/json&quot; -H &quot;Content-Type: application/json&quot; -X POST &quot;http://localhost:8080/rest/items/find-by-metadata-field&quot; -d '{&quot;key&quot;: &quot;cg.subject.ilri&quot;,&quot;value&quot;: &quot;SEEDS&quot;, &quot;language&quot;:&quot;en_US&quot;}' | jq length
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>The results (55+34=89) don&rsquo;t seem to match those from the database:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang is null;
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
15
|
||||||
|
dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang='';
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
4
|
||||||
|
dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang='en_US';
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
66
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>So, querying from the API I get 55 + 34 = 89 results, but the database actually only has 85&hellip;</li>
|
||||||
|
<li>And the <code>find-by-metadata-field</code> endpoint doesn&rsquo;t seem to have a way to get all items with the field, or a wildcard value</li>
|
||||||
|
<li>I&rsquo;ll ask a question on the dspace-tech mailing list</li>
|
||||||
|
<li>And speaking of <code>text_lang</code>, this is interesting:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# select distinct text_lang from metadatavalue where resource_type_id=2;
|
||||||
|
text_lang
|
||||||
|
-----------
|
||||||
|
|
||||||
|
ethnob
|
||||||
|
en
|
||||||
|
spa
|
||||||
|
EN
|
||||||
|
es
|
||||||
|
frn
|
||||||
|
en_
|
||||||
|
en_US
|
||||||
|
|
||||||
|
EN_US
|
||||||
|
eng
|
||||||
|
en_U
|
||||||
|
fr
|
||||||
|
(14 rows)
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Generate a list of all these so I can fix them in batch:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# \copy (select distinct text_lang, count(*) from metadatavalue where resource_type_id=2 group by text_lang order by count desc) to /tmp/text-langs.csv with csv;
|
||||||
|
COPY 14
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Perhaps we need to fix them all in batch, or experiment with fixing only certain metadatavalues:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# update metadatavalue set text_lang='en_US' where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS';
|
||||||
|
UPDATE 85
|
||||||
|
</code></pre>
|
||||||
</description>
|
</description>
|
||||||
</item>
|
</item>
|
||||||
|
|
||||||
|
@ -131,6 +131,102 @@ COPY 22
|
|||||||
<li>The error was <code>Timeout waiting for idle object</code> but I haven&rsquo;t looked into the Tomcat logs to see what happened</li>
|
<li>The error was <code>Timeout waiting for idle object</code> but I haven&rsquo;t looked into the Tomcat logs to see what happened</li>
|
||||||
<li>Also, I ran the corrections for CRPs from earlier this week</li>
|
<li>Also, I ran the corrections for CRPs from earlier this week</li>
|
||||||
</ul>
|
</ul>
|
||||||
|
|
||||||
|
<h2 id="2016-11-10">2016-11-10</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Helping Megan Zandstra and CIAT with some questions about the REST API</li>
|
||||||
|
<li>Playing with <code>find-by-metadata-field</code>, this works:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ curl -s -H &quot;accept: application/json&quot; -H &quot;Content-Type: application/json&quot; -X POST &quot;http://localhost:8080/rest/items/find-by-metadata-field&quot; -d '{&quot;key&quot;: &quot;cg.subject.ilri&quot;,&quot;value&quot;: &quot;SEEDS&quot;}'
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>But the results are deceiving because metadata fields can have text languages and your query must match exactly!</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# select distinct text_value, text_lang from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS';
|
||||||
|
text_value | text_lang
|
||||||
|
------------+-----------
|
||||||
|
SEEDS |
|
||||||
|
SEEDS |
|
||||||
|
SEEDS | en_US
|
||||||
|
(3 rows)
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>So basically, the text language here could be null, blank, or en_US</li>
|
||||||
|
<li>To query metadata with these properties, you can do:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ curl -s -H &quot;accept: application/json&quot; -H &quot;Content-Type: application/json&quot; -X POST &quot;http://localhost:8080/rest/items/find-by-metadata-field&quot; -d '{&quot;key&quot;: &quot;cg.subject.ilri&quot;,&quot;value&quot;: &quot;SEEDS&quot;}' | jq length
|
||||||
|
55
|
||||||
|
$ curl -s -H &quot;accept: application/json&quot; -H &quot;Content-Type: application/json&quot; -X POST &quot;http://localhost:8080/rest/items/find-by-metadata-field&quot; -d '{&quot;key&quot;: &quot;cg.subject.ilri&quot;,&quot;value&quot;: &quot;SEEDS&quot;, &quot;language&quot;:&quot;&quot;}' | jq length
|
||||||
|
34
|
||||||
|
$ curl -s -H &quot;accept: application/json&quot; -H &quot;Content-Type: application/json&quot; -X POST &quot;http://localhost:8080/rest/items/find-by-metadata-field&quot; -d '{&quot;key&quot;: &quot;cg.subject.ilri&quot;,&quot;value&quot;: &quot;SEEDS&quot;, &quot;language&quot;:&quot;en_US&quot;}' | jq length
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>The results (55+34=89) don&rsquo;t seem to match those from the database:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang is null;
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
15
|
||||||
|
dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang='';
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
4
|
||||||
|
dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang='en_US';
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
66
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>So, querying from the API I get 55 + 34 = 89 results, but the database actually only has 85&hellip;</li>
|
||||||
|
<li>And the <code>find-by-metadata-field</code> endpoint doesn&rsquo;t seem to have a way to get all items with the field, or a wildcard value</li>
|
||||||
|
<li>I&rsquo;ll ask a question on the dspace-tech mailing list</li>
|
||||||
|
<li>And speaking of <code>text_lang</code>, this is interesting:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# select distinct text_lang from metadatavalue where resource_type_id=2;
|
||||||
|
text_lang
|
||||||
|
-----------
|
||||||
|
|
||||||
|
ethnob
|
||||||
|
en
|
||||||
|
spa
|
||||||
|
EN
|
||||||
|
es
|
||||||
|
frn
|
||||||
|
en_
|
||||||
|
en_US
|
||||||
|
|
||||||
|
EN_US
|
||||||
|
eng
|
||||||
|
en_U
|
||||||
|
fr
|
||||||
|
(14 rows)
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Generate a list of all these so I can fix them in batch:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# \copy (select distinct text_lang, count(*) from metadatavalue where resource_type_id=2 group by text_lang order by count desc) to /tmp/text-langs.csv with csv;
|
||||||
|
COPY 14
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Perhaps we need to fix them all in batch, or experiment with fixing only certain metadatavalues:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# update metadatavalue set text_lang='en_US' where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS';
|
||||||
|
UPDATE 85
|
||||||
|
</code></pre>
|
||||||
</description>
|
</description>
|
||||||
</item>
|
</item>
|
||||||
|
|
||||||
|
@ -130,6 +130,102 @@ COPY 22
|
|||||||
<li>The error was <code>Timeout waiting for idle object</code> but I haven&rsquo;t looked into the Tomcat logs to see what happened</li>
|
<li>The error was <code>Timeout waiting for idle object</code> but I haven&rsquo;t looked into the Tomcat logs to see what happened</li>
|
||||||
<li>Also, I ran the corrections for CRPs from earlier this week</li>
|
<li>Also, I ran the corrections for CRPs from earlier this week</li>
|
||||||
</ul>
|
</ul>
|
||||||
|
|
||||||
|
<h2 id="2016-11-10">2016-11-10</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Helping Megan Zandstra and CIAT with some questions about the REST API</li>
|
||||||
|
<li>Playing with <code>find-by-metadata-field</code>, this works:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ curl -s -H &quot;accept: application/json&quot; -H &quot;Content-Type: application/json&quot; -X POST &quot;http://localhost:8080/rest/items/find-by-metadata-field&quot; -d '{&quot;key&quot;: &quot;cg.subject.ilri&quot;,&quot;value&quot;: &quot;SEEDS&quot;}'
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>But the results are deceiving because metadata fields can have text languages and your query must match exactly!</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# select distinct text_value, text_lang from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS';
|
||||||
|
text_value | text_lang
|
||||||
|
------------+-----------
|
||||||
|
SEEDS |
|
||||||
|
SEEDS |
|
||||||
|
SEEDS | en_US
|
||||||
|
(3 rows)
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>So basically, the text language here could be null, blank, or en_US</li>
|
||||||
|
<li>To query metadata with these properties, you can do:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ curl -s -H &quot;accept: application/json&quot; -H &quot;Content-Type: application/json&quot; -X POST &quot;http://localhost:8080/rest/items/find-by-metadata-field&quot; -d '{&quot;key&quot;: &quot;cg.subject.ilri&quot;,&quot;value&quot;: &quot;SEEDS&quot;}' | jq length
|
||||||
|
55
|
||||||
|
$ curl -s -H &quot;accept: application/json&quot; -H &quot;Content-Type: application/json&quot; -X POST &quot;http://localhost:8080/rest/items/find-by-metadata-field&quot; -d '{&quot;key&quot;: &quot;cg.subject.ilri&quot;,&quot;value&quot;: &quot;SEEDS&quot;, &quot;language&quot;:&quot;&quot;}' | jq length
|
||||||
|
34
|
||||||
|
$ curl -s -H &quot;accept: application/json&quot; -H &quot;Content-Type: application/json&quot; -X POST &quot;http://localhost:8080/rest/items/find-by-metadata-field&quot; -d '{&quot;key&quot;: &quot;cg.subject.ilri&quot;,&quot;value&quot;: &quot;SEEDS&quot;, &quot;language&quot;:&quot;en_US&quot;}' | jq length
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>The results (55+34=89) don&rsquo;t seem to match those from the database:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang is null;
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
15
|
||||||
|
dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang='';
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
4
|
||||||
|
dspace=# select count(text_value) from metadatavalue where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS' and text_lang='en_US';
|
||||||
|
count
|
||||||
|
-------
|
||||||
|
66
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>So, querying from the API I get 55 + 34 = 89 results, but the database actually only has 85&hellip;</li>
|
||||||
|
<li>And the <code>find-by-metadata-field</code> endpoint doesn&rsquo;t seem to have a way to get all items with the field, or a wildcard value</li>
|
||||||
|
<li>I&rsquo;ll ask a question on the dspace-tech mailing list</li>
|
||||||
|
<li>And speaking of <code>text_lang</code>, this is interesting:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspacetest=# select distinct text_lang from metadatavalue where resource_type_id=2;
|
||||||
|
text_lang
|
||||||
|
-----------
|
||||||
|
|
||||||
|
ethnob
|
||||||
|
en
|
||||||
|
spa
|
||||||
|
EN
|
||||||
|
es
|
||||||
|
frn
|
||||||
|
en_
|
||||||
|
en_US
|
||||||
|
|
||||||
|
EN_US
|
||||||
|
eng
|
||||||
|
en_U
|
||||||
|
fr
|
||||||
|
(14 rows)
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Generate a list of all these so I can fix them in batch:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# \copy (select distinct text_lang, count(*) from metadatavalue where resource_type_id=2 group by text_lang order by count desc) to /tmp/text-langs.csv with csv;
|
||||||
|
COPY 14
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>Perhaps we need to fix them all in batch, or experiment with fixing only certain metadatavalues:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>dspace=# update metadatavalue set text_lang='en_US' where resource_type_id=2 and metadata_field_id=203 and text_value='SEEDS';
|
||||||
|
UPDATE 85
|
||||||
|
</code></pre>
|
||||||
</description>
|
</description>
|
||||||
</item>
|
</item>
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user