From 13f4d47ed8260f2ab00d8ac59bf495b43db01720 Mon Sep 17 00:00:00 2001 From: Alan Orth Date: Tue, 28 Jan 2020 17:37:27 +0200 Subject: [PATCH] Update notes for 2020-01-28 --- content/posts/2020-01.md | 23 +++++++++++++++++++++++ docs/2020-01/index.html | 32 +++++++++++++++++++++++++++++++- 2 files changed, 54 insertions(+), 1 deletion(-) diff --git a/content/posts/2020-01.md b/content/posts/2020-01.md index 2bfdfd1d1..4cb2433ea 100644 --- a/content/posts/2020-01.md +++ b/content/posts/2020-01.md @@ -301,4 +301,27 @@ org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: - I made a [pull request](https://github.com/ilri/DSpace/pull/443) and merged it to the `5_x-prod` branch and will deploy on CGSpace later tonight - I am curious if anyone on the dspace-tech mailing list has run into this, so I will try to send a message about this there when I get a chance +## 2020-01-28 + +- Generate a list of CIP subjects for Abenet: + +``` +dspace=# \COPY (SELECT DISTINCT text_value as "cg.subject.cip", count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 127 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-01-28-cip-subjects.csv WITH CSV HEADER; +COPY 77 +``` + +- Start looking over the IITA records from earlier this month ([IITA_201907_Jan13](https://dspacetest.cgiar.org/handle/10568/106567)) + - Delete one duplicate, map one item from ILRI community + - The following items are duplicates or something (there is not enough metadata to tell for sure): + - https://dspacetest.cgiar.org/handle/10568/106682 + - https://dspacetest.cgiar.org/handle/10568/106653 + - https://dspacetest.cgiar.org/handle/10568/106694 + - This item doesn't exist in the journal, and Weed Science volume 55 was published in 2007, not 2003: + - https://dspacetest.cgiar.org/handle/10568/106665 + - All items using `cg.journal.title` instead of `dc.source` + - Several items were missing ISSN despite having a journal title + - Many items were missing DOIs, abstracts, etc + - I did some metadata enrichment by searching for the items and copying relevant data from journal pages + - I asked Bosede to try to do the same for the rest of the journal articles + diff --git a/docs/2020-01/index.html b/docs/2020-01/index.html index 46cc6fb31..865ee39b9 100644 --- a/docs/2020-01/index.html +++ b/docs/2020-01/index.html @@ -63,7 +63,7 @@ I tweeted the CGSpace repository link "@type": "BlogPosting", "headline": "January, 2020", "url": "https:\/\/alanorth.github.io\/cgspace-notes\/2020-01\/", - "wordCount": "2754", + "wordCount": "2910", "datePublished": "2020-01-06T10:48:30+02:00", "dateModified": "2020-01-27T16:20:44+02:00", "author": { @@ -446,6 +446,36 @@ org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: +

2020-01-28

+ +
dspace=# \COPY (SELECT DISTINCT text_value as "cg.subject.cip", count(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 127 GROUP BY text_value ORDER BY count DESC) to /tmp/2020-01-28-cip-subjects.csv WITH CSV HEADER;
+COPY 77
+