diff --git a/content/post/2017-05.md b/content/post/2017-05.md index 584067be5..8de99f28f 100644 --- a/content/post/2017-05.md +++ b/content/post/2017-05.md @@ -126,3 +126,9 @@ $ for item in /home/aorth/10947-1/ITEM@10947-*; do [dspace]/bin/dspace packager ``` dspace=# delete from metadatavalue where resource_type_id=2 and text_value=''; ``` + +## 2017-05-13 + +- After quite a bit of troubleshooting with importing cleaned up data as CSV, it seems that there are actually [NUL](https://en.wikipedia.org/wiki/Null_character) characters in the `dc.description.abstract` field (at least) on the lines where CSV importing was failing +- I tried to find a way to remove the characters in vim or Open Refine, but decided it was quicker to just remove the column temporarily and import it +- The import was successful and detected 2022 changes, which should likely be the rest that were failing to import before diff --git a/public/2017-05/index.html b/public/2017-05/index.html index 78898707d..461dbf830 100644 --- a/public/2017-05/index.html +++ b/public/2017-05/index.html @@ -13,7 +13,7 @@ - + @@ -45,9 +45,9 @@ "@type": "BlogPosting", "headline": "May, 2017", "url": "https://alanorth.github.io/cgspace-notes/2017-05/", - "wordCount": "1037", + "wordCount": "1122", "datePublished": "2017-05-01T16:21:52+02:00", - "dateModified": "2017-05-10T11:20:27+03:00", + "dateModified": "2017-05-10T23:44:44+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -263,6 +263,14 @@ $ for item in /home/aorth/10947-1/ITEM@10947-*; do [dspace]/bin/dspace packager
dspace=# delete from metadatavalue where resource_type_id=2 and text_value='';
+dc.description.abstract
field (at least) on the lines where CSV importing was failing