2024-01-05 13:45:46 +01:00
<!DOCTYPE html>
< html lang = "en" >
< head >
< meta charset = "utf-8" >
< meta name = "viewport" content = "width=device-width, initial-scale=1, shrink-to-fit=no" >
2024-02-29 07:41:44 +01:00
< meta property = "og:title" content = "February, 2024" / >
< meta property = "og:description" content = "2024-02-05
2024-01-05 13:45:46 +01:00
2024-02-29 07:41:44 +01:00
Delete duplicate metadata as described in my DSpace issue from last year: https://github.com/DSpace/DSpace/issues/8253
Lower case all the AGROVOC subjects on CGSpace
2024-01-05 13:45:46 +01:00
" />
< meta property = "og:type" content = "article" / >
< meta property = "og:url" content = "https://alanorth.github.io/cgspace-notes/2024-01/" / >
2024-02-29 07:41:44 +01:00
< meta property = "article:published_time" content = "2024-01-05T11:10:00+03:00" / >
< meta property = "article:modified_time" content = "2024-02-27T17:18:35+03:00" / >
2024-01-05 13:45:46 +01:00
< meta name = "twitter:card" content = "summary" / >
2024-02-29 07:41:44 +01:00
< meta name = "twitter:title" content = "February, 2024" / >
< meta name = "twitter:description" content = "2024-02-05
2024-01-05 13:45:46 +01:00
2024-02-29 07:41:44 +01:00
Delete duplicate metadata as described in my DSpace issue from last year: https://github.com/DSpace/DSpace/issues/8253
Lower case all the AGROVOC subjects on CGSpace
2024-01-05 13:45:46 +01:00
"/>
2024-02-29 07:41:44 +01:00
< meta name = "generator" content = "Hugo 0.123.6" >
2024-01-05 13:45:46 +01:00
< script type = "application/ld+json" >
{
"@context": "http://schema.org",
"@type": "BlogPosting",
2024-02-29 07:41:44 +01:00
"headline": "February, 2024",
2024-01-05 13:45:46 +01:00
"url": "https://alanorth.github.io/cgspace-notes/2024-01/",
2024-02-29 07:41:44 +01:00
"wordCount": "551",
"datePublished": "2024-01-05T11:10:00+03:00",
"dateModified": "2024-02-27T17:18:35+03:00",
2024-01-05 13:45:46 +01:00
"author": {
"@type": "Person",
"name": "Alan Orth"
},
"keywords": "Notes"
}
< / script >
< link rel = "canonical" href = "https://alanorth.github.io/cgspace-notes/2024-01/" >
2024-02-29 07:41:44 +01:00
< title > February, 2024 | CGSpace Notes< / title >
2024-01-05 13:45:46 +01:00
<!-- combined, minified CSS -->
< link href = "https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel = "stylesheet" integrity = "sha256-xrqAvFBmlVdkWr4F+GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin = "anonymous" >
<!-- minified Font Awesome for SVG icons -->
< script defer src = "https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity = "sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz+lcnA=" crossorigin = "anonymous" > < / script >
<!-- RSS 2.0 feed -->
< / head >
< body >
< div class = "blog-masthead" >
< div class = "container" >
< nav class = "nav blog-nav" >
< a class = "nav-link " href = "https://alanorth.github.io/cgspace-notes/" > Home< / a >
< / nav >
< / div >
< / div >
< header class = "blog-header" >
< div class = "container" >
< h1 class = "blog-title" dir = "auto" > < a href = "https://alanorth.github.io/cgspace-notes/" rel = "home" > CGSpace Notes< / a > < / h1 >
< p class = "lead blog-description" dir = "auto" > Documenting day-to-day work on the < a href = "https://cgspace.cgiar.org" > CGSpace< / a > repository.< / p >
< / div >
< / header >
< div class = "container" >
< div class = "row" >
< div class = "col-sm-8 blog-main" >
< article class = "blog-post" >
< header >
2024-02-29 07:41:44 +01:00
< h2 class = "blog-post-title" dir = "auto" > < a href = "https://alanorth.github.io/cgspace-notes/2024-01/" > February, 2024< / a > < / h2 >
2024-01-05 13:45:46 +01:00
< p class = "blog-post-meta" >
2024-02-29 07:41:44 +01:00
< time datetime = "2024-01-05T11:10:00+03:00" > Fri Jan 05, 2024< / time >
2024-01-05 13:45:46 +01:00
in
< span class = "fas fa-folder" aria-hidden = "true" > < / span > < a href = "/categories/notes/" rel = "category tag" > Notes< / a >
< / p >
< / header >
2024-02-29 07:41:44 +01:00
< h2 id = "2024-02-05" > 2024-02-05< / h2 >
< ul >
< li > Delete duplicate metadata as described in my DSpace issue from last year: < a href = "https://github.com/DSpace/DSpace/issues/8253" > https://github.com/DSpace/DSpace/issues/8253< / a > < / li >
< li > Lower case all the AGROVOC subjects on CGSpace< / li >
< / ul >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-sql" data-lang = "sql" > < span style = "display:flex;" > < span > dspace< span style = "color:#f92672" > =#< / span > < span style = "color:#66d9ef" > BEGIN< / span > ;
< / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#66d9ef" > BEGIN< / span >
< / span > < / span > < span style = "display:flex;" > < span > dspace< span style = "color:#f92672" > =*#< / span > < span style = "color:#66d9ef" > UPDATE< / span > metadatavalue < span style = "color:#66d9ef" > SET< / span > text_value< span style = "color:#f92672" > =< / span > < span style = "color:#66d9ef" > LOWER< / span > (text_value) < span style = "color:#66d9ef" > WHERE< / span > dspace_object_id < span style = "color:#66d9ef" > IN< / span > (< span style = "color:#66d9ef" > SELECT< / span > uuid < span style = "color:#66d9ef" > FROM< / span > item) < span style = "color:#66d9ef" > AND< / span > metadata_field_id< span style = "color:#f92672" > =< / span > < span style = "color:#ae81ff" > 187< / span > < span style = "color:#66d9ef" > AND< / span > text_value < span style = "color:#f92672" > ~< / span > < span style = "color:#e6db74" > ' [[:upper:]]' < / span > ;
< / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#66d9ef" > UPDATE< / span > < span style = "color:#ae81ff" > 180< / span >
< / span > < / span > < span style = "display:flex;" > < span > dspace< span style = "color:#f92672" > =*#< / span > < span style = "color:#66d9ef" > COMMIT< / span > ;
< / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#66d9ef" > COMMIT< / span >
< / span > < / span > < / code > < / pre > < / div > < h2 id = "2024-02-06" > 2024-02-06< / h2 >
< ul >
< li > Discuss IWMI using the CGSpace REST API for their new website< / li >
< li > Export the IWMI community to extract their ORCID identifiers:< / li >
< / ul >
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ dspace metadata-export -i 10568/16814 -f /tmp/iwmi.csv
< / span > < / span > < span style = "display:flex;" > < span > $ csvcut -c < span style = "color:#e6db74" > ' cg.creator.identifier,cg.creator.identifier[en_US]' < / span > ~/Downloads/2024-02-06-iwmi.csv < span style = "color:#ae81ff" > \
< / span > < / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#ae81ff" > < / span > | grep -oE ' [A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' \
< / span > < / span > < span style = "display:flex;" > < span > | sort -u \
< / span > < / span > < span style = "display:flex;" > < span > | tee /tmp/iwmi-orcids.txt \
< / span > < / span > < span style = "display:flex;" > < span > | wc -l
< / span > < / span > < span style = "display:flex;" > < span > 353
< / span > < / span > < span style = "display:flex;" > < span > $ ./ilri/resolve_orcids.py -i /tmp/iwmi-orcids.txt -o /tmp/iwmi-orcids-names.csv -d
2024-01-07 20:18:43 +01:00
< / span > < / span > < / code > < / pre > < / div > < ul >
2024-02-29 07:41:44 +01:00
< li > I noticed some similar looking names in our list so I clustered them in OpenRefine and manually checked a dozen or so to update our list< / li >
2024-02-27 15:18:35 +01:00
< / ul >
2024-02-29 07:41:44 +01:00
< h2 id = "2024-02-07" > 2024-02-07< / h2 >
2024-02-27 15:18:35 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > Maria asked me about the “ missing” item from last week again
2024-01-10 06:34:16 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > I can see it when I used the Admin search, but not in her workflow< / li >
< li > It was submitted by TIP so I checked that user’ s workspace and found it there< / li >
< li > After depositing, it went into the workflow so Maria should be able to see it now< / li >
2024-02-27 15:18:35 +01:00
< / ul >
< / li >
2024-01-10 06:34:16 +01:00
< / ul >
2024-02-29 07:41:44 +01:00
< h2 id = "2024-02-09" > 2024-02-09< / h2 >
2024-01-10 06:34:16 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > Minor edits to CGSpace submission form< / li >
< li > Upload 55 ISNAR book chapters to CGSpace from Peter< / li >
2024-02-27 15:18:35 +01:00
< / ul >
2024-02-29 07:41:44 +01:00
< h2 id = "2024-02-19" > 2024-02-19< / h2 >
2024-01-10 06:34:16 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > Looking into the collection mapping issue on CGSpace
2024-01-10 06:34:16 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > It seems to be by design in DSpace 7: < a href = "https://github.com/DSpace/dspace-angular/issues/1203" > https://github.com/DSpace/dspace-angular/issues/1203< / a > < / li >
< li > This is a massive setback for us… < / li >
2024-02-27 15:18:35 +01:00
< / ul >
< / li >
< / ul >
2024-02-29 07:41:44 +01:00
< h2 id = "2024-02-20" > 2024-02-20< / h2 >
2024-01-10 15:21:12 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > Minor work on OpenRXV to fix a bug in the ng-select drop downs< / li >
< li > Minor work on the DSpace 7 nginx configuration to allow requesting robots.txt and sitemaps without hitting rate limits< / li >
2024-02-27 15:18:35 +01:00
< / ul >
2024-02-29 07:41:44 +01:00
< h2 id = "2024-02-21" > 2024-02-21< / h2 >
2024-02-27 15:18:35 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > Minor updates on OpenRXV, including one bug fix for missing mapped collections
2024-02-27 15:18:35 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > Salem had to re-work the harvester for DSpace 7 since the mapped collections and parent collection list are separate!< / li >
2024-02-27 15:18:35 +01:00
< / ul >
< / li >
< / ul >
2024-02-29 07:41:44 +01:00
< h2 id = "2024-02-22" > 2024-02-22< / h2 >
2024-02-27 15:18:35 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > Discuss tagging of datasets and re-work the submission form to encourage use of DOI field for any item that has a DOI, and the normal URL field if not
2024-02-27 15:18:35 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > The “ cg.identifier.dataurl” field will be used for “ related” datasets< / li >
< li > I still have to check and move some metadata for existing datasets< / li >
2024-02-27 15:18:35 +01:00
< / ul >
< / li >
< / ul >
2024-02-29 07:41:44 +01:00
< h2 id = "2024-02-23" > 2024-02-23< / h2 >
2024-02-27 15:18:35 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > This morning Tomcat died due to an OOM kill from the kernel:< / li >
2024-02-27 15:18:35 +01:00
< / ul >
2024-02-29 07:41:44 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > kernel: Out of memory: Killed process 698 (java) total-vm:14151300kB, anon-rss:9665812kB, file-rss:320kB, shmem-rss:0kB, UID:997 pgtables:20436kB oom_score_adj:0
2024-02-27 15:18:35 +01:00
< / span > < / span > < / code > < / pre > < / div > < ul >
2024-02-29 07:41:44 +01:00
< li > I don’ t see any abnormal pattern in my Grafana graphs, for JVM or system load… very weird< / li >
< li > I updated the submission form on CGSpace to include the new changes to URLs for datasets
2024-02-27 15:18:35 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > I also updated about 80 datasets to move the URLs to the correct field< / li >
2024-01-18 13:59:49 +01:00
< / ul >
< / li >
< / ul >
2024-02-29 07:41:44 +01:00
< h2 id = "2024-02-25" > 2024-02-25< / h2 >
2024-02-27 15:18:35 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > This morning Tomcat died while I was doing a CSV export, with an OOM kill from the kernel:< / li >
2024-02-27 15:18:35 +01:00
< / ul >
2024-02-29 07:41:44 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > kernel: Out of memory: Killed process 720768 (java) total-vm:14079976kB, anon-rss:9301684kB, file-rss:152kB, shmem-rss:0kB, UID:997 pgtables:19488kB oom_score_adj:0
2024-02-27 15:18:35 +01:00
< / span > < / span > < / code > < / pre > < / div > < ul >
2024-02-29 07:41:44 +01:00
< li > I don’ t know why this is happening so often recently… < / li >
2024-01-24 06:24:50 +01:00
< / ul >
2024-02-29 07:41:44 +01:00
< h2 id = "2024-02-27" > 2024-02-27< / h2 >
2024-02-27 15:18:35 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > IFPRI sent me a list of authors to add to our list for now, until we can find a better way of doing it
2024-01-24 06:24:50 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > I extracted the existing authors from our controlled vocabulary and combined them with IFPRI’ s:< / li >
2024-02-27 15:18:35 +01:00
< / ul >
< / li >
2024-01-24 06:24:50 +01:00
< / ul >
2024-02-29 07:41:44 +01:00
< div class = "highlight" > < pre tabindex = "0" style = "color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;" > < code class = "language-console" data-lang = "console" > < span style = "display:flex;" > < span > $ xmllint --xpath < span style = "color:#e6db74" > ' //node/isComposedBy/node()' < / span > dspace/config/controlled-vocabularies/dc-contributor-author.xml < span style = "color:#ae81ff" > \
< / span > < / span > < / span > < span style = "display:flex;" > < span > < span style = "color:#ae81ff" > < / span > | grep -oE ' label=" .*" ' \
< / span > < / span > < span style = "display:flex;" > < span > | sed -e ' s/label=" //' -e ' s/" $//' > /tmp/authors
< / span > < / span > < span style = "display:flex;" > < span > $ cat /tmp/authors /tmp/ifpri-authors | sort -u > /tmp/new-authors
< / span > < / span > < / code > < / pre > < / div > < h2 id = "2024-02-28" > 2024-02-28< / h2 >
2024-02-27 15:18:35 +01:00
< ul >
2024-02-29 07:41:44 +01:00
< li > I figured out a way to add a new Angular component to handle all our relation fields< / li >
2024-02-06 09:45:02 +01:00
< / ul >
2024-01-10 06:34:16 +01:00
<!-- raw HTML omitted -->
2024-01-05 13:45:46 +01:00
< / article >
< / div > <!-- /.blog - main -->
< aside class = "col-sm-3 ml-auto blog-sidebar" >
< section class = "sidebar-module" >
< h4 > Recent Posts< / h4 >
< ol class = "list-unstyled" >
2024-02-06 09:45:02 +01:00
< li > < a href = "/cgspace-notes/2024-01/" > February, 2024< / a > < / li >
2024-01-05 13:45:46 +01:00
< li > < a href = "/cgspace-notes/2024-01/" > January, 2024< / a > < / li >
< li > < a href = "/cgspace-notes/2023-12/" > December, 2023< / a > < / li >
< li > < a href = "/cgspace-notes/2023-11/" > November, 2023< / a > < / li >
< li > < a href = "/cgspace-notes/2023-10/" > October, 2023< / a > < / li >
< / ol >
< / section >
< section class = "sidebar-module" >
< h4 > Links< / h4 >
< ol class = "list-unstyled" >
< li > < a href = "https://cgspace.cgiar.org" > CGSpace< / a > < / li >
< li > < a href = "https://dspacetest.cgiar.org" > DSpace Test< / a > < / li >
< li > < a href = "https://github.com/ilri/DSpace" > CGSpace @ GitHub< / a > < / li >
< / ol >
< / section >
< / aside >
< / div > <!-- /.row -->
< / div > <!-- /.container -->
< footer class = "blog-footer" >
< p dir = "auto" >
Blog template created by < a href = "https://twitter.com/mdo" > @mdo< / a > , ported to Hugo by < a href = 'https://twitter.com/mralanorth' > @mralanorth< / a > .
< / p >
< p >
< a href = "#" > Back to top< / a >
< / p >
< / footer >
< / body >
< / html >