<metaproperty="og:description"content="2023-12-01 There is still high load on CGSpace and I don’t know why I don’t see a high number of sessions compared to previous days in the last few weeks $ for file in dspace.log.2023-11-[23]*; do echo "$file"; grep -a -oE 'session_id=[A-Z0-9]{32}' "$file" | sort | uniq | wc -l; done dspace.log.2023-11-20 22865 dspace.log.2023-11-21 20296 dspace.log.2023-11-22 19688 dspace.log.2023-11-23 17906 dspace.log.2023-11-24 18453 dspace.log.2023-11-25 17513 dspace.log.2023-11-26 19037 dspace.log.2023-11-27 21103 dspace.log.2023-11-28 23023 dspace.log.2023-11-29 23545 dspace."/>
<metaname="twitter:description"content="2023-12-01 There is still high load on CGSpace and I don’t know why I don’t see a high number of sessions compared to previous days in the last few weeks $ for file in dspace.log.2023-11-[23]*; do echo "$file"; grep -a -oE 'session_id=[A-Z0-9]{32}' "$file" | sort | uniq | wc -l; done dspace.log.2023-11-20 22865 dspace.log.2023-11-21 20296 dspace.log.2023-11-22 19688 dspace.log.2023-11-23 17906 dspace.log.2023-11-24 18453 dspace.log.2023-11-25 17513 dspace.log.2023-11-26 19037 dspace.log.2023-11-27 21103 dspace.log.2023-11-28 23023 dspace.log.2023-11-29 23545 dspace."/>
<li>Send a message to Altmetric support because the item IWMI highlighted last month still doesn’t show the attention score for the Handle after I tweeted it several times weeks ago</li>
<li>Spent some time writing a Python script to fix the literal MaxMind City JSON objects in our Solr statistics
<ul>
<li>There are about 1.6 million of these, so I exported them using solr-import-export-json with the query <code>city:com*</code> but ended up finding many that have missing bundles, container bitstreams, etc:</li>
</ul>
</li>
</ul>
<pretabindex="0"><code>city:com* AND -bundleName:[* TO *] AND -containerBitstream:[* TO *] AND -file_id:[* TO *] AND -owningItem:[* TO *] AND -version_id:[* TO *]
</code></pre><ul>
<li>(Note the negation to find fields that are missing)</li>
<li>I don’t know what I want to do with these yet</li>
</ul>
<h2id="2023-12-05">2023-12-05</h2>
<ul>
<li>I finished the <code>fix_maxmind_stats.py</code> script and fixed 1.6 million records and imported them on CGSpace after testing on DSpace 7 Test</li>
<li>Altmetric said there was a glitch regarding the Handle and DOI linking and they successfully re-scraped the item page and linked them
<ul>
<li>They sent me a list of current production IPs and I notice that some of them are in our nginx bot network list:</li>
</ul>
</li>
</ul>
<divclass="highlight"><pretabindex="0"style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><codeclass="language-console"data-lang="console"><spanstyle="display:flex;"><span>$ <spanstyle="color:#66d9ef">for</span> network in <spanstyle="color:#66d9ef">$(</span>csvcut -c network /tmp/ips.csv | sed 1d | sort -u<spanstyle="color:#66d9ef">)</span>; <spanstyle="color:#66d9ef">do</span> grepcidr $network ~/src/git/rmg-ansible-public/roles/dspace/files/nginx/bot-networks.conf; <spanstyle="color:#66d9ef">done</span>
<li>Finalized the script to generate Solr statistics for Alliance research Mirjam
<ul>
<li>The script is <code>ilri/generate_solr_statistics.py</code></li>
<li>I generated ~3,200 statistics based on her records of the download statistics of <ahref="https://hdl.handle.net/10568/131997">that item</a> and imported them on CGSpace</li>
<divclass="highlight"><pretabindex="0"style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><codeclass="language-console"data-lang="console"><spanstyle="display:flex;"><span>localhost/dspace7= ☘ \COPY (SELECT DISTINCT text_value AS "dc.contributor.author", count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id = 3 GROUP BY "dc.contributor.author" ORDER BY count DESC) to /tmp/2023-12-08-authors.csv WITH CSV HEADER;
<li>The Alliance TIP team is testing deposits to the DSpace 7 REST API and getting an HTTP 500 error
<ul>
<li>In the DSpace logs I see this after they log in, create the item, and update the metadata:</li>
</ul>
</li>
</ul>
<pretabindex="0"><code>2023-12-19 17:49:28,022 ERROR unknown unknown org.dspace.rest.Resource @ Something get wrong. Aborting context in finally statement.
</code></pre><ul>
<li>I found some messages on the dspace-tech mailing list suggesting this might be an old bug: <ahref="https://groups.google.com/g/dspace-tech/c/My1GUFYFGoU/m/tS7-WAJPAwAJ">https://groups.google.com/g/dspace-tech/c/My1GUFYFGoU/m/tS7-WAJPAwAJ</a>
<ul>
<li>I restarted Tomcat and told the Alliance TIP team to try again</li>
</ul>
</li>
</ul>
<h2id="2023-12-20">2023-12-20</h2>
<ul>
<li>The Alliance guys said that submitting via REST works now… sigh, so that’s just some old DSpace 5/6 REST API bug</li>
<li>I lowercased all our AGROVOC keywords in <code>dcterms.subject</code> in SQL:</li>
</span></span><spanstyle="display:flex;"><span>dspace=*# UPDATE metadatavalue SET text_value=LOWER(text_value) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=187 AND text_value ~ '[[:upper:]]';