mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Update notes
This commit is contained in:
@ -11,11 +11,12 @@
|
||||
|
||||
Remove cg.subject.wle and cg.identifier.wletheme from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021)
|
||||
iso-codes 4.13.0 was released, which incorporates my changes to the common names for Iran, Laos, and Syria
|
||||
I finally got through with porting the input form from DSpace 6 to DSpace 7
|
||||
" />
|
||||
<meta property="og:type" content="article" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2023-03/" />
|
||||
<meta property="article:published_time" content="2023-03-01T07:58:36+03:00" />
|
||||
<meta property="article:modified_time" content="2023-03-01T07:58:36+03:00" />
|
||||
<meta property="article:modified_time" content="2023-03-01T08:30:25+03:00" />
|
||||
|
||||
|
||||
|
||||
@ -25,6 +26,7 @@ iso-codes 4.13.0 was released, which incorporates my changes to the common names
|
||||
|
||||
Remove cg.subject.wle and cg.identifier.wletheme from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021)
|
||||
iso-codes 4.13.0 was released, which incorporates my changes to the common names for Iran, Laos, and Syria
|
||||
I finally got through with porting the input form from DSpace 6 to DSpace 7
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.110.0">
|
||||
|
||||
@ -36,9 +38,9 @@ iso-codes 4.13.0 was released, which incorporates my changes to the common names
|
||||
"@type": "BlogPosting",
|
||||
"headline": "March, 2023",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2023-03/",
|
||||
"wordCount": "41",
|
||||
"wordCount": "380",
|
||||
"datePublished": "2023-03-01T07:58:36+03:00",
|
||||
"dateModified": "2023-03-01T07:58:36+03:00",
|
||||
"dateModified": "2023-03-01T08:30:25+03:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -116,8 +118,70 @@ iso-codes 4.13.0 was released, which incorporates my changes to the common names
|
||||
<ul>
|
||||
<li>Remove <code>cg.subject.wle</code> and <code>cg.identifier.wletheme</code> from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021)</li>
|
||||
<li><a href="https://salsa.debian.org/iso-codes-team/iso-codes/-/blob/main/CHANGELOG.md#4130-2023-02-28">iso-codes 4.13.0 was released</a>, which incorporates my changes to the common names for Iran, Laos, and Syria</li>
|
||||
<li>I finally got through with porting the input form from DSpace 6 to DSpace 7</li>
|
||||
</ul>
|
||||
<!-- raw HTML omitted -->
|
||||
<ul>
|
||||
<li>I can’t put my finger on it, but the input form has to be formatted very particularly, for example if your rows have more than two fields in them with out a sufficient Bootstrap grid style, or if you use a <code>twobox</code>, etc, the entire form step appears blank</li>
|
||||
</ul>
|
||||
<h2 id="2023-03-02">2023-03-02</h2>
|
||||
<ul>
|
||||
<li>I did some experiments with the new <a href="https://datapythonista.me/blog/pandas-20-and-the-arrow-revolution-part-i">Pandas 2.0.0rc0 Apache Arrow support</a>
|
||||
<ul>
|
||||
<li>There is a change to the way nulls are handled and it causes my tests for <code>pd.isna(field)</code> to fail</li>
|
||||
<li>I think we need consider blanks as null, but I’m not sure</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I made some adjustments to the Discovery sidebar facets on DSpace 6 while I was looking at the DSpace 7 configuration
|
||||
<ul>
|
||||
<li>I downgraded CIFOR subject, Humidtropics subject, Drylands subject, ICARDA subject, and Language from DiscoverySearchFilterFacet to DiscoverySearchFilter in <code>discovery.xml</code> since we are no longer using them in sidebar facets</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="2023-03-03">2023-03-03</h2>
|
||||
<ul>
|
||||
<li>Atmire merged one of my old pull requests into COUNTER-Robots:
|
||||
<ul>
|
||||
<li><a href="https://github.com/atmire/COUNTER-Robots/pull/54">COUNTER_Robots_list.json: Add new bots</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I will update the local ILRI overrides in our DSpace spider agents file</li>
|
||||
</ul>
|
||||
<h2 id="2023-03-04">2023-03-04</h2>
|
||||
<ul>
|
||||
<li>Submit a <a href="https://github.com/flyingcircusio/pycountry/pull/156">pull request on pycountry to use iso-codes 4.13.0</a></li>
|
||||
</ul>
|
||||
<h2 id="2023-03-05">2023-03-05</h2>
|
||||
<ul>
|
||||
<li>Start a harvest on AReS</li>
|
||||
</ul>
|
||||
<h2 id="2023-03-06">2023-03-06</h2>
|
||||
<ul>
|
||||
<li>Export CGSpace to do Initiative collection mappings
|
||||
<ul>
|
||||
<li>There were thirty-three that needed updating</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Send Abenet and Sam a list of twenty-one CAS publications that had been marked as “multiple documents” that we uploaded as metadata-only items
|
||||
<ul>
|
||||
<li>Goshu will download the PDFs for each and upload them to the items on CGSpace manually</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I spent some time trying to get csv-metadata-quality working with the new Arrow backend for Pandas 2.0.0rc0
|
||||
<ul>
|
||||
<li>It seems there is a problem recognizing empty strings as na with <code>pd.isna()</code></li>
|
||||
<li>If I do <code>pd.isna(field) or field == ""</code> then it works as expected, but that feels hacky</li>
|
||||
<li>I’m going to test again on the next release…</li>
|
||||
<li>Note that I had been setting both of these global options:</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code>pd.options.mode.dtype_backend = 'pyarrow'
|
||||
pd.options.mode.nullable_dtypes = True
|
||||
</code></pre><ul>
|
||||
<li>Then reading the CSV like this:</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code>df = pd.read_csv(args.input_file, engine='pyarrow', dtype='string[pyarrow]'
|
||||
</code></pre><!-- raw HTML omitted -->
|
||||
|
||||
|
||||
|
||||
|
Reference in New Issue
Block a user