mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-06-29 01:23:47 +02:00
2.9 KiB
2.9 KiB
title | date | author | categories | |
---|---|---|---|---|
March, 2023 | 2023-03-01T07:58:36+03:00 | Alan Orth |
|
2023-03-01
- Remove
cg.subject.wle
andcg.identifier.wletheme
from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021) - iso-codes 4.13.0 was released, which incorporates my changes to the common names for Iran, Laos, and Syria
- I finally got through with porting the input form from DSpace 6 to DSpace 7
- I can't put my finger on it, but the input form has to be formatted very particularly, for example if your rows have more than two fields in them with out a sufficient Bootstrap grid style, or if you use a
twobox
, etc, the entire form step appears blank
2023-03-02
- I did some experiments with the new Pandas 2.0.0rc0 Apache Arrow support
- There is a change to the way nulls are handled and it causes my tests for
pd.isna(field)
to fail - I think we need consider blanks as null, but I'm not sure
- There is a change to the way nulls are handled and it causes my tests for
- I made some adjustments to the Discovery sidebar facets on DSpace 6 while I was looking at the DSpace 7 configuration
- I downgraded CIFOR subject, Humidtropics subject, Drylands subject, ICARDA subject, and Language from DiscoverySearchFilterFacet to DiscoverySearchFilter in
discovery.xml
since we are no longer using them in sidebar facets
- I downgraded CIFOR subject, Humidtropics subject, Drylands subject, ICARDA subject, and Language from DiscoverySearchFilterFacet to DiscoverySearchFilter in
2023-03-03
- Atmire merged one of my old pull requests into COUNTER-Robots:
- I will update the local ILRI overrides in our DSpace spider agents file
2023-03-04
2023-03-05
- Start a harvest on AReS
2023-03-06
- Export CGSpace to do Initiative collection mappings
- There were thirty-three that needed updating
- Send Abenet and Sam a list of twenty-one CAS publications that had been marked as "multiple documents" that we uploaded as metadata-only items
- Goshu will download the PDFs for each and upload them to the items on CGSpace manually
- I spent some time trying to get csv-metadata-quality working with the new Arrow backend for Pandas 2.0.0rc0
- It seems there is a problem recognizing empty strings as na with
pd.isna()
- If I do
pd.isna(field) or field == ""
then it works as expected, but that feels hacky - I'm going to test again on the next release...
- Note that I had been setting both of these global options:
- It seems there is a problem recognizing empty strings as na with
pd.options.mode.dtype_backend = 'pyarrow'
pd.options.mode.nullable_dtypes = True
- Then reading the CSV like this:
df = pd.read_csv(args.input_file, engine='pyarrow', dtype='string[pyarrow]'