CGSpace Notes

Documenting day-to-day work on the CGSpace repository.

July, 2017

2017-07-01

  • Run system updates and reboot DSpace Test

2017-07-04

  • Merge changes for WLE Phase II theme rename (#329)
  • Looking at extracting the metadata registries from ICARDA’s MEL DSpace database so we can compare fields with CGSpace
  • We can use PostgreSQL’s extended output format (-x) plus sed to format the output into quasi XML:

$ psql dspacenew -x -c 'select element, qualifier, scope_note from metadatafieldregistry where metadata_schema_id=5 order by element, qualifier;' | sed -r 's:^-\[ RECORD (.*) \]-+$:</dc-type>\n<dc-type>\n<schema>cg</schema>:;s:([^ ]*) +\| (.*):  <\1>\2</\1>:;s:^$:</dc-type>:;1s:</dc-type>\n::'
  • The sed script is from a post on the PostgreSQL mailing list
  • Abenet says the ILRI board wants to be able to have “lead author” for every item, so I’ve whipped up a WIP test in the 5_x-lead-author branch
  • It works but is still very rough and we haven’t thought out the whole lifecycle yet

Testing lead author in submission form

  • I assume that “lead author” would actually be the first question on the item submission form
  • We also need to check to see which ORCID authority core this uses, because it seems to be using an entirely new one rather than the one for dc.contributor.author (which makes sense of course, but fuck, all the author problems aren’t bad enough?!)
  • Also would need to edit XMLUI item displays to incorporate this into authors list
  • And fuck, then anyone consuming our data via REST / OAI will not notice that we have an author outside of dc.contributor.authors… ugh
  • What if we modify the item submission form to use type-bind fields to show/hide certain fields depending on the type?

2017-07-05

  • Adjust WLE Research Theme to include both Phase I and II on the submission form according to editor feedback (#330)
  • Generate list of fields in the current CGSpace cg scheme so we can record them properly in the metadata registry:
$ psql dspace -x -c 'select element, qualifier, scope_note from metadatafieldregistry where metadata_schema_id=2 order by element, qualifier;' | sed -r 's:^-\[ RECORD (.*) \]-+$:</dc-type>\n<dc-type>\n<schema>cg</schema>:;s:([^ ]*) +\| (.*):  <\1>\2</\1>:;s:^$:</dc-type>:;1s:</dc-type>\n::' > cg-types.xml
  • CGSpace was unavailable briefly, and I saw this error in the DSpace log file:
2017-07-05 13:05:36,452 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error -
org.postgresql.util.PSQLException: FATAL: remaining connection slots are reserved for non-replication superuser connections
  • Looking at the pg_stat_activity table I saw there were indeed 98 active connections to PostgreSQL, and at this time the limit is 100, so that makes sense
  • Tsega restarted Tomcat and it’s working now
  • Abenet said she was generating a report with Atmire’s CUA module, so it could be due to that?
  • Looking in the logs I see this random error again that I should report to DSpace:
2017-07-05 13:50:07,196 ERROR org.dspace.statistics.SolrLogger @ COUNTRY ERROR: EU
  • Seems to come from dspace-api/src/main/java/org/dspace/statistics/SolrLogger.java

2017-07-06

  • Sisay tried to help by making a pull request for the RTB flagships but there are formatting errors, unrelated changes, and the flagship names are not in the style I requested
  • Abenet talked to CIP and they said they are actually ok with using collection names rather than adding a new metadata field

2017-07-13

  • Remove UKaid from the controlled vocabulary for dc.description.sponsorship, as Department for International Development, United Kingdom is the correct form and it is already present (#334)