<!DOCTYPE html> <html lang="en" > <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta property="og:title" content="October, 2020" /> <meta property="og:description" content="2020-10-06 Add tests for the new /items POST handlers to the DSpace 6.x branch of my dspace-statistics-api It took a bit of extra work because I had to learn how to mock the responses for when Solr is not available Tag and release version 1.3.0 on GitHub: https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0 Trying to test the changes Atmire sent last week but I had to re-create my local database from a recent CGSpace dump During the FlywayDB migration I got an error: " /> <meta property="og:type" content="article" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-10/" /> <meta property="article:published_time" content="2020-10-06T16:55:54+03:00" /> <meta property="article:modified_time" content="2020-10-08T11:15:49+03:00" /> <meta name="twitter:card" content="summary"/> <meta name="twitter:title" content="October, 2020"/> <meta name="twitter:description" content="2020-10-06 Add tests for the new /items POST handlers to the DSpace 6.x branch of my dspace-statistics-api It took a bit of extra work because I had to learn how to mock the responses for when Solr is not available Tag and release version 1.3.0 on GitHub: https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0 Trying to test the changes Atmire sent last week but I had to re-create my local database from a recent CGSpace dump During the FlywayDB migration I got an error: "/> <meta name="generator" content="Hugo 0.76.2" /> <script type="application/ld+json"> { "@context": "http://schema.org", "@type": "BlogPosting", "headline": "October, 2020", "url": "https://alanorth.github.io/cgspace-notes/2020-10/", "wordCount": "1195", "datePublished": "2020-10-06T16:55:54+03:00", "dateModified": "2020-10-08T11:15:49+03:00", "author": { "@type": "Person", "name": "Alan Orth" }, "keywords": "Notes" } </script> <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2020-10/"> <title>October, 2020 | CGSpace Notes</title> <!-- combined, minified CSS --> <link href="https://alanorth.github.io/cgspace-notes/css/style.6da5c906cc7a8fbb93f31cd2316c5dbe3f19ac4aa6bfb066f1243045b8f6061e.css" rel="stylesheet" integrity="sha256-baXJBsx6j7uT8xzSMWxdvj8ZrEqmv7Bm8SQwRbj2Bh4=" crossorigin="anonymous"> <!-- minified Font Awesome for SVG icons --> <script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f3d2a1f5980bab30ddd0d8cadbd496475309fc48e2b1d052c5c09e6facffcb0f.js" integrity="sha256-89Kh9ZgLqzDd0NjK29SWR1MJ/EjisdBSxcCeb6z/yw8=" crossorigin="anonymous"></script> <!-- RSS 2.0 feed --> </head> <body> <div class="blog-masthead"> <div class="container"> <nav class="nav blog-nav"> <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a> </nav> </div> </div> <header class="blog-header"> <div class="container"> <h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1> <p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p> </div> </header> <div class="container"> <div class="row"> <div class="col-sm-8 blog-main"> <article class="blog-post"> <header> <h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-10/">October, 2020</a></h2> <p class="blog-post-meta"><time datetime="2020-10-06T16:55:54+03:00">Tue Oct 06, 2020</time> by Alan Orth in <span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a> </p> </header> <h2 id="2020-10-06">2020-10-06</h2> <ul> <li>Add tests for the new <code>/items</code> POST handlers to the DSpace 6.x branch of my <a href="https://github.com/ilri/dspace-statistics-api/tree/v6_x">dspace-statistics-api</a> <ul> <li>It took a bit of extra work because I had to learn how to mock the responses for when Solr is not available</li> <li>Tag and release version 1.3.0 on GitHub: <a href="https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0">https://github.com/ilri/dspace-statistics-api/releases/tag/v1.3.0</a></li> </ul> </li> <li>Trying to test the changes Atmire sent last week but I had to re-create my local database from a recent CGSpace dump <ul> <li>During the FlywayDB migration I got an error:</li> </ul> </li> </ul> <pre><code>2020-10-06 21:36:04,138 ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper @ Batch entry 0 update public.bitstreamformatregistry set description='Electronic publishing', internal='FALSE', mimetype='application/epub+zip', short_description='EPUB', support_level=1 where bitstream_format_id=78 was aborted: ERROR: duplicate key value violates unique constraint "bitstreamformatregistry_short_description_key" Detail: Key (short_description)=(EPUB) already exists. Call getNextException to see other errors in the batch. 2020-10-06 21:36:04,138 WARN org.hibernate.engine.jdbc.spi.SqlExceptionHelper @ SQL Error: 0, SQLState: 23505 2020-10-06 21:36:04,138 ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper @ ERROR: duplicate key value violates unique constraint "bitstreamformatregistry_short_description_key" Detail: Key (short_description)=(EPUB) already exists. 2020-10-06 21:36:04,142 ERROR org.hibernate.engine.jdbc.batch.internal.BatchingBatch @ HHH000315: Exception executing batch [could not execute batch] 2020-10-06 21:36:04,143 ERROR org.dspace.storage.rdbms.DatabaseRegistryUpdater @ Error attempting to update Bitstream Format and/or Metadata Registries org.hibernate.exception.ConstraintViolationException: could not execute batch at org.hibernate.exception.internal.SQLStateConversionDelegate.convert(SQLStateConversionDelegate.java:129) at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49) at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:124) at org.hibernate.engine.jdbc.batch.internal.BatchingBatch.performExecution(BatchingBatch.java:122) at org.hibernate.engine.jdbc.batch.internal.BatchingBatch.doExecuteBatch(BatchingBatch.java:101) at org.hibernate.engine.jdbc.batch.internal.AbstractBatchImpl.execute(AbstractBatchImpl.java:161) at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.executeBatch(JdbcCoordinatorImpl.java:207) at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:390) at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:304) at org.hibernate.event.internal.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:349) at org.hibernate.event.internal.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:56) at org.hibernate.internal.SessionImpl.flush(SessionImpl.java:1195) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.hibernate.context.internal.ThreadLocalSessionContext$TransactionProtectionWrapper.invoke(ThreadLocalSessionContext.java:352) at com.sun.proxy.$Proxy162.flush(Unknown Source) at org.dspace.core.HibernateDBConnection.commit(HibernateDBConnection.java:83) at org.dspace.core.Context.commit(Context.java:435) at org.dspace.core.Context.complete(Context.java:380) at org.dspace.administer.MetadataImporter.loadRegistry(MetadataImporter.java:164) at org.dspace.storage.rdbms.DatabaseRegistryUpdater.updateRegistries(DatabaseRegistryUpdater.java:72) at org.dspace.storage.rdbms.DatabaseRegistryUpdater.afterMigrate(DatabaseRegistryUpdater.java:121) at org.flywaydb.core.internal.command.DbMigrate$3.doInTransaction(DbMigrate.java:250) at org.flywaydb.core.internal.util.jdbc.TransactionTemplate.execute(TransactionTemplate.java:72) at org.flywaydb.core.internal.command.DbMigrate.migrate(DbMigrate.java:246) at org.flywaydb.core.Flyway$1.execute(Flyway.java:959) at org.flywaydb.core.Flyway$1.execute(Flyway.java:917) at org.flywaydb.core.Flyway.execute(Flyway.java:1373) at org.flywaydb.core.Flyway.migrate(Flyway.java:917) at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:663) at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:575) at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:551) at org.dspace.core.Context.<clinit>(Context.java:103) at org.dspace.app.util.AbstractDSpaceWebapp.register(AbstractDSpaceWebapp.java:74) at org.dspace.app.util.DSpaceWebappListener.contextInitialized(DSpaceWebappListener.java:31) at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:5197) at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5720) at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:1016) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:992) </code></pre><ul> <li>I checked the database migrations with <code>dspace database info</code> and they were all OK <ul> <li>Then I restarted the Tomcat again and it started up OK…</li> </ul> </li> <li>There were two issues I had reported to Atmire last month: <ul> <li>Importing items from the command line throws a <code>NullPointerException</code> from <code>com.atmire.dspace.cua.CUASolrLoggerServiceImpl</code> for every item, but the item still gets imported</li> <li>No results for author name in Listing and Reports, despite there being hits in Discovery search</li> </ul> </li> <li>To test the first one I imported a very simple CSV file with one item with minimal data <ul> <li>There is a new error now (but the item does get imported):</li> </ul> </li> </ul> <pre><code>$ dspace metadata-import -f /tmp/2020-10-06-import-test.csv -e aorth@mjanja.ch Loading @mire database changes for module MQM Changes have been processed ----------------------------------------------------------- New item: + New owning collection (10568/3): ILRI articles in journals + Add (dc.contributor.author): Orth, Alan + Add (dc.date.issued): 2020-09-01 + Add (dc.title): Testing CUA import NPE 1 item(s) will be changed Do you want to make these changes? [y/n] y ----------------------------------------------------------- New item: aff5e78d-87c9-438d-94f8-1050b649961c (10568/108548) + New owning collection (10568/3): ILRI articles in journals + Added (dc.contributor.author): Orth, Alan + Added (dc.date.issued): 2020-09-01 + Added (dc.title): Testing CUA import NPE Tue Oct 06 22:06:14 CEST 2020 | Query:containerItem:aff5e78d-87c9-438d-94f8-1050b649961c Error while updating org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Expected mime type application/octet-stream but got text/html. <!doctype html><html lang="en"><head><title>HTTP Status 404 – Not Found</title><style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 404 – Not Found</h1><hr class="line" /><p><b>Type</b> Status Report</p><p><b>Message</b> The requested resource [/solr/update] is not available</p><p><b>Description</b> The origin server did not find a current representation for the target resource or is not willing to disclose that one exists.</p><hr class="line" /><h3>Apache Tomcat/7.0.104</h3></body></html> at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:512) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:168) at com.atmire.dspace.cua.CUASolrLoggerServiceImpl$5.visit(SourceFile:1131) at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.visitEachStatisticShard(SourceFile:212) at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.update(SourceFile:1104) at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.update(SourceFile:1093) at org.dspace.statistics.StatisticsLoggingConsumer.consume(SourceFile:104) at org.dspace.event.BasicDispatcher.consume(BasicDispatcher.java:177) at org.dspace.event.BasicDispatcher.dispatch(BasicDispatcher.java:123) at org.dspace.core.Context.dispatchEvents(Context.java:455) at org.dspace.core.Context.commit(Context.java:424) at org.dspace.core.Context.complete(Context.java:380) at org.dspace.app.bulkedit.MetadataImport.main(MetadataImport.java:1399) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229) at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81) </code></pre><ul> <li>Also, I tested Listings and Reports and there are still no hits for “Orth, Alan” as a contributor, despite there being dozens of items in the repository and the Solr query generated by Listings and Reports actually returning hits:</li> </ul> <pre><code>2020-10-06 22:23:44,116 INFO org.apache.solr.core.SolrCore @ [search] webapp=/solr path=/select params={q=*:*&fl=handle,search.resourcetype,search.resourceid,search.uniqueid&start=0&fq=NOT(withdrawn:true)&fq=NOT(discoverable:false)&fq=search.resourcetype:2&fq=author_keyword:Orth,\+A.+OR+author_keyword:Orth,\+Alan&fq=dateIssued.year:[2013+TO+2021]&rows=500&wt=javabin&version=2} hits=18 status=0 QTime=10 </code></pre><ul> <li>Solr returns <code>hits=18</code> for the L&R query, but there are no result shown in the L&R UI</li> <li>I sent all this feedback to Atmire…</li> </ul> <h2 id="2020-10-07">2020-10-07</h2> <ul> <li>Udana from IWMI had asked about stats discrepencies from reports they had generated in previous months or years <ul> <li>I told him that we very often purge bots and the number of stats can change drastically</li> <li>Also, I told him that it is not possible to compare stats from previous exports and that the stats should be taking with a grain of salt</li> </ul> </li> <li>Testing POSTing items to the DSpace 6 REST API <ul> <li>We need to authenticate to get a JSESSIONID cookie first:</li> </ul> </li> </ul> <pre><code>$ http -f POST https://dspacetest.cgiar.org/rest/login email=aorth@fuuu.com 'password=fuuuu' $ http https://dspacetest.cgiar.org/rest/status Cookie:JSESSIONID=EABAC9EFF942028AA52DFDA16DBCAFDE </code></pre><ul> <li>Then we post an item in JSON format to <code>/rest/collections/{uuid}/items</code>:</li> </ul> <pre><code>$ http POST https://dspacetest.cgiar.org/rest/collections/f10ad667-2746-4705-8b16-4439abe61d22/items Cookie:JSESSIONID=EABAC9EFF942028AA52DFDA16DBCAFDE < item-object.json </code></pre><ul> <li>Format of JSON is:</li> </ul> <pre><code>{ "metadata": [ { "key": "dc.title", "value": "Testing REST API post", "language": "en_US" }, { "key": "dc.contributor.author", "value": "Orth, Alan", "language": "en_US" }, { "key": "dc.date.issued", "value": "2020-09-01", "language": "en_US" } ], "archived":"false", "withdrawn":"false" } </code></pre><ul> <li>What is unclear to me is the <code>archived</code> parameter, it seems to do nothing… perhaps it is only used for the <code>/items</code> endpoint when printing information about an item <ul> <li>If I submit to a collection that has a workflow, even as a super admin and with “archived=false” in the JSON, the item enters the workflow (“Awaiting editor’s attention”)</li> <li>If I submit to a new collection without a workflow the item gets archived immediately</li> <li>I created <a href="https://gist.github.com/alanorth/40fc3092aefd78f978cca00e8abeeb7a">some notes</a> to share with Salem and Leroy for future reference when we start discussion POSTing items to the REST API</li> </ul> </li> <li>I created an account for Salem on DSpace Test and added it to the submitters group of an ICARDA collection with no other workflow steps so we can see what happens <ul> <li>We are curious to see if he gets a UUID when posting from MEL</li> </ul> </li> <li>I did some tests by adding his account to certain workflow steps and trying to POST the item</li> <li>Member of collection “Submitters” step: <ul> <li>HTTP Status 401 – Unauthorized</li> <li>The request has not been applied because it lacks valid authentication credentials for the target resource.</li> </ul> </li> <li>Member of collection “Accept/Reject” step: <ul> <li>Same error…</li> </ul> </li> <li>Member of collection “Accept/Reject/Edit Metadata” step: <ul> <li>Same error…</li> </ul> </li> <li>Member of collection Administrators with no other workflow steps…: <ul> <li>Posts straight to archive</li> </ul> </li> <li>Member of collection Administrators with empty “Accept/Reject/Edit Metadata” step: <ul> <li>Posts straight to archive</li> </ul> </li> <li>Member of collection Administrators with populated “Accept/Reject/Edit Metadata” step: <ul> <li>Does <em>not</em> post straight to archive, goes to workflow</li> </ul> </li> <li>Note that community administrators have no role in item submission other than being able to create/manage collection groups</li> </ul> <h2 id="2020-10-08">2020-10-08</h2> <ul> <li>I did some testing of the DSpace 5 REST API because Salem and I were curious <ul> <li>The authentication is a little different (uses a serialized JSON object instead of a form and the token is an HTTP header instead of a cookie):</li> </ul> </li> </ul> <pre><code>$ http POST http://localhost:8080/rest/login email=aorth@fuuu.com 'password=ddddd' $ http http://localhost:8080/rest/status rest-dspace-token:d846f138-75d3-47ba-9180-b88789a28099 $ http POST http://localhost:8080/rest/collections/1549/items rest-dspace-token:d846f138-75d3-47ba-9180-b88789a28099 < item-object.json </code></pre><ul> <li>The item submission works exactly the same as in DSpace 6:</li> </ul> <ol> <li>The submitting user must be a collection admin</li> <li>If the collection has a workflow the item will enter it and the API returns an item ID</li> <li>If the collection does not have a workflow then the item is committed to the archive and you get a Handle</li> </ol> <!-- raw HTML omitted --> </article> </div> <!-- /.blog-main --> <aside class="col-sm-3 ml-auto blog-sidebar"> <section class="sidebar-module"> <h4>Recent Posts</h4> <ol class="list-unstyled"> <li><a href="/cgspace-notes/2020-10/">October, 2020</a></li> <li><a href="/cgspace-notes/2020-09/">September, 2020</a></li> <li><a href="/cgspace-notes/2020-08/">August, 2020</a></li> <li><a href="/cgspace-notes/2020-07/">July, 2020</a></li> <li><a href="/cgspace-notes/2020-06/">June, 2020</a></li> </ol> </section> <section class="sidebar-module"> <h4>Links</h4> <ol class="list-unstyled"> <li><a href="https://cgspace.cgiar.org">CGSpace</a></li> <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li> <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li> </ol> </section> </aside> </div> <!-- /.row --> </div> <!-- /.container --> <footer class="blog-footer"> <p dir="auto"> Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>. </p> <p> <a href="#">Back to top</a> </p> </footer> </body> </html>