diff --git a/content/posts/2023-08.md b/content/posts/2023-08.md
index 6c2a41771..d26ab76c5 100644
--- a/content/posts/2023-08.md
+++ b/content/posts/2023-08.md
@@ -152,4 +152,25 @@ $ ./run.sh -s http://localhost:8081/solr/statistics -a import -o /tmp/statistics
- Export CGSpace to check for missing Initiative collection mappings
+## 2023-08-19
+
+- Start a harvest on AReS
+
+## 2023-08-21
+
+- Experiment with the DSpace 7 REST API
+ - I wrote a Python script to benchmark harvesting all 100,000+ items using the `/api/discover/search/objects` endpoint 100 items at a time
+ - I was able to harvest the entire 106,000 items in fifty-two minutes, which seems slow, but that's about ten times faster than with the legacy REST API...
+ - Still, I need to benchmark a bit more, as the item response doesn't include collection mappings or thumbnails
+- Reading the [API docs](https://github.com/DSpace/RestContract/blob/main/README.md#etags--conditional-headers) it seems that we should be able to use the standard `If-Modified-Since` header for some endpoints
+ - I tried it on the `/api/discover/search/objects` and `/api/core/items` endpoints, but apparently those don't support this header because I don't see a `Last-Modified` header in the response
+ - According to the docs, it means that these endpoints indeed don't support it...
+
+## 2023-08-22
+
+- I was experimenting with the DSpace 7 REST API again
+ - This time looking at the thumbnail responses in item endpoints
+ - According to [the documentation](https://github.com/DSpace/RestContract/blob/main/items.md#main-thumbnail) the API will respond with HTTP 200 if there is a thumbnail, and HTTP 204 if there is no content
+ - That means we need to make the request before we can even find out!
+
diff --git a/docs/2023-08/index.html b/docs/2023-08/index.html
index a569ab4ae..094994529 100644
--- a/docs/2023-08/index.html
+++ b/docs/2023-08/index.html
@@ -19,7 +19,7 @@ Start working on some batch uploads for IFPRI
-
+
@@ -44,9 +44,9 @@ Start working on some batch uploads for IFPRI
"@type": "BlogPosting",
"headline": "August, 2023",
"url": "https://alanorth.github.io/cgspace-notes/2023-08/",
- "wordCount": "1057",
+ "wordCount": "1254",
"datePublished": "2023-08-03T11:18:36+03:00",
- "dateModified": "2023-08-14T18:38:03+02:00",
+ "dateModified": "2023-08-18T23:54:07+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -293,6 +293,36 @@ Start working on some batch uploads for IFPRI
- Export CGSpace to check for missing Initiative collection mappings
+2023-08-19
+
+- Start a harvest on AReS
+
+2023-08-21
+
+- Experiment with the DSpace 7 REST API
+
+- I wrote a Python script to benchmark harvesting all 100,000+ items using the
/api/discover/search/objects
endpoint 100 items at a time
+- I was able to harvest the entire 106,000 items in fifty-two minutes, which seems slow, but that’s about ten times faster than with the legacy REST API…
+- Still, I need to benchmark a bit more, as the item response doesn’t include collection mappings or thumbnails
+
+
+- Reading the API docs it seems that we should be able to use the standard
If-Modified-Since
header for some endpoints
+
+- I tried it on the
/api/discover/search/objects
and /api/core/items
endpoints, but apparently those don’t support this header because I don’t see a Last-Modified
header in the response
+- According to the docs, it means that these endpoints indeed don’t support it…
+
+
+
+2023-08-22
+
+- I was experimenting with the DSpace 7 REST API again
+
+- This time looking at the thumbnail responses in item endpoints
+- According to the documentation the API will respond with HTTP 200 if there is a thumbnail, and HTTP 204 if there is no content
+- That means we need to make the request before we can even find out!
+
+
+
diff --git a/docs/categories/index.html b/docs/categories/index.html
index 24272fc1d..50e3ab22a 100644
--- a/docs/categories/index.html
+++ b/docs/categories/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/index.html b/docs/categories/notes/index.html
index a1d7b8ae3..acdb502f6 100644
--- a/docs/categories/notes/index.html
+++ b/docs/categories/notes/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/2/index.html b/docs/categories/notes/page/2/index.html
index 49d22d709..804b0c433 100644
--- a/docs/categories/notes/page/2/index.html
+++ b/docs/categories/notes/page/2/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/3/index.html b/docs/categories/notes/page/3/index.html
index 450f3be0e..a118bc6e9 100644
--- a/docs/categories/notes/page/3/index.html
+++ b/docs/categories/notes/page/3/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/4/index.html b/docs/categories/notes/page/4/index.html
index c770d9bfa..534348cf4 100644
--- a/docs/categories/notes/page/4/index.html
+++ b/docs/categories/notes/page/4/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/5/index.html b/docs/categories/notes/page/5/index.html
index 9e6bde4be..0fda278ed 100644
--- a/docs/categories/notes/page/5/index.html
+++ b/docs/categories/notes/page/5/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/6/index.html b/docs/categories/notes/page/6/index.html
index 45adebb0a..1d836679d 100644
--- a/docs/categories/notes/page/6/index.html
+++ b/docs/categories/notes/page/6/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/7/index.html b/docs/categories/notes/page/7/index.html
index c65298be0..0ebdd58f7 100644
--- a/docs/categories/notes/page/7/index.html
+++ b/docs/categories/notes/page/7/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/8/index.html b/docs/categories/notes/page/8/index.html
index 71a7ac9b3..d38af4afc 100644
--- a/docs/categories/notes/page/8/index.html
+++ b/docs/categories/notes/page/8/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/index.html b/docs/index.html
index 1da9ea9dc..6dbb5239b 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/10/index.html b/docs/page/10/index.html
index 68e6eaf41..4f58dc7fd 100644
--- a/docs/page/10/index.html
+++ b/docs/page/10/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/2/index.html b/docs/page/2/index.html
index 70909e190..3f79cca61 100644
--- a/docs/page/2/index.html
+++ b/docs/page/2/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/3/index.html b/docs/page/3/index.html
index 33aa9f70a..84b6ea817 100644
--- a/docs/page/3/index.html
+++ b/docs/page/3/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/4/index.html b/docs/page/4/index.html
index c71689f60..6c8dc1df7 100644
--- a/docs/page/4/index.html
+++ b/docs/page/4/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/5/index.html b/docs/page/5/index.html
index e417991e9..c6a12ca13 100644
--- a/docs/page/5/index.html
+++ b/docs/page/5/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/6/index.html b/docs/page/6/index.html
index 9a516f45e..39dab5763 100644
--- a/docs/page/6/index.html
+++ b/docs/page/6/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/7/index.html b/docs/page/7/index.html
index 5cf31854b..5f0ba3462 100644
--- a/docs/page/7/index.html
+++ b/docs/page/7/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/8/index.html b/docs/page/8/index.html
index 51549c6f4..e19c78018 100644
--- a/docs/page/8/index.html
+++ b/docs/page/8/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/9/index.html b/docs/page/9/index.html
index 1cd43c577..45a0cbd7e 100644
--- a/docs/page/9/index.html
+++ b/docs/page/9/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/index.html b/docs/posts/index.html
index 5a577503c..de8fc3beb 100644
--- a/docs/posts/index.html
+++ b/docs/posts/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/10/index.html b/docs/posts/page/10/index.html
index 29798f184..72bf77911 100644
--- a/docs/posts/page/10/index.html
+++ b/docs/posts/page/10/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/2/index.html b/docs/posts/page/2/index.html
index 0197118ad..a0021cfeb 100644
--- a/docs/posts/page/2/index.html
+++ b/docs/posts/page/2/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/3/index.html b/docs/posts/page/3/index.html
index 51fe81ff5..2b54b74b9 100644
--- a/docs/posts/page/3/index.html
+++ b/docs/posts/page/3/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/4/index.html b/docs/posts/page/4/index.html
index 31e348847..b94a785c1 100644
--- a/docs/posts/page/4/index.html
+++ b/docs/posts/page/4/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/5/index.html b/docs/posts/page/5/index.html
index 02d7c0cf4..a633b1d2d 100644
--- a/docs/posts/page/5/index.html
+++ b/docs/posts/page/5/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/6/index.html b/docs/posts/page/6/index.html
index fb249ac1a..90360b4cf 100644
--- a/docs/posts/page/6/index.html
+++ b/docs/posts/page/6/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/7/index.html b/docs/posts/page/7/index.html
index 6f728bc13..070922fc4 100644
--- a/docs/posts/page/7/index.html
+++ b/docs/posts/page/7/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/8/index.html b/docs/posts/page/8/index.html
index 0faf37c16..9b578a962 100644
--- a/docs/posts/page/8/index.html
+++ b/docs/posts/page/8/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/9/index.html b/docs/posts/page/9/index.html
index de57306b6..6cd199739 100644
--- a/docs/posts/page/9/index.html
+++ b/docs/posts/page/9/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
index 4e0910c21..cfcbddc6f 100644
--- a/docs/sitemap.xml
+++ b/docs/sitemap.xml
@@ -3,19 +3,19 @@
xmlns:xhtml="http://www.w3.org/1999/xhtml">
https://alanorth.github.io/cgspace-notes/2023-08/
- 2023-08-14T18:38:03+02:00
+ 2023-08-18T23:54:07+03:00
https://alanorth.github.io/cgspace-notes/categories/
- 2023-08-14T18:38:03+02:00
+ 2023-08-18T23:54:07+03:00
https://alanorth.github.io/cgspace-notes/
- 2023-08-14T18:38:03+02:00
+ 2023-08-18T23:54:07+03:00
https://alanorth.github.io/cgspace-notes/categories/notes/
- 2023-08-14T18:38:03+02:00
+ 2023-08-18T23:54:07+03:00
https://alanorth.github.io/cgspace-notes/posts/
- 2023-08-14T18:38:03+02:00
+ 2023-08-18T23:54:07+03:00
https://alanorth.github.io/cgspace-notes/2023-07/
2023-08-02T23:04:11+03:00