From 81f3ee523b7bfd5fae9d1e1697f98a554d215c64 Mon Sep 17 00:00:00 2001 From: Alan Orth Date: Sat, 22 Sep 2018 00:49:53 +0300 Subject: [PATCH] Add notes for 2018-09-21 --- content/posts/2018-09.md | 25 ++++++++++++++++++++++++ docs/2018-09/index.html | 41 +++++++++++++++++++++++++++++++++++++--- docs/sitemap.xml | 10 +++++----- 3 files changed, 68 insertions(+), 8 deletions(-) diff --git a/content/posts/2018-09.md b/content/posts/2018-09.md index 2ac2b32b9..3cc8f722f 100644 --- a/content/posts/2018-09.md +++ b/content/posts/2018-09.md @@ -368,5 +368,30 @@ dspace=# select item_id from item where in_archive is True and withdrawn is Fals ## 2018-09-20 - Contact Atmire to ask how we can buy more credits for future development +- I researched the Solr `filterCache` size and I found out that the formula for calculating the potential memory use of **each entry** in the cache is: + +``` +((maxDoc/8) + 128) * (size_defined_in_solrconfig.xml) +``` + +- Which means that, for our statistics core with *149 million* documents, each entry in our `filterCache` would use 8.9 GB! + +``` +((149374568/8) + 128) * 512 = 9560037888 bytes (8.9 GB) +``` + +- So I think we can forget about tuning this for now! +- [Discussion on the mailing list about `filterCache` size](http://lucene.472066.n3.nabble.com/Calculating-filterCache-size-td4142526.html) +- [Article discussing testing methodology for different `filterCache` sizes](https://docs.google.com/document/d/1vl-nmlprSULvNZKQNrqp65eLnLhG9s_ydXQtg9iML10/edit) +- Discuss Handle links on Twitter with IWMI + +## 2018-09-21 + +- I see that there was a nice optimization to the ImageMagick PDF CMYK detection in the upstream `dspace-5_x` branch: [DS-3664](https://github.com/DSpace/DSpace/pull/2204) +- The fix will go into DSpace 5.10, and we are currently on DSpace 5.8 but I think I'll cherry-pick that fix into our `5_x-prod` branch: + - 4e8c7b578bdbe26ead07e36055de6896bbf02f83: ImageMagick: Only execute "identify" on first page +- I think it would also be nice to cherry-pick the fixes for [DS-3883](https://github.com/DSpace/DSpace/pull/2020), which is related to optimizing the XMLUI item display of items with many bitstreams + - a0ea20bd1821720b111e2873b08e03ce2bf93307: DS-3883: Don't loop through original bitstreams if only displaying thumbnails + - 8d81e825dee62c2aa9d403a505e4a4d798964e8d: DS-3883: If only including thumbnails, only load the main item thumbnail. diff --git a/docs/2018-09/index.html b/docs/2018-09/index.html index edd64ad34..687c22900 100644 --- a/docs/2018-09/index.html +++ b/docs/2018-09/index.html @@ -18,7 +18,7 @@ I’m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I " /> - +
  • Contact Atmire to ask how we can buy more credits for future development
  • +
  • I researched the Solr filterCache size and I found out that the formula for calculating the potential memory use of each entry in the cache is:
  • + + +
    ((maxDoc/8) + 128) * (size_defined_in_solrconfig.xml)
    +
    + + + +
    ((149374568/8) + 128) * 512 = 9560037888 bytes (8.9 GB)
    +
    + + + +

    2018-09-21

    + + diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 64e5af139..d9aa6b4c5 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -4,7 +4,7 @@ https://alanorth.github.io/cgspace-notes/2018-09/ - 2018-09-19T20:40:18+03:00 + 2018-09-20T13:11:48+03:00 @@ -184,7 +184,7 @@ https://alanorth.github.io/cgspace-notes/ - 2018-09-19T20:40:18+03:00 + 2018-09-20T13:11:48+03:00 0 @@ -195,7 +195,7 @@ https://alanorth.github.io/cgspace-notes/tags/notes/ - 2018-09-19T20:40:18+03:00 + 2018-09-20T13:11:48+03:00 0 @@ -207,13 +207,13 @@ https://alanorth.github.io/cgspace-notes/posts/ - 2018-09-19T20:40:18+03:00 + 2018-09-20T13:11:48+03:00 0 https://alanorth.github.io/cgspace-notes/tags/ - 2018-09-19T20:40:18+03:00 + 2018-09-20T13:11:48+03:00 0