Skip to content

Refresh all collections harvested from other Dataverse instances/resume automated regular harvesting #408

@landreev

Description

@landreev

Since an unexpected breakthrough (for the lack of a better word *) was made in IQSS/dataverse#11828 it has become possible to refresh large harvested collections. This issue is to track checking the entire list of other Dataverses that our prod. instance is configured to harvest from and making sure all are refreshed and updated regularly going forward.

Updated No failures Name Collection Notes
Borealis borealis_harvested See note 1) below
Odum odum See note 1) below
QDR qdr
UVA uva_harvested See note 1) below
Dataverse-NL dataverseNL_harvested
UBC Abacus ubc_harvested
TDR tdr_harvested
Dataverse-NO dataverseNO_harvested
Madroño madrono_harvested
ICRAF icraf_harvested some failures; see note 1) below
CIMMYT-software cimmyt_harvested 1 failure; see note 1) below
CIMMYT-iwyp cimmyt_harvested
CIMMYT cimmyt_harvested A few invalid records; todo: review the log
heiDATA heidata_harvested some failures; see note 1) below
CIP cip_harvested This dataverse instance appears to have been down for a while
ICRISAT icrisat-harvested 1 failure; see note 1) below
CIFOR cifor_harvested
ICARDA-CGIAR icarda_harvested
PUCP pucp
IDSC idsc_harvested
Recherche Data Gouv recherchedatagouv See note 2) below
Historic Shapefiles, U of Oxford oxfordmiddleeastcentreshapefiles
DFC Ghana DFC_Ghana

*) "breakthrough" may be too fancy a word - it turned out that the observed performance degradation was on account of something very stupid we were doing on the database side and, as a result, it was very easy to address without needing any code changes.

Notes:

  1. There appears to be a minor bug affecting re-harvests of datasets from which some files have been dropped since the last harvest. Dev. issue: Harvesting: a bug in dataset update (UpdateHarvestedDatasetCommand) dataverse#11933
  2. There appears to be a bug in import, where a ddi with the xml attribute xml:lang in the codeBook element cannot be imported. It can be described as minor, on a large scale of things (in terms of the number of datasets affected - all of them in Recherche Gouv - vs the total number of harvested datasets). However, it makes most of the datasets in that specific collection unimportable. Dev. issue: Harvesting: DDI import fails when there's a xml:lang attribute present in the codeBook element dataverse#11932

Sub-issues

Metadata

Metadata

Assignees

Labels

FY26 Sprint 10FY26 Sprint 10 (2025-11-05 - 2025-11-19)FY26 Sprint 8FY26 Sprint 8 (2025-10-08 - 2025-10-22)FY26 Sprint 9FY26 Sprint 9 (2025-10-22 - 2025-11-05)Size: 10A percentage of a sprint.

Type

No type

Projects

Status

In Progress 💻

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions