Scientific data repositories on the Web: An initial survey
Science Data Repositories (SDRs) have been recognized as both critical to science, and undergoing a fundamental change. A websample study was conducted of 100 SDRs. Information on the websites and from administrators of the SDRs was reviewed to determine salient characteristics of the SDRs, which were used to classify SDRs into groups using a combination of cluster analysis and logistic regression. Characteristics of the SDRs were explored for their role in determining groupings and for their relationship to the success of SDRs. Four of these characteristics were identified as important for further investigation: whether the SDR was supported with grants and contracts, whether support comes from multiple sponsors, what the holding size of the SDR is and whether a preservation policy exists for the SDR. An inferential framework for understanding SDR composition, guided by observations, characteristic collection and refinement and subsequent analysis on elements of group membership, is discussed. The development of SDRs is further examined from a business standpoint, and in comparison to its most similar form, institutional repositories. Because this work identifies important characteristics of SDRs and which characteristics potentially impact the sustainability and success of SDRs, it is expected to be helpful to SDRs.