学位论文详细信息
The application of file identification, validation, and characterization tools in digital curation
Digital curation;Digital preservation;File identification;File validation;File characterization;Preservation tools;Preservation software
Ford, Kevin M. ; Cragin ; Melissa H. ; McDonough ; Jerome P.
关键词: Digital curation;    Digital preservation;    File identification;    File validation;    File characterization;    Preservation tools;    Preservation software;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/24301/Ford_Kevin.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

File format identification, characterization, and validation are considered essential processes for digital preservation and, by extension, long-term data curation.These actions are performed on data objects by humans or computers, in an attempt to identify the type of a given file, derive characterizing information that is specific to the file, and validate that the given file conforms to its type specification.The present research reviews the literature surrounding these digital preservation activities, including their theoretical basis and the publications that accompanied the formal release of tools and services designed in response to their theoretical foundation.It also reports the results from extensive tests designed to evaluate the coverage of some of the software tools developed to perform file format identification, characterization, and validation actions.Tests of these tools demonstrate that more work is needed - particularly in terms of scalable solutions - to address the expanse of digital data to be preserved and curated.The breadth of file types these tools are anticipated to handle is so great as to call into question whether a scalable solution is feasible, and, more broadly, whether such efforts will offer a meaningful return on investment.Also, these tools, which serve to provide a type of baseline reading of a file in a repository, can be easily tricked.It is possible to generate files with nothing more than a proper file extension and correct magic number and have the tools "positively" identify the file.This is not the same as a file that conforms to its specification, and one that could be considered valid.The ability to manipulate the results returned by these tools raises issues of identity, trust, security and risk.

【 预 览 】
附件列表
Files Size Format View
The application of file identification, validation, and characterization tools in digital curation 981KB PDF download
  文献评价指标  
  下载次数:6次 浏览次数:40次