Version 1
: Received: 7 August 2023 / Approved: 7 August 2023 / Online: 8 August 2023 (14:12:01 CEST)
How to cite:
García-Ruiz, S.; H Reynolds, R.; Grant-Peters, M.; K Gustavsson, E.; Fairbrother-Browne, A.; Chen, Z.; W Brenton, J.; Ryten, M. AWS-S3-Integrity-Check: An Open-Source Bash Tool to Verify the Integrity of a Dataset Stored on Amazon S3. Preprints2023, 2023080603. https://doi.org/10.20944/preprints202308.0603.v1
García-Ruiz, S.; H Reynolds, R.; Grant-Peters, M.; K Gustavsson, E.; Fairbrother-Browne, A.; Chen, Z.; W Brenton, J.; Ryten, M. AWS-S3-Integrity-Check: An Open-Source Bash Tool to Verify the Integrity of a Dataset Stored on Amazon S3. Preprints 2023, 2023080603. https://doi.org/10.20944/preprints202308.0603.v1
García-Ruiz, S.; H Reynolds, R.; Grant-Peters, M.; K Gustavsson, E.; Fairbrother-Browne, A.; Chen, Z.; W Brenton, J.; Ryten, M. AWS-S3-Integrity-Check: An Open-Source Bash Tool to Verify the Integrity of a Dataset Stored on Amazon S3. Preprints2023, 2023080603. https://doi.org/10.20944/preprints202308.0603.v1
APA Style
García-Ruiz, S., H Reynolds, R., Grant-Peters, M., K Gustavsson, E., Fairbrother-Browne, A., Chen, Z., W Brenton, J., & Ryten, M. (2023). AWS-S3-Integrity-Check: An Open-Source Bash Tool to Verify the Integrity of a Dataset Stored on Amazon S3. Preprints. https://doi.org/10.20944/preprints202308.0603.v1
Chicago/Turabian Style
García-Ruiz, S., Jonathan W Brenton and Mina Ryten. 2023 "AWS-S3-Integrity-Check: An Open-Source Bash Tool to Verify the Integrity of a Dataset Stored on Amazon S3" Preprints. https://doi.org/10.20944/preprints202308.0603.v1
Abstract
Amazon Simple Storage Service (Amazon S3) has become a widely used and reliable platform for storing large biomedical datasets. However, unintended changes to the original data can occur during the data writing and transmission, ultimately altering the original contents of the object transferred and producing unexpected results when later accessed. Despite the interest in verifying end-to-end data integrity, there are no existing open-source and easy-to-use tools to accomplish this mission. To bridge this gap, here we present aws-s3-integrity-check, a user-friendly, lightweight and reliable bash tool to verify the integrity of a dataset stored within an Amazon S3 bucket. By using this tool, we completed the integrity verification of 1,045 records ranging between 5 Bytes and 10 Gigabytes (GB) in size and occupying a total of ~935 GigaBytes (GB) of Amazon S3 cloud storage space in ~114 minutes. The aws-s3-integrity-check tool also provides file-by-file on-screen and log-file-based information about the status of each individual integrity check. To the best of our knowledge, the aws-s3-integrity-check bash tool is the only open-source tool that allows verifying the integrity of a dataset uploaded to the Amazon S3 Storage system in a quick, reliable and efficient manner. The aws-s3-integrity-check tool is freely available for download and use at https://github.com/SoniaRuiz/aws-s3-integrity-check and https://hub.docker.com/r/soniaruiz/aws-s3-integrity-check.
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.