thorp

S3 Sync

Find a file

Paul Campbell 8c89cc2489 Error when calculating MD5Hash for large files (#56 ) * [domain] SizeTranslation includes decimals for larger sizes * [core] MD5HashGenerator rewrite for memory efficiency No longer attempt to create an Array the size of the file to be parsed. Now it creates a single small buffer and reads 8kb chunks in at a time. Only creating an additional smaller buffer to read the tail of the file. Remove methods to parsing only part of a file are they were no longer used, and remove the relevant tests.		2019-06-11 20:38:14 +01:00
.github	[github] Add stale configuration	2019-05-14 07:05:48 +01:00
aws-api/src/main/scala/net/kemitix/s3thorp/aws/api	Improve upload logging (#44 )	2019-06-10 19:45:36 +01:00
aws-lib/src	Handle when a file is not found after initial scan (#49 )	2019-06-10 21:22:46 +01:00
cli/src/main/scala/net/kemitix/s3thorp/cli	Improve upload logging (#44 )	2019-06-10 19:45:36 +01:00
core/src	Error when calculating MD5Hash for large files (#56 )	2019-06-11 20:38:14 +01:00
domain/src	Error when calculating MD5Hash for large files (#56 )	2019-06-11 20:38:14 +01:00
project	[gitignote] update to allow some project files	2019-05-11 08:54:35 +01:00
.gitignore	[gitignore] ignore zip files	2019-05-14 07:27:14 +01:00
.travis.yml	[travis] define AWS_REGION environment variable	2019-05-16 19:28:50 +01:00
build.sbt	Update aws-java-sdk-s3 to 1.11.569 (#53 )	2019-06-11 07:38:16 +01:00
CHANGELOG.org	[changelog] Updated for 0.4.0	2019-06-11 08:04:31 +01:00
LICENSE	Create LICENSE	2019-06-07 21:25:23 +01:00
README.org	[readme] Add codacy badge	2019-06-11 08:04:31 +01:00

s3thorp

Synchronisation of files with S3 using the hash of the file contents.

https://api.codacy.com/project/badge/Grade/14ea6ad0825249c994a27a82d3485180

Originally based on Alex Kudlick's aws-s3-sync-by-hash.

The normal aws s3 sync ... command only uses the time stamp of files to decide what files need to be copied. This utility looks at the md5 hash of the file contents.

Usage

  s3thorp
  Usage: s3thorp [options]

    -s, --source <value>  Source directory to sync to S3
    -b, --bucket <value>  S3 bucket name
    -p, --prefix <value>  Prefix within the S3 Bucket
    -i, --include <value> Include matching paths
    -x, --exclude <value> Exclude matching paths
    -v, --verbose <value> Verbosity level (1-5)

The --include and --exclude parameters can be used more than once.

Behaviour

When considering a local file, the following table governs what should happen:

#	local file	remote key	hash of same key	hash of other keys	action
1	exists	exists	matches	-	do nothing
2	exists	is missing	-	matches	copy from other key
3	exists	is missing	-	no matches	upload
4	exists	exists	no match	matches	copy from other key
5	exists	exists	no match	no matches	upload
6	is missing	exists	-	-	delete