S3 Sync
Find a file
Paul Campbell 8c89cc2489
Error when calculating MD5Hash for large files (#56)
* [domain] SizeTranslation includes decimals for larger sizes

* [core] MD5HashGenerator rewrite for memory efficiency

No longer attempt to create an Array the size of the file to be
parsed.

Now it creates a single small buffer and reads 8kb chunks in at a
time. Only creating an additional smaller buffer to read the tail of
the file.

Remove methods to parsing only part of a file are they were no longer
used, and remove the relevant tests.
2019-06-11 20:38:14 +01:00
.github [github] Add stale configuration 2019-05-14 07:05:48 +01:00
aws-api/src/main/scala/net/kemitix/s3thorp/aws/api Improve upload logging (#44) 2019-06-10 19:45:36 +01:00
aws-lib/src Handle when a file is not found after initial scan (#49) 2019-06-10 21:22:46 +01:00
cli/src/main/scala/net/kemitix/s3thorp/cli Improve upload logging (#44) 2019-06-10 19:45:36 +01:00
core/src Error when calculating MD5Hash for large files (#56) 2019-06-11 20:38:14 +01:00
domain/src Error when calculating MD5Hash for large files (#56) 2019-06-11 20:38:14 +01:00
project [gitignote] update to allow some project files 2019-05-11 08:54:35 +01:00
.gitignore [gitignore] ignore zip files 2019-05-14 07:27:14 +01:00
.travis.yml [travis] define AWS_REGION environment variable 2019-05-16 19:28:50 +01:00
build.sbt Update aws-java-sdk-s3 to 1.11.569 (#53) 2019-06-11 07:38:16 +01:00
CHANGELOG.org [changelog] Updated for 0.4.0 2019-06-11 08:04:31 +01:00
LICENSE Create LICENSE 2019-06-07 21:25:23 +01:00
README.org [readme] Add codacy badge 2019-06-11 08:04:31 +01:00

s3thorp

Synchronisation of files with S3 using the hash of the file contents.

https://api.codacy.com/project/badge/Grade/14ea6ad0825249c994a27a82d3485180

Originally based on Alex Kudlick's aws-s3-sync-by-hash.

The normal aws s3 sync ... command only uses the time stamp of files to decide what files need to be copied. This utility looks at the md5 hash of the file contents.

Usage

  s3thorp
  Usage: s3thorp [options]

    -s, --source <value>  Source directory to sync to S3
    -b, --bucket <value>  S3 bucket name
    -p, --prefix <value>  Prefix within the S3 Bucket
    -i, --include <value> Include matching paths
    -x, --exclude <value> Exclude matching paths
    -v, --verbose <value> Verbosity level (1-5)

The --include and --exclude parameters can be used more than once.

Behaviour

When considering a local file, the following table governs what should happen:

# local file remote key hash of same key hash of other keys action
1 exists exists matches - do nothing
2 exists is missing - matches copy from other key
3 exists is missing - no matches upload
4 exists exists no match matches copy from other key
5 exists exists no match no matches upload
6 is missing exists - - delete