S3 Sync
Paul Campbell
f54c50aaf3
* [sbt] define existing single module project as legacyRoot * [sbt] add empty cli module depending on legacyRoot * [cli] move Main to cli module * [cli] move ParseArgs to cli module * [sbt] limit scope of scopt dependency to cli module * [cli] moved logging config to cli module * [cli] rename module directory * [aws-api] added empty module * [sbt] aggregate builds from cli * [aws-lib] add empty module * [core] add empty module * [sbt] add comment graphing module dependencies * [sbt] adjust module dependencies to reflect plan Include legacyRoot at the base until it can be redistributed * [legacy] make some awssdk classes non-private during this transition, these classes being private would cause problems * [aws-lib] create S3ClientBuilder This is copied from the legacy S3Client companion object * [domain] add empty module * [domain] move Bucket into module * [legacy] RemoteKey no longer has dependency on Config * [domain] move RemoteKey into module * [domain] move MD5Hash into module * [legacy] LocalFile no longer had dependency on MD5HashGenerator * [domain] move LocalFile into module * [domain] mode LastModified into module * [domain] move RemoteMetaData into module * [domain] move S3MetaData into module * [domain] move Exclude into module * [domain] move Filter into module * [domain] move KeyModified into module * [domain] move HashModified into module * [domain] RemoteKey.resolve added * [domain] add dependency on scalatest * [domain] LocalFile.resolve added * [legacy] Remove UnitTest * [legacy] optimise imports * [domain] move S3ObjectsData moved into module * [legacy] wrapper for using GeneralProgressListener * [domain] move Config into module * [sbt] move aws-api below legacyRoot in dependencies This will allow use to move S3Client into the aws-api module * [legacy] rename S3Client companion as S3ClientBuilder Preparation to move this into its own file. * Inject Logger via CLI (#34) * [S3Client] refactor defaultClient() * [S3Client] transfermanager explicitly uses the same s3client * [S3ClientPutObjectUploader] refactor putObjectRequest creation * [cli] copy in Logging trait as Logger class * [cli] Main uses Logger * [cli] simplify Logger and pass to Sync.run * [legacy] SyncLogging converted to companion * [cli] Logger info can more easily use levels again * [legacy] LocalFileStream uses injected info * [legacy] S3MetaDataEnricher remove unused Logging * [legacy] ActionGenerator remove unused Logging * [legacy] convert ActionGenerator to an object * [legacy] import log methods from SyncLogging * [legacy] move getS3Status from S3Client to S3MetaDataEnricher * [legact] convert ActionsSubmitter to an object * [legacy] convert LocalFileStream to an object * [legacy] move Action case classes inside companion * [legacy] move UploadEvent case classes inside companion and rename * [legacy] move S3Action case classes into companion * [legacy] convert Sync to an object * [cli] Logger takes verbosity level at construction No longer needs to be passed the whole Config implicitly for each info call. * [legacy] stop passing implicit Config for logging purposes Pass a more specific implicit info: Int => String => Unit instead * [legacy] remove DummyS3Client * [legacy] remove Logging * [legacy] convert MD5HashGenerator to an object * [aws-api] move S3Client into module * [legacy] convert KeyGenerator to an object * [legacy] don't use IO.unsafeRunSync directly * [legacy] refactor/rewrite Sync.run * [legacy] Rewrite sort using a for-comprehension * [legacy] Sync inline sorting * [legacy] SyncLogging rename method * [legacy] repair tests * [sbt] move core module to a dependency of legacyRoot * [sbt] add test dependencies to core module * [core] move classes into module * [aws-lib] move classes into module * [sbt] remove legacy root |
||
---|---|---|
.github | ||
aws-api/src/main/scala/net/kemitix/s3thorp/aws/api | ||
aws-lib/src | ||
cli/src/main | ||
core/src | ||
domain/src | ||
project | ||
.gitignore | ||
.travis.yml | ||
build.sbt | ||
CHANGELOG.org | ||
README.org |
s3thorp
Synchronisation of files with S3 using the hash of the file contents.
Originally based on Alex Kudlick's aws-s3-sync-by-hash.
The normal aws s3 sync ...
command only uses the time stamp of files
to decide what files need to be copied. This utility looks at the md5
hash of the file contents.
Usage
s3thorp Usage: s3thorp [options] -s, --source <value> Source directory to sync to S3 -b, --bucket <value> S3 bucket name -p, --prefix <value> Prefix within the S3 Bucket -x, --exclude <value>[,<values>] Exclude matching paths -v, --verbose <value> Verbosity level (1-5)
Behaviour
When considering a local file, the following table governs what should happen:
# | local file | remote key | hash of same key | hash of other keys | action |
1 | exists | exists | matches | - | do nothing |
2 | exists | is missing | - | matches | copy from other key |
3 | exists | is missing | - | no matches | upload |
4 | exists | exists | no match | matches | copy from other key |
5 | exists | exists | no match | no matches | upload |
6 | is missing | exists | - | - | delete |
Creating Native Images
Note: the created image currently can't be run outside of the base of the project. See Issue #15
-
Download and install GraalVM
-
Install
native-image
using the graal updatergu install native-image
-
Create native image
native-image -cp `sbt 'export runtime:fullClasspath'|tail -n 1` \ -H:Name=s3thorp \ -H:Class=net.kemitix.s3thorp.Main \ --allow-incomplete-classpath \ --force-fallback
- Resulting file requires a JDK for execution