S3 Sync
Find a file
Paul Campbell 96a83e6c3e
Convert Storage to full ZIO effect module (#133)
* [console] Rename MyConsole as Console

* [console] break infinite loop

* [console] fix typo

* [console] clean up helpers

* [cli] Main use ZIO#provide to run program

* [cli] Main define Program type alias

* [cli] Program handle cli args in Program

* [cli] Program doesn't extend PlanBuilder

* [cli] refactoring

* [cli] rename ParseArgs as CliArgs

* [cli] CliArgs#apply renamed a parse

* [storage-aws] S3StorageService renamed as S3Storage

* [storage-api] Rename StorageService as Storage.Service

* [storage-api] make Storage.copy effectTotal

* [storage-api] make Storage.delete effectTotal

* [storage-api] make Storage.shutdown effectTotal

* [storage-api] make Storage.upload effectTotal

* [storage-aws] Lister refactoring

* [storage-aws] make Lister into a trait

* [storage-aws] make Copier into a trait

* [storage-aws] make Deleter into a trait

* [storate-aws] make Uploader into a trait

* [storage-aws] AmazonS3 move error handling out of client wrapper

* [storage-aws] DeleterTest added

* [storage-aws] ListerTest added

* [storage-aws] Uploader refactoring

* [storage-aws] CopierTest test Copier directly

* [storage-aws] DeleterTest test Deleter directly

* [storate-aws] ListerTest test Lister directly

* [storage-aws] UploaderTest added

* [storage-aws] S3Storage.Live replaces S3StorageServiceBuilder

* Complete migration to Module for Storage

* [cli] Main define LiveThorpApp object

* [core] Add CoreTypes

* [cli] Program Refactoring

* [core] PlanBuilding Refactoring

* [changelog] updated

* [console] Console.Live Usage of get on optional type

* [storage-aws] AmazonS3ClientTestFixture Use wildcards when selecting more than 6 elements
2019-07-28 20:11:03 +01:00
.github [github] Add stale configuration 2019-05-14 07:05:48 +01:00
bin Rename project to Thorp (#75) 2019-06-17 15:33:49 +01:00
cli/src Convert Storage to full ZIO effect module (#133) 2019-07-28 20:11:03 +01:00
console/src/main/scala/net/kemitix/thorp/console Convert Storage to full ZIO effect module (#133) 2019-07-28 20:11:03 +01:00
core/src Convert Storage to full ZIO effect module (#133) 2019-07-28 20:11:03 +01:00
domain/src Don't use String as key in Map for hashes (#124) 2019-07-24 19:50:28 +01:00
project Update sbt-assembly to 0.14.10 (#105) 2019-07-06 13:47:24 +01:00
storage-api/src/main/scala/net/kemitix/thorp/storage Convert Storage to full ZIO effect module (#133) 2019-07-28 20:11:03 +01:00
storage-aws/src Convert Storage to full ZIO effect module (#133) 2019-07-28 20:11:03 +01:00
.gitignore Rename project to Thorp (#75) 2019-06-17 15:33:49 +01:00
.scalafmt.conf Apply scalafmt (#108) 2019-07-16 07:56:54 +01:00
.travis.yml [sbt,travis] revert most of "publish fat-jar", keeping cli jar name (#97) 2019-06-30 14:38:50 +01:00
build.sbt Remove Monocle dependency (#121) 2019-07-24 09:40:56 +01:00
CHANGELOG.org Convert Storage to full ZIO effect module (#133) 2019-07-28 20:11:03 +01:00
LICENSE Create LICENSE 2019-06-07 21:25:23 +01:00
README.org Sync more than one source directory into a single bucket/prefix (#25) 2019-07-12 07:42:42 +01:00

thorp

Synchronisation of files with S3 using the hash of the file contents.

file:https://img.shields.io/codacy/grade/c1719d44f1f045a8b71e1665a6d3ce6c.svg?style=for-the-badge file:https://img.shields.io/maven-central/v/net.kemitix.thorp/thorp_2.12.svg?style=for-the-badge

Originally based on Alex Kudlick's aws-s3-sync-by-hash.

The normal aws s3 sync ... command only uses the time stamp of files to decide what files need to be copied. This utility looks at the md5 hash of the file contents.

Usage

  thorp
  Usage: thorp [options]

    -V, --version         Display the version and quit
    -B, --batch           Enabled batch-mode
    -s, --source <value>  Source directory to sync to S3
    -b, --bucket <value>  S3 bucket name
    -p, --prefix <value>  Prefix within the S3 Bucket
    -i, --include <value> Include matching paths
    -x, --exclude <value> Exclude matching paths
    -d, --debug           Enable debug logging
    --no-global           Ignore global configuration
    --no-user             Ignore user configuration

If you don't provide a source the current diretory will be used.

The --include and --exclude parameters can be used more than once.

The --source parameter can be used more than once, in which case, all files in all sources will be consolidated into the same bucket/prefix.

Batch mode

Batch mode disable the ANSI console display and logs simple messages that can be written to a file.

Configuration

Configuration will be read from these files:

  • Global: /etc/thorp.conf
  • User: ~ /.config/thorp.conf
  • Source: ${source}/.thorp.conf

Command line arguments override those in Source, which override those in User, which override those Global, which override any built-in config.

When there is more than one source, only the first ".thorp.conf" file found will be used.

Built-in config consists of using the current working directory as the source.

Note, that include and exclude are cumulative across all configuration files.

Behaviour

When considering a local file, the following table governs what should happen:

# local file remote key hash of same key hash of other keys action
1 exists exists matches - do nothing
2 exists is missing - matches copy from other key
3 exists is missing - no matches upload
4 exists exists no match matches copy from other key
5 exists exists no match no matches upload
6 is missing exists - - delete

Executable JAR

To build as an executable jar, perform `sbt assembly`

This will create the file `cli/target/scala-2.12/thorp`

Copy this file to your `PATH`.