Paul Campbell
761c1c9784
* [aws-lib] Uploader provide request with the already calculated md5 hash * [aws-lib] remove unused accepts method * [aws-lib] Uploader refactoring * [domain] Config remove unused threshold and max retries items * [core] Show upload errors in summary * [domain] LocalFile add helper to explicitly compare by hash value Looking to add an optional field to MD5Hash but we want to do our checks here only on the hash value, not whether a digest is available or not. * [core] Sync refactoring * [core] SyncSuite invoke subject inside it method and after declaring expectations * [core] SyncSuite use the localfile hash rather than something arbitrary * [cli] Add `--no-global` and `--no-user` options * [core] LocalFileStream refactoring * [core] SyncSuite: ignore user and global configuration files * [domain] MD5Hash now can optionally store the base64 encoded hash * [core] MD5HashGenerator pass the digest to MD5Hash * [aws-lib] Uploader use the base64 encoded hash * [changelog] updated
75 lines
2.9 KiB
Org Mode
75 lines
2.9 KiB
Org Mode
* thorp
|
|
|
|
Synchronisation of files with S3 using the hash of the file contents.
|
|
|
|
[[https://www.codacy.com/app/kemitix/thorp][file:https://img.shields.io/codacy/grade/c1719d44f1f045a8b71e1665a6d3ce6c.svg?style=for-the-badge]]
|
|
|
|
Originally based on Alex Kudlick's [[https://github.com/akud/aws-s3-sync-by-hash][aws-s3-sync-by-hash]].
|
|
|
|
The normal ~aws s3 sync ...~ command only uses the time stamp of files
|
|
to decide what files need to be copied. This utility looks at the md5
|
|
hash of the file contents.
|
|
|
|
* Usage
|
|
|
|
#+begin_example
|
|
thorp
|
|
Usage: thorp [options]
|
|
|
|
-s, --source <value> Source directory to sync to S3
|
|
-b, --bucket <value> S3 bucket name
|
|
-p, --prefix <value> Prefix within the S3 Bucket
|
|
-i, --include <value> Include matching paths
|
|
-x, --exclude <value> Exclude matching paths
|
|
-d, --debug Enable debug logging
|
|
--no-global Ignore global configuration
|
|
--no-user Ignore user configuration
|
|
#+end_example
|
|
|
|
If you don't provide a ~source~ the current diretory will be used.
|
|
|
|
The ~--include~ and ~--exclude~ parameters can be used more than once.
|
|
|
|
* Configuration
|
|
|
|
Configuration will be read from these files:
|
|
|
|
- Global: ~/etc/thorp.conf~
|
|
- User: ~ ~/.config/thorp.conf~
|
|
- Source: ~${source}/.thorp.conf~
|
|
|
|
Command line arguments override those in Source, which override those
|
|
in User, which override those Global, which override any built-in
|
|
config.
|
|
|
|
Built-in config consists of using the current working directory as the
|
|
~source~.
|
|
|
|
Note, that ~include~ and ~exclude~ are cumulative across all
|
|
configuration files.
|
|
|
|
* Behaviour
|
|
|
|
When considering a local file, the following table governs what should happen:
|
|
|
|
|---+------------+------------+------------------+--------------------+---------------------|
|
|
| # | local file | remote key | hash of same key | hash of other keys | action |
|
|
|---+------------+------------+------------------+--------------------+---------------------|
|
|
| 1 | exists | exists | matches | - | do nothing |
|
|
| 2 | exists | is missing | - | matches | copy from other key |
|
|
| 3 | exists | is missing | - | no matches | upload |
|
|
| 4 | exists | exists | no match | matches | copy from other key |
|
|
| 5 | exists | exists | no match | no matches | upload |
|
|
| 6 | is missing | exists | - | - | delete |
|
|
|---+------------+------------+------------------+--------------------+---------------------|
|
|
|
|
* Executable JAR
|
|
|
|
To build as an executable jar, perform `sbt assembly`
|
|
|
|
This will create the file
|
|
`cli/target/scala-2.12/thorp-assembly-$VERSION.jar` (where $VERSION
|
|
is substituted)
|
|
|
|
Copy and rename this file as `thorp.jar` into the same directory as
|
|
the `bin/throp` shell script.
|