Commit graph

22 commits

Author SHA1 Message Date
afc55354e7
Apply scalafmt (#108)
* [scalafmt] Add .scalafmt

* [sbt] Add unused import warning and make them errors

* [core] ConfigurationBuilder remove unused import

* [core] ConfigurationBuilder apply scalafmt

* [domain] reformat

* [storage-api] reformat

* [core] reformat and some refactoring

* [storage-aws] reformat

* [cli] reformat

* [core] post rebase fix up

* [storage-aws] UploaderSuite tidy up

* [domain] MD5Hash tweak

Without doing this we have a file that we don't create a valid md5
hash for. See #103

* [storage-aws] UploaderSuite remove unused import

* [storage-aws] StorageServiceSuite reformatted

* [sbt] consistent settings for all modules

Can't enable as stubbing TransferManager attempts to use the default
constructor for TransferManager.

* [storage-aws] Add AmazonTransferManager

This gives us an interface we can safely stub in unit tests without
the compiler failing the build because TransferManager's default
constructor is deprecated.

* [storage-aws] Add AmazonUpload

Prevents deprecation errors due to deprecated parameters on Transfer.

* [sbt] mode import to top of file
2019-07-16 07:56:54 +01:00
6a55e74047
Not reading .thorp.conf file (#111)
* [domain] Config defaults to an empty Sources list

* [core] ConfigurationBuilterTest add test for parsing .thorp.conf

* [core] ConfigQuery if no Sources given returns current dir

* [core] rewrote config loader

- Only settings from explicit sources are used

* [changelog] updated

* [core] Remove stray println statements

* [core] SyncLogging tidy up multi-source messages
2019-07-15 07:01:06 +01:00
b15350d959
Sync more than one source directory into a single bucket/prefix (#25)
* [changelog] updated

* [readme] updated

* [core] ConfigQuery added sources()

* [cli] ParseArgs allow specifying multiple sources

* [domain,core,cli] Source datatype changed to Path

* [domain] Sources added to hold multiple paths in order

* [domain] Config sources change datatype to Sources

* [core] Scan sources for .thorp.config and include any sources listed

This allows the inclusion of a `.thorp.config` file in a source with a
single line `source = ....` that causes that other source to also be
synched into the same remote prefix as the current source.

* [core] ConfigurationBuilderTest add more pending tests

* [[core] ConfigurationBuilderTest rewrite using loan-pattern for fixtures

* [core] ConfigOptionTest use TemporaryFolder

* [core] ConfigOptionTest remove unused fields

* [cli] ParseArgsTest don't use get on an Option

* [core] ConfigurationBuilderTest don't use get on Either

* [core] TemporaryFolder Move import to top of file

* [core] TemporaryFolder use Try over try-finally

* [core] ConfigurationBuilderTest don't use get on Either

* [core] TemporaryFolders.withDirectory propogate errors

* [core] TemporaryFolders add writeFile and createFile

* [core] PlanBuilderTest create a plan with two sources with unique files in both

* [core] ActionGenerator only upload file by name in first source

create a plan
  two sources
    same filename in both
    - only upload file in first source

* [domain] LastModified with no params is now()

* [core] PlanBuilderTest 2 sources w/remote only in 2nd src do nothing

* [core] PlanBuilderTest 2 sources w/remote only in 2nd src do nothing

* [domain] RemoteKey map to a file when prefix is empty

* [domain] S3ObjectData defaults to empty

* [core] KeyGenerator Avoid delimiter when empty prefix key

* [core] PlanBuilderTest when remote not in sources delete from remote

* [core] PlanBuilderTest extract helper md5Hash()

* [core] PlanBuilderTest one source a file no matching remote key

* [core] PlanBuildingTest file with matching key and hash do nothing

* [core] PlanBuilderTest file w/matching remote key and different hash

* [core] PlanBuilderTest a remote key with and without local file

* [core] DummyStorageService Use wildcards when selecting more than 6 elements
2019-07-12 07:42:42 +01:00
00c04187e8
Display total size and progress for entire run (#94)
* [changelog] updated

* [core] Wrap Stream[LocalFile] as LocalFiles

* [core] LocalFiles counts files

* [core] LocalFiles sums file lengths

* [core] Restore logFileScan

* [storage-aws] Lister logs when fetching object summaries

* [storage-aws] Extract ListerLogger

* [core] Synchronise use leftMap

* [core] Syncronise extract assemblePlan

* [core] Wrap Stream[Action] in SyncPlan

* [core] Copy the file count and totalSizeBytes across to SyncPlan

* [cli] Program rename actions as syncPlan

* [cli] Program extract thorpArchive def

* [cli] Program extract createPlan def

* [cli] Program refactoring

* [cli] Program remove println showing version

* [cli] Program rename actions parameter as syncPlan

* [core] ThorpArchive add an index to each action

* [cli] Program make SyncPlan available to ThorpArchive

* [core] Pass SyncTotals to Archive

* [domain] Move SyncTotals into module

* [domain] Pass index and SyncTotals to UploadEventListener

* [domain] UploadEventLogger add file count a size progress bars

* [domain] UploadEventLogger better display stability and add file index

* [cli] Index files in correct order

* [cli] Program extends Synchronise

* [core] Rename Synchronise as PlanBuilder

* [cli] Program add test to check actions don't get reordered from plan

* [core] collect file size totals

* [domain] UploadEventLogger include percentage

* [cli] ProgramTest Use wildcards when selecting more than 6 elements
2019-07-04 18:58:31 +01:00
1267b6e313
Add a batch mode that provides a simple log output (#85)
* [changelog] Updated

* [readme] Updated

* [domain] Config Add batch-mode flag

* [core] ConfigOption Add BatchMode option

* [core] ConfigQuery Add batchMode query

Also replaced verbose exists case clauses with a simple contains.

* [core] ConfigOptions added to replace Seq[ConfigOption]

* [core] Syncronise rename method to createPlan

* [cli] Program rename apply as run

* [storage-aws] S3StorageServiceBuilder stop using IO to create object

* [storage-aws] S3StorageServiceBuilder make default service lazy

* [storage-aws] Rename S3ClientCopier => Copier

* [storage-aws] Rename S3ClientDeleter => Deleter

* [storage-aws] Rename S3ClientObjectLister => Lister

* [storage-aws] Only attach upload listener when in batch mode

Only detects batch mode when selected as a command line option

* [core] Synchronise use leftMap rather than swap.map.swap

* [cli] ParseArgs add `-B` and `--batch` options to enable batch mode

* [core] ThorpArchive logs file uploaded when in batch mode
2019-07-02 08:43:52 +01:00
c23376a037
Shutdown storage service once completed (#88) 2019-06-29 20:04:36 +01:00
ac9a52f93f
Use correct hash locally for comparing multi-part uploaded files (#82)
* [storage-aws] ETagGenerator add stub

* [core] MD5HashGenerator add hex and digest helpers

* [domain] MD5Hash can always provide base64 and also digest

Rather that store the base 64 digest some of the time, simply decode
it from the hex hash. The same for the binary digest.

MD5Hash is now cleaner now that it no longer has Option parameters.

* [core] MD5HashGenerator add stubs to allow reading file chunks

* [domain] MD5HashData add sub-objects

* [domain] MD5HashData move back into test where it belongs

* [sbt] add sbt-bloop plugin

* [domain] MD5HashData Add hash of big-file

* [domain] MD5HashData Add hash of big-file

* [core] MD5HashGenerator find end of chunk correctly

* [core] MD5HashGenerator offset is a Long

* [core] MD5HashGenerator don't read past the end of the file

* [storage-aws] ETagGenerator can reproduce ETags

* [storage-aws] ETagGeneratorTest added

* [storate-aws] ETagGenerator refactoring

* [storage-aws] ETageGenerator refactoring

* [core] SyncSuite remove redundant braces

* [storage-api] HashService added

* [storage-aws] S3HashService added

* [core] LocalFileStream refactoring

* [core] integrate HashService and ETagGenerator

* Optimise imports

* [domain] HexEncoder added to replace java 8 only DataTypeConverter

* [core] MD5HashGenerator refactoring

* [core] S3MetaDataEnricher refactoring

* [core] S3MetaDataEnricherSuite refactoring

* [storage-aws] ETagGeneratorTest refactoring

* [storage-aws] StorageServiceSuite refactoring

* [core] S3MetaDataEnricher refactoring

* [core] refactoring

* [storage-aws] refactoring
2019-06-29 19:07:51 +01:00
0f8708e19f
Restructure sync to use a State with foldLeft around actions (#74)
* [changelog] updated

* [cli] Program rename parameter

* [core] Add AppState

* [core] Synchronise rought draft replacement for Sync

Uses the AppState

* [core] Synchronise as sequential for-comprehensions

* [core] Synchronise as nested for-comprehensions

* [sbt] thorp(root) depends on cli moduke

* [core] Synchronise extract methods

* [core] Synchronise rewritten

* [core] Synchronise generates actions

* [core] Remove AppState

* [core] ActionSubmitter remove unused implicit config parameter

* [cli] Program rewritten to use Synchronise

* [core] Synchronise useValidConfig accepts Logger implicitly

* [core] Synchronise reorder methods

* [core] Synchronise refactor errorMessages

* [core] SyncLogging logRunStart accepts explicit parameters

* [core] remove old Sync

* [core] Synchronise restore logRunStart

* [domain] Terminal add types to public methods and values

* [domain] UploadEventLogger force flush to terminal

Also make part of the progress message in green.

Not flushing, by using println, cause odd behaviour. Works on normal
terminal, but not great in an emacs terminal. Oh well.

* [core] SyncLogging.logRunFinished remove unused parameters

* [cli] Program restore final summary

* [storage-aws] remove logging from module

* [core] ThorpArchive replaces ActionSubmitter

ActionSubmitter implementation becomes UnversionedMirrorArchive

* [domain] cleaner upload progress messages

* [cli] Program remove unused Logger

* [cli] Program rename parameter

* [core] SyncSuite use Synchronise

* [sbt] Allow storage-aws to share core test classes

* [domain] LocalFile stop storing a lambda

The lambda breaks the equality test between LocalFile instances.

* [core] MD4HashData add missing base64 digest for leafFile

* [core] Synchronise drop DoNothing actions

* [core] SyncSuite update tests

* [sbt] aggregate modules from root module
2019-06-25 08:27:38 +01:00
9d2271fdcf
Introduce backend abstraction to hide S3 (#76)
Add backend abstraction to hide S3
2019-06-22 07:20:59 +01:00
761c1c9784
Is AWS SDK calculating MD5Hash again for a local file? (#50)
* [aws-lib] Uploader provide request with the already calculated md5 hash

* [aws-lib] remove unused accepts method

* [aws-lib] Uploader refactoring

* [domain] Config remove unused threshold and max retries items

* [core] Show upload errors in summary

* [domain] LocalFile add helper to explicitly compare by hash value

Looking to add an optional field to MD5Hash but we want to do our
checks here only on the hash value, not whether a digest is available
or not.

* [core] Sync refactoring

* [core] SyncSuite invoke subject inside it method and after declaring expectations

* [core] SyncSuite use the localfile hash rather than something arbitrary

* [cli] Add `--no-global` and `--no-user` options

* [core] LocalFileStream refactoring

* [core] SyncSuite: ignore user and global configuration files

* [domain] MD5Hash now can optionally store the base64 encoded hash

* [core] MD5HashGenerator pass the digest to MD5Hash

* [aws-lib] Uploader use the base64 encoded hash

* [changelog] updated
2019-06-21 19:20:35 +01:00
eec79066f3
Smooth progress bar (#78)
* [domain] rewrite and expand ansi codes available

* [aws-api] refactor progress-bar

* [domain] move progress-bar into Terminal

* [aws-api] remove unused value

* [domain] use unicode characters to get a smooth progress bar
2019-06-20 22:58:21 +01:00
910688ee32
Add support for per directory configuration files (#71)
* [core] rename the config supplied from CLI as such

This distinguishes it as config supplied from the command line.

* [core] add ConfigOption

* [core] ConfigOption can update a Config

* [core] new validator for config

* [domain] Config doesn't validate source any more

* [cli] PrintLogger default to not print debug messages

* [cli] Use ConfigOptions and new ConfigValidator

* [sbt] Use common settings for project root

* [domain] RemoteKey can handle when prefix is empty

* [cli] remove banner

* [domain] Logger can create version of itself with debug flipped

* [core] Build and validate Config within core module

This means that the `thorp-lib` module can validate its input from a
list of `ConfigOption`s.

* [core] refactor ConfigurationBuilder

* [core] refactor ConfigurationBuilder

* [sbt] starting back from tagless-final by using IO where needed

* [core] Add ParseConfigFile

* [sbt] Make cats-effect available from the domain

* Roll back from tagless-final to just use cat-effect's IO

* [core] extract ParseConfigLines

* [core] ConfigurationBuilder rename apply as buildConfig

* [core] ParseConfig[Files,Lines] rename apply methods

* [core] refactor ParseConfigFile and add tests

* [core] Sync fix call to run

* [core] SyncSuite update tests to use ConfigOptions
2019-06-20 17:18:46 +01:00
9196dd623f
Rename project to Thorp (#75)
* [sbt] change application name

* [cli] rename package

* [cli] Change displayed application name and description

* [domain] rename package

* [core] fix bad package directory structure

* [core] rename package

* [aws-lib] rename package

* [aws-api] rename package

* [cli] rename programe for usage message

* [bin] rename and update script

* [gitignore] update

* [readme] update
2019-06-17 15:33:49 +01:00
90770eaafb
[cli] Remove verbosity flag (#63) 2019-06-14 20:21:58 +01:00
e3675b5394
Add a debug flag and make debug message hidden by default (#60)
* [cli] add a debug flag to control logging

* [core] show entering a directory as a debug message
2019-06-14 20:00:22 +01:00
21b8917395
Simplify logging (#59)
* [cli] Logger add debug method

* [cli] Logger: add alternate info method

* [cli] replace direct calls to legacy Logger.info

* [cli] Extract trait from Logger into core and rename

* [cli] Program use debug and new info

* [core] MD5HashGenerator uses Logger

* [core] Sync uses Logger

* [core] ActionSubmitter uses Logger

* [core] LocalFileStream uses Logger

* [core] SyncLogging uses Logger

* [domain] Move Logger into module

The allows it to be used in the aws-api module

* [aws-lib] S3ClientObjectLister uses Logger

* [aws-lib] Uploader uses Logger

* [aws-lib] S3ClientDeleter uses Logger

* [aws-lib] S3ClientCopier uses Logger

* [core] remove unused legacy logging

* [aws-lib] remove used logging method

* [cli] PrintLogger remove legacy info method
2019-06-14 17:18:53 +01:00
8c89cc2489
Error when calculating MD5Hash for large files (#56)
* [domain] SizeTranslation includes decimals for larger sizes

* [core] MD5HashGenerator rewrite for memory efficiency

No longer attempt to create an Array the size of the file to be
parsed.

Now it creates a single small buffer and reads 8kb chunks in at a
time. Only creating an additional smaller buffer to read the tail of
the file.

Remove methods to parsing only part of a file are they were no longer
used, and remove the relevant tests.
2019-06-11 20:38:14 +01:00
97efed76b4
Improve upload logging (#44)
* [aws-lib] fold S3ClientUploader trait into it's only implementation

This trait was only implemented by S3ClientTransferManager.

* [core] SyncLogging: more robust matching

No longer cares about parameters to case classes, just their types.

* [cli] Logger uses IO for log methods

* [aws-lib] remove 'transfer-manager'prefix and only show tryCount > 1

* [sbt,cli] remove log4j and scala-logging dependencies

* [domain] move QuoteStripper to Domain

Use it directly in MD5Hash to strip quotes from any input.

* [core] SyncLogging call info in proper context

If the IO.unit returned by the info calls isn't part of the chain that
is returned from the function, then the delayed IO action is never
called.

* [aws-lib] Display size in bytes of file being uploaded

* [core] call info in correct context

* [cli] call info in correct context

* [aws-lib] raise summary fetch message to info 1

* [cli] include correct level in info messages

* [aws-lib] S3ClientLogging adjust logging levels

* [aws-lib] display file sizes in english

* [aws-lib] ObjectLister use IO.bracket properly

* [aws-lib] Copier use IO.bracket properly

* [aws-lib] Deleter refactor

* [aws-lib] TransferManagerLogging remove unused methods

* [aws-lib] TransferManager refactor

* [aws-lib] TransferManager refactor

* [aws-lib] TransferManager displays log messages

Use the UploadProgressListener that was being ignored, and use
unsafeRunSync to execute the suspended effect within the IO[Unit].
Using unsafeRunSync is required to render the effects as the listener
returns Unit, meaning the suspended effects would be discarded.

* [domain] Extract SizeTranslation into module

* [aws-api] report bytes transferred in progress

* [core] fix calls to info

info now returns an IO already, so don't need to wrap it in one.

* [aws-lib] remove unused class

* [aws-lib] UploadProgress displays progress bar while uploading

* [aws-api] UploadProgressLogging optimise imports

* [aws-api] UploadProgressLogging rename variables

* [domain] add Terminal object

* [aws-api] UploadProgressLogging use console width and two lines

- Improved clearing of lines after progress bar
- Use console width for progress bar size

* [aws-lib] S3ClientLogging optimise imports

* [aws-lib] TransferManager clear line before logging

* [aws-lib] rename class as TransferManager

* [aws-lib] rename TransferManger as Uploader to not clash

We are using an AWS SDK class with the same name.
2019-06-10 19:45:36 +01:00
44c66c042c
Include and Exclude behave more like the AWS CLI (#48)
* [domain] rename Filter as Include

* [cli]ParseArgs allow exclude and include parameters to be repeated

* [core] don't include include/exclude details in logging

* [domain] combine Include and Exclude into Filter

Config now collect includes and Excludes into a single list and passed
each file to the Filter.isIncluded method, with the list of Filters,
to determine if a file should be included.
2019-06-08 20:31:20 +01:00
7ffa386b29
[core] MD5HashGenerator uses IO to return where there is file IO (#47)
* [core] MD5HashGenerator uses IO to return where there is file IO

This required that LocalFile in the domain module no longer be
supplied with a function to convert a File into an MD5Hash. Because
such a function requires reading the file it now must use IO, which we
don't allow in the domain module.

Unfortunate ripple effects out to users of MD5HashGenerator and
LocalFile.

* [aws-lib] Add own copy of test class MD5HashData
2019-06-08 18:19:15 +01:00
8ec667343a
Improve Code Quality (#37)
* [core] convert QuoteStripper to an object and move to core

* [aws-lib]S3ClientUploader: use case matching instead of else if blocks

* [aws-lib] put imports at top of file

* [domain] remove redundant braces after class definition

* [aws-lib] remove redundant braces after class definition

* [core] avoid using head on a collection
2019-06-06 19:49:07 +01:00
f54c50aaf3
Split into subprojects (#36)
* [sbt] define existing single module project as legacyRoot

* [sbt] add empty cli module depending on legacyRoot

* [cli] move Main to cli module

* [cli] move ParseArgs to cli module

* [sbt] limit scope of scopt dependency to cli module

* [cli] moved logging config to cli module

* [cli] rename module directory

* [aws-api] added empty module

* [sbt] aggregate builds from cli

* [aws-lib] add empty module

* [core] add empty module

* [sbt] add comment graphing module dependencies

* [sbt] adjust module dependencies to reflect plan

Include legacyRoot at the base until it can be redistributed

* [legacy] make some awssdk classes non-private

during this transition, these classes being private would cause problems

* [aws-lib] create S3ClientBuilder

This is copied from the legacy S3Client companion object

* [domain] add empty module

* [domain] move Bucket into module

* [legacy] RemoteKey no longer has dependency on Config

* [domain] move RemoteKey into module

* [domain] move MD5Hash into module

* [legacy] LocalFile no longer had dependency on MD5HashGenerator

* [domain] move LocalFile into module

* [domain] mode LastModified into module

* [domain] move RemoteMetaData into module

* [domain] move S3MetaData into module

* [domain] move Exclude into module

* [domain] move Filter into module

* [domain] move KeyModified into module

* [domain] move HashModified into module

* [domain] RemoteKey.resolve added

* [domain] add dependency on scalatest

* [domain] LocalFile.resolve added

* [legacy] Remove UnitTest

* [legacy] optimise imports

* [domain] move S3ObjectsData moved into module

* [legacy] wrapper for using GeneralProgressListener

* [domain] move Config into module

* [sbt] move aws-api below legacyRoot in dependencies

This will allow use to move S3Client into the aws-api module

* [legacy] rename S3Client companion as S3ClientBuilder

Preparation to move this into its own file.

* Inject Logger via CLI (#34)

* [S3Client] refactor defaultClient()

* [S3Client] transfermanager explicitly uses the same s3client

* [S3ClientPutObjectUploader] refactor putObjectRequest creation

* [cli] copy in Logging trait as Logger class

* [cli] Main uses Logger

* [cli] simplify Logger and pass to Sync.run

* [legacy] SyncLogging converted to companion

* [cli] Logger info can more easily use levels again

* [legacy] LocalFileStream uses injected info

* [legacy] S3MetaDataEnricher remove unused Logging

* [legacy] ActionGenerator remove unused Logging

* [legacy] convert ActionGenerator to an object

* [legacy] import log methods from SyncLogging

* [legacy] move getS3Status from S3Client to S3MetaDataEnricher

* [legact] convert ActionsSubmitter to an object

* [legacy] convert LocalFileStream to an object

* [legacy] move Action case classes inside companion

* [legacy] move UploadEvent case classes inside companion and rename

* [legacy] move S3Action case classes into companion

* [legacy] convert Sync to an object

* [cli] Logger takes verbosity level at construction

No longer needs to be passed the whole Config implicitly for each info
call.

* [legacy] stop passing implicit Config for logging purposes

Pass a more specific implicit info: Int => String => Unit instead

* [legacy] remove DummyS3Client

* [legacy] remove Logging

* [legacy] convert MD5HashGenerator to an object

* [aws-api] move S3Client into module

* [legacy] convert KeyGenerator to an object

* [legacy] don't use IO.unsafeRunSync directly

* [legacy] refactor/rewrite Sync.run

* [legacy] Rewrite sort using a for-comprehension

* [legacy] Sync inline sorting

* [legacy] SyncLogging rename method

* [legacy] repair tests

* [sbt] move core module to a dependency of legacyRoot

* [sbt] add test dependencies to core module

* [core] move classes into module

* [aws-lib] move classes into module

* [sbt] remove legacy root
2019-06-06 19:24:15 +01:00