Commit graph

90 commits

Author SHA1 Message Date
4cff0dd0c9 fix up tests to handle new stream return types 2019-05-24 07:52:36 +01:00
fa31882e51 [S3MetaDataEnricher,ActionSubmitter] return streams
Help to perpetuate the map/flatMap structure within for-comprehension
in Sync's run method.

Added DoNothing and DoNothingS3Action
2019-05-24 07:48:43 +01:00
bffc6c032c
Support multiple filters (#18)
* Support multiple filters

* Clean up imports

* [S3ClientLogging] log the remote key value

* Update changelog, readme and long arg name

* [SyncSuite] update test
2019-05-23 19:35:48 +01:00
37ac41093e
Improved S3Client logging (#17)
* [ThorpS3Client] Log event when event actually occurs

* [MD5HashGenerator] log activity reading md5 hash for local files

* [awssdk] Extract logging into S3ClientLogging

* [S3ClientLogging] raise logging levels

* [SyncLogging] Remove per-file logging

* [S3ClientLogging] More readable messages
2019-05-23 18:19:51 +01:00
0fe9b86471
Simple Exclusion Filter (#16)
* [filter] Parse filter from command line and add to config

* [filter] exclude file that match the filter
2019-05-23 09:21:09 +01:00
eacfc37095
Handle renames (#14)
* [sync] move thunks to s3client to bottom of class

Also, use the thunk methods from within run rather than accessing the
s3client object directly.

* Layout tweaks to put each parameter on own line

* [syncsuite] value renames and move sync.run outside it() call

Future tests will be evaluating the result of that call, so this
avoids repeatedly calling it.

* Add first pass at copy methods and some delete stubs

* [Bucket] Convert from type alias for String to a case class

* [SyncSuite] mark new tests as pending

* [RemoteKey] Convert from type alias for String to a case class

* [MD5Hash] Convert from type alias for String to a case class

* [LastModified] Convert from type alias for String to a case class

* [LocalFile] Revert to using a normal File

* [Sync] Use a for-comprehension and restructure S3MetaData

The for-comprehension will make it easier to generate multiple actions
out of the stream of enriched metadata. The restructured S3MetaData
avoids the need to wrap it in an Either in some cases.

* [ToUpload] Add an wrapper to indicate action required on File

* [S3Action] Stub actions for IO events

* [S3Action] Use UploadS3Action

* [Sync] Fix formating when echoing parameters

* [logging] Change log level down to 4 for listing every file considered

* [Sync] Use a case class to hold counters

* [HashModified] Add case class to replace MD5Hash, LastModified tuples

* [logging] Move file considered logging to source of files

Rather than logging this where adding meta data, move to where the
files are being initially identified.

* [logging] Log all final counters

* Pass Config and HashLookup as implicit parameters

* [LocalFileStream] rename method as findFiles

* [S3MetaDataEnricher] rename method as getMetadata

* Rename selection filter and uploader trait and methods

* [MD5HashGenerator] Extract as trait

* [Action] Convert ToUpload into an Action sealed trait

* [ActionGenerator] refactored and removed logging

* fix up tests

* [LocalFileStream] adjust logging

* [RemoteMetaData] Added

* [ActionGenerator] remove redundant braces

* [LocalFile] Added as wrapper for File

* [Sync] run: remove redundant braces

* [Sync] run: rename HashLookup as S3ObjectsData

* WIP - toward copy action

* Extract S3ObjectsByHash for grouping

* extract internal wrapper for S3CatsIOClient

Remove some boiler plate from the middle of a test

* Explicitly name the Map parameters in extected result

* All lastModified are the same to avoid confusion

We aren't testing this field, just that the keys and hash values are correct.

* Rename variable

* space out object cxreation

* Fix test - error in expected result

Code has been working for ages!

* [readme] condense and simplify behaviour table, adding option delete

Reduce the complexity by only noting the distinct attributes leading
to each action.

Add the action of delete when a local file is missing.

* [S3MetaDataEnricherSuite] rename tests and note missing tests

* [ActionGeneratorSuite] rename tests and note missing tests

* Note unwritten tests as such

* [ActionGenerator]  #2 local exists, remote is missing, other matches

* [S3ClientSuite] fix tests

* [S3MetaDataEnricherSuite] #2a local exists, remote is missing, remote matches, other matches - copy

* [S3MetaDataEnricherSuite] drop 'remote is missing, remote matches'

Impossible to represent this combination

* [S3MetaDataEnricherSuite] #3 local exists, remote is missing, remote no match, other no matches - upload

* [S3MetaDataEnricherSuite] Tests #1-3 rename variables consistantly

* [S3MetadataEnricherSuite] #4 local exists, remote exists, remote no match, other matches - copy

* [S3MetadataEnricherSuite] #5 local exists, remote exists, remote no match, other no matches - upload

* [S3MetadataEnricherSuite] drop test #6 - no way to make request

* [ActionGeneratorSuite] standardise tests 2-4

* [ActionGeneratorSuite] #1 local exists, remote exists, remote matches - do nothing

* [ActionGeneratorSuite] Comment expected outcome

* [ActionGeneratorSuite] #5 local exists, remote exists, remote no match, other no matches - upload

* [Action] Add ToDelete case class

* Use ToDelete and fix up return types for DeleteS3Action

* [ActionGenerator] Add explicit case for #1

* [ActionGenerator] Add explicit check for local exists in #2

* [ActionGenerator] match case against #3

* [ActionGenerator] simplify case and match against #5

* [ActionGenerator] Add case for #4

* [ActionGenerator] Remote explicit checks for file existing

If we are called with a LocalFile parameter then we assume the file exists.

* [ActionGenerator] Avoid #1 matching condition #5

* [ActionGeneratorSuite] enable tests

* [test] remove stray println

* [SyncSuite] Add test helper RecordingSync

* [SyncSuite] Use RecordingSync

* [SyncSuite] enable rename test - excluding delete test

* [Sync] log and increment counters for copy and delete

* [Sync] Use case matched RemoteKey in log message

* [Sync] Reorder actioins to do copy then upload then delete

* [S3Action] Drop Move as a distinct action

Can be implemented as a Copy followed by a Delete.

* [S3Action] Actions are ordered Copy, Upload then Delete

This allows sequencing of actions so that all the quick to accomplish
copies take place before bandwidth/time costly updates or destructive
deletes. Deletes come last after they have had the opportunity to b
used as the source for any copies.

* [Sync] Use S3Action's default sorting

* [Sync] extract logging of activity

* [SyncLogging] Extract logging out of Sync

Single Responsibility principle - Sync knows nothing about how it
logs, it just delegates to SyncLogging.

* [Sync] Rename variables and extract sort into private def

* [SyncLogging] Use IO context

* [SyncLogging] Remove moved counter

* [SyncLogging] Clean up an log start of run config info

* Verify that IO actions are evaluated before the program terminates

* [Sync] ensure logging runs

* [ActionGenerator] Don't upload files every time

* [ActionGenerator] fix remote hash for #5

* [SyncSuite] Add tests for delete and delete after rename

* [RemoteKey] Add asFile and isMissingLocally helpers

* [Sync] Generate delete actions

* Remove old extensions upon MD5HashGenerator

* [MD5Hash] prevent confusion by never allowing quotes

This means we need to filter quotes from md5hash values at source

* [Sync] ensure start log message is run

* [ThorpS3Client] Fix passing parameters for source key

* [ThorpS3Client] reformat byKey for clarity

* [S3Client] Add level 5 logging around s3 sdk calls

* fix up tests
2019-05-22 13:55:03 +01:00
00743c425c
Add configurable logging levels, selected from command line argument (#12)
* [config,parseargs] Accept v/verbose command line argument

* [parseargs] lowercase program name

* [logging] Log messages based on command line argument

* [readme] update usage
2019-05-16 21:59:40 +01:00
74afb288cc
[localfilestream] Compare test files within a Set (#11)
Fixes #10 

* [localfilestream] Compare test files within a Set

Removes issue of files being read in different orders.

* [localfilestream] add missing parameter type
2019-05-16 19:59:06 +01:00
ed6550e134 [sync] use listObjects and show count of files uploaded at end 2019-05-16 16:09:32 +01:00
74be5ec1ac [awssdk] add listObjects 2019-05-15 07:06:10 +01:00
64bf42921d [awssdk] Typo/rename class Throp* => Thorp* 2019-05-14 20:14:08 +01:00
11cbcb2312 Use logging in place of println 2019-05-11 20:18:55 +01:00
6761fb0ade [log4j] Configure logging level 2019-05-11 19:59:09 +01:00
4689c4537b [tests] move to resources tree 2019-05-11 13:58:02 +01:00
e963827fc5 [resource] test helper for loading resources 2019-05-11 13:58:02 +01:00
359ae1a900 [syncsuite] add tests for run 2019-05-11 07:52:27 +01:00
0931f82414 [sync] remove unneeded for-comprehension 2019-05-11 06:26:23 +01:00
218c0114c2 [syncsuite] improve tests for s3client thunk 2019-05-11 06:24:35 +01:00
5b397ce181 [s3client] upload returns S3's md5hash of the uploaded file 2019-05-10 22:35:01 +01:00
41e38f5cee [s3client] simple test for upload 2019-05-10 19:51:25 +01:00
bc1bffc345 [awssdk] Rewritten and simplified AWS SDK interface 2019-05-10 19:35:35 +01:00
fae876b554 [reactives3clienttest] clarify test descriptions 2019-05-10 18:31:30 +01:00
1abc34f30b [uploadselectionfilter] update call to uploadRequiredFilter 2019-05-10 18:30:52 +01:00
4c1cf89d51 Consistent logging format 2019-05-10 08:52:39 +01:00
27d1d1b99f [s3metadataenricher] remove quotes from remote hash 2019-05-10 08:39:18 +01:00
be266e0d41 [s3uploader] simplify logging 2019-05-10 08:29:59 +01:00
b61499ed8f [reactives3client] remove logging 2019-05-10 08:24:40 +01:00
c93bebb1e5 [reactives3client] handle NoSuchKeyException 2019-05-10 07:57:53 +01:00
f991c2b7b0 [s3uploadersuite] fix syntax 2019-05-10 07:56:30 +01:00
674e88d802 [reactives3clienttest] inline results 2019-05-09 23:09:47 +01:00
7593da9ce7 [reactives3client] [FAILING] when throws NoSuchKey returns None 2019-05-09 23:09:43 +01:00
1944d6620c [reactives3client] objectHead response okay then return Some 2019-05-09 22:50:15 +01:00
3eddc09a20 [catsios3client] Extract as trait 2019-05-09 21:11:46 +01:00
69029730e2 [reactives3client] Remove try with no catches 2019-05-09 18:47:52 +01:00
232ea40be6 [reactives3client] add console logging 2019-05-09 18:46:25 +01:00
e43b7dc0e3 [s3uploader] implement with (basic) test 2019-05-09 18:34:17 +01:00
befd6975fa [s3metadataenricher] rename intermediate function 2019-05-09 18:33:48 +01:00
6af5b8cafc [keygenerator] Extract as a trait 2019-05-09 18:33:29 +01:00
b348f18142 [s3client] Add upload method
Add DummyS3Client to help tests
2019-05-09 18:32:44 +01:00
e8a656ae4c [sync] rename type aliases to MD5Hash and RemoteKey 2019-05-09 18:29:42 +01:00
24b03f959e FIXUP upload filter 2019-05-09 17:40:05 +01:00
ebec3b1564 [s3uploader] Extract as trait 2019-05-09 17:37:06 +01:00
7d688954f7 [uploadselectionfilter] implement with tests 2019-05-09 17:34:17 +01:00
5090a78baa [uploadselectionfilter] Extracted as trait 2019-05-09 16:55:57 +01:00
86a53bf712 [awssdk] Move AWS SDK wrapper and trait to seperate package 2019-05-09 11:50:59 +01:00
65c1915d53 [s3metadataenricher] enrich with metadata when remote doesn't exists returns file to upload 2019-05-09 08:48:46 +01:00
e75b2a4892 [s3metadataenricher] enrich with metadata when remote exists returns metadata 2019-05-09 08:45:57 +01:00
df0df49624 [s3metadataenricher] remove putStrLn 2019-05-09 08:29:18 +01:00
4e2729ae26 [reactives3client] handle key not found 2019-05-09 07:41:50 +01:00
7af4004c75 [s3client] objectHead returns an IO[Option[...]]
If the remote file is missing then return None.

S3MetaDataEnricher.enrichWithS3MetaData now returns an IO[Either[File,
S3MetaData]]. If objectHead returns None, the this returns the file,
otherwise, the Some[Hash, LastModified] from objectHead is used to
create the S3MetaData as before.
2019-05-09 07:11:27 +01:00