src/main/scala/net/kemitix/s3thorp | ||
.gitignore | ||
build.sbt | ||
README.org |
- s3thorp
- How does aws-s3-sync-by-hash do it?
- constructor
- def sync(): Promise[Upload]
- def createUploadPromise(): Promise[Upload]
- def loadS3MetaData: Stream[S3MetaData]
- def filterByHash(p: S3MetaData => Boolean): Stream[S3MetaData]
- def uploadFile(upload: Upload): IO[Unit]
- def createDeletePromise(): Promise[Delete]
- def deleteFile(delete: Delete): IO[Unit]
s3thorp
Synchronisation of files with S3 using the hash of the file contents.
Based on Alex Kudlick's JavaScript implementation aws-s3-sync-by-hash.
The normal aws s3 sync ...
command only uses the time stamp of files
to decide what files need to be copied. This utility looks at the md5
hash of the file contents.
How does aws-s3-sync-by-hash do it?
The following is a rough, first draft, pseudo-scala, impression of the process.
constructor
val options = Load command line arguments and AWS security keys.
def sync(): Promise[Upload]
val uploadPromise = createUploadPromise() if options contains delete then createDeletePromise() else return uploadPromise
def createUploadPromise(): Promise[Upload]
readdir(options(root))
loadS3MetaData
filterByHash
uploadFile
callback(file > uploadedFiles +
file)
def loadS3MetaData: Stream[S3MetaData]
HEAD(bucket, key) map (metadata => S3MetaData(localFile, bucket, key, metadata.hash, metadata.lastModified))
def filterByHash(p: S3MetaData => Boolean): Stream[S3MetaData]
md5File(localFile)
filter(localHash > options.force || localHash !
metadataHash)
def uploadFile(upload: Upload): IO[Unit]
S3Upload(bucket, key, localFile)
def createDeletePromise(): Promise[Delete]
S3AllKeys(bucket, key) filter(remoteKey => localFileExists(remoteFile).negate)
def deleteFile(delete: Delete): IO[Unit]
S3Delete(bucket, key, remoteKey)