thorp/README.org

55 lines
1.4 KiB
Org Mode
Raw Normal View History

2019-04-29 20:10:38 +01:00
* s3thorp
Synchronisation of files with S3 using the hash of the file contents.
Based on Alex Kudlick's JavaScript implementation [[https://github.com/akud/aws-s3-sync-by-hash][aws-s3-sync-by-hash]].
The normal ~aws s3 sync ...~ command only uses the time stamp of files
to decide what files need to be copied. This utility looks at the md5
hash of the file contents.
* How does aws-s3-sync-by-hash do it?
2019-04-29 20:10:38 +01:00
The following is a rough, first draft, pseudo-scala, impression of the process.
2019-04-29 20:10:38 +01:00
** constructor
2019-04-29 20:10:38 +01:00
val options = Load command line arguments and AWS security keys.
** def sync(): Promise[Upload]
val uploadPromise = createUploadPromise()
if options contains delete then createDeletePromise()
else return uploadPromise
** def createUploadPromise(): Promise[Upload]
readdir(options(root))
loadS3MetaData
filterByHash
uploadFile
callback(file => uploadedFiles += file)
** def loadS3MetaData: Stream[S3MetaData]
HEAD(bucket, key)
map (metadata => S3MetaData(localFile, bucket, key, metadata.hash, metadata.lastModified))
** def filterByHash(p: S3MetaData => Boolean): Stream[S3MetaData]
md5File(localFile)
filter(localHash => options.force || localHash != metadataHash)
** def uploadFile(upload: Upload): IO[Unit]
S3Upload(bucket, key, localFile)
** def createDeletePromise(): Promise[Delete]
S3AllKeys(bucket, key)
filter(remoteKey => localFileExists(remoteFile).negate)
** def deleteFile(delete: Delete): IO[Unit]
S3Delete(bucket, key, remoteKey)