Example App - Reinvent Cloud Drive#

There are many cloud drive products available in the market, such as Google Drive, Microsoft One Drive, Dropbox, and more. By using AWS S3 and s3pathlib, it is possible to create your own cloud drive product.

The primary technologies used are S3 Bucket versioning and Object Lifecycle Management. S3 Bucket versioning can preserve all historical versions of your file and prevent accidental deletions. Object Lifecycle Management can also automatically expire older versions.

First, let’s assume that you have a customer John using your cloud drive product. He is trying to sync the /Users/John/cloud-drive/ on the laptop to the s3://s3pathlib-versioning-enabled/cloud-drive/John/ on S3.

[62]:
from pathlib_mate import Path
from s3pathlib import S3Path

dir_root = Path("cloud-drive")
dir_root.mkdir_if_not_exists()
s3dir_root = S3Path("s3://s3pathlib-versioning-enabled/cloud-drive/John/")
s3dir_root.mkdir(exist_ok=True)

_ = dir_root.remove_if_exists()
_ = s3dir_root.delete(is_hard_delete=True)

Local File Change Handler#

User wants to sync their change on the local file system to the cloud drive. Now we defined a function to handle the local file change event. It will upload the file to S3 and create a new version, and return the S3 path.

[63]:
def handle_local_file_change_event(path: Path):
    s3path = s3dir_root.joinpath(str(path.relative_to(dir_root)))
    s3path.write_bytes(path.read_bytes())
    return s3path
[64]:
# user create the version of doc.txt
path_v1 = dir_root.joinpath("my-documents/doc.txt")
path_v1.parent.mkdir_if_not_exists()
path_v1.write_text("v1")

# invoke the handle
s3path_v1 = handle_local_file_change_event(path_v1)
[65]:
# you can validate the content of file on S3
s3path_v1.read_text()
[65]:
'v1'
[66]:
for p in s3path_v1.list_object_versions().all():
    print(f"version_id = {p.version_id}, content = {p.read_text(version_id=p.version_id)!r}")
version_id = xMqg9Jp5l_PhcU4..gcYeGgJbVW6A4wB, content = 'v1'

S3 File Change Handler#

User wants to sync their change on the cloud drive to the local file system too.

[67]:
def handle_s3_file_change_event(s3path: S3Path):
    path = dir_root.joinpath(s3path.relative_to(s3dir_root).key)
    path.parent.mkdir_if_not_exists()
    path.write_bytes(s3path.read_bytes(version_id=s3path.version_id))
    return path
[68]:
# put a new version of doc.txt on S3
s3path_v2 = s3path_v1.write_text("v2")

# invoke the handle
path_v2 = handle_s3_file_change_event(s3path_v2)
[69]:
# you can validate the content of file on your laptop
path_v2.read_text()
[69]:
'v2'

Delete and Recover#

With S3 bucket versioning, a deletion is actually a delete-marker on the latest version. If user accidentally delete a file on their local. The S3 still have the copy and marked it as “deleted”. If user accidentally delete a file on their cloud drive, the S3 only put a delete-marker. The user can always use the “Recent deleted” feature to recover the file. On S3 side, it is just a list_object_versions API call to retrieve a historical version.