Example App - Reinvent Cloud Drive#
There are many cloud drive products available in the market, such as Google Drive, Microsoft One Drive, Dropbox, and more. By using AWS S3 and s3pathlib
, it is possible to create your own cloud drive product.
The primary technologies used are S3 Bucket versioning
and Object Lifecycle Management
. S3 Bucket versioning can preserve all historical versions of your file and prevent accidental deletions. Object Lifecycle Management can also automatically expire older versions.
First, let’s assume that you have a customer John
using your cloud drive product. He is trying to sync the /Users/John/cloud-drive/
on the laptop to the s3://s3pathlib-versioning-enabled/cloud-drive/John/
on S3.
[62]:
from pathlib_mate import Path
from s3pathlib import S3Path
dir_root = Path("cloud-drive")
dir_root.mkdir_if_not_exists()
s3dir_root = S3Path("s3://s3pathlib-versioning-enabled/cloud-drive/John/")
s3dir_root.mkdir(exist_ok=True)
_ = dir_root.remove_if_exists()
_ = s3dir_root.delete(is_hard_delete=True)
Local File Change Handler#
User wants to sync their change on the local file system to the cloud drive. Now we defined a function to handle the local file change event. It will upload the file to S3 and create a new version, and return the S3 path.
[63]:
def handle_local_file_change_event(path: Path):
s3path = s3dir_root.joinpath(str(path.relative_to(dir_root)))
s3path.write_bytes(path.read_bytes())
return s3path
[64]:
# user create the version of doc.txt
path_v1 = dir_root.joinpath("my-documents/doc.txt")
path_v1.parent.mkdir_if_not_exists()
path_v1.write_text("v1")
# invoke the handle
s3path_v1 = handle_local_file_change_event(path_v1)
[65]:
# you can validate the content of file on S3
s3path_v1.read_text()
[65]:
'v1'
[66]:
for p in s3path_v1.list_object_versions().all():
print(f"version_id = {p.version_id}, content = {p.read_text(version_id=p.version_id)!r}")
version_id = xMqg9Jp5l_PhcU4..gcYeGgJbVW6A4wB, content = 'v1'
S3 File Change Handler#
User wants to sync their change on the cloud drive to the local file system too.
[67]:
def handle_s3_file_change_event(s3path: S3Path):
path = dir_root.joinpath(s3path.relative_to(s3dir_root).key)
path.parent.mkdir_if_not_exists()
path.write_bytes(s3path.read_bytes(version_id=s3path.version_id))
return path
[68]:
# put a new version of doc.txt on S3
s3path_v2 = s3path_v1.write_text("v2")
# invoke the handle
path_v2 = handle_s3_file_change_event(s3path_v2)
[69]:
# you can validate the content of file on your laptop
path_v2.read_text()
[69]:
'v2'
Delete and Recover#
With S3 bucket versioning, a deletion is actually a delete-marker on the latest version. If user accidentally delete a file on their local. The S3 still have the copy and marked it as “deleted”. If user accidentally delete a file on their cloud drive, the S3 only put a delete-marker. The user can always use the “Recent deleted” feature to recover the file. On S3 side, it is just a list_object_versions
API call to retrieve a historical version.