utils¶
- s3pathlib.utils.split_s3_uri(s3_uri: str) Tuple[str, str][source]¶
Split AWS S3 URI, returns bucket and key.
- Parameters:
s3_uri – example,
"s3://my-bucket/my-folder/data.json"
Added in version 1.0.1.
- s3pathlib.utils.join_s3_uri(bucket: str, key: str) str[source]¶
Join AWS S3 URI from bucket and key.
- Parameters:
bucket – example,
"my-bucket"key – example,
"my-folder/data.json"or"my-folder/"
Added in version 1.0.1.
- s3pathlib.utils.split_parts(key: str) List[str][source]¶
Split s3 key parts using “/” delimiter.
Example:
>>> split_parts("a/b/c") ["a", "b", "c"] >>> split_parts("//a//b//c//") ["a", "b", "c"]
Added in version 1.0.1.
- s3pathlib.utils.smart_join_s3_key(parts: List[str], is_dir: bool) str[source]¶
Note, it assume that there’s no such double slack in your path. It ensure that there’s only one consecutive “/” in the s3 key.
- Parameters:
parts – list of s3 key path parts, could have “/”
is_dir – if True, the s3 key ends with “/”. otherwise enforce no tailing “/”.
Example:
>>> smart_join_s3_key(parts=["/a/", "b/", "/c"], is_dir=True) a/b/c/ >>> smart_join_s3_key(parts=["/a/", "b/", "/c"], is_dir=False) a/b/c
Added in version 1.0.1.
- s3pathlib.utils.make_s3_console_url(bucket: str | None = None, prefix: str | None = None, s3_uri: str | None = None, version_id: str | None = None, aws_region: str | None = None, is_us_gov_cloud: bool = False) str[source]¶
Return an AWS Console url that you can use to open it in your browser.
- Parameters:
bucket – example,
"my-bucket"prefix – example,
"my-folder/"s3_uri – example,
"s3://my-bucket/my-folder/data.json"
Example:
>>> make_s3_console_url(s3_uri="s3://my-bucket/my-folder/data.json") https://s3.console.aws.amazon.com/s3/object/my-bucket?prefix=my-folder/data.json
Added in version 1.0.1.
Changed in version 2.0.1: add
version_idparameter.Changed in version 2.2.2.
- s3pathlib.utils.ensure_s3_object(s3_key_or_uri: str) None[source]¶
Raise exception if the string is not in valid format for a AWS S3 object
Added in version 1.0.1.
- s3pathlib.utils.ensure_s3_dir(s3_key_or_uri: str) None[source]¶
Raise exception if the string is not in valid format for a AWS S3 directory
Added in version 1.0.1.
- s3pathlib.utils.validate_s3_bucket(bucket)[source]¶
Ref: https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html
- s3pathlib.utils.validate_s3_key(key)[source]¶
Ref: https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html#object-key-guidelines
- s3pathlib.utils.repr_data_size(size_in_bytes: int, precision: int = 2) str[source]¶
Return human readable string represent of a file size. Doesn’t support size greater than 1YB.
For example:
100 bytes => 100 B
100,000 bytes => 97.66 KB
100,000,000 bytes => 95.37 MB
100,000,000,000 bytes => 93.13 GB
100,000,000,000,000 bytes => 90.95 TB
100,000,000,000,000,000 bytes => 88.82 PB
and more …
Magnitude of data:
1000 kB kilobyte 1000 ** 2 MB megabyte 1000 ** 3 GB gigabyte 1000 ** 4 TB terabyte 1000 ** 5 PB petabyte 1000 ** 6 EB exabyte 1000 ** 7 ZB zettabyte 1000 ** 8 YB yottabyte
Added in version 1.0.1.
- s3pathlib.utils.parse_data_size(s) int[source]¶
Parse human readable string representing a file size. Doesn’t support size greater than 1YB.
Examples:
>>> parse_data_size("3.43 MB") 3596615 >>> parse_data_size("2_512.4 MB") 2634442342 >>> parse_data_size("2,512.4 MB") 2634442342
Added in version 1.0.5.
- s3pathlib.utils.hash_binary(b: bytes, hash_meth: callable) str[source]¶
Get the hash of a binary object.
- Parameters:
b – binary object
hash_meth – callable hash method, example: hashlib.md5
- Returns:
hash value in hex digits.
Added in version 1.0.1.
- s3pathlib.utils.md5_binary(b: bytes) str[source]¶
Get the md5 hash of a binary object.
- Parameters:
b – binary object
- Returns:
hash value in hex digits.
Added in version 1.0.1.
- s3pathlib.utils.sha256_binary(b: bytes) str[source]¶
Get the md5 hash of a binary object.
- Parameters:
b – binary object
- Returns:
hash value in hex digits.
Added in version 1.0.1.
- s3pathlib.utils.hash_file(abspath: str, hash_meth: callable, nbytes: int = 0, chunk_size: int = 64) str[source]¶
Get the hash of a file on local drive.
- Parameters:
abspath – absolute path of the file
hash_meth – callable hash method, example: hashlib.md5
nbytes – only hash first nbytes of the file
chunk_size – internal option, stream chunk_size of the data for hash each time, avoid high memory usage.
- Returns:
hash value in hex digits.
Added in version 1.0.1.
- s3pathlib.utils.grouper_list(l: Iterable, n: int) Iterable[list][source]¶
Evenly divide list into fixed-length piece, no filled value if chunk size smaller than fixed-length.
Example:
>>> list(grouper_list(range(10), n=3) [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
- Parameters:
l – an iterable object
n – number of item per list
Added in version 1.0.1.