python - What is the Unicode normalization form for an AWS S3 Buckets

Question

Welcome To Ask or Share your Answers For Others

python - What is the Unicode normalization form for an AWS S3 Buckets

asked Jan 31, 2022 in Technique[技术] by 深蓝 (71.8m points)

python - What is the Unicode normalization form for an AWS S3 Buckets

Upon working with file names which are in UTF-8 format on AWS s3 bucket, I've found out that some of the quoted file names( in a Link to a file on s3 bucket) may differ from same file names which were quoted by code of my python app ( I'am using boto library). As I've found out they differs due to different normalization forms of unicode and problem goes away after using unicodedata.normalize.

However I haven't found any information about normalization form which being used by AWS ( NFC, NFKC, NFD or NFKD), so I will highly appreciate any suggestance of trasted source which provides that information, thanks.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2022-01-31T07:05:45+0000

It looks like S3 doesn't apply any normalization itself. If I upload (using the S3 web console) a file with a unicode name (eg A?rende.txt) to S3 from a Mac and again from Windows, I'll end up with two files in S3. They look the same in the S3 console, but they are considered distinct by S3 because the encoding of the name is different.

You will have to consider exactly how it affects your application (users) and adjust accordingly. For example, if your users may switch between environments (Mac vs Windows vs Linux) and expect consistent cross-platform behaviour, then it seems you will need to normalize the names yourself. If your users work from a single platform consistently, then you wouldn't need to care most likely.

Categories

python - What is the Unicode normalization form for an AWS S3 Buckets

python - What is the Unicode normalization form for an AWS S3 Buckets

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags