Skip to content

Conversation

@ayush1300
Copy link

@ayush1300 ayush1300 commented Jul 29, 2025

Description of PR

How was this patch tested?

  • Doc added

For code changes:

  • [Yes] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 20m 29s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ trunk Compile Tests _
+1 💚 mvninstall 45m 14s trunk passed
+1 💚 mvnsite 0m 43s trunk passed
+1 💚 shadedclient 86m 9s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 31s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 mvnsite 0m 32s the patch passed
+1 💚 shadedclient 40m 31s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ asflicense 0m 38s /results-asflicense.txt The patch generated 1 ASF License warnings.
149m 59s
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7834/1/artifact/out/Dockerfile
GITHUB PR #7834
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint
uname Linux 842afd4dd444 5.15.0-143-generic #153-Ubuntu SMP Fri Jun 13 19:10:45 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 95bfec7
Max. process+thread count 611 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7834/1/console
versions git=2.25.1 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran steveloughran changed the title HADOOP-19536 : Readme DOC for addition of S3 tags through S3A. HADOOP-19536. S3A : Add option for custom S3 tags while writing and deleting S3 objects Jul 30, 2025
Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great start! renamed the PR title so it matches the overall work.

Can you actually use a different branch name than trunk, ideally something with the JIRA in it. This is because github pr to check out a PR uses the branch name, and we all already hava a branch trunk.

Commented on soft-delete. it gets complicated fast in terms of trying to maintain fs metaphor (or even preventing object overwrite, probes for existence etc).


### Soft Delete Tags
```properties
fs.s3a.soft.delete.enabled=true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the idea that when enabled, each delete(path) call is remapped to tagging the object for deletion?

is this for recovery or for a performance benefit?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes the object will be tagged according to the tag given by the user or some default tag for deletion.
  2. It is for recovery. Users can archive some s3 objects on the basis of tags and recover that in future when they need.


## Soft Delete Feature

The soft delete feature allows you to tag objects instead of permanently deleting them, enabling data retention policies and recovery options.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or people use versioned buckets, obviously

-rm s3a://my-bucket/file-to-archive.txt
```

### Future Capabilities (Planned)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't make these commitments as they get complex fast. e.g

  • what if a path has nothing but soft deleted files underneath. Does it exist? can I do a non-recursive rm of a directory with nothing but soft delete entries underneath? we'd reject now as a LIST call would say stuff is there, and we don't do a HEAD on each file looking for a soft-delete marker.
  • what if I rename a soft-deleted file? does it come back into existence?
  • what if I create a file, the header is set to not create if a file is there, but there's a soft deleted entry?

Better to say

While tagged as soft delete, the files are still visible to filesystem operations
such as list and create.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I will remove this section/rename this accordingly.

@steveloughran
Copy link
Contributor

@ayush1300 why did you close this?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 14m 32s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ trunk Compile Tests _
+0 🆗 mvndep 9m 46s Maven dependency ordering for branch
+1 💚 shadedclient 45m 16s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 33s Maven dependency ordering for patch
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 shadedclient 34m 4s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 asflicense 0m 47s The patch does not generate ASF License warnings.
96m 42s
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7834/2/artifact/out/Dockerfile
GITHUB PR #7834
Optional Tests dupname asflicense
uname Linux 043bfbb40995 5.15.0-140-generic #150-Ubuntu SMP Sat Apr 12 06:00:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / cf4f97d
Max. process+thread count 717 (vs. ulimit of 5500)
modules C: U:
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7834/2/console
versions git=2.25.1 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor

oh yes, I remember. I even asked for it. thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants