Skip to content

Conversation

@ajantha-bhat
Copy link
Member

It is unused currently and Spark integration uses SparkFileWriterFactory instead of this.

* @deprecated since 1.7.0, will be removed in 1.8.0; use {@link SparkFileWriterFactory} instead.
*/
@Deprecated
class SparkAppenderFactory implements FileAppenderFactory<InternalRow> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While working on partition stats, I came across FileWriterFactory and AppenderFactory. After digging a bit, found that Spark was using AppenderFactory before but now uses FileWriterFactory. But the implementation of AppenderFactory still exist and confusing for users. Hence, deprecating it before removal in the next version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this class is not used by the Spark engine integration. Hence, we can safely remove it.

Looking more into it, we might have lost metrics related function in FileWriterFactory that was present in AppenderFactory. Maybe I can work on it in the future for FileWriterFactory.

Copy link

@amoghjaha-db amoghjaha-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the class is package private do we want to just remove it upfront? It looks like it's currently used in some JMH benchmarks WritersBenchmark, and TestSparkMergingMetrics would need to be updated to test the SparkFileWriterFactory class instead.

@ajantha-bhat
Copy link
Member Author

ajantha-bhat commented Sep 6, 2024

@amoghjaha-db:

Since the class is package private do we want to just remove it upfront

I did get this thought initially and checked how we handled previously for other classes in this module. The package private classes like SparkFileScan was deprecated first and then removed. So, I went ahead and followed the same.

Screenshot 2024-09-03 at 5 45 44 PM

It looks like it's currently used in some JMH benchmarks WritersBenchmark, and TestSparkMergingMetrics would need to be updated to test the SparkFileWriterFactory class instead.

As I mentioned in one of the comments, SparkFileWriterFactory doesn't have metrics related function. So, we cannot update the test to use it. We have to delete the tests as well.

@aokolnychyi
Copy link
Contributor

This is indeed our legacy writers. We can deprecate and then remove to be safe.

@aokolnychyi aokolnychyi merged commit 09370dd into apache:main Sep 27, 2024
@aokolnychyi
Copy link
Contributor

Thanks, @ajantha-bhat!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants