Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move analyze skipping index rules to config #288

Closed
wants to merge 1 commit into from

Conversation

rupal-bq
Copy link
Contributor

Description

Issues Resolved

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Rupal Mahajan <maharup@amazon.com>
@rupal-bq
Copy link
Contributor Author

@dai-chen

I'm not sure if this is the right API because I've only used table.schema(). Could you double check this along with the comment above later? #284 (comment)

checked partitioning api again, it returns the physical partitioning of this table. We used schema to get list of columns and partitioning to get partition columns.

@rupal-bq rupal-bq marked this pull request as ready for review March 19, 2024 21:15
@dai-chen dai-chen added maintenance Code refactoring 0.3 labels Mar 21, 2024
@dai-chen
Copy link
Collaborator

@dai-chen

I'm not sure if this is the right API because I've only used table.schema(). Could you double check this along with the comment above later? #284 (comment)

checked partitioning api again, it returns the physical partitioning of this table. We used schema to get list of columns and partitioning to get partition columns.

Could you clarify what's the physical partitioning? I was concerned if this API return static list of partitioned column or all partitions. Because I see you also use toSet to deduplicate.

rules("PARTITION")._1,
rules("PARTITION")._2)
} else if (rules.contains(field.dataType.toString)) {
rules.getString("recommendation.data_type_rules.PARTITION.skipping_type"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract util method to avoid appending string in different place?

@@ -7,27 +7,16 @@ package org.opensearch.flint.spark.skipping.recommendations

import scala.collection.mutable.ArrayBuffer

import org.opensearch.flint.spark.skipping.FlintSparkSkippingStrategy.SkippingKind.{BLOOM_FILTER, MIN_MAX, PARTITION, VALUE_SET}
import com.typesafe.config.{Config, ConfigFactory}

import org.apache.spark.sql.{Row, SparkSession}
import org.apache.spark.sql.flint.{loadTable, parseTableName}

class DataTypeSkippingStrategy extends AnalyzeSkippingStrategy {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add missing Javadoc on new class, interface and public methods?

@rupal-bq
Copy link
Contributor Author

Thanks for reviewing @dai-chen. Closing this PR for now. Will address all comments in another PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.3 maintenance Code refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants