Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loadFastaDna usage not obvious due to default method parameter #2183

Closed
heuermh opened this issue Jul 6, 2019 · 0 comments · Fixed by #2201
Closed

loadFastaDna usage not obvious due to default method parameter #2183

heuermh opened this issue Jul 6, 2019 · 0 comments · Fixed by #2201
Milestone

Comments

@heuermh
Copy link
Member

heuermh commented Jul 6, 2019

ADAMContext includes the following methods for loading data into SequenceDataset and SliceDataset

  // load FASTA as DNA

  def loadFastaDna(pathName: String): SequenceDataset = {}

  def loadFastaDna(
    pathName: String,
    maximumLength: Long = 10000L): SliceDataset = {}

  // format guessing, including FASTA and Parquet

  def loadDnaSequences(
    pathName: String,
    optPredicate: Option[FilterPredicate] = None,
    optProjection: Option[Schema] = None): SequenceDataset = {}

  def loadSlices(
    pathName: String,
    maximumLength: Long = 10000L,
    optPredicate: Option[FilterPredicate] = None,
    optProjection: Option[Schema] = None): SliceDataset = {}

  // data frames

  def loadSequences(df: DataFrame): SequenceDataset = {}

  def loadSequences(df: DataFrame, metadataPathName: String): SequenceDataset = {}

  def loadSequences(
    df: DataFrame,
    references: SequenceDictionary): SequenceDataset = {}

  def loadSlices(df: DataFrame): SliceDataset = {}

  def loadSlices(df: DataFrame, metadataPathName: String): SliceDataset = {}

  def loadSlices(
    df: DataFrame,
    references: SequenceDictionary): SliceDataset = {}

This method, which uses a default maximumLength method parameter

  def loadFastaDna(
    pathName: String,
    maximumLength: Long = 10000L): SliceDataset = {}

is not obviously different from

  def loadFastaDna(pathName: String): SequenceDataset = {}

especially in the Spark shell environment. Perhaps we should drop the default value for maximumLength in the loadFastaDna method that returns SliceDataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant