Skip to content

Named intermediate Datasets #188

@OlivierBlanvillain

Description

@OlivierBlanvillain
case class Foo(bar: Int, baz: String, bal: Boolean)
val ds: TypedDataset[Foo]

This is nicely named and typed! However, after a select, the names are completely lost:

val ds1: TypedDataset[Tuple2[Int, String]] = ds.select(ds('bar), ds('baz))

The best one can do with the current API is to define a new case class for the intermediate representation, and use .as[] to get a ds1 with useful columns names.

Somes idea to workaround this issue:

  • Use a macro to generate a case classes "on the fly", something like this:

    ds.selectNamed(ds('bar), ds('baz))
    // Expends to
    case class FooBarBaz(bar: Int, baz: String)
    ds.selectNamed(ds('bar), ds('baz)).as[FooBarBaz]
  • Instead of TupleN, type the resulting Dataset with a shapeless record and update.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions