Skip to content

Latest commit

 

History

History
91 lines (67 loc) · 2.1 KB

README.md

File metadata and controls

91 lines (67 loc) · 2.1 KB

spark-kassandra-extensions

Kotlin wrapper for Java Spark-Cassandra API. Extends JavaSparkContext and JavaRDD functionality. Simplifies working with Cassandra data using Apache Spark with Java API.

Getting started

Gradle (Jitpack dependency)

repositories {
    ...
    maven { url "https://jitpack.io" }
}

compile 'com.github.cortwave:spark-kassandra-extensions:0.2.1'

Examples

Briefing

Table with name users exists in test keyspace. Table has next structure:

Column name Type
email text PK
age int
name text
city text

User Java POJO:

public class User {
    private String email;
    private String name;
    private Integer age;
    private String city;
    
    ... //getters and setters
}

or User Kotlin data class:

data class Users(val email: String, val age: Int, val city: String, val name: String)

Read Cassandra table

read cassandra table to untyped RDD (RDD type is CassandraRow)

  • Java example
CassandraTableScanJavaRDD<CassandraRow> usersUntypedTable = CassandraJavaUtil.javaFunctions(sparkContext)
                                  .cassandraTable("test", "users");
  • Kotlin example
val usersUntypedTable = sparkContext.cassandraTableRows("test", "users")

read cassandra table to typed RDD

  • Java example
CassandraTableScanJavaRDD<User> usersTable = CassandraJavaUtil.javaFunctions(sparkContext)
                                  .cassandraTable("test", "users", CassandraJavaUtil.mapRowTo(User.class));
  • Kotlin example
val usersTable = sparkContext.cassandraTable<User>("test", "users")

Save to Cassandra table

users type - JavaRDD<User>

save RDD to Cassandra table

  • Java example
CassandraJavaUtil.javaFunctions(users)
            .writerBuilder("test", "users", CassandraJavaUtil.mapToRow(User.class)).saveToCassandra();
  • Kotlin example
users.saveToCassandra("test", "users")