io - sparkpipe-core 1.1.0 API - software.uncharted.sparkpipe.ops.core.rdd.io

Value Members

val BZIP2_CODEC: String

Codec value indicating that a text file should be written using the BZip2 codec
val CODEC: String

An option key with which to specify the compression codec with which to write text data
val GZIP_CODEC: String

Codec value indicating that a text file should be written using the GZip codec
val MIN_PARTITIONS: String

An option key with which to specify the minimum number of partitions into which to read an input file
val TEXT_FORMAT: String

Simple text format, one line per record.
object docs

Stub object necessary due to https://issues.scala-lang.org/browse/SI-8124
Stub object necessary due to https://issues.scala-lang.org/browse/SI-8124
Documentation for ops.core.rdd.io can be found at software.uncharted.sparkpipe.ops.core.rdd.io

Attributes
protected[this]
See also
software.uncharted.sparkpipe.ops.core.rdd.io
implicit def mutateContext(sqlc: SparkSession): SparkContext

Translates a SparkSession into a SparkContext, so that RDD operations can be called with either
Translates a SparkSession into a SparkContext, so that RDD operations can be called with either
sqlc
A SparkSession in which to run operations
returns
The spark context from which the SQL context was created
implicit def mutateContextFcn[T](fcn: (SparkContext) ⇒ T): (SparkSession) ⇒ T

Traslate a function from a SparkContext into a function from a SparkSession, so that RDD operations can be run off a Pipe[SparkSession]
Traslate a function from a SparkContext into a function from a SparkSession, so that RDD operations can be run off a Pipe[SparkSession]
T
The return type of the function
fcn
The SparkContext-based function
returns
The same function, but working on a SparkSession.
def read(path: String, format: String = TEXT_FORMAT, options: Map[String, String] = Map[String, String]())(sc: SparkContext): RDD[String]

Reads a file into an RDD
Reads a file into an RDD
path
The location of the source data
format
The format in which to read the data. Currently, only "text" is supported.
options
A Map[String, String] of options. Currently, the only supported option is "minPartitions", which will set the minimum number of partitions into which the data is read.
sc
The spark context in which to read the data
returns
An RDD of the text of the source data, line by line
def write[T](path: String, format: String = TEXT_FORMAT, options: Map[String, String] = Map[String, String]())(input: RDD[T]): RDD[T]

Write an RDD
Write an RDD
T
The type of data contained in the RDD
path
The location to which to write the data
format
The format in which to write the data. Currently, only "text" is supported.
options
A Map[String, String] of options. Currently, only the "codec" option is supported, for which valid values are "bzip2", and "gzip"; any other value will result in the default codec.
input
The RDD to write
returns
The input RDD

io

package io

Value Members

val BZIP2_CODEC: String

val CODEC: String

val GZIP_CODEC: String

val MIN_PARTITIONS: String

val TEXT_FORMAT: String

object docs

implicit def mutateContext(sqlc: SparkSession): SparkContext

implicit def mutateContextFcn[T](fcn: (SparkContext) ⇒ T): (SparkSession) ⇒ T

def read(path: String, format: String = TEXT_FORMAT, options: Map[String, String] = Map[String, String]())(sc: SparkContext): RDD[String]

def write[T](path: String, format: String = TEXT_FORMAT, options: Map[String, String] = Map[String, String]())(input: RDD[T]): RDD[T]

Inherited from AnyRef

Inherited from Any

Ungrouped