Package

software.uncharted.sparkpipe.ops.core.rdd

io

Permalink

package io

Input/output operations for RDDs, based on the SparkContext.textFile API

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. io
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. val BZIP2_CODEC: String

    Permalink

    Codec value indicating that a text file should be written using the BZip2 codec

  2. val CODEC: String

    Permalink

    An option key with which to specify the compression codec with which to write text data

  3. val GZIP_CODEC: String

    Permalink

    Codec value indicating that a text file should be written using the GZip codec

  4. val MIN_PARTITIONS: String

    Permalink

    An option key with which to specify the minimum number of partitions into which to read an input file

  5. val TEXT_FORMAT: String

    Permalink

    Simple text format, one line per record.

  6. object docs

    Permalink

    Stub object necessary due to https://issues.scala-lang.org/browse/SI-8124

    Stub object necessary due to https://issues.scala-lang.org/browse/SI-8124

    Documentation for ops.core.rdd.io can be found at software.uncharted.sparkpipe.ops.core.rdd.io

    Attributes
    protected[this]
    See also

    software.uncharted.sparkpipe.ops.core.rdd.io

  7. implicit def mutateContext(sqlc: SparkSession): SparkContext

    Permalink

    Translates a SparkSession into a SparkContext, so that RDD operations can be called with either

    Translates a SparkSession into a SparkContext, so that RDD operations can be called with either

    sqlc

    A SparkSession in which to run operations

    returns

    The spark context from which the SQL context was created

  8. implicit def mutateContextFcn[T](fcn: (SparkContext) ⇒ T): (SparkSession) ⇒ T

    Permalink

    Traslate a function from a SparkContext into a function from a SparkSession, so that RDD operations can be run off a Pipe[SparkSession]

    Traslate a function from a SparkContext into a function from a SparkSession, so that RDD operations can be run off a Pipe[SparkSession]

    T

    The return type of the function

    fcn

    The SparkContext-based function

    returns

    The same function, but working on a SparkSession.

  9. def read(path: String, format: String = TEXT_FORMAT, options: Map[String, String] = Map[String, String]())(sc: SparkContext): RDD[String]

    Permalink

    Reads a file into an RDD

    Reads a file into an RDD

    path

    The location of the source data

    format

    The format in which to read the data. Currently, only "text" is supported.

    options

    A Map[String, String] of options. Currently, the only supported option is "minPartitions", which will set the minimum number of partitions into which the data is read.

    sc

    The spark context in which to read the data

    returns

    An RDD of the text of the source data, line by line

  10. def write[T](path: String, format: String = TEXT_FORMAT, options: Map[String, String] = Map[String, String]())(input: RDD[T]): RDD[T]

    Permalink

    Write an RDD

    Write an RDD

    T

    The type of data contained in the RDD

    path

    The location to which to write the data

    format

    The format in which to write the data. Currently, only "text" is supported.

    options

    A Map[String, String] of options. Currently, only the "codec" option is supported, for which valid values are "bzip2", and "gzip"; any other value will result in the default codec.

    input

    The RDD to write

    returns

    The input RDD

Inherited from AnyRef

Inherited from Any

Ungrouped