Package

software.uncharted.sparkpipe.ops.core

dataframe

Permalink

package dataframe

Common operations for manipulating dataframes

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. dataframe
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. def addColumn[I1, I2, I3, I4, I5, O](columnName: String, columnFcn: (I1, I2, I3, I4, I5) ⇒ O, in1: String, in2: String, in3: String, in4: String, in5: String)(input: DataFrame)(implicit tagO: scala.reflect.api.JavaUniverse.TypeTag[O], tagI1: scala.reflect.api.JavaUniverse.TypeTag[I1], tagI2: scala.reflect.api.JavaUniverse.TypeTag[I2], tagI3: scala.reflect.api.JavaUniverse.TypeTag[I3], tagI4: scala.reflect.api.JavaUniverse.TypeTag[I4], tagI5: scala.reflect.api.JavaUniverse.TypeTag[I5]): DataFrame

    Permalink

    Take an existing DataFrame, and add a new column to it.

    Take an existing DataFrame, and add a new column to it.

    columnName

    The name of the column to add

    columnFcn

    A function which generates the new column's values based on input columns

    in1

    the first input column

    in2

    the second input column

    in3

    the third input column

    in4

    the fourth input column

    in5

    the fifth input column

    input

    The existing DataFrame

    returns

    A new DataFrame with the named added value.

  2. def addColumn[I1, I2, I3, I4, O](columnName: String, columnFcn: (I1, I2, I3, I4) ⇒ O, in1: String, in2: String, in3: String, in4: String)(input: DataFrame)(implicit tagO: scala.reflect.api.JavaUniverse.TypeTag[O], tagI1: scala.reflect.api.JavaUniverse.TypeTag[I1], tagI2: scala.reflect.api.JavaUniverse.TypeTag[I2], tagI3: scala.reflect.api.JavaUniverse.TypeTag[I3], tagI4: scala.reflect.api.JavaUniverse.TypeTag[I4]): DataFrame

    Permalink

    Take an existing DataFrame, and add a new column to it.

    Take an existing DataFrame, and add a new column to it.

    columnName

    The name of the column to add

    columnFcn

    A function which generates the new column's values based on input columns

    in1

    the first input column

    in2

    the second input column

    in3

    the third input column

    in4

    the fourth input column

    input

    The existing DataFrame

    returns

    A new DataFrame with the named added value.

  3. def addColumn[I1, I2, I3, O](columnName: String, columnFcn: (I1, I2, I3) ⇒ O, in1: String, in2: String, in3: String)(input: DataFrame)(implicit tagO: scala.reflect.api.JavaUniverse.TypeTag[O], tagI1: scala.reflect.api.JavaUniverse.TypeTag[I1], tagI2: scala.reflect.api.JavaUniverse.TypeTag[I2], tagI3: scala.reflect.api.JavaUniverse.TypeTag[I3]): DataFrame

    Permalink

    Take an existing DataFrame, and add a new column to it.

    Take an existing DataFrame, and add a new column to it.

    columnName

    The name of the column to add

    columnFcn

    A function which generates the new column's values based on input columns

    in1

    the first input column

    in2

    the second input column

    in3

    the third input column

    input

    The existing DataFrame

    returns

    A new DataFrame with the named added value.

  4. def addColumn[I1, I2, O](columnName: String, columnFcn: (I1, I2) ⇒ O, in1: String, in2: String)(input: DataFrame)(implicit tagO: scala.reflect.api.JavaUniverse.TypeTag[O], tagI1: scala.reflect.api.JavaUniverse.TypeTag[I1], tagI2: scala.reflect.api.JavaUniverse.TypeTag[I2]): DataFrame

    Permalink

    Take an existing DataFrame, and add a new column to it.

    Take an existing DataFrame, and add a new column to it.

    columnName

    The name of the column to add

    columnFcn

    A function which generates the new column's values based on input columns

    in1

    the first input column

    in2

    the second input column

    input

    The existing DataFrame

    returns

    A new DataFrame with the named added value.

  5. def addColumn[I, O](columnName: String, columnFcn: (I) ⇒ O, in: String)(input: DataFrame)(implicit tagO: scala.reflect.api.JavaUniverse.TypeTag[O], tagI: scala.reflect.api.JavaUniverse.TypeTag[I]): DataFrame

    Permalink

    Take an existing DataFrame, and add a new column to it.

    Take an existing DataFrame, and add a new column to it.

    columnName

    The name of the column to add

    columnFcn

    A function which generates the new column's values based on input columns

    in

    the input column

    input

    The existing DataFrame

    returns

    A new DataFrame with the named added value.

  6. def addColumn[O](columnName: String, columnFcn: () ⇒ O)(input: DataFrame)(implicit tag: scala.reflect.api.JavaUniverse.TypeTag[O]): DataFrame

    Permalink

    Take an existing DataFrame, and add a new column to it.

    Take an existing DataFrame, and add a new column to it.

    columnName

    The name of the column to add

    columnFcn

    A function which generates the new column's values based on input columns

    input

    The existing DataFrame

    returns

    A new DataFrame with the named added value.

  7. def cache(frame: DataFrame): DataFrame

    Permalink

    cache() the specified DataFrame

    cache() the specified DataFrame

    frame

    the DataFrame to cache()

    returns

    the input DataFrame, after calling cache()

  8. def castColumns(castMap: Map[String, String])(input: DataFrame): DataFrame

    Permalink

    Cast a set of columns to new types, replacing the original columns

    Cast a set of columns to new types, replacing the original columns

    castMap

    a Map[String, String] of columnName => datatype

    input

    The existing DataFrame

    returns

    A new DataFrame with the casted columns

  9. def copyColumn(columnName: String, newColumnName: String)(input: DataFrame): DataFrame

    Permalink

    Takes a DataFrame and copies a column in i

    Takes a DataFrame and copies a column in i

    columnName

    the column to copy

    newColumnName

    the column to place the copy in

    input

    The existing DataFrame

    returns

    A new DataFrame with the copied column

  10. object docs

    Permalink

    Stub object necessary due to https://issues.scala-lang.org/browse/SI-8124

    Stub object necessary due to https://issues.scala-lang.org/browse/SI-8124

    Documentation for ops.core.dataframe can be found at software.uncharted.sparkpipe.ops.core.dataframe

    Attributes
    protected[this]
    See also

    software.uncharted.sparkpipe.ops.core.dataframe

  11. def dropColumns(colNames: String*)(input: DataFrame): DataFrame

    Permalink

    Remove columns from a DataFrame

    Remove columns from a DataFrame

    colNames

    the named columns to remove

    input

    the input DataFrame

    returns

    the resultant DataFrame, without the specified column

  12. package io

    Permalink

    Input/output operations for DataFrames, based on the sparkSession.read and DataFrame.write APIs

  13. def joinDataFrames(leftColumn: String, rightColumn: String)(leftInput: DataFrame, rightInput: DataFrame): DataFrame

    Permalink

    Inner join two data frames on the specified columns.

    Inner join two data frames on the specified columns.

    leftColumn

    The join ID column of the first data frame

    rightColumn

    The join ID column of the second data frame

    leftInput

    The first data frame

    rightInput

    The second data frame

    returns

    The joined data frames

  14. package numeric

    Permalink

    Numeric pipeline operations that operate on DataFrames.

  15. def renameColumns(nameMap: Map[String, String])(input: DataFrame): DataFrame

    Permalink

    Rename columns in a DataFrame

    Rename columns in a DataFrame

    nameMap

    a Map[String, String] from columns in the DataFrame to new names

    input

    the input DataFrame

    returns

    a new DataFrame with the renamed column

  16. def replaceColumn[I, O](columnName: String, columnFcn: (I) ⇒ O)(input: DataFrame)(implicit tagI: scala.reflect.api.JavaUniverse.TypeTag[I], tagO: scala.reflect.api.JavaUniverse.TypeTag[O]): DataFrame

    Permalink

    Takes a DataFrame and replaces a column in it using a transformation function

    Takes a DataFrame and replaces a column in it using a transformation function

    columnName

    the column to replace

    input

    The existing DataFrame

    returns

    A new DataFrame with the replaced column

  17. package temporal

    Permalink

    Common pipeline operations for dealing with temporal data

  18. package text

    Permalink

    Common pipeline operations for dealing with textual data

  19. def toRDD(frame: DataFrame): RDD[Row]

    Permalink

    Convert a DataFrame to an RDD[Row]

    Convert a DataFrame to an RDD[Row]

    frame

    the DataFrame

    returns

    the underlying RDD[Row] from frame

Inherited from AnyRef

Inherited from Any

Ungrouped