Options
All
  • Public
  • Public/Protected
  • All
Menu

Interface IDataFrame<IndexT, ValueT>

Interface that represents a dataframe. A dataframe contains an indexed sequence of data records. Think of it as a spreadsheet or CSV file in memory.

Each data record contains multiple named fields, the value of each field represents one row in a column of data. Each column of data is a named Series. You think of a dataframe a collection of named data series.

Type parameters

  • IndexT

    The type to use for the index.

  • ValueT

    The type to use for each row/data record.

Hierarchy

Implemented by

Index

Methods

__@iterator

  • __@iterator(): Iterator<ValueT>
  • Get an iterator to enumerate the rows of the dataframe. Enumerating the iterator forces lazy evaluation to complete. This function is automatically called by for...of.

    example
    
    for (const row of df) {
        // ... do something with the row ...
    }
    

    Returns Iterator<ValueT>

    An iterator for the rows in the dataframe.

after

  • after(indexValue: IndexT): IDataFrame<IndexT, ValueT>
  • Gets a new dataframe containing all rows after the specified index value (exclusive).

    example
    
    const df = new DataFrame({
         index: [0, 1, 2, 3], // This is the default index.
         values: [10, 20, 30, 40],
    });
    
    const lastHalf = df.before(1);
    expect(lastHalf.toArray()).to.eql([30, 40]);
    
    example
    
    const timeSeriesDf = ... a dataframe indexed by date/time ...
    
    // Get all rows after the specified date.
    const allRowsAfterStartDate = df.after(new Date(2016, 5, 4));
    

    Parameters

    • indexValue: IndexT

      The index value after which to start the new dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe containing all rows after the specified index value.

aggregate

  • Aggregate the rows in the dataframe to a single result.

    example
    
    const dailySalesDf = ... daily sales figures for the past month ...
    const totalSalesForthisMonth = dailySalesDf.aggregate(
         0, // Seed - the starting value.
         (accumulator, row) => accumulator + row.SalesAmount // Aggregation function.
    );
    
    example
    
    const totalSalesAllTime = 500; // We'll seed the aggregation with this value.
    const dailySalesDf = ... daily sales figures for the past month ...
    const updatedTotalSalesAllTime = dailySalesDf.aggregate(
         totalSalesAllTime,
         (accumulator, row) => accumulator + row.SalesAmount
    );
    
    example
    
    var salesDataSummary = salesDataDf.aggregate({
         TotalSales: df => df.count(),
         AveragePrice: df => df.deflate(row => row.Price).average(),
         TotalRevenue: df => df.deflate(row => row.Revenue).sum(),
    });
    

    Type parameters

    • ToT

    Parameters

    • seedOrSelector: AggregateFn<ValueT, ToT> | ToT | IColumnAggregateSpec
    • Optional selector: AggregateFn<ValueT, ToT>

      Function that takes the seed and then each row in the dataframe and produces the aggregate value.

    Returns ToT

    Returns a new value that has been aggregated from the dataframe using the 'selector' function.

all

  • Evaluates a predicate function for every row in the dataframe to determine if some condition is true/truthy for all rows in the dataframe.

    example
    
    const everyoneIsNamedFred = df.all(row => row.CustomerName === "Fred"); // Check if all customers are named Fred.
    

    Parameters

    • predicate: PredicateFn<ValueT>

      Predicate function that receives each row. It should returns true/truthy for a match, otherwise false/falsy.

    Returns boolean

    Returns true if the predicate has returned true or truthy for every row in the dataframe, otherwise returns false. Returns false for an empty dataframe.

any

  • Evaluates a predicate function for every row in the dataframe to determine if some condition is true/truthy for any of rows in the dataframe.

    If no predicate is specified then it simply checks if the dataframe contains more than zero rows.

    example
    
    const anyFreds = df.any(row => row.CustomerName === "Fred"); // Do we have any customers named Fred?
    
    example
    
    const anyCustomers = df.any(); // Do we have any customers at all?
    

    Parameters

    Returns boolean

    Returns true if the predicate has returned truthy for any row in the sequence, otherwise returns false. If no predicate is passed it returns true if the dataframe contains any rows at all. Returns false for an empty dataframe.

appendPair

  • appendPair(pair: [IndexT, ValueT]): IDataFrame<IndexT, ValueT>
  • Append a pair to the end of a dataframe. Doesn't modify the original dataframe! The returned dataframe is entirely new and contains rows from the original dataframe plus the appended pair.

    example
    
    const newIndex = ... index of the new row ...
    const newRow = ... the new data row to append ...
    const appendedDf = df.appendPair([newIndex, newRows]);
    

    Parameters

    • pair: [IndexT, ValueT]

      The index/value pair to append.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with the specified pair appended.

at

  • at(index: IndexT): ValueT | undefined
  • Get the row, if there is one, with the specified index.

    example
    
    const row = df.at(5); // Get the row at index 5 (with a default 0-based index).
    
    example
    
    const date = ... some date ...
    // Retreive the row with specified date from a time-series dataframe (assuming date indexed has been applied).
    const row = df.at(date);
    

    Parameters

    • index: IndexT

      Index to for which to retreive the row.

    Returns ValueT | undefined

    Returns the row from the specified index in the dataframe or undefined if there is no such index in the present in the dataframe.

bake

  • Forces lazy evaluation to complete and 'bakes' the dataframe into memory.

    example
    
    const bakedDf = df.bake();
    

    Returns IDataFrame<IndexT, ValueT>

    Returns a dataframe that has been 'baked', all lazy evaluation has completed.

before

  • before(indexValue: IndexT): IDataFrame<IndexT, ValueT>
  • Gets a new dataframe containing all rows up to the specified index value (exclusive).

    example
    
    const df = new DataFrame({
         index: [0, 1, 2, 3], // This is the default index.
         values: [10, 20, 30, 40],
    });
    
    const firstHalf = df.before(2);
    expect(firstHalf.toArray()).to.eql([10, 20]);
    
    example
    
    const timeSeriesDf = ... a dataframe indexed by date/time ...
    
    // Get all rows before the specified date.
    const allRowsBeforeEndDate = df.before(new Date(2016, 5, 4));
    

    Parameters

    • indexValue: IndexT

      The index value at which to end the new dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe containing all rows up to (but not including) the specified index value.

between

  • between(startIndexValue: IndexT, endIndexValue: IndexT): IDataFrame<IndexT, ValueT>
  • Gets a new dataframe containing all rows between the specified index values (inclusive).

    example
    
    const df = new DataFrame({
         index: [0, 1, 2, 3, 4, 6], // This is the default index.
         values: [10, 20, 30, 40, 50, 60],
    });
    
    const middleSection = df.between(1, 4);
    expect(middleSection.toArray()).to.eql([20, 30, 40, 50]);
    
    example
    
    const timeSeriesDf = ... a dataframe indexed by date/time ...
    
    // Get all rows between the start and end dates (inclusive).
    const allRowsBetweenDates = df.after(new Date(2016, 5, 4), new Date(2016, 5, 22));
    

    Parameters

    • startIndexValue: IndexT

      The index at which to start the new dataframe.

    • endIndexValue: IndexT

      The index at which to end the new dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe containing all values between the specified index values (inclusive).

bringToBack

  • bringToBack(columnOrColumns: string | string[]): IDataFrame<IndexT, ValueT>
  • Bring the column(s) with specified name(s) to the back of the column order, making it (or them) the last column(s) in the output dataframe.

    example
    const modifiedDf = df.bringToBack("NewLastColumn");
    
    example
    const modifiedDf = df.bringToBack(["NewSecondLastCollumn, ""NewLastColumn"]);
    

    Parameters

    • columnOrColumns: string | string[]

      Specifies the column or columns to bring to the back.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with 1 or more columns bought to the back of the column ordering.

bringToFront

  • bringToFront(columnOrColumns: string | string[]): IDataFrame<IndexT, ValueT>
  • Bring the column(s) with specified name(s) to the front of the column order, making it (or them) the first column(s) in the output dataframe.

    example
    const modifiedDf = df.bringToFront("NewFirstColumn");
    
    example
    const modifiedDf = df.bringToFront(["NewFirstColumn", "NewSecondColumn"]);
    

    Parameters

    • columnOrColumns: string | string[]

      Specifies the column or columns to bring to the front.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with 1 or more columns bought to the front of the column ordering.

cast

  • cast<NewValueT>(): IDataFrame<IndexT, NewValueT>
  • Cast the value of the dataframe to a new type. This operation has no effect but to retype the r9ws that the dataframe contains.

    example
    
    const castDf = df.cast();
    

    Type parameters

    • NewValueT

    Returns IDataFrame<IndexT, NewValueT>

    The same dataframe, but with the type changed.

concat

  • Concatenate multiple other dataframes onto this dataframe.

    example
    
    const concatenated = a.concat(b);
    
    example
    
    const concatenated = a.concat(b, c);
    
    example
    
    const concatenated = a.concat([b, c]);
    
    example
    
    const concatenated = a.concat(b, [c, d]);
    
    example
    
    const otherDfs = [... array of dataframes...];
    const concatenated = a.concat(otherDfs);
    

    Parameters

    • Rest ...dataframes: (IDataFrame<IndexT, ValueT> | IDataFrame<IndexT, ValueT>[])[]

      Multiple arguments. Each can be either a dataframe or an array of dataframes.

    Returns IDataFrame<IndexT, ValueT>

    Returns a single dataframe concatenated from multiple input dataframes.

count

  • count(): number
  • Count the number of rows in the dataframe

    example
    
    const numRows = df.count();
    

    Returns number

    Returns the count of all rows.

defaultIfEmpty

  • defaultIfEmpty(defaultSequence: ValueT[] | IDataFrame<IndexT, ValueT>): IDataFrame<IndexT, ValueT>
  • Returns the specified default dataframe if the input dataframe is empty.

    example
    
    const emptyDataFrame = new DataFrame();
    const defaultDataFrame = new DataFrame([ { A: 1 }, { A: 2 }, { A: 3 } ]);
    expect(emptyDataFrame.defaultIfEmpty(defaultDataFrame)).to.eql(defaultDataFrame);
    
    example
    
    const nonEmptyDataFrame = new DataFrame([ { A: 100 }]);
    const defaultDataFrame = new DataFrame([ { A: 1 }, { A: 2 }, { A: 3 } ]);
    expect(nonEmptyDataFrame.defaultIfEmpty(defaultDataFrame)).to.eql(nonEmptyDataFrame);
    

    Parameters

    • defaultSequence: ValueT[] | IDataFrame<IndexT, ValueT>

      Default dataframe to return if the input dataframe is empty.

    Returns IDataFrame<IndexT, ValueT>

    Returns 'defaultSequence' if the input dataframe is empty.

deflate

  • Converts (deflates) a dataframe to a Series.

    example
    
    const series = df.deflate(); // Deflate to a series of object.
    
    example
    
    const series = df.deflate(row => row.SomeColumn); // Extract a particular column.
    

    Type parameters

    • ToT

    Parameters

    Returns ISeries<IndexT, ToT>

    Returns a series that was created from the original dataframe.

detectTypes

  • Detect the the frequency of the types of the values in the dataframe. This is a good way to understand the shape of your data.

    example
    
    const df = dataForge.readFileSync("./my-data.json").parseJSON();
    const dataTypes = df.detectTypes();
    console.log(dataTypes.toString());
    

    Returns IDataFrame<number, ITypeFrequency>

    Returns a dataframe with rows that confirm to ITypeFrequency that describes the data types contained in the original dataframe.

detectValues

  • Detect the frequency of the values in the dataframe. This is a good way to understand the shape of your data.

    example
    
    const df = dataForge.readFileSync("./my-data.json").parseJSON();
    const dataValues = df.detectValues();
    console.log(dataValues.toString());
    

    Returns IDataFrame<number, IValueFrequency>

    Returns a dataframe with rows that conform to IValueFrequency that describes the values contained in the original dataframe.

distinct

  • Returns only the set of rows in the dataframe that are distinct according to some criteria. This can be used to remove duplicate rows from the dataframe.

    example
    
    // Remove duplicate rows by customer id. Will return only a single row per customer.
    const distinctCustomers = salesDf.distinct(sale => sale.CustomerId);
    

    Type parameters

    • ToT

    Parameters

    • Optional selector: SelectorFn<ValueT, ToT>

      User-defined selector function that specifies the criteria used to make comparisons for duplicate rows.

    Returns IDataFrame<IndexT, ValueT>

    Returns a dataframe containing only unique values as determined by the 'selector' function.

dropSeries

  • dropSeries<NewValueT>(columnOrColumns: string | string[]): IDataFrame<IndexT, NewValueT>
  • Create a new dataframe with the requested column or columns dropped.

    example
    const modifiedDf = df.dropSeries("SomeColumn");
    
    example
    const modifiedDf = df.dropSeries(["ColumnA", "ColumnB"]);
    

    Type parameters

    • NewValueT

    Parameters

    • columnOrColumns: string | string[]

      Specifies the column name (a string) or columns (array of strings) to drop.

    Returns IDataFrame<IndexT, NewValueT>

    Returns a new dataframe with a particular named column or columns removed.

endAt

  • endAt(indexValue: IndexT): IDataFrame<IndexT, ValueT>
  • Gets a new dataframe containing all rows up until and including the specified index value (inclusive).

    example
    
    const df = new DataFrame({
         index: [0, 1, 2, 3], // This is the default index.
         values: [10, 20, 30, 40],
    });
    
    const firstHalf = df.endAt(1);
    expect(firstHalf.toArray()).to.eql([10, 20]);
    
    example
    
    const timeSeriesDf = ... a dataframe indexed by date/time ...
    
    // Get all rows ending at a particular date.
    const allRowsUpToAndIncludingTheExactEndDate = df.endAt(new Date(2016, 5, 4));
    

    Parameters

    • indexValue: IndexT

      The index value at which to end the new dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe containing all rows up until and including the specified index value.

ensureSeries

  • Add a series to the dataframe, but only if it doesn't already exist.

    example
    
    const updatedDf = df.ensureSeries("ANewColumn", new Series([1, 2, 3]));
    
    example
    
    const updatedDf = df.ensureSeries("ANewColumn", df =>
         df.getSeries("AnExistingSeries").select(aTransformation)
    );
    
    example
    
    

    const modifiedDf = df.ensureSeries({ ANewColumn: new Series([1, 2, 3]), SomeOtherColumn: new Series([10, 20, 30]) });

    example
    
    

    const modifiedDf = df.ensureSeries({ ANewColumn: df => df.getSeries("SourceData").select(aTransformation)) });

    Type parameters

    • SeriesValueT

    Parameters

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with the specified series added, if the series didn't already exist. Otherwise if the requested series already exists the same dataframe is returned.

except

  • except<InnerIndexT, InnerValueT, KeyT>(inner: IDataFrame<InnerIndexT, InnerValueT>, outerSelector?: SelectorFn<ValueT, KeyT>, innerSelector?: SelectorFn<InnerValueT, KeyT>): IDataFrame<IndexT, ValueT>
  • Creates a new dataframe by merging two input dataframes. The resulting dataframe contains only the rows from the 1st dataframe that don't appear in the 2nd dataframe. This is essentially subtracting the rows from the 2nd dataframe from the 1st and creating a new dataframe with the remaining rows.

    example
    
    const dfA = ...
    const dfB = ...
    const remainingDf = dfA.except(dfB);
    
    example
    
    // Find the list of customers haven't bought anything recently.
    const allCustomers = ... list of all customers ...
    const recentCustomers = ... list of customers who have purchased recently ...
    const remainingCustomers = allCustomers.except(
         recentCustomers,
         customerRecord => customerRecord.CustomerId
    );
    

    Type parameters

    • InnerIndexT

    • InnerValueT

    • KeyT

    Parameters

    • inner: IDataFrame<InnerIndexT, InnerValueT>

      The inner dataframe to merge (the dataframe you call the function on is the 'outer' dataframe).

    • Optional outerSelector: SelectorFn<ValueT, KeyT>
    • Optional innerSelector: SelectorFn<InnerValueT, KeyT>

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe that contains only the rows from the 1st dataframe that don't appear in the 2nd dataframe.

expectSeries

  • expectSeries<SeriesValueT>(columnName: string): ISeries<IndexT, SeriesValueT>
  • Verify the existence of a name column and extracts the Series for it. Throws an exception if the requested column doesn't exist.

    example
    
    try {
         const series = df.expectSeries("SomeColumn");
         // ... do something with the series ...
    }
    catch (err) {
         // ... the dataframe doesn't contain the column "SomeColumn" ...
    }
    

    Type parameters

    • SeriesValueT

    Parameters

    • columnName: string

      Name of the column to extract.

    Returns ISeries<IndexT, SeriesValueT>

    Returns the Series for the column if it exists, otherwise it throws an exception.

fillGaps

  • fillGaps(comparer: ComparerFn<[IndexT, ValueT], [IndexT, ValueT]>, generator: GapFillFn<[IndexT, ValueT], [IndexT, ValueT]>): IDataFrame<IndexT, ValueT>
  • Fill gaps in a dataframe.

    example
    
      var sequenceWithGaps = ...
    
     // Predicate that determines if there is a gap.
     var gapExists = (pairA, pairB) => {
         // Returns true if there is a gap.
         return true;
     };
    
     // Generator function that produces new rows to fill the game.
     var gapFiller = (pairA, pairB) => {
         // Create an array of index, value pairs that fill the gaps between pairA and pairB.
         return [
             newPair1,
             newPair2,
             newPair3,
         ];
     };
    
     var sequenceWithoutGaps = sequenceWithGaps.fillGaps(gapExists, gapFiller);
    

    Parameters

    • comparer: ComparerFn<[IndexT, ValueT], [IndexT, ValueT]>

      User-defined comparer function that is passed pairA and pairB, two consecutive rows, return truthy if there is a gap between the rows, or falsey if there is no gap.

    • generator: GapFillFn<[IndexT, ValueT], [IndexT, ValueT]>

      User-defined generator function that is passed pairA and pairB, two consecutive rows, returns an array of pairs that fills the gap between the rows.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with gaps filled in.

first

  • first(): ValueT
  • Get the first row of the dataframe.

    example
    
    const firstRow = df.first();
    

    Returns ValueT

    Returns the first row of the dataframe.

forEach

  • Invoke a callback function for each roew in the dataframe.

    example
    
    df.forEach(row => {
         // ... do something with the row ...
    });
    

    Parameters

    • callback: CallbackFn<ValueT>

      The calback function to invoke for each row.

    Returns IDataFrame<IndexT, ValueT>

    Returns the original dataframe with no modifications.

generateSeries

  • Generate new columns based on existing rows.

    This is equivalent to calling select to transform the original dataframe to a new dataframe with different column, then using withSeries to merge each the of both the new and original dataframes.

    example
    
    function produceNewColumns (inputRow) {
         const newColumns = {
             // ... specify new columns and their values based on the input row ...
         };
    
         return newColumns;
    };
    
    const dfWithNewSeries = df.generateSeries(row => produceNewColumns(row));
    
    example
    
    const dfWithNewSeries = df.generateSeries({
         NewColumnA: row => produceNewColumnA(row),
         NewColumnB: row => produceNewColumnB(row),
    })
    

    Type parameters

    • NewValueT

    Parameters

    • generator: SelectorWithIndexFn<any, any> | IColumnTransformSpec

      Generator function that transforms each row to produce 1 or more new columns. Or use a column spec that has fields for each column, the fields specify a generate function that produces the value for each new column.

    Returns IDataFrame<IndexT, NewValueT>

    Returns a new dataframe with 1 or more new columns.

getColumnNames

  • getColumnNames(): string[]
  • Get the names of the columns in the dataframe.

    example
    
    console.log(df.getColumnNames());
    

    Returns string[]

    Returns an array of the column names in the dataframe.

getColumns

  • Retreive the collection of all columns in the dataframe.

    example
    
    for (const column in df.getColummns()) {
         console.log("Column name: ");
         console.log(column.name);
    
         console.log("Data:");
         console.log(column.series.toArray());
    }
    

    Returns ISeries<number, IColumn>

    Returns a Series containing the names of the columns in the dataframe.

getIndex

  • Get the index for the dataframe.

    example
    
    const index = df.getIndex();
    

    Returns IIndex<IndexT>

    The Index for the dataframe.

getSeries

  • getSeries<SeriesValueT>(columnName: string): ISeries<IndexT, SeriesValueT>
  • Extract a Series from a named column in the dataframe.

    example
    
    const series = df.getSeries("SomeColumn");
    

    Type parameters

    • SeriesValueT

    Parameters

    • columnName: string

      Specifies the name of the column that contains the Series to retreive.

    Returns ISeries<IndexT, SeriesValueT>

    Returns the Series extracted from the named column in the dataframe.

groupBy

  • Collects rows in the dataframe into a Series of groups according to a user-defined selector function.

    example
    
    const salesDf = ... product sales ...
    const salesByProduct = salesDf.groupBy(sale => sale.ProductId);
    for (const productSalesGroup of salesByProduct) {
         // ... do something with each product group ...
         const productId = productSalesGroup.first().ProductId;
         const totalSalesForProduct = productSalesGroup.deflate(sale => sale.Amount).sum();
         console.log(totalSalesForProduct);
    }
    

    Type parameters

    • GroupT

    Parameters

    • selector: SelectorWithIndexFn<ValueT, GroupT>

      User-defined selector function that specifies the criteriay to group by.

    Returns ISeries<number, IDataFrame<IndexT, ValueT>>

    Returns a Series of groups. Each group is a dataframe with rows that have been grouped by the 'selector' function.

groupSequentialBy

  • Collects values in the series into a new series of groups based on if the values are the same or according to a user-defined selector function.

    example
    
    // Some ultra simple stock trading strategy backtesting...
    const dailyStockPriceDf = ... daily stock price for a company ...
    const priceGroups  = dailyStockPriceDf.groupBy(day => day.close > day.movingAverage);
    for (const priceGroup of priceGroups) {
         // ... do something with each stock price group ...
    
         const firstDay = priceGroup.first();
         if (firstDay.close > movingAverage) {
             // This group of days has the stock price above its moving average.
             // ... maybe enter a long trade here ...
         }
         else {
             // This group of days has the stock price below its moving average.
             // ... maybe enter a short trade here ...
         }
    }
    

    Type parameters

    • GroupT

    Parameters

    Returns ISeries<number, IDataFrame<IndexT, ValueT>>

    Returns a Series of groups. Each group is a dataframe with rows that are the same or have been grouped by the 'selector' function.

hasSeries

  • hasSeries(columnName: string): boolean
  • Determine if the dataframe contains a Series the specified named column.

    example
    
    if (df.hasSeries("SomeColumn")) {
         // ... the dataframe contains a series with the specified column name ...
    }
    

    Parameters

    • columnName: string

      Name of the column to check for.

    Returns boolean

    Returns true if the dataframe contains the requested Series, otherwise returns false.

head

  • head(numValues: number): IDataFrame<IndexT, ValueT>
  • Get X rows from the start of the dataframe. Pass in a negative value to get all rows at the head except for X rows at the tail.

    examples
    
    const sample = df.head(10); // Take a sample of 10 rows from the start of the dataframe.
    

    Parameters

    • numValues: number

      Number of rows to take.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe that has only the specified number of rows taken from the start of the original dataframe.

inflateSeries

  • Inflate a named Series in the dataframe to 1 or more new series in the new dataframe.

    This is the equivalent of extracting the series using getSeries, transforming them with Series.select and then running Series.inflate to create a new dataframe, then merging each column of the new dataframe into the original dataframe using withSeries.

    example
    
    function newColumnGenerator (row) {
         const newColumns = {
             // ... create 1 field per new column ...
         };
    
         return row;
    }
    
    const dfWithNewSeries = df.inflateSeries("SomeColumn", newColumnGenerator);
    

    Type parameters

    • NewValueT

    Parameters

    • columnName: string

      Name of the series to inflate.

    • Optional selector: SelectorWithIndexFn<IndexT, any>

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with a column inflated to 1 or more new columns.

insertPair

  • insertPair(pair: [IndexT, ValueT]): IDataFrame<IndexT, ValueT>
  • Insert a pair at the start of the dataframe. Doesn't modify the original dataframe! The returned dataframe is entirely new and contains rows from the original dataframe plus the inserted pair.

    example
    
    const newIndex = ... index of the new row ...
    const newRow = ... the new data row to insert ...
    const insertedDf = df.insertPair([newIndex, newRows]);
    

    Parameters

    • pair: [IndexT, ValueT]

      The index/value pair to insert.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with the specified pair inserted.

intersection

  • intersection<InnerIndexT, InnerValueT, KeyT>(inner: IDataFrame<InnerIndexT, InnerValueT>, outerSelector?: SelectorFn<ValueT, KeyT>, innerSelector?: SelectorFn<InnerValueT, KeyT>): IDataFrame<IndexT, ValueT>
  • Creates a new dataframe by merging two input dataframes. The resulting dataframe contains the intersection of rows from the two input dataframes. These are only the rows that appear in both dataframes.

    example
    
    const dfA = ...
    const dfB = ...
    const mergedDf = dfA.intersection(dfB);
    
    example
    
    // Merge two sets of customer records to find only the
    // customers that appears in both.
    const customerRecordsA = ...
    const customerRecordsB = ...
    const intersectionOfCustomerRecords = customerRecordsA.intersection(
         customerRecordsB,
         customerRecord => customerRecord.CustomerId
    );
    

    Type parameters

    • InnerIndexT

    • InnerValueT

    • KeyT

    Parameters

    • inner: IDataFrame<InnerIndexT, InnerValueT>

      The inner dataframe to merge (the dataframe you call the function on is the 'outer' dataframe).

    • Optional outerSelector: SelectorFn<ValueT, KeyT>
    • Optional innerSelector: SelectorFn<InnerValueT, KeyT>

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe that contains the intersection of rows from the two input dataframes.

join

  • join<KeyT, InnerIndexT, InnerValueT, ResultValueT>(inner: IDataFrame<InnerIndexT, InnerValueT>, outerKeySelector: SelectorFn<ValueT, KeyT>, innerKeySelector: SelectorFn<InnerValueT, KeyT>, resultSelector: JoinFn<ValueT, InnerValueT, ResultValueT>): IDataFrame<number, ResultValueT>
  • Creates a new dataframe by merging two input dataframes. The resulting dataframe contains only those rows that have matching keys in both input dataframes.

    example
    
    // Join together two sets of customers to find those
    // that have bought both product A and product B.
    const customerWhoBoughtProductA = ...
    const customerWhoBoughtProductB = ...
    const customersWhoBoughtBothProductsDf = customerWhoBoughtProductA.join(
             customerWhoBoughtProductB,
             customerA => customerA.CustomerId, // Join key.
             customerB => customerB.CustomerId, // Join key.
             (customerA, customerB) => {
                 return {
                     // ... merge the results ...
                 };
             }
         );
    

    Type parameters

    • KeyT

    • InnerIndexT

    • InnerValueT

    • ResultValueT

    Parameters

    • inner: IDataFrame<InnerIndexT, InnerValueT>

      The 'inner' dataframe to join (the dataframe you are callling the function on is the 'outer' dataframe).

    • outerKeySelector: SelectorFn<ValueT, KeyT>

      User-defined selector function that chooses the join key from the outer dataframe.

    • innerKeySelector: SelectorFn<InnerValueT, KeyT>

      User-defined selector function that chooses the join key from the inner dataframe.

    • resultSelector: JoinFn<ValueT, InnerValueT, ResultValueT>

      User-defined function that merges outer and inner values.

    Returns IDataFrame<number, ResultValueT>

    Returns the new merged dataframe.

joinOuter

  • joinOuter<KeyT, InnerIndexT, InnerValueT, ResultValueT>(inner: IDataFrame<InnerIndexT, InnerValueT>, outerKeySelector: SelectorFn<ValueT, KeyT>, innerKeySelector: SelectorFn<InnerValueT, KeyT>, resultSelector: JoinFn<ValueT | null, InnerValueT | null, ResultValueT>): IDataFrame<number, ResultValueT>
  • Creates a new dataframe by merging two input dataframes. The resulting dataframe contains only those rows that are only present in or or the other of the dataframes, not both.

    example
    
    // Join together two sets of customers to find those
    // that have bought either product A or product B, not not both.
    const customerWhoBoughtProductA = ...
    const customerWhoBoughtProductB = ...
    const customersWhoBoughtEitherProductButNotBothDf = customerWhoBoughtProductA.joinOuter(
             customerWhoBoughtProductB,
             customerA => customerA.CustomerId, // Join key.
             customerB => customerB.CustomerId, // Join key.
             (customerA, customerB) => {
                 return {
                     // ... merge the results ...
                 };
             }
         );
    

    Type parameters

    • KeyT

    • InnerIndexT

    • InnerValueT

    • ResultValueT

    Parameters

    • inner: IDataFrame<InnerIndexT, InnerValueT>

      The 'inner' dataframe to join (the dataframe you are callling the function on is the 'outer' dataframe).

    • outerKeySelector: SelectorFn<ValueT, KeyT>

      User-defined selector function that chooses the join key from the outer dataframe.

    • innerKeySelector: SelectorFn<InnerValueT, KeyT>

      User-defined selector function that chooses the join key from the inner dataframe.

    • resultSelector: JoinFn<ValueT | null, InnerValueT | null, ResultValueT>

      User-defined function that merges outer and inner values.

      Implementation from here:

      http://blogs.geniuscode.net/RyanDHatch/?p=116
      

    Returns IDataFrame<number, ResultValueT>

    Returns the new merged dataframe.

joinOuterLeft

  • joinOuterLeft<KeyT, InnerIndexT, InnerValueT, ResultValueT>(inner: IDataFrame<InnerIndexT, InnerValueT>, outerKeySelector: SelectorFn<ValueT, KeyT>, innerKeySelector: SelectorFn<InnerValueT, KeyT>, resultSelector: JoinFn<ValueT | null, InnerValueT | null, ResultValueT>): IDataFrame<number, ResultValueT>
  • Creates a new dataframe by merging two input dataframes. The resulting dataframe contains only those rows that are present either in both dataframes or only in the outer (left) dataframe.

    example
    
    // Join together two sets of customers to find those
    // that have bought either just product A or both product A and product B.
    const customerWhoBoughtProductA = ...
    const customerWhoBoughtProductB = ...
    const boughtJustAorAandB = customerWhoBoughtProductA.joinOuterLeft(
             customerWhoBoughtProductB,
             customerA => customerA.CustomerId, // Join key.
             customerB => customerB.CustomerId, // Join key.
             (customerA, customerB) => {
                 return {
                     // ... merge the results ...
                 };
             }
         );
    

    Type parameters

    • KeyT

    • InnerIndexT

    • InnerValueT

    • ResultValueT

    Parameters

    • inner: IDataFrame<InnerIndexT, InnerValueT>

      The 'inner' dataframe to join (the dataframe you are callling the function on is the 'outer' dataframe).

    • outerKeySelector: SelectorFn<ValueT, KeyT>

      User-defined selector function that chooses the join key from the outer dataframe.

    • innerKeySelector: SelectorFn<InnerValueT, KeyT>

      User-defined selector function that chooses the join key from the inner dataframe.

    • resultSelector: JoinFn<ValueT | null, InnerValueT | null, ResultValueT>

      User-defined function that merges outer and inner values.

      Implementation from here:

      http://blogs.geniuscode.net/RyanDHatch/?p=116
      

    Returns IDataFrame<number, ResultValueT>

    Returns the new merged dataframe.

joinOuterRight

  • joinOuterRight<KeyT, InnerIndexT, InnerValueT, ResultValueT>(inner: IDataFrame<InnerIndexT, InnerValueT>, outerKeySelector: SelectorFn<ValueT, KeyT>, innerKeySelector: SelectorFn<InnerValueT, KeyT>, resultSelector: JoinFn<ValueT | null, InnerValueT | null, ResultValueT>): IDataFrame<number, ResultValueT>
  • Creates a new dataframe by merging two input dataframes. The resulting dataframe contains only those rows that are present either in both dataframes or only in the inner (right) dataframe.

    example
    
    // Join together two sets of customers to find those
    // that have bought either just product B or both product A and product B.
    const customerWhoBoughtProductA = ...
    const customerWhoBoughtProductB = ...
    const boughtJustAorAandB = customerWhoBoughtProductA.joinOuterRight(
             customerWhoBoughtProductB,
             customerA => customerA.CustomerId, // Join key.
             customerB => customerB.CustomerId, // Join key.
             (customerA, customerB) => {
                 return {
                     // ... merge the results ...
                 };
             }
         );
    

    Type parameters

    • KeyT

    • InnerIndexT

    • InnerValueT

    • ResultValueT

    Parameters

    • inner: IDataFrame<InnerIndexT, InnerValueT>

      The 'inner' dataframe to join (the dataframe you are callling the function on is the 'outer' dataframe).

    • outerKeySelector: SelectorFn<ValueT, KeyT>

      User-defined selector function that chooses the join key from the outer dataframe.

    • innerKeySelector: SelectorFn<InnerValueT, KeyT>

      User-defined selector function that chooses the join key from the inner dataframe.

    • resultSelector: JoinFn<ValueT | null, InnerValueT | null, ResultValueT>

      User-defined function that merges outer and inner values.

      Implementation from here:

      http://blogs.geniuscode.net/RyanDHatch/?p=116
      

    Returns IDataFrame<number, ResultValueT>

    Returns the new merged dataframe.

last

  • last(): ValueT
  • Get the last row of the dataframe.

    example
    
    const lastRow = df.last();
    

    Returns ValueT

    Returns the last row of the dataframe.

none

  • Evaluates a predicate function for every row in the dataframe to determine if some condition is true/truthy for none of rows in the dataframe.

    If no predicate is specified then it simply checks if the dataframe contains zero rows.

    example
    
    const noFreds = df.none(row => row.CustomerName === "Fred"); // Do we have zero customers named Fred?
    
    example
    
    const noCustomers = df.none(); // Do we have zero customers?
    

    Parameters

    Returns boolean

    Returns true if the predicate has returned truthy for zero rows in the dataframe, otherwise returns false. Returns false for an empty dataframe.

orderBy

  • Sorts the dataframe in ascending order by a value defined by the user-defined selector function.

    example
    
    // Order sales by amount from least to most.
    const orderedDf = salesDf.orderBy(sale => sale.Amount);
    

    Type parameters

    • SortT

    Parameters

    • selector: SelectorWithIndexFn<ValueT, SortT>

      User-defined selector function that selects the value to sort by.

    Returns IOrderedDataFrame<IndexT, ValueT, SortT>

    Returns a new dataframe that has been ordered accorrding to the value chosen by the selector function.

orderByDescending

  • Sorts the dataframe in descending order by a value defined by the user-defined selector function.

    example
    
    // Order sales by amount from most to least
    const orderedDf = salesDf.orderByDescending(sale => sale.Amount);
    

    Type parameters

    • SortT

    Parameters

    • selector: SelectorWithIndexFn<ValueT, SortT>

      User-defined selector function that selects the value to sort by.

    Returns IOrderedDataFrame<IndexT, ValueT, SortT>

    Returns a new dataframe that has been ordered accorrding to the value chosen by the selector function.

parseDates

  • parseDates(columnNameOrNames: string | string[], formatString?: undefined | string): IDataFrame<IndexT, ValueT>
  • Parse a column with string values and convert it to a column with date values.

    example
    
    const withParsedColumn = df.parseDates("MyDateColumn");
    
    example
    
    const withParsedColumns = df.parseDates(["MyDateColumnA", "MyDateColumnA"]);
    

    Parameters

    • columnNameOrNames: string | string[]

      -Specifies the column name or array of column names to parse.

    • Optional formatString: undefined | string

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with a particular named column parsed as dates.

parseFloats

  • parseFloats(columnNameOrNames: string | string[]): IDataFrame<IndexT, ValueT>
  • Parse a column with string values and convert it to a column with float values.

    example
    
    const withParsedColumn = df.parseFloats("MyFloatColumn");
    
    example
    
    const withParsedColumns = df.parseFloats(["MyFloatColumnA", "MyFloatColumnA"]);
    

    Parameters

    • columnNameOrNames: string | string[]

      Specifies the column name or array of column names to parse.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with a particular named column parsed as floats.

parseInts

  • parseInts(columnNameOrNames: string | string[]): IDataFrame<IndexT, ValueT>
  • Parse a column with string values and convert it to a column with int values.

    example
    
    const withParsedColumn = df.parseInts("MyIntColumn");
    
    example
    
    const withParsedColumns = df.parseInts(["MyIntColumnA", "MyIntColumnA"]);
    

    Parameters

    • columnNameOrNames: string | string[]

      Specifies the column name or array of column names to parse.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with a particular named column parsed as ints.

pivot

  • pivot<NewValueT>(columnOrColumns: string | Iterable<string>, valueColumnNameOrSpec: string | IPivotAggregateSpec, aggregator?: undefined | function): IDataFrame<number, NewValueT>
  • Reshape (or pivot) a dataframe based on column values. This is short-hand that combines grouping, aggregation and sorting.

    example
    
    // Simplest example.
    // Group by the values in 'PivotColumn'.
    // The unique set of values in 'PivotColumn' becomes the columns in the resulting dataframe.
    // The column 'ValueColumn' is averaged for each group and this becomes the
    // values in the new column.
    const pivottedDf = df.pivot("PivotColumn", "ValueColumn", values => values.average());
    
    example
    
    // Multi-value column example.
    // Similar to the previous example except now we are aggregating multiple value columns.
    // Each group has the average computed for 'ValueColumnA' and the sum for 'ValueColumnB'.
    const pivottedDf = df.pivot("PivotColumn", {
         "ValueColumnA": aValues => aValues.average(),
         "ValueColumnB":  bValues => bValues.sum(),
    });
    
    example
    
    // Full multi-column example.
    // Similar to the previous example now we are pivotting on multiple columns.
    // We now group by the 'PivotColumnA' and then by 'PivotColumnB', effectively creating a
    // multi-level group.
    const pivottedDf = df.pivot(["PivotColumnA", "PivotColumnB" ], {
         "ValueColumnA": aValues => aValues.average(),
         "ValueColumnB":  bValues => bValues.sum(),
    });
    
    example
    
    // To help understand the pivot function, let's look at what it does internally.
    // Take the simplest example:
    const pivottedDf = df.pivot("PivotColumn", "ValueColumn", values => values.average());
    
    // If we expand out the internals of the pivot function, it will look something like this:
    const pivottedDf = df.groupBy(row => row.PivotColumn)
             .select(group => ({
                 PivotColumn: group.deflate(row.ValueColumn).average()
             }))
             .orderBy(row  => row.PivotColumn);
    
    // You can see that pivoting a dataframe is the same as grouping, aggregating and sorting it.
    // Does pivoting seem simpler now?
    
    // It gets more complicated than that of course, because the pivot function supports multi-level nested
    // grouping and aggregation of multiple columns. So a full expansion of the pivot function is rather complex.
    

    Type parameters

    • NewValueT

    Parameters

    • columnOrColumns: string | Iterable<string>

      Column name whose values make the new DataFrame's columns.

    • valueColumnNameOrSpec: string | IPivotAggregateSpec

      Column name or column spec that defines the columns whose values should be aggregated.

    • Optional aggregator: undefined | function

    Returns IDataFrame<number, NewValueT>

    Returns a new dataframe that has been pivoted based on a particular column's values.

renameSeries

  • Create a new dataframe with 1 or more columns renamed.

    example
    
    const renamedDf = df.renameSeries({ OldColumnName, NewColumnName });
    
    example
    
    const renamedDf = df.renameSeries({
         Column1: ColumnA,
         Column2: ColumnB
    });
    

    Type parameters

    • NewValueT

    Parameters

    • newColumnNames: IColumnRenameSpec

      A column rename spec - a JavaScript hash that maps existing column names to new column names.

    Returns IDataFrame<IndexT, NewValueT>

    Returns a new dataframe with specified columns renamed.

reorderSeries

  • reorderSeries<NewValueT>(columnNames: string[]): IDataFrame<IndexT, NewValueT>
  • Create a new dataframe with columns reordered. New column names create new columns (with undefined values), omitting existing column names causes those columns to be dropped.

    example
    const reorderedDf = df.reorderSeries(["FirstColumn", "SecondColumn", "etc"]);
    

    Type parameters

    • NewValueT

    Parameters

    • columnNames: string[]

      Specifies the new order for columns.

    Returns IDataFrame<IndexT, NewValueT>

    Returns a new dataframe with columns reodered according to the order of the array of column names that is passed in.

resetIndex

  • Resets the Index of the dataframe back to the default zero-based sequential integer index.

    example
    
    const dfWithResetIndex = df.resetIndex();
    

    Returns IDataFrame<number, ValueT>

    Returns a new dataframe with the Index reset to the default zero-based index.

reverse

  • Gets a new dataframe in reverse order.

    example
    
    const reversed = df.reverse();
    

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe that is the reverse of the original.

rollingWindow

  • Partition a dataframe into a Series of rolling data windows. Each value in the new series is a rolling chunk of data from the original dataframe.

    example
    
    const salesDf = ... // Daily sales data.
    const rollingWeeklySales = salesDf.rollingWindow(7); // Get rolling window over weekly sales data.
    console.log(rollingWeeklySales.toString());
    

    Parameters

    • period: number

      The number of data rows to include in each data window.

    Returns ISeries<number, IDataFrame<IndexT, ValueT>>

    Returns a new series, each value of which is a rolling chunk of the original dataframe.

select

  • Generates a new dataframe by repeatedly calling a user-defined selector function on each row in the original dataframe.

    example
    
    function transformRow (inputRow) {
         const outputRow = {
             // ... construct output row derived from input row ...
         };
    
         return outputRow;
    }
    
    const transformedDf = df.select(row => transformRow(row));
    

    Type parameters

    • ToT

    Parameters

    • selector: SelectorWithIndexFn<ValueT, ToT>

      A user-defined selector function that transforms each row to create the new dataframe.

    Returns IDataFrame<IndexT, ToT>

    Returns a new dataframe with each row transformed by the selector function.

selectMany

  • Generates a new dataframe by repeatedly calling a user-defined selector function on each row in the original dataframe.

    • Similar to the select function, but in this case the selector function produces a collection of output rows that are flattened and merged to create the new dataframe.
    example
    
    function produceOutputRows (inputRow) {
         const outputRows = [];
         while (someCondition) {
             // ... generate zero or more output rows ...
             outputRows.push(... some generated row ...);
         }
         return outputRows;
    }
    
    const modifiedDf = df.selectMany(row => produceOutputRows(row));
    

    Type parameters

    • ToT

    Parameters

    • selector: SelectorWithIndexFn<ValueT, Iterable<ToT>>

      A user-defined selector function that transforms each row into a collection of output rows.

    Returns IDataFrame<IndexT, ToT>

    Returns a new dataframe where each row has been transformed into 0 or more new rows by the selector function.

sequentialDistinct

  • Eliminates adjacent duplicate rows.

    For each group of adjacent values that are equivalent only returns the last index/row for the group, thus ajacent equivalent rows are collapsed down to the last row.

    example
    
    const dfWithDuplicateRowsRemoved = df.sequentialDistinct(row => row.ColumnA);
    

    Type parameters

    • ToT

    Parameters

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with groups of adjacent duplicate rows collapsed to a single row per group.

serialize

  • serialize(): any
  • Serialize the dataframe to an ordinary JavaScript data structure. The resulting data structure is suitable for further serialization to JSON and can be used to transmit a DataFrame and its internal structure over the wire. Use the deserialize function to later reconstitute the serialized dataframe.

    example
    
    const jsDataStructure = df.serialize();
    const jsonData = JSON.stringify(jsDataStructure);
    console.log(jsonData);
    const deserializedJsDataStructure = JSON.parse(jsonData);
    const deserializedDf = DataFrame.deserialize(deserializedJsDataStructure); // Reconsituted.
    

    Returns any

    Returns a JavaScript data structure conforming to ISerializedDataFrame that represents the dataframe and its internal structure.

setIndex

  • setIndex<NewIndexT>(columnName: string): IDataFrame<NewIndexT, ValueT>
  • Set a named column as the Index of the dataframe.

    example
    
    const indexedDf = df.setIndex("SomeColumn");
    

    Type parameters

    • NewIndexT

    Parameters

    • columnName: string

      Name of the column to use as the new Index of the returned dataframe.

    Returns IDataFrame<NewIndexT, ValueT>

    Returns a new dataframe with the values of the specified column as the new Index.

skip

  • skip(numValues: number): IDataFrame<IndexT, ValueT>
  • Skip a number of rows in the dataframe.

    example
    
    const dfWithRowsSkipped = df.skip(10); // Skip 10 rows in the original dataframe.
    

    Parameters

    • numValues: number

      Number of rows to skip.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with the specified number of rows skipped.

skipUntil

  • Skips values in the dataframe untils a condition evaluates to true or truthy.

    example
    
    const dfWithRowsSkipped = df.skipUntil(row => row.CustomerName === "Fred"); // Skip initial customers until we find Fred.
    

    Parameters

    • predicate: PredicateFn<ValueT>

      Return true/truthy to stop skipping rows in the original dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with all initial sequential rows removed until the predicate returned true/truthy.

skipWhile

  • Skips values in the dataframe while a condition evaluates to true or truthy.

    example
    
    const dfWithRowsSkipped = df.skipWhile(row => row.CustomerName === "Fred"); // Skip initial customers named Fred.
    

    Parameters

    • predicate: PredicateFn<ValueT>

      Returns true/truthy to continue to skip rows in the original dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with all initial sequential rows removed while the predicate returned true/truthy.

startAt

  • startAt(indexValue: IndexT): IDataFrame<IndexT, ValueT>
  • Gets a new dataframe containing all rows starting at and after the specified index value.

    example
    
    const df = new DataFrame({
         index: [0, 1, 2, 3], // This is the default index.
         values: [10, 20, 30, 40],
    });
    
    const lastHalf = df.startAt(2);
    expect(lastHalf.toArray()).to.eql([30, 40]);
    
    example
    
    const timeSeriesDf = ... a dataframe indexed by date/time ...
    
    // Get all rows starting at (or after) a particular date.
    const allRowsFromStartDate = df.startAt(new Date(2016, 5, 4));
    

    Parameters

    • indexValue: IndexT

      The index value at which to start the new dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe containing all rows starting at and after the specified index value.

subset

  • subset<NewValueT>(columnNames: string[]): IDataFrame<IndexT, NewValueT>
  • Create a new dataframe with just a subset of columns.

    example
    const subsetDf = df.subset(["ColumnA", "ColumnB"]);
    

    Type parameters

    • NewValueT

    Parameters

    • columnNames: string[]

      Array of column names to include in the new dataframe.

    Returns IDataFrame<IndexT, NewValueT>

    Returns a dataframe with a subset of columns from the original dataframe.

tail

  • tail(numValues: number): IDataFrame<IndexT, ValueT>
  • Get X rows from the end of the dataframe. Pass in a negative value to get all rows at the tail except X rows at the head.

    examples
    
    const sample = df.tail(12); // Take a sample of 12 rows from the end of the dataframe.
    

    Parameters

    • numValues: number

      Number of rows to take.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe that has only the specified number of rows taken from the end of the original dataframe.

take

  • take(numRows: number): IDataFrame<IndexT, ValueT>
  • Take a number of rows in the dataframe.

    example
    
    const dfWithRowsTaken = df.take(15); // Take only the first 15 rows from the original dataframe.
    

    Parameters

    • numRows: number

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with only the specified number of rows taken from the original dataframe.

takeUntil

  • Takes values from the dataframe untils a condition evaluates to true or truthy.

    example
    
    const dfWithRowsTaken = df.takeUntil(row => row.CustomerName === "Fred"); // Take all initial customers until we find Fred.
    

    Parameters

    • predicate: PredicateFn<ValueT>

      Return true/truthy to stop taking rows in the original dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with only the initial sequential rows taken until the predicate returned true/truthy.

takeWhile

  • Takes values from the dataframe while a condition evaluates to true or truthy.

    example
    
    const dfWithRowsTaken = df.takeWhile(row => row.CustomerName === "Fred"); // Take only initial customers named Fred.
    

    Parameters

    • predicate: PredicateFn<ValueT>

      Returns true/truthy to continue to take rows from the original dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with only the initial sequential rows that were taken while the predicate returned true/truthy.

toArray

  • toArray(): ValueT[]
  • Extract rows from the dataframe as an array. Each element of the array is one row of the dataframe represented as a JavaScript object with the fields as the dataframe's columns. This forces lazy evaluation to complete.

    example
    const values = df.toArray();
    

    Returns ValueT[]

    Returns an array of the rows contained within the dataframe.

toCSV

  • toCSV(): string
  • Serialize the dataframe to the CSV data format.

    example
    
    const csvData = df.toCSV();
    console.log(csvData);
    

    Returns string

    Returns a string in the CSV data format that represents the dataframe.

toHTML

  • toHTML(): string
  • Serialize the dataframe to HTML.

    Returns string

    Returns a string in HTML format that represents the dataframe.

toJSON

  • toJSON(): string
  • Serialize the dataframe to the JSON data format.

    example
    
    const jsonData = df.toJSON();
    console.log(jsonData);
    

    Returns string

    Returns a string in the JSON data format that represents the dataframe.

toObject

  • toObject<KeyT, FieldT, OutT>(keySelector: function, valueSelector: function): OutT
  • Convert the dataframe to a JavaScript object.

    example
    
    const someObject = df.toObject(
         row => row.SomeColumn, // Specify the column to use for field names in the output object.
         row => row.SomeOtherColumn // Specifi the column to use as the value for each field.
    );
    

    Type parameters

    • KeyT

    • FieldT

    • OutT

    Parameters

    • keySelector: function

      User-defined selector function that selects keys for the resulting object.

        • (value: ValueT): KeyT
        • Parameters

          • value: ValueT

          Returns KeyT

    • valueSelector: function

      User-defined selector function that selects values for the resulting object.

        • (value: ValueT): FieldT
        • Parameters

          • value: ValueT

          Returns FieldT

    Returns OutT

    Returns a JavaScript object generated from the dataframe by applying the key and value selector functions.

toPairs

  • toPairs(): Object[]
  • Retreive the index, row pairs from the dataframe as an array. Each pair is [index, row]. This forces lazy evaluation to complete.

    example
    const pairs = df.toPairs();
    

    Returns Object[]

    Returns an array of pairs that contains the dataframe's rows. Each pair is a two element array that contains an index and a row.

toRows

  • toRows(): any[][]
  • Bake the data frame to an array of rows were each rows is an array of values in column order.

    example
    const rows = df.toRows();
    

    Returns any[][]

    Returns an array of rows. Each row is an array of values in column order.

toString

  • toString(): string
  • Format the dataframe for display as a string. This forces lazy evaluation to complete.

    example
    
    console.log(df.toString());
    

    Returns string

    Generates and returns a string representation of the dataframe or dataframe.

toStrings

  • toStrings(columnNames: string | string[] | IFormatSpec, formatString?: undefined | string): IDataFrame<IndexT, ValueT>
  • Convert a column of values of different types to a column of string values.

    example
    
    const withStringColumn = df.toStrings("MyDateColumn", "YYYY-MM-DD");
    
    example
    
    const withStringColumn = df.toStrings("MyFloatColumn", "0.00");
    

    Parameters

    • columnNames: string | string[] | IFormatSpec

      Specifies the column name or array of column names to convert to strings. Can also be a format spec that specifies which columns to convert and what their format should be.

    • Optional formatString: undefined | string

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with a particular named column convert to strings.

transformSeries

  • Transform one or more columns.

    This is equivalent to extracting a Series with getSeries, then transforming it with Series.select, and finally plugging it back in as the same column using withSeries.

    example
    
    const modifiedDf = df.transformSeries({
         AColumnToTransform: columnValue => transformRow(columnValue)
    });
    
    example
    
    const modifiedDf = df.transformSeries({
         ColumnA: columnValue => transformColumnA(columnValue),
         ColumnB: columnValue => transformColumnB(columnValue)
    });
    

    Type parameters

    • NewValueT

    Parameters

    • columnSelectors: IColumnTransformSpec

      Object with field names for each column to be transformed. Each field specifies a selector function that transforms that column.

    Returns IDataFrame<IndexT, NewValueT>

    Returns a new dataframe with 1 or more columns transformed.

truncateStrings

  • truncateStrings(maxLength: number): IDataFrame<IndexT, ValueT>
  • Produces a new dataframe with all string values truncated to the requested maximum length.

    example
    
    // Truncate all string columns to 100 characters maximum.
    const truncatedDf = df.truncateString(100);
    

    Parameters

    • maxLength: number

      The maximum length of the string values after truncation.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with all strings truncated to the specified maximum length.

union

  • Creates a new dataframe by merging two input dataframes. The resulting dataframe contains the union of rows from the two input dataframes. These are the unique combination of rows in both dataframe. This is basically a concatenation and then elimination of duplicates.

    example
    
    const dfA = ...
    const dfB = ...
    const merged = dfA.union(dfB);
    
    example
    
    // Merge two sets of customer records that may contain the same
    // customer record in each set. This is basically a concatenation
    // of the dataframes and then an elimination of any duplicate records
    // that result.
    const customerRecordsA = ...
    const customerRecordsB = ...
    const mergedCustomerRecords = customerRecordsA.union(
         customerRecordsB,
         customerRecord => customerRecord.CustomerId
    );
    
    example
    
    // Note that you can achieve the exact same result as the previous
    // example by doing a {@link DataFrame.concat) and {@link DataFrame.distinct}
    // of the dataframes and then an elimination of any duplicate records
    // that result.
    const customerRecordsA = ...
    const customerRecordsB = ...
    const mergedCustomerRecords = customerRecordsA
         .concat(customerRecordsB)
         .distinct(customerRecord => customerRecord.CustomerId);
    

    Type parameters

    • KeyT

    Parameters

    • other: IDataFrame<IndexT, ValueT>

      The other dataframes to merge.

    • Optional selector: SelectorFn<ValueT, KeyT>

    Returns IDataFrame<IndexT, ValueT>

    Returns the union of the two dataframes.

variableWindow

  • Partition a dataframe into a Series of variable-length data windows where the divisions between the data chunks are defined by a user-provided comparer function.

    example
    
    

    function rowComparer (rowA, rowB) { if (... rowA should be in the same data window as rowB ...) { return true; } else { return false; } };

    const variableWindows = df.variableWindow(rowComparer);

    Parameters

    • comparer: ComparerFn<ValueT, ValueT>

      Function that compares two adjacent data rows and returns true if they should be in the same window.

    Returns ISeries<number, IDataFrame<IndexT, ValueT>>

    Returns a new series, each value of which is a chunk of data from the original dataframe.

where

  • Filter the dataframe using user-defined predicate function.

    example
    
    const filteredDf = df.where(row => row.CustomerName === "Fred"); // Filter so we only have customers named Fred.
    

    Parameters

    • predicate: PredicateFn<ValueT>

      Predicte function to filter rows from the dataframe. Returns true/truthy to keep rows, or false/falsy to omit rows.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe containing only the rows from the original dataframe that matched the predicate.

window

  • Partition a dataframe into a Series of data windows. Each value in the new series is a chunk of data from the original dataframe.

    example
    
    const windows = df.window(2); // Get rows in pairs.
    const pctIncrease = windows.select(pair => (pair.last().SalesAmount - pair.first().SalesAmount) / pair.first().SalesAmount);
    console.log(pctIncrease.toString());
    
    example
    
    const salesDf = ... // Daily sales data.
    const weeklySales = salesDf.window(7); // Partition up into weekly data sets.
    console.log(weeklySales.toString());
    

    Parameters

    • period: number

      The number of rows to include in each data window.

    Returns ISeries<number, IDataFrame<IndexT, ValueT>>

    Returns a new series, each value of which is a chunk (data window) of the original dataframe.

withIndex

  • withIndex<NewIndexT>(newIndex: Iterable<NewIndexT> | SelectorFn<ValueT, NewIndexT>): IDataFrame<NewIndexT, ValueT>
  • Apply a new Index to the dataframe.

    example
    
    const indexedDf = df.withIndex([10, 20, 30]);
    
    example
    
    const indexedDf = df.withIndex(df.getSeries("SomeColumn"));
    
    example
    
    const indexedDf = df.withIndex(row => row.SomeColumn);
    
    example
    
    const indexedDf = df.withIndex(row => row.SomeColumn + 20);
    

    Type parameters

    • NewIndexT

    Parameters

    • newIndex: Iterable<NewIndexT> | SelectorFn<ValueT, NewIndexT>

      The new array or iterable to be the new Index of the dataframe. Can also be a selector to choose the Index for each row in the dataframe.

    Returns IDataFrame<NewIndexT, ValueT>

    Returns a new dataframe with the specified Index attached.

withSeries

  • Create a new dataframe with a replaced or additional column specified by the passed-in series.

    example
    
    const modifiedDf = df.withSeries("ANewColumn", new Series([1, 2, 3]));
    
    example
    
    const modifiedDf = df.withSeries("ANewColumn", df =>
         df.getSeries("SourceData").select(aTransformation)
    );
    
    example
    
    

    const modifiedDf = df.withSeries({ ANewColumn: new Series([1, 2, 3]), SomeOtherColumn: new Series([10, 20, 30]) });

    example
    
    

    const modifiedDf = df.withSeries({ ANewColumn: df => df.getSeries("SourceData").select(aTransformation)) });

    Type parameters

    • SeriesValueT

    Parameters

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe replacing or adding a particular named column.

zip

  • zip<Index2T, Value2T, ResultT>(s2: IDataFrame<Index2T, Value2T>, zipper: Zip2Fn<ValueT, Value2T, ResultT>): IDataFrame<IndexT, ResultT>
  • zip<Index2T, Value2T, Index3T, Value3T, ResultT>(s2: IDataFrame<Index2T, Value2T>, s3: IDataFrame<Index3T, Value3T>, zipper: Zip3Fn<ValueT, Value2T, Value3T, ResultT>): IDataFrame<IndexT, ResultT>
  • zip<Index2T, Value2T, Index3T, Value3T, Index4T, Value4T, ResultT>(s2: IDataFrame<Index2T, Value2T>, s3: IDataFrame<Index3T, Value3T>, s4: IDataFrame<Index4T, Value4T>, zipper: Zip3Fn<ValueT, Value2T, Value3T, ResultT>): IDataFrame<IndexT, ResultT>
  • zip<ResultT>(...args: any[]): IDataFrame<IndexT, ResultT>
  • Merge together multiple dataframes to create a new dataframe. Preserves the index of the first dataframe.

    example
    
    function produceNewRow (rowA, rowB) {
          const outputRow = {
              ValueA: rowA.Value,
              ValueB: rowB.Value,
          };
          return outputRow;
    }
    
    const dfA = new DataFrame([ { Value: 10 }, { Value: 20 }, { Value: 30 }]);
    const dfB = new DataFrame([ { Value: 100 }, { Value: 200 }, { Value: 300 }]);
    const zippedDf = dfA.zip(dfB, produceNewRow);
    

    Type parameters

    • Index2T

    • Value2T

    • ResultT

    Parameters

    • s2: IDataFrame<Index2T, Value2T>
    • zipper: Zip2Fn<ValueT, Value2T, ResultT>

      User-defined zipper function that merges rows. It produces rows for the new dataframe based-on rows from the input dataframes.

    Returns IDataFrame<IndexT, ResultT>

    Returns a single dataframe merged from multiple input dataframes.

  • Type parameters

    • Index2T

    • Value2T

    • Index3T

    • Value3T

    • ResultT

    Parameters

    Returns IDataFrame<IndexT, ResultT>

  • Type parameters

    • Index2T

    • Value2T

    • Index3T

    • Value3T

    • Index4T

    • Value4T

    • ResultT

    Parameters

    Returns IDataFrame<IndexT, ResultT>

  • Type parameters

    • ResultT

    Parameters

    • Rest ...args: any[]

    Returns IDataFrame<IndexT, ResultT>

Generated using TypeDoc