Options
All
  • Public
  • Public/Protected
  • All
Menu

Interface IDataFrame<IndexT, ValueT>

Interface that represents a dataframe. A dataframe contains an indexed sequence of data records. Think of it as a spreadsheet or CSV file in memory.

Each data record contains multiple named fields, the value of each field represents one row in a column of data. Each column of data is a named Series. You think of a dataframe a collection of named data series.

Type parameters

  • IndexT

    The type to use for the index.

  • ValueT

    The type to use for each row/data record.

Hierarchy

Implemented by

Index

Methods

__@iterator

  • __@iterator(): Iterator<ValueT>
  • Get an iterator to enumerate the rows of the dataframe. Enumerating the iterator forces lazy evaluation to complete. This function is automatically called by for...of.

    example
    for (const row of df) {
        // ... do something with the row ...
    }
    

    Returns Iterator<ValueT>

    An iterator for the rows in the dataframe.

after

  • after(indexValue: IndexT): IDataFrame<IndexT, ValueT>
  • Gets a new dataframe containing all rows after the specified index value (exclusive).

    example
    const df = new DataFrame({
         index: [0, 1, 2, 3], // This is the default index.
         values: [10, 20, 30, 40],
    });
    
    const lastHalf = df.before(1);
    expect(lastHalf.toArray()).to.eql([30, 40]);
    
    example
    const timeSeriesDf = ... a dataframe indexed by date/time ...
    
    // Get all rows after the specified date.
    const allRowsAfterStartDate = df.after(new Date(2016, 5, 4));
    

    Parameters

    • indexValue: IndexT

      The index value after which to start the new dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe containing all rows after the specified index value.

aggregate

  • Aggregate the rows in the dataframe to a single result.

    aggregate is similar to DataFrame.reduce but the parameters are reversed. Please use DataFrame.reduce in preference to aggregate.

    example
    const dailySalesDf = ... daily sales figures for the past month ...
    const totalSalesForthisMonth = dailySalesDf.aggregate(
         0, // Seed - the starting value.
         (accumulator, row) => accumulator + row.SalesAmount // Aggregation function.
    );
    
    example
    const totalSalesAllTime = 500; // We'll seed the aggregation with this value.
    const dailySalesDf = ... daily sales figures for the past month ...
    const updatedTotalSalesAllTime = dailySalesDf.aggregate(
         totalSalesAllTime,
         (accumulator, row) => accumulator + row.SalesAmount
    );
    
    example
    var salesDataSummary = salesDataDf.aggregate({
         TotalSales: df => df.count(),
         AveragePrice: df => df.deflate(row => row.Price).average(),
         TotalRevenue: df => df.deflate(row => row.Revenue).sum(),
    });
    

    Type parameters

    • ToT

    Parameters

    • seedOrSelector: AggregateFn<ValueT, ToT> | ToT | IColumnAggregateSpec
    • Optional selector: AggregateFn<ValueT, ToT>

      Function that takes the seed and then each row in the dataframe and produces the aggregate value.

    Returns ToT

    Returns a new value that has been aggregated from the dataframe using the 'selector' function.

all

  • Evaluates a predicate function for every row in the dataframe to determine if some condition is true/truthy for all rows in the dataframe.

    example
    const everyoneIsNamedFred = df.all(row => row.CustomerName === "Fred"); // Check if all customers are named Fred.
    

    Parameters

    • predicate: PredicateFn<ValueT>

      Predicate function that receives each row. It should returns true/truthy for a match, otherwise false/falsy.

    Returns boolean

    Returns true if the predicate has returned true or truthy for every row in the dataframe, otherwise returns false. Returns false for an empty dataframe.

any

  • Evaluates a predicate function for every row in the dataframe to determine if some condition is true/truthy for any of rows in the dataframe.

    If no predicate is specified then it simply checks if the dataframe contains more than zero rows.

    example
    const anyFreds = df.any(row => row.CustomerName === "Fred"); // Do we have any customers named Fred?
    
    example
    const anyCustomers = df.any(); // Do we have any customers at all?
    

    Parameters

    • Optional predicate: PredicateFn<ValueT>

      Optional predicate function that receives each row. It should return true/truthy for a match, otherwise false/falsy.

    Returns boolean

    Returns true if the predicate has returned truthy for any row in the sequence, otherwise returns false. If no predicate is passed it returns true if the dataframe contains any rows at all. Returns false for an empty dataframe.

appendPair

  • appendPair(pair: [IndexT, ValueT]): IDataFrame<IndexT, ValueT>
  • Append a pair to the end of a dataframe. Doesn't modify the original dataframe! The returned dataframe is entirely new and contains rows from the original dataframe plus the appended pair.

    example
    const newIndex = ... index of the new row ...
    const newRow = ... the new data row to append ...
    const appendedDf = df.appendPair([newIndex, newRows]);
    

    Parameters

    • pair: [IndexT, ValueT]

      The index/value pair to append.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with the specified pair appended.

at

  • at(index: IndexT): ValueT | undefined
  • Get the row, if there is one, with the specified index.

    example
    const row = df.at(5); // Get the row at index 5 (with a default 0-based index).
    
    example
    const date = ... some date ...
    // Retreive the row with specified date from a time-series dataframe (assuming date indexed has been applied).
    const row = df.at(date);
    

    Parameters

    • index: IndexT

      Index to for which to retreive the row.

    Returns ValueT | undefined

    Returns the row from the specified index in the dataframe or undefined if there is no such index in the present in the dataframe.

bake

  • Forces lazy evaluation to complete and 'bakes' the dataframe into memory.

    example
    const bakedDf = df.bake();
    

    Returns IDataFrame<IndexT, ValueT>

    Returns a dataframe that has been 'baked', all lazy evaluation has completed.

before

  • before(indexValue: IndexT): IDataFrame<IndexT, ValueT>
  • Gets a new dataframe containing all rows up to the specified index value (exclusive).

    example
    const df = new DataFrame({
         index: [0, 1, 2, 3], // This is the default index.
         values: [10, 20, 30, 40],
    });
    
    const firstHalf = df.before(2);
    expect(firstHalf.toArray()).to.eql([10, 20]);
    
    example
    const timeSeriesDf = ... a dataframe indexed by date/time ...
    
    // Get all rows before the specified date.
    const allRowsBeforeEndDate = df.before(new Date(2016, 5, 4));
    

    Parameters

    • indexValue: IndexT

      The index value at which to end the new dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe containing all rows up to (but not including) the specified index value.

between

  • between(startIndexValue: IndexT, endIndexValue: IndexT): IDataFrame<IndexT, ValueT>
  • Gets a new dataframe containing all rows between the specified index values (inclusive).

    example
    const df = new DataFrame({
         index: [0, 1, 2, 3, 4, 6], // This is the default index.
         values: [10, 20, 30, 40, 50, 60],
    });
    
    const middleSection = df.between(1, 4);
    expect(middleSection.toArray()).to.eql([20, 30, 40, 50]);
    
    example
    const timeSeriesDf = ... a dataframe indexed by date/time ...
    
    // Get all rows between the start and end dates (inclusive).
    const allRowsBetweenDates = df.after(new Date(2016, 5, 4), new Date(2016, 5, 22));
    

    Parameters

    • startIndexValue: IndexT

      The index at which to start the new dataframe.

    • endIndexValue: IndexT

      The index at which to end the new dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe containing all values between the specified index values (inclusive).

bringToBack

  • bringToBack(columnOrColumns: string | string[]): IDataFrame<IndexT, ValueT>
  • Bring the column(s) with specified name(s) to the back of the column order, making it (or them) the last column(s) in the output dataframe.

    example
    const modifiedDf = df.bringToBack("NewLastColumn");
    
    example
    const modifiedDf = df.bringToBack(["NewSecondLastColumn, ""NewLastColumn"]);
    

    Parameters

    • columnOrColumns: string | string[]

      Specifies the column or columns to bring to the back.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with 1 or more columns bought to the back of the column ordering.

bringToFront

  • bringToFront(columnOrColumns: string | string[]): IDataFrame<IndexT, ValueT>
  • Bring the column(s) with specified name(s) to the front of the column order, making it (or them) the first column(s) in the output dataframe.

    example
    const modifiedDf = df.bringToFront("NewFirstColumn");
    
    example
    const modifiedDf = df.bringToFront(["NewFirstColumn", "NewSecondColumn"]);
    

    Parameters

    • columnOrColumns: string | string[]

      Specifies the column or columns to bring to the front.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with 1 or more columns bought to the front of the column ordering.

cast

  • cast<NewValueT>(): IDataFrame<IndexT, NewValueT>
  • Cast the value of the dataframe to a new type. This operation has no effect but to retype the r9ws that the dataframe contains.

    example
    const castDf = df.cast();
    

    Type parameters

    • NewValueT

    Returns IDataFrame<IndexT, NewValueT>

    The same dataframe, but with the type changed.

concat

  • Concatenate multiple other dataframes onto this dataframe.

    example
    const concatenated = a.concat(b);
    
    example
    const concatenated = a.concat(b, c);
    
    example
    const concatenated = a.concat([b, c]);
    
    example
    const concatenated = a.concat(b, [c, d]);
    
    example
    const otherDfs = [... array of dataframes...];
    const concatenated = a.concat(otherDfs);
    

    Parameters

    • Rest ...dataframes: (IDataFrame<IndexT, ValueT> | IDataFrame<IndexT, ValueT>[])[]

      Multiple arguments. Each can be either a dataframe or an array of dataframes.

    Returns IDataFrame<IndexT, ValueT>

    Returns a single dataframe concatenated from multiple input dataframes.

count

  • count(): number
  • Count the number of rows in the dataframe

    example
    const numRows = df.count();
    

    Returns number

    Returns the count of all rows.

defaultIfEmpty

  • defaultIfEmpty(defaultSequence: ValueT[] | IDataFrame<IndexT, ValueT>): IDataFrame<IndexT, ValueT>
  • Returns the specified default dataframe if the input dataframe is empty.

    example
    const emptyDataFrame = new DataFrame();
    const defaultDataFrame = new DataFrame([ { A: 1 }, { A: 2 }, { A: 3 } ]);
    expect(emptyDataFrame.defaultIfEmpty(defaultDataFrame)).to.eql(defaultDataFrame);
    
    example
    const nonEmptyDataFrame = new DataFrame([ { A: 100 }]);
    const defaultDataFrame = new DataFrame([ { A: 1 }, { A: 2 }, { A: 3 } ]);
    expect(nonEmptyDataFrame.defaultIfEmpty(defaultDataFrame)).to.eql(nonEmptyDataFrame);
    

    Parameters

    • defaultSequence: ValueT[] | IDataFrame<IndexT, ValueT>

      Default dataframe to return if the input dataframe is empty.

    Returns IDataFrame<IndexT, ValueT>

    Returns 'defaultSequence' if the input dataframe is empty.

deflate

  • Converts (deflates) a dataframe to a Series.

    example
    const series = df.deflate(); // Deflate to a series of object.
    
    example
    const series = df.deflate(row => row.SomeColumn); // Extract a particular column.
    

    Type parameters

    • ToT

    Parameters

    • Optional selector: SelectorWithIndexFn<ValueT, ToT>

      Optional user-defined selector function that transforms each row to produce the series.

    Returns ISeries<IndexT, ToT>

    Returns a series that was created from the original dataframe.

detectTypes

  • Detect the the frequency of the types of the values in the dataframe. This is a good way to understand the shape of your data.

    example
    const df = dataForge.readFileSync("./my-data.json").parseJSON();
    const dataTypes = df.detectTypes();
    console.log(dataTypes.toString());
    

    Returns IDataFrame<number, ITypeFrequency>

    Returns a dataframe with rows that confirm to ITypeFrequency that describes the data types contained in the original dataframe.

detectValues

  • Detect the frequency of the values in the dataframe. This is a good way to understand the shape of your data.

    example
    const df = dataForge.readFileSync("./my-data.json").parseJSON();
    const dataValues = df.detectValues();
    console.log(dataValues.toString());
    

    Returns IDataFrame<number, IValueFrequency>

    Returns a dataframe with rows that conform to IValueFrequency that describes the values contained in the original dataframe.

distinct

  • Returns only the set of rows in the dataframe that are distinct according to some criteria. This can be used to remove duplicate rows from the dataframe.

    example
    // Remove duplicate rows by customer id. Will return only a single row per customer.
    const distinctCustomers = salesDf.distinct(sale => sale.CustomerId);
    
    example
    // Remove duplicate rows across mutliple columns
    const safeJoinChar = '$';
    const distinctCustomers = salesDf.distinct(sale => [sale.CustomerId, sale.MonthOfYear].join(safeJoinChar));
    

    Type parameters

    • ToT

    Parameters

    • Optional selector: SelectorFn<ValueT, ToT>

      User-defined selector function that specifies the criteria used to make comparisons for duplicate rows. Note that the selector determines the object used for the comparison. If the selector returns a new instance of an array or a javascript object, distinct will always include all rows since the object instances are different even if the members are the same.

    Returns IDataFrame<IndexT, ValueT>

    Returns a dataframe containing only unique values as determined by the 'selector' function.

dropSeries

  • dropSeries<NewValueT>(columnOrColumns: string | string[]): IDataFrame<IndexT, NewValueT>
  • Create a new dataframe with the requested column or columns dropped.

    example
    const modifiedDf = df.dropSeries("SomeColumn");
    
    example
    const modifiedDf = df.dropSeries(["ColumnA", "ColumnB"]);
    

    Type parameters

    • NewValueT

    Parameters

    • columnOrColumns: string | string[]

      Specifies the column name (a string) or columns (array of strings) to drop.

    Returns IDataFrame<IndexT, NewValueT>

    Returns a new dataframe with a particular named column or columns removed.

endAt

  • endAt(indexValue: IndexT): IDataFrame<IndexT, ValueT>
  • Gets a new dataframe containing all rows up until and including the specified index value (inclusive).

    example
    const df = new DataFrame({
         index: [0, 1, 2, 3], // This is the default index.
         values: [10, 20, 30, 40],
    });
    
    const firstHalf = df.endAt(1);
    expect(firstHalf.toArray()).to.eql([10, 20]);
    
    example
    const timeSeriesDf = ... a dataframe indexed by date/time ...
    
    // Get all rows ending at a particular date.
    const allRowsUpToAndIncludingTheExactEndDate = df.endAt(new Date(2016, 5, 4));
    

    Parameters

    • indexValue: IndexT

      The index value at which to end the new dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe containing all rows up until and including the specified index value.

ensureSeries

  • Add a series to the dataframe, but only if it doesn't already exist.

    example
    const updatedDf = df.ensureSeries("ANewColumn", new Series([1, 2, 3]));
    
    example
    const updatedDf = df.ensureSeries("ANewColumn", df =>
         df.getSeries("AnExistingSeries").select(aTransformation)
    );
    
    example
    const modifiedDf = df.ensureSeries({
         ANewColumn: new Series([1, 2, 3]),
         SomeOtherColumn: new Series([10, 20, 30])
    });
    
    example
    const modifiedDf = df.ensureSeries({
         ANewColumn: df => df.getSeries("SourceData").select(aTransformation))
    });
    

    Type parameters

    • SeriesValueT

    Parameters

    • columnNameOrSpec: string | IColumnGenSpec

      The name of the series to add or a IColumnGenSpec that specifies the columns to add.

    • Optional series: ISeries<IndexT, SeriesValueT> | SeriesSelectorFn<IndexT, ValueT, SeriesValueT>

      If columnNameOrSpec is a string that specifies the name of the series to add, this specifies the actual Series to add or a selector that generates the series given the dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with the specified series added, if the series didn't already exist. Otherwise if the requested series already exists the same dataframe is returned.

except

  • except<InnerIndexT, InnerValueT, KeyT>(inner: IDataFrame<InnerIndexT, InnerValueT>, outerSelector?: SelectorFn<ValueT, KeyT>, innerSelector?: SelectorFn<InnerValueT, KeyT>): IDataFrame<IndexT, ValueT>
  • Creates a new dataframe by merging two input dataframes. The resulting dataframe contains only the rows from the 1st dataframe that don't appear in the 2nd dataframe. This is essentially subtracting the rows from the 2nd dataframe from the 1st and creating a new dataframe with the remaining rows.

    example
    const dfA = ...
    const dfB = ...
    const remainingDf = dfA.except(dfB);
    
    example
    // Find the list of customers haven't bought anything recently.
    const allCustomers = ... list of all customers ...
    const recentCustomers = ... list of customers who have purchased recently ...
    const remainingCustomers = allCustomers.except(
         recentCustomers,
         customerRecord => customerRecord.CustomerId
    );
    

    Type parameters

    • InnerIndexT

    • InnerValueT

    • KeyT

    Parameters

    • inner: IDataFrame<InnerIndexT, InnerValueT>

      The inner dataframe to merge (the dataframe you call the function on is the 'outer' dataframe).

    • Optional outerSelector: SelectorFn<ValueT, KeyT>

      Optional user-defined selector function that selects the key from the outer dataframe that is used to match the two dataframes.

    • Optional innerSelector: SelectorFn<InnerValueT, KeyT>

      Optional user-defined selector function that selects the key from the inner dataframe that is used to match the two dataframes.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe that contains only the rows from the 1st dataframe that don't appear in the 2nd dataframe.

expectSeries

  • expectSeries<SeriesValueT>(columnName: string): ISeries<IndexT, SeriesValueT>
  • Verify the existence of a name column and extracts the Series for it. Throws an exception if the requested column doesn't exist.

    example
    try {
         const series = df.expectSeries("SomeColumn");
         // ... do something with the series ...
    }
    catch (err) {
         // ... the dataframe doesn't contain the column "SomeColumn" ...
    }
    

    Type parameters

    • SeriesValueT

    Parameters

    • columnName: string

      Name of the column to extract.

    Returns ISeries<IndexT, SeriesValueT>

    Returns the Series for the column if it exists, otherwise it throws an exception.

fillGaps

  • fillGaps(comparer: ComparerFn<[IndexT, ValueT], [IndexT, ValueT]>, generator: GapFillFn<[IndexT, ValueT], [IndexT, ValueT]>): IDataFrame<IndexT, ValueT>
  • Fill gaps in a dataframe.

    example
      var sequenceWithGaps = ...
    
     // Predicate that determines if there is a gap.
     var gapExists = (pairA, pairB) => {
         // Returns true if there is a gap.
         return true;
     };
    
     // Generator function that produces new rows to fill the game.
     var gapFiller = (pairA, pairB) => {
         // Create an array of index, value pairs that fill the gaps between pairA and pairB.
         return [
             newPair1,
             newPair2,
             newPair3,
         ];
     };
    
     var sequenceWithoutGaps = sequenceWithGaps.fillGaps(gapExists, gapFiller);
    

    Parameters

    • comparer: ComparerFn<[IndexT, ValueT], [IndexT, ValueT]>

      User-defined comparer function that is passed pairA and pairB, two consecutive rows, return truthy if there is a gap between the rows, or falsey if there is no gap.

    • generator: GapFillFn<[IndexT, ValueT], [IndexT, ValueT]>

      User-defined generator function that is passed pairA and pairB, two consecutive rows, returns an array of pairs that fills the gap between the rows.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with gaps filled in.

filter

  • Filter the dataframe through a user-defined predicate function.

    This is the same concept as the JavaScript function Array.filter but filters a dataframe rather than an array.

    example
    // Filter so we only have sales figures greater than 100.
    const filtered = dataframe.filter(row => row.salesFigure > 100);
    console.log(filtered.toArray());
    

    Parameters

    • predicate: PredicateFn<ValueT>

      Predicate function to filter values from the dataframe. Returns true/truthy to keep elements, or false/falsy to omit elements.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe containing only the values from the original dataframe that matched the predicate.

first

  • first(): ValueT
  • Get the first row of the dataframe.

    example
    const firstRow = df.first();
    

    Returns ValueT

    Returns the first row of the dataframe.

flatMap

  • Transforms and flattens an input dataframe, generating a new dataframe. The transformer function is called for each value in the input dataframe and produces an array that is then flattened into the generated dataframe.

    This is the same concept as the JavaScript function Array.flatMap but maps over a dataframe rather than an array.

    example
    function transformer (input) {
         const output = [];
         while (someCondition) {
             // ... generate zero or more outputs from a single input ...
             output.push(... some generated value ...);
         }
         return output;
    }
    
    const transformed = dataframe.flatMap(transformer);
    console.log(transformed.toString());
    

    Type parameters

    • ToT

    Parameters

    • transformer: SelectorWithIndexFn<ValueT, Iterable<ToT>>

      A user-defined function that transforms each value into an array that is then flattened into the generated dataframe.

    Returns IDataFrame<IndexT, ToT>

    Returns a new dataframe generated by calling the transformer function over each element of the input.

forEach

  • Invoke a callback function for each row in the dataframe.

    example
    df.forEach(row => {
         // ... do something with the row ...
    });
    

    Parameters

    • callback: CallbackFn<ValueT>

      The calback function to invoke for each row.

    Returns IDataFrame<IndexT, ValueT>

    Returns the original dataframe with no modifications.

generateSeries

  • Generate new columns based on existing rows.

    This is equivalent to calling select to transform the original dataframe to a new dataframe with different column, then using withSeries to merge each the of both the new and original dataframes.

    example
    function produceNewColumns (inputRow) {
         const newColumns = {
             // ... specify new columns and their values based on the input row ...
         };
    
         return newColumns;
    };
    
    const dfWithNewSeries = df.generateSeries(row => produceNewColumns(row));
    
    example
    const dfWithNewSeries = df.generateSeries({
         NewColumnA: row => produceNewColumnA(row),
         NewColumnB: row => produceNewColumnB(row),
    })
    

    Type parameters

    • NewValueT

    Parameters

    • generator: SelectorWithIndexFn<any, any> | IColumnTransformSpec

      Generator function that transforms each row to produce 1 or more new columns. Or use a column spec that has fields for each column, the fields specify a generate function that produces the value for each new column.

    Returns IDataFrame<IndexT, NewValueT>

    Returns a new dataframe with 1 or more new columns.

getColumnNames

  • getColumnNames(): string[]
  • Get the names of the columns in the dataframe.

    example
    console.log(df.getColumnNames());
    

    Returns string[]

    Returns an array of the column names in the dataframe.

getColumns

  • Retreive the collection of all columns in the dataframe.

    example
    for (const column in df.getColumns()) {
         console.log("Column name: ");
         console.log(column.name);
    
         console.log("Data:");
         console.log(column.series.toArray());
    }
    

    Returns ISeries<number, IColumn>

    Returns a Series containing the names of the columns in the dataframe.

getIndex

  • Get the index for the dataframe.

    example
    const index = df.getIndex();
    

    Returns IIndex<IndexT>

    The Index for the dataframe.

getSeries

  • getSeries<SeriesValueT>(columnName: string): ISeries<IndexT, SeriesValueT>
  • Extract a Series from a named column in the dataframe.

    example
    const series = df.getSeries("SomeColumn");
    

    Type parameters

    • SeriesValueT

    Parameters

    • columnName: string

      Specifies the name of the column that contains the Series to retreive.

    Returns ISeries<IndexT, SeriesValueT>

    Returns the Series extracted from the named column in the dataframe.

groupBy

  • Collects rows in the dataframe into a Series of groups according to a user-defined selector function.

    example
    const salesDf = ... product sales ...
    const salesByProduct = salesDf.groupBy(sale => sale.ProductId);
    for (const productSalesGroup of salesByProduct) {
         // ... do something with each product group ...
         const productId = productSalesGroup.first().ProductId;
         const totalSalesForProduct = productSalesGroup.deflate(sale => sale.Amount).sum();
         console.log(totalSalesForProduct);
    }
    

    Type parameters

    • GroupT

    Parameters

    • selector: SelectorWithIndexFn<ValueT, GroupT>

      User-defined selector function that specifies the criteria to group by.

    Returns ISeries<number, IDataFrame<IndexT, ValueT>>

    Returns a Series of groups. Each group is a dataframe with rows that have been grouped by the 'selector' function.

groupSequentialBy

  • Collects values in the series into a new series of groups based on if the values are the same or according to a user-defined selector function.

    example
    // Some ultra simple stock trading strategy backtesting...
    const dailyStockPriceDf = ... daily stock price for a company ...
    const priceGroups  = dailyStockPriceDf.groupBy(day => day.close > day.movingAverage);
    for (const priceGroup of priceGroups) {
         // ... do something with each stock price group ...
    
         const firstDay = priceGroup.first();
         if (firstDay.close > movingAverage) {
             // This group of days has the stock price above its moving average.
             // ... maybe enter a long trade here ...
         }
         else {
             // This group of days has the stock price below its moving average.
             // ... maybe enter a short trade here ...
         }
    }
    

    Type parameters

    • GroupT

    Parameters

    • Optional selector: SelectorFn<ValueT, GroupT>

      Optional selector that specifies the criteria for grouping.

    Returns ISeries<number, IDataFrame<IndexT, ValueT>>

    Returns a Series of groups. Each group is a dataframe with rows that are the same or have been grouped by the 'selector' function.

hasSeries

  • hasSeries(columnName: string): boolean
  • Determine if the dataframe contains a Series the specified named column.

    example
    if (df.hasSeries("SomeColumn")) {
         // ... the dataframe contains a series with the specified column name ...
    }
    

    Parameters

    • columnName: string

      Name of the column to check for.

    Returns boolean

    Returns true if the dataframe contains the requested Series, otherwise returns false.

head

  • head(numValues: number): IDataFrame<IndexT, ValueT>
  • Get X rows from the start of the dataframe. Pass in a negative value to get all rows at the head except for X rows at the tail.

    examples
    const sample = df.head(10); // Take a sample of 10 rows from the start of the dataframe.
    

    Parameters

    • numValues: number

      Number of rows to take.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe that has only the specified number of rows taken from the start of the original dataframe.

inflateSeries

  • Inflate a named Series in the dataframe to 1 or more new series in the new dataframe.

    This is the equivalent of extracting the series using getSeries, transforming them with Series.select and then running Series.inflate to create a new dataframe, then merging each column of the new dataframe into the original dataframe using withSeries.

    example
    function newColumnGenerator (row) {
         const newColumns = {
             // ... create 1 field per new column ...
         };
    
         return row;
    }
    
    const dfWithNewSeries = df.inflateSeries("SomeColumn", newColumnGenerator);
    

    Type parameters

    • NewValueT

    Parameters

    • columnName: string

      Name of the series to inflate.

    • Optional selector: SelectorWithIndexFn<IndexT, any>

      Optional selector function that transforms each value in the column to new columns. If not specified it is expected that each value in the column is an object whose fields define the new column names.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with a column inflated to 1 or more new columns.

insertPair

  • insertPair(pair: [IndexT, ValueT]): IDataFrame<IndexT, ValueT>
  • Insert a pair at the start of the dataframe. Doesn't modify the original dataframe! The returned dataframe is entirely new and contains rows from the original dataframe plus the inserted pair.

    example
    const newIndex = ... index of the new row ...
    const newRow = ... the new data row to insert ...
    const insertedDf = df.insertPair([newIndex, newRows]);
    

    Parameters

    • pair: [IndexT, ValueT]

      The index/value pair to insert.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with the specified pair inserted.

intersection

  • intersection<InnerIndexT, InnerValueT, KeyT>(inner: IDataFrame<InnerIndexT, InnerValueT>, outerSelector?: SelectorFn<ValueT, KeyT>, innerSelector?: SelectorFn<InnerValueT, KeyT>): IDataFrame<IndexT, ValueT>
  • Creates a new dataframe by merging two input dataframes. The resulting dataframe contains the intersection of rows from the two input dataframes. These are only the rows that appear in both dataframes.

    example
    const dfA = ...
    const dfB = ...
    const mergedDf = dfA.intersection(dfB);
    
    example
    // Merge two sets of customer records to find only the
    // customers that appears in both.
    const customerRecordsA = ...
    const customerRecordsB = ...
    const intersectionOfCustomerRecords = customerRecordsA.intersection(
         customerRecordsB,
         customerRecord => customerRecord.CustomerId
    );
    

    Type parameters

    • InnerIndexT

    • InnerValueT

    • KeyT

    Parameters

    • inner: IDataFrame<InnerIndexT, InnerValueT>

      The inner dataframe to merge (the dataframe you call the function on is the 'outer' dataframe).

    • Optional outerSelector: SelectorFn<ValueT, KeyT>

      Optional user-defined selector function that selects the key from the outer dataframe that is used to match the two dataframes.

    • Optional innerSelector: SelectorFn<InnerValueT, KeyT>

      Optional user-defined selector function that selects the key from the inner dataframe that is used to match the two dataframes.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe that contains the intersection of rows from the two input dataframes.

join

  • join<KeyT, InnerIndexT, InnerValueT, ResultValueT>(inner: IDataFrame<InnerIndexT, InnerValueT>, outerKeySelector: SelectorFn<ValueT, KeyT>, innerKeySelector: SelectorFn<InnerValueT, KeyT>, resultSelector: JoinFn<ValueT, InnerValueT, ResultValueT>): IDataFrame<number, ResultValueT>
  • Creates a new dataframe by merging two input dataframes. The resulting dataframe contains only those rows that have matching keys in both input dataframes.

    example
    // Join together two sets of customers to find those
    // that have bought both product A and product B.
    const customerWhoBoughtProductA = ...
    const customerWhoBoughtProductB = ...
    const customersWhoBoughtBothProductsDf = customerWhoBoughtProductA.join(
             customerWhoBoughtProductB,
             customerA => customerA.CustomerId, // Join key.
             customerB => customerB.CustomerId, // Join key.
             (customerA, customerB) => {
                 return {
                     // ... merge the results ...
                 };
             }
         );
    

    Type parameters

    • KeyT

    • InnerIndexT

    • InnerValueT

    • ResultValueT

    Parameters

    • inner: IDataFrame<InnerIndexT, InnerValueT>

      The 'inner' dataframe to join (the dataframe you are callling the function on is the 'outer' dataframe).

    • outerKeySelector: SelectorFn<ValueT, KeyT>

      User-defined selector function that chooses the join key from the outer dataframe.

    • innerKeySelector: SelectorFn<InnerValueT, KeyT>

      User-defined selector function that chooses the join key from the inner dataframe.

    • resultSelector: JoinFn<ValueT, InnerValueT, ResultValueT>

      User-defined function that merges outer and inner values.

    Returns IDataFrame<number, ResultValueT>

    Returns the new merged dataframe.

joinOuter

  • joinOuter<KeyT, InnerIndexT, InnerValueT, ResultValueT>(inner: IDataFrame<InnerIndexT, InnerValueT>, outerKeySelector: SelectorFn<ValueT, KeyT>, innerKeySelector: SelectorFn<InnerValueT, KeyT>, resultSelector: JoinFn<ValueT | null, InnerValueT | null, ResultValueT>): IDataFrame<number, ResultValueT>
  • Creates a new dataframe by merging two input dataframes. The resulting dataframe contains only those rows that are only present in one or the other of the dataframes, or both.

    example
    // Join together two sets of customers to find those
    // that have bought either product A or product B, or both.
    const customerWhoBoughtProductA = ...
    const customerWhoBoughtProductB = ...
    const customersWhoBoughtEitherProductButNotBothDf = customerWhoBoughtProductA.joinOuter(
             customerWhoBoughtProductB,
             customerA => customerA.CustomerId, // Join key.
             customerB => customerB.CustomerId, // Join key.
             (customerA, customerB) => {
                 return {
                     // ... merge the results ...
                 };
             }
         );
    

    Type parameters

    • KeyT

    • InnerIndexT

    • InnerValueT

    • ResultValueT

    Parameters

    • inner: IDataFrame<InnerIndexT, InnerValueT>

      The 'inner' dataframe to join (the dataframe you are callling the function on is the 'outer' dataframe).

    • outerKeySelector: SelectorFn<ValueT, KeyT>

      User-defined selector function that chooses the join key from the outer dataframe.

    • innerKeySelector: SelectorFn<InnerValueT, KeyT>

      User-defined selector function that chooses the join key from the inner dataframe.

    • resultSelector: JoinFn<ValueT | null, InnerValueT | null, ResultValueT>

      User-defined function that merges outer and inner values.

      Implementation from here:

      http://blogs.geniuscode.net/RyanDHatch/?p=116

    Returns IDataFrame<number, ResultValueT>

    Returns the new merged dataframe.

joinOuterLeft

  • joinOuterLeft<KeyT, InnerIndexT, InnerValueT, ResultValueT>(inner: IDataFrame<InnerIndexT, InnerValueT>, outerKeySelector: SelectorFn<ValueT, KeyT>, innerKeySelector: SelectorFn<InnerValueT, KeyT>, resultSelector: JoinFn<ValueT | null, InnerValueT | null, ResultValueT>): IDataFrame<number, ResultValueT>
  • Creates a new dataframe by merging two input dataframes. The resulting dataframe contains only those rows that are present either in both dataframes or only in the outer (left) dataframe.

    example
    // Join together two sets of customers to find those
    // that have bought either just product A or both product A and product B.
    const customerWhoBoughtProductA = ...
    const customerWhoBoughtProductB = ...
    const boughtJustAorAandB = customerWhoBoughtProductA.joinOuterLeft(
             customerWhoBoughtProductB,
             customerA => customerA.CustomerId, // Join key.
             customerB => customerB.CustomerId, // Join key.
             (customerA, customerB) => {
                 return {
                     // ... merge the results ...
                 };
             }
         );
    

    Type parameters

    • KeyT

    • InnerIndexT

    • InnerValueT

    • ResultValueT

    Parameters

    • inner: IDataFrame<InnerIndexT, InnerValueT>

      The 'inner' dataframe to join (the dataframe you are callling the function on is the 'outer' dataframe).

    • outerKeySelector: SelectorFn<ValueT, KeyT>

      User-defined selector function that chooses the join key from the outer dataframe.

    • innerKeySelector: SelectorFn<InnerValueT, KeyT>

      User-defined selector function that chooses the join key from the inner dataframe.

    • resultSelector: JoinFn<ValueT | null, InnerValueT | null, ResultValueT>

      User-defined function that merges outer and inner values.

      Implementation from here:

      http://blogs.geniuscode.net/RyanDHatch/?p=116

    Returns IDataFrame<number, ResultValueT>

    Returns the new merged dataframe.

joinOuterRight

  • joinOuterRight<KeyT, InnerIndexT, InnerValueT, ResultValueT>(inner: IDataFrame<InnerIndexT, InnerValueT>, outerKeySelector: SelectorFn<ValueT, KeyT>, innerKeySelector: SelectorFn<InnerValueT, KeyT>, resultSelector: JoinFn<ValueT | null, InnerValueT | null, ResultValueT>): IDataFrame<number, ResultValueT>
  • Creates a new dataframe by merging two input dataframes. The resulting dataframe contains only those rows that are present either in both dataframes or only in the inner (right) dataframe.

    example
    // Join together two sets of customers to find those
    // that have bought either just product B or both product A and product B.
    const customerWhoBoughtProductA = ...
    const customerWhoBoughtProductB = ...
    const boughtJustAorAandB = customerWhoBoughtProductA.joinOuterRight(
             customerWhoBoughtProductB,
             customerA => customerA.CustomerId, // Join key.
             customerB => customerB.CustomerId, // Join key.
             (customerA, customerB) => {
                 return {
                     // ... merge the results ...
                 };
             }
         );
    

    Type parameters

    • KeyT

    • InnerIndexT

    • InnerValueT

    • ResultValueT

    Parameters

    • inner: IDataFrame<InnerIndexT, InnerValueT>

      The 'inner' dataframe to join (the dataframe you are callling the function on is the 'outer' dataframe).

    • outerKeySelector: SelectorFn<ValueT, KeyT>

      User-defined selector function that chooses the join key from the outer dataframe.

    • innerKeySelector: SelectorFn<InnerValueT, KeyT>

      User-defined selector function that chooses the join key from the inner dataframe.

    • resultSelector: JoinFn<ValueT | null, InnerValueT | null, ResultValueT>

      User-defined function that merges outer and inner values.

      Implementation from here:

      http://blogs.geniuscode.net/RyanDHatch/?p=116

    Returns IDataFrame<number, ResultValueT>

    Returns the new merged dataframe.

last

  • last(): ValueT
  • Get the last row of the dataframe.

    example
    const lastRow = df.last();
    

    Returns ValueT

    Returns the last row of the dataframe.

map

  • Transforms an input dataframe, generating a new dataframe. The transformer function is called for each element of the input and the collection of outputs creates the generated datafarme.

    This is the same concept as the JavaScript function Array.map but maps over a dataframe rather than an array.

    example
    function transformer (input) {
         const output = {
             // ... construct output from input ...
         };
    
         return output;
    }
    
    const transformed = dataframe.map(transformer);
    console.log(transformed.toString());
    

    Type parameters

    • ToT

    Parameters

    • transformer: SelectorWithIndexFn<ValueT, ToT>

      A user-defined transformer function that transforms each element from the input to generate the output.

    Returns IDataFrame<IndexT, ToT>

    Returns a new dataframe generated by calling the transformer function over each element of the input.

melt

  • melt<NewValueT>(idColumnOrColumns: string | Iterable<string>, valueColumnOrColumns: string | Iterable<string>): IDataFrame<IndexT, NewValueT>
  • Unpivot a DataFrame from wide to long format, optionally leaving identifiers set. This is a powerful function that combines grouping, aggregation and sorting.

    example
    // Use column in 'idColumnOrColumns' as the identity column.
    // The column name passed in 'valueColumnOrColumns' forms the 'variable' column
    // and the values are used to populate the 'value' column of the new dataframe.
    const moltenDf = df.melt("A", "B");
    
    example
    // Multiple value columns example.
    // Similar to the previous example except now the variable column will constitute
    // of multiple values.
    const moltenDf = df.melt("A", ["B", "C"]);
    
    example
    // Multiple identity and value columns example.
    const moltenDf = df.melt(["A", "B"], ["C", "D"]);
    

    Type parameters

    • NewValueT

    Parameters

    • idColumnOrColumns: string | Iterable<string>

      Column(s) to use as identifier variables.

    • valueColumnOrColumns: string | Iterable<string>

      Column(s) to unpivot.

    Returns IDataFrame<IndexT, NewValueT>

    Returns a new dataframe that has been unpivoted based on a particular column's values.

merge

  • merge<MergedValueT>(...otherDataFrames: IDataFrame<IndexT, any>[]): IDataFrame<IndexT, MergedValueT>
  • Merge one or more dataframes into this single dataframe. Rows are merged by indexed. Same named columns in subsequent dataframes override columns in earlier dataframes.

    example
    const mergedDF = df1.merge(df2);
    
    const mergedDF = df1.merge(df2, df3, etc);
    

    Type parameters

    • MergedValueT

    Parameters

    • Rest ...otherDataFrames: IDataFrame<IndexT, any>[]

    Returns IDataFrame<IndexT, MergedValueT>

    The merged data frame.

none

  • Evaluates a predicate function for every row in the dataframe to determine if some condition is true/truthy for none of rows in the dataframe.

    If no predicate is specified then it simply checks if the dataframe contains zero rows.

    example
    const noFreds = df.none(row => row.CustomerName === "Fred"); // Do we have zero customers named Fred?
    
    example
    const noCustomers = df.none(); // Do we have zero customers?
    

    Parameters

    • Optional predicate: PredicateFn<ValueT>

      Optional predicate function that receives each row. It should return true/truthy for a match, otherwise false/falsy.

    Returns boolean

    Returns true if the predicate has returned truthy for zero rows in the dataframe, otherwise returns false. Returns false for an empty dataframe.

orderBy

  • Sorts the dataframe in ascending order by a value defined by the user-defined selector function.

    example
    // Order sales by amount from least to most.
    const orderedDf = salesDf.orderBy(sale => sale.Amount);
    

    Type parameters

    • SortT

    Parameters

    • selector: SelectorWithIndexFn<ValueT, SortT>

      User-defined selector function that selects the value to sort by.

    Returns IOrderedDataFrame<IndexT, ValueT, SortT>

    Returns a new dataframe that has been ordered accorrding to the value chosen by the selector function.

orderByDescending

  • Sorts the dataframe in descending order by a value defined by the user-defined selector function.

    example
    // Order sales by amount from most to least
    const orderedDf = salesDf.orderByDescending(sale => sale.Amount);
    

    Type parameters

    • SortT

    Parameters

    • selector: SelectorWithIndexFn<ValueT, SortT>

      User-defined selector function that selects the value to sort by.

    Returns IOrderedDataFrame<IndexT, ValueT, SortT>

    Returns a new dataframe that has been ordered accorrding to the value chosen by the selector function.

parseDates

  • parseDates(columnNameOrNames: string | string[], formatString?: undefined | string): IDataFrame<IndexT, ValueT>
  • Parse a column with string values and convert it to a column with date values.

    example
    const withParsedColumn = df.parseDates("MyDateColumn");
    
    example
    const withParsedColumns = df.parseDates(["MyDateColumnA", "MyDateColumnA"]);
    

    Parameters

    • columnNameOrNames: string | string[]

      -Specifies the column name or array of column names to parse.

    • Optional formatString: undefined | string

      Optional formatting string for dates.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with a particular named column parsed as dates.

parseFloats

  • parseFloats(columnNameOrNames: string | string[]): IDataFrame<IndexT, ValueT>
  • Parse a column with string values and convert it to a column with float values.

    example
    const withParsedColumn = df.parseFloats("MyFloatColumn");
    
    example
    const withParsedColumns = df.parseFloats(["MyFloatColumnA", "MyFloatColumnA"]);
    

    Parameters

    • columnNameOrNames: string | string[]

      Specifies the column name or array of column names to parse.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with a particular named column parsed as floats.

parseInts

  • parseInts(columnNameOrNames: string | string[]): IDataFrame<IndexT, ValueT>
  • Parse a column with string values and convert it to a column with int values.

    example
    const withParsedColumn = df.parseInts("MyIntColumn");
    
    example
    const withParsedColumns = df.parseInts(["MyIntColumnA", "MyIntColumnA"]);
    

    Parameters

    • columnNameOrNames: string | string[]

      Specifies the column name or array of column names to parse.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with a particular named column parsed as ints.

pivot

  • pivot<NewValueT>(columnOrColumns: string | Iterable<string>, valueColumnNameOrSpec: string | IMultiColumnAggregatorSpec, aggregator?: undefined | function): IDataFrame<number, NewValueT>
  • Reshape (or pivot) a dataframe based on column values. This is a powerful function that combines grouping, aggregation and sorting.

    example
    // Simplest example.
    // Group by the values in 'PivotColumn'.
    // The column 'ValueColumn' is aggregated for each group and this becomes the
    // values in the output column.
    const pivottedDf = df.pivot("PivotColumn", "ValueColumn", values => values.average());
    
    example
    // Multiple input column example.
    // Similar to the previous example except now we are aggregating multiple input columns.
    // Each group has the average computed for 'ValueColumnA' and the sum for 'ValueColumnB'.
    const pivottedDf = df.pivot("PivotColumn", {
         ValueColumnA: aValues => aValues.average(),
         ValueColumnB:  bValues => bValues.sum(),
    });
    
    example
    // Multiple output column example.
    // Similar to the previous example except now we are aggregating multiple outputs for each input column.
    // This example produces an output dataframe with columns OutputColumnA, B, C and D.
    // OutputColumnA/B are the sum and average of ValueColumnA across each group as defined by PivotColumn.
    // OutputColumnC/D are the sum and average of ValueColumnB across each group as defined by PivotColumn.
    const pivottedDf = df.pivot("PivotColumn", {
         ValueColumnA: {
             OutputColumnA: aValues => aValues.sum(),
             OutputColumnB: aValues => aValues.average(),
         },
         ValueColumnB: {
             OutputColumnC: bValues => bValues.sum(),
             OutputColumnD: bValues => bValues.average(),
         },
    });
    
    example
    // Full multi-column example.
    // Similar to the previous example, but now we are pivotting on multiple columns.
    // We now group by 'PivotColumnA' and then by 'PivotColumnB', effectively creating a
    // multi-level nested group.
    const pivottedDf = df.pivot(["PivotColumnA", "PivotColumnB" ], {
         ValueColumnA: aValues => aValues.average(),
         ValueColumnB:  bValues => bValues.sum(),
    });
    
    example
    // To help understand the pivot function, let's expand it out and look at what it does internally.
    // Take the simplest example:
    const pivottedDf = df.pivot("PivotColumn", "ValueColumn", values => values.average());
    
    // If we expand out the internals of the pivot function, it will look something like this:
    const pivottedDf = df.groupBy(row => row.PivotColumn)
             .select(group => ({
                 PivotColumn: group.first().PivotColumn,
                 ValueColumn: group.deflate(row => row.ValueColumn).average()
             }))
             .orderBy(row  => row.PivotColumn);
    
    // You can see that pivoting a dataframe is the same as grouping, aggregating and sorting it.
    // Does pivoting seem simpler now?
    
    // It gets more complicated than that of course, because the pivot function supports multi-level nested
    // grouping and aggregation of multiple columns. So a full expansion of the pivot function is rather complex.
    

    Type parameters

    • NewValueT

    Parameters

    • columnOrColumns: string | Iterable<string>

      Column name whose values make the new DataFrame's columns.

    • valueColumnNameOrSpec: string | IMultiColumnAggregatorSpec

      Column name or column spec that defines the columns whose values should be aggregated.

    • Optional aggregator: undefined | function

      Optional function used to aggregate pivotted vales.

    Returns IDataFrame<number, NewValueT>

    Returns a new dataframe that has been pivoted based on a particular column's values.

reduce

  • Reduces the values in the dataframe to a single result.

    This is the same concept as the JavaScript function Array.reduce but reduces a dataframe rather than an array.

    example
    const dailyRecords = ... daily records for the past month ...
    const totalSales = dailyRecords.reduce(
         (accumulator, row) => accumulator + row.salesAmount, // Reducer function.
         0  // Seed value, the starting value.
    );
    
    example
    const previousSales = 500; // We'll seed the reduction with this value.
    const dailyRecords = ... daily records for the past month ...
    const updatedSales = dailyRecords.reduce(
         (accumulator, row) => accumulator + row.salesAmount,
         previousSales
    );
    

    Type parameters

    • ToT

    Parameters

    • reducer: AggregateFn<ValueT, ToT>

      Function that takes the seed and then each value in the dataframe and produces the reduced value.

    • Optional seed: ToT

      Optional initial value, if not specifed the first value in the dataframe is used as the initial value.

    Returns ToT

    Returns a value that has been reduced from the input dataframe by passing each element through the reducer function.

renameSeries

  • Create a new dataframe with 1 or more columns renamed.

    example
    const renamedDf = df.renameSeries({ OldColumnName, NewColumnName });
    
    example
    const renamedDf = df.renameSeries({
         Column1: ColumnA,
         Column2: ColumnB
    });
    

    Type parameters

    • NewValueT

    Parameters

    • newColumnNames: IColumnRenameSpec

      A column rename spec - a JavaScript hash that maps existing column names to new column names.

    Returns IDataFrame<IndexT, NewValueT>

    Returns a new dataframe with specified columns renamed.

reorderSeries

  • reorderSeries<NewValueT>(columnNames: string[]): IDataFrame<IndexT, NewValueT>
  • Create a new dataframe with columns reordered. New column names create new columns (with undefined values), omitting existing column names causes those columns to be dropped.

    example
    const reorderedDf = df.reorderSeries(["FirstColumn", "SecondColumn", "etc"]);
    

    Type parameters

    • NewValueT

    Parameters

    • columnNames: string[]

      Specifies the new order for columns.

    Returns IDataFrame<IndexT, NewValueT>

    Returns a new dataframe with columns reordered according to the order of the array of column names that is passed in.

resetIndex

  • Resets the Index of the dataframe back to the default zero-based sequential integer index.

    example
    const dfWithResetIndex = df.resetIndex();
    

    Returns IDataFrame<number, ValueT>

    Returns a new dataframe with the Index reset to the default zero-based index.

reverse

  • Gets a new dataframe in reverse order.

    example
    const reversed = df.reverse();
    

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe that is the reverse of the original.

rollingWindow

  • Partition a dataframe into a Series of rolling data windows. Each value in the new series is a rolling chunk of data from the original dataframe.

    example
    const salesDf = ... // Daily sales data.
    const rollingWeeklySales = salesDf.rollingWindow(7); // Get rolling window over weekly sales data.
    console.log(rollingWeeklySales.toString());
    

    Parameters

    • period: number

      The number of data rows to include in each data window.

    Returns ISeries<number, IDataFrame<IndexT, ValueT>>

    Returns a new series, each value of which is a rolling chunk of the original dataframe.

round

  • round(numDecimalPlaces?: undefined | number): IDataFrame<IndexT, ValueT>
  • Produces a new dataframe with all number values rounded to the specified number of places.

    example
    const df = ... your data frame ...
    const rounded = df.round(); // Round numbers to two decimal places.
    
    example
    const df = ... your data frame ...
    const rounded = df.round(3); // Round numbers to three decimal places.
    

    Parameters

    • Optional numDecimalPlaces: undefined | number

      The number of decimal places, defaults to 2.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with all number values rounded to the specified number of places.

select

  • Transforms an input dataframe, generating a new dataframe. The transformer function is called for each element of the input and the collection of outputs creates the generated datafarme.

    select is an alias for DataFrame.map.

    This is the same concept as the JavaScript function Array.map but maps over a dataframe rather than an array.

    example
    function transformer (input) {
         const output = {
             // ... construct output from input ...
         };
    
         return output;
    }
    
    const transformed = dataframe.select(transformer);
    console.log(transformed.toString());
    

    Type parameters

    • ToT

    Parameters

    • transformer: SelectorWithIndexFn<ValueT, ToT>

      A user-defined transformer function that transforms each element from the input to generate the output.

    Returns IDataFrame<IndexT, ToT>

    Returns a new dataframe generated by calling the transformer function over each element of the input.

selectMany

  • Transforms and flattens an input dataframe, generating a new dataframe. The transformer function is called for each value in the input dataframe and produces an array that is then flattened into the generated dataframe.

    selectMany is an alias for DataFrame.flatMap.

    This is the same concept as the JavaScript function Array.flatMap but maps over a dataframe rather than an array.

    example
    function transformer (input) {
         const output = [];
         while (someCondition) {
             // ... generate zero or more outputs from a single input ...
             output.push(... some generated value ...);
         }
         return output;
    }
    
    const transformed = dataframe.selectMany(transformer);
    console.log(transformed.toString());
    

    Type parameters

    • ToT

    Parameters

    • transformer: SelectorWithIndexFn<ValueT, Iterable<ToT>>

      A user-defined function that transforms each value into an array that is then flattened into the generated dataframe.

    Returns IDataFrame<IndexT, ToT>

    Returns a new dataframe generated by calling the transformer function over each element of the input.

sequentialDistinct

  • Eliminates adjacent duplicate rows.

    For each group of adjacent values that are equivalent only returns the last index/row for the group, thus adjacent equivalent rows are collapsed down to the last row.

    example
    const dfWithDuplicateRowsRemoved = df.sequentialDistinct(row => row.ColumnA);
    

    Type parameters

    • ToT

    Parameters

    • Optional selector: SelectorFn<ValueT, ToT>

      Optional selector function to determine the value used to compare for equivalence.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with groups of adjacent duplicate rows collapsed to a single row per group.

serialize

  • serialize(): any
  • Serialize the dataframe to an ordinary JavaScript data structure. The resulting data structure is suitable for further serialization to JSON and can be used to transmit a DataFrame and its internal structure over the wire. Use the deserialize function to later reconstitute the serialized dataframe.

    example
    const jsDataStructure = df.serialize();
    const jsonData = JSON.stringify(jsDataStructure);
    console.log(jsonData);
    const deserializedJsDataStructure = JSON.parse(jsonData);
    const deserializedDf = DataFrame.deserialize(deserializedJsDataStructure); // Reconsituted.
    

    Returns any

    Returns a JavaScript data structure conforming to {@link ISerializedDataFrame} that represents the dataframe and its internal structure.

setIndex

  • setIndex<NewIndexT>(columnName: string): IDataFrame<NewIndexT, ValueT>
  • Set a named column as the Index of the dataframe.

    example
    const indexedDf = df.setIndex("SomeColumn");
    

    Type parameters

    • NewIndexT

    Parameters

    • columnName: string

      Name of the column to use as the new Index of the returned dataframe.

    Returns IDataFrame<NewIndexT, ValueT>

    Returns a new dataframe with the values of the specified column as the new Index.

skip

  • skip(numValues: number): IDataFrame<IndexT, ValueT>
  • Skip a number of rows in the dataframe.

    example
    const dfWithRowsSkipped = df.skip(10); // Skip 10 rows in the original dataframe.
    

    Parameters

    • numValues: number

      Number of rows to skip.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with the specified number of rows skipped.

skipUntil

  • Skips values in the dataframe untils a condition evaluates to true or truthy.

    example
    const dfWithRowsSkipped = df.skipUntil(row => row.CustomerName === "Fred"); // Skip initial customers until we find Fred.
    

    Parameters

    • predicate: PredicateFn<ValueT>

      Return true/truthy to stop skipping rows in the original dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with all initial sequential rows removed until the predicate returned true/truthy.

skipWhile

  • Skips values in the dataframe while a condition evaluates to true or truthy.

    example
    const dfWithRowsSkipped = df.skipWhile(row => row.CustomerName === "Fred"); // Skip initial customers named Fred.
    

    Parameters

    • predicate: PredicateFn<ValueT>

      Returns true/truthy to continue to skip rows in the original dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with all initial sequential rows removed while the predicate returned true/truthy.

startAt

  • startAt(indexValue: IndexT): IDataFrame<IndexT, ValueT>
  • Gets a new dataframe containing all rows starting at and after the specified index value.

    example
    const df = new DataFrame({
         index: [0, 1, 2, 3], // This is the default index.
         values: [10, 20, 30, 40],
    });
    
    const lastHalf = df.startAt(2);
    expect(lastHalf.toArray()).to.eql([30, 40]);
    
    example
    const timeSeriesDf = ... a dataframe indexed by date/time ...
    
    // Get all rows starting at (or after) a particular date.
    const allRowsFromStartDate = df.startAt(new Date(2016, 5, 4));
    

    Parameters

    • indexValue: IndexT

      The index value at which to start the new dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe containing all rows starting at and after the specified index value.

subset

  • subset<NewValueT>(columnNames: string[]): IDataFrame<IndexT, NewValueT>
  • Create a new dataframe with just a subset of columns.

    example
    const subsetDf = df.subset(["ColumnA", "ColumnB"]);
    

    Type parameters

    • NewValueT

    Parameters

    • columnNames: string[]

      Array of column names to include in the new dataframe.

    Returns IDataFrame<IndexT, NewValueT>

    Returns a dataframe with a subset of columns from the original dataframe.

summarize

  • Produces a summary of dataframe.

    example
    const summary = df.summarize();
    console.log(summary);
    
    example
    const summary = df.summarize({ // Summarize using pre-defined functions.
         Column1: Series.sum,
         Column2: Series.average,
         Column3: Series.count,
    });
    console.log(summary);
    
    example
    const summary = df.summarize({ // Summarize using custom functions.
         Column1: series => series.sum(),
         Column2: series => series.std(),
         ColumnN: whateverFunctionYouWant,
    });
    console.log(summary);
    
    example
    const summary = df.summarize({ // Multiple output fields per column.
         Column1: {
             OutputField1: Series.sum,
             OutputField2: Series.average,
         },
         Column2: {
             OutputField3: series => series.sum(),
             OutputFieldN: whateverFunctionYouWant,
         },
    });
    console.log(summary);
    

    Type parameters

    • OutputValueT

    Parameters

    • Optional spec: IMultiColumnAggregatorSpec

      Optional parameter that specifies which columns to aggregate and how to aggregate them. Leave this out to produce a default summary of all columns.

    Returns OutputValueT

    A object with fields that summary the values in the dataframe.

tail

  • tail(numValues: number): IDataFrame<IndexT, ValueT>
  • Get X rows from the end of the dataframe. Pass in a negative value to get all rows at the tail except X rows at the head.

    examples
    const sample = df.tail(12); // Take a sample of 12 rows from the end of the dataframe.
    

    Parameters

    • numValues: number

      Number of rows to take.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe that has only the specified number of rows taken from the end of the original dataframe.

take

  • take(numRows: number): IDataFrame<IndexT, ValueT>
  • Take a number of rows in the dataframe.

    example
    const dfWithRowsTaken = df.take(15); // Take only the first 15 rows from the original dataframe.
    

    Parameters

    • numRows: number

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with only the specified number of rows taken from the original dataframe.

takeUntil

  • Takes values from the dataframe untils a condition evaluates to true or truthy.

    example
    const dfWithRowsTaken = df.takeUntil(row => row.CustomerName === "Fred"); // Take all initial customers until we find Fred.
    

    Parameters

    • predicate: PredicateFn<ValueT>

      Return true/truthy to stop taking rows in the original dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with only the initial sequential rows taken until the predicate returned true/truthy.

takeWhile

  • Takes values from the dataframe while a condition evaluates to true or truthy.

    example
    const dfWithRowsTaken = df.takeWhile(row => row.CustomerName === "Fred"); // Take only initial customers named Fred.
    

    Parameters

    • predicate: PredicateFn<ValueT>

      Returns true/truthy to continue to take rows from the original dataframe.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with only the initial sequential rows that were taken while the predicate returned true/truthy.

toArray

  • toArray(): ValueT[]
  • Extract rows from the dataframe as an array. Each element of the array is one row of the dataframe represented as a JavaScript object with the fields as the dataframe's columns. This forces lazy evaluation to complete.

    example
    const values = df.toArray();
    

    Returns ValueT[]

    Returns an array of the rows contained within the dataframe.

toCSV

  • Serialize the dataframe to the CSV data format.

    example
    const csvData = df.toCSV();
    console.log(csvData);
    
    example
    const csvData = df.toCSV({ header: false });
    console.log(csvData);
    

    Parameters

    Returns string

    Returns a string in the CSV data format that represents the dataframe.

toHTML

  • toHTML(): string
  • Serialize the dataframe to HTML.

    Returns string

    Returns a string in HTML format that represents the dataframe.

toJSON

  • toJSON(): string
  • Serialize the dataframe to the JSON data format.

    example
    const jsonData = df.toJSON();
    console.log(jsonData);
    

    Returns string

    Returns a string in the JSON data format that represents the dataframe.

toJSON5

  • toJSON5(): string
  • Serialize the dataframe to the JSON5 data format.

    example
    const jsonData = df.toJSON5();
    console.log(jsonData);
    

    Returns string

    Returns a string in the JSON5 data format that represents the dataframe.

toObject

  • toObject<KeyT, FieldT, OutT>(keySelector: function, valueSelector: function): OutT
  • Convert the dataframe to a JavaScript object.

    example
    const someObject = df.toObject(
         row => row.SomeColumn, // Specify the column to use for field names in the output object.
         row => row.SomeOtherColumn // Specify the column to use as the value for each field.
    );
    

    Type parameters

    • KeyT

    • FieldT

    • OutT

    Parameters

    • keySelector: function

      User-defined selector function that selects keys for the resulting object.

        • (value: ValueT): KeyT
        • Parameters

          • value: ValueT

          Returns KeyT

    • valueSelector: function

      User-defined selector function that selects values for the resulting object.

        • (value: ValueT): FieldT
        • Parameters

          • value: ValueT

          Returns FieldT

    Returns OutT

    Returns a JavaScript object generated from the dataframe by applying the key and value selector functions.

toPairs

  • toPairs(): Object[]
  • Retreive the index, row pairs from the dataframe as an array. Each pair is [index, row]. This forces lazy evaluation to complete.

    example
    const pairs = df.toPairs();
    

    Returns Object[]

    Returns an array of pairs that contains the dataframe's rows. Each pair is a two element array that contains an index and a row.

toRows

  • toRows(): any[][]
  • Bake the data frame to an array of rows were each rows is an array of values in column order.

    example
    const rows = df.toRows();
    

    Returns any[][]

    Returns an array of rows. Each row is an array of values in column order.

toString

  • toString(): string
  • Format the dataframe for display as a string. This forces lazy evaluation to complete.

    example
    console.log(df.toString());
    

    Returns string

    Generates and returns a string representation of the dataframe or dataframe.

toStrings

  • toStrings(columnNames: string | string[] | IFormatSpec, formatString?: undefined | string): IDataFrame<IndexT, ValueT>
  • Convert a column of values of different types to a column of string values.

    example
    const withStringColumn = df.toStrings("MyDateColumn", "YYYY-MM-DD");
    
    example
    const withStringColumn = df.toStrings("MyFloatColumn", "0.00");
    

    Parameters

    • columnNames: string | string[] | IFormatSpec

      Specifies the column name or array of column names to convert to strings. Can also be a format spec that specifies which columns to convert and what their format should be.

    • Optional formatString: undefined | string

      Optional formatting string for dates.

      Numeral.js is used for number formatting. http://numeraljs.com/

      Moment is used for date formatting. https://momentjs.com/docs/#/parsing/string-format/

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with a particular named column convert to strings.

transformSeries

  • Transform one or more columns.

    This is equivalent to extracting a Series with getSeries, then transforming it with Series.select, and finally plugging it back in as the same column using withSeries.

    example
    const modifiedDf = df.transformSeries({
         AColumnToTransform: columnValue => transformRow(columnValue)
    });
    
    example
    const modifiedDf = df.transformSeries({
         ColumnA: columnValue => transformColumnA(columnValue),
         ColumnB: columnValue => transformColumnB(columnValue)
    });
    

    Type parameters

    • NewValueT

    Parameters

    • columnSelectors: IColumnTransformSpec

      Object with field names for each column to be transformed. Each field specifies a selector function that transforms that column.

    Returns IDataFrame<IndexT, NewValueT>

    Returns a new dataframe with 1 or more columns transformed.

truncateStrings

  • truncateStrings(maxLength: number): IDataFrame<IndexT, ValueT>
  • Produces a new dataframe with all string values truncated to the requested maximum length.

    example
    // Truncate all string columns to 100 characters maximum.
    const truncatedDf = df.truncateString(100);
    

    Parameters

    • maxLength: number

      The maximum length of the string values after truncation.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe with all strings truncated to the specified maximum length.

union

  • Creates a new dataframe by merging two input dataframes. The resulting dataframe contains the union of rows from the two input dataframes. These are the unique combination of rows in both dataframe. This is basically a concatenation and then elimination of duplicates.

    example
    const dfA = ...
    const dfB = ...
    const merged = dfA.union(dfB);
    
    example
    // Merge two sets of customer records that may contain the same
    // customer record in each set. This is basically a concatenation
    // of the dataframes and then an elimination of any duplicate records
    // that result.
    const customerRecordsA = ...
    const customerRecordsB = ...
    const mergedCustomerRecords = customerRecordsA.union(
         customerRecordsB,
         customerRecord => customerRecord.CustomerId
    );
    
    example
    // Note that you can achieve the exact same result as the previous
    // example by doing a {@link DataFrame.concat) and {@link DataFrame.distinct}
    // of the dataframes and then an elimination of any duplicate records
    // that result.
    const customerRecordsA = ...
    const customerRecordsB = ...
    const mergedCustomerRecords = customerRecordsA
         .concat(customerRecordsB)
         .distinct(customerRecord => customerRecord.CustomerId);
    

    Type parameters

    • KeyT

    Parameters

    • other: IDataFrame<IndexT, ValueT>

      The other dataframes to merge.

    • Optional selector: SelectorFn<ValueT, KeyT>

      Optional user-defined selector function that selects the value to compare to determine distinctness.

    Returns IDataFrame<IndexT, ValueT>

    Returns the union of the two dataframes.

variableWindow

  • Partition a dataframe into a Series of variable-length data windows where the divisions between the data chunks are defined by a user-provided comparer function.

    example
    function rowComparer (rowA, rowB) {
         if (... rowA should be in the same data window as rowB ...) {
             return true;
         }
         else {
             return false;
         }
    };
    
    const variableWindows = df.variableWindow(rowComparer);
    

    Parameters

    • comparer: ComparerFn<ValueT, ValueT>

      Function that compares two adjacent data rows and returns true if they should be in the same window.

    Returns ISeries<number, IDataFrame<IndexT, ValueT>>

    Returns a new series, each value of which is a chunk of data from the original dataframe.

where

  • Filter the dataframe through a user-defined predicate function.

    where is an alias for DataFrame.filter.

    This is the same concept as the JavaScript function Array.filter but filters a dataframe rather than an array.

    example
    // Filter so we only have sales figures greater than 100.
    const filtered = dataframe.where(row => row.salesFigure > 100);
    console.log(filtered.toArray());
    

    Parameters

    • predicate: PredicateFn<ValueT>

      Predicate function to filter values from the dataframe. Returns true/truthy to keep elements, or false/falsy to omit elements.

    Returns IDataFrame<IndexT, ValueT>

    Returns a new dataframe containing only the values from the original dataframe that matched the predicate.

window

  • Partition a dataframe into a Series of data windows. Each value in the new series is a chunk of data from the original dataframe.

    example
    const windows = df.window(2); // Get rows in pairs.
    const pctIncrease = windows.select(pair => (pair.last().SalesAmount - pair.first().SalesAmount) / pair.first().SalesAmount);
    console.log(pctIncrease.toString());
    
    example
    const salesDf = ... // Daily sales data.
    const weeklySales = salesDf.window(7); // Partition up into weekly data sets.
    console.log(weeklySales.toString());
    

    Parameters

    • period: number

      The number of rows to include in each data window.

    Returns ISeries<number, IDataFrame<IndexT, ValueT>>

    Returns a new series, each value of which is a chunk (data window) of the original dataframe.

withIndex

  • withIndex<NewIndexT>(newIndex: Iterable<NewIndexT> | SelectorFn<ValueT, NewIndexT>): IDataFrame<NewIndexT, ValueT>
  • Apply a new Index to the dataframe.

    example
    const indexedDf = df.withIndex([10, 20, 30]);
    
    example
    const indexedDf = df.withIndex(df.getSeries("SomeColumn"));
    
    example
    const indexedDf = df.withIndex(row => row.SomeColumn);
    
    example
    const indexedDf = df.withIndex(row => row.SomeColumn + 20);
    

    Type parameters

    • NewIndexT

    Parameters

    • newIndex: Iterable<NewIndexT> | SelectorFn<ValueT, NewIndexT>

      The new array or iterable to be the new Index of the dataframe. Can also be a selector to choose the Index for each row in the dataframe.

    Returns IDataFrame<NewIndexT, ValueT>

    Returns a new dataframe with the specified Index attached.

withSeries

  • Create a new dataframe with a replaced or additional column specified by the passed-in series.

    example
    const modifiedDf = df.withSeries("ANewColumn", new Series([1, 2, 3]));
    
    example
    const modifiedDf = df.withSeries("ANewColumn", df =>
         df.getSeries("SourceData").select(aTransformation)
    );
    
    example
    const modifiedDf = df.withSeries({
         ANewColumn: new Series([1, 2, 3]),
         SomeOtherColumn: new Series([10, 20, 30])
    });
    
    example
    const modifiedDf = df.withSeries({
         ANewColumn: df => df.getSeries("SourceData").select(aTransformation))
    });
    

    Type parameters

    • OutputValueT

    • SeriesValueT

    Parameters

    • columnNameOrSpec: string | IColumnGenSpec

      The name of the column to add or replace or a IColumnGenSpec that defines the columns to add.

    • Optional series: ISeries<IndexT, SeriesValueT> | SeriesSelectorFn<IndexT, ValueT, SeriesValueT>

      When columnNameOrSpec is a string that identifies the column to add, this specifies the Series to add to the dataframe or a function that produces a series (given a dataframe).

    Returns IDataFrame<IndexT, OutputValueT>

    Returns a new dataframe replacing or adding a particular named column.

zip

  • zip<Index2T, Value2T, ResultT>(s2: IDataFrame<Index2T, Value2T>, zipper: Zip2Fn<ValueT, Value2T, ResultT>): IDataFrame<IndexT, ResultT>
  • zip<Index2T, Value2T, Index3T, Value3T, ResultT>(s2: IDataFrame<Index2T, Value2T>, s3: IDataFrame<Index3T, Value3T>, zipper: Zip3Fn<ValueT, Value2T, Value3T, ResultT>): IDataFrame<IndexT, ResultT>
  • zip<Index2T, Value2T, Index3T, Value3T, Index4T, Value4T, ResultT>(s2: IDataFrame<Index2T, Value2T>, s3: IDataFrame<Index3T, Value3T>, s4: IDataFrame<Index4T, Value4T>, zipper: Zip3Fn<ValueT, Value2T, Value3T, ResultT>): IDataFrame<IndexT, ResultT>
  • zip<ResultT>(...args: any[]): IDataFrame<IndexT, ResultT>
  • Merge together multiple dataframes to create a new dataframe. Preserves the index of the first dataframe.

    example
    function produceNewRow (rowA, rowB) {
          const outputRow = {
              ValueA: rowA.Value,
              ValueB: rowB.Value,
          };
          return outputRow;
    }
    
    const dfA = new DataFrame([ { Value: 10 }, { Value: 20 }, { Value: 30 }]);
    const dfB = new DataFrame([ { Value: 100 }, { Value: 200 }, { Value: 300 }]);
    const zippedDf = dfA.zip(dfB, produceNewRow);
    

    Type parameters

    • Index2T

    • Value2T

    • ResultT

    Parameters

    • s2: IDataFrame<Index2T, Value2T>
    • zipper: Zip2Fn<ValueT, Value2T, ResultT>

      User-defined zipper function that merges rows. It produces rows for the new dataframe based-on rows from the input dataframes.

    Returns IDataFrame<IndexT, ResultT>

    Returns a single dataframe merged from multiple input dataframes.

  • Type parameters

    • Index2T

    • Value2T

    • Index3T

    • Value3T

    • ResultT

    Parameters

    Returns IDataFrame<IndexT, ResultT>

  • Type parameters

    • Index2T

    • Value2T

    • Index3T

    • Value3T

    • Index4T

    • Value4T

    • ResultT

    Parameters

    Returns IDataFrame<IndexT, ResultT>

  • Type parameters

    • ResultT

    Parameters

    • Rest ...args: any[]

    Returns IDataFrame<IndexT, ResultT>

Generated using TypeDoc