Class Dataset<T>`Abstract`

A Dataset manages the production and manipulation of tiles. Each plot has a single dataset; the dataset handles all transformations around data through batchwise operations.

Type Parameters

T extends Tile

Hierarchy

Dataset
- ArrowDataset
- QuadtileDataset

Constructors

constructor

new Dataset<T>(plot: Plot): Dataset<T>
Type Parameters
- T extends Tile<T>
Parameters
- plot: Plot
  
  The plot to which this dataset belongs.
Returns Dataset<T>
- Defined in src/Dataset.ts:58

Properties

_ix_seed

_ix_seed: number = 0

`Optional` _schema

_schema?: Schema<any>

`Private` extents

extents: Record<string, [number, number]> = {}

`Protected` plot

plot: Plot

`Abstract` promise

promise: Promise<void>

`Abstract` ready

ready: Promise<void>

`Abstract` root_tile

root_tile: T

`Optional` tileProxy

tileProxy?: TileProxy

transformations

transformations: Record<string, Transformation<T>> = {}

Accessors

`Abstract` extent

get extent(): Rectangle
Returns Rectangle
- Defined in src/Dataset.ts:46

highest_known_ix

get highest_known_ix(): number
The highest known point that deepscatter has seen so far. This is used to adjust opacity size.

Returns number
- Defined in src/Dataset.ts:68

table

get table(): Table<any>
Attempts to build an Arrow table from all record batches. If some batches have different transformations applied, this will error

Returns Table<any>
- Defined in src/Dataset.ts:127

Methods

add_label_identifiers

add_label_identifiers(ids: Record<string, number>, field_name: string, key_field?: string): void
Parameters
- ids: Record<string, number>
  
  A list of ids to get, keyed to the value to set them to.
- field_name: string
  
  The name of the new field to create
- key_field: string = '_id'
  
  The column in the dataset to match them against.
Returns void
- Defined in src/Dataset.ts:416

add_sparse_identifiers

add_sparse_identifiers(field_name: string, ids: PointUpdate): void
Parameters
- field_name: string
- ids: PointUpdate
Returns void
- Defined in src/Dataset.ts:396

add_tiled_column

add_tiled_column(field_name: string, buffer: Uint8Array): void
Parameters
- field_name: string
  
  the name of the column to create
- buffer: Uint8Array
  
  An Arrow IPC Buffer that deserializes to a table with columns('data' and '_tile')
Returns void
- Defined in src/Dataset.ts:371

applyTransformationToPoint

applyTransformationToPoint(transformation: string, ix: number): Promise<StructRowProxy<any>>
Given an ix, apply a transformation to the point at that index and return the transformed point (not just the transformation, the whole point) As a side-effect, this applies the transformaation to all other points in the same tile.
Parameters
- transformation: string
  
  The name of the transformation to apply
- ix: number
  
  The index of the point to transform
Returns Promise<StructRowProxy<any>>
- Defined in src/Dataset.ts:446

delete_column_if_exists

delete_column_if_exists(name: string): void
Parameters
- name: string
Returns void
- Defined in src/Dataset.ts:180

domain

domain(dimension: string, max_ix?: number): [number, number]
Parameters
- dimension: string
- max_ix: number = 1e6
Returns [number, number]
- Defined in src/Dataset.ts:190

`Abstract` download_most_needed_tiles

download_most_needed_tiles(bbox: Rectangle, max_ix: number, queue_length: number): void
Parameters
- bbox: Rectangle
- max_ix: number
- queue_length: number
Returns void
- Defined in src/Dataset.ts:162

download_to_depth

download_to_depth(max_ix: number): Promise<void>
Parameters
- max_ix: number
Returns Promise<void>
- Defined in src/Dataset.ts:118

findPoint

findPoint(ix: number): StructRowProxy<any>[]
Returns
A structRowProxy for the point with the given index.
Parameters
- ix: number
  
  The index of the point to get.
Returns StructRowProxy<any>[]
- Defined in src/Dataset.ts:465

findPointRaw

findPointRaw(ix: number): [Tile, StructRowProxy<any>, number][]
Finds the points and tiles that match the passed ix

Returns
A list of [tile, point] pairs that match the index.
Parameters
- ix: number
  
  The index of the point to get.
Returns [Tile, StructRowProxy<any>, number][]
- Defined in src/Dataset.ts:474

has_column

has_column(name: string): boolean
Returns
True if the column exists in the dataset, false otherwise.
Parameters
- name: string
  
  The name of the column to check for
Returns boolean
- Defined in src/Dataset.ts:173

map

map<U>(callback: ((tile: T) => U), after?: boolean): U[]
Map a function against all tiles. It is often useful simply to invoke Dataset.map(d => d) to get a list of all tiles in the dataset at any moment.

Returns
A list of the results of the function in an order determined by 'after.'
Type Parameters
- U
Parameters
- callback: ((tile: T) => U)
  
  A function to apply to each tile.
  - - (tile: T): U
    - Parameters
      
      tile: T
      
      Returns U
- after: boolean = false
  
  Whether to perform the function in bottom-up order
Returns U[]
- Defined in src/Dataset.ts:260

points

points(bbox: Rectangle, max_ix?: number): Generator<StructRowProxy<any>, void, unknown>
Parameters
- bbox: Rectangle
- max_ix: number = 1e99
Returns Generator<StructRowProxy<any>, void, unknown>
- Defined in src/Dataset.ts:230

register_transformation

register_transformation(name: string, pointFunction: PointFunction, prerequisites?: string[]): void
This allows creation of a new column in your chart.

A few thngs to be aware of: the point function may be run millions of times. For best performance, you should not wrap complicated logic in this: instead, generate any data structures outside the function.

name: the name to identify the new column in the data. pointFunction: a function that runs on a single row of data. It accepts a single argument, the data point to be transformed: technically this is a StructRowProxy on the underlying Arrow frame, but for most purposes you can treat it as a dict. The point is read-only--you cannot change attributes.

For example: suppose you have a ['lat', 'long'] column in your data and want to create a new set of geo coordinates for your data. You can run the following. { const scale = d3.geoMollweide().extent([-20, -20, 20, 20]) scatterplot.register_transformation('mollweide_x', datum => { return scale([datum.long, datum.lat])[0] }) scatterplot.register_transformation('mollweide_y', datum => { return scale([datum.long, datum.lat])[1] }) }

Note some constraints: the scale is created outside the functions, to avoid the overhead of instantiating it every time; and the x and y coordinates are created separately with separate function calls, because it's not possible to assign to both x and y simultaneously.
Parameters
- name: string
- pointFunction: PointFunction
- prerequisites: string[] = []
Returns void
- Defined in src/Dataset.ts:102

schema

schema(): Promise<Schema<any>>
Returns Promise<Schema<any>>
- Defined in src/Dataset.ts:357

visit

visit(callback: ((tile: T) => void), after?: boolean, filter?: ((t: T) => boolean)): void
Invoke a function on all tiles in the dataset that have been downloaded. The general architecture here is taken from the d3 quadtree functions. That's why, for example, it doesn't recurse.
Parameters
- callback: ((tile: T) => void)
  
  The function to invoke on each tile.
  - - (tile: T): void
    - Parameters
      
      tile: T
      
      Returns void
- after: boolean = false
  
  Whether to execute the visit in bottom-up order. Default false.
- filter: ((t: T) => boolean) = ...
  - - (t: T): boolean
    - Parameters
      
      t: T
      
      Returns boolean
Returns void
- Defined in src/Dataset.ts:279

visit_full

visit_full(callback: ((tile: T) => Promise<void>), after?: boolean, starting_tile?: T, filter?: ((t: T) => boolean), updateFunction: ((tile: T, completed: any, total: any) => Promise<void>)): Promise<void>
Invoke a function on all tiles in the dataset, downloading those that aren't here yet.. The general architecture here is taken from the d3 quadtree functions. That's why, for example, it doesn't recurse.
Parameters
- callback: ((tile: T) => Promise<void>)
  
  The function to invoke on each tile.
  - - (tile: T): Promise<void>
    - Parameters
      
      tile: T
      
      Returns Promise<void>
- after: boolean = false
  
  Whether to execute the visit in bottom-up order. Default false.
- starting_tile: T = null
- filter: ((t: T) => boolean) = ...
  - - (t: T): boolean
    - Parameters
      
      t: T
      
      Returns boolean
- updateFunction: ((tile: T, completed: any, total: any) => Promise<void>)
  - - (tile: T, completed: any, total: any): Promise<void>
    - Parameters
      
      tile: T
      
      completed: any
      
      total: any
      
      Returns Promise<void>
Returns Promise<void>
- Defined in src/Dataset.ts:322

`Static` from_arrow_table

from_arrow_table(table: Table<any>, plot: default<ArrowTile>): ArrowDataset
Generate an ArrowDataset from a single Arrow table.

Returns
Parameters
- table: Table<any>
  
  A single Arrow table
- plot: default<ArrowTile>
  
  The Scatterplot to use.
Returns ArrowDataset
- Defined in src/Dataset.ts:155

`Static` from_quadfeather

from_quadfeather(url: string, plot: Plot): QuadtileDataset
Parameters
- url: string
- plot: Plot
Returns QuadtileDataset
- Defined in src/Dataset.ts:135

Class Dataset<T>Abstract

Type Parameters

T extends Tile

Hierarchy

Index

Constructors

Properties

Accessors

Methods

Constructors

constructor

Type Parameters

T extends Tile<T>

Parameters

plot: Plot

Returns Dataset<T>

Properties

_ix_seed

Optional _schema

Private extents

Protected plot

Abstract promise

Abstract ready

Abstract root_tile

Optional tileProxy

transformations

Accessors

Abstract extent

Returns Rectangle

highest_known_ix

Returns number

table

Returns Table<any>

Methods

add_label_identifiers

Parameters

ids: Record<string, number>

field_name: string

key_field: string = '_id'

Returns void

add_sparse_identifiers

Parameters

field_name: string

ids: PointUpdate

Returns void

add_tiled_column

Parameters

field_name: string

buffer: Uint8Array

Returns void

applyTransformationToPoint

Parameters

transformation: string

ix: number

Returns Promise<StructRowProxy<any>>

delete_column_if_exists

Parameters

name: string

Returns void

domain

Parameters

dimension: string

max_ix: number = 1e6

Returns [number, number]

Abstract download_most_needed_tiles

Parameters

bbox: Rectangle

max_ix: number

queue_length: number

Returns void

download_to_depth

Parameters

max_ix: number

Returns Promise<void>

findPoint

Returns

Parameters

ix: number

Returns StructRowProxy<any>[]

findPointRaw

Class Dataset<T>`Abstract`

`Optional` _schema

`Private` extents

`Protected` plot

`Abstract` promise

`Abstract` ready

`Abstract` root_tile

`Optional` tileProxy

`Abstract` extent

`Abstract` download_most_needed_tiles

`Static` from_arrow_table