Tabular Data Abstract Syntax Tree format.
tdast is a specification for representing tabular data as an abstract syntax tree. It implements the unist spec.
This document may not be released. See releases for released documents.
This document defines a format for representing tabular data as an abstract syntax tree. This specification is written in the style of other syntax-tree specs, for potential incorporation into the ecosystem in the future. Development started in September 2020.
tdast extends unist, a format for syntax trees, and benefits from its ecosystem of utilities.
tdast relates to JavaScript in that it has an ecosystem of utilities for working with compliant syntax trees in JavaScript. However, tdast is not limited to JavaScript and can be used in other programming languages.
tdast relates to the unified project in that tdast syntax trees follow the unist spec and is compatible with utilities throughout its ecosystem.
tdast seeks to represent how tabular data from some source content (e.g. CSV, JSON), would be represented in a syntax tree. It implements the unist spec, meaning that the tree would track positional information of nodes that will be useful for remaining interoperable with unist utilities.
With associated utilities, tdast enables flexible ways to parse, transform, serialize tabular data in various content formats. CSV, JSON, HTML tables are immediate native candidates benefiting from tdast, and future content types can be supported by implementors.
While it is possible to perform data calculations in tdast with associated utilities, it is recommended that you work with more direct data representations of the source content for these purposes. tdast utilites to convert to/from JS objects exist for this purpose.
While tdast could be used to represent most tabular data, it is best used with simple cases of flat tabular data.
interface Parent extends UnistParent {
/** Array of child nodes */
children: Array<Row | Cell | Column>;
/** Additional data maybe attached. */
data?: Data;
}
Parent
(UnistParent
) is an abstract node containing other nodes (said to be children).
This node is not used directly in tdast and is defined as an abstract interface.
interface Literal extends UnistLiteral {
/** Primary data value of a literal node. Loosely typed to `any` for convenience. */
value: any;
/** Additional data maybe attached. */
data?: Data;
}
Literal
(UnistLiteral
) is an abstract node in containing a value.
This node is not used directly in tdast and is defined as an abstract interface.
interface Table extends Parent {
/** Table node type. */
type: 'table';
/** Can only contain Rows for children. */
children: Row[];
}
Table
(Parent
) represents the node that holds all tabular data with Row
nodes.
Table
can be used as the root of a tree, but never as a child.
interface Row extends Parent {
/** Row node type. */
type: 'row';
/** Index of Row in relation to other Rows in a Table. */
index: number;
/** Can only contain Cells or Columns for children. */
children: Array<Cell | Column>;
}
Row
(Parent
) holds literal nodes that contain data, such as Column
or Cell
nodes.
Row
contains an index
field that tracks its relative position with other rows under a Table
.
interface Cell extends Literal {
/** Cell node type. */
type: 'cell';
/** Tracks which Column the Cell belongs to */
columnIndex: number;
/** Tracks which Row the Cell belongs to */
rowIndex: number
}
Cell
(Literal
) represents the intersection of a Row
and Column
of a Table
. This intersection information is stored on the columnIndex
and rowIndex
properties.
Its primary data is stored on the value
property. Cell
can also contain attach additional data (not relevant to tdast) under the optional data
property.
interface Column extends Literal {
/** Column node type. */
type: 'column';
/** Display label of a column. */
label: string;
/** Index of Column in relation to other Columns in a Table. */
index: number;
/** Optional data type useful to determine data types of Cells matching the Column. */
dataType?: string;
}
Column
(Literal
) is usually reserved in the first Row
of a Table
.
Column
contains an index
field that tracks its relative position with other columns under a Table
. Its primary data is represented in the value
property. It should have a display label
specified as a string.
Column
can optionally specify the dataType
property, which informs the data types applied to Cell
s in a Row
.
ast
: a data structure representing source content as an abstract syntax tree.cell
: the intersection of a table row and column. A cell contains data.column
: a table column provides definitions for cells in subsequent rows. A column should define thedataType
of cells under it. The cardinality of a column is equal to the number of rows in a table.csv
: a common tabular data format that uses comma-separated values for delimiting values.dataType
: refers to the data type a table cell assumes. This is usually defined by a table column.row
: a table row contains cells. The cardinality of a row is equal to the number of columns in a table. ThedataType
of each cell should assume what is prescribed by the columns.table
: an arrangement of data in rows and columns.tdast
: represent tabular data as an abstract syntax tree with tdast.unist
: universal syntax tree that tdast is based on.
tdastscript
: utility to create tdast trees.tdast-util-from-array
: transform from JS array to tdasttdast-util-from-csv
: parse from CSV to tdast (RFC-4180 compliant)tdast-util-to-array
: transform tdast to JS array, which can be manipulated externally and read back withtdast-util-from-array
.tdast-util-to-csv
: serialize tdast to CSV (RFC-4180 compliant)tdast-util-to-hast-table
: transform tdast to a hast table nodetdast-util-to-html-table
: serialize tdast to a HTML tabletdast-util-to-json
: serialize tdast to JSON (array form)tdast-util-to-markdown-table
: serialize tdast to a markdown table
- unist: Universal Syntax Tree format
- tdastscript: utility to create tdast trees.