pub struct DataTable {
pub table_id: TableId,
pub col_row_id: VecDeque<RowId>,
pub col_timelines: BTreeMap<Timeline, VecDeque<Option<i64>>>,
pub col_entity_path: VecDeque<EntityPath>,
pub columns: BTreeMap<ComponentName, DataCellColumn>,
}Expand description
A sparse table’s worth of data, i.e. a batch of events: a collection of DataRows.
This is the top-level layer in our data model.
Behind the scenes, a DataTable is organized in columns, where columns are represented by
sparse lists of DataCells.
Cells within a single list are likely to reference shared, contiguous heap memory.
Cloning a DataTable can be very costly depending on the contents.
§Field visibility
To facilitate destructuring (let DataTable { .. } = row), all the fields in DataTable are
public.
Modifying any of these fields from outside this crate is considered undefined behavior. Use the appropriate getters and setters instead.
§Layout
A table is a collection of sparse rows, which are themselves collections of cells, where each cell can contain an arbitrary number of instances:
[
[[C1, C1, C1], [], [C3], [C4, C4, C4], …],
[None, [C2, C2], [], [C4], …],
[None, [C2, C2], [], None, …],
…
]
Consider this example:
let row0 = {
let points: &[MyPoint] = &[[10.0, 10.0].into(), [20.0, 20.0].into()];
let colors: &[_] = &[MyColor::from_rgb(128, 128, 128)];
let labels: &[Label] = &[];
DataRow::from_cells3(RowId::new(), "a", timepoint(1, 1), (points, colors, labels))?
};
let row1 = {
let colors: &[MyColor] = &[];
DataRow::from_cells1(RowId::new(), "b", timepoint(1, 2), colors)?
};
let row2 = {
let colors: &[_] = &[MyColor::from_rgb(255, 255, 255)];
let labels: &[_] = &[Label("hey".into())];
DataRow::from_cells2(RowId::new(), "c", timepoint(2, 1), (colors, labels))?
};
let table = DataTable::from_rows(table_id, [row0, row1, row2]);A table has no arrow representation nor datatype of its own, as it is merely a collection of independent rows.
The table above translates to the following, where each column is contiguous in memory:
┌──────────┬───────────────────────────────┬──────────────────────────────────┬───────────────────┬─────────────┬──────────────────────────────────┬─────────────────┐
│ frame_nr ┆ log_time ┆ rerun.row_id ┆ rerun.entity_path ┆ ┆ rerun.components.Point2D ┆ rerun.components.Color │
╞══════════╪═══════════════════════════════╪══════════════════════════════════╪═══════════════════╪═════════════╪══════════════════════════════════╪═════════════════╡
│ 1 ┆ 2023-04-05 09:36:47.188796402 ┆ 1753004ACBF5D6E651F2983C3DAF260C ┆ a ┆ [] ┆ [{x: 10, y: 10}, {x: 20, y: 20}] ┆ [2155905279] │
├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 1 ┆ 2023-04-05 09:36:47.188852222 ┆ 1753004ACBF5D6E651F2983C3DAF260C ┆ b ┆ - ┆ - ┆ [] │
├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2 ┆ 2023-04-05 09:36:47.188855872 ┆ 1753004ACBF5D6E651F2983C3DAF260C ┆ c ┆ [hey] ┆ - ┆ [4294967295] │
└──────────┴───────────────────────────────┴──────────────────────────────────┴───────────────────┴─────────────┴──────────────────────────────────┴─────────────────┘
§Example
let row0 = {
let points: &[MyPoint] = &[MyPoint { x: 10.0, y: 10.0 }, MyPoint { x: 20.0, y: 20.0 }];
let colors: &[_] = &[MyColor(0xff7f7f7f)];
let labels: &[MyLabel] = &[];
DataRow::from_cells3(
RowId::new(),
"a",
timepoint(1, 1),
(points, colors, labels),
).unwrap()
};
let row1 = {
let colors: &[MyColor] = &[];
DataRow::from_cells1(RowId::new(), "b", timepoint(1, 2), colors).unwrap()
};
let row2 = {
let colors: &[_] = &[MyColor(0xff7f7f7f)];
let labels: &[_] = &[MyLabel("hey".into())];
DataRow::from_cells2(
RowId::new(),
"c",
timepoint(2, 1),
(colors, labels),
).unwrap()
};
let table_in = DataTable::from_rows(table_id, [row0, row1, row2]);
eprintln!("Table in:\n{table_in}");
let (schema, columns) = table_in.serialize().unwrap();
// eprintln!("{schema:#?}");
eprintln!("Wired chunk:\n{columns:#?}");
let table_out = DataTable::deserialize(table_id, &schema, &columns).unwrap();
eprintln!("Table out:\n{table_out}");Fields§
§table_id: TableIdAuto-generated TUID, uniquely identifying this batch of data and keeping track of the
client’s wall-clock.
col_row_id: VecDeque<RowId>The entire column of RowIds.
Keeps track of the unique identifier for each row that was generated by the clients.
col_timelines: BTreeMap<Timeline, VecDeque<Option<i64>>>All the rows for all the time columns.
The times are optional since not all rows are guaranteed to have a timestamp for every single timeline (though it is highly likely to be the case in practice).
col_entity_path: VecDeque<EntityPath>The entire column of EntityPaths.
The entity each row relates to, respectively.
columns: BTreeMap<ComponentName, DataCellColumn>All the rows for all the component columns.
The cells are optional since not all rows will have data for every single component (i.e. the table is sparse).
Implementations§
source§impl DataTable
impl DataTable
pub fn num_rows(&self) -> u32
sourcepub fn to_rows(&self) -> impl ExactSizeIterator
pub fn to_rows(&self) -> impl ExactSizeIterator
Fails if any row has two or more cells share the same component type.
sourcepub fn timepoint_max(&self) -> TimePoint
pub fn timepoint_max(&self) -> TimePoint
Computes the maximum value for each and every timeline present across this entire table,
and returns the corresponding TimePoint.
sourcepub fn compute_all_size_bytes(&mut self)
pub fn compute_all_size_bytes(&mut self)
Compute and cache the total (heap) allocated size of each individual underlying
DataCell.
This does nothing for cells whose size has already been computed and cached before.
Beware: this is very costly!
source§impl DataTable
impl DataTable
sourcepub fn serialize(
&self
) -> Result<(Schema, Chunk<Box<dyn Array>>), DataTableError>
pub fn serialize( &self ) -> Result<(Schema, Chunk<Box<dyn Array>>), DataTableError>
Serializes the entire table into an arrow payload and schema.
A serialized DataTable contains two kinds of columns: control & data.
- Control columns are those that drive the behavior of the storage systems. They are always present, always dense, and always deserialized upon reception by the server. Internally, time columns are (de)serialized separately from the rest of the control columns for efficiency/QOL concerns: that doesn’t change the fact that they are control columns all the same!
- Data columns are the ones that hold component data. They are optional, potentially sparse, and never deserialized on the server-side (not by the storage systems, at least).
sourcepub fn serialize_control_column<'a, C>(
values: &'a VecDeque<C>
) -> Result<(Field, Box<dyn Array>), DataTableError>
pub fn serialize_control_column<'a, C>( values: &'a VecDeque<C> ) -> Result<(Field, Box<dyn Array>), DataTableError>
Serializes a single control column: an iterable of dense arrow-like data.
sourcepub fn serialize_primitive_column<T>(
name: &str,
values: &VecDeque<T>,
datatype: Option<DataType>
) -> (Field, Box<dyn Array>)where
T: NativeType,
pub fn serialize_primitive_column<T>(
name: &str,
values: &VecDeque<T>,
datatype: Option<DataType>
) -> (Field, Box<dyn Array>)where
T: NativeType,
Serializes a single control column; optimized path for primitive datatypes.
sourcepub fn serialize_data_column(
name: &str,
column: &VecDeque<Option<DataCell>>
) -> Result<(Field, Box<dyn Array>), DataTableError>
pub fn serialize_data_column( name: &str, column: &VecDeque<Option<DataCell>> ) -> Result<(Field, Box<dyn Array>), DataTableError>
Serializes a single data column.
pub fn serialize_primitive_deque_opt<T>(
data: &VecDeque<Option<T>>
) -> PrimitiveArray<T>where
T: NativeType,
pub fn serialize_primitive_deque<T>(data: &VecDeque<T>) -> PrimitiveArray<T>where
T: NativeType,
source§impl DataTable
impl DataTable
sourcepub fn deserialize(
table_id: TableId,
schema: &Schema,
chunk: &Chunk<Box<dyn Array>>
) -> Result<DataTable, DataTableError>
pub fn deserialize( table_id: TableId, schema: &Schema, chunk: &Chunk<Box<dyn Array>> ) -> Result<DataTable, DataTableError>
Deserializes an entire table from an arrow payload and schema.
source§impl DataTable
impl DataTable
sourcepub fn from_arrow_msg(msg: &ArrowMsg) -> Result<DataTable, DataTableError>
pub fn from_arrow_msg(msg: &ArrowMsg) -> Result<DataTable, DataTableError>
Deserializes the contents of an ArrowMsg into a DataTable.
sourcepub fn to_arrow_msg(&self) -> Result<ArrowMsg, DataTableError>
pub fn to_arrow_msg(&self) -> Result<ArrowMsg, DataTableError>
Serializes the contents of a DataTable into an ArrowMsg.
Trait Implementations§
source§impl PartialEq for DataTable
impl PartialEq for DataTable
source§impl SizeBytes for DataTable
impl SizeBytes for DataTable
source§fn heap_size_bytes(&self) -> u64
fn heap_size_bytes(&self) -> u64
self on the heap, in bytes.source§fn total_size_bytes(&self) -> u64
fn total_size_bytes(&self) -> u64
self in bytes, accounting for both stack and heap space.source§fn stack_size_bytes(&self) -> u64
fn stack_size_bytes(&self) -> u64
self on the stack, in bytes. Read moreimpl StructuralPartialEq for DataTable
Auto Trait Implementations§
impl Freeze for DataTable
impl !RefUnwindSafe for DataTable
impl Send for DataTable
impl Sync for DataTable
impl Unpin for DataTable
impl !UnwindSafe for DataTable
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> CheckedAs for T
impl<T> CheckedAs for T
source§fn checked_as<Dst>(self) -> Option<Dst>where
T: CheckedCast<Dst>,
fn checked_as<Dst>(self) -> Option<Dst>where
T: CheckedCast<Dst>,
source§impl<Src, Dst> CheckedCastFrom<Src> for Dstwhere
Src: CheckedCast<Dst>,
impl<Src, Dst> CheckedCastFrom<Src> for Dstwhere
Src: CheckedCast<Dst>,
source§fn checked_cast_from(src: Src) -> Option<Dst>
fn checked_cast_from(src: Src) -> Option<Dst>
source§impl<T> Instrument for T
impl<T> Instrument for T
source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
source§impl<T> IntoEither for T
impl<T> IntoEither for T
source§fn into_either(self, into_left: bool) -> Either<Self, Self> ⓘ
fn into_either(self, into_left: bool) -> Either<Self, Self> ⓘ
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moresource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self> ⓘ
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self> ⓘ
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more