When you use TensorBay, you'll come across some concepts related to dataset types, mainly Datasets and Fusion Datasets. Here we will give explanations for related concepts.
Dataset represents the most common datasets such as MNIST and THUCNews. Datasets are the most commonly used data in algorithm research. Data files have no special correlation with one another. They may come from the same sensor.
- Datasets: Common Datasets
A dataset contains a certain number of files, annotations, and relevant information. There are a certain number of segments in one dataset.
- Catalog & SubCatalog
There could be one or several SubCatalogs under one Catalog, each of which only stores annotation metadata of one type.
Catalog stores the annotation metadata. It collects all the annotations that occurred in all data samples of a dataset. Each SubCatalog with a single annotation type contains only one type of metadata.
Segment refers to a data level secondary to Dataset in the TensorBay dataset structure. Each Segment contains a certain number of data. For example, all training samples can be organized into one segment.
- Data: Data Required for Algorithm Research
Data is the most basic level in the TensorBay dataset structure. A data object contains one dataset sample, annotations, and other information such as a timestamp.
The structure of a dataset looks like this:
A fusion dataset is composed of complex structured data or data collected from multiple sensors, usually referring to autonomous driving datasets, such as the nuScenes and KITTI-Tracking.
A fusion dataset contains data collected from multiple sensors, including information on frames, labels, sensors, and more. A fusion dataset may contain multiple fusion segments.
FusionSegment is a data level only secondary to a fusion dataset. Each FusionSegment contains a certain amount of Frames to store information from different sensors.
A Sensor refers to a device that collects the data in the FusionSegment. Currently, we support the following four types of sensors, each of which stores sensor-related information, including sensor names, extrinsic parameters, and intrinsic parameters (only for camera-type sensors).
Frame is a component of a FusionSegment. A Frame consists of multiple types of data collected at the same time from different sensors.
- Data in a Fusion Dataset
Each of the data in a Frame corresponds to a data file collected by a Sensor. The data of fusion datasets is defined the same as the data of datasets.