Create a Dataset by Filtering
Create a dataset by filtering data of the current dataset. This will not duplicate data and will help you save a lot of storage space. You can also view data details, label distribution, and other information online with visualization widgets while filtering a dataset.
- Select Create by Filter from the drop-down box on the TensorBay Dataset List page.
- Enter the Data List page, and select the dataset version you want to manage from the drop-down box on the left.
- Confirm the version to be managed. You can filter data with the data name, segment, and annotation. The filtered data will be shown in the Data List on the right.
- On the Data List page, you can also preview the data online and view the specific annotation status with the visualization on the right. For details, please see Data Visualization widget.
- View the distribution of labels and annotations on the right.
- Select Create a new dataset based on this result
- Fill in dataset name, select storage location and set vsibility (public or private). Select Create to complete the dataset creation.
- When the dataset is created, you can jump to the Details page of the newly created dataset.
For fusion and normal datasets, you can not only search by filters (including segment name, annotation type, and with or without annotation) but also customize the advanced search. All you need to do is add a GitHub URL to get a file and use the file to filter data.
- On the Dataset Details page, click Manage Data and View Data to enter the Data List page.
- Click Advanced Search
- You will see the pop-up window for uploading GitHub Repo Link. Copy the link of the file you want to upload into the window and click Search to start filtering. Note: The link should be under the HTTPS protocol, which means that the link should begin with “https”, and the address and revision should be separated by“:”.
- After completing the advanced search, the search results will be automatically saved in Search Records.