Def of collate

Def of collate code#

Here, we need to read and create a list of Data objects and save it into the processed_dir.īecause saving a huge python list is rather slow, we collate the list into one huge Data object via torch_() before saving. The real magic happens in the body of process(). You can find helpful methods to download and extract data in torch_geometric.data. Torch_(): Processes raw data and saves it into the processed_dir. Torch_(): Downloads raw data into raw_dir. Torch_file_names(): A list of files in the processed_dir which needs to be found in order to skip the processing. Torch_file_names(): A list of files in the raw_dir which needs to be found in order to skip the download. In order to create a torch_, you need to implement four fundamental methods: Use cases may involve the restriction of data objects being of a specific class. The pre_filter function can manually filter out data objects before saving. The pre_transform function applies the transformation before saving the data objects to disk (so it is best used for heavy precomputation which needs to be only done once). The transform function dynamically transforms the data object before accessing (so it is best used for data augmentation).

In addition, each dataset can be passed a transform, a pre_transform and a pre_filter function, which are None by default. We split up the root folder into two folders: the raw_dir, where the dataset gets downloaded to, and the processed_dir, where the processed dataset is being saved.

Torch_ inherits from torch_ and should be used if the whole dataset fits into CPU memory.įollowing the torchvision convention, each dataset gets passed a root folder which indicates where the dataset should be stored. We provide two abstract classes for datasets: torch_ and torch_. However, we give a brief introduction on what is needed to setup your own dataset.

Def of collate code#

Implementing datasets by yourself is straightforward and you may want to take a look at the source code to find out how the various datasets are implemented. Although PyG already contains a lot of useful datasets, you may wish to create your own dataset with self-recorded or non-publicly available data.