datasets
AnomalyDataset(transform, samples, task='segmentation', valid_area_mask=None, crop_area=None)
¶
Bases: Dataset
Anomaly Dataset.
Parameters:
-
transform
(
alb.Compose
) –Albumentations compose.
-
task
(
str
) –classification
orsegmentation
-
samples
(
DataFrame
) –Pandas dataframe containing samples following the same structure created by make_anomaly_dataset
-
valid_area_mask
(
Optional[str]
) –Optional path to the mask to use to filter out the valid area of the image. If None, the whole image is considered valid.
-
crop_area
(
Optional[Tuple[int, int, int, int]]
) –Optional tuple of 4 integers (x1, y1, x2, y2) to crop the image to the specified area. If None, the whole image is considered valid.
Source code in quadra/datasets/anomaly.py
196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 |
|
__getitem__(index)
¶
Get dataset item for the index index
.
Parameters:
-
index
(
int
) –Index to get the item.
Returns:
-
Dict[str, Union[str, Tensor]]
–Dict of image tensor during training.
-
Dict[str, Union[str, Tensor]]
–Otherwise, Dict containing image path, target path, image tensor, label and transformed bounding box.
Source code in quadra/datasets/anomaly.py
224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 |
|
__len__()
¶
Get length of the dataset.
Source code in quadra/datasets/anomaly.py
220 221 222 |
|
ClassificationDataset(samples, targets, class_to_idx=None, resize=None, roi=None, transform=None, rgb=True, channel=3, random_padding=False, circular_crop=False)
¶
Bases: ImageClassificationListDataset
Custom Classification Dataset.
Parameters:
-
samples
(
List[str]
) –List of paths to images
-
targets
(
List[Union[str, int]]
) –List of targets
-
class_to_idx
(
Optional[Dict]
) –Defaults to None.
-
resize
(
Optional[int]
) –Resize image to this size. Defaults to None.
-
roi
(
Optional[Tuple[int, int, int, int]]
) –Region of interest. Defaults to None.
-
transform
(
Optional[Callable]
) –transform function. Defaults to None.
-
rgb
(
bool
) –Use RGB space
-
channel
(
int
) –Number of channels. Defaults to 3.
-
random_padding
(
bool
) –Random padding. Defaults to False.
-
circular_crop
(
bool
) –Circular crop. Defaults to False.
Source code in quadra/datasets/classification.py
138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 |
|
ImageClassificationListDataset(samples, targets, class_to_idx=None, resize=None, roi=None, transform=None, rgb=True, channel=3, allow_missing_label=False)
¶
Bases: Dataset
Standard classification dataset.
Parameters:
-
samples
(
List[str]
) –List of paths to images to be read
-
targets
(
List[Union[str, int]]
) –List of labels, one for every image in samples
-
class_to_idx
(
Optional[Dict]
) –mapping from classes to unique indexes. Defaults to None.
-
resize
(
Optional[int]
) –Integer specifying the size of a first optional resize keeping the aspect ratio: the smaller side of the image will be resized to
resize
, while the longer will be resized keeping the aspect ratio. Defaults to None. -
roi
(
Optional[Tuple[int, int, int, int]]
) –Optional ROI, with (x_upper_left, y_upper_left, x_bottom_right, y_bottom_right). Defaults to None.
-
transform
(
Optional[Callable]
) –Optional Albumentations transform. Defaults to None.
-
rgb
(
bool
) –if False, image will be converted in grayscale
-
channel
(
int
) –1 or 3. If rgb is True, then channel will be set at 3.
-
allow_missing_label
(
Optional[bool]
) –If set to false warn the user if the dataset contains missing labels
Source code in quadra/datasets/classification.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
|
MultilabelClassificationDataset(samples, targets, class_to_idx=None, transform=None, rgb=True)
¶
Bases: torch.utils.data.Dataset
Custom MultilabelClassification Dataset.
Parameters:
-
samples
(
List[str]
) –list of paths to images.
-
targets
(
np.ndarray
) –array of multiple targets per sample. The array must be a one-hot enoding. It must have a shape of (n_samples, n_targets).
-
class_to_idx
(
Optional[Dict]
) –Defaults to None.
-
transform
(
Optional[Callable]
) –transform function. Defaults to None.
-
rgb
(
bool
) –Use RGB space
Source code in quadra/datasets/classification.py
193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 |
|
PatchSklearnClassificationTrainDataset(data_path, samples, targets, class_to_idx=None, resize=None, transform=None, rgb=True, channel=3, balance_classes=False)
¶
Bases: Dataset
Dataset used for patch sampling, it expects samples to be paths to h5 files containing all the required information for patch sampling from images.
Parameters:
-
data_path
(
str
) –base path to the dataset
-
samples
(
List[str]
) –Paths to h5 files
-
targets
(
List[Union[str, int]]
) –Labels associated with each sample
-
class_to_idx
(
Optional[Dict]
) –Mapping between class and corresponding index
-
resize
(
Optional[int]
) –Whether to perform an aspect ratio resize of the patch before the transformations
-
transform
(
Optional[Callable]
) –Optional function applied to the image
-
rgb
(
bool
) –if False, image will be converted in grayscale
-
channel
(
int
) –1 or 3. If rgb is True, then channel will be set at 3.
-
balance_classes
(
bool
) –if True, the dataset will be balanced by duplicating samples of the minority class
Source code in quadra/datasets/patch.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
|
SegmentationDataset(image_paths, mask_paths, batch_size=None, object_masks=None, resize=224, mask_preprocess=None, labels=None, transform=None, mask_smoothing=False, defect_transform=None)
¶
Bases: torch.utils.data.Dataset
Custom SegmentationDataset class for loading images and masks.
Parameters:
-
image_paths
(
List[str]
) –List of paths to images.
-
mask_paths
(
List[str]
) –List of paths to masks.
-
batch_size
(
Optional[int]
) –Batch size.
-
object_masks
(
Optional[List[Union[np.ndarray, Any]]]
) –List of paths to object masks.
-
resize
(
int
) –Resize image to this size.
-
mask_preprocess
(
Optional[Callable]
) –Preprocess mask.
-
labels
(
Optional[List[str]]
) –List of labels.
-
transform
(
Optional[albumentations.Compose]
) –Transformations to apply to images and masks.
-
mask_smoothing
(
bool
) –Smooth mask.
-
defect_transform
(
Optional[albumentations.Compose]
) –Transformations to apply to images and masks for defects.
Source code in quadra/datasets/segmentation.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
|
SegmentationDatasetMulticlass(image_paths, mask_paths, idx_to_class, batch_size=None, transform=None, one_hot=False)
¶
Bases: torch.utils.data.Dataset
Custom SegmentationDataset class for loading images and multilabel masks.
Parameters:
-
image_paths
(
List[str]
) –List of paths to images.
-
mask_paths
(
List[str]
) –List of paths to masks.
-
idx_to_class
(
Dict
) –dict with corrispondence btw mask index and classes: {1: class_1, 2: class_2, ..., N: class_N}
-
batch_size
(
Optional[int]
) –Batch size.
-
transform
(
Optional[albumentations.Compose]
) –Transformations to apply to images and masks.
-
one_hot
(
bool
) –if True return a binary mask (n_classxHxW), otherwise the labelled mask HxW. SMP loss requires the second format.
Source code in quadra/datasets/segmentation.py
152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 |
|
__getitem__(index)
¶
Get image and mask.
Source code in quadra/datasets/segmentation.py
186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 |
|
__len__()
¶
Returns the dataset lenght.
Source code in quadra/datasets/segmentation.py
232 233 234 235 236 237 |
|
TwoAugmentationDataset(dataset, transform, strategy=AugmentationStrategy.SAME_IMAGE)
¶
Bases: Dataset
Two Image Augmentation Dataset for using in self-supervised learning.
Parameters:
-
dataset
(
Dataset
) –A torch Dataset object
-
transform
(
Union[A.Compose, Tuple[A.Compose, A.Compose]]
) –albumentation transformations for each image. If you use single transformation, it will be applied to both images. If you use tuple, it will be applied to first image and second image separately.
-
strategy
(
AugmentationStrategy
) –Defaults to AugmentationStrategy.SAME_IMAGE.
Source code in quadra/datasets/ssl.py
29 30 31 32 33 34 35 36 37 38 39 |
|
TwoSetAugmentationDataset(dataset, global_transforms, local_transform, num_local_transforms)
¶
Bases: Dataset
Two Set Augmentation Dataset for using in self-supervised learning (DINO).
Parameters:
-
dataset
(
Dataset
) –Base dataset
-
global_transforms
(
Tuple[A.Compose, A.Compose]
) –Global transformations for each image.
-
local_transform
(
A.Compose
) –Local transformations for each image.
-
num_local_transforms
(
int
) –Number of local transformations to apply. In total you will have two + num_local_transforms transformations for each image. First element of the array will always return the original image. images[0] = global_transform0 images[1] = global_transform1 images[2:] = local_transform(s)(original_image) ...
Source code in quadra/datasets/ssl.py
81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
|