A Python library for loading, batching, and preprocessing image and audio datasets — built with clean object-oriented design for ML pipelines.
CenterCrop, RandomFlip, Padding)
Dataset (ABC) ├── LabeledDataset (ABC) │ ├── ImageDataset │ └── AudioDataset └── UnlabeledDataset (ABC) ├── UnlabeledImageDataset └── UnlabeledAudioDataset
Transform (ABC) ├── CenterCrop (image) ├── RandomCrop (image) ├── RandomFlip (image) ├── Padding (image) ├── MelSpectrogram (audio) ├── AudioRandomCrop (audio) ├── Resample (audio) ├── PitchShift (audio) └── Pipeline (any → any)
from src.image_dataset import ImageDataset from src.batch_loader import BatchLoader from src.preprocessing import Pipeline, CenterCrop ds = ImageDataset("data/", lazy=True) train, test = ds.split(0.8) loader = BatchLoader(train, batch_size=32) pipe = Pipeline(CenterCrop(256, 256)) for batch in loader: out = [pipe(img) for img, label in batch]
• ABCs prevent incomplete instantiation • All attributes private, exposed via @property • split() shuffles indices, not data • BatchLoader uses a generator (yield) • Callable classes for all transforms • CSV labels auto-cast: int → float → str • Pillow for RGB-safe image loading • librosa.load(sr=None) → (y, sr) tuple