External

API details: Module for defining config and downloading data from various sources like kaggle, fastai, github

Extend Configuration


source

aiking_cfg

 aiking_cfg ()

Config object for fastai’s config.ini


source

aiking_path

 aiking_path (folder)

Path to folder in aiking_cfg

Extend URL


source

URLs

 URLs ()

Global constants for dataset and model URLs.

Kaggle Dataset Utilities

This is going to be a helper class for downloading Kaggle dataset in required structure

# !rm -rf '/AIKING_HOME/data/spooky.zip'

source

KAGGLEs

 KAGGLEs ()

Initialize self. See help(type(self)) for accurate signature.


source

download_kaggle2

 download_kaggle2 (url, dest, overwrite=False)

Download url to dest unless it exists and not overwrite

# download_kaggle2("kaggle_competitions::spaceship-titanic", cfg.path('data')/"spaceship-titanic")
# !ls -l {cfg.path('data')/"spaceship-titanic"}

source

download_kaggle

 download_kaggle (url, dest, overwrite=False)

Download url to dest unless it exists and not overwrite

Utilities for archive extraction


source

unzip_file

 unzip_file (dest, arch_path)

source

untar_data

 untar_data (url, archive=None, data=None, c_key='data',
             force_download=False)

Download url to fname if dest doesn’t exist, and extract to folder dest

Type Default Details
url
archive NoneType None
data NoneType None
c_key str data
force_download bool False , extract_func=file_extract, timeout=4):

source

list_checked_data

 list_checked_data ()

Image Dataset Utilities(Bing / ddg)

Duck Duck Go Downloader


source

search_images_ddg

 search_images_ddg (key, max_n=150)

Bing Downloader


source

search_images_bing

 search_images_bing (key, term, min_sz=128, max_images=150)

Dataset constructor


source

construct_image_dataset

 construct_image_dataset (clstypes, dest, key=None, loc=None, count=150,
                          engine='bing')

Utilities to review datasets folder


source

list_ds

 list_ds (loc=None)

source

get_ds

 get_ds (name, loc=None)

Create or Update Dataset folder


source

push_ds

 push_ds (url, dsname, name=None, subfolder='.')

Download url to dsname dataset as name. Creates dsname if doesnot exists