data external

API details: Module for defining config and downloading data from various sources like kaggle, fastai, github

Extend Configuration


source

aiking_cfg

 aiking_cfg ()

Config object for fastai’s config.ini


source

aiking_path

 aiking_path (folder)

Path to folder in aiking_cfg

Extend URL


source

URLs

 URLs ()

Global constants for dataset and model URLs.

Kaggle Dataset Utilities

This is going to be a helper class for downloading Kaggle dataset in required structure


source

KAGGLEs

 KAGGLEs ()

Initialize self. See help(type(self)) for accurate signature.


source

download_kaggle

 download_kaggle (url, dest, overwrite=False)

Download url to dest unless it exists and not overwrite

Utilities for archive extraction


source

unzip_file

 unzip_file (dest, arch_path)

source

untar_data

 untar_data (url, archive=None, data=None, c_key='data',
             force_download=False)

Download url to fname if dest doesn’t exist, and extract to folder dest

Type Default Details
url
archive NoneType None
data NoneType None
c_key str data
force_download bool False , extract_func=file_extract, timeout=4):

source

list_checked_data

 list_checked_data ()

Image Dataset Utilities(Bing / ddg)

## export
LOC = None
if os.getenv("AIKING_HOME"):
    LOC = os.path.join(os.getenv("AIKING_HOME"), "data")
# LOC, LOC

Duck Duck Go Downloader


source

search_images_ddg

 search_images_ddg (key, max_n=150)

Bing Downloader


source

search_images_bing

 search_images_bing (key, term, min_sz=128, max_images=150)

Google Dataset Downloader

Dataset constructor


source

construct_image_dataset

 construct_image_dataset (clstypes, dest, key=None, loc=None, count=150,
                          engine='bing')
# aiking_path('data')

Utilities to review datasets folder


source

list_ds

 list_ds (loc=None)

source

get_ds

 get_ds (name, loc=None)

Create or Update Dataset folder


source

push_ds

 push_ds (url, dsname, name=None, subfolder='.')

Download url to dsname dataset as name. Creates dsname if doesnot exists

Google Downloader

# query = "dog"
# options = webdriver.ChromeOptions()
# options.add_argument("--headless")
# options.add_argument('--no-sandbox')
# options.add_argument('--disable-dev-shm-usage')
# try: 
#     driver = webdriver.Remote(options=options)
# except: 
#     print("""Error initializing chromedriver. 
#             Check if it's in your path by running `which chromedriver""")
# !which chromedriver