Binary Data Helpers
- datarobot.helpers.binary_data_utils.get_encoded_image_contents_from_urls(urls, custom_headers=None, image_options=None, continue_on_error=False, n_threads=None)
Returns base64 encoded string of images located in addresses passed in input collection. Input collection should hold data of valid image url addresses reachable from location where code is being executed. Method will retrieve image, apply specified reformatting before converting contents to base64 string. Results will in same order as specified in input collection.
- Parameters:
- urls: Iterable
Iterable with url addresses to download images from
- custom_headers: dict
Dictionary containing custom headers to use when downloading files using a URL. Detailed data related to supported Headers in HTTP can be found in the RFC specification for headers: https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html When used, specified passed values will overwrite default header values.
- image_options: ImageOptions class
Class holding parameters for use in image transformation and formatting.
- continue_on_error: bool
If one of rows encounters error while retrieving content (i.e. file does not exist) should this error terminate process of downloading consecutive files or should process continue skipping this file.
- n_threads: int or None
Number of threads to use for processing. If “None” is passed, the number of threads is determined automatically based on the number of available CPU cores. If this is not possible, 4 threads are used.
- Returns:
- List of base64 encoded strings representing reformatted images.
- Raises:
- ContentRetrievalTerminatedError:
The error is raised when the flag continue_on_error is set to` False` and processing has been terminated due to an exception while loading the contents of the file.
- Return type:
List
[Optional
[str
]]
- datarobot.helpers.binary_data_utils.get_encoded_image_contents_from_paths(paths, image_options=None, continue_on_error=False, n_threads=None)
Returns base64 encoded string of images located in paths passed in input collection. Input collection should hold data of valid image paths reachable from location where code is being executed. Method will retrieve image, apply specified reformatting before converting contents to base64 string. Results will in same order as specified in input collection.
- Parameters:
- paths: Iterable
Iterable with path locations to open images from
- image_options: ImageOptions class
Class holding parameters for image transformation and formatting
- continue_on_error: bool
If one of rows encounters error while retrieving content (i.e. file does not exist) should this error terminate process of downloading consecutive files or should process continue skipping this file.
- n_threads: int or None
Number of threads to use for processing. If “None” is passed, the number of threads is determined automatically based on the number of available CPU cores. If this is not possible, 4 threads are used.
- Returns:
- List of base64 encoded strings representing reformatted images.
- Raises:
- ContentRetrievalTerminatedError:
The error is raised when the flag continue_on_error is set to` False` and processing has been terminated due to an exception while loading the contents of the file.
- Return type:
List
[Optional
[str
]]
- datarobot.helpers.binary_data_utils.get_encoded_file_contents_from_paths(paths, continue_on_error=False, n_threads=None)
Returns base64 encoded string for files located under paths passed in input collection. Input collection should hold data of valid file paths locations reachable from location where code is being executed. Method will retrieve file and convert its contents to base64 string. Results will be returned in same order as specified in input collection.
- Parameters:
- paths: Iterable
Iterable with path locations to open images from
- continue_on_error: bool
If one of rows encounters error while retrieving content (i.e. file does not exist) should this error terminate process of downloading consecutive files or should process continue skipping this file.
- n_threads: int or None
Number of threads to use for processing. If “None” is passed, the number of threads is determined automatically based on the number of available CPU cores. If this is not possible, 4 threads are used.
- Returns:
- List of base64 encoded strings representing files.
- Raises:
- ContentRetrievalTerminatedError:
The error is raised when the flag continue_on_error is set to` False` and processing has been terminated due to an exception while loading the contents of the file.
- Return type:
List
[Optional
[str
]]
- datarobot.helpers.binary_data_utils.get_encoded_file_contents_from_urls(urls, custom_headers=None, continue_on_error=False, n_threads=None)
Returns base64-encoded string for files located in the URL addresses passed on input. Input collection holds data of valid file URL addresses reachable from location where code is being executed. Method will retrieve file and convert its contents to base64 string. Results will be returned in same order as specified in input collection.
- Parameters:
- urls: Iterable
Iterable containing URL addresses to download images from.
- custom_headers: dict
Dictionary with headers to use when downloading files using a URL. Detailed data related to supported Headers in HTTP can be found in the RFC specification: https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html. When specified, passed values will overwrite default header values.
- continue_on_error: bool
If a row encounters an error while retrieving content (i.e., file does not exist), specifies whether the error results in terminating the process of downloading consecutive files or the process continues. Skipped files will be marked as missing.
- n_threads: int or None
Number of threads to use for processing. If “None” is passed, the number of threads is determined automatically based on the number of available CPU cores. If this is not possible, 4 threads are used.
- Returns:
- List of base64 encoded strings representing files.
- Raises:
- ContentRetrievalTerminatedError:
The error is raised when the flag continue_on_error is set to` False` and processing has been terminated due to an exception while loading the contents of the file.
- Return type:
List
[Optional
[str
]]
- class datarobot.helpers.image_utils.ImageOptions(should_resize=True, force_size=True, image_size=(224, 224), image_format=None, image_quality=75, image_subsampling=None, resample_method=1, keep_quality=True)
Image options class. Class holds image options related to image resizing and image reformatting.
- should_resize: bool
Whether input image should be resized to new dimensions.
- force_size: bool
Whether the image size should fully match the new requested size. If the original and new image sizes have different aspect ratios, specifying True will force a resize to exactly match the requested size. This may break the aspect ratio of the original image. If False, the resize method modifies the image to contain a thumbnail version of itself, no larger than the given size, that preserves the image’s aspect ratio.
- image_size: Tuple[int, int]
New image size (width, height). Both values (width, height) should be specified and contain a positive value. Depending on the value of force_size, the image will be resized exactly to the given image size or will be resized into a thumbnail version of itself, no larger than the given size.
- image_format: ImageFormat | str
What image format will be used to save result image after transformations. For example (ImageFormat.JPEG, ImageFormat.PNG). Values supported are in line with values supported by DataRobot. If no format is specified by passing None value original image format will be preserved.
- image_quality: int or None
The image quality used when saving image. When None is specified, a value will not be passed and Pillow library will use its default.
- resample_method: ImageResampleMethod
What resampling method should be used when resizing image.
- keep_quality: bool
Whether the image quality is kept (when possible). If True, for JPEG images quality will be preserved. For other types, the value specified in image_quality will be used.