from_files#
- ethology.annotations.io.load_bboxes.from_files(file_paths, format, images_dirs=None)[source]#
Read input annotation files as a bboxes dataframe.
- Parameters:
file_paths (Path | str | list[Path | str]) – Path or list of paths to the input annotation files.
format (Literal["VIA", "COCO"]) – Format of the input annotation files.
images_dirs (Path | str | list[Path | str], optional) – Path or list of paths to the directories containing the images the annotations refer to.
- Returns:
Bounding boxes annotations dataframe. The dataframe is indexed by “annotation_id” and has the following columns: “image_filename”, “image_id”, “image_width”, “image_height”, “x_min”, “y_min”, “width”, “height”, “supercategory”, “category”. It also has the following attributes: “annotation_files”, “annotation_format”, “images_directories”. The “image_id” is assigned based on the alphabetically sorted list of unique image filenames across all input files. The “category_id” column is always a 0-based integer, except for VIA files where the values specified in the input file are retained.
- Return type:
pd.DataFrame
Notes
We use image filenames’ to assign IDs to images, so if two images have the same name but are in different input annotation files, they will be assigned the same image ID and their annotations will be merged.
If this behaviour is not desired, and you would like to assign different image IDs to images that have the same name but appear in different input annotation files, you can either make the image filenames distinct before loading the data, or you can load the data from each file as a separate dataframe, and then concatenate them as desired.
See also
pandas.concat
Concatenate pandas objects along a particular axis.
pandas.DataFrame.drop_duplicates
Return DataFrame with duplicate rows
removed.