ValidBboxesDataFrameCOCO#

class ethology.io.annotations.validate.ValidBboxesDataFrameCOCO(*args, **kwargs)[source]#

Bases: DataFrameModel

Class for COCO-exportable bounding box annotations dataframes.

The validation checks the required columns exist and their types are correct. It additionally checks that the index and the annotation_id column are equal.

idx#

Index of the dataframe. Should be greater than or equal to 0 and equal to the annotation_id column.

Type:

Index[int]

annotation_id#

Unique identifier for the annotation. Should be equal to the index.

Type:

int

image_id#

Unique identifier for each of the images.

Type:

int

image_filename#

Filename of the image.

Type:

str

image_width#

Width of each of the images.

Type:

int

image_height#

Height of each of the images.

Type:

int

bbox#

Bounding box coordinates as xmin, ymin, width, height.

Type:

list[float]

area#

Bounding box area.

Type:

float

segmentation#

Bounding box segmentation masks as list of lists of coordinates.

Type:

list[list[float]]

category#

Category of the annotation.

Type:

str

supercategory#

Supercategory of the annotation.

Type:

str

iscrowd#

Whether the annotation is a crowd. Should be 0 or 1.

Type:

int

Raises:

pa.errors.SchemaError – If the dataframe does not match the schema.

Notes

See COCO format documentation for more details.

Methods

build_schema_(**kwargs)

check_idx_and_annotation_id(df)

Check that the index and the annotation_id column are equal.

empty(*_args)

Create an empty DataFrame with the schema of this model.

example(cls, **kwargs)

Generate an example of a particular size.

get_metadata()

Provide metadata for columns and schema level

map_df_columns_to_COCO_fields()

Map COCO-exportable dataframe columns to COCO fields.

pydantic_validate(schema_model)

Verify that the input is a compatible dataframe model.

strategy(cls, **kwargs)

Create a hypothesis strategy for generating a DataFrame.

to_json_schema()

Serialize schema metadata into json-schema format.

to_schema()

Create DataFrameSchema from the DataFrameModel.

to_yaml([stream])

Convert Schema to yaml using io.to_yaml.

validate(check_obj[, head, tail, sample, ...])

Validate a DataFrame based on the schema specification.

classmethod check_idx_and_annotation_id(df)[source]#

Check that the index and the annotation_id column are equal.

Parameters:

df (pd.DataFrame) – The dataframe to check.

Returns:

A boolean indicating whether the index and the annotation_id column are equal for all rows.

Return type:

bool

classmethod empty(*_args)#

Create an empty DataFrame with the schema of this model.

Return type:

DataFrame[Self]

classmethod example(cls, **kwargs)#

Generate an example of a particular size.

Parameters:

size – number of elements in the generated DataFrame.

Return type:

DataFrameBase[TypeVar(TDataFrameModel, bound= DataFrameModel)]

Returns:

DataFrame object.

classmethod get_metadata()#

Provide metadata for columns and schema level

Return type:

Optional[dict]

static map_df_columns_to_COCO_fields()[source]#

Map COCO-exportable dataframe columns to COCO fields.

Returns:

A dictionary mapping each column in the COCO-exportable dataframe to the corresponding fields in the equivalent COCO file.

Return type:

dict

classmethod pydantic_validate(schema_model)#

Verify that the input is a compatible dataframe model.

Return type:

DataFrameModel

classmethod strategy(cls, **kwargs)#

Create a hypothesis strategy for generating a DataFrame.

Parameters:
  • size – number of elements to generate

  • n_regex_columns – number of regex columns to generate.

Returns:

a strategy that generates DataFrame objects.

classmethod to_json_schema()#

Serialize schema metadata into json-schema format.

Parameters:

dataframe_schema – schema to write to json-schema format.

Note

This function is currently does not fully specify a pandera schema, and is primarily used internally to render OpenAPI docs via the FastAPI integration.

classmethod to_schema()#

Create DataFrameSchema from the DataFrameModel.

Return type:

TypeVar(TSchema, bound= BaseSchema)

classmethod to_yaml(stream=None)#

Convert Schema to yaml using io.to_yaml.

classmethod validate(check_obj, head=None, tail=None, sample=None, random_state=None, lazy=False, inplace=False)#

Validate a DataFrame based on the schema specification.

Parameters:
  • check_obj (pd.DataFrame) – the dataframe to be validated.

  • head (Optional[int]) – validate the first n rows. Rows overlapping with tail or sample are de-duplicated.

  • tail (Optional[int]) – validate the last n rows. Rows overlapping with head or sample are de-duplicated.

  • sample (Optional[int]) – validate a random sample of n rows. Rows overlapping with head or tail are de-duplicated.

  • random_state (Optional[int]) – random seed for the sample argument.

  • lazy (bool) – if True, lazily evaluates dataframe against all validation checks and raises a SchemaErrors. Otherwise, raise SchemaError as soon as one occurs.

  • inplace (bool) – if True, applies coercion to the object of validation, otherwise creates a copy of the data.

Return type:

DataFrame[Self]

Returns:

validated DataFrame

Raises:

SchemaError – when DataFrame violates built-in or custom checks.