ValidBboxAnnotationsCOCO#

class ethology.validators.annotations.ValidBboxAnnotationsCOCO(*args, **kwargs)[source]#

Bases: DataFrameModel

Class for COCO-exportable bounding box annotations dataframes.

The validation checks the required columns exist and their types are correct. It additionally checks that the index and the annotation_id column are equal.

idx#

Index of the dataframe. Should be greater than or equal to 0 and equal to the annotation_id column.

Type:: Index[int]

annotation_id#

Unique identifier for the annotation. Should be equal to the index.

Type:: int

image_id#

Unique identifier for each of the images.

Type:: int

image_filename#

Filename of the image.

Type:: str

image_width#

Width of each of the images.

Type:: int

image_height#

Height of each of the images.

Type:: int

bbox#

Bounding box coordinates as xmin, ymin, width, height.

Type:: list[float]

area#

Bounding box area.

Type:: float

segmentation#

Bounding box segmentation masks as list of lists of coordinates.

Type:: list[list[float]]

category#

Category of the annotation.

Type:: str

supercategory#

Supercategory of the annotation.

Type:: str

iscrowd#

Whether the annotation is a crowd. Should be 0 or 1.

Type:: int

Raises:: pa.errors.SchemaError – If the dataframe does not match the schema.

Notes

See COCO format documentation for more details.

Methods

`build_schema_`(**kwargs)
`check_idx_and_annotation_id`(df)	Check that the index and the `annotation_id` column are equal.
`empty`(*_args)	Create an empty DataFrame with the schema of this model.
`example`(cls, **kwargs)	Generate an example of a particular size.
`get_metadata`()	Provide metadata for columns and schema level
`map_df_columns_to_COCO_fields`()	Map COCO-exportable dataframe columns to COCO fields.
`pydantic_validate`(schema_model)	Verify that the input is a compatible dataframe model.
`strategy`(cls, **kwargs)	Create a `hypothesis` strategy for generating a DataFrame.
`to_json_schema`()	Serialize schema metadata into json-schema format.
`to_schema`()	Create `DataFrameSchema` from the `DataFrameModel`.
`to_yaml`([stream])	Convert Schema to yaml using io.to_yaml.
`validate`(check_obj[, head, tail, sample, ...])	Validate a DataFrame based on the schema specification.

class Config#

Bases: BaseConfig

add_missing_columns: bool = False#: add columns to dataframe if they are missing

coerce: bool = False#: coerce types of all schema components

description: Optional[str] = None#: arbitrary textual description

drop_invalid_rows: bool = False#: drop invalid rows on validation

dtype: Optional[PandasDtypeInputTypes] = None#: datatype of the dataframe. This overrides the data types specified in any of the fields.

from_format: Optional[Union[Format, Callable]] = None#: data format before validation. This option only applies to schemas used in the context of the pandera type constructor pa.typing.DataFrame[Schema](data). If None, assumes a data structure compatible with the pandas.DataFrame constructor.

from_format_kwargs: Optional[dict[str, Any]] = None#: a dictionary keyword arguments to pass into the reader function that converts the object of type from_format to a pandera-validate-able data structure. The reader function is implemented in the pandera.typing generic types via the from_format and to_format methods.

metadata: Optional[dict] = None#: a dictionary object to store key-value data at schema level

multiindex_coerce: bool = False#: coerce types of all MultiIndex components

multiindex_name: Optional[str] = None#: name of multiindex

multiindex_ordered: bool = True#: validate MultiIndex in order

multiindex_strict: StrictType = False#: make sure all specified columns are in validated MultiIndex - if "filter", removes indexes not specified in the schema

multiindex_unique = None#: make sure the MultiIndex is unique along the list of columns

name: Optional[str] = 'ValidBboxAnnotationsCOCO'#: name of schema

ordered: bool = False#: validate columns order

strict: StrictType = False#: make sure all specified columns are in the validated dataframe - if "filter", removes columns not specified in the schema

title: Optional[str] = None#: human-readable label for schema

to_format: Optional[Union[Format, Callable]] = None#: data format to serialize into after validation. This option only applies to schemas used in the context of the pandera type constructor pa.typing.DataFrame[Schema](data). If None, returns a dataframe.

to_format_buffer: Optional[Union[str, Callable]] = None#: Buffer to be provided when to_format is a custom callable. See docs for example of how to implement an example of a to format function.

to_format_kwargs: Optional[dict[str, Any]] = None#: a dictionary keyword arguments to pass into the writer function that converts the pandera-validate-able object to type to_format. The writer function is implemented in the pandera.typing generic types via the from_format and to_format methods.

unique: Optional[Union[str, list[str]]] = None#: make sure certain column combinations are unique

unique_column_names: bool = False#: make sure dataframe column names are unique

classmethod check_idx_and_annotation_id(df)[source]#

Check that the index and the annotation_id column are equal.

Parameters:: df (pd.DataFrame) – The dataframe to check.
Returns:: A boolean indicating whether the index and the annotation_id column are equal for all rows.
Return type:: bool

classmethod empty(*_args)#

Create an empty DataFrame with the schema of this model.

Return type:: DataFrame[Self]

classmethod example(cls, **kwargs)#

Generate an example of a particular size.

Parameters:: size – number of elements in the generated DataFrame.
Return type:: DataFrameBase[TypeVar(TDataFrameModel, bound= DataFrameModel)]
Returns:: DataFrame object.

classmethod get_metadata()#

Provide metadata for columns and schema level

Return type:: Optional[dict]

static map_df_columns_to_COCO_fields()[source]#

Map COCO-exportable dataframe columns to COCO fields.

Returns:: A dictionary mapping each column in the COCO-exportable dataframe to the corresponding fields in the equivalent COCO file.
Return type:: dict

classmethod pydantic_validate(schema_model)#

Verify that the input is a compatible dataframe model.

Return type:: DataFrameModel

classmethod strategy(cls, **kwargs)#

Create a hypothesis strategy for generating a DataFrame.

Parameters:

size – number of elements to generate
n_regex_columns – number of regex columns to generate.

Returns:

a strategy that generates DataFrame objects.

classmethod to_json_schema()#

Serialize schema metadata into json-schema format.

Parameters:: dataframe_schema – schema to write to json-schema format.

Note

This function is currently does not fully specify a pandera schema, and is primarily used internally to render OpenAPI docs via the FastAPI integration.

classmethod to_schema()#

Create DataFrameSchema from the DataFrameModel.

Return type:: TypeVar(TSchema, bound= BaseSchema)

classmethod to_yaml(stream=None)#: Convert Schema to yaml using io.to_yaml.

classmethod validate(check_obj, head=None, tail=None, sample=None, random_state=None, lazy=False, inplace=False)#

Validate a DataFrame based on the schema specification.

Parameters:

check_obj (pd.DataFrame) – the dataframe to be validated.
head (Optional[int]) – validate the first n rows. Rows overlapping with tail or sample are de-duplicated.
tail (Optional[int]) – validate the last n rows. Rows overlapping with head or sample are de-duplicated.
sample (Optional[int]) – validate a random sample of n rows. Rows overlapping with head or tail are de-duplicated.
random_state (Optional[int]) – random seed for the sample argument.
lazy (bool) – if True, lazily evaluates dataframe against all validation checks and raises a SchemaErrors. Otherwise, raise SchemaError as soon as one occurs.
inplace (bool) – if True, applies coercion to the object of validation, otherwise creates a copy of the data.

Return type:

DataFrame[Self]

Returns:

validated DataFrame

Raises:

SchemaError – when DataFrame violates built-in or custom checks.

ValidBboxAnnotationsCOCO#

This Page