ValidBboxAnnotationsCOCO#
- class ethology.validators.annotations.ValidBboxAnnotationsCOCO(*args, **kwargs)[source]#
Bases:
DataFrameModelClass for COCO-exportable bounding box annotations dataframes.
The validation checks the required columns exist and their types are correct. It additionally checks that the index and the
annotation_idcolumn are equal.- idx#
Index of the dataframe. Should be greater than or equal to 0 and equal to the
annotation_idcolumn.- Type:
Index[int]
- segmentation#
Bounding box segmentation masks as list of lists of coordinates.
- Raises:
pa.errors.SchemaError – If the dataframe does not match the schema.
Notes
See COCO format documentation for more details.
Methods
build_schema_(**kwargs)Check that the index and the
annotation_idcolumn are equal.empty(*_args)Create an empty DataFrame with the schema of this model.
example(cls, **kwargs)Generate an example of a particular size.
Provide metadata for columns and schema level
Map COCO-exportable dataframe columns to COCO fields.
pydantic_validate(schema_model)Verify that the input is a compatible dataframe model.
strategy(cls, **kwargs)Create a
hypothesisstrategy for generating a DataFrame.Serialize schema metadata into json-schema format.
Create
DataFrameSchemafrom theDataFrameModel.to_yaml([stream])Convert Schema to yaml using io.to_yaml.
validate(check_obj[, head, tail, sample, ...])Validate a DataFrame based on the schema specification.
- class Config#
Bases:
BaseConfig- add_missing_columns: bool = False#
add columns to dataframe if they are missing
- coerce: bool = False#
coerce types of all schema components
- description: Optional[str] = None#
arbitrary textual description
- drop_invalid_rows: bool = False#
drop invalid rows on validation
- dtype: Optional[PandasDtypeInputTypes] = None#
datatype of the dataframe. This overrides the data types specified in any of the fields.
- from_format: Optional[Union[Format, Callable]] = None#
data format before validation. This option only applies to schemas used in the context of the pandera type constructor
pa.typing.DataFrame[Schema](data). If None, assumes a data structure compatible with thepandas.DataFrameconstructor.
- from_format_kwargs: Optional[dict[str, Any]] = None#
a dictionary keyword arguments to pass into the reader function that converts the object of type
from_formatto a pandera-validate-able data structure. The reader function is implemented in the pandera.typing generic types via thefrom_formatandto_formatmethods.
- metadata: Optional[dict] = None#
a dictionary object to store key-value data at schema level
- multiindex_coerce: bool = False#
coerce types of all MultiIndex components
- multiindex_name: Optional[str] = None#
name of multiindex
- multiindex_ordered: bool = True#
validate MultiIndex in order
- multiindex_strict: StrictType = False#
make sure all specified columns are in validated MultiIndex - if
"filter", removes indexes not specified in the schema
- multiindex_unique = None#
make sure the MultiIndex is unique along the list of columns
- name: Optional[str] = 'ValidBboxAnnotationsCOCO'#
name of schema
- ordered: bool = False#
validate columns order
- strict: StrictType = False#
make sure all specified columns are in the validated dataframe - if
"filter", removes columns not specified in the schema
- title: Optional[str] = None#
human-readable label for schema
- to_format: Optional[Union[Format, Callable]] = None#
data format to serialize into after validation. This option only applies to schemas used in the context of the pandera type constructor
pa.typing.DataFrame[Schema](data). If None, returns a dataframe.
- to_format_buffer: Optional[Union[str, Callable]] = None#
Buffer to be provided when to_format is a custom callable. See docs for example of how to implement an example of a to format function.
- to_format_kwargs: Optional[dict[str, Any]] = None#
a dictionary keyword arguments to pass into the writer function that converts the pandera-validate-able object to type
to_format. The writer function is implemented in the pandera.typing generic types via thefrom_formatandto_formatmethods.
- unique: Optional[Union[str, list[str]]] = None#
make sure certain column combinations are unique
- unique_column_names: bool = False#
make sure dataframe column names are unique
- classmethod check_idx_and_annotation_id(df)[source]#
Check that the index and the
annotation_idcolumn are equal.- Parameters:
df (pd.DataFrame) – The dataframe to check.
- Returns:
A boolean indicating whether the index and the
annotation_idcolumn are equal for all rows.- Return type:
- classmethod empty(*_args)#
Create an empty DataFrame with the schema of this model.
- Return type:
DataFrame[Self]
- classmethod example(cls, **kwargs)#
Generate an example of a particular size.
- Parameters:
size – number of elements in the generated DataFrame.
- Return type:
DataFrameBase[TypeVar(TDataFrameModel, bound= DataFrameModel)]- Returns:
DataFrame object.
- classmethod get_metadata()#
Provide metadata for columns and schema level
- static map_df_columns_to_COCO_fields()[source]#
Map COCO-exportable dataframe columns to COCO fields.
- Returns:
A dictionary mapping each column in the COCO-exportable dataframe to the corresponding fields in the equivalent COCO file.
- Return type:
- classmethod pydantic_validate(schema_model)#
Verify that the input is a compatible dataframe model.
- Return type:
- classmethod strategy(cls, **kwargs)#
Create a
hypothesisstrategy for generating a DataFrame.- Parameters:
size – number of elements to generate
n_regex_columns – number of regex columns to generate.
- Returns:
a strategy that generates DataFrame objects.
- classmethod to_json_schema()#
Serialize schema metadata into json-schema format.
- Parameters:
dataframe_schema – schema to write to json-schema format.
Note
This function is currently does not fully specify a pandera schema, and is primarily used internally to render OpenAPI docs via the FastAPI integration.
- classmethod to_schema()#
Create
DataFrameSchemafrom theDataFrameModel.- Return type:
TypeVar(TSchema, bound=BaseSchema)
- classmethod to_yaml(stream=None)#
Convert Schema to yaml using io.to_yaml.
- classmethod validate(check_obj, head=None, tail=None, sample=None, random_state=None, lazy=False, inplace=False)#
Validate a DataFrame based on the schema specification.
- Parameters:
check_obj (pd.DataFrame) – the dataframe to be validated.
head (
Optional[int]) – validate the first n rows. Rows overlapping with tail or sample are de-duplicated.tail (
Optional[int]) – validate the last n rows. Rows overlapping with head or sample are de-duplicated.sample (
Optional[int]) – validate a random sample of n rows. Rows overlapping with head or tail are de-duplicated.random_state (
Optional[int]) – random seed for thesampleargument.lazy (
bool) – if True, lazily evaluates dataframe against all validation checks and raises aSchemaErrors. Otherwise, raiseSchemaErroras soon as one occurs.inplace (
bool) – if True, applies coercion to the object of validation, otherwise creates a copy of the data.
- Return type:
DataFrame[Self]- Returns:
validated
DataFrame- Raises:
SchemaError – when
DataFrameviolates built-in or custom checks.