.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/load_annotations_dataset.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_load_annotations_dataset.py: Load bounding box COCO annotations into ``ethology`` ==================================================================== Load bounding box annotations in `COCO format `_ as an ``ethology`` dataset and inspect it using ``ethology`` and `movement `_ utilities. .. GENERATED FROM PYTHON SOURCE LINES 11-13 Imports ------- .. GENERATED FROM PYTHON SOURCE LINES 13-32 .. code-block:: Python import os from pathlib import Path import matplotlib.pyplot as plt import numpy as np import pooch import xarray as xr from movement.io import save_bboxes from movement.plots import plot_occupancy from movement.roi import PolygonOfInterest from ethology.io.annotations import load_bboxes # For interactive plots: install ipympl with `pip install ipympl` and uncomment # the following line in your notebook # %matplotlib widget .. GENERATED FROM PYTHON SOURCE LINES 33-51 Download dataset ------------------ For this example, we will use the dataset from the `UAS Imagery of Migratory Waterfowl at New Mexico Wildlife Refuges `_. This dataset is part of the `Drones For Ducks project `_ that aims to develop an efficient method to count and identify species of migratory waterfowl at wildlife refuges across New Mexico. The dataset is made up of a set of drone images and corresponding bounding box annotations. Annotations are provided by both expert annotators and volunteers. Since the dataset is not very large, we can download it as a zip file directly from the URL provided in the dataset webpage. We use the `pooch `_ library to download it to the ``.ethology`` cache directory. .. GENERATED FROM PYTHON SOURCE LINES 51-75 .. code-block:: Python # Source of the dataset data_source = { "url": "https://storage.googleapis.com/public-datasets-lila/uas-imagery-of-migratory-waterfowl/uas-imagery-of-migratory-waterfowl.20240220.zip", "hash": "c5b8dfc5a87ef625770ac8f22335dc9eb8a67688b610490a029dae81815a9896", } # Define cache directory ethology_cache = Path.home() / ".ethology" ethology_cache.mkdir(exist_ok=True) # Download the dataset to the cache directory extracted_files = pooch.retrieve( url=data_source["url"], known_hash=data_source["hash"], fname="waterfowl_dataset.zip", path=ethology_cache, processor=pooch.Unzip(extract_dir=ethology_cache), ) data_dir = ethology_cache / "uas-imagery-of-migratory-waterfowl" .. GENERATED FROM PYTHON SOURCE LINES 76-77 For this example, we will focus on the annotations labelled by the experts. .. GENERATED FROM PYTHON SOURCE LINES 77-83 .. code-block:: Python annotations_file = ( data_dir / "experts" / "20230331_dronesforducks_expert_refined.json" ) .. GENERATED FROM PYTHON SOURCE LINES 84-90 Load annotations as an ``ethology`` dataset -------------------------------------------- We can use the :func:`ethology.io.annotations.load_bboxes.from_files` function to load the COCO file with the expert annotations as an ``ethology`` dataset. .. GENERATED FROM PYTHON SOURCE LINES 90-96 .. code-block:: Python ds = load_bboxes.from_files(annotations_file, format="COCO") print(ds) print(ds.sizes) .. rst-class:: sphx-glr-script-out .. code-block:: none Size: 353kB Dimensions: (image_id: 12, space: 2, id: 722) Coordinates: * image_id (image_id) int64 96B 0 1 2 3 4 5 6 7 8 9 10 11 * space (space) `_ to define a :class:`movement.roi.PolygonOfInterest` and then count how many annotations are within the polygon. .. GENERATED FROM PYTHON SOURCE LINES 193-233 .. code-block:: Python # Define a polygon central_region = PolygonOfInterest( ((1000, 500), (1000, 3000), (4500, 3000), (4500, 500)), name="Central region", ) # Plot all annotations fig, ax = plt.subplots() sc = ax.scatter( ds.position.sel(space="x").values, ds.position.sel(space="y").values, s=3, c=ds.category.values, cmap=cmap, ) ax.set_title("Annotations in polygon") ax.set_xlabel("x (pixels)") ax.set_ylabel("y (pixels)") ax.axis("equal") ax.invert_yaxis() # Plot ROI (region of interest) polygon on top central_region.plot(ax, facecolor="red", edgecolor="red", alpha=0.25) # Check the number of annotations in the polygon # Note: if position is NaN, ``.contains_point`` returns ``False`` ds_in_region = central_region.contains_point( ds.position ) # shape: (n_images, n_max_annotations_per_image) n_annotations_in_region = ds_in_region.sum() n_annotations_total = (~ds.position.isnull().any(axis=1)).sum() fraction_in_region = n_annotations_in_region / n_annotations_total print(f"Total annotations: {n_annotations_total.item()}") print(f"Annotations in region: {n_annotations_in_region.item()}") print(f"Fraction of annotations in region: {fraction_in_region * 100:.2f}%") .. image-sg:: /examples/images/sphx_glr_load_annotations_dataset_002.png :alt: Annotations in polygon :srcset: /examples/images/sphx_glr_load_annotations_dataset_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none Total annotations: 2243 Annotations in region: 1155 Fraction of annotations in region: 51.49% .. GENERATED FROM PYTHON SOURCE LINES 234-236 We can see that just over 50% of the annotations are within the region of interest defined by the polygon. .. GENERATED FROM PYTHON SOURCE LINES 238-248 Transform dataset to a ``movement``-like dataset ------------------------------------------------- We can take further advantage of ``movement`` utilities by transforming our annotations dataset to a ``movement``-like dataset. To do this, we need to rename the dataset dimensions, add a confidence array, and add a ``time_unit`` attribute. We additionally rename the ``individuals`` coordinate values to follow the ``movement`` naming convention. .. GENERATED FROM PYTHON SOURCE LINES 248-277 .. code-block:: Python # Rename dimensions ds_as_movement = ds.rename({"image_id": "time", "id": "individuals"}) # Rename 'individuals' coordinate values to be ds_as_movement["individuals"] = [ f"id_{i.item()}" for i in ds_as_movement.individuals.values ] # Add confidence array with NaN values ds_as_movement["confidence"] = xr.DataArray( np.full( ( ds_as_movement.sizes["time"], ds_as_movement.sizes["individuals"], ), np.nan, ), dims=["time", "individuals"], ) # Add time_unit attribute ds_as_movement.attrs["time_unit"] = "frames" print(ds_as_movement) print(ds_as_movement.sizes) .. rst-class:: sphx-glr-script-out .. code-block:: none Size: 433kB Dimensions: (time: 12, space: 2, individuals: 722) Coordinates: * time (time) int64 96B 0 1 2 3 4 5 6 7 8 9 10 11 * space (space) `_ .. GENERATED FROM PYTHON SOURCE LINES 289-300 Plot occupancy map using ``movement`` -------------------------------------- We can now use the :func:`movement.plots.plot_occupancy` function to plot the occupancy map of the annotations. This is a two-dimensional histogram that shows for each bin the number of annotations that fall within it. To determine the number of bins along each dimension, we use the aspect ratio of the images to define similarly sized bins. This makes the occupancy map more informative. Note that all images have the same dimensions. .. GENERATED FROM PYTHON SOURCE LINES 300-323 .. code-block:: Python # Determine aspect ratio of the images image_width = np.unique(ds["image_shape"].sel(space="x").values).item() image_height = np.unique(ds["image_shape"].sel(space="y").values).item() image_AR = image_width / image_height # Set number of bins along each dimension n_bins_x = 75 n_bins_y = int(n_bins_x / image_AR) # Plot occupancy map fig, ax, hist = plot_occupancy( ds_as_movement.position, bins=[n_bins_x, n_bins_y], ) fig.set_size_inches(10, 5) ax.set_xlim(0, image_width) ax.set_ylim(0, image_height) ax.set_xlabel("x (pixels)") ax.set_ylabel("y (pixels)") ax.axis("equal") ax.invert_yaxis() .. image-sg:: /examples/images/sphx_glr_load_annotations_dataset_003.png :alt: load annotations dataset :srcset: /examples/images/sphx_glr_load_annotations_dataset_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 324-327 The occupancy map shows that the maximum count in each bin is 5 annotations, and the minimum count is 0 annotations. We can confirm this by inspecting the outputs of the :func:`movement.plots.plot_occupancy` function. .. GENERATED FROM PYTHON SOURCE LINES 327-335 .. code-block:: Python bin_size_x = np.diff(hist["xedges"])[0].item() bin_size_y = np.diff(hist["yedges"])[0].item() print(f"Bin size (pixels): ({bin_size_x}, {bin_size_y})") print(f"Maximum bin count: {hist['counts'].max().item()}") print(f"Minimum bin count: {hist['counts'].min().item()}") .. rst-class:: sphx-glr-script-out .. code-block:: none Bin size (pixels): (72.77666666666667, 72.605) Maximum bin count: 5.0 Minimum bin count: 0.0 .. GENERATED FROM PYTHON SOURCE LINES 336-343 Visualise the dataset in the ``movement`` napari plugin --------------------------------------------------------- We can export the ``movement``-like dataset in a format that we can visualise in the `movement napari plugin `_. For example, we can use the :func:`movement.io.save_bboxes.to_via_tracks_file` function, that saves bounding box ``movement`` datasets as VIA-tracks files. .. GENERATED FROM PYTHON SOURCE LINES 343-347 .. code-block:: Python save_bboxes.to_via_tracks_file(ds_as_movement, "waterfowl_dataset.csv") .. rst-class:: sphx-glr-script-out .. code-block:: none PosixPath('waterfowl_dataset.csv') .. GENERATED FROM PYTHON SOURCE LINES 348-355 You can now follow the `movement napari guide `_ to load the output VIA-tracks file into ``napari``. To visualise the annotations over the corresponding images, remember to first drag and drop the images directory into the ``napari`` canvas. You will find the images for the experts' annotations under the ``data_dir / "experts" / "images"`` directory. .. GENERATED FROM PYTHON SOURCE LINES 355-358 .. code-block:: Python print(f"Images directory: {data_dir / 'experts' / 'images'}") .. rst-class:: sphx-glr-script-out .. code-block:: none Images directory: /home/runner/.ethology/uas-imagery-of-migratory-waterfowl/experts/images .. GENERATED FROM PYTHON SOURCE LINES 359-363 The view in ``napari`` should look something like this: .. image:: ../_static/examples/napari-annotations.jpg :alt: Bounding box annotations in napari .. GENERATED FROM PYTHON SOURCE LINES 365-369 The bounding boxes are coloured by individual ID per image. Remember that the individual IDs are not consistent across images, so it makes more sense to hide the tracks layer for an easier visualisation, like in the example screenshot above. .. GENERATED FROM PYTHON SOURCE LINES 371-375 Clean-up -------- To remove the output files we have just created, we can run the following code. .. GENERATED FROM PYTHON SOURCE LINES 375-378 .. code-block:: Python os.remove("waterfowl_dataset.csv") .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 5.511 seconds) .. _sphx_glr_download_examples_load_annotations_dataset.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/neuroinformatics-unit/ethology/gh-pages?filepath=notebooks/examples/load_annotations_dataset.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: load_annotations_dataset.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: load_annotations_dataset.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: load_annotations_dataset.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_