Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Dataset structure guide on setting Class IDs and num_classes #247

Open
niqbal996 opened this issue Sep 6, 2024 · 0 comments
Open

Comments

@niqbal996
Copy link

Hello,

Thanks for your work and repository. I want to use it to train my custom dataset. My dataset has the following structure. Its based on the Phenorob agricultural dataset from here.

Soil: Category ID 0 (Stuff)
Crop: Category ID 1 (Thing)
Weed: Category ID 2 (Thing)

I want to detect all three classes using both semantic and panoptic segmentation.

Now in my yaml config file, I have set num_classes = 3 SEM_SEG_HEAD like this:

SEM_SEG_HEAD:
    NAME: "MaskFormerHead"
    IGNORE_VALUE: 255
    NUM_CLASSES: 3
    LOSS_WEIGHT: 1.0
    CONVS_DIM: 256
    MASK_DIM: 256
    NORM: "GN"

I am using the COCO Panoptic and SEMSeg Evaluator for evaluation purposes. In the semantic segmentation png label files, my png files have following pixel mappings:

0 -> soil
1 -> crop
2 -> weed

And the categories in the json file are as below:

"categories": [
        {
            "color": [
                0,
                0,
                0
            ],
            "id": 0,
            "isthing": 0,
            "name": "soil",
            "supercategory": "soil"
        },
        {
            "color": [
                111,
                74,
                0
            ],
            "id": 1,
            "isthing": 1,
            "name": "crop",
            "supercategory": "crop"
        },
        {
            "color": [
                230,
                150,
                140
            ],
            "id": 2,
            "isthing": 1,
            "name": "weed",
            "supercategory": "weed"
        }
    ],

The dataset is registered in the following manner:

meta = {}

  # Define classes and colors
  thing_classes = ["crop", "weed"]
  thing_colors = [(0, 0, 200), (200, 0, 0)]
  stuff_classes = ["soil"]
  stuff_colors = [(0, 0, 0)]

  meta["thing_classes"] = thing_classes
  meta["thing_colors"] = thing_colors
  meta["stuff_classes"] = stuff_classes
  meta["stuff_colors"] = stuff_colors

  # Map dataset IDs to contiguous IDs
  meta["thing_dataset_id_to_contiguous_id"] = {1: 1, 2: 2}  # 1 -> crop, 2 -> weed
  meta["stuff_dataset_id_to_contiguous_id"] = {0: 0}  # 0 -> soil

  # Set ignore label
  meta["ignore_label"] = 255

  # Additional metadata for visualization and evaluation
  meta["stuff_classes"] = stuff_classes + thing_classes
  meta["stuff_colors"] = stuff_colors + thing_colors
  meta["stuff_dataset_id_to_contiguous_id"].update(meta["thing_dataset_id_to_contiguous_id"])

  return meta

############################################################
DatasetCatalog.register(
        panoptic_name,
        lambda: merge_to_panoptic(
            load_pheno_panoptic_json(panoptic_json, image_root, panoptic_root, metadata),
            load_sem_seg(sem_seg_root, image_root, gt_ext='png', image_ext='png'),
        ),
    )
    MetadataCatalog.get(panoptic_name).set(
        panoptic_root=panoptic_root,
        image_root=image_root,
        panoptic_json=panoptic_json,
        sem_seg_root=sem_seg_root,
        json_file=instances_json,  # TODO rename
        evaluator_type="coco_panoptic_seg",
        label_divisor=1000,
        **metadata,
    )

At the moment, I am only trying to check the semantic segmentation results by enabling the config file

TEST:
      SEMANTIC_ON: True
      INSTANCE_ON: False
      PANOPTIC_ON: False
      OVERLAP_THRESHOLD: 0.8
      OBJECT_MASK_THRESHOLD: 0.8

I am really struggling to stage the experiment such that all three classes are detected. I would really appreciate if you could find any inconsistency in the above dataset structure or class names. I tried different variations but the predictions always seem to mix up the background/soil class with one of the either crop or weed. It seems trivial but have been scratching my head for many hours over this. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant