Skip to main content

Colon10K Dataset

Explanation:

  • “image/”: contains all the images of this case
  • “<CASE-ID>-matchings.txt”: contains ground truth in form of “<query_id> | positive intervals”

01: 485 images | 02: 550 images | 03: 494 images | 04: 435 images | 05: 780 images | 06: 799 images | 07: 446 images | 08: 104 images | 09: 354 images | 10: 223 images | 11: 392 images | 12: 312 images | 13: 158 images | 14: 879 images | 15: 471 images | 16: 819 images | 17: 799 images | 18: 264 images | 19: 861 images | 20: 591 images

Camera Intrinsics: Pinhole fx=145.4410 fy=145.4410 cx=135.6993 cy=107.8946 width=270 height=216

Video collection and image configuration:

  • The subsequences in Colon10K are cropped from full colonoscopies conducted by Dr. Sarah K. McGill, who is an Assistant Professor of Medicine in the Division of Gastroenterology and Hepatology of UNC Hospital. Data copyright belongs to UNC Medical School.
  • Raw videos were captured by CF and PCF series Olympus Colonoscopes. Raw image size is 1350×1080. We resized the images to 270×216 in Colon10K.
  • Raw images were captured by fisheye cameras. We undistorted them to pinhole camera images using MATLAB’s fisheye calibration.

Publication:

R. Ma, S. McGill, R. Wang, J. Rosenman, J. Frahm, Y. Zhang, and S. Pizer. Colon10K: A Benchmark for Place Recognition in Colonoscopy. To appear In the proceedings of IEEE International Symposium on Biomedical Imaging (2021).

Training hyperparameters of the neural networks presented in the above paper:

We mostly followed the same parameter as in Radenović et al., but with the following modifications:

  • Number of positive matching pairs in each epoch = 2500
  • Pool size of hard negative mining in each epoch (randomly selected) = 50000
  • Learning rate = 5e-7

Historical frame recognition demo:

 

Loop closure of a sequence using our CNN image retriever and RNN-DP:

 

Reconstructions with (star) and without loop closure: