CholecTriplet2022: Show me a tool and tell me the triplet - an endoscopic vision challenge for surgical action triplet detection (2023-02-13T00:00:00.000000Z)

TL;DR

The CholecTriplet2022 challenge is presented, which extends surgical action triplet modeling from recognition to detection, and includes weakly-supervised bounding box localization of every visible surgical instrument (or tool), as the key actors, and the modeling of each tool-activity in the form of ‹instrument, verb, target› triplet.

Authors

L. Maier-Hein

9 papers

N. Navab

References102 items

Forest Graph Convolutional Network for Surgical Action Triplet Recognition in Endoscopic Videos

Computer vision in surgery: from potential to clinical value

Why Deep Surgical Models Fail?: Revisiting Surgical Action Triplet Recognition through the Lens of Robustness

AutoLaparo: A New Dataset of Integrated Multi-tasks for Image-guided Surgical Automation in Laparoscopic Hysterectomy

Data Splits and Metrics for Method Benchmarking on Surgical Action Triplet Datasets

CholecTriplet2021: A benchmark challenge for surgical action triplet recognition

SIRNet: Fine-Grained Surgical Interaction Recognition

MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection

Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer

Swin Transformer V2: Scaling Up Capacity and Resolution

2020 CATARACTS Semantic Segmentation Challenge

Comparative Validation of Machine Learning Algorithms for Surgical Workflow and Skill Analysis with the HeiChole Benchmark

Rendezvous: Attention Mechanisms for the Recognition of Surgical Action Triplets in Endoscopic Videos

Video Swin Transformer

Shallow Feature Matters for Weakly Supervised Object Localization

Emerging Properties in Self-Supervised Vision Transformers

The SARAS Endoscopic Surgeon Action Detection (ESAD) dataset: Challenges and methods

Learning Domain Adaptation with Model Calibration for Surgical Report Generation in Robotic Surgery

Temporal Memory Relation Network for Workflow Recognition From Surgical Video

MIcro-Surgical Anastomose Workflow recognition challenge report

Trans-SVNet: Accurate Phase Recognition from Surgical Videos via Hybrid Embedding Aggregation Transformer

QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information

End-to-End Human Object Interaction Detection with HOI Transformer

OperA: Attention-Regularized Transformers for Surgical Phase Recognition

Endoscopic Vision Challenge 2021

Surgical Visual Domain Adaptation: Results from the MICCAI 2020 SurgVisDom Challenge

Multi-task temporal convolutional networks for joint recognition of surgical phases and steps in gastric bypass procedures

Is Space-Time Attention All You Need for Video Understanding?

CholecSeg8k: A Semantic Segmentation Dataset for Laparoscopic Cholecystectomy Based on Cholec80

Attention-Driven Dynamic Graph Convolutional Network for Multi-label Image Recognition

Comparative validation of multi-instance instrument segmentation in endoscopy: Results of the ROBUST-MIS 2019 challenge

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

m2caiSeg: Semantic Segmentation of Laparoscopic Images using Convolutional Neural Networks

Proposing novel methods for gynecologic surgical action recognition on laparoscopic videos

Recognition of Instrument-Tissue Interactions in Endoscopic Videos via Action Triplets

Heidelberg colorectal data set for surgical data science in the sensor operating room

TeCNO: Surgical Phase Recognition with Multi-Stage Temporal Convolutional Networks

MOT20: A benchmark for multi object tracking in crowded scenes

Endoscopic Vision Challenge

Assisted phase and step annotation for surgical videos

2018 Robotic Scene Segmentation Challenge

CAI4CAI: The Rise of Contextual Artificial Intelligence in Computer-Assisted Interventions

Methods and open-source toolkit for analyzing and visualizing challenge results

BIAS: Transparent reporting of biomedical image analysis challenges

CaDIS: Cataract Dataset for Image Segmentation

2017 Robotic Instrument Segmentation Challenge

MOTS: Multi-Object Tracking and Segmentation

CATARACTS: Challenge on automatic tool annotation for cataRACT surgery

SlowFast Networks for Video Recognition

Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos

Learning from a tiny dataset of manual annotations: a teacher/student approach for surgical phase recognition

Learning Human-Object Interactions by Graph Parsing Neural Networks

Toward a standard ontology of surgical process models

Monitoring tool usage in surgery videos using boosted convolutional and recurrent neural networks

Temporal coherence-based self-supervised learning for laparoscopic workflow analysis

Weakly-Supervised Learning for Tool Localization in Laparoscopic Videos

SV-RCNet: Workflow Recognition From Surgical Videos Using Recurrent Convolutional Network

Tool Detection and Operative Skill Assessment in Surgical Videos Using Region-Based Convolutional Neural Networks

Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification

Attention is All you Need

AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

Detecting and Recognizing Human-Object Interactions

Learning to Detect Human-Object Interactions

The TUM LapChole dataset for the M2CAI 2016 workflow challenge

Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering

Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding

The Cityscapes Dataset for Semantic Urban Scene Understanding

Object Skeleton Extraction in Natural Images by Fusing Scale-Associated Deep Side Outputs

EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos

Deep Residual Learning for Image Recognition

HICO: A Benchmark for Recognizing Human-Object Interactions in Images

The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS)

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

LapOntoSPM: an ontology for laparoscopic surgeries and its application to surgical phase recognition

Visual Semantic Role Labeling

Automatic phase prediction from low-level surgical activities

A Novel Performance Evaluation Methodology for Single-Target Trackers

Learning Spatiotemporal Features with 3D Convolutional Networks

Long-term recurrent convolutional networks for visual recognition and description

ImageNet Large Scale Visual Recognition Challenge

Knowledge-Driven Formalization of Laparoscopic Surgeries for Rule-Based Intraoperative Context-Aware Assistance

Large-Scale Video Classification with Convolutional Neural Networks

Two-Stream Convolutional Networks for Action Recognition in Videos

Microsoft COCO: Common Objects in Context

Surgical process modelling: a review

UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild

Linking Top-Level Ontologies and Surgical Workflows

Structured recording of intraoperative surgical workflows

Towards automatic skill evaluation: Detection and segmentation of robot-assisted surgical motions

Deliberate Perioperative Systems Design Improves Operating Room Throughput

ENT-surgical workflow as an instrument to assess the efficiency of technological developments in medicine

The 2005 PASCAL Visual Object Classes Challenge

Challenge!

Instrument-tissue Interaction Quintuple Detection in Surgery Videos

VisDrone-MOT2020: The Vision Meets Drone Multiple Object Tracking Challenge Results

OR 2.0 Context-Aware Operating Theaters, Computer Assisted Robotic Endoscopy, Clinical Image-Based Procedures, and Skin Image Analysis

Action Recognition in Realistic Sports Videos

JHU-ISI Gesture and Skill Assessment Working Set ( JIGSAWS ) : A Surgical Activity Dataset for Human Motion Modeling

100

Motif Discovery in OR Sensor Data with Application to Surgical Workflow Analysis and Activity Detection

101

Recognition of the Surgeon's Motions During Endoscopic Operation by Statistics Based Algorithm and Neural Networks Based ANARX Models

102

International Journal of Computer Vision manuscript No. (will be inserted by the editor) The PASCAL Visual Object Classes (VOC) Challenge