Paul Maria Scheikl

I am an applied scientist at Amazon Robotics in Berlin, working on the Vulcan project. I was a postdoc at Johns Hopkins University working under the guidance of Axel Krieger in the Laboratory for Computational Sensing and Robotics (LCSR). I received my PhD (Dr.-Ing.) from the Department of Artificial Intelligence in Biomedical Engineering at the Friedrich-Alexander University Erlangen-Nürnberg in Germany, where I was advised by Franziska Mathis-Ullrich.

Email  /  Scholar  /  Github  /  Doctoral Thesis

profile photo

Research

My research interests lie at the intersection of robotics, computer vision, and machine learning. I am particularly interested in learning visumotor policies for complex object manipulation tasks. During my doctoral studies, I worked on imitation learning, reinforcement learnig, semantic segmentation, deformable object simulation, and sim-to-real transfer.

ImitateCholec: A Multimodal Dataset for Long-Horizon Imitation Learning in Robotic Cholecystectomy
Pascal Hansen, Ji Woong Brian Kim, Antony Goldenberg, Juo-Tung Chen, Yuanzhe Amos Li, Anton Deguet, Brandon White, De Ru Tsai, Richard Cha, Jeffrey Jopling, Paul Maria Scheikl, Axel Krieger
Scientific Data, 2026
code / data

Publicly available multimodal dataset for imitation learning of long-horizon surgical workflows. Provides over 18,000 demonstrations from 34 ex-vivo porcine cholecystectomies (≈20 hours), segmented into 17 distinct tasks during the clipping and cutting phase. Combines endoscopic video from multiple viewpoints with da Vinci Research Kit kinematics, and includes both optimal executions and recovery maneuvers for robust error recovery.

Surgical Gaussian Surfels: Highly Accurate Real-time Surgical Scene Rendering using Gaussian Surfels
Idris O. Sunmola, Zhenjun Zhao, Samuel Schmidgall, Yumeng Wang, Paul Maria Scheikl, Viet Pham, Axel Krieger
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2026
code

Real-time reconstruction of deformable tissue from monocular endoscopic video. Constrains 3D Gaussian primitives along the view-aligned axis to form surface-aligned elliptical splats, sharpening tool-occlusion handling and preserving fine anatomical details over prior 3DGS- and NeRF-based methods. A lightweight Fully Fused Deformation MLP predicts surfel motion fields up to 5x faster than a standard MLP.

SRT-H: A hierarchical framework for autonomous surgery via language-conditioned imitation learning
Ji Woong Kim, Juo-Tung Chen, Pascal Hansen, Lucy Xiaoyang Shi, Antony Goldenberg, Samuel Schmidgall, Paul Maria Scheikl, Anton Deguet, Brandon M White, De Ru Tsai, Richard Jaepyeong Cha, Jeffrey Jopling, Chelsea Finn, Axel Krieger
Science Robotics, 2025

Learns a hierarchical policy for autonomous surgery using language-conditioned imitation learning. The high-level policy receives image data and produces a natural language instruction. The low-level policy receives the language instruction and image data and generates robotic motions. The framework is evaluated on the da Vinci Research Kit on ex-vivo porcine organs for laparoscopic cholecystectomy.

Point Cloud Segmentation for Autonomous Clip Positioning in Laparoscopic Cholecystectomy on a Phantom
Balázs Gyenes, Nikolai Franke, Paul Maria Scheikl, Pit Henrich, Rayan Younis, Gerhard Neumann, Martin Wagner, Franziska Mathis-Ullrich
IEEE Robotics and Automation Letters (RA-L), 2025
code

Presents the first robotic system to demonstrate autonomous clip positioning on a physical phantom in laparoscopic surgery. The system segments a colorless point cloud from a single camera, extracts target positions for the clips using spline interpolation, and allows human operators to adjust the targets. The segmentation model is trained on only 60 hand-labeled real point clouds, reflecting data scarcity in the surgical domain. The system achieves a localization precision of 0.75 mm at a 95% success rate and executes autonomous clip positioning with a 100% success rate.

Autonomous Vision-Guided Resection of Central Airway Obstruction
Mariana E. Smith, Nural Yilmaz, Tanner Watts, Paul Maria Scheikl, Jiawei Ge, Anton Deguet, Alan Kuntz, Axel Krieger
Journal of Medical Robotics Research (JMRR), 2025

Vision-guided, autonomous approach for palliative resection of tracheal tumors. Models the tracheal surface with a fifth-degree polynomial to plan tool trajectories. Aegmentation pipeline identifies the trachea and tumor boundaries.

LUDO: Low-Latency Understanding of Deformable Objects Using Point Cloud Occupancy Functions
Pit Henrich, Franziska Mathis-Ullrich, Paul Maria Scheikl
IEEE Transactions on Robotics (T-RO), 2025

Reconstructs a complete deformable object, including deformed internal structures, from a single point cloud observation. We use the reconstructed object to plan a robotic path to puncture internal regions of interest. Deformable object reconstruction eliminates the need for deformable object registration!

From Monocular Vision to Autonomous Action: Guiding Tumor Resection via 3D Reconstruction
Ayberk Acar, Mariana Smith, Lidia Al-Zogbi, Tanner Watts, Fangjie Li, Hao Li, Nural Yilmaz, Paul Maria Scheikl, Jesse F. d'Almeida, Susheela Sharma, Lauren Branscombe, Tayfun Efe Ertop, Robert J. Webster III, Ipek Oguz, Alan Kuntz, Axel Krieger, Jie Ying Wu
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025

A 3D mapping pipeline that uses only RGB images to create segmented point clouds of the target anatomy, removing the need for bulky depth cameras in space-limited surgical settings. Compares structure-from-motion algorithms for mapping central airway obstructions and shows that monocular reconstruction can match -- and in some metrics exceed -- RGB-D performance in a downstream tumor resection task.

SurgiPose: Estimating Surgical Tool Kinematics from Monocular Video for Surgical Robot Learning
Juo-Tung Chen, XinHao Chen, Ji Woong Kim, Paul Maria Scheikl, Richard Jaepyeong Cha, Axel Krieger
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025

Differentiable rendering approach to estimate surgical tool kinematics from monocular video, eliminating the need for ground-truth kinematic recordings. Tool trajectories and joint angles are inferred by minimizing the discrepancy between rendered and observed images. Imitation learning policies trained on estimated kinematics achieve success rates comparable to those trained on measured kinematics, opening a path to large-scale learning from online surgical videos.

Integrating Learned Retraction and Model-Based Resection Planning for Automated Central Airway Obstruction Removal
Tanner Watts, Mariana Smith, James Ferguson, Nural Yilmaz, Paul Maria Scheikl, Bao Thach, Brendan Burkhart, Tucker Hermans, Anton Deguet, Axel Krieger, Alan Kuntz
IEEE Transactions on Medical Robotics and Bionics (T-MRB), 2025

Couples a learned tissue retraction policy with a model-based resection planner for automated tumor removal in central airway obstruction surgery. The retraction network exposes the surgical target so the planner can compute resection trajectories on the visible tissue.

Segmentation of Image Observations for Articulated Endoscope Control
Pooja Rao, Paul Maria Scheikl, Georg Rauter, Pit Henrich, Franziska Mathis-Ullrich
Current Directions in Biomedical Engineering (CDBE), 2025

Image segmentation that produces visual observations to support feedback control of an articulated endoscope.

Lens Capsule Tearing in Cataract Surgery using Reinforcement Learning
Rebekka Charlotte Peter, Steffen Peikert, Ludwig Haide, Doan Xuan Viet Pham, Tahar Chettaoui, Eleonora Tagliabue, Paul Maria Scheikl, Johannes Fauser, Matthias Hillenbrand, Gerhard Neumann, Franziska Mathis-Ullrich
IEEE International Conference on Robotics and Automation (ICRA), 2024

Demonstrates simulation and policy learning of lens capsule tearing in cataract surgery using reinforcement learning.

Movement Primitive Diffusion: Learning Gentle Robotic Manipulation of Deformable Objects
Paul Maria Scheikl, Nicolas Schreiber, Christoph Haas, Niklas Freymuth, Gerhard Neumann, Rudolf Lioutikov, Franziska Mathis-Ullrich.
IEEE Robotics and Automation Letters (RA-L), 2024
code / website

Combines the versatility of diffusion-based imitation learning with the high-quality motion generation capabilities of Probabilistic Dynamic Movement Primitives. Achieves gentle manipulation of deformable objects, while maintaining data efficiency critical for surgical applications where demonstration data is scarce. Evaluated in simulation and on real robotic hardware.

Registered and Segmented Deformable Object Reconstruction From a Single View Point Cloud
Pit Henrich, Balázs Gyenes, Paul Maria Scheikl, Gerhard Neumann, Franziska Mathis-Ullrich.
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024

3D reconstruction and segmentation of deformable objects from a single view point cloud. Also introduces a simple sampling algorithm to generate better training data for occupancy learning.

A surgical activity model of laparoscopic cholecystectomy for co-operation with collaborative robots
Rayan Younis, Amine Yamlahi, Sebastian Bodenstedt, Paul Maria Scheikl, Anna Kisilenko, Marie Daum, André Schulze, Philipp A. Wise, Felix Nickel, Franziska Mathis-Ullrich, Lena Maier-Hein, Beat P. Müller-Stich, Stefanie Speidel, Marius Distler, Jürgen Weitz, Martin Wagner
Surgical Endoscopy, 2024

Defines a surgical process model that captures intraoperative activities -- actor, instrument, action, target -- in laparoscopic cholecystectomy. Activities, instrument presence, and surgical phases are annotated across 20 videos of human and ex-vivo porcine procedures (~25 hours). A Distilled-Swin transformer is trained on the dataset for automatic action triplet recognition, providing a foundation for context-aware collaborative surgical robots.

LapGym - An Open Source Framework for Reinforcement Learning in Robot-Assisted Laparoscopic Surgery
Paul Maria Scheikl, Balázs Gyenes, Rayan Younis, Christoph Haas, Gerhard Neumann, Martin Wagner, Franziska Mathis-Ullrich.
Journal of Machine Learning Research (JMLR), 2023
code: lap_gym / sofa_env

Reinforcement learning framework for robot-assisted laparoscopic surgery. Builds on the open-source, fast, interactive FEM simulation backend SOFA. Deformable object manipulation, topological changes (cutting), grasping, image observation modalities (RGB, depth, segmentation, point clouds).

Grounding Graph Network Simulators using Physical Sensor Observations
Jonas Linkerhägner, Niklas Freymuth, Paul Maria Scheikl, Franziska Mathis-Ullrich, Gerhard Neumann.
International Conference on Learning Representations (ICLR), 2023

Integrate sensory information to ground Graph Network Simulators on real world observations. Predict the mesh state of deformable objects by utilizing point cloud data.

Sim-to-Real Transfer for Visual Reinforcement Learning of Deformable Object Manipulation for Robot-Assisted Surgery
Paul Maria Scheikl, Eleonora Tagliabue, Balázs Gyenes, Martin Wagner, Diego Dall'Alba, Paolo Fiorini, Franziska Mathis-Ullrich.
IEEE Robotics and Automation Letters (RA-L), 2022

Training a visumotor policy for deformable object manipulation in simulation with reinforcement learning. Transferring the policy to the real world with the daVinci Research Kit using unpaired image-to-image translation.

Cooperative assistance in robotic surgery through multi-agent reinforcement learning
Paul Maria Scheikl, Balázs Gyenes, Tornike Davitashvili, Rayan Younis, André Schulze, Beat P Müller-Stich, Gerhard Neumann, Martin Wagner, Franziska Mathis-Ullrich.
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021

Learns decentralized policies without a human in the loop with multi-agent reinforcement learning. Evaluates the learned policies in cooperation with a human surgeon for deformable object manipulation.

Deep learning for semantic segmentation of organs and tissues in laparoscopic surgery
Paul Maria Scheikl, Stefan Laschewski, Anna Kisilenko, Tornike Davitashvili, Benjamin Müller, Manuela Capek, Beat P Müller-Stich, Martin Wagner, Franziska Mathis-Ullrich.
Current Directions in Biomedical Engineering (CDBE), 2020

Evaluates several architectures and training strategies for semantic segmentation of organs and tissues in laparoscopic surgery.


The website is based on the code from Jon Barron