Robotics 55
☆ Transformer-based Heuristic for Advanced Air Mobility Planning ATC
Safety is extremely important for urban flights of autonomous Unmanned Aerial
Vehicles (UAVs). Risk-aware path planning is one of the most effective methods
to guarantee the safety of UAVs. This type of planning can be represented as a
Constrained Shortest Path (CSP) problem, which seeks to find the shortest route
that meets a predefined safety constraint. Solving CSP problems is NP-hard,
presenting significant computational challenges. Although traditional methods
can accurately solve CSP problems, they tend to be very slow. Previously, we
introduced an additional safety dimension to the traditional A* algorithm,
known as ASD A*, to effectively handle Constrained Shortest Path (CSP)
problems. Then, we developed a custom learning-based heuristic using
transformer-based neural networks, which significantly reduced computational
load and enhanced the performance of the ASD A* algorithm. In this paper, we
expand our dataset to include more risk maps and tasks, improve the proposed
model, and increase its performance. We also introduce a new heuristic strategy
and a novel neural network, which enhance the overall effectiveness of our
approach.
comment: 2024 AIAA DATC/IEEE 43rd Digital Avionics Systems Conference (DASC)
☆ Resolving Multiple-Dynamic Model Uncertainty in Hypothesis-Driven Belief-MDPs AAMAS 2025
When human operators of cyber-physical systems encounter surprising behavior,
they often consider multiple hypotheses that might explain it. In some cases,
taking information-gathering actions such as additional measurements or control
inputs given to the system can help resolve uncertainty and determine the most
accurate hypothesis. The task of optimizing these actions can be formulated as
a belief-space Markov decision process that we call a hypothesis-driven belief
MDP. Unfortunately, this problem suffers from the curse of history similar to a
partially observable Markov decision process (POMDP). To plan in continuous
domains, an agent needs to reason over countlessly many possible
action-observation histories, each resulting in a different belief over the
unknown state. The problem is exacerbated in the hypothesis-driven context
because each action-observation pair spawns a different belief for each
hypothesis, leading to additional branching. This paper considers the case in
which each hypothesis corresponds to a different dynamic model in an underlying
POMDP. We present a new belief MDP formulation that: (i) enables reasoning over
multiple hypotheses, (ii) balances the goals of determining the (most likely)
correct hypothesis and performing well in the underlying POMDP, and (iii) can
be solved with sparse tree search.
comment: 8 pages, 4 figures, submitted to AAMAS 2025
☆ Landing Trajectory Prediction for UAS Based on Generative Adversarial Network SC
Models for trajectory prediction are an essential component of many advanced
air mobility studies. These models help aircraft detect conflict and plan
avoidance maneuvers, which is especially important in Unmanned Aircraft systems
(UAS) landing management due to the congested airspace near vertiports. In this
paper, we propose a landing trajectory prediction model for UAS based on
Generative Adversarial Network (GAN). The GAN is a prestigious neural network
that has been developed for many years. In previous research, GAN has achieved
many state-of-the-art results in many generation tasks. The GAN consists of one
neural network generator and a neural network discriminator. Because of the
learning capacity of the neural networks, the generator is capable to
understand the features of the sample trajectory. The generator takes the
previous trajectory as input and outputs some random status of a flight.
According to the results of the experiences, the proposed model can output more
accurate predictions than the baseline method(GMR) in various datasets. To
evaluate the proposed model, we also create a real UAV landing dataset that
includes more than 2600 trajectories of drone control manually by real pilots.
comment: 9 pages, AIAA SCITECH 2023
☆ 23 DoF Grasping Policies from a Raw Point Cloud ICRA
Coordinating the motion of robots with high degrees of freedom (DoF) to grasp
objects gives rise to many challenges. In this paper, we propose a novel
imitation learning approach to learn a policy that directly predicts 23 DoF
grasp trajectories from a partial point cloud provided by a single, fixed
camera. At the core of the approach is a second-order geometric-based model of
behavioral dynamics. This Neural Geometric Fabric (NGF) policy predicts
accelerations directly in joint space. We show that our policy is capable of
generalizing to novel objects, and combine our policy with a geometric fabric
motion planner in a loop to generate stable grasping trajectories. We evaluate
our approach on a set of three different objects, compare different policy
structures, and run ablation studies to understand the importance of different
object encodings for policy learning.
comment: IEEE International Conference on Robotics and Automation (ICRA)
Workshop on Geometric Representations 2023
☆ Learning Humanoid Locomotion with Perceptive Internal Model ICRA2025
In contrast to quadruped robots that can navigate diverse terrains using a
"blind" policy, humanoid robots require accurate perception for stable
locomotion due to their high degrees of freedom and inherently unstable
morphology. However, incorporating perceptual signals often introduces
additional disturbances to the system, potentially reducing its robustness,
generalizability, and efficiency. This paper presents the Perceptive Internal
Model (PIM), which relies on onboard, continuously updated elevation maps
centered around the robot to perceive its surroundings. We train the policy
using ground-truth obstacle heights surrounding the robot in simulation,
optimizing it based on the Hybrid Internal Model (HIM), and perform inference
with heights sampled from the constructed elevation map. Unlike previous
methods that directly encode depth maps or raw point clouds, our approach
allows the robot to perceive the terrain beneath its feet clearly and is less
affected by camera movement or noise. Furthermore, since depth map rendering is
not required in simulation, our method introduces minimal additional
computational costs and can train the policy in 3 hours on an RTX 4090 GPU. We
verify the effectiveness of our method across various humanoid robots, various
indoor and outdoor terrains, stairs, and various sensor configurations. Our
method can enable a humanoid robot to continuously climb stairs and has the
potential to serve as a foundational algorithm for the development of future
humanoid control methods.
comment: submitted to ICRA2025
☆ ETA-IK: Execution-Time-Aware Inverse Kinematics for Dual-Arm Systems
This paper presents ETA-IK, a novel Execution-Time-Aware Inverse Kinematics
method tailored for dual-arm robotic systems. The primary goal is to optimize
motion execution time by leveraging the redundancy of both arms, specifically
in tasks where only the relative pose of the robots is constrained, such as
dual-arm scanning of unknown objects. Unlike traditional inverse kinematics
methods that use surrogate metrics such as joint configuration distance, our
method incorporates direct motion execution time and implicit collisions into
the optimization process, thereby finding target joints that allow subsequent
trajectory generation to get more efficient and collision-free motion. A neural
network based execution time approximator is employed to predict time-efficient
joint configurations while accounting for potential collisions. Through
experimental evaluation on a system composed of a UR5 and a KUKA iiwa robot, we
demonstrate significant reductions in execution time. The proposed method
outperforms conventional approaches, showing improved motion efficiency without
sacrificing positioning accuracy. These results highlight the potential of
ETA-IK to improve the performance of dual-arm systems in applications, where
efficiency and safety are paramount.
☆ Cross--layer Formal Verification of Robotic Systems
Robotic systems are widely used to interact with humans or to perform
critical tasks. As a result, it is imperative to provide guarantees about their
behavior. Due to the modularity and complexity of robotic systems, their design
and verification are often divided into several layers. However, some system
properties can only be investigated by considering multiple layers
simultaneously. We propose a cross-layer verification method to verify the
expected properties of concrete robotic systems. Our method verifies one layer
using abstractions of other layers. We propose two approaches: refining the
models of the abstract layers and refining the property under verification. A
combination of these two approaches seems to be the most promising to ensure
model genericity and to avoid the state-space explosion problem.
comment: In Proceedings FMAS2024, arXiv:2411.13215
☆ Synthesising Robust Controllers for Robot Collectives with Recurrent Tasks: A Case Study
When designing correct-by-construction controllers for autonomous
collectives, three key challenges are the task specification, the modelling,
and its use at practical scale. In this paper, we focus on a simple yet useful
abstraction for high-level controller synthesis for robot collectives with
optimisation goals (e.g., maximum cleanliness, minimum energy consumption) and
recurrence (e.g., re-establish contamination and charge thresholds) and safety
(e.g., avoid full discharge, mutually exclusive room occupation) constraints.
Due to technical limitations (related to scalability and using constraints in
the synthesis), we simplify our graph-based setting from a stochastic
two-player game into a single-player game on a partially observable Markov
decision process (POMDP). Robustness against environmental uncertainty is
encoded via partial observability. Linear-time correctness properties are
verified separately after synthesising the POMDP strategy. We contribute
at-scale guidance on POMDP modelling and controller synthesis for tasked robot
collectives exemplified by the scenario of battery-driven robots responsible
for cleaning public buildings with utilisation constraints.
comment: In Proceedings FMAS2024, arXiv:2411.13215
☆ Model Checking and Verification of Synchronisation Properties of Cobot Welding
This paper describes use of model checking to verify synchronisation
properties of an industrial welding system consisting of a cobot arm and an
external turntable. The robots must move synchronously, but sometimes get out
of synchronisation, giving rise to unsatisfactory weld qualities in problem
areas, such as around corners. These mistakes are costly, since time is lost
both in the robotic welding and in manual repairs needed to improve the weld.
Verification of the synchronisation properties has shown that they are
fulfilled as long as assumptions of correctness made about parts outside the
scope of the model hold, indicating limitations in the hardware. These results
have indicated the source of the problem, and motivated a re-calibration of the
real-life system. This has drastically improved the welding results, and is a
demonstration of how formal methods can be useful in an industrial setting.
comment: In Proceedings FMAS2024, arXiv:2411.13215
☆ ROSMonitoring 2.0: Extending ROS Runtime Verification to Services and Ordered Topics
Formal verification of robotic applications presents challenges due to their
hybrid nature and distributed architecture. This paper introduces ROSMonitoring
2.0, an extension of ROSMonitoring designed to facilitate the monitoring of
both topics and services while considering the order in which messages are
published and received. The framework has been enhanced to support these novel
features for ROS1 -- and partially ROS2 environments -- offering improved
real-time support, security, scalability, and interoperability. We discuss the
modifications made to accommodate these advancements and present results
obtained from a case study involving the runtime monitoring of specific
components of a fire-fighting Uncrewed Aerial Vehicle (UAV).
comment: In Proceedings FMAS2024, arXiv:2411.13215
☆ InCrowd-VI: A Realistic Visual-Inertial Dataset for Evaluating SLAM in Indoor Pedestrian-Rich Spaces for Human Navigation
Simultaneous localization and mapping (SLAM) techniques can be used to
navigate the visually impaired, but the development of robust SLAM solutions
for crowded spaces is limited by the lack of realistic datasets. To address
this, we introduce InCrowd-VI, a novel visual-inertial dataset specifically
designed for human navigation in indoor pedestrian-rich environments. Recorded
using Meta Aria Project glasses, it captures realistic scenarios without
environmental control. InCrowd-VI features 58 sequences totaling a 5 km
trajectory length and 1.5 hours of recording time, including RGB, stereo
images, and IMU measurements. The dataset captures important challenges such as
pedestrian occlusions, varying crowd densities, complex layouts, and lighting
changes. Ground-truth trajectories, accurate to approximately 2 cm, are
provided in the dataset, originating from the Meta Aria project machine
perception SLAM service. In addition, a semi-dense 3D point cloud of scenes is
provided for each sequence. The evaluation of state-of-the-art visual odometry
(VO) and SLAM algorithms on InCrowd-VI revealed severe performance limitations
in these realistic scenarios, demonstrating the need and value of the new
dataset to advance SLAM research for visually impaired navigation in complex
indoor environments.
comment: 18 pages, 7 figures, 5 tabels
☆ Convex Approximation of Probabilistic Reachable Sets from Small Samples Using Self-supervised Neural Networks
Probabilistic Reachable Set (PRS) plays a crucial role in many fields of
autonomous systems, yet efficiently generating PRS remains a significant
challenge. This paper presents a learning approach to generating 2-dimensional
PRS for states in a dynamic system. Traditional methods such as Hamilton-Jacobi
reachability analysis, Monte Carlo, and Gaussian process classification face
significant computational challenges or require detailed dynamics information,
limiting their applicability in realistic situations. Existing data-driven
methods may lack accuracy. To overcome these limitations, we propose leveraging
neural networks, commonly used in imitation learning and computer vision, to
imitate expert methods to generate PRS approximations. We trained the neural
networks using a multi-label, self-supervised learning approach. We selected
the fine-tuned convex approximation method as the expert to create expert PRS.
Additionally, we continued sampling from the distribution to obtain a diverse
array of sample sets. Given a small sample set, the trained neural networks can
replicate the PRS approximation generated by the expert method, while the
generation speed is much faster.
comment: 10 pages
☆ SplatR : Experience Goal Visual Rearrangement with 3D Gaussian Splatting and Dense Feature Matching
Experience Goal Visual Rearrangement task stands as a foundational challenge
within Embodied AI, requiring an agent to construct a robust world model that
accurately captures the goal state. The agent uses this world model to restore
a shuffled scene to its original configuration, making an accurate
representation of the world essential for successfully completing the task. In
this work, we present a novel framework that leverages on 3D Gaussian Splatting
as a 3D scene representation for experience goal visual rearrangement task.
Recent advances in volumetric scene representation like 3D Gaussian Splatting,
offer fast rendering of high quality and photo-realistic novel views. Our
approach enables the agent to have consistent views of the current and the goal
setting of the rearrangement task, which enables the agent to directly compare
the goal state and the shuffled state of the world in image space. To compare
these views, we propose to use a dense feature matching method with visual
features extracted from a foundation model, leveraging its advantages of a more
universal feature representation, which facilitates robustness, and
generalization. We validate our approach on the AI2-THOR rearrangement
challenge benchmark and demonstrate improvements over the current state of the
art methods
☆ Continual Learning and Lifting of Koopman Dynamics for Linear Control of Legged Robots
The control of legged robots, particularly humanoid and quadruped robots,
presents significant challenges due to their high-dimensional and nonlinear
dynamics. While linear systems can be effectively controlled using methods like
Model Predictive Control (MPC), the control of nonlinear systems remains
complex. One promising solution is the Koopman Operator, which approximates
nonlinear dynamics with a linear model, enabling the use of proven linear
control techniques. However, achieving accurate linearization through
data-driven methods is difficult due to issues like approximation error, domain
shifts, and the limitations of fixed linear state-space representations. These
challenges restrict the scalability of Koopman-based approaches. This paper
addresses these challenges by proposing a continual learning algorithm designed
to iteratively refine Koopman dynamics for high-dimensional legged robots. The
key idea is to progressively expand the dataset and latent space dimension,
enabling the learned Koopman dynamics to converge towards accurate
approximations of the true system dynamics. Theoretical analysis shows that the
linear approximation error of our method converges monotonically. Experimental
results demonstrate that our method achieves high control performance on robots
like Unitree G1/H1/A1/Go2 and ANYmal D, across various terrains using simple
linear MPC controllers. This work is the first to successfully apply linearized
Koopman dynamics for locomotion control of high-dimensional legged robots,
enabling a scalable model-based control solution.
☆ Soft Manipulation Surface With Reduced Actuator Density For Heterogeneous Object Manipulation
Object manipulation in robotics faces challenges due to diverse object
shapes, sizes, and fragility. Gripper-based methods offer precision and low
degrees of freedom (DOF) but the gripper limits the kind of objects to grasp.
On the other hand, surface-based approaches provide flexibility for handling
fragile and heterogeneous objects but require numerous actuators, increasing
complexity. We propose new manipulation hardware that utilizes equally spaced
linear actuators placed vertically and connected by a soft surface. In this
setup, object manipulation occurs on the soft surface through coordinated
movements of the surrounding actuators. This approach requires fewer actuators
to cover a large manipulation area, offering a cost-effective solution with a
lower DOF compared to dense actuator arrays. It also effectively handles
heterogeneous objects of varying shapes and weights, even when they are
significantly smaller than the distance between actuators. This method is
particularly suitable for managing highly fragile objects in the food industry.
☆ Generalizing End-To-End Autonomous Driving In Real-World Environments Using Zero-Shot LLMs
Traditional autonomous driving methods adopt a modular design, decomposing
tasks into sub-tasks. In contrast, end-to-end autonomous driving directly
outputs actions from raw sensor data, avoiding error accumulation. However,
training an end-to-end model requires a comprehensive dataset; otherwise, the
model exhibits poor generalization capabilities. Recently, large language
models (LLMs) have been applied to enhance the generalization capabilities of
end-to-end driving models. Most studies explore LLMs in an open-loop manner,
where the output actions are compared to those of experts without direct
feedback from the real world, while others examine closed-loop results only in
simulations. This paper proposes an efficient architecture that integrates
multimodal LLMs into end-to-end driving models operating in closed-loop
settings in real-world environments. In our architecture, the LLM periodically
processes raw sensor data to generate high-level driving instructions,
effectively guiding the end-to-end model, even at a slower rate than the raw
sensor data. This architecture relaxes the trade-off between the latency and
inference quality of the LLM. It also allows us to choose from a wide variety
of LLMs to improve high-level driving instructions and minimize fine-tuning
costs. Consequently, our architecture reduces data collection requirements
because the LLMs do not directly output actions; we only need to train a simple
imitation learning model to output actions. In our experiments, the training
data for the end-to-end model in a real-world environment consists of only
simple obstacle configurations with one traffic cone, while the test
environment is more complex and contains multiple obstacles placed in various
positions. Experiments show that the proposed architecture enhances the
generalization capabilities of the end-to-end model even without fine-tuning
the LLM.
☆ Towards a Physics Engine to Simulate Robotic Laser Surgery: Finite Element Modeling of Thermal Laser-Tissue Interactions
Nicholas E. Pacheco, Kang Zhang, Ashley S. Reyes, Christopher J. Pacheco, Lucas Burstein, Loris Fichera
This paper presents a computational model, based on the Finite Element Method
(FEM), that simulates the thermal response of laser-irradiated tissue. This
model addresses a gap in the current ecosystem of surgical robot simulators,
which generally lack support for lasers and other energy-based end effectors.
In the proposed model, the thermal dynamics of the tissue are calculated as the
solution to a heat conduction problem with appropriate boundary conditions. The
FEM formulation allows the model to capture complex phenomena, such as
convection, which is crucial for creating realistic simulations. The accuracy
of the model was verified via benchtop laser-tissue interaction experiments
using agar tissue phantoms and ex-vivo chicken muscle. The results revealed an
average root-mean-square error (RMSE) of less than 2 degrees Celsius across
most experimental conditions.
comment: Submitted to the International Symposium on Medical Robotics 2025
☆ Simulation-Aided Policy Tuning for Black-Box Robot Learning
How can robots learn and adapt to new tasks and situations with little data?
Systematic exploration and simulation are crucial tools for efficient robot
learning. We present a novel black-box policy search algorithm focused on
data-efficient policy improvements. The algorithm learns directly on the robot
and treats simulation as an additional information source to speed up the
learning process. At the core of the algorithm, a probabilistic model learns
the dependence of the policy parameters and the robot learning objective not
only by performing experiments on the robot, but also by leveraging data from a
simulator. This substantially reduces interaction time with the robot. Using
this model, we can guarantee improvements with high probability for each policy
update, thereby facilitating fast, goal-oriented learning. We evaluate our
algorithm on simulated fine-tuning tasks and demonstrate the data-efficiency of
the proposed dual-information source optimization algorithm. In a real robot
learning experiment, we show fast and successful task learning on a robot
manipulator with the aid of an imperfect simulator.
☆ Formalizing Stateful Behavior Trees
Behavior Trees (BTs) are high-level controllers that are useful in a variety
of planning tasks and are gaining traction in robotic mission planning. As they
gain popularity in safety-critical domains, it is important to formalize their
syntax and semantics, as well as verify properties for them. In this paper, we
formalize a class of BTs we call Stateful Behavior Trees (SBTs) that have
auxiliary variables and operate in an environment that can change over time.
SBTs have access to persistent shared memory (often known as a blackboard) that
keeps track of these auxiliary variables. We demonstrate that SBTs are
equivalent in computational power to Turing Machines when the blackboard can
store mathematical (i.e., unbounded) integers. We further identify syntactic
assumptions where SBTs have computational power equivalent to finite state
automata, specifically where the auxiliary variables are of finitary types. We
present a domain specific language (DSL) for writing SBTs and adapt the tool
BehaVerify for use with this DSL. This new DSL in BehaVerify supports
interfacing with popular BT libraries in Python, and also provides generation
of Haskell code and nuXmv models, the latter of which is used for model
checking temporal logic specifications for the SBTs. We include examples and
scalability results where BehaVerify outperforms another verification tool by a
factor of 100.
comment: In Proceedings FMAS2024, arXiv:2411.13215
☆ Verification of Behavior Trees with Contingency Monitors
Behavior Trees (BTs) are high level controllers that have found use in a wide
range of robotics tasks. As they grow in popularity and usage, it is crucial to
ensure that the appropriate tools and methods are available for ensuring they
work as intended. To that end, we created a new methodology by which to create
Runtime Monitors for BTs. These monitors can be used by the BT to correct when
undesirable behavior is detected and are capable of handling LTL
specifications. We demonstrate that in terms of runtime, the generated monitors
are on par with monitors generated by existing tools and highlight certain
features that make our method more desirable in various situations. We note
that our method allows for our monitors to be swapped out with alternate
monitors with fairly minimal user effort. Finally, our method ties in with our
existing tool, BehaVerify, allowing for the verification of BTs with monitors.
comment: In Proceedings FMAS2024, arXiv:2411.13215
☆ Grand Challenges in the Verification of Autonomous Systems
Kevin Leahy, Hamid Asgari, Louise A. Dennis, Martin S. Feather, Michael Fisher, Javier Ibanez-Guzman, Brian Logan, Joanna I. Olszewska, Signe Redfield
Autonomous systems use independent decision-making with only limited human
intervention to accomplish goals in complex and unpredictable environments. As
the autonomy technologies that underpin them continue to advance, these systems
will find their way into an increasing number of applications in an ever wider
range of settings. If we are to deploy them to perform safety-critical or
mission-critical roles, it is imperative that we have justified confidence in
their safe and correct operation. Verification is the process by which such
confidence is established. However, autonomous systems pose challenges to
existing verification practices. This paper highlights viewpoints of the
Roadmap Working Group of the IEEE Robotics and Automation Society Technical
Committee for Verification of Autonomous Systems, identifying these grand
challenges, and providing a vision for future research efforts that will be
needed to address them.
☆ MetaCropFollow: Few-Shot Adaptation with Meta-Learning for Under-Canopy Navigation
Autonomous under-canopy navigation faces additional challenges compared to
over-canopy settings - for example the tight spacing between the crop rows,
degraded GPS accuracy and excessive clutter. Keypoint-based visual navigation
has been shown to perform well in these conditions, however the differences
between agricultural environments in terms of lighting, season, soil and crop
type mean that a domain shift will likely be encountered at some point of the
robot deployment. In this paper, we explore the use of Meta-Learning to
overcome this domain shift using a minimal amount of data. We train a
base-learner that can quickly adapt to new conditions, enabling more robust
navigation in low-data regimes.
☆ Path Tracking Hybrid A* For Autonomous Agricultural Vehicles
We propose a path-tracking Hybrid A* planner and a coupled hierarchical Model
Predictive Control (MPC) controller in scenarios involving the path smoothing
of agricultural vehicles. For agricultural vehicles following reference paths
on farmlands, especially during cross-furrow operations, a minimum deviation
from the reference path is desired, in addition to the curvature constraints
and body scale collision avoidance. Our contribution is threefold. (1) We
propose the path-tracking Hybrid A*, which satisfies nonholonomic constraints
and vehicle size collision avoidance, and devise new cost and heuristic
functions to minimize the deviation degree. The path-tracking Hybrid A* can not
only function in offline smoothing but also the real-time adjustment when
confronted with unexpected obstacles. (2) We propose the hierarchical MPC to
safely track the smoothed trajectory, using the initial solution solved by
linearized MPC and nonlinear local adjustments around the initial solution. (3)
We carry out extensive simulations with baseline comparisons based on
real-world farm datasets to evaluate the performance of our algorithm.
☆ GPT versus Humans: Uncovering Ethical Concerns in Conversational Generative AI-empowered Multi-Robot Systems
Rebekah Rousi, Niko Makitalo, Hooman Samani, Kai-Kristian Kemell, Jose Siqueira de Cerqueira, Ville Vakkuri, Tommi Mikkonen, Pekka Abrahamsson
The emergence of generative artificial intelligence (GAI) and large language
models (LLMs) such ChatGPT has enabled the realization of long-harbored desires
in software and robotic development. The technology however, has brought with
it novel ethical challenges. These challenges are compounded by the application
of LLMs in other machine learning systems, such as multi-robot systems. The
objectives of the study were to examine novel ethical issues arising from the
application of LLMs in multi-robot systems. Unfolding ethical issues in GPT
agent behavior (deliberation of ethical concerns) was observed, and GPT output
was compared with human experts. The article also advances a model for ethical
development of multi-robot systems. A qualitative workshop-based method was
employed in three workshops for the collection of ethical concerns: two human
expert workshops (N=16 participants) and one GPT-agent-based workshop (N=7
agents; two teams of 6 agents plus one judge). Thematic analysis was used to
analyze the qualitative data. The results reveal differences between the
human-produced and GPT-based ethical concerns. Human experts placed greater
emphasis on new themes related to deviance, data privacy, bias and unethical
corporate conduct. GPT agents emphasized concerns present in existing AI ethics
guidelines. The study contributes to a growing body of knowledge in
context-specific AI ethics and GPT application. It demonstrates the gap between
human expert thinking and LLM output, while emphasizing new ethical concerns
emerging in novel technology.
comment: 51 pages, 10 figures
☆ A Simulated real-world upper-body Exoskeleton Accident and Investigation
This paper describes the enactment of a simulated (mock) accident involving
an upper-body exoskeleton and its investigation. The accident scenario is
enacted by role-playing volunteers, one of whom is wearing the exoskeleton.
Following the mock accident, investigators - also volunteers - interview both
the subject of the accident and relevant witnesses. The investigators then
consider the witness testimony alongside robot data logged by the ethical black
box, in order to address the three key questions: what happened?, why did it
happen?, and how can we make changes to prevent the accident happening again?
This simulated accident scenario is one of a series we have run as part of the
RoboTIPS project, with the overall aim of developing and testing both processes
and technologies to support social robot accident investigation.
☆ Contact Tooling Manipulation Control for Robotic Repair Platform
This paper delves into various robotic manipulation control methods designed
for dynamic contact tooling operations on a robotic repair platform. The
explored control strategies include hybrid position-force control, admittance
control, bilateral telerobotic control, virtual fixture, and shared control.
Each approach is elucidated and assessed in terms of its applicability and
effectiveness for handling contact tooling tasks in real-world repair
scenarios. The hybrid position-force controller is highlighted for its
proficiency in executing precise force-required tasks, but it demands
contingent on an accurate model of the environment and structured, static
environment. In contrast, for unstructured environments, bilateral
teleoperation control is investigated, revealing that the compliance with the
remote robot controller is crucial for stable contact, albeit at the expense of
reduced motion tracking performance. Moreover, advanced controllers for tooling
manipulation tasks, such as virtual fixture and shared control approaches, are
investigated for their potential applications.
comment: This paper was submitted to Waste Management Symposia 2024 (WM2024)
☆ Dual-Arm Telerobotic Platform for Robotic Hotbox Operations for Nuclear Waste Disposition in EM Sites
This paper introduces a dual-arm telerobotic platform designed to efficiently
and safely execute hot cell operations for nuclear waste disposition at EM
sites. The proposed system consists of a remote robot arm platform and a
teleoperator station, both integrated with a software architecture to control
the entire system. The dual-arm configuration of the remote platform enhances
versatility and task performance in complex and hazardous environments,
ensuring precise manipulation and effective handling of nuclear waste
materials. The integration of a teleoperator station enables human teleoperator
to remotely control the entire system real-time, enhancing decision-making
capabilities, situational awareness, and dexterity. The control software plays
a crucial role in our system, providing a robust and intuitive interface for
the teleoperator. Test operation results demonstrate the system's effectiveness
in operating as a remote hotbox for nuclear waste disposition, showcasing its
potential applicability in real EM sites.
comment: This paper was submitted to Waste Management Symposia 2024 (WM2024)
☆ Dehazing-aided Multi-Rate Multi-Modal Pose Estimation Framework for Mitigating Visual Disturbances in Extreme Underwater Domain
Vidya Sudevan, Fakhreddine Zayer, Taimur Hassan, Sajid Javed, Hamad Karki, Giulia De Masi, Jorge Dias
This paper delves into the potential of DU-VIO, a dehazing-aided hybrid
multi-rate multi-modal Visual-Inertial Odometry (VIO) estimation framework,
designed to thrive in the challenging realm of extreme underwater environments.
The cutting-edge DU-VIO framework is incorporating a GAN-based pre-processing
module and a hybrid CNN-LSTM module for precise pose estimation, using
visibility-enhanced underwater images and raw IMU data. Accurate pose
estimation is paramount for various underwater robotics and exploration
applications. However, underwater visibility is often compromised by suspended
particles and attenuation effects, rendering visual-inertial pose estimation a
formidable challenge. DU-VIO aims to overcome these limitations by effectively
removing visual disturbances from raw image data, enhancing the quality of
image features used for pose estimation. We demonstrate the effectiveness of
DU-VIO by calculating RMSE scores for translation and rotation vectors in
comparison to their reference values. These scores are then compared to those
of a base model using a modified AQUALOC Dataset. This study's significance
lies in its potential to revolutionize underwater robotics and exploration.
DU-VIO offers a robust solution to the persistent challenge of underwater
visibility, significantly improving the accuracy of pose estimation. This
research contributes valuable insights and tools for advancing underwater
technology, with far-reaching implications for scientific research,
environmental monitoring, and industrial applications.
☆ Learning Two-agent Motion Planning Strategies from Generalized Nash Equilibrium for Model Predictive Control
We introduce an Implicit Game-Theoretic MPC (IGT-MPC), a decentralized
algorithm for two-agent motion planning that uses a learned value function that
predicts the game-theoretic interaction outcomes as the terminal cost-to-go
function in a model predictive control (MPC) framework, guiding agents to
implicitly account for interactions with other agents and maximize their
reward. This approach applies to competitive and cooperative multi-agent motion
planning problems which we formulate as constrained dynamic games. Given a
constrained dynamic game, we randomly sample initial conditions and solve for
the generalized Nash equilibrium (GNE) to generate a dataset of GNE solutions,
computing the reward outcome of each game-theoretic interaction from the GNE.
The data is used to train a simple neural network to predict the reward
outcome, which we use as the terminal cost-to-go function in an MPC scheme. We
showcase emerging competitive and coordinated behaviors using IGT-MPC in
scenarios such as two-vehicle head-to-head racing and un-signalized
intersection navigation. IGT-MPC offers a novel method integrating machine
learning and game-theoretic reasoning into model-based decentralized
multi-agent motion planning.
comment: Submitted to 2025 Learning for Dynamics and Control Conference (L4DC)
☆ Breadboarding the European Moon Rover System: discussion and results of the analogue field test campaign
Cristina Luna, Augusto Gómez Eguíluz, Jorge Barrientos-Díez, Almudena Moreno, Alba Guerra, Manuel Esquer, Marina L. Seoane, Steven Kay, Angus Cameron, Carmen Camañes, Philipp Haas, Vassilios Papantoniou, Armin Wedler, Bernhard Rebele, Jennifer Reynolds, Markus Landgraf
This document compiles results obtained from the test campaign of the
European Moon Rover System (EMRS) project. The test campaign, conducted at the
Planetary Exploration Lab of DLR in Wessling, aimed to understand the scope of
the EMRS breadboard design, its strengths, and the benefits of the modular
design. The discussion of test results is based on rover traversal analyses,
robustness assessments, wheel deflection analyses, and the overall
transportation cost of the rover. This not only enables the comparison of
locomotion modes on lunar regolith but also facilitates critical
decision-making in the design of future lunar missions.
comment: 6 pages, 5 figures, conference International Conference on Space
Robotics
☆ Hybrid-Neuromorphic Approach for Underwater Robotics Applications: A Conceptual Framework
This paper introduces the concept of employing neuromorphic methodologies for
task-oriented underwater robotics applications. In contrast to the increasing
computational demands of conventional deep learning algorithms, neuromorphic
technology, leveraging spiking neural network architectures, promises
sophisticated artificial intelligence with significantly reduced computational
requirements and power consumption, emulating human brain operational
principles. Despite documented neuromorphic technology applications in various
robotic domains, its utilization in marine robotics remains largely unexplored.
Thus, this article proposes a unified framework for integrating neuromorphic
technologies for perception, pose estimation, and haptic-guided conditional
control of underwater vehicles, customized to specific user-defined objectives.
This conceptual framework stands to revolutionize underwater robotics,
enhancing efficiency and autonomy while reducing energy consumption. By
enabling greater adaptability and robustness, this advancement could facilitate
applications such as underwater exploration, environmental monitoring, and
infrastructure maintenance, thereby contributing to significant progress in
marine science and technology.
☆ Learning thin deformable object manipulation with a multi-sensory integrated soft hand
Robotic manipulation has made significant advancements, with systems
demonstrating high precision and repeatability. However, this remarkable
precision often fails to translate into efficient manipulation of thin
deformable objects. Current robotic systems lack imprecise dexterity, the
ability to perform dexterous manipulation through robust and adaptive behaviors
that do not rely on precise control. This paper explores the singulation and
grasping of thin, deformable objects. Here, we propose a novel solution that
incorporates passive compliance, touch, and proprioception into thin,
deformable object manipulation. Our system employs a soft, underactuated hand
that provides passive compliance, facilitating adaptive and gentle interactions
to dexterously manipulate deformable objects without requiring precise control.
The tactile and force/torque sensors equipped on the hand, along with a depth
camera, gather sensory data required for manipulation via the proposed slip
module. The manipulation policies are learned directly from raw sensory data
via model-free reinforcement learning, bypassing explicit environmental and
object modeling. We implement a hierarchical double-loop learning process to
enhance learning efficiency by decoupling the action space. Our method was
deployed on real-world robots and trained in a self-supervised manner. The
resulting policy was tested on a variety of challenging tasks that were beyond
the capabilities of prior studies, ranging from displaying suit fabric like a
salesperson to turning pages of sheet music for violinists.
comment: 19 pages
☆ Neuromorphic Attitude Estimation and Control
The real-world application of small drones is mostly hampered by energy
limitations. Neuromorphic computing promises extremely energy-efficient AI for
autonomous flight, but is still challenging to train and deploy on real robots.
In order to reap the maximal benefits from neuromorphic computing, it is
desired to perform all autonomy functions end-to-end on a single neuromorphic
chip, from low-level attitude control to high-level navigation. This research
presents the first neuromorphic control system using a spiking neural network
(SNN) to effectively map a drone's raw sensory input directly to motor
commands. We apply this method to low-level attitude estimation and control for
a quadrotor, deploying the SNN on a tiny Crazyflie. We propose a modular SNN,
separately training and then merging estimation and control sub-networks. The
SNN is trained with imitation learning, using a flight dataset of sensory-motor
pairs. Post-training, the network is deployed on the Crazyflie, issuing control
commands from sensor inputs at $500$Hz. Furthermore, for the training procedure
we augmented training data by flying a controller with additional excitation
and time-shifting the target data to enhance the predictive capabilities of the
SNN. On the real drone the perception-to-control SNN tracks attitude commands
with an average error of $3$ degrees, compared to $2.5$ degrees for the regular
flight stack. We also show the benefits of the proposed learning modifications
for reducing the average tracking error and reducing oscillations. Our work
shows the feasibility of performing neuromorphic end-to-end control, laying the
basis for highly energy-efficient and low-latency neuromorphic autopilots.
☆ Cooperative Grasping and Transportation using Multi-agent Reinforcement Learning with Ternary Force Representation
Cooperative grasping and transportation require effective coordination to
complete the task. This study focuses on the approach leveraging force-sensing
feedback, where robots use sensors to detect forces applied by others on an
object to achieve coordination. Unlike explicit communication, it avoids delays
and interruptions; however, force-sensing is highly sensitive and prone to
interference from variations in grasping environment, such as changes in
grasping force, grasping pose, object size and geometry, which can interfere
with force signals, subsequently undermining coordination. We propose
multi-agent reinforcement learning (MARL) with ternary force representation, a
force representation that maintains consistent representation against
variations in grasping environment. The simulation and real-world experiments
demonstrate the robustness of the proposed method to changes in grasping force,
object size and geometry as well as inherent sim2real gap.
☆ Joint-repositionable Inner-wireless Planar Snake Robot
Ayato Kanada, Ryo Takahashi, Keito Hayashi, Ryusuke Hosaka, Wakako Yukita, Yasutaka Nakashima, Tomoyuki Yokota, Takao Someya, Mitsuhiro Kamezaki, Yoshihiro Kawahara, Motoji Yamamoto
Bio-inspired multi-joint snake robots offer the advantages of terrain
adaptability due to their limbless structure and high flexibility. However, a
series of dozens of motor units in typical multiple-joint snake robots results
in a heavy body structure and hundreds of watts of high power consumption. This
paper presents a joint-repositionable, inner-wireless snake robot that enables
multi-joint-like locomotion using a low-powered underactuated mechanism. The
snake robot, consisting of a series of flexible passive links, can dynamically
change its joint coupling configuration by repositioning motor-driven joint
units along rack gears inside the robot. Additionally, a soft robot skin
wirelessly powers the internal joint units, avoiding the risk of wire tangling
and disconnection caused by the movable joint units. The combination of the
joint-repositionable mechanism and the wireless-charging-enabled soft skin
achieves a high degree of bending, along with a lightweight structure of 1.3 kg
and energy-efficient wireless power transmission of 7.6 watts.
☆ Hybrid Physics-ML Modeling for Marine Vehicle Maneuvering Motions in the Presence of Environmental Disturbances
A hybrid physics-machine learning modeling framework is proposed for the
surface vehicles' maneuvering motions to address the modeling capability and
stability in the presence of environmental disturbances. From a deep learning
perspective, the framework is based on a variant version of residual networks
with additional feature extraction. Initially, an imperfect physical model is
derived and identified to capture the fundamental hydrodynamic characteristics
of marine vehicles. This model is then integrated with a feedforward network
through a residual block. Additionally, feature extraction from trigonometric
transformations is employed in the machine learning component to account for
the periodic influence of currents and waves. The proposed method is evaluated
using real navigational data from the 'JH7500' unmanned surface vehicle. The
results demonstrate the robust generalizability and accurate long-term
prediction capabilities of the nonlinear dynamic model in specific
environmental conditions. This approach has the potential to be extended and
applied to develop a comprehensive high-fidelity simulator.
☆ Trajectory Tracking Using Frenet Coordinates with Deep Deterministic Policy Gradient
This paper studies the application of the DDPG algorithm in
trajectory-tracking tasks and proposes a trajectorytracking control method
combined with Frenet coordinate system. By converting the vehicle's position
and velocity information from the Cartesian coordinate system to Frenet
coordinate system, this method can more accurately describe the vehicle's
deviation and travel distance relative to the center line of the road. The DDPG
algorithm adopts the Actor-Critic framework, uses deep neural networks for
strategy and value evaluation, and combines the experience replay mechanism and
target network to improve the algorithm's stability and data utilization
efficiency. Experimental results show that the DDPG algorithm based on Frenet
coordinate system performs well in trajectory-tracking tasks in complex
environments, achieves high-precision and stable path tracking, and
demonstrates its application potential in autonomous driving and intelligent
transportation systems. Keywords- DDPG; path tracking; robot navigation
☆ Image Compression Using Novel View Synthesis Priors
Real-time visual feedback is essential for tetherless control of remotely
operated vehicles, particularly during inspection and manipulation tasks.
Though acoustic communication is the preferred choice for medium-range
communication underwater, its limited bandwidth renders it impractical to
transmit images or videos in real-time. To address this, we propose a
model-based image compression technique that leverages prior mission
information. Our approach employs trained machine-learning based novel view
synthesis models, and uses gradient descent optimization to refine latent
representations to help generate compressible differences between camera images
and rendered images. We evaluate the proposed compression technique using a
dataset from an artificial ocean basin, demonstrating superior compression
ratios and image quality over existing techniques. Moreover, our method
exhibits robustness to introduction of new objects within the scene,
highlighting its potential for advancing tetherless remotely operated vehicle
operations.
comment: Preprint submitted to Ocean Engineering
☆ Data-Driven Multi-step Nonlinear Model Predictive Control for Industrial Heavy Load Hydraulic Robot
Automating complex industrial robots requires precise nonlinear control and
efficient energy management. This paper introduces a data-driven nonlinear
model predictive control (NMPC) framework to optimize control under multiple
objectives. To enhance the prediction accuracy of the dynamic model, we design
a single-shot multi-step prediction (SSMP) model based on long short-term
memory (LSTM) and multilayer perceptrons (MLP), which can directly obtain the
predictive horizon without iterative repetition and reduce computational
pressure. Moreover, we combine offline and online models to address
disturbances stemming from environmental interactions, similar to the
superposition of the robot's free and forced responses. The online model learns
the system's variations from the prediction mismatches of the offline model and
updates its weights in real time. The proposed hybrid predictive model
simplifies the relationship between inputs and outputs into matrix
multiplication, which can quickly obtain the derivative. Therefore, the
solution for the control signal sequence employs a gradient descent method with
an adaptive learning rate, allowing the NMPC cost function to be formulated as
a convex function incorporating critical states. The learning rate is
dynamically adjusted based on state errors to counteract the inherent
prediction inaccuracies of neural networks. The controller outputs the average
value of the control signal sequence instead of the first value. Simulations
and experiments on a 22-ton hydraulic excavator have validated the
effectiveness of our method, showing that the proposed NMPC approach can be
widely applied to industrial systems, including nonlinear control and energy
management.
☆ A Data-Driven Modeling and Motion Control of Heavy-Load Hydraulic Manipulators via Reversible Transformation
This work proposes a data-driven modeling and the corresponding hybrid motion
control framework for unmanned and automated operation of industrial heavy-load
hydraulic manipulator. Rather than the direct use of a neural network black
box, we construct a reversible nonlinear model by using multilayer perceptron
to approximate dynamics in the physical integrator chain system after
reversible transformations. The reversible nonlinear model is trained offline
using supervised learning techniques, and the data are obtained from
simulations or experiments. Entire hybrid motion control framework consists of
the model inversion controller that compensates for the nonlinear dynamics and
proportional-derivative controller that enhances the robustness. The stability
is proved with Lyapunov theory. Co-simulation and Experiments show the
effectiveness of proposed modeling and hybrid control framework. With a
commercial 39-ton class hydraulic excavator for motion control tasks, the root
mean square error of trajectory tracking error decreases by at least 50\%
compared to traditional control methods. In addition, by analyzing the system
model, the proposed framework can be rapidly applied to different control
plants.
☆ Arm Robot: AR-Enhanced Embodied Control and Visualization for Intuitive Robot Arm Manipulation
Embodied interaction has been introduced to human-robot interaction (HRI) as
a type of teleoperation, in which users control robot arms with bodily action
via handheld controllers or haptic gloves. Embodied teleoperation has made
robot control intuitive to non-technical users, but differences between humans'
and robots' capabilities \eg ranges of motion and response time, remain
challenging. In response, we present Arm Robot, an embodied robot arm
teleoperation system that helps users tackle human-robot discrepancies.
Specifically, Arm Robot (1) includes AR visualization as real-time feedback on
temporal and spatial discrepancies, and (2) allows users to change observing
perspectives and expand action space. We conducted a user study (N=18) to
investigate the usability of the Arm Robot and learn how users perceive the
embodiment. Our results show users could use Arm Robot's features to
effectively control the robot arm, providing insights for continued work in
embodied HRI.
☆ Spatiotemporal Tubes for Temporal Reach-Avoid-Stay Tasks in Unknown Systems
The paper considers the controller synthesis problem for general MIMO systems
with unknown dynamics, aiming to fulfill the temporal reach-avoid-stay task,
where the unsafe regions are time-dependent, and the target must be reached
within a specified time frame. The primary aim of the paper is to construct the
spatiotemporal tube (STT) using a sampling-based approach and thereby devise a
closed-form approximation-free control strategy to ensure that system
trajectory reaches the target set while avoiding time-dependent unsafe sets.
The proposed scheme utilizes a novel method involving STTs to provide
controllers that guarantee both system safety and reachability. In our
sampling-based framework, we translate the requirements of STTs into a Robust
optimization program (ROP). To address the infeasibility of ROP caused by
infinite constraints, we utilize the sampling-based Scenario optimization
program (SOP). Subsequently, we solve the SOP to generate the tube and
closed-form controller for an unknown system, ensuring the temporal
reach-avoid-stay specification. Finally, the effectiveness of the proposed
approach is demonstrated through three case studies: an omnidirectional robot,
a SCARA manipulator, and a magnetic levitation system.
☆ A Novel Passive Occupational Shoulder Exoskeleton With Adjustable Peak Assistive Torque Angle For Overhead Tasks
Objective: Overhead tasks are a primary inducement to work-related
musculoskeletal disorders. Aiming to reduce shoulder physical loads, passive
shoulder exoskeletons are increasingly prevalent in the industry due to their
lightweight, affordability, and effectiveness. However, they can only handle
specific tasks and struggle to balance compactness with a sufficient range of
motion effectively. Method: We proposed a novel passive occupational shoulder
exoskeleton designed to handle various overhead tasks at different arm
elevation angles, ensuring sufficient ROM while maintaining compactness. By
formulating kinematic models and simulations, an ergonomic shoulder structure
was developed. Then, we presented a torque generator equipped with an
adjustable peak assistive torque angle to switch between low and high
assistance phases through a passive clutch mechanism. Ten healthy participants
were recruited to validate its functionality by performing the screwing task.
Results: Measured range of motion results demonstrated that the exoskeleton can
ensure a sufficient ROM in both sagittal (164$^\circ$) and horizontal
(158$^\circ$) flexion/extension movements. The experimental results of the
screwing task showed that the exoskeleton could reduce muscle activation (up to
49.6%), perceived effort and frustration, and provide an improved user
experience (scored 79.7 out of 100). Conclusion: These results indicate that
the proposed exoskeleton can guarantee natural movements and provide efficient
assistance during overhead work, and thus have the potential to reduce the risk
of musculoskeletal disorders. Significance: The proposed exoskeleton provides
insights into multi-task adaptability and efficient assistance, highlighting
the potential for expanding the application of exoskeletons.
♻ ☆ Geometric Static Modeling Framework for Piecewise-Continuous Curved-Link Multi Point-of-Contact Tensegrity Robots
Tensegrities synergistically combine tensile (cable) and rigid (link)
elements to achieve structural integrity, making them lightweight, packable,
and impact resistant. Consequently, they have high potential for locomotion in
unstructured environments. This research presents geometric modeling of a
Tensegrity eXploratory Robot (TeXploR) comprised of two semi-circular, curved
links held together by 12 prestressed cables and actuated with an internal mass
shifting along each link. This design allows for efficient rolling with
stability (e.g., tip-over on an incline). However, the unique design poses
static and dynamic modeling challenges given the discontinuous nature of the
semi-circular, curved links, two changing points of contact with the surface
plane, and instantaneous movement of the masses along the links. The robot is
modeled using a geometric approach where the holonomic constraints confirm the
experimentally observed four-state hybrid system, proving TeXploR rolls along
one link while pivoting about the end of the other. It also identifies the
quasi-static state transition boundaries that enable a continuous change in the
robot states via internal mass shifting. This is the first time in literature a
non-spherical two-point contact system is kinematically and geometrically
modeled. Furthermore, the static solutions are closed-form and do not require
numerical exploration of the solution. The MATLAB simulations are
experimentally validated on a tetherless prototype with mean absolute error of
4.36{\deg}.
comment: This work is published on IEEE RA-L. Please refer to the published
article below: https://ieeexplore.ieee.org/document/10734217 L. Ervin and V.
Vikas, "Geometric Static Modeling Framework for Piecewise-Continuous
Curved-Link Multi Point-of-Contact Tensegrity Robots," in IEEE Robotics and
Automation Letters, vol. 9, no. 12, pp. 11066-11073, Dec. 2024, doi:
10.1109/LRA.2024.3486199
♻ ☆ Accelerating Gaussian Variational Inference for Motion Planning Under Uncertainty
This work addresses motion planning under uncertainty as a stochastic optimal
control problem. The path distribution induced by the optimal controller
corresponds to a posterior path distribution with a known form. To approximate
this posterior, we frame an optimization problem in the space of Gaussian
distributions, which aligns with the Gaussian Variational Inference Motion
Planning (GVIMP) paradigm introduced in \cite{yu2023gaussian}. In this
framework, the computation bottleneck lies in evaluating the expectation of
collision costs over a dense discretized trajectory and computing the marginal
covariances. This work exploits the sparse motion planning factor graph, which
allows for parallel computing collision costs and Gaussian Belief Propagation
(GBP) marginal covariance computation, to introduce a computationally efficient
approach to solving GVIMP. We term the novel paradigm as the Parallel Gaussian
Variational Inference Motion Planning (P-GVIMP). We validate the proposed
framework on various robotic systems, demonstrating significant speed
acceleration achieved by leveraging Graphics Processing Units (GPUs) for
parallel computation. An open-sourced implementation is presented at
https://github.com/hzyu17/VIMP.
comment: 7 pages
♻ ☆ Probabilistically Correct Language-based Multi-Robot Planning using Conformal Prediction
This paper addresses task planning problems for language-instructed robot
teams. Tasks are expressed in natural language (NL), requiring the robots to
apply their capabilities at various locations and semantic objects. Several
recent works have addressed similar planning problems by leveraging pre-trained
Large Language Models (LLMs) to design effective multi-robot plans. However,
these approaches lack performance guarantees. To address this challenge, we
introduce a new distributed LLM-based planner, called S-ATLAS for Safe plAnning
for Teams of Language-instructed AgentS, that is capable of achieving
user-defined mission success rates. This is accomplished by leveraging
conformal prediction (CP), a distribution-free uncertainty quantification tool
in black-box models. CP allows the proposed multi-robot planner to reason about
its inherent uncertainty in a distributed fashion, enabling robots to make
individual decisions when they are sufficiently certain and seek help
otherwise. We show, both theoretically and empirically, that the proposed
planner can achieve user-specified task success rates, assuming successful plan
execution, while minimizing the overall number of help requests. We provide
comparative experiments against related works showing that our method is
significantly more computational efficient and achieves lower help rates. The
advantage of our algorithm over baselines becomes more pronounced with
increasing robot team size.
♻ ☆ VeriGraph: Scene Graphs for Execution Verifiable Robot Planning
Recent advancements in vision-language models (VLMs) offer potential for
robot task planning, but challenges remain due to VLMs' tendency to generate
incorrect action sequences. To address these limitations, we propose VeriGraph,
a novel framework that integrates VLMs for robotic planning while verifying
action feasibility. VeriGraph employs scene graphs as an intermediate
representation, capturing key objects and spatial relationships to improve plan
verification and refinement. The system generates a scene graph from input
images and uses it to iteratively check and correct action sequences generated
by an LLM-based task planner, ensuring constraints are respected and actions
are executable. Our approach significantly enhances task completion rates
across diverse manipulation scenarios, outperforming baseline methods by 58%
for language-based tasks and 30% for image-based tasks.
♻ ☆ M-SET: Multi-Drone Swarm Intelligence Experimentation with Collision Avoidance Realism
Chuhao Qin, Alexander Robins, Callum Lillywhite-Roake, Adam Pearce, Hritik Mehta, Scott James, Tsz Ho Wong, Evangelos Pournaras
Distributed sensing by cooperative drone swarms is crucial for several Smart
City applications, such as traffic monitoring and disaster response. Using an
indoor lab with inexpensive drones, a testbed supports complex and ambitious
studies on these systems while maintaining low cost, rigor, and external
validity. This paper introduces the Multi-drone Sensing Experimentation Testbed
(M-SET), a novel platform designed to prototype, develop, test, and evaluate
distributed sensing with swarm intelligence. M-SET addresses the limitations of
existing testbeds that fail to emulate collisions, thus lacking realism in
outdoor environments. By integrating a collision avoidance method based on a
potential field algorithm, M-SET ensures collision-free navigation and sensing,
further optimized via a multi-agent collective learning algorithm. Extensive
evaluation demonstrates accurate energy consumption estimation and a low risk
of collisions, providing a robust proof-of-concept. New insights show that
M-SET has significant potential to support ambitious research with minimal
cost, simplicity, and high sensing quality.
comment: 7 pages, 7 figures. This work has been accepted by 2024 IEEE 49th
Conference on Local Computer Networks (LCN)
♻ ☆ Exosense: A Vision-Based Scene Understanding System For Exoskeletons
Jianeng Wang, Matias Mattamala, Christina Kassab, Guillaume Burger, Fabio Elnecave, Lintong Zhang, Marine Petriaux, Maurice Fallon
Self-balancing exoskeletons are a key enabling technology for individuals
with mobility impairments. While the current challenges focus on
human-compliant hardware and control, unlocking their use for daily activities
requires a scene perception system. In this work, we present Exosense, a
vision-centric scene understanding system for self-balancing exoskeletons. We
introduce a multi-sensor visual-inertial mapping device as well as a navigation
stack for state estimation, terrain mapping and long-term operation. We tested
Exosense attached to both a human leg and Wandercraft's Personal Exoskeleton in
real-world indoor scenarios. This enabled us to test the system during typical
periodic walking gaits, as well as future uses in multi-story environments. We
demonstrate that Exosense can achieve an odometry drift of about 4 cm per meter
traveled, and construct terrain maps under 1 cm average reconstruction error.
It can also work in a visual localization mode in a previously mapped
environment, providing a step towards long-term operation of exoskeletons.
comment: 8 pages, 9 figures
♻ ☆ A Survey on Small-Scale Testbeds for Connected and Automated Vehicles and Robot Swarms
Armin Mokhtarian, Jianye Xu, Patrick Scheffe, Maximilian Kloock, Simon Schäfer, Heeseung Bang, Viet-Anh Le, Sangeet Ulhas, Johannes Betz, Sean Wilson, Spring Berman, Liam Paull, Amanda Prorok, Bassam Alrifaee
Connected and automated vehicles and robot swarms hold transformative
potential for enhancing safety, efficiency, and sustainability in the
transportation and manufacturing sectors. Extensive testing and validation of
these technologies is crucial for their deployment in the real world. While
simulations are essential for initial testing, they often have limitations in
capturing the complex dynamics of real-world interactions. This limitation
underscores the importance of small-scale testbeds. These testbeds provide a
realistic, cost-effective, and controlled environment for testing and
validating algorithms, acting as an essential intermediary between simulation
and full-scale experiments. This work serves to facilitate researchers' efforts
in identifying existing small-scale testbeds suitable for their experiments and
provide insights for those who want to build their own. In addition, it
delivers a comprehensive survey of the current landscape of these testbeds. We
derive 62 characteristics of testbeds based on the well-known sense-plan-act
paradigm and offer an online table comparing 23 small-scale testbeds based on
these characteristics. The online table is hosted on our designated public
webpage https://bassamlab.github.io/testbeds-survey, and we invite testbed
creators and developers to contribute to it. We closely examine nine testbeds
in this paper, demonstrating how the derived characteristics can be used to
present testbeds. Furthermore, we discuss three ongoing challenges concerning
small-scale testbeds that we identified, i.e., small-scale to full-scale
transition, sustainability, and power and resource management.
comment: 16 pages, 11 figures, 1 table. This work was accepted by the IEEE
Robotics & Automation Magazine
♻ ☆ Highly dynamic physical interaction for robotics: design and control of an active remote center of compliance
Robot interaction control is often limited to low dynamics or low
flexibility, depending on whether an active or passive approach is chosen. In
this work, we introduce a hybrid control scheme that combines the advantages of
active and passive interaction control. To accomplish this, we propose the
design of a novel Active Remote Center of Compliance (ARCC), which is based on
a passive and active element which can be used to directly control the
interaction forces. We introduce surrogate models for a dynamic comparison
against purely robot-based interaction schemes. In a comparative validation,
ARCC drastically improves the interaction dynamics, leading to an increase in
the motion bandwidth of up to 31 times. We introduce further our control
approach as well as the integration in the robot controller. Finally, we
analyze ARCC on different industrial benchmarks like peg-in-hole, top-hat rail
assembly and contour following problems and compare it against the state of the
art, to highlight the dynamic and flexibility. The proposed system is
especially suited if the application requires a low cycle time combined with a
sensitive manipulation.
comment: 7 pages, 7 figures
♻ ☆ t-READi: Transformer-Powered Robust and Efficient Multimodal Inference for Autonomous Driving
Given the wide adoption of multimodal sensors (e.g., camera, lidar, radar) by
autonomous vehicles (AVs), deep analytics to fuse their outputs for a robust
perception become imperative. However, existing fusion methods often make two
assumptions rarely holding in practice: i) similar data distributions for all
inputs and ii) constant availability for all sensors. Because, for example,
lidars have various resolutions and failures of radars may occur, such
variability often results in significant performance degradation in fusion. To
this end, we present tREADi, an adaptive inference system that accommodates the
variability of multimodal sensory data and thus enables robust and efficient
perception. t-READi identifies variation-sensitive yet structure-specific model
parameters; it then adapts only these parameters while keeping the rest intact.
t-READi also leverages a cross-modality contrastive learning method to
compensate for the loss from missing modalities. Both functions are implemented
to maintain compatibility with existing multimodal deep fusion methods. The
extensive experiments evidently demonstrate that compared with the status quo
approaches, t-READi not only improves the average inference accuracy by more
than 6% but also reduces the inference latency by almost 15x with the cost of
only 5% extra memory overhead in the worst case under realistic data and modal
variations.
comment: 14 pages, 16 figures
♻ ☆ OTO Planner: An Efficient Only Travelling Once Exploration Planner for Complex and Unknown Environments
Autonomous exploration in complex and cluttered environments is essential for
various applications. However, there are many challenges due to the lack of
global heuristic information. Existing exploration methods suffer from the
repeated paths and considerable computational resource requirement in
large-scale environments. To address the above issues, this letter proposes an
efficient exploration planner that reduces repeated paths in complex
environments, hence it is called "Only Travelling Once Planner". OTO Planner
includes fast frontier updating, viewpoint evaluation and viewpoint refinement.
A selective frontier updating mechanism is designed, saving a large amount of
computational resources. In addition, a novel viewpoint evaluation system is
devised to reduce the repeated paths utilizing the enclosed sub-region
detection. Besides, a viewpoint refinement approach is raised to concentrate
the redundant viewpoints, leading to smoother paths. We conduct extensive
simulation and real-world experiments to validate the proposed method. Compared
to the state-of-the-art approach, the proposed method reduces the exploration
time and movement distance by 10%-20% and improves the speed of frontier
detection by 6-9 times.
♻ ☆ Learning Robust Grasping Strategy Through Tactile Sensing and Adaption Skill
Robust grasping represents an essential task in robotics, necessitating
tactile feedback and reactive grasping adjustments for robust grasping of
objects. Previous research has extensively combined tactile sensing with
grasping, primarily relying on rule-based approaches, frequently neglecting
post-grasping difficulties such as external disruptions or inherent
uncertainties of the object's physics and geometry. To address these
limitations, this paper introduces an human-demonstration-based adaptive
grasping policy base on tactile, which aims to achieve robust gripping while
resisting disturbances to maintain grasp stability. Our trained model
generalizes to daily objects with seven different sizes, shapes, and textures.
Experimental results demonstrate that our method performs well in dynamic and
force interaction tasks and exhibits excellent generalization ability.
♻ ☆ FracGM: A Fast Fractional Programming Technique for Geman-McClure Robust Estimator
Robust estimation is essential in computer vision, robotics, and navigation,
aiming to minimize the impact of outlier measurements for improved accuracy. We
present a fast algorithm for Geman-McClure robust estimation, FracGM,
leveraging fractional programming techniques. This solver reformulates the
original non-convex fractional problem to a convex dual problem and a linear
equation system, iteratively solving them in an alternating optimization
pattern. Compared to graduated non-convexity approaches, this strategy exhibits
a faster convergence rate and better outlier rejection capability. In addition,
the global optimality of the proposed solver can be guaranteed under given
conditions. We demonstrate the proposed FracGM solver with Wahba's rotation
problem and 3-D point-cloud registration along with relaxation pre-processing
and projection post-processing. Compared to state-of-the-art algorithms, when
the outlier rates increase from 20% to 80%, FracGM shows 53% and 88% lower
rotation and translation increases. In real-world scenarios, FracGM achieves
better results in 13 out of 18 outcomes, while having a 19.43% improvement in
the computation time.
comment: 8 pages, 6 figures