Realistic hand manipulation is a key component of immersive virtual reality (VR), yet existing methods often rely on kinematic approaches or motion-capture datasets that omit crucial physical attributes such as contact forces and finger torques. Consequently, these approaches prioritize tight, one-size-fits-all grips rather than reflecting users’ intended force levels.
We present ForceGrip, a deep learning agent that synthesizes realistic hand manipulation motions, faithfully reflecting the user’s grip force intention. Instead of mimicking predefined motion datasets, ForceGrip uses generated training scenarios—randomizing object shapes, wrist movements, and trigger input flows—to challenge the agent with a broad spectrum of physical interactions.
To effectively learn from these complex tasks, we employ a three-phase curriculum learning framework comprising Finger Positioning, Intention Adaptation, and Dynamic Stabilization. This progressive strategy ensures stable hand-object contact, adaptive force control based on user inputs, and robust handling under dynamic conditions. Additionally, a proximity reward function enhances natural finger motions and accelerates training convergence. Quantitative and qualitative evaluations reveal ForceGrip’s superior force controllability and plausibility compared to state-of-the-art methods.