| description abstract | The construction industry has long been plagued by low productivity and high injury and fatality rates. Robots have been envisioned to automate the construction process, thereby substantially improving construction productivity and safety. Despite the enormous potential, teaching robots to perform complex construction tasks is challenging. We present a generalizable framework to harness human teleoperation data to train construction robots to perform repetitive construction tasks. First, we develop a teleoperation method and interface to control robots on construction sites, serving as an intermediate solution toward full automation. Teleoperation data from human operators, along with context information from the job site, can be collected for robot learning. Second, we propose a new method for extracting keyframes from human operation data to reduce noise and redundancy in the training data, thereby improving robot learning efficacy. We propose a hierarchical imitation learning method that incorporates the keyframes to train the robot to generate appropriate trajectories for construction tasks. Third, we model the robot’s visual observations of the working space in a compact latent space to improve learning performance and reduce computational load. To validate the proposed framework, we conduct experiments teaching a robot to generate appropriate trajectories for excavation tasks from human operators’ teleoperations. The results suggest that the proposed method outperforms state-of-the-art approaches, demonstrating its significant potential for application. |  |