BricksRL: A Platform for Democratizing Robotics and Reinforcement Learning Research and Education with LEGO

1Universitat Pompeu Fabra, 2ICREA 3PyTorch Team 4Meta

Abstract

We present BricksRL, a platform designed to democratize access to robotics for reinforcement learning research and education. BricksRL facilitates the creation, design, and training of custom LEGO robots in the real world by interfacing them with the TorchRL library for reinforcement learning agents. The integration of TorchRL with the LEGO hubs, via Bluetooth bidirectional communication, enables state-of-the-art reinforcement learning training on GPUs for a wide variety of LEGO builds. This offers a flexible and cost-efficient approach for scaling and also provides a robust infrastructure for robot-environment-algorithm communication. We present various experiments across tasks and robot configurations, providing built plans and training results. Furthermore, we demonstrate that inexpensive LEGO robots can be trained end-to-end in the real world to achieve simple tasks, with training times typically under 120 minutes on a normal laptop. Moreover, we show how users can extend the capabilities, exemplified by the successful integration of non-LEGO sensors. By enhancing accessibility to both robotics and reinforcement learning, BricksRL establishes a strong foundation for democratized robotic learning in research and educational settings.


Experiments

RoboArm

The videos display the evaluation of two SAC agents: one trained entirely in the real-world environment, RoboArm-v0 (left), and another in the simulation environment, RoboArmSim-v0 (right). Displayed at the top of the video are four goal positions, selected to evaluate the effectiveness of the trained policies, which the agent must reach to demonstrate task completion.

RoboArm Combined

RoboArm Mixed Observation

RoboArm Combined

The video presents sequences of successive evaluation trials of the SAC agent in the RoboArm-mixed-v0 environment, which integrates direct sensor information of the robot arm angles and image inputs. The objective is to move the red ball, held by the robotic arm, to a randomly sampled target position indicated by a green dot in the image. The plot illustrates the training performance of the algorithms in the RoboArm_mixed-v0 environment. It displays both the reward achieved and the number of episode steps taken to reach the target location.

Walker

This example features a DroQ agent trained in the Walker-v0 environment, which is directly implemented in the real world (left), and a DroQ agent trained in the WalkerSim-v0 simulation (right). The comparison demonstrates a successful simulation-to-reality (sim2real) transfer of the policy, where both agents are tasked with learning a forward-moving walking gait.

RoboArm Combined

2Wheeler

Evaluating a trained SAC agent in the RunAway-v0 environment (left) and a TD3 agent in the Spinning-v0 environment (right), each performing simple yet distinct tasks. In RunAway-v0, the agent's objective is to maximize the distance measured by an Ultrasonic sensor. Meanwhile, in Spinning-v0, the agent's task is to turn the 2Wheeler left or right based on indicators provided to the agent.

RoboArm Combined

Building Instructions

2Wheeler & RoboArm

Video recording of the building instructions for the 2Wheeler and the RoboArm robot.

Due to the discontinuation of the original set required for reconstructing the robots, we've investigated alternative options. After careful consideration, we believe that the Spike Prime Set, combined with the Spike Extension set, should be sufficient to build the robot or a similar version. These sets provide the necessary components and versatility to replicate the essential features and functionality of the original design.

Walker

Instructions for assembling the Walker robot and other models that can be created with this kit are available here.

Other Projects

For inspiration and further exploration, we provide additional robots and projects to extend the experiments in BricksRL: