Paper:

bleu_tech.pdf

Bleu Robotics

Many factories use robots, but each automation project is far more complex (and therefore costly) than it might initially appear. Current manipulators are primarily designed to repeat trajectories without variation or adaptation. For each workstation, engineers have three levers for automation: (1) limiting variability — for example, the position of objects to grasp — using accessories such as vibratory bowls and guides, (2) developing specialized vision-based algorithms to perceive this variability, and (3) developing task-specific grippers for each object (typically suction-based). Beyond hardware costs, this development process requires substantial engineering investment. This expense is only justified for tasks with high production volumes and minimal changes (product changes, process modifications, etc.). These automation costs explain why, to our knowledge, only a small percentage of tasks are fully automated in French & German factories.

Our analysis of industrial needs has clearly identified the value of "universal" robots that can be easily configured for many applications and replace a worker without requiring workstation modifications. The typical scenario we are targeting involves a bimanual wheeled robot (humanoid capabilities: mobility and bimanual manipulation) that learns a task either through teleoperation with a joystick or using manual grippers (see study 1), then reproduces it from demonstrations. A common factory example is machine loading: a worker must grasp a part from a bin or box, place it in a machine (e.g., a milling machine, laser engraver, or press), close the machine, press a button, wait for the result, then grasp the part, and position it in a new box. Typically, a factory operates several slightly different machines (different generations) and may change object types and sequences frequently based on orders (e.g., monthly). The critical scientific challenge is enabling robot training from demonstrations alone, as naturally as possible—without needing to manually sequence subtasks, scan objects for 3D meshes, train object classifiers, etc., as is currently required to automate a production line. In other words, the goal is to reduce automation costs to hardware and software costs only. This is fundamentally an imitation learning problem.

These tasks typically involve variability because they are designed for human workers (as opposed to fully automated production lines), who naturally adapt to variations—for example, objects or machines to manipulate are not always in the same position or orientation. Furthermore, if the robot moves, perfect positioning relative to objects and machines cannot be guaranteed (unlike a fixed robot). Computer vision and cameras are therefore necessary, not just "replaying" recorded trajectories.

vidéo Poste Gravage laser.mp4

Loading a laser engraver with a bimanual mobile robot (Inria/Bleu Robotics experiment at an automotive subcontractor, 2025).

The hardware—humanoid robots or wheeled humanoid robots—is almost ready: humanoid robots that cost less than USD 30,000 can now be purchased, and competition between manufacturers makes the robots more impressive every day.

What's missing is the right software to fulfill the needs of industrial tasks and, more generally, the vision of universal, flexible robots in factories. To achieve that, we need the right learning technology.

bleu-pipeline.png

We have identified five critical characteristics for the learning system:

  1. Demonstration duration—and therefore the number of demonstrations—must be minimized because it often means stopping the production line (which is very costly) and/or requiring operators capable of performing the task. Tasks typically last 1–2 minutes, so 30 demonstrations already represent at least 30–60 minutes, excluding setup time. Most factory managers we contacted already consider this lengthy. As a consequence, we should aim for learning tasks with 10 to 50 demonstrations.
  2. Computation time for learning from data must also be minimized to reduce deployment time for new tasks. Waiting several hours to test the results of demonstrations is generally considered costly "downtime" that accumulates when tasks change frequently or when many tasks need to be learned.
  3. The robot must have two arms (a significant portion of tasks require both hands), a mobile torso, and the ability to move short distances (a few meters) to perform worker tasks without significantly modifying the organization. The goal is to make the robot's workspace, accounting for its mobility, resemble that of humans as closely as possible.
  4. Objects to manipulate cannot be known in advance and generally have shapes not found in standard datasets (e.g., internal engine parts, various mechanical components), making their automatic detection difficult.
  5. Industrial tasks are typically long sequences (at least 1–2 minutes, meaning at least 500–1000 time steps for a robot) with 5–10 different stages, distinguishing them from reactive control problems (e.g., trajectory following, walking, or maintaining a setpoint despite perturbations).

An overview of the Abductive Algorithm principle

The learning algorithm developed at Inria that led to the creation of Bleu Robotics builds on our experience with humanoid robotics learning(Rouxel, Ferrari, et al. 2024; Rouxel et al. 2025). We frame the imitation problem as a goal-conditioned reinforcement learning problem.