The Robots Build Now, Too

At Generalist, we’re working towards a future where robots can “just do anything,” and we’re excited to share a step in this direction.

One of our newest internal evaluation tasks is one-shot assembly. A person constructs a small structure, and the robot copies it. We’re evaluating our models on how well they can build Legos – end-to-end, from pixels to 100Hz actions. No task-specific engineering, no custom instructions: it sees what you build and replicates it.

Why this matters

We were inspired by your suggestions on our previous Lego throwing demos, and as far as we know this is the world’s first robot to assemble Legos with end-to-end visuomotor control. If you have tasks that you want to see robots do, we’d love to hear.

Note: there are expected bounds to the generalization of what’s shown in the video: we’ve only tested model capabilities for 4-colored, 3-brick structures of two-by-four Lego bricks. Calculating how many possibilities this presents is not easy. (If this is easy for you, please reach out for a job.) If we agree that uncolored 3-brick combinations of two-by-four Lego bricks have 1,560 combinations, then having 4 color options for each of the 3 bricks gives 4 × 4 × 4 × 1,560 = 99,840 possible combinations.