Computer Vision

AlignIt

Model-Free Robot Arm Alignment

Precisely align robot grippers with objects using RGB(D) cameras and neural networks. No CAD models, no markers, no complex setup—just record, train, and align.

Real-Time

Inference Speed

Model-Free

No CAD Required

RGB(D)

Camera Input

Performance

See AlignIt in Action

From data collection to autonomous alignment

Teaching Procedure

Record demonstration data as the robot moves around the target object. The system automatically collects and labels alignment data for training.

Continuous Alignment

Real-time inference predicts relative poses to precisely align the gripper with the object, enabling autonomous manipulation with high accuracy.

Workflow

Three Simple Steps

Record an Object

Position your robot near the target object. The system automatically records camera images and corresponding poses as you demonstrate the alignment task.

Train the Model

AlignNet neural network learns the relationship between camera observations and required alignment actions using your demonstration data.

Align It

The trained model predicts real-time relative poses to guide the gripper to the target object with sub-millimeter precision.

Features

Key Features

Model-Free Approach

No need for CAD models, 3D reconstructions, or object markers. AlignIt learns directly from camera observations and robot poses.

Real-Time Performance

Fast neural network inference enables smooth, responsive alignment even on standard computing hardware.

RGB(D) Support

Works with standard RGB cameras or RGB-D sensors like Intel RealSense for enhanced depth perception.

Robot Agnostic

Compatible with xArm, UFactory robots, and extensible to other manipulators through a simple interface.

Deep Learning Based

EfficientNetResNet

Easy Data Collection

Automated spiral trajectory generation and data labeling makes dataset creation fast and effortless.

Deep Dive

Technical Highlights

The core technology behind AlignIt's precision

Neural Network

AlignNet Architecture

A custom convolutional neural network that processes RGB(D) images to predict 6-DOF relative transformations.

Configurable backbones (EfficientNet, ResNet)
Multi-view feature aggregation
Depth fusion for accuracy
9D output (3D translation + 6D rotation)

Data Engineering

Automatic Generation

Spiral trajectory generation creates diverse viewpoints around the target object for robust training.

Configurable cone angles & sweep ranges
Automatic pose labeling
Depth image recording
HuggingFace Datasets integration

Implementation

Inference & Control

Real-time alignment loop continuously predicts and executes corrective motions until convergence.

Configurable tolerance thresholds
Rotation matrix acceleration
Convergence detection
Servo control integration

Education

Applications

Manufacturing

Precise part insertion, component assembly, and quality inspection tasks.

Bin Picking

Align grippers with objects in bins without needing explicit object models.

Lab Automation

Handle delicate samples, align with test fixtures, or perform liquid handling.

Research

Teach vision-based manipulation or explore learning-based robotics.

Tech Stack

Technologies

xArm / UFactory

Intel RealSense

PyTorch

HuggingFace

MuJoCo

Python 3.8+

Get Started with AlignIt

Open-source and ready to use. Build your own vision-based alignment solution or collaborate with us for custom projects.

View on GitHub Support

Apache 2.0 License • Spes Robotics