Human-like planning skills and dexterous manipulation have long posed challenges in the felds of robotics and artifcial intelligence (AI). The task of reinterpreting calligraphy presents a formidable challenge, as it involves the decomposition of strokes and dexterous utensil control. Previous efforts have primarily focused on supervised learning of a single instrument, limiting the performance of robots in the realm of cross-domain text replication.
To address these challenges, we propose CalliRewrite: a coarse-to-fne approach for robot arms to discover and recover plausible writing orders from diverse calligraphy images without requiring labeled demonstrations. Our model achieves fne-grained control of various writing utensils. Specifcally, an unsupervised image-to-sequence model decomposes a given calligraphy glyph to obtain a coarse stroke sequence. Using an RL algorithm, a simulated brush is fnetuned to generate stylized trajectories for robotic arm control. Evaluation in simulation and physical robot scenarios reveals that our method successfully replicates unseen fonts and styles while achieving integrity in unknown characters.
We focus on unsupervised learning from plain images and discovering dexterous control over various utensils. Our method employs a hierarchical structure encompassing a CNN-encoded LSTM model to deduce stroke-level orders, and a reinforcement learning (RL) pipeline to fine-tune the coarse sequences into tool-aware stylized control, controlling the brush agent with soft-actor-critic (SAC) algorithm.
An overview of the training process of CalliRewrite pipeline. We develop our coarse sequence extraction module based on Mo et al. and propose tailored unsupervised loss functions for human-like glyph decomposition. In the second phase, we formulate the task into a constrained optimization task, leveraging SAC algorithm on our crafted environment to finetune dexterous control sequences. Coarse sequences are set into initial states to curtail ineffective exploration and boost training.