A novel method for generating artistic representations of Oracle Bone Inscriptions (OBI) has emerged, leveraging advancements in vector graphics and deep learning techniques. This approach combines differentiable vector graphics rendering, diffusion models, and text-to-image synthesis, marking a significant step in the fusion of traditional and modern design techniques.
The foundation of this technique lies in Differentiable Vector Graphics Rendering, which converts vector graphics—comprised of control points and paths—into raster images. Conventional methods like OpenGL, while efficient, often lose critical gradient information due to their discrete sampling processes, making gradient-based optimization challenging. In response, researchers, including Li et al., introduced DIFFVG in 2020, facilitating end-to-end differentiable rendering from vector path parameters to pixel space. This method allows for direct optimization of vector control points via neural networks, enhancing applications like font design and image vectorization.
At the core of this innovation are Diffusion Models, a class of deep generative models that mimic thermodynamic processes. These models operate in two stages: a forward process that incrementally adds noise to data, and a reverse process where a neural network is trained to remove that noise, reconstructing the original data. While traditional diffusion models produce high-quality outputs, they can be computationally intensive. Denoising Diffusion Implicit Models (DDIM) have improved efficiency by employing a non-Markovian forward process, enabling high-quality generation with fewer steps, thereby paving the way for models like Stable Diffusion.
The advent of text-to-image generation has been revolutionized by these models. The Stable Diffusion model, introduced by Rombach et al., enables effective generation of high-quality images from textual descriptions. It achieves this by transforming text prompts into semantic conditions that guide the diffusion process in a compressed latent space. A unique U-Net structured diffusion model iteratively denoises random noise, aligning with the input descriptions. However, training these models is resource-intensive. To address this, Poole et al. proposed Score Distillation Sampling (SDS), which utilizes a pre-trained model to guide the optimization process, allowing for the generation of 3D models or vector graphics. Despite this, the method can yield overly smooth images. To counteract this, LucidDreamer developed Interval Score Matching (ISM), which matches scores between two interval steps in the diffusion process, ultimately producing outputs with rich detail.
Integral to this methodology is the Oracle Bone Font Glyph Dataset (OFGD), built upon the “HanYi Chen Style Oracle Bone Inscriptions” font library, which includes 3,665 commonly used OBI characters with vector contour data in TrueType format. The original font files, while offering consistent display, have limitations in control points that can affect precision. Therefore, preprocessing is essential to convert these glyphs into cubic Bezier curves, stored in SVG format. This enhances the quality of geometric representation, boosting subsequent font synthesis processes.
To generate cubic Bezier curves efficiently, an algorithm was proposed to adaptively determine control points, balancing fidelity and computational efficiency. By loading vector contour data and recursively subdividing curve segments that exceed a defined threshold, this algorithm increases control point density while preserving the original glyph’s topological structure. Consequently, a refined set of Bezier curves is produced, ensuring flexibility and accuracy essential for subsequent transformations.
The OBI-Designer framework comprises two stages for generating artistic OBI characters that maintain readability while enhancing aesthetic quality. The first stage utilizes DIFFVG for differentiable rasterization combined with diffusion models, while the second stage focuses on texture synthesis using Control and LoRA methods for contour refinement. By doing so, the framework can generate high-quality oracle bone script art characters efficiently. The glyph synthesis pipeline initiates with a learnable control point set, optimizing it against a given text prompt to produce semantically aligned glyphs.
In order to maintain the integrity of OBI glyphs during the optimization process, the methodology employs loss functions, including ISM Loss, ACAP loss, and Tone loss. ISM Loss generates stable pseudo-labels leveraging DDIM inversion, while ACAP Loss minimizes geometric deformation by comparing internal angles of triangles formed by control points. Tone Loss ensures preservation of overall structure by constraining changes to low-frequency information. This combination of loss functions allows the generation of artistic characters that are both visually appealing and legible.
The texture synthesis phase further enhances the artistic output by utilizing a structure-preserving fusion method with ControlNet and LoRA models, which ensures clarity and detail in the glyphs while transferring textures from artistic styles. The multi-step process effectively integrates structure and texture, yielding a new generation of artistic OBI characters suitable for modern applications.
The implications of this innovative approach extend beyond aesthetic considerations, suggesting a future where traditional art forms can be revitalized and adapted through advanced technologies. This intersection of cultural heritage and technological advancement showcases the potential for AI to redefine artistic expression.
See also
Sam Altman Praises ChatGPT for Improved Em Dash Handling
AI Country Song Fails to Top Billboard Chart Amid Viral Buzz
GPT-5.1 and Claude 4.5 Sonnet Personality Showdown: A Comprehensive Test
Rethink Your Presentations with OnlyOffice: A Free PowerPoint Alternative
OpenAI Enhances ChatGPT with Em-Dash Personalization Feature



















































