Connect with us

Hi, what are you looking for?

AI Research

Carnegie Mellon Unveils Image2Gcode Framework, Directly Converts 2D Images to G-code

Carnegie Mellon University’s Image2Gcode framework streamlines additive manufacturing by generating printer-ready G-code from 2D images, improving path efficiency by 2.4%.

Researchers at Carnegie Mellon University have unveiled Image2Gcode, an innovative deep learning framework capable of generating printer-ready G-code directly from 2D images. This breakthrough eliminates the traditional reliance on computer-aided design (CAD) models and slicing software, streamlining the additive manufacturing process. The findings, published on arXiv, highlight a diffusion-transformer model that converts sketches or photographs into executable manufacturing instructions, establishing a direct connection between visual design and fabrication.

Traditional additive manufacturing workflows involve multiple steps, including CAD modeling, mesh conversion, and slicing, each requiring specialized knowledge and extensive iterations. This complexity often hinders design modification and accessibility. Image2Gcode simplifies this process by creating a direct visual-to-instruction pathway using a denoising diffusion probabilistic model (DDPM). The framework generates structured extrusion trajectories straight from an image, bypassing the need for intermediate STL and CAD files.

Users can input either hand-drawn sketches or photographs of objects. Image2Gcode effectively extracts visual features, interprets geometric boundaries, and synthesizes continuous extrusion paths. This approach not only accelerates prototyping and repair but also lowers the entry barrier for non-expert users. Researchers noted that the framework establishes a “direct and interpretable mapping from visual input to native toolpaths,” effectively bridging the gap between concept and execution in a single computational process.

Image2Gcode employs a pre-trained DinoV2-Small vision transformer, which is a self-supervised model designed for large-scale image representation learning. This is integrated with a 1D U-Net denoising architecture, conditioned through multi-scale cross-attention. The DinoV2 encoder extracts hierarchical geometric information that assists the diffusion model in generating coherent G-code. Training was conducted using the Slice-100K dataset, containing over 100,000 aligned STL–G-code pairs, allowing the model to learn the relationship between geometry and movement at the layer level.

The model was trained in PyTorch for 800 epochs using the AdamW optimizer, implementing a cosine noise schedule across 500 diffusion timesteps. This iterative denoising process produced valid, printer-ready G-code sequences, which can be adjusted without the need for retraining. Normalization across spatial and extrusion channels ensured stability and adaptability to various printer configurations.

Initial evaluations on the Slice-100K validation set indicated that Image2Gcode produced geometrically consistent and manufacturable toolpaths. Prints generated from the model’s G-code exhibited strong interlayer bonding, accurate boundaries, and smooth surfaces comparable to those produced by traditional slicer software. The toolpaths were capable of replicating complex infill structures—such as rectilinear, honeycomb, and diagonal hatching—without the need for rule-based programming.

Real-world testing extended to photographs and hand-drawn sketches, which presented data distributions distinct from the synthetic training set. Preprocessing effectively extracted shape contours from these inputs, and Image2Gcode successfully generated coherent, printable paths. The geometrically faithful and functionally sound results validated the model’s capacity to adapt pretrained DinoV2 features for real-world applications.

A quantitative analysis demonstrated a 2.4% reduction in mean travel distance compared to heuristic slicer baselines, indicating improved path efficiency without compromising print quality or mechanical strength. This suggests that the model captures geometric regularities that support optimized motion planning.

However, challenges remain. The toolpath synthesis process currently operates within a 2D slice framework and does not account for interlayer dependencies or internal cavities that require coordinated 3D path planning. The authors propose expanding the model toward hierarchical 3D generation, where a coarse global model defines key cross-sections that Image2Gcode would refine layer-by-layer. This could enhance control over fabrication outcomes by incorporating parameters for infill density, mechanical performance, and material usage.

Looking ahead, integrating Image2Gcode with AI-driven manufacturing frameworks, such as LLM-3D Print—a multi-agent system for adaptive process control and defect detection—could further extend its capabilities. By linking the diffusion model to language-based interfaces, users could specify goals—like minimizing print time or improving surface finish—that the system would translate into optimized G-code generation.

By combining diffusion-based synthesis, pretrained visual perception, and parameter normalization, Image2Gcode sets a new standard for intent-aware additive manufacturing. This data-driven architecture links design, perception, and execution, diminishing reliance on manual modeling and paving the way for fully digital workflows where sketches and photographs can be seamlessly transformed into printed components.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Research

New research reveals a mathematical framework identifying fundamental topological obstructions to AI alignment, challenging the pursuit of perfect safety in advanced systems.

Top Stories

Taboola's study reveals AI-generated ads achieve a 0.76% click-through rate, outperforming human ads by leveraging trust-building visuals like human faces.

AI Research

AI tools like ChatGPT are driving a 33% surge in submissions to arXiv, raising serious concerns about the integrity and quality of scientific research.

AI Research

ProAIDE, developed by JetBrains and Carnegie Mellon, achieves a 52% developer acceptance rate through timely proactive AI suggestions in IDEs.

AI Research

A University of Washington study reveals that 80% of responses from distinct AI models, including DeepSeek-V3 and OpenAI's GPT-4o, are nearly indistinguishable, threatening human...

Top Stories

DeepSeek expands its R1 paper from 22 to 86 pages, unveiling detailed training insights and benchmarks ahead of potential V4 release this Lunar New...

Top Stories

Microsoft's Copilot, despite an $80B investment, fails basic tasks 70% of the time, prompting CEO Satya Nadella to call for a new AI development...

AI Research

OpenAI and Google DeepMind are set to enhance AI agents’ recall systems, aiming for widespread adoption of memory-enabled models by mid-2025.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.