Connect with us

Hi, what are you looking for?

AI Generative

Aniket Roy Reveals Resource-Constrained Image Generation Techniques in PhD Research

Aniket Roy, a PhD from Johns Hopkins, unveils FeLMi and DiffNat, enhancing image generation efficiency in low-resource environments for practical AI applications.

In a recent interview, Aniket Roy, a newly minted PhD from Johns Hopkins University, shared insights into his groundbreaking research in generative models for computer vision tasks. Under the guidance of Bloomberg Distinguished Professor Rama Chellappa, Roy’s work focuses on enhancing efficiency and adaptability in image generation, especially in resource-constrained environments.

Roy’s PhD research traverses the realms of generative AI, multimodal learning, and few-shot learning. He has sought to create methodologies that enable models to learn new concepts or execute intricate visual tasks with minimal data and computational resources. His work addresses longstanding challenges such as data scarcity and personalized image synthesis, aiming to make advanced vision systems more practical for real-world applications.

One significant contribution from Roy is FeLMi, a few-shot learning framework that utilizes uncertainty-guided hard mixup strategies. This innovation improves robustness when working with a limited number of labeled samples. Another noteworthy project is Cap2Aug, which employs textual descriptions to guide synthetic image generation, effectively enhancing visual diversity and bridging the gap between real and generated data.

In addition to these frameworks, Roy developed DiffNat, a regularization method that improves the perceptual quality of images generated by diffusion models. By applying a kurtosis-concentration loss, DiffNat encourages generated images to exhibit more natural texture statistics, a crucial element in enhancing visual realism for downstream vision tasks.

Furthermore, Roy has made strides in personalizing generative models. He introduced DuoLoRA, a framework designed for efficient control over content and style, allowing for fine-tuning without necessitating a complete model retraining. This innovation extends to zero-shot settings, enabling users to customize objects during generation simply through textual input. His MultiLFG framework further refines this process by incorporating wavelet-domain representations to facilitate accurate and training-free fusion of various concepts within diffusion models.

Among the projects that Roy found particularly engaging is DiffNat, which he presented at the International Conference on Learning Representations (TMLR) in 2025. This project highlights the importance of improving the perceptual quality of images generated by diffusion models, addressing a challenge that has persisted despite significant advancements in generative AI. Roy’s method not only enhances the statistical consistency of generated images but also integrates a condition-agnostic perceptual guidance strategy that boosts image fidelity without needing additional training.

The transition from academic research to practical applications is a key focus for Roy as he embarks on a new chapter at NEC Laboratories America as a Research Scientist. He aims to develop new generative model methodologies while exploring their interactions with multimodal systems. His interests lie at the intersection of generative models, vision-language-action models, and embodied AI, with the broader goal of enhancing intelligent systems that can proficiently understand and generate visual information.

Reflecting on his journey, Roy’s fascination with computer vision and machine learning was ignited during his undergraduate studies. The immediate visual impact of signal and image processing algorithms captivated him, fostering a deep curiosity about how machines can emulate human visual perception. His intellectual curiosity was further nurtured by mentorship from Dr. Kuntal Ghosh, who inspired him to approach complex problems with scientific rigor.

Roy’s experience at the recent AAAI Doctoral Consortium, although marred by visa issues that prevented his attendance, was nonetheless fruitful. His colleague’s presentation of his research poster sparked insightful discussions with fellow researchers, yielding constructive feedback and potential collaborative opportunities. Roy expressed appreciation for the platform, recognizing it as a valuable avenue for sharing early-stage ideas and engaging with the academic community.

Beyond his research endeavors, Roy finds joy in music, stand-up comedy, and travel. He considers exploring diverse cultures a refreshing escape and is also a budding poet who combines humor and storytelling through his performances. This creative outlet contrasts with his rigorous analytical research, allowing him to maintain a well-rounded perspective on life and work.

As Roy moves forward, he remains committed to advancing the capabilities of generative models and their applications, striving to contribute to the scientific understanding of intelligent systems that can interact effectively with the visual world.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Generative

Nvidia's partnerships with Asian firms like LG and Nanya surge AI chip demand to 90% of production costs, reshaping the tech landscape in Asia.

AI Finance

Google invests $10 billion in Anthropic, boosting its valuation to $350 billion and securing critical AI infrastructure ahead of a potential IPO.

Top Stories

Meta cuts 8,000 jobs amid a strategic pivot to AI investment, while Microsoft offers buyouts to 8,750 employees as tech companies adapt to evolving...

AI Technology

Intel's robust sales forecast of up to $14.8 billion for June, driven by soaring AI demand, propelled shares 20% higher to record levels.

Top Stories

Tencent aims for a 20% stake in $40B AI startup DeepSeek as Alibaba joins funding talks, intensifying the competition in China's AI landscape

AI Cybersecurity

Anthropic’s Mythos AI model was breached through a simple exploit, raising alarms about the vulnerability of advanced AI systems in cybersecurity.

AI Tools

Unauthorized users accessed Anthropic's Mythos cybersecurity tool through a third-party vendor, raising serious enterprise security concerns.

AI Finance

Treasury Secretary Scott Bessent and Fed Chair Jerome Powell convened banking leaders to address escalating cybersecurity threats from Anthropic's AI model, Mythos, highlighting urgent...

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.