The human brain is an incredibly complex organ, and researchers have been striving to understand how it represents the world around us. One approach has been to reconstruct visual experiences from brain activity, which can provide insights into how the brain and computer vision models are connected. In a newly published research paper, a team of scientists has proposed a new method based on a diffusion model to reconstruct images from human brain activity obtained via functional magnetic resonance imaging (fMRI).
Diffusion models (DMs) are deep generative models that have achieved state-of-the-art performance in several tasks involving conditional image generation, image super resolution, image colorization, and other related tasks. Recently, researchers have introduced latent diffusion models (LDMs) that have further reduced computational costs by utilizing the latent space generated by their autoencoding component, enabling more efficient computations in the training and inference phases.
The proposed method relies on a specific LDM termed Stable Diffusion, which reduces the computational cost of DMs while preserving their high generative performance. The team characterized the inner mechanisms of the LDM by studying how its different components relate to distinct brain functions. They demonstrated that their method can reconstruct high-resolution images with high fidelity in a straightforward fashion without the need for any additional training and fine-tuning of complex deep-learning models.
Unlike previous studies of image reconstruction, this method only requires simple linear mappings from fMRI to latent representations within LDMs. Additionally, the team provided a quantitative interpretation of different LDM components from a neuroscientific perspective, demonstrating the emergence of semantic information expressed by the conditional text while maintaining the appearance of the original image.
This study proposes a promising method for reconstructing images from human brain activity and provides a new framework for understanding DMs. It offers a novel approach to investigating the inner workings of the human brain and its connection to computer vision models.
Comentarios