New Apple model could boost Apple Vision Pro graphics – Latest Technology News & Trends

New Apple model could boost Apple Vision Pro graphics

Breaking tech news:


A team of Apple researchers has developed a new framework that enables high-resolution 3D scene rendering with far greater efficiency. Here are the details of the new study.

A bit of context

In a new study titled Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting, a group of researchers from Apple and Hong Kong University propose a new framework, aptly called LGTM.

In the study, the researchers explain that as resolution increases, existing feed-forward 3D Gaussian Splatting methods quickly become too expensive to run, making high-resolution scenes increasingly impractical.

Feed-forward 3D Gaussian Splatting, in a nutshell, is a way for an AI model to quickly turn one or a few images into a 3D scene that can be viewed from new angles.

In fact, AI-model-turns-2d-photos-into-3d-views/”>we recently covered SPLAT, an open-source model developed by Apple, which employs feed-forward 3D Gaussian Splatting that creates 3D views from a single 2D image, and it yields impressive results:

New paper from Apple – Sharp Monocular View Synthesis in Less than a Second

Mescheder et al. @ Apple just released a very impressive paper (congrats! 🎉🥳). You give it an image and it generates a really great looking 3d Gaussian representation. Uses depth pro. It’s really good.… pic.twitter.com/XSZCZA8iio

— Tim Davison ᯅ (@timd_ca) December 16, 2025

Feed-forward 3D Gaussian Splatting differs from per-scene optimization approaches, which build each scene individually, step by step. Although they usually take longer to process, they can generally produce more stable results.

So, while those older approaches can spend more time fitting a specific scene, feed-forward methods are much faster, though existing versions become difficult to scale to higher resolutions.

LGTM

To address this problem, the researchers propose the LGTM framework, which “decouples geometric complexity from rendering resolution.”

In other words, it separates the structure of a scene from its visual detail, so the system can keep the geometry simple while using textures to add high-resolution detail.

Importantly, LGTM isn’t a standalone model. Instead, it builds on existing feed-forward methods, enhancing how they represent detail by layering texture predictions on top of their geometry.

The way they did this was two-fold:

  1. They had the model learn the scene’s structure from low-resolution images, then checked the output against high-resolution ground truth. This forced the model to learn how to produce geometry that still looked correct, even when rendered at 2K or 4K, avoiding gaps or artifacts.
  2. They introduced a second network focused on appearance. It takes high-resolution images and learns detailed textures for each geometric element, effectively layering fine visual detail on top of the simpler geometry from the first model.

The result is a framework that can upgrade existing systems to generate detailed 4K scenes without the quadratic explosion in compute needs that has made earlier feed-forward methods impractical at higher resolutions.

What this could mean for products such as the Apple Vision Pro

Currently, Apple Vision Pro has two displays with about 23 million pixels in total, meaning each eye gets more pixels than a 4K TV.

As the study shows, feed-forward 3D Gaussian Splatting struggles at those resolutions. The displays can handle it, but generating the scene quickly and accurately becomes a computational bottleneck.

LGTM could help address that on the Apple Vision Pro, which in turn could offer smoother performance and sharper visuals in situations where feed-forward 3D Gaussian Splatting is required.

In practice, this could translate into more opportunities to enjoy detailed, immersive environments or more realistic passthrough experiences, while keeping processing demand in check.

To see LGTM in action, check out the project page. It showcases methods such as NoPoSplat, DepthSplat, and Flash3D, with and without LGTM, across both single-view and two-view inputs.

Browsing through the sample videos and images, it’s easy to see how LGTM helps produce results that are much richer in detail (particularly in textures and texts) and closer to the ground truth images (labeled as GT in the image samples).

Browsing through the sample videos and images, it’s easy to see how LGTM helps produce results that are much richer in detail (particularly in textures and texts) and closer to the ground truth images (labeled as GT in the image samples).

Worth checking out on Amazon

Add 9to5Mac as a preferred source on Google
Add 9to5Mac as a preferred source on Google

FTC: We use income earning auto affiliate links. More.

Tech Insight

This update reflects the latest developments in technology, AI, startups, and innovation.

Follow the latest trends in gadgets, software, and digital transformation.

Leave a Comment

Your email address will not be published. Required fields are marked *

Related Posts