The dual 12-megapixel cameras on the iPhone XS models have specs virtually unchanged from the first iPhone X: two vertically aligned 12-megapixel cameras, an f1.8 wide angle lens and a f2.4 2X telephoto lens, and a 7-megapixel forward-facing camera as part of the TrueDepth module and its passel of sensor technologies. However, all of it is backed with new image sensors, lenses, and a brand-new image signal processor that comes as part of the new A12 Bionic chipset.
Like me, Souza marveled over the updated portrait mode capabilities that let you adjust the background blur effect after you take the photo, with either the front or rear dual camera system. He told me he didn’t think consumers would take notice of the f-stop numbers on the interface.
Later, on the phone, Souza said, “I think they’ll use it and not really understand it.” But he added that consumers will understand the results and see how “when they go in one direction, everything other than what’s in focus gets less in focus and the other way things get more in focus.”
“I love this decision by the team to honor art of photography and the work that went into characterizing how great lenses work,” said Apple senior vice president of worldwide marketing Phil Schiller when I asked about the decision to include f-stop numbers in the depth editor interface.
Schiller, along with Graham Townsend, Apple’s senior director of camera hardware, and Sebastien Marineau-Mes, Apple’s vice president of software, sat down late on the afternoon of iPhone XS launch day to peel away the veil of secrecy surrounding at least one part of Apple’s iPhone technology matrix: how they design and develop their photo and video capture hardware and software.
The numbers consumers will see on these phones and through the photo editing app are not just an old-school nod to how f-stops and aperture control work on DSLR cameras. Schiller told me Apple engineered an exact model of how a physical lens at those aperture numbers would behave.
In a physical camera, a higher-number f-stop represents a smaller aperture opening and a longer depth of field. In other words, a setting of f1.4 would have the front of a subject’s face in focus while the background is fuzzy. On the other hand, a setting of f16 puts almost everything front and back in focus.
The first expression of this kind of photography on smartphones appeared in 2016 with the iPhone 7 Plus and its portrait mode, which used the two images grabbed by its dual-lens system (and some algorithmic magic) to create a background-blur, or bokeh, effect. This on its own was a radical innovation for amateur iPhone photographers by transforming mundane portraits into studio-quality images. Even so, it — like virtually all other smartphone-based portrait-mode photography that followed — was a two-plane version of the depth effect. The images held the foreground object in focus and blurred the back plane.
Samsung was the first to introduce adjustable blur that could be used during photography and in post-processing, but Samsung’s Live Focus still sees the image as two planes.
Like a lens
What’s clear from using the new iPhone XS and XS Max is that the depth slider captures almost unlimited planes between the foreground and background. Apple calls this “lens modeling.”
“We turned the model of a lens into math and apply it to that image,” explained Marineau-Mes. “No one else is doing what this does. Others just do ‘blur background.’” And the post-processing works equally well whether you’re taking a selfie with the iPhone XS single, 7-megapixel front-facing camera, a portrait with the dual-lens system on the iPhone XS or XS Max, or a photo with the single 12-megapixel rear camera on the iPhone XR.
Put simply, Apple is employing three distinctly different depth-information-capturing technologies to drive the same depth editing result. Townsend described it to me as using three different sources of information: the dot-based depth sensor in the TrueDepth module, the dual-lens stereo imagery of the 12-megapixel cameras on the back of both the XS and XS Max, and an almost entirely algorithmic solution on the XR.
Apple’s depth editing is all the more remarkable because it lets you adjust the aperture in post without touching the exposure. In traditional and DSLR photography, every adjustment of the f-stop has to be met with a correlative adjustment of the exposure setting. A smaller aperture means less light while a more open one blows out the exposure in the photo unless you increase the shutter speed.
However, sliding the depth editor back and forth on an iPhone XS image adjusts that blur exponentially while somehow maintaining your original exposure. It’s a heavy lift computationally, but Marineau-Mes said they do it all in real-time.
Seeking professional quality
Souza, who had been test-driving the iPhone XS at Washington, D.C.’s Natural History Museum, described the depth edit feature to me as “pretty, pretty nice.”
When he tried portrait mode on the life-size early-human heads in the dioramas at the museum — even through the display glass — the results were impressive. “To be able to change the f-stop and get your pinpoint focus… I was using the stage lighting [one of the settings in portrait mode] to darken the background, yet the eyes are still sharp as a tack,” he said.
“I would compare it to… I use Canon, a Canon 85 mm lens. I use it as a widest aperture. That’s the effect you’re getting,” Souza added.
An admitted Apple and iPhone enthusiast (other than early flip phones, the 63-year-old photographer said he has never owned a different brand of phone), Souza said that while he almost invariably used a DSLR when photographing Obama (“the images were going in the National Archive,” he explained), he always had the iPhone (models 5 through 7) to shoot more casual photos. He noted, “I have hundreds of snow pictures, pictures of Rose Garden, nobody in it, [taken] with the iPhone.”
Obviously, professional photographers know the limits of smartphone photography. Even with multiple lenses, telephoto capability, and post-processing, it’s hard to replace what you can do with full-frame 35 mm sensors and a 55 mm or larger lens.
But that hasn’t stopped Apple from trying. Apple’s multi-pronged effort to put pro-level photographic capabilities in the hands of millions of iPhone users starts with addressing image capture (both photo and video) as a system.
“We’re not like a hardware company; we’re not like a software company. We’re a system company,” Townsend said, emphasizing what’s become the hallmark of Apple’s success across a wide array of consumer electronics categories: the ability to control the full stack, from designing and development through hardware and software and virtually everything in between.
“We have the privilege of being able to design everything from the photon first entering the lens right through to the format of the captured file. Only Apple is able to customize and match together,” Townsend added.
Part of the reason Apple does this, especially with a system as intricate as imaging, is that components that work well do not always work well together. Apple’s penchant for bespoke components from third-party partners is well-known, but it goes further than that.
To manufacture something like the dual-camera system, Apple has to ensure that the two lenses are at a precise point and tilt.
“We have tight specs,” Schiller said, smiling.
If a partner says they can’t reproduce Apple’s spec, Apple gets manufacturing process experts specifically for the camera manufacturing process. They work with the manufacturers, Apple has said, to re-engineer custom versions of their equipment.
“One of the really big things we aim for is the first phone, the 1 millionth phone, and the 10 millionth phone we want that experience to be as close as we can humanly manage, and we put a lot of effort, and it’s not something we talk a lot about… but it’s really important to us that there’s no big variation in performance between any phone anywhere the world,” Townsend said.
Internally, Apple’s hardware and software teams have been meeting regularly to define the imaging features they’ll deliver through fresh silicon like the A12 Bionic (which has taken three years to develop) and on new hardware like the iPhone XS.
“The architectural decision to deliver HDR [High Dynamic Range] has to come at the beginning of conversation of chip architecture,” Townsend noted.
Even so, different components can arrive at different times. So, there’s the agreed-upon architecture plan, and then there are the on-the-fly adjustments.
If the A12’s development started three years ago and two or three generations of iPhones and camera systems are delivered in that time, adjustments have to be made to match new lenses and image sensors with the image signal processor (ISP). Fortunately, Marineau-Mes says his team can still program the ISP, which is part of the A12’s chipset, to match the lenses and sensors found on the iPhone XS, XS Max, and XR.
And there can be tradeoffs. “We’ll do this in the lens, it’s going to give us a better image, but we have to do this in the ISP,” said Sebastien Marineau-Mes.
That kind of cross-department coordination proved crucial in the development of Smart HDR, the second jewel in Apple’s image processing crown.
In traditional HDR, a pair of images at different exposures are used to capture details in the dark and over-bright areas of a photo. Depending on the disparity, it’s likely that, even in the best HDR images, some detail — or maybe a lot — will be lost or the final combined image will have significant amounts of noise, especially in the case of action shots. Additionally, HDR can introduce a bit of shutter lag, which means motion photography is almost impossible.
In my tests with the iPhone XS and XS Max, Smart HDR produced high-quality images in challenging situations that had stumped even the year-old iPhone X.There are, as Schiller said during the keynote speech, trillions of operations occurring with each photo to make this possible, but it starts with the ISP and its high readout capabilities.
As Marineau-Mes explained to me, the camera starts by capturing two image frames at different exposures in one-thirtieth of a second. That information is passed on to the software pipe and the A12’s neural engine, which starts analyzing the images.
Smart HDR doesn’t stop there. “Raw material is captured at 30 fps (that’s a pair every thirtieth of a second), and the fusing happens in a few hundred milliseconds,” Marineau-Mes pointed out.
Inside the A12 chip is the neural engine that analyzes frames not just for exposure but for discrete image elements. It’s identifying facial features and looking for motion. If the system detects motion, it looks for the frame with the sharpest image of the motion and adds it to the image. Similarly, an image with red-eye is not just fixed but replaced with a frame where the eye isn’t red or with the reference eye color from the frame without red-eye.
Earlier in the day, Apple had shown me a photo of a dreadlocked man standing in a lake. He’s dramatically backlit, though I could easily see his torso. His hair is mid head-toss, so his dreads flair out and water is captured flying off into the air. If I were shooting the image with a DSLR, I’d set the shutter speed to at least 500 fps but keep the aperture somewhat closed — maybe f11 to try and maintain some of the image depth. I’d also have to raise the ISO level to pull in enough light, which would probably introduce a lot of grain.
This perfectly frozen and exposed photo, however, was taken with an iPhone XS.
“We set a reference frame and fuse in information from multiple frames,” Marineau-Mes said. The image I saw was a composite of multiple frames. Some of those frames contained pieces of what would become the final image, like the perfectly sharp hair and water.
“As you stack the frames, if you have the same image, you have lower and lower noise and better and better detail,” he explained.
It takes an incredibly powerful ISP and neural engine backed by an equally powerful GPU and CPU to do all this processing, Marineau-Mes said.
All that heavy lifting starts before you even press the iPhone camera app’s virtual shutter button. Schiller said that what users see on their iPhone XS, XS Max, and XR screens is not dramatically different from the final image.
“You need for that to happen,” said Schiller. “It needs to feel real-time for the user.”
When I asked what all this gathered information meant for file size, they told me that Apple’s HEIF format results in higher quality but smaller file sizes.
Sometimes Apple’s engineers arrive at better image technology almost by accident. Last year, Apple introduced flicker detection, which seeks light source refresh frequencies and tries to reduce flicker in still and video imagery. While incandescent and fluorescent lights have consistent refresh frequencies, which makes it easy to figure out exposure times, modern energy-saving LEDs operate at all different frequencies, especially the ones that change hue, Townsend explained. This year, Apple engineers widened the range of recognized frequencies to further cut down on flicker. However, while doing so, they realized that they can now also immediately identify when the sun is in the picture (“The sun doesn’t flicker,” Townsend noted) and instantly adjust the white balance for the natural light.
“Our engineers were kind of working at that, and they spotted this extra information. So, this is the bonus that we get from the flicker detect,” Townsend said.
Video and new frontiers
All of these image capture gymnastics extend to video as well, where the same frame-to-frame analysis is happening in real-time to produce video with more details in high and low light.
It occurred to me that with all that intelligence, Apple could probably apply the depth editor to video, adding the professional polish of a defocused background to everyday video shoots, but when I asked Schiller about it, he would only say that Apple does not comment on future plans.
Video or stills, the end result is a new high-water mark for Apple and, perhaps, smartphone camera photography in general. The company gets emails, Townsend told me, where people say, “I can’t believe I took this picture.” It’s an indication that Apple’s achieving a larger goal.
“We make cameras for real people in real situations,” Townsend said. “They’re not on tripods; they’re not in the labs. They want their picture to be a beautiful picture without thinking very much about it.”
Back on the phone with Souza, whose book on the photographic contrast between the Obama and Trump administrations, Shade, will be on sale starting in October, he told me he’s long been impressed with the iPhone’s closeup shot capabilities. “I’m continually amazed how close you can get to your subject with an iPhone. The minimum focusing distance is better on an iPhone and a DSLR unless you have a macro [lens] with you.”
That morning, Souza had printed out a pair of his iPhone XS shots. He said he thought they looked like they’d been captured with a DSLR.
I emailed Souza one last question: If he were asked by a future president to photograph the administration, would he use an iPhone for official photos or still rely on his DSLR?
“DSLR,” he wrote back, perhaps crushing a few Apple dreams, “though [I] would still use iPhone for some Instagram posts as I did during the Obama administration.”
That calculation — of switching between the always-in-your pocket iPhone and an expensive DSLR camera — surely is one that Apple must hope Souza and other photographers won’t always have to make.