Applications of Multi-Bucket Sensors to
Computational Photography

Gordon Wan Mark Horowitz Marc Levoy
Stanford Computer Graphics Laboratory Technical Report 2012-2

HDR photography using 4 buckets per pixel. (a) Each of the three colored bars depicts a single exposure. For each of the time slices within the exposure, color denotes which bucket the electrons are stored in at the conclusion of that slice. In non-interleaved HDR (top bar), four images are captured sequentially. In the original time-interleaved HDR (middle bar), four images are captured in a time-interleaved manner. For these two protocols, the four images are read out directly from the buckets at the conclusion of the exposure. In contrast, in photon-efficient HDR (bottom bar), four non-destructive readouts are performed at the conclusion of the exposure, of bucket 4 alone, bucket 4 + bucket 3, and so on, thereby producing images with exposures times, T, 2T, 4T, and 8T. These additions are performed at all pixels in parallel in the analog domain. These images can be combined digitally off-chip to produce exposure times in each pixel that range from T to 8T. The use of non-destructive readout and analog addition allows us to achieve a total capture time of only 8T, by contrast with the first two protocols based on a sequence of exposures of length T, 2T, 4T, and 8T. In the latter case, total capture time is 15T, so motion blur is worse. This is one advantage of our approach. (b) Still life with moving metronome (at center). The images labeled T, 2T, 4T, and 8T are the four images, with crops shown at bottom. At center is the synthesized HDR photograph. The four windows separated by black lines in the images correspond to pixels with slightly different designs. Since capture of the four images are finely interleaved in time, there are no motion differences between them, and no alignment is necessary before HDR synthesis. This is a second advantage of our approach, which can be extended to the capture of HDR video. [Please watch the video].
Abstract
Many computational photography techniques take the form, "Capture a burst of images varying camera setting X (exposure, gain, focus, lighting), then align and combine them to produce a single photograph exhibiting better Y (dynamic range, signal-to-noise, depth of field). Unfortunately, these techniques may fail on moving scenes because the images are captured sequentially, so objects are in different positions in each image, and robust local alignment is difficult to achieve. To overcome this limitation, we propose using multi-bucket sensors, which allow the images to be captured in time-slice-interleaved fashion. This interleaving produces images with nearly identical positions for moving objects, making alignment unnecessary. To test our proposal, we have designed and fabricated a 4-bucket, VGA-resolution CMOS image sensor, and we have applied it to high dynamic range (HDR) photography. Our sensor permits 4 different exposures to be captured at once with no motion difference between the exposures. Also, since our protocol employs non-destructive analog addition of time slices, it requires less total capture time than capturing a burst of images, thereby reducing total motion blur. Finally, we apply our multi-bucket sensor to several other computational photography applications, including flash/no-flash, multi-flash, and flash matting.
Paper
Adobe Acrobat PDF (3 MB)

Downloadable Video
AVI Video (19 MB)


Citation:

Gordon Wan, Mark Horowitz, Marc Levoy. Applications of Multi-Bucket Sensors to Computational Photography. Stanford Computer Graphics Laboratory Technical Report 2012-2. November, 2012.