Compressive Light Field Photography Using Overcomplete Dictionaries and Optimized Projections

Recently we have shown that light-field photography images can be interpreted every bit express-angle cone-beam tomography acquisitions. Here, we apply this property to develop a direct-space tomographic refocusing formulation that allows one to refocus both unfocused and focused calorie-free-field images. We limited the reconstruction every bit a convex optimization trouble, thus enabling the use of various regularization terms to aid suppress artifacts, and a wide grade of existing advanced tomographic algorithms. This formulation also supports super-resolved reconstructions and the correction of the optical system's express frequency response (bespeak spread part). Nosotros validate this method with numerical and real-world examples.

one. Introduction

Portable plenoptic cameras offer unique features which practise non exist with traditional cameras, such as the possibility to refocus on different planes of the scene later on acquisition [1, 2], and the power to extract depth information regarding the photographed objects [iii]. These features come at the cost of a greatly reduced spatial resolution. As a result, researchers and developers take invested a considerable endeavour on developing custom and sometimes very specialized algorithms for computational photography techniques, in particular for super-resolution applications.

Higher light-field refocusing quality and resolution are prerequisites for of import applications that tin can benefit from the ability to refocus on multiple distances after conquering, depth estimation and volume rendering, all from a single light-field. Examples of these applications can be establish in fields ranging from industrial quality control and automation processes to autonomous driving. As calorie-free-field imaging will be increasingly adopted by these applications, greater importance will exist given to quality and richness of details of the refocused images.

Many approaches attempt to obtain spatially super-resolved light-fields from lower resolution acquisitions [4–9]. The resulting super-resolved light-fields incorporate effectively detailed structures that tin deliver higher quality depth-estimation (distance from the camera) of the objects in the scene, and deliver higher quality image reconstruction with traditional refocusing algorithms. These approaches need strong a priori information to work properly, similar a map of the distances from the photographic camera of all the objects in the scene, or the knowledge of their composing structures, which limits their applicability.

A unlike approach is taken in [10] for a compound-eye imaging system, and in [11] for the calorie-free-field microscope, where the authors use the back-up of the acquired data to super-resolve the refocused images, instead of creating new information in the data. In both [x] and [11], the prototype formation process is modelled as a linear process, and the loftier-resolution epitome reconstruction is formulated as the solution to a minimization trouble. The limited frequency response of the optical system, modelled as a point spread function (PSF), is taken into account in the image formation model.

Every bit specified in [xi], the traditional light-field acquisition scheme, called unfocused light-field (ULF), does non allow to super-resolve images at the acquisition focal distance. In [12] the authors propose a deconvolution-like solution to the problem, while in [13, xiv], the authors propose a new popular conquering scheme, which is known as focused low-cal-field (FLF). Traditionally, the ULF and FLF have been treated equally two different conquering setups that needed different algorithms for similar tasks like refocusing. In the ULF case, notable examples of refocusing approaches are: the integration method [xv], the Fourier-infinite method based on the Fourier slice theorem [16], and a volumetric filtered version of the latter [17]. In the FLF case, we have mainly the patch method [14], and a more than sophisticated super-resolution method that explicitly takes into account the positioning of the pixels on the sensor [eighteen]. Recently, in [xix], the authors take demonstrated that the algorithms developed for i camera setup can be applied to the other setup. However, the resulting paradigm resolutions and quality depend on the chosen algorithm.

In a contempo paper, we described a framework in which a light-field photography image tin can be interpreted every bit limited angle cone-beam tomography acquisition, [20]. The framework presents an in-depth analysis of how different parametrizations of the ray-tracing model can be formulated so that the well-known dorsum-projection methods for cone-beam tomography can be used to solve the refocusing problem.

In this newspaper, we suggest to supervene upon the back-projection approach introduced before by an algebraic model of the cone-beam tomography problem. This new conception exposes a broad range of more sophisticated tomographic reconstruction techniques, introducing a number of powerful modelling and computational formulations for lite field reconstruction. Specifically:

We extend the formulation by unifying the handling of unfocused light-field (ULF) and focused light-field (FLF) images.
We formulate the refocusing reconstruction equally a convex optimization problem, introduce additional regularization terms to cope with artifacts, and testify how avant-garde iterative algorithms can be used to solve the trouble.
We bear witness how super-resolved refocused images can be computed.
We correct for the limited frequency response of the optical system (indicate spread function).

Each of these extensions contributes to a measurable improvement of the refocusing reconstruction quality.

1.1. Detailed content description

The introduction of tomographic projection operations naturally allows to express the paradigm formation process every bit a linear system, similarly to [10, eleven], where the forward operator is equanimous for both geometries of: a tomographic forward-projection operator; a downward-sampling operator in the example of super-resolved images; and a blurring operator. The blurring operator takes into account the effects of Fraunhofer diffraction on the lenslets [10, 12] and the Nyquist limitation on the refocusing resolution [16].

Nosotros propose the use of regularizations like the Total Variation minimization (Telly-min) or the stationary wavelet transform (SWT) [21, 22], instead of the traditional smoothness constraints [23]. These advanced constraints allow to finer suppress artifacts related to the super-resolution reconstruction, while enforcing prior noesis in the reconstruction, similar sharpness of the in-focus objects, or its compressibility in the wavelet domain. We besides demonstrate the utilize of both established and recent customizable algorithms, which are widely used in the tomography community and include the Simultaneous Iterative Reconstruction Technique (SIRT) [24, 25], and the Chambolle-Pock (CP) algorithm instances [26, 27]. The use of consolidated algorithms allows to re-use existing high-operation solutions, while the apply of CP algorithm instances allows to take tailored algorithms for the most popular regularization functionals in a directly-frontwards manner.

The rest of the article is structured every bit follows. In section 2 we introduce the used formalism, explain the geometric adjustments needed for the ULF and FLF cases, construct the minimization problem, and present the used reconstruction techniques. In section 3 we present the results of the reconstruction methods detailed in section 2 on simulated and real-world examples, and compare them against a representative selection of existing methods. In section 4 we conclude the paper.

2. Methods

In this department nosotros introduce and talk over the mathematical tools used in the paper. In department two.2 we establish the relationship between the projection geometry, the optical system's point spread office (PSF) and the epitome germination. Then in department 2.3 we express the refocusing every bit a convex optimization problem. Finally in section 2.4 nosotros innovate the regularization terms used in this article.

2.1. Conventions

We schematically correspond the two typical plenoptic acquisition setups in Fig. i. They are composed of a main lens (ML), a micro-lenses array (MLA), and a sensor. The camera array setup is non considered explicitly hither, but it is equivalent to the setup in Fig. ane(a), with the only difference being that it lacks the ML [20].

We prefer the conventions and representations from [twenty]. In both Figs. 1(a) and 1(b) the ML divides the infinite into the object infinite which is the photographed scene (left side) and the image space which is the inside of the camera (right side).

figure: Fig. 1

Fig. 1 Comparing of the two low-cal-field setups, and their naming conventions: (a) unfocused low-cal-field (ULF); (b) focused light-field (FLF).

Download Full Size | PPT Slide | PDF

The focal lengths of the ML and the lenslets in the MLA are chosen f _one and f _two respectively, and the altitude between the natural focal plane in the object space and the ML is called z ₀. In the ULF case (Fig. i(a)), the MLA-to-sensor altitude is called z _ii and is commonly set to f _two, while the ML-to-MLA distance is z ₁. Given the fact that $f_{two} < < z_{1}$ , the ML is considered in this case to exist at the optical infinity of the lenslets in the MLA. In the FLF case, we tin have two configurations: "Keplerian" and "Galilean" [28]. For simplicity, throughout this paper, we only work with a Keplerian setup (represented in Fig. 1(b)), but our approach equally works for the Galilean setup. In Keplerian mode, the image from the ML is formed in front of the MLA, at the distance z _ane, and that has a ML-to-MLA altitude equal to $z_{1} + a$ , with $a > 0$ . In Galilean fashion, the prototype from the ML is virtual and it is formed behind the MLA, which is placed again at a distance of $z_{i} + a$ from the ML, but with $a < 0$ . For an FLF in Keplerian configuration, the MLA-to-sensor distance is called b and satisfies the thin lens equation: $1 / f_{2} = one / a + 1 / b$ .

In the ULF configuration, the coordinates $(u, v)$ define the positions on the main lens, while the coordinates $(s, t)$ define the positions of each lenslet on the MLA. For the pixels behind each lenslet, nosotros use coordinates $(σ, τ)$ , whose origin is on the centre of the lenslet footprint. In this setup, there is a one-to-one correspondence with coordinates $(u, v)$ on the main lens, given by:

We refer to the collection of pixels behind a lenslet as the micro-image. The pixels sharing the same $(σ, τ)$ position nether each lenslet form a sub-discontinuity — or pinhole — paradigm. This is due to the fact that their intensities originate from the same point $(u, v)$ , as if there were a pinhole in that position. The corresponding pinhole is indicated by the yellow apertures in Fig. one(a).

Compared to the ULF case, we showed in [twenty] that in the FLF case, we observe a shift in the $(s, t)$ coordinates of each sub-aperture image, which depends on its associated $(σ, τ)$ coordinates:

where s _MLA and t _MLA are the coordinates of the center of the lenslets on the MLA, while the

(u, v)

coordinates become:

2.2. Frontwards model

In [twenty] we derived the tomographic volumetric forward-projection and back-projection formulas for the ULF, which relate lines in the considered volume of the photographed scene to points in the lite-field and vice versa. The consummate forward-project formula is:

\begin{array}{l} L^{'} (s_{i}, t_{i}, u, v) = \int_{Ω_{o}} δ (\frac{z}{z_{0}} s_{o} + (i - \frac{z}{z_{0}}) u - Yard {due south}_{i}, \frac{z}{z_{0}} t_{o} + (1 - \frac{z}{z_{0}}) 5 - One thousand t_{i}) \\ \times E ({southward}_{o}, t_{o}, z) d s_{o} d t_{o} d z_{o}, \end{array}

where

One thousand = z_{0} / z_{1}

is the magnification of the recorded pixels in the photographed scene,

({south}_{o}, t_{o})

are the transverse coordinates in the object space,

(s_{i}, t_{i})

are the transverse coordinates in the prototype space,

L^{'} (s_{i}, t_{i}, u, 5)

is the computed light-field,

Eastward (s_{o}, t_{o}, z)

is the function describing the photographed scene, with support

Ω_{o} = Ω_{{southward}_{o}} \times Ω_{t_{o}} \times Ω_{z}

, where

Ω_{s_{o}} = [- S_{o}, S_{o}]

Ω_{t_{o}} = [- T_{o}, T_{o}]

, and

Ω_{z} = [- Z, Z]

. The dorsum-projection (integration refocusing) formula is instead:

\begin{array}{l} {East}^{'} (s_{o}, t_{o}, z) = \int_{Ω_{i}} δ (\frac{z}{z_{0}} s_{o} + (1 - \frac{z}{z_{0}}) u - Yard s_{i}, \frac{z}{z_{0}} t_{o} + (1 - \frac{z}{z_{0}}) v - M t_{i}) \\ \times Fifty ({southward}_{i}, t_{i}, u, 5) d {due south}_{i} d t_{i} d u d five, \end{array}

where

E^{'} ({due south}_{o}, t_{o}, z)

is the produced focal stack,

Fifty ({southward}_{i}, t_{i}, u, v)

is the recorded light-field, with support

Ω_{i} = Ω_{s_{i}} \times Ω_{t_{i}} \times Ω_{u} \times Ω_{5}

, where

Ω_{{southward}_{i}} = [- {Southward}_{i}, {South}_{i}]

Ω_{t_{i}} = [- T_{i}, T_{i}]

Ω_{u} = [- U, U]

, and

Ω_{v} = [- V, V]

| U |

and

| V |

are the largest sampled coordinates on the ML.

Equations (7) and (8) are also valid for the FLF instance when considering the coordinates $(s_{i}, t_{i})$ on the plane at altitude z _one inside the camera. From Eqs. (3) and (four), nosotros deduce that each recorded sub-aperture image presents a displacement equal to $Δ {southward}_{i} = (a / b) Δ σ$ and $Δ t_{i} = (a / b) Δ τ$ , respectively. Using Eqs. (5) and (6), we tin can compute the displacements of the recorded sub-aperture images in the image space, as functions of the $(u, five)$ coordinates:

In the object space, the shifts go:

Δ s_{o, Δ u} = M Δ s_{i, Δ u} = Chiliad a / (z_{i} + a) Δ u,

Δ t_{o, Δ 5} = Thou Δ t_{i, Δ 5} = Grand a / (z_{1} + a) Δ 5 .

The resulting forward-project equation for the FLF case is just:

\begin{array}{l} L^{'} ({southward}_{i}, t_{i}, u, v) = \int_{Ω_{o}} δ (\frac{z}{z_{0}} s_{o} + (i - \frac{z}{z_{0}} + \frac{Yard a}{z_{1} + a}) u - M s_{i}, \frac{z}{z_{0}} t_{o} + (1 - \frac{z}{z_{0}} + \frac{M a}{z_{1} + a}) v - M t_{i}) \\ \times East ({southward}_{o}, t_{o}, z) d s_{o} d t_{o} d z_{o}, \end{array}

and the back-project equation is:

\begin{array}{l} {Eastward}^{'} (s_{o}, t_{o}, z) = \int_{Ω_{i}} δ (\frac{z}{z_{0}} {due south}_{o} + (1 - \frac{z}{z_{0}} + \frac{M a}{z_{ane} + a}) u - M s_{i}, \frac{z}{z_{0}} t_{o} + (ane - \frac{z}{z_{0}} + \frac{M a}{z_{1} + a}) 5 - K t_{i}) \\ \times L (s_{i}, t_{i}, u, v) d s_{i} d t_{i} d u d 5 . \end{array}

From a tomographic perspective, in the object space, both Eq. (7) for the ULF and Eq. (13) for the FLF compute the line integral for the pixel $(s_{o}, t_{o}, z_{0})$ of a projection having source in $(u, five, 0)$ , in a cone-beam geometry acquisition. The but difference between the two resides in the definition of the pixel $(s_{o}, t_{o}, z_{0})$ , which is shifted past the amounts $(Δ s_{o, u}, Δ t_{o, v}, 0)$ in the FLF configuration. Due to this, they can be both re-interpreted equally the application of the integral forward-project operator $A [\cdot (s_{o}, t_{o}, z)] ({southward}_{i}, t_{i}, u, v) : ℝ^{iii} \mapsto ℝ^{4}$ (which nosotros call $A ({due south}_{i}, t_{i}, u, v; s_{o}, t_{o}, z)$ with a small-scale abuse of notation) to the part $E (s_{o}, t_{o}, z)$ , transforming Eqs. (7) and (13) into:

{Fifty}^{'} ({southward}_{i}, t_{i}, u, v) = A [E ({southward}_{o}, t_{o}, z)] (s_{i}, t_{i}, u, five),

where the integration is over the coordinates

(s_{o}, t_{o}, z)

, and the details of the geometry are absorbed into the operator

A ({south}_{i}, t_{i}, u, v; {due south}_{o}, t_{o}, z)

. Equations (eight) and (14) are the adjoint operations of Eqs. (7) and (13), respectively. So, Eqs. (viii) and (14) tin also be re-interpreted both as the application of the integral back-project operator

A^{†} [\cdot (s_{i}, t_{i}, u, 5)] (s_{o}, t_{o}, z) : ℝ^{4} \mapsto ℝ^{3}

(or

A^{†} (s_{o}, t_{o}, z; s_{i}, t_{i}, u, 5)

, equally before), to the part

L (s_{i}, t_{i}, u, v)

{Eastward}^{'} (s_{o}, t_{o}, z) = A^{†} [L (s_{i}, t_{i}, u, 5)] (s_{o}, t_{o}, z),

where the integration is over the coordinates

(s_{i}, t_{i}, u, v)

, and

A^{†}

is the adjoint of

A

. The forward-projection matrix A is the matrix representation of the operator

A ({south}_{i}, t_{i}, u, five; s_{o}, t_{o}, z)

, where the rows are selected past the set up of indexes

{s_{i}, t_{i}, u, v}

, and the columns are selected by the set up of indexes

{{due south}_{o}, t_{o}, z}

. The back-projection matrix A^T is then the matrix representation of the operator

A^{†} (s_{o}, t_{o}, z; s_{i}, t_{i}, u, v)

where the rows are selected by the set of indexes

{s_{o}, t_{o}, z}

, and the columns are selected by the set of indexes

{s_{i}, t_{i}, u, v}

. The matrices A and A^T are thin, and have real non-negative values [29].

figure: Fig. 2

Fig. 2 Project geometry of the two calorie-free-field conquering geometries: (a) unfocused calorie-free-field (ULF); (b) focused light-field (FLF). At the altitude z ₀ all the sub-aperture images coincide on the $(south, t)$ axes for the ULF case, while they experience shifts $Δ southward = (a / b) Δ σ$ and $Δ t = (a / b) Δ τ$ for the FLF case.

Download Full Size | PPT Slide | PDF

In this article, nosotros restrict ourselves to refocusing at a single distance, as opposite to reconstructing an entire volume from one light-field image. This turns Eq. (seven) into the projection from a slice at position z in the book to the light-field:

\begin{array}{l} L^{'} (s_{i}, t_{i}, u, five) = \int_{Ω_{o, z^{'}}} δ (α s_{o} + (1 - α) u - M s_{i}, α t_{o} + (1 - α) five - Chiliad t_{i}) \\ \times E ({due south}_{o}, t_{o}, z^{'}) d {south}_{o} d t_{o}, \end{array}

and Eq. (viii) into the projection from the light-field to such slice at position z:

\begin{array}{l} E^{'} (s_{o}, t_{o}) = \int_{Ω_{i}} δ (α {south}_{o} + (1 - α) u - Thousand s_{i}, α t_{o} + (ane - α) v - M t_{i}) \\ \times L (s_{i}, t_{i}, u, v) d s_{i} d t_{i} d u d 5, \end{array}

where

α = z^{'} / z_{0}

, and the support

Ω_{o, z}

is the slice at position

z = z^{'}

. In Fig. 2(a) we evidence the geometric representation for Eqs. (17) and (18) and α = one. For

α \neq 1

, the dotted lines remain the same, simply the image grid is shifted to the respective

z = α z_{0}

position.

For the FLF case, Eq. (thirteen) becomes:

\begin{array}{l} L^{'} (s_{i}, t_{i}, u, five) = \int_{Ω_{o, z^{'}}} δ (α {south}_{o} + (1 - α + \frac{M a}{z_{ane} + a}) u - M s_{i}, α t_{o} + (one - α + \frac{M a}{z_{one} + a}) v - M t_{i}) \\ \times East (s_{o}, t_{o}, z^{'}) d s_{o} d t_{o}, \end{array}

and Eq. (fourteen) becomes:

\begin{array}{l} E ({south}_{o}, t_{o}) = \int_{Ω_{i}} δ (α s_{o} + (i - α + \frac{M a}{z_{one} + a}) u - Yard {southward}_{i}, α t_{o} + (ane - α + \frac{Chiliad a}{z_{1} + a}) v - Thousand t_{i}) \\ \times 50 (s_{i}, t_{i}, u, v) d {southward}_{i} d t_{i} d u d five . \end{array}

Figure 2(b) gives a representation of the projection geometry of different sub-aperture images, for the FLF instance.

Without loss of generality we can identify again A_z equally the matrix representation of both the forward-project operators in Eqs. (17) and (xix), and $A_{z}^{T}$ equally the matrix representation of the back-project operators in Eqs. (eighteen) and (xx).

If we at present consider the discretization of the functions $L ({south}_{i}, t_{i}, u, v)$ and $E_{z^{'}} ({south}_{o}, t_{o})$ as the vectors b and x_z respectively, the epitome formation procedure for ane slice is represented by the following linear organization:

where

{\bar{x}}_{z}

is the vector representation of the considered piece in the photographed scene, and b is the vector representation of the recorded light-field. In Eq. (21) nosotros presume that all the recorded light comes from the chosen piece at distance z.

Any object outside the slab defined by the conquering depth-of-field (DoF) effectually z, is non reconstructed correctly and appears out of focus in the refocused images. The conquering DoF is the sum of two factors: a geometric and a diffractive component. The former was besides considered in [20] from the betoken of view of a tomographic conquering. It is linearly proportional to the lenslet diameter and inversely proportional to the ML numerical aperture (NA). The latter has instead an inverse quadratic proportionality to the ML NA, and information technology is proportional to the detected photon wavelength. For a more than in depth explanation and derivation of the DoF we refer to [thirty]. For translucent objects, this ways that if they are in-focus, their shape is correctly refocused, but the objects exterior of the DoF that are visible through them are imaged out of focus.

In case of super-resolved reconstructions, we assume that the refocused slice in the photographed scene has a smaller pixel size than the recorded light-field in the $(s, t)$ coordinates. Equation (21) can be modified to remove this discrepancy, by calculation a downwards-scaling operation after the projection of the matrix A_z . The down-scaling matrix U_n bins the value of $due north \times due north$ neighbouring pixels into a unmarried value, where n is the ratio between the recorded calorie-free-field resolution and the piece resolution in the $(south, t)$ coordinates. The resulting epitome formation procedure from Eq. (21) becomes:

The optical system's limited frequency response is also introduced in the form of a convolution matrix P_z (Toeplitz matrix), applied to the binned images [31]. This matrix performs the convolution of the binned images with the total PSF of the optical arrangement. This PSF itself is the convolution of the PSFs generated by the lenses' limited size and the Nyquist limitation on the resolution. The ML and lenslet'south PSFs stand for to the Blusterous functions defined by their size and the captured low-cal wavelength [2, 11, 32]. They do not depend on z, and they are practical over the $(s_{i}, t_{i})$ and $(u, five)$ coordinates respectively. The Nyquist limitation on the resolution corresponds to the integration over the corresponding portion to a given sub-aperture image on the ML [16]. It has the shape of a flat disk whose width depends on the distance z

, refocusing distance z ₀ and $(u, v)$ resolution, and it is applied over the $(s_{i}, t_{i})$ coordinates.

By introducing the optical system'due south limited frequency response and super-resolution capabilities, the consummate prototype formation process for i slice from Eq. (22) becomes:

where U_n is the binning matrix for the called due north super-resolution, P_z is the convolutional matrix that applies the optical system's PSF.

The matrices A_z and $A_{z}^{T}$ can be computed explicitly past using the sets of unit of measurement vectors $E_{o} = {e_{o, i}, e_{o, two}, \dots}$ and $L_{i} = {{fifty}_{i, ane}, l_{i, 2}, \dots}$ respectively. The unit vectors E_o stand for each to a volume in the $(s_{o}, t_{o}, z)$ coordinates, where every value is naught, except for the selected bespeak of position $({due south}_{o, thou}, t_{o, k}, z_{one thousand})$ , which is one. The aforementioned goes for the unit vectors 50_i in the coordinates $({due south}_{i}, t_{i}, u, v)$ . These vectors are used in Eqs. (17)–(20) to compute a volume in the adjoint infinite. These resulting volumes are the columns of A_z and the rows of $A_{z}^{T}$ when using unit vectors E_o, while they are the rows of A_z and the columns of $A_{z}^{T}$ when using the unit vectors L_i. While this procedure tin can give mathematical insight about the project operations, for practical applications, it is better to apply existing efficient tomographic implementations, similar the ASTRA toolbox [33].

2.3. Refocusing every bit a minimization problem

Refocusing is the procedure of retrieving a photo $E_{z} (s_{o}, t_{o})$ for a given refocusing distance $z = α z_{0}$ , from the light-field acquired at z ₀. It was proven in [20] that the simple back-project of the sub-discontinuity images corresponds to the integration method for non super-resolved photographs described in [ane]. According to Eq. (21), we obtain the stack of refocused images ${\hat{10}}_{z}$ , from Eqs. (18) and (20), as follows:

where b is again the vector representation of the recorded light-field. For super-resolved images, based on the frontward model from Eq. (22), nosotros update Eq. (24) to: where

U_{north}^{T}

is the up-sampling matrix, transpose of the U_{due north} downwards-sampling matrix.

The Fourier slice theorem method from [sixteen] uses an approximation ${\tilde{A}}_{z}^{T}$ to the back-projection matrix $A_{z}^{T}$ which is computationally less expensive, resulting in a very similar refocusing problem:

When refocusing to merely one slice, the method from [17] is equivalent to the one from [16], in the sense that it performs a Gaussian re-sampling of the Fourier infinite instead of using Kaiser-Bessel functions. This results in just a slightly different approximation matrix ${\tilde{A}}_{z}^{T}$ for Eq. (26).

In this article, we formulate the refocusing reconstruction as a convex minimization problem, similarly to the loftier resolution image reconstruction from [10, 11]. We minimize the rest on the captured light-field, using Eq. (23):

\begin{array}{l} {\hat{x}}_{z} = \arg ​ \min_{x} {| | P_{z} U_{due north} A_{z} x - b | |_{2}^{2}} \\ subject to : ten ≽ 0, \end{array}

where the non-negativity constraint imposes the fact that there cannot exist negative calorie-free scattering. This representation can exist solved by existing iterative algorithms like SIRT and Cohabit Slope Least-Squares (CGLS) [25]. The SIRT implementation for the solution of Eq. (27) can exist plant in Appendix A.

2.4. Regularization

The contempo introduction of reconstruction algorithm classes similar Chambolle-Pock [26, 27], offers the ability to generate algorithm instances that solve capricious problems characterized by convex and perhaps non-smooth terms, in a straight-forward manner. This allows to surpass the implementations from [23] and [10], and to augment Eq. (27) with non-smooth convex terms like the post-obit:

\begin{array}{l} {\hat{10}}_{z} = \arg ​ \min_{ten} {| | P_{z} U_{n} A_{z} ten - b | |_{2}^{2} + λ | | O x | |_{ane}} \\ subject to : x ≽ 0, \end{array}

where λ is a weighting parameter, and O is an operator.

The fifty ₁-norm promotes sparsity [34], and we tin can chose an operator O that returns a sparse representation of the solution vector ten when it presents certain characteristics. This tin be used to impose prior noesis on the reconstruction. For instance, natural images nowadays a thin wavelet decomposition, while this is not the example for white dissonance [34]. Thus, reducing the l ₁-norm of the wavelet decomposition for natural images, reduces the bear upon of white dissonance and possibly other sources of noise, similar pixel value quantization and pixel surface integration.

Another popular selection for O is the modulus of the point-wise slope, whose l ₁-norm is known as Full Variation: $TV (\cdot) = | | (| \nabla \cdot |) | |_{1}$ . The TV term promotes isotropic gradient sparsity in the reconstructed images, thus favouring apartment regions with sharp boundaries [27].

In this article, we propose to employ either the Tv term or the translation-invariant wavelet transform (SWT) [21, 22]. The SWT is an overcomplete wavelet decomposition, that allows to remove the well-known block artifacts that characterize high compression levels of the traditional discrete wavelet transform (DWT), while preserving its sparsity properties. The CP implementation for the solution of Eq. (28) with a TV regularization term tin can be found in Appendix A.

ii.5. Practical limitations

The model presented in department ii.2 suffers from a few applied limitations. Three principal limitations are: lenslets are ordinarily not perfectly aligned with the underlying sensor pixel grid; the ML does not have real pinholes, but information technology integrates over a finite region; and the ML causes geometric distortion.

The first limitation implies a mismatch of the $(σ, τ)$ coordinates betwixt the pixels nether each lenslet. Every bit a effect, none of the presented methods, including the ane from the literature can be straight applied to the collected information. This is routinely solved by a re-interpolation of the pixel grid from the area underneath each lenslet (lenslet footprint). The resulting re-sampled grid has the desired matching of $(σ, τ)$ coordinates, but it suffers from a small penalty in directional resolution, as big every bit one-half of the $(u, v)$ pixel-size.

The 2nd corresponds to the Nyquist limitation on the resolution discussed in department 2.two. It results in a disk blur dependent on the refocusing distance from the original focal distance z ₀

, and it can be mitigated through the inclusion of such blurring term in the projection model, every bit discussed in section 2.2.

The tertiary limitation causes straight lines to be reconstructed as curved lines. This a well-known problem in computer vision, and it is routinely solved by a bank check-lath calibration of the photographic camera, which returns a map of its distortion. In one case the photographic camera has been calibrated, the acquired images can exist rectified through a mutual image distortion correction procedure. Reliable software for carrying out this procedure already exists. An example is the "Light Field Toolbox for Matlab" (LFT: http://dgd.vision/Tools/LFToolbox/) from Donald Chiliad. Dansereau.

For a more in depth analysis of lenslet alignment with the underlying sensor, and the geometric baloney introduced by the ML, nosotros refer to [35].

figure: Fig. 3

Fig. three Description of the logos synthetic instance: (a) phantom used to simulate calorie-free field data, having a VOXEL logo at the acquisition focal altitude $z = 100$ mm, and 2 CWI logos at the distances of 110 mm, and 90 mm; (b) fake information, with two sub-aperture images for clarity.

Download Full Size | PPT Slide | PDF

three. Reconstructions

We now address the reconstruction functioning and applicability of the proposed methods. Nosotros outset employ a known phantom from [20] (the "logos" exam instance) to test the reconstruction fidelity of the refocused objects, for different upwardly-scaling factors (Fig. 3). For brevity nosotros simply reconstruct the logos example for an ULF camera geometry. We and then demonstrate the power of our arroyo to reconstruct real-data for both ULF and FLF camera configurations.

figure: Fig. 4

Fig. iv Raw information from the Tarot Cards dataset. The data has been downscaled by a factor 4 in the $(south, t)$ coordinates, and blurred with a known PSF.

Download Full Size | PPT Slide | PDF

For the ULF case, we first show reconstructions of 2 versions of the "Tarot Cards" dataset from the Stanford Calorie-free-Field Archive (http://lightfield.stanford.edu/lfs.html), originally acquired past Andrew Adams (data: Fig. 4). The two versions are used for different purposes: ane is downscaled and blurred to simulate an MLA-based acquisition setup, and appraise the PSF correction and super-resolution operation; the other is the original full-resolution version, which is used to bear witness that the adult methods tin can work with full-resolution gantry setups. To further test the ULF instance, nosotros so show reconstructions of the "Tea" dataset, acquired with the setup in Fig. 5(a) using the Lytro Illum plenoptic photographic camera (information: Fig. six).

figure: Fig. 5

Fig. 5 Experimental setups used for the acquisition of the ULF and FLF datasets "Tea", "Letters", and "Flower": (a) ULF setup, using the Lytro Illum plenoptic photographic camera; (b) FLF setup, built at Imagine Optic in Bordeaux (France).

Download Full Size | PPT Slide | PDF

For the FLF case, we show the reconstruction of two datasets chosen "Letters" (micro-images: Fig. 7(a), sub-discontinuity images: Fig. seven(b)) and "Blossom" (micro-images: Fig. 8(b), sub-aperture images: Fig. 8(c)), that we acquired experimentally with the dedicated setup in Fig. 5(b). The Letters dataset depicts vector art, that is easily recognizable by the man middle, and that is favoured by the TV-min constraint. The Flower dataset depicts natural imagery (https://github.com/pratulsrinivasan/Local $_{Fifty} i g h t_{F} i e l d_{S} y northward t h east south i s), w h i c h i southward m o r e c o m p l e x a n d f a 5 o u r eastward d b y t h eastward South W T - m i n c o n s t r a i north t .$

figure: Fig. 7

Fig. 7 Raw data for the Messages case: (a) Micro-image representation; (b) Sub-discontinuity representation. In both cases, two zoomed insets have been shown for clarity. The observed inversion between (a) and (b) is due to the fact that FLF data mixes spatial information with athwart information at the micro-images [36], and that the lenslets perform an epitome space inversion of the images that form on their object space focal plane.

Download Full Size | PPT Slide | PDF

figure: Fig. 8

Fig. 8 Raw data for the Flower example: (a) Original (high resolution) paradigm used for this test case; (b) Micro-image representation; (c) Sub-discontinuity representation. In both cases, ii zoomed insets take been shown for clarity.

Download Full Size | PPT Slide | PDF

3.1. Data description

The logos scene is composed of three unlike objects positioned at 90, 100 and 110 mm respectively from the ML, as displayed in Fig. 3(a). The dataset is composed of $sixteen \times 16$ sub-aperture images, each composed of $512 \times 256$ pixels. The sub-discontinuity images for this case are presented in Fig. 3(b). The diameter d ₁ of the primary lens is ten mm, while each lenslet has a bore $d_{ii} = xvi μ m$ . The distance z ₂ is $50 μ 1000$ , and the detector pixel size $1 μ m$ . The ML-to-MLA distance z ₁ is 25 mm, so that the focal airplane in the object space is at 100 mm ( $| M | = 4$ ). In the cosmos of this dataset we applied the two PSFs due to Fraunhofer diffraction for the ML and the lenslets in the MLA. These PSFs were generated by computing the Fraunhofer diffraction pattern predicted for the lens aperture (d ₁ and d ₂ for ML and MLA respectively), detected light wavelength, and lens-to-target altitude (z ₁ and z ₂ for ML and MLA respectively). This diffraction pattern was so integrated in steps equal to the underlying pixel size ( $16 μ g$ and $i μ m$ for ML and MLA respectively).

To enable the evaluation of the upwardly-sampling refocusing reconstruction quality for the logos case, nosotros as well created ii boosted versions of such test instance, where we binned (down-sampled) the sub-aperture images along the $(s, t)$ coordinates by 2 and 4. The reconstruction of these down-sampled datasets was performed at the original dataset $(s, t)$ resolution, turning it into a super-resolved reconstruction problem with corresponding up-sampling factors of ii and four, respectively. The resulting upwardly-sampled refocused images could then be compared to the original (high resolution) phantom. In general, this means that for increasing sampling factors, the available data $(due south, t)$ resolution becomes worse, less data is bachelor, and more operations are needed to produce images with the original $(southward, t)$ resolution. This process also resulted in the re-scaling of the expected lenslet sizes, detector pixel sizes, and PSF widths, according to the sampling factors.

The Tarot cards dataset has been acquired using a gantry setup (coordinating to a photographic camera array setup), which was built using Lego Mindstorms, and a Canon Digital Rebel XTi, with a Canon 10-22 mm lens. The cropped and registered dataset from the archive comes as a collection of $17 \times 17$ sub-discontinuity images, with a relatively high $(s, t)$ resolution ( $1024 \times 1024$ ). The focal length of the main lens f _one

was set up to 20 mm. We binned the sub-aperture images to a resolution of $256 \times 256$ , and blurred the lite-field in the $(u, 5)$ coordinates using a dissimilar PSF per color channel. The three PSFs have been computed as the event of the Fraunhofer diffraction effects due to the lenslets, with the following parameters: lenslet diameter $d_{2} = xx μ m$ , distance $z_{two} = 37 μ m$ and detector pixel size $1.four μ m$ . The parameters for the lenslets are the same as the ones in a well known plenoptic photographic camera, commercially available under the name of Lytro ILLUM. The modified Tarots cards dataset is displayed in Fig. 4.

The Tea dataset (Fig. six) has been caused using a Lytro Illum plenoptic camera, with the setup in Fig. v(a). Existence the Lytro Illum a proprietary photographic camera, we exercise not have the complete acquisition parameters. The raw data and metadata were extracted using the proprietary software "Lytro power tools", property of Lytro. This software already performs the rectification and lenslet re-sampling procedures discussed in section 2.5.

The Letters and Flower datasets (Figs. seven and 8) correspond printed vector art and natural imagery, that was so photographed with the dedicated setup in Fig. 5(b), in the laboratories of Imagine Optic (Bordeaux, France). In this setup, we used the AC254-200-A-ML ML and MLA300-14AR-Thousand MLA from Thorlabs, and the Stingray F-145 sensor from Centrolineal Vision. The ML has focal distance $f_{1} = 200 mm$ , and the MLA is composed of $\sim thirty \times 30$ lenslets with $f_{2} = 18.6 mm$ , diameter $d_{two} = 300 μ m$ , and built-in f-number equal to 62. The detector pixel size is $6.45 μ yard$ . We chose a z _i altitude of 400 mm, which gave a z ₀ distance of 400 mm and $| M | = ane$ . Nosotros also chose a distance $b = 20.93 mm$ , which corresponded in a distance $a = 167.4 mm$ . The field-of-view (FoV) of the scene is $7.2 mm \times 4.8 mm$ , while the DoF is $≃ 2.5 mm$ [30]. The degradation of the micro-image quality observed in both Figs. vii and 8 is due to the PSF of the imaging organization, and in particular to the large lenslet f-number, which negatively correlates to the imaging system spatial resolution [30].

For the reconstruction of all these exam cases, the called regularization weight λ values accept been selected empirically by trial and fault. In the example of multi-aqueduct data (e.g. RGB), we performed a single-channel reconstruction, where each channel has been reconstructed separately. However, the same λ value was used for the reconstruction of all the channels.

three.2. Operation assay

To evaluate the reconstruction fidelity of the logos test case, we can directly compare the refocused regions with the phantom, and compute the root hateful foursquare error (RMSE) of each refocused image using the formula:

R M S E (x_{z}) = \sqrt{\frac{\sum_{n}^{Northward} {({(x_{z})}_{due north} - {({\hat{x}}_{z})}_{due north})}^{2}}{N}}

where the summation is over the selected N pixels, and the vector

{\hat{x}}_{z}

represents the expected solution at the given depth z. Lower values of the RMSE stand for to better refocusing performance.

3.3. Logos refocusing

The logos test case is a criterion on the affect of both the PSF correction and the regularization, with respect to different up-sampling parameters. We compare the refocusing reconstruction results between a few different reconstruction approaches: integration (known too as "shift&sum"), the mathematically equivalent tomographic simple back-projection, Fourier slice theorem, SIRT with and without PSF correction, and CP using the To the lowest degree Squares (50 _ii) information-divergence term and Telly penalization (abbreviated to CP-LS-TV), both with and without PSF correction. The refocusing was performed on the CWI logo at altitude 90 mm from the ML, respective to $α = 0.9$ (as divers in section 2.3).

figure: Fig. 9

Fig. 9 Functioning comparing of different refocusing approaches for the "logos" constructed exam case: (a) Phantom; (b) Integration; (c) Back-projection; (d) Fourier slice theorem; (e) SIRT without PSF; (f) Chambolle-Pock without PSF, with l ₂ data-deviation term and TV regularization term with λ = 1; (g) SIRT with PSF; (h) Chambolle-Pock using PSF, l _two data-divergence term, and TV regularization term with λ = 1.

Download Full Size | PPT Slide | PDF

Every bit mentioned in section ii.three, the choice of a uncomplicated Fourier slice theorem implementation from [xvi], over a filtered version from [17] is due to the fact that the two versions, for planar refocusing only differ by the introduction of a filtering pace that is equivalent to a Gaussian resampling, instead of the Kaiser-Bessel resampling proposed in [sixteen]. This means that on a noiseless phantom, the 2 methods should bear very similarly.

In Fig. 9 nosotros present all the different reconstructions against the all-in-focus phantom, at the original phantom resolution (sampling factor = one: down-sampling of sub-aperture images = one, upward-sampling of reconstruction = 1). Information technology is apparent from the comparison of Figs. 9(b) to nine(d) against Figs. 9(eastward) to ix(h) that iterative methods tin can define better sharp transitions than traditional one-step refocusing methods, fifty-fifty in the presence of blurring. Then, from the comparison between Figs. 9(e) to 9(f) and Figs. 9(chiliad) to 9(h), we see that the PSF correction greatly improves the recovery of sharp boundaries. From the comparison between Fig. 9(g) and Fig. 9(h), by looking close to the edge of the CWI target in the left inset of each figure, it is possible to notice that the boosted regularization term helps in reducing the wavy type of artifacts due to the PSF reconstruction.

figure: Fig. 10

Fig. 10 Root-Mean-Square-Errors (RMSEs) and computational time for each reconstruction approach from Fig. 9 for the downward-sampling factors i, two, and 4 of the sub-discontinuity images (the $(s, t)$ coordinates of the light-field): (a) RMSE; (b) Computational time; (c) Scaling of sub-aperture image resolution and reconstruction upward-sampling cistron against the sampling factor.

Download Full Size | PPT Slide | PDF

Figure x(a) quantifies the performance of the different methods, by plotting the RMSEs of each reconstruction method for the down-sampling factors i, 2 and 4 of the sub-aperture images in the original light-field, and respective up-sampling of the refocusing reconstructions. As also observed for Fig. 9, the one-stride refocusing methods nowadays similar reconstruction accurateness across the board, with the difference between the integration and the back-projection approach simply being due to the different interpolation method used in their implementations. Among the iterative methods, the PSF correction tin improve the reconstruction accurateness by almost an order of magnitude at the highest resolutions, while it becomes less of import with higher detector binning factors, because the PSF consequence tends to be essentially reduced [ten]. This can exist observed in detail for the sampling factor equal to two, where the SIRT and CP-LS-Television reconstructions without PSF correction improve their refocusing performance with respect to the original dataset, because of a reduction in size of the lenslet PSF. The choice of the correct regularization (in this case TV-min) improves the result over non-regularized reconstructions, and it becomes relatively more of import at higher sampling factors and with smaller PSFs.

Finally, in Fig. 10(b) nosotros can discover the computational time required for every tested method at each down-sampling factor. Even if the focus of this article is not on the computational performance of the proposed method, we believe that it is worth giving an understanding of the performance penalties associated with each addressed aspect. These tests were carried out on a machine equipped with an Intel i7-6700K CPU, 64GB of DDR4 memory, and a NVIDIA GeForce GTX 970 GPU. The fastest method is the back-projection considering information technology is performed using the ASTRA toolbox, which uses GPU acceleration. All the iterative methods also use the GPU dispatch offered past ASTRA for the forrad and dorsum-projection, but perform the rest of the operations on the CPU. This imposes data transfers beyond the PCIe bus (from GPU to CPU and vice versa) at each iteration. Nosotros expect to run across substantial improvements past running these refocusing routines entirely on the GPU, past both avoiding bottlenecks due to the data transfers, and allowing to accelerate every other performance, including the PSF application. For the same number of iterations, the addition of the TV minimization term has a negligible bear on on the refocusing operation. The PSF addition causes the highest performance penalty, which nevertheless depends on the size of the PSF: for increasing downwardly-scaling factors, information technology has a progressively lower impact. The Fourier method is besides showing better performance with higher down-sampling factors, because the initial 4D-FFT is performed on smaller data sizes [16].

3.4. Tarot cards refocusing

The Tarot cards test case shows the super-resolution and artifact correction potential of the proposed methods, over real-information, and in particular over a well understood and widely used dataset.

figure: Fig. 11

Fig. 11 Operation comparing of different refocusing approaches, at different levels of up-scaling (super-resolution) for the Tarot Cards example. First row shows reconstructions at upwardly-sampling $=$ 1, and second row up-sampling $=$ 4. In the columns, instead, nosotros accept the dissimilar approaches: (a) Back-projection; (b) Fourier slice theorem; (c) SIRT with PSF; (d) Chambolle-Pock using PSF, l ₂ data-difference term, and SWT Haar regularization with λ = 3.

Download Full Size | PPT Slide | PDF

We beginning analyze the results of the downscaled and blurred dataset. In Fig. 11, nosotros can observe the refocusing performance of the well-nigh significant reconstruction methods, for the ii different up-sampling parameters of ane and 4. For this test case, nosotros chose to use: dorsum-projection, Fourier slice theorem, SIRT with PSF, and CP-LS with SWT penalization based on Haar wavelets (CP-LS-SWT) and PSF. Once more, the back-projection and Fourier based methods evangelize comparable refocusing and feature-resolving performance. In Fig. eleven(c), for up-scaling equal to 4, the SIRT reconstruction presents a meliorate characteristic resolving performance than the i-step approaches, only also visible artifacts in the reconstruction. The SWT penalized reconstruction from Fig. 11(d) visibly preserves the higher characteristic resolving operation of the SIRT approach, while strongly reducing the said artifacts. The pick of the SWT penalization term over the Telly term can be understood through the same reasoning detailed in department two.4. The Tarot Cards dataset presents a variety of features, including shaded regions. These regions intermission the supposition of sharp boundaries associated to the Goggle box minimization.

figure: Fig. 12

Fig. 12 Operation comparison of dissimilar refocusing approaches, for the full-resolution Tarot Cards case. We bear witness one of the sub-aperture images on the side of different reconstruction approaches: (a) Fundamental sub-aperture image; (b) Back-project; (c) SIRT; (d) Chambolle-Pock using l ₂ data-divergence term, and SWT Haar regularization with λ = 3.

Download Full Size | PPT Slide | PDF

Nosotros now consider the reconstruction of the unmodified Tarots Cards dataset. It can be deduced from Fig. 12 that the approaches presented in this article allow to correctly reconstruct loftier resolution gantry data. By comparison the results of the different methods in Fig. 12, no difference is visible between the reconstructions. This is due to the fact that, in this case, the back-project method already returns a correct reconstruction. In fact, the mistiness visible in the reconstructions is also present in the sub-aperture images (Fig. 12(a)).

These high resolution gantry information reconstructions can be used to quantify the quality of the super-resolved refocusing reconstructions that used the modified Tarots Cards dataset. The measured RMSE values for the reconstructions in Fig. 11, against Fig. 12(b) are the following: 0.0363353, 0.103504, 0.0382393, 0.0169782 for the back-projection, Fourier, SIRT with PSF, and CP-LS-SWT with PSF respectively. These results confirm the visual estimation carried out earlier in this section.

3.5. Tea refocusing

The Lytro Illum plenoptic camera is a pop commercially available chemical compound photographic camera, that allows to easily acquire and excerpt unmarried-shot light-fields, while maintaining a portable format.

figure: Fig. 13

Fig. 13 Operation comparison of dissimilar refocusing approaches, at different levels of upwardly-scaling (super-resolution) for the Tea instance. First row shows reconstructions at upwardly-sampling $=$ ane, and second row up-sampling $=$ 4. In the columns, instead, we have the different approaches: (a) Back-project; (b) Fourier piece theorem; (c) SIRT with PSF; (d) Chambolle-Pock using PSF, l ₂ data-deviation term, and SWT Haar regularization with λ = one.

Download Full Size | PPT Slide | PDF

In Fig. 13, we evidence the reconstruction of the Lytro Illum information acquired with the setup in Fig. 5(a). As information technology was done for the Tarots Cards dataset, we performed reconstructions with upwards-sampling of 1 and iv, for the following methods: back-projection, Fourier slice theorem, SIRT with PSF, and CP-LS with SWT penalisation based on Haar wavelets (CP-LS-SWT) and PSF.

This case shows that the methods adult in this article can be applied to the information acquired with this popular camera. It tin exist seen from the foliage and castle insets that in one case again both iterative approaches present a better feature resolving performance than the one-step approaches. Moreover, the SWT regularization is well behaved for the refocusing of natural images, and it helps in reducing artefacts.

In this example, we besides focus our attention to the out-of-focus regions of the refocused image. In particular, nosotros refer to the lesser inset, which magnifies an border from the background. The super-resolved dorsum-projected reconstruction (bottom of Fig. 13(a)) of this region is similar to the corresponding Fourier version (bottom of Fig. 13(b)), just it also presents traces of a wavy pattern. The iterative methods used in Figs. 13(c) and 13(d) present a stronger antiquity because they use the same back-projection operation every bit in Fig. 13(a), but iterate it multiple times. The observed artifact is due to inaccuracies in the numerical interpolation of the used back-projection implementation, which is more than apparent for higher relative distances from the acquisition focal distance. In fact, this effect is non visible for the Tarot Cards test example (Fig. 12), and it deviates from the expected bokeh consequence for out-of-focus regions. To remove information technology, the back-projection operation could be substituted by a Fourier type of back-projection.

3.vi. Letters and Bloom refocusing

The Letters and Blossom exam cases appraise the ability of our approach to handle real-data produced with the FLF photographic camera geometry. We show that every aspect of our approach, including super-resolution and regularization, can exist carried over to this instance.

figure: Fig. 14

Fig. xiv Performance comparison of unlike refocusing approaches, at dissimilar levels of upscaling (super-resolution) for the Letters case. Get-go row shows reconstructions at up-sampling $=$ 1, and second row upwards-sampling $=$ 2. In the columns, instead, we have the different approaches: (a) Dorsum-project; (b) SIRT without PSF; (c) Chambolle-Pock with PSF, l ₂ data-difference term, TV regularization, and λ = 100.

Download Full Size | PPT Slide | PDF

In Fig. fourteen, we present the reconstruction of the FLF information from Fig. seven for the following approaches: back-projection, SIRT without PSF and CP-LS-Tv with PSF. For this instance, the choice of the TV term is justified by the properties of the imaged objects: the printed letters, with sharp boundaries over a plain background, explicitly satisfy the prior cognition of the TV minimization. We present both reconstructions with up-sampling equal to one, and 2. From Fig. 14, it is clearly visible that the utilise of super-sampling and regularization techniques tin improve the reconstruction of the smallest features.

The PSF used for this instance is the upshot of both the theoretically predicted PSF due to Fraunhofer diffraction of the lenslets, and the Nyquist limitation on the resolution for the given refocusing altitude (as seen in section 2.ii). To compute the Nyquist PSF size, nosotros used a computed refocusing distance $z ≃ 430 mm$ , an acquisition focus distance $z_{0} = 400 mm$ , and $(u, five)$ resolution of $\sim 0.175 mm$ . Figure 14(c), despite looking overall sharper than Fig. fourteen(b), withal presents some blur. This is due to the fact that the PSFs were only theoretically computed, based on the estimated distances z, z ₀, and the measured b (which in plough determines a, and the $(u, five)$ resolution). For higher reconstruction accuracy, a thorough calibration of the conquering setup is required (department ii.5), including the experimental determination of the camera PSF.

figure: Fig. 15

Fig. fifteen Performance comparison of different refocusing approaches, at different levels of upscaling (super-resolution) for the bloom case. First row shows reconstructions at upwards-sampling $=$ ane, and second row up-sampling $=$ two. In the columns, instead, we have the different approaches: (a) Expected reconstruction for an ideal camera with infinite bandwidth optical arrangement response; (b) Back-projection; (c) SIRT without PSF; (d) Chambolle-Pock with PSF, fifty ₂ information-departure term, SWT regularization, and λ = v.

Download Total Size | PPT Slide | PDF

In Fig. 15, we nowadays the reconstruction of the FLF information from Fig. 8. Similarly to what was done for the Messages test example, we used the following: back-projection, SIRT without PSF and CP-LS-SWT with PSF. The notable departure from the Letters case is the option of the SWT regularization term, over the Television term. Equally previously mentioned, natural imagery by and large breaks the assumption of flat and sharp regions in the reconstructed images.

In Fig. fifteen(a), we can see the expected reconstruction with an ideal photographic camera (with space bandwidth). However, in do, as stated in [16], the caused light-field is convolved with a depression pass filter, which has the form of a 4D sinc role. This results in lower than ideal performance for each reconstruction method, simply an increasing similarity to Fig. xv(a) indicates increasing reconstruction quality. Moreover, the images in Fig. fifteen(a), were obtained by downscaling the high resolution image from Fig. 8(a), and not by taking a picture of a printed version, equally information technology was done for Figs. viii(b) and viii(c). Every bit a outcome, peculiarly the higher resolution version represents a purely ideal example, because it was not upsampled from its lower resolution version.

Despite requiring a more thorough calibration of the images and PSF, also in this case, the proposed methods improve on the back-projection approach. In particular, the super-resolution reconstruction using the theoretical PSF and SWT regularization allows to visibly recover more of the expected features, than the other considered approaches.

4. Conclusions and outlook

The proposed tomography based light-field refocusing method is defined in the straight-space, equally opposed to previously proposed Fourier space tomography methods. We have shown that this formulation is applicative to both the ULF and FLF camera configuration, and it allows to both easily introduce advanced forward operator modeling, and advanced regularization techniques. This is due to the fact that it reduces the different camera geometries to minor differences in the projection implementation, resulting in a unified refocusing strategy for all the available plenoptic acquisition schemes. 1 of the main results of this aspect is that information technology provides greater and wider applicability for futurity developments in the field. As an example, we showed that the introduction of regularization terms in the refocusing reconstruction could be applied to both the ULF and FLF cases.

The TV and SWT regularization techniques likewise proved to be powerful tools for reducing the artifacts due to super-resolution reconstructions and PSF corrections. Newer and more advanced regularizations could help to further improve the reconstruction quality and push for college super-resolution refocusing.

The proposed approach will allow in the future to introduce the optical organization's aberration modeling in the forward projection operator, in the same fashion equally for the PSF. This would event in yet higher refocusing accuracy and super-resolution. We believe that further endeavor could be devoted to determining the theoretically all-time achievable super-resolution for the light-field information and geometry at mitt, once the PSF and aberrations have been taken into account. On some other note, this commodity further confirms that computational improvements can assistance to minimize the affect of physical limitations imposed by the manufacturing of smaller eyes and detectors with smaller pixel sizes.

Finally, we showed that the proposed method allows to tune the refocusing problem based on the available prior knowledge, merely also that multiple avant-garde algorithms could be used to solve these issues. This means that the newest developments in the reconstruction algorithms could be ported to light-field refocusing from other fields like computed tomography, aslope with the newest developments in convex optimization and regularizations.

Appendix A

SIRT and CP-LS-TV refocusing implementations

Here we first written report the operations needed to implement a SIRT algorithm that solves Eq. (27). Each iteration step can exist computed as follows:

\begin{array}{l} 10_{z}^{(k + ane)} = p o south (x_{z}^{(one thousand)} + D_{ii} {\tilde{A}}_{z}^{T} (b - {\tilde{A}}_{z} x_{l}^{(k)})) \\ {\tilde{A}}_{z} = P_{z} U_{n} A_{z}; {\tilde{A}}_{z}^{T} = A_{z}^{T} U_{n}^{T} P_{z}^{T} \\ D_{one} = d i a yard (\frac{1}{| {\tilde{A}}_{z} | 1}); D_{2} = d i a chiliad (\frac{1}{| {\tilde{A}}_{z}^{T} | 1}) \end{array}

where the function $p o south (\cdot)$ clips all the negative values to 0, k indicates the iteration pace, 1 are vectors composed by merely values i of the same sizes every bit the vectors 10 and b respectively, and the matrices D ₁ and D ₂ are diagonal matrices used to rescale the modified forwards and backward operators ${\tilde{A}}_{z}$ and ${\tilde{A}}_{z}^{T}$ respectively.

For the solution of Eq. (28), we use the Chambolle-Pock type of algorithms, which are chosen primal-dual algorithms, considering they solve two conjugate problems at the same time: the primal, defined by Eq. (28), and its dual, which is a maximization problem. If the solution exists and it is unique, information technology will exist the same for both problems once the algorithm has reached convergence [27, 37]. Nosotros report hither a CP preconditioned version of the said CP algorithm with an l _two information divergence term (LS), and TV penalization:

\begin{array}{l} x_{z}^{(0)} = 0, {\bar{x}}^{(0)} = 0, p_{d}^{(0)} = 0, p_{television set}^{(0)} = 0, D_{2} = d i A m (\frac{1}{| {\tilde{A}}_{z}^{T} | 1 + 4 λ}) \\ for l : = [0, L) \\ p_{d}^{(l + 1)} = \frac{p_{d}^{(l)} + D_{1} (A_{z} {\bar{x}}^{(l)} - b)}{d i A one thousand (1) + D_{i}} \\ p_{idiot box}^{(fifty + 1)} = \frac{p_{t v}^{(l)} + 1 / 2 \nabla {\bar{x}}^{(fifty)}}{m A x (one, | p_{t v}^{(l)} + 1 / 2 \nabla {\bar{x}}^{(50)} |)} \\ {ten}_{z}^{(l + one)} = p o s (x^{(50)} - D_{2} A^{T} p_{d}^{(l + 1)} + λ D_{two} d i v (p_{television}^{(l + 1)})) \\ {\bar{x}}^{(l + ane)} = x^{(l + 1)} + (10^{(l + ane)} - x^{(l)}) \end{array}

where p _d and p _tv are the detector and gradient terms of the dual problem respectively, the office $d i 5 (\cdot)$ computes the divergence of its input, the matrices ${\tilde{A}}_{z}$ , ${\tilde{A}}_{z}^{T}$ and D ₁ are the same equally for the SIRT algorithm, the matrix D ₂ has been adapted to also contain the Idiot box term rescaling, and $\bar{x}$ is a modified solution 10_z that helps in improving the convergence rate of the algorithm [27].

Funding

European Marriage's Horizon 2020 research and innovation program (VOXEL H2020-FETOPEN-2014-2015-RIA GA 665207).

References

1. R. Ng, "Digital calorie-free field photography," Ph.D. thesis, Stanford University (2006).

two. E. Y. Lam, "Computational photography with plenoptic camera and calorie-free field capture: tutorial," Periodical of the Optical Society of America A 32, 2021–2032 (2015). [CrossRef]

iii. M. West. Tao, South. Hadap, J. Malik, and R. Ramamoorthi, "Depth from combining defocus and correspondence using light-field cameras," in), Proceedings of the IEEE International Conference on Calculator Vision, (IEEE, 2013), pp. 673–680.

4. T. E. Bishop, S. Zanetti, and P. Favaro, "Light field super resolution," in), 2009 IEEE International Conference on Computational Photography (ICCP), (IEEE, 2009), pp. 1–9.

five. S. Wanner and B. Goldluecke, "Spatial and angular variational super-resolution of 4D calorie-free fields," Lecture Notes in Calculator Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7576 LNCS, 608–621 (2012).

6. Thou. Marwah, G. Wetzstein, Y. Bando, and R. Raskar, "Compressive light field photography using overcomplete dictionaries and optimized projections," ACM Transactions on Graphics 32, ane (2013). [CrossRef]

7. Y. Wang, Grand. Hou, Z. Sun, Z. Wang, and T. Tan, "A simple and robust super resolution method for calorie-free field images," in 2016 IEEE International Conference on Image Processing (ICIP), (IEEE, 2016), pp. 1459–1463. [CrossRef]

8. Y. Yoon, H.-G. Jeon, D. Yoo, J.-Y. Lee, and I. S. Kweon, "Low-cal-Field Image Super-Resolution Using Convolutional Neural Network," IEEE Signal Processing Messages 24, 848–852 (2017). [CrossRef]

ix. M. Rossi and P. Frossard, "Geometry-Consequent Low-cal Field Super-Resolution via Graph-Based Regularization," IEEE Transactions on Image Processing 27, 4207–4218 (2018). [CrossRef] [PubMed]

10. West. Due south. Chan, E. Y. Lam, M. K. Ng, and G. Y. Mak, "Super-resolution reconstruction in a computational compound-center imaging arrangement," Multidimensional Systems and Indicate Processing xviii, 83–101 (2007). [CrossRef]

11. M. Broxton, L. Grosenick, Southward. Yang, N. Cohen, A. Andalman, K. Deisseroth, and M. Levoy, "Wave eyes theory and iii-D deconvolution for the lite field microscope," Eyes Express 21, 25418–25439 (2013). [CrossRef] [PubMed]

12. Due south. A. Shroff and Thou. Berkner, "Image formation assay and high resolution image reconstruction for plenoptic imaging systems," Applied Eyes 52, D22 (2013). [CrossRef] [PubMed]

13. A. Lumsdaine and T. Georgiev, "The focused plenoptic photographic camera," in), IEEE International Conference on Computational Photography (ICCP), (IEEE, 2009), pp. ane–viii.

14. T. G. Georgiev and A. Lumsdaine, "Focused plenoptic camera and rendering," Periodical of Electronic Imaging 19, 021106 (2010). [CrossRef]

fifteen. R. Ng, M. Levoy, G. Duval, M. Horowitz, and P. Hanrahan, "Calorie-free field photography with a hand-held plenoptic camera," Stanford UniversityTechnical Written report CSTR2005– (2005).

16. R. Ng, "Fourier slice photography," ACM Transactions on Graphics 24, 735 (2005). [CrossRef]

17. D. K. Dansereau, O. Pizarro, and South. B. Williams, "Linear Volumetric Focus for Light Field Cameras," ACM Transactions on Graphics 34, one–20 (2015). [CrossRef]

18. T. Georgiev, G. Chunev, and A. Lumsdaine, "Superresolution with the focused plenoptic photographic camera," in Computational Imaging IX,C. A. Bouman, I. Pollak, and P. J. Wolfe, eds. (SPIE, 2011). [CrossRef]

19. C. Herzog, O. de La Rochefoucauld, M. Dovillaire, Ten. Granier, F. Harms, Ten. Levecq, E. Longo, L. Mignard-Debise, and P. Zeitoun, "Comparison of reconstruction approaches for plenoptic imaging systems," in Unconventional Optical Imaging, C. Fournier, M. P. Georges, and G. Popescu, eds. (SPIE, 2018), May, p. 104. [CrossRef]

xx. N. Viganò, H. Der Sarkissian, C. Herzog, O. de la Rochefoucauld, R. van Liere, and 1000. J. Batenburg, "Tomographic approach for the quantitative scene reconstruction from calorie-free field images," Optics Express 26, 22574 (2018). [CrossRef] [PubMed]

21. Thousand. Lang, H. Guo, J. E. Odegard, C. S. Burrus, and R. O. Wells Jr., "Racket reduction using an undecimated discrete wavelet transform," IEEE Point Processing Letters three, 10–12 (1996). [CrossRef]

22. A. Thou. Louis, P. Maass, and A. Rieder, Wavelets: Theory and Applications, Pure and Applied Mathematics(Wiley, 1997).

23. S. Bakery and T. Kanade, "Limits on super-resolution and how to pause them," IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 1167–1183 (2002). [CrossRef]

24. A. C. Kak and Thousand. Slaney, Principles of Computerized Tomographic Imaging (IEEE, 1988).

25. A. V. der Sluis and H. V. der Vorst, "SIRT-and CG-type methods for the iterative solution of sparse linear least-squares problems," Linear Algebra and its Applications 130, 257–303 (1990).

26. A. Chambolle and T. Pock, "A first-guild primal-dual algorithm for convex problems with applications to imaging," Journal of Mathematical Imaging and Vision 40, 120–145 (2010). [CrossRef]

27. Due east. Y. Sidky, J. H. Jørgensen, and X. Pan, "Convex optimization trouble prototyping for image reconstruction in computed tomography with the Chambolle-Pock algorithm," Physics in Medicine and Biology 57, 3065–3091 (2012). [CrossRef] [PubMed]

28. S. Zhu, A. Lai, 1000. Eaton, P. Jin, and L. Gao, "On the fundamental comparison betwixt unfocused and focused light field cameras," Applied Eyes 57, A1 (2018). [CrossRef] [PubMed]

29. T. M. Buzug, Computed Tomography(Springer-Verlag, 2008).

30. M. Martínez-Corral and B. Javidi, "Fundamentals of 3d imaging and displays: a tutorial on integral imaging, light-field, and plenoptic systems," Advances in Optics and Photonics 10, 512 (2018). [CrossRef]

31. One thousand. H. Golub and C. F. Van Loan, Matrix Computations, vol. 28 (Johns Hopkins University, 1996).

32. J. West. Goodman, Introduction to Fourier Optics, Electrical and Figurer Engineering: Communications and Betoken Processing (McGraw-Hill, 1996), 2nd ed.

33. W. J. Palenstijn, Grand. J. Batenburg, and J. Sijbers, "Performance improvements for iterative electron tomography reconstruction using graphics processing units (GPUs)," Periodical of Structural Biological science 176, 250–253 (2011). [CrossRef] [PubMed]

34. G. Kutyniok, Compressed Sensing(Cambridge University, 2012).

35. D. Chiliad. Dansereau, O. Pizarro, and S. B. Williams, "Decoding, Scale and Rectification for Lenselet-Based Plenoptic Cameras," in), 2013 IEEE Conference on Reckoner Vision and Design Recognition, (IEEE, 2013), pp. 1027–1034. [CrossRef]

36. A. Lumsdaine, T. Grand. Georgiev, and G. Chunev, "Spatial analysis of detached plenoptic sampling," Proceedings of SPIE 8299, 829909 (2012). [CrossRef]

37. Due south. Boyd and L. Vandenberghe, Convex Optimization(Cambridge University, 2004). [CrossRef]

Compressive Light Field Photography Using Overcomplete Dictionaries and Optimized Projections

one. Introduction

1.1. Detailed content description

2. Methods

2.1. Conventions

2.2. Frontwards model

2.3. Refocusing every bit a minimization problem

2.4. Regularization

ii.5. Practical limitations

three. Reconstructions

3.1. Data description

three.2. Operation assay

3.3. Logos refocusing

3.4. Tarot cards refocusing

3.5. Tea refocusing

3.vi. Letters and Bloom refocusing

4. Conclusions and outlook

Appendix A

SIRT and CP-LS-TV refocusing implementations

Funding

References

0 Response to "Compressive Light Field Photography Using Overcomplete Dictionaries and Optimized Projections"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel