Study of B → πℓν and B → ρℓν Decays and Determination of | V ub |

We present an analysis of exclusive charmless semileptonic B -meson decays based on 377 million BB pairs recorded with the B A B AR detector at the Υ (4 S ) resonance. We select four event samples corresponding to the decay modes B 0 → π − ℓ + ν , B + → π 0 ℓ + ν , B 0 → ρ − ℓ + ν , and B + → ρ 0 ℓ + ν , and ﬁnd the measured branching fractions to be consistent with isospin symmetry. Assuming isospin symmetry, we combine the two B → πℓν samples, and similarly the two B → ρℓν samples, and measure the branching fractions B ( B 0 → π − ℓ + ν ) = (1 . 41 ± 0 . 05 ± 0 . 07) × 10 − 4 and B ( B 0 → ρ − ℓ + ν ) = (1 . 75 ± 0 . 15 ± 0 . 27) × 10 − 4 , where the errors are statistical and systematic. We compare the measured distribution in q 2 , the momentum transfer squared, with predictions for the form factors from QCD calculations and determine the CKM matrix element | V ub | . Based on the measured partial branching fraction for B → πℓν in the range q 2 < 12 GeV 2 and the most recent LCSR calculations we obtain | V ub | = (3 . 78 ± 0 . 13 +0 . 55 − 0 . 40 ) × 10 − 3 , where the errors refer to the experimental and theoretical uncertainties. From a simultaneous ﬁt to the data over the full q 2 range and the FNAL/MILC lattice QCD results, we obtain | V ub | = (2 . 95 ± 0 . 31) × 10 − 3 from B → πℓν , where the error is the combined experimental and theoretical uncertainty.


I. INTRODUCTION
The elements of the Cabibbo-Kobayashi-Maskawa (CKM) quark-mixing matrix are fundamental parameters of the Standard Model (SM) of electroweak interactions. With the increasingly precise measurements of decay-time-dependent CP asymmetries in B-meson decays, in particular sin(2β) [1,2], improved measurements of the magnitude of V ub and V cb will allow for more stringent experimental tests of the SM mechanism for CP violation [3]. This is best illustrated in terms of the unitarity triangle, the graphical representation of one of the unitarity conditions for the CKM matrix, for which the length of the side that is opposite to the angle β is proportional to the ratio |V ub |/|V cb |. The best method to determine |V ub | and |V cb | is to measure semileptonic decay rates for B → X c ℓν and B → X u ℓν (X c and X u refer to hadronic states with or without charm), which are proportional to |V cb | 2 and |V ub | 2 , respectively.
There are two methods to extract these two CKM elements from B decays, one based on inclusive and the other on exclusive semileptonic decays. Exclusive decays offer better kinematic constraints and thus more effective background suppression than inclusive decays, but the lower branching fractions result in lower event yields. Since the experimental and theoretical techniques for these two approaches are different and largely independent, they can provide important cross checks of our understanding of the theory and the measurements. An overview of the determination of |V ub | using both approaches can be found in a recent review [4].
In this paper, we present a study of four exclusive charmless semileptonic decay modes, B 0 → π − ℓ + ν, B + → π 0 ℓ + ν, B 0 → ρ − ℓ + ν, and B + → ρ 0 ℓ + ν [5], and a determination of |V ub |. Here ℓ refers to a charged lepton, either e + or µ + , and ν refers to a neutrino, either ν e or ν µ . This analysis represents an update of an earlier measurement [6] that was based on a significantly smaller data set. For the current analysis, the signal yields and background suppression have been improved and the systematic uncertainties have been reduced through the use of improved reconstruction and signal extraction methods, combined with more detailed background studies.
The principal experimental challenge is the separation of the B → X u ℓν from the dominant B → X c ℓν decays, for which the inclusive branching fraction is a factor of 50 larger. Furthermore, the isolation of individual exclusive charmless decays from all other B → X u ℓν decays is difficult, because the exclusive branching ratios are typically only 10% of B(B → X u ℓν) = (2.29 ± 0.34) × 10 −3 [7], the inclusive branching fraction for charmless semileptonic B decays.
The reconstruction of signal decays in e + e − → Υ (4S) → BB events requires the identification of three types of particles, the hadronic state X u producing one or two charged and/or neutral final state pions, the charged lepton, and the neutrino. The presence of the neutrino is inferred from the missing momentum and energy in the whole event.
The event yields for each of the four signal decay modes are extracted from a binned maximum-likelihood fit to the three-dimensional distributions of the variables m ES , the energy-substituted B-meson mass, ∆E, the difference between the reconstructed and the expected B-meson energy, and q 2 , the momentum transfer squared from the B meson to the final-state hadron. The measured differential decay rates in combination with recent form-factor calculations are used to determine |V ub |. By measuring both B → πℓν and B → ρℓν decays simultaneously, we reduce the sensitivity to the cross feed between these two decay modes and some of the background contributions.
The most promising decay mode for a precise determination of |V ub |, both experimentally and theoretically, is the B → πℓν decay for which a number of measurements exist. The first measurement of this type was performed by the CLEO Collaboration [8]. In addition to the earlier BABAR measurement mentioned above [6], there is a more recent BABAR measurement [9] in which somewhat looser criteria on the neutrino selection were applied, resulting in a larger signal sample but also substantially higher backgrounds. These analyses also rely on the measurement of the missing energy and momentum of the whole event to reconstruct the neutrino, without explicitly reconstructing the second B-meson decay in the event, but are based on smaller data sets than the one presented here. Recently a number of measurements of both B → πℓν and B → ρℓν decays were published, in which the BB events were tagged by a fully reconstructed hadronic or semileptonic decay of the second B meson in the event [10,11]. These analyses have led to a simpler and more precise reconstruction of the neutrino and very low backgrounds. However, this is achieved at the expense of much smaller signal samples, which limit the statistical precision of the form-factor measurement.

A. Overview
The advantage of charmless semileptonic decays over charmless hadronic decays of the B meson is that the leptonic and hadronic components of the matrix element factorize. The hadronic matrix element is difficult to calculate, since it must take into account physical mesons, rather than free quarks. Therefore higher-order perturbative corrections and non-perturbative long-distance hadronization processes cannot be ignored. To overcome these difficulties, a set of Lorentz-invariant form factors has been introduced that give a global description of these QCD processes.
A variety of theoretical predictions for these form factors exist. They are based on QCD calculations, such as lattice QCD and sum rules, in addition to quark models. We will make use of a variety of these calculations to assess their impact on the determination of |V ub | from measurements of the decay rates.
The V −A structure of the hadronic current is invoked, along with the knowledge of the transformation properties of the final-state meson, to formulate these form factors. They are functions of q 2 = m 2 W , the mass squared of the virtual W , Here P ℓ and P ν refer to the four-momenta of the charged lepton and the neutrino, M B and P B to the mass and the four-momentum of the B meson, and m X and E X are the mass and energy (in the B-meson rest frame) of the final-state meson X u . We distinguish two main categories of exclusive semileptonic decays: decays to pseudoscalar mesons, B → πℓν or B + → ηℓ + ν, and decays to vector mesons, B → ρℓν or B + → ωℓ + ν.  Figure 1 shows the phase space for B → πℓν and B → ρℓν decays in terms of q 2 and E ℓ , the energy of the charged lepton in the B-meson rest frame. The difference between the distributions is due to the different spin structure of the decays.

B Decays to Pseudoscalar Mesons: B → πℓν
For decays to a final-state pseudoscalar meson, the hadronic matrix element is usually written in terms of two form factors, f + (q 2 ) and f 0 (q 2 ) [12,13], π(P π )|uγ µ b|B(P B ) = where P π and P B are the four-momenta of the final-state pion and the parent B meson, and m π and M B are their masses. This expression can be simplified for leptons with small masses, such as electrons and muons, because in the limit of m ℓ ≪ M B the second term can be neglected. We are left with a single form factor f + (q 2 ) and the differential decay rate becomes where p π is the momentum of the pion in the rest frame of the B meson, and q 2 varies from zero to q 2 The decay rate depends on the third power of the pion momentum, suppressing the rate at high q 2 . The rate also depends on sin 2 θ W ℓ , where θ W ℓ is the angle of the charged-lepton momentum in the W rest frame with respect to direction of the W boost from the B rest frame. The combination of these two factors leads to a leptonmomentum spectrum that is peaked well below the kinematic limit (see Figure 1).

B Decays to Vector Mesons: B → ρℓν
For decays with a vector meson in the final state, the polarization vector ǫ of the vector meson plays an important role. The hadronic current is written in terms of four form factors, of which only three (A i with i = 0, 1, 2) are independent [12,13], where m ρ and P ρ refer to the vector-meson mass and four-momentum. Again, a simplification can be made for low-mass charged leptons. The term with q µ can be neglected, so there are effectively only three form factors for electrons and muons: the axial-vector form factors, A 1 (q 2 ) and A 2 (q 2 ), and the vector form factor, V (q 2 ). Instead of using these form factors, the full differential decay rate is usually expressed in terms of the helicity amplitudes corresponding to the three helicity states of the ρ meson, where p ρ is the momentum of the final-state ρ meson in the B rest frame. While A 1 dominates the three helicity amplitudes, A 2 contributes only to H 0 , and V contributes only to H ± . Thus the differential decay rate can be written as The V −A nature of the charged weak current leads to a dominant contribution from H − and a distribution of events characterized by a forward peak in cos θ W ℓ and high lepton momenta (see Figure 1).

C. Form-Factor Calculations and Models
The q 2 dependence of the form factors can be extracted from the data. Since the differential decay rates are proportional to the product of |V ub | 2 and the form-factor terms, we need at least one point in q 2 at which the form factor is predicted in order to extract |V ub | from the measured branching fractions.
These calculations will also be used to simulate the kinematics of the signal decay modes and thus might impact the detection efficiency and thereby the branchingfraction measurement. The two QCD calculations result in predictions for different regions of phase space. The lattice calculations are only available in the high-q 2 region, while LCSR provide information near q 2 = 0. Interpolations between these two regions can be constrained by unitarity and analyticity requirements [24,25]. Figure 2 shows the q 2 distributions for B → πℓν and B → ρℓν decays for various form-factor calculations. The uncertainties in these predictions are not indicated. For B → πℓν decays they are largest at low q 2 for LQCD predictions and largest at high q 2 for LCSR calculations. Estimates of the uncertainties of the calculations are currently not available for B → ρℓν decays.  [14], LCSR calculations (LCSR 1 [15] and LCSR 2 [19] for B → πℓν and LCSR [17] for B → ρℓν) and the HPQCD [23] lattice calculation. The extrapolations of the QCD predictions to the full q 2 range are marked as dashed lines.
The Isgur-Scora-Grinstein-Wise model (ISGW2) [14] is a constituent quark model with relativistic corrections. Predictions extend over the full q 2 range; they are normalized at q 2 ≈ q 2 max . The form factors are parameterized as where ξ is the charge radius of the final-state meson, and N = 2 (N = 3) for decays to pseudoscalar (vector) mesons. The uncertainties of the predictions by this model are difficult to quantify. QCD light-cone sum-rule calculations are nonperturbative and combine the idea of QCD sum rules with twist expansions performed to O(α s ). These calculations provide estimates of various form factors at low to intermediate q 2 , for both pseudoscalar and vector decays. The overall normalization is predicted at low q 2 with typical uncertainties of 10-13% [15,17].
Lattice QCD calculations can potentially provide heavy-to-light-quark form factors from first principles. Unquenched lattice calculations, in which quark-loop effects in the QCD vacuum are incorporated, are now available for the B → πℓν form factors from the Fermilab/MILC [22] and the HPQCD [23] Collaborations. Both calculations account for three dynamical quark flavors, the mass-degenerate u and d quarks and a heavier s quark, but they differ in the way the b quark is simulated. Predictions for f 0 (q 2 ) and f + (q 2 ) are shown in Figure 3. The two lattice calculations agree within the stated uncertainties, which are significantly smaller than those of earlier quenched approximations.  [22] and HPQCD [23] Collaborations (data points with combined statistical and systematic errors) and LCSR calculations [15] (solid black lines). The dashed lines indicate the extrapolations of the LCSR predictions to q 2 > 16 GeV 2 .

D. Form-Factor Parameterizations
Neither the lattice nor the LCSR QCD calculations predict the form factors over the full q 2 range. Lattice calculations are restricted to small hadron momenta, i.e., to q 2 ≥ q 2 max /2, while LCSR work best at small q 2 . If the q 2 spectrum is well measured, the shape of the form factors can be constrained, and the QCD calculations provide the normalization necessary to determine |V ub |.
A number of parameterizations of the pseudoscalar form factor f + (q 2 ) are available in the literature. The following four parameterizations are commonly used. All of them include at least one pole term at q 2 = m 2 B * , with m B * = 5.325 GeV < M B + m π .
2. Ball-Zwicky (BZ) [15,16] : where f + (0) is the normalization, and α BZ and r BZ determine the shape. This is an extension of the BK ansatz, related by the simplification α BK = α BZ = r BZ . This ansatz was used to extend the LCSR predictions to higher q 2 , as shown in Figure 3.
3. Boyd, Grinstein, Lebed (BGL) [24,25] : where m ± = M B ± m π and q 2 0 is a free parameter [27]. The so-called Blaschke factor P(q 2 ) = z(q 2 , m 2 B * ) accounts for the pole at q 2 = m 2 B * , and φ(q 2 , q 2 0 ) is an arbitrary analytic function [28] whose choice only affects the particular values of the series coefficients a k . In this expansion in the variable z, the shape is given by the values of a k , with truncation at k max = 2 or 3. The expansion parameters are constrained by unitarity, k a 2 k ≤ 1. Becher and Hill [25] have pointed out that due to the large b-quark mass, this bound is far from being saturated. Assuming that the ratio Λ/m b is less than 0.1, the heavy-quark bound is approximately 30 times more constraining than the bound from unitarity alone, k a 2 k ∼ (Λ/m b ) 3 ≈ 0.001. For more details we refer to the literature [24,25]. 4. Bourrely, Caprini, Lellouch (BCL) [29] : where the variable z is defined as in Eq. 12 with free parameter q 2 0 [27]. In this expansion the shape is given by the values of b k , with truncation at k max = 2 or 3. The BCL parameterization exhibits the QCD scaling behavior f + (q 2 ) ∝ 1/q 2 at large q 2 .
The BK and BZ parameterizations are intuitive and have few free parameters. Fits to the previous BABAR form-factor measurements using these parameterizations have shown that they describe the data quite well [9]. The BGL and BCL parameterizations are based on fundamental theoretical concepts like analyticity and unitarity. The z-expansion avoids ad hoc assumptions about the number of poles and pole masses, and it can be adapted to the precision of the data. The data used in this analysis were recorded with the BABAR detector at the PEP-II energy-asymmetric e + e − collider operating at the Υ (4S) resonance. A sample of 377 million Υ (4S) → BB events, corresponding to an integrated luminosity of 349 fb −1 , was collected. An additional sample of 35.1 fb −1 was recorded at a centerof-mass (c.m.) energy approximately 40 MeV below the Υ (4S) resonance, i.e., just below the threshold for BB production. This off-resonance data sample is used to subtract the non-BB contributions from the data collected at the Υ (4S) resonance. The principal source of these hadronic non-BB events is e + e − annihilation in the continuum to qq pairs, where q = u, d, s, c refers to quarks. The relative normalization of the off-resonance and on-resonance data samples is derived from luminosity measurements, which are based on the number of detected µ + µ − pairs and the QED cross section for e + e − → µ + µ − production, adjusted for the small difference in c.m. energy. The systematic error on the relative normalization is estimated to be 0.25%.

B. BABAR Detector
The BABAR detector and event reconstruction are described in detail elsewhere [30,31]. The momenta and angles of charged particles are measured in a tracking system consisting of a five-layer silicon vertex tracker (SVT) and a 40-layer drift chamber (DCH) filled with a heliumisobutane gas mixture. Charged particles of different masses are distinguished by their ionization energy loss in the tracking devices and by a ring-imaging Cerenkov detector (DIRC). Electromagnetic showers from electrons and photons are measured in a finely segmented CsI(Tl) calorimeter (EMC). These detector components are embedded in the 1.5-T magnetic field of the solenoid. The magnet flux return steel is segmented and instrumented (IFR) with planar resistive plate chambers and limited streamer tubes, which detect particles penetrating the magnet coil and steel.
The efficiency for the reconstruction of charged particles inside the fiducial volume of the tracking system exceeds 96% and is well reproduced by MC simulation. An effort has been made to minimize fake charged tracks, caused by multiple counting of a single low-energy track curling in the DCH, split tracks, or backgroundgenerated tracks. The average uncertainty in the trackreconstruction efficiency is estimated to range from 0.25% to 0.5% per track.
To remove beam generated background and noise in the EMC, photon candidates are required to have an energy of more than 50 MeV and a shower shape that is consistent with an electromagnatic shower. The photon efficiency and its uncertainty are evaluated by comparing τ ± → π ± ν to τ ± → ρ ± ν samples and by studying e + e − → µ + µ − (γ) events.
Electron candidates are selected on the basis of the ratio of the energy detected in the EMC and the track momentum, the EMC shower shape, the energy loss in the SVT and DCH, and the angle of the Cerenkov photons reconstructed in the DIRC. The energy of electrons is corrected for bremsstrahlung detected as photons emitted close to the electron direction. Muons are identified by using a neural network that combines the information from the IFR with the measured track momentum and the energy deposition in the EMC.
The electron and muon identification efficiencies and the probabilities to misidentify a pion, kaon, or proton as an electron or muon are measured as a function of the laboratory momentum and angles using high-purity samples of particles selected from data. These measurements are performed separately for positive and negative leptons. For the determination of misidentification probabilities, knowledge of the inclusive momentum spectra of positive and negative hadrons, and the measured fractions of pions, kaons and protons and their misidentification rates is used.
Within the acceptance of the SVT, DCH and EMC defined by the polar angle in the laboratory frame, −0.72 < cos θ lab < 0.92, the average electron efficiency for laboratory momenta above 0.5 GeV is 93%, largely independent of momentum. The average hadron misidentification rate is less than 0.2%. Within the same polar-angle acceptance, the average muon efficiency rises with laboratory momentum to reach a plateau of about 70% above 1.4 GeV. The muon efficiency varies between 50% and 80% as a function of the polar angle. The average hadron misidentification rate is 2.5%, varying by about 1% as a function of momentum and polar angle.
Neutral pions are reconstructed from pairs of photon candidates that are detected in the EMC and assumed to originate from the primary vertex. Photon pairs with an invariant mass within 17.5 MeV of the nominal π 0 mass are considered π 0 candidates. The overall detection efficiency, including solid angle restrictions, varies between 55% and 65% for π 0 energies in the range of 0.2 to 2.5 GeV.

C. Monte Carlo Simulation
We assume that the Υ (4S) resonance decays exclusively to BB pairs [32] and that the non-resonant cross section for e + e − → qq is 3.4 nb, compared to the Υ (4S) peak cross section of 1.05 nb. We use Monte Carlo (MC) techniques [33] to simulate the production and decay of BB and qq pairs and the detector response [34], to estimate signal and background efficiencies, and to extract the expected signal and background distributions. The size of the simulated sample of generic BB events exceeds the BB data sample by about a factor of three, while the MC samples for inclusive and exclusive B → X u ℓν decays exceed the data samples by factors of 15 or larger. The MC sample for qq events is comparable in size to the qq data sample recorded at the Υ (4S) resonance.
Information extracted from studies of selected data control samples on efficiencies and resolution is used to improve the accuracy of the simulation. Specifically, comparisons of data with the MC simulations reveal small differences in the tracking efficiencies and calorimeter resolution. We apply corrections to account for these differences. The MC simulations include radiative effects such as bremsstrahlung in the detector material and initial-state and final-state radiation [35]. Adjustments are made to take into account the small variations of the beam energies over time.
For this analysis, no attempt is made to reconstruct K 0 L interacting in the EMC or IFR. Since a K 0 L deposits only a small fraction of its energy in the EMC, K 0 L production can have a significant impact on the energy and momentum balance of the whole event and thereby the neutrino reconstruction. It is therefore important to verify that the production rate of neutral kaons and their interactions in the detector are well reproduced.
From detailed studies of large data and MC samples of D 0 → K 0 L π + π − and D 0 → K 0 S π + π − decays, corrections to the simulation of the K 0 L detection efficiency and energy deposition in the EMC are determined. The MC simulation reproduces the efficiencies well for K 0 L laboratory momenta above 0.7 GeV. At lower momenta, the difference between MC and data increases significantly; in this range the MC efficiencies are reduced by randomly eliminating a fraction of the associated EMC showers. The energy deposited by K 0 L in the EMC is significantly underestimated by the simulation for momenta up to 1.5 GeV. At higher momenta the differences decrease. Thus the simulated energies are scaled by factors varying between 1.20 and 1.05 as a function of momentum. Furthermore, assuming equal inclusive production rates for K 0 L and K 0 S we verify the production rate as a function of momentum, by comparing data and MC simulated K 0 S momentum spectra. We observe differences at small momenta; below 0.4 GeV the data rate is lower by as much as 22 ± 7% compared to the MC simulation. To account for this difference, we reduce the rate of low momentum K 0 L in the simulation by randomly transforming the excess K 0 L into a fake photon, i.e., we replace the energy deposited in the EMC by the total K 0 L energy and set the mass to zero. Thus we correct the overall energy imbalance created by the excess in K 0 L production. For reference, the values of the branching fractions, lifetimes, and parameters most relevant to the MC simulation are presented in Tables I and II. The simulation of inclusive charmless semileptonic decays B → X u ℓν is based on predictions of a heavy-quark expansion (HQE) (valid to O(α s ) [36]) for the differential decay rates. This calculation produces a smooth hadronic mass spectrum. The hadronization of X u with masses above 2m π is performed by JETSET [37]. To  1.065 ± 0.026 [41] describe the dynamics of the b quark inside the B meson we use HQE parameters extracted from global fits to moments of inclusive lepton-energy and hadron-mass distributions in B → X c ℓν decays and moments of inclusive photon-energy distributions in B → X s γ decays [38]. The specific values of the HQE parameters in the shape-function scheme are m b = 4.631 ± 0.034 GeV and µ 2 π = 0.184 ± 0.36 GeV 2 ; they have a correlation of ρ = −0.27. Samples of exclusive semileptonic decays involving low-mass charmless mesons (π, ρ, ω, η, η ′ ) are simulated separately and then combined with samples of decays to non-resonant and higher-mass resonant states, so that the cumulative distributions of the hadron mass, the momentum transfer squared, and the lepton momentum reproduce the HQE predictions. The generated distributions are reweighted to accommodate variations due to specific choices of the parameters for the inclusive and exclusive decays. The overall normalization is adjusted to reproduce the measured inclusive B → X u ℓν branching fraction.
For the generation of decays involving charmless pseudo-scalar mesons we choose two approaches. For the signal decay B → πℓν we use the ansatz by Becirevic and Kaidalov [26] for the q 2 dependence, with the single parameter α BK set to the value determined in a previous BABAR analysis [9] of B → πℓν decays, α BK = 0.52±0.06. For decays to η and η ′ we use the form factor parameterization of Ball and Zwicky with specific values reported in [18].
Decays involving charmless vector mesons (ρ, ω) are generated based on form factors determined from LCSR by Ball and Zwicky [17]. We use the parameterization proposed by the authors to describe the q 2 dependence of the form factors in terms of a modified pole ansatz using up to three independent parameters r 1 , r 2 and m fit . Table III shows the suggested values for these parameters. m fit refers to an effective pole mass that accounts for contributions from higher-mass B mesons with J P = 1 − , and r 1 , and r 2 give the relative scale of the two pole terms.  [15,17] for decays to pseudo-scalar mesons η and η ′ (f+) and vector mesons ρ and ω (A1, A2, V ). For the simulation of the dominant B → X c ℓν decays, we have chosen a variety of models. For B → Dℓν and B → D * ℓν decays we use parameterizations [39,40] of the form factors based on heavy quark effective theory (HQET). In the limit of negligible lepton masses, decays to pseudoscalar mesons are described by a single form factor for which the q 2 dependence is given by a slope parameter. We use the world average [41], updated for recent precise measurements by the BABAR Collaboration [42,43]. Decays to vector mesons are described by three form factors, of which the axial vector form factor dominates. In the limit of heavy quark symmetry, their q 2 dependence can be described by three parameters: ρ 2 D * , R 1 , and R 2 . We use the most precise BABAR measurement [44] of these parameters.
For the generation of the semileptonic decays to D * * resonances (four L = 1 states), we use the ISGW2 [14] model. At present, the sum of the branching fractions for these four decays modes is measured to be 1.7%, but so far only the decays D * * → Dπ and D * * → D * π have been reconstructed, while the total individual branching fractions for these four states remain unknown. Since the measured inclusive branching fraction for B → X c ℓν exceeds the sum of the measured branching fractions of all exclusive semileptonic decays by about 1.0%, and since non-resonant B → D ( * ) πℓν decays have not been observed [45], we assume that the missing decays are due to B → D * * ℓν, involving hadronic decays of the D * * mesons that have not yet been measured. To account for the observed deficit, we increase the B → D * * ℓν branch-ing fractions by 60% and inflate the errors by a factor of three.

IV. EVENT RECONSTRUCTION AND CANDIDATE SELECTION
In the following, we describe the selection and kinematic reconstruction of signal candidates, the definition of the various background classes, and the application of neural networks to suppress these backgrounds.

A. Signal-Candidate Selection
Signal candidates are selected from events having at least four charged tracks. The reconstruction of the four signal decay modes, B 0 → π − ℓ + ν, B + → π 0 ℓ + ν, B 0 → ρ − ℓ + ν and B + → ρ 0 ℓ + ν, requires the identification of a charged lepton, the reconstruction of the hadronic state consisting of one or more charged or neutral pions, and the reconstruction of the neutrino from the missing energy and missing momentum of the whole event.

Lepton and Hadron Selection
Candidates for leptons, both e ± and µ ± , are required to have high c.m. momenta, p * ℓ ≥ 1.0 GeV for the B → πℓν, and p * ℓ ≥ 1.8 GeV for the B → ρℓν sample. This requirement significantly reduces the background from hadrons that are misidentified as leptons, and also removes a large fraction of true leptons from secondary decays or photon conversions, and from B → X c ℓν decays.
To suppress Bhabha scattering and two-photon processes in which an electron or a photon from initialstate or final state radiation interacts in the material of the detector and generates additional charged tracks and photons at small angles to the beam axis, we require ξ z < 0.65 for events with a candidate electron. Here where the sum runs over all charged particles in the event and p z i and E i are their longitudinal momentum components and energies measured in the laboratory frame.
For the reconstruction of the signal hadron, we consider all charged tracks that are not consistent with a signal lepton and not identified as a kaon. Neutral pions are reconstructed from pairs of photon candidates and the π 0 c.m. momentum is required to exceed 0.2 GeV. Candidate ρ ± → π ± π 0 or ρ 0 → π + π − decays are required to have a two-pion mass within one full width of the nominal ρ mass, 0.650 < M ππ < 0.850 GeV. To reduce the combinatorial background, we also require that the c.m. momentum of one of the pions exceed 0.4 GeV, and that the c.m. momentum of the other pion be larger than 0.2 GeV.
Each charged lepton candidate is combined with a hadron candidate to form a so-called Y candidate of charge zero or one. At this stage in the analysis we allow for more than one candidate per event. Two or three charged tracks associated with the Y candidate are fitted to a common vertex. This vertex fit must yield a χ 2 probability of at least 0.1%. To remove background from J/ψ → ℓ + ℓ − decays, we reject a Y candidate if the invariant mass of the lepton and any oppositely charged track in the event is consistent with this decay.
To further reduce backgrounds without significant signal losses, we impose additional restrictions on the c.m. momenta of the lepton and hadron candidates by requiring at least one of the following conditions to be satisfied, for B → πℓν GeV. These additional requirements on the lepton and hadron c.m. momenta primarily reject background candidates that are inconsistent with the phase space of the signal decay modes.
If a Y candidate originates from a signal decay mode, the cosine of the angle between the momentum vectors of the B meson and the Y candidate, cos θ BY , can be calculated as follows, and the condition | cos θ BY | ≤ 1.0 should be fulfilled.
The energy E * B and momentum p * B of the B meson are not measured event-by-event. Specifically, E * B = √ s/2 is given by the average c.m. energy of the colliding beams, and the B momentum is derived as To allow for the finite resolution in this variable, we impose the requirement −1.2 < cos θ BY < 1.1.

Neutrino Reconstruction
The neutrino four-momentum, P ν ≃ (E miss , p miss ), is inferred from the difference between the net fourmomentum of the colliding-beam particles, P e + e − = (E e + e − , p e + e − ), and the sum of the measured four-vectors of all detected particles in the event, (15) where E i and p i are the energy and three-momentum of the i th track or EMC shower, measured in the laboratory frame. The energy calculation depends on the correct mass assignments for charged tracks. For this reason we choose to calculate the missing momentum and energy in the laboratory frame rather than in the rest frame of the Υ (4S). By doing so, we keep this uncertainty confined to the missing energy.
If all particles in the event, except the neutrino, are well measured, P ν ≃ (E miss , p miss ) is a good approximation. However, particles that are undetected because of inefficiency or acceptance losses, in particular K L mesons and additional neutrinos, or spurious tracks or photons that do not originate from the BB event, impact the accuracy of this approximation. To reduce the effect of losses due to the limited detector acceptance, we require that the polar angle of the missing momentum in the laboratory frame be in the range 0.3 < θ miss < 2.2 rad. We also require the missing momentum in the laboratory frame to exceed 0.5 GeV.
For the rejection of background events and signal decays that are poorly reconstructed as well as events with more than one missing particle, we make use of the missing mass squared of the whole event, For a correctly reconstructed event with a single semileptonic B decay, m 2 miss should be consistent with zero within measurement errors.
Failure to detect one or more particles in the event creates a substantial tail at large positive values. Since the resolution in m 2 miss increases linearly with E miss , we use the variable m 2 miss /2E miss ≃ E miss − p miss as a discriminator and require m 2 miss /2E miss < 2.5 GeV.

Variables Used for Signal Extraction
The kinematic consistency of the candidate decay with a signal B decay is ascertained using two variables, the beam-energy substituted B mass m ES , and the difference between the reconstructed and expected energy of the B candidate ∆E. In the laboratory frame, they are defined as and where P B = (E B , p B ) and P e + e − denote the fourmomenta of the B meson and the colliding beam particles, respectively. The B-meson momentum vector p B is determined from the measured three-momenta of the decay products, and P e + e − is derived from the calibration and monitoring of the energies and angles of the stored beams. We extract the signal yields by a fit to the two-dimensional ∆E − m ES distributions in bins of the momentum transfer squared q 2 . We define a region in the ∆E − m ES plane that contains almost all of the signal events and leaves sufficient phase space to constrain the different background contributions. This fit region is defined as |∆E| < 0.95 GeV , 5.095 < m ES < 5.295 GeV. (19) Only candidates that fall inside the fit region are considered in the analysis. We also define a smaller region where the signal contribution is much enhanced relative to the background. This signal region is defined as The signal region is chosen to be slightly asymmetric in ∆E to avoid sizable B → X c ℓν background, which peaks near −0.2 GeV. In the following, we refer to the phase space outside the signal region, but inside the fit region, as the side bands.
As a measure of the momentum transfer squared q 2 we adopt the mass squared of the virtual W , i.e., the invariant mass squared of the four-vector sum of the reconstructed lepton and neutrino, The resolution in q 2 raw is dominated by the measurement of the missing energy which tends to have a poorer resolution than the measured missing momentum, because the missing momentum is a vector sum and contributions from particle losses (or additional tracks and EMC showers) do not add linearly as is the case for E miss . Thus for the definition of q 2 raw it is advantageous to replace E miss by p miss , the absolute value of the measured missing momentum, The resolution of q 2 raw can be further improved by scaling p miss by a factor of α, such that ∆E of the B candidate is forced to zero, and substituting p ν for p miss to obtain q 2 corr . Any candidates for which this q 2 correction yields unphysical values, q 2 corr < 0 GeV 2 , are rejected. This is the case for about 1% of the background not associated with semileptonic decays. The quantity q 2 corr is used as the measured q 2 throughout this analysis.
The q 2 resolution is critical for the measurements of the form factors. Figure 4 shows the correlation between the true q 2 and the reconstructed q 2 corr for simulated samples of B → πℓν and B → ρℓν candidates passing the entire event selection, which is described below. Correctly reconstructed signal events and combinatorial signal events, for which the hadron has been incorrectly selected, are shown. For correctly reconstructed signal decays the resolution improves with higher q 2 and can be well described by the sum of two Gaussian resolution functions, see Table IV. In the signal region, the widths of the core resolution are in the range 0.18 − 0.34 GeV 2 , and the tails can be approximated by a second Gaussian function with widths in the range 0.6 − 0.8 GeV 2 . As expected, the resolution is significantly worse in the larger fit region. Combinatorial signal events contribute primarily at high q 2 . We rely on the Monte Carlo simulation to reproduce the resolution in the reconstructed q 2 corr variable.

Signal and Background Sources
A variety of processes contribute to the four samples of selected candidates for the charmless semileptonic decay We divide the signal and background for each of the four candidate samples into a set of sources based on the origin of the charged lepton candidate.
• Signal: We differentiate four classes of signal events; for all of them the lepton originates from a signal decay under study: -True signal: the hadron originates from the signal decay under study; -Combinatorial Signal: the hadron is incorrectly selected, in many cases from decay products of the second B meson in the event; -Isospin-conjugate signal: the lepton originates from the isospin conjugate of the signal decay; -Cross-feed signal: the lepton originates from another signal decay mode, for instance B → ρℓν in a B → πℓν sample.
• Continuum background: We differentiate two classes of continuum backgrounds: -True leptons: the lepton candidate originates from a leptonic or semileptonic decay of a hadron produced in e + e − → qq (mostly cc) -Fake leptons: the lepton candidate is a misidentified hadron; this is a sizable contribution to the muon sample.
• B → X u ℓν background: We differentiate two different sources of B → X u ℓν background: -Exclusive B → X u ℓν decays involving a single hadron with mass below 1 GeV: decays that are not analyzed as signal (B + → ωℓ + ν, -Inclusive B → X u ℓν decays: decays involving more than one hadron or a single hadron with mass above 1 GeV.
• BB background: We differentiate three classes of BB background, excluding B → X u ℓν decays: -Primary leptons, i.e., B → X c ℓν decays: the lepton originates from a charm semileptonic B decay, either B → Dℓν, B → D * ℓν, or B → D ( * ) (nπ)ℓν with n ≥ 1 additional pions; this class is dominated by B → D * ℓν decays; the largest contributions involve hadrons that do not originate from the semileptonic decay; -Secondary leptons: the lepton originates from the decay of a particle other than a B meson, for instance charm mesons, τ leptons, J/ψ , or from photon conversions; -Fake leptons: the lepton candidate is not a lepton, but a misidentified charged hadron; this background is dominated by fake muons.
Given that the secondary-lepton and fake-lepton BB background contributions are relatively small in this analysis, we combine them into one class (other BB).
For intermediate values of q 2 (in the range 4 < q 2 < 20 GeV 2 ), B → X c ℓν decays are by far the dominant background, whereas continuum background contributes mostly at low and high q 2 . The B → X u ℓν decays have much smaller branching fractions, but their properties are very similar to the signal decays and thus they are difficult to discriminate against. They contribute mostly at high q 2 , where they are the dominant background. IV: Description of the q 2 resolution in terms of a sum of two Gaussian resolution functions for true signal decays in the ∆E − mES fit region and in the signal region, integrated over q 2 ; σ1, µ1 and σ2, µ2 denote the means and the widths of the two Gaussian functions, and the last column lists the fraction of the events characterized by the narrower resolution function.

Neural Networks
To separate signal events from the background sources, continuum events, non-signal B → X u ℓν decays and the remaining BB events, we employ a neural-network technique based on a multi-layer perceptron (MLP) [46]. We have set up a network structure with seven input neurons and one hidden layer with three neurons and have adopted the method introduced by Broyden, Fletcher, Goldfarb, and Shanno [47] to train the network. Some of the input variables are used as part of the event preselection that is designed to reduce the BB and continuum backgrounds by cutting out regions where the signal contribution is small or where there are spikes in distributions, which the neural network may not deal with effectively. The following variables are input to the neural networks: • R2, the second normalized Fox-Wolfram moment [48] determined from all charged and neutral particles in the event; we require R2 < 0.5; where the sum runs over all tracks in the event excluding the Y candidate, and p * i and θ * i refer to the c.m. momenta and the angles measured with respect to the thrust axis of the Y candidate; we set a loose restriction, L2 < 3.0 GeV.
The first three input variables are sensitive to the topological difference between the jet-like continuum events and the more spherical BB events. Restrictions on these variables do not bias the q 2 distribution significantly. The restrictions placed on cos θ BY , m 2 miss /(2E miss ), and θ miss do not significantly bias the q 2 distribution either. However, the variable cos θ W ℓ is correlated with the lepton momentum and thereby q 2 . To ensure that the selection does not adversely affect the measurement of the q 2 spectrum, we have chosen rather moderate restrictions on cos θ W ℓ . Figure 5 shows the ∆E and m ES distributions for samples of B 0 → π − ℓ + ν and B 0 → ρ − ℓ + ν candidates (integrated over q 2 ) that have been preselected by the criteria described above. The stacked histograms show the signal and background contributions compared to the data, prior to the fit. The three dominant backgrounds are B → X c ℓν decays (including B → Dℓν, B → D * ℓν and B → D ( * ) (nπ)ℓν), qq continuum and B → X u ℓν decays. The signal contributions are very small by comparison and difficult to observe.
The neural networks are trained separately for the three background categories and for different q 2 intervals. We introduce six bins in q 2 for B → πℓν and three bins for B → ρℓν. The bin sizes are 4 GeV 2 for B → πℓν and 8 GeV 2 for B → ρℓν, except for the last bin, which extends to the kinematic limit of 26.4 GeV 2 and 20.3 GeV 2 , respectively. Thus in total we train 3×(2×6+2×3) = 54 neural networks. Since we aim for a good signal-tobackground ratio in the region where most of the signal is located, we do not train the neural network with events in the whole fit region, but in an extended signal region, −0.25 < ∆E < 0.35 GeV, 5.240 < m ES < 5.295 GeV.
For the training of the neural networks we use MC simulated events containing correctly reconstructed signal decays and the following simulated background samples:  6 10 × candidates after the preselection, i.e., prior to the neural-network application. The stacked histograms show the predicted signal and background contributions prior to the fit. The expected signal distribution (with arbitrary normalization) is indicated as a magenta dashed histogram.
The training of the neural networks and the subsequent background reduction is performed sequentially for the three background samples. We use subsamples of typically less than half the total MC samples for training and validation of the neural networks. Of these subsamples, one half of the events is used as training sample, and the other half for validation.
Studies of the neural-network performance for the B → X u ℓν background indicate that the separation of this background from the signal is very difficult because of the similarity in the shape of the distributions, especially for the B + → π 0 ℓ + ν and the B → ρℓν samples. Given these difficulties, we use the B → X u ℓν neural network only for the B 0 → π − ℓ + ν sample, and only for q 2 > 12 GeV 2 , where the B → X u ℓν background becomes significant. Figure 6 shows, for the sample of B 0 → π − ℓ + ν candidates, the distributions of the seven input variables to the neural networks. The distributions are shown sequentially after application of the preselection, the qq neural network and the B → X c ℓν neural network to illustrate the change in the sample composition. Figures 7 to 9 show the distributions of the three neural-network discriminators for the B 0 → π − ℓ + ν sample in four of the six q 2 bins. Figures 10 and 11 show the distributions of the two neural-network discriminators for the B 0 → ρ − ℓ + ν sample in all three q 2 bins. The discriminator cuts are chosen to minimize the total error on the signal yield for each channel, using the sum in quadrature of the error obtained from the maximum-likelihood fit described in Section VI and the estimated total systematic error of the partial signal branching fraction in each q 2 bin (see Section VII). The data-MC agreement is reasonably good for the input distributions and the neural-network discriminators. One should keep in mind that at this stage the distributions are taken directly from the simulation, without any adjustments or fit.  Table V shows the selection efficiencies for the four signal samples compared to the efficiencies for the dominant background sources for these samples. The total signal efficiencies are typically 6 − 7% for B → πℓν decays and roughly 1 − 2.5% for B → ρℓν decays in the fit region. The dominant BB and qq backgrounds are suppressed by factors of order 10 4 and 10 5 , respectively.

Candidate Multiplicity
After the neural-network selection there are on average 1.14 candidates per event in the B 0 → π − ℓ + ν sample, 1.46 in the B + → π 0 ℓ + ν sample, 1.30 in the B 0 → ρ − ℓ + ν sample, and 1.17 in the B + → ρ 0 ℓ + ν sample. We observe fewer candidates for decay modes without neutral pions   The B → Xcℓν neural-network discriminators for B 0 → π − ℓ + ν candidates in the signal region, −0.15 < |∆E| < 0.25 GeV ; 5.255 < mES < 5.295 GeV. The distributions are shown for four different q 2 bins, columns from left to right: 0 < q 2 < 4 GeV 2 , 4 < q 2 < 8 GeV 2 , 12 < q 2 < 16 GeV 2 , q 2 > 20 GeV 2 . Top row: Discriminator distributions for signal (magneta, dashed) and B → Xcℓν background (blue, solid), normalized to the same area. The arrows indicate the chosen cuts. Bottom row: Discriminator distributions for data compared with MC-simulated signal and background contributions. For a legend see Figure 5.  in the final state. For all four samples, the observed candidate multiplicity is well reproduced by MC simulation.
In case of multiple candidates for a given decay mode, we select the one with the highest probability of the vertex fit for the Y candidate. Since this is not an option for B + → π 0 ℓ + ν decays, we select the photon pair with an invariant mass closest to the π 0 mass. Simulations of signal events indicate that this procedure selects the correct signal decay in 60 − 65% of the cases. By this selection we do not allow a single event to contribute more than one candidate to a given decay-mode sample, though we do allow an event to contribute candidates to more than one decay-mode sample.

V. DATA-MONTE CARLO COMPARISONS
The determination of the number of signal events relies heavily on the MC simulation to correctly describe the distributions for signal and background sources. Therefore a significant effort has been devoted to detailed comparisons of data and MC distributions for samples that have been selected to enhance a given source of background. Though we record data below BB threshold (offresonance data) the total luminosity of this sample is only about 10% of the Υ (4S) data sample (on-resonance data), and thus we need to rely on MC simulation to predict the shapes of these background distributions.
To study the simulation of qq events, we scale the MC sample to match the integrated luminosity of the offresonance data. The study is performed separately for samples with electrons and muons. This background contains events with true leptons from leptonic or semileptonic decays of hadrons, as well as hadrons misidentified as leptons. The muon sample is dominated by misidentified hadrons, whereas the electron sample contains small contributions from Dalitz pairs and photon conversions, as well as some residual background from non-qq processes. We observe a clear difference in the normalization, not only in the relatively small event sample passing the neural-network selection, but also for the much larger sample available before the neural-network suppression. To correct for this difference, we apply additional scale factors to the simulated qq samples; they are different for electrons and muons.
In addition to correcting the normalization, we also examine the shapes of the m ES , ∆E, and q 2 distributions that are used to extract the signal yield. Since the size of the off-resonance data set is small, we study samples with a looser selection, namely we bypass the qq neural-network discrimination. The comparison of these qq-enriched samples reveals small differences between data and simulation. We derive linear corrections from the bin-by-bin ratios and apply these corrections to the m ES , ∆E, and q 2 distributions. Figures 12 and 13 show a comparison of the rescaled and corrected qq MC samples with the off-resonance data for the ∆E, m ES , and q 2 distributions. Within the relatively large statistical errors of the off-resonance data the simulation agrees well with the data. The uncertainties in the shape of the simulated distributions will be assessed as a systematic uncertainty.

B. B → Xcℓν Enhanced Sample
The overall dominant background source in this analysis is B → X c ℓν decays. Therefore it is important to verify that these decays are correctly simulated. This has been done in two ways, a) by relaxing the B → X c ℓν suppression to obtain a charm-enhanced sample, and b) by reconstructing a specific decay mode, such as B 0 → D * − ℓ + ν, in the same way we reconstruct the signal decays, and comparing the kinematic distributions with MC simulations (see Section V C).
We select a charm-enhanced sample by inverting the cut on the B → X c ℓν neural-network discriminator. Figures 14 and 15 show the ∆E and m ES distributions in the signal region and the side bands, as well as the q 2 distribution in the signal region. All distributions show good agreement in shape; the absolute yields differ at a level that is expected, considering that the MC distributions have not been adjusted.
To study the Monte Carlo simulation of the neutrino reconstruction employed in this analysis, we use a con- trol sample of exclusively reconstructed B 0 → D * − ℓ + ν decays with D * − → D 0 π − s and D 0 → K + π − . Since the B 0 → D * − ℓ + ν decay rate exceeds the rate for B + → ρ 0 ℓ + ν by a factor of about 30 (including the D 0 branching fraction), this control sample represents a high-statistics and high-purity sample of exclusive semileptonic decays. Except for the low-momentum pion (π − s ), this final state has the same number of tracks, and very similar kinematics, as the B + → ρ 0 ℓ + ν signal decay. Furthermore, since about 50% of the B → X c ℓν background in the B → πℓν and B → ρℓν samples comes from B 0 → D * − ℓ + ν decays, this B 0 → D * − ℓ + ν sample can provide important tests of the shapes of the distributions that are used to discriminate the B → X c ℓν background from signal.
Moreover, the distributions of the primary background suppression variables, in particular R2, L2, cos θ BY , θ miss , and m 2 miss /(2E miss ), are relatively insensitive to the specific semileptonic decay mode. Likewise, the resolutions for the fit variables ∆E, m ES , and q 2 are dominated by the resolution of the reconstructed neutrino, and thus depend little on the decay mode under study.
The reconstruction of the D * − from its decay prod- ucts is straightforward. Except for the selection of the D * − , we apply the same preselection as for the signal charmless decays. We require the K + π − invariant mass to be within 17 MeV of the nominal D 0 mass, and restrict the mass difference, ∆m D * = m Dπ − m D to 0.1432 < ∆m D * < 0.1478 GeV. The number of events in this data control sample exceeds the MC prediction by 3.8 ± 1.7%, a result consistent with the uncertainties in the efficiency for the very-low-momentum charged pion from the D * − → D 0 π − decay. We correct the MC yield and sequentially place requirements on the same seven variables we use in the neural networks to both the data and MC samples. We compare the step-by-step reduction in the number of events; the largest difference is 0.9 ± 0.7%, for the cut on cos θ W ℓ . For all other critical requirements the agreement is better than 0.5% and one standard deviation. The remaining background is at the level of 10%.
We have compared the MC-generated distributions for the control sample with the selected B 0 → D * − ℓ + ν data sample and find very good agreement for the basic event variables, i.e., the multiplicity of charged particles and photons, and the total charge per event, indicating that the efficiency losses are well reproduced by the simulation. The distributions of the topological event variables R2 and L2 match well. Figure 16 shows the distributions of the variables critical for the neutrino reconstruction, p miss , m 2 miss /(2E miss ), cos θ BY , and θ miss ; they are also well reproduced. Figure 17 shows distributions of ∆E and m ES for events in the signal region and in the side bands. Again, the agreement between data and the MC simulation is

reasonable.
We have also compared the q 2 distributions of the simulation and the data control sample and find good agreement for both the raw and the corrected spectra, as illustrated in Figure 18. After corrections, no events appear above the kinematic limit of 10.7 GeV 2 . The q 2 corr resolution function can be described by the sum of two Gaussian resolution functions, with widths of 0.27 GeV 2 and 0.67 GeV 2 , close to the values obtained for events in the fit region for the signal B → πℓν and B → ρℓν decays, respectively.

A. Overview
We determine the yields for the signal decay modes, B 0 → π − ℓ + ν, B + → π 0 ℓ + ν, B 0 → ρ − ℓ + ν, and B + → ρ 0 ℓ + ν, by performing a maximum-likelihood fit to the three-dimensional ∆E − m ES − q 2 distributions for the four selected data samples corresponding to the four exclusive decay modes. The fit technique employed in this analysis is an extended binned maximum-likelihood fit that accounts for the statistical fluctuations not only of the data samples but also of the MC samples by allowing the MC-simulated distributions to fluctuate in each bin according to the statistical uncertainty given by the number of events in the bin. This method was introduced by Barlow and Beeston [49].
The parameters of the fit are scale factors for the signal and background yields of the four selected event samples. We use the following nomenclature for the fit parame-  ters: p source j , where the superscript source denotes the fit source (signal or background type) and the subscript j labels the q 2 corr bin (if no subscript is given, the same fit parameter is used across all q 2 bins). Predictions for the shape of the ∆E−m ES distributions are taken from simulation of both signal and the various background sources, separately for each bin in q 2 . The branching fractions for the four signal decays are obtained by multiplying the fitted values of the scale factors with the branching fractions that are implemented in the MC simulation.
The choice of a two-dimensional distribution in ∆E and m ES is mandated because the two variables are correlated for both signal, in particular the combinatorial signal events, and for some of the background sources. Since it would be difficult to determine reliable analytic expressions for these two-dimensional distributions, a binned maximum-likelihood method is used, with the bin sizes chosen to obtain a good signal and background separation while retaining adequate statistics in all bins. The bin sizes are small in the region where most of the signal is located and larger in the side bands. There are 47 ∆E − m ES bins for each bin in q 2 corr . Figure 19 shows the ∆E − m ES distribution for signal events and the binning used in the fit. As mentioned in Section IV B 2, for the two B → πℓν samples the q 2 range 0 < q 2 corr < 26.4 GeV 2 is divided into six bins, and for the two B → ρℓν samples the range 0 < q 2 corr < 20.3 GeV 2 is divided into three bins.

B. Fit Method
Since the MC samples available to create the probability density functions (PDFs) for the individual sources that are input to the fits are rather limited in size, it is necessary to take into account the statistical uncertainties, given by the number of events generated for each bin. For this reason we have adopted a generalized binned maximum-likelihood fit method. The MC samples that are used to define the PDFs are to a good approximation statistically independent of those used to train the neural networks for background suppression, since for the latter relatively small subsamples of the full MC samples have been used.
As mentioned above, the data are divided into n bins in a three-dimensional array in ∆E − m ES − q 2 corr . If d i is the number of selected events in bin i for a given single data sample corresponding to candidates for a specific decay mode, and a ji is the number of MC events from source j in this bin, then where N D is the total number of events in the data sample, and N j is the total number in the MC sample for source j. We assume that there are m different MCgenerated source distributions that add up to describe the data. The predicted number of events in each bin f i (P j ) can be written in terms of the strength of the individual contributions P j (j = 1, .., m) as with p j = N D P j /N j . In each bin, the weights w ji account for the relative normalization of the samples and various other corrections.
Since the MC samples are limited in size, the generated numbers of events a ji have statistical fluctuations relative to the value A ji expected for infinite statistics, and thus the more correct prediction for each bin is If we assume Poisson statistics for both the data and MC samples, the total likelihood function L is the combined probability for the observed d i and a ji [49], The first sum has the usual form associated with the uncertainty of the data and the second term refers to the MC statistics and is not dependent on data. There are (n + 1) × m unknown parameters that need to be determined: the m relative normalization factors p j , which are of interest to the signal extraction, and n × m values A ji . The problem can be significantly simplified. The n×m quantities A ji can be determined by solving n simultaneous equations for A ji of the form with A ji = a ji /(1+p j w ji t i ) and t i = 1−d i /f i (for d i = 0 we define t i = 1). At every step in the minimization of −2lnL these n independent equations need to be solved. This procedure results not only in the determination of the parameters p j , but also in improved estimates for the various contributions A ji in each bin. For fits to the individual data samples corresponding to the four signal decay modes, there is a specific likelihood function (Eq. 27). To perform a simultaneous fit to all four data samples the log-likelihood function is the sum of the individual ones. Some of the parameters p j may be shared among the four likelihood functions, (a hji ln A hji − A hji ).

C. Fit Parameters and Inputs
The fits can be performed separately for each of the four data samples or combined for all four data samples, and where possible, with common fit parameters. The nominal fit in this analysis is a simultaneous fit of all four data samples: B 0 → π − ℓ + ν, B + → π 0 ℓ + ν, B 0 → ρ − ℓ + ν, and B + → ρ 0 ℓ + ν. A signal decay in one data sample may contribute to the background in another sample, and therefore these sources share a common fit parameter. For example, the scale factor for the B 0 → ρ − ℓ + ν signal in the B 0 → ρ − ℓ + ν sample is also applied in the B 0 → π − ℓ + ν sample, where it represents cross-feed background.
We impose isospin invariance for the signal decay modes, The yields of the true and combinatorial signal decays as well as isospin-conjugate decays are related to the same branching fraction and therefore share the same fit parameter. The B → X u ℓν background, which contains exclusive and non-resonant decays, is scaled by two parameters, one for low and intermediate q 2 (q 2 < 20 GeV 2 ) and one for high q 2 (q 2 > 20 GeV 2 ), for the fits to the B → πℓν samples. Because of the large correlation between B → X u ℓν background and B → ρℓν signal (> 90% for both B → ρℓν modes), we rely on MC simulation for the B → X u ℓν background and keep it fixed in the fits to the B → ρℓν samples. The BB background is split into two sources. Among the B → X c ℓν decays, we treat the dominant decay mode, B → D * ℓν, as a separate source and combine the other semileptonic decays (B → Dℓν, B → D ( * ) (nπ)ℓν) and the remaining (or "other") BB background (secondary leptons and fake leptons) into a single source. The continuum qq background sources containing true and fake leptons are combined into one fit source and scaled by a single fit parameter.
The complete list of fit sources and corresponding fit parameters is given in Table VI. The π ↔ ρ cross feed is a free fit parameter in the four-mode fit; for one-mode fits, it is fixed to the value obtained from the four-mode fit. In the four-mode fit, all background sources that are not fixed are fit separately for each signal mode, since the different hadrons of the signal decays lead to different combinatorial backgrounds.

D. Fit Results
The fits are performed both separately and simultaneously for the four signal decay modes, B 0 → π − ℓ + ν, B + → π 0 ℓ + ν, B 0 → ρ − ℓ + ν, and B + → ρ 0 ℓ + ν.  VI: List of fit parameters representing scale factors for the different signal samples and background sources. Parameters with index j are free parameters in the fit, one for each q 2 bin j. The π ↔ ρ crossfeed parameter is free only in the four-mode fit; for one-mode fits, it is fixed to the values obtained from the four-mode fit. There are independent scale factors for qq background, B → D * ℓν decays and for all other background sources from BB events for all four signal modes (subscripts π ± , π 0 , ρ ± , ρ 0 ). For the B → πℓν decays, the B → Xuℓν background is fit in two q 2 intervals (index k = 1, 2); for the B → ρℓν decays it is fixed.
for each bin in q 2 corr . As a measure of the goodness-of-fit we use χ 2 per degree of freedom; all fits have values in the range 1.05 − 1.11 (for details see Table VII).
The scale factors for the signal contributions, which are determined by the fits, can be translated to numbers of background-subtracted signal events for the four signal decays. These signal yields are listed in Table VII with errors that are a combination of the statistical uncertainties of the data and MC samples and the uncertainties of the fitted yields of the various backgrounds. For each signal decay mode, the table specifies the number of true and combinatorial signal decays. Their relative fraction is taken from simulation. This fraction is larger for decays with a π 0 in the final state. For all signal modes, the fraction of combinatorial signal events is small at low q 2 , increases with q 2 , and at the highest q 2 it is similar to or exceeds the one of true signal decays. This leads to larger errors in the measurement of q 2 , m ES and ∆E.
In Table XVII in Appendix XI B the correlation matrix of the four-mode fit is presented. We observe correlations of about 40 − 60% between the qq and the other BB backgrounds and between the B → D * ℓν and the other BB backgrounds for all signal modes. For B → πℓν, the correlation between the B → X u ℓν background and the signal at high q 2 is also sizable (≃ 60%). For B → ρℓν, this correlation is larger than 90%, which is why we choose to fix the B → X u ℓν background normalization for these two samples. As a test, we let the B → X u ℓν background normalization in the B → ρℓν modes vary as free parameter in the four-mode fit. This results in a B → X u ℓν contribution that is lower by a factor of 0.85 ± 0.15 for B 0 → ρ − ℓ + ν and 0.90 ± 0.14 for B + → ρ 0 ℓ + ν and an increase of the B → ρℓν signal yields by 10% in the first two q 2 bins and by 15% in the last q 2 bin. These changes are covered by the systematic uncertainties due to the B → X u ℓν background stated in Section VII.
To cross-check the results of the nominal four-mode fit, we also perform fits for each signal mode separately. The contributions from the other signal decay modes are fixed to the result obtained from the four-mode fit. Since the shape of the π ↔ ρ cross-feed contribution is very similar to the other B → X u ℓν background, we fix its normalization to the one obtained from the four-mode fit. A comparison of the results of the one-mode fits with the combined four-mode fit shows agreement within the fit errors of the B 0 → π − ℓ + ν and B + → π 0 ℓ + ν modes and the B 0 → ρ − ℓ + ν and B + → ρ 0 ℓ + ν modes in all q 2 bins.
The partial branching fractions for the different q 2 corr bins are derived as the products of the fitted signal scale factors and the signal decay branching fractions used in the simulation. The total branching fraction integrated over the entire q 2 range and its error are calculated as the sum of all partial branching fractions, taking into account the correlations of the fitted yields in different q 2 bins. The branching fraction for B 0 decays, B 0 signal , is related to the fitted signal yields, N 0 signal , in the following way, where f 00 = 0.484 ± 0.006 [41] denotes the fraction of B 0B0 events produced in Υ (4S) decays and ǫ 0 signal is the total signal efficiency (averaged over the electron and muon samples) as predicted by the MC simulation. The factor of four accounts for the fact that each event contains two B mesons, and that the branching fraction is quoted for a single charged lepton, not for the sum of the decays to electrons or muons. The branching fraction results are presented in Section VIII.

E. Fit Validation and Consistency
The fit procedure is validated several ways. First of all, the implementation of the Barlow-Beeston fit technique allowing statistical fluctuations of the MC distributions to be incorporated is checked by verifying the consistency of the fit variations with the statistical error of the input distributions. Secondly, a large number of simulated experiments are generated based on random samples drawn from the three-dimensional histograms used in the standard fit. Specifically, we create 500 sets of distributions by fluctuating each simulated source distribution bin-bybin using Poisson statistics. For each of the sets, we add the source distributions to make up to total distribution that corresponds to the data distribution ("toy data"), which are then fitted by the standard procedure. In addition, we create independent fluctuations for the distributions that make up the source PDFs for the fit, in the same way as for the toy data described above. For a compilation of these 500 "toy experiments", we study the distributions of the deviation of the fit result from the input value divided by the fit error. These distributions show no significant bias for any of the free parameters and confirm that the errors are correctly estimated.
Additional fits are performed to check the consistency of the data. For instance, the data samples are divided into subsamples, i.e., the electron sample separated from the muon sample or the data separated into different run periods. These subsamples are fitted separately; the results agree within the statistical uncertainties.

VII. SYSTEMATIC UNCERTAINTIES
Many sources of systematic uncertainties have been assessed for the measurement of the exclusive branching fractions as a function of q 2 . Since this analysis does not depend only on the reconstruction of the charged lepton and hadron from the signal decay mode, but also on the measurement of all remaining tracks and photons in the event, the uncertainties in the detection efficiencies of all particles as well as the uncertainties in the background yields and shapes enter into the systematic errors.
Tables VIII and IX summarize the systematic uncertainties for B → πℓν and B → ρℓν for the four-mode fit. In Appendix XI A the systematic error tables for the one-mode fits are presented. The individual sources are, to a good approximation, uncorrelated and can therefore be added in quadrature to obtain the total systematic errors for each decay mode. In the following, we discuss the assessment of the systematic uncertainties in detail.
For the estimation of the systematic errors of the fitted branching fractions, we compare the differential branching fractions obtained from the nominal fit with results obtained after changes to the MC simulation that reflect the uncertainty in the parameters that impact the detector efficiency and resolution or the simulation of signal and background processes. For instance, we vary the tracking efficiency, reprocess the MC samples, reapply the fit to the data, and take the difference compared to the results obtained with the nominal MC simulation as an estimate of the systematic error. The sources of systematic errors are not identical for all four signal decay modes, and the size of their impact on the event yields depends on the sample composition and q 2 .

A. Detector Effects
Uncertainties in the reconstruction efficiencies for charged and neutral particles and in the rate of tracks and photons from beam background, fake tracks, failures in the matching of EMC clusters to charged tracks, and showers split off from hadronic interactions, undetected K L , and additional neutrinos, all contribute to the quality of the neutrino reconstruction and impact the variables that are used in the preselection and the neural networks. For all these effects the uncertainties in the efficiencies and resolution have been derived independently from comparisons of data and MC simulation for selected control samples.

Track, Photon, and Neutral-Pion Reconstruction
We evaluate the impact of uncertainties in the tracking efficiency by randomly eliminating tracks with a proba-bility that is given by the uncertainty ranging from 0.25% to 0.5% per track, as measured with data control samples.
Similarly, we evaluate the uncertainty due to photon efficiency by eliminating photons at random with an energy-dependent probability, ranging from 0.7% per photon above 1 GeV to 1.8% at lower energies. This estimate includes the uncertainty in the π 0 efficiency for signal decays with a π 0 , since photons originating from the signal hadron are also eliminated.

Lepton Identification
The average uncertainties in the identification of electrons and muons have been assessed to be 1.4% and 3%, respectively. The uncertainty in the misidentification of hadrons as electrons or muons is about 15%.

K 0 L Production and Interactions
Events containing a K 0 L have a significant impact on the neutrino reconstruction, because only a small fraction of the K 0 L energy is deposited in the electromagnetic calorimeter. Based on detailed studies of data control samples of D 0 → K 0 π + π − decays and inclusive K 0 S samples in data and MC, corrections to the efficiency, shower deposition and the production rates have been derived and applied to the simulation as a function of the K 0 L mo- mentum and angles (see Section III). To determine the systematic uncertainties in the MC simulations we vary the scale factors within their statistical and systematic uncertainties. The average uncertainty of the energy deposition in the EMC due to K 0 L interactions is estimated to be 7.5%. Above 0.7 GeV, the K 0 L detection efficiency is well reproduced by the simulation, with an estimated average uncertainty of 2%. At lower momenta, the simulation is corrected to match the data, and the uncertainty increases to 25% below 0.4 GeV.
The production rates for K 0 S in data and MC agree within errors, except for momenta below 0.4 GeV where  the data spectrum is low by 22 ± 7% compared to the MC simulation and a correction is applied. To assess the impact of the uncertainty of the correction procedure, the size of the correction is varied by its estimated uncertainty.

Signal Form Factors
To assess the impact of the form-factor (FF) uncertainty on the shape of the simulated signal distributions, we vary the B → π form factor within the uncertainty of the previous BABAR measurement [9] and the B → ρ form factors within the uncertainties of the LCSR calculation assessed by Ball and Zwicky [17]. For the latter we assume uncertainties on the form factors A 1 , A 2 and V of 10% at q 2 = 0. They rise linearly to 13% at q 2 = 14 GeV 2 and are extrapolated up to the kinematic endpoint. We add the uncertainties due to the three form factors in quadrature. For B → πℓν, the form-factor uncertainty is small, since we extract the signal in six bins of q 2 . In contrast, for B → ρℓν the form-factor uncertainty is one of the dominant sources of systematic error. This is partly due to the stricter requirement on the lepton momentum, p * ℓ > 1.8 GeV, which is imposed to suppress the large B → X c ℓν background. We refrain from using the difference between LCSR and ISGW2 as systematic uncertainty, but this difference is comparable to the estimate we obtain from the uncertainties in the LCSR calculation.

B → Xuℓν Background
The B → X u ℓν background contribution is composed of the sum of exclusive decays, B + → ωℓ + ν, B + → ηℓ + ν, and B + → η ′ ℓ + ν decays, and the remaining resonant and non-resonant B → X u ℓν decays that make up the total B → X u ℓν branching fraction. We estimate the total error of the B → X u ℓν background composition by repeating the fit with branching fractions for various exclusive and non-resonant decays varied independently within their current measurement errors. The uncertainty of the branching fraction for non-resonant decays is dominant; it is equal to the error on the total B → X u ℓν branching fraction, B(B → X u ℓν) = (2.33 ± 0.22) × 10 −3 [41].
In addition, the analysis is sensitive to the mass and composition of the charmless hadronic states. We assess the uncertainty of the predictions by varying the QCD parameters that define the mass, the lepton spectrum, and the q 2 distributions predicted by calculations [36] based on HQE. We vary the shape-function (SF) parameters m b and µ 2 π within the uncertainties (error ellipse) given in Ref. [38]. VIII: Systematic errors in % for B(B 0 → π − ℓ + ν) from the four-mode fit for bins in q 2 and the total q 2 range. The total errors are derived from the individual contributions taking into account the complete covariance matrix. For the two B → ρℓν samples, the B → X u ℓν background is large compared to the signal and very difficult to separate. Consequently, the fit shows very high correlations between the fitted yields for signal and this background. We therefore choose to fix the background yields and shapes to those provided by the simulation, and account for the uncertainty by assessing the sensitivity of the fitted signal yield to variations of the B → X u ℓν branching fraction and the shapes of the background distributions, corresponding to the estimated error of the shape-function parameters. The resulting estimated errors are the two dominant contributions to the systematic errors of the B → ρℓν partial and total branching fractions.

B → Xcℓν Background
The systematic error related to the shapes of the B → X c ℓν background distributions is dominated by the IX: Systematic errors in % for B(B 0 → ρ − ℓ + ν) from the four-mode fit for three bins in q 2 and the total q 2 range. The total errors are derived from the individual contributions taking into account the complete covariance matrix.  Table I. Since we scaled up the four B → D * * ℓν branching fractions to take into account the unknown D * * partial branching fractions, the errors were increased by a factor of three relative to the published values.
To evaluate the effect of uncertainties in the formfactor parameters for the dominant B → D * ℓν component, we repeat the fit with ±1σ variations in each of the three form-factor parameters, ρ 2 D * , R 1 and R 2 . The impact of the form factor for the B → Dℓν background is evaluated by varying the parameter ρ 2 D within its uncertainty.

Continuum Background
In Section V A, we have described the correction of the simulated shapes of the m ES , ∆E, and q 2 distributions for the continuum using linear functions derived from comparison with off-resonance data. The uncertainties of the fitted slopes of these correction functions are used to evaluate the errors due to modeling of the shape of the continuum background distributions. They represent a sizable contribution to the systematic error, which is mainly due to the low statistics of the off-resonance data sample.

Final-State Radiation and Bremsstrahlung
The kinematics of the signal decays are corrected for radiative effects such as final-state radiation and bremsstrahlung in detector material.
In the MC simulation, final-state radiation (FSR) is modeled using PHOTOS [35], which is based on O(α) calculations but includes multiple-photon emission from the electron. We have studied the effects of FSR on the q 2 dependence of the measured signal and background yields by comparing events generated with and without PHOTOS. The observed change is largest, up to 5%, for electron momenta of about 0.6 GeV (i.e. well below our cut-off at 1 GeV for B → πℓν and 1.8 GeV for B → ρℓν). Comparisons of the PHOTOS simulation with semi-analytical calculations [50] show excellent agreement. Allowing for the fact that non-leading terms from possible electromagnetic corrections to the strong interactions of the quarks in the initial and final state have not been calculated to any precision [51], we adopt an uncertainty in the PHOTOS calculations of 20%.
The uncertainty of the bremsstrahlung correction is determined by the uncertainty of the amount of detector material in the inner detector. We have adopted as the systematic uncertainty due to bremsstrahlung the impact of a change in the thickness of the detector material by 0.14% radiation lengths, the estimated uncertainty in the thickness of inner detector and the beam vacuum pipe. As for final-state radiation, the uncertainty in the effective radiator thickness impacts primarily the electron spectrum.
The uncertainties due to final-state radiation and bremsstrahlung combined amount to far less than 1% for most of the q 2 range.

Number of BB Events
The determination of the on-resonance luminosity and the number of BB events is described in detail elsewhere [52]. The uncertainty of the total number of BB pairs is estimated to be 1.1%.
Since we combine fits to decays of charged and neutral B mesons and make use of isospin relations, the B-meson lifetimes enter into the four-mode fit. We use the PDG [7] value for the B lifetime, τ 0 = 1.530±0.009 ps , and the lifetime ratio, τ + /τ 0 = 1.071 ± 0.009. These uncertainties lead to a systematic error of 0.3% for B → πℓν and 0.7% for B → ρℓν decays.

VIII. RESULTS
Based on the signal yields obtained in the four-mode fit, integrated over the full q 2 range (see Table VII), we derive the following total branching fractions, constrained by the isospin relations stated in Eqs. 30, Here and in the following, the first error reflects the statistical (fit) error and the second the estimated systematic error. The total branching fractions obtained from the single-mode fits for the charged and neutral B → πℓν samples are For the charged and neutral B → ρℓν samples, we obtain The single-mode fits result in higher values for B(B 0 → ρ − ℓ + ν) and B(B + → ρ 0 ℓ + ν) than the average branching fraction obtained from the four-mode fit. This may be explained by different treatments of the isospin-conjugate signal and the π ↔ ρ cross feed in the single-and fourmode fits. In contrast to the four-mode fit, the isospinconjugate signal contribution in the single-mode fits is not constrained by the isospin-conjugate mode. In addition, the four-mode fit uses the same fit parameter for the signal and the cross feed from the signal mode into other modes, which leads to a slight decrease in the B → ρℓν branching fraction compared to the single-mode fits. Since the ρ → π cross feed is significantly larger than the π → ρ cross feed, the effect on the B → ρℓν results is larger than for B → πℓν.
Both the B → πℓν and the B → ρℓν results are consistent within errors with the isospin relations, By extracting the signal in several q 2 bins we also measure the q 2 spectra of B → πℓν and B → ρℓν decays. These spectra need to be corrected for effects such as detector resolution, bremsstrahlung, and final-state radiation.

A. Partial Branching Fractions
We correct the measured q 2 spectra for resolution, radiative effects and bremsstrahlung by applying an unfolding technique that is based on singular-value decomposition of the detector response matrix [53]. The detector response matrix in the form of a two-dimensional histogram of the reconstructed versus the true q 2 values (see Figure 4) is used as input to the unfolding algorithm. This algorithm contains a regularization term to suppress spurious oscillations originating from statistical fluctuations. To find the best choice of the regularization parameter κ we have studied the systematic bias on the partial branching fractions compared to the statistical uncertainty as a function of κ using a set of simulated distributions. The data samples in this analysis are large enough that no severe distortions due to statistical fluctuations are expected. We choose the largest possible value of κ, i.e., we set κ equal to the number of q 2 bins, to minimize a potential bias.
The ∆B/∆q 2 distributions resulting from the unfolding procedure are presented in Figure 24 for B → πℓν and in Figure 25 for B → ρℓν. Tables X and XI list the partial branching fractions ∆B for B → πℓν and B → ρℓν, respectively.

B. Form-factor Shape
For B → πℓν decays, we extract the shape of the form factor f + (q 2 ) directly from data. For B → ρℓν decays, we restrict ourselves to the measurement of the q 2 dependence, since the current experimental precision is not adequate to extract the three different form factors involved.
Several parameterizations of f + (q 2 ) are used to interpolate between results of various form-factor calculations or to extrapolate these calculations from a partial to the whole q 2 range. The four most common parameterizations, the BK [26], BZ [15], BGL [24,25] and BCL [29] parameterizations, have been introduced in Section II. For the BGL and BCL parameterizations, we consider a linear (k max = 2) and a quadratic (k max = 3) ansatz.
We perform χ 2 fits to the measured q 2 spectrum to determine the free parameters for each of these parameterizations. The fit employs the following χ 2 definition, with integration of the fit function over the q 2 bins, where V −1 i,j is the inverse covariance matrix of the partialbranching-fraction measurements. ∆ k for q 2 bin k is defined as where α denotes the set of parameters for a chosen parameterization of f + (q 2 ), and C = |V ub | 2 τ 0 G 2 F /(24π 3 ) is an overall normalization factor whose value is irrelevant for these fits since the data can only constrain the shape of the form factor, but not its normalization.
In Table XII and Figure 24 we present the results of these fits to the B → πℓν samples. All parameterizations describe the data well, with χ 2 probabilities ranging from 10% to 18%. Thus, within the current experimental precision, all parameterizations are valid choices, and the central values for |V ub |f + (0) agree with each other. We choose the quadratic BGL parameterization as the default, though even a linear parameterization results in a very good fit to the data. The error band represents the uncertainties of the fit to data, based on the quadratic BGL parameterization (solid line in Figure 24). It has been computed using standard error propagation, taking the correlation between the fit parameters into account.
We compare the measured q 2 spectra with the shapes predicted by form-factor calculations based on lattice QCD [23], light-cone sum rules [15,19], and the ISGW2 [14] relativistic quark model. Among the available calculations for B → πℓν decays, the HPQCD lattice calculation agrees best with the data. It should be noted that the LQCD predictions are only valid for q 2 > 16 GeV 2 , the earlier LCSR calculation (LCSR 1) for q 2 < 16 GeV 2 , and the more recent LCSR calculation (LCSR 2) for q 2 < 12 GeV 2 ; their extrapolation is impacted by sizable uncertainties.
In Table XI and Figure 25 we present the results of the fits to the B → ρℓν samples. The LCSR calculation and the ISGW2 model are in good agreement with the data. However, the errors of the measured B → ρℓν partial branching fractions are relatively large, at the level of 15-30%, depending on the q 2 interval.
It should be noted that the theoretical calculations differ most for low and high q 2 . In these regions of phase space, the measurements are impacted significantly by higher levels of backgrounds, specifically continuum events at low q 2 and other B → X u ℓν decays that are difficult to separate from the signal modes at higher q 2 . These two background sources have been examined X: Partial and total branching fractions (corrected for radiative effects) for B 0 → π − ℓ + ν and B + → π 0 ℓ + ν decays obtained from the single-mode fits and B → πℓν decays from the four-mode fit with statistical (fit), systematic and total errors. The branching fraction for B + → π 0 ℓ + ν has been scaled by twice the lifetime ratio of neutral and charged B mesons. All branching fractions and associated errors are given in units of 10 −4 . The fit result for the BZ and BCL parameterizations are barely visible, since they overlap almost completely with the BGL result. The shaded band illustrates the uncertainty of the quadratic BGL fit to data. Right: shape comparisons of the data to various B → πℓν form-factor predictions (LCSR 1 [15], LCSR 2 [19], HPQCD [23], ISGW2 [14]), which have been normalized to the measured total branching fraction. The extrapolations of the QCD predictions to the full q 2 range are marked as dashed lines.
in detail, and the uncertainties in their normalization and shape are included in the systematic uncertainties. For the inclusive B → X u ℓν background, the q 2 and the hadronic mass spectra are derived from theoretical predictions that depend on non-perturbative parameters that are not well measured [38]. For B → ρℓν the correlation between the signal and the B → X u ℓν background is so large that they cannot both be fitted simultaneously.
Thus the B → X u ℓν background scale factor and shape are fixed to the MC predictions, which have large uncertainties. MC studies indicate that this may introduce a bias affecting the signal yield. The stated errors account for this potential bias.   [17] and from the ISGW2 quark model [14].

C. Determination of |Vub|
We choose two different approaches to determine the magnitude of the CKM matrix element V ub .
First, we use the traditional method to derive |V ub |. As in previous publications [6,[8][9][10][11], we combine the measured partial branching fractions with integrals of the form-factor calculations over a certain q 2 range using the relation where τ 0 = (1.530 ± 0.009) ps is the B 0 lifetime and ∆ζ is defined as The values of ∆ζ are derived from theoretical form-factor calculations for different q 2 ranges. Table XIII summarizes the ∆ζ values, the partial branching fractions and the |V ub | results. For B → ρℓν, values of ∆ζ are taken from the LCSR calculation in the range q 2 < 16 GeV and the quark model predictions of ISGW2 over the full q 2 range. The results are also presented in Table XIII. Estimates of the uncertainties for ∆ζ are not given in Refs. [17] and [14] .
Second, we perform a simultaneous fit to the most recent lattice results and BABAR data to make best use of the available information on the form factor from data (shape) and theory (shape and normalization). A fit of this kind was first presented by the FNAL/MILC Collaboration [22] using the earlier BABAR results on B 0 → π − ℓ + ν decays [9].
To perform this fit, we translate the f + (q 2 ) predictions from LQCD to 1/(τ 0 |V ub )| 2 )∆B/∆q 2 . We simultaneously fit this distribution and the ∆B/∆q 2 distribution from data as a function of q 2 . We use the BGL form-factor parameterization as the fit function, with the additional normalization parameter a norm = τ 0 |V ub | 2 , which allows us to determine |V ub | from the relative normalization of data and LQCD predictions.
The χ 2 for this fit is given by where Here (∆B/∆q 2 ) data is the measured spectrum, f lat + (q 2 l ) are the form-factor predictions from LQCD, and (V data ij ) −1 and (V lat ij ) −1 are the corresponding inverse covariance matrices for (∆B/∆q 2 ) data and G 2 F /(24π 3 )p 3 π (q 2 l )|f lat + (q 2 l )| 2 , respectively. The set of free parameters α of the fit function g(q 2 ; α) contains the coefficients a k of the BGL parameterization and the normalization parameter a norm .
From the FNAL/MILC [22] lattice calculations, we use only subsets with six, four or three of the twelve predictions at different values of q 2 , since neighboring points are very strongly correlated. All chosen subsets of LQCD points contain the point at lowest q 2 . It has been checked that alternative choices of subsets give compatible re-sults. From the HPQCD [23] lattice calculations, we use only the point at lowest q 2 since the correlation matrix for the four predicted points is not available. For comparison, we also perform the corresponding fit using only the point at lowest q 2 from FNAL/MILC. The data, the lattice predictions, and the fitted functions are shown in Figure 26. Table XIV shows the numerical results of the fit.
For the nominal fit we use the subset with four FNAL/MILC points and assume a quadratic BGL parameterization. We refer to this fit as 3+1-parameter BGL fit (three coefficients a k and the normalization parameter a norm ). As can be seen in Table XII for the fit to data alone, the data are well described by a linear function with the normalization a 0 and a slope a 1 /a 0 . This indicates that most of the variation of the form factor is due to well-understood QCD effects that are parameterized by the functions P(q 2 ) and φ(q 2 , q 2 0 ) in the BGL parameterization. If we include a curvature term in the fit, the slope a 1 /a 0 = −0.82±0.29 is fully consistent with the linear fit; the curvature a 2 /a 0 is negative and consistent with zero. Since the z distribution is almost linear, we also perform a linear fit (2+1-parameter BGL fit) for comparison. The results of the linear fits are also shown in Table XIV.
The simultaneous fits provide very similar results, both for the BGL expansion coefficients, which determine the shape of the spectrum, and for |V ub |. The fitted values for the form-factor parameters are very similar to those obtained from the fits to data alone. This is not surprising, since the data dominate the fit results. Unfortunately the decay rate is lowest and the experimental errors are largest at large q 2 , where the lattice calculation can make predictions. We obtain from these simultaneous fits a k are significantly smaller than 1, as predicted. The sum of the squares of the first two coefficients, 1 k=0 a 2 k = (0.85±0.20)×10 −3 , is consistent with the tighter bounds set by Becher and Hill [25].
Since the total error of 10% on |V ub | results from the simultaneous fit to data and LQCD predictions, it is nontrivial to separate the error into contributions from experiment and theory. We have estimated that the error contains contributions of 3% from the branching-fraction measurement, 5% from the shape of the q 2 spectrum determined from data, and 8.5% from the form-factor normalization obtained from theory.
We study the effect of variations of the isospin relations imposed in the combined four-mode fit as stated in Eqs. 30. These relations are not expected to be exact, though the comparison of the single-mode fit results provides no indication for isospin breaking. The isospinbreaking effects are primarily due to π 0 − η and ρ 0 − ω mixing in B + → π 0 ℓ + ν and B + → ρ 0 ℓ + ν decays, respectively. They are expected to increase the branching fractions of the B + relative to the B 0 meson. Given the masses and widths of the mesons involved, the impact of π 0 − η mixing is expected to be smaller than that of ρ 0 − ω mixing.
Detailed calculations have been performed to correct form-factor measurements and to extract V us from semileptonic decays of charged and neutral kaons [54]. These calculations account for isospin breaking due to π 0 − η mixing and should also be applicable to B + → π 0 ℓ + ν decays. For B + → π 0 ℓ + ν decays the effect is expected to be smaller by a factor of three, i.e., the predicted increase is (1.5 ± 0.2)% [55]. For B + → ρ 0 ℓ + ν decays, calculations have not been carried out to the same precision. Based on the change in the π + π − rate at the peak of the ρ mass distribution, the branching fraction is predicted to increase by as much as 34% [56]. However, an integration over the resonances weighted by the proper Breit-Wigner function and taking into account the masses and finite ρ and ω widths results in a much smaller effect, an increase in the π + π − branching fraction of 6% [57].
We have assessed the impact of changes in the ratios of the branching fractions for charged and neutral B mesons on the extraction of the differential decay rates due to adjustments of the MC default branching fractions of the B + decays in the combined four-mode fit. For a 1.5% increase in the B + → π 0 ℓ + ν branching fraction, the fitted B → πℓν partial branching fraction decreases by 0.5%, while the B → ρℓν rate increases by less than 0.1%. A 6% increase in the B + → ρ 0 ℓ + ν branching fraction results in a decrease of the B → ρℓν rate by 3.1% and a 0.14% increase for the fitted B → πℓν rate. We observe a partial compensation to the change in the simulated B + → π 0 ℓ + ν rate due to changes in the B + → ρ 0 ℓ + ν background contribution, and vice versa. The observed changes in the fitted yields depend linearly on the imposed branching-fraction changes and are independent of q 2 .
For a 1.5% variation of the B + → π 0 ℓ + ν branching fraction, the value for |V ub | extracted from the measured B → πℓν spectrum decreases by 0.2%. A +6% variation of the B + → ρ 0 ℓ + ν branching fraction increases the value of |V ub | extracted from the same measured spectrum by 0.3%.

IX. CONCLUSIONS
In summary, we have measured the exclusive branching fractions B(B 0 → π − ℓ + ν) and B(B 0 → ρ − ℓ + ν) as a function of q 2 and have determined |V ub | using recent form-factor calculations. We measure the total branching fractions, based on samples of charged and neutral B mesons and isospin constraints, to be where the first error is the statistical uncertainty of the fit employed to determine the signal and background yields and the second is the systematic uncertainty. The separate measurements of the branching fractions for charged and neutral B mesons are consistent within errors with the assumed isospin relations, We have assessed the sensitivity of the combined branching-fraction measurements to isospin violations due to π 0 − η and ρ 0 − ω mixing in B + decays. Based on the best estimates currently available, the impact on the branching fractions is small compared to the total systematic errors. We refrain from applying corrections, given the uncertainties in the size of the effects.
The measured branching fraction for B → πℓν is more precise than any previous measurement and agrees well with the current world average B(B 0 → π − ℓ + ν) = (1.36 ± 0.05 ± 0.05) × 10 −4 [41]. The branching fraction for B → ρℓν is also the most precise single measurement to date based on a large signal event sample, although the Belle Collaboration [11] has reported a smaller systematic error (by a factor of two), based on a small signal sample of hadronically-tagged events [11]. The B → ρℓν branching fraction presented here is significantly lower (by about 2.5 σ) compared to the current world average B(B 0 → ρ − ℓ + ν) = (2.77 ± 0.18 ± 0.16) × 10 −4 [41]. The dominant uncertainty of this B → ρℓν measurement is due to the limited knowledge of the normalization and shape of the irreducible background from other B → X u ℓν decays.
Within the sizable errors, the measured q 2 spectrum for B → ρℓν agrees well with the predictions from lightcone sum rules [17] and the ISGW2 [14] quark model. Neither of these calculations includes an estimate of their uncertainties. In the future, it will require much cleaner data samples and considerably better understanding of other B → X u ℓν decays to achieve significant improvements in the measurements of the form factors in B decays to vector mesons.
We determine the CKM matrix element |V ub | using two different approaches. First, we use the traditional method to derive |V ub | by combining the measured partial branching fractions with the form-factor predictions based on different QCD calculations. The results, presented in Table XIII, agree within the sizable uncertainties of the form-factor predictions. For this approach we quote as a result the value of |V ub | = (3.78 ± 0.13 +0. 55 −0.40 ) × 10 −3 , based on the most recent LCSR calculation for q 2 < 12 GeV 2 . Second, we extract |V ub | from simultaneous fits to data and lattice predictions using the quadratic BGL parameterization for the whole q 2 range. These fits to data and the two most recent lattice calculations by the FNAL/MILC [22] and HPQCD [23] Collaborations agree very well. We quote as a result the fitted value of |V ub | = (2.95 ± 0.31) × 10 −3 , based on the normalization predicted by the FNAL/MILC Collaboration. The total error of 10% is dominated by the theory error of 8.5%. This value of |V ub | is smaller by one standard deviation compared to the results of a combined fit to earlier BABAR measurements and the same recent FNAL/MILC lattice calculations [22].
The values of |V ub | presented here appear to be sensitive to the q 2 range for which theory predictions and the measured spectrum can be compared. LCSR calculations are restricted to low values of q 2 and result in values of |V ub | in the range of (3.63 − 3.78) × 10 −3 with theoretical uncertainties of +16 −11 % and experimental errors of 3 − 4%. LQCD predictions are available for q 2 > 16 GeV 2 and result in |V ub | in the range of (2.95 − 3.21) × 10 −3 and experimental errors of 5 − 6% for both the traditional method and the simultaneous fit to LQCD predictions and the measured spectrum. This fit combines the measured shape of the spectrum over the full q 2 range with the lattice QCD form-factor predictions at high q 2 and results in a reduced theoretical uncertainty of 8.5%, as compared to +17 −11 % for the traditional method. Both |V ub | values quoted as results are also lower than most determinations of |V ub | based on inclusive B → X u ℓν decays, which are typically in the range (4.0 − 4.5) × 10 −3 . These inclusive measurements are very sensitive to the mass of the b quark, which is extracted from fits to moments of inclusive B → X c ℓν and B → X s γ decay distributions [38] and depends on higherorder QCD corrections. Estimated theoretical uncertainties are typically 6%.
To permit more stringent tests of the CKM framework and its consistency with the standard model of electroweak interactions, further reductions in the experimental and theoretical uncertainties will be necessary. For B → πℓν decays this will require a reduction in the statistical errors and improved detector hermeticity to more effectively reconstruct the neutrino, which will reduce backgrounds from all sources. Further improvements in the precision of lattice and other QCD calculations will also be beneficial.

X. ACKNOWLEDGMENTS
We would like to thank A. Khodjamirian, A. Kronfeld, P. Mackenzie, T. Mannel, J. Shigemitsu, and R. Van de Water for their help with theoretical form-factor calculations. We are grateful for the extraordinary contributions of our PEP-II colleagues in achieving the excellent luminosity and machine conditions that have made this work possible. The success of this project also relies critically on the expertise and dedication of the computing organizations that support BABAR. 0 = −0.65m 2 − as proposed in Ref. [22]. For the BCL parameterization, we choose q 2 0 = (MB +mπ)( √ MB − √ mπ) 2 , as proposed in Ref. [29]. For the B-meson and pion masses, we use MB = 5.279 GeV and mπ = 0.1396 GeV.
B 0 → π − ℓ + ν B + → π 0 ℓ + ν q 2 range ( GeV  Table XVII shows the full correlation matrix for all signal and background fit parameters in the four-mode maximum-likelihood fit used to determine the signal yields, described in Section VI. This appendix also contains all statistical, systematic and total correlation and covariance matrices for the B → πℓν and B → ρℓν ∆B/∆q 2 measurements. The total correlation matrix is shown before and after unfolding of the q 2 spectrum. All covariance matrices are shown after q 2 unfolding. The total covariance matrix for B → πℓν in Table XXVIII is used in the form-factor fits described in Eq. 32 or 36.