Dynamical grooming of QCD jets

We propose a new class of infrared-collinear (IRC) and Sudakov safe observables with an associated jet grooming technique that removes dynamically soft and large angle branches. It is based on identifying the hardest branch in the Cambridge/Aachen re-clustering sequence and discarding prior splittings that occur at larger angles. This leads to a dynamically generated cut-off on the phase space of the tagged splitting that is encoded in a Sudakov form factor. In this exploratory study we focus on the mass and momentum sharing distributions of the tagged splitting which we analyze analytically to modified leading logarithmic accuracy and compare to Monte-Carlo simulations.


I. INTRODUCTION
Jets, or collimated sprays of particles originating from the fragmentation of energetic quarks and gluons, are among the most prominent features of high-energy particle collisions. The analysis of jet observables is crucial to study the theory of strong interactions, QCD, in the perturbative regime, including the running of the strong coupling constant α s . These phenomena also play an important role in constraining the background in searches for heavy particles, including the Higgs boson [1] and particles beyond the Standard Model [2].
In the case of high-energy hadronic collisions, however, the observables are strongly affected by a wide range of processes that are hard to account for in perturbation theory and conventional resummation techniques. These include radiation from outside of the jet (non-global logarithms) [3] and non-perturbative effects such as hadronization and underlying event activity. In the last decade, tackling these challenges has lead to an improved analytical understanding of jet substructure, see [4][5][6] for recent reviews, coinciding with the maturing of fast and versatile jet reclustering procedures [7][8][9].
In this context, several jet grooming techniques designed to reduce the jet's sensitivity to non-local and non-perturbative physics have been developed. Such techniques have further evolved toward being easier to interpret in terms of perturbative QCD [10,11]. Representative examples are the modified Mass Drop (mMDT) grooming [10] and SoftDrop (SD) grooming [12] that provide a two-parameter algorithm to determine the first branching in an angular-ordered tree that is deemed to be sufficiently perturbative. Given the ith primary emission off an angular ordered jet (corresponding to the i-th branch of a Cambridge/Aachen reclustering along the leading flow of energy), where p T,i1 > p T,i2 are the energies of the two splitting products, one removes such emissions until one identifies the first whose momentum sharing fraction satisfies the condition z > z cut (θ/R) β . Grooming recursively along the primary and secondary emission branches strongly reduces non-perturbative effects in specific cases [13]. Other techniques, such as trimming [14], recluster the jet with a smaller cone size and remove substructures below a certain energy cut-off. Grooming techniques have also proven promising to study the internal structure of quenched jets in the context of heavy-ion collisions [15][16][17][18], see also [19].
Furthermore, substructure techniques probe our knowledge of the multi-particle regime of QCD. Thus, designing new observables often goes hand in hand with a grooming scheme that permits a direct comparison to experimental data. For example, the momentum sharing variable z g of the first accepted emission in Soft Drop with β = 0, that coincides with mMDT grooming [10], turns out to be an ultraviolet fixed point [20] so that its distribution does not depend on the strong coupling constant. Other examples, such as the groomed jet mass [21,22], have been calculated to next-to-leading logarithmic accuracy.
Despite the many successes, current jet substructure techniques are often quite simple but lacking an internal "logic" that would allow to estimate the most natural choice for the grooming parameters. These procedures are sensitive to the choice of parameters, e.g. z cut and β in the case of SD, and their optimal values, in terms of resilience to underlying event or other distortions, can possibly depend on jet p T , underlying event activity and other unknown parameters. Clearly, if z cut 1 the sensitivity to non-perturbative infrared effects is enhanced. Moreover, from an analytical point of view, their inclusion generates new scales on the level of jet substructure observables that complicate the understanding of the different contributing modes. This appears because the intrinsic jet scale is fluctuating on a jet-by-jet basis. One would therefore wish for a method that aligns more closely with the intrinsic properties of a given jet without the need for fine tuning.
In this work, we aim to alleviate some of these shortcomings. We consider a class of observables based on selecting, or tagging, the hardest splitting in an angu- Figure 1. Dynamical grooming applied to an angular ordered tree. The splitting represented by the thick (black) line has the largest κ (a) in the tree. In tagging mode, observables are calculated using the kinematic variables zg and θg of the tagged splitting. In grooming mode, softer splittings that appear earlier in the tree, i.e. at larger angles, are discarded and the jet kinematics is adjusted accordingly. lar ordered shower, where the "hardness" is characterized by a pseudo energy correlation variable as follows. Given the i-th 1 → 2 splitting in the C/A [8] re-clustering sequence, the variable that measures the "hardness" of a jet, or in other words defines the "hardest" splitting within a jet, is defined as where p T and R are the energy and radius of the jet respectively, z i is the momentum sharing fraction, p T,i the energy of the parent, θ i the relative angle of the splitting and a is a free parameter whose physical interpretation will be discussed below. Note that for a = 2, we would select the splitting with the shortest formation time t −1 f ∼ κ (2) p T . We refer to this case as TimeDrop in what follows. Alternatively, for a = 1 we tag the branching with the largest relative transverse momentum k T ∼ κ (1) p T and name this option as k T Drop. Finally, a = 0 corresponds to the splitting with the most symmetric momentum sharing and is called zDrop in what follows. In fact, a = 0 leads to collinear sensitivity, see App. A, and we will rather use a = 0.1 for all practical purposes below.
Having identified a genuinely hard branching in the shower, we suggest two strategies.
• In tagging mode, the kinematics of the hardest splitting informs the observable one wishes to compute. This will be the main focus of this paper.
• In grooming mode, one discards all emissions taking place prior to the hard splitting in the reclustering sequence. This procedure can easily be iterated along all the branches of the jet. We will pursue this strategy for other groomed observables in an upcoming publication.
The main advantage of this method is that it autogenerates the conditions for tagging or grooming on a jet-by-jet basis. While a similar strategy is also pursued within e.g. jet pruning [23], the procedure in our case is simpler to implement and closer in spirit to the physics of color coherence. In fact, softer emissions in the C/A sequence prior to the hardest one can be considered as radiation off the total charge. Our procedure only depends on one parameter which defines what we mean by the hardest emission inside a jet in contrast to most other techniques that involve two (extrinsic) parameters. Hence, we refer to this procedure as dynamical grooming. A schematic illustration of how to dynamically groom an angular ordered shower can be found in Fig. 1.
This paper is structured as follows. In Sec. II, we discuss vetoed showers and introduce the probability for a splitting to be the hardest. In Sec. III, we employ the derived Sudakov form factor to compute a family of observables based on the tagged splitting. We then perform analytical calculations in the modified leadinglogarithmic approximation (MLLA) for the tagged mass and z distributions in Sec. III A and Sec. III B. Finally, Sec. IV is dedicated to Monte-Carlo simulations of proton-proton collisions. We generate and analyze the Lund planes for the tagged splittings within the different dynamical grooming settings and present a systematic study of the impact of non-perturbative phenomena on the tagged mass and z distributions. We end with a short discussion and outlook in Sec. V.

II. VETOED SHOWERS AND TAGGING
Dynamically grooming a jet amounts to a certain reorganization of the conventional parton shower, where we will assume angular ordering. Given a specific 1 → 2 splitting in the jet history, the procedure forces the emissions taking place both before, i.e. at angles larger than the selected splitting, and after, i.e. at smaller angles, not to allow for a harder emission. The reorganization depends on the properties of a given jet allowing for a procedure that is more adapted to account for jet-by-jet fluctuations.
The "hardness" variable κ (a) is easily accessible in experimental data together with Monte-Carlo showers, and will be used in Sec. IV. In order to tackle the problem analytically, in a transparent manner and up to the required logarithmic precision, some simplifications are in order. To next-to-leading logarithmic accuracy in an angular ordered shower, it can be shown that the hardest splitting takes place off the leading particle in the jet or, in other words, on the primary Lund plane [24,25]. In this case, we neglect the energy depletion of the leading jet p T,i p T and hence explicit dependence on momenta cancel out in Eq. (1). Accordingly, for a collinear safe definition, i.e. for a > 0 in Eq. (1), the hardness of a tagged splitting is given by Whenever it is obvious from the context, we will simply write κ ≡ κ (a) . The central quantity in our computations is the following Sudakov form factor, where P (z) is the splitting function and the Heaviside function vetoes emissions with a κ (a) larger than the measured, or tagged, emission. Note that the angular integral, spanning between 0 and R, enforces this veto all over the primary Lund plane of the jet. Finally, α s (k 2 t ) is the strong coupling constant evaluated at k t = z(1 − z)θp T , the transverse momentum generated at the splitting.
Such a form factor arises as a remainder contribution from the vetoed showers occurring before and after the hard emission. A similar construction was previously used as a method to match parton showers with nextto-leading order contributions where the hardest emission was the one with the largest k t ∼ κ (1) [26]. Here, we will proceed with a more direct line of reasoning to derive the relevant probability distribution. Taking the derivative and then integrating over κ in Eq. (3) leads to the following identity, where ∆(κ) ≡ ∆(κ a). Clearly, we have that ∆(∞) = 1.
For a > 0, the Sudakov given in Eq. (3) is infrared and collinear finite. Therefore, we can safely take the limit κ → 0, resulting in ∆(0) = 0. This is no longer the case for collinear unsafe observables, explicitly for a = 0, where we have to introduce a non-perturbative cut-off scale to regulate the integrals. We will treat this particular case in more detail in App. A, and focus in the remainder of the paper on the collinear safe taggers. Then, for collinear safe observables we can construct a normalized probability distribution of the splitting with the largest κ (a) in an angular ordered shower. Owing to the fact that it follows that where is the probability of splitting giving rise to the momentum sharing fraction z at angle θ that results into the largest κ (a) in the shower, i.e. to be the hardest splitting.
In order to gain analytic insight, we will work for simplicity in the modified leading logarithmic approximation that assumes angular ordering and where the one loop Altarelli-Parisi splitting function can be approximated as where the second term comes from approximating the finite part of the splitting function by This corresponds to the modified leadinglogarithmic approximation. For fixed coupling and using the approximation in Eq. (8), the integrals in Eq. (3) can be done analytically. The final expression for the Sudakov form factor in the MLLA is then, whereᾱ ≡ α s C F /π. Only the first two terms are relevant to leading-logarithmic (LL) accuracy. For the numerical evaluations, we will keep the term of order one to insure exact normalization of our observables.

III. COMPUTING TAGGED OBSERVABLES
Since our procedure exploits the properties of the hardest branching in the jet shower, we will be interested in observables that are of the same form as Eq. (2). Generically, such distributions are then given by where σ| a represents the fact that the cross section distribution is measured given a tagged splitting with the largest κ (a) in the parton shower. The normalized distribution of the jet "hardness" is then written as Hence, the first argument of the function is related to the observable κ (b) that is measured on the kinematics of the tagged splitting while the second defines what do we mean by tagging the hardest splitting, i.e. we identify the splitting with the largest κ (a) . Whenever it is obvious from the context, we will simply denote κ (b) ≡ κ. Without losing generality, we work in the MLL approximation and treat z 1. Then, assuming b > 0, we evaluate the δ-function to obtain where now the argument of the running coupling is It is straightforward to check that Eq. (12) is normalized to unity. After further simplifications, assuming a = b, the final expression reads where we introduced the notationP (z) ≡ zP (z) and where now It is clear from Eq. (13) that, except when b = 0, H(b|a) is an infrared and collinear safe quantity since it admits a Taylor expansion in the coupling constant. For b = 0, the x integration goes from 0 to κ and the integrant exhibits an essential singularity at 0 that is regulated by the Sudakov form factor. The resulting integral is an asymptotic series, and hence finite, although the perturbative expansion is divergent term by term. In this case, H(0|a) is said to be Sudakov safe [20,27].
Remarkably, in the double logarithmic approximation (DLA), where we drop the second term in Eq. (8), and with fixed coupling, we obtain which turns out to be invariant under the transforma- where the equality only holds at DLA. This immediately singles out a = b as a "fixed point" of this class of observables. The dual distributions correspond to different tagging modes and in the limiting case a → 0 this amounts to strong and weak grooming in the left and right hand side of Eq. (15), respectively. Below we will come back to the meaning of these observations. We plot the normalized distribution H(b a) as function of the variable κ (b) in Fig. 2 for three values of a, corresponding (from left to right) to zDrop (a ≈ 0), k t Drop (a = 1) and TimeDrop (a = 2). For each grooming setting we plot the distributions for different variables κ (b) for 0.1 < b < 2. To observe the approximate duality Eq. (15) it is sufficient to notice that, in the central panel (k t Drop) the distributions reach the maximal values at b = a and starts decreasing at b > a.
In order to discuss the qualitative features of the spectrum at this level it is sufficient to focus on the case b > a > 0. The asymptotic behavior of the κ-distribution, i.e. ln 2 κ (ᾱ a) −1 , reads We observe two qualitative features. First, the distribution peaks at ln κ ∼ (ᾱ a/b 2 ) −1/2 . Hence, the peak of the distribution shifts to larger values of κ with decreasing a. Second, from Eq. (16), we see that the distribution flattens as a decreases. The opposite case, i.e. a > b > 0, can be found in a similar way, e.g. using Eq. (15) at DLA. These features are seen in Fig. 2. As a result, the limits a → ∞ and a → 0 exhibit similar behavior despite their different groomed modes.
Turning now to the special case, when a = b > 0, we evaluate Eq. (12) which gives where now k t = z 1− 1 a κ 1 a Q. The remaining integral in Eq. (17) is regulated from below and is finite. In the MLLA and at fixed coupling we get It turns out that the distribution measured this way corresponds to the plain distribution to LL accuracy, i.e. the distribution of the observables without any grooming, see Eq. (20) for a concrete example. Hence, having b ∼ a corresponds to a low degree of grooming, such that the distribution closely resembles the plain one, while b a results in strong grooming.
We now proceed to consider in more detail two important observables in jet physics, namely the mass and momentum sharing fraction z properties of the tagged, hard splitting. As a further example, we will consider the tagged k t distribution in App. B.

A. Tagged mass distribution (b = 2)
The case b = 2 is related to the mass of a given splitting or, in other words, to the virtuality m 2 of the parent particle that decayed. Defining the rescaled variable ρ ≡ m 2 /(p T R) 2 , the normalized distribution is simply H(2 a) ≡ d ln σ d ln ρ. Using Eq. (12), we find at MLL accuracy, where in this case κ ≡ κ (2) The result of numerically solving Eq. (19) is displayed in Fig. 3 for the most representative values of a, i.e. TimeDrop (a = 2), k t Drop (a = 1) and zDrop (a ≈ 0). We observe how the qualitative features generically discussed in the previous section are manifest: as a decreases the ρ distribution flattens while the peak shifts to larger values of ρ.
The case where b = a = 2 is of special interest. Using Eq. (18) we obtain  which remarkably reproduces the result for the plain mass at leading logarithmic accuracy. This is not surprising since to this level of accuracy the plain mass is determined by the hardest splitting.

B. Tagged z distribution (b = 0)
As a second example, we consider the tagged zdistribution, with b = 0 so that κ (0) = z(1 − z) ≈ z. Since we are now dealing with a potentially infrared unsafe but Sudakov safe [20] observable, see Eq. (10), one has to beware. However, defining H(0 a) ≡ d ln σ d ln z, it straightforward to derive After fixing the coupling and at MLLA, Eq. (21) transforms into The resulting tagged z-distributions obtained by numerically solving Eq. (22) are displayed in Fig. 4 for 2 > a > 0.
The origin of the main features observed in Fig. 4 can be understood analytically by resorting to the DLA, where This distribution is cut-off at a characteristic value of z, namely For a ᾱ, this opens a wide range z cut < z < 0.5 where the distribution falls off as z −1 and is modulated by ᾱ/a. However, for a 0 and z cut ≈ 1, we find that i.e. the distribution grows slowly with z. These features are roughly reproduced in Fig. 4 where the drop-off for the k t Drop case is clearly visible around z ∼ 0.02. In this context it is interesting to notice that the cut-off in z, Eq. (24), is dynamically generated and is a measure of α s /a. This is quite different from SD (mMDT) grooming with β = 0 where the cut-off is simply given by the input to the algorithm. Although the distribution is modulated by the same ratio, dynamical grooming opens up the possibility to probe the splitting function down to low z.

IV. MONTE-CARLO STUDIES AND RESILIENCE TO NON-PERTURBATIVE EFFECTS
In this Section, we complement our analytical studies by using PYTHIA8 [28] to simulate di-jet events in proton-proton collisions at √ s = 13 TeV. For each event, particles are clustered into anti-k T jets [9] with R = 0.8 and re-clustered with Cambridge/Aachen using FastJet 3.1 [29]. The analysis is performed on jets with p T > 450 GeV/c. Further, the sensitivity to non-perturbative phenomena such as the underlying event (multi-parton interactions and inital state radiation) and hadronization is explored.
We plot the kinematics of the tagged emissions on the primary Lund plane for the three main choices of a in Eq. (1), corresponding to TimeDrop (a = 2), k t Drop (a = 1) and zDrop (a ≈ 0), in Fig. 5. It is clear from these figures that the condition on the hardest branch in each of these three cases corresponds to suppressing the phase space at large formation times (alternatively, small virtualities), small k t 's or small momentum fractions z, respectively. It is important to point out that there are no sharp cuts in the kinematical plane, in contrast to other existing grooming algorithms, such as trimming, filtering, pruning and SoftDrop. This remarkable feature arises due to the fact that the hardest emission, which can be thought of as a proxy of the realistic jet scale, is fluctuating on a jet-by-jet basis. Nevertheless, a dynamical cut is generated which can be estimated by solving ∆(κ a) = 1/2. Up to DLA we find or, in terms of k t = zθ/R, This defines a straight line boundary that is clear from the MC simulations of the Lund Planes in Fig. 8. Parametrically, the point where the critical line crosses the yaxis can be estimated within the fixed coupling approximation to scale as As an another illustration of the dynamical grooming, we plot in Fig. 6 the distribution of the number i of the tagged, hardest branching. Although IRC unsafe it is useful to investigate the location of the tagged branching in the C/A sequence. The larger the power a, the more narrow and peaked around i = 1, i.e. the first C/A de-clustering step, the distribution is. This is quite natural since a = ∞ corresponds to an angular-ordered Sudakov form factor. In the opposite limit, a → 0, the distribution widens and peaks around i 1. More precisely, the average i in each grooming setting is ≈ 2 (Time-Drop), 3 (k t Drop) and 5 (zDrop).
The ρ-distribution is presented in Fig. 7. As anticipated, both TimeDrop and k t Drop exhibit a plain masslike shape while SoftDrop and zDrop deliver an almost flat distribution. The sensitivity to non-perturbative physics is alike for every scenario, especially for values of ρ > 10 −3 . It is worth noticing that while zDrop has a fantastic robustness against underlying event, k t Drop outperforms when considering hadronization. Therefore, we expect that a compromise to reduce the sensitivity to both mechanisms simultaneously could be   In the top panel of Fig. 8, the z distribution of the tagged splitting for different grooming procedures at partonic level and without underlying event is displayed. For completeness, we show the results obtained with SoftDrop for z cut = 0.1 and two different choices of β = 0, 1. We find an excellent agreement between the qualitative features of the analytic estimate for the dynamical grooming family as shown in Fig. 4 and the more realistic scenario provided by full-fledged Monte-Carlo simulations. It's worth noticing the different behavior between SoftDrop (β = 0) and zDrop even though they use the same variable for tagging, i.e., the momentum sharing. Regarding the impact of non-perturbative effects, in the central and bottom panels we evaluate the role of the underlying event and hadronization, respectively. We would like to highlight the resilience of k t Drop to hadronization effects, an imprint of its effectiveness on selecting the most perturbative splitting. For the other cases, an overall similar performance to Soft- Drop is found.

V. CONCLUSIONS
In this work we have proposed a new set of jet substructure observables defined by the "hardest" splitting in a C/A re-clustered jet. We explore three representative definitions of "hardness" in terms of formation time (TimeDrop), relative transverse momentum of the splitting (k t Drop) and momentum sharing (zDrop). For a tagged splitting in the shower by either of these three  choices, its kinematics serve to compute any observable such as its mass or momentum sharing fraction. Other observables, such as the groomed angular distribution, can be derived in a completely analogous fashion.
We have developed an analytical framework that gives a good qualitative understanding of the features seen in full Monte-Carlo simulations. The key object in these calculations is the Sudakov form factor Eq. (3) that vetoes all primary emissions in the full angular range of the jet. While many contemporary grooming procedures involve two parameters, our approach relies on the intrinsically generated jet scale whose proxy is the "hardness" defined via the continuous parameter a. The amount of grooming is somehow related to how different the observable is from the variable that defines the hardness, i.e. by comparing b to a, where a = b results in plain distributions. We found also that, in contrast to SD/mMDT where z cut is a parameter of the grooming, a similar cut-off scale is naturally generated by the strong QCD dynamics, z cut ∼ e − √ᾱ /a . The observables discussed in this work are ICR safe except for the zdistribution which turns out to be Sudakov safe.
So far we have only investigated observables exploiting the tagged hardest splitting inside a jet. In addition to the remarkable features of the analytic distributions, our Monte-Carlo studies indicate that these observables are quite resilient to non-perturbative effects, including both hadronization and underlying event, for a large part of the distributions. We find it particularly interesting to note that even with relatively mild grooming, b a, the mass distribution is robust in the region of its peak (this is also the case for the k t distribution). We propose to study such observables experimentally as they represent, perhaps, the closest realization of perturbative parton dynamics in fully fledged jet observables.
In this work we have deliberately avoided to study in more detail the grooming mode, where branches that violate the ordering set by the hardest branching would be removed leading to modifications of the jet kinematics. This procedure naturally lends itself to an interpretation of removing radiation sensitive to the total color charge of the jet. It could easily be implemented in a recursive fashion along all the primary and secondary branches/planes of the jet. This will be studied in more detail in an upcoming paper. κ (1) σ dσ dκ (1) ����� �������� (α � =����) Figure 9. The tagged κ (1) -distribution for fixed coupling as given by Eq. (B1) for 2 > a > 0. dz α s (zm 2 ) 2π P (z)∆(κ 0) , (A3) where the second condition on the integral comes about by demanding that k 2 t > Q 2 0 . We notice a strong shape sensitivity to the ratio Q 2 /Q 2 0 . The normalization factor appears from the unitarity condition, and reads where the last line was obtained in DLA for fixed coupling.