Multi-Higgs Probes of the Dark Sector

In the Higgs portal framework, cascade decays of the dark sector fields naturally produce multi-Higgs final states along with dark matter. It is common that heavier dark states couple stronger to dark matter than light fields do, including the 125 GeV Higgs boson. In this case, dark matter production is often accompanied by Higgs production such that multi-Higgs final states play an essential role in probing the hidden sector. We study 2 and 3 Higgs final states with missing energy using a multivariate analysis with Boosted Decision Trees. We find that the di-Higgs channel is quite promising for the $\bar b b + \gamma \gamma$ and $\bar b b + \bar \ell \ell$ decay modes. The tri-Higgs final state with missing energy, on the other hand, appears to be beyond the reach of the LHC in analogous channels. This may change when fully hadronic Higgs decays are considered.


Introduction
The scalar sector of the Standard Model (SM) remains relatively little explored compared to the gauge sector. In particular, no Higgs boson self-interaction has been measured. It is a well motivated and experimentally viable possibility that the Higgs field plays the role of a portal into the hidden sector which may contain dark matter and possibly other cosmologically relevant fields [1][2][3]. Exploring this portal requires Higgs coupling measurements as well as a search for extra states. The Higgs selfcoupling can be determined via Higgs boson pair production -one of the central objectives of the future (HL)-LHC searches, cf. eg. [4] for a recent summary. Higgs portal dark matter can manifest itself in di-Higgs or multi-Higgs final states with large missing energy, which appears due to emission of undetected dark matter. Analogous signals arise in a multitude of New Physics models as studied in [5][6][7][8][9][10][11][12][13][14][15][16]. LHC searches for di-Higgs + E T have been presented in [17][18][19].
Within a similar framework, a study of di-Higgs + E T production with the 4b + E T final state has been performed in [20]. It utilises a jet substructure technique to reconstruct the boosted Higgs bosons. While for heavy dark states this method is efficient, in the intermediate mass range it is less reliable. In our work, we consider both 2 and 3 Higgs final states, which subsequently decay intobb, γγ and W W . These have the advantage of being cleaner channels with a lower background. Instead of using a traditional cut-based analysis, we employ (as done for 4b + E T in [20]) a multivariate technique with Boosted Decision Trees which leads to much better sensitivity to New Physics. Further probes of the model are provided by monojet + E T searches [21]. The analysis of [21] is based on rectangular cuts and as such leads to lower sensitivity than the present study does.
To set the framework for our study, we introduce a simplified model in which cascade decays of heavier states produce dark matter and the Higgses. This model is motivated by non-Abelian gauged hidden sectors in which gauge symmetry is broken completely by VEVs of a minimal set of dark Higgs multiplets [22]. Such models automatically lead to stable dark matter which consists of a subset of gauge fields (and possibly extra scalars). Smaller groups like U(1) and SU (2) [23,24] do not allow for the needed cascade decays, while SU(3) and larger groups have all the necessary features. Further options, e.g. when part of the gauge group condenses, have been explored in [25,26].
This paper is structured as follows. In Section 2, we motivate our study and introduce the simplified model. In Section 3, we perform an LHC study of the di-Higgs final state with missing energy using multivariate analysis with Boosted Decision Trees and present the resulting sensitivity to model parameters. In Section 4, our findings on the tri-Higgs final state with missing energy are summarized. Section 5 concludes our study.
2 Motivation: multi-Higgs states and missing energy from dark gauge sectors

Dark Higgsed gauge sectors
The presence of stable states which can play the role of dark matter is a common feature of dark sectors with spontaneously broken gauge symmetry. In particular, breaking SU(N) completely with the minimal number of dark Higgs fields automatically leads to stable dark matter. To be specific, let us briefly summarize the main relevant features of the SU(3) example [22] (see also [27,28]). The set-up contains two dark Higgs triplets φ i to break the symmetry completely. The Lagrangian is Here, The potential for the dark Higgses is given by fields In the unitary gauge, φ 1 , φ 2 can be written as where the v i are real VEVs and ϕ i are real scalar fields. For simplicity, we assume an unbroken CP symmetry in the scalar sector, i.e. we assume that the couplings are real and v 4 = 0. 1 As explained in [22,30], the model then has an unbroken U(1)×Z 2 global symmetry. Its Z 2 × Z 2 subgroup leads to stability of multicomponent DM. The parities of the fields under Z 2 × Z 2 transformations are summarized in Table 1.
The lightest states with non-trivial parities cannot decay to the Standard Model particles and are viable DM candidates. The hidden vector and scalar fields can mix with other fields of the same Z 2 × Z 2 parity.
In this example, with scalar emission. The scalar couplings to the vectors depend on the VEVs of the triplets as well as their mixing with the SM Higgs. The CP-even scalars are all expected to mix and their mass eigenstates include the 125 GeV Higgs h, a heavier scalar H and two more heavy scalars which play no significant role in our discussion. Given that H-emission is often kinematically forbidden, cascade decays of the heavy states naturally produce multi-Higgs h signals.

Simplified model
In our study, only main features of this set-up play a role. Hence, it is convenient to introduce a simplified model which inherits salient features of the Higgs portal framework. Consider an extension of the SM by three fields: a Higgs-like scalar H with mass m H , a stable vector field A L with mass m A L and a heavier unstable vector field A H with mass m A H . Dark matter is assumed to be composed of A L , while the 125 GeV Higgs h and the heavier Higgs H are mixtures of the SM Higgs doublet and the hidden sector singlet characterized by the mixing angle θ.
The interactions of the prototype fields are listed in [22]. The result depends on which of the triplets the SM Higgs mixes with predominantly. For our purposes, it is convenient to parametrize the couplings in terms of the mixing angles as follows. The couplings of h, H to SM matter and SM gauge fields are given by  while the couplings of h, H to A L , A H are given by In addition to the masses, the New Physics parameters are: the dark gauge coupling g, the Higgs mixing angle θ and an angle δ that sets the relative strength of the scalar coupling to the dark vectors. Further relevant couplings include the terms Here κ 112 is not fixed by the other model parameters, so that we can take BR(H → hh) to be a free variable. The σ-term accounts for decay of the heavy gauge boson into dark matter and SM states. The di-Higgs + E T final state is generated through the diagram in Fig. 1, left. We consider the regime where H is produced on-shell and decays into a pair of A H with a significant branching fraction. These subsequently decay into A L 's and h's as long as kinematically allowed, while the A L pair escapes undetected thereby producing the missing energy signal.
In the simplest model, On the other hand, BR(H → A H A H ) varies. The partial width for the H decay into SM fermions and SM vector bosons is given by The partial widths for the H decay to hidden vectors are given by see that cos δ is required to be small in order to get a significant BR(H → A H A H ). A large portion of parameter space is excluded by the invisible h-decay constraint BR(h → inv) < 0.1, which also forces cos δ 1. 2 In the SU(3) model, the smallness of cos δ can be attributed to the Higgs mixing predominantly with one of the triplets, namely φ 1 , such that H has a large φ 1 component (see Table 3 of [22].) Finally, we note that the chosen values of sin θ and m H are consistent with both the LHC and electroweak precision measurements [31,32].
In this work, we are being agnostic as to the DM production mechanism and do not impose a constraint on the thermal DM annihilation cross-section. 3 On the other hand, the direct DM detection cross-section is automatically suppressed: the hmediated contribution is small due to cos 2 δ 1, while the H-mediated contribution is suppressed by sin 2 θ and m 4 H . Thus, in what follows we may focus on the collider aspects of the model as long as the parameter values are in the ballpark of those considered in this section.
The simplified model can be added a layer of complexity by including an extra heavy gauge boson A H . Indeed, dark sectors with the symmetry group larger than SU(2) contain multiple heavy bosons with different masses. The relevant couplings take the form where A a = A H , A H , A L and Φ i = h, H. Depending on the couplings λ abi and masses, H may predominantly decay into a pair of A H , which then decay into A H and A L with h-emission. This can lead, for example, to a 3h and dark matter final state as shown in Fig. 1, right. Such exotic processes may provide an additional handle on the dark sector properties.
3 LHC search for di-Higgs production with missing energy Certain aspects of collider phenomenology of related models have been studied before [6,20,21,[33][34][35][36][37] through different production channels. In this work, we focus on the heavy CP-even scalar (H) production through gluon fusion, which subsequently decays via the hidden sector vector fields into the 125 GeV Higgs bosons (h) along with two dark matter particles. This can result in various final states depending on the decay mode of the 125 GeV Higgs boson. The dominant decay of the Higgs to bb has been studied extensively in [20]. However, such a multi-b-jet final state is plagued with hadronic backgrounds and one needs a very clear understanding of the W +jets and QCD backgrounds at high luminosity in order to isolate the signal events. Using the jet substructure technique to reconstruct the Higgs bosons is an efficient tool for a heavy H, while in the mass range of interest to us it is less reliable. Instead, we explore a cleaner channel where one of the 125 GeV Higgses decays into a pair of photons. The photon identification efficiency is quite good and even though the corresponding branching ratio is small, an enhanced cut efficiency makes this channel a significant one. Thus, our signal region consists of two b-jets, two photons and large transverse missing energy. The LHC collaborations have studied the two b-jets + 2 photons signal region quite extensively in the context of BSM Higgs searches [38]. However, this does not include large missing energy which we use as an additional feature to suppress the SM background.
The dominant SM background to the signal arises from the tth, bbh, Zh, γγ+ jets, ttγγ and ZZγγ production. We have simulated all these processes along with our signal for some representative benchmark points at the 14 TeV LHC. The parton level events have been generated using MadGraph5 [39,40]. We use the NNPDF parton distribution function [41,42] for our computation. These events are then passed through PYTHIA8 [43,44] for decay, showering and hadronisation. For processes with additional jets at the parton level, MLM matching [45,46] was performed through the MadGraph5-PYTHIA8 interface. The complete event information is further passed through Delphes3 [47][48][49] for detector simulation. The jets are constructed via FastJet [50] following the anti-kt [51] algorithm.

Problems with the cut-based analysis
We use the following kinematic cuts to achieve a good signal to background ratio: 4 • C1: The final state must contain two b-jets with p b T > 25 GeV and two photons with p γ T > 20 GeV. Pseudorapidity of both the b-jets and the photons must lie within |η| < 2.5. There should be no charged leptons with p T > 20 GeV and |η| < 2.5.
• C2: The transverse missing energy / E T must be larger than 120 GeV.   • C5: Invariant mass of the photon pair must lie within a 5 GeV window of the 125 GeV Higgs, |m γγ − 125 GeV| < 5 GeV. • C6: Angular separation between the bb pair and the photon pair must be large, ∆φ(bb, γγ) > 2.0. • C7: In Tables 2 and 3 we show our results obtained from the cut-based analysis. Clearly, the chosen cuts are effective enough to reduce the SM background, but they also decrease the signal cross-section to the extent that any possible signal could only be observed at high-luminosity LHC. The 14 TeV LHC is projected to accumulate an integrated luminosity of 3000 fb −1 , which appears insufficient to obtain the discovery signal significance in most of the parameter space. 5 We have also tried to soften the cuts to increase the signal rate, but that eventually results in a worse signal to background ratio. Our conclusion is that imposing rectangular cuts does not lead to good discovery prospects.

Multivariate analysis
Evidently, the cut-based analysis is not sensitive enough to probe the present scenario at the 14 TeV LHC even at high luminosity. Thus, we next explore the possibility of improving the analysis with machine learning techniques, namely the Gradient Boosted Decision Trees (BDT) [52]. This method of data analysis is being used quite extensively in LHC searches to good effect [20,[53][54][55][56][57]. We have chosen the XGBoost [52] toolkit for the gradient boosting analysis. Below we list the kinematic variables used in the decision trees (cf. [38]).   We have chosen 1000 trees, maximum depth 4 and learning rate 0.01 for our analysis. We combine data on the above kinematic variables for our signal events with all the background events in one data file. All the events are required to have at least two b-jets (p b T > 25 GeV), at least two photons (p γ T > 20 GeV) and no charged leptons with p T > 20 GeV. We have combined the background events after properly weighting them according to their cross-sections subject to these cuts. As a result, more importance is given to the dominant backgrounds while training the data. We also make sure that there are enough signal events to match the total weight of the background events. We take 80% of our data for training and 20% for testing.    Table 3) along with the required luminosity to achieve a 3σ signal significance in the 2b + 2γ + / E T channel at the 14 TeV LHC assuming BR(H → A H A H )=1 or 0.7 (in parentheses).

Results
indicates the fraction of identified signal events after imposing the BDT classifier while the y-axis (Purity) indicates the ratio of identified signal events to the total number of identified events (signal plus background) after imposing the classifier. The area under the curve (AUC) is a good indicator of the BDT performance. For this benchmark point, AUC=0.90.
Our multivariate analysis results are quite promising and prove to be a significant improvement over those for the cut-based analysis. The AUC indicator is close to 0.9 for all the benchmark points, which shows that the BDT classifier is efficient in distinguishing the signal from the background in all considered cases. In Table 4, we present the AUC for the benchmark points along with the required luminosity to achieve a signal significance of 3σ.

Multivariate analysis for 2b-jets + 2 + / E T in the final state
Another promising channel for the di-Higgs and dark matter search is provided by thē bb and W W * decay modes of the Higgses. Tagging on leptonic decays of the W 's, one obtains a clean final state with 2b-jets +2 + / E T . Here we take = e, µ. The challenge is to suppress the tt + jets background which has a much larger cross-section. Further 15. Separation between the two b-jets, ∆R bb . 16. Azimuthal separation between the b-jet pair and the dilepton system, ∆φ(bb, ¯ ).

Results
We collect information on these kinematic variables subject to the following preliminary cuts: the final state is required to have at least two b-jets with p T > 25 GeV, at least two charged leptons with p T > 20 GeV and / E T > 50 GeV. The final results are summarised in Table 5. In Fig. 4, we present the required integrated luminosity for a 3σ signal significance in both channels.
The results can also presented in terms of the exclusion limits on the model. In particular, the branching ratio for the decay H → A H A H can be severely constrained with 3ab −1 of data.    Table 3) along with the required integrated luminosity to achieve a 3σ signal significance in the 2b + 2 + / E T at the 14 TeV LHC assuming BR(H → A H A H )=1 or 0.7 (in parentheses). BP5 and BP6 require very large integrated luminosity and thus not shown. constraint can be as strong as BR(H → A H A H )< 7% for m H < 400 GeV. Substantial values of BR(H → A H A H ) can be probed up to m H ∼ 600 − 650 GeV. We also see that the 2b + 2γ + / E T channel performs slightly better than 2b + 2 + / E T does.

Multivariate analysis for ≥ 3b-jets +2 + / E T in the final state
In the presence of additional heavy particles in the dark sector, more exotic final states can be produced. In particular, cascade decays can lead to 3 or more Higgses h in the final state. For example, the heavy Higgs H can decay into a pair of A H , each of which decays either into a heavier A H with h or a lighter A L with h. Subsequently, A H decays into A L and a Higgs, thereby generating a multi-Higgs final state. The channel with four Higgses suffers from severe kinematic suppression, while the tri-Higgs one could potentially be interesting.
Multi-Higgs production can be efficient if the decays happen on-shell. This implies that A H must be heavier than 250 GeV and thus m H > 500 GeV. Given a tri-Higgs final state, there are a number of options to consider for Higgs decay. If all three decay intobb pairs, the signal extraction is marred by a large QCD multi-jet background and possible misidentification of light flavor jets as b-jets. For the current study, we choose instead 2bb pairs and leptonically decaying W W * as our final state. It has the advantage of being relatively clean and, in addition, we may recycle many of the background calculations done in the previous section. We perform a multivariate analysis with the following kinematic variables: T . 14. Separation between the two charged leptons, ∆R ¯ . 15. Separation between the two b-jets, ∆R b 1 b 2 . 16. Azimuthal separation between the b-jet pair and the dilepton system, ∆φ(b 1 b 2 , ¯ ).
We collect information on these kinematic variables imposing the following preliminary cuts: the final state is required to have at least 3 b-jets with p T > 25 GeV and at least two charged leptons with p T > 20 GeV. We construct all possible b-jet pairs in the final state, compute the corresponding invariant masses, and then identify the pair that has m bb inv closest to 125 GeV. We call the harder b-jet in this pair b 1 and the other one is identified as b 2 .
Note that we are not using the third b-jet kinematics directly in our multivariate analysis. Including it adds a few more variables to the list: p T b 3 , ∆φ(b 3 , ¯ ) and ∆φ (b 1 b 2 , b 3 ). However, we have verified that these do not improve our results.
The sensitivity of this signal region is weaker compared to the previous two. Although the BDT classifier is quite efficient in isolating the signal events, the few remaining background events have a large enough cross-section to suppress the signal significance. This is due to the small signal cross-section to start with, which is further reduced by detector simulations and applying the BDT classifier.
To give an example, consider the parameter set m H = 600 GeV, m A H = 290 GeV, m A H = 150 GeV, m A L = 20 GeV. The production cross section for the required final state is 0.107 fb. Optimizing the signal to background ratio using our multivariate analysis, we find that the 1σ signal significance is achieved with 3000 fb −1 integrated luminosity, while that at 2σ level requires 12000 fb −1 . Clearly, this is beyond the reach of the LHC.
Although our result for the 3b-jets +2 + / E T channel is negative, a fully b-jet + / E T final state may be more promising. It requires a dedicated analysis which we reserve for future work.

Summary and Conclusions
Within the Higgs portal framework, cascade decays in the dark sector naturally lead to multi-Higgs final states with missing energy. In this work, we have introduced a simplified model which captures main features of realistic hidden sectors which contain dark matter as well as further heavier states.
We have focused on 2 and 3 Higgs final states which subsequently decay intobb, γγ and W W . Using multivariate analysis with Boosted Decision Trees, we find that the 2b + 2γ + / E T and 2b + 2 + / E T channels are promising in the context of 14 TeV LHC with 3 ab −1 integrated luminosity. In particular, light dark matter A L with mass 100 GeV can be probed efficiently for the dark Higgs (H) mass below 600 GeV and its mixing angle with the SM Higgs sin θ ∼ 0.2 − 0.3. In this region, 3σ and higher signal significance can be achieved. The result can also be translated into a bound on the dark Higgs decay into the heavier partners of dark matter A H , with sensitivity to BR(H → A H A H ) reaching 7% in the best case scenario.
The 3 Higgs final state, on the other hand, appears far less promising, at least for the decay channels considered. Fully hadronic Higgs decays may change this situation, but require a dedicated background study and detection simulation.