Determination of the QCD coupling from the static energy and the free energy

We present two determinations of the strong coupling \(\alpha_s\). The first one is from the static energy at three-loop accuracy, and may be considered an update of earlier determinations by some of us. The new analysis includes new lattice data at smaller lattice spacings, and reaches distances as short as \(0.0237\,{\rm fm}\). We present a comprehensive and detailed estimate of the error sources that contribute to the uncertainty of the final result, $\alpha_s(M_Z)=0.11660^{+0.00110}_{-0.00056}$. The second determination is based on lattice data for the singlet free energy at finite temperature up to distances as small as \(0.0081\,{\rm fm}\), from which we obtain $\alpha_s(M_Z)=0.11638^{+0.0009 5}_{-0.00087}$.


I. INTRODUCTION
A precise determination of the strong coupling α s is of key importance both for the theory of strong interactions, which is determined by such a fundamental parameter, and for investigations of new physics beyond the standard model. Cross section calculations at the Large Hadron Collider, for example, suffer from the uncertainties related to our limited knowledge of α s . The last decades have witnessed an impressive effort in the α s extraction from an ample range of physical observables with a broad sweep of different methods. Notwithstanding all these efforts, the Particle Data Book (PDG) world average value for α s in 2018, α s ðM Z Þ ¼ 0.1181 AE 0.0011, has an overall uncertainty that has almost doubled with respect to the PDG average in 2013, α s ðM Z Þ ¼ 0.1185 AE 0.0006 [1]. This is due to the fact that the uncertainty of several determinations that enter the average is dominated by errors of theoretical origin, which are often difficult to precisely assess. It is important therefore to use a variety of different ways to extract α s and to validate each extraction at the best of the state of the art.
In this paper we aim at an improved determination of α s both by updating earlier extractions from the static energy and by proposing a new method that involves the singlet free energy.
The static energy is an observable up to an additive constant, and it is a function of the distance r between the static quark and the static antiquark. In the limit of massless dynamical quarks, it depends only on α s . It can be calculated on the lattice for any distance r ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi , where x, y and z are integer multiples of the lattice spacing. For short distance r it can be calculated in perturbation theory in QCD as a function of α s using nonrelativistic QCD effective field theories. Notably, perturbation theory is accurate for this quantity at three loops, and the tree-level result is already sensitive to α s . The comparison between the perturbative expression and lattice data at distances small enough to be accessible to perturbation theory is a good way to obtain a precise determination of α s . For this endeavor it is crucial to have lattice data covering an interval of sufficiently small distances in order to be sensitive to the minute details of the curvature of the static energy, namely, with sufficiently fine lattice spacing. At the present day, these lattices still pose a major challenge. In this paper we use lattices [2] with an extraordinarily fine lattice spacing a ¼ 0.0249 fm to achieve a systematically improved extraction of α s .
Additionally we exploit a new idea. One reason for which it is challenging to reach such fine lattice spacings in lattice QCD simulations with dynamical quarks is that one has to simultaneously maintain the control over finite volume effects arising from the propagation of the lightest hadronic modes, namely, the Goldstone bosons, at the pion scale. A lattice simulation at high enough temperature avoids this infrared problem, and thus enables reaching much finer lattice spacings using smaller volumes. We use finite temperature lattices with unprecedentedly fine lattice spacing a ¼ 0.00848 fm [3]. The singlet static free energy is again a function of the static quark-antiquark distance and has been calculated on the lattice [3] and perturbatively [4]. The comparison between the two offers a novel and independent method to extract a precise determination of α s .
The rest of the paper is organized as follows. In Sec. II we give an account of the gauge ensembles and lattice correlators that we use and discuss the relevant systematic effects in the lattice data. In Sec. III after briefly recalling the procedure used in Ref. [5] to extract α s from the static energy, we compare between the lattice data and the weak-coupling result, discuss our estimates of systematic uncertainties, and finally obtain an updated extraction of α s . In Sec. IV we discuss the singlet free energy at finite temperature, and outline the relation with the static potential and static energy at zero temperature. We discuss the relevant scale hierarchies, compare with the lattice, and proceed to extract α s at even shorter distances. Section V contains some discussion of our results, as well as a comparison with previous related works. Finally, in Sec. VI we present a short summary of the main results and conclude. In Appendix A we discuss the gauge ensembles used in this study. In Appendix B we discuss discretization artifacts at short distances in detail. In Appendix C we list the coefficients appearing in the perturbative results used in this paper.

II. LATTICE SETUP
We require the lattice result of the static energy at the smallest available distances in order to compare with the weak-coupling calculations. For this reason we employ the 2 þ 1 flavor ensembles using the highly improved staggered quark (HISQ) action [6] that have been generated for studies of the QCD equation of state [2,7]. The spatial volume is given by V 3 ¼ ðaN σ Þ 3 , and the physical length of the Euclidean time direction is aN τ . We summarize the zero temperature ensembles 1 with N τ ¼ 64 in Table I, which correspond to two different pion masses in the continuum limit. Namely, these ensembles have different light sea quark masses, m l ¼ m s =20 or m l ¼ m s =5. The strange quark mass m s is at its physical value, namely, such that the mass of the hypothetical η ss meson is reproduced as m η ss ¼ 686 MeV. The lattice spacing a has been fixed by the r 1 scale, which is defined by the equation see Refs. [2,7] for the details. Here and in the following EðrÞ denotes the QCD static energy calculated on the lattice. We use the value r 1 ¼ 0.3106ð17Þ fm determined from the pion decay constant. Since the quark mass dependence of r 1 =a is small, the m l ¼ m s =20 and m l ¼ m s =5 results can be combined to produce a parametrization of r 1 =a as a function of the bare lattice gauge coupling β [2]. The tunneling between different topological sectors is suppressed for these gauge ensembles. Nevertheless, no statistically significant difference has been observed for bulk observables other than the light quark condensate [2] in different topological sectors. For the distances we consider in this study the static energy [8] does not vary significantly between the different topological sectors. In addition, we use the finite temperature results of the singlet free energy with N σ =N τ ¼4 and N τ ¼12, or 16. These ensembles correspond to the thermal QCD medium at temperatures T ¼ 1=ðaN τ Þ. The finite temperature ensembles The lattice observables (gluon action density, and sea quark condensates) contributing to the equation of state have divergent, additive contributions, which depend only on the lattice spacing. Renormalization group invariant observables are obtained by subtracting the T ¼ 0 observables from the T > 0 observables at each lattice spacing. have been generated using lattice parameters that would correspond to the same two pion masses (at zero temperature) in the continuum limit as the zero temperature ensembles [3]. For these we use the same parametrization of r 1 =a. We give an account of these ensembles in Appendix A.
The static energy can be calculated using Wilson line correlators in Coulomb gauge or Wilson loops. We use the former because it is more convenient for practical reasons. For additional cross-checks we also studied Wilson loops for β ¼ 7.825. When extracting the static energy from Wilson loops the spatial lines have to be smeared in order to obtain a reasonable signal-to-noise ratio and suppress excited states. We used one, two, and five levels of spatial smearing using the HYP (hypercubic smearing) algorithm [9] in our calculations. In the studies of the equation of state [2,7], the static energy was extracted from two-exponential fits to the Wilson line correlators using a fixed range of Euclidean time τ=a that was chosen independently from r=a. This previous calculation is not adequate for our analysis. Here we improve the extraction of the static energy in the following way.
(i) We explicitly include the correlation matrix of the static energy in the fits to determine α s . For this reason we have to recalculate the static energy and preserve its statistical correlations.
(ii) We explicitly distinguish at short distances between data with distinct path geometries instead of averaging over them. This increases the number of data for each β compared to the earlier analysis [5]. (iii) At small τ the correlators of Wilson lines are contaminated by excited states, while at large τ the statistical fluctuations are large, and the effective masses sometimes do not show a clear plateau; see Fig. 1 for some of the worst-case scenarios. Therefore, a careful choice of the lower end, τ min , and upper end, τ max , of the fit interval is needed to obtain reliable results of the static energy. The corresponding fit interval ½τ min ∶τ max depends on the distance r.
We used a data-driven approach to determine suitable fit windows and estimated the systematic errors due to this choice of the fit interval. The procedure of extracting the static energy is demonstrated in Fig. 1, where the solid lines show the static energy and the dashed lines the corresponding total uncertainty. We summarize the fit windows used for obtaining the ground state in Appendix A. At short distances, namely, for r ≲ 0.14 fm, we observe the nonmonotonic behavior of the effective mass at small τ=a, typically, within τ ≲ 0.14 fm. This nonmonotonic behavior is not restricted to the Wilson line correlator in Coulomb Black open squares indicate the mean and statistical error of the effective mass. Although, we can usually identify an unambiguous plateau in the window 7 ≤ τ=a ≤ 11, we have to restrict the fit to narrower windows in multiple cases that are similar to those shown above. There is no evident pattern regarding β, r=a or r for the appearance of these unstable plateaus. The gray solid line indicates the central value of the ground state fit; the dashed gray lines indicate the one sigma interval for the total error that includes the systematic errors due to the dependence on the fit interval.
gauge, but also appears in Wilson loops with spatially smeared spatial Wilson lines; see Fig. 2. For this reason, these nonmonotonicities are due to the gauge ensembles with improved gauge action instead of being due to the details of the interpolating operator. The effective masses obtained from correlators of Wilson lines in Coulomb gauge and from Wilson loops with HYP smeared spatial lines converge to the same plateau at large τ.
The current and previous results for the static energy are consistent within errors. For the given reasons the new result is more precise at short distances and has a better estimate of the errors. It supersedes the previous calculation [2,7].
The last major source of systematic uncertainties of the lattice data is due to discretization artifacts at short distances, which are caused by the breaking of rotational symmetry from O(3) in the continuum to the cubic group W 3 on the lattice. Even for an improved gauge action these artifacts are significantly larger than any of the other uncertainties of the lattice result. The artifacts express themselves in the form of a variation of the lattice static energy around the continuum static energy at any given distance r. The sign and the size of this variation depends on the geometry of the shortest paths connecting the two sites at distance r on the lattice. This variation is typically large for on-axis distances, i.e., r=a ¼ ðn; 0; 0Þ, while being smaller for off-axis distances, and shrinks to the point of numerical irrelevance as the distances become larger than r=a ≳ 5. These lattice artifacts could be understood qualitatively in leading-order perturbation theory, i.e. at tree level. The analysis presented in Appendix B shows that the large artifacts at r=a ¼ 1 are about 8% for the Wilson action and 4% for the Lüscher-Weisz action that we use. The tree-level discretization effects decrease rapidly at larger distances, and become very small for r=a ≥ ffiffi ffi 8 p . One can use the tree-level results to reduce lattice artifacts also in the interacting theory. This procedure is called the tree-level correction and discussed in Appendix B.
The statistical errors of the static energy obtained on the lattice increase when increasing the separation r. For r=a ≥ ffiffi ffi 8 p the statistical errors in the static energy are typically larger than the discretization errors after tree-level correction. Thus for these distances discretization errors can be neglected. For ffiffi ffi 5 p ≤ r=a ≤ ffiffi ffi 8 p the statistical and discretization errors are comparable. In this region one can deal with the discretization errors by estimating them and combining with the statistical errors. For very short distances r=a < ffiffi ffi 5 p the discretization effects are much larger that the statistical errors and should be estimated carefully and corrected for, before the lattice results can be compared to the weak-coupling calculations. The corresponding procedure is described in Appendix B.

III. EXTRACTING α s FROM THE STATIC ENERGY
The static energy of a quark-antiquark pair is directly accessible in lattice calculations. Our lattice calculation of the static energy for N f ¼ 2 þ 1 light flavors has been discussed in Sec. II. In the weak-coupling regime the static energy is known up to three loops, i.e., up to order α 4 s in the small α s expansion. The resummation of ln α s terms is also known at subleading order. Hence, since realistic lattice calculations for N f ¼ 2 þ 1 are available at short enough distances where the small α s expansion is reliable, α s can be extracted by comparing the lattice data for the static energy with its perturbative expression for the case of three (approximately) massless quarks.
This program was initially carried out in Ref. [10] and considerably improved in Ref. [5], to a large extent due to the incorporation of shorter distance data on finer lattices. Here, we further improve the α s extraction by including three even finer lattices. The abundance of data at short distances allows us to perform the analysis omitting the data with r=a < ffiffi ffi 5 p that are susceptible to significant discretization artifacts. It also permits us to quantify the impact of using such data with or without appropriate corrections. Rather than making a separate analysis for each lattice spacing and averaging the final results, here we make a global fit to all lattice data, leaving as a free parameter a normalization constant for each lattice spacing. This approach is well motivated from the earlier observation that no remaining lattice spacing dependence can be resolved from the three lattice spacings in Ref. [5] and that the scale uncertainties for all considered lattice spacings are similar. We also take into account the statistical correlations between the fluctuations of data at different distances on the lattice. Eventually the effect of these correlations turns out to be much smaller than the statistical uncertainty, even though it reduces the uncertainty of the extracted value of α s .
We focus on data for r ≲ 0.14 fm or r ≲ 0.45r 1 ; see Fig. 3. The shortest distance that we can access, namely, due to a single lattice spacing 2 on our finest lattice (at zero temperature) is 0.0237 fm. In Ref. [5] only data with 0.057 fm ≲ r ≲ 0.16 fm were used for the extraction. For that range it was shown that the small α s expansion was reliable. Since we use only shorter distances in this paper, the perturbative expansion remains reliable. In Ref. [5], separate analyses with data for each gauge ensemble were performed in the range r=a ≥ ffiffi ffi 2 p . In this work, thanks to the finer lattices and the combined analysis, we can afford to vary the minimal distance r=a and quantify the corresponding uncertainty. Data on coarser lattices in the same r range as well as data on finer lattices may be excluded due to restrictions of the r=a range in the analysis. This is shown with more detail in Fig. 3. It was also shown in Ref. [5] that there is no sensitivity to nonperturbative contributions giving rise to powerlike corrections. This is important, since sea quark mass effects, finite volume effects, or effects due to an incorrect expectation value of the topological charge would make themselves noticeable in such contributions, 3 if they were not negligibly small.
To obtain the perturbative result for the static energy we follow Ref. [5]. We use the perturbative result for the force, FðrÞ ¼ dE=dr, and integrate it to obtain the energy. The constant contribution obtained this way is irrelevant and subsumed into the normalization constants that we have to fix anyway when comparing the weak-coupling results to the lattice data at each β. A choice of the renormalization scale ν (also called the soft scale; see below) proportional to 1=r avoids large logarithms of the form lnðνrÞ [5]. We call the choice ν ¼ 1=r our standard choice for the renormalization scale. For this choice we have [5] FðrÞ where the first three terms correspond to tree-level, oneloop, and two-loop order, respectively; the fourth term corresponds to three-loop order. The strong coupling, α s , is understood, here and in the following, as renormalized in the MS scheme. At any time the strong coupling can be traded for the corresponding QCD scale. At three flavors in the MS scheme the QCD scale is Λ MS . The coefficients in Eq. (2) are defined in Ref. [13] and reproduced in . Gray symbols either indicate data that were not used in the fits with a given minimal r=a, but are below the respective maximal r=r 1 , or that are above the maximal r=r 1 that we included in the fits. Appendix C. At three loops there is a contribution proportional to ln α s as well as a nonlogarithmic term. The origin of these terms can be understood using the effective field theory approach, namely, potential nonrelativistic QCD (pNRQCD) [14,15]. The term, pointed out in Ref. [16], proportional to ln α s comes from the energy scale ∼α s =r, which is called the ultrasoft scale to distinguish it from the energy scale 1=r, which is referred to as the soft scale.
Equation (2) is accurate at next-to-next-to-next-to-leading order (N 3 LO).
For sufficiently small values of the coupling, i.e., at small enough distances the ultrasoft logarithm lnðC A α s ð1=rÞ=2Þ eventually becomes large and can be resummed. This can be done using the renormalization group equations in pNRQCD [17][18][19]. The expression for the force after performing the resummation of the ultrasoft logarithms reads [5] We choose as a standard value for the ultrasoft scale The coefficients in Eq. (3) are defined in Ref. [13] and reproduced in Appendix C. At leading order in ln α s we can set μ us to 1=r and Eq. (3) reduces to Eq. (2). The first four terms of Eq. (3) include all terms of the form α 3þn s ln n α s ; i.e., it is accurate at next-tonext-to-leading logarithmic order. Keeping all terms in Eq. (3), we obtain an expression that is accurate at threeloop order with leading ultrasoft resummation.
In Fig. 4, we show the two-loop contribution to the force, i.e., the third term of Eq. (2), along with different α 4 s contributions as a function of the distance r. Here and in the rest of the paper we use, if not differently specified, the standard scales ν ¼ 1=r and μ us ¼ C A α s ð1=rÞ=ð2rÞ. One can see that the α 4 s ln α s contribution is never larger than the nonlogarithmic α 4 s contribution in the entire distance range used in our study, and both contributions are smaller than the two-loop contribution. If we resum the ultrasoft logarithms [i.e., by replacing the last line of Eq. (2) with the fourth and fifth lines of Eq. (3)], the corresponding term becomes slightly larger in absolute value; see the dot-dashed line in Fig. 4, but it is still significantly smaller than the two-loop contribution. Therefore, as already noticed in [5], the leading ultrasoft logarithms can be counted as being of the same order as the other nonlogarithmic three-loop terms, while subleading logarithms may be included in the uncertainty due to the missing four-loop contributions. In Fig. 4, we see also that there is a partial cancellation between the soft and ultrasoft contributions at order α 4 s . This cancellation, however, may be accidental and may not happen at higher orders. Our final result will come from comparing lattice data with perturbation theory at three-loop order with leading ultrasoft resummation, and we use the difference between the N 3 LO expression of Eq. (2) and the three-loop expression with leading ultrasoft resummation from Eq. (3) to estimate the size of unknown higher-loop contributions. Other ways to estimate the higherloop contributions are discussed in the following and enter eventually in our final error assessment.
We proceed as follows. On the lattice data side, we put all six data sets (see Table I) together by adding an arbitrary subtraction constant to each set that is determined by the fit. On the continuum side, we generate a grid of α s ðM Z ; N f ¼ 5Þ input values ranging from 0.1140 to 0.1200 in steps of 0.0001. We use the perturbative result for the force, 4 given in Eq.  2)]. A factor α s ð1=rÞ 3 has been dropped from all terms. We use the standard value μ us ¼ C A α s ð1=rÞ=ð2rÞ for the ultrasoft scale. We have taken Λ 4 We use perturbative running and decoupling to convert each grid value of α s ðM Z ; N f ¼ 5Þ to the corresponding Λ N f ¼3 MS value, which we use in the analytical expressions of the force, Eqs. (2) and (3). The details of the running and decoupling are the same as in Ref. [5]. Namely, we use four-loop running, with the charm quark mass equal to 1.6 GeV and the bottom quark mass equal to 4.7 GeV. The effects of higher-order terms in the running are negligible with the current accuracies. ultrasoft resummation expression, for the reasons discussed above. For each of the α s input values we calculate the static energy EðrÞ by numerically integrating Eq. (3) and fit the lattice data to it, thus determining the subtraction constants. Hence, the subtraction constants for different sets are obtained in independent fits, since the different ensembles are statistically independent. Of course the subtraction constants depend on the input value of α s . For the determination of the subtraction constants we consider fits with or without the full correlation matrix, which does not lead to statistically distinguishable results. The smallest χ 2 =d:o:f. determines the preferred value of α s ðM Z Þ by the lattice data. We calculate the χ 2 =d:o:f. with or without the full correlation matrix. Typically, the minimum of the χ 2 =d:o:f. varies by at most δ corr ¼ AE0.0001 upon inclusion of the full correlation matrix, which is always smaller than the statistical error. We consider this difference as part of the statistical error that is already accounted for. The subtraction constants in the correlated fit are also consistent with the subtraction constants in the uncorrelated fits within fractions of the statistical error. We repeat this procedure for several values of maxðrÞ, vary the smallest distances minðr=aÞ in units of the lattice spacing considered in the analysis, and vary the treatment of discretization artifacts at the shortest distances. Eventually we repeat the entire procedure for expressions with different values of the soft scale, and for the two-loop and three-loop (without resumming the ultrasoft contribution) expressions.
Discretization artifacts are significant at distances r=a ≲ 3, with increasing importance at smaller and smaller distances as detailed in Sec. II. We use different strategies to handle the residual discretization artifacts; see Appendix B for details.
(i) Ignoring them; i.e., we engage in no further corrections beyond the tree-level correction. In this case we restrict the fit to r=a ≥ ffiffi ffi 8 p and omit r=a ¼ ffiffiffiffiffi 12 p ; (ii) correcting for them, by following a strategy similar to Ref. [5], i.e., nonperturbative correction (see Appendix B); (iii) accounting for them, by enlarging the statistical errors to fully cover the discretization artifacts, i.e., treelevel correction and systematic error estimates. This approach is suitable for r=a ≥ ffiffi ffi 5 p .
First, we analyze the data for r=a ≥ ffiffi ffi 8 p without considering the discretization artifacts beyond the tree-level correction. In these fits we omit the data with r=a ¼ ffiffiffiffiffi 12 p due to their exceptionally large discretization artifacts. We generally obtain χ 2 =d:o:f. between 0.3 and 0.6 for the best fits, which rises steeply outside a narrow window around the central value of α s . We estimate the systematic error due to discretization artifacts from the difference to fits with r=a ≥ ffiffi ffi 8 p with tree-level correction or with nonperturbative correction as being typically about δ lat ¼ AE0.00021. Table II clearly shows that any errors due to discretization artifacts are safely contained in the statistical uncertainty. We point out that the analysis for maxðrÞ ≤ 0.12 fm and r=a ≥ ffiffi ffi 8 p does not depend on gauge ensembles with the smaller sea quark mass, i.e., uses exclusively ensembles with m l ¼ m s =5. Second, for 1 < r=a < ffiffi ffi 8 p the discretization artifacts are too large to be ignored. The data can still be used after correcting for the discretization artifacts through the nonperturbative correction. The χ 2 =d:o:f. is about the same, unless data at large distances are discarded. In this case, the χ 2 =d:o:f. becomes even smaller, due to the conservative error estimates for the corrections. We do not observe any larger deviation from the results for r=a ≥ ffiffi ffi 8 p , and, hence, estimate that the uncertainty due to the corrections is generally about the same as for r=a ≥ ffiffi ffi 8 p , namely, δ lat ¼ AE0.00021. The discretization errors are safely contained in the statistical uncertainty. The results for fits in four representative fit intervals 5 using nonperturbatively corrected discretization errors are summarized in Table III. In each fit interval all results are consistent. We find slightly lower central values for fits with only ten data or less, which, however, are never statistically significant. We show the tree-level or nonperturbatively corrected lattice data together with the three-loop with leading ultrasoft resummation result with standard scales in Fig. 5. Given the large variation over the r range fine details are difficult to resolve. It can be seen that the fit to the data with large r=a misses the data with r=a ¼ 1, i.e., the first data for each β, unless the and different treatment of discretization artifacts, here, nonperturbative correction or only tree-level correction. We list uncorrelated and correlated χ 2 =d:o:f. in columns 4 and 5. In column 9 we list the perturbative error for the corresponding fit window. The last column displays the outcome for α s at two-loop order. The tree-level corrected calculations have fewer degrees of freedom, since the data with r=a ¼ ffiffiffiffiffi 12 p have been excluded; see the text. describes the data at shorter distances once the nonperturbative corrections are implemented. On the other hand, without the nonperturbative corrections the short distance data are off. We show in Fig. 6 the differences between the normalized lattice data and the three-loop with leading ultrasoft resummation result with standard scales, either using nonperturbatively corrected data, or data with only treelevel correction. In the considered range the nonperturbative correction is irrelevant.
We show the differences between the normalized, nonperturbatively corrected lattice data and the three-loop with leading ultrasoft resummation result with standard scales in Fig. 7 for fits with r=a ≥ 1. The difference is generally smaller than 0.007=r 1 (about 4 MeV) and fluctuates around 0 with no apparent trend. At distances smaller than 0.15r 1 , the difference tends to be marginally below 0 for maxðrÞ ¼ 0.055 fm and slightly above 0 TABLE III. Representative fits using nonperturbatively corrected data in different intervals. We list uncorrelated and correlated χ 2 =d:o:f. in columns 4 and 5. In column 9 we list the aggregate perturbative error for the corresponding fit window. The last column displays the outcome for α s at two-loop order. . The three-loop with leading ultrasoft resummation result with standard scales is shown as lines for the α s ðM Z Þ grid values corresponding to the best fit (0.1167), and the values which are one standard deviation lower or higher (α s ¼ 0.1162 or 0.1172). The color palette corresponds to different values of a=r 1 . Right: The normalized lattice data and the weak-coupling result for the static energy multiplied by r. Here the black symbols correspond to the uncorrected data, while the color symbols correspond to nonperturbatively corrected data. The color palette corresponds to different values of a=r 1 . otherwise. The central value of α s is 0.00027 lower for fits restricted to the shortest distance interval (maxðrÞ ¼ 0.055 fm) than for maxðrÞ ¼ 0.131 fm. This difference may be indicative of the imperfect removal of the discretization artifacts, but is still compatible with the estimate of the discretization error using minðr=aÞ ¼ ffiffi ffi 8 p , and is more than covered by statistical uncertainties.
We perform another set of tests to investigate the systematic uncertainties of the lattice result. Namely, we test for a bias due to combining multiple lattice spacings into a single analysis that assumes that the data are within uncertainties consistent with the continuum limit. We separately analyze the data at the finest lattice spacing, β ¼ 8.4. Here we also explore whether using data with r=a ≥ ffiffi ffi 5 p or ffiffi ffi 8 p with errors that are inflated by estimates of the discretization artifacts has a significant impact on our results. We use the same uncertainty estimate for the influence of the discretization artifacts that we obtained previously, since it is larger than any estimates that we would obtain by comparing the results in this set of tests. Table IV shows that the separate analysis for the finest lattice spacing is consistent within statistical errors or within the even smaller estimate of the uncertainties due to the treatment of discretization artifacts. While the preferred central value in this analysis appears to be marginally higher than in the analysis that combines multiple spacings, this is not relevant for our final result.
To rule out the possibility that the nonperturbative correction of the lattice data at short distances introduces a systematic bias in the determination of α s we also performed fits without implementing these corrections, but rather using these corrections as systematic error on the short distance data. This fit resulted in α s ¼ 0.  Table III, which is compatible with the other values within errors. An even better agreement is obtained for the larger values of maxðrÞ. Thus, we do not find any indication for a possible bias in α s toward smaller values due to the nonperturbative correction, because the result for α s without the nonperturbative corrections is never larger than with the nonperturbative correction.
In order to estimate the perturbative error of the small α s expansion, i.e., the error made by neglecting contributions beyond three loops, we proceed similarly to Ref. [5] using three methods to quantify the uncertainty of the perturbative calculation. Namely, we add and subtract a term with the typical size of the soft four-loop contribution, we analyze the dependence on the soft scale ν, and we use the difference between the three-loop order with leading ultrasoft resummation and the three-loop result. We estimate the perturbative error in each method as the difference between α s ðM Z Þ calculated in that way and the α s ðM Z Þ value obtained from the three-loop with leading ultrasoft resummation expression at standard scales. In Ref. [5], the error due to the soft scale variation was estimated by taking ν ¼ ffiffi ffi 2 p =r and ν ¼ 1= ffiffi ffi 2 p r. In this study, we enlarge the range of scale variation by considering ν ¼ 2=r and ν ¼ 1=ð2rÞ. In doing so, we realize that the ν dependence of the extracted α s is not monotonous: it has a minimum close to ν ¼ 1= ffiffi ffi 2 p r; see Fig. 8. For the lowest soft scale ν ¼ 1=ð2rÞ the use of perturbation theory becomes inadequate much earlier than for other choices. Namely, for maxðrÞ ≳ 0.1 fm, the fits do not describe the data in a satisfying manner, and χ 2 =d:o:f. increases much more rapidly for ν ¼ 1=ð2rÞ than for the other choices of the soft scale. This is expected on the grounds of the observation that the 2014 analysis showed that the scale ν ¼ 1=ð ffiffi ffi 2 p rÞ was adequate up to about r ¼ r 1 =2. If we exclude those larger distances the upper perturbative error is always given in terms of the soft scale variation to ν ¼ 2=r. We observe that for maxðrÞ ≳ 0.09 fm the soft scale variation to ν ¼ 1=ð ffiffi ffi 2 p rÞ contributes to a larger estimate of the lower perturbative error than the soft higher-order terms, in line with the 2014 results. However, for smaller distances maxðrÞ ≲ 0.09 fm the estimate of the lower perturbative error due to inclusion of soft higher-order terms exceeds the estimate of the lower perturbative error due to soft scale TABLE IV. Fits using only the finest lattice spacing, β ¼ 8.4, and different treatment of discretization artifacts, i.e., with or without systematic error estimates and the tree-level correction. We list uncorrelated and correlated χ 2 =d:o:f. in columns 5 and 6. In column 10 we list the perturbative error for the corresponding fit window. For the lattice error due to discretization artifacts we use the results of the analysis with multiple lattice spacings. The tree-level corrected calculations have fewer degrees of freedom, since the data with r=a ¼ variation. We take the most extreme among the error estimates from the soft contribution as the upper and lower perturbative errors due to the soft contribution. We also estimate the perturbative error by evaluating the difference between the three-loop with leading ultrasoft resummation result and the fixed-order three-loop result. The fixed-order three-loop result always yields a lower central value than the three-loop with leading ultrasoft resummation result. In order to be conservative, we apply the difference as a symmetric error. This has an even larger effect on the lower perturbative error than the soft scale variation or the soft higher-order terms. Moreover, the soft contribution to the lower perturbative error decreases for smaller values of maxðrÞ more rapidly than the ultrasoft contribution. We stress that the lower perturbative error is always smaller than the statistical error for r ≲ 0.1 fm. We add this symmetric perturbative error due to (inclusion or not of) ultrasoft resummation to the perturbative error due to the soft contribution in quadrature. The reasoning for adding these errors in quadrature is that it is unknown whether there is a similar partial compensation between soft and ultrasoft contributions at higher orders as it happens at three loops, and as such, we have to treat the corresponding uncertainties as independent. Lastly, we compare the threeloop with leading ultrasoft resummation and the two-loop results [first three lines of Eq. (2)]. As can be read off from Tables II and III, the difference to the two-loop result never exceeds þ0.00025 and decreases for smaller values of maxðrÞ; i.e., it is smaller than the statistical errors and smaller than the other effects due variation of soft scale, soft higher-order terms, or variation of the ultrasoft resummation. The χ 2 =d:o:f. does not change significantly between using the two-loop or three-loop with leading ultrasoft resummation results. Hence, we confirm the criterion for having the lattice data in the perturbative regime. We observe that the smaller maxðrÞ is the smaller the variation of the central value of α s between fits with different forms of the weak-coupling results becomes. Table III shows clearly that, for a given minðr=aÞ, the perturbative errors are dramatically reduced at smaller distances, as expected, while the statistical error increases as fewer data are used to constrain the fits. Let us summarize the considerations of the preceding paragraphs. We have to use maxðrÞ ≲ 0.1 fm to perform the full scale variation and keep the perturbative uncertainties fully under control. We should ideally use significantly more than ten data point to limit the impact of the imperfectly treated discretization artifacts. Given the considerations of the preceding paragraphs, we take the result for 1 ≤ r=a ≤ 5 and maxðrÞ ¼ 0.073 fm, namely, α s ðM Z Þ ¼ 0.11660 as our final result, which corresponds to r 1 Λ  α s ð1=r; N f ¼ 3Þ corresponding to the largest distance maxðrÞ ¼ 0.073 fm used in this analysis yields α s ð1= maxðrÞ; N f ¼ 3Þ ¼ 0.2499 and safely permits the scale variation by the factor 2 or even more. Therefore, the final result and full error budget of our zero temperature lattice calculation are given as We have added the statistical error and the lattice discretization error of the static energy, the total error of the r 1 scale, and the perturbative error in quadrature. In order to compare the current analysis to the previous analysis [5], we use the smaller window ½1=ð ffiffi ffi 2 p rÞ; ffiffi ffi 2 p =r for the variation of the soft scale ν, and do not account for the uncertainty arising from the difference between resumming or not the leading ultrasoft logarithms to obtain δ ffiffi In this case, the perturbative error is not dominant anymore. Thus, the presented analysis has approximately halved the uncertainties of Ref. [5]. Nevertheless our final errors are only 10% (upper error) and 30% (lower error) smaller than the ones in [5], since we have accounted for the other possible sources of uncertainty listed above. The central values of Eq. (4) and of the final result in Ref. [5] coincide.

IV. EXTRACTING α s FROM THE SINGLET FREE ENERGY
In this section, we consider the extraction of the strong coupling from the singlet free energy at nonzero temperature, as it is expected that at small distances medium effects are small. We define the singlet free energy in terms of the correlation function of two thermal Wilson lines in Coulomb gauge At distances much smaller than the inverse temperature rT ≪ 1, we can write using pNRQCD [4] F S ðr; TÞ ¼ V s ðr; μ us Þ þ δF S ðr; T; μ us Þ; ð11Þ where μ us is the ultrasoft scale. The form of the thermal correction depends on the scale hierarchy. One could consider the case 1=r ≫ α s =r ≫ T ≫ m D ∼ gT or the case 1=r ≫ T ≫ m D ∼ gT ≫ α s =r. In the former case μ us ∼ α s =r and δF S ðr; T; μ us Þ ¼ δE US ðμ us Þ þ ΔF S ðr; TÞ with E US ðμ us Þ being the ultrasoft contribution to the static energy in the vacuum. In the latter case μ us ∼ T and δF s ðr; T; μ us Þ has been calculated to order g 5 ; i.e., see Eqs. (16)- (19) in Ref. [3]. In the latter case the cancellation of the ultrasoft factorization scale dependence cannot be verified because of the unknown g 6 contribution to δF s ðr; T; μ us Þ. Since V s ðr; μ us Þ has a term ∼α 3 s lnðμ us rÞ=r, however, the difference between the T ¼ 0 static energy and singlet free energy, EðrÞ − F S ðr; TÞ should have a term ∼α 3 s lnðrTÞ=r. This complicates the extraction of the strong coupling from the singlet free energy. The matching between NRQCD and pNRQCD also induces a term ∼g 6 T in F S ðr; TÞ for both scale hierarchies [4], which also needs to be considered.
The singlet free energy has been studied on the lattice in Ref. [3] using a wide temperature range and several lattice spacings, i.e., several temporal extents N τ . The shortest distance that we can access, due to a single lattice spacing on our finest lattice at T > 0, is 0.00814 fm. For our analysis the relevant data correspond to N τ ¼ 10, 12 and 16, since rT has to be small. From the analysis of Ref. [3] we know that thermal effects are small for rT ≲ 0.3. To understand the temperature dependence of the singlet free energy in more detail we show the lattice results for the difference 6 between the singlet free energy at T > 0 and the static energy at T ¼ 0 for β ¼ 8.4 (corresponding to the finest zero temperature lattice) in Fig. 9. For other β values the results are similar.
From the figure we see that for r=a ≤ ffiffi ffi 6 p the difference approaches a constant proportional to the temperature according to the above expectations and no temperature effects beyond this constant can be seen in this range within the errors of the lattice results. Therefore, we conclude that in this regime the scale hierarchy 1=r ≫ α s =r ≫ T is appropriate. We treat the finite temperature data in this range as if at zero temperature and fit them with the threeloop with leading ultrasoft resummation result for the static energy with standard scales. Alternatively one could use the N τ ¼ 12 data in the range where thermal effects appear to be small together with the two-loop result for the static energy, where the problem of the ultrasoft scale dependence does not enter at all. This, however, would lead to a larger theoretical uncertainty.
For distances r=a > ffiffi ffi 6 p we see some temperature dependence. For the lattice data with N τ ¼ 12 this temperature dependence is to some extent captured by the known g 5 result for δF S ðr; TÞ, but for N τ ¼ 10 and 16 it is off. On the one hand, for N τ ¼ 16, i.e., for a lower temperature, the temperature effects are larger than what is expected based on the g 5 result. Since the deviation of the temperature effects from a constant or from the g 5 result is quite similar to the size of the typical discretization artifacts or the statistical errors of the data for r=a > ffiffi ffi 6 p , it is unclear whether this is truly an effect due to the finite temperature. On the other hand, for N τ ¼ 10, where the same temperature corresponds to a coarser lattice, the temperature effects are in the opposite direction and may be caused by temperature-dependent discretization errors [3].
The known g 5 result consists of two contributions with opposite sign but similar magnitude for the temperatures under consideration, and, therefore, the overall temperature effect is small due to cancellations. As these two contributions are due to either nonstatic or static Matsubara modes at orders g 4 or g 5 , respectively, it is clear that this is an accidental cancellation in the temperature window of our lattice simulations. Furthermore, there is no reason to , or between r=a < ffiffi ffi 8 p and r=a ≥ ffiffi ffi 8 p , respectively. 6 The discretization artifacts between both quantities cancel exactly at tree level. Nonetheless, we still use the tree-level corrected distance in Fig. 9, in order to permit the visual distinction between data that are inequivalent in the full QCD result. Moreover, we matched the perturbative result to the lattice data at r=a ¼ 1 using a constant that mimics the effect of the unknown g 6 T term. expect that discretization artifacts in both contributions are similar enough to achieve a cancellation between them. In fact, it has been shown that the Oða 2 Þ discretization errors appear to be more pronounced in the difference E − F S than in E or F S individually [3].
For these reasons, we conclude that for r=a > ffiffi ffi 6 p we do not have sufficient understanding of thermal effects to use F S for extraction of α s with fully controlled uncertainties. Nonetheless, the cancellations that appear to be at work both in the g 5 result and in the lattice data suggest that a fit of the lattice data with the zero temperature result may still be possible as a cross-check. Hence, we again treat the finite temperature data in this range as if at zero temperature and fit them with the three-loop with leading ultrasoft resummation result with standard scales.
In order to determine α s from the singlet free energy at T > 0 we proceed as follows. We analyze the T ¼ 0 static energy result (N τ ¼ 64) in the same r=a intervals for which we expect that temperature effects in the singlet free energy are under control or are small due to accidental cancellations. We use the same weak-coupling result, namely, the three-loop with leading ultrasoft resummation result with standard scales to obtain α s from the singlet free energy. We estimate the uncertainty due to discretization artifacts to be the same as in the zero temperature analysis, i.e., δ lat ¼ AE0.00021. Similarly, we estimate the uncertainty due to temperature effects from the difference of the central values of α s for any combination of N τ values in each fit interval.
First, we perform fits in the range where temperature effects are expected to be constant and the scale hierarchy 1=r ≫ α s =r ≫ T ≫ m D ∼ gT applies. We report the result of representative fits with r=a ≤ 2 in Table V, which corresponds to rT ≲ 0.17 or 0.13 for N τ ¼ 12 or 16, respectively. The central values of α s in fits to the singlet free energy with N τ ¼ 12 or 16 bracket the result for the static energy at zero temperature, with deviations being smaller than any other uncertainties. Hence, we conclude that we do not resolve nonconstant T > 0 effects as expected. As for the T ¼ 0 analysis, we see a marginal decrease of the central value of α s upon restriction to shorter distances. This decrease is about the same magnitude as in the T ¼ 0 analysis, but is delayed to significantly shorter distances, which supports our interpretation that it is due to imperfections of the nonperturbative correction procedure. Within the smallest statistical uncertainties or within the even smaller estimates of the lattice discretization uncertainties, all of these results are consistent with the zero temperature analysis of the previous section, while the individual estimates of perturbative errors are systematically smaller; compare with Table III. Next, we perform fits in the range where temperature effects beyond a constant appear to be small due to accidental cancellations. We assume the same scale hierarchy 1=r ≫ α s =r ≫ T ≫ m D ∼ gT, noting that it is at present not verifiable if the ultrasoft factorization scale dependence cancels in this case. Namely, we perform the fits with r=a ≤ 3 and report the results in Table V as well. 7 This corresponds to rT ≲ 0.25 or 0.19 for N τ ¼ 12 or 16 respectively. χ 2 =d:o:f. is practically unchanged for N τ ¼ 12, but increases slightly for N τ ¼ 16. Remarkably, the agreement with the previous T ¼ 0 analysis, using up to r=a ≤ 5, does not change significantly as maxðrÞ becomes larger. This suggests that the differences between the results are not caused by effects of the finite temperature, but rather by the more severe influence of the imperfections of the nonperturbative correction procedure in the T ¼ 0 analysis with r=a ≤ 2 or r=a ≤ 3. Hence, the estimate of the δ T>0 errors for r > 0.05 fm cannot be separated from the δ lat errors due to lattice discretization uncertainties.
Let us summarize the considerations of the preceding paragraphs. For T > 0 we have to use r=a ≲ 2 to guarantee the cancellation of ultrasoft factorization scale dependence, although we do not see any indication that it does not work for somewhat larger distances, i.e., r=a ≤ 3. The weakcoupling picture suggests that cancellation between different medium effects is responsible for the small thermal modification of the singlet free energy. While the medium effects according to the weak-coupling picture are not unambiguously resolved in the data, the data seem to be affected by an accidental cancellation in the temperature window under consideration. Perturbative uncertainties are dramatically reduced at smaller distances maxðrÞ. In Given the considerations of the preceding paragraphs, we take the N τ ¼ 12 result for 1 ≤ r=a ≤ 2 and maxðrÞ ¼ 0.030 fm, namely, α s ðM Z Þ ¼ 0.11638 as our final result, which corresponds to r 1 Λ N f ¼3 MS ¼ 0.4900. Furthermore, it has a largely independent error budget. We note that the finite temperature result for these distances does not depend on gauge ensembles corresponding to the larger sea quark mass, i.e., uses exclusively ensembles with m l ¼ m s =20. The scale uncertainty is the same as in the T ¼ 0 analysis, δ scale ¼ AE1.7 MeV for Λ N f ¼3 MS , and δ scale ¼ AE0.00010 for α s ðM Z ; N f ¼ 5Þ. The value α s ð1=r; N f ¼ 3Þ corresponding to the largest distance maxðrÞ ¼ 0.030 fm used in this analysis yields α s ð1= maxðrÞ; N f ¼ 3Þ ¼ 0.1820 and safely permits the scale variation by a factor of 2 or even more. Therefore, the final result and full error budget of our finite temperature lattice calculation are given as We have added the statistical error, the lattice discretization error of the static energy, the finite temperature error due to using the singlet free energy, the total error of the r 1 scale, and the perturbative error in quadrature. . The three-loop with leading ultrasoft resummation with standard scales is shown for the α s ðM Z Þ grid values corresponding to the best fits for the r and rT intervals as indicated. The vertical lines of the same color indicate maxðrÞ of the fits. 7 Using an even larger fit range up to r=a ≤ ffiffiffiffiffi 12 p leads to the same conclusions; i.e., the nonconstant thermal effects are still numerically irrelevant.

V. DISCUSSION
In Secs. III and IV, we have presented two extractions of the strong coupling constant from two different observables, the static energy on zero temperature lattices, and the singlet free energy on finite temperature lattices.
Both results, i.e., α s ðM Z ;N f ¼ 5Þ ¼ 0.11660 þ0.00110 −0.00056 , and α s ðM Z ; N f ¼ 5Þ ¼ 0.11638 þ0.00095 −0.00087 , are in excellent agreement. This is despite the fact that they are based on several statistically independent sets of gauge ensembles corresponding to the QCD vacuum or the quark-gluon plasma and have been obtained using two different sea quark masses. The systematic uncertainties due to the lattice discretization and due to the lattice scale are not independent, but do not play a significant role in either of the two error budgets. Moreover, the two results are obtained in different r ranges, ½0.0237 fm; 0.0734 fm or ½0.0081 fm; 0.0301 fm, which overlap only in a narrow window. As such, we may even consider the perturbative uncertainties of the two analyses as practically independent. We stress that every single fit in our analysis is contained already within the purely statistical errors of either the final T ¼ 0 or the final T > 0 results as long as the soft scale is at ν ¼ 1=r.
There has been an observation in pure Yang-Mills theory that the approach to the continuum limit of r 0 Λ N f ¼0 MS extracted in the QQ-scheme 8 appears to change if data with α 3 QQ ≲ 0.01 are considered [20]. Namely, shorter distances corresponding to smaller α QQ seemed to be related to smaller r 0 Λ N f ¼0 MS in the continuum limit, with about 10% lower continuum value in the pure Yang-Mills theory analysis. We show such a comparison for our data in Fig. 11. The fits with maxðrÞ ≲ 0.05 fm are safely in the α 3 QQ ≲ 0.01 range, and are consistent with either a stable value or a modest drop. To account for possible artifacts in our analysis due to the use of small r=a, we have included the systematic error estimate δ lat ¼ AE0.00021.
Our result from the static energy supersedes the result of the previous analysis [5], α s ðM Z ; N f ¼ 5Þ ¼ 0.1166 þ0.0012 −0.0008 , obtained using three gauge ensembles with light quark masses m l ¼ m s =20. Although it reproduces the previous central value, it includes more conservative estimates of the known sources of systematic error. Nevertheless it succeeds in reducing the uncertainties through the use of finer lattice spacings.
In Ref. [12] the strong coupling was calculated from the static energy using two different analyses of spatially smeared Wilson loop data on three gauge ensembles with a pion mass of about 300 MeV, which are both consistent with our results. The presented two-step analysis (obtaining the continuum limit first and then extracting the strong coupling) yielded α s ðM Z ; N f ¼ 5Þ ¼ 0.1166 þ0.0021 −0.0020 , while the second analysis using a single global fit yielded a higher value α s ðM Z ; N f ¼ 5Þ ¼ 0.1179 þ0.0015 −0.0014 . Our result is slightly smaller than the PDG average [1] α s ðM Z ; N f ¼ 5Þ ¼ 0.1181ð11Þ as well as the recent FLAG average [21] α s ðM Z ; N f ¼ 5Þ ¼ 0.11823ð81Þ. It is also smaller than the result of the ALPHA collaboration [22] α s ðM Z ; N f ¼ 5Þ ¼ 0.11852ð84Þ, which uses step scaling and the Schrödinger functional method. These three values are, however, correlated. Our result is smaller than the results of the HPQCD collaboration using pseudoscalar quarkonium correlators or small Wilson loops [23,24] with three or four sea quark flavors, namely, α s ðM Z ;N f ¼ 5Þ ¼0.1183ð7Þ, or α s ðM Z ;N f ¼ 5Þ ¼ 0.11822ð74Þ, respectively. Our result agrees with the very recent lattice determination in terms of pseudoscalar quarkonium correlators [25], α s ðM Z ; N f ¼ 5Þ ¼ 0.1159ð12Þ, or the preceding analysis of charmonium correlators [26], α s ðM Z ; N f ¼ 5Þ ¼ 0.11622ð83Þ, or the analyses of higher quarkonium moments [27] α s ðM Z ; N f ¼ 5Þ ¼ 0.1176ð26Þ. Moreover, our result is consistent within errors with the recent lattice determinations from the hadronic vacuum polarization [28] Table IV. The data with 0.0697 fm ≤ r are selected from our T ¼ 0 analysis with r=a ≥ ffiffi ffi 8 p ; compare with Table II. All other data sets are with r=a ≥ 1. All individual results are in agreement with our final results, Eqs. (6) and (14), which are displayed as overlapping bands with the corresponding full error budgets. from the gauge-fixed gluon propagator in Landau gauge [29], α s ðM Z ; N f ¼ 5Þ ¼ 0.1172ð11Þ.
Our result is also consistent with phenomenological estimates based on the bottomonium spectrum in NRQCD [30], α s ðM Z ; N f ¼ 5Þ ¼ 0.1178ð51Þ, or quarkonia spectra in pNQRCD [31], α s ðM Z ; N f ¼ 5Þ ¼ 0.1195ð53Þ, which have errors comparable to an earlier extraction from radiative bottomonium decays [32], α s ðM Z ; N f ¼ 5Þ ¼ 0.119 þ0.006 −0.005 . Finally, a recent study of the static energy using a subset of our data in a much larger r window, with an assumption similar to the one of Ref. [12] and approximations to the weak-coupling result that prevent the quantification of the uncertainties, finds α s ðM Z ; N f ¼ 5Þ ¼ 0.1168 [33], which agrees with our result.

VI. CONCLUSIONS
In this paper, we have calculated the static energy at zero temperature and the singlet free energy at finite temperature in 2 þ 1 flavor QCD using the HISQ action, and determined the strong coupling. We improved and extended the previous 2 þ 1 flavor HISQ analysis published in Ref. [5]. We included three finer lattice spacings, overhauled the analysis strategy, treated systematic uncertainties more conservatively, and restricted the analysis to much shorter distances that are deeper in the perturbative regime. Our main results are given in Eqs. (4)- (7). Moreover, for the first time we used the singlet free energy on finite temperature lattices to reach much smaller distances than ever used for the analysis of the static energy, obtaining an independent result with a complementary error budget given in Eqs. (12)- (15). Both results agree without even having to invoke the systematic errors of the weak-coupling calculation.
Both of our results for the central value of α s are lower than a few lattice QCD determinations as well as the PDG average. However, the result from the static energy is compatible within errors. Furthermore, our results agree with some other lattice determinations of α s based on moments of quarkonium correlators [25]. Finally, we stress that the extraction of α s from the static energy is the only lattice extraction based, in the continuum part, on an observable known up to three loops and accurate at order α 4 s . This is due to the fact that the static energy is sensitive to α s already at leading order. (C2PAP) in the project "Static quark correlators in lattice QCD at nonzero temperature" (pr83pu) as well as on the SuperMUC at the Leibniz-Rechenzentrum (LRZ) in the project "Static quark-antiquark energy at zero and finite temperature" (pr48le). We thank the Institute for Nuclear Theory at the University of Washington for its kind hospitality and stimulating research environment. This research was supported in part by the INT's U.S. Department of Energy Grant No. DE-FG02-00ER41132. This research was supported by the Munich Institute for Astro-and Particle Physics (MIAPP) of the DFG cluster of excellence "ORIGINS." The lattice QCD calculations have been performed using the publicly available MILC code. The data analysis was performed using the R-base and nlme packages [34,35].

APPENDIX A: GAUGE ENSEMBLES AND CORRELATORS
In this appendix, we discuss additional ensembles that have not been summarized in Table I, since they are used only for auxiliary calculations, and fits to the primary lattice correlators. Further zero temperature ensembles that have only been used for developing a procedure to treat the discretization artifacts are summarized in Table VI. They also have been generated by the HotQCD collaboration [7,36] using the rational hybrid Monte Carlo (RHMC) algorithm and use 2 þ 1 flavors with the strange quark mass at its physical value and the mass of the two degenerate light quarks at 5% of the strange quark mass.
We have also used the gauge configurations generated for the study of quark number susceptibilities at high temperatures [37] and for the study of color screening [3] by the TUMQCD collaboration. The RHMC algorithm was used for 2 þ 1 quark flavors with the strange quark mass at its physical value and the mass of the two  [2] degenerate light quarks at 5% or 20% of the strange quark mass, i.e., corresponding to the zero temperature ensembles. We summarize the nonzero temperature gauge ensembles used in the determination of the strong coupling in Tables VII and VIII. We have increased the statistics of some N τ ¼ 16 gauge ensembles specifically for this study and updated the Polyakov loop in Table VIII.
In Table IX we summarize the necessary cuts to the fit ranges for the extraction of the static energy from the lattice correlators.

APPENDIX B: DISCRETIZATION ARTIFACTS
In this appendix, we briefly discuss the discretization artifacts of the static energy calculated on the lattice. We follow the treatment outlined in Ref. [3] with a few refinements.
It has been known for a long time that the static energy is affected by discretization artifacts at distances that are only slightly larger than the lattice spacing, i.e., r=a ≳ 1. The ratio of the static energy on the lattice and in the continuum exhibits a distinct pattern of a few percent to permill level variation around 1, which dies out quickly for larger distances; see Fig. 12. This pattern is quite similar in the tree-level calculation and in the QCD result (with or without sea quarks). Hence, one may partly account for these discretization artifacts by dividing the lattice QCD static energy by the ratio of the lattice and continuum static energies from a tree-level calculation, i.e., assuming onegluon exchange and no running coupling. This procedure is  with D 00 ðkÞ being the temporal component of the lattice gluon propagator on the lattice. The results on the lattice static energy at tree level calculated for Wilson and Lüscher-Weisz action are shown in Fig. 12. Writing defines the so-called improved distance r I . In practice, the tree-level correction is included in the comparison between continuum and lattice results either as or-by introducing a fit parameter κ I ðr=a; βÞ for each distinctive path geometry r=a between the quark and antiquark at each lattice spacing-as Here we use the first approach. However, this is not sufficient to remove the discretization artifacts entirely. Ideally, one would replace the treelevel correction by a one-or two-loop correction, i.e., calculating the correction factors using lattice perturbation theory at one-or two-loop level, and including the tadpole factors. We do not pursue this approach here as it requires extensive calculation beyond the scope of this study. As one can see from Fig. 12 discretization errors are at few percent level for r=a ≤ 2. For r=a ¼ ffiffi ffi 5 p and ffiffi ffi 6 p these errors are below a percent and are even smaller for r=a ≥ ffiffi ffi 8 p . Based on previous studies we expect the same pattern of discretization effects in the interacting theory. We show in Fig. 13 that a similar pattern is observed for the HISQ action after the tree-level correction. For distances r=a ≥ ffiffi ffi 8 p the statistical errors of the static energy are larger than the discretization errors.
Hence, there are three options to cope with the situation: The first option is to completely avoid the data with significant residual discretization artifacts. This corresponds to the first analysis strategy discussed in Sec. III.
The second option is to estimate the size of the residual discretization artifacts in the data at small r=a and combine a systematic error estimate of similar magnitude with the statistical errors to reduce the statistical weight of these data.
The first quark-antiquark distance that we can access with distinctive path geometries (ignoring permutations) is r=a ¼ 3, specifically, either through r=a ¼ ð3; 0; 0Þ or (2,2,1). We average these two results as E lat ðr ¼ 3aÞ, and use a Cornell fit to interpolate between the data for r I ð ffiffi ffi 8 p aÞ; 3a, and r I ð ffiffiffiffiffi 10 p aÞ to the distances r I ðð3;0;0ÞaÞ ¼ 2.9788a and r I ðð2; 2; 1ÞaÞ ¼ 3.0123a. The difference Δ lat ðr=aÞ ¼ E lat ðr=aÞ − E int ðr=aÞ between the data and the interpolation at r=a ¼ ð3; 0; 0Þ, or (2,2,1) already includes the tree-level correction, i.e., Δ lat ðð3; 0; 0ÞÞ − Δ lat ðð2; 2; 1ÞÞ is an estimate for the residual discretization artifacts after the tree-level correction. On the other hand, E lat ðð3; 0; 0ÞÞ − E lat ðð2; 2; 1ÞÞ is an estimate for the discretization artifacts without the tree-level correction. The latter is usually about 0.1%, the former is usually 1 order of magnitude smaller. Since the latter certainly overestimates the residual discretization artifacts after tree-level correction and the former may be accidentally small due to statistical fluctuations, we proceed using the geometric average of the two. We smooth out fluctuations between these geometric averages for different lattice spacings by demanding that the systematic error estimate has to be at least as large as the mean of the systematic error for the two neighboring lattice spacings. We summarize the systematic error estimates for the different lattices in Table X. The absolute size of the estimate of the residual discretization error is typically less than 0.5% of 1=r 1 , namely, less than 3 MeV. This option is suitable for r=a ≥ ffiffi ffi 5 p at reasonable accuracy and should be applied for all distances with at most two steps along any of the lattice axes.
Finally, we may estimate the residual discretization artifacts in the data for each individual r=a, and correct for these errors before comparing lattice data with the weak-coupling results. This is the procedure that has been used in Refs. [3,5]. In this procedure, we define the correction factors which is compared to the continuum static energy E cont ðr I Þ.
In the following, we spell out how we estimate the residual discretization artifacts, and how we correct for these. We use two different estimates for the continuum E cont ðr I Þ.
(a) For the lattices under consideration and r=a ≥ ffiffi ffi 5 p the relative discretization artifacts are less than 0.1% after the tree-level correction. We interpolate the static energy on the finest lattices at such distances to obtain a continuum estimate 9 that can be used to take ratios of the QCD static energy on coarser lattices and in the continuum.
Our updated interpolation procedure differs in three ways from the procedure outlined in Ref. [3]. Before the interpolation we add a systematic error estimate of 1.5%, 0.5%, 0.5%, 0.5%, 0.04%, 0.02%, and 0.3% to the data on fine lattices at r=a ¼ 1; ffiffi ffi 2 p ; ffiffi ffi 3 p ; 2; ffiffi ffi 5 p ; ffiffi ffi 6 p , and ffiffiffiffiffi 12 p , respectively. Furthermore, we modify the data before the interpolation by −1.0%, þ0.5%, −0.5%, and −0.3% for r=a ¼ 1; ffiffi ffi 3 p ; 2, and ffiffiffiffiffi 12 p , respectively. These estimates of systematic errors are motivated by our previous analysis of the discretization artifacts at short distances [3]. We split the interpolation with a modified Cornell potential into two intervals, a shorter distance (usually seven points) and a larger distance interval, whose details are decided dynamically depending on the χ 2 =d:o:f. of the fit, and switch between the two interpolations in an overlap region. In each interval we require fewer higher-order terms than for a single fit of all data. We use a spline interpolation to estimate the model dependence. In the next step, we calculate a weighted average hKðr=a; βÞi of the ratios Kðr=a; β; β ref Þ obtained with continuum estimates based on different reference lattices (indicated by β ref ) for each r=a and β as previously.
Using the static energy at zero temperature restricted to r=a ≤ 8 we vary the reference lattice down to β ref ≥ 7.28 and obtain correction factors for all but the finest lattice (β ¼ 8.4) down to β ≥ 6.664. Using the singlet free energy with N τ ¼ 12 restricted to r=a ≤ ffiffiffiffiffi 12 p , namely, rT ≤ 1= ffiffiffiffiffi 12 p ≈ 0.289, we obtain correction factors for all but the finest lattice (β ¼ 9.67) down to β ≥ 7.373. TABLE X. Systematic uncertainty estimates for the discretization artifacts from the static energy at T ¼ 0 or the singlet free energy with N τ ¼ 16 or 12 at r=a ¼ 3. The estimates have been calculated individually for each data set. The fluctuations within a factor 2 or 3 between the uncertainty estimates for the different data sets are due to statistical fluctuations of the data.  9 We extrapolate to 10% shorter distances; see Ref. [3].
have been obtained from the plaquette at T ¼ 0, or with N τ ¼ 12 or 16. For the extrapolation we use K r a ; α lat We try extrapolations with up to N ¼ 3 terms, but never require the third-order term; i.e., we cannot fix its coefficient with small uncertainties. While data are quite insensitive to the second-order term for r=a ≥ 2, it cannot be avoided for r=a < 2. The coefficients of the first-and second-order term are anticorrelated. Only in the case of the Cornell fit to lattice data at zero temperature we cannot resolve the second-order term successfully. We also tested fits which include the zeroth-order term, but it turned out that this term was never necessary. For each estimate (Cornell vs weak-coupling fits and zero vs finite temperature) we take the median of the correction factors from fits including data starting at minðr=aÞ ¼ ffiffi ffi 1 p up to minðr=aÞ ¼ ffiffi ffi 5 p . We estimate the overall error from typical uncertainties, or the spread between results for different minðr=aÞ, whichever is larger. We show the median of the extrapolations for each fit type as function of α lat s as colored bands in Fig. 14. While the agreement between the median of Cornell fits and the median of weak-coupling fits is not satisfactory for fits to T ¼ 0 data, the different fit results to T > 0 data agree very well. Moreover, for the correction factor hKðr=a < 2; βÞi, where the data are sensitive to the second-order term in α lat s , the extrapolation of the results obtained with T > 0 data also extends to the T ¼ 0 results at larger values of the coupling, i.e., on coarse lattices. Therefore, in order to obtain our final results, we take the mean of the Cornell fit and the weak-coupling fit to the N τ ¼ 12 data, and estimate the uncertainty by the largest among the errors of the two results, or half of the spread between the results. Since this uncertainty still underestimates the spread between hKðr=a ¼ 1; βÞi for different fit types and among different β, we add a systematic error estimate of 0.003 in quadrature to the errors of this procedure. We summarize the coefficients corresponding to both fit types in Table XII and show the averaged result as a gray band in Fig. 14. While the corrections are significant and constrained quite well up to r=a ¼ ð2; 1; 1Þ, the corrections beyond tree level are consistent with 0 for larger distances.

APPENDIX C: COEFFICIENTS OF THE FORCE AT THREE LOOPS
In this appendix, we summarize the constants appearing in the weak-coupling result of the force at three-loop order in Eqs. (2) and (3). The color factors read where N c is the number of colors. The QCD beta function is where the first three coefficients read [38,39] N f is the number of active massless quarks. Higher-order coefficients β 3 , and β 4 are also known [38,39], but are not reproduced for brevity's sake. We reproduce the coefficientsã i as given in Ref. [13],