Connector tensor networks: a renormalization-type approach to quantum certification

As quantum technologies develop, we acquire control of an ever-growing number of quantum systems. Unfortunately, current tools to detect relevant quantum properties of quantum states, such as entanglement and Bell nonlocality, suffer from severe scalability issues and can only be computed for systems of a very modest size, of around $6$ sites. In order to address large many-body systems, we propose a renormalisation-type approach based on a class of local linear transformations, called connectors, which can be used to coarse-grain the system in a way that preserves the property under investigation. Repeated coarse-graining produces a system of manageable size, whose properties can then be explored by means of usual techniques for small systems. In case of a successful detection of the desired property, the method outputs a linear witness which admits an exact tensor network representation, composed of connectors. We demonstrate the power of our method by certifying using a normal desktop computer entanglement, Bell nonlocality and supra-quantum Bell nonlocality in systems with hundreds of sites.

A central goal in quantum information theory is to detect interesting global properties of few or many-body systems. For example, traditionally one may be interested in detecting whether a given quantum manybody state is entangled [1], or whether a given conditional probability distribution contains non-classical correlations, in the sense of violating a Bell inequality [2]. More recent ventures include detecting quantum causality properties [3] or the minimal local Hilbert space dimension of each party in a Bell test necessary for a given violation [4]. The basic underlying approach to all these tasks is the same: we consider a property (entanglement, nonlocality, dimensionality, etc.) of the system that we wish to falsify, derive an operational limitation on the set of all systems satisfying this property and then show that this limitation is violated in the experiment.
Effective numerical tools to derive the operational limitations of small systems [5][6][7][8][9][10][11] are available. These allow detection of global properties such as entanglement and nonlocality in three or even four-partite systems in a few minutes using a regular desktop computer. Unfortunately, the analysis of large systems presents two problems.
The first is that the number of parameters required to fully specify the operational behavior of a many-body system increases exponentially with its size. So do the resources (e.g.: number of experiments) needed to estimate all such parameters. That is, even prior to detection, one cannot efficiently specify the state of the system in general. However, many natural quantum states admit an efficient tensor network representation, see e.g. [12][13][14][15][16][17], which has been exploited to develop tomographic protocols to characterize such states with a number of experiments that scales only polynomially with the system size [18,19]. Assuming that the quantum states underlying our experiments are somehow typical, and can be represented by a tensor network, it is therefore possible to circumvent this problem.
The second problem is that, even when the system can be efficiently represented, the computational resources required to detect the relevant global properties of the system also scale exponentially with the system size. Consequently, as experiments with quantum simulators and condensed matter systems progress, we get access to larger and larger systems whose non-classical properties cannot be detected with current theoretical tools.
In this work, we propose a general approach to solve this second problem. Our approach will rely heavily on the framework of tensor networks, and also, somewhat surprisingly, on a central concept from quantum foundations: the framework of Generalized Probabilistic Theories (GPTs) [20][21][22][23][24]. It will also require techniques from convex optimization theory [25,26]. The result, connector theory, will allow us to detect global properties of many-body systems via algorithms whose time and memory complexity scales linearly with the system size. This lets us access systems made of hundreds of sites.
The key insight underlying connector theory is that one can construct local transformations to coarse-grain the many-body system to an effective small system, say with 2 or 3 sites, while preserving the global property of interest. Subsequently, if one can detect that the resulting small system has the desired property, then so does the original system.
Our inspiration comes from renormalisation approaches and, in particular, coarse-graining techniques that have proved very effective in e.g. diagonalizing large quantum-many body Hamiltonians in condensed matter physics where one is often interested only in the low energy subspace. The many-body Hamiltonian can be coarse-grained to a few-site Hamiltonian-which can be exactly diagonalized-while preserving its low energy subspace. This strategy has led to the invention of ground breaking simulation algorithms for large con-densed matter systems.
Before presenting all the details of the method, it is useful to illustrate the main idea with an example. Suppose you are given a quantum state ρ of m particles and your task is to certify that the state is entangled. An m-body quantum state is separable if it can be expressed as where p i ≥ 0, i p i = 1, and ρ j i are normalized quantum states. The state ρ is entangled if it does not admit such a decomposition. Now, assume you have a linear transformation mapping two systems into one, and such that product states are transformed into valid quantum states, that is T (ρ ⊗ σ) ≥ 0 for all ρ and σ. Note that we don't require the map to be physical, that is, it may produce non-positive states when applied to an initial (entangled) state. Clearly the application of this map to a separable m-body state results into a separable (m − 1)-body state. By repeatedly applying maps of this form to the initial state, it is possible to reach a size in which standard entanglement detection methods, including state positivity, can be tested. If any of these methods fails, we can certify that the initial m-body state was entangled. However, to apply this idea in practice, many aspects need to be sorted out. For instance, one needs to find a way of applying the maps T without having to deal with the complete m-body quantum state and provide the tools to construct them. These and other issues are presented in what follows and constitute the main technical results of this work. The article is organized as follows. In Section II, we begin by reviewing the three essential ingredients of connector theory: tensor networks (as efficient representation of quantum many-body states), convex optimization and the formalism of GPTs. This section also introduces the graphical notation for tensor networks that is convenient to explain the basics of connector theory, and is used in the rest of the paper. In Section III, we explain the use of connector theory for detecting Bell nonlocality as a case study. In Sections IV, V, we describe how to apply the formalism of connector theory to detect supraquantum nonlocality and entanglement in quantum many-body systems. Finally, in section VI, we present our conclusions.

II. BACKGROUND
The main objective of this first section is to review the main ingredients used in our construction: tensor networks, techniques from convex optimization theory and generalized probabilistic theories (GPT).

A. Tensor networks
We start by providing a broad introduction to the formalism of tensor networks as it appears in quantum information theory and condensed matter physics for efficiently representing quantum many-body states. For a review, see [17].
Broadly speaking, a tensor network is a set of tensors that are interconnected or contracted according to a given network. By a tensor we simply mean a multidimensional array of complex coefficients-an object that generalizes the notion of vectors and matrices. More precisely, a tensor T i1i2...im j1j2...jn is a linear map from the tensor product of a set of input vector spaces to the tensor product of some output vector spaces, The indices i 1 , i 2 , . . . , i m label an orthonormal basis in the input spaces V 1 , V 2 , . . . , V m respectively, whereas the indices j 1 , j 2 , . . . , j n label an orthonormal basis in the output spaces W 1 , W 2 , . . . , W n respectively. For convenience, we represent tensors graphically as illustrated in Fig. 1. A tensor is depicted by a shape and its indices are depicted by directed lines emanating from the shape. Input and output indices, which we later need to distinguish, are indicated by attaching incoming and outgoing arrows to the corresponding lines.
One can obtain a new tensor by contracting or multiplying together a set of tensors. Tensor contraction generalizes the notion of matrix multiplication. Two matrices M and N can be multiplied to obtain a new matrix R ≡ M N . In tensor notation we write, We graphically depict this by connecting an output index of matrix M with an input index of matrix N , as shown in Fig. 1(d). Of course, the dimension of the output index of M must equal the dimension of the input index of N . The dimension of an index is the number of values the index runs over. For example, if R in Eq. (4) is a 2 × 3 matrix then the dimension of indices i and j is equal to 2 and 3 respectively. A more general tensor contraction is illustrated in Fig. 1(e), where three tensors A, B and C are contracted to obtain a 4-index tensor T , In a contraction, the indices that are left uncontracted are called open indices e.g. i, j, k and l. On the other hand, indices that are summed over are called bond indices, e.g. a, b and c. The generic pure quantum state |Ψ of a large manybody system e.g. a lattice of qubits, requires specifying 2 N probability amplitudes, where N is the number of qubits: Even more parameters, of the order of 4 N , are needed if the state is mixed. However, often interesting states such as ground states or thermal states of local Hamiltonians contain a limited amount of correlations. This can be exploited to efficiently represent them by decomposing the large quantum many-body wavefunction encoded in the exponentially N -index tensor Ψ i1,i2,...,i N into a product of small tensors, namely, as a tensor network. Fig. 2 illustrates two popular tensor network decompositions-matrix product states (MPSs) [12,13] and the multi-scale entanglement renormalization ansatz (MERA) [15]-that have been used to efficiently represent ground state of local Hamiltonians acting on a onedimensional quantum lattice. Later, we will propose the use of connector tensor networks that are structurally similar to these decompositions.
Here, the sites of the lattice correspond to the open indices of tensor networks, while the bond indices carry the entanglement and correlations in the state. The dimensions of the bond indices generally indicate the amount of entanglement and correlations in the state: a larger bond dimension generally corresponds to larger entanglement and correlations. By contracting together all the tensors in a tensor network one recovers the probability amplitudes in Eq. (6). The choice of the network pattern of the tensor network decomposition of a given state is dictated by the specific structure of entanglement in the state.
In a condensed-matter context, one typically uses these tensor networks as an ansatz for the unknown ground state (or a low energy subspace) of a given Hamiltonian and determines the tensors numerically by means of, say, a variational energy minimization. The maximum bond dimension in a tensor network ansatz determines both the cost of the numerical optimization and the accuracy of the approximation. Here, we propose a novel application of tensor networks in the context of certification of relevant quantum properties, for instance, as witnesses for entanglement and non-locality. Moreover, and as described below, these tensor networks can be understood as measurements in a general probabilistic theory, thus extending the formalism of tensor networks beyond quantum theory.

B. Convex optimization theory
Let X be a vector space, and let X ⊂ X be a convex subset thereof. The goal of convex optimization is to solve problems of the form where f is a convex function, i.e., f (px 1 + (1 − p)x 2 ) ≤ pf (x 1 ) + (1 − p)f (x 2 ), forx 1 ,x 2 ∈ X , 0 ≤ p ≤ 1. Any vector of variablesx ∈ X satisfyingx ∈ X is said to be a feasible point. Linear programming (LP) [25] is a branch of convex optimization where X is a polytope (a convex set defined by a finite number of linear inequalities) and f is a linear function of the variablesx ∈ R n of the optimization problem. Linear programming is thus concerned with optimization problems of the sort: minc ·x such that Ax ≥b.
Here the m × n matrix A,b ∈ R n andc ∈ R m are the inputs of the problem. For any pair of vectorsȳ,z of identical size, the notationȳ ≥z indicates that y i − z i ≥ 0 for all i. As we will see, linear programming is an instrumental tool for nonlocality detection.
In order to deal with entanglement and quantum nonlocality, we use a more sophisticated tool, namely semidefinite programming (SDP) [26]. A semidefinite program is an optimization problem of the form: This time the m × m matrices F 0 , {F i } and the vector c constitute the problem input. Beware the change in notation: if A is a square matrix, then A ≥ 0 is used to denote that A is positive semidefinite, i.e., it is selfadjoint and all its eigenvalues are non-negative. There exist free solvers available to solve both linear and semidefinite programs. These solvers exploit convex optimization theory to provide, not only an approximate solution of the problem, but also rigorous bounds on how this figure differs from the exact value. For linear programs of any size, we recommend the MATLAB solver Gurobi [27]; the packages Sedumi [28] and Mosek [29] are appropriate, respectively, to solve small and large instances of semidefinite programs. We recommend not to work with these solvers directly, but through general optimization MATLAB packages, such as YALMIP [30] or CVX [31,32]. The advantage of using either of these packages is that the user does not need to write the programs in the standard form (8), (9): it is enough to indicate what linear or semidefinite constraints the variables x of the problem must be subjected to.
Unless otherwise specified, in all our numerical computations we make use of YALMIP [30] in combination with Gurobi [27] (for LPs) or Mosek [29] (for SDPs).

C. Generalized probabilistic theories
The formalism of GPTs was conceived to reason about physical theories beyond quantum physics. In a sense, it conveys an operational description of what one can do within a physical theory, but without a correspondence principle to relate the mathematical formalism of the theory to the instruments of an experimental workshop. Viewed as a GPT, quantum physics is a theory where each system is labeled by a natural number D (the dimension). Normalized (subnormalized) states of a system of dimension D are described by D × D complex positive semidefinite matrices with trace (smaller than or equal to) 1; measurements are defined by Positive Operator Valued Measures (POVMs); and transformations, by completely positive trace-preserving maps. Also, states of a bipartite system of dimensions D, D are in one-toone correspondence with the states of a system of dimension DD .
More generally, a GPT is specified by a list of possible system types, together with composition rules specifying which system type describes the combination of several other types. In a GPT the state of a given system of type S is identified with a vector [57]v, living in a space H S . The set of possible states of S corresponds to a convex set C S ⊂ H S . For every system S we assume the existence of a vectorē S ∈ H S , the unit effect, whose scalar product with any state returns the norm of the state, or the probability that the state was successfully prepared. It follows that, for allv ∈ C S ,ē S ·v ≤ 1. Moreover,v ∈ C S is a deterministic preparation iffē S · v = 1. Sometimes, for simplicity, we use the notation E(v) =ē S ·v.
In the following we will only consider GPTs which satisfy local tomography [20]. In our language, this implies that H S⊗S = H S ⊗ H S , where S ⊗ S denotes the composition of systems S, S . To recover the marginal statev S of system S from the joint state V SS of systems S, S , we apply the unit effect over the space H S , i.e., v S = I S ⊗ē S ·v SS .
Any (non-deterministic) transformation of system S into another system of type S corresponds to a linear map W : H S → H S with the property that, for any system type T , When the output system S has dimension 1, the transformation corresponds to a vectorw ∈ H S , and it physically represents an effect. The probability that the event signified byw occurs is then given byw ·v S .
Effects must not be confused with witnesses. A normalized witness is a vectorw ∈ H S with the property 0 ≤w ·v ≤ 1 for allv ∈ C S . An effect has, in addition, the property that (w T ⊗ I T ) ·v ST is a state in C T if v ST ∈ C ST . While all effects are normalized witnesses, (in general GPTs) not all normalized witnesses are effects. The different systems are seen as back boxes, each producing the classical output ai after receiving the classical input xi. The whole scenario is described by the conditional probability distribution P (a1, ..., am|x1, ..., xm).

III. A CASE STUDY: BELL NONLOCALITY
Once the main building blocks of the construction have been presented, in what follows we illustrate how to use connector theory to detect relevant properties of large systems. We do so by means of different relevant examples, starting by the detection of Bell nonlocality. We explain the connector formalism in this scenario, characterize connectors for detecting Bell non-locality and techniques for optimizing them. After non-locality in the following sections, we also show how to use connectors in the context of supra-quantum nonlocality and entanglement detection.
Consider an m-partite Bell scenario, where each party interacts with a black box, to which it can input a symbol x and then obtain an output a, see Fig. 3. The operational description of this box is given by the probabilities P (a 1 , ..., a m |x 1 , ..., x m ). We will represent these probabilities also as a tensor with m incoming and m outgoing indices, namely, and graphically represent this tensor as shown in Fig. 4(a). We assume that this box is non-signalling, i.e., for any k ∈ {1, ..., m}, the marginal probability distribution does not depend on x k . We are asked whether box P is Bell local i.e., whether the correlations P x1...xm a1...am can be expressed as where p(λ), (P j ) xj ,λ aj ≡ P j (a j |x j , λ) are arbitrary probability distributions, see Fig. 4(c). We also refer to Bell local boxes as classical boxes.
To help us answer this question, we introduce a GPT, that we call LOC-world. Intuitively, each system in Here D is a m−index identity tensor (also called a copy tensor ), namely, a tensor whose only non-zero components are Dii...i = 1, and p λ is a vector of probabilities p(λ). (d) For convenience, we will combine each pair (a k , x k ) of input and output indices of a box into a single outgoing index y k ≡ (a k , x k ).
LOC-world corresponds to a multipartite box, with a number of possible inputs and outputs. A general system is thus labelled by a vector of natural numbers of the form [O 1 , ..., O m , I 1 , ..., I m ], which denotes that the k th party's box has I k inputs and O k possible outputs. In LOC-world, the set of states of any system of type [O 1 , ..., O m , I 1 , ..., I m ] corresponds to the set of unnormalized probabilities P x1...xm a1...am of the form (13). We define the norm of a state P in LOC-world as E(P ), Valid transformations W in LOC-world correspond to linear maps which, acting on part of a classical box P , return a classical box P = (W ⊗ I)P . To be interpreted as non-deterministic transformations, such maps must satisfy the condition E(P ) ≤ E(P ). We call such transformations connectors (in LOC-world). Here, at Level 3 we obtain a 3-party box, whose nonlocality can be probed exactly with a known 3-party witness W8 (red).
of P can be expressed as the contraction of the first m outgoing legs of P with the incoming legs of C: depicted in Fig. 5(b). Now, suppose that, given a non-signalling box P for 9 parties, we applied to it connectors W 1 , W 2 , ..., W 8 as depicted in Fig. 5(c). Consider the case in which the output system of the top most connector W 8 is of type [1,1], resulting on a joint probability distribution for a one-input and one-output situation, that is, a probability 0 ≤ p ≤ 1. Then the action of all the connectors W 1 , W 2 , ..., W 8 can be interpreted as a measurement W in LOC-world. Clearly, if W (P ) < 0 or W (P ) > 1, it follows that P did not belong to the class of states of LOC-world. In other words: P is nonlocal.
This observation is the basis of connector theory. Namely, any tensor network of connectors that has 1. no outgoing arrows; and 2. no cycles, defines a normalized Bell inequality [33], i.e., a linear functional with the property that 0 ≤ W (P ) ≤ N (P ) for all classical boxes P . It is easy to see that any attempt at enlarging the set of connectors with extra tensors will result on the loss of this property.
Note that in principle there is a more general strategy to detect that our original tensor does not belong to the set of states of the theory. Namely, use connectors to coarse-grain the system and then apply a witness to prove the non-physicality of the resulting coarse-grained network. For instance, in Figure 5 (c) we would replace the connector W 8 by an arbitrary normalized witness. Since the set of normalized witnesses contains the set of effects in a GPT, this method should allow us to detect more instances of Bell nonlocality. However, in LOCworld, as well as in the other two GPTs we define in the following for supra-quantum and entanglement detection, normalized witnesses happen to be connectors as well, so it is enough when we consider connector tensor networks.

A. Characterizing connectors in LOC-world
Due to the structure (13) of the set of classical boxes, LOC-world has the convenient property that any linear map fulfilling condition (10) with T = ∅ constitutes a valid transformation. That is, if a given map in LOCworld is valid when acting on a system, it is also a valid map when acting on parts of a larger system. Let us see why.
Any box of the form P x a can be expressed as where pā is a probability distribution over {1, ..., d} n and P x,ā a = δ ax a are deterministic boxes. Absorbing pā in the definition of the hidden variable λ in eq. (13), we have that an m-partite box is classical iff it can be expressed as a convex combination of m-partite deterministic boxes of the form P x1,...,xm a1,...,am = m k=1 P x k ,ā k a k . Each of these deterministic boxes is an extreme point, namely, a point that cannot be decomposed as a convex decomposition of other points within the set of classical boxes.
Given an m → q connector Ω, deciding whether it satisfies (10) amounts to verifying that Ω ⊗ I T maps deterministic boxes to classical boxes. The general result then follows by applying convexity. Now, suppose that Ω satisfies (10) with T = ∅, consider an arbitrary m + r partite deterministic box P ≡ P 1 P 2 ...P m+r and let Ω act over the first m systems (equivalently, take T = {m + 1, ..., m + r}). The result will be the (in general, unnormalized) box P = QP m+1 ...P m+r , with Q = Ω(P 1 ...P m ). By hypothesis, Q is classical, and therefore so is P . We thus conclude that Ω ⊗ I T will map classical boxes to classical boxes for all T .
Note that this property does not hold in quantum mechanics. Indeed, there exist positive linear maps Ω, like FIG. 6: Example of a wiring. A 2-party box P is mapped into a single party box P by using as input for the second system the output generated by the first.
the transposition map Ω(ρ) = ρ T , which satisfy Ω(ρ) ≥ 0 for all ρ ≥ 0 (i.e., they are positive), despite the fact that Ω ⊗ I is not positive [34,35]. In other words, in the case of local correlations, it is impossible to find maps that are analogue to the positive but not completely positive maps for quantum states.
We have thus reduced the problem of characterizing connectors in LOC-world to the problem of identifying those transformations which map extreme classical boxes to classical boxes. Since classical boxes form a convex set with a finite number of extreme points, it follows that characterizing or conducting linear optimizations over connectors in LOC-world can be cast as a linear program (LP). See Appendix A for a detailed description of the LPs, together with some tips to reduce their time and memory complexity.
Whats' the form of connectors in LOC-world? Some of them correspond to wirings [36], namely, transformations that correspond to feeding the outputs of some parties to the inputs of some other parties. To fix ideas, consider connectors from (2, 2, 2, 2) systems to (2, 2) systems, with input indices (a 1 , x 1 ), (a 2 , x 2 ) and output indices (b, y). If we denote by P, P the input and output boxes, then a possible wiring would be given by with W b,y a,a ,x,x = δ y,x δ b,a δ a,x . This is just the result of inputting y on the first part of P , reading the result a and using it as an input in the second part of P . The final outcome of such an effective box is the output b produced by the second part of P , see Fig. 6. Although wirings map non-signalling boxes to non-signalling boxes-and therefore they are examples of connectors-the contraction of any network of wirings with a non-signalling box always results in a non-negative number. This means that wirings, by themselves, cannot be used to detect Bell nonlocality. Fortunately, there exist more general connectors in LOC-world, as shown next.

B. Connectors built from Bell inequalities
Consider the following 2-to-1 connector defined by the relations: where C, C correspond to normalized forms of the Clauser-Horne-Shimony-Holt (CHSH) Bell inequality, i.e., It can be verified that 0 ≤ C(P ), C (P ) ≤ 1 for any bipartite classical probability distribution P with two inputs and two outputs. It follows that, for any classical box P , the new box P = C(P ) will be such that P y b ≥ 0, for b, y = 0, 1 and b P y b = E(P ). This connector corresponds to a deterministic transformation in LOC-world.
How would we implement transformation C in practice? C is neither a wiring nor a convex combination thereof: this follows from the fact that, applied over any box violating the CHSH inequality, it will return a 'box' with negative probabilities. Now, suppose that the input box P is indeed classical. This implies that, hidden within P , there exist variables (a 0 , a 1 , a 0 , a 1 ) ∈ {0, 1} 4 which will determine the outcomes of the box: if we input x, y ∈ {0, 1}, we will obtain the outputs a x , a y . The values of (a 0 , a 1 , a 0 , a 1 ) can change every time we initialize the box; we assume that they are distributed according to a measure µ(a 0 , a 1 , a 0 , a 1 ). Now, consider the functions: a0,a1,a 0 ,a 1 µ(a 0 , a 1 , a 0 , a 1 )g(a 0 , a 1 , a 0 , a 1 ) = C (P ). (22) To implement C in the lab over a local distribution P , it suffices to set up a device inside the box that can read the values a 0 , a 1 , a 0 , a 1 . On input y = 0 (y = 1), the device would return b = 0, if f (a 0 , a 1 , a 0 , a 1 ) = 1 Example of a non-trivial 2 → 2 connector. The 2 → 2 connector identified in Appendix E cannot be decomposed as a combination of non-deterministic transformations of the form: a) A 2 → 1 connector followed by a preparation; b) a (global) transformation of the bipartite box into a probability distribution, which we use as a local hidden variable model to build a new box; c) local mappings on both boxes; and d) local mappings followed by swapping the two parties.
(g(a 0 , a 1 , a 0 , a 1 ) = 1), and b = 1, otherwise. Obviously, such an operation is just possible if P is a classical box to begin with. Inside a quantum box, for instance, a 0 , a 1 could correspond to the outcomes of noncommuting measurements. Therefore, we could not have simultaneous access to them and the above scheme would be unrealizable.
The fact that we used a Bell inequality to devise a 2 → 1 connector is not coincidental. Actually, all m → 1 connectors can be related to Bell inequalities. Let B be the set of m-partite classical boxes, and let B be its dual, i.e., the set of linear functionals U which map any box inside B to a non-negative number. Then all nondeterministic m → 1 connectors W in LOC-world are of the form Now, it turns out that B is in one-to-one correspondence with the set of Bell inequalities. Indeed, let B(a 1 , .., a m , x 1 , ..., x m ), K ∈ R be such that for all classical boxes P x1,...,xm a1,...,am . Then the linear functional U given by U (P ) ≡ ā,x Bā x Px a − KE(P ) satisfies U (P ) ≥ 0 for all classical boxes.
It is worth remarking that the notion of composing m → 1 connectors to form new Bell inequalities is implicit in the work of Wu et al. [37]. There, the authors propose a scheme to generate a new (m + 1)-partite new Bell inequality in the two-input/two-output Bell scenario, given two m-partite Bell inequalities. In our language, their scheme can be interpreted as a contraction between an m → 1 connector and the Clauser-Horne-Shimony-Holt (CHSH) inequality [38].
To our knowledge, though, m → m connectors have never been considered in Bell nonlocality, so we cannot relate them to past literature on the subject. A preliminary exploration of this class of transformations revealed rather intriguing objects. In this regard, in Appendix E we present a 2 → 2 connector that does not admit a decomposition in terms of non-deterministic 1 → 1 and 2 → 1 connectors, see Fig. 7.

C. Applications
Now that we have non-trivial connectors, the next step is to contract them to generate new Bell inequalities. One possibility is to take C in eq. (18) and contract multiple copies thereof in according to a tree network, see Fig. 8.
How useful are these new inequalities? As it turns out, computing the minimum value of an arbitrary Bell inequality under non-signalling distributions can be cast as a linear program. This allowed us to calculate, numerically, the corresponding maximal violations of each 'CHSH tree', which seem to increase with the number of parties. Note that this value is a meaningful quantity, since all CHSH trees are normalized by construction.
However, the ultimate goal of Bell nonlocality detection is not to devise arbitrary Bell inequalities, but to detect the non-classicality of specific experimental systems. In this context, we now discuss two applications. First, detection of non-locality in a experimental setup where the actual preparation of the underlying quantum state is known. And second, detection of non-locality for more general boxes.

Nonlocality detection in finitely correlated states
We find that connector theory provides a simple heuristic that relates the detection of nonlocality in a particular quantum experimental setup with the actual preparation of the underlying quantum state. Consider a scenario where several copies of the maximally entangled state |φ ≡ (e two-qubit unitaries U 1 , ..., U m see Fig. 9 (left). The resulting state-sometimes called a finitely correlated state [12]-is distributed among 2n parties, who probe it with Pauli measurements, thus obtaining a 2m-partite nonsignalling box P with three inputs and two outputs at each site. Our goal is to certify that the P is Bell nonlocal. Denote by P the three-input/two output box that results when we distribute |φ among two parties and allow each to measure its qubit with Pauli operators. Since Pauli measurements form a complete operator basis, we can identify any two-qubit operator U with the way it transforms linear combinations of σ i ⊗ σ j . It follows that there exist matrices V 1 , ..., V m such that P = (V 1 2,3 ⊗ V 2 4,5 ... ⊗ V m 2m,1 )(P ) ⊗m . On the other hand, P violates one of the forms C of the normalized CHSH Bell inequality when each party measures with σ x , σ z ; more specifically, C (P ) = 1 2 − 1 √ 2 < 0. It follows that Unfortunately, that does not prove that P is Bell nonlocal, since (V 1 ) −1 , (V 2 ) −1 , ... are not connectors (that is, each one of them does not necessarily map classical boxes to classical boxes). Suppose, though, that we identified 2 → 2 connectors W 1 , W 2 , ... whose action on P were analogous to that of (V 1 ) −1 , (V 2 ) −1 , ..., see Fig. 9 (right). Then, there would be a fair chance that the newly devised Bell inequality were such that B(P ) < 0.  II  2  120  120  3  201  52  4  282  59  5  300  56  6 300 65  Fig. 9 (left) was detected via the connector contraction in Fig. 9 (right) for two different methods to choose the connectors. The numbers in the first row are the same because both methods are equivalent for m = 2.
To find a guess for W 1 , we consider the box Q = (I 1 ⊗ V 1 23 ⊗I 4 )P ⊗P . It is easy to see that identifying the connector W 1 that minimizes {C ⊗(E−C )}(I 1 ⊗W 1 23 ⊗I 4 )Q can be cast as a linear program. Heuristically, W 1 is approximately inverting the action of U 1 . Next, we consider the problem of identifying the connector W 2 such that is minimized; again an LP. We iterate this procedure until we obtain suitable guesses for W 1 , ..., W m−1 . The last step is to identify the connector W n that minimizes the contraction shown on the right side of Fig. 9. If the result is negative, we have detected the Bell nonlocality of P .
To assess how well this method works, we generated 300 m-tuples of random unitaries (U 1 , ..., U m ) for different values of m and applied the procedure to the resulting 2m-partite box. The results are shown in Table I, Method I. Note that, for m ≥ 5, the algorithm detected nonlocality always.
Notice that, if the intuition behind the heuristic is taken literally, the detection of Bell nonlocality should only depend on the values of just U 1 , U 2 , U n . Indeed, intuitively, the C red connector in Fig. 9 is associated to the first negative term on the right-hand side of eq. (25): the remaining red connectors E − C give the positive contribution on the right of the equation; and are just meant to enhance the magnitude of the Bell violation. This intuition leads one to anticipate that, for m ≥ 3, the probability of a Bell violation should not depend on the system size. In fact, we observe just the opposite: as the system grows in size, the probability of detecting non-locality with the contraction in Fig. 9 increases with m, see Table I.
A possible explanation is that, as the index k runs from 1 to n, the action of W k becomes less and less the inversion of V k . On the contrary, the heuristic seems to be exploiting the structure of the correlations between the remaining parties after the application of {C ⊗ (E − C ) ⊗k } in order to boost the overall Bell violation even further. Actually, if we modify the heuristic and, for k ≥ 2, we derive each W k by maximizing the contraction then the dependence in m disappears, see Table I, Method II.

Nonlocality detection in more general boxes
In some situations, we want to decide the Bell nonlocality of a multipartite box for which one just has a theoretical description. (That is, no preparation information is available as in the previous discussion.) To address nonlocality detection in more general boxes, we introduce the Matrix Product Connector Tensor Network (MPCTN): a witness composed of 2 → 1 connectors that are contracted similar to tensors in a matrix product state, see Fig. 2(b). Fig. 10 illustrates a MPCTN for a 9-party box.
Denote by n I and n O the number of inputs and the number of outputs that appear on the bond indices of the MPCTN respectively. We call the pair (n I , n O ) the bond dimension of the MPCTN. The bond dimension determines the size of the output index of each connector. In principle, one can choose a different bond dimension for each connector. However, in the numerical simulations presented here, we fixed the same bond dimension for all the connectors in the MPCTN. We remark that even though in this paper all numerical results were obtained by using a MPCTN witness, we can at least construct a MERA-like witness, since non-trivial 2 → 2 connectors exist (e.g. the one depicted in Fig. 7).
Having thus fixed MPCTN as the ansatz for nonlocality witness, the next task to (numerically) determine the connectors connectors {W i } in order to minimize the contraction shown in the figure for a given box P . In numerical optimizations, the bond dimension controls both the computational cost and the value of possible violations. We used two types of optimization techniques.
See-saw optimization.-Let us initialize all the connectors to random values (within the space of 2 → 1 connectors of the given bond dimension). Denote by W (P ) the value of the contraction illustrated in Fig. 10. We can fix all but one connector, say at location j, and determine connector W j that minimizes W (P ). This problem reduces to a linear optimization over the set of 2 → 1 connectors, a problem that can be cast as linear program.
Iterating, we obtain a sequence of decreasing values for W (P ). We can stop the protocol as soon as W (P ) becomes negative. This sort of optimization procedures are called see-saw methods [39,40], and they have proven very helpful in condensed matter physics and quantum nonlocality. In our numerical simulations, though, we find that, unless W (P ) is negative from the very beginning, very often one of the optimal connectors becomes 0. In such cases, a projected gradient method [41] seems to be a better choice to minimize W (P ).
Gradient descent optimization.-Choose > 0, 1, and let E j denote the tensor obtained by contracting all tensors except W j . We will say that E j is the A generic witness composed from 2 → 1 connectors that are contracted similar to tensors in a matrix product states (see Fig. 2(b)). We found the resulting witness-the MPCTN-useful for detecting non-locality and entanglement in several systems. The sequence of 2 → 1 connectors applied from left to right coarse-grain the box to an effective 2-site box, whose global properties can be explored by means of a 2-site witness W8. When P also has an efficient tensor network representation, then an MPCTN provides a witness that can be scaled to hundreds of sites.
'environment' of tensor W j . Adapted to this problem, the subgradient method consists in updating the connectors via the iterative equation: Here, for any tensor A, π C (A) denotes the projection onto the set of valid connectors. That is, π C (A) is the connector that best approximates A in 2-norm (when viewed as a multipartite vector). Computing projections can be formulated as an SDP, and hence it can be solved efficiently, as long as the cardinality of the indices of the connector is kept at a reasonable value. Guess for initial connectors.-While some times using random connectors as the initial guess for the optimization (both the see-saw and the gradient descent) worked well, we observed that in some cases making an educated guess for the initial connectors produced a violation when starting with random initial connectors failed to do so. (This also some times enhanced the violation in cases where there was a violation with random initial connectors.) A guess that often worked when exploring the same box for larger and larger number of parties was to use the optimized connectors for smaller box as initial connectors for the larger number of parties. Some times, the reverse worked-optimized connectors for larger number of parties provided a good guess for smaller number of parties. In practice, we tried such different schemes to determine a method that worked well for a given box.  2). The blue points (circles) are the violations obtained using the see-saw optimization scheme using randomly chosen initial connectors until 75 parties. Using random initial connectors failed to produce violations for larger number of parties. The violations for larger sizes (between 75 and 100) were obtained by using the optimized connectors for L parties as the initial guess for the optimization of L + 5 parties. The red points (star) are substantially enhanced violations obtained by using the optimized connectors for 100 parties as the initial guess for smaller number of parties, suggesting that a simple see-saw optimization scheme may not be optimal. (The 100 party solution stopped working as the initial guess for less than 35 parties. We still managed to obtained enhanced violations in this domain by feeding the optimized connectors for L parties as the initial guess for L − 5 parties.) Scalability.-There exist relevant scenarios in which the no-signalling box P can also be efficiently represented as a tensor network e.g. a matrix product state of a low bond dimension. This is possible when the box is the result of measuring a quantum state with limited correlations, for example, the thermal state of a 1D local Hamiltonian. In this case, the contraction illustrated in Fig. 10 to compute W (P ) can be carried out with a cost that scales only linearly with the number of parties [13]. This means that we can apply the method above to assess the Bell nonlocality of boxes shared by hundreds or even thousands of parties. However, note that increasing the bond dimension-either of the box P or of the MPCTN-increases the pre-factor in the scaling of the computational cost.  Figure 11 shows the violations we obtained for m = {5, 10, 15, ..., 100} using two see-saw optimization schemes. Note that the magnitude of the violation seems to increase approximately exponentially with the number of parties. (In fact, we were unable to proceed for number of parties much larger than 100 because of instabilities in optimizing the connectors owing to the presence of very large coefficients in connector tensors.) This rules out the possibility that the optimization algorithm is simply determining wirings to project the last three parties into a GHZ state and then probing the latter with, e.g., the Mermin inequality [43]. The algorithm is finding a cleverer solution. Furthermore, we found violations only after a couple of see-saw sweeps, even when starting from random connectors. Note that detecting the non-locality of a system made of hundreds parties, as done here, is completely out of reach with the existing techniques.

IV. SUPRAQUANTUM NONLOCALITY DETECTION
In this section, we sketch how to use connector theory to determine whether a given non-signalling box P x1,...,xm a1,...,am admits a quantum realization. The object of interest is the same as above, a probability distribution for the measurement outputs conditioned on the inputs, but now the goal is to understand whether these given correlations can be reproduced within the quantum formalism.
where ρ is a non-normalized quantum state and M k a|x is to be understood as the Positive Operator Valued Measure (POVM) element corresponding to party k inputting x in its box and obtaining the result a. That is, The natural GPT to consider here would be QUANTworld, whose states are quantum boxes of arbitrarily many inputs and outputs. In this case, the dual set Q of the set of quantum boxes corresponds to the set of coefficients W a1,..,an x1,..,xn such that, given any set of operators {M k a|x } satisfying (30) is positive semidefinite. As explained in Appendix B, non-trivial SDP ansätze on Q can be derived from the Navascués-Pironio-Acín hierarchy (NPA) [5,44,45] and variants [46]. We claim that, replacing B in Eq. To test the method, next we introduce a number of non-signalling supra-quantum boxes which admit an MPS decomposition of bond dimension linear on the system size, or even bounded. This will allow us to test their non-quantumness for high system sizes.

A. Generalized Svetlichny box
Consider a scenario where m parties have two measurements, each with two outcomes, i.e., x 1 , ..., x m , a 1 , ..., a m ∈ {0, 1}, and let f (x 1 , ..., x m ) be any Boolean function. It can be verified that the box with statistics P x1,...,xm a1,...,am = 1 2 n−1 δ k a k ,f (x1,...,xm) (33) is normalized and no-signalling. The proof is simple: if we trace out any party, the probability of any sequence of outputs equals 1/2 m−1 , independently of the sequence of inputs. Almost all such boxes allow, by wirings, to produce a perfect PR box, and hence they cannot be realized within quantum theory. Moreover, there exist important supra-quantum boxes within this family, such as the Svetlichny box [47]. Next we generalize the Svetlichny box to arbitrarily many parties and then show that it admits an MPS representation with bond dimension 16.
The original Svetlichny box is tripartite, with statistics given by It is thus of the form (33) We will generalize it to a box of the form It can be verified that all such 'generalized Svetlichny boxes' can be simulated by distributing a PR-box to each party and its near neighbor and let each party wire its two boxes together. Again, reaching similar results for systems of hundred particles is impossible with existing methods.
In turn, Svetlichny boxes can be seen to admit an MPS decomposition P x1,...,xm a1,...,am = Λ [ involving matrices Λ a,x of size at most 16 × 16, see Appendix D for their exact expression.
Using the above MPS representation of the Svetlichny boxes and by using a MPCTN witness composed of 2 → 1 connectors from QUANT-world, we were able to detect quantum nonlocality violation in these boxes for large number of parties, see Fig. 13. We find that violation increases exponentially with the number of parties.
This box also admits a MPS representation (35) for bond dimension 2(r + 1), see Appendix D. A fully symmetric 'majority voting' box is given by the function maj(x 1 , ..., x m ) =1, if half or more of the inputs are 1s, 0, otherwise.
This time, the bond dimension of the box scales linearly with the system size m. Figure 14 shows the violations we found for the box defined in Eq. (36) with m = 2, for up to 20 parties. We also tried the same box with m = 3, but managed to find a small violation ≈ −0.1 for 4 parties. Even for this case, we used a bond dimension = (4,4). The optimization of the connectors with larger bond dimension is very slow.

V. ENTANGLEMENT DETECTION
As a final application of connector theory, we come back to the problem of entanglement detection, which we used in the introduction to illustrate the main intuition of our construction. To apply connector theory to detect entanglement, we define a GPT whose states coincide with the fully separable quantum states, call it SEP-world. In this theory, the norm of a state ρ is defined as E(ρ) ≡ tr(ρ). As in LOC-world, due to the structure (1) of the set of separable states, SEP-world has the property that any linear map fulfilling condition (10) with T = ∅ constitutes a valid transformation. The set of connectors in SEP-world thus corresponds to the set of linear maps which transform separable states into separable states. In general, these operations cannot be implemented in quantum theory.  4). With the optimizations techniques described in this paper we managed to find violations only up to 20 parties. It is possible that violations will also be found for more parties by using more sophisticated see-saw optimization techniques, better guess for initial connectors, and/or using larger bond dimensions. With our current implementation, we could only manage to run simulations with a maximum bond dimension = (4 4) in a reasonable time.
When studying LOC-world, we noted that the structure of m → 1 connectors is closely linked to that of Bell inequalities. Similarly, in SEP-world there is a oneto-one correspondence between scaled connectors and m + 1-partite entanglement witnesses [48]. We remind the reader that a k-partite entanglement witness W is an operator acting in for all fully separable states ρ. We claim that a linear map Ω : B(⊗ m i=1 H i ) → B(H m+1 ) corresponds to an m → 1 connector iff: 1. W Ω , defined via the relation tr{W Ω (σ ⊗ β)} = tr{Ω(σ)β}, is an m + 1-partite entanglement witness, 2. I 1,...,m − tr m+1 (W Ω ) is an m-partite entanglement witness.
The above observation allows us to link the connector theory of SEP-world with the existing literature in entanglement detection. In principle, we can promote any k-partite witness to a k − 1 → 1 connector and contract several copies thereof, as we did with the CHSH Bell inequality in Fig. 8. The result would be a novel entanglement witness for m-partite entangled states.
Take, for instance, the family of m-qubit entanglement witnesses derived by Toth et al. in [49]: i . This is not a linear witness, but can be turned into one by just replacing J i by arbitrary real numbers: i=x,y,z Taking λ i = 0, one can contract the connectors associated to the 4 and 2-qubit entanglement witnesses as shown in Fig. 15 to produce the witness W 6 . Numerically, we find that there exist 6-qubit states which, while satisfying all forms of (40), can be detected by W 6 . This shows that new detection properties can arise from composition alone. We come back to this in Section V C. Constructing witnesses which detect the entanglement of a given quantum state is a more complicated task, due to the difficulty of certifying that W Ω , I 1,...,m − tr m+1 (W Ω ) are indeed entanglement witnesses. A possible approach to this problem is to prove instead that the average values of those two operators are non-negative when evaluated over a relaxation (a superset) of the set of separable states. The family of relaxations which we considered in our numerical examples is called the Doherty-Parrilo-Spedaliery (DPS) hierarchy [9][10][11]. Combining this idea with the observation in [50] that a small perturbation of the DPS sets projects them to the interior of the set of separable states, in Appendix C we present a family of SDP ansätze on the set of m → m connectors. Throughout the rest of this section, we use those ansätze whenever a linear optimization over feasible connectors is required.

A. PPT states
To test how useful connectors are for entanglement detection, we first considered a famous class of multipartite entangled states which are positive under partial trasposition (PPT) [51]. An unextendible product basis (UPB) is a collection of m-partite orthogonal product states {|ψ i } K i=1 with the property that no other product vector is orthogonal to their span. Given any UPB, can be shown entangled and PPT [52]. In [53], a family of six-qubit UPBs parametrized by three qubit unitaries, is presented. We sampled 10 unitary triples randomly according to the Haar measure, built the corresponding six-qubit quantum states (41), and used a MPCTN to detect their entanglement. The output system of each connector was a qubit. In all cases, a see-saw algorithm found a normalized entanglement witness whose average value of the state was −0.5.

B. Finitely correlated mixed states
We generated mixed states following a preparation similar to the one described in Sec. III C 1. We considered distributing m singlets |Ψ − ≡ 1 2 (|0 |1 − |1 |0 ) amongst 2m parties as illustrated in Fig. 9 (where state |φ is replaced with |Ψ − ). We also replaced the action of unitaries U 1 , U 2 , ..., U m , shown in the figure, by conjugation with a convex combination of two unitaries (drawn randomly for each pair of sites). The resulting state is mixed, has an efficient MPS representation, and may be separable. We wanted to certify whether such states are entangled or not using connectors. We used a witness to the one illustrated in Fig. 9 (right hand side). The intuition is the same. We wanted to find 2 → 2 connectors in SEP-world that approximately inverted the randomizing quantum channel, thus exposing the initially singlets. The singlets can be certified to be entangled by a 2-party witness which is simply the SWAP gate, which evaluates to −1 for the singlet |Ψ − . (That is, we replaced C =SWAP in Fig. 9; E−SWAP connectors are once again used to amplify the violation. ) We generated such finitely correlated mixed states from randomly chosen unitaries (using the Haar measure) for a system of 50 qubits, and found a violation (certifying the presence of entanglement) almost each time. Fig. 17 shows the violations obtained for five such randomly drawn states.
We also used another witness, one composed of only 2 → 1 connectors as illustrated in Fig. 16. Again, we easily found violations for randomly chosen states, see Fig. 18. We also found that the violations were larger than those obtained by using 2 → 2 connectors.

C. Entanglement detection through hybrid GPTs
In the previous section, we linked the entanglement problem to connector theory by defining a GPT, SEP-LOC, where the set of physical states coincides with the set of fully separable states. The purpose of this section is to demonstrate that it is even possible to follow a hybrid approach in which different GPTs are connected.
Consider, for example, a theory where there are two We dub this theory STEER-world.
Now, suppose that we wished to assess the entanglement of a three-qubit state. One possibility would be to regard it as a possible state of STEER-world and then apply the connectors depicted in Fig. 19. There the three-partite quantum state is transformed into a bipartite box, which we then evaluate with the normalized CHSH inequality C (20). This scenario reminds that of device-independent certification of entanglement, and actually it would be equivalent, if the transformations U, V acted on single systems. Indeed, in that case connectors from quantum systems to boxes correspond to conducting quantum measurements on the former, and entanglement is detected iff the corresponding box violates a Bell inequality.
As we will see, the 2 → 1 connector U mapping bipartite quantum systems to a single box changes things completely. Let U, V be defined via: Purple lines indicate quantum systems; black lines, boxes. Starting from a quantum state, we effect two transformations U, V to map it to a bipartite box, which we then probe with the normalized CHSH inequality.
Contracting U, V, C, we obtain a three-qubit entanglement witness X, that we can express as an operator acting on C 2 ⊗ C 2 ⊗ C 2 . Now, consider an optimization over PPT three-qubit states ρ ABC , i.e., consider the problem min tr(Xρ ABC ) This problem can be cast as an SDP; hence we can solve it. The solution is −0.0721. This is surprising because neither the CHSH inequality nor the SWAP operator can, by themselves, detect PPT entanglement. Their composition, however, does. So, even if we were not aware of the existence of non-decomposable entanglement witnesses (those which can detect PPT states), we could have derived them from compositional arguments alone.

VI. CONCLUSION
We have presented a general method to analyze complex networks, be they classical, quantum or supraquantum. In essence, our method consists in acting on the many-body system in question with a number of linear transformations-the connectors-which iteratively coarse-grain the system to one that is small enough to analyze with the existing mathematical tools. While we could relate m → 1 connectors to past literature in Bell nonlocality and entanglement theory, m → m connectors seem to be a completely different beast. We showed that connector theory is powerful enough to detect Bell nonlocality (quantum and supraquantum) and entanglement in networks composed of hundreds of sites. Even though we focused on these three areas, we suspect that connector theory will soon find application in other scenarios, for example, to build new dimension witnesses.
Connectors are a natural tool to analyze large, complex many-body systems, and we feel that future research should focus on understanding their mathematical properties. In this regard, our work leaves open important theoretical questions.
One of them is to understand the limitations of the new formalism. Could there be, e.g., entangled tripartite quantum states, undetectable by the composition of a 2 → 1 connector and a bipartite witness? If not, one wonders how difficult it is in general to find the 'right' connectors to detect a particular state or box. The performance of our current numerical methods oscillates between disappointing (it sometimes takes ages to identify the appropriate connectors, even for m = 3) and excel-lent (100 sites in less than 2 minutes!). Actually, in some scenarios, like QUANT-world, we altogether avoided discussing how to optimize over general m → m connectors! Another question pertains the practical use in experiments of connector-generated witnesses. Estimating the average value of a witness on a many-body state/box generally requires a number of experiments that scales exponentially with the system size. Is there any way to exploit the tensor network structure of a witness in order to estimate its value with a polynomial number of experiments?
Finally, it is an intriguing idea whether more complicated connectors could be devised by working on a GPT where states are identified with the connectors themselves (CONNECTOR-world), or even with connectors of connectors.
where the last condition enforces that the norm of the box does not increase after we apply the connector. To see that any linear functional W satisfying the feasibility conditions transforms local boxes into subnormalized local boxes, note that any initial classical box P admits a decomposition i p i P i , with p i ≥ 0, i p i = 1. The result of applying W over such a box is thus the box P = i,j p i p i jP j . Identifying µ j ≡ i p i p i j with our local hidden variable model, we find thatP is also Bell-local.
Program (A2), although correct, can be greatly improved. In the following, we show how to do so by exploiting the lessons learned from the monogamy of nonlocal correlations. We will do so in three stages. First, we will introduce a convenient notation to deal with nosignalling boxes, that will also be useful to minimize the complexity of LPs like (A2). A characterization of the dual of the set of non-signalling boxes will follow. Finally, building on the above two results, we present our proposal for linear optmizations over 2 → 1 connectors.

The dual of no-signalling boxes
Consider the k-partite non-locality scenario In abbreviated form, the corresponding set of (non-normalized) no-signalling distribu- , and S is the matrix that transforms a box from its abbreviated representation P (A 1 , ..., A k ) to its standard representation P (a 1 , ..., a k |x 1 , ..., x k ), see Section A 1. The condition Sq ≥ 0 enforces that all the probabilities of the box are non-negative.
The set of positive linear functionals in abbreviated representation is given by the set B = {S Tc :c ≥ 0}. Indeed, by definition, anyv ∈ L satisfiesv ·q ≥ 0 for allq ∈ L, and so the dual set of B contains B . It rests to show that any vector outside B cannot belong to the dual of B. First note that, for anyq ∈ B, there exists v ∈ B such thatv ·q < 0 (take, e.g.,v = S Tc , with c j = Θ(−(S Tq ) j )). Now, letw ∈ B . By the Separation theorem there existsq such thatv ·q ≥ 0 for allv ∈ B , andw ·q < 0. The first condition implies thatq ∈ B, and so the second condition implies thatw is not in the dual of B.
With the formulation above, it is clear that linear optimizations over the set of positive functionals of nosignalling boxes can be carried out via linear programming [25]. 3. Faster codes for optimization over 2 → 1 connectors First, we will define a (non-normalized) local box in a non-standard way. That this definition implies bipartite locality can be seen by noting that the variables b 1 , ..., b n B play the role of local hidden variables in the decomposition above.
Note that we can regard an extended box as a nosignalling box where all the parties but the first have just one input. Therefore, we can represent extended boxes in abbreviated form, as a vector of probabilities Q = P (A, B 1 With this formulation, one can carry out linear optimizations over the set of positive functionals of local boxes via linear programming [25]. The computational cost will be bearable provided that n B is not very large. n A can take high values, though. satisfying the properties above is called a Bose-symmetric PPT k-extension of σ. As proven in [11], any separable state admits a Bosesymmetric PPT k-extension for all k. Indeed, let σ = i p i m+1 j=1 |u j i u j i |. Then it can be verified that the state β = i p i m j=1 |u j i u j i | ⊗k ⊗ |u m+1 i u m+1 i | satisfies the above constraints. Most importantly, the limiting set lim k→∞ S k is the set of fully separable states [11].
As explained in the main text, rather than over general entanglement witnesses, we will conduct optimizations over a subset thereof. More precisely, we will consider a set W k m of multipartite operators W 1,...,m+1 such that tr{W σ} ≥ 0 for all states σ ∈ S k . This set is composed by operators W 1,...,m+1 such that with the sum on the left running over all bipartitions A of the km + 1 parties and V A ≥ 0, for all partitions A.
Let then σ admit a Bose-symmetric PPT k-extension β 1,...,1,2,...,2,...,m,...,m,m+1 . Then we have that tr(W σ) = tr{(W ⊗ I ⊗k d1 ⊗ ... ⊗ I ⊗m dm )β} = tr{Π k sym (W ⊗ I ⊗k d1 ⊗ ... ⊗ I ⊗m dm )Π k sym β} = tr{ Here the first equality follows from the fact that β is an extension; the second, from it living in the symmetric subspace; and the third, from eq. (C1). The last inequality follows from the fact that β is PPT and that V A ≥ 0 for all bipartitions A. It hence follows that any map Ω satisfying the SDP conditions: 1. W Ω ∈ W k m , 2. I 1,...,m − tr m+1 (W Ω ) ∈ W k m−1 , is a m → 1 connector in SEP-world. Clearly, linear optimizations over this set can be cast as an SDP. The DPS hierarchy also provides us with tools to define SDP ansätze of m → m connectors. In [50], it is shown that, for any state ρ 1,...,m admitting a Bose-symmetric extension (note that the PPT condition is not necessary), the stateΩ(ρ) ≡ ⊗ C d m ) can be transformed into a m → m connector just by tracing out the extra systems and applyingΩ at the output.
Let us finish with a trick to optimize over m → 2 connectors when the output is a C 2 ⊗C 2 or a C 2 ⊗C 3 system. We again start from an m → 1 connector W , with output spaces A m+1 , B m+1 . The key idea is to enforce that both W Ω and W T A m+1 Ω are entanglement witnesses. If that is the case, then the output of the map will be a PPT state, and so, by [55], a separable state. Imposing that W Ω , W T A m+1 Ω ∈ W k m is again an SDP.

[k]
a k ,x k . It can be verified that ψ k | = 1 2 k x 1 | x k | x 1 x 2 ⊕...⊕x k−1 x k | a 1 ⊕...⊕a k |. That is: the first qubit register contains a copy of the value of x 1 ; the second, the value of x k ; the third, the part of f (x 1 , ..., x m ) computed so far; and the last one, the part of a 1 ⊕ ... ⊕ a m computed so far.
The tensors defining box (36) are given by: Λ [1] a,x = 1 2 x| a|, a,x = 1 2 M x,r ⊗ X a , for 1 < k < m, where M x,r = |r r|+ r−1 j=0 |j x(j +1)|. In this case, the first register has r +1 levels (|0 , ..., |r ), and it represents a counter. The second register is a qubit carrying the sum modulo 2 of the outputs.
Finally, it can be verified that an MPS representation for box (37) is given by the matrices: