Photonic Realization of a Quantum Finite Automaton

We describe a physical implementation of a quantum finite automaton recognizing a well known family of periodic languages. The realization exploits the polarization degree of freedom of single photons and their manipulation through linear optical elements. We use techniques of confidence amplification to reduce the acceptance error probability of the automaton. It is worth remarking that the quantum finite automaton we physically realize is not only interesting per se, but it turns out to be a crucial building block in many quantum finite automaton design frameworks theoretically settled in the literature.


I. INTRODUCTION
Quantum computing is a prolific research area, halfway between physics and computer science [1][2][3][4][5]. Most likely, its origins may be dated back to the 1970s, when some work on quantum information began to appear (see, e.g., Refs. [6,7]). In the early 1980s, Feynman suggested that the computational power of quantum mechanical processes might be beyond that of traditional computation models [8]. A similar idea was put forth by Manin [9]. Almost at the same time, Benioff proved that such processes are at least as powerful as Turing machines [10]. In 1985, Deutsch proposed the notion of a quantum Turing machine as a physically realizable model for a quantum computer [11].
The first impressive result witnessing "quantum power" was Shor's algorithm for integer factorization, which could run in polynomial time on a quantum computer [12]. It should be stressed that no classical polynomial time factoring algorithm is currently known. On this fact, the security of many current cryptographic protocols, e.g., Rivest-Shamir-Adleman (RSA) and Diffie-Hellman, actually relies. Relevant progress was made by Grover, who proposed a quantum algorithm for searching an item in an unsorted database containing n items, which runs in time O( √ n) [13]. These and other theoretical advances naturally drove much attention to efforts on the physical realization of quantum computational devices (see, e.g., Refs. [14][15][16][17]). While we can hardly expect to see a full-featured quantum computer in the near future, it might be reasonable to envision classical computing devices incorporating quantum components. Since the physical realization of quantum computational systems * stefano.olivares@fisica.unimi.it has proved to be an extremely complex task, it is also reasonable to keep quantum components as small as possible. Small-size quantum devices are modeled by quantum finite automata, a theoretical model for quantum machines with finite memory.
Indeed, in current implementations of quantum computing, the preparation and initialization of qubits in superposition and/or entangled states is often challenging, making worthwhile the study of quantum computation with restricted memory, which requires less demanding resources, as in the case of the quantum finite automata.
The simplest and most promising from a physical realization viewpoint model of a quantum finite automaton is the so-called measure-once quantum finite automaton [18][19][20][21]. Such a model also served as a basis for defining several variants of quantum finite automata introduced and studied in plenty of contributions (see, e.g., Refs. [22][23][24][25][26][27][28]). Becasue it is the only model considered in the present paper, from now on for the sake of brevity we will simply write "quantum finite automaton" instead of "measure-once quantum finite automaton." The "hardware" of a (one-way) quantum finite automaton is that of a classical finite automaton. Thus, we have an input tape scanned by a one-way input head moving one position forward at each move, plus a finite basis state control. Some basis states are designated as accepting states. At any given time during the computation, the state of the quantum finite automaton is represented by a complex linear combination of classical basis states, called a superposition. At each step, a unitary transformation associated with the currently scanned input symbol makes the automaton evolve to the next superposition. Superposition dynamics can transfer the complexity of the problem from a large number of sequential steps to a large number of coherently superposed quantum states. At the end of input processing, the automaton is observed in its final superposition. This operation makes the superposition collapse to a particular (classical) basis state with a certain probability. The probability that the automaton accepts the input word is given by the probability of observing (collapsing into) an accepting basis state. Quantum finite automata exhibit both advantages and disadvantages with respect to their classical (e.g., deterministic or probabilistic) counterparts. Basically, quantum superposition offers some computational advantages on probabilistic superposition. On the other hand, quantum dynamics must be reversible, and this requirement may impose severe computational limitations to finite memory devices. As a matter of fact, it is sometimes impossible to simulate classical finite automata by quantum finite automata. In fact, as we will discuss in Sec. II D, isolated cut point quantum finite automata recognize a proper subclass of regular languages [18,20,21].
Although weaker from a computational power point of view, quantum finite automata may greatly outperform classical ones when descriptional power is at stake. In the realm of descriptional complexity [29], models of computation are compared on the basis of their size. In the case of finite state machines, a commonly assumed size measure is the number of finite control states. Most likely, the first contribution explicitly studying the descriptional power of quantum versus classical finite automata is Ref. [30], where an extremely succinct quantum finite automaton is provided, accepting the unary language L m = {a k | k ∈ N and k mod m = 0} for any given m > 0. The construction in Ref. [30] uses as a basic (and sole) module a quantum finite automaton A for L m with 2 basis states, whose acceptance reliability is then enhanced within a suitable modular building framework where traditional compositions (i.e., direct products and sums) of quantum systems are performed. Actually, many (if not all) contributions in the literature aiming to design small size quantum finite automata for several tasks (see, e.g., Refs. [25,[31][32][33][34][35][36][37][38]) use the module A as a crucial building block. In this sense, the language L m and the module A turn out to be "paradigmatic" as tools to build and test size-efficient quantum finite automata. Hence, a physical realization of the module A might be well worth investigating.
In this paper, we put forward a physical implementation of quantum finite automata based on the polarization degree of freedom of single photons and able to recognize a family of periodic languages. More precisely, because of above stressed centrality in quantum finite automaton design frameworks, we focus on the physical implementation of the quantum finite automaton A for the language L m . We investigate the performance of our photonic automaton, taking into account the main sources of error and imperfections, e.g., in the preparation of the initial automaton state. We also use techniques of confidence amplification to reduce the acceptance error probability of the automaton.
The paper is structured as follows. In Sec. II, we provide an almost self-contained overview of the basic concepts underling formal language theory and classical finite automa. Moreover, we quickly address practical impacts of finite automata and the importance of investigating their size in the light of possible physical implementations of such devices. Next, we present the notion of a quantum finite automaton together with some basic facts on its computational and descriptional power. We particularly focus on unary automata, i.e., automata with a single-letter input alphabet, and emphasize the notion of a language accepted with isolated cut point. In Sec. III, we introduce a simple unary language, as a benchmark upon which to test the descriptional power of classical and quantum finite automata, namely the language L m = {a k | k ∈ N and k mod m = 0} for any given m > 0. We provide a theoretical definition of a quantum finite automaton A accepting L m with isolated cut point and two basis states, whereas any classical automaton for L m requires a number of states which grows with m.
The photonic implementation of the quantum finite automaton A with two basis states is then discussed in Sec. IV. There, we start reviewing the standard quantum formalism used to describe the polarization state of the single photon, its dynamics, and the link with the formalism used in the previous sections. Then, we explain the working principle of the photonic implementation of the quantum finite automaton and propose a discrimination strategy to reduce the acceptance error probability. Section V describes the experimental apparatus and reports the results we obtained. Finally, we close the paper with Sec. VI, where we draw some concluding remarks and the outlook for our work.

A. Formal languages and classical finite automata
Formal language theory studies languages from a mathematical point of view, providing formal tools and methods to analyze language properties. Strictly connected with automata theory, the discipline dates back to the 1950s, and it was originally developed to provide a theoretical basis for natural language processing. It was soon realized that this theory was relevant to the artificial languages (e.g., programming languages) that had originated in computer science. Since its birth, formal language theory has become established as one of the most prominent area in theoretical computer science. Its results have huge impacts in numerous fields, including practical computer science, cryptography and security, discrete mathematics and combinatorics, graph theory, mathematical logic, nature-inspired (e.g., quantum, biological, genetic) computational models, physics, and system theory.
The reader may find a lot of excellent textbooks where thoughtful presentations of formal language and automata theory and their applications are presented (see, e.g., Refs. [39,40]). In order to keep this paper as self-contained as possible, we present basic concepts and notations of formal language and automata theory and briefly emphasize those aspects which are relevant to the present work, i.e., regular languages and finite automata.
An alphabet is any finite set of elements called symbols. A word on is a sequence ω = σ 1 σ 2 . . . σ n with σ i ∈ being its ith symbol. The length of ω, i.e., the number of symbols ω consists of, is denoted by |ω|. We let ε be the empty word satisfying |ε| = 0. The set of all words (including the empty word) on is denoted by * , and we let + = * \ {ε}. A language L on is any subset of * , i.e., L ⊆ * . If | | = 1, we say that is a unary alphabet, and languages on unary alphabets are called unary languages. In the case of unary alphabets, we customarily let = {a} so that a unary language is any set L ⊆ a * . The concatenation of the word x ∈ * with the word y ∈ * is the word xy consisting of the sequence of symbols of x immediately followed by the sequence of symbols of y. For any σ ∈ and any positive integer k, we let σ k be the word obtained by concatenating k times the symbol σ . We stipulate that σ 0 = ε.
Several formal tools have been introduced to rigorously express languages. Formal grammars are the main generative systems for languages. A formal grammar is a quadruple G = ( , Q, P, S) where and Q are two disjoint finite alphabets of, respectively, terminal and nonterminal symbols, S ∈ Q is the start symbol, and P is the finite set of production rules, or simply, productions. Productions can be regarded as rewriting rules, typically expressed in the form α → β with α ∈ ( ∪ Q) + and β ∈ ( ∪ Q) * . Given w, z ∈ ( ∪ Q) * , we say that z is derived in one step from w in G whenever w = xαy, z = xβy, and α → β is a production rule in P. Formally, we write w ⇒ G z. More generally, z is derived from w in G whenever there is a sequence w 0 , w 1 , . . . , The language generated by the grammar G = ( , Q, P, S) is the set L(G) ⊆ * defined as L(G) = {ω ∈ * | S ⇒ * G ω}. Two grammars G, G are equivalent whenever L(G) = L(G ).
The following example provides a grammar and establishes the corresponding generated language.
Example 1. When listing grammar production rules, we can write α → β 1 |β 2 | . . . β n−1 |β n as a shortcut for expressing the set of productions α → β 1 , α → β 2 , . . . , α → β n . So, consider the grammar where the set P of productions is defined as Let us derive the generated language L(G). By repeatedly applying the productions B 0 → aB 0 |bB 0 , from the start symbol B 0 we can derive αB 0 , for any α ∈ {a, b} * . Formally, B 0 ⇒ * G αB 0 . At this point, in order to generate a word of terminal symbols only, we must apply the production B 0 → bB 1 , thus having B 0 ⇒ * G αB 0 ⇒ G αbB 1 . Then, we are left to sequentially apply the productions B i → aB i+1 |bB i+1 for every 1 i k − 1. So, B 0 ⇒ * G αB 0 ⇒ G αbB 1 ⇒ * G αbβB k , for any β ∈ {a, b} * and |β| = k − 1. By applying the last production Thus, the language generated by G writes as In words, L(G) consists of those words on {a, b} featuring a symbol b at the kth position from the right.
Originally, four types of grammars have been pointed out, depending on the form of productions. The corresponding four classes of generated languages turn out to be relevant both from practical and theoretical points of view. Precisely, G = ( , Q, P, S) is a grammar of the following: TYPE 0: whenever productions in P do not have any particular restriction. The class of languages generated by this type of grammar is the class of recursively enumerable languages. TYPE 1 or context-sensitive: whenever every production α → β ∈ P satisfies |α| |β|; the production S → ε is allowed provided S never occurs within the right part of any production in P. The class of languages generated by this type of grammar is the class of context-sensitive languages. TYPE 2 or context-free: whenever every production in P is of the form A → β with A ∈ Q. The class of languages generated by this type of grammar is the class of context-free languages. TYPE 3 or regular: whenever every production is of the form A → ε, A → σ , or A → σ B with σ ∈ and A, B ∈ Q. The class of languages generated by this type of grammar is the class of regular languages. The reader may easily verify that the grammar proposed in Exercise 1 is a type 3 grammar, and hence the generated language is an example of regular language.
It can be shown that for any given type i + 1 grammar, an equivalent type i grammar can be built. Hence, the class of regular languages is contained in the class of contextfree languages, which is contained in the class of contextsensitive languages, which in turn is contained in the class of recursively enumerable languages. In addition, we have that such a language class hierarchy is proper. In fact, (i) there exist languages outside the class of recursively enumerable languages, (ii) there exist recursively enumerable languages that cannot be generated by any context-sensitive grammar, (iii) the ternary context-sensitive language {a n b n c n | n ∈ N} cannot be generated by any context-free grammar, (iv) the binary context-free language {a n b n | n ∈ N} cannot be generated by any regular grammar. Beside the one in Example 1, further instances of regular languages will be provided below. This language class hierarchy is usually known as the Chomsky hierarchy, and the whole formal language and automata theory has been developing around it. Every level of the hierarchy has been deeply investigated, yielding profound results and widespread applications.
An alternative equivalent approach to define the Chomsky hierarchy uses language accepting systems, i.e., roughly speaking, formal computational devices which process input words and outcome an accept/reject final verdict. For one such device, the corresponding accepted (or recognized) language consists of those input words that are accepted. According to this point of view, (i) the class of recursively enumerable languages coincides with the class of languages accepted by Turing machines, (ii) the class of context-sensitive languages coincides with the class of languages accepted by linear bounded automata, (iii) the class of context-free languages coincides with the class of languages accepted by nondeterministic pushdown automata, and (iv) the class of regular languages coincides with the class of languages accepted by (several types of) finite automata.
In this paper, we will be concerned with the class of regular languages. In particular, we will focus on the computational model of finite automata defining them [see (iv) above]. For extensive and thoughtful surveys on classical finite automata theory, the reader is referred to, e.g., Refs. [39][40][41]. Several types of finite automata have been introduced and deeply investigated in the literature. Let us begin by the original and most basic version. In Fig. 1, the "hardware" of a oneway deterministic finite automaton (1dfa, for short [42]) A is depicted. We remark that the other versions of finite automata we are going to review share the same hardware but exhibit different dynamics.
FIG. 1. Schematic diagram of the "hardware" of a one-way deterministic finite automaton (1dfa). The 1dfa is made of a read-only input tape consisting of a sequence of cells, each one capable of storing a symbol. The tape may be scanned by a "head", which is moving one position right at each step. At each stage of the computation of A, a finite state control is in a state from a finite set Q.
We have a read-only input tape consisting of a sequence of cells, each one being able to store an input symbol. The tape is scanned by an input head always moving one position right at each step. This type of input head motion motivates the designation "one way." At each time during the computation of A, a finite state control is in a state from a finite set Q. Some of the states in Q are designated as accepting states, while q 0 ∈ Q is a designated initial state. The computation of A on a word ω from a given input alphabet begins by having (i) ω stored symbol by symbol, left to right, in the cells of the input tape, (ii) the input head scanning the leftmost tape cell, and (iii) the finite state control being in the state q 0 . In a move, A reads the symbol below the input head and, depending on such a symbol and the state of the finite state control, it switches to the next state according to a fixed transition function and moves the input head one position forward. We say that A accepts ω if and only if it enters an accepting state after scanning the rightmost symbol of ω; otherwise, A rejects ω. The language accepted by A is the set L(A) ⊆ * consisting of all the input words accepted by A.
Formally, a 1dfa is a quintuple A = (Q, , δ, q 0 , F ), where Q is a finite set of states, with q 0 ∈ Q being the initial state and F ⊆ Q being the set of accepting states, is the input alphabet, and δ : Q × → Q is the transition function defining moves as follows: If A scans the input symbol σ by being in the state p and δ(p, σ ) = q holds, then it enters the state q and shifts the input head one position forward. The transition function δ can be inductively extended from symbols in to words in * as δ : Q × * → Q. Namely, for any q ∈ Q and ω ∈ * , we let

Thus, the language accepted by
A nice pictorial representation of a 1dfa A = (Q, , δ, q 0 , F ) is by its state (or transition) graph D A . Basically, D A is a labeled digraph having Q as the set of its vertexes and labeled directed edges representing moves. Precisely, there exists an edge from vertex p to vertex q with label σ if and only if δ(p, σ ) = q holds true. Vertexes are usually drawn as circles on the plan with labels indicating the corresponding states, while labeled arrows join adjacent states. The vertex corresponding to the state q 0 has an incoming arrow, while vertexes associated with accepting states in F are double circled. It is easy to see that the computation of A on the input word ω can be tracked in D A by following the unique directed path labeled ω from the vertex q 0 . So, A accepts ω if and only if such a path ends up in a double circled vertex.
To clarify the above notions, the next example displays a 1dfa accepting a simple unary language. We provide such a 1dfa both in its formal definition as a quintuple and as state graph.
Example 2. The following simple unary language will play an important role throughout the rest of the paper. For any given integer m > 0, let Such a language can be accepted by the 1dfa where, for any 0 i m − 1, we set δ(q i , a) = q (i+1) mod m . It is easy to see that δ(q 0 , a k ) = q k mod m which is q 0 if and only if k mod m = 0 if and only if a k ∈ L m . Hence, L(A) = L m . The state graph for the 1dfa A is depicted in Fig. 2. Because of unary input alphabet, all edges would have the same label a, which can then be safely omitted.
Let us now turn to the model of a one-way nondeterministic finite automaton (1nfa, for short [42]). Formally, a 1nfa is a quintuple A = (Q, , δ, q 0 , F ) in which every component is defined as in 1dfa's but the transition function, which is now a mapping δ : Q × → 2 Q , where 2 Q denotes the powerset of Q, i.e., the set of all subsets of Q. Unlike the deterministic case, now at each move A has several candidates as possible next states. Precisely, if A scans the input symbol σ by being in the state p and δ(p, σ ) = S holds, then it may enter one of the states in S and shift the input head one position forward. Thus, on any input word ω, more computation paths from q 0 exist; if at least one of such paths leads to an accepting state, then A accepts ω. More formally, we can inductively extend the transition function δ to subsets of states and words as δ : 2 Q × * → 2 Q . First of all, we define the extension δ : Then, for any S ⊆ Q and ω ∈ * , we let Thus, the language accepted by A is the set The reader may easily verify that a 1dfa can be seen as a 1nfa where, for any q ∈ Q and σ ∈ , we have that δ(q, σ ) contains a single state.
The state graph D A for the 1nfa A = (Q, , δ, q 0 , F ) can be defined as above for the deterministic case, but now an edge from vertex p to vertex q with label σ exists if and only if q ∈ δ(p, σ ) holds true. This means that, in general, a vertex may present more outgoing edges with the same label. Thus, A accepts an input word ω if and only if there exists a path in D A labeled ω from q 0 to a double circled vertex.
The following example proposes a 1nfa expressed as state graph for a binary language.
Example 3. Consider the binary language in Example 1, for which a type 3 grammar was there provided. Here, we call that language E k which was defined as Thus, a word on {a, b} is in E k if and only if its kth symbol from the right is b. In Fig. 3, the state graph of a 1nfa accepting E k is depicted. The reader may easily verify that the accepted language is exactly E k . Moreover, she may straightforwardly work out an equivalent formal definition of the 1nfa as a quintuple.
We complete our overview of classical models of finite automata by introducing the notion of a one-way probabilistic finite automaton (1pfa, for short [43]). Formally, a 1pfa is a quintuple A = (Q, , δ, q 0 , F ) in which every component is defined as usual, but now δ returns a probability distribution for the next state. More precisely, δ : is the probability that A, being in the state p, reaches the state q upon reading the symbol σ . As usual, the input head is shifted one position right at each move. Clearly, for any p ∈ Q and σ ∈ , we require that q∈Q δ(p, σ, q) = 1. Inductively extending the transition function δ to words enables us to get δ : Q × * × Q → [0, 1], where δ(p, ω, q) yields the probability that A, being in the state p, reaches the state q upon reading the input word ω as Thus, the probability that A accepts the input word ω is written as p A (ω) = q∈F δ(q 0 , ω, q), i.e., the probability for A to reach an accepting state from the initial state q 0 after processing ω. Given a real number λ, we define the language accepted by A with cut point λ as the set L A,λ = {ω ∈ * | p A (ω) > λ}. A language L ⊆ * is said to be accepted by A with isolated cut point λ whenever L = L A,λ and there exists ρ > 0 such that |p A (ω) − λ| ρ for every ω ∈ * . The relevance of isolated cut point acceptance is due to the fact that, in this case, we can arbitrarily reduce the classification error probability of an input word by repeating a constant number of times (not depending on the length of the input word) its parsing and taking the majority of the answers [41,43]. In our experiment, we will use this fact to reduce the error probability. Notice that beside isolated cut point acceptance, other probabilistic acceptance modes are widely studied in the literature (see, e.g., Refs. [24,25,36,44]. Without going into detail, even with a 1pfa A, a state graph D A can be naturally associated. Now, edges in D A are labeled by both a symbol and the corresponding transition probability.
Example 4. For two primes m, n, let the unary language Notice that this is a particular instance of the unary language introduced in Example 2. We define the set of states and any other transition occurs with probability 0.
It is not hard to see that Thus, the 1pfa A accepts L m·n with cut point 3 4 isolated by 1 4 . The state graph of the 1pfa A is sketched in Fig. 4. As usual, due to unary input alphabet, we omit the label a from every edge. Moreover, each edge without an associated probability defines a move occurring with certainty.
For the sake of completeness, we point out that two-way finite automata are also considered in the literature. Very roughly speaking, a two-way finite automaton has the same hardware as a one-way finite automaton, but its input head can move one position forward or backward, or stand still at each move. Two-way motion of the input head can be adopted by the three paradigms above recalled, thus leading to the models of 2dfa's, 2nfa's, and 2pfa's. Formal definitions and properties of two-way finite automata may be found, e.g., in Refs. [39][40][41][42][43]45,46].
The computational power of all these (and actually of many other) variants of finite automata has been well established in the literature over many years of research. As suggested in point (iv), in the automata-based characterization of Chomsky hierarchy above recalled, the following is true: The class of regular languages is properly contained in the class of languages accepted by isolated cut point 2pfa's [45]. However, when restricted to unary alphabets, even isolated cut point 2pfa's accept exactly unary regular languages [46]. Regular languages are of fundamental importance in many applications in computer science. Viewing regular languages throughout finite automata greatly improved compiler and interpreter design, parsing and pattern-matching algorithms, cryptography and security protocol testing, computer network protocol testing, model checking, and software validation. It is not an exaggeration to say that almost any task in computer science sooner or later leads to coping with some regular language which can be fruitfully managed via a suitable finite automaton.
However, beside being a valuable tool in language processing, finite automata represent a formidable theoretical model to deal with those physical systems which exhibit a predetermined sequence of actions depending on a sequence of events they are presented. Originally, finite automata have been introduced to describe the electric activity of brain neurons, but soon they have been extensively used in the design and analysis of several devices such as the control units for vending machines, elevators, traffic lights, combination locks, etc.
Particularly important is the use of finite automata in very large scale integration (VLSI) design, namely, in the project of sequential networks which are the building blocks of modern computers and digital systems. Very roughly speaking, a sequential network is a boolean circuit equipped with memory. Engineering a sequential network typically requires modeling its behavior with a finite automaton whose number of states directly influences the amount of hardware (i.e., the number of logic gates) employed in the electronic realization of the sequential network. From this point of view, having fewer states in the modeling finite automaton directly results in employing smaller hardware which, in turn, means having less energy absorption and fewer cooling problems. These latter physical implementation aspects, as the reader may easily figure out, turn out to be of paramount importance given the current level of digital device miniaturization.
These "physical" (and other more theoretical) considerations have led to a trend in the literature in which, beside acceptance capabilities, the descriptional power of finite automata is deeply investigated. Within the realm of descriptional complexity [29], the size of finite automata is under consideration, and a common measure for finite automaton size is the number of states. In particular, reducing or increasing the number of states is studied, when using different computational paradigms (e.g., deterministic, nondeterministic, probabilistic, quantum, one-way, two-way) on a finite automaton to perform a given task. Let us quickly recall some very well-known results on the descriptional power of different types of finite automata. To this aim, we say that two finite automata A, A are equivalent whenever L(A) = L(A ).
It is well known that any n-state 1nfa can be converted into an equivalent 2 n -state 1dfa [42], and that in general such an exponential size blowup is unavoidable. In fact, consider the language E k in Example 3. There, a k-state 1nfa accepting E k is sketched, but it can be shown that any 1dfa for E k cannot have fewer than 2 k states. A similar exponential gap exists for 1dfa's versus 1pfa's: Any n-state 1pfa accepting a language with cut point isolated by ρ can be turned into an equivalent 1dfa with (1 + 1/ρ) n states [43]. Even in this case, the exponential blowup is in general "almost unavoidable" (stating the exact size gap between determinism and probabilism is an open problem). This can be proved by elaborating on the language L m·n provided in Example 4. Equivalent 1dfa's for n-state 2dfa's and 2nfa's can be obtained, paying by not less than n n and 2 n 2 states, respectively [42,47].
Following this line of research on the succinctness of different computational paradigms, we are going to investigate whether and how adopting the quantum paradigm of computation may reduce the number of states on finite state automata, thus providing theoretical foundations for the realization of more succinct devices with all potential benefits in terms of miniaturization and energy consumption above addressed.
To this aim, we will be particularly interested in unary one-way finite automata, i.e., automata having a unary input alphabet consisting of the sole symbol a. Clearly, unary oneway finite automata accept unary languages L ⊆ a * . Here, we choose to provide a nice and compact matrix presentation of unary one-way finite automata that will naturally lead to formalizing the notion of a unary one-way quantum finite automaton. We recall that a matrix is said to be boolean whenever its entries are either 0 or 1 and stochastic whenever its entries are real numbers from the interval [0,1] and each row sums to 1.
Let A be a unary one-way finite automaton with {q 1 , q 2 , . . . , q n } being the set of its states; some of these states are accepting. Then, A can be formally written as a triple A = (ζ , U, η), where η ∈ {0, 1} n×1 is the characteristic column vector of the accepting states, i.e., η i = 1 if and only if q i is an accepting state, while ζ and U have different forms depending on the nature of A. Precisely, A is as follows: 1dfa: ζ ∈ {0, 1} n is the characteristic row vector of the initial state, U is an n × n boolean stochastic transition matrix, and hence U has exactly a single 1 per row, with U i j = 1 if and only if and only if A moves from the state q i to the state q j upon reading a; i.e., U i j = 1 if and only if δ(q i , a) = q j . 1nfa: as above, except that U is boolean with U i j = 1 if and only if q j ∈ δ(q i , a).
1pfa: ζ ∈ [0, 1] n is a stochastic row vector representing the initial probability distribution of the states, 1 and U is an n × n stochastic transition matrix with U i j being the probability that A moves from the state q i to the state q j upon reading a, i.e., U i j = δ(q i , a, q j ).
The reader may easily work out the matrix presentation for the unary 1dfa and the unary 1pfa defined, respectively, in Examples 2 and 4.
Let us see how to express the notion of accepted language in this matrix presentation. The situation of the unary one-way finite automata A at the end its the computation on the input word a k is described by the vector ζU k having the following meaning (recall that η is the characteristic vector of the final states of A): A is a 1dfa: ζU k is the characteristic vector of the state reached by A at the end of the computation on a k . Thus, the product ζU k η returns 1 if the reached state is accepting, and 0 otherwise. We say that A accepts a k whenever ζU k η = 1.
A is a 1nfa: ζU k is the characteristic vector of the set of states reached by A at the end of the computation on a k . Thus, the product ζU k η returns the number of reached accepting states. We say that A accepts a k whenever ζU k η 1.
A is a 1pfa: ζU k is a stochastic vector whose ith component represents the probability that A reaches the state q i at the end of the computation on a k . Thus, the product p A (a k ) = ζU k η returns the probability for A to reach an accepting state at the end of the computation on a k , i.e., the probability that A accepts a k .
If A is a unary 1dfa or 1nfa, then the accepted language is defined as Let A be a unary 1pfa. The language accepted by A with cut point λ is defined as As above recalled, the unary 1pfa A accepts a unary language L ⊆ a * with isolated cut point λ whenever L = L A,λ and there exists ρ > 0 such that |p A (a k ) − λ| ρ for every k ∈ N.
For the sake of completeness, we point out that when investigating the descriptional power of unary finite automata, we get size estimations which are slightly different than those 1 The definition of a 1pfa previously given admits a single initial state q 0 instead of assigning to each control state the probability of being initial. It can be shown that the two definitions of a 1pfa are actually equivalent from both a computational and a descriptional point of view. above quoted for finite automata working on general input alphabets. Thus, e.g., it is known that e ( √ n log n) states are necessary and sufficient for 1dfa's to simulate unary 1nfa's [48]. The same exponential blowup is proved in Refs. [49,50] for simulating unary 2dfa's and 2nfa's by 1dfa's. A "similar" exponential gap is also proved for simulating unary 1pfa's by 1dfa's; however, for this latter simulation the question should be stated more carefully, and we refer the reader to Refs. [51,52] for complete details. Finally, as previously recalled, we have that isolated cut point unary 2pfa's accept all and only regular languages, but their exact descriptional power is still an open question.

B. Basics of linear algebra
We briefly recall some basic notions of linear algebra (see, e.g., Ref. [53]) that are useful in the quantum picture and, in particular, to define the model of quantum finite automata. We denote by C the field of complex numbers. Given a complex number z ∈ C, its conjugate is denoted by z and its modulus by |z| = √ zz. The set of n × m matrices having entries in C is denoted by C n×m . For matrices C ∈ C n×m and D ∈ C m×r , their product is the matrix (CD) i j = m k=1 C ik D k j in C n×r . The adjoint of a matrix M ∈ C n×m is the matrix M † ∈ C m×n with M † i j = M ji . An Hilbert space of dimension n is the linear space C 1×n -in what follows denoted by C n for short-equipped with sum and product by elements in C, in which, for any vectors ζ , ξ ∈ C n , the inner product ζ , ξ = ζ ξ † is defined. If ζ , ξ = 0, we say that ζ and ξ are orthogonal. If ζ and ξ are orthogonal and ζ = 1 = ξ , then ζ and ξ are said to be orthonormal. The norm of vector ζ is defined as ζ = √ ζ , ζ . Two subspaces X, Y in C n are orthogonal if every vector in X is orthogonal to every vector in Y ; in this case, the linear space generated by X ∪ Y is denoted by X Y .
A matrix M ∈ C n×n is said to be unitary whenever MM † = I = M † M, where I ∈ C n×n is the identity matrix. Equivalently, M is unitary if and only if it preserves the norm, i.e., ζ M = ζ for every ζ ∈ C n . The eigenvalues of unitary matrices are complex numbers of modulus 1, i.e., they are in the form e iϑ , for some real ϑ. A matrix O ∈ C n×n is said to be Hermitian whenever O = O † . Let c 1 , . . . , c s be the eigenvalues of the Hermitian matrix O and E 1 , . . . E s be the corresponding eigenspaces. It is well known that (i) each eigenvalue c k is real, (ii) E i is orthogonal to E j for every i = j, and (iii) E 1 · · · E s = C n . Each vector ζ ∈ C n can be uniquely decomposed as ζ = ζ 1 + · · · + ζ s , where ζ j ∈ E j . The linear transformation ζ → ζ j is the projector P j ∈ C n×n on the subspace E j . It is easy to see that s j=1 P j = I. An Hermitian matrix O is one-to-one determined by its eigenvalues and its eigenspaces (or, equivalently, by its projectors). In fact, we have O = c 1 P 1 + · · · + c s P s .

C. Axiomatic for quantum mechanics in short
Here, we use the elements of linear algebra discussed so far to describe quantum systems (see, e.g., Refs. [54,55] for detailed expositions). Given a set Q = {q 1 , . . . , q m } of basis states, every q i can be represented by its characteristic vector e i ∈ {0, 1} m having 1 at ith position and 0 elsewhere. A quantum state on Q is a superposition ζ ∈ C m of basis states of the form ζ = m k=1 α k e k , with coefficients α k being complex amplitudes satisfying ζ = 1. Given an alphabet = {a 1 , . . . , a l } of events, with every event symbol a i we associate a unitary transformation U (a k ) : C m → C m . An observable is described by an Hermitian matrix O = c 1 P 1 + · · · + c s P s . Suppose that at a given instant a quantum system is described by the quantum state ζ . Then, we can operate the following: (1) Evolution by the event a j . The new state ξ = ζU (a j ) is reached. This dynamics is reversible, meaning that ζ = ξU † (a j ).
(2) Measurement of O. Every outcome in {c 1 , . . . , c s } can be obtained. The outcome c j is obtained with probability ζ P j 2 = ζ P j , ζ P j , and the state of the quantum system after observing such a measurement collapses to the superposition ζ P j / ζ P j . The state transformation induced by a measurement is typically irreversible.

D. One-way unary quantum finite automata
Several models of one-way (fully) quantum finite automata are proposed in the literature. Basically, they differ in measurement policy [22,24,25,44]. In this paper, we consider the simplest model of one-way quantum automata called measure once [18][19][20][21]. We focus on the unary case, i.e., automata having a single-letter input alphabet = {a}. Indeed, the definition of a one-way quantum automata on a general alphabet comes straightforwardly. As done in Sec. II A for classical models of unary one-way finite automata, we are going to provide a matrix presentation of unary one-way quantum finite automata.
A unary measure-once one-way quantum finite automaton (1qfa, for short) with n basis states, some of which are designated as accepting states, is formally defined by the triple A = (ζ , U, P), where the following hold: (i) ζ ∈ C n , with ζ = 1, is the initial superposition of basis states.
(ii) U ∈ C n×n is a unitary transition matrix with U i j being the amplitude that A moves from the basis state q i to the basis state q j upon reading a, so that |U i j | 2 is the probability of such a transition.
(iii) P ∈ C n×n is the projector onto the accepting subspace, i.e., the subspace of C n spanned by the accepting basis states. The projector P represents the observable O = 1 · P + 0 · (I − P).
At the end of the computation on the input word a k , the state of A is described by the final superposition ζU k . At this point, the observable O is measured, and A is observed in an accepting basis state with probability p A (a k ) = ζU k P 2 .
This is the probability that A accepts a k . The definition of the unary language L A,λ accepted by A with cut point λ and the notion of a unary language accepted by A with isolated cut point are identical to those provided in Sec. II A for the model of unary 1pfa's.
The designation "measure once" given to the model of 1qfa above introduced is due to the fact the observation for acceptance is performed only once, at the end of input processing. Throughout the rest of the paper, for the sake of brevity, by 1qfa we will mean "measure-once 1qfa," unless otherwise stated.
Several contributions in the literature show that, surprisingly enough, isolated cut point 1qfa's are less powerful than classical models of one-way finite automata. In fact, Refs. [18,20,21] prove the following: Theorem 6. The class of languages on general alphabets accepted by isolated cut point 1qfa's coincides with the class of group languages [56], a proper subclass of regular languages.
By restricting to unary alphabets, the computational power of isolated cut point 1qfa's still remains strictly lower than that of classical devices. On the other hand, it is proved in Ref. [61] that the class of unary languages accepted by "measuremany" isolated cut point 1qfa's coincides with the class of unary regular languages. Roughly speaking, a measure-many 1qfa [24,25,57] is defined as a measure-once 1qfa, but the observation for acceptance is performed at each step along the computation.

III. THEORETICAL DESIGN OF A SMALL QUANTUM FINITE AUTOMATON
Although being computationally weaker, 1qfa's may greatly outperform classical devices when size-customarily measured by the number of basis states-is considered (see, e.g., Refs. [18,30,33,34,36,38,[62][63][64][65][66]). To prove this fact, we test the descriptional power of several models of classical and quantum one-way finite automata on the very simple benchmark language introduced in Example 2: For any given integer m > 0, we let the unary language L m = {a k | k ∈ N and k mod m = 0}. (6) Despite its simplicity, this language proves to be particularly size consuming on the classical model of one-way finite automata, as shown in the following: Theorem 7. For any integer m > 0, let m = p α 1 1 p α 2 2 . . . p α s s be its integer factorization, for primes p i and positive integers α i . To accept the language L m , the following number of states are necessary and sufficient: (i) m states on 1{d,n}fa's, and (ii) p α 1 1 + p α 2 2 + · · · + p α s s states on 2{d,n}fa's and isolated cut point 1pfa's.
Proof. (i) In Example 2, an m-state 1dfa (which is clearly a particular 1nfa) for L m is provided. The fact that m states are necessary for any 1{d,n}fa to accept L m can be easily obtained by using the pumping lemma for regular languages [39,40].
By adopting the quantum paradigm, we can obtain isolated cut point 1qfa's for L m of incredibly small size.
Theorem 8. For any integer m > 0, the language L m can be accepted by an isolated cut point 1qfa with two basis states. FIG. 5. Scheme of the 1qfa A accepting the language L m : Given the initial automaton state ζ and the input word a k the automaton outputs "1" (accepted) or "0" (not accepted). See the text for details.
Proof. We define the 1qfa A with 2 basis states as cos(π/m) sin(π/m) − sin(π/m) cos(π/m) , One may easily verify that U is a unitary matrix and that Straightforward calculations show that the probability that A accepts the word a k amounts to In other words, our 1qfa A accepts with certainty the words in L m , while the acceptance probability for the words not in L m is bounded above by cos 2 (π/m) < 1. So, we can set the cut point λ = [1 + cos 2 (π/m)]/2 and isolation ρ = [1 − cos 2 (π/m)]/2, and conclude that L m is accepted by the 1qfa A with two basis states and cut point λ isolated by ρ.
In Fig. 5, we depict the 1qfa A of Eq. (7) in order to highlight the input word a k , the initial automaton state ζ , the unitary operator U m , and the measurement described by the projector P.
It is worth noting that the isolation ρ = [1 − cos 2 (π/m)]/2 around the cut point of the 1qfa A of Eq. (7) tends to 0 for m → +∞. Hence, as m grows, so does the error probability, i.e., with high probability A may erroneously accept (reject) words not in L m (words in L m ). To overcome this lack of precision, several modular design frameworks have been settled in the literature, aiming at enlarging cut point isolation paying by increasing the number of basis states [25,[31][32][33][34][35][36][37][38]. Within these frameworks, for any desired isolation ρ > 0, a 1qfa can be theoretically defined, which accepts L m with cut point isolated by ρ and featuring O( log m ρ ) basis states. Although the number of basis states now depends on m, still it remains exponentially lower than the number of states of equivalent classical one-way finite automata displayed in Theorem 7. In addition, the proposed O( log m ρ )-state 1qfa turns out to be the smallest possible. In fact, in Ref. [34] it is proved that any 1qfa accepting L m with cut point isolation ρ must have at least log m log[1+2/ρ] basis states. It should be stressed that all the design frameworks proposed in the literature, aiming to build extremely succinct 1qfa's not only for L m but also for more general families of languages, use the simple 1qfa A of Eq. (7) as a crucial building block. Within these frameworks, the 1qfa A is suitably composed in a modular pattern by using traditional compositions (i.e., direct product and sum of quantum systems), in order to enhance precision in language recognition. In particular, from this perspective, a physical realization of the 1qfa A is not only interesting per se but it may provide a concrete computational component upon which to physically project more sophisticated and precise 1qfa's by traditional compositions of quantum systems.

IV. PHOTONIC IMPLEMENTATION OF THE QUANTUM FINITE AUTOMATON
In this section, we describe the physical implementation of the 1qfa A of Eq. (7). The experimental realization is based on the polarization degree of freedom of single photons and their manipulation through suitable rotators of polarization. For the sake of clarity, before discussing the physical implementation, we will summarize in the following the basic formalism used to describe this kind of quantum system.

A. The Dirac formalism
In order to describe the physical implementation of the 1qfa A of Eq. (7) accepting the language L m , it is useful to review the standard notation for quantum mechanics introduced by Dirac [54]. This will help the reader to easily pass from the notation used in the previous sections to the one we will use in the following. In this notation, the state "ψ" of a quantum system is described by the symbol |ψ which is, in general, a complex column vector in a Hilbert space. In the present work, we are interested in the (linear) polarization state of a single photon; therefore, only the two basis states |H and |V , referring to the horizontal (H) and vertical (V ) polarization, respectively, are needed. Indeed, because of the very law of quantum mechanics, any normalized linear combination of these two vectors represents a quantum state. For instance, a single photon polarized at an angle θ with respect to the horizontal is described by the state vector Since we are in the presence of only two basis states, we can give a geometrical representation of them and of the corresponding spanned space, as shown in Fig. 6(a). In this formalism, it is clear the correspondence , and H| = (|H ) † = ζ = (1, 0), (11) where ζ is the same state introduced in Eq. (7). Analogously, we have In Sec. II B, we introduced the inner product ζ , ξ = ζ ξ † between the states ζ and ξ . Using the Dirac formalism, we have where we also used the orthonormality of the involved states.
If we now introduce the projectors where P is the same as in Eq. (7), and given the state |θ , with (|θ ) † = ϑ, we have p H = ϑP, ϑP = θ | H |θ = | H|θ | 2 = cos 2 θ, (16a) where we used 2 J = J , with J ∈ {H, V } and a|b = b|a . The geometrical meanings of p H and p V are reported in Fig. 6(b), where, from the physical point of view, they correspond to the probability of finding the photon with horizontal or vertical polarization, respectively.
In the context of the polarization of single photons, the analog of the unitary operator U m defined in Eq. (7) is the operator R(π/m) which corresponds to a rotator of polarization, which rotates the polarization of the photons by an amount π/m. We can write R(π/m) = U † m . Thereafter, the one-step evolution of the state |H = ζ † reads

B. Photonic quantum automaton
In Fig. 7, we depicted the basic elements of the photonic quantum automaton implementing the 1qfa A of Eq. (7) accepting the language L m . Given the input word a k (see also Fig. 5), a single photon, generated in the state |H is sent through k rotators of polarization, where each rotator applies a rotation of a fixed amount π/m. It is worth noting that in order to actually reproduce the computation of a 1qfa, a single rotation should be applied step by step upon reading FIG. 7. Sketch of the photonic implementation of the 1qfa A accepting the language L m . Single photons are generated in the polarization state |H and then they pass through k polarization rotators, k being the length of the input word a k . Each rotator implements the operator R(π/m) rotating the polarization by the amount π/m: The overall polarization rotation is θ = π k/m. Finally, the photons are addressed to two photodetectors by means of a polarizing beam splitters (PBS) according to their horizontal (H ) or vertical (V ) polarization. each input symbol, since the input word length is not known in advance. After the rotators, the single photon is sent to a polarizing beam splitter (PBS), a device which transmits (reflects) the horizontal (vertical) polarization component of the input state. Since after the rotators the state of the photon is |θ , given in Eq. (10), it is detected by the H or V detector (see Fig. 7) with the probabilities given in Eqs. (16). It is worth noting that, as expected, p H (k) is equal to the automaton acceptance probability p A (a k ); see Eq. (9). As mentioned in Theorem 8, this kind of automaton accepts with certainty the word a k if k mod m = 0, but it has also a high error probability to accept the word if k mod m = 1. In fact, in this case, p H (k) attains its maximum cos 2 (π/m).
To reduce the error probability, one can send M = N c (m) copies of the same input word a k , collect the number N c (k) of counts at the detector H, and evaluate the ratio In this scenario, we let f 1 = f (k mod m) = 1 be the highest frequency less than f 0 = f (k mod m) = 0 = 1. That is, f 1 is the highest frequency for words that are erroneously accepted (those words a k for which k mod m = 1), and f 0 is the frequency of those words that are correctly accepted (those words a k for which k mod m = 0). Thus, we can define the threshold frequency and we use the following strategy: It is clear that such a strategy leads to a zero error probability; namely, all and only the words in L m can have f k > f th . However, in a realistic scenario the number of detected photons is subjected to Poisson statistical fluctuations, due to the very nature of the detection process [67]. So, given the word a k , the number of detected counts N c (k) fluctuates according to a Poisson distribution with mean μ k = N c cos 2 (π k/m), where N c is the average number of detected photons obtained for also for a word a k not belonging (resp., belonging) to the language L m , leading to a non-null experimental acceptance error probability p err . If we assume μ 1 = N c cos 2 (π/m) 1, the distribution of the detected number of counts for k mod m = 1 can be approximated by a Gaussian distribution function with mean and variance given by same value μ 1 . Analogously, for k mod m = 0 we have a Gaussian distribution with mean and variance equal to μ 0 = N c . Now we can find a more suitable threshold N th of the detected counts by considering the intersection between the two Gaussians, namely N th = N c cos(π/m) 1 − ln[cos 2 (π/m)] N c sin 2 (π/m) , (21) and the corresponding discrimination strategy reads The experimental error probability is thus given by (we consider only the two relevant contributions) where 1 N c cos 2 (π/m) = μ 1 < N th < μ 0 = N c . We note that p err corresponds to the probability of accepting (resp., rejecting) the word a k whenever it should be rejected (resp., accepted). In Fig. 8, we plot the error probability for different values of m: As one may expect, as m increases so should the average number of counts N c in order to have a small error probability.

V. EXPERIMENTAL RESULTS
The main elements of our physical implementation of the 1qfa A accepting the language L m are sketched in Fig. 7. However, in order to reduce the losses and other sources of FIG. 9. Schematic diagram of the experimental setup. A 405-nm continuous wave (cw) laser diode (L) generates a pump beam which passes through an amplitude modulator, composed of a half-wave plate (λ/2) and a polarizing beamsplitter cube (PBS), and through another half-wave plate to set the polarization. The beam interacts with a 1-mm-long barium borate (BBO) crystal generating photons at 810 nm via parametric down conversion (PDC). The two beams separated by the horizontal plane are called signal and idler: On the signal's branch there are two polarizers (P) separated by a half-wave plate. Photons are finally focused into two multimode fibers through two couplers (C), and sent to homemade single-photon counting modules.
noise, in the actual setup we replace the action of the k polarization rotators on the input word a k by using a single rotator applying an overall rotation of θ = π k/m, which "simulates" the whole computation of the 1qfa: For this reason, we will refer to our system as a photonic quantum simulator [68] of the quantum automaton. As mentioned in the previous section, an actual 1qfa does not have an a priori knowledge about the length k of the input word. In fact, it reads the input word symbol by symbol while applying a rotation π/m per each scanned input symbol a. Practically, this can be implemented, for instance, by a motorized rotator of polarization, but this is beyond the scope of the present work. Nevertheless, it is worth noting that a more advanced technology, e.g., based on integrated optics or optoelectronics, can be used to realize the very setup of Fig. 7.
The experimental setup is shown in Fig. 9.
(1) The pump derives from a 405-nm cw InGaN laser diode, which we chose in order to use detectors in silicon, the ones with the lowest noise on the market: Indeed, these work with maximum quantum efficiency at 810 nm, which is the same wavelength of the photons generated via parametric down conversion (PDC) from a 405-nm pump.
(2) The laser beam passes through an amplitude modulator composed by a half-wave plate and a polarizing beamsplitter cube (PBS), and then through another half-wave plate to set the polarization vertical with respect to the optical bench.
(3) The interaction between the pump and a 1-mm-long BBO crystal generates photons at 810 nm with horizontal polarization, along the surface of a cone, via type-I-eoo PDC: For this purpose, the optical axis of the crystal is on the vertical plane at the phase-matching angle.
(4) The intersection of the cone with the horizontal plane distinguishes two beams (branches): the signal and the idler. It is possible to finely tune the angle of the outgoing photons by properly rotating the principal axis of the BBO.  (21). In the plots, we also report the theoretical error probability from Eq. (23). The reduction of the relative statistical fluctuations is evident.
(5) Along the signal branch, a polarizer ensures the transmission of the horizontally polarized photons, then a halfwave plate is used to simulate the k polarization rotators, and finally another horizontal polarizer transmits the photons to the detector. This last half-wave plate can be manually rotated and is equipped with graduations where a unit corresponds to 4 • in polarization: By considering the working principle of the half-wave plate, this can be obtained by actually rotating the plate by 2 • . Therefore, in general, in order to obtain a rotation in polarization of amount θ , one should rotate the plate by θ/2.
(6) On each branch, photons are finally focused into a multimode fiber and sent to a homemade single-photon counting module, based on an avalanche photodiode operated in Geiger mode with passive quenching [69]. We chose to measure the coincidence counts in order to obtain a better signal-noise ratio: Indeed, the photodiodes produce a thermal background such that approximately 1% of the direct counts are dark counts, while the coincidence dark counts are only 0.001% of the coincidence counts.
In Fig. 10, we show typical experimental results from our photonic simulator of the 1qfa A for the language L m , with m = 5 (this choice allows us to put better in evidence the role of the statistical fluctuations of the detected number of photons). In this case, a single rotation of polarization (taking place, e.g., on the input word of length k = 1) has θ = 36 • , which corresponds to rotating by 9 units the half-wave plate on signal's branch [see point (5) in the above description of our experimental setup].
Here we only show the interesting results for input words a k of length k = 5 and k = 1. These two inputs, respectively representing a word in L 5 and one of the most prone to error classification words not in L 5 , turn out to be critical for testing the accuracy of the discrimination strategy we use. Further- FIG. 11. Examples of the experimental number of counts N c (k) (dots) as a function of the length k of the input word a k (we have chosen 10 values of k randomly in the interval [1,500]). The horizontal lines refer to the threshold N th on the number of counts for the discrimination strategy (see the text for details): If N c (k) N th the word a k is accepted. The color of the vertical bars refers to the theoretical acceptance (green) or rejection (red) of the corresponding input a k by the 1qfa A accepting the language L m , with m = 23. The average number of counts is N c = 18439 ± 114 (top panel, corresponding to N th = 18267.5) and N c = 56477 ± 244 (bottom panel, leading to N th = 55951.5). In both panels, the lower plots are a magnification of the region around the threshold N th . (Top panel) In this case, the error probability evaluated from Eq. (23) is p err = 10.3%. The number 229 is accepted according to our strategy, since the number of counts (see the orange arrows) is larger than the threshold (horizontal solid line), but it should be rejected; on the other hand, the number 115 should be accepted but it is rejected since the number of counts (see the magenta arrows) is below the threshold. (Bottom panel) Here the error probability evaluated from Eq. (23) is p err = 1.3%. Now the number 229 is rejected and the number 115 is accepted, according to the definition of L m . See the text for details. more, in order to highlight the reduction of the statistical fluctuations, we plot the ratio N c (k)/ N c . Each point corresponds to the number of counts at the detector H when the average total number of counts is N c = 36, 108, 479, and 1845, respectively, which can be obtained varying pump's power by rotating the half-wave plate of the amplitude modulator. We repeated the experiments 50 times with an acquisition time of 1s for each of the two values of k. It is clear that increasing N c reduces the relative fluctuations and thus the error probability decreases accordingly.
To better appreciate the performance of our photonic simulator, we consider our 1qfa A accepting the language L m , with m = 23. In this case, a single rotation of polarization (taking place, e.g., on the input word of length k = 1) has (approximately) θ = 8 • , which corresponds to 2 units in the half-wave plate's scale [see point (5) in the above description of our experimental setup]. In the top panel of Fig. 11, we report examples of the number of counts N c (k) for just one given experimental run as a function of different values k of the length of the input word a k and for N c = 18439. The acquisition time of each point is 10s. We can see that, due to the statistical fluctuations mentioned above, sometimes the automaton fails to accept the word: This is the case for the input lengths 115 and 229, as discussed in the figure caption. We remark that in the latter cases we have chosen two particular experimental runs in which the automaton fails: If we had considered the average over many runs, we would have found that the automaton always succeeds on average, since the standard deviation, due to the statistical scaling, can be reduced at will. Of course, given a particular run, the error is independent of k, but depends only on the random, statistical fluctuations, which can be controlled by increasing N c , as we can see in the bottom panel of Fig. 11, where we report the results of the photonic simulator taking the same words of the top panel as inputs but with N c = 56477, obtained with an acquisition time of 30s. This can be also understood by considering Eq. (23): The error probability is reduced from p err = 10.3% for N c = 18439 of the previous case to the current p err = 1.3% for N c = 56477.

VI. CONCLUSIONS
We have suggested and demonstrated a photonic realization of quantum finite automata able to recognize a wellknown family of unary periodic languages. Our device exploits the polarization degree of freedom of single photons and their manipulation through linear optical elements. In particular, we have designed and implemented a one-way quantum finite automaton A accepting the unary language L m = {a k | k ∈ N and k mod m = 0} with only two basis states and isolated cut point. Notice that any classical finite automaton for L m requires a number of states which grows with m. We have implemented the quantum finite automaton A using the polarization degree of freedom of a single photon and have exploited a discrimination strategy to reduce the acceptance error probability.
It is worth noting that, for the particular one-way quantum finite automaton we considered, we exploited only the polarization degree of freedom of (quantized) optical fields and photodetection. Therefore, one can implement a similar automaton also exploiting polarization of a classical coherent field (a laser beam) and intensity measurements. Nevertheless, our experiment uses single photons that are intrinsically quantum objects, and thus it paves the way for more complex quantum finite automata we are planning to address and which exploit genuine quantum resources, such as entanglement. In fact, the quantum technology employed in our implementation is the same used in the current quantum information processing setups based on optical states.
Besides being interesting in itself for fundamental reasons, our physical realization of the one-way quantum finite automaton A provides a concrete implementation of a small quantum computational component that can be used to physically build more sophisticated and precise quantum finite automata. Indeed, several modular design frameworks have been modeled and widely investigated from a theoretical point of view [24,25,[30][31][32][33][34][35][36][37][38]65] to build succinct and precise quantum finite automata performing different tasks, where the module A plays a crucial role. Within these frameworks, by suitably assembling a sufficient number of A -like modules via traditional compositions of quantum systems (i.e., direct products and sums), the existence of succinct and precise quantum finite automata has been theoretically shown. From this perspective, our results are instrumental to a deeper understanding of possible physical implementations of these design frameworks by means of photonic technology and pave the way for the construction of other more powerful models of quantum finite automata.