Binary Classification Quantum Neural Network Model Based on Optimized Grover Algorithm

Zhao, Wenlin; Wang, Yinuo; Qu, Yingjie; Ma, Hongyang; Wang, Shumei

doi:10.3390/e24121783

Open AccessArticle

Binary Classification Quantum Neural Network Model Based on Optimized Grover Algorithm

by

Wenlin Zhao

¹

,

Yinuo Wang

²,

Yingjie Qu

²,

Hongyang Ma

²

and

Shumei Wang

^2,*

¹

School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266520, China

²

School of Science, Qingdao University of Technology, Qingdao 266520, China

^*

Author to whom correspondence should be addressed.

Entropy 2022, 24(12), 1783; https://doi.org/10.3390/e24121783

Submission received: 26 October 2022 / Revised: 23 November 2022 / Accepted: 29 November 2022 / Published: 6 December 2022

(This article belongs to the Special Issue Advances in Quantum Computing)

Download

Browse Figures

Versions Notes

Abstract

:

We focus on the problem that the Grover algorithm is not suitable for the completely unknown proportion of target solutions. Considering whether the existing quantum classifier used by the current quantum neural network (QNN) to complete the classification task can solve the problem of the classical classifier, this paper proposes a binary quantum neural network classifical model based on an optimized Grover algorithm based on partial diffusion. Trial and error is adopted to extend the partial diffusion quantum search algorithm with the known proportion of target solutions to the unknown state, and to apply the characteristics of the supervised learning of the quantum neural network to binary classify the classified data. Experiments show that the proposed method can effectively retrieve quantum states with similar features. The test accuracy of BQM retrieval under the depolarization noise at the 20th period can reach 97% when the depolarization rate is 0.1. It improves the retrieval accuracy by about 4% and 10% compared with MSE and BCE in the same environment.

Keywords:

binary classification; Grover algorithm; QNN

1. Introduction

The Internet has many different kinds of data and information that are intersected and stored on social networks, prompting many different research fields to start to pay attention to social networks. Users on social networks obtain the resources that they need by visiting web pages [1,2,3,4]. The key to studying social networks is to analyze how these social networks are used. The data analyzed can be used to improve the social network itself, making it more convenient for users to browse the data required. They can also be used to analyze users’ preferences to deliver advertisements at designated points. They can also be used to analyze user behavior and to predict the transactions that users participate in [5,6]. One of the main ways to analyze these results is to perform extensive data analysis on the weblogs of these sites [7,8]. Every time a user requests a page or some of the resources on that page, such as video, sound, etc., a new record is added to the weblog of the site [9]. This information contains information about the user’s favourite pages (i.e., the most frequently visited pages), the sequence of visits to ordinary pages, and even hints at the user’s characteristics. This information analysis method can be called web page using data mining (WUM) based on weblog information [10,11,12]. To use WUM, a sequence of interactions between a single user and a web page needs to be extracted. The resulting file should contain at least the following fields: the user’s IP address, timestamp, requested resource, code for the result of the operation, the previous web address before entering the web page, and the browser used [13,14]. Using and analyzing this sequence, the pattern of the user’s access to the web page can be obtained.

The value of big data is essentially reflected as follows: it provides new thinking and means for a human to understand complex systems. In theory, a virtual digital image of the natural world can be constructed by digitizing the real world on a sufficiently small scale of time and space that carries the running rules of the real world. On the premise of sufficient computing power and efficient data analysis methods, an in-depth analysis of this virtual digital image will make it possible to understand and to discover the existing complex system’s operation behavior, state, and law. Big data provides a new way of thinking and a new means for exploring objective laws and for transforming nature and society for human beings. Due to the high and increasing demand for big data mining and analysis work, the era is forced to gradually use more efficient quantum scientific research technology [15] to meet the gap of technology improvement means, and then to improve the efficiency of information extraction work and to promote the progress of more scientific research. In view of the above requirements and improved thinking, this paper makes use of the special advantages of quantum algorithms compared with traditional iterative algorithms [16], as a new field of quantum neural network, and studies the neural network autonomous learning scheme [17] for network log information extraction, which highlights the high-value prospects of the quantum field in big data mining and analysis.

This search hassle can be carried out in

O (N)

with the use of Grover’s quantum search algorithm. Grover’s quantum search algorithm [18] makes use of the amplitude amplification approach in quantum computing to attain quadratic acceleration in unsorted search problems. Grover’s quantum search algorithm has been efficiently applied on classical computer systems with the usage of a quantum laptop language (QCL). For an unordered database search, the Grover algorithm achieves quadratic acceleration compared with the classical algorithm. Then, analysis, induction, and variant research are carried out [19,20]. Except for the lower bound, Grover’s methodology can be used for the case where the

λ

fraction of the target term is unknown. The fixed point strategy and the trial and error method are the two quantum search strategies that can be implemented.

The Binary QNN Model in this paper is a kind of model based on the Grover algorithm and the QNN supervised learning algorithm. First of all, this is based on the traditional Grover algorithm being analyzed and improved, using Younes’ algorithm to improve the search algorithm efficiency, inserting this into the iterative learning process of the quantum neural network [21,22], and using quantum processes to promote the efficiency of the algorithm and neural network to realize multiple network synchronization searching and learning [23,24] for each iteration algorithm to improve the efficiency of solutions [25]. The quantum neural network learning scheme in this paper can be applied to quickly find the user’s IP address in massive weblogs, and then accurately and efficiently identifying and classifying relevant and valuable information such as the IP addresses of network logs. The results can be sorted according to the number of search categories to identify and analyze user behavior accurately. It can not only quickly discover the person’s record of specific interests, but also can divide the session of a single user, which ensures the accuracy of person document identification and improves the effectiveness of user activity queries.

This article is structured as follows. Section 2 introduces the trial-and-error method (Younes’ algorithm) used in this paper, and the basic principles of the QNN supervised learning division task. Section 3 describes our binary QNN model. Section 4 describes the evolution process of the original dataset, and provides entropy analysis to show the advantages of this model. Section 5 summarizes and discusses the role of the proposed model in the development of user behavior pattern prediction.

2. Basic Conception

2.1. Supervised Learning Classification of QNN

Quantum Neural Network (QNN) is a new research field formed by the intersection of quantum physics, mathematics, computer science, information science, cognitive science, complexity science, and other disciplines. As the natural evolution of the traditional neural computing system, it makes full use of the great power of quantum computing to improve upon the information processing capacity of neural computing. By introducing the idea of the superposition of quantum states into the traditional feedforward neural network, QNN can train the quantum interval and quantify the uncertainty of the input data of training samples. Different data will map to different magnitudes. A multi-layer excitation function is used to increase the fuzziness of the network, and to improve the accuracy and certainty of network pattern recognition [26]. Therefore, the research on the quantum neural network provides beneficial support for the combination of quantum computing and neural computing. The potential of quantum neural networks is that they take advantage of both quantum computing based on coherent superposition and neural computing based on parallel processing. For example, running a deep neural network (DNN) model on a device with limited computing power can be challenging because of the large computational power and memory requirements on the device. However, to solve this problem, a quantum neural network (QNN) has greater potential, which can save computing costs while ensuring the accuracy of DNN training.

QNN comprises quantum state preparation subcircuits and optimization tasks performed by classical controllers [27]. The fact that variable-component subcircuits utilized in QNN produce probability distributions that cannot be efficiently simulated is part of the evidence supporting the claim [28,29]. QNN’s main application, similar to DNN’s, is to tackle categorization tasks [30]. Practical challenges, such as recognizing handwritten digits and the features of many living creatures, can be categorized as categorization scenes [31,32].

A dataset is given

\begin{matrix} T = {(x_{i}, y_{i})}_{i = 0}^{N - 1} \in (R^{N \times M}, {0, 1}^{N}) \end{matrix}

(1)

According to N examples and M elements in the examples, a QNN is led to research

f_{θ} (\cdot)

to predict the label of a facts set T

\begin{matrix} min_{θ} \sum_{i = 0}^{N - 1} I_{y_{i} \neq f_{θ} (x_{i})} \end{matrix}

(2)

where

θ

is the trainable parameter, and

I_{z}

is an indicator function whose value is 1 when the condition z is met; otherwise, it is zero. The quantum classifier realizes the data tag prediction function according to specific rules through the filtered data, and its basic principle is shown in Figure 1. We use quantum classifiers in the research section to specify a QNN for completing the classification task defined in Formula (2). Considering the binary task, it is necessary not only to find a decision rule in Formula (2), but also to output the index j satisfying the pre-determined black box function. Given a trained classifier

f_{θ} (\cdot)

, both classical algorithms and previous quantum classifiers require at least

O (T)

query complexity to find j.

The dataset T is constructed from a given qubit with adjustable interactions, where the qubit composition is represented by

x_{i}

and

y_{i}

. We learn the interaction from a given training set of each input–output relationship based on the classical backpropagation rule

f_{θ} (\cdot)

, and taking

x_{i}

as the input to its rule, where the input–output relation is the data pair

(x_{i}, y_{i})

that constitutes the dataset T. This learning process of qubits is viewed as the desired output algorithm behavior, that is, the quantum network “learns” an algorithm.

A notable theoretical result concerning quantum classifiers is the tradeoff between the computational cost and the training performance shown [33].

2.2. Younes’ Algorithm

The methodology presented by Younes, Rowe J., and Miller J. [34] in Younes’ algorithm is used to carry out the quantum search by exploiting the local diffusion operator to overcome the souffle problem in the Grover algorithm. It demonstrates that regardless of whether the number of matches is known, the entire range of

1 \leq M \leq N

can be consistently handled. It lays the theoretical foundation of the binary QNN model.

In the

| 0 〉

and

| 1 〉

states, part of the diffusion operator

Q_{i}

system subspace of the entanglement in additional qubit workspace performs about the inverse operation of the mean, and the inverse operation of the phase shift is

- 1

. H is the Hadamard Gates denoted by

H = \frac{1}{\sqrt{2}} [\begin{matrix} 1 & 1 \\ 1 & - 1 \end{matrix}]

. The diagonal representation of

Q_{i}

applied to the

n + 1

qubit system is:

\begin{matrix} Q_{i} = (H^{\otimes n} \otimes I_{1}) (2 | 0 〉 〈 0 | - I_{n + 1}) (H^{\otimes n} \otimes I_{1}) \end{matrix}

(3)

asthe

| 0 〉

length is

2^{n + 1}

,

I_{1}

is the unit of a

2 \times 2

matrix.

Generally, quantum structures of the well-known size

n + 1

can be expressed as:

\begin{matrix} | ψ 〉 = \sum_{p = 0}^{N - 1} α_{p} (| p 〉 \otimes | 0 〉) + \sum_{p = 0}^{N - 1} β_{p} (| p 〉 \otimes | 1 〉) \end{matrix}

(4)

Applying

Q_{i}

to

| ψ 〉

bits, we obtain

\begin{matrix} \sum_{p = 0}^{N - 1} (\frac{2}{N} \sum_{p = 0}^{N - 1} α_{p} - α_{p}) (| p 〉 \otimes | 0 〉) - \sum_{p = 0}^{N - 1} β_{p} (| p 〉 \otimes | 1 〉) \end{matrix}

(5)

where

\frac{1}{N} \sum_{p = 0}^{N - 1} α_{j}

means the mean amplitude of subspace

\sum_{p = 0}^{N - 1} α_{p} (| p 〉 \otimes | 0 〉)

. That is, the operator

Q_{i}

only performs the inversion of the means in the subspace and only changes the sign of the amplitude.

The H gate is utilized to the first n qubits to produce

2^{n}

values to characterize the list. Then, we iteratively observe the Oracle feature

U_{f}

to map the goal in the list to 0 or 1, and we retail the outcomes such as

U_{f} | x, 0 〉 \to | x, f (x) 〉

; the partial diffusion operator

Q_{i}

is applied, and this step is repeated q times. Finally, the first n qubits are measured.

The variety of iterations q has to be an integer to locate a healthy shut to the change in measurements.

Setting

q = ⌊\frac{π}{2 θ}⌋

, as

| q - \bar{q} | \leq \frac{1}{2}

,

0 < θ \leq \frac{π}{2}

. As

cos (θ) = 1 - \frac{M}{N}

, we can find

\begin{matrix} θ \geq \frac{\sqrt{2 N M - M^{2}}}{N} \end{matrix}

(6)

\begin{matrix} q = ⌊\frac{π}{2 θ}⌋ \leq O (\sqrt{\frac{N}{M}}) \end{matrix}

(7)

It is proven that the algorithm can be handled in the range of

1 \leq M \leq N

using the

O (\sqrt{\frac{N}{M}})

fixed operator.

3. The Binary QNN Model

We simulate the creation of a binary analysis algorithm that uses quantum states to process information, as shown in Figure 2. The algorithm proposed in this paper is uniformly represented as a BQM field in the following content. As shown in Figure 2, BQM uses a specified variable component subcircuit

U_{c}

, and an H gate to replace the Oracle

U_{f}

. The variable component subcircuit

U_{c}

, based on the training data, can conditionally flip a flag qubit. The tagged qubit is then used as part of the H gate to guide a Grover search algorithm to identify the index of the specified example; i.e., the state of the tagged qubit, such as “0” or “1”, determines the probability of success in identifying the target index. BQM optimizes the trainable parameters of the variable component subcircuit

U_{c}

. When the corresponding training instance is positive, the success probability of sampling a target index is maximized. Otherwise, BQM minimizes the probability of the success of sampling target indicators. The design of our algorithm has some advantages in terms of query complexity by inheriting attributes from the Grover search algorithm and performing binary classification tasks on these attributes while allowing the setting of search constraints [35]. Under the above observation, a quantum classifier must have certain advantages [36,37,38].

3.1. Pretreatment Stage of a Dichotomous Task

In the pretreatment stage, a dichotomous task uses the dataset T defined in Equation (1) as the extended dataset

\hat{T}

.

To apply the Grover search algorithm to obtain index

i = K - 1

, for

K \in [N]

, the

K^{t h}

pair data training rules

T_{k}

are as follows.

\begin{matrix} T_{K} {= [(x}_{k}^{(0)} {, y}_{k}^{(0)} {), (x}_{k}^{(1)} {, y}_{k}^{(1)}), \dots {, (x}_{k}^{(K - 1)} {, y}_{k}^{(K - 1)})] \end{matrix}

(8)

The pair of data in

T_{k}

is like the

K^{t h}

pair of

\hat{T}

; this means that

(x_{k}^{(K - 1)}, y_{k}^{(K - 1)}) = (x_{k}, y_{k})

. The first

K - 1

pair of

T_{k} = {(x_{k}^{(i)}, y_{k}^{(i)})}_{i = 0}^{K - 2}

uniformly samples from a subset of

\hat{T}

, when every label

{y_{k}^{(i)}}_{i = 0}^{K - 2}

is the opposite of

y_{k}

.

y_{k} \in {0, 1}

,

{\hat{T}}^{(0)}

and

{\hat{T}}^{(1)}

are constructed, which contain only those examples of

\hat{T}

with the labels ‘0’ and ‘1’, respectively. When

y_{k} = 0

, the pair samples before K from

{\hat{T}}^{(1)}

; it is same as for the situation where

y_{k} = 1

, in which the pair samples before K from

{\hat{T}}^{(0)}

.

Various quantum classifiers encode

T_{k}

into quantum states in different ways [39]. For the sake of notation, we indicate that

| Φ^{k} 〉

that analogously connects with the

K^{t h}

example is

\begin{matrix} | Φ^{k} 〉 = U_{d a t a} | 0 〉 = \sum_{i = 0}^{K - 1} \frac{1}{\sqrt{K}} | g (x_{i}) 〉 | i 〉 \end{matrix}

(9)

as

g (\cdot)

is a coding operation.

3.2. The Training Process of the Learning Plan

Compared with the traditional Grover algorithm, combining the variational learning method and the Grover search algorithm produces quantum advantages [40,41,42,43,44]. The adopted variable component subcircuit

U_{c}

is designed to find a hyperplane to keep the last pair in the

T_{k}

away from the pair of samples before K.

For the variational quantum circuits

U_{c} (θ)

in BQM, a NISQ device scheme consists of a trainable single-qubit gate and two-qubit gates such as CNOT or CZ, which implement generation and discrimination tasks using variational hybrid quantum-classical algorithms.

U_{c}

is denoted as

U_{c} = \prod_{c = 1}^{C} U (θ^{c})

, where each layer

U (θ^{c})

contains

O (p o l y (N))

parameterized single-qubit gates and at most,

O (p o l y (N))

fixed two-qubit gates with the same layout.

In the optimal situation, given a initial state

| Φ^{k} 〉

defined in Equation (9),

U_{c}

is applied to obtain the following goals:

1.: If the pair of samples before K as $T_{k}$ analogously connects with the label $y_{k} = 0$ , the expected state is

$\begin{matrix} (U_{c} \otimes I) (U_{data} | 0 〉)_{y_{k} = 0} = \sum_{i = 0}^{K - 1} \frac{1}{\sqrt{K}} | 0 〉 | i 〉 \end{matrix}$

(10)
2.: If the pair of samples before K as $T_{k}$ analogously connects with the label $y_{k} = 1$ , the expected state is

$\begin{matrix} (U_{c} \otimes I) (U_{data} | 0 〉)_{y_{k} = 1} = \sum_{i = 0}^{K - 1} \frac{1}{\sqrt{K}} | 1 〉 | i 〉 \end{matrix}$

(11)

The label

y_{k} = 0

(or

y_{k} = 1

) as the quantum state of the

R_{F}

characteristics register the first qubits

| 0 〉

(or

| 1 〉

). As shown in Figure 2A, state

(U_{c} \otimes I_{I}) U_{d a t a} | 0 〉

was prepared. Our binary QNN model iteratively applied the H door to register by the characteristics of the first qubit and index on the index register control, using the

U_{d a t a}

register and the

U_{c}

calculation characteristics, and to finish the first cycle, it applied the diffusion operator

Q_{i}

to the index register. All quantum processes, such as U, are part of a period.

\begin{matrix} U = Q_{i} \circ U_{d a t a}^{†} \circ {(U_{c} \otimes I)}^{†} \circ H \circ (U_{c} \otimes I) \circ U_{d a t a}^{} \end{matrix}

(12)

Define

Q_{i} = I \otimes (\frac{2}{K} \sum_{i} | i 〉 〈 i | - I_{I})

. Except on a loop, the binary QNN model is repeatedly in the initial state

| 0 〉

to U, and the application of the unitary operation is replaced with

\begin{matrix} U_{E} = Q_{i} \circ H \circ (U_{c} \otimes I) \circ U_{d a t a}^{} \end{matrix}

(13)

The brown shade in Figure 2B shows this. According to the traditional Grover search algorithm [45], before making quantum measurements, the binary QNN model polls U and

U_{E}

for a total of

O (\sqrt{K})

times.

3.3. The Evolution of the Quantum State

We analyze how quantum states evolve under

y_{k} = 0

and

y_{k} = 1

:

After interaction with unitary $U_{c} \otimes I$ , using the Equation (10) input state $Φ_{k} (y_{k} = 0)$ , this state can be converted to $\frac{1}{\sqrt{K}} \sum_{i = 0} | 0 〉 | i 〉$ . For all computing in $i \in [K - 1]$ , this means that the quantum operation $Q_{i} \circ U_{d a t a}^{†} \circ {(U_{c} \otimes I_{I})}^{†}$ does not change state.

$\begin{matrix} \frac{1}{\sqrt{K}} (H^{\otimes n} \otimes I) (2 | 0 〉 〈 0 | - I_{n + 1}) (H^{\otimes n} \otimes I) \sum_{i = 0}^{K - 1} | 0 〉 | i 〉 = \sum_{i = 0}^{K - 1} \frac{1}{\sqrt{K}} | 0 〉 | i 〉 \end{matrix}$

(14)

When we measure the indicator register of the output state, the sampling $i \in [K - 1]$ for calculating the base i is distributed.
After interaction with unitary $U_{c} \otimes I$ using the Equation (11) input state $Φ_{k} (y_{k} = 1)$ , this state can be converted to $\frac{1}{\sqrt{K}} \sum_{i = 0} | 1 〉 | i 〉$ .
Mathematically, the result state is generated after interaction with H

$\begin{matrix} H \circ (U_{c} \otimes I) (U_{d a t a} | 0 〉)_{y_{k} = 1} = \frac{1}{\sqrt{K}} | 0 〉 \sum_{i = 0}^{K - 2} | i 〉 - \frac{1}{\sqrt{K}} | 1 〉 | i^{*} 〉 \end{matrix}$

(15)

where $| i^{*} 〉$ for calculating the base $| K - 1 〉$ . The calculation operation $U_{d a t a}^{†} \circ {(U_{c} \otimes I)}^{†}$ and the diffusion operation $Q_{i}$ are used to increase $| i^{*} 〉$ probability.
After the first cycle, the generated state is generated

$\begin{matrix} U (U_{data} | 0 〉)_{y_{k} = 1} = \frac{(K - 4)}{K \sqrt{K}} | 0 〉 \sum_{i = 0}^{K - 2} | i 〉 + \frac{3 K - 4}{K \sqrt{K}} | 0 〉 | i^{*} 〉 \end{matrix}$

(16)

where Equation (12) defines U. According to Grover’s algorithm, the chance of sampling $i^{*}$ will increase to $\frac{{(3 K - 4)}^{2}}{K^{3}}$ .

3.4. The Loss Function

With the observation above leading to Theorem 1, the proof is given above:

Theorem 1.

For BQM, under the optimal setting, if the label of the last item of

T_{k}

is

y_{k} = 1

, the probability of the sampling resulting in

i^{*} = K - 1

is asymptotically 1.

Proof of Theorem 1.

We discussed the case where the last entry in

T_{k}

has labels

y_{k} = 1

and

y_{k} = 0

.

In the instance of

y_{k} = 0

, assuming that the label of the final item in

T_{k}

is

y_{k} = 0

, it is possible to determine from Equation (14) that after the first cycle, the generation state of BQM is

\begin{matrix} U_{c} | Φ_{k} (y_{k} = 0) 〉 = U | 0 〉 = \sum_{i = 0}^{K - 1} \frac{1}{\sqrt{K}} | 0 〉 | i 〉 \end{matrix}

(17)

The chance of picking any index when U (apply to

| 0 〉

) is the same, according to the formula above. After U is applied to

| 0 〉

via induction, with the migration with time n, the state changes as

\begin{matrix} \prod_{i = 0}^{N} U^{i} | 0 〉 = \sum_{i = 0}^{K - 1} \frac{1}{\sqrt{K}} | 0 〉 | i 〉 \end{matrix}

(18)

where the given N is any positive integer, and the probability of sampling

| i^{*} 〉

is

\frac{1}{K}

. In the last loop, the quantum operation

U_{E}

defined in Equation (13) is applied to state

\prod_{i = 0}^{N} U^{i} | 0 〉

and the resulting state is

\begin{matrix} U_{E} \prod_{i = 0}^{N} U^{i} | 0 〉 & = Q_{i} \circ H \circ (U_{c} \otimes I) \circ U_{d a t a}^{} \sum_{i = 0}^{K - 1} \frac{1}{\sqrt{K}} | 0 〉 | i 〉 \\ = \sum_{i = 0}^{K - 1} \frac{1}{\sqrt{K}} | 0 〉 Λ \end{matrix}

(19)

where the first equality uses Equation (18); the second equation uses Equation (15) and exploits the application of the diffusion operator

Q_{i} = (H^{\otimes n} \otimes I_{1}) (2 | 0 〉 〈 0 | - I_{n + 1}) (H^{\otimes n} \otimes I_{1})

, then

Λ = \frac{1}{\sqrt{K - 1}} \sum_{i = 0}^{K - 2} | i 〉 + | i^{*} 〉

.

In the instance of

y_{k} = 1

, assuming that the label of the last item in

T_{k}

is

y_{k} = 1

, it is possible to determine from Equation (16) that after the first cycle, the generation state of BQM is

\begin{matrix} U_{c} | Φ_{k} (y_{k} = 1) 〉 = | 0 〉 \otimes (\frac{(K - 4)}{K \sqrt{K}} \sum_{i = 0}^{K - 2} | i 〉 + \frac{3 K - 4}{K \sqrt{K}} | i^{*} 〉) \end{matrix}

(20)

The chance of picking any index when U (apply to

| 0 〉

) is the same, according to the formula above. After U is applied to

| 0 〉

by induction, with the migration with time n, the state changes as follows:

\begin{matrix} \prod_{i = 0}^{n} U^{i} | 0 〉 = | 0 〉 \otimes \frac{1}{\sqrt{K}} [ℏ \sum_{i = 0}^{K - 2} | i 〉 + λ | i^{*} 〉] \end{matrix}

(21)

where given that n is any positive integer,

ℏ = cos (2 n α) - \frac{1}{\sqrt{K - 1}} sin (2 n α)

,

sin α = \frac{1}{\sqrt{K}}

,

λ = \sqrt{K - 1} sin (2 n α) + cos (2 n α)

. In the last loop, the quantum operation

U_{E}

defined in Equation (13) is applied to state

\prod_{i = 0}^{N} U^{i} | 0 〉

, and the resulting state is

\begin{matrix} U_{E} \prod_{i = 0}^{N} U^{i} | 0 〉 & = Q_{i} \circ H \circ (U_{c} \otimes I) \circ U_{d a t a} | 0 〉 \otimes \frac{1}{\sqrt{K}} ℏ \sum_{i = 0}^{K - 2} | i 〉 + λ | i^{*} 〉) \\ = Q_{i} \frac{[ℏ | 0 〉 \sum_{i = 0}^{K - 2} | i 〉 + λ | 1 〉 | i^{*} 〉]}{\sqrt{K}} \\ = \frac{[(K - 2) ℏ | 0 〉 \sum_{i = 0}^{K - 2} | i 〉 + 2 \sqrt{K - 1} λ | 1 〉 | i^{*} 〉]}{\sqrt{K^{3}}} \end{matrix}

(22)

where the first equality uses Equation (21) and the second equation uses Equation (16) to design the feature register. It uses H to flip the phase of

| i 〉

whose first qubit of the feature register is

| 1 〉

, and the last equation comes from the application of the diffusion operator

Q_{i}

.

According to Equation (22), in the ideal situation, the probability of sampling

i^{*}

is near to 1 when

n \sim O (\sqrt{K})

, and then

(K - 1) sin (2 n α) + \sqrt{K - 1} cos (2 n α)

is close to 1.

The result of Equation (19) shows that when

y_{k} = 0

, the probability of sampling

i^{*}

never increases. Thus, we can follow that the sampling probability of the result

i^{*}

asymptotically approaches one if and only if the label of the last term of

T_{k}

is

y_{k} = 1

. □

According to Theorem 1 of the BQM’s special property, the output distribution is different for different labels of the input

T_{k}

while performing the binary classification task. According to the analysis of Theorem 1 mentioned above, the calculation basis

i = K - 1

will be present in the output state of the BQM; that is,

U_{E} U^{O (\sqrt{K})} | 0 〉

, which corresponds to

y_{k} = 1

, and its probability is close to 1. The matching output state for

y_{k} = 0

will, however, include the same computational foundation

i \in [K - 1]

.

According to the mechanism of the Grover search algorithm, the loss function of BQM is deduced as

\begin{matrix} min_{θ} L (θ) = s (\frac{1}{2} - y_{k}) T r ((| 1 〉 〈 1 |) \otimes H \otimes (| i^{*} 〉 〈 i^{*} |) Δ_{θ}) \end{matrix}

(23)

where

s (\cdot)

is the sign function,

Δ_{θ} = U_{E} U {(θ)}^{O (\sqrt{K})} | 0 〉 〈 0 | {(U_{E} U {(θ)}^{O (\sqrt{K})})}^{†}

, and

U (θ)

is defined in Equation (12).

The success probability of sampling

i^{*}

and obtaining the first feature qubit as ’1’(’0’) is maximized (minimized) when

y_{k} = 1

(

y_{k} = 0

), when faced with the challenge of minimizing the loss function

L (θ)

.

3.5. Gradient-Based Parameter Optimization

The optimization method in this paper uses a multiple-layer parameterized quantum circuit (MPQC), according to the principle that the arrangement of quantum gates in each layer is the same [46], and the operation formed by the layer c is expressed as

U (θ^{c})

, produced by quantum states produced by MPQC

\begin{matrix} | ω 〉 = \prod_{c = 1}^{C} U (θ^{c}) | 0 〉^{\otimes N} \end{matrix}

(24)

where C is the total number of layers. BQM uses MPQC to construct

U_{c}

\begin{matrix} U_{c} (θ) = \prod_{c = 1}^{C} U (θ^{c}) \end{matrix}

(25)

The circuit layout of

U (θ^{c})

at layer l is shown in Figure 3. When the number of layers is C, the total number of trainable parameters of BQM is

2 MC

.

The update rules of BQM at the

j^{t h}

iteration are as follows

\begin{matrix} θ^{(j + 1)} = θ^{(j)} - ζ \frac{L (θ^{(j)}, T_{j})}{\partial θ} \end{matrix}

(26)

where

ζ

is the learning rate. Given the explicit form of

L (θ)

in the defined Equation (23), the gradient of

L (θ^{(j)}, T_{j})

can be rewritten as

\begin{matrix} \frac{\partial L (θ^{(j)}, T_{j})}{\partial θ} = s (\frac{1}{2} - y_{j}) \frac{\partial T r (\prod Δ_{θ^{(j)}})}{\partial θ} \end{matrix}

(27)

where

y_{j}

is the label of the last item in

T_{j}

,

s (\cdot)

is the symbol function, and ∏ is the measurement operator.

BQM employs a gradient-based method, according to the parameter displacement rule, to obtain the gradient

\frac{\partial T r (\prod Δ_{θ^{(j)}})}{\partial θ}

, to optimize

θ

. The parameter shift rule [47] iteratively calculates each gradient entry under its guiding principle.

For

e \in [2 NC]

, only the

e^{t h}

parameter is rotated by

\pm \frac{π}{2}

, i.e.,

\begin{matrix} θ_{\pm}^{(j)} = [θ_{0}^{(j)}, \dots, θ_{e - 1}^{(j)}, θ_{e}^{(j)} \pm \frac{π}{2}, θ_{e + 1}^{(j)}, \dots, θ_{2 N C - 1}^{(j)}] \end{matrix}

(28)

Combining Equations (26)–(28), the update rule of BQM at the

e^{t h}

iteration of the e item is

\begin{matrix} θ_{e}^{(j + 1)} = θ_{e}^{j} - ζ s (\frac{1}{2} - y_{j}) \frac{\partial T r (\prod Δ_{θ_{\pm}^{(j)}})}{2} \end{matrix}

(29)

where

Δ_{θ_{\pm}^{(j)}} = U_{E} U {({θ_{+}}^{(j)})}^{O (\sqrt{K})} | 0 〉 〈 0 | {(U_{E} U {({θ_{-}}^{(j)})}^{O (\sqrt{K})})}^{†}

.

3.6. Circuit Implementation of Label Prediction

After the training of BQM is completed, the trained

U_{c}

can use the corresponding circuit (as shown in Figure 4) to predict the label of an instance with

O (1)

query complexity.

Denoting the new input as

(x, y)

, we encode a into quantum states using the same encoding method used during training; i.e.,

| \tilde{χ} 〉 = | g (x) 〉

, then we apply the trained

U_{c}

to

| \tilde{χ} 〉

.

When the size of the dataset loaded by the binary QNN model is K, a well-trained binary QNN model can obtain the index with

O (\sqrt{\frac{K}{M T^{2}}})

query complexity.

3.7. Synthetic Construction of Datasets

Given the training example

x_{i} = (α^{(j)}, β^{(j)}) \in R^{2}

, the embedded function

f (α^{(j)}, β^{(j)})

used to encode

x_{i}

into a quantum state is represented as

\begin{matrix} f (α^{(j)}, β^{(j)}) = (R (γ (α^{(j)}, β^{(j)})) \otimes R (γ (α^{(j)}, β^{(j)}))) | 0 〉^{\otimes 2} \end{matrix}

(30)

where

γ (α^{(j)}, β^{(j)}) = {(α^{(j)}, β^{(j)})}^{2}

is a specified mapping function. The above formula means that g (xi) can be converted into a series of quantum operations, the implementation of which is shown in Figure 5a. To encode multiple training examples into quantum states simultaneously, we should treat

f (x_{i})

as a controlled version, the implementation of which is shown in Figure 5b.

3.8. The Details of BQM

The implementation of GBLS is shown in Figure 5c. In it, the data encoding the unitary

U_{d a t a}

consists of a controlled set of

f (x_{i})

quantum operations. The implementation of encoding unity

U_{d a t a}

depends on the size of the batch B. For the quantum kernel classifier with BCE loss and MSE loss (

B = M

), it can be seen from Formula (30) that the unitary encoding is

\begin{matrix} U_{d a t a} = R (γ (α^{(j)}, β^{(j)})) \otimes R (γ (α^{(j)}, β^{(j)})) \end{matrix}

(31)

For a quantum kernel classifier with MSE loss (

B = M / 4)

, the implementation of encoding unitary

U_{d a t a}

is the same as that of BQM, as shown in Figure 5.

4. Results

We will analyze the security of the proposed BQM, intuitively express the evolution and generation process of the dataset by using the dot plot, and evaluate the algorithm’s performance through entropy analysis and testing [48,49,50].

4.1. Dataset Evolution

This algorithm’s dataset generation process is shown in Figure 6. It uses the top

K - 1

pair and the

K^{t h}

pair in the original dataset for label classification and uniform sampling. It divides all data pairs into sub-datasets labeled as 1 (or 0) according to the y value of 0 (or 1).

The yellow cube in the figure represents the data point in the data pair whose value of y is 1, the blue cube represents the data point in the data pair whose value of y is 0, and the point whose circle in red represents the

K^{t h}

data point. The specific rules are as follows:

1.: Represents the original dataset when $y_{k} = 1$ as the `A’ cube or when $y_{k} = 0$ as the `B’ cube in the $K^{t h}$ pair of data $(x_{k}, y_{k})$ ;
2.: Represents the uniform sampling from the sub-dataset labeled 1 in Figure 6a to generate a new dataset in Figure 6b;
3.: Represents the uniform sampling from the sub-dataset labeled 0 in Figure 6a to generate a new dataset in Figure 6c.

4.2. Stability and Convergence Analysis

Define a utility-bound R as a utility measure to evaluate the distance between the optimization result and the stationary point in the optimized environment.

\begin{matrix} R = E [| | \nabla_{θ} L (θ^{(j)}) {| |]}^{2} \leq ε (j) \end{matrix}

(32)

For the BQM quantum classifier with a depolarization noise setting, the utility bound of output

θ^{(j)} \in R^{l}

after j iterations is

\begin{matrix} ε (j) = O (p o l y (\frac{l}{j {(1 - P)}^{d}}, \frac{l}{B K {(1 - P)}^{d}}, \frac{l}{{(1 - P)}^{d}})) \end{matrix}

(33)

where P is the depolarization rate, l is the total number of trainable parameters, K is the number of measurements to estimate the quantum expected value, d is the circuit depth of the variable component sub-circuit, and B is the number of batches.

We use the decay rate of

l o g (ε (j))

to define the asymptotic convergence rate of this optimization algorithm [51,52]. According to Equation (33), the attenuation rate of

l o g (ε (j))

is slower than that of

- j

, which proves that this algorithm has a sublinear convergence rate.

When

B = M

, we input each sample

T_{j}

in turn to variable component subcircuits to obtain

\nabla L (θ, T_{j})

. Once the set

{\nabla L (θ, T_{j})}_{j = 1}^{M}

is collected, the gradient

\nabla L (θ, T)

can be estimated by

\frac{1}{M} \sum_{j = 1}^{M} \nabla L (θ, T_{j})

. Assuming that the number of measurements required to estimate the derivative of the JTH parameter

θ_{j}

is K, the total number of measurements obtained is

M K

for

\frac{1}{M} \sum_{j = 1}^{M} \nabla L (θ, T_{j})

. Therefore, the estimate of

\nabla L (θ, T_{j})

with l parameters requires

M K l

measurements.

For the above definition of utility bound R, the results show that a large number of lot B can guarantee a better realization of utility bound R by increasing the total number of measurements.

4.3. Performance Analysis under Depolarization Noise

We employ depolarization channels to mimic the system noise, since the number of measurements and the quantum system’s noise are both constrained. We next examine how well BQM performs in the presence of depolarization noise [53].

If a quantum state is

ω

, we define the depolarizing channel

ϖ_{P}

acting on this quantum state as

ϖ_{P} (ω) = (1 - P) ω + P \frac{I_{l}}{l}

(34)

where P is the depolarization rate, and l is the total number of trainable parameters.

We compare the performance of BQM and two other quantum kernel classifiers when quantum system noise and measurement delays are considered. Among them, BQM stands for a binary classification quantum neural network model based on the optimized Grover algorithm proposed in this paper. The two classifiers compared with BQM are defined as “BCE” and “MSE”, respectively. “BCE” stands for the quantum kernel classifier with binary cross-entropy loss, and “MSE” means the quantum kernel classifier with the mean square error loss (B = N). We simulated the statistics for each of the three classifiers by repeating the values 10 times. Figure 7 illustrates the simulation findings. After 20 periods, BQM, BCE, and MSE quantum classifiers achieve the same performances. It can be observed that the quantum classifiers with MSE loss have lower convergence speeds and larger variances than the BQM and BCE classifiers. This phenomenon reflects that using BQM for classification tasks with different batches is meaningful. In Table 1, we compare the average training and testing accuracies of the BQM, BCE, and MSE quantum classifiers in the last stage. Considering the measurement error and quantum gate noise, BQM still achieves a stable performance because of its minimal variance.

The binary QNN model based on Grover, quantum kernel classifier BCE loss, and quantum kernel classifier mean square error loss is reflected by the labels `BQM’, `BCE’, and `MSE’. The train and test accuracies of the BQM quantum classifier are shown in the left and right figures. The vertical bar represents the train and test accuracy variation at each iteration, where all hyperparameter settings are the same as those used in the above numerical simulation.

According to the numerical simulation results of the three quantum classifiers in Figure 7, BQM can obtain a good utility boundary R through some tests. When they achieve basically comparable performances, BQM reduces the number of measurements required by K = 4 times compared to quantum classifiers with BCE losses and MSE losses (B = N). This result shows that when N is larger, there is a large separation of computational efficiency between BQM and the previous B = N quantum classifier.

The above data demonstrate that, when BQM is compared to the other two quantum classifiers, the number of measurements required by BQM is decreased by four times, demonstrating BQM’s efficacy.

4.4. Complex Comparsion Analysis

After receiving the encoded data in the variable component subcircuit of the quantum classifier, the measurement operator defines a query as one measurement. The iterative process of this algorithm is shown in Figure 8. According to the quantum classifier’s training mechanism, calculating the total number of measurements for variable component subcircuits is comparable to the query complexity of obtaining the gradient in a time frame.

The next step is to derive the number of measurements required for a quantum kernel classifier with BCE loss in one period. For the dataset T, the

BCE

loss is generated

L_{BCE} = - \frac{1}{N} \sum_{i = 0}^{N - 1} y_{i} log (P (y_{i})) + (1 - y_{i}) log (1 - P (y_{i}))

(35)

where

y_{i}

is the label of the

i^{t h}

example, and

P (y_{i})

is the prediction probability of the label

y_{i}

; the output of its quantum circuit is

P (y_{i}) = T r ((| 1 〉 〈 1 |) \otimes H \otimes \nabla

(36)

where

\nabla_{θ} = | i^{*} 〉 〈 i^{*} |) U_{c} (θ) | 0 〉 〈 0 | U_{c} {(θ)}^{†}

,

U_{c} (θ)

is defined in Equation (25), and

(| 1 〉 〈 1 |) \otimes H \otimes (| i^{*} 〉 〈 i^{*} |) = \prod

is the measurement operator. According to the parameter displacement rule, the derivative of

BCE

loss is satisfied

\frac{\partial L_{BCE}}{\partial θ_{e}} = \frac{1}{N} \sum_{i = 0}^{N - 1} ρ \frac{Tr (\prod \nabla_{θ_{+}}) - Tr (\prod \nabla_{θ_{-}})}{2}

(37)

where

θ_{\pm}

is defined in Equation (28),

ρ = \frac{1 - y_{i}}{1 - P (y_{i})} - \frac{y_{i}}{P (y_{i})}

. To obtain the gradient of BCE loss according to the above equation, we need to give each training example to the quantum kernel classifier to estimate

P (y_{i})

, and then calculate the coefficient

ρ

.

BQM uses the superposition property of the loss function L defined in Equation (17) to obtain the gradient

\frac{\partial L}{\partial θ_{e}}

. According to Equation (23), the gradient of BQM satisfies

\frac{\partial L (θ, T_{k})}{\partial θ_{e}} = \frac{s}{2} (1 - y_{k} (Tr (\prod \nabla_{θ_{+}^{(k)}}) - Tr (\prod \nabla_{θ_{-}^{(k)}})))

(38)

where

y_{k}

refers to the label of the last pair in the training example

T_{k}

. The gradient of

T_{k}

may be calculated using

2 K

measurements, where the first K measurement aims to approach

Tr (\prod \nabla_{θ_{-}^{(k)}})

and the last K measurement aims to approximate

Tr (\prod \nabla_{θ_{+}^{(k)}})

, according to the equation above.

Complex comparison analysis determines the effect of the dataset size on the binary QNN model in this paper. The standard Grover search algorithm’s search complexity is

O (\sqrt{K l})

for information data entries of size K and the total number of trainable parameters l. The optimal algorithm classification complexity value is

O (\sqrt{\frac{K l}{M T^{2}}})

, as can be shown in the following Table 2. The reduced query complexity of BQM implies a potential quantum advantage in completing the classification task.

5. Conclusions

As an essential source of information for value-added social media websites, user behavior patterns are the key to fast and accurate identification through session division to realize extensive data analysis. In this paper, based on the trial-and-error method of the Grover search algorithm, combined with the binary classification task of QNN supervised learning, the advantages of the Grover quantum algorithm are brought into play in the quantum classifier of the quantum neural network. The data are preprocessed by analyzing the network search data to realize the construction of the BQM algorithm. They lay the foundation for the development of user behavior pattern prediction. The experimental data show that the application effect of this algorithm has a more apparent accurate recognition rate than the other two classifiers, and it still has a prominent effect in the depolarized noise environment. It can play a supervisory role in the security detection of future users’ network search behaviors.

Author Contributions

Conceptualization, W.Z.; methodology, W.Z.; software, H.M.; validation, Y.W.; formal analysis, Y.Q.; data curation, S.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Nos. 11975132, 61772295), the Natural Science Foundation of Shandong Province, China (Nos. ZR2021MF049, ZR2019YQ01), and the Project of Shandong Provincial Natural Science Foundation Joint Fund Application (ZR202108020011).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bao, Z.; Zhou, J.; Tay, Y.C. sonSQL: An Extensible Relational DBMS for Social Network Start-Ups. In Proceedings of the 32nd International Conference on Conceptual Modeling (ER 2013), Hong Kong, China, 11–13 November 2013; Volume 8217, pp. 495–498. [Google Scholar]
Rosen, D.; Kim, B. Social networks and online environments: When science and practice co-evolve. Soc. Netw. Anal. Min. 2011, 1, 27–42. [Google Scholar] [CrossRef] [Green Version]
Zhang, Y.; Ling, W. A comparative study of information diffusion in weblogs and microblogs based on social network analysis. Chin. J. Libr. Inf. Sci. 2012, 5, 51–66. [Google Scholar]
Xu, L.; Jiang, C.; Wang, J.; Yuan, J.; Ren, Y. Information Security in Big Data: Privacy and Data Mining. Chin. J. Libr. Inf. Sci. 2014, 2, 2169–3536. [Google Scholar]
Witten, I.; EibeFrank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, 1st ed.; China Machine Press: Beijing, China, 2005. [Google Scholar]
Nasraoui, O. Web data mining: Exploring hyperlinks, contents, and usage data. SIGKDD Explor. Newsl. 2008, 10, 23–25. [Google Scholar] [CrossRef]
Tang, H.S.; Yao, Y.W. Research on Decision Tree in Data Mining. Appl. Res. Comput. 2001, 8, 18. [Google Scholar]
Chen, M.S.; Park, J.S.; Yu, P.S. Efficient data mining for path traversal patterns. IEEE Trans. Knowl. Data Eng. 1998, 10, 209–221. [Google Scholar] [CrossRef] [Green Version]
Juntao, H.U.; Defeng, W.U.; Li, G.; Gan, Y. Architecture and Technique of Multimedia Data Mining. Comput. Eng. 2003, 29, 149–151. [Google Scholar]
Shou, Y.J. Data Mining Technique. Autom.-Petro-Chem. Ind. 2000, 6, 38. [Google Scholar]
Yang, Y.; Adelstein, S.J.; Kassis, A.I. Target discovery from data mining approaches. Drug Discov. Today 2009, 14, 147–154. [Google Scholar] [CrossRef]
Mahmood, N.; Hafeez, Y.; Iqbal, K.; Hussain, S.; Aqib, M.; Jamal, M.; Song, O.Y. Mining Software Repository for Cleaning Bugs Using Data Mining Technique. Comput. Mater. Contin. 2021, 69, 873–893. [Google Scholar] [CrossRef]
Helma, C.; Kramer, S. Machine Learning and Data Mining. Predict. Toxicol. 2005, 42, 223–254. [Google Scholar]
Wang, Y.; Li, G.; Xu, Y.; Hu, J. An Algorithm for Mining of Association Rules for the Information Communication Network Alarms Based on Swarm Intelligence. Math. Probl. Eng. 2014, 2014, 894205. [Google Scholar] [CrossRef] [Green Version]
Chen, Z.Y.; Qiu, T.H.; Zhang, W.B.; Ma, H.Y. Effects of initial states on the quantum correlations in the generalized Grover search algorithm. Chin. Phys. B 2021, 30, 080303. [Google Scholar] [CrossRef]
Xu, Z.; Ying, M.; Valiron, B. Reasoning about Recursive Quantum Programs. arXiv 2021, arXiv:2107.11679. [Google Scholar]
Xue, Y.J.; Wang, H.W.; Tian, Y.B.; Wang, Y.N.; Wang, Y.X.; Wang, S.M. Quantum Information Protection Scheme Based on Reinforcement Learning for Periodic Surface Codes. Quantum Eng. 2022, 2022, 7643871. [Google Scholar] [CrossRef]
Galindo, A.; Martín-Delgado, M.A. Family of Grover’s quantum-searching algorithms. Phys. Rev. A 2000, 62, 062303. [Google Scholar] [CrossRef] [Green Version]
Ma, H.; Ma, Y.; Zhang, W.; Zhao, X.; Chu, P. Development of Video Encryption Scheme Based on Quantum Controlled Dense Coding Using GHZ State for Smart Home Scenario. Wirel. Pers. Commun. 2022, 123, 295–309. [Google Scholar] [CrossRef]
Zhu, D.; Linke, N.M.; Benedetti, M.; Landsman, K.A.; Nguyen, N.H.; Alderete, C.H.; Perdomo-Ortiz, A.; Korda, N.; Garfoot, A.; Brecque, C.; et al. Training of quantum circuits on a hybrid quantum computer. Sci. Adv. 2019, 5, aaw9918. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wang, H.W.; Xue, Y.J.; Ma, Y.L.; Hua, N.; Ma, H.Y. Determination of quantum toric error correction code threshold using convolutional neural network decoders. Chin. Phys. B 2022, 31, 010303. [Google Scholar] [CrossRef]
Zheng, W.; Yin, L. Characterization inference based on joint-optimization of multi-layer semantics and deep fusion matching network. Peerj. Comput. Sci. 2022, 8, e908. [Google Scholar] [CrossRef]
Singh, M.P.; Radhey, K.; Rajput, B.S. Pattern Classifications Using Grover’s and Ventura’s Algorithms in a Two-qubits System. Int. J. Theor. Phys. 2018, 57, 692–705. [Google Scholar] [CrossRef]
Huang, C.Q.; Jiang, F.; Huang, Q.H.; Wang, X.Z.; Han, Z.M.; Huang, W.Y. Dual-Graph Attention Convolution Network for 3-D Point Cloud Classification. IEEE Trans. Neural Networks Learn. Syst. 2022, 1–13. [Google Scholar] [CrossRef] [PubMed]
Ding, L.; Wang, H.; Wang, Y.; Wang, S. Based on Quantum Topological Stabilizer Color Code Morphism Neural Network Decoder. Quantum Eng. 2022, 2022. [Google Scholar] [CrossRef]
Zhou, G.; Zhang, R.; Huang, S. Generalized Buffering Algorithm. IEEE Access 2021, 99, 1. [Google Scholar] [CrossRef]
Xu, Y.; Zhang, X.; Gai, H. Quantum Neural Networks for Face Recognition Classifier. Procedia Eng. 2011, 15, 1319–1323. [Google Scholar] [CrossRef] [Green Version]
Zhang, Y.C.; Bao, W.S.; Xiang, W.; Fu, X.Q. Decoherence in optimized quantum random-walk search algorithm. Chin. Phys. 2015, 24, 197–202. [Google Scholar] [CrossRef]
Tseng, H.-Y.; Tsai, C.-W.; Hwang, T.; Li, C.-M. Quantum Secret Sharing Based on Quantum Search Algorithm. Int. J. Theor. Phys. 2012, 51, 3101–3108. [Google Scholar] [CrossRef]
Chen, J.; Liu, L.; Liu, Y.; Zeng, X. A Learning Framework for n-Bit Quantized Neural Networks toward FPGAs. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 1067–1081. [Google Scholar] [CrossRef] [Green Version]
Xiao, Y.; Huang, W.; Oh, S.K.; Zhu, L. A polynomial kernel neural network classifier based on random sampling and information gain. Appl. Intell. 2021, 52, 6398–6412. [Google Scholar] [CrossRef]
He, Z.; Zhang, H.; Zhao, J.; Qian, Q. Classification of power quality disturbances using quantum neural network and DS evidence fusion. Eur. Trans. Electr. Power 2012, 22, 533–547. [Google Scholar] [CrossRef]
Du, Y.; Hsieh, M.H.; Liu, T.; Tao, D. A Grover-search Based Quantum Learning Scheme for Classification. New J. Phys. 2021, 23, 20. [Google Scholar] [CrossRef]
Younes, A.; Rowe, J.; Miller, J. Enhanced quantum searching via entanglement and partial diffusion. Phys. Nonlinear Phenom. 2008, 237, 1074–1078. [Google Scholar] [CrossRef] [Green Version]
Liang, X.; Luo, L.; Hu, S.; Li, Y. Mapping the knowledge frontiers and evolution of decision making based on agent-based modeling. Knowl. Based Syst. 2022, 250, 108982. [Google Scholar] [CrossRef]
Liu, Z.; Chen, H. Analyzing and Improving the Secure Quantum Dialogue Protocol Based on Four-Qubit Cluster State. Int. J. Theor. Phys. 2020, 59, 2120–2126. [Google Scholar] [CrossRef]
Chakrabarty, I.; Khan, S.; Singh, V. Dynamic Grover search: Applications in recommendation systems and optimization problems. Quantum Inf. Process. 2017, 16, 153. [Google Scholar] [CrossRef] [Green Version]
Song, J.; Ke, Z.; Zhang, W.; Ma, Y.; Ma, H. Quantum Confidentiality Query Protocol Based on Bell State Identity. Int. J. Theor. Phys. 2022, 61, 52. [Google Scholar] [CrossRef]
Panella, M.; Martinelli, G. Neural networks with quantum architecture and quantum learning. Int. J. Circuit Theory Appl. 2011, 39, 61–77. [Google Scholar] [CrossRef]
Carlo, C.; Stefano, M. On Grover’s search algorithm from a quantum information geometry viewpoint. Phys. Stat. Mech. Its Appl. 2012, 391, 1610–1625. [Google Scholar]
Ashraf, I.; Mahmood, T.; Lakshminarayanan, V. A Modification of Grover’s Quantum Search Algorithm. Photonics Optoelectron 2012, 1, 20–24. [Google Scholar]
Long, G.; Liu, Y.; Loss, D. Search an unsorted database with quantum mechanics. Front. Comput. Sci. China 2007, 1, 247–271. [Google Scholar] [CrossRef]
Bin, C.; Zhang, W.; Wang, X.; Zhao, J.; Gu, Y.; Zhang, Y. A Memetic Algorithm Based on Two_Arch2 for Multi-depot Heterogeneous-vehicle Capacitated Arc Routing Problem. Swarm Evol. Comput. 2021, 63, 100864. [Google Scholar]
Ma, Z.; Zheng, W.; Chen, X.; Yin, L. Joint embedding VQA model based on dynamic word vector. Peerj Comput. Sci. 2021, 7, e353. [Google Scholar] [CrossRef] [PubMed]
Grover, L.K. A fast quantum mechanical algorithm for database search. In Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing; Association for Computing Machinery, New York, NY, USA, 3–5 May 1996; pp. 212–219. [Google Scholar]
Benedetti, M.; Garcia-Pintos, D.; Perdomo, O.; Leyton-Ortega, V.; Nam, Y.; Perdomo-Ortiz, A. A generative modeling approach for benchmarking and training shallow quantum circuits. npj Quantum Inf. 2019, 5, 45. [Google Scholar] [CrossRef] [Green Version]
Mitarai, K.; Negoro, M.; Kitagawa, M.; Fujii, K. Quantum circuit learning. Phys. Rev. A 2018, 98, 2309. [Google Scholar] [CrossRef] [Green Version]
Zhang, Y.J.; Tang, J. Analysis and assessment of network security situation based on cloud model. Int. J. Theor. Phys. 2014, 36, 63–67. [Google Scholar]
Tian, H.; Qin, Y.; Niu, Z.; Wang, L.; Ge, S. Summer Maize Mapping by Compositing Time Series Sentinel-1A Imagery Based on Crop Growth Cycles. J. Indian Soc. Remote. Sens. 2021, 49, 2863–2874. [Google Scholar] [CrossRef]
Qin, Y. Early-Season Mapping of Winter Crops Using Sentinel-2 Optical Imagery. Remote. Sens. 2021, 13, 3822. [Google Scholar]
Zheng, W.; Xun, Y.; Wu, X.; Deng, Z.; Chen, X.; Sui, Y. A Comparative Study of Class Rebalancing Methods for Security Bug Report Classification. IEEE Trans. Reliab. 2021, 70, 4. [Google Scholar] [CrossRef]
Wu, X.; Zheng, W.; Xia, X.; Lo, D. Data quality matters: A case study on data label correctness for security bug report prediction. IEEE Trans. Softw. Eng. 2021, 48, 2541–2556. [Google Scholar] [CrossRef]
Aghayar, K.; Rezaei Fard, E. Noisy Quantum Mixed State Pattern Classification Based on the Grover’s Search Algorithm. Int. J. Nanotechnol. Appl. 2021, 15, 171–180. [Google Scholar]

Figure 1. Basic principles of the quantum classifier.

x_{i}

randomly generated by the

R^{N \times M}

matrix, and

y_{i}

, which can only be 0 or 1, forms N data pairs and then generates the dataset T. According to the classic backpropagation rule

f_{θ} (\cdot)

,

x_{i}

is taken as the input to this rule. The qubit obtains the input–output relationship from the data pair

(x_{i}, y_{i})

in the constructed dataset T, and learns the interaction in the training set of the relationship. In other words, the purpose of the quantum neural network is to use

x_{i}

as an input to learn

f_{θ} (\cdot)

rules.

Figure 1. Basic principles of the quantum classifier.

x_{i}

randomly generated by the

R^{N \times M}

matrix, and

y_{i}

, which can only be 0 or 1, forms N data pairs and then generates the dataset T. According to the classic backpropagation rule

f_{θ} (\cdot)

,

x_{i}

is taken as the input to this rule. The qubit obtains the input–output relationship from the data pair

(x_{i}, y_{i})

in the constructed dataset T, and learns the interaction in the training set of the relationship. In other words, the purpose of the quantum neural network is to use

x_{i}

as an input to learn

f_{θ} (\cdot)

rules.

Figure 2. The paradigm of BQM. (A) The first K-1 loop uses U, defined in Equation (12), which consists of unitary operators (namely

U_{d a t a}

,

U_{c}

, H, and

Q_{i}

). (B) The last cycle uses the unitary operation

U_{E}

defined in Equation (13). The qubit interacts with

U_{c}

and

Q_{i}

to form the feature register and data register.

Figure 2. The paradigm of BQM. (A) The first K-1 loop uses U, defined in Equation (12), which consists of unitary operators (namely

U_{d a t a}

,

U_{c}

, H, and

Q_{i}

). (B) The last cycle uses the unitary operation

U_{E}

defined in Equation (13). The qubit interacts with

U_{c}

and

Q_{i}

to form the feature register and data register.

Figure 3. The realization of the

l^{t h}

layer

U (θ^{c})

. It is assumed that the

l^{t h}

layer

U (θ^{c})

interacts with M qubits. Three trainable parameterized gates

R_{Z}

,

R_{Y}

, and

R_{Z}

are first applied to each qubit, followed by the

M - 1

C N O T

gates.

Figure 3. The realization of the

l^{t h}

layer

U (θ^{c})

. It is assumed that the

l^{t h}

layer

U (θ^{c})

interacts with M qubits. Three trainable parameterized gates

R_{Z}

,

R_{Y}

, and

R_{Z}

are first applied to each qubit, followed by the

M - 1

C N O T

gates.

Figure 4. Circuit implementation of BQM prediction. Use

g (\cdot)

as in the training process. The encoding method prepares the state

| g (x) 〉

and applies the trained variable component subcircuit

U_{c}

to

| \tilde{χ} 〉

.

Figure 4. Circuit implementation of BQM prediction. Use

g (\cdot)

as in the training process. The encoding method prepares the state

| g (x) 〉

and applies the trained variable component subcircuit

U_{c}

to

| \tilde{χ} 〉

.

Figure 5. Implementation of BQM in numerical simulation. (a) The circuit implementation of the encoded unitary

U_{d a t a}

corresponding to the feature map

f (x_{i})

is illustrated. (b) The realization of quantum operation

f (x_{i})

. (c) The implementation of BQM, given input

T_{k} = {x_{i}, x_{j}, x_{m}, x_{n}}

.

Figure 5. Implementation of BQM in numerical simulation. (a) The circuit implementation of the encoded unitary

U_{d a t a}

corresponding to the feature map

f (x_{i})

is illustrated. (b) The realization of quantum operation

f (x_{i})

. (c) The implementation of BQM, given input

T_{k} = {x_{i}, x_{j}, x_{m}, x_{n}}

.

Figure 6. Dataset evolution. The yellow cube in the figure represents the data point in the data pair whose

y = 1

, the blue cube represents the data point in the data pair whose

y = 0

, and the point circle in red represents the

K^{t h}

data point. (a) The traditional dataset when defined as T, which is defined in Equation (1). (b) When

K^{t h} = 1

(that is, the red grid labeled A), according to the generation formula,

K^{t h}

is combined with the first

K - 1

labels with

y = 0

, and the resulting dataset. (c) When

K^{t h} = 0

(that is, the red grid labeled B), according to the generation formula,

K^{t h}

is combined with the first

K - 1

labels with

y = 1

, and the resulting dataset.

Figure 6. Dataset evolution. The yellow cube in the figure represents the data point in the data pair whose

y = 1

, the blue cube represents the data point in the data pair whose

y = 0

, and the point circle in red represents the

K^{t h}

data point. (a) The traditional dataset when defined as T, which is defined in Equation (1). (b) When

K^{t h} = 1

(that is, the red grid labeled A), according to the generation formula,

K^{t h}

is combined with the first

K - 1

labels with

y = 0

, and the resulting dataset. (c) When

K^{t h} = 0

(that is, the red grid labeled B), according to the generation formula,

K^{t h}

is combined with the first

K - 1

labels with

y = 1

, and the resulting dataset.

Figure 7. The performance of different quantum classifiers at the different depolarization rates (

P = 0.1, 0.3

). Depolarizing noise models extracted from quantum hardware are applied to the trainable unitary

U_{c} (θ)

of these three classifiers. The labels `BQM’, `BCE’, and `MSE’ refer to the proposed Grover-based quantum classifier, the quantum kernel classifier with BCE loss, and the quantum kernel classifier with mean square error loss. (a,b) shows the variation of the train and test accuracies of BQM and the quantum kernel classifier with BCE loss with a P value of 0.1. (c,d) show the variation of the train and test accuracies of BQM and the quantum kernel classifier with BCE loss when the P value is 0.3. Vertical bars reflect the variance of the train and test accuracy at each iteration.

Figure 7. The performance of different quantum classifiers at the different depolarization rates (

P = 0.1, 0.3

). Depolarizing noise models extracted from quantum hardware are applied to the trainable unitary

U_{c} (θ)

of these three classifiers. The labels `BQM’, `BCE’, and `MSE’ refer to the proposed Grover-based quantum classifier, the quantum kernel classifier with BCE loss, and the quantum kernel classifier with mean square error loss. (a,b) shows the variation of the train and test accuracies of BQM and the quantum kernel classifier with BCE loss with a P value of 0.1. (c,d) show the variation of the train and test accuracies of BQM and the quantum kernel classifier with BCE loss when the P value is 0.3. Vertical bars reflect the variance of the train and test accuracy at each iteration.

Figure 8. The iterated process of BQM algorithm. After receiving the encoded data in the variable component subcircuit of the quantum classifier, the measurement operator defines a query as one measurement.

Table 1. The average training and testing accuracies of BQM, BCE, and MSE quantum classifiers in the last stage. The value `a ± b’ means that the average precision is a and its variance is b. The labels `BQM’, `BCE’, and `MSE’ refer to the proposed Grover-based quantum classifier, the quantum kernel classifier with BCE loss, and the quantum kernel classifier with mean square error loss.

Algorithm’s Name	$P = 0.1 (Train)$	$P = 0.1 (Test)$	$P = 0.3 (Train)$	$P = 0.3 (Test)$
BCE	$0.883 \pm 0.034$	$0.871 \pm 0.071$	$0.924 \pm 0.042$	$0.799 \pm 0.061$
MSE	$0.941 \pm 0.017$	$0.938 \pm 0.008$	$0.929 \pm 0.034$	$0.917 \pm 0.011$
BQM	$0.977 \pm 0.011$	$0.971 \pm 0.007$	$0.951 \pm 0.027$	$0.949 \pm 0.010$

Table 2. Query complexity in several algorithms. The notations T, K, M, and l refer to the batch size range, the wide variety of measurements used to estimate quantum expectation value, the complete variety of education examples, and the total number of trainable parameters.

Algorithm’s Name	Query Complexity
Grover	$O (\sqrt{K l})$
Younes’ algorithm	$O (\sqrt{\frac{K l}{M}})$
BCE	$O (K M l)$
MSE	$O (K M l)$
BQM	$O (\sqrt{\frac{K l}{M T^{2}}})$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, W.; Wang, Y.; Qu, Y.; Ma, H.; Wang, S. Binary Classification Quantum Neural Network Model Based on Optimized Grover Algorithm. Entropy 2022, 24, 1783. https://doi.org/10.3390/e24121783

AMA Style

Zhao W, Wang Y, Qu Y, Ma H, Wang S. Binary Classification Quantum Neural Network Model Based on Optimized Grover Algorithm. Entropy. 2022; 24(12):1783. https://doi.org/10.3390/e24121783

Chicago/Turabian Style

Zhao, Wenlin, Yinuo Wang, Yingjie Qu, Hongyang Ma, and Shumei Wang. 2022. "Binary Classification Quantum Neural Network Model Based on Optimized Grover Algorithm" Entropy 24, no. 12: 1783. https://doi.org/10.3390/e24121783

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Binary Classification Quantum Neural Network Model Based on Optimized Grover Algorithm

Abstract

1. Introduction

2. Basic Conception

2.1. Supervised Learning Classification of QNN

2.2. Younes’ Algorithm

3. The Binary QNN Model

3.1. Pretreatment Stage of a Dichotomous Task

3.2. The Training Process of the Learning Plan

3.3. The Evolution of the Quantum State

3.4. The Loss Function

3.5. Gradient-Based Parameter Optimization

3.6. Circuit Implementation of Label Prediction

3.7. Synthetic Construction of Datasets

3.8. The Details of BQM

4. Results

4.1. Dataset Evolution

4.2. Stability and Convergence Analysis

4.3. Performance Analysis under Depolarization Noise

4.4. Complex Comparsion Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI