1. Introduction
Machine learning (ML) is a set of algorithms and tools used to acquire information or optimize actions based on data [
1]. It has witnessed substantial recent growth, primarily due to its versatile applications across diverse industries [
2]. ML techniques, encompassing classification [
3,
4], prediction [
5,
6], and data generation [
7,
8], have been instrumental in tasks such as image classification [
9,
10], forecasting financial systems [
11], and generating text through natural language processing [
12].
These technological advancements demand substantial computational capacity and optimal performance is expensive. Working with large datasets in deep learning and natural language processing necessitates significant computational resources, thus motivating exploration into different paradigms, such as quantum computing. Quantum algorithms have shown the potential to be exponentially more efficient than classical algorithms for some problems. Various domains of computer science may benefit from quantum algorithms and their properties. ML stands out due to the potential for performance improvements through reducing the temporal or spatial complexity for important subroutines [
13,
14]. Classical data may be encoded in quantum states, for example, through qubit rotations. These encodings enable the study of quantum machine learning (QML) algorithms using classical benchmark datasets. Several algorithms have been proposed in this new paradigm.These include quantum principal component analysis [
15,
16], quantum Support Vector Machines [
17], quantum nearest-neighbor classification [
18,
19], and quantum neural networks [
20,
21]. The variety of algorithms in QML showcases the richness and potential of this emerging field.
ML consists of three main approaches: supervised learning, unsupervised learning, and reinforcement learning. This study focuses on an example of supervised learning [
22,
23]. The goal is to train an ML algorithm that employs a labeled dataset to label a data point from outside the training set but sampled from the same underlying distribution. This approach is known as classification, and it can be performed by the K-Nearest Neighbor (K-NN) [
24] algorithm based on a similarity or distance metric. The method is one of the best-known classification algorithms in supervised learning [
25]. It is also possible to use a K-NN for unsupervised learning tasks. Based on the theory provided in [
14], Quantum Nearest Neighbors is almost quadratically more efficient than the classical analog, even with sparse data, which is useful when working with large amounts of data in a short time.
This work describes a novel Quantum K-Nearest Neighbor (QK-NN) algorithm based on data stored in a Quantum Random Access Memory (QRAM) [
26,
27,
28]. The QRAM incorporates address and data qubits, leveraging quantum superposition. Each data cell represents a leaf of a binary tree, indexed by the address qubits. Our method computes the required distance metric based on the SWAP-Test, which offers a similarity metric for comparing two quantum states. We use Grover’s algorithm [
26,
29,
30,
31,
32] to efficiently identify states with high similarity scores by utilizing an oracle. Quantum states from the labeled dataset are analyzed using the SWAP-Test [
33,
34] via a mid-circuit measurement. This allows our quantum oracle to search for the state with the highest similarity to an unlabeled test example. It is important to note that our results, as is typical with quantum algorithms, are probabilistic. The proposed approach involves storing a sample, or the entire dataset, in QRAM. Properly speaking, our K-NN is not trained; rather, it references the labeled dataset at inference time. In keeping with standard ML parlance, we refer to the data used as “training” data. In the work presented here, we use the Iris dataset (distributed with Scikit-learn [
35]) and MNIST [
36] as the demonstrators. The experiment format involves creating different sizes of QRAM to store subsets of the dataset until reaching a maximum of 128, which is the feasible capacity for simulation. Also, with the size of 128, different numbers of nearest neighbors are considered, considering the maximum of 13 neighbors. Preprocessing is conducted on the datasets to assess the classification capability of our QK-NN proposal, taking into account information loss and working with two datasets with different types of features and several attributes.
Note, that a noiseless quantum computer is considered in this work, as the circuit depths would be prohibitive on a noisy device. The implementation and development are intended to be added to a model with noise or hardware when the capacity to handle the required depth becomes available. However, dealing with the limitations of NISQ devices is beyond the scope of this work. Given these constraints, the experiments are conducted via classical quantum computer simulation. By studying performance trends with an increasing qubit count, it is possible to speculate about the potential usefulness of this approach as quantum computers push beyond classical regimes. It is important to note that these speculations are based on the current understanding and may change as the field evolves.
The structure of the rest of the paper is as follows:
Section 2, describes prior and related work.
Section 3 describes the conceptual design and implementation details of our quantum classifier algorithm.
Section 4 describes the experimental results from a Qiskit simulation of a quantum computer. Then,
Section 5 discusses our results. Finally,
Section 6 presents the conclusions and offers some directions for future research.
2. Related Work
Multiple studies of quantum K-NNs exist in the literature. Here, we review the most recent results, and those contributions closest to this work.
In [
26], a recommendation system employing a Quantum K-NN leveraging Grover’s algorithm for data processing is proposed, utilizing the Hamming distance instead of the SWAP-Test for a similarity metric. This system includes a classical database constructed using basis encoding and two sampled quantum states for comparison using the quantum algorithm. Additionally, an auxiliary qubit is incorporated to probabilistically amplify recommended elements via Grover’s algorithm, thereby enhancing recommendation accuracy.
Another quantum version of the K-NN classifier is proposed in [
29]. In this paper, the entire dataset was incorporated into a quantum circuit. The best cases were compared using the Hamming distance and sorted accordingly with a quantum operator.
Reference [
37] introduces a QK-NN algorithm based on Mahalanobis distance, which is a multivariate generalization of the square of a standard
Z score, i.e., how many standard deviations a point is away from the mean. The authors propose to store the Mahalanobis distance between states in QRAM. They then sample from the collection to obtain the K nearest neighbors using Grover’s algorithm.
In [
38] the authors introduce a new similarity measure named “polar distance,” inspired by the polar coordinate system and quantum properties. This measure considers both angular (phase) and modulus (length) information, with an adjustable weight parameter tailored to specific datasets, they use a classical binary search with the Grover algorithm. The complexity time is
.
In [
39] the authors present a quantum circuit for K-NN classification that quantizes both neighbor and K value selection processes simultaneously. Their algorithm utilizes least squares loss and sparse regularization to identify the optimal K values and K nearest neighbors. It employs a novel quantum circuit that combines quantum phase estimation, controlled rotation, and inverse phase estimation techniques.
Reference [
40] utilizes the SWAP-Test to compute scalar products between feature vectors encoded according to the Mottonen method, representing amplitude encoding in quantum states. This work is the most direct comparison of our proposed QK-NN model due to the similarities between the SWAP-Test and the Grover algorithm because it applies the Quantum Minimization Algorithm (QMA) to determine the K closest neighbors for each test vector. This method also involves employing Grover’s algorithm to find the closest neighbor and conducting a specific number of SWAP-Test iterations. Likewise, a similar QMA process is used to enhance accuracy. The time complexity of the Quantum K-NN algorithm is
where
n is the training set size,
m is the test set size,
k is the number of clusters and
d is the space dimension, enabling a potential speed-up for large vector dimensions. Therefore, the number of SWAP-test iterations and QMA iterations influences the efficiency of the QK-NN algorithm.
Finally, Ref. [
41] has various K-NN approaches, relying on the inner product and Euclidean distance calculations to identify the nearest neighbor or class centroid. Specifically, the focus is on the utilization of Euclidean distance. While using QRAM and the SWAP-test is mentioned, an intermediate measurement is lacking. Instead, Amplitude Estimation (AE) is employed, similar to the algorithm proposed in reference [
38], albeit without QRAM. Their proposal is notable for its scaling of the number of queries, which is proportional to
rather than
M.
Our Contribution
Our results are comparable to recent state-of-the-art quantum algorithms, see, for example, [
38] with an accuracy of 95.82% and [
29] which found an accuracy of 94.66%. The notable points of this work are as follows:
Our primary novelty is in the combination of high-performing quantum subroutines along with an empirical study of scaling performance with available qubit count for the QRAM.
We illustrate how mid-circuit measurements in the SWAP-Test as an oracle support Grover’s algorithm, which achieves high accuracy in identifying similar states.
The algorithm allows for the creation of a quantum circuit for a subset of data with a spatial complexity of and a temporal complexity of .
We conducted a series of experiments with ten seeds to demonstrate the benefits of our proposal on the Iris and MNIST datasets, aiming to highlight the limitations of our QK-NN approach.
We have enhanced the state-of-the-art experimental results using the MNIST dataset, achieving a notable performance of 94.00% ± 2.11.
We provide repeatable numerical experiments through a structured workflow on the Iris and MNIST datasets. Our code is available online (
https://github.com/MaldoAlberto/QKnn, accessed on 21 April 2024).
3. Materials and Methods
This section presents the methods employed in this research. Our QK-NN consists of three primary steps. First, we utilize QRAM for intermediate storage and implement the SWAP-Test for a similarity metric. Second, we employ Grover’s algorithm to find the solution with the highest similarity with high probability. Finally, classical post-processing reads the address of the QRAM to determine the identities of the K-NN classes. Each point is described in more detail below.
3.1. Review of the Classical K-NN Algorithm
The classical K-NN classification algorithm is widely used in supervised learning [
27,
28]. It utilizes two datasets: a training set, where each sample has an associated label and a test set that contains unlabeled samples we wish to classify. For each example from the test set, the algorithm deploys a similarity metric, e.g., a distance metric in feature space (see
Section 3.5) to compare the example to each member of the training set. Labels are selected according to a voting rule over
K of the nearest neighbors in metric distance space, where the parameter
K is an odd hyperparameter so that one class is always more significant than the other according to the pigeonhole principle.
For illustration,
Figure 1 describes two classes in our training set, represented as blue and green circles. The method determines which class an unknown example, represented by a red point in
Figure 1, belongs to. The distance metric is applied between the unlabeled example and the samples in the training set. With
, the nearest neighbor set is observed to contain two green circles and three blue circles. Therefore, by majority vote, the unknown example is classified as blue.
The steps of a classical K-NN are listed in Algorithm 1.
Algorithm 1 Classical K-NN |
- 1:
Designate a labeled training set and an unlabeled test set. - 2:
for each event in the test set do - 3:
for each event in the training set do - 4:
Compute the distance between the test event and the training event. - 5:
end for - 6:
Sort the events in the training set by metric distance. - 7:
Choose the K nearest neighbors to the test event. - 8:
The class with highest number of nearest neighbors among the K selected examples is assigned to the test event. - 9:
end for
|
3.2. Quantum Data Encoding and Preprocessing
In order to utilize a quantum computer to analyze classical data, we must first encode the data as quantum information. To conduct this we apply the transformation
, where
is the Hilbert space with terms
[
42,
43]. To represent classical data in a state vector, a set of
qubits is transformed using a unitary matrix
to obtain
[
42]. Algorithms may use select from basis, amplitude, angle, or arbitrary encoding. The algorithm in this paper demonstrates an angle encoding scheme, but in principle, other parameterized circuit methods may also be successful.
3.2.1. Angle Encoding
Angle encoding uses
N classical features as rotation angles for
n qubits, where
[
42,
43] (each qubit may, in principle, support two rotations). A general encoding using single-qubit parameterized gates with one qubit per feature,
, may be written as [
42,
43]:
where
j indexes examples,
is the corresponding classical data, and
i indexes qubits. Our method chooses parameterized rotation gates about
Y or
Z on the Bloch sphere—RY
or RZ
, respectively, for the data encoding, defined in Equations (
2) and (
3):
The quantum circuits encoding
n features in
n qubits are shown in
Figure 2. In the case of RZ gates, we must prefix the circuit with a Hadamard gate (
) to transform the basis from
to
.
3.2.2. Preproccesing
Datasets may contain numerical or categorical values. We convert binary categorical values into or , generating complete or zero-rotation gates. In the event of multiple classes, it is appropriate to use a “one-hot” encoding method, which requires more qubits. Standard classical machine learning practices such as k-fold cross-validation, leave-one-out, or hold-out data selection may be employed in constructing training and testing datasets. To verify the proposed model’s classification accuracy when applied to unlabeled data, k-fold cross-validation, and hold-out samples should be made using random shuffles.
Our initial preprocessing step is to determine the necessary number of features, denoted as
N. We choose here to assign one qubit per feature. In other cases, due to limitations on qubit availability in contemporary hardware, dimensionality reduction techniques such as principal component analysis (PCA) [
44] may be needed. Angle encoding requires angles between 0 and
when using RY gates, and between 0 and
when using RZ gates. Therefore, normalization might be necessary depending on the dataset and quantum rotation gates utilized.
3.3. QRAM
A storage array, an address register, and an output register define classical RAM. Each cell within the array corresponds to a distinct numeric address. Upon initialization, the register holds the address of a memory cell, and subsequently, the content is transferred to the output register. QRAM shares the fundamental components of classical RAM, with the distinction that qubits comprise the address and output registers [
27]. Through address superposition, QRAM can access multiple memory cells simultaneously. The address register may effectively store
values, where
m is the number of qubits in the register, as shown in Equation (
4):
where
j indexes the data in the QRAM and
is the
m-qubit content of the
j-th memory cell [
27]. The address structure may be represented as a binary tree as shown, for example, in
Figure 3, which shows an example encoding for the QRAM of three qubits. The state of
is
, and the states
and
are
, giving a probability of
for each of the states
and
. This encodes binary values as values 0, 1, 2, and 3, respectively.
The QRAM address size is constrained to base-two values such as 2, 4, 8, 16, 32, 64, 128, and so forth. If the classical data indices do not fill the vector, the remaining cells are padded with zero-values.
3.3.1. Encoding the Dataset in QRAM
To encode the training dataset, we begin with m address qubits initialized into , an ancilla qubit in , and data qubits (Here, we assume we have an equal number of data qubits and features. If this is not practical, we may use dimensionality reduction techniques to reduce the required qubit count or different encoding schemes) initialized into . Note, that given a fixed number of features N, increasing the number of qubits for QRAM increases the address space and the number of training events we may encode. The full QRAM with address qubits, our ancilla (for operations) and the training data qubits is or . The encoding procedure involves preparing all of the qubits, and then executing an operation on the address block to prime the ancilla followed by an operation on the data qubits according to the state of the ancilla in a loop over all of the training data. We will explain these operations individually in the following paragraphs and then explain how the combined circuit is constructed.
To build the addresses for the linked data blocks, we employ the Multicontrolled
X (MCX) gate, as described in [
31]. We will write
to indicate the number of control qubits
m. In this work,
m also refers to the number of address qubits available. An example
is the
, which is the Toffoli gate (or CCX gate). It involves two control qubits and one target qubit that activates upon meeting the conditions of the control qubits. By default, the control qubits are set to the state
, and we apply an X gate if we need the state
. This MCX then prepares the ancillary qubit. The address values are derived from a binary value representing a basis encoding. For example, cell 2 is 0b10, or
, meaning that the MCX gate will be applied with qubit 0 in the state
and qubit 1 in the state
. We can write this operation generally in Expression (
5):
to indicate how the address block and ancilla are prepared.
The prepared ancilla qubit then activates the controlled rotation gates RY or RZ (using one or the other depending on the chosen encoding), to encode the correctly addressed training example, as described in Expression (
6):
where
p is the size of the training dataset,
N is the number of features (and
qubits in the data register),
provides the value of feature
j in example
i, CRY is a controlled-RY, and CRZ is a controlled-RZ. If
, the complex conjugate of the element of the test set to be compared is added to the empty cell address. Otherwise, the MCX and X gates used to generate the binary state are reapplied to avoid affecting other states.
See
Figure 4 for an illustration of the address and data encoding combined into a circuit for two data examples.
3.4. Grover’s Algorithm
Grover’s algorithm aims to search for a value within an unstructured database of size
p. Classical algorithms, whether deterministic or probabilistic, require inspecting at least
values with a time complexity of
to achieve a probability of finding the value of interest exceeding
. In contrast, Grover’s algorithm can retrieve the desired value in
steps [
45].
Our goal is to find which of the
p examples in our training set have the highest similarity to a test event using an unstructured search. To run Grover’s quantum algorithm, in general, we begin with initial state
and an oracle,
, that operates as shown in Equation (
7).
where
is the value being searched for.
We next consider a function
f that takes a proposed solution
, and returns
if
and
for
. Then the oracle is expressed as shown in Equation (
8):
This transformation means that the amplitude of the
state becomes negative over the states, as is described in [
26,
29,
30,
31]. Then, using a reflection operation referred to as a “diffuser” [
26,
29,
30,
31,
32] causes the positive values to move to an amplitude of 0 and the negative values to move to an amplitude of 1, as shown in Equation (
9):
where
is the uniform superposition of states and
is the identity matrix of dimension
N. Since
is a reflection about
is a reflection about
. The Grover diffuser is implemented in a quantum circuit using a phase-shifting operator that negates the ground state [
32].
3.5. SWAP-Test
An important subroutine of the K-NN algorithm is to calculate the distance metric between two examples. The classical algorithm frequently employs the Euclidean distance for this metric, as defined in Equation (
10) [
34]:
where
i indexes a sum over features, and
and
are two examples.
In quantum computing, a similar metric may be derived using the inner product to find the state overlap, denoted as
. A quantum subroutine known as the SWAP-Test [
34] can compute this between two pure states
and
. For our purposes,
and
are the states generated by our encoding algorithm for two examples. The SWAP-Test, illustrated in
Figure 5 for a three-qubit example, in general, requires a minimum of
qubits: one ancilla qubit and
n each for the two states being compared:
.
The quantum circuit for the SWAP-Test consists of an ancilla initialized to the state
, and qubits to hold the states
and
. A Hadamard gate is first applied to the ancilla qubit. Subsequently, a controlled SWAP operation is performed, with the ancilla qubit as the control. Finally, another Hadamard gate is applied to the ancilla, and then it is measured. The measurement of the ancilla qubit is repeated
s times to obtain the fidelity between the two states. The quantum state after running the gates in the circuit, but before measurement, is expressed in Equation (
11):
Following the measurement, the probability of the first qubit being found in the
state is
. When the probability of finding
is 100%, it indicates that the two states are identical; however, if the probability is 50%
and 50%
, it signifies that the states are orthogonal. Note, that to estimate the error on the overlap to within a factor of
, we require
copies of the state.
When our states require more than one qubit for encoding the controlled-SWAP follows a specific pair-wise pattern. We maintain the same ancilla for control and create a sequential cascade with different target qubits: one from
and one from
, as depicted in
Figure 6. Each pair of qubits, matched by index, undergoes a controlled SWAP operation, with the ancilla qubit consistently serving as the control. The circuit begins and ends with Hadamard gates applied to the ancilla qubit [
33].
3.6. Quantum K-NN
The QK-NN is based on three critical components: data storage in QRAM, the similarity metric from the SWAP-Test quantum circuit, and utilizing Grover’s algorithm to select the most similar states with high probability, as shown in
Figure 7, which combines all the elements of our approach into one circuit. Theoretically, we should run Grover’s algorithm
times to search the QRAM for the matching similarity event. However, we empirically found good results with many fewer calls. We found searching
times to produce a set of nearest neighbors, where
m is the number of qubit addresses in the QRAM, to be effective. The result of each run is a probabilistic estimate of the most similar event from the training dataset. It is not guaranteed to be unique and
calls are not guaranteed to find the
K most similar events. We empirically observe in the experiments discussed below that the algorithm samples from the events in a way consistent with the performance of other KNN implementations.
The performance of the proposed classifier is assessed through a mean accuracy metric computed from R runs. We propagate statistical uncertainties from shot noise through the average to the end results.
The QK-NN algorithm is presented step-by-step in Algorithm 2.
Algorithm 2 Quantum K-NN |
- 1:
Designate a labeled training set and an unlabeled test set. - 2:
Apply angle encoding to the features of both datasets. - 3:
for each event in the test set do - 4:
for 1 ⋯ (Grover iterations) do - 5:
for S shots do - 6:
Load the training set into QRAM. - 7:
Initialize the oracle qubit to the state . - 8:
Compute the SWAP-Test between the test event and training data encoded as a superposed state in the data qubits portion of the QRAM. - 9:
Apply an intermediate Measure in the SWAP-Test qubit. - 10:
Apply the X-gate to the oracle qubit when the intermediate measurement of the SWAP-Test is 0. - 11:
Apply the transposed complex-conjugate of the QRAM and the diffuser to the address qubits. - 12:
Measure the QRAM address qubits. - 13:
end for - 14:
The states measured in the address qubits of the QRAM should be converted from binary to decimal to compute the index of the example in the dataset. - 15:
Record the index of most similar state after S shots of the circuit. Note, that if we find an empty address we throw the result out and repeat the iteration. - 16:
end for - 17:
Sort and select the K nearest neighbor candidates with the highest similarity scores as compared to the test event. - 18:
Vote over the K candidates to find the class of the current test event. - 19:
end for
|
3.7. Utilization Procedure Summary
To summarize this section, for our first step, we preprocess the dataset classically and divide it into training and test sets. This preprocessing step involves any necessary dimensionality reduction and/or normalization of variables for encoding purposes. Subsequently, the data are transferred to a QRAM and subjected to the SWAP-Test with an intermediate measurement in the SWAP-Test qubit. Following this, Grover’s algorithm is utilized on the SWAP-Test value to ascertain the indices of the samples with the highest probability of overlap with the test sample, thereby identifying the
K nearest neighbors. This method is depicted in
Figure 8, where the five basic steps are as follows:
Define the dataset: A dataset is selected and divided into (labeled) train and (unlabeled) test sets.
Execute required preprocessing: Any required classical preprocessing to utilize the features of the events is applied at this stage.
Run the Quantum K-NN: The proposed quantum nearest neighbor algorithm, generated by a test on QRAM, with high overlap events selected using Grover’s algorithm, is employed next.
Perform postprocessing: This step consists of measuring the set of address qubits of the QRAM memory and selecting the nearest neighbors.
Compute the accuracy: After each example is labeled we can compute the accuracy of the algorithm.
It is important to highlight that when obtaining states with higher probability, only the top-k states activated with a 1 are considered of interest. Subsequently, these states are sorted in descending order until reaching k neighbors. This process entails a time complexity of , where m represents the number of address qubits used in Grover iterations, and k denotes the number of iterations required for the selection process. In this selection process, only half of the results are utilized, from which half of the states are chosen as the k neighbors.
5. Discussion
This paper introduces a QKNN algorithm that utilizes Grover’s algorithm to accelerate event comparison. The dataset is encoded using a QRAM, and the Grover oracle is constructed via a SWAP-Test intermediate measurement. The algorithm evaluation is presented in
Table 1,
Table 2,
Table 3,
Table 4 and
Table 5 where we find competitive performance across various QRAM (encoded dataset) sizes and encoding strategies.
The experiments show that the behavior of the proposed circuit with different preprocessing for encoding offers performance similar to the classical algorithm using the Euclidean distance implemented by the Scikit-learn library [
35]. The quantum results are within statistical uncertainty of the classical results for any given experiment, but consistently slightly lower across experiments. Both quantum encoding approaches performed roughly equally well across experiments.
When considering different QRAM sizes, if the total encodable training set was small, all classifiers had poor accuracy. However, when training set sizes approached a quarter of the available training data, we found good performance, as displayed in
Table 1,
Table 2 and
Table 3. When considering different numbers of neighbors, we find optimal performance by
or 5, with a precise determination obscured by statistical uncertainties, as presented in
Table 4 and
Table 5. In general, we find performance increasing with the qubit counts, both as available to QRAM and as available for encoding features; the PCA-reduced as the Pooling experiments underperform the exact feature encoding.
It is not surprising to find approximate parity on accuracy given the simplicity of the dataset. It is difficult to measure any performance advantage at this stage in terms of wall-clock timing, but the dataset encoding has logarithmic scaling in the quantum case versus linear in the classical, and a polynomial advantage in best-match search time. The quantum algorithm performance likely has large constant factors to contend with, and we are not considering the cost of loading the dataset into QRAM which is daunting. However, it is reasonable to hypothesize the scaling performance advantages inherent in the technique could manifest some problems on future quantum computers.
Our results may be compared to two recent implementations of a QKNN using the Iris dataset. First, in [
38], the algorithm uses a polar distance metric, and instead of a QRAM with Grover’s algorithm, it employs amplitude estimation. It achieves an accuracy of around 95%, roughly independent of
K for
K between one and 13. The authors performed 30 ten-fold cross-validation experiments for a training set size of 135 (re-drawn 10 times). The second approach, detailed in [
29], is based on the Hamming distance and a sorting algorithm. It achieves an accuracy of 94.66% for
. The authors used a “leave one out” training strategy, creating an effective training set size of 149, and repeated their experiments 50 times. The accuracy comparison between our algorithm and these recent state-of-the-art results is summarized in
Table 8.
Our work presents a time complexity determined by the number of iterations in the Grover algorithm and the reading of probability states subject to the Grover algorithm, giving us a complexity of
. In the state of the art, it is observed that method [
38] has a complexity similar to ours with
, implying a similar process of operations in terms of efficiency. However, our work highlights the ability to achieve satisfactory results with just one iteration, as evidenced in
Figure 13 and
Figure 14. Additionally, our approach is based on angle encoding using the QRAM, unlike amplitude encoding using heaps, which is used in other methods. Consequently, the complexity of our work is determined by the number of gates to be encoded, expressed as
, where
n represents the number of qubits to be encoded,
M denotes the number of parameters, and
t is the value to be compared, and ours is
. Also, [
29]
is more complex when working with a four-variable system, where
k is the number of the classes,
n and
m are the requirements to design the operator and sorting takes
p time.
In the MNIST dataset, [
41] presents a study that leverages the concept of QRAM memory, assuming the provision of oracles, to minimize the number of oracle queries required. They propose that they can achieve the accuracy level
using neural networks (
) with a scaling number of oracle queries, proportionate to
for constant success probability. In contrast, the classical nearest-neighbor algorithm scales queries as
. In
Table 9, we experimentally demonstrate the type of experiment where we achieve the classification of this dataset and compare it with the classical part, obtaining only a 2.55% difference in RZ encoding.
Figure 15 and
Figure 16 demonstrate that our circuit proposal maintains a consistent ratio between the depth and the number of gates for two qubits. As we scale up the QK-NN to 128, it becomes inefficient, experiencing an exponential increase, supported by
Table 7, where even with a size of
or 256 cells, having more than six features becomes impossible. Therefore, dealing with a large dataset in the Qiskit 1.0.x version with GPU could be more efficient. However, we conducted experiments with the MNIST dataset that support and enrich the results of [
40], which consider oracles and QRAM as a given.
We can define to what extent it is viable to perform a QK-NN algorithm through simulation. Consider it a future work to address this process in real hardware, such as using IBM Quantum processors. These processors would work with the same number of gates if there is interaction between the qubits. Otherwise, the experiments will use extra ancillary qubits and additional CX gates. We also consider it essential to implement these results, taking into account the use of quantum hardware in robust projects, as in [
48,
49,
50].
Table 9.
Here, we present an accuracy comparison across the different methods discussed in this paper with the MNIST dataset. All values are in percent.
Table 9.
Here, we present an accuracy comparison across the different methods discussed in this paper with the MNIST dataset. All values are in percent.
Metric | Classical K-NN | QK-NN QRAM Size 128 with RY (This Proposal) | QK-NN QRAM Size 128 with RZ (This Proposal) |
---|
Accuracy | | | |
Considering the work’s open-source nature, the code provided by this study can be incorporated into the noisy simulation module using the Qiskit 1.0.X format to explore the method’s application on real quantum hardware. It supports CPU and GPU, utilizing CUDA 12.4, although this is not the study’s primary purpose.
6. Conclusions
The notable findings of this study in the Iris dataset include an accuracy of approximately 94 ± 1.6%, placing the performance of the proposed algorithm on par with the classical K-NN version, which achieves an accuracy of approximately 96 % obtained using the scikit-learn implementation. Additionally, the accuracy values are roughly equivalent to the performance of recent quantum implementations such as one approach using the polar distance metric with 95.82%, and another implementing a sorting algorithm with 94.66%.
Regarding the MNIST dataset, classes 0 and 1 demonstrate an accuracy of %, consistent with the classical result obtained using scikit-learn, registering %. This outcome underscores critical advancements in the state-of-the-art, presenting an experimental milestone for this dataset. It leverages a QRAM memory and an oracle integrated within a framework grounded in the quantum computing paradigm, such as Qiskit.
Our work builds on the theoretical foundations presented in [
14,
16,
17], and provides a versatile framework capable of accommodating different training dataset sizes and different encoding schemes. Furthermore, in frameworks such as Qiskit, using intermediate measurements similar to dynamic circuits [
51] while leaving other qubits unmeasured, enables circuit processing and the application of Grover’s algorithm to identify states with higher probabilities than those with identical fidelity.
Our method offers potential extensions by integrating the Hamming distance with the fidelity metric. This would allow for the analysis of datasets with different data types. Leveraging the ability of QRAM to accommodate different cell types, such as those represented by RY and RZ, which interpret binary values as and , opens avenues for developing two different oracles. For example, one oracle could be based on the Hamming distance, while the other could be based solely on the SWAP-Test, assigning specific qubits from the training set and test set features to each distance metric.
The main limitations in the application of our approach on modern hardware are to be found in available qubit count and noise levels. Here, we have considered noiseless circuit simulation as a proxy for an anticipated blend of quantum error mitigation and error correction on a small number of qubits. Scaling the exact classical simulation of this algorithm to high qubit counts and circuit depths faces severe constraints. It is an open question as to whether approximate circuit methods, such as tensor networks, could produce equivalent results.
The natural next step is to consider more complicated problems with larger datasets and more complex feature sets. We are releasing code that may be easily adapted to such studies. It will be important to eventually understand if quantum machine learning offers a provable performance advantage. This could be demonstrated in the future by demonstrating strong scaling performance in qubit count for complex problems, or by measuring significant complexity or wall-clock savings. The latter demonstration must likely wait for better hardware performance and at least partial quantum error correction.