Companion Repository for "Bugs in Quantum Computing Platforms: An Empirical Study"
This publication can be reused in at least two ways:
-
Bug Collection: we provide a dataset of bugs that have been minimized accounting only the files and lines of code which were responsible for the bug fix. This can be manually inspected to get deeper insights on the types of bugs that occurs in quantum computing platforms. Each bug fix contains two subfolders, named
before
andafter
, with the files before and after the fix. We recommend using a visual tool like Meld to compare the files before and after the bug fix.Target audience: researchers interested in inspecting bug patterns of quantum computing platforms in details.
-
Bug Study: we provide the code to conduct our empirical study on the annotated data. The code to produce the plots can be reused to compute these metrics for any other annotated bug datasets (e.g., in another context or for an extension of the current work).
Target audience: researchers conducting an empirical study of bugs.
Potential Ideas for Future Work:
- Extension for APR: The bug collection dataset could be augmented with bug-triggering test cases to create a quantum specific benchmark for automatic program repair (APR) techniques.
- Extension for Code Evolution Study: in some time, the current bug collection dataset could be extended with future bug fixes to investigate how the studied metrics has changed over time in the quantum computing platforms.
This repository contains the following resources.
- commits_considered_for_sampling.csv: containing the full list of commits considered for sampling, namely all the pairs of repository name and commit hash. Resource path: artifacts/commits_considered_for_sampling.csv.
- annotation_bugs.csv: including all the annotated commits which resulted either in bugs or false positives. For the bugs, we further annotate
bug type
,components
,symptom
, andbug patterns
. To describe the bug, we also have acomment
column which contains a brief description of the bug or a quote from a developer involved in the discussion of such bug. Moreover, we also have a columnlocalization
which contains a link that roughly identifies the location of the bug in the commit or pointing to some relevant resource to understand the bug. Resource path: artifacts/annotation_bugs.csv. - annotation_components.csv: containing information on which part of the repository led us to the decision that a specific project includes a specific component, e.g., quantum abstraction or machine code generation. We report this for each platform together with the reference to the commit we used during manual source code inspection. Resource path: artifacts/annotation_components.csv.
- minimal_bugfixes folder: containing the minimal bug fixes that are required to fix the annotated bugs. Resource path: artifacts/minimal_bugfixes/. The bug fixes are grouped per repository, and each bug fix folder contains two subfolders, named
before
andafter
, which contain the relevant files before and after the bug fix. - Reproducibility_of_Paper_Analysis.ipynb: containing the steps to reproduce the plots in the paper. Resource path: notebooks/Reproducibility_of_Paper_Analysis.ipynb.
The bugs have two unique ids for historical reasons, but a single id is enough to unequivocally identify a bug. The two ids are named:
id
: which is an incremental number used to uniquely identify the bug, it was used during the annotation process, and if the same commit contains multiple bug fixes, we use a comma-separated nomenclature to refer to the additional bugs (e.g., given the commitacb123
, first bug has id of75
, whereas the second has75,5
).human_id
: which was introduced for readability purposes. It is a combination of repository name followed by the first issue mentioned in the commit message, such aspennylane#481
. In case of multiple bugs in the same commit we use the naming convention:pennylane#481
andpennylane#481_B
.
Note that we never have more than two bugs per commit in this dataset.
The notebook Reproducibility_of_Paper_Analysis.ipynb contains all the required steps to reproduce all the main paper's results in terms of plots and tables.
Hardware and Software Setup
We test the analysis on the following setup:
- Operating System: Ubuntu 20.04.3 LTS
- Kernel: Linux 5.4.0-91-generic
- Architecture: x86-64
- CPU: Intel(R) Core(TM) i7-10610U CPU @ 1.80GHz
- conda 4.10.1
- Python 3.8.0
- RAM: 32 GB
Step-by-Step Reproducibility
Follow these steps to reproduce the paper's results. You can either use the docker container (OPTION A) or set up the conda environment (OPTION B). Instead, if you are only interested in the exact python packages and version used, you can find them in the conda_environment.yml) file. We recommend OPTION A when possible.
OPTION A - Docker Container Approach
Requirement: you need Docker installed on your system.
- Go in the main directory of this repo
- Run the following command to download and run the docker container. This container has dependencies which have been tested to be compatible with the notebook notebooks/Reproducibility_of_Paper_Analysis.ipynb:
docker run -i -p 8888:8888 -v "$(pwd)":/home/jovyan -t jupyter/datascience-notebook:ubuntu-20.04 /bin/bash
OPTION B - Conda Environment Approach
Requirement: you need Conda installed on your system.
- To use the notebook with the exact dependencies we used, you have to create the same conda environment starting from the environment file named conda_environment.yml in the root of the repository. Run the following command to set up your environment:
Note that your system might have assigned a different name to the environment, thus use the one mentioned in your printout at the line
conda env create --file conda_environment.yml
conda activate **environment_name**
. Make sure to use the right one, the default name should beQuantumPlatformBugs
, but the name is an irrelevant detail. - Then activate the conda environment by running:
make sure that the name of your environment is visible in the command line.
conda activate QuantumPlatformBugs
(QuantumPlatformBugs) matteo@ubuntu:~/.../Bugs-Quantum-Computing-Platforms/
Common Step for Analysis
- Now run the jupyter notebook kernel:
jupyter notebook
- From the web UI, navigate to the
notebooks
directory, open and run top to bottom the Reproducibility_of_Paper_Analysis.ipynb notebook. Note that the notebook will output images which will be saved in the folder: reproducibility_results.