Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing genes and did not produce circular mitogenome #217

Open
meeranhussain opened this issue Aug 5, 2024 · 9 comments
Open

Missing genes and did not produce circular mitogenome #217

meeranhussain opened this issue Aug 5, 2024 · 9 comments

Comments

@meeranhussain
Copy link

Hi,

I tried assembling the mitogenome of my insect species using MitoZ. The final assembly is not circular, with a size of 15,442 kb. Additionally, the following genes are missing:

l-rRNA
tRNA-Cys
tRNA-Glu
tRNA-Ser
tRNA-Val
I've tried different K-mers ranging from 57 to 129 and used SPAdes and MEGAHIT. However, the genome remains non-circular with the aforementioned missing genes.

Any suggestions to better use MitoZ to get a circular genome would be greatly appreciated.

Thank you!

@linzhi2013
Copy link
Owner

How about smaller kmers, e.g. 31, 35, 39, 45, 55?

@meeranhussain
Copy link
Author

I did try small kmers with spades but still the same results. Also, FYI I tried mitogenome assembly using Flye on Oxford Nanopore Technology (ONT) long read data in meta mode and produced circularized genomes with sizes ranging from 29-32kb (had all the genes), which is unusually large for insect mitogenomes. I verified long-read mitogenome assembly method on Calliphora sp ONT data (whose mitogenome is typically 15-16kb). However, using Flye with this method resulted in a 32kb circular contig. which raises concerns about misassemblies by this method. Any suggestions you have would be helpful.
summary.txt

@linzhi2013
Copy link
Owner

You can annotate your mitogenome from ONT data with MitoZ and find out the breakpoint for a single circular mitogenome.

Or you can map the NGS reads to the ONT mitogenome and call consensus, and then also cut it based on gene annotation results.

@meeranhussain
Copy link
Author

Thanks for your reply. I annotated ONT generated mitogenome using MitoZ which still didn't annotate following genes: "Potential missing genes:
#Gene total_missing_number
l-rRNA 1
tRNA-Cys 1
tRNA-Glu 1
tRNA-Ser 1
tRNA-Val 1"

But these genes were annotated using MITOS2 on ONT mitogenome, any reason for this difference?

@linzhi2013
Copy link
Owner

It could be that your sample is too divergent from the database MitoZ uses to annotate these genes.

@meeranhussain
Copy link
Author

That makes sense. How to I find circular breakpoints in this Mitogenome, Is there any file generated during annotation?

@linzhi2013
Copy link
Owner

based on the coordinates of the repeated gene, such as COX1.

@meeranhussain
Copy link
Author

Apologies for the delayed response, and thank you for your earlier guidance regarding the use of repeated gene coordinates, such as COX1, to identify the break point in circular mitogenomes.
I was wondering if you could kindly elaborate on the process of identifying the exact circular break point?

Additionally, as an alternative approach, I tried assembling the mitogenome using SPAdes in meta mode and obtained a contig of 15,324 bp with higher coverage (~700x). Upon annotating it with MITOS2, all genes were successfully annotated. However, I am unsure how to confirm whether this contig is circular. Any suggestions or recommendations for verifying circularity in this case would be greatly appreciated.

@linzhi2013
Copy link
Owner

linzhi2013 commented Nov 26, 2024

Hi @meeranhussain ,

MitoZ has incorporated SPAdes meta mode, you can use the option --assembler spades directly.

Assuming there is no long repeats (longer than your kmer sizes, read length), you can redefine the break points and connect the head and tail of the original sequences. Then map the clean data to this new sequence by using mitoz visualize command. If the new "joining" point and its surrounding region have even coverage as the other parts, then you can confident to say your mitogenome is truely circular. But the coverage is much higher than the other parts, it suggests there are repeats around this region, and the read length and kmer size are unable to stride over the repeats and thus maybe only one repeat unit was assembled but the other units were missed.

Best
Guanliang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants