AAV Genome Integrity Characterization¶
Overview¶
Customer Name | Biotech Bob |
---|---|
Institution | Biotech ABC |
Contact Email | bob@example.com |
Quote Number | 00-00002 |
Order Date | 2025-06-01 |
Sample Name | ss.subsample005 |
Sample Unique ID | ss.subsample005 |
Sequencing Platform | PacBio Sequel IIe |
Software Version | 1.0.8 |
Support Email | ngs@azenta.com |
Executive Summary¶
This report contains the results from a comprehensive analysis of PacBio long-read sequencing data. Herein, we explore the capsid contents to identify the full vector genomes that map to the ITR-to-ITR cassette and the partial genomes and contaminants that constitute non-therapeutic capsids.
The quality rating below reflects the proportion of sequences that represent full genomes packaged into capsids. This rating is color-coded for risk. A strong target for therapeutic vectors is 75% or greater full genomes, and is green. Below 50% represents a significant risk for manufacturability, safety, and/or efficacy and is red. For a medium to high-risk score (50-75%), best practices recommend investigation and possible redesign.
Best practices recommend investigation and potential redesign for a medium to high-risk score.
Connect with Form Bio to enhance your genome integrity analysis and optimize your vector design.
Schedule a Consult
Capsid Content Overview¶
The proportions of full, partial, and contaminant genomes illustrate the genome integrity and indicate the risks for the displayed vector.
Category | Read Count | Total Read Count (%) |
---|---|---|
full | 2235 | 45.57 |
partial | 1512 | 30.83 |
contaminant | 1158 | 23.61 |
Table 1. Read counts and percent total read counts for aggregated data across subtypes. Full are reads that map to the ITR-to-ITR reference genome; Partial includes all reads that map to the reference genome but are not fully ITR-to-ITR; Contaminant includes all reads that map to any other reference genomes (human host, pRepCap or pHelper, plasmid backbone, etc) even if they include parts of the reference AAV, and unmapped DNA.
Figure 1. Percent of total read counts for aggregated data across AAV subtypes.
AAV Reads Mapped to Reference Sequence¶
The percent of total read counts mapped to the start and end positions indicate both where along the vector design subgenomic molecules are present and how frequently those molecules were produced during packaging.
Figure 2. Distribution of alignment start positions by AAV subtypes. Percent of total read counts (y-axis) vs. subtype start position (x-axis). Vertical dotted lines indicate the ITR boundaries.
Figure 3. Distribution of alignment end positions by AAV subtypes. Percent of total read counts (y-axis) vs. subtype end position (x-axis). Vertical dotted lines indicate the ITR boundaries.
Figure 4. Distribution of alignments by AAV subtypes. Percent of total read counts (y-axis) vs. mapped reference length (x-axis).
Partial Genome Contents¶
This section presents the distribution of partial vector genomes to illustrate the proportions of subgenomic species, including left partials, right partials, left snapbacks, right snapbacks, and other vector partials, which have read characteristics other than scAAV or ssAAV. These subgenomic molecules indicate risks for manufacturability, efficacy, and safety of the drug substance.
Assigned Type | Partial | Read Count | Total Read Counts (%) |
---|---|---|---|
ssAAV | right-snapback | 378 | 7.71 |
ssAAV | right-partial | 360 | 7.34 |
ssAAV | left-snapback | 168 | 3.43 |
ssAAV | left-partial | 161 | 3.28 |
ssAAV | partial | 29 | 0.59 |
other-vector | complex | 219 | 4.46 |
other-vector | tandem | 137 | 2.79 |
other-vector | snapback | 51 | 1.04 |
other-vector | unclassified | 9 | 0.18 |
Table 2. Partial genome read types and subtypes with counts and percent of total read counts.
Figure 5. Proportions of partial vector genomes. Partial vector genome categories as percent of total reads with counts above bars.
Figure 6. Relative proportions of partial vector genomes. Partial vector genome categories as percent of total reads.
Figure 7. Frequency and read length of partial vector genomes. Percent of aggregate partial read counts (y-axis) vs. read length of partial genomes (x-axis). Partial genome subtypes indicated by color.
Contaminant Contents¶
This section presents the distribution of contaminant subtypes to illustrate the proportions of contaminant species from outside the ITR-ITR vector genome, including unmapped, chimeric non-vector, chimeric vector, pRepCap, pHelper, host genome, and vector backbone. These contaminant molecules indicate risks for efficacy, and safety of the drug substance.
Contaminant | Read Count | Total Read Counts (%) |
---|---|---|
unresolved-dimer | 975 | 19.88 |
chimeric-vector | 49 | 1.00 |
repcap | 46 | 0.94 |
(unmapped) | 28 | 0.57 |
host | 21 | 0.43 |
vector-backbone | 17 | 0.35 |
helper | 12 | 0.24 |
chimeric-nonvector | 8 | 0.16 |
read-through | 2 | 0.04 |
Table 3. Contaminant read subtypes with counts and percent of total read counts.
Figure 8. Proportions of contaminants. Contaminant categories as percent of total read counts with counts above bars.
Figure 9. Relative proportions of contaminants. Contaminant categories as percent of total reads.
Figure 10: Frequency and read length of contaminants. Percent of aggregate contaminant read counts (y-axis) vs. read length of contaminants (x-axis). Contaminant subtypes indicated by color.
Truncations¶
This section provides insight into the magnitude and location of the truncation species found in the vector prep. Truncations are reads consisting of a single stranded partial vector genome including one ITR and terminating at a site prior to the opposite ITR. These can occur as partials due to vector genome breaks or as snapback genomes and represent risks for manufacturability, efficacy, and safety.
Figure 11. Truncation location and frequency in partial vector genomes. Vector break frequency in percent of vector genome read counts (y-axis) at locations along the vector genome reference (x-axis), and color-coded for magnitude.
Figure 12. Frequency and read length of truncation species. Percent of vector genome partial read counts (truncations, y-axis) vs. read length of partial vector genomes (x-axis). Vector genome partial subtypes indicated by color.
Mutational Distribution¶
This section presents mutations such as deletions, insertions, and substitutions in the sequencing data of full length vector genomes. Mutations have the capacity to cause silent, missense, or nonsense transcripts. Mutation location and size is plotted.
Figure 13. Location of mutations in the vector genome. Percent of vector genome read counts (y-axis) with color indicating mutation type and size corresponding to the number of nucleotides for the specified type.
Figure 14. Nucleotide substitutions along vector genome. Counts binned by 10 nucleotides (NTs) (y-axis) and position (x-axis) for substitutions in all full vector genomes (including the ITRs).
Figure 15. Nucleotide deletions along vector genome. Counts binned by 10 nucleotides (NTs) (y-axis) and position (x-axis) for deletions in all full vector genomes (including the ITRs).
Figure 16. Nucleotide insertions along vector genome. Counts binned by 10 nucleotides (NTs) (y-axis) and position (x-axis) for insertions in all full vector genomes (including the ITRs).
ITR Assessment¶
Inverted terminal repeats (ITRs) of an AAV genome are essential for replication, packaging, and gene expression and so ITR fidelity is critical to vector genome integrity. Here we present the length and mutational burden of the ITRs of the vector genomes, including their location, and finally the ratio of Flip and Flop.
Figure 17. Locations of mutations in full length ITRs. Percentage of mutant (insertions, deletions, substitutions) read counts found in 5’ and 3’ ITRs (y-axis) with color indicating mutation type and size corresponding to the number of nucleotides for the specified type.
Flip and flop indicate the orientation of the inverted terminal repeat (ITR) of an AAV genome. The orientation and structural integrity of ITRs in AAV genomes significantly impact manufacturability and transgene expression, which ultimately determines efficacy.
Assigned Type | Assigned Subtype | Flip/Flop Configuration | Count | Vector Genome Read Count |
---|---|---|---|---|
ssAAV | full | flip-flip | 185 | 3.90 |
ssAAV | full | flip-flop | 207 | 4.37 |
ssAAV | full | flop-flip | 218 | 4.60 |
ssAAV | full | flop-flop | 199 | 4.20 |
ssAAV | full | unclassified | 358 | 7.55 |
ssAAV | left-partial | unclassified | 428 | 9.03 |
ssAAV | right-partial | unclassified | 967 | 20.40 |
Table 4. Flip/flop ratio. The ratio of ITR orientations seen in full vector genome sequences.
Figure 18. Flip/flop proportions. The percent of vector read counts (y-axis) vs the Flip-Flop ITR orientations seen in full vector genome sequences.
Host Contaminant Genome Analysis¶
During manufacturing the host genome can be recombined into the vector genome. This section provides the origin of host genome contaminants in terms of location (default is HG38) including chromosome and gene, and the number of reads for a given host contaminant.
Chromosome | Read Count Total | Read Count (%) |
---|---|---|
chr19 | 21 | 0.428 |
Table 5. Host cell genome mapping to chromosome with counts and percent total read counts.
Gene | Read Count | Total Read Count (%) |
---|---|---|
WASH5P | 5 | 0.10 |
LOC128966629 | 2 | 0.04 |
FAM138F | 1 | 0.02 |
Table 6. Host cell genome mapping to top genes with counts and percent total read counts.
Figure 19. Host cell genome mapping to chromosome. Chromosome location (y-axis) and percent of total read counts (x-axis) for most abundant host genome contamination reads.
Figure 20. Host cell genome mapping to gene. Gene (y-axis) and percent of total read counts (x-axis) for most abundant host genome contamination reads.
RepCap Contaminant Genome Analysis¶
During manufacturing various plasmids are required including one containing the AAV Rep (replication) and Cap (capsid) genes, which are essential for vector genome replication and packaging. Sometimes these components get erroneously packaged into capsids and become one of the major contaminants found in rAAV preps. This section provides the origin of repcap contaminants in terms of location and the number of read
Figure 21. Distribution of RepCap contamination. Percent of repcap plasmid read counts (y-axis) vs. mapped reference position (x-axis). Start and end positions indicated by color.
Figure 22. Density of RepCap contaminants. Density of repcap plasmid counts (y-axis) vs. mapped reference length (x-axis).
Helper Contaminant Genome Analysis¶
During manufacturing various plasmids are required including one containing helper genes, which are important for vector genome replication and packaging. Because rAAV cannot replicate on their own, helper genes like the adenovirus E4, E2a, and VA RNA, are coexpressed in order to promote rAAV replication. Sometimes these components are erroneously packaged into capsids and become one of the major contaminants found in rAAV preps that indicate risks for safety and efficacy. This section provides the origin of helper contaminants in terms of location and the number of reads.
Figure 23. Distribution of Helper contamination. Percent of helper plasmid read counts (y-axis) vs. mapped reference position (x-axis). Start and end positions indicated by color.
Figure 24. Density of Helper contaminants. Density of helper gene counts (y-axis) vs. mapped reference length (x-axis).
To gain deeper insights into your genome integrity and optimize your vector design, please contact Form Bio.
Schedule a Consult