GT-Scan2: cloud-based CRISPR target site ranking using chromatin information

LO Wilson1, A O'Brien1, R Dunne2 and DC Bauer1

  1. CSIRO Health and Biosecurity, Sydney
  2. CSIRO Data61, Sydney

The CRISPR-Cas9 system is one of the most widely adopted genome editing mechanism. As such, being able to accurately score targets for their sensitivity and off-target specificity is critical. This, however, is hampered by uncertainty around the influence of the chromatin environment on the in vivo binding activity. We hence investigate whether chromatin marks are able to predict CRISPR-Cas9 activity by performing a meta study over the in vivo binding activity of sgRNAs. While looking at DNase Hypersensitivity marks as a proxy for accessibility we indeed see little correlation with CRISPR-Cas9 functionality. However, we find that the histone marks H3K14ac, H3K4me3 and H4K8ac are significantly associated with sgRNA activity (p-value < 0.05). We hence designed GT-Scan2, which combines sgRNA structure and chromatin environment of the target-site to predict on-target efficacy of CRISPR-Cas9. Our method shows up to a 37% improvement over previously published methods when tested on two independent datasets. By leveraging the Roadmap Epigenomics Project, GT-Scan2 is readily applicable to all human tissues/cell types and provides a full end-to-end service from identifying target sgRNAs to evaluating their effectiveness. GT-Scan2 is available as an Amazon Lambda function, which allows server-less continuously-scalable applications, that can can be shared and built upon without barrier to access.