
Connect With Us
Genomic Data 101, 2018 Ed.
i) Introduction
ii) Abbreviations
iii) Glossary
1. Data Generation
- Next Generation Sequencing - A brief introduction of the different levels of the NGS (whole genome, whole exome, etc.) and the nature of the output from a sequencer.
- Genomic Data Analysis - An overview of the process by which binary code is converted into meaningful health information.
- Data Storage and Sharing - An introduction to the issues presented by the size of genomic data. This section will also touch on the idea of centralised genomic databases and the legal and ethical implications raised by them.
2. Analysing Genomic Data
- Sequencing Aligment - An explanation of the different methods available and the pros and cons of each.
+ Reference Genome Mapping
+ De Novo Sequence Assembly
+ Graph-Based Reference Genomes
+ Variant Calling - An explanation of the different techniques/tools available for detecting different types of variant and in what situations they can be used.
+ Copy Number Variation
+ Single Nucletoide Variation
+ Comparison of Variant Calling Pipelines - An overview of why there has been a lack of comparioson between the tools in the past, and how that is being overcome.
+ Variant Effect Predictors.
3. Data Storage and Infrastructure
- Data Storage
+ Traditional In-House Infrastructure - The advantages and drawbacks of this approach, and why better solutions were needed for genomic data.
+ The Cloud - An overview of what cloud computing is, and why it is of particular use to researchers in the genomic space.
+ Flash Technology and Improved In-House Infrastructure
- Transferring Data - The problems associated with transfer of large datasets and an overview of the solutions currently available.
5. Data Security
- Data Protection Laws - Some of the legislation surrounding storage of genomic data and how it can impact researchers in the genomics space. Mostly USA and UK/Europe.
- Genomic Databases - The legal considerations of sharing genomic information and the problems faced at a practical level.
- Cloud Security Resources - An overview of some of the ways that data can be kept safe in the cloud.
+ Virtual Private Networks
+ Encryption Tools
- Artificial Intelligence - An introduction to what AI is and why it can be helpful when handling large amounts of data.
+ Narrow/Weak AI
+ Strong/Full AI
+ Super Intelligent AI
- Teaching an AI - An explanation of the differences between the methods and why you might use one and not another.
+ Supervised Models
+ Unsupervised Models
+ Semi-Supervised Models
- Algorithm Applications - An overview of the two main ways by which algorithms can differentiate between data points and why each one might be used.
+ Generative Modelling
+ Discriminative Modelling
- Dealing With Problems - An outline of the biggest problems faced by machine learning at the moment and how they might be resolved.
+ The limits of Prior Knowledge
+ Missing Data
+ Imbalanced Data Categories