GWAS Scientist, Cancer Genomics Research Laboratory - Remote (req2039)

Employer
  • Frederick National Laboratory

Job Description

PROGRAM DESCRIPTION

Join our talented team of bioinformaticians dedicated to understanding the genetics of cancer! We are seeking an enthusiastic, creative, and collaborative bioinformatics scientist to support our broad portfolio of genome-wide association studies (GWAS).

The Cancer Genomics Research Laboratory (CGR) investigates the contribution of germline and somatic genetic variation to cancer susceptibility and outcomes in support of the NCI's Division of Cancer Epidemiology and Genetics (DCEG), the world's most comprehensive cancer epidemiology research group. CGR is located at the NCI Shady Grove campus in Gaithersburg, MD and operated by Leidos Biomedical Research, Inc. We care deeply about discovering the genetic and environmental determinants of cancer, and new approaches to cancer prevention, through our contributions to the molecular, genetic, and epidemiologic research of the 70+ investigators in DCEG. Our bioinformaticians have both the passion to learn and the opportunity to apply their skills to our rich and varied genotyping and sequencing datasets, generated in support of DCEG's multidisciplinary family- and population-based studies. Working in concert with the epidemiologists, biostatisticians, and basic research scientists in DCEG's intramural research program, CGR conducts genome-wide discovery studies and targeted regional approaches to identify the heritable determinants of various forms of cancer.

KEY ROLES/RESPONSIBILITIES

  • Function as a scientific thought leader within CGR and DCEG for all aspects of GWAS and population genetics. Collaborate closely with DCEG PIs on scientific manuscript development, submission, and revision activities with significant co-authorship and potentially first authorship opportunities
  • Guide large-scale genotyping data QC, phasing and imputation activities
  • Execute population structure testing, association studies, meta-analysis, and fine mapping
  • Contribute to building, benchmarking, and maintaining bioinformatics pipelines to facilitate high through put genomic data analysis in HPC and cloud environments
  • Harmonize and maintain diverse datasets and associated metadata, including performing meta-analyses of data run on multiple platforms and/or externally generated data
  • Thoughtfully synthesize results into clear presentations (including QQ-plots, Manhattan plots) and concise summaries of work to support recommendations for next steps
  • Perform advanced research including multiplicative interaction studies, pathway-based studies, and integrative analyses from multiple platforms and various data types
  • Coordinate with additional resources throughout DCEG, including fellows, post-docs, other contract resources to support an integrative, collaborative, collegial environment

BASIC QUALIFICATIONS

To be considered for this position, you must minimally meet the knowledge, skills, and abilities listed below:
  • Possession of a PhD degree from an accredited college or university according to the Council for Higher Education Accreditation (CHEA) in bioinformatics, statistics, genetics, computational biology or related field or eight (8) years of related experience. Foreign degrees must be evaluated for U.S. equivalency
  • No experience required beyond PhD
  • Experience in scientific and/or complex system management/bioinformatics experience
  • In-depth knowledge of genome-wide association studies and interpretation, and applied computational research on large multivariate datasets
  • Expertise in algorithmic implementation, statistical programming and data manipulation, using e.g. R/Bioconductor, Python, MATLAB, and a wide range of contemporary, open-source bioinformatics tools (e.g. PLINK, SNPTEST, IMPUTE2, BEAGLE, UCSC Genome Browser, Michigan Imputation Server, etc.)
  • Proficiency with Bash, Python, Perl, R, C/C++, and/or JAVA
  • Team-oriented with excellent written and verbal communication skills, organizational skills, and attention to detail; ability to organize and execute multiple projects in parallel
  • Demonstrated ability to proactively remain up-to-date in current bioinformatics techniques and resources, and identify and benchmark novel software solutions against established reference datasets
  • Experience in constructing practical computational tools/pipelines for data parsing, quality control, modelling, and analysis for large-scale genetic or genomics datasets

PREFERRED QUALIFICATIONS

Candidates with these desired skills will be given preferential consideration:
  • Three (3) years of progressively responsible scientific and/or complex system management/ bioinformatics experience
  • Familiarity with publicly available data sources (such as dbGaP, GDC/TCGA, ENCODE, 1000 Genomes, gnomAD/ExAC, TARGET, GTEX) and diverse genomic annotations
  • Experience managing large datasets and computational tasks in a Linux-based high-performance computing environment
  • Pipeline development experience, including collaborative coding and use of source control (e.g. git)
  • Experience with Snakemake or other workflow management systems
  • Experience with containerization (e.g. Singularity, Docker)
  • Experience with Google Cloud, AWS, or managed cloud environments
  • Experience in the field of molecular and population genetics with a strong publication record


more