Associate Bioinformatics Data Scientist
Signature Science, LLC
Charlottesville, VA, USA
6/14/2022
Technology
Full Time
Job Description
Job Description
Position Purpose:
A bioinformatics data scientist is responsible for providing experimental design consulting and data analysis for large, high-throughput genomic experiments, with a focus on forensics and metagenomics. The bioinformatics data scientist will be responsible for designing and implementing annotated code for managing, manipulating, and analyzing large-scale genomic data, and for preparing thorough documentation and reporting.
This position is a full-time, on-site role at the Signature Science office in Charlottesville, VA.
Essential Duties and Responsibilities:
- Develop tools for management, analysis and interpretation of high-density microarray and whole genome sequencing data.
- Manage, manipulate, and analyze data using a combination of R, python, and UNIX tools.
- Use established domain-specific open-source software and tools to manipulate and analyze genomic data.
- Implement and execute data processing workflows and automated analytic pipelines.
- · Apply literate‑programming methods to develop reproducible workflows that produce consistent, standardized tables and figures.
- Conduct workflow benchmarking and documentation, identifying inconsistencies and resolving data problems.
- Prepare SOPs, document source code/workflows, and write reports to summarize computational requirements, processing status, and customized analysis results.
Required Knowledge, Skills & Abilities:
- Advanced proficiency working in a Unix/Linux environment.
- Advanced proficiency with open-source software, tools, and databases for analyzing next-generation sequencing data (whole-genome sequencing, RNA-seq, epigenetics, microbiome, and metagenomics).
- Proficiency working with and developing using Docker and/or Singularity container technology.
- Proficiency using version Control software (e.g., Git or similar) to manage programming code.
- Proficiency with Python, Perl, or another scripting language.
- Proficiency with R, RMarkdown, and the "tidyverse" tools for data analysis.
- Preferred: Experience with NextFlow, SnakeMake, or similar workflow/pipeline management systems.
- Preferred: Familiarity with developing and querying relational databases.
- Preferred: Familiarity with AWS and/or Azure cloud computing.
Education/Experience:
- BA or BS in Computer Science, Bioinformatics, or related field
- Experience managing and analyzing large-scale datasets produced sequencing platforms and delivering solutions for managing, visualizing, analyzing, and interpreting genomic data
- Experience using Linux/Unix text processing tools, R, and other open-source tooling to manipulate and format data, to assess data quality, and analyze data.
Clearance:
- This position requires that the candidate be willing and able to complete a successful background screening for a security clearance. Candidates with a current security clearance will receive preference.
Supervisory Responsibilities:
- May serve as a bioinformatics task lead.
Working Conditions/ Equipment:
- Ability to work in varying conditions to include: traditional office environments with sedentary extended periods required for code development and testing.
