Data Scientist
Cotiviti.com
Hybrid
Remote, India
Full Time
Overview
Cotiviti is seeking a Data Scientist to lead the development of advanced classification and predictive systems for healthcare risk adjustment and ICD-10 code classification. This role will focus on building intelligent NLP systems that analyze clinical charts and encounters to accurately identify and classify ICD-10 codes through sophisticated pattern recognition, machine learning, and natural language processing techniques.
Responsibilities
- Lead development of NLP-based classification systems for ICD-10 code identification from clinical charts and encounters
- Design and implement deep learning models using PyTorch and transformer architectures for medical text analysis
- Build and optimize machine learning models for accurate risk adjustment coding and HCC classification
- Develop decision support systems for automated ICD-10 code suggestion and validation
- Create and maintain feature engineering pipelines for clinical text processing and model training
- Implement model evaluation metrics and performance optimization strategies for healthcare coding accuracy • Produce comprehensive technical documentation and training materials for NLP models
- Conduct system health checks and performance monitoring for deployed coding models
- Collaborate with engineering teams to integrate ML/NLP solutions into production systems
- Provide technical guidance on statistical modeling, transformer architectures, and algorithm selection
- Support data pipeline design and implementation for clinical text analytical workflows
- Participate in code reviews and maintain high standards for code quality
- Experience with distributed computing frameworks and big data technologies for processing large volumes of clinical data
- Mentor junior scientists and analysts in machine learning, NLP, and artificial intelligence best practices
- Track record of using assistive AI technology to improve quality and efficiency of modeling and analytics
- Complete all responsibilities as outlined in the annual performance review and/or goal setting. Required
- Complete all special projects and other duties as assigned. Required
- Must be able to perform duties with or without reasonable accommodation. Required
Qualifications
- Bachelor's degree in Computer Science, Statistics, Mathematics, Data Science, or related field; Master's in Data Science or related field preferred
- 3+ years of experience in machine learning and AI with focus on NLP, classification, and predictive systems
- Strong expertise in PyTorch and transformer architectures (BERT, RoBERTa, etc.) for text classification
- Advanced Python programming skills with experience in NLP frameworks and libraries
- Experience with healthcare claims data, clinical text processing, and ICD-10 coding systems preferred
- Knowledge of risk adjustment methodologies and HCC (Hierarchical Condition Categories) coding
- Experience with feature engineering techniques for clinical text and model evaluation methodologies
- Experience with model deployment and monitoring in production environments
- Understanding of data quality frameworks and error detection methodologies for healthcare coding
- Strong analytical and problem-solving skills with focus on clinical data challenges
- Excellent communication skills with ability to explain technical NLP concepts clearly
- Experience with version control systems and collaborative development practices
- Expertise with SQL and data manipulation tools for healthcare datasets.
