Mistral AI Multi Agent Workflow For Recruitment

Multi Agent Workflow For Recruitment

agentsnon_frameworkmistral-cookbookrecruitment_agentmistral

alph-notebooks/mistral-cookbook / Multi_Agent_Workflow_For_Recruitment.ipynb

Export

Run Notebooks

Contents

No cells yet

Add cells to see them here

Multi Agent Workflow For Recruitment

Introduction

The Multi Agent Workflow For Recruitment is an automated system designed to help streamline the hiring process through specialized AI agents working in harmony to improve candidate evaluation, save time and resources, and improve overall hiring outcomes.

The Problem

Today's recruitment landscape faces three critical challenges:

Overwhelming Volume: Recruiters struggle to efficiently process large numbers of applications, often missing qualified candidates.
Manual Inefficiency: Traditional resume screening is time-consuming, inconsistent, and vulnerable to bias.
Poor Candidate Experience: Slow response times and fragmented communication damage employer brand and lose top talent.

Why This Matters

Ineffective recruitment directly impacts business outcomes through:

Reduced Performance: Missing qualified candidates leads to suboptimal hires and team performance
Business Delays: Extended hiring cycles postpone critical projects and initiatives
Higher Costs: Inefficient processes and prolonged vacancies increase recruitment costs

Our Solution

The Multi Agent Workflow For Recruitment addresses these challenges through a coordinated system of specialized AI agents:

DocumentAgent: Intelligently extracts and processes text from resumes and job descriptions using advanced Mistral's OCR
JobAnalysisAgent: Analyzes job descriptions to identify required skills, experience, and qualifications
ResumeAnalysisAgent: Parses resumes to create structured candidate profiles with key capabilities
MatchingAgent: Evaluates candidates against job requirements with nuanced understanding beyond keyword matching
EmailCommunicationAgent: Generates personalized email communications and schedules interviews with qualified candidates
CoordinatorAgent: Orchestrates the entire workflow between agents for seamless operation.

The solution uses Mistral LLM for language understanding, structured output mechanisms for consistent data extraction, and Mistral OCR for document parsing.

Example: Data Scientist Hiring

To illustrate how the Multi Agent Workflow For Recruitment operates in practice, consider a realistic example:

HireFive needs to hire a Senior Data Scientist with machine learning expertise. The job description specifies requirements including 3+ years of experience, proficiency in Python and deep learning frameworks, and a Master's degree in a quantitative field. From a pool of candidate resumes, the workflow automatically:

Extracts structured requirements from the job description, identifying critical skills
Parses all the resumes, creating standardized profiles with skills, experience, and education
Evaluates each candidate, assigning scores like "Technical Skills: 32/40" and "Experience: 25/30"
Identifies candidates scoring above the 70-point threshold
Automatically sends personalized interview invitations with scheduling links to these candidates

The entire process completes in minutes, providing HireFive's hiring manager with a ranked list of qualified candidates while eliminating hours of manual resume screening.

Solution Architecture

Installation

[1]

Collecting mistralai
  Downloading mistralai-1.6.0-py3-none-any.whl.metadata (30 kB)
Collecting eval-type-backport>=0.2.0 (from mistralai)
  Downloading eval_type_backport-0.2.2-py3-none-any.whl.metadata (2.2 kB)
Requirement already satisfied: httpx>=0.28.1 in /usr/local/lib/python3.11/dist-packages (from mistralai) (0.28.1)
Requirement already satisfied: pydantic>=2.10.3 in /usr/local/lib/python3.11/dist-packages (from mistralai) (2.11.2)
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.11/dist-packages (from mistralai) (2.8.2)
Requirement already satisfied: typing-inspection>=0.4.0 in /usr/local/lib/python3.11/dist-packages (from mistralai) (0.4.0)
Requirement already satisfied: anyio in /usr/local/lib/python3.11/dist-packages (from httpx>=0.28.1->mistralai) (4.9.0)
Requirement already satisfied: certifi in /usr/local/lib/python3.11/dist-packages (from httpx>=0.28.1->mistralai) (2025.1.31)
Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.11/dist-packages (from httpx>=0.28.1->mistralai) (1.0.7)
Requirement already satisfied: idna in /usr/local/lib/python3.11/dist-packages (from httpx>=0.28.1->mistralai) (3.10)
Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.11/dist-packages (from httpcore==1.*->httpx>=0.28.1->mistralai) (0.14.0)
Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.11/dist-packages (from pydantic>=2.10.3->mistralai) (0.7.0)
Requirement already satisfied: pydantic-core==2.33.1 in /usr/local/lib/python3.11/dist-packages (from pydantic>=2.10.3->mistralai) (2.33.1)
Requirement already satisfied: typing-extensions>=4.12.2 in /usr/local/lib/python3.11/dist-packages (from pydantic>=2.10.3->mistralai) (4.13.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.11/dist-packages (from python-dateutil>=2.8.2->mistralai) (1.17.0)
Requirement already satisfied: sniffio>=1.1 in /usr/local/lib/python3.11/dist-packages (from anyio->httpx>=0.28.1->mistralai) (1.3.1)
Downloading mistralai-1.6.0-py3-none-any.whl (288 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 288.7/288.7 kB 7.0 MB/s eta 0:00:00
Downloading eval_type_backport-0.2.2-py3-none-any.whl (5.8 kB)
Installing collected packages: eval-type-backport, mistralai
Successfully installed eval-type-backport-0.2.2 mistralai-1.6.0

Imports

[2]

Setup API Keys

[3]

Initialize Mistral API Client

[4]

Download Data

Here, we download the necessary data for the demonstration.

Job Descrition.
Candidate Resumes.

Helper functions to download Job description and candidate resumes

[17]

Download Job Description

[18]

Downloaded job_description.pdf

Download Candidate Resumes

[19]

13 files available for download:
Downloaded Resume 10_ Carlos Mendez.pdf
Downloaded Resume 11_ Alex Patel.pdf
Downloaded Resume 12_ Taylor Williams.pdf
Downloaded Resume 13_ Jordan Smith.pdf
Downloaded Resume 1_ Sarah Chen.pdf
Downloaded Resume 2_ Michael Rodriguez.pdf
Downloaded Resume 3_ Jennifer Park.pdf
Downloaded Resume 4_ David Wilson.pdf
Downloaded Resume 5_ Priya Sharma.pdf
Downloaded Resume 6_ James Lee.pdf
Downloaded Resume 7_ Emily Johnson.pdf
Downloaded Resume 8_ Robert Thompson.pdf
Downloaded Resume 9_ Lisa Wang.pdf

Define Pydantic Models

Pydantic models provide structured data validation between agents, ensuring consistent formats for candidate profiles, job requirements, and evaluation scores while enabling seamless integration with Mistral LLM's parsing capabilities. Following are the different pydantic models we use for

Skill: Represents a candidate's technical or soft skill with its proficiency level and years of experience.
Education: Captures educational qualifications including degree, field of study, institution, and performance metrics.
Experience: Tracks professional experience with role details, duration, utilized skills, and key accomplishments.
ContactDetails: Stores candidate contact information including name, email, and optional communication channels.
JobRequirements: Defines position requirements including mandatory and preferred skills, experience level, and educational qualifications.
CandidateProfile: Consolidates a candidate's complete professional profile including contact details, skills, education, and work history.
SkillMatch: Evaluates individual skill alignment between job requirements and candidate capabilities with confidence scores.
CandidateScore: Provides comprehensive scoring across key evaluation areas with total score calculation and identified strengths/gaps.
CandidateResult: Connects file information with extracted candidate data and evaluation scores for final ranking and selection.

Pydantic Models for structured extraction.

[5]

Base Agent Class

The Agent class serves as the foundation for all specialized agents, providing a standardized interface for processing and communicating between agents in the recruitment workflow.

Each agent implements the common process() method while inheriting identity management and communication capabilities.

[6]

DocumentAgent: Handles document extraction and OCR

The DocumentAgent handles document processing by extracting structured text from various files using Mistral's OCR capabilities. It transforms complex resume PDFs and job descriptions into text, serving as the initial data gateway for the entire recruitment workflow.

[7]

JobAnalysisAgent: Handles job requirement extraction and analysis

The JobAnalysisAgent extracts structured job requirements from plain text job descriptions using Mistral LLM. It transforms unstructured job postings into organized data models capturing required skills, experience levels, and educational qualifications needed for candidate matching.

[8]

ResumeAnalysisAgent: Handles resume parsing and profile extraction

The ResumeAnalysisAgent transforms raw resume text into structured candidate profiles using Mistral LLM's parsing capabilities. It extracts and organizes key information including contact details, skills, education history, and professional experience into standardized data structures for consistent evaluation.

[9]

MatchingAgent: Evaluates candidate fit against job requirements

The MatchingAgent evaluates candidate profiles against job requirements to generate comprehensive scoring across technical skills, experience, education and additional qualifications. It employs Mistral LLM to assess the quality and relevance of candidate attributes beyond simple keyword matching, producing a detailed evaluation with confidence metrics and identified strengths and gaps.

[10]

EmailCommunicationAgent: Handles email generation and sending

The EmailCommunicationAgent generates personalized email communications to candidates and sends them through SMTP integration. It crafts contextually relevant messages based on candidate qualifications and scheduling information, managing the critical final step of candidate engagement in the recruitment workflow.

[17]

CoordinatorAgent: Manages the workflow and coordinates between agents

The CoordinatorAgent orchestrates the entire recruitment workflow by managing communication and data flow between all specialized agents. It initializes the process with job descriptions, distributes resumes, collects evaluation results, applies threshold-based filtering, and triggers candidate communications, serving as the central intelligence that ensures the seamless execution of the multi-agent recruitment system.

[31]

Run the workflow

To run the Multi Agent Workflow For Recruitment, you simply need to:

Configure file paths for the job description, resume directory, and output results
Set up email credentials and Calendly scheduling link
Initialize the CoordinatorAgent with your Mistral client
Configure the EmailCommunicationAgent with sender credentials
Execute the workflow with your desired threshold score

Define paths

[32]

Gmail App Password Setup

To use the email functionality in the Multi Agent Workflow For Recruitment with Gmail, you'll need to create an app password:

Enable 2-Step Verification on your Google Account:
- Go to your Google Account → Security
- Under "Signing in to Google," select 2-Step Verification → Get started
Generate an App Password:
- Go to your Google Account → Security
- Under "Signing in to Google," select App passwords
- Select "Mail" as the app and "Other" as the device (name it "Recruitment Workflow")
- Click "Generate"
- Google will display a 16-character password (four groups of four characters)

Use this app password in your workflow configuration:

sender_email = "your.email@gmail.com"
app_password = "abcd efgh ijkl mnop"  # Your generated app password

This app password bypasses 2FA and allows the workflow to send emails through your Gmail account securely without storing your actual Google password in the code.

[33]

Initialize coordinator agent

[34]

Set up communication agent with email credentials

[35]

Execute hiring workflow

Note: We have considered 5 candidate resumes for simplicity's sake.

[43]

🤖 DocumentAgent extracting text from job description...
🤖 JobAnalysisAgent analyzing job description...

🤖 DocumentAgent processing resume: Resume 5_ Priya Sharma.pdf
🤖 ResumeAnalysisAgent extracting candidate profile...
🤖 MatchingAgent evaluating candidate Priya Sharma...

🤖 DocumentAgent processing resume: Resume 3_ Jennifer Park.pdf
🤖 ResumeAnalysisAgent extracting candidate profile...
🤖 MatchingAgent evaluating candidate Jennifer Park...

🤖 DocumentAgent processing resume: Resume 6_ James Lee.pdf
🤖 ResumeAnalysisAgent extracting candidate profile...
🤖 MatchingAgent evaluating candidate James Lee...

🤖 DocumentAgent processing resume: Resume 7_ Emily Johnson.pdf
🤖 ResumeAnalysisAgent extracting candidate profile...
🤖 MatchingAgent evaluating candidate Emily Johnson...

🤖 DocumentAgent processing resume: Resume 2_ Michael Rodriguez.pdf
🤖 ResumeAnalysisAgent extracting candidate profile...
🤖 MatchingAgent evaluating candidate Michael Rodriguez...

🤖 CoordinatorAgent saved results to candidate_results.json

===== CANDIDATE RANKING =====
1. Michael Rodriguez: 86/100
2. Priya Sharma: 85/100
3. Jennifer Park: 84/100
4. Emily Johnson: 67/100
5. James Lee: 54/100

🤖 EmailCommunicationAgent preparing to send interview invitations to 4 candidates who scored 65+ out of 100...

You can check each of the candidates extracted results.

[42]

[{'file_name': 'Resume 2_ Michael Rodriguez.pdf',
  'contact_details': {'name': 'Michael Rodriguez',
   'email': 'michael.rodriguez@email.com',
   'phone': '(510) 555-7321',
   'location': 'Oakland, CA',
   'linkedin': 'linkedin.com/in/michaelrodriguez',
   'website': None},
  'candidate_profile': {'contact_details': {'name': 'Michael Rodriguez',
    'email': 'michael.rodriguez@email.com',
    'phone': '(510) 555-7321',
    'location': 'Oakland, CA',
    'linkedin': 'linkedin.com/in/michaelrodriguez',
    'website': None},
   'skills': [{'name': 'Python', 'level': 'Advanced', 'years': 4},
    {'name': 'SQL', 'level': 'Advanced', 'years': 4},
    {'name': 'R', 'level': 'Intermediate', 'years': 4},
    {'name': 'NumPy', 'level': 'Intermediate', 'years': 4},
    {'name': 'Pandas', 'level': 'Intermediate', 'years': 4},
    {'name': 'scikit-learn', 'level': 'Intermediate', 'years': 4},
    {'name': 'XGBoost', 'level': 'Intermediate', 'years': 4},
    {'name': 'TensorFlow', 'level': 'Intermediate', 'years': 4},
    {'name': 'AWS', 'level': 'Intermediate', 'years': 4},
    {'name': 'Spark', 'level': 'Intermediate', 'years': 4},
    {'name': 'Tableau', 'level': 'Intermediate', 'years': 4},
    {'name': 'Matplotlib', 'level': 'Intermediate', 'years': 4},
    {'name': 'Seaborn', 'level': 'Intermediate', 'years': 4},
    {'name': 'PostgreSQL', 'level': 'Intermediate', 'years': 4},
    {'name': 'MySQL', 'level': 'Intermediate', 'years': 4},
    {'name': 'Redshift', 'level': 'Intermediate', 'years': 4},
    {'name': 'Git', 'level': 'Intermediate', 'years': 4},
    {'name': 'Jupyter', 'level': 'Intermediate', 'years': 4},
    {'name': 'Docker', 'level': 'Intermediate', 'years': 4}],
   'education': [{'degree': 'MS',
     'field': 'Statistics',
     'institution': 'University of California, Berkeley',
     'year_completed': 2018,
     'gpa': 3.8},
    {'degree': 'BS',
     'field': 'Mathematics',
     'institution': 'University of California, Los Angeles',
     'year_completed': 2016,
     'gpa': 3.7}],
   'experience': [{'title': 'Data Scientist',
     'company': 'Fintech Solutions Inc.',
     'duration_years': 4,
     'skills_used': ['Python',
      'SQL',
      'Tableau',
      'Machine Learning',
      'Data Visualization'],
     'achievements': ['Built and deployed machine learning models for fraud detection, reducing false positives by 30% while maintaining 99% fraud capture rate',
      'Designed and implemented a customer segmentation model using clustering algorithms, leading to a 15% increase in marketing campaign conversion rates',
      'Developed a churn prediction model with 85% accuracy, enabling proactive retention strategies',
      'Created interactive dashboards for executive reporting using Tableau',
      'Mentored junior data scientists and analytics interns',
      'Collaborated with engineering team to optimize model deployment processes'],
     'relevance_score': 10},
    {'title': 'Data Analyst',
     'company': 'Retail Analytics Group',
     'duration_years': 2,
     'skills_used': ['Python',
      'SQL',
      'A/B Testing',
      'Predictive Modeling',
      'ETL Pipelines'],
     'achievements': ['Conducted A/B testing for website optimizations, resulting in a 12% increase in conversion rate',
      'Built predictive models for inventory management using time series forecasting',
      'Created ETL pipelines for data cleaning and preprocessing using Python and SQL',
      'Implemented automated reporting solutions, saving 15+ hours of manual work weekly',
      'Collaborated with marketing teams to develop customer lifetime value models'],
     'relevance_score': 8}]},
  'score': {'technical_skills_score': 32,
   'experience_score': 28,
   'education_score': 14,
   'additional_score': 12,
   'total_score': 86,
   'key_strengths': ['Proven experience in building and deploying machine learning models',
    'Strong proficiency in Python and SQL',
    'Experience with data visualization tools like Tableau',
    'Relevant experience in the finance domain',
    'Strong educational background in Statistics and Mathematics'],
   'key_gaps': ['Intermediate level in required skills like NumPy, Pandas, and scikit-learn instead of advanced',
    'Lack of experience with preferred skills like PyTorch, Azure, GCP, and Hadoop',
    'No mention of experience with statistical modeling techniques or specific machine learning algorithms'],
   'confidence': 0.9,
   'notes': 'Michael Rodriguez demonstrates strong technical skills and relevant experience, particularly in the finance domain. His educational background is robust, and he has shown significant achievements in his roles. However, there are some gaps in the required advanced-level skills and preferred skills that could be beneficial for the role. Overall, he appears to be a strong candidate with high potential.'}},
 {'file_name': 'Resume 5_ Priya Sharma.pdf',
  'contact_details': {'name': 'Priya Sharma',
   'email': 'psharma@email.com',
   'phone': '+16505552910',
   'location': 'Palo Alto, CA',
   'linkedin': 'linkedin.com/in/priyasharma',
   'website': None},
  'candidate_profile': {'contact_details': {'name': 'Priya Sharma',
    'email': 'psharma@email.com',
    'phone': '+16505552910',
    'location': 'Palo Alto, CA',
    'linkedin': 'linkedin.com/in/priyasharma',
    'website': None},
   'skills': [{'name': 'R', 'level': 'Advanced', 'years': None},
    {'name': 'Python', 'level': 'Advanced', 'years': None},
    {'name': 'SQL', 'level': 'Intermediate', 'years': None},
    {'name': 'SAS', 'level': 'Advanced', 'years': None},
    {'name': 'Pandas', 'level': None, 'years': None},
    {'name': 'NumPy', 'level': None, 'years': None},
    {'name': 'scikit-learn', 'level': None, 'years': None},
    {'name': 'TensorFlow', 'level': None, 'years': None},
    {'name': 'tidyverse', 'level': None, 'years': None},
    {'name': 'caret', 'level': None, 'years': None},
    {'name': 'Regression', 'level': None, 'years': None},
    {'name': 'Time Series Analysis', 'level': None, 'years': None},
    {'name': 'Bayesian Methods', 'level': None, 'years': None},
    {'name': 'Survival Analysis', 'level': None, 'years': None},
    {'name': 'Causal Inference', 'level': None, 'years': None},
    {'name': 'Random Forests', 'level': None, 'years': None},
    {'name': 'Gradient Boosting', 'level': None, 'years': None},
    {'name': 'Neural Networks', 'level': None, 'years': None},
    {'name': 'Clustering', 'level': None, 'years': None},
    {'name': 'ggplot2', 'level': None, 'years': None},
    {'name': 'Matplotlib', 'level': None, 'years': None},
    {'name': 'Seaborn', 'level': None, 'years': None},
    {'name': 'Shiny', 'level': None, 'years': None},
    {'name': 'Git', 'level': None, 'years': None},
    {'name': 'Docker', 'level': None, 'years': None},
    {'name': 'RStudio', 'level': None, 'years': None},
    {'name': 'Jupyter', 'level': None, 'years': None}],
   'education': [{'degree': 'PhD',
     'field': 'Biostatistics',
     'institution': 'Harvard University',
     'year_completed': 2017,
     'gpa': None},
    {'degree': 'MS',
     'field': 'Statistics',
     'institution': 'Stanford University',
     'year_completed': 2013,
     'gpa': 3.95},
    {'degree': 'BS',
     'field': 'Mathematics',
     'institution': 'University of California, Los Angeles',
     'year_completed': 2011,
     'gpa': 3.9}],
   'experience': [{'title': 'Senior Biostatistician',
     'company': 'GenomeTech Research',
     'duration_years': 3,
     'skills_used': ['R',
      'Python',
      'SQL',
      'SAS',
      'Pandas',
      'NumPy',
      'scikit-learn',
      'TensorFlow',
      'tidyverse',
      'caret',
      'Regression',
      'Time Series Analysis',
      'Bayesian Methods',
      'Survival Analysis',
      'Causal Inference',
      'Random Forests',
      'Gradient Boosting',
      'Neural Networks',
      'Clustering',
      'ggplot2',
      'Matplotlib',
      'Seaborn',
      'Shiny',
      'Git',
      'Docker',
      'RStudio',
      'Jupyter'],
     'achievements': ['Lead statistical analysis for clinical trials, genomic research, and drug discovery projects',
      'Develop machine learning models to predict patient responses to experimental treatments, improving trial success rates by 25%',
      'Create and maintain R packages for internal analysis workflows',
      'Design statistical frameworks for complex clinical study designs',
      'Collaborate with cross-functional teams of biologists, clinicians, and data engineers',
      'Mentor junior statisticians and data analysts'],
     'relevance_score': None},
    {'title': 'Research Scientist',
     'company': 'Stanford Medical Center',
     'duration_years': 3,
     'skills_used': ['R',
      'Python',
      'SQL',
      'SAS',
      'Pandas',
      'NumPy',
      'scikit-learn',
      'TensorFlow',
      'tidyverse',
      'caret',
      'Regression',
      'Time Series Analysis',
      'Bayesian Methods',
      'Survival Analysis',
      'Causal Inference',
      'Random Forests',
      'Gradient Boosting',
      'Neural Networks',
      'Clustering',
      'ggplot2',
      'Matplotlib',
      'Seaborn',
      'Shiny',
      'Git',
      'Docker',
      'RStudio',
      'Jupyter'],
     'achievements': ['Developed predictive models for patient outcomes using electronic health record data',
      'Applied natural language processing to extract insights from clinical notes',
      'Created interactive dashboards for visualizing clinical trial results',
      'Collaborated on research leading to 8 peer-reviewed publications',
      'Designed and taught workshops on statistical methods for medical researchers'],
     'relevance_score': None}]},
  'score': {'technical_skills_score': 32,
   'experience_score': 28,
   'education_score': 15,
   'additional_score': 10,
   'total_score': 85,
   'key_strengths': ['Advanced proficiency in Python and R, which are crucial for data analysis and machine learning',
    'Extensive experience in statistical modeling and machine learning algorithms',
    'Strong background in healthcare and biostatistics, aligning well with the preferred domain',
    'Proven ability to lead complex projects and mentor junior team members',
    'Publications and workshops indicate a strong commitment to research and knowledge sharing'],
   'key_gaps': ['Intermediate level in SQL, which is required at an advanced level',
    'Lack of explicit experience with scikit-learn, though related skills are present',
    'No mention of experience with preferred tools like TensorFlow, PyTorch, AWS, Azure, GCP, Spark, Hadoop, Tableau, or PowerBI',
    'Limited experience in finance domain, though transferable skills are present'],
   'confidence': 0.9,
   'notes': 'Priya Sharma demonstrates a strong technical background and relevant experience in healthcare, making her a strong candidate despite some gaps in required skills and domain experience. Her advanced degrees and publications further strengthen her profile.'}},
 {'file_name': 'Resume 3_ Jennifer Park.pdf',
  'contact_details': {'name': 'Jennifer Park',
   'email': 'jpark@email.com',
   'phone': '+14155553842',
   'location': 'San Francisco, CA',
   'linkedin': 'linkedin.com/in/jenniferpark',
   'website': None},
  'candidate_profile': {'contact_details': {'name': 'Jennifer Park',
    'email': 'jpark@email.com',
    'phone': '+14155553842',
    'location': 'San Francisco, CA',
    'linkedin': 'linkedin.com/in/jenniferpark',
    'website': None},
   'skills': [{'name': 'Python', 'level': 'Advanced', 'years': 3},
    {'name': 'SQL', 'level': 'Advanced', 'years': 3},
    {'name': 'R', 'level': 'Basic', 'years': 3},
    {'name': 'Pandas', 'level': 'Advanced', 'years': 3},
    {'name': 'NumPy', 'level': 'Advanced', 'years': 3},
    {'name': 'scikit-learn', 'level': 'Advanced', 'years': 3},
    {'name': 'Matplotlib', 'level': 'Advanced', 'years': 3},
    {'name': 'Spark', 'level': 'Basic', 'years': 1},
    {'name': 'Tableau', 'level': 'Advanced', 'years': 3},
    {'name': 'Power BI', 'level': 'Advanced', 'years': 3},
    {'name': 'Seaborn', 'level': 'Advanced', 'years': 3},
    {'name': 'PostgreSQL', 'level': 'Advanced', 'years': 3},
    {'name': 'MySQL', 'level': 'Advanced', 'years': 3},
    {'name': 'Git', 'level': 'Advanced', 'years': 3},
    {'name': 'Jupyter Notebooks', 'level': 'Advanced', 'years': 3},
    {'name': 'VS Code', 'level': 'Advanced', 'years': 3}],
   'education': [{'degree': 'MS in Analytics',
     'field': 'Analytics',
     'institution': 'University of San Francisco',
     'year_completed': 2019,
     'gpa': 3.75},
    {'degree': 'BS in Economics',
     'field': 'Economics',
     'institution': 'University of California, Davis',
     'year_completed': 2017,
     'gpa': 3.6}],
   'experience': [{'title': 'Senior Data Analyst',
     'company': 'ShopSmart Retail',
     'duration_years': 2,
     'skills_used': ['Python',
      'SQL',
      'Pandas',
      'NumPy',
      'scikit-learn',
      'Tableau',
      'Power BI',
      'PostgreSQL',
      'MySQL',
      'Git',
      'Jupyter Notebooks',
      'VS Code'],
     'achievements': ['Developed and implemented clustering algorithms for customer segmentation, resulting in a 20% increase in email campaign engagement',
      'Built a product recommendation engine using collaborative filtering techniques',
      'Created sales forecasting models with 85% accuracy using time series analysis',
      'Designed interactive dashboards for executives to monitor KPIs',
      'Collaborated with marketing team to develop and analyze A/B tests',
      'Automated routine reporting processes using Python scripts, saving 10+ hours weekly'],
     'relevance_score': 9},
    {'title': 'Data Analyst',
     'company': 'MarketEdge Consulting',
     'duration_years': 2,
     'skills_used': ['Python',
      'SQL',
      'Pandas',
      'NumPy',
      'scikit-learn',
      'Tableau',
      'Power BI',
      'PostgreSQL',
      'MySQL',
      'Git',
      'Jupyter Notebooks',
      'VS Code'],
     'achievements': ['Conducted exploratory data analysis for clients across retail and e-commerce industries',
      'Created predictive models for customer behavior using logistic regression and decision trees',
      'Built ETL pipelines for data preprocessing and cleaning',
      'Developed business intelligence dashboards using Tableau',
      'Presented insights and recommendations to client stakeholders'],
     'relevance_score': 8}]},
  'score': {'technical_skills_score': 34,
   'experience_score': 26,
   'education_score': 12,
   'additional_score': 12,
   'total_score': 84,
   'key_strengths': ['Advanced proficiency in Python, SQL, Pandas, NumPy, and scikit-learn',
    'Strong experience in data analysis and machine learning',
    'Proven track record of delivering impactful projects',
    'Excellent communication and presentation skills',
    'Advanced knowledge of data visualization tools like Tableau and Power BI'],
   'key_gaps': ['Lacks advanced knowledge in TensorFlow and PyTorch',
    'Limited experience with cloud platforms like AWS, Azure, and GCP',
    'No experience with Hadoop',
    "Master's degree is in Analytics, not specifically in Computer Science, Statistics, or Mathematics",
    'Less than 3 years of experience in the required domains (Healthcare, Finance)'],
   'confidence': 0.9,
   'notes': 'Jennifer Park demonstrates strong technical skills and relevant experience in data analysis and machine learning. Her educational background is solid, and she has a proven track record of delivering impactful projects. However, she lacks some preferred skills and domain experience. Overall, she is a strong candidate with a few areas for potential improvement.'}},
 {'file_name': 'Resume 7_ Emily Johnson.pdf',
  'contact_details': {'name': 'Emily Johnson',
   'email': 'e.johnson@email.com',
   'phone': '+16285554231',
   'location': 'San Francisco, CA',
   'linkedin': 'linkedin.com/in/emilyjohnson',
   'website': None},
  'candidate_profile': {'contact_details': {'name': 'Emily Johnson',
    'email': 'e.johnson@email.com',
    'phone': '+16285554231',
    'location': 'San Francisco, CA',
    'linkedin': 'linkedin.com/in/emilyjohnson',
    'website': None},
   'skills': [{'name': 'Python', 'level': 'Advanced', 'years': None},
    {'name': 'R', 'level': 'Intermediate', 'years': None},
    {'name': 'SQL', 'level': 'Intermediate', 'years': None},
    {'name': 'NumPy', 'level': None, 'years': None},
    {'name': 'Pandas', 'level': None, 'years': None},
    {'name': 'scikit-learn', 'level': None, 'years': None},
    {'name': 'TensorFlow', 'level': 'Basic', 'years': None},
    {'name': 'Keras', 'level': 'Basic', 'years': None},
    {'name': 'Regression', 'level': None, 'years': None},
    {'name': 'Classification', 'level': None, 'years': None},
    {'name': 'Clustering', 'level': None, 'years': None},
    {'name': 'Hypothesis Testing', 'level': None, 'years': None},
    {'name': 'Matplotlib', 'level': None, 'years': None},
    {'name': 'Seaborn', 'level': None, 'years': None},
    {'name': 'Plotly', 'level': None, 'years': None},
    {'name': 'Tableau', 'level': None, 'years': None},
    {'name': 'PostgreSQL', 'level': None, 'years': None},
    {'name': 'MySQL', 'level': None, 'years': None},
    {'name': 'Git', 'level': None, 'years': None},
    {'name': 'Jupyter Notebooks', 'level': None, 'years': None},
    {'name': 'Google Colab', 'level': None, 'years': None}],
   'education': [{'degree': 'MS',
     'field': 'Data Science',
     'institution': 'University of California, Berkeley',
     'year_completed': 2023,
     'gpa': 3.85},
    {'degree': 'BS',
     'field': 'Statistics',
     'institution': 'University of California, Davis',
     'year_completed': 2022,
     'gpa': 3.7}],
   'experience': [{'title': 'Data Science Intern',
     'company': 'HealthTech Solutions',
     'duration_years': 0.25,
     'skills_used': ['Machine Learning',
      'Exploratory Data Analysis',
      'Data Visualization',
      'Agile Development'],
     'achievements': ['Developed a machine learning model to predict patient no-shows, achieving 78% accuracy',
      'Performed exploratory data analysis on patient demographic and appointment data',
      'Created data visualizations to communicate findings to non-technical stakeholders',
      'Collaborated with product team to implement model insights into the scheduling system',
      'Participated in agile development processes and weekly sprint reviews'],
     'relevance_score': 9},
    {'title': 'Research Assistant',
     'company': 'University of California, Berkeley - Data Science Department',
     'duration_years': 0.75,
     'skills_used': ['Natural Language Processing',
      'Text Classification',
      'Data Preprocessing',
      'Research Assistance',
      'Teaching Assistance'],
     'achievements': ['Assisted professor with research on natural language processing applications in healthcare',
      'Implemented and evaluated various text classification algorithms',
      'Preprocessed and cleaned large textual datasets from electronic health records',
      'Co-authored a research paper submitted to a data science conference',
      'Provided support for undergraduate data science courses as a teaching assistant'],
     'relevance_score': 8},
    {'title': 'Marketing Analyst Intern',
     'company': 'Digital Marketing Agency',
     'duration_years': 0.25,
     'skills_used': ['Data Analysis',
      'Excel',
      'Python',
      'Reporting',
      'A/B Testing',
      'Customer Segmentation'],
     'achievements': ['Analyzed digital marketing campaign performance data using Excel and basic Python',
      'Created reports and dashboards to visualize key performance metrics',
      'Assisted in developing A/B testing strategies for email marketing campaigns',
      'Performed customer segmentation analysis for targeted marketing efforts'],
     'relevance_score': 6}]},
  'score': {'technical_skills_score': 25,
   'experience_score': 18,
   'education_score': 14,
   'additional_score': 10,
   'total_score': 67,
   'key_strengths': ['Strong educational background in Data Science and Statistics',
    'Relevant experience in healthcare data science',
    'Proven ability to develop and implement machine learning models',
    'Experience with data visualization and communication to non-technical stakeholders',
    'Familiarity with agile development processes'],
   'key_gaps': ['Lacks advanced proficiency in required skills like NumPy, Pandas, scikit-learn, and SQL',
    'Limited experience with preferred skills such as TensorFlow, PyTorch, and cloud platforms',
    'Less than 3 years of total relevant work experience',
    'No mention of statistical modeling techniques or problem-solving skills in the provided experience'],
   'confidence': 0.8,
   'notes': 'Emily Johnson shows strong potential with her educational background and relevant internships, particularly in the healthcare domain. However, she lacks the required years of experience and advanced proficiency in several key technical skills. Her additional qualifications, such as familiarity with agile development and data visualization, are valuable but not sufficient to compensate for the gaps in required skills and experience.'}},
 {'file_name': 'Resume 6_ James Lee.pdf',
  'contact_details': {'name': 'James Lee',
   'email': 'james.lee@email.com',
   'phone': '+14085556723',
   'location': 'San Jose, CA',
   'linkedin': 'https://linkedin.com/in/jameslee',
   'website': None},
  'candidate_profile': {'contact_details': {'name': 'James Lee',
    'email': 'james.lee@email.com',
    'phone': '+14085556723',
    'location': 'San Jose, CA',
    'linkedin': 'https://linkedin.com/in/jameslee',
    'website': None},
   'skills': [{'name': 'Java', 'level': 'Advanced', 'years': 5},
    {'name': 'JavaScript', 'level': 'Advanced', 'years': 5},
    {'name': 'Python', 'level': 'Basic', 'years': 1},
    {'name': 'React', 'level': 'Advanced', 'years': 4},
    {'name': 'Node.js', 'level': 'Advanced', 'years': 3},
    {'name': 'HTML/CSS', 'level': 'Advanced', 'years': 5},
    {'name': 'REST APIs', 'level': 'Advanced', 'years': 3},
    {'name': 'MongoDB', 'level': 'Advanced', 'years': 3},
    {'name': 'MySQL', 'level': 'Intermediate', 'years': 2},
    {'name': 'PostgreSQL', 'level': 'Intermediate', 'years': 2},
    {'name': 'Docker', 'level': 'Intermediate', 'years': 2},
    {'name': 'Jenkins', 'level': 'Intermediate', 'years': 2},
    {'name': 'AWS', 'level': 'Intermediate', 'years': 2},
    {'name': 'SQL', 'level': 'Basic', 'years': 1},
    {'name': 'Pandas', 'level': 'Beginner', 'years': 1},
    {'name': 'Git', 'level': 'Advanced', 'years': 5},
    {'name': 'JIRA', 'level': 'Advanced', 'years': 3},
    {'name': 'Visual Studio Code', 'level': 'Advanced', 'years': 5}],
   'education': [{'degree': 'Bachelor of Science',
     'field': 'Computer Science',
     'institution': 'San Jose State University',
     'year_completed': 2018,
     'gpa': 3.6}],
   'experience': [{'title': 'Senior Software Engineer',
     'company': 'TechSolutions Inc.',
     'duration_years': 3,
     'skills_used': ['React',
      'Node.js',
      'MongoDB',
      'REST APIs',
      'Database Optimization',
      'Team Leadership',
      'Machine Learning'],
     'achievements': ['Developed and maintained full-stack web applications',
      'Implemented RESTful APIs',
      'Optimized database queries and application performance',
      'Collaborated with product managers and UX designers',
      'Led a team of junior developers',
      'Worked with data team on machine learning model APIs'],
     'relevance_score': 9},
    {'title': 'Software Engineer',
     'company': 'WebApps Co.',
     'duration_years': 2,
     'skills_used': ['React',
      'Angular',
      'Java Spring Boot',
      'Jest',
      'JUnit',
      'Agile Development',
      'D3.js'],
     'achievements': ['Developed front-end components',
      'Created backend services',
      'Implemented automated testing',
      'Participated in agile development processes',
      'Built data visualization dashboards'],
     'relevance_score': 7}]},
  'score': {'technical_skills_score': 12,
   'experience_score': 22,
   'education_score': 10,
   'additional_score': 10,
   'total_score': 54,
   'key_strengths': ['Strong software engineering background with experience in full-stack development',
    'Proven ability to lead teams and collaborate with cross-functional teams',
    'Experience with machine learning model APIs',
    'Proficient in using Git and JIRA for version control and project management'],
   'key_gaps': ['Limited experience with Python, NumPy, Pandas, scikit-learn, and SQL at the required advanced level',
    'Lacks experience with TensorFlow, PyTorch, and other preferred machine learning frameworks',
    "Does not have a Master's degree in a relevant field",
    'Limited experience in healthcare or finance domains'],
   'confidence': 0.8,
   'notes': "James Lee has a strong background in software engineering with relevant experience in full-stack development and team leadership. However, he lacks the required advanced skills in Python, NumPy, Pandas, scikit-learn, and SQL, which are crucial for the role. Additionally, he does not have a Master's degree in a relevant field and limited experience in the preferred domains of healthcare or finance. His experience with machine learning model APIs is a notable strength, but overall, he may need significant upskilling to meet the job requirements."}}]

[ ]