8 min read

Resume Parsing

Contents

Resume parsing, also known as CV parsing, resume extraction, or CV extraction, allows for the automated storage and analysis of resume data. The resume is imported into parsing software and the information is extracted so that it can be sorted and searched. It converts resume data into a structured format from an unstructured format. 

 

How resume parsing software works 

Resume parser analyzes a resume, extract the desired information, and insert the information into a database with a unique entry for each candidate. Once the resume has been analyzed, a recruiter can search the database for keywords and phrases and get a list of relevant candidates. Many parsers support semantic search, which adds context to the search terms and tries to understand intent in order to make the results more reliable and comprehensive.

 

Role of Machine learning

Machine learning is extremely important for resume parsing. Each block of information needs to be given a label and sorted into the correct category, whether that’s education, work history, or contact information. Rule-based parsers use a predefined set of rules to parse the text. This method does not work for resumes because the parser needs to “understand the context in which words occur and the relationship between them. For example, if the word “Harvey” appears on a resume, it could be the name of an applicant, refer to the college Harvey Mudd, or reference the company Harvey & Company LLC. The abbreviation MD could mean “Medical Doctor” or “Maryland”. A rule-based parser would require incredibly complex rules to account for all the ambiguity and would provide limited coverage.

 

This leads us to Machine Learning and specifically Natural Language Processing (NLP). NLP is a branch of Artificial intelligence and it uses Machine Learning to understand content and context as well as make predictions. 

 

Many of the features of NLP are extremely important in resume parsing. Text normalization and Part-of-speech tagging accounts for the different possible formats of acronyms and normalizes them. Lemmatisation reduces words to their root using a language dictionary and Stemming removes “s”, “ing”, etc. Named-entity recognition uses Regular expression, dictionaries, statistical analysis and complex pattern-based extraction to identify people, places, companies, phone numbers, email addresses, important phrases and more.

 

Effectiveness of resume parsers

Resume parsers have achieved up to 87% accuracy, which refers to the accuracy of data entry and categorizing the data correctly. Human accuracy is typically not greater than 96%, so the resume parsers have achieved “near human accuracy.

 

One executive recruiting company tested three resume parsers and humans to compare the accuracy in data entry. They ran 1000 resumes through the parsing software and had humans manually parse and enter the data. The company brought in a third party to evaluate how the humans did compared to the software. They found that the results from the resume parsers were more comprehensive and had fewer mistakes. The humans did not enter all the information on the resumes and occasionally misspelled words or wrote incorrect numbers.

 

In an earlier experiment, a resume for an ideal candidate was created based on the job description for a clinical scientist position. After going through the parser, one of the candidate’s work experiences was completely lost due to the date being listed before the employer. The parser also didn’t catch several educational degrees. The result was that the candidate received a relevance ranking of only 43%. If this had been a real candidate’s resume, they wouldn’t have moved on to the next step even though they were qualified for the position. It would be helpful if a similar study was conducted on current résumé parsers to see if there have been any improvements over the past few years.

 

Benefits

Résumé parsing allows candidates to be ranked based on objective information and can help prevent the bias that so easily shows up in the hiring process. The software can be programmed to ignore and hide factors that contribute to bias such as name, gender, race, age, address and more. 

 

  • The technology is extremely cost-effective and a resource saver. Rather than asking candidates to manually enter the information, which could discourage them from applying or wasting recruiter’s time, data entry is now done automatically. 
  • The contact information, relevant skills, work history, educational background and more specific information about the candidate is easily accessible. 
  • The applicant screening process is now significantly faster and more efficient. Instead of having to look at every résumé, recruiters can filter them by specific characteristics, sort and search them. This allows recruiters to move through the interview process and fill positions at a faster rate. 
  • One of the biggest complaints people searching for jobs have is the length of the application process. With resume parser, the process is now faster and candidates have an improved experience. 
  • The technology helps prevent qualified candidates from slipping through the cracks. On average, a recruiter spends six seconds looking at a resume. When a recruiter is looking through hundreds or thousands of them, it can be easy to miss or lose track of potential candidates. 
  • Once a candidate’s resume has been analyzed, their information remains in the database. If a position comes up that they are qualified for, but haven’t applied to, the company still has their information and can reach out to them.  

 

Challenges

The parsing software has to rely on complex rules and statistical algorithms to correctly capture the desired information in the resumes. There are many variations of writing style, word choice, syntax, etc. and the same word can have multiple meanings. The date alone can be written hundreds of different ways. It is still a challenge for these resume parses to account for all the ambiguity. Natural Language Processing and Artificial Intelligence still have a way to go in understanding context-based information and what humans mean to convey in written language. 

Contents

Newsletter

Get the latest news, blog articles and updates from HireOnboard!

Share

Interesting Reads

Explore our library of blogs for the latest updates in 
the world of talent management and recruitment.

5 min read

Jul 22, 2024

What is a Talent management system?  A talent management system (TMS) is an integrated software platform

5 min read

Jul 15, 2024

Most organizations today, including more than 97% of Fortune 500 companies, rely on an applicant tracking

5 min read

Jul 11, 2024

In today’s economic slowdown, organizations must prioritize attracting and retaining top talent to stay ahead. A

Book a Demo

Start Your Hiring Journey Now With HireOnboard

Get a demo of HireOnboard, from one of our product experts and start building winning teams.