What Is Resume Parsing?
What Is Resume Parsing?
Last Revised: July 12, 2021
What is a Resume Parser?
A Resume Parser is a piece of software that can read, understand, and classify all of the data on a resume, just like a human can – but 10,000 times faster.
A Resume Parser is designed to help get candidate's resumes into systems in near real time at extremely low cost, so that the resume data can then be searched, matched and displayed by recruiters.
Are there other names for a Resume Parser?
You may have heard the term "Resume Parser", sometimes called a "Résumé Parser" or "CV Parser" or "Resume/CV Parser" or "CV/Resume Parser". Some companies refer to their Resume Parser as a Resume Extractor or Resume Extraction Engine, and they refer to Resume Parsing as Resume Extraction. These terms all mean the same thing!
What does a Resume Parser do?
A Resume Parser performs Resume Parsing, which is a process of converting an unstructured resume into structured data that can then be easily stored into a database such as an Applicant Tracking System.
Can I see an example of how Resume Parsing works?
Sure. Watch this:
What is the purpose of a Resume Parser?
The purpose of a Resume Parser is to replace slow and expensive human processing of resumes with extremely fast and cost-effective software. A Resume Parser allows businesses to eliminate the slow and error-prone process of having humans hand-enter resume data into recruitment systems. A Resume Parser classifies the resume data and outputs it into a format that can then be stored easily and automatically into a database or ATS or CRM.
By using a Resume Parser, a resume can be stored into the recruitment database in realtime, within seconds of when the candidate submitted the resume.
What else does a Resume Parser do?
A Resume Parser should also do more than just classify the data on a resume: a resume parser should also summarize the data on the resume and describe the candidate. Think of the Resume Parser as the world's fastest data-entry clerk AND the world's fastest reader and summarizer of resumes. For instance, a resume parser should tell you how many years of work experience the candidate has, how much management experience they have, what their core skillsets are, and many other types of "metadata" about the candidate.
What types of "metadata" can a Resume Parser provide?
A Resume Parser should also provide metadata, which is "data about the data". For instance, to take just one example, a very basic Resume Parser would report that it found a skill called "Java". But a Resume Parser should also calculate and provide more information than just the name of the skill. It should be able to tell you:
- The name of the skill.
- Each place where the skill was found in the resume.
- When the skill was last used by the candidate.
- How long the skill was used by the candidate.
- How the skill is categorized in the skills taxonomy.
Not all Resume Parsers use a skill taxonomy. Some Resume Parsers just identify words and phrases that look like skills. Unfortunately, uncategorized skills are not very useful because their meaning is not reported or apparent.
What are the benefits of using a Resume Parser?
A Resume Parser benefits all the main players in the recruiting process.
Benefits for Candidates: When a recruiting site uses a Resume Parser, candidates do not need to fill out applications. They can simply upload their resume and let the Resume Parser enter all the data into the site's CRM and search engines. In other words, a great Resume Parser can reduce the effort and time to apply by 95% or more.
Benefits for Recruiters: Because using a Resume Parser eliminates almost all of the candidate's time and hassle of applying for jobs, sites that use Resume Parsing receive more resumes, and more resumes from great-quality candidates and passive job seekers, than sites that do not use Resume Parsing. Also, the time that it takes to get all of a candidate's data entered into the CRM or search engine is reduced from days to seconds. So, a huge benefit of Resume Parsing is that recruiters can find and access new candidates within seconds of the candidates' resume upload. In recruiting, the early bird gets the worm.
Benefits for Executives: Because a Resume Parser will get more and better candidates, and allow recruiters to "find" them within seconds, using Resume Parsing will result in more placements and higher revenue.
Benefits for Investors: Using a great Resume Parser in your jobsite or recruiting software shows that you are smart and capable and that you care about eliminating time and friction in the recruiting process. Since 2006, over 83% of all the money paid to acquire recruitment technology companies has gone to customers of the Sovren Resume Parser. That's 5x more total dollars for Sovren customers than for all the other resume parsing vendors combined.
Who should use a Resume Parser?
Any company that wants to compete effectively for candidates, or bring their recruiting software and process into the modern age, needs a Resume Parser.
- Corporate jobsites
- Job boards
- Recruitment software vendors
- Recruiting or staffing firms
- Executive Placement firms
- Outplacement firms
- Recruitment Process Outsourcing (RPO) firms
Does a Resume Parser store data?
Some do, and that is a huge security risk. A Resume Parser should not store the data that it processes. Some vendors store the data because their processing is so slow that they need to send it to you in an "asynchronous" process, like by email or "polling".
The actual storage of the data should always be done by the users of the software, not the Resume Parsing vendor. Unless, of course, you don't care about the security and privacy of your data….
Sovren's public SaaS service does not store any data that it sent to it to parse, nor any of the parsed results.
How does a Resume Parser get the resumes to parse?
A Resume Parser does not retrieve the documents to parse. Resumes can be supplied from candidates (such as in a company's job portal where candidates can upload their resumes), or by a "sourcing application" that is designed to retrieve resumes from specific places such as job boards, or by a recruiter supplying a resume retrieved from an email.
What is the typical workflow for a Resume Parser?
Let's take a live-human-candidate scenario. A candidate (1) comes to a corporation's job portal and (2) clicks the button to "Submit a resume". That resume is (3) uploaded to the company's website, (4) where it is handed off to the Resume Parser to read, analyze, and classify the data. The Resume Parser then (5) hands the structured data to the data storage system (6) where it is stored field by field into the company's ATS or CRM or similar system. (7) Now recruiters can immediately see and access the candidate data, and find the candidates that match their open job requisitions.
What document formats can a Resume Parser process?
That depends on the Resume Parser. The Sovren Resume Parser handles all commercially used text formats including PDF, HTML, MS Word (all flavors), Open Office – many dozens of formats. If the document can have text extracted from it, we can parse it!
What languages can a Resume Parser process?
That depends on the Resume Parser. The Sovren Resume Parser features more fully supported languages than any other Parser. Read the fine print, and always TEST. Some vendors list "languages" in their website, but the fine print says that they do not support many of them!
How fast is Resume Parsing?
The Sovren Resume Parser's public SaaS Service has a median processing time of less then one half second per document, and can process huge numbers of resumes simultaneously. Other vendors' systems can be 3x to 100x slower. One vendor states that they can usually return results for "larger uploads" within 10 minutes, by email (https://affinda.com/resume-parser/ as of July 8, 2021).
Can a Resume Parser help with Privacy?
Some can. For instance, the Sovren Resume Parser returns a second version of the resume, a version that has been fully anonymized to remove all information that would have allowed you to identify or discriminate against the candidate – and that anonymization even extends to removing all of the Personal Data of all of the people (references, referees, supervisors, etc.) mentioned in the resume.
Has Resume Parsing been around for long?
Yes! The first Resume Parser was invented about 40 years ago and ran on the Unix operating system. It was called Resumix ("resumes on Unix") and was quickly adopted by much of the US federal government as a mandatory part of the hiring process. The system was very slow (1-2 minutes per resume, one at a time) and not very capable. It is no longer used.
A new generation of Resume Parsers sprung up in the 1990's, including Resume Mirror (no longer active), Burning Glass, Resvolutions (defunct), Magnaware (defunct), and Sovren. Later, Daxtra, Textkernel, Lingway (defunct) came along, then rChilli and others such as Affinda.
How much is Resume Parsing used?
It depends on the product and company. Sovren's public SaaS service processes millions of transactions per day, and in a typical year, Sovren Resume Parser software will process several billion resumes, online and offline. Yes, that is more resumes than actually exist. Sovren's software is so widely used that a typical candidate's resume may be parsed many dozens of times for many different customers. Other vendors process only a fraction of 1% of that amount. For example, Affinda states that it processes about 2,000,000 documents per year (https://affinda.com/resume-redactor/free-api-key/ as of July 8, 2021), which is less than one day's typical processing for Sovren.
Can a Resume Parser parse scanned images?
Not accurately, not quickly, and not very well. Optical character recognition (OCR) software is rarely able to extract commercially usable text from scanned images, usually resulting in terrible parsed results. In addition, there is no commercially viable OCR software that does not need to be told IN ADVANCE what language a resume was written in, and most OCR software can only support a handful of languages. Parsing images is a trail of trouble.
How do I select a Resume Parser?
TEST TEST TEST, using real resumes selected at random. Do NOT believe vendor claims! Here is a great overview on how to test Resume Parsing.
Ask how many people the vendor has in "support". The more people that are in support, the worse the product is. Poorly made cars are always in the shop for repairs. Sovren receives less than 500 Resume Parsing support requests a year, from billions of transactions. That is a support request rate of less than 1 in 4,000,000 transactions.
Ask for accuracy statistics. If a vendor readily quotes accuracy statistics, you can be sure that they are making them up. Accuracy statistics are the original fake news. There are no objective measurements. That's why you should disregard vendor claims and test, test test!
Ask about configurability. Can the Parsing be customized per transaction? Does it have a customizable skills taxonomy?
Ask about customers. Sovren's customers include:
- The three most important job boards in the world
- The largest technology company in the world
- The largest ATS in the world, and the largest north American ATS
- The most important social network in the world
- The largest privately held recruiting company in the world
Look at what else they do. Resume Parsing is an extremely hard thing to do correctly. Do they stick to the recruiting space, or do they also have a lot of side businesses like invoice processing or selling data to governments? Those side businesses are red flags, and they tell you that they are not laser focused on what matters to you.