What’s one of the biggest problems the US intelligence community has encountered over the past 10 years? Data. Leaders in the intelligence community have argued that it’s not the lack of information (tens of thousands of terabytes are gained daily), but the difficulty in analyzing the vast volume of unstructured information available, that’s the challenge.
It’s why savvy intelligence analysts remain in demand, and careers such as Data Scientist are on the rise. It’s also introduced innovative analytics solutions including Content Analyst Analytical Technology (CAAT®), a patented, award-winning product of Content Analyst Company. CAAT is a conceptual search tool that’s based on Latent Semantic Indexing (LSI) technology, which means it can use a phrase, sentence or entire document as the query to find the most relevant information in vast amounts of unstructured information. It’s the same search technology used by the Intelligence Community, now available to ClearanceJobs employers via the IntelliSearch™ candidate profile search tool.
“When you think about the vast amount of unstructured information that the IC has to deal with, and the fact that it’s multilingual, and the fact that information is obfuscated in many ways – code words, code names – if you wanted to do a keyword search, you’re not going to have good luck,” said Steven Toole, vice president of marketing at Content Analyst.
Rather than relying on specific or exact keywords and traditional Boolean logic, conceptual search ‘mimics the way people think – by topics or concepts versus keywords’.
Content Analyst Company was created in 2004 as a spinoff of defense contractor SAIC – and the intelligence community was its first customer.
A search engine that’s as expressive as you are
In 2014, Director of National Intelligence James Clapper called on the government and private sector to help the intelligence community ‘find the needle without the haystacks.’ Throughout the past decade researchers in organizations from the NSA to the Intelligence Advanced Research Projects Agency (IARPA) have looked to both government partners and industry to help find the innovative solutions to sift through the volume of data available. Because like many other industries, the internet has made more data accessible than ever. The problem remains, however, which data is most relevant?
THE JURY IS IN
When SAIC went public, Content Analyst Company went out on its own – and headed straight into jury rooms across the country. It’s now used in every court of law in the country as a permissible technique for culling down documents and producing only the most relevant documents to the case, noted Toole. CAAT analytic capabilities are used by tens of thousands of attorneys in dozens of e-Discovery software tools. CAAT search for the legal community intelligently identifies the most relevant documents in legal matters, creating a more efficient review process, as well as a trusted one. The legal community has weighed in, and has “no reason to think this tool has omitted any information,” Toole noted.
YOUR BOOLEAN HEADACHES
There’s another area where search functionality can spell the difference between success and failure – and that’s the world of recruiting and talent sourcing. When Content Analyst Company was a part of SAIC, internal recruiters caught wind of this innovative technology and asked to put it to work to help identify the most relevant (qualified) candidates among hundreds of thousands of past applicants. A beta test was supposed to run for 60 days, with recruiters doing a side-by-side comparison of CAAT and their existing Boolean search strings. The pilot lasted two days before the recruiters asked to put away the Booleans and start using CAAT exclusively, said Toole.
“There are lots of ways to say things related to employment. Some things are only expressed in a certain way – like your clearance level,” he noted. “Beyond that, when you get into someone’s professional skills and experiences, there are zillions of ways to express that. One of those is how it’s expressed in a job description, one of those is how a candidate types it into his or her resume.”
That means for every resume, there is a different list of keywords applied. That becomes even more difficult in the defense industry – a world of hyphens, acronyms and military skills that may or may not translate exactly into your civilian job titles.
Fortunately, CAAT doesn’t rely on a user’s ability to predict how the perfect candidate will list the skills you need on his or her resume. It does the heavy lifting of translating and deciphering. All you need to do is copy and paste your position description into the query box and CAAT matches the requirements to the skills and experience in the resumes and automatically stack-ranks the candidates from most qualified to least – no keywords or Boolean search strings necessary.
“CAAT can identify, for example, all of the ways to express a skill such as Java programming. CAAT identifies these terms as they are used in the resumes that are being searched, and identifies them as similar terms for the same concept – Java programming. As such, CAAT can identify the resumes that express Java programming in one way or another, regardless of specific terms used. Once identified, CAAT can stack rank the resumes matching all of the skills and requirements identified in the job description.,” said Toole. “That’s the power of LSI- – it learns from the content (resume database) itself. It’s constantly updating itself and learning all of the latest spellings, acronyms, and terms, and mapping those terms to broader concepts”
LSI is essentially a mathematical approach to text analytics. It extracts contextual relation among every term in every text object within a collection. It is constantly learning, and constantly updating to pull in the most relevant data.
BOOLEAN VS. CAAT – MAKING THE CASE
Toole describes CAAT like this – let’s say you have a job description that requires 10 skills. Using Boolean search, you would need to create an argument that captures every way each of those ten skills might be expressed across the more than 760,000 registered candidates in the ClearanceJobs database. In contrast, CAAT understands those skills are actually concepts. It’s not just going to show you the candidates who have that exact phrase, they are going to show you the best resumes that contain those concepts (in other words, skills and experience). And by matching based on concepts versus exact keywords or phrases, it gives recruiters and talent sourcers the ability to identify more qualified candidates with less effort and in less time than what’s possible via the limitations of Boolean searching.
“ClearanceJobs is excited to be the first career site in our industry offering deep contextual search technology to over 1,200 employers that rely on us for top talent, fast,” said Evan Lesser, Founder and President of ClearanceJobs.com. “Like it or not, search is evolving. Boolean logic carries no information beyond a basic ‘true’ or ‘false.’ We see contextual search as an important step in helping employers identify the best talent for the job, rather than just identifying which resumes contain the right keywords. In this case, I can say that if it’s good enough for the Intelligence Community, it’s good enough for us.”
CAAT search technology is now available to all ClearanceJobs.com customers through the IntelliSearch™ tool. Log into your ClearanceJobs account to put this intelligent search tool to use.