New search engine to help thwart terrorists
With news that the London bombers were British citizens, radicalised on the streets of England and with squeaky-clean police records, comes the realisation that new mechanisms for hunting terrorists before they strike must be developed.
Researchers at the University of Buffalo, US, believe they have discovered a technique that will reveal information on public web sites that was not intended to be published.
The United States Federal Aviation Administration (FAA) and the National Science Foundation (NSF) are supporting the development of a new search engine based on Unintended Information Revelation (UIR), and designed for anti-terrorism applications.
UIR supposes that snippets of information – that by themselves appear to be innocent – may be linked together to reveal highly sensitive data.
The need for such a tool arose after 9/11 when the FAA started focusing on information being disseminated on its own web site.
"It couldn't tell if it was possible to infer things that the FAA doesn't want others to infer by putting together data from this page and that page and that page," said Rohini Srihari, Ph.D., professor of computer science and engineering.
Srihari is developing the search engine with colleagues at the Centre of Excellence in Document Analysis and Recognition in the School of Engineering and Applied Sciences.
Existing search engines process individual documents based on the number of times a key word appears in a single document, but UIR constructs a concept chain graph used to search for the best path connecting two ideas within a multitude of documents.
To develop the method, researchers used the chapters of the 9/11 Commission Report to establish concept ontologies – lists of terms of interest in the specific domains relevant to the researchers: aviation, security and anti-terrorism issues.
"A concept chain graph will show you what's common between two seemingly unconnected things," said Srihari. "With regular searches, the input is a set of key words, the search produces a ranked list of documents, any one of which could satisfy the query.
"UIR, on the other hand, is a composite query, not a keyword query. It is designed to find the best path, the best chain of associations between two or more ideas. It returns to you an evidence trail that says, 'This is how these pieces are connected.'"
The hope is to develop the core algorithms exposing veiled paths through documents generated by different individuals or organisations.
This is a necessary step in counter-terrorism since due to their clean records, the London bombers were able to plot and coordinate their attack without being detected by informants, MI5 agents or other officials using traditional interception methods.
Lord Stevens, the former Metropolitan police chief, reportedly told a weekend newspaper: "They [the bombers] will be apparently ordinary British citizens; young men conservatively and cleanly dressed and probably with some higher education. Highly computer literate, they will have used the internet to research explosives."
But while the bombers do not fit the stereotype of bearded extremists emerging from dusty training camps in Afghanistan, there may be a pattern of seemingly innocent links that connects online documents, text messages and emails to the perpetrators.
It is unknown if British authorities are conducting similar research. Yet Srihari explained by simply using the 9/11 body of evidence, the system found that terrorists Binal Shibh and Mohamed Atta shared apartments in Hamburg - that Atta and Nawaf al Hazmi were hijackers involved in the 9/11 attacks, and that Hazmi found an apartment in San Diego with the help of Anwar Aulaq, an imam named at a mosque in San Diego.
"The concept chains show you what may be of interest, but the real intelligence here is gleaned from looking for patterns of interest," said Srihari. "Once a pattern of interest is identified, then you can ask, 'Are there more patterns like this?'"
A robust prototype is expected to be delivered to the FAA by the end of the year.