TerroGate engine can alert security analysts of potential targets, pinpoint future hot spots
Sarah Staples
Sun
Canadian defence scientists will unveil today the Google of terror fighting tools: the world’s first search engine able to track down sophisticated references to terrorism hidden in vast quantities of written documents and Web pages.
TerroGate, developed by computational linguists at Defence R&D Canada — Valcartier, in Val-Belair, Que., is software that uses algorithms to search for the vocabulary of terrorism.
From a theoretically limitless and ever-changing store of written information, TerroGate calls up word matches centred around five main themes: terrorist tactics, groups and individuals, weapons, locations and targets.
The words could be contained in common document formats, including Microsoft Word, Adobe Acrobat and HTML (the coded language of the Internet). Eventually, the software will be able to comb through newswire feeds in real time as well.
The software could, for example, pull all references to water or nuclear power plants thought of as targets in Western Europe; instantly alert national security analysts to places where suicide bombings have occurred; or point out future potential hot spots anywhere in the world where wanted individuals are likely to strike.
TerroGate melds two emerging search trends. An “entity extraction” component sifts through documents tagging relevant words for easy retrieval. And the system is one of a handful in the world capable of performing “conceptual” searches, which don’t merely hunt for keywords the way Google or Yahoo do, but also notions more vaguely associated with the keyword.
“With traditional search engines you always need to know what [word] you’re looking for,” said Alain Auger, group leader of the knowledge management systems group at Valcartier, and the computational linguist in charge of the research. “Here, as long as you have an idea of the concept you want to search, you don’t have to know all the keywords. You can finally find information you didn’t know you wanted. To my knowledge, there’s nothing similar to it right now in the world.”
Security forces around the world already rely on rudimentary “entity extraction” technology. At least two commercial systems exist — AeroText, by a subsidiary of Lockheed Martin, and ThingFinder, by Inxight Software, Inc., which is used by the U.S. Defense Department and the U.S. army — but they only annotate generic proper or place names in a document. It’s still up to defence analysts to decide if the tagged references point to terrorist activity, Auger said.
In contrast, TerroGate automatically homes in on terror-related concepts and terminology without need for further analyst intervention, and with an accuracy rate of 93 per cent — a feat the software can accomplish in under three seconds, the scientists said.
The software is scheduled to be introduced in Quebec City, at Defence & Security Innovation 2005, a biennial conference attended by various government agencies.
Researchers started with a shortlist of one million words culled from non-classified international reports on terrorism, which they winnowed into a list of 3,000 exclusively terror-related terms. Those form the backbone of TerroGate’s sensitive, multi-layered searches, Auger said.
Auger said he’ll also consider building separate versions of TerroGate to retrieve words in languages other than English.
“Once it has been demonstrated that the technology is valid, from there now the potential is vast.”
© The Vancouver Sun 2005