Massively scalable hierarchical semantic complexity crystallization of unstructured data
(Multiple keyword views of text)
15 September 2012
(Application for non-cloud, offline, local machine simulated-Hadoop, scalable processing for military-, homeland security- and other classified, private and confidential data streams)
Table of Contents:
Part I: Field
Part II: Background
Part III: Objective
Part IV: Innovation
Part V: Technical summary
(i) Programmatic data
(ii) Programmatic structure
Part VI: Technical example (descriptive)
Part VII: Technical examples (illustrative)
Part VIII : Technical example (diagrams)
Part IX : Applications
Part X : Application differentiation and strengths
Part XI : Application performance
Part XII : Application status
Part XIII : Application updates
Part XIV : Contact information
Part XV : Addendum (USA focus)
Part XVI : General disclaimers
Part I: Field
This software application relates to information processing and more specifically relates to scalable stepwise analysis of unstructured natural language data to generate multiple alternative semantic views of text and assist with simplifying and speeding-up of real-time analysis of large volume inline data streams. The latter could be real-time security image- or video tagging”, inline analysis of Twitter, Facebook or security scanning of mobile phone SMS and voice data.
Part II: Background
Enabling computers to understand language remains one of the hardest problems in text analysis. Language is highly contextual. Often the same words have different meanings in different contexts and small differences in sentence structure can lead to totally different meanings. At the same time, a great number of different sentence structures can have the same meaning.
Most information analytics use text-based probing tools. To return accurate results, search- or summarization algorithms must be able to apply some form of language interpretation to query strings. Most of the time interpretation will be limited to simple keyword determination for extraction against data repositories.
One simple increase in extraction complexity is to perform keyword synonym replacements – usually on a single-to-single word basis. Thus the word “picture” may have the synonym “photo” so that analytical searches against “picture of Grand Canyon”, also extracts “photo of Grand Canyon”.
Merely applying synonyms can easily lead to wrong results. If a search is for “history of motion pictures” then the word “pictures” must not be substituted with “photos” because the string “history of motion photos” is meaningless. As another example, if an analysis includes a search for “HP wide screen monitor” and we operationally substitute the synonym “detector” for “monitor”, and “shutter” for “screen”, completely irrelevant returns would be delivered.
General text extraction analysis therefore also needs to be able to perform contextual (meaning or semantic) processing so that it “knows”, for example, that the string “HP wide screen monitor” has nothing to do with shutters or detectors and that the term “motion photo” is not the same as “motion picture”.
Even words which are normally interchangeable can lead to totally different meanings when used in different contexts. A search for “arm reduction” probably has to do with cosmetic surgery whereas “arms reduction” relates to reducing stockpiles of weaponry. When longer sentences are involved, erroneous permutations become exponentially more complex.
It is very difficult for machines to semantically interpret longer search queries so as to deliver meaningful extraction results. As illustration, a search on Google TM for “Software companies founded before 1990 with a current turnover of more than $100 million” yields a list of largely irrelevant references, even though the search query is perfectly clear to a human and the information is doubtless available on the Internet.
Because existing analytics rely primarily on keywords and basic synonym replacement rather than the semantic “context” of words, most extraction operations – through necessity – spirals down to what has been dubbed “caveman speak”, where for example, an extraction of popular seafood restaurants in Seattle might end up as a search for “seafood Seattle” rather than “provide a list of good affordable seafood restaurants in Seattle”.
Existing semantic analysis engines are weak at converting complex contextual meanings in search inputs to meaningful results.
Much of the web and proprietary datasets in which machine-readable data is available, also contains “meta data” (information about data) that guides language analytic tools on what a subset of text or a topic is about. This meta-data can be in the form of structuring (i.e. columns with textual column headers), instruction sets that act as processing “directors”, or purely be textual synopsis of following information – all to enable non-human analytic tools to understand the meaning of information directly, without the interpretation problems that plague unstructured text .
Currently, certain defined domains – for example, airline booking systems – operate in this way. Thus the term “JFK” in an airline booking system means only John F Kennedy International airport in New York, not to the former US president or other terms that may have these three letters as their acronym. Some hierarchical analytic engines identify higher-level groupings or categories and filter out irrelevant results by “vertically” applying selected categories only. Thus a search for “chicken” might identify categories of “animals” and “recipes” and allow the analytics for instance to filter so as to only search within one of the two categories.
The goal of all-encompassing semantic search has not yet fully been realized, despite ongoing efforts to index, categorize and associate concepts in multitudes of datasets worldwide. The main problem is the enormity of the task involved in performing such identification and association, which requires huge structured lexicons and ontologies as guides.
It would be advantageous to have a completely autonomous self-replicating system that is able to build a contextual language model so that search strings can be interpreted more accurately by language analytic engines, without the need to categorize or index existing content.
Part III: Objective
It is the object of the Gatfol software application to provide a massively scalable but easy-to-install system in the form of a simulated Hadoop method (distributed multiple redundant master and slave nodes) for the stepwise crystallization of natural language (English) text input from semantic complexity to semantic simplicity on single ordinary desktop computers to enable extremely fast searches of multi-keyword groups to be made into very large data streams.
Part IV: Innovation
This software application is subsumed under provisional patent number 61/476,917 lodged 19 April 2011 in New York USA under the international searching authority of the United States Patent and Trademark Office (USPTO) (ISA/US) with the title of “A SYSTEM AND A METHOD FOR GENERATING MULTIPLE ALTERNATIVE SEARCH STRINGS TO FACILITATE IMPROVED COMPUTERIZED SEARCH”, and also under PCT international application PCT/IB2012/051870 on 16 April 2012, submission number 44897 with the International Bureau of the World Intellectual Property Organization in Geneva Switzerland, with the title “A COMPUTERIZED SYSTEM AND A METHOD FOR PROCESSING AND BUILDING SEARCH STRINGS”.
Part V: Technical summary
(i) Programmatic data
The crux of the application is the comparison of left-right ambidextrous grammar signatures for all keywords in the search input and the application of Markov chain analysis to create multiword groups of similar semantics and intact grammar corresponding to the original input.
(ii) Programmatic structure
The application comprises multiple redundant local machine based master and slave software nodes to process input in parallel to ensure extremely high throughput speeds at very large input volumes, regardless of machine- and CPU hardware configurations . Any amount of nodes can be used with processing speed increases proportional to the volume of nodes applied.
The application engine as well as all data inflow into the application and all resultant outflow is fully contained on the local machine. No programmatic calls are made outside of the local machine for any reason at any time whatsoever. This characteristic is critical for application in security classified military- or other confidential data stream environments.
Part VI: Technical example (descriptive)
For ease of understanding a general natural language example is used below. Inferences and application to the military- and security contexts can be made quite easily:
At a first stage, popular words are removed from the input search string. Popular words are identified as those words with a total frequency in each software processing node word relationship database that is higher than a predetermined threshold – in other words, those words that appear very commonly in the total body of text as indexed by the node.
Consider the search string “Where can I get cool spring water?”. The words “where”, “can”, “I” and “get” will be identified as popular words, with the remaining words “cool spring water” being non-popular words. This keyword compression maximizes search speed.
At the next stage, the non-popular words are linked in two-word groups from left to right with the last word of any preceding two-word group forming the first word of the next two-word group. In this case, there are two two-word groups, namely “cool spring” and “spring water”.
Each two word group is then analyzed according to its ambidextrous grammar signature as follows: the reverse signature of the first word and the forward signature of the second word are obtained. The forward and reverse group signatures are combined into a single left-right “word-group” signature. For example, if the forward signature of “spring” in the node word relationship database is the following:
42551 (“spring”): 2211 (“day”), 21 | 53342 (“was”), 15 | 3321 (“morning”)
and the reverse signature of “cool” is the following :
1221 (“cool”): 49923 (“very”), 19 | 3221 (“stay”), 13 | 9219 (“really”)
then the ambidextrous signature of “cool spring” could be the following :
(“cool spring”): 2211 (“day”) | 49923 (“very”) | 53342 (“was”) | 3221 (“stay”) | 9219
(“really”) | 3321 (“morning”)
Importantly, the final ambidextrous word relationship signature gives the forward and reverse relationship of the two words “cool spring” in combination, as if the word combination is a single (but natural language wise currently “non-existing”) word.
Next the node signature database is searched to look for close signature matches for the ambidextrous “word group” signature. By comparing the ambidextrous signature to the word signature database and looking for close matches, single words can be found that are semantically similar to the two word group, “cool spring”. In this manner a crystallization from high grammar- and semantic complexity to simplicity is achieved.
The previous stage is repeated for each of the other two-word groups in the search string, which in this example is the second two word group, “spring water”. In this way, one or more other words are identified that are semantically similar to “spring water”. Combining the results of both iterations yields a number of two word strings that are each semantically similar to “cool spring water”. For example, if one of the words identified as semantically similar to “cool spring” was “refreshing” and one of the words identified as semantically similar to “spring water” was “liquid”, then “refreshing liquid” would be identified as semantically similar to “cool spring water”.
Using the substitute word or words for “cool spring” and “spring water”, and repeating the procedure with the substitute two words (e.g. “refreshing liquid”) using a simplified Markov chain analysis algorithm, it is possible to repeat the preceding stages to find individual words that are semantically similar to the three words, “cool spring water”. In this example, the single word “juice” could, for example, be identified as semantically similar to “refreshing liquid”.
By repeating the substitution procedure in all the previous stages a specified number of times, it is possible to obtain multiple alternate words for the extracted non-popular words. The alternate words can be a string that has any number of words fewer than the extracted non-popular words. For example, if 5 non-popular words were extracted, then alternate word string of 4, 3, 2, or 1 word(s) can be generated. In the case of the three word string, “cool spring water”, the following alternatives could perhaps have been generated:
“refreshing water”
“cool spring liquid”
“refreshing liquid”
“aqua”
While the method described above enables the extracted non-popular portion of the search string to be substituted with semantically similar words, it does not necessarily follow that the semantically similar words will be grammatically correct when substituted back into the original search string. For example, in the search string, “Where can I get cool spring water?”, if the word “season” is identified as semantically similar to the two words “cool spring”, substituting “season” into the original string yields the phrase, “Where can I get season water?” which clearly is not grammatically correct.
In this case, the meaning is also not as originally intended because of the multiple meanings of the word “spring”. In most cases, where the substituted words yield a sentence that is grammatically incorrect, the meaning of the alternative string is different from the intended meaning of the original string, but where the substituted words yield a sentence that is grammatically correct, the meaning is generally consistent with the original meaning.
To overcome the problem of grammatically incorrect alternative search strings, each node application applies additional steps by means of which grammatically incorrect alternative strings can be excluded. To do this, the semantically substituted words are first substituted back into the original search string. Then each substituted word is analyzed within the original string to see whether the words preceding it and following it are words that are associated with the substituted word by a predefined degree. This is done by looking up the word in the node word relationship database and checking whether the word following it appears within the list of row fields with more than a predetermined frequency. Using the reverse signature of that word, a check is also made to see whether the word preceding it appears within the list of row fields with more than a predetermined frequency. Only if both the preceding and following words appear within the row of fields with more than a predetermined frequency is the word regarded as fitting grammatically within the string, otherwise they are rejected at the final stage.
For example, in the case of the alternative string, “Where can I get season water”, it is very unlikely that “get” will appear within the list of words that commonly precede “season” or that “water” will appear within the list of words that commonly follow “season”. This alternative string will therefore be rejected as grammatically incorrect.
If the word “fresh” is identified as semantically similar to “cool spring”, the string, “Where can I get fresh water?” would be checked for grammatical correctness by seeing whether the word “get” commonly precedes “fresh” and whether “water” commonly follows “fresh”. In both cases, the answer will be in the affirmative and, at the final stage, the string “Where can I get fresh water?” will be identified as an alternative string for “Where can I get cool spring water?”.
Once multiple alternative strings of diminishing grammar- and semantic complexity (crystallization) have been generated, they are simultaneously or in very rapid succession input into search streams (i.e. using “danger” word blacklists) and the results compared. The “search hits” that are found to be relevant in the aggregated results of multiple alternative crystallized search inputs can then be identified as more relevant than those hits which are only found to be relevant in the results of one search string – as is currently the case with most search engine input. The most relevant search hits are presented to the original user application first.
From the perspective of the user of search analytics all functioning described above is completely hidden and operates in stealth ‘in the background. The user interacts with the search analytics application(s) in exactly the same way as before, but receives output results based on the multiple stealth alternatives.
Part VII: Technical examples (illustrative)
Lets look at the following input phrase :
“…I am looking for a small coffee shop with red canopies in central
Copenhagen that serves many types of strawberry cheesecake…”
Gatfol crystallization “sees” keywords as negative semantic spaces :
Gatfol combines the negative spaces of two-word linkages i.e. “coffee shop” :
Note how the small “coffee cup” negative space adds “coffee” to the full “coffee shop” semantic image. We do not have “coffee” as a concept, neither do we have “shop”, and NEITHER DO WE HAVE “coffee” and “shop” as a combinational concept. We have a unique negative space of “coffeeshop” that’s neither individually or in combination part of the original input concepts. Gatfol fluidly expands the “negative spaces” to contain almost any combination of concepts in an input dataset.
Negative space templates are multi-dimensionally compared for the closest fit :
giving us…
“….Maurice’s deli with the largest variety of strawberry
confectionary, in walking distance from Copernicus square…..”
Part VIII : Technical example (diagrams)
Part IX : Applications
Image/Video Analysis : Semantic Image Component Crystallization Through Gatfol SIFT (Semantic Intelligence Filter Technology)
(Rich multi-level automatic image- and video tagging on a massive scale)
When building the software node word relationship databases, Gatfol uses a proprietary technology called SIFT. Build words are forced through semantic matrix “filters” each with a different “focus”.
In the landscape images below a wide semantic “focus” sees only a lake with mountains, but a narrow focus additionally sees detail of a country house and village street:
Using SIFT, Gatfol is extremely effective as base technology in automatic machine tagging of images covering detailed image components. Current auto-tagging technology “recognizes” image detail to an accuracy level spanning substantial uncertainty around most- or all individual visual components or discrete pixel-groupings:
Above we could be looking at a work bench with a painters cap and several containers with dark- and yellow paints and some metallic tools, perhaps a green picnic blanket with tea or coffee and vegetation against a light sky, perhaps a green corn field with white grain silo’s etc. As long as ambiguous image content descriptions are available, Gatfol semantic intelligence is powerful enough to stepwise crystallize the ambiguities and provide large-volume detailed tags that all relate semantically.
Given the above vague and semantically wide tag results for the image above (taken from a leading current image tagging application – ALIPR), Gatfol does the following :
Each of the supplied tags above is iterated with SIFT through wider and wider semantic word groupings, continuously checking back to available tags for matchings (Gatfol semantic crystallization is not only many-to-single down but also single-to-many upwards as well as multiword to multiword). If the shiny object at the bottom of the image is not a small mirror, work tool, ball bearing, light bulb or lens – but a teaspoon, the white object with dark contents is likely a tea or coffee cup with contents and not a bowl with soup – if coffee, then the white object with partial covering of a “hat” at right is not the “best fit” semantically. If a teacup, then the green striped object is likely a tea cozy with the white object adjacent, a tea pot. Given these crystallizations, the yellow blob at back is likely not paint, but either jam, butter or coloured ice cream. Given a “tea pot and cups” element grouping, the dark bands at back right is likely a chair structure, and given this, the light green object is semantically unlikely to be a picnic blanket or green meadow, but rather a table cloth. In this iterative manner a large volume of detailed tagging is obtained from a limited initial set of ambiguous descriptions.
Semantic scrubbing of media- and cell phone data streams covering – inter alia – aspects related to Terrorism, Weather/Natural Disasters/Emergency Management, Fire, Trafficking/Border Control Issues, Immigration, HAZMAT, Nuclear, Transportation Security, Infrastructure, National/International Security, Health Concerns, National/International, Public Safety and Cyber Security.
Gatfol’s massively scalable algorithmic semantic analysis engine can be used to flag posts in social media data streams containing word groups that semantically crystallizes to defined “danger” words :
Both Anders Breivik and Jared Lee Loughner posted substantial digital repositories before their extremist acts. In an ideal world we should be able to track ambiguous social network traffic to pinpoint individuals and groups narrowing in on behaviour that can be harmful to society.
Currently this is very difficult.
As is mostly the case, online threats and circumstantial postings contain little or no clear-cut hits against “danger word” listings. Anders Breivik’s online manifesto contained the following sentence : “I simulate various future scenarios relating to resistance efforts, confrontations with police, future interrogation scenarios, future court appearances, future media interviews etc. “Brief skimming brings up possible danger-words in “scenarios”, “resistance”, “police” etc, but nothing that does not appear in many daily online Tweets, Facebook postings or blog utterances.
The Gatfol semantic engine picked up these danger-words but flagged with extreme sensitivity the word combination “media interviews”. Together with “scenarios”, “resistance”, and “police”, the phrase “media interviews” uniquely crystallized in the Gatfol SIFT matrixes as “public violence of newsworthy effect”. The latter semantic equivalent phrase hit totally different keywords than the original set of input words – Gatfol “sees” semantic perspective in large datasets that is not immediately evident to human analysts or investigators :
Part X : Application differentiation and strengths
Unlike almost all competing technology available today, Gatfol provides robust parallel processing power from even simple desktops or laptops. With all data streams staying local to the processing machine, field agents or operators do not require online access for operation in any way.
With a simulated Hadoop multiple master-and-slave node architecture built around simple but robust WindowsTM executable files and with multiple fallback redundancies around both master and slave functions, as well as all nodes individually carrying full word relationship databases, reliability of throughput is ensured – especially critical in large volume streaming functionality.
With a base in ordinary executable files, Gatfol also secures legacy hardware and OS (Windows XP and older) functionality and easy portability in instances of local machine OS upgrades.
Gatfol standalone architecture can be easily incorporated into wider distributed processing architecture including full Hadoop – with corresponding increases in throughput performance.
Part XI : Application performance
Current best performance of a 50-100 level deep crystallization stack on a standalone desktop (Intel Dual 2.93 GHZ 3.21GB RAM Windows XP) for text throughput is 3.6mb/hour for a single Gatfol cluster instance, 11.78mb/hour for a 10-cluster instance and 98mb/hour for a 100-cluster instance.
On a standalone desktop (Intel Quad 3.30 GHZ 2.91GB RAM Windows XP) best text throughput for a 50-100 level stack on a Gatfol 1000-cluster instance is 611mb/hour.
Total text throughput for a 50-100 level stack on an ordinary desktop Microsoft Networks-linked grouping of 20 desktops (Intel Single core 2.8GHZ 768MB RAM Windows XP) each running a Gatfol 100-cluster instance is 1.9GB/hour – giving maximum text output volume of 140GB/hour.
Part XII : Application status
In private Beta since March 2012. Full field ready product to be available from October 2012.
Part XIII : Application updates
Gatfol operates a 24/7 software node word relationship database update service trawling through approximately 9TB of unique web text data per month.
To ensure absolute stealth and security on a local machine basis no online update calls are made by the application. Updates can be made freely and easily at any time on a manual basis from the Gatfol website by any user.
Part XIV : Contact information
Carl Greyling : Founder at Gatfol
Mobile : ++27 82 5902993
Skype : carl_greyling
Email : carl@gatfol.com
Part XV : Addendum (USA focus)
Relevant Gatfol applicable data streams with specific Homeland Security focus :
1) Terrorism: Includes media reports on the activities of terrorist organizations both in the United States as well as abroad. This category also covers media articles that report on the threats, media releases by al Qaeda and other organizations, killing, capture, and identification of terror leaders and/or cells.
2) Weather/Natural Disasters/Emergency Management: Includes media reports on emergency and disaster management related issues. Reports include hurricanes, tornadoes, flooding, earthquakes, winter weather, etc. (all hazards). Reports outline the tracking of weather systems, reports on response and recovery operations, as well as the damage, costs, and effects associated with emergencies and disasters by area. Will also include articles regarding requests for resources, disaster proclamations, and requests for assistance at the local, state, and federal levels.
3) Fire: Includes reports on the ignition, spread, response, and containment of wildfires/industrial fires/explosions regardless of source.
4) Trafficking/Border Control Issues: Includes reports on the trafficking of narcotics, people, weapons, and goods into and out of the United States of an exceptional level.
5) Immigration: Includes reports on the apprehension of illegal immigrants and border control issues.
6) HAZMAT: Includes reports on the discharge of chemical, biological, and radiological hazardous materials as well as security and procedural incidents at nuclear facilities around the world, and potential threats toward nuclear facilities in the United States. Also included under this category are reports and responses to suspicious powder and chemical or biological agents.
7) Nuclear: Reports on international nuclear developments, attempts to obtain nuclear materials by terrorist organizations, and stateside occurrences such as melt downs, the mismanagement of nuclear weapons, releases of radioactive materials, illegal transport of nuclear materials, obtaining of weapons by terrorist organizations, and breaches in nuclear security protocol.
8 ) Transportation Security: Reports on security breaches, airport procedures, and other transportation related issues, and any of the above issues that affect transportation. Reports including threats toward and incidents involving rail, air, road, and water transit in the United States.
9) Infrastructure: Reports on national infrastructure including key assets and technical structures. Articles related to failures or attacks on transportation networks, telecommunications/ internet networks, energy grids, utilities, finance, domestic food and agriculture, government facilities, and public health.
10) National/International Security: Reports on threats or actions taken against United States national interests both at home and abroad. Reports including articles related to threats against American citizens, political figures, military installations, embassies, consulates, as well as efforts taken by local, state, and federal agencies to secure the homeland. Articles involving intelligence will also be included in this category.
11) Health Concerns, National/International: Includes articles on national and international outbreaks of infectious diseases and recalls of food or other items deemed dangerous to the public health.
12) Public Safety: Includes reports on public safety incidents, building lockdowns, bomb threats, mass shootings, and building evacuations.
13) Reports on DHS, Components, and other Federal Agencies: Includes both positive and negative reports on FEMA, CIS, CBP, ICE, etc. as well as organizations outside of DHS.
14) Cyber Security: Reports on cyber security matters that could have a national impact on other CIR Categories; internet trends affecting DHS missions such as cyber attacks, computer viruses; computer tools and techniques that could thwart local, state and federal law enforcement; use of IT and the internet for terrorism, crime or drug-trafficking; and Emergency Management use of social media strategies and tools that aid or affect communications and management of crises.
(Department of Homeland Security National Operations Center Media Monitoring Capability Desktop Reference Binder 2011)
U.S. Department of Homeland Security
Privacy Impact Assessment for the Office of Operations Coordination and Planning
Publicly Available Social Media Monitoring and Situational Awareness Initiative Update
January 6, 2011 :
Terms Used by the NOC When Monitoring Social Media Sites
This is a current list of terms that will be used by the NOC when monitoring social media sites to provide situational awareness and establish a common operating picture. As natural or manmade disasters occur, new search terms may be added.
DHS & Other AgenciesDepartment of Homeland Security (DHS)Federal Emergency Management Agency (FEMA)
Coast Guard (USCG)
Customs and Border Protection (CBP)
Border Patrol
Secret Service (USSS)
National Operations Center (NOC)
Homeland Defense
Immigration Customs Enforcement (ICE)
Agent
Task Force
Central Intelligence Agency (CIA)
Fusion Center
Drug Enforcement Agency (DEA)
Secure Border Initiative (SBI)
Federal Bureau of Investigation (FBI)
Alcohol Tobacco and Firearms (ATF)
U.S. Citizenship and Immigration Services (CIS)
Federal Air Marshal Service (FAMS)
Transportation Security Administration (TSA)
Air Marshal
Federal Aviation Administration (FAA)
National Guard
Red Cross
United Nations (UN)
Domestic Security
Assassination
Attack
Domestic security
Drill
Exercise
Cops
Law enforcement
Authorities
Disaster assistance
Disaster management
DNDO (Domestic Nuclear Detection Office)
National preparedness
Mitigation
Prevention
Response
Recovery
Dirty bomb
Domestic nuclear detection
Emergency management
Emergency response
First responder
Homeland security
Maritime domain awareness (MDA)
National preparedness initiative
Militia
Shooting
Shots fired
Evacuation
Deaths
Hostage
Explosion (explosive)
Police
Disaster medical assistance team (DMAT)
Organized crime
Gangs
National security
State of emergency
Security
Breach
Threat
Standoff
SWAT
Screening
Lockdown
Bomb (squad or threat)
Crash
Looting
Riot
Emergency Landing
Pipe bomb
Incident
Facility
HAZMAT & Nuclear
Hazmat
Nuclear
Chemical spill
Suspicious package/device
Toxic
National laboratory
Nuclear facility
Nuclear threat
Cloud
Plume
Radiation
Radioactive
Leak
Biological infection (or event)
Chemical
Chemical burn
Biological
Epidemic
Hazardous
Hazardous material incident
Industrial spill
Infection
Powder (white)
Gas
Spillover
Anthrax
Blister agent
Chemical agent
Exposure
Burn
Nerve agent
Ricin
Sarin
North Korea
Health Concern + H1N1
Outbreak
Contamination
Exposure
Virus
Evacuation
Bacteria
Recall
Ebola
Food Poisoning
Foot and Mouth (FMD)
H5N1
Avian
Flu
Salmonella
Small Pox
Plague
Human to human
Human to Animal
Influenza
Center for Disease Control (CDC)
Drug Administration (FDA)
Public Health
Toxic
Agro Terror
Tuberculosis (TB)
Agriculture
Listeria
Symptoms
Mutation
Resistant
Antiviral
Wave
Pandemic
Infection
Water/air borne
Sick
Swine
Pork
Strain
Quarantine
H1N1
Vaccine
Tamiflu
Norvo Virus
Epidemic
World Health Organization (WHO) (and components)
Viral Hemorrhagic Fever
E. Coli
Infrastructure Security
Infrastructure security
Airport
Airplane (and derivatives)
Chemical fire
CIKR (Critical Infrastructure & Key Resources)
AMTRAK
Collapse
Computer infrastructure
Communications infrastructure
Telecommunications
Critical infrastructure
National infrastructure
Metro
WMATA
Subway
BART
MARTA
Port Authority
NBIC (National Biosurveillance Integration Center)
Transportation security
Grid
Power
Smart
Body scanner
Electric
Failure or outage
Black out
Brown out
Port
Dock
Bridge
Cancelled
Delays
Service disruption
Power lines |
Southwest Border ViolenceDrug cartelViolence
Gang
Drug
Narcotics
Cocaine
Marijuana
Heroin
Border
Mexico
Cartel
Southwest
Juarez
Sinaloa
Tijuana
Torreon
Yuma
Tucson
Decapitated
U.S. Consulate
Consular
El Paso
Fort Hancock
San Diego
Ciudad Juarez
Nogales
Sonora
Colombia
Mara salvatrucha
MS13 or MS-13
Drug war
Mexican army
Methamphetamine
Cartel de Golfo
Gulf Cartel
La Familia
Reynosa
Nuevo Leon
Narcos
Narco banners (Spanish equivalents)
Los Zetas
Shootout
Execution
Gunfight
Trafficking
Kidnap
Calderon
Reyosa
Bust
Tamaulipas
Meth Lab
Drug trade
Illegal immigrants
Smuggling (smugglers)
Matamoros
Michoacana
Guzman
Arellano-Felix
Beltran-Leyva
Barrio Azteca
Artistic Assassins
Mexicles
New Federation
Terrorism
Terrorism
Al Qaeda (all spellings)
Terror
Attack
Iraq
Afghanistan
Iran
Pakistan
Agro
Environmental terrorist
Eco terrorism
Conventional weapon
Target
Weapons grade
Dirty bomb
Enriched
Nuclear
Chemical weapon
Biological weapon
Ammonium nitrate
Improvised explosive device
IED (Improvised Explosive Device)
Abu Sayyaf
Hamas
FARC (Armed Revolutionary Forces Colombia)
IRA (Irish Republican Army)
ETA (Euskadi ta Askatasuna) Basque Separatists
Hezbollah
Tamil Tigers
PLF (Palestine Liberation Front)
PLO (Palestine Liberation Organization
Car bomb
Jihad
Taliban
Weapons cache
Suicide bomber
Suicide attack
Suspicious substance
AQAP (AL Qaeda Arabian Peninsula)
AQIM (Al Qaeda in the Islamic Maghreb)
TTP (Tehrik-i-Taliban Pakistan)
Yemen
Pirates
Extremism
Somalia
Nigeria
Radicals
Al-Shabaab
Home grown
Plot
Nationalist
Recruitment
Fundamentalism
Islamist
Weather/Disaster/Emergency
Emergency
Hurricane
Tornado
Twister
Tsunami
Earthquake
Tremor
Flood
Storm
Crest
Temblor
Extreme weather
Forest fire
Brush fire
Ice
Stranded/Stuck
Help
Hail
Wildfire
Tsunami Warning Center
Magnitude
Avalanche
Typhoon
Shelter-in-place
Disaster
Snow
Blizzard
Sleet
Mud slide or Mudslide
Erosion
Power outage
Brown out
Warning
Watch
Lightening
Aid
Relief
Closure
Interstate
Burst
Emergency Broadcast System
Cyber Security
Cyber security
Botnet
DDOS (dedicated denial of service)
Denial of service
Malware
Virus
Trojan
Keylogger
Cyber Command
2600
Spammer
Phishing
Rootkit
Phreaking
Cain and abel
Brute forcing
Mysql injection
Cyber attack
Cyber terror
Hacker
China
Conficker
Worm
Scammers
Social media
Other
Breaking News
|
http://www.dhs.gov/xlibrary/assets/privacy/privacy_pia_ops_publiclyavailablesocialmedia_update.pdf
Part XVI : General disclaimers
This white paper and updates to it are made available for general information purposes only and is in no way binding upon Gatfol. By reading this white paper you understand that there is no supplier-client or advisory relationship created between you and Gatfol. Although the information in this white paper and updates is intended to be current and accurate, the information presented therein may not reflect the most current technical- or procedural developments, regulatory actions or software developments. These materials may be changed, improved, or updated without notice. Gatfol is not responsible for any errors or omissions in the content of this white paper or for damages arising from the use or performance of this white paper under any circumstances. We encourage you to contact us for specific feedback- or advice as to your particular matter.
The contents of this paper are protected by the patent laws of the United States and other jurisdictions. You may print a copy of any part of this blog for your own personal, noncommercial use, or for reasonable distribution to directly interested third parties, but you may not copy any part of the white paper for any other purposes, and you may not modify any part of the white paper. Inclusion of any part of the content of this paper in another work, whether in printed or electronic, or other form, or inclusion of any part hereof in another web site by linking, framing, or otherwise without the express written permission of Gatfol is prohibited.
Updates to this document will be published on the gatfol.com blog.