Category Archives: Semantic Firewalls

Gatfol Semantic Firewall

 

Gatfol has developed a router and network switch hardware-based natural language semantic firewall for deployment in enterprise data streams to control data leakage.

Developing hardware-based semantic firewalls is difficult :

Language permutation combinations in n-gram format are too many for router-based RAM storage         

Gatfol multiword-to-multiword firewall instances do not require static databases for signature retrieval. The trillions-upon-trillions of natural language permutations needed to effectively process multiword groupings of up to 20-words in input phrase sizes of up to 200-words overwhelm even the largest commercial databases today. Gatfol performs semantic equalisation between multiword groups fully in RAM employing several layers of heuristic filters – developed over 9 years – to bound permutation volume.

Language permutation iterations take too long with non-parallel processing

Even without static database retrieval, the amount of processing permutations at throughput volumes of gigabytes per second is too large to provide microsecond input-output delivery. Parallel processing of multiword groupings is required. Programming for parallel processing on single- or dual chip hardware is difficult. Gatfol utilises a simple multiple EXE architecture and massively scalable proprietary developed local Hadoop master-and-slave technology to let the OS take care of parallel processing. 

The same set of algorithms must work seamlessly between all natural languages

A semantic firewall must be able to filter any base language dynamically. Gatfol technology uses no language-specific grammar- or other processing rules. At embedded level, Gatfol runs on binary patterns and can process any system of repetitive symbols efficiently. Gatfol is functional in base English, -Chinese, -Arabic and any other natural language.

Semantic processing ontologies and definition lists normally require huge disk storage resources

The limited memory processing storage space on router- and network hardware prevents usage of very large ontologies, -word linkage repositories and -definition lists normally required for language semantic processing. Gatfol uses compact 2-gram, two-dimensional word linkage matrixes read fully into RAM combined with simplified Markov chain analysis to provide large permutation power. Total disk space required for even the largest Gatfol firewall deployment is only around 100 MB.

Guarding against false positives in multiword synonym equalisation is difficult

Multiword synonym replacement technology cannot work efficiently without grammar linkage verification. Most dictionaries list “detest, hate, loathe and abhor” as synonyms, but only grammar link filters show up usage frequency discrepancies when each of these words are used with i.e. the term “pizza”. Gatfol uses grammar linkage verification at both word linkage matrix building as well as input-output processing to ensure synonym equivalence quality.

Reflecting web concept relationship changes in real time is difficult

Concept linkages on the web can change unpredictably and abruptly. A representative semantic firewall must mirror linkage changes in real time. All Gatfol concept matrixes update dynamically from locally connected proprietary RSS crawlers to reflect rapidly changing patterns in web language within seconds after actual changes anywhere in the world.

Semantic firewalls can never be “offline” during housekeeping processes

Deployment of semantic firewalls running continuous packet inspection on local hardware is sub-optimal when signature updating requires human intervention at any stage in the update process. Gatfol multi-modular functioning at hardware level is fully hands-free and requires no human intervention of any kind.

Semantic processing systems require large combined CPU/RAM resources

The total Gatfol semantic firewall footprint is extremely small, both from viewpoints of processing power as well as storage. A full strength Gatfol firewall can run on as little as a single CPU and with only 3GB of RAM.

Language-based software applications with a statistical argument basis are never 100% accurate

Humans have an intuitive “accuracy” limit below which language product functionality is deemed inadequate. Accuracy controls inversely impact results volume. Balancing control limitations to volume depends on finely tuned static variables linked to naturally occurring patterns in language together with specific algorithmic functioning. Gatfol spent many years perfecting a proprietary multi-layered semantic intelligence filtering technology (SIFT) to maximise quality against processing volume and speed.