Cylance PROTECT is an endpoint protection system. It contains an antivirus functionality that uses a machine learning algorithm(specifically,a neural network)to classify executables as malicious or benign. Security researchers isolated properties of the machine learning algorithm allowing them to change most known-malicious files in simple ways that cause the Cylance product to misclassify the file as benign. Several common malware families,such as Dridex,Gh0stRAT,and Zeus,were reported as successfully modified to bypass the Cylance product in this way. The success rate of the bypass is reported as approximately 85%of malicious files tested. Cylance reports a 50%bypass creation success rate based on internal testing. Either way,attacker effort to find a successful bypass would be low. Unsophisticated attackers can leverage this flaw to change any executable to which they have access; the defense evasion does not require rewriting the malware,just appending strings to it. The specific attack reported by Skylight Cyber relies on a particular set of strings used by the Cylance product. Although Cylance used an ensemble model that made some uncommon model design choices to achieve a white-listing functionality,this over-reliance on specific details when classifying a file is an instance of a common weakness in machine learning algorithms. For a comprehensive discussion of attacks on machine learning systems,see Papernot N,McDaniel P,Sinha A,Wellman MP. SoK:Security and privacy in machine learning. IEEE EuroS&P 2018. Because this flaw is an instance of a broader category of weaknesses in machine learning algorithms,we do not expect an easy solution. Cylance describes their response as"three-fold:First,we have added anti-tampering controls to the parser in order to detect feature manipulation and prevent them from impacting the model score. Second,we have strengthened the model itself to detect when certain features become proportionally overweight. Lastly,we have removed the features in the model that were most susceptible to tampering."This patch should stop the specific keywords used by the Skylight Cyber researchers from allowing an attacker to bypass detection and increase attacker effort required to find similar bypass techniques. However,the method described by the Skylight Cyber researchers to find and recover the features of the Cylance product is likely to enable the recovery of manipulable features from other security products that rely on machine learning. Although Cylance has removed features"most susceptible to tampering,"our understanding of adversarial manipulation of machine learning classifiers in other domains suggests that the remaining features almost certainly provide adequate freedom for tampering. This inference is based on the structural similarity of the Cylance machine learning model(a neural network)to models that have been successfully deceived in the domains of,for example,facial recognition or visual recognition in self-driving cars. There is some evidence that deception remains relatively easy despite the structure of computer network traffic; we are unaware of public evidence as to whether file structure carries the same limitations. This environment is the context behind and likely driver of Cylance's statement that"AI and machine learning models are,by nature,living models. They are designed to evolve and do require periodic retraining and field servicing when appropriate."
↧