Yehia Mamdouh – Deep Inside Malicious PDF (Malware Security)

Attack come from where?
Most of the attacks these days are focused on client side attacks, when attackers target company or organization networks they face a lot of challenges like IDS, IPS and firewalls which prevent them from reaching the internal network. They then resort to basically targeting employees working In the target organization by many methods like phishing attacks or sending Malicious PDF files!
When we start to check the PDF files that exist in our network we may use antivirus scanners but these days it is certainly not a perfect solution to detect malicious PDF because attackers mostly encrypt it to bypass traditional antivirus scanners and more often than not they target a zero day vulnerability that may exit in Adobe Acrobat reader or target outdated versions, the image below show how PDF vulnerabilities rising every year
Before we start analyzing malicious PDF we going to have a simple look at PDF structures as to understand how the shell code works and where it is located
What is a PDF? PDF document have four main parts (one-line header, body, cross-reference table and trailer)
PDF Header: Which  is the first line showing the pdf format version and the most important line that gives you the basic information of the pdf file for example “%PDF-1.4 means that file fourth version
PDF Body: it consist of objects that compose contents of the document, these objects include fonts, images, annotations, text streams. Users can put invisible objects or elements, this objects can interactive with pdf features like animation and other security features. The body of the pdf supports two types of numbers (integers, real numbers)
The Cross-Reference Table (xref table): the cross- reference counts links of all objects and elements that exist on file format, you can use this feature to see other pages contents (when the users update the PDF the cross-reference table gets updated automatically)
The Trailer: The trailer contains links to cross-reference table and always ends up with “%%EOF” to identify the end of a PDF file the trailer enables a user to navigate to the next page by clicking on the link provided

Attack come from where?

Most of the attacks these days are focused on client side attacks, when attackers target company or organization networks they face a lot of challenges like IDS, IPS and firewalls which prevent them from reaching the internal network. They then resort to basically targeting employees working In the target organization by many methods like phishing attacks or sending Malicious PDF files!

When we start to check the PDF files that exist in our network we may use antivirus scanners but these days it is certainly not a perfect solution to detect malicious PDF because attackers mostly encrypt it to bypass traditional antivirus scanners and more often than not they target a zero day vulnerability that may exit in Adobe Acrobat reader or target outdated versions, the image below show how PDF vulnerabilities rising every year

Before we start analyzing malicious PDF we going to have a simple look at PDF structures as to understand how the shell code works and where it is located

What is a PDF? PDF document have four main parts (one-line header, body, cross-reference table and trailer)

PDF Header: Which  is the first line showing the pdf format version and the most important line that gives you the basic information of the pdf file for example “%PDF-1.4 means that file fourth version

PDF Body: it consist of objects that compose contents of the document, these objects include fonts, images, annotations, text streams. Users can put invisible objects or elements, this objects can interactive with pdf features like animation and other security features. The body of the pdf supports two types of numbers (integers, real numbers)

The Cross-Reference Table (xref table): the cross- reference counts links of all objects and elements that exist on file format, you can use this feature to see other pages contents (when the users update the PDF the cross-reference table gets updated automatically)

The Trailer: The trailer contains links to cross-reference table and always ends up with “%%EOF” to identify the end of a PDF file the trailer enables a user to navigate to the next page by clicking on the link provided

Start Attack!!!
Now we will start to install old version of Adobe Acrobat reader 9.4.6 or 10 through to 10.1.1 which are vulnerable to Adobe U3D Memory Corruption Vulnerability
We can create a malicious PDF by Metasploit framework so we can analyze it. Start opens the terminal and type msfconsole
As shown in the image on the right , we going to configure some settings on Metasploit variables to be sure that everything is working fine
The file has been saved on /root/.msf4/local
So we are going to move the file to Desktop for easier location by typing in the terminal
root@kali :~# cd /root/.msf4/local
root@kali :~# mv msf.pdf /root/Desktop
Wait for Analysis!
PDFid
Now we are going to use pdfid to see what the pdf constructs from the elements and objects and JavaScript and see if there is something interesting to analyze .
First Notice: The PDF has only one page, maybe its normal 🙂
Second Notice: There are several JavaScript objects inside, this is very strange.
Third Notice: There is also an OpenAction object which will execute a malicious JavaScript so we are going to use peepdf for deeper analysis
Deeper Analysis
Peepdf its python tool very powerful for PDF analysis, the tool provides all necessary components that security researcher need in PDF analysis, it support encryption, Object Streams, Shellcode emulation, Javascript Analysis, and for Malicious PDF it shows potential vulnerabilities, shows suspicious elements, powerful interactive console, PDF obfuscation (bypassing AVs), decoding: hexadecimal, – ASCII and HEX search … the list goes on…..
Rock and Roll !
If we going to start the analysis, go to the directory of the PDF file then start with syntax /usr/bin/peepdf –f msf.pdf
We use –f option to avoid errors and force the tool to ignore them
This is the default output but we see some interesting things first, the one we see is the highlighted object 15 continue JavaScript code and we have also have one object 4 continue two executing elements (/AcroForm & /OpenAction) and the last one is /U3D showing to us a Known Vulnerability for now we will start to explore this objects by getting an interactive console by typing syntax /usr/bin/peepdf –i msf.pdf
The tree commands shows the logical structure of the file and we start to explore object 4 (/AcroForm)
As we see in the image when we type object 4 it gave you another objects to explore for now we didn’t see any important information or anything suspicious except object 2 (XFA array) that gave us the element which seems to us that is does not contain something special …
Let’s move to the another object (Open Action)
No we can see JavaScript code, that will be executed when the pdf file will be opened
The other part of the JavaScript code is barely obfuscated like writing some variables in hex and in this code we can see a heap spraying with shell code plus some padding bytes. The attackers typically use Unicode to encode their shell code and then use the unescape function to translate the Unicode representation to binary content (now we are sure that this is definitely a malicious pdf)
Technology Mitigations
Lastline http://www.lastline.com/  which is a technology pioneer dedicated to stopping advanced and evasive malware, zero-day attacks. Lastline’s flexible Previct platform provides high-resolution malware analysis and protection; the required network security foundational layer capable of providing advanced malware detection capabilities. Evasive malware simply bypass traditional security controls  IPS, AV and next generation firewalls since they are signature based.
Malicious PDF files attacks – Lastline is able to analyze these attacks by putting the malicious files through a fully emulated sandbox and evaluates each stage of the attack life-cycle using binary level code analysis.
As DTS Solution we are the leading providers of Breach Detection System – Lastline in (United Arab Emirates) UAE region,
General Mitigations
Malicious PDF mitigation techniques by ensuring;

Start Attack!!!

Now we will start to install old version of Adobe Acrobat reader 9.4.6 or 10 through to 10.1.1 which are vulnerable to Adobe U3D Memory Corruption Vulnerability
We can create a malicious PDF by Metasploit framework so we can analyze it. Start opens the terminal and type msfconsole
As shown in the image on the right , we going to configure some settings on Metasploit variables to be sure that everything is working fine
The file has been saved on /root/.msf4/local
So we are going to move the file to Desktop for easier location by typing in the terminal
root@kali :~# cd /root/.msf4/local
root@kali :~# mv msf.pdf /root/Desktop
Wait for Analysis!
PDFid
Now we are going to use pdfid to see what the pdf constructs from the elements and objects and JavaScript and see if there is something interesting to analyze .
First Notice: The PDF has only one page, maybe its normal 🙂
Second Notice: There are several JavaScript objects inside, this is very strange. 
Third Notice: There is also an OpenAction object which will execute a malicious JavaScript so we are going to use peepdf for deeper analysis
Deeper Analysis
Peepdf its python tool very powerful for PDF analysis, the tool provides all necessary components that security researcher need in PDF analysis, it support encryption, Object Streams, Shellcode emulation, Javascript Analysis, and for Malicious PDF it shows potential vulnerabilities, shows suspicious elements, powerful interactive console, PDF obfuscation (bypassing AVs), decoding: hexadecimal, – ASCII and HEX search … the list goes on…..
Rock and Roll !
If we going to start the analysis, go to the directory of the PDF file then start with syntax /usr/bin/peepdf –f msf.pdf
We use –f option to avoid errors and force the tool to ignore them
This is the default output but we see some interesting things first, the one we see is the highlighted object 15 continue JavaScript code and we have also have one object 4 continue two executing elements (/AcroForm & /OpenAction) and the last one is /U3D showing to us a Known Vulnerability for now we will start to explore this objects by getting an interactive console by typing syntax /usr/bin/peepdf –i msf.pdf
The tree commands shows the logical structure of the file and we start to explore object 4 (/AcroForm)
As we see in the image when we type object 4 it gave you another objects to explore for now we didn’t see any important information or anything suspicious except object 2 (XFA array) that gave us the element which seems to us that is does not contain something special …
Let’s move to the another object (Open Action)
No we can see JavaScript code, that will be executed when the pdf file will be opened
The other part of the JavaScript code is barely obfuscated like writing some variables in hex and in this code we can see a heap spraying with shell code plus some padding bytes. The attackers typically use Unicode to encode their shell code and then use the unescape function to translate the Unicode representation to binary content (now we are sure that this is definitely a malicious pdf)
Technology Mitigations
Lastline http://www.lastline.com/  which is a technology pioneer dedicated to stopping advanced and evasive malware, zero-day attacks. Lastline’s flexible Previct platform provides high-resolution malware analysis and protection; the required network security foundational layer capable of providing advanced malware detection capabilities. Evasive malware simply bypass traditional security controls  IPS, AV and next generation firewalls since they are signature based.
Malicious PDF files attacks – Lastline is able to analyze these attacks by putting the malicious files through a fully emulated sandbox and evaluates each stage of the attack life-cycle using binary level code analysis.
As DTS Solution we are the leading providers of Breach Detection System – Lastline in (United Arab Emirates) UAE region,
General Mitigations
Malicious PDF mitigation techniques by ensuring;