In order to delay detection, phishing and malware websites often use some obfuscation technique.
Obfuscation techniques are double-edged swords. They hide the malicious content from dumb crawlers, bots and sandboxes, but smarter algorithms that know what to look for can detect the malware just by looking at it’s attempts to hide. This is one of the ways we can detect zero-day malware.
In this example we have a fake PayPal website. This page interleaves invisible spans between visible text in order to avoid detection by automated systems that perform heuristic analysis of the web page content.
You’ll get a clearer idea by looking at the following pictures.
This is the fake PayPal website as it is displayed in the browser:
Notice the text just above the login box on the left of the page. The text says “Bitte geben Sie Ihre PayPal-Dated ein”. You will not find this phrase in the source code of the page because the phrase (and especially the word PayPal) has been interleaved with a lot of text enclosed in invisible spans. This text is present in the page but it is not displayed to the user.
Here is a part of the source code of the page (click on the image to enlarge it):
The parts in brown are the invisible spans, they contain a lot of random text that the browser is instructed not to display to the user.
The parts surrounded by yellow boxes are visible and displayed to the user. These parts compose the phrase you see on the webpage but a bot that scans the page and that doesn’t skip the invisible parts cannot find this phrase or even the word PayPal in the whole page.
Invisible content is perfectly normal in legit web pages, often some parts of the page are made visible only on specific events, often most of the page is initially invisible and made visible only when everything has been loaded. Having invisible content is not bad by itself and this is why crawlers and sandboxes don’t ignore it. Using it in this way is certainly suspicious.
Our UrlSand sandbox searches for this and other obfuscation/evasion techniques in order to detect malware.
Libra Esva R&D Manager