Micrmsoft blog(not affiliated with Microsoft)

Saturday, February 24, 2018

Data backup and simple duplication

Disclaimer: RAID arrays with duplication protect against most drive failures, and can also protect against corruption of data depending on the setup. However, if all your data is physically located in one building, you can still lose everything. If your PSU decides it's tired of life and wants to take your system with you, you'll lose everything. If someone finds a bedbug in a different apartment and decides to burn the whole building to the ground, you'll lose everything. A safe backup of your local data is a remotely controlled and located NAS, or a cloud storage system.

On the opposite side of the spectrum, you can maximize your backup safety and longevity at the expense of near total inaccessibility. Buy an LTO tape cartridge, marvel at the $14/TB price ratio, then cry when you see that a tape drive will set you back over $3k. Transfer your data and put the cartridges in an airtight stainless steel time capsule. Add some "DO NOT EAT" packets and seal it in a low humidity room. Bury it in the Los Angeles area for optimal storage temperatures. You can return any time within the next two to three decades, retrieve your data, and cry again as you realize your original tape drive has failed and you need a functioning piece of vintage hardware to retrieve your dog pictures and logs of high school IRC/AIM chats.

I've dreamed of having a full-on standalone NAS system for a long time. I wanted a very expandable zRAID setup with tolerance for two drive failures, running on FreeNAS or something comparable. The hardware requirements for FreeNAS seem reasonable at first glance, but after a little digging you will realize that a smoothly-performing system is going to require a LOT of ECC RAM. You'll also need enough SATA ports for all your drives. I recommend an expansion card with internal SAS ports, each of which can support 4 SATA ports via a breakout cable. You'll need an Ethernet controller that can handle the speeds you want. Throw in the case, PSU, CPU, a motherboard that can support all of the above, the drives themselves, and you have a very expensive system. Before buying anything, check the documentation for your NAS software to make sure all components are supported.

I couldn't justify all that to back up my meager 5TB of data, so I spent $30 on a license for Drivepool and $190 on a 6TB drive.

I had two 3TB drives I wanted to duplicate. The goal was protecting against a single drive failure. I assumed I could just turn on duplication and DrivePool would mirror them on the 6TB drive, but that's not how it works. DrivePool used the disks to create a pool as a lettered logical drive. Data has to be copied to the pool before it can be duplicated. I originally decided to partition the 6TB drive into two partitions and create two pools, each containing a 3TB drive and a new 3TB partition. After spending a day or two doing a completely unnecessary full format on these partitions, I realized that the partitions were pointless and it was simpler to leave each drive with a single partition and copy all three to one pool. I made sure duplication was turned off, copied the files to the pool, verified that all the data was on the pool, then deleted the original non-pool data, turned on duplication and let DrivePool do its thing.

Afterward, since all my files were accessible from the new lettered pool drive, I removed the letters of my 3TB drives since I no longer needed to directly access them. It uncluttered my drive list and DrivePool doesn't care if a drive is hidden or not.

The UI is simple and user-friendly, but I will note the few things that initially confused me. First, DrivePool sorts data into categories that are not self-explanatory at first glance. This simple and helpful post was the clearest and most useful one I could find.

Helpful tip: When you are copying files, DrivePool will automatically limit the transfer rate to 40-50% of a disk's maximum I/O speed so that the drive will still be usable during the process. If you don't need this usability, there will be a ⏩ symbol to the right of the progress bar at the bottom of the window. Click it to allow DrivePool to remove the limiter and copy the files twice as fast.

Here is what my current pool looks like:

Wednesday, June 12, 2013

Malware Analysis In Situ

Preface: I am by no means a "professional" malware analyst, and I'm sure there are easier ways to analyze samples (including just throwing the following sample at a sandbox). This was more of a live-fire exercise.

Besides drive-by downloads, typically in the form of Exploit Kits, one of the most common vectors for delivery of malware is email. Last week I received a typical message promising to contain deposit slips in a .rar file. Instead of trashing it or uploading it to a public sandbox, I decided to take a peak into the sample's functionality using a variety of techniques. Some of these have been covered in previous articles (Malware Analysis 101 Part 1, Malware Analysis 101 Part 2, Malware Analysis 101 Part 3) while others will be somewhat more advanced. First we'll start with static analysis in dependencywalker and Ida Free. Then we'll move onto OllyDbg. Next up will be behavioral analysis within the analysis VM using Brian Baskins' Noriben.py (and fakenet to emulate a "real" internet connection). Finally, I'll share the final report from malwr.com to see what indicators we may have missed throughout the above process.

~~~~~~~~~~~~~~~~~~~~~Read more »

Friday, March 15, 2013

Memory Games - Volatility

I had a brief introduction to memory forensics, or rather I was pushed into exploring the topic after an interesting conversation with Micrmsoft's own Allison Nixon.

Allison had come across a malware sample that was thwarting VM runtime execution and static code analysis. It was evident after IDA began to loop on a large block of code that this particular sample was packed. This was confirmed with a quick strings search. Besides the Visual Basic ASCII strings, there was no means by which to analyze this sample without patching out the encryption function. Unfortunately, neither of us is at a point where we can handle that just yet so we threw some ideas around. Prior to roping me in, while closely observing the sample (Redacted), she discovered the means by which the VM detection was functioning. After removing the registry entry that was observed (more info to come in a blog from Allison) the sample successfully executed and the VM was infected!

With the sample active in a controlled environment, I suggested that a memory snapshot of the VM environment could be used to carve a copy of the unencrypted sample for local static analysis. This would allow us to bypass the encryption function of the original sample! The awesome thing about VM environments, and VMware Workstation (http://www.vmware.com/products/workstation/) in particular, is the ability to create snapshots which include a ".vmem" file. This file format is very similar to using dd (http://www.forensicswiki.org/wiki/Tools:Memory_Imaging) to dump RAM from a live system, and thus, the open source Volatility Framework (https://www.volatilesystems.com/default/volatility). After creating taking a snapshot of the infected VM to generate the .vmem file I needed, I fired up SIFT (Freely provided by SANS here) to begin my search for the nastiness.

~~~~~~~~~~~~~~~~~~~~~Read more »

Sunday, February 24, 2013

Malware Analysis 101 [Part 3]

Finally, here's the concluding portion of Malware Analysis 101. Both previous posts can be found here [Part 1] and here [Part 2].

~~~~~~~~~~~~~~~~~~~~~Read more »

Sunday, February 17, 2013

An unexpected weekend project that went very well.

We all have these unexpected projects. This week I was given a surveillance DVR that had seen better days. The primary hard disk had spun a bearing (hehe). It was a system I built a number of years ago for my family. Previously I used the software that came with the capture card, a generic 4 channel bt878 card. The software ran on Windows XP so from the start I never really expected anything long lasting although it did last about 5 years.

Anyways, enough about what it was. I dug up a 4Gb USB flash drive, installed a very basic CenOS 6 install, got x264, FFmpeg, ZoneMinder, MySQL, PHP and Apache and made it a reliable surveillance server now. It does have a 1Tb drive for the video.

I have to say the ZoneMinder software is not anything fancy by first appearance but supports all the features that most expensive commercial solutions offer. I like that it did not require me to install any GUI environment on the device itself. No resources wasted on this build.

[root@cam ~]# uptime
11:09:56 up 1 day, 23:04, 2 users, load average: 0.00, 0.18, 0.22
[root@cam ~]# ps ax | grep zm[ac]
12795 ?        S      0:04 /usr/bin/perl -wT /usr/local/bin/zmaudit.pl -c
12931 ?        S     46:47 /usr/local/bin/zmc -d /dev/video0
16011 ?        S    131:08 /usr/local/bin/zma -m 5
16192 ?        S     35:39 /usr/local/bin/zmc -d /dev/video1
16204 ?        S    126:37 /usr/local/bin/zma -m 6
16231 ?        S     35:32 /usr/local/bin/zmc -d /dev/video2
16243 ?        S    120:07 /usr/local/bin/zma -m 7
16309 ?        S     34:22 /usr/local/bin/zmc -d /dev/video3
16321 ?        S    128:12 /usr/local/bin/zma -m 8

Not bad for an old Pentium E5200 with 2Gb.

-Mathew

Sunday, February 10, 2013

Malware Analysis 101 [Part 2]

Now that we've covered the different classifications of the functionality of malicious software, made the distinction between targeting vectors, and summarized how IDS/IPS devices fit into the scheme, its time to move into analysis of malware. In case you missed it, Part 1 can be located here.

Image Source

Approaches:

Generally, analysis of any sample can be broken down into two categories, static and dynamic which further subdivide into basic and advanced:

Static

Basic - Examine an executable without viewing instructions.
Advanced - Code analysis using disassembly tools to view Op Code.

Dynamic

Basic - Behavioral observation [Sandboxing].
Advanced - Code analysis using a debugger to manually control the flow of the program. This technique is used to understand the more complex aspects of a sample which static code analysis may be unable to provide.

Each of the above techniques can be utilized in synchronicity of one another in order to reveal different pieces of information about a piece of software. The goal of analysis is to take these many small pieces of information from multiple sources and form a picture of the nature of the sample.

Basic Static Analysis:

From here on, it is assumed that a sample is readily apparent and available. Finding malware is a whole different can of worms, and a worthy topic for a separate blog post series (maybe in the future).

With a sample provided, one of the first things that can be done is AntiVirus scanning. The key is to use multiple programs and resources as this increases signature coverage. Siganture variations make a substantial difference in the rate of detection. Fortunately, instead of buying 8-10 host based antivirus products (and pitting them against each other on the same local machine) along with their accompanying subscriptions, several free web services exist which will scan a submitted sample against multiple Antivirus engines. These engines include both signature based detection and heuristic based detection. The most common/popular/wellknown example is, of course, Virustotal.com (which has register free, public, and private API functionality).

Before moving on, a word about the inherent weaknesses of Antivirus software. The two types of detection mechanisms, signature based and heuristic based, have a number of shortcomings. Simple code modifications easily bypass signature based detection, and in this era of updatable malware through network (command and control) communications, the cat and mouse game has become even more fast paced. Heuristic engines have a similar weakness: they can be completely bypassed with new or unusual code. If there is no record to compare a sample against, it is impossible for a heuristic engine to categorize.

Image Source

The image to the right was generated by using VirusTotal's main page to submit a binary retrieved from a domain listed on malwaredomainlist.com. This file is categorized as a generic password stealer, which slots nicely into the "Infostealer" category defined in part 1 of this series. Interesting note: 9/42 AV products detected this sample as malicious, although the ones that did identifity the sample, categorized it similarly [this suggests a commonality amongst their signature designs]. The grey window shows PE32 structural information, which is something we will examine a little later. Sites like VirusTotal and AntiVirus products as a whole are not withoiut their faults: benign files can and oftentimes do trigger alerts [Example being unebootin, packed with UPX which appears suspicious, and does trigger some alerts]. Thus, solely relying on multi-scanner service like this is not going to provide enough information to make an informed decision about the overall nature of a sample. Use the information gained from services like VirusTotal or NoVirusThanks to inform your decisions, not make them for you!

Sandboxing:

Image Source

Sandboxing is a basic, dynamic analysis technique in which a sample is run in an isolated environment: the goal being to "box off" potentially malicious activity from harming a local machine. This is generally achieved through virtualization ex: Cuckoo Sandbox. Sandboxes attempt to mimic common network services in order to monitor the behavior of malware when provided network connectivity. Like antivirus services, there are a number of free, web-based services that provide this functionality if one does not have the means to setup a local environment. By far the most popular is ThreatExpert. ThreatExpert performs automated analysis, and is generally very useful in initial reporting.

Automated sandbox analysis is not without its drawbacks and frequently will fail to give any meaningful output. Some of the issues are as follows:
1) Any sample that requires command line options will not be analyzed appropriately as an automated environment has no way to pass arguments via command line in the course of executing a piece of software.
2) Though network services are simulated, command and control traffic is not. Malware this is reliant on instructions from a remote host may never execute, leaving automated analysis useless.

Provided the sandbox is able to execute the provided sample, a report containing system changes, network connectivity, and basic functionality if generated. ThreatExpert has the added ability to identify some malwre through AV scans, but this functionality overlaps with services previously mentioned. Like the multi-engine antivirus services the results of automated analysis should not be taken as definitive: the goal of sandboxing is to provide insight into the ways in which a sample may attempt to manipulate a host.

Report Source

Here is an example report for a sample submitted to ThreatExpert in 2010. Though 2010 may be a reltively ancient date at this point, it is a good demonstration of information provided by ThreatExpert post-analysis. You can see basic summary information of the submitted sample including hash, file size Aliases as identified by AV products and Time/Date Submitted. Of more interest is the "What's Been Found Field" which summarizes the activity which took place during the run time in an easily readable and understandable form. This information is very useful for understanding the capabilities of a sample, and can provide some insight into the Function Calls that you may see further along in the analysis process. The next section contains File System changes: files added, removed and modified and their names + directories, potentially useful for signature development or manual removal. The final section of note is the Network activity summary. Here you can see that a file was downloaded, potentially provided yet another sample for analysis in the future. You may also be able to infer that the sample submitted has some downloader Trojan Functionality, and is likely to making Windows API calls in order to download this file.

Stay Tuned for Part 3!
[Hashing, PE Headers, Linked Libraries and Functions!]

Friday, February 8, 2013

Newest contributer to this blog

Hello followers. I am a new edition to this blog. My topics will be far and wide due to that being the focus of my interests.

My current project is porting ArchLinux to my newest embedded device, the oDroid. I have a Raspberry Pi, I wanted something with a bit more processing capabilities and discovered the oDroid. Great device.

The status of this project is, as far as I can tell the system is running but no video output from the HDMI output (framebuffer only). I have a 1.7v UART adapter on the way so that I can get the console up and figure out the video issue.

As of last night I've completed setting up a VM on my server that is dedicated to distcc (distributed c compiler) to assist in compiling software on the oDroid. Not that the oDroid is slow but just because time is progress.

In case you've not seen the oDroid, it packs a Cortex A9 quad-core ARM7 at 1.7GHz, 2Gb memory and Mali-400 GPU. A couple other things that make this stand out compared to the RPi is that this has 6xUSB, Audio in, 8-bit eMMC memory option and then a category 10 SD reader. I did spring for the 16Gb eMMC option.

I now have the eMMC running Android on it, one SD card with my ArchLinux progress and a second SD card with Ubuntu. I did manage to get GNUradio compiled under Ubuntu and have been using it with a RTL-SDR receiver.

-Mathew