How We Worklet: Data Extraction with Third-Party Storage

Welcome to the fourth installment of our blog series: How We Worklet.

Let’s briefly revisit what a Worklet is and how you can use them within Automox.

Recap: What’s a Worklet and how is it used?

Automox Worklets™️ are super helpful automation tools. Worklets hand over the reins so you can automate any scriptable action on macOS, Linux, and Windows devices. What soul-crushing, manual task vexes you most? Script it and eliminate it.

When you write or use existing Worklets, you give your organization a shot at reaching its full automation potential. The best part is that Worklets help you do away with time-consuming manual tasks and help with compliance efforts.

Leverage Worklets to remediate zero-day or unpatched vulnerabilities. Or use them to configure your devices, disconnect unauthorized applications, roll back patches, etc.

To learn more about what a Worklet is and how it works, jump over to the Automox Community.

Today’s Winning Worklet is… the Data Extraction with Third-Party Storage Worklet

Colby is the Automox Systems Administrator. He’s focused on endpoint management, automated workflows, and script creation to get the job done!

In his off-time, Colby’s an avid gamer, custom PC builder, and he prints tiny, radical objects such as figurine busts, chassis for robots, or miniatures using his 3D printers. Check out some of Colby’s latest creations:


Now, let’s get down to business.

Why did you build this Worklet? What problem does it solve?

One of the most important and first steps in troubleshooting is gathering information. Typically for IT this means collecting system, service, or application logs.

Asking someone to find and upload these files can be confusing, take a lot of time, and in some cases seem outright impossible due to file or folder permissions based on your organization’s security policies.

What tasks does the Data Extraction with Third-Party Storage Worklet accomplish?

The Data Extraction with Third-Party Storage Worklet was built to:

  • Reduce time spent on information gathering

  • Give power to the IT helpdesk to automatically gather information on devices

  • Eliminate frustrating back-and-forth communications

How long did it take to build the Data Extraction with Third-Party Storage Worklet?

This solution took knowledge of the operating system and where log files for the different components are kept. But it also required a knowledge of other tools, such as Rapid7’s Insight Connect workflow builder and the Google Drive API.

Depending on the endpoint/third party on which you’d like to store the information, this can be either more or less complex.

Following a lengthy discovery process and setup, it took some troubleshooting to get the solution to work. However, with this solution now built out, it helps speed up the information gathering part of the troubleshooting process. Also, the Worklet serves as a jumping-off point for many more automated processes that can use the information we stored.

How does the Data Extraction with Third-Party Storage Worklet work?

Data Extraction with Third-Party Storage works by encrypting and formatting the files we need and uploading them to Rapid7 using a curl command. Then, it sends the info to the storage solution of our choice using Rapid7’s included Google endpoint plugin. This connects your Google Drive storage solution securely and without much hassle over authentication.

Prerequisites

Let’s set up our script

To make sure we can pull files from any device, we need our script to be as general as possible so it can work with any user. For this reason, we need to utilize variables.

Let’s get the current logged in user first and store it as the loggedInUser variable:

loggedInUser=$( echo "show State:/Users/ConsoleUser" | scutil | awk '/Name :/ && ! /loginwindow/ { print $3 }' )

Also, let’s go ahead and get the current time as well, to infer when the files were collected:

DATE=$(date +%s)

Last but not least, let’s store our file paths for easy use (and cleaner code). We’ll do this for each file we’re retrieving:

filePathS=/Users/$loggedInUser/Desktop/$HOSTNAME-systemlog.zip

filePathA=/Users/$loggedInUser/Desktop/$HOSTNAME-amagentlog.zip

**Important Note: $HOSTNAME is a common variable that doesn’t need to be defined for MacOS. It inserts the Hostname of the device the script is run on.

Next, we’ll compress the files for ease of uploading by using the zip command:

zip $filePathS /private/var/log/system.log

zip $filePathA /var/log/amagent/amagent.log

For the purpose of securely transferring this data, we will go ahead and encrypt the files via the base64 encoding algorithm:

echo $filePathS | base64

echo $filePathA | base64

Now, in order for the Rapid7 system to parse out and identify files sent into the InsightConnect Workflow, we need to format our files with JSON key value pairs for identification and to give InsightConnect the information needed to decrypt the files you upload:

cat << EOF > /Users/$loggedInUser/Desktop/systemlog.json

{"filename": "$DATE-$HOSTNAME-systemlog.zip", "base64_data": "$(base64 /Users/$loggedInUser/Desktop/$HOSTNAME-systemlog.zip)"}

EOF

cat << EOF > /Users/$loggedInUser/Desktop/amagentlog.json

{"filename": "$DATE-$HOSTNAME-amagentlog.zip", "base64_data": "$(base64 /Users/$loggedInUser/Desktop/$HOSTNAME-amagentlog.zip)"}

EOF

Finally, we’ll utilize the curl command in order to send our files up to Rapid7 for the next step in the process. The URL will be the generated URL from Rapid7:

curl -X POST -H "Content-Type: application/json; charset=utf-8" -d "@/Users/$loggedInUser/Desktop/systemlog.json" https://URL-to-Upload-to/abcdefg-abc

curl -X POST -H "Content-Type: application/json; charset=utf-8" -d "@/Users/$loggedInUser/Desktop/amagentlog.json" https://URL-to-Upload-to/abcdefg-abc

Setting up our Rapid7 InsightConnect Workflow

First, log in to Rapid7. Click on the InsightConnect module. Then, navigate to the Workflow section on the following page.

Click on the blue Add Workflow button on the top right of the page and build a workflow from scratch to start with a blank workflow.

Next, name your workflow, give a summary of what it does, and establish tags for identifying the workflow on the initial page.

To start our workflow, we need to create a trigger. For our use-case, we’ll make this trigger an API endpoint URL we can target:

Next, let’s configure the variables we want our workflow to “ingest” from the JSON format we set up earlier. This will help identify what we should name the files as they come in and locate where the encryption key is so we can decrypt the files:

Once this step is saved, we’re presented with our API trigger URL. We can input it into our script from earlier and write instructions on how to use it!

Next, we’ll create the step for actually extracting data from our devices. Select the Compression by Rapid7 action, choosing the Extract Archive option:

You’ll need to choose an orchestrator, or set up one if needed.

For our purposes, it’s already been created, so just press continue. Then enter in the variables from our trigger setup. You can add a variable by clicking in the text boxes and pressing the blue plus sign on the bottom right to search for all variables that exist in our workflow:

Finally, connect a Google account via the Rapid7 Plugin for the last step of Uploading our Files. Go ahead and create another step, select the Action option, and search for the Google Drive plugin:

Then, select the Create File in Folder action to upload our file up to Google Drive. Next, create a connection so Rapid7 can interact with our Google Drive environment. You’ll need to enter in the required information as accurately as possible:

**Important Note: You’ll need to complete the prereqs listed at the top of this guide in order to have the information for these fields. You can find the information in the Google Service account’s JWT JSON file that is downloaded on Google Service account setup.

The only thing left to do is set the folder ID of the Google Drive folder you want to upload to and... voila! You now have an automated workflow for collecting files from your organization’s devices!

Keep in mind, due to limitations with Google Drive’s API, shared drives are not supported as destinations for uploads. However, Google Apps scripting can provide a handy resolution for this last step.

**Folder ID can be obtained by Navigating to your Google Drive folder, then copying the last string of the URL: "https://drive.google.com/drive/folders/{folder-id}"

Before you built the Data Extraction with Third-Party Storage Worklet, how much time did it take to do the same task(s)?

Depending on the files required, security permissions of the user base for the organization, and the end-user’s confidence in locating the files and uploading, this could be near impossible to complete. Or, it could take up to a week via playing an IT game of telephone back and forth.

Now that you have the Data Extraction with Third-Party Storage Worklet, how much time does the same task take?

The whole Worklet/upload and storage process takes about two minutes, depending on the network speed for the client device and the size of the files being uploaded.

Is the Data Extraction with Third-Party Storage Worklet device-specific?

Currently, the Worklet is macOS-only.

However, with the uploading process kicked off by a curl command which can be utilized via Powershell, all that would be needed would be to write the formatting and encrypting script for Powershell to get this working for Windows devices as well.

Which type of IT or security role might especially benefit from using the Data Extraction with Third-Party Storage Worklet?

Any IT Helpdesk employee can take advantage of this Worklet to drastically reduce and automate part of their troubleshooting process. However, this can also be utilized by security analysts to scan for vulnerabilities in various apps in their organization’s endpoints.

And finally… just for fun, if this solution were an animal, what would it be and why? What would its theme song be?

If this solution were an animal, it would be a Golden Retriever.

It’s able to find and retrieve multiple files, as needed, and deliver them to third-party storage solutions reliably and efficiently.

This Worklet is on the hunt for your log files and hungry for whatever you want to feed into it! For that reason, the Data Extraction with Third-Party Storage Worklet theme song would be Hungry Like the Wolf by Duran Duran.

Stay tuned for more Winning Worklets

Remember, anyone can create and offer up a Worklet in our online community. Though some Worklets are written by the Automox team, our customers also have great ideas that come to life in Worklet form. Automox users who create and share new Worklets are affectionately dubbed “SuperUsers.”

To dive deeper into Worklets and discover what they can do for you, check out the Community Worklets catalog. Here you’ll see what new Worklets are available. You can also ask questions about how Worklets function or submit your own!

Until next month, be well and Worklet on.

Watch Colby demo the Worklet:

Automox for Easy IT Operations

Automox is the cloud-native IT operations platform for modern organizations. It makes it easy to keep every endpoint automatically configured, patched, and secured – anywhere in the world. With the push of a button, IT admins can fix critical vulnerabilities faster, slash cost and complexity, and win back hours in their day.

Demo Automox and join thousands of companies transforming IT operations into a strategic business driver.

Dive deeper into this topic

loading...