How Can You Extract Several Values from a String?

How do you extract several values from a string with VBA?

Pulling values out of a string is something you probably already do all the time with Excel’s handy “Text to Columns” built-in.

text_to_columns

Excel’s built-in Text to Columns functionality

As amazing as “Text to Columns” is, though, sometimes the info you’re trying to get is just nestled too deeply, in too complicated a manner, for “Text to Columns” to work. What then?

Let’s say you’re given some data that was spit out by someone in a different department’s arcane system, and it looks like this:

(L-8735:3[701.4][22/09/2015])(L-8844:5[458.32][30/10/2015])
(L-8744:5[935.2][22/09/2015])(L-8792:3[1402.8][29/10/2015])(L-8802:3[2104.2][29/10/2015])
(L-8753:4[14.9][28/09/2015])
 
You want to take ALL values and dates contained by square brackets and move them to their own cells, so you have something like this:

split_out_square_brackets

Our End Goal

To pull this off we’re going to use Regular Expressions (regex or regexp for short) — but don’t worry, we’re going to break it down step-by-step and TOTALLY demystify the process.

Before we get to business, though, we need to first enable Regular Expressions in VBA.

enable_regular_expressions

Turn on Regular Expression support in VBA

From the VBA editor screen, select the “Tools” option, then click “References…”. In the next pop-up window, check the box next to “Microsoft VBScript Regular Expressions 5.5”, then click “OK” — that’s it! You now have access to the RegExp object.

Now, let’s get on with the code:


Here’s a link to the code above so you can review it side-by-side with the walk through below. Right-click, “Open in new window”.

extract_values_from_string

Here’s our script in-action, plucking values like a champ

Per usual, let’s walk down this script using our 4-step VBA process to guide the way:

Step 1 – Setup
Step 2 – Exploration
Step 3 – Execution
Step 4 – Cleanup

Lines 13-20 cover our Step 1 – Setup here — short and sweet. First, we set a reference to the worksheet (which is named “raw” in this case). Then, we set strPattern, which is the string-matching pattern our RegExp object will use, on the next line.

Let’s spend a little time on strPattern — there’s a lot packed into a little space.

FIRST THINGS FIRST THOUGH — don’t think that I dug into my memory and just conjured this sucker up. I use Rubular to iteratively build matchers, and you should too. By seeing the match results in real time, you can quickly build up what you need and move on.

With that out of the way, let’s talk about what’s contained in that strPattern variable:

Phew! Let’s move on.

Finally, we create the actually RegExp object (called rgx in this example) on lines 15-20. The most important step in the RegExp object setup is on line 19, where we assign rgx.Pattern to be the strPattern we created above.

Lines 24-30 are our Step 2 – Exploration. We identify the last-occupied row, then create a Range object that contains all the cells we want to investigate… Easy cheesy!

Lines 33-59 handle our Step 3 – Execution, and that phase kicks off with a For Each...Next loop around the entire target Range.

The first thing we do in our loop is establish the raw string we’d like to match against. From there, we leverage the RegExp.Test method on our strRaw, which returns True if there are any matches and False otherwise.

If RegExp.Test returns True, then we create an object (called objMatches to hold all of our matches on line 42. Then, beginning on line 46, we loop through the objMatches object, removing square brackets from each value, and eventually writing the output to the neighboring columns (on line 50). Whaddup!

The Step 4 – Cleanup is a cinch in this script, as we only need to let the user know our macro has finished — and that’s exactly what happens on line 58.

And with that, we can now extract an arbitrary number of square bracket-enclosed values from a string. Nice!

Are you plucking values out of complicated strings with ease? If not, let me know and I’ll help you get what you need! And if you’d like more step-by-step, no-bullshit VBA guides delivered direct to your inbox, join my email newsletter below.

How To Speed Up Delete Row Loops

How do you speed up a loop that deletes rows?

Deleting rows based on some criteria is something you will do all the time with VBA. Seriously — the scenarios are endless:

Delete rows where the cell value is blank
Delete rows where the cell value is less than some number
Delete rows where the cell value is less than a certain date
Delete rows where the cell value is …

Being the sharp VBA writer you are, you already figured out that you have to loop backward too! (Author confession: the number of times I have incorrectly looped forward and watched my For...Next fail is embarrassingly high.)

Let’s say you are deleting rows based on blank cell values in a certain column:

DO NOT get hung up on the check against vbNullString (blank) in that example — you might be checking for values in a column below a number threshold (like 50):

You might even be trying to delete rows where the date in a column is more recent than a certain day (like Feb 1st, 2013):

Regardless of the specific cell context you are trying to delete, the For...Next loops above perform like shit! As your data files get bigger, your macro takes longer… and longer… and before long you’re waiting 15-20 minutes for all the cleanup. Let’s fix that by skipping the For...Next loop altogether!

If you had to tackle this task by hand, how would you go about it? With Excel’s built-in filter, right? With VBA we can do exactly that… and compared to looping, the Range.Autofilter method is ludicrous speed!

ludicrious_speed

Ludicrous speed!

Let’s see how it’s done:


Here’s a link to the code above so you can review it side-by-side with the walk through below. Right-click, “Open in new window”.

More lines of code? Absolutely — no debate there. But a huge performance improvement? You bet.

Let’s walk down the code above in the context of the 4-step VBA process:

Step 1 – Setup
Step 2 – Exploration
Step 3 – Execution
Step 4 – Cleanup

Lines 4-9 handle the Step 1 – Setup. The majority of the section is dimensioning variables, but on Line 9 we do a little up-front work and assign wksData to the Worksheet we will be modifying.

Lines 12-22 are the Step 2 – Exploration. First, we identify the last-occupied row and last-occupied column, using lngLastRow and lngLastCol to hold the values. Then, using that info, we create a Range variable called rngDataBlock to store the full block of data.

Lines 25-30 cover our Step 3 – Execution, and even though this section is short there is actually a TON of stuff going on, so let’s walk through it very slowly.

We use With rngDataBlock...End With to manage context here and shave a ton of extra typing off these statements. Every time you see a With Thing...End With, you know that the enclosed methods (like .AutoFilter in this case) are applied to Thing (which is rngDataBlock in our example).

On line 27, we’re using the .AutoFilter method with two inputs: Field and Criteria1. So what the heck are Field and Criteria1?

Line 28 does a lot of work, so the following .gif breaks it down step by step:

delete_rows_with_autofilter_step_by_step

Deleting rows with the AutoFilter (step by step)

Here’s that breakdown again:

Phew!

Finally, lines 33-38 handle our Step 4 – Cleanup, by turning off the AutoFilter safely. And that’s it!

Suppose you wanted to remove rows where the cell value was less than a numeric threshold, like 50? Easy peasy! just change the Criteria1 value on Line 27 like so:

Criteria1:="<50"

Or maybe you’re considering dates, and you only want to keep dates that are older than a certain day (like Feb 1st, 2013)? Another easy change to our friend Criteria1:

Criteria1:=">2/1/2013"

Now that you understand how to use Range.AutoFilter to delete rows rather than a loop, your macros will be WAY faster!

Is your delete-rows-based-on-column-criteria subroutine speedy as heck? If not, let me know and I’ll help you get what you need! And if you’d like more step-by-step, no-bullshit VBA guides delivered direct to your inbox, join my email newsletter below.

Combining Many Excel CSV Files Into One Part 3

What should you do when you need to combine many Excel files (CSVs, XLSXs, TXTs, whatever) into one “master” file?

Merge Many CSVs into Master

Given this file structure with CSVs, we want to merge them all into the master file

Part 3 of a 3 Part Series (Part 1) (Part 2)

We’re almost there y’all! After wrapping Part 2, you now all the data in a single sheet. All that’s left to do is move all the processed CSVs to the “processed” folder (duh). Let’s get cracking!


Here’s a link to the code above so you can review it side-by-side with the walk through below. Right-click, “Open in new window”.

This one is short and sweet, but still sticks to the 4-step VBA process:

Step 1 – Setup
Step 2 – Exploration
Step 3 – Execution
Step 4 – Cleanup

Lines 4-9 are our straightforward Step 1 – Setup. Most important is line 9, where we use the Dir function — Dir does the heavy-lifting here.

The Dir function returns the first filename matching the pathname specified. We know that our macro-containing workbook is in the same folder as the CSVs, so we can use ThisWorkbook.Path combined with “*.csv” to catch the CSV files. (We also take advantage of the fact that Dir returns a zero-length String if there are no more matches in our While loop.)

Lines 15-18 handle our Step 2 – Exploration. We already know everything we need to in this particular case, so we simply assign filenames — nice! strDestinationFile is exactly the same as strSourceFile, except strDestinationFile has had the “processed” folder injected into the middle.

Step 3 – Execution is accomplished by lines 21-24. Moving files with VBA is a cinch by using the Name function, and on the next line we increment strFile by calling Dir again.

Finally, Step 4 – Cleanup is simply a notification to the user on line 28 that the script is finished.

Now that we’ve built everything out, our Public Sub ImportDataThenMoveFiles() script does it all!

Is this import-the-CSVs-then-move-the-CSVs suite of macros working for you? If not, let me know and I’ll help you get what you need! And if you’d like more step-by-step, no-bullshit VBA guides delivered direct to your inbox, join my email newsletter below.

Combining Many Excel CSV Files Into One Part 2

What should you do when you need to combine many Excel files (CSVs, XLSXs, TXTs, whatever) into one “master” file?

Merge Many CSVs into Master

Given this file structure with CSVs, we want to merge them all into the master file

Part 2 of a 3 Part Series (Part 1) (Part 3)

Welcome back! Since Part 1, you have set up a handy folder structure and written your high-level, “ideal world” code. Let’s get to work on the ImportAllDataFiles subroutine!


Here’s a link to the code above so you can review it side-by-side with the walk through below. Right-click, “Open in new window”.

Per usual, we’re going to adhere to the 4-step VBA process:

Step 1 – Setup
Step 2 – Exploration
Step 3 – Execution
Step 4 – Cleanup

Lines 4-23 handle the Step 1 – Setup. We’re mostly just dimensioning variables here, but lines 18-20 are a bit more interesting than that.

We first assign the master worksheet to a variable for easy access, then we identify the columns that will contain our “TimeZone” and “Date” data. Usually, this kind of up-front column assignment is not a good idea, but since we are in charge of the master sheet, we can assign these columns with confidence.

Master Sheet Column Assignment

Boom — since we’re in charge of Master, we can be 100% sure about those tricky columns, “Time Zone” and “Date”

Finally, we use the Dir function on line 23 to feed our loop through the directory.

Since this subroutine relies on the interaction between the master sheet and each CSV, the Exploration and Execution steps bounce back and forth from line 26 through line 108, so let’s walk through each section and identify exactly what’s going on.

Lines 30-38 are doing some Exploration. We identify the last-occupied row on the master sheet (which will change every time a new block of data is appended from a CSV), and with the last row we assign a Range variable. rngDestination will make our copy / paste action much easier down the line.

Lines 41-56 should also be considered Exploration, this time focused on the CSV. We identify the data range (NOT including the headers, since we set that up already in our master file) then set it to another Range variable. rngSource now contains all the data we want to move from the CSV to the master sheet.

Line 60 is our super-simple Execution. The Range.Copy method makes moving data from one Range to another Range really easy — all we have to do is specify the Range to be copied and the destination!

Lines 63-82 begin our next round of Exploration, this time focused on getting that pesky “TimeZone” and “Date” data from the CSV:

Master Sheet Get TimeZone and Date

The “Date” and “Time Zone” values only show up once, but we want them to run down the entire column in the master file

We want to write the “TimeZone” and “Date” data into columns at the far-right end of the original data block in the CSV, which will make pivoting the master sheet a breeze down the line.

Since the data blocks coming from each CSV have the potential to be large, this implementation uses Variant arrays, similar to what we did when we needed to create a column based on a different column.

First, on line 73 and line 79, we do some more Step 2 – Exploration and establish Range variables representing the space on the sheet that we want to fill with the “Time Zone” and “Date” info, then on lines 86-87 we create Variant arrays to match (varTimeZoneCol and varDateCol).

Next, we continue our Exploration get the actual “Time Zone” and “Date” values from the CSV on lines 90-91.

Don’t let the Step 3 – Execution lines (94-97) confuse you — we are simply making EVERY element of the Variant arrays equal to the “Time Zone” or “Date” value. I wish there was a better way to do this, but as far as I know there isn’t — argh!

The final Execution step happens on lines 100-101, where we write the Variant arrays back to the Range columns for “Time Zone” and “Date”.

With that, the CSV data has been completely extracted!

Finally, lines 106-107 close the CSV and adjust the target of Dir, which is Step 4 – Cleanup in preparation for the next CSV file.

Lines 111-112 handle the final Cleanup, where the user is alerted that the data has been merged in and the master has been saved.

Pat yourself on the back, because you just wrapped up Part 2!

Ready for the finale? Check out Part 3, where we move the CSVs that have had their data extracted into a different folder.
Is your merge-many-files-into-one subroutine humming along nicely? If not, let me know and I’ll help you get what you need! And if you’d like more step-by-step, no-bullshit VBA guides delivered direct to your inbox, join my email newsletter below.

Combining Many Excel CSV Files Into One Part 1

What should you do when you need to combine many Excel files (CSVs, XLSXs, TXTs, whatever) into one “master” file?

Merge Many CSVs into Master

Given this file structure with CSVs, we want to merge them all into the master file

Part 1 of a 3-Part Series (Part 2) (Part 3)

This situation or some slight variation has undoubtedly landed on your desk before. You’re given a bunch of CSVs (or XLSXs, or text files maybe), and you need to get all that data into a single “master” file. Bonus points if you get new data files daily!

Fortunately, with VBA you have everything you need to make short work of this challenge.

In Part 1, we’ll handle the folder setup and decide on our overall design. This post will be short, but DAMN important — spending a bit of time up-front allows us to see the potential pitfalls from the beginning, when we can actually DO something about them.

First things first. We know that we want to import each CSV one time, so it would make sense to move each imported file to some kind of “processed” (or “completed”) folder once it’s done.

When taking on bigger projects, it can be really helpful to write out the “ideal world” code first to serve as a guideline. In this case, our “ideal world” code would look like this:

And our file structure should look like this:

C:
|—…
|—MyFolder
|——processed
|———(starts off empty)
|——CallCenterData_1.csv
|——CallCenterData_2.csv
|——(more call center CSV files)
|——master.xlsb
|—…
|—Program Files
|—Program Files (x86)
|—…
|—Users

Folder Structure for Combining CSVs

This is the setup we’ll use to combine all these suckers

And with that, we’re done with Part 1!

Ready to dive-in and do some heavy-lifting? Check out Part 2, where we extract data from each CSV and add it to our master sheet.
Does this high-level strategy and setup make sense to you? If not, let me know and I’ll help you get what you need! And if you’d like more step-by-step, no-bullshit VBA guides delivered direct to your inbox, join my email newsletter below.