Featured Post

TNTET 2017 BREAKING NEWS

TNTET 2017 BREAKING NEWS | ஆசிரியர் தகுதித்தேர்வு நடத்த அனைத்து ஏற்பாடுகளும் தயார்...ஓரிரு நாட்களில் முறையான அறிவிப்பு வெளியாகிறது...| விண்ண...

Saturday, September 21, 2013

Extract words from a cell, where they occur in a list of words.

His request was to list all names that appeared in the sentence to be shown in one cell separated by commas.
While it is possible with a dash of VBA (Google for ACONCAT by Harlan Grove), Excel formulas straight away do not provide this flexibility using the CONCATENATE function.
So Shrivallabha suggested an alternative which was to list the words found across columns.
This post written by Shrivallabha will explain how his solution works.
As always at formula Forensics You can follow along in a sample file here: Download Sample File

Objective

Cells B3:B12 [Namelist Column] holds a list of names which are to be searched in sentences in the Cells C3:C7 [Sentence Column].
The results are then to be listed across starting from cell D3 [Formula Columns].
So we will write a formula in Cell D3 and then copy it down and across.
Note: We could have easily kept the Sentence column to the left of NameList but we may need free space to keep adding columns in Formula section and we don’t know how many columns we’ll need. This arrangement ensures that we don’t have to change layout to adjust.

Formula

Here’s the formula used which needs to be array entered in Cell D3:
=IFERROR(INDEX(NameList,SMALL(IF(ISNUMBER(FIND(NameList,$C3,1)),ROW(NameList)-2),COLUMNS($D$2:D2))),”-”)then press Ctrl+Shift+Enter
Where, NameList is a named formula which refers to the Range B3:B12.
Note: The formula above can be used for Excel Versions 2007 and above. For Excel Versions 2003 and previous we’ll need to resort to =IF(ISERROR(Formula),Formula,”-”)
or
=If(Iserror( INDEX(NameList,SMALL(IF(ISNUMBER(FIND(NameList,$C3,1)),ROW(NameList)-2),COLUMNS($D$2:D2)))), INDEX(NameList,SMALL(IF(ISNUMBER(FIND(NameList,$C3,1)),ROW(NameList)-2),COLUMNS($D$2:D2))),”-”)

Now let’s understand how above first formula works
Always remember, any complex formula is made up of more than one formula similar to a big program built up using small sub-routines.
To begin with, it may seem daunting but as you delve deeper and break the formula down to basic functions then it is just becomes a matter of understanding the inter-dependency of these basic functions with one another.
We’ll do the same to understand this formula.
By doing away with IFERROR what remains is the core formula which gets us the results we are interested in:
=IFERROR(INDEX(NameList,SMALL(IF(ISNUMBER(FIND(NameList,$C3,1)),ROW(NameList)-2),COLUMNS($D$2:D2))),”-”)
We will examine the internal formula starting with Index
INDEX(NameList,SMALL(IF(ISNUMBER(FIND(NameList,$C3,1)),ROW(NameList)-2),COLUMNS($D$2:D2)))
We are using INDEX which is one of the frequently used functions in Excel which has following construct:
=INDEX(array, row_num, column_num)
This function returns an element based on the row_numcolumn_num arguments passed e.g. if we write formula as:
=INDEX(A1:E5, 2, 3)
Then it will return value in Cell C2.
We use this function to return the match found in Named Range “NameList”.
To find out the name that exists in given sentence we use following construct:
IF(ISNUMBER(FIND(NameList,$C3,1)),ROW(NameList)-2)
Here we need to use Excel’s “Evaluate Formula” functionality which shows the steps that Excel takes internally to arrive at the result.
“Evaluate Formula” is accessed by selecting part of the formula and pressing F9 and will show results as below for Cell D3:
  • The NameList array will be searched up like IF(ISNUMBER(FIND({“Alka”;”Aniket”;”Anil”;….;”Vrunda”},$C3,1)),ROW(NameList)-2).
Note: SEARCH function shall be used in place of FIND if we do not need search to be case sensitive.
  • And then it will be searched in sentence cell C3 which will result in array which has numerical results along with #value! errors like IF(ISNUMBER({1;#value!; #value!;….;10; #value!}),ROW(NameList)-2).
  • ISNUMBER is another useful function which handles errors and numerical results (#value!) and creates BOOLEAN results as required by FIND function as IF({TRUE;FALSE;FALSE;….;TRUE; FALSE}),ROW(NameList)-2).
  • The ROW(NameList) function returns results as {3;4;5;6;7;…;11;12} so when we are processing these results which come from if function. So the numerical results will be correlated with following TRUE results: {3, 11}.
  • If we look at our NameList array then they will refer to 3rd and 11th row respectively. But our data has only 10 elements and 3rd element is “Anil” which is incorrect and then 11 will cause #REF error. What you’ll notice is the result is always offset by 2 rows. That is because our data range starts at 3rd row. So we adjust it by subtracting 2 from it. So it refers to correct results: {1, 9}.
I hope by now we have understood how we use above IF construct to get the results. At this point, you might be still wondering why we used SMALL function around IF.
SMALL(If_function, n)
  • We have multiple results and we need to show one result at a time which is done by SMALL(If_formula, COLUMNS($D$3:D3)).
COLUMNS ($D$3:D3) results in 1 in cell D3 i.e. count of columns.
As we copy across $D$2 remains constant and D2 changes.
So in E2 the value becomes 2, in F2 it becomes 3 and so forth.
  • So in cell D3 we get first result of SMALL formula i.e. 1 and therefore it returns “Alka” and in E3 “Vinay” and then there is no third match in F3 so INDEX formula errors out which is then handled by IFERROR function.

No comments: