Concatenating Values from a Variable Number of Cells

Pam has two columns of data. In column A there are simple identifiers, such as A, B, C, etc. In column B there are a series of integer values. She can sort the data by the identifier and, secondarily, by the integer values. Now she wants, in column C, to have a formula that will concatenate all the integer values for a particular identifier. Thus, if A1:A4 all contain the identifier A, then in cell C1 she would like to have all the values in B1:B4 concatenated and divided by commas, such as “11, 17, 19, 25”. Since the number of rows for each identifier can be different, Pam isn’t sure how to go about the concatenation.

The easiest way to accomplish this is to use a macro, which can be created as a user-defined function. Here’s an example:

Function CatSame(c As Range) As String
    Application.Volatile
    sTemp = ""
    iCurCol = c.Column
    If iCurCol = 3 Then
        If c.Row = 1 Then
            sLast = ""
        Else
            sLast = c.Offset(-1, -2)
        End If
        If c.Offset(0, -2)  sLast Then
            J = 0
            Do
                sTemp = sTemp & ", " & c.Offset(J, -1)
                J = J + 1
            Loop While c.Offset(J, -2) = c.Offset(J - 1, -2)
            sTemp = Right(sTemp, Len(sTemp) - 2)
        End If
    End If
    CatSame = sTemp
End Function

This function basically takes a value that is passed to it (a cell reference) and verifies that the cell reference is for column C. If it is, then it starts to concatenate values from column B based on the values in column A. It only returns the string of concatenated values if the value is column A is different than the value in the row above it. Assuming your identifiers are in column A and your values to be concatenated are in column B, you could place the following in column C:

=CatSame(C1)

Copy this down as far as necessary in column C and you end up with exactly what Pam wanted.

A more versatile function would be one that would function somewhat like VLOOKUP, but bring back a concatenated list of values that match whatever you are looking up. Consider the following function:

Function VLookupAll(vValue, rngAll As Range, _
  iCol As Integer, Optional sSep As String = ", ")
    Dim rCell As Range
    Dim rng As Range
    On Error GoTo ErrHandler

    Application.Volatile
    Set rng = Intersect(rngAll, rngAll.Columns(1))
    For Each rCell In rng
        If rCell.Value = vValue Then _
          VLookupAll = VLookupAll & sSep & _
          rCell.Offset(0, iCol).Value
    Next rCell

    If VLookupAll = "" Then
        VLookupAll = CVErr(xlErrNA)
    Else
        VLookupAll = Right(VLookupAll, Len(VLookupAll) - Len(sSep))
    End If
ErrHandler:
    If Err.Number  0 Then VLookupAll = CVErr(xlErrValue)
End Function

This function takes up to four arguments. The first is the value you want to match in your lookup. In Pam’s instance, this would be the identifier you want, such as A, B, or C. The second argument is the range of cells in which to look for the matches (column A in this case). The third argument is an offset (from the range in the second argument) that represents the values you want concatenated. You can use the function in this manner:

=VLookupAll("B",A1:A99,1)

If you want to specify a different delimiter between values, you can do it using the optional fourth argument. For instance, the following returns a string where a dash separates each value:

=VLookupAll("B",A1:A99,1,"-")

The solutions so far have focused on using macros. The reason for this is relatively simple: There isn’t a formula-based solution that can do what Pam needs. Using nested IF statements to evaluate what is in column A won’t work well because you are limited in how deeply IF statements can be nested.

You could use a formula and an intermediate result if you don’t mind having the concatenated values be at the last instance of an identifier in column A. Start by putting this formula in cell C1:

=B1

This formula should go into cell C2:

=IF(A2=A1,C1 & ", " & B2, B2)

Copy this formula down as many rows as necessary. What you end up with is an increasingly long series of concatenated values in column C, with the longest in each run being on the same row as the last sequential identifier in column A. You can then put the following in all the applicable cells of column D:

=IF(LEN(C2)>LEN(C1),"",C1)

This formula only displays the longest strings from column C, which is what Pam needed to begin with.