# Thread: Stratified random sampling

1. ## Stratified random sampling

Hi,

I have to create an excel file where I have a list of IDs in column A and a category column is on its right. Through a macro, I have to create 3 new sheets in which I have to get a 10% random sample for each category from this sheet. I have a vb code for generating a sample of 10% randomly but it doesn't allow the 'stratification'. Quick help would be appreciated.Expected Result.xlsxSourceFile.xlsx  Reply With Quote

2. Welcome to the forum!

Most forums have a policy against solving homework problems. Is this homework or work?

While files are good ways to help others help you, you should include the vba code that you have so far.

To do that sort of thing, an autofilter method would be one approach to start with. The Column B on the created sheets is redundant.

Here are some of my standard routines that I might consider using for such. The concepts are: unique, sorting, and random numbers.
Code:
``` 'https://msdn.microsoft.com/en-us/library/system.collections.arraylist(v=vs.110).aspx
Function advArrayListSort(sn As Variant, Optional tfAscending1 As Boolean = True, _
Optional tfAscending2 As Boolean = True, _
Optional tfNumbersFirst As Boolean = True) As Variant

Dim i As Long, c1 As Object, c2 As Object
Dim a1() As Variant, a2() As Variant, a() As Variant

Set c1 = CreateObject("System.Collections.ArrayList")
Set c2 = CreateObject("System.Collections.ArrayList")

For i = LBound(sn) To UBound(sn)
If IsNumeric(sn(i)) = True Then
Else
End If
Next i

c1.Sort 'Sort ascendending
c2.Sort 'Sort ascending

If tfAscending1 = False Then c1.Reverse 'Sort and then Reverse to sort descending
If tfAscending2 = False Then c2.Reverse 'Sort and then Reverse to sort descending

a1() = c1.Toarray()
a2() = c2.Toarray()

If tfNumbersFirst = True Then
a() = a1()
For i = 1 To c2.Count
ReDim Preserve a(UBound(a) + 1)
a(UBound(a)) = a2(i - 1)
Next i
Else
a() = a2()
For i = 1 To c1.Count
ReDim Preserve a(UBound(a) + 1)
a(UBound(a)) = a1(i - 1)
Next i
End If

End Function

' http://www.excelforum.com/excel-programming-vba-macros/819998-filter-and-sort-scripting-dictionary.html
'Early Binding method requires Reference: MicroSoft Scripting Runtime, scrrun.dll
Function UniqueArrayByDict(Array1d() As Variant, Optional compareMethod As Integer = 0) As Variant
'Dim dic As Object 'Late Binding method - Requires no Reference
'Set dic = CreateObject("Scripting.Dictionary")  'Late or Early Binding method
Dim dic As Dictionary     'Early Binding method
Set dic = New Dictionary  'Early Binding Method
Dim e As Variant
dic.CompareMode = compareMethod
'BinaryCompare=0
'TextCompare=1
'DatabaseCompare=2
For Each e In Array1d
If Not dic.Exists(e) Then dic.Add e, Nothing
Next e
UniqueArrayByDict = dic.Keys
End Function

Function VPickRndX(nArray() As Variant, iPick As Long) As Variant
Dim i As Long, randIndex As Variant, Temp As Variant
Randomize
For i = 1 To iPick
randIndex = Int(Rnd * UBound(nArray)) + 1
Temp = nArray(i)
nArray(i) = nArray(randIndex)
nArray(randIndex) = Temp
Next i
ReDim Preserve nArray(1 To iPick)
VPickRndX = nArray()
End Function```  Reply With Quote

3. mrinal222 might be a data consultant at his firm tasked to 'conduct studies and training around data analysis across the Global Organization' according to some googling.
So he might even be developing a course! If he's on a training course I'd hazard a guess that any trainer would immediately see that my offering was not his student's own work. In the attached (your Expected Result file with a few additions), a click of the button on the Population sheet will update (not create) the 3 other sheets.
There's a macro in there which updates and copies data from a pivot table on a hidden sheet:
Code:
```Sub blah()
Dim RngToClear As Range
'Sheets("Population").Calculate 'might be needed if you're not on Automatic recalculation
With Sheets("Sheet1").PivotTables("PivotTable4")
.PivotCache.Refresh
Categories = Array("C21", "ASR", "DR")
For Each Category In Categories
.PivotFields("Category").CurrentPage = Category
topn = .Parent.Range("D1").Value
.PivotFields("Reference Ids").AutoShow xlAutomatic, xlTop, topn, "Sum of rand"
Set SourceRng = .DataBodyRange.Offset(, -1)
With Sheets(Category)
Set RngToClear = Intersect(.UsedRange.Offset(1), .UsedRange)
If Not RngToClear Is Nothing Then RngToClear.Clear
SourceRng.Copy .Range("A2")
.Range("B2").Resize(SourceRng.Rows.Count).Value = Category 'lose this line if you want as it populates column B which is just the same as the sheet's name.
End With
Next Category
End With
End Sub```  Reply With Quote

4. Code:
```Sub M_snb()
Application.ScreenUpdating = False
sn = [transpose(A2:A1001&"_"&B2:B1001)]

For j = 1 To 3
sp = Filter(sn, Choose(j, "DR", "C21", "ASR"))

Columns(10).ClearContents
Cells(1, 10).Resize(UBound(sp) + 1).Name = "snb"
[snb] = "=rand()"
st = [index(rank(snb,snb),)]

For jj = 1 To UBound(sp) \ 10
c00 = c00 & "|" & sp(st(jj, 1) - 1)
Next
Next

Columns(10).Resize(, 2).ClearContents
st = Split(c00, "|")
Cells(1, 10).Resize(UBound(st) + 1) = Application.Transpose(st)
Cells(1, 10).Resize(UBound(st) + 1).TextToColumns , , , , 0, 0, 0, 0, -1, "_"
End Sub```  Reply With Quote

#### Tags for this Thread

sampling through excel, vb code #### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•