Types of Sampling Techniques in Data Analytics| What is Sampling?
Data sampling techniques is a statistical method that involves selecting a part of a population of data to create representative samples.
Let’s say we want to know the percentage of people who use iPhones in a city, for example. One way to do this is to call up everyone in the city and ask them what type of phone they use. The other way would be to get a smaller subgroup of individuals and ask them the same question, and then use this information as an approximation of the total population.
Steps involved in Sampling | Data sampling process:
So here’s a step-by-step process of how sampling is typically done.
Step 1
The first stage in the sampling process is to clearly define the target population.
Step 2
Sampling Frame – It is a list of items or people forming a population from which the sample is taken.
Step 3
Generally, probability sampling methods use every vote’s equal value, allowing any person to join the sample regardless of their caste.
Step 4
Sample Size – The number of individuals or items to include in a sample must be sufficient to make inferences about the population with the desired level of accuracy and precision.
Step 5
Once the target population, sampling frame, sampling technique, and sample size have been established, the next step is to collect data from the sample.
Types of Data Sampling Methods:
Sampling techniques are categorized into two main types: probability sampling and non-probability sampling. Each type is tailored to specific research needs and offers unique advantages and challenges·
- Probability Sampling
Simple Random Sampling
Stratified Sampling
Cluster Sampling
Systematic Sampling - Non-Probability Sampling
Convenience Sampling
Purposive Sampling
Snowball Sampling
Quota Sampling
1. Simple Random Sampling:
In simple random sampling, the researcher selects the participants randomly. There are a number of data analytics tools like random number generators and random number tables used that are based entirely on chance.
Example: The researcher assigns every member in a company database a number from 1 to 1000 (depending on the size of your company) and then use a random number generator to select 100 members.
2. Systematic Sampling:
In systematic sampling, every population is given a number as well like in simple random sampling. However, instead of randomly generating numbers, the samples are chosen at regular intervals.
Example: The researcher assigns every member in the company database a number. Instead of randomly generating numbers, a random starting point (say 5) is selected. From that number onwards, the researcher selects every, say, 10th person on the list (5, 15, 25, and so on) until the sample is obtained.
3. Stratified Sampling:
In stratified sampling, the population is subdivided into subgroups, called strata, based on some characteristics (age, gender, income, etc.). After forming a subgroup, you can then use random or systematic sampling to select a sample for each subgroup.
Example: If a company has 600 male employees and 300 female employees, the researcher wants to ensure that the sample reflects the gender as well. So the population is divided into two subgroups based on gender.
4. Cluster Sampling:
In cluster sampling, the population is divided into subgroups, but each subgroup has similar characteristics to the whole sample. Instead of selecting a sample from each subgroup, you randomly select an entire subgroup.
Example: A company has over a hundred offices in twenty cities across the world which has roughly the same number of employees in similar job roles. The researcher randomly selects 3 to 4 offices and uses them as the sample.
Non-Probability Sampling Techniques:
Non-Probability Sampling Techniques is one of the important types of Sampling techniques. In non-probability sampling, not every individual has a chance of being included in the sample. This sampling method has high risks of sampling bias.
1. Convenience Sampling:
In this sampling method, the researcher simply selects the individuals which are most easily accessible to them. The only criteria involved is that people are available and willing to participate.
Example: The researcher stands outside a company and asks the employees coming in to answer questions or complete a survey.
2. Voluntary Response Sampling:
Voluntary response sampling is similar to convenience sampling, in the sense that the only criterion is people are willing to participate. However, instead of the researcher choosing the participants, the participants volunteer themselves.
Example: The researcher sends out a survey to every employee in a company and gives them the option to take part in it.
3. Purposive Sampling:
In purposive sampling, the researcher uses their expertise and judgment to select a sample that they think is the best fit. It is often used when the population is very small and the researcher only wants to gain knowledge about a specific phenomenon rather than make statistical inferences.
Example: The researcher wants to know about the experiences of disabled employees at a company. So the sample is purposefully selected from this population.
4. Snowball Sampling:
In snowball sampling, the research participants recruit other participants for the study. It is used when participants required for the research are hard to find. It is called snowball sampling because like a snowball, it picks up more participants along the way and gets larger and larger.
Example: The researcher wants to know about the experiences of homeless people in a city. Since there is no detailed list of homeless people, a probability sample is not possible. The only way to get the sample is to get in touch with one homeless person who will then put you in touch with other homeless people in a particular area.
Summary of Sampling Methods
A table summarizing the sampling methods, along with their definitions and examples.
| Method of Sampling | Definition | Example |
|---|---|---|
| Random Sampling | A sampling method where every item of the population has an equal chance of being selected. It is impartial and does not involve investigator control. | Drawing names from a lottery to select participants. |
| Stratified Sampling | The population is divided into subgroups (strata) based on distinct characteristics, and samples are selected proportionally from each subgroup. | Dividing students into Arts, Commerce, and Science groups to study academic performance. |
| Systematic Sampling | The population is arranged in order, and every nth item is selected to form the sample. | Selecting every 10th person from a list of 200 for a survey. |
|
Cluster Sampling |
Split large population into small groups and pick few groups and study samples from chosen group. |
Selecting some schools from clusters of schools. |
| Quota Sampling | The population is divided into groups based on certain characteristics, and fixed numbers are selected from each group to ensure diversity in the sample. | Surveying a set percentage of people from different age groups. |
| Convenience Sampling | The investigator selects items based on convenience, often using readily accessible individuals or items. | Interviewing people at a local mall for a survey about cooking habits. |
|
Purposive Sampling |
The investigator deliberately selects a sample based on their judgment, focusing on items that are deemed most relevant to the study. |
Selecting top FMCG companies like Nestle and Hindustan Unilever for a market study. |
|
Snowball Sampling |
A research method where sample is increased when existing subjects recruit by referrals. |
Study about Homelessness, where a person refer to two homeless and those people then refer to more homeless people. |
Why is sampling important?
Although the idea of sampling is easiest to understand when you think about a very large population, it makes sense to use sampling methods in research studies of all types and sizes. After all, if you can reduce the effort and cost of doing a study, why wouldn’t you? And because sampling allows you to research larger target populations using the same resources as you would smaller ones, it dramatically opens up the possibilities for research.
Sampling is a little like having gears on a car or bicycle. Instead of always turning a set of wheels of a specific size and being constrained by their physical properties, it allows you to translate your effort to the wheels via the different gears, so you’re effectively choosing bigger or smaller wheels depending on the terrain you’re on and how much work you’re able to do.
Sampling allows you to “gear” your research so you’re less limited by the constraints of cost, time, and complexity that come with different population sizes.
It allows us to do things like carrying out exit polls during elections, map the spread and effects rates of epidemics across geographical areas, and carry out nationwide census research that provides a snapshot of society and culture.
How to choose the correct sample size
Finding the best sample size for your target population is something you’ll need to do again and again, as it’s different for every study.
To make life easier, we’ve provided a sample size calculator. To use it, you need to know your:
Population size
Confidence level
Margin of error (confidence interval)
If any of those terms are unfamiliar, have a look at our blog post on determining sample size for details of what they mean and how to find them.
