Kilem Li Gwet, Ph.D.

STATAXIS Consulting

20203 Goshen Road, #191

Gaithersburg, MD 20879

Phone: 240-505-0994

Fax:     301-560-3473

Email: Stataxis Consulting

Sr. Statistical Consultant

Statistical Consulting Services for Researchers, Business Managers
Statistical Thinking, From the Drawing Board of Theory to the Messy World of Hard Facts

 

Statistical sampling

 

Home

The Problem:

  • Most social and business studies involve a (random) selection of subjects on whom the data will be collected.  Such a sample of subjects must be carefully designed in order to be representative of the target survey population.  Methods of statistical inference including data weighting are then needed to compute the margin or error associated with each estimation. 

  • Note that people who are only concerned about immediate gains may feel discomfort about the risks incidental to sampling.  But statistical sampling, like any other scientific methodology may at times be misleading, will oftentimes be useful; but provides the only approach that guarantees the long-term reliability of estimations when a complete enumeration of the population is impractical.  This is achieved by a careful and systematic evaluation of the sampling error.

What I can do:

  • I have years of experience in survey sampling, and have taught it on several occasions. I can help you develop an effective sampling plan.

  • I can perform data weighting on your sample data using your design information.

  • I can perform variance estimation even with complex survey data, based on multi-stage or multi-phase stratified samples. 

What you can do:

  • Send me an e-mail with a short description of your project for a free initial consultation

  • Place an Online Request (preferred method) with study background, and statement of your problem

  • Call me at 240-505-0994

 

More about me
Study Design
Statistical Sampling
Statistical Data Analysis
Inter-Rater Reliability
Design of Experiment
Database Management
My Links
Contact
Most researchers do not conduct their own surveys.  They often use survey data that government agencies collected and made available to the public. To use this data properly, one must carefully read the documentation about the survey design.  If this is not taken into account, the resulting statistics and standard errors will be wrong.

 

The Genesis of Modern Sampling Theory 

  • Most target populations in social and business studies have specific demographic or commercial structures that are of interest to researchers. Any sample selected for  production of official statistics is naturally expected to be as representative as possible to the population it was selected from. 

    What is a representative sample?

    A sample is said to be representative of a population if the sample and population distributions are similar with respect to some key characteristics. The activity that aims to ensure the representativeness of the sample is called Sampling Design.

  • Samples selected according to a specific design are referred to as Complex Samples. Depending on the nature of the population, sampling designs may vary considerably in their level of complexity.  Once a complex sample is selected and the data collected, each sampled individual is assigned a weight

    What is a weight

    Each individual in the sample represents a group of individuals in the target population. The number of individuals in that group represents the weight associated with the sample individual. Suppose 100,000 individuals aged 25 or more are selected randomly from the Northeast region of the Unites States, where about 36,000,000 people in the same age range live. Therefore, each of the 100,000 sample individuals represents approximately 36,000,000 / 100,000 = 360 individuals in the population

  • All statistics such as the sums or the means produced from a complex sample must be calculated as weighted sums or weighted means using the weights defined as above. The question now is how to evaluate the precision of estimates obtained from complex samples. Standard errors, variances, confidence intervals must be calculated. Hypothesis testing must be performed. 

  • The problem: Abstract statistical models do not refer to any specific population of interest and are therefore inappropriate as a framework for statistical inference from complex samples. It became apparent in the first half of the twentieth century that a new framework was needed to address inferential problems in the context of finite populations. Such a framework was created in an incredible tour de force by the Polish mathematician Jerzy Neyman; giving birth to what is known today as the modern sample survey theory, also referred to as Finite Population Sampling. This theory is well documented in key reference books such as Cochrane (1977),  Sarndal et al. (1992), and Kish (1965). 

  • Since its creation, the modern theory of sampling has considerably gained in complexity overtime. Inferential methods such as the Jackknife or the Bootstrap (and other replication methods) have been developed.  A plethora of software products (WesVar, SUDAAN, STATA, and more) for handling complex samples are proposed by many vendors.

 

 

STATAXIS CONSULTING

20203 Goshen Road #191, MD 20879 USA

Phone: 240-505-0994, Fax: 301-560-3473, email: Stataxis Consulting

 
Copyright @2004, STATAXIS Consulting