Blog

STAT 344 – Sample Surveys Course Reflection

STAT 344 – Sample Surveys was actually a very enjoyable course for me. Unlike many theory-based statistics courses, stat 344 gives concrete and practical examples of how surveys are conducted. This gave me a feeling of accomplishment because I could get a sense of how what I was learning could be applied towards real-life situations. A common question that was asked on practice exams: we are given a table of data and we are asked to treat the data as a

a) stratified sample

b) panel study

c) aggregation of polls

d) cluster sample

and find their relative estimates and standard errors.

 

Some topics or concepts covered in this course (Off the top of my head):

  • Recommending a sample size in order to satisfy an employer’s preferred accuracy level
  • Bias (an example that helped me understand bias was “say you were sampling random people on the street and asking them the number of people in their household” but this is a biased way of sampling because larger households have a better chance of being approached by you)
  • Ratio vs. Regression vs. “Vanilla” estimation
  • Panel study (has a co-variance term)
  • Stratified sampling
  • One-stage cluster sampling (Simple random sample clusters, then sample everyone in selected cluster)
  • Two-stage cluster sampling (Simple random sample of clusters, then another random sampling within the cluster)
  • Aggregate polls (poll of polls)
  • House-effects (τ)
  • Weighted sampling
  • Proportional/optimal allocation
  • Cluster sampling with probability-proportional-to-size (this was tricky!)
  • Non-responders
  • 3 types of missing data – missing at random (MAR): the chance of participation varies with the helper variables but not with the variable of interest, missing completely at random (MCAR): the chance of participating is constant and does not depend of the variable of interest, non-ignorable missing (NMAR): chance of participation varies with the variable of interest and helper variables.

STAT 305 – Statistical Inference Course Reflection

STAT 305 – Introduction to Statistical Inference was a pretty difficult course in my opinion. It was very theory based with not many concrete examples. My favorite unit was probably likelihood estimators. I felt as if I could just follow the same game-plan for most questions:

1) Find the likelihood function by taking the product of n probability density functions

2) Then log it to make it the log likelihood which is easier to proceed with

3) Take the first derivative of the log likelihood and equate it to zero and solve for parameter of interest to find the MLE

4) Take the second derivative of of the log likelihood and if it is <0 then it ensures that you are maximizing

5) Fisher information is -E(second derivative)

6) Variance estimate is just 1/(Fisher Info)

Some topics or concepts covered in this course (Off the top of my head):

  • Moment Generating functions. First derivative gives E(Y) or mean while the second derivative gives E(Y²). Var(Y)=E(Y²)-E(Y)²
  • Likelihood functions
  • Maximum likelihood estimators (MLE’s)
  • Bayesian prior/posterior
  • Hessian matrix
  • Fisher information
  • Wilk’s and Pearson’s statistics
  • Paired comparisons/comparing 2 multinomial distributions
  • Hypothesis testing using Neyman Pearson Lemma. Significance level, power, and p-value.
  • Pooled samples
  • Categorical data with free parameters

STAT 302 – Introduction to Probability Course Reflection

STAT 302 – Introduction to Probability

First off, I would just like to say that this was the hardest course I have ever taken. Statistics 302 – Introduction to Probability covered so much material and drew concepts from calculus 1,2,3 and STAT 200. I found myself studying for not only the course itself, but I also had to review integration, multi-variable calculus, and introductory statistical analysis techniques. I truly do think that the material I learned will be useful in the future. Often times I would relate what I was learning to various real life situations.

Just a short list (in no particular order) of what was covered in this course:

  • Advanced combinatorics / permutations and combinations (probably one of the hardest chapters)
  • Probability laws which was almost the same as set theory (union, intersection, partition, commutative associative distributive and DeMorgan’s laws, complement, subset, disjoint)
  • Conditional probability (Baye’s formula, odds, independence of events, conditional independence)
  • Discrete random variables (probability mass function, cumulative distribution function, expectation, variance/standard deviation)
  • Common discrete random variables: Bernoulli, binomial, geometric, negative binomial, Poisson, hypergeometric
  • Continuous random variables (probability density function, cumulative distribution function, gamma/uniform/normal/exponential distributions)
  • Joint probability (this chapter was also really difficult and covered so much)
  • Markov and Chebyshev’s inequality
  • Moment generating functions

Sum of all Natural Numbers = -1/12 !?

Quick post about something I came across the other day. I’m currently learning the Principle of Mathematical Induction(PMI) in MATH 220. It is essentially a way of proving a statement is true for all natural numbers. It has a lot to do with proving the summation of sets. However, I will talk more about the PMI in a different post.

A math meme page I follow on Facebook posted:

Image may contain: 4 people, people smiling

It’s a joke about the pineapple pen guy but in math terms. The meme is making fun of the theory/proof that the sum of all natural numbers converges or equals to -1/12 or 1+2+3+4+…+∞ = -1/12. I was really confused and I even believed it at the time! (it is obviously not true though)

I googled it and one of the most popular result is this video which “proves” the theory: https://www.youtube.com/watch?v=w-I6XTVZXww

I won’t go too in depth but the proof is incorrect due to it assuming many things that are not true. For starters, the video attempts to assume that 1-1+1-1+1… =1/2. The video does not explain why and assumes that it is common knowledge. However, assume 1-1+1-1+1… is a finite sum Z. Adding Z to itself you would get:
Z+Z=1-1+1-1+1…1-1+1-1+1… but this is just the original sum.
This implies Z+Z=2Z=Z and since Z=1/2, it follows that 1/2 = 1. Which is not possible. Therefore the video’s proof was flawed. This is just one of their incorrect assumptions. For more a more in-depth and thorough analysis, I definitely recommend reading this article: https://plus.maths.org/content/infinity-or-just-112

First Impressions of STAT 302 – Probability

I’m going to start this entry off by saying this course is incredibly interesting, but is by far one of the hardest classes I have ever had to take.

The class started off simple. It felt like review of MATH 220 to me: the union, intersection, and complements of probabilities acted similar to those of sets. For example:
Sets: Let the set A={1,2,3} and B={3,4,5}

A∪B= A+B-(A∩B) = {1,2,3,4,5}

Probability: Let the probability of event A=1/3 and B=1/2

A∪B= A+B-(A∩B) = 1/3 + 1/2 – 1/3 = 1/2

The next section was Combinatorics: Counting, Permutations and Combinations. I remember learning about this in grade 12, but did not go in depth. The questions we were expected to be able to do in this course were extremely complicated and I still believe that this section is one of the most difficult ones in the whole textbook. It forces us to think critically and even creatively, as these questions usually have more than one way of solving it.

One of the questions on the FIRST assignment: A quiz consists of 10 true/false questions. A student decides that he will not answer FALSE for any two consecutive questions. In how many ways can he answer all 10 questions?

The question seems quite simple to begin with. As soon as I tried to solve it, it was as if the question’s difficulty was increasing at an exponential rate. A random classmate of mine and I discussed our strategy in solving it and his solution looked like:

So for this question, drawing out all the combinations is possible but not very efficient. There was also talk among other classmates that it followed a Fibonacci sequence. My thought process was that there must be less than 6 false answers in order for none to be consecutive, then using the nCr (n choose r) formula. But the solution is much more complicated than that, which I will not go into on my blog post.

 

An Introduction to MATH 220 – Logic and Statements

Just finished set theory a few weeks ago in Math 220 and I honestly found it quite interesting. The union, intersection, partition, and complement of sets all relate to probability in Stat 302. One may say that the intersection of the content of the 2 courses is non-empty… Jokes aside, props to Professor Brett Kolesnik for having engaging lectures taught at the right pace. You know a course is well taught when you rarely need to look at textbook explanations.

Anyways, after set theory came logic and statements. This came quite naturally to me due to my prior programming experience in first and second year. Simple operators such as:

Not/Negation (~)

Or/Disjunction (∨)

And/Conjunction (∧)

If/Implication (⇒)

If and only if/Biconditional (⟺)

had truth tables that I already knew, simply by thinking in terms of code. However, something new came up that troubled me. Logical equivalence theorems such as commutative, associative, distributive, and De Morgan’s laws were not anything I was familiar with. After some practice it was much easier to handle.

Simple example: Show that ~(P⇒Q) is logically equivalent to (P∧~Q)∨(Q∧~P)

Start with ~(P⇒Q)≡(P∧~Q)∨(Q∧~P)

~(P⇒Q)≡~((P⇒Q)∧(Q⇒P))      … by definition of biconditional

≡~(P⇒Q)∨~(Q⇒P)     … by De Morgan’s Law

≡(P∧~Q)∨(Q∧~P)       …∎

Journey to Python – Introduction

Learning a new language or program can sometimes be a little intimidating. Foreign syntax, operators, and functions can be overwhelming and seem like a lot to take in. I’ve heard many great things about Python from my fellow coding friends. They explained to me that it would be a relatively easy and straightforward language to pick up, especially with my prior CPSC 210 Java knowledge. I find programming very interesting as it gives developers full reign and freedom. I like to compare a programmer and a language to an artist and a paintbrush. The possibilities of what one can do is only limited to his/her willingness to learn and the ability to express ideas creatively.

Because I have minimal Python knowledge, I will be learning the basics from multiple sources such as: LearnPython.org, YouTube, etc.

The first thing that came to my attention is that you do not need to declare variables and their type before using them. This is a new concept to me because in my past programming experience, objects had a variable name and declared type prior to their use.

Example in Java: String s1 = new String(“hello”)

Example in Python: s1 = “hello”

Also in Python, you are able to assign more than one variable at the same time using commas. However, I imagine this could get quite messy with more variables.

Example in Python: a, b = 3, 4 in which a is 3 and b is 4.

Some things to keep in mind: An exercise on LearnPython.org used the function isinstance(object, classinfo)  https://docs.python.org/2/library/functions.html#isinstance

They also used the symbol % which is actually a string formatting operator but can also be use as modulus/remainder. Same as Math 220!       https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting

%s – String (or any object with a string representation, like numbers)

%d – Integers

%f – Floating point numbers

You can also multiply strings with a number. Example: “hi” * 5 = “hihihihihi”

My first impressions: After working through simple introduction modules, it seems that Python and Java have many similarities. However, this is just the very beginning of Python for me and I am excited to see how the two differ as I get more in depth.