Skip to content

Latest commit

 

History

History
93 lines (67 loc) · 7.49 KB

probability-and-statistics.md

File metadata and controls

93 lines (67 loc) · 7.49 KB

Probability & Statistics

Coursera course on probabilities - for data science, actually quite good in explaining a lot of the basic tools,prob, conditional, distributions, sampling, CI, hypothesis, etc.\

  • I.e, Probability deals with predicting the likelihood of future events, while statistics involves the analysis of the frequency of past events.
  • The problems considered by probability and statistics are inverse to each other.
  • In probability theory we consider some underlying process which has some randomness or uncertainty modeled by random variables, and we figure out what happens.

=> Underlying process + randomness and random variables -> what happens next?

  • In statistics we observe something that has happened, and try to figure out what underlying process would explain those observations.

=> observe what happened -> what is the underlying process?

  • Finally, probability theory is mainly concerned with the deductive part, statistics with the inductive part of modeling processes with uncertainty

Introduction to statistics

  1. Table of content
  2. Median
  3. Mode - most freq
  4. Weighted mean
  5. Geometric mean
  6. Harmonic mean
  7. Percentiles
  8. Mean deviation
  9. Correlation
  10. Standard deviation, formula
  11. Standard normal distribution
  12. Skewness of distribution
  13. Confidence intervals (using std)
  14. Accuracy vs precision (accurate vs hitting closely or density)
  15. Probability
  16. Probability complement
  17. Chi-square test, p_value, independent, dependent, significance
  18. Variation vs variance - a private case
  19. Std vs variance - std is in the same metric as the mean, is the root of variance., allows outliers to influence, will not result in samples cancelling each other without the square root in the formula.

Introduction to Probability

  1. Types of events
  2. Independent events
  3. Conditional proba
  4. Proba tree diagrams
  5. Mutually exclusive events
  6. Combination and permutations
  7. Bayes
  8. Least squares regresssion It works by making the total of the square of the errors as small as possible (that is why it is called "least squares"
  9. Random variables
  10. Continuous random variables
  11. Random vars mean, std, variance

More on Statistics

  1. 25 concepts (part 2), 29 more concepts (part1) & part 3 in statistics.

Wiki

  1. Marginal probability
  2. Joint probability
  3. Conditional probability
  4. Chain rule - derivatives using the chain rule, on khan

Recommended Courses

  1. Another great course on probability, distribution types, conditional, joint, chain, etc.
  2. Kahn academy
  3. A really good intro to probability, conditional, joint, etc.

(another angle) The main difference between probability and statistics has to do with knowledge

  • what are the known facts? Inherent in both probability and statistics is a population,
  • every individual we are interested in studying, and a sample, consisting of the individuals that are selected from the population.
  • in probability: would start with us knowing everything about the composition of a population, and then would ask, “What is the likelihood that a selection, or sample, from the population, has certain characteristics?”
  • In statistics: we have no knowledge about the types of socks in the drawer. we infer properties about the population on the basis of a random sample.

Some calculations to get you into probability:

  • Finding out the probability of an event
  • Of two consecutive events (multiplication)
  • Of several events (sum)
  • Etc..

STATISTICAL SAMPLING AND RESAMPLING

  1. What is? Method for sampling/resampling, and sampling errors explained. (cross validation etc)