Sunday, November 11, 2018

Basic statistics - Some notes and proofs


Sample variance estimation formula: Why is (n-1) to be used instead of n as the denominator?



Let Y1, Y2, .., Yn be the sample data points for the random variable Y. Let μY and σY denote the mean and standard deviation of Y respectively. This assumption usually holds true in case of sampling with replacement. Hence, E(YiYj) = E(Yi)E(Yj) for any i, j ∈ {1, .., n} such that i ≠ j.

The estimator for mean of Y is simply the average value of the data points, i.e. 
Meanest(Y) or est =


which seems intuitive. By the same reasoning, it would seem that the best estimate for the variance of Y would be Varest(Y) =




Let us estimate the mean value of the estimated variance, since the estimate itself is a random variable dependent on the values of data points Yi in the sample set. Let A denote the variance estimator that uses n as the denominator. Let us denote by Sn the numerator in the formula for est, i.e.




Now, by definition,

Thus, the estimated variance has a mean value that is less than the actual variance. From equation ⑤, it is clear that if the estimator A is multiplied by n/(n - 1) (which is a linear transformation of A), then the resulting estimator Amod will have a mean value that equals the actual variance. Thus, the more precise estimator of variance is:
 

Wednesday, January 10, 2018

Linear algebra - intuitive explanations of matrices and determinants

  1. Understanding matrices intuitively, part 1
  2. Understanding matrices intuitively, part 2, eigenvalues and eigenvectors
  3. What's an intuitive way to think about the determinant? (Stackexchange post)
  4. Math Insight

Precision-Recall - evaluation and adjustment

  1. Plotting precision-recall with Scikit
  2. Classifier with adjustable precision vs recall

Reading list for decision trees, ensemble methods, support vector meachines

Decision trees and ensemble methods

  1. Ensemble Machine Learning Algorithms in Python with scikit-learn
  2. Difference between "fully developed decision trees" and "shallow decision trees"?(Stackexchange post)
  3. Decision tree or logistic regression
  4. Two-class AdaBoost (Scikit-learn page)
  5. Intuitive explanations of differences between Gradient Boosting Trees (GBM) & Adaboost
  6. Decision trees in python with scikit-learn and pandas
  7. Advanced modeling - Predict Customer Churn - Logistic Regression, Decision Tree and Random Forest
  8. Decision Trees, Confusion Matrices and Precision Recall
  9. Xgboost in Sagemaker - Walkthrough with examples

Support Vector Machines

# maximal margin SVM  #Lagrange constrained optimization and svm
  1. An idiot's guide to SVM - MIT lecture slide
  2. Support Vector Machines: Maximum Margin Classifiers – Piotr Mirowski's NYU lecture slides
  3. SVM - Understanding the math - Duality and Lagrange multipliers
  4. Constrained Optimization Using Lagrange Multipliers