Mungeol Heo: AI - Evaluation

Plot
- The distribution between y and yhat
  - Closer is better
  - ax1 = sns.distplot(Y, hist=False, color="r", label="Actual Value")
  - sns.distplot(Yhat, hist=False, color="b", label="Fitted Values" , ax=ax1)
Establish a single-number evaluation metric for your team to optimize
- Choose a single-number evaluation metric for your team to optimize. If there are multiple goals that you care about, consider combining them into a single formula (such as averaging multiple error metrics) or defining satisficing and optimizing metrics
- E.g. use F1, F2, or AUC instead of precision and recall
MAE

A good metric to avoid penalizing differences for large prediction values more heavily than for small prediction values

R^2
- Close to 1 is better
- Negative value = overfitting
- from sklearn.metrics import r2_score
- Or, lm.fit then lm.score
Confusion matrix
- https://en.wikipedia.org/wiki/Confusion_matrix
- Python function
  - http://scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html
  - http://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html
- Precision-recall curve
F1 measure
Log Loss
- Performance of a classifier where the predicted output is a probability value between 0 and 1
AUC: area under the curve
Time-dependent ROC curve
- For game update
IBS: Integrated Brie score
- For game update
Dunn index

Cross-validation
- https://en.wikipedia.org/wiki/Cross-validation_(statistics)
- Python function
  - http://scikit-learn.org/stable/modules/generated/sklearn.cross_validation.cross_val_score.html
- Spark
  - CrossValidator
- K-fold
  - For a small data set
  - E.g. , m = about 1000
  - Normally k=10
  - https://en.wikipedia.org/wiki/Cross-validation_(statistics)#k-fold_cross-validation
  - python function
    - http://scikit-learn.org/stable/modules/generated/sklearn.cross_validation.KFold.html
- Leave-one-out
  - For a very small data set
  - E.g., m < 100, 20 examples, k=20
  - https://en.wikipedia.org/wiki/Cross-validation_(statistics)#Leave-one-out_cross-validation
  - python function
    - http://scikit-learn.org/stable/modules/generated/sklearn.cross_validation.LeaveOneOut.html
Learning curves
- Telling that adding more training data is helpful or not
- Diagnosing bias and variance
- 2018 Machine learning yearning
  - P55

Mungeol Heo