- Plot
- The distribution between y and yhat
- Closer is better
- ax1 = sns.distplot(Y, hist=False, color="r", label="Actual Value")
- sns.distplot(Yhat, hist=False, color="b", label="Fitted Values" , ax=ax1)
- The distribution between y and yhat
- Establish a single-number evaluation metric for your team to optimize
- Choose a single-number evaluation metric for your team to optimize. If there are multiple goals that you care about, consider combining them into a single formula (such as averaging multiple error metrics) or defining satisficing and optimizing metrics
- E.g. use F1, F2, or AUC instead of precision and recall
- MAE
- A good metric for measuring the accuracy of predictions for time series
- It does not heavily punish larger errors as square errors do
- MAPE
- MSE
- RMSE
- More sensitive to outliers than MAE
- RMSLE
- A good metric to avoid penalizing differences for large prediction values more heavily than for small prediction values
- R^2
- Close to 1 is better
- Negative value = overfitting
- from sklearn.metrics import r2_score
- Or, lm.fit then lm.score
- Confusion matrix
- F1 measure
- Log Loss
- Performance of a classifier where the predicted output is a probability value between 0 and 1
- AUC: area under the curve
- Time-dependent ROC curve
- For game update
- IBS: Integrated Brie score
- For game update
- Dunn index
- A metric for evaluating clustering algorithms
- Cross-validation
- https://en.wikipedia.org/wiki/Cross-validation_(statistics)
- Python function
- Spark
- CrossValidator
- K-fold
- For a small data set
- E.g. , m = about 1000
- Normally k=10
- https://en.wikipedia.org/wiki/Cross-validation_(statistics)#k-fold_cross-validation
- python function
- Leave-one-out
- For a very small data set
- E.g., m < 100, 20 examples, k=20
- https://en.wikipedia.org/wiki/Cross-validation_(statistics)#Leave-one-out_cross-validation
- python function
- Learning curves
- Telling that adding more training data is helpful or not
- Diagnosing bias and variance
- 2018 Machine learning yearning
- P55
Wednesday, August 22, 2018
AI - Evaluation
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.