25 multiple-choice questions on decision trees, random forests, feature importance and sklearn, plus 5 short-answer questions.
Select the single best answer for each question. Each question is worth 1 mark.
A decision tree makes predictions by:
A random forest is best described as:
Bagging (Bootstrap Aggregating) works by:
Feature importance in a random forest is typically measured by:
The out-of-bag (OOB) error is:
n_estimators in sklearn's RandomForestClassifier controls:
Setting a very small max_depth on a decision tree results in:
Random forests use feature subsampling at each split to:
Compared to a single decision tree, a random forest generally has:
max_features in sklearn's RandomForestClassifier controls:
Pruning a decision tree achieves:
Gini impurity at a node measures:
A decision tree will overfit when:
The key advantage of random forests over single decision trees is:
oob_score=True in sklearn:
In sklearn, model.fit(X_train, y_train):
The difference between predict() and predict_proba() in sklearn is:
Cross-validation is used in model evaluation to:
A model with high bias and low variance is likely:
A feature importance of 0.0 for a column means:
An ensemble method improves prediction by:
A decision tree grown without any depth limit will typically:
Random forests reduce variance compared to a single tree because:
What is grid search used for in model development?
Which of the following would you use to visualise the relative importance of features in a trained sklearn random forest?
Answer each question in 2β4 sentences. Precise technical language is expected. Code snippets are welcome where relevant.
Explain how a random forest makes a prediction for a new data point. What role does each individual tree play and how are their outputs combined?written
What is the out-of-bag (OOB) error in a random forest? Why does it provide a useful estimate of generalisation performance without needing a separate validation set?written
Explain how feature importance is computed in a random forest. How could you use feature importances practically to improve a model or understand your data?written
Explain the bias-variance tradeoff. Where does a fully-grown single decision tree sit on this spectrum, and where does a random forest sit? Why?written
Compare and contrast decision trees and random forests across three dimensions: interpretability, variance, and computational cost. When would you choose a single decision tree over a random forest?written
Complete all 30 questions then click Submit. Your MCQ score (25/25) will be shown. Short answers are marked separately.