25 multiple-choice on tabular models, entity embeddings, preprocessing and the Titanic dataset, plus 5 short-answer questions.
Select the single best answer for each question. Each question is worth 1 mark.
Tabular data differs from image or text data primarily because:
In FastAI tabular learning, categorical variables are handled using:
An entity embedding for a categorical variable:
TabularDataLoaders.from_df(df, ...) requires you to specify:
Which of the following is a continuous variable?
FastAI automatically applies normalisation to continuous variables because:
The FillMissing processor in FastAI:
Categorify in FastAI:
Dropout during training works by:
layers=[200, 100] in tabular_learner creates a network with:
The most appropriate evaluation metric for binary classification on the Titanic dataset is:
In FastAI tabular, cat_names and cont_names specify:
Batch normalisation in a neural network:
A high feature importance score for a column in a tree-based model indicates:
The key advantage of entity embeddings over one-hot encoding for a column with 500 unique categories is:
valid_idx in TabularDataLoaders specifies:
Class imbalance in the Titanic dataset (more non-survivors than survivors) can cause:
The procs argument in TabularDataLoaders specifies:
learn.predict(row) for a tabular model takes a:
Which of the following is the correct FastAI call to train a tabular learner for 5 epochs?
In the Titanic dataset, Pclass (passenger class: 1, 2, 3) is best treated as:
K-fold cross-validation is used to:
What does tabular_learner create?
Adding more hidden layers to a tabular neural network:
Using Pclass and Sex as features to predict Titanic survival raises which ethical concern?
Answer each question in 2–4 sentences. Precise technical language is expected. Code snippets are welcome where relevant.
Explain the difference between categorical and continuous variables in tabular data. How does FastAI handle each type differently in its preprocessing pipeline?written
What is entity embedding for categorical variables? Why is it more powerful than one-hot encoding for a column with many unique values (e.g. zip code with 10,000 unique values)?written
Describe the three main preprocessing steps in FastAI's tabular module: FillMissing, Categorify, and Normalize. What does each do and why is each needed?written
What is dropout and how does it act as a regulariser? What would you expect to happen to training and validation accuracy if you set dropout to 0.9 (very high)?written
Describe how you would build a FastAI tabular model to predict Titanic survival. Include data loading, feature selection, DataLoader creation, model training, and evaluation steps.written
Complete all 30 questions then click Submit. Your MCQ score (25/25) will be shown. Short answers are marked separately.