Hello friends in this post we are going to discuss about Structured Data Classification Question Answer Dumps | Structured Data Classification TCS Fresco Play Dumps | Structured Data Classification TFactor Question TRA PRA Dumps
31.Pruning is a technique associated with __
Answer : dt
32.What does the command sentiment_analysis_data[‘label’].value_counts() return?
Answer : counts of unique values in the ‘label’ column
33.Select the pre-processing technique(s) from the following.
Answer : all
34.Which of the given hyper parameter, when increased, may cause random forest to over fit the data?
Answer : depth of tree
35.Select the correct statement about Nonlinear classification.
Answer : Kernel tricks are used by Nonlinear classifiers to achieve maximum-margin hyperplanes.
36.Choose the correct sequence for classifier building from the following..
Answer : Initialize -> Train – -> Predict–>Evaluate
37.What command should be given to tokenize a sentence into words?
Answer : from nltk.tokenize import word_tokenize, Word_tokens =word_tokenize(sentence)
38.Choose the correct sequence from the following.
Answer : Data Analysis -> PreProcessing -> Model Building–> Predict
39.The following are all classification techniques, except ______
Answer : StratifiedShuffleSplit
40.The commonly used package for machine learning in python is
Answer : sklearn
41.How many new columns does the following command return?
Answer : iris_series = pd.get_dummies(iris[‘Species’])
42.Download the dataset from:
https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/d546eaee765268bf2f4876
08c537c05e22e4b221/iris.csv to answer the question.
Answer : 3
43.Which type of cross validation is used for imbalanced dataset?
Answer : K fold
44.To view the first 3 rows of the dataset, which of the following commands are used? Download the dataset from:
https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/d546eaee765268bf2f4876
08c537c05e22e4b221/iris.csv to answer the question.
Answer : iris.head(3)
45.Naive Bayes Algorithm is useful for :
Answer : indepth analysis
46.A process used to identify data points that are simply unusual
Answer : Anomaly Detection
47.Is there a class imbalance problem in the given data set?
Answer : no
48.Which of the following is not a technique to process missing values?
Answer : One hot encoding
49.Images,documents are examples of
Answer : Unstructured Data
50.email spam detection is an example of
Answer : The count with unique values in the iris[‘species’] column
51.Imagine you have just finished training a decision tree for spam classication and it is showing abnormal bad performance on both your training and test sets. Assume that your implementation has no bugs. What could be reason for this problem.
Answer : All
52.True Negative is when the predicted instance and the actual is positive.
Answer : False
53.A process used to identify unusual data points is ________
Answer : Anomaly Detection
54.Cross-validation causes over-fitting.
Answer : False
55.True Positive is when the predicted instance and the actual instance is not negative.
Answer : True
56.What kind of classification is our case study ‘Churn Analysis’?
Answer : Binary
57.Which command is used to identify the unique values of a column?
Answer : unique()
58.Which preprocessing technique is used to make the data gaussian with zero mean and unit variance?
Answer : Standardisation
59.Cross-validation technique is used to evaluate a classifier by dividing the data set into training set to train the classifier and testing set to test the same.
Answer : True
60.Let’s assume you are solving a classification problem with a highly imbalanced class. The majority class is observed 99% of the time in the training data. Which of the following is true when your model has 99% accuracy after taking the predictions on test data?
Answer : For imbalanced class problems, the accuracy metric is not a good idea.
61.The cross-validation technique will provide accurate results when the training set and the testing set are from two different populations.
Answer : False