I've been following this tutorial I found online about speech analysis in Deep Learning, it kept giving me the nameerror. i'm quite new to python, so I'm not sure on how to define it. But then train_test_split is a method by default to split the data, train_test_split is imported.
Here is the code:
'''
import numpy as np
import pandas as pd 
import os
import seaborn as sns
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use('fivethirtyeight')
from tqdm import tqdm
print(os.listdir("../input"))
from keras import Sequential
from keras import optimizers
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential,Model
from keras.layers import LSTM, Dense, Bidirectional, Input,Dropout,BatchNormalization,CuDNNLSTM, GRU, CuDNNGRU, Embedding, GlobalMaxPooling1D, GlobalAveragePooling1D, Flatten
from keras import backend as K
from keras.engine.topology import Layer
from keras import initializers, regularizers, constraints
from sklearn.model_selection import KFold, cross_val_score, train_test_split
train = pd.read_json('C:/Users/User/Downloads/dont-call-me-turkey/train.json')
display(train.shape)
train.head()
train_train, train_val = train_test_split(train, random_state = 42)
xtrain = [k for k in train_train['audio_embedding']]
ytrain = train_train['is_turkey'].values
xval = [k for k in train_val['audio_embedding']]
yval = train_val['is_turkey'].values '''
it gave an error:
NameError                                 Traceback (most recent call last)
<ipython-input-19-1e07851e6519> in <module>
----> 1 train_train, train_val = train_test_split(train, random_state = 42)
      2 xtrain = [k for k in train_train['audio_embedding']]
      3 ytrain = train_train['is_turkey'].values
      4 xval = [k for k in train_val['audio_embedding']]
      5 yval = train_val['is_turkey'].values
NameError: name 'train_test_split' is not defined
train_test_split() function: The train_test_split() method is used to split our data into train and test sets. First, we need to divide our data into features (X) and labels (y). The dataframe gets divided into X_train, X_test, y_train, and y_test. X_train and y_train sets are used for training and fitting the model.
The scikit-learn Python machine learning library provides an implementation of the train-test split evaluation procedure via the train_test_split() function. The function takes a loaded dataset as input and returns the dataset split into two subsets.
test_sizefloat or int, default=None. If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples. If None, the value is set to the complement of the train size.
EXAMPLE 3: Use random_state to make a repeatable split That's because train_test_split allocates rows of data to the output randomly . Therefore, every time you run train_test_split with the default settings, the output data will contain observations that are randomly selected from the input data.
Probably you haven't installed sklearn
Pip install sklearn
If you already have done that, then try:
from sklearn.cross_validation import train_test_split
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With