3
$\begingroup$

I need to read data from a CSV file and then the first partition that data into features and labels and then into the training and testing set. However, there are several issues cropping up again and again. Below is the code I tried with error,

ValueError: could not convert string to float: 'mon' on line Y: train_y}) 

The code for Linear Regression:-

import pandas as pd from sklearn.model_selection import train_test_split import tensorflow as tf import numpy as np learning_rate = 0.01 training_epochs = 1000 display_step = 50 data = pd.read_csv('forestfires.csv') y = data.temp x = data.drop('temp', axis=1) train_x, test_x, train_y, test_y = train_test_split(x, y,test_size=0.2) n_samples = train_x.shape[0] n_features = train_x.shape[1] X = tf.placeholder('float', [None, n_features]) Y = tf.placeholder('float', [None, 1]) # Model weights. W = tf.Variable(np.random.randn(n_features, 1), dtype='float32') b = tf.Variable(np.random.randn(1), dtype='float32') # Construct linear model. prediction = tf.matmul(X, W) + b loss = tf.reduce_sum(tf.pow(prediction - Y, 2))/(2 * n_samples) optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss) # Start training. with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for epoch in range(training_epochs): for (x, y) in zip(train_x, train_y): sess.run(optimizer, feed_dict={X: train_x, Y: train_y}) # Display logs per epoch step. if (epoch + 1) % display_step == 0: c = sess.run(loss, feed_dict={X: train_x, Y: train_y}) print ('Epoch:', '%04d' % (epoch+1), 'cost=','{:.9f}'.format(c), \ 'W=', sess.run(W), 'b=', sess.run(b)) print ('Training Done!') training_cost = sess.run(loss, feed_dict={X: train_x, Y: train_y}) print ('Training cost=', training_cost, 'W=', sess.run(W), 'b=', sess.run(b), '\n') # Graphic display. plt.plot(train_x, train_y, 'ro', label='Original data') plt.plot(train_x, sess.run(W) * train_x + sess.run(b), label='Fitted line') plt.legend() plt.show() 

Could anyone help me with reading data properly in a rather general way? Snapshot of the data:-

enter image description here

$\endgroup$
1
  • $\begingroup$Add a snapshot of the data!$\endgroup$
    – Aditya
    CommentedAug 18, 2018 at 9:51

2 Answers 2

0
$\begingroup$

I don't know exactly how your data is but y = data.temp may be a Series containing the string values which should be cast to float values. Try to change it to the following alternative.

y = data.temp.astype(float)

$\endgroup$
3
  • 1
    $\begingroup$Or maybe they are cats which need to be transformed..$\endgroup$
    – Aditya
    CommentedAug 18, 2018 at 9:51
  • 2
    $\begingroup$Where did you see cats?!$\endgroup$CommentedAug 18, 2018 at 10:10
  • 1
    $\begingroup$Cats -> Categories/Strings maybe.. Sorry for the shorthand..$\endgroup$
    – Aditya
    CommentedAug 18, 2018 at 17:21
0
$\begingroup$

So, the question is to understand this ValueError that you are getting.

This error I believe is referring to your month column, which I presume you are using a feature for this network. If so, as this is a categorical variable, you will need to change this into a one-hot encoding representation (https://machinelearningmastery.com/why-one-hot-encode-data-in-machine-learning/), because the model cannot interpret the string, hence the ValueError.

$\endgroup$

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.