Not reading my dataset

df=pd.read_csv(r"C:\Users\HP\ml-100k\u.data",sep=’\t’)

it is showing error “file not found” but i have already downloaded the file and it is present at (C:\Users\HP) this file path

i have downloaded the zip file ml-100k.zip as directed in first video of movie recommendation project.

Try removing the r before the path

df=pd.read_csv(“C:\Users\HP\Desktop\ml-100k\u.data”,sep=’\t’)
^
SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated \UXXXXXXXX escape

now it is showing this error

Hey @Roopa1i_ma1hotra, do one thing. Go to u.data file in your laptop. Right click on the file and see it’s location or path…where it is originally stored. Copy that location in pd.read_csv( ) function. There must be an error in the path itself .

there is no option to look for the location of u.data file

In the properties option maybe ?

thankyou sir. it’s done

1 Like

I hope you were able to understand the error !

Please mark the doubt as resolved in your doubts section :+1:
Happy Learning :slight_smile:

1 Like

df = pd.read_csv(‘C:\Users\Ayush Bhardwaj\Downloads\ml-100k\ml-100k’ , sep = “\t”)
getting the same . please help

The error message you’re seeing indicates that Python encountered a string with a Unicode escape sequence that is truncated, meaning it doesn’t have the required number of hexadecimal digits. The position 2-3 in the error message refers to the location in the string where the error occurred.

You can solve this error “codec can’t decode bytes” by using a raw string by prefixing your string with the letter “r”. Raw strings don’t interpret backslashes as escape characters.

path = r"C:\Users\User\Documents\file.txt"

Double the backslashes in your string

path = "C:\\Users\\User\\Documents\\file.txt"

Use forward slashes instead of backslashes

path = "C:/Users/User/Documents/file.txt"

Unicode escape sequences can be used in both single-quoted (’…’) and double-quoted ("…") strings in Python. However, it’s important to note that in Python 3, strings are Unicode by default, so you can also include Unicode characters directly in a string without using escape sequences, as long as you use the appropriate character encoding.