Difference between readlines and (read and split)

advertisement
Files in Python
Caution about readlines vs. read and split
You would think that
lines = infile.readlines()
and
line = infile.read()
lines = line.split(‘\n’)
would give the same result in the variable lines, that is, a list of strings
from the file, delimited by the newline characters.
You would be surprised!
• readlines() gives you a list of strings, each with a \n at the end
• Except! if you did not press Enter on the last line of the data file, the
last string in the list will not have a \n in it
• read() followed by split(‘\n’) gives a list of strings, yes, but none of
them will have \n in them (remember split removes the delimiters
from its results)
And another surprise!
• If you did press Enter on the last line of the data file, readlines still
works properly. The last string in the list will have a \n character just
like all the others
• BUT the same file read with the read/split combination will have one
extra entry, an empty string at the end of the list
• This is something you need to be aware of while processing your data
– many programs crash because they assume that every string will be
the same length, for example.
Download