chapter 5: Files and I/O
5.1. Reading and Writing Text Data
open a file
the 't' in mode means text.
f = open('1.txt', 'rt') #read
# f = open('1.txt', 'wt') #write
# f = open('1.txt', 'at') #append
# specify codec
f = open('1.txt', 'rt', encoding='latin-1') #read
f = open('1.txt', 'wt', encoding='latin-1') #write
#disable newline translation, by use the open(newline='') option
f = open('1.txt', 'rt', newline='') #read
# specify what to do when encountering decoding/encoding errors, by use open(errors='...') option
f = open('1.txt', 'rt', errors='replace') #replace the char that can't be decoded to a unicode char U+fffd(which is the unicode replacemenet char)
f = open('1.txt', 'rt', errors='ignore') #just ignore the char that can't be decoded
read whole content of a file as a string
with open('1.txt', 'rt') as f:
s = f.read()
print(s)
read/iterate each line of a file, by just treat the file object as a generator
with open('1.txt', 'rt') as f:
for line in f:
print(line, end='')
write str to a file, by file.write(text) method
with open('2.txt', 'wt') as f:
f.write('abced')
get system's default encoding
import sys
print(sys.getdefaultencoding())
5.2. Printing to a File, redirect stdout to a file, by use print(file=…) option
with open('2.txt', 'wt') as f:
print("aaaaa", file=f)
Question: how to redirect stdout to a file system widely.
5.3. Printing with a Different Separator or Line Ending, by use print(sep=…, end=…) options
print(1, 'abc')
print(1, 'abc', sep=', ', end='##')
print()
row = (45, 'Hello', 'List', 4)
print(row)
print(*row)
print(row, sep=', ')
print(*row, sep=', ')
pass a sequence/list object to a function as N parameters instead of one, by using *listname
row = (45, 'Hello', 'List', 4)
print(row)
print(*row)
print(row, sep=', ')
print(*row, sep=', ')
5.4. Reading and Writing Binary Data(such as image, sound files)
By saying binary data, it means that there will no encoding/decoding works during writing/reading process. Use mode such as 'rb', 'wb', 'ab'.
当作为binary data读取时, 与作为text data相比,没有自动的decode, encode过程。
with open('2.txt', 'wb') as f:
# f.write('aaabbb'.encode('latin-1'))
f.write(b'aaabbb')
what is text string and byte string in python
Each element in a text string is also a text string, Each element in a byte string is a int
s = 'Hello'
print(type(s), s, sep=', ')
for c in s:
print(type(c), c, sep=', ')
s = b'Hello'
print(type(s), s, sep=', ')
for c in s:
print(type(c), c, sep=', ')
5.5. Writing to a File That Doesn't Already Exist, by set mode of open(…) function to 'x'
If the file already exists, then don't write, and will raise a FileExistsError exception
with open('2.txt', 'xt') as f:
f.write('aaa bbb')
感觉这个根python的哲学有点类似,不事先做判断,而是用exception的方式。 具体的用法可能需要将它放在一个try catch里。
5.6. Performing I/O Operations on a String, by io.StringIO() or io.BytesIO()
a typecal application can be simulate a file when do unit testing.
5.7. Reading and Writing Compressed Datafiles, by use gzip.open(…), or bz2.open(…)
After open the file, other operations are just the same as normal file.
5.8. Iterating Over Fixed-Sized Records, by iter(callable, sentinel)
import functools
RECORD_SIZE = 2
with open('1.txt', 'rt') as f:
for r in iter(functools.partial(f.read, RECORD_SIZE), ''):
print(r, end='; ')
the functools.partial(func, *args, **kwargs) function: create a new callable from a given callable with some(partial) arguments fixed. Currying
from functools import partial
def max(a, b):
if a>b: return a
else: return b
mm = partial(max, 3)
print(mm(4))
print(mm(2))
print(mm())
写一个能够接收很多参数的函数,然后利用partial 来生成简易的使用接口。需要注意参数的顺序。
5.9. Reading Binary Data into a Mutable Buffer
5.10. Memory Mapping Binary Files, map a binary file to memory(byte array), my mmap.mmap(…) method
This is a general method to map file to memory, then you can random access the content of the file, such as by using slicing
After mapped, by change the value of the array will change the file's content. This is also a way for multiple intepreter comunication. Below is a general function that map a file to a byte array.
import os
import mmap
def memory_map(filename, access=mmap.ACCESS_WRITE):
size = os.path.getsize(filename)
fd = os.open(filename, os.O_RDWR)
return mmap.mmap(fd, size, access=access)
# below is application of the function
f = memory_map('1.txt')
print(f[2:8])
f[0:3] = b'EEF'