What’s New in Python 2.1

Author:A.M. Kuchling

はじめに

この記事はPython 2.1の新機能について説明します。 Python 2.1には Python 2.0ほど多くの変更点はありませんが、楽しい驚きがあります。2.1はPythonの拡張計画(PEP)を使い動かす最初のリリースであり、大きい変更点のほとんどは、より詳細なドキュメントや変更のための設計原理を提供するPEPに付随しています。この記事は新機能について網羅していませんが、Pythonプログラマのための新機能について概要を示します。特に興味のある任意の新機能の詳細については、Python 2.1ドキュメントか特定のPEPを参照してください。

Python開発チームの最近の目標の一つとして新しいリリースのペースを上げており、一つのリリースにつき6~9ヶ月ごとにリリースしています。2.1はこの速いペースになって出てきた最初のリリースで、初のアルファ版は、2.0の最終版がリリースされた3ヶ月後の1月に登場しました。

Python 2.1の最終版は2001年4月17日に作成されました。

PEP 227: 入れ子状のスコープ

Python2.1における最も大きな変更点はPythonのスコープルールです。Python2.0では、ある指定された時点である変数の名前を検索するために多くても3つの名前空間、つまりローカル、モジュールレベル、ビルトイン名前空間しか使われませんでした。このことは直感的な期待と一致せずしばしば人々を驚かせました。例えば、入れ子になった再帰関数の定義は動きません:

def f():
    ...
    def g(value):
        ...
        return g(value-1) + 1
    ...

名前``g``はローカルの名前空間にもモジュールレベルの名前空間にも紐付かないので、関数:func:`g`は常に:exc:`NameError`例外を上げます。これは実際には大した問題ではありません (このような内部関数を再帰的に定義する頻度はそう多くありません)が、:keyword:`lambda`ステートメントをより使いにくくするのでこれについては問題です。:keyword:`lambda`を使うコードにおいて、デフォルトの引数としてこれらを渡してコピーされるローカル変数を頻繁に見かけます。:

def find(self, name):
    "Return list of any entries equal to 'name'"
    L = filter(lambda x, name=name: x == name,
               self.list_attribute)
    return L

強く機能的なスタイルで記述されたPythonコードは結果的に読みにくくなります。

Python 2.1の最も重要な変更点は、この問題を解決するために静的なスコープが追加されたことです。最初の効果として、``name=name``という例ではデフォルトの引数は必要ありません。簡単に言えば、指定された引数名が関数内の値に割り当てられない場合(def、:keyword:`class`または:keyword:`import`ステートメントの割り当てによって)、変数の参照は外側のスコープのローカル名前空間で検索されます。ルールや実装の詳細はPEPで参照できます。

この変更は、同じ変数名がモジュールレベルと関数の定義が含まれている関数内のローカルの両方で変数名として使用されているコードで、互換性の問題を引き起こすかもしれません。そのようなコードを最初の場所で読むことはかなり混乱するでしょうから、問題が起きることはむしろ考えにくいですが。

この変更の効果の一面として、``from module import *``と``exec`の両ステートメントが特定の条件下の関数スコープ内で不正となります。CPythonのインタープリタではこうした制限がありませんでしたが、Pythonのリファレンスマニュアルによると``from module import *``はトップレベルモジュールでのみ合法という規約に全て沿っています。入れ子になったスコープ実装の一環として、Pythonのソースをバイトコードに変換するコンパイラは、内包されたスコープ内の変数にアクセスするために別のコードを生成する必要があります。それゆえもし関数が関数定義を含んだり自由な変数に:keyword:`lambda`の語句を含んだりする場合、コンパイラは:exc:`SyntaxError`例外を上げて知らせます。

前述した説明を少し明確にするため、例を挙げます:

x = 1
def f():
    # The next line is a syntax error
    exec 'x=2'
    def g():
        return x

func:`g`に値を呼び出される``x``と命名された新しいローカル変数を``exec``が定義するので、``exec``ステートメントを含む4行目は構文エラーです。

``exec``はほとんどのPythonコードで滅多に使われません(そして使われるときはしばしば貧弱なデザインの証です)ので、これはそんなに制限されるべきではありません。

互換性の問題は徐々に導入されて入れ子になったスコープにつながりました; Pythonの2.1ではそれらはデフォルトで有効になっていませんが、PEP236に記載されているように、将来的にステートメントを使用してモジュール内でオンにすることができます。(PEP236の更なる議論のために以下のセクションを参照してください。)Python2.2では入れ子になったスコープがデフォルトになり、それをオフにする方法はありませんが、ユーザは2.1の全ライフタイムを通して、導入の結果起きるあらゆる破損を修正し続けるでしょう。

参考

PEP 227 - 静的に入れ子になったスコープ

Jeremy Hylton著、実装

PEP 236: __future__ 指示文

ネストされたスコープへのこの対応は、リリース2.1ではコードを壊すことの危険性について広く懸念されており、Python使い達に保守的なアプローチを選ばせるには充分でした。このアプローチは、リリースN+1で必須となるリリースNで、オプション機能を有効にする規則の導入から成立しています。

この構文は:mod:`__future__`という名の予約モジュールを使う``from...import``ステートメントを使用します。ネストしたスコープは以下のステートメントで有効にできます。

from __future__ import nested_scopes

:keyword:`import`ステートメントは正常に見えますが、実は違います; このような未来のステートメントが置かれる場所には厳格なルールがあります。これらはモジュールの冒頭にしか置けませんし、必ず何らかのPythonコードや正規の:keyword:`import`ステートメントの前になければなりません。それと言うのもこのようなステートメントは、Pythonのバイトコードコンパイラがコードを解析し、バイトコードを生成する方法に影響しますので、生成されるバイトコードとなるステートメントの前になくてはならないのです。

参考

PEP 236 - :mod:`__future__`に戻る

Tim Peter著、Jeremy Hylton主実装

PEP 207: 拡張比較

以前のバージョンでは、ユーザ定義クラスや拡張型の比較を実装するためのPythonのサポートはとても簡単でした。クラスは2つのインスタンスを与えられる:meth:`__cmp__`メソッドを実装でき、実際の値がどうあれ、同値または+1か-1の時にだけ0を返すことができました;メソッドは例外を上げることも、ブール値以外のものを返すこともできませんでした。Numeric Pythonが使われるナンバークランチングプログラムでは、要素ごとに与えられた比較結果を含む行列を返し、2つの行列を要素単位で比較できることがより便利なので、Numeric Pythonのユーザはこのモデルがとても脆弱で限定的であることにしばしば気付きました。もし2つの行列サイズが異なる場合、この比較はエラー通知の例外を上げられなければなりません。

Python2.1で、このニーズを満たすために拡張比較が追加されました。Pythonのクラスは、<<=>> ===``!=``の各演算を個別にオーバーロードできます。 この新しい特殊メソッドの名前は:

演算

Method name
< __lt__()
<= __le__()
> __gt__()
>= __ge__()
== __eq__()
!= __ne__()

(特殊メソッドはFortranの``.LT.``や``.LE.``等々の演算子にちなんで命名されています。プログラマはほぼ確実にこれらの名前を熟知していますので、覚えやすいでしょう。)

これらの特殊メソッドは method(self, other) の形式になっており、 self が演算子の左辺、 other が右辺のオブジェクトになります。例えば、式 A < BA.__lt__(B) を呼び出します。

これらの特殊メソッドは何でも、つまり真偽値や行列、リストや他のPython オブジェクトを返せます。もし比較が不可能であったり、矛盾していたり、意味がない場合は、代わりに例外を上げることができます。

The built-in cmp(A,B) function can use the rich comparison machinery, and now accepts an optional argument specifying which comparison operation to use; this is given as one of the strings "<", "<=", ">", ">=", "==", or "!=". If called without the optional third argument, cmp() will only return -1, 0, or +1 as in previous versions of Python; otherwise it will call the appropriate method and can return any Python object.

C プログラマにとって興味深い関連する変更があります。 型オブジェクトに新しい tp_richcmp スロットと、この拡張された比較を行うAPIが追加されました。ここではC APIについて言及しませんが、あながた関連する関数のリストを見たければ、PEP 207かバージョン2.1のC APIドキュメントを参照してください。

参考

PEP 207 - Rich Comparisions
Written by Guido van Rossum, heavily based on earlier work by David Ascher, and implemented by Guido van Rossum.

PEP 230: 警告フレームワーク

10年の間に、Pythonは途中で廃止されたモジュールと機能の特定の番号を蓄積してきました。どれだけのコードが活用されているか知る術はないので、機能を削除して問題ないか把握することは困難です。その機能に依存するプログラムは一つもないかもしれませんし、たくさんあるかもしれません。より構造化された方法で古い機能を削除できるようにするために、警告のフレームワークが追加されました。Pythonの開発者がある機能を取り除きたいときは、Pythonの次のバージョンで最初の警告を引き起こします。Pythonの次のバージョンが機能を削除でき、ユーザーは古い機能の使用を廃止するための完全なリリースサイクルを持ち続けるでしょう。

Python2.1はこのスキームで使われる警告フレームワークを追加します。また、警告の表示機能や表示させたくない警告を除外する機能を提供する、警告モジュールを追加します。サードパーティーのモジュールはまた、彼らがサポート対象外にしたい古い機能を非難するのにこのフレームワークを利用できます。

例えば、Python2.1では正規表現モジュールは廃止されたので、これをインポートすると警告が表示されます:

>>> import regex
__main__:1: DeprecationWarning: the regex module
         is deprecated; please use the re module
>>>

警告はwarnings.warn関数を呼び出すことで発行できます。

warnings.warn("feature X no longer supported")

最初のパラメータは警告メッセージです。任意追加のパラメータは、特定の警告カテゴリを指定するために利用することができます。

特定の警告を無効にするためにフィルターを追加できます。また警告を抑止するためにメッセージまたはモジュール名に正規表現が適用できます。例えば正規表現モジュールを使ったプログラムで、今すぐにはreモジュールに変換する時間をかけられないプログラムがあるとします。警告は呼び出しにより抑制できます。

import warnings
warnings.filterwarnings(action = 'ignore',
                        message='.*regex module is deprecated',
                        category=DeprecationWarning,
                        module = '__main__')

This adds a filter that will apply only to warnings of the class DeprecationWarning triggered in the __main__ module, and applies a regular expression to only match the message about the regex module being deprecated, and will cause such warnings to be ignored. Warnings can also be printed only once, printed every time the offending code is executed, or turned into exceptions that will cause the program to stop (unless the exceptions are caught in the usual way, of course).

関数は、警告を発行するためのPythonのC APIに追加されました。詳細についてはPEP230またはPythonのAPIドキュメントを参照してください。

参考

PEP 5 - 言語の進化のためのガイドライン

Phthonから古い機能を削除する時に従うべき手順を示すため、ポール・プレスコッドにより書かれました。当PEPに記載された方針は正式に採用されていませんが、最終的な方針もプレスコッドの提案とおそらくさほど違いはないでしょう。

PEP 230 - 警告フレームワーク

Guido van Rossum 著、実装

PEP 229: New Build System

When compiling Python, the user had to go in and edit the Modules/Setup file in order to enable various additional modules; the default set is relatively small and limited to modules that compile on most Unix platforms. This means that on Unix platforms with many more features, most notably Linux, Python installations often don’t contain all useful modules they could.

Python 2.0 added the Distutils, a set of modules for distributing and installing extensions. In Python 2.1, the Distutils are used to compile much of the standard library of extension modules, autodetecting which ones are supported on the current machine. It’s hoped that this will make Python installations easier and more featureful.

Instead of having to edit the Modules/Setup file in order to enable modules, a setup.py script in the top directory of the Python source distribution is run at build time, and attempts to discover which modules can be enabled by examining the modules and header files on the system. If a module is configured in Modules/Setup, the setup.py script won’t attempt to compile that module and will defer to the Modules/Setup file’s contents. This provides a way to specific any strange command-line flags or libraries that are required for a specific platform.

In another far-reaching change to the build mechanism, Neil Schemenauer restructured things so Python now uses a single makefile that isn’t recursive, instead of makefiles in the top directory and in each of the Python/, Parser/, Objects/, and Modules/ subdirectories. This makes building Python faster and also makes hacking the Makefiles clearer and simpler.

参考

PEP 229 - Using Distutils to Build Python

A.M. Kuchling 著、実装

PEP 205: 脆弱なリファレンス

Weak references, available through the weakref module, are a minor but useful new data type in the Python programmer’s toolbox.

Storing a reference to an object (say, in a dictionary or a list) has the side effect of keeping that object alive forever. There are a few specific cases where this behaviour is undesirable, object caches being the most common one, and another being circular references in data structures such as trees.

For example, consider a memoizing function that caches the results of another function f(x) by storing the function’s argument and its result in a dictionary:

_cache = {}
def memoize(x):
    if _cache.has_key(x):
        return _cache[x]

    retval = f(x)

    # Cache the returned object
    _cache[x] = retval

    return retval

This version works for simple things such as integers, but it has a side effect; the _cache dictionary holds a reference to the return values, so they’ll never be deallocated until the Python process exits and cleans up This isn’t very noticeable for integers, but if f() returns an object, or a data structure that takes up a lot of memory, this can be a problem.

Weak references provide a way to implement a cache that won’t keep objects alive beyond their time. If an object is only accessible through weak references, the object will be deallocated and the weak references will now indicate that the object it referred to no longer exists. A weak reference to an object obj is created by calling wr = weakref.ref(obj). The object being referred to is returned by calling the weak reference as if it were a function: wr(). It will return the referenced object, or None if the object no longer exists.

This makes it possible to write a memoize() function whose cache doesn’t keep objects alive, by storing weak references in the cache.

_cache = {}
def memoize(x):
    if _cache.has_key(x):
        obj = _cache[x]()
        # If weak reference object still exists,
        # return it
        if obj is not None: return obj

    retval = f(x)

    # Cache a weak reference
    _cache[x] = weakref.ref(retval)

    return retval

The weakref module also allows creating proxy objects which behave like weak references — an object referenced only by proxy objects is deallocated – but instead of requiring an explicit call to retrieve the object, the proxy transparently forwards all operations to the object as long as the object still exists. If the object is deallocated, attempting to use a proxy will cause a weakref.ReferenceError exception to be raised.

proxy = weakref.proxy(obj)
proxy.attr   # Equivalent to obj.attr
proxy.meth() # Equivalent to obj.meth()
del obj
proxy.attr   # raises weakref.ReferenceError

参考

PEP 205 - 脆弱なリファレンス

Fred L. Drake, Jr. 著、実装

PEP 232: 関数の属性

In Python 2.1, functions can now have arbitrary information attached to them. People were often using docstrings to hold information about functions and methods, because the __doc__ attribute was the only way of attaching any information to a function. For example, in the Zope Web application server, functions are marked as safe for public access by having a docstring, and in John Aycock’s SPARK parsing framework, docstrings hold parts of the BNF grammar to be parsed. This overloading is unfortunate, since docstrings are really intended to hold a function’s documentation; for example, it means you can’t properly document functions intended for private use in Zope.

Arbitrary attributes can now be set and retrieved on functions using the regular Python syntax:

def f(): pass

f.publish = 1
f.secure = 1
f.grammar = "A ::= B (C D)*"

The dictionary containing attributes can be accessed as the function’s __dict__. Unlike the __dict__ attribute of class instances, in functions you can actually assign a new dictionary to __dict__, though the new value is restricted to a regular Python dictionary; you can’t be tricky and set it to a UserDict instance, or any other random object that behaves like a mapping.

参考

PEP 232 - 関数の属性

Barry Warsaw 著、実装

PEP 235: 大文字小文字を区別しないプラットフォームでのモジュールの読み込み

Some operating systems have filesystems that are case-insensitive, MacOS and Windows being the primary examples; on these systems, it’s impossible to distinguish the filenames FILE.PY and file.py, even though they do store the file’s name in its original case (they’re case-preserving, too).

In Python 2.1, the import statement will work to simulate case- sensitivity on case-insensitive platforms. Python will now search for the first case-sensitive match by default, raising an ImportError if no such file is found, so import file will not import a module named FILE.PY. Case- insensitive matching can be requested by setting the PYTHONCASEOK environment variable before starting the Python interpreter.

PEP 217: 対話的な Display Hook

When using the Python interpreter interactively, the output of commands is displayed using the built-in repr() function. In Python 2.1, the variable sys.displayhook() can be set to a callable object which will be called instead of repr(). For example, you can set it to a special pretty- printing function:

>>> # Create a recursive data structure
... L = [1,2,3]
>>> L.append(L)
>>> L # Show Python's default output
[1, 2, 3, [...]]
>>> # Use pprint.pprint() as the display function
... import sys, pprint
>>> sys.displayhook = pprint.pprint
>>> L
[1, 2, 3,  <Recursion on list with id=135143996>]
>>>

参考

PEP 217 - Display Hook の対話的な使用

Moshe Zadka 著、実装

PEP 208: New Coercion Model

How numeric coercion is done at the C level was significantly modified. This will only affect the authors of C extensions to Python, allowing them more flexibility in writing extension types that support numeric operations.

Extension types can now set the type flag Py_TPFLAGS_CHECKTYPES in their PyTypeObject structure to indicate that they support the new coercion model. In such extension types, the numeric slot functions can no longer assume that they’ll be passed two arguments of the same type; instead they may be passed two arguments of differing types, and can then perform their own internal coercion. If the slot function is passed a type it can’t handle, it can indicate the failure by returning a reference to the Py_NotImplemented singleton value. The numeric functions of the other type will then be tried, and perhaps they can handle the operation; if the other type also returns Py_NotImplemented, then a TypeError will be raised. Numeric methods written in Python can also return Py_NotImplemented, causing the interpreter to act as if the method did not exist (perhaps raising a TypeError, perhaps trying another object’s numeric methods).

参考

PEP 208 - Reworking the Coercion Model
Written and implemented by Neil Schemenauer, heavily based upon earlier work by Marc-André Lemburg. Read this to understand the fine points of how numeric operations will now be processed at the C level.

PEP 241: Metadata in Python Packages

A common complaint from Python users is that there’s no single catalog of all the Python modules in existence. T. Middleton’s Vaults of Parnassus at http://www.vex.net/parnassus/ are the largest catalog of Python modules, but registering software at the Vaults is optional, and many people don’t bother.

As a first small step toward fixing the problem, Python software packaged using the Distutils sdist command will include a file named PKG-INFO containing information about the package such as its name, version, and author (metadata, in cataloguing terminology). PEP 241 contains the full list of fields that can be present in the PKG-INFO file. As people began to package their software using Python 2.1, more and more packages will include metadata, making it possible to build automated cataloguing systems and experiment with them. With the result experience, perhaps it’ll be possible to design a really good catalog and then build support for it into Python 2.2. For example, the Distutils sdist and bdist_* commands could support a upload option that would automatically upload your package to a catalog server.

You can start creating packages containing PKG-INFO even if you’re not using Python 2.1, since a new release of the Distutils will be made for users of earlier Python versions. Version 1.0.2 of the Distutils includes the changes described in PEP 241, as well as various bugfixes and enhancements. It will be available from the Distutils SIG at http://www.python.org/sigs/distutils-sig/.

参考

PEP 241 - Pythonソフトウェアパッケージのためのメタデータ

A.M. Kuchling 著、実装

PEP 243 - レポジトリアップロードモジュールのメカニズム
Written by Sean Reifschneider, this draft PEP describes a proposed mechanism for uploading Python packages to a central server.

New and Improved Modules

  • Ka-Ping Yee contributed two new modules: inspect.py, a module for getting information about live Python code, and pydoc.py, a module for interactively converting docstrings to HTML or text. As a bonus, Tools/scripts/pydoc, which is now automatically installed, uses pydoc.py to display documentation given a Python module, package, or class name. For example, pydoc xml.dom displays the following:

    Python Library Documentation: package xml.dom in xml
    
    NAME
        xml.dom - W3C Document Object Model implementation for Python.
    
    FILE
        /usr/local/lib/python2.1/xml/dom/__init__.pyc
    
    DESCRIPTION
        The Python mapping of the Document Object Model is documented in the
        Python Library Reference in the section on the xml.dom package.
    
        This package contains the following modules:
          ...
    

    pydoc also includes a Tk-based interactive help browser. pydoc quickly becomes addictive; try it out!

  • Two different modules for unit testing were added to the standard library. The doctest module, contributed by Tim Peters, provides a testing framework based on running embedded examples in docstrings and comparing the results against the expected output. PyUnit, contributed by Steve Purcell, is a unit testing framework inspired by JUnit, which was in turn an adaptation of Kent Beck’s Smalltalk testing framework. See http://pyunit.sourceforge.net/ for more information about PyUnit.

  • The difflib module contains a class, SequenceMatcher, which compares two sequences and computes the changes required to transform one sequence into the other. For example, this module can be used to write a tool similar to the Unix diff program, and in fact the sample program Tools/scripts/ndiff.py demonstrates how to write such a script.

  • curses.panel, a wrapper for the panel library, part of ncurses and of SYSV curses, was contributed by Thomas Gellekum. The panel library provides windows with the additional feature of depth. Windows can be moved higher or lower in the depth ordering, and the panel library figures out where panels overlap and which sections are visible.

  • The PyXML package has gone through a few releases since Python 2.0, and Python 2.1 includes an updated version of the xml package. Some of the noteworthy changes include support for Expat 1.2 and later versions, the ability for Expat parsers to handle files in any encoding supported by Python, and various bugfixes for SAX, DOM, and the minidom module.

  • Ping also contributed another hook for handling uncaught exceptions. sys.excepthook() can be set to a callable object. When an exception isn’t caught by any try...except blocks, the exception will be passed to sys.excepthook(), which can then do whatever it likes. At the Ninth Python Conference, Ping demonstrated an application for this hook: printing an extended traceback that not only lists the stack frames, but also lists the function arguments and the local variables for each frame.

  • Various functions in the time module, such as asctime() and localtime(), require a floating point argument containing the time in seconds since the epoch. The most common use of these functions is to work with the current time, so the floating point argument has been made optional; when a value isn’t provided, the current time will be used. For example, log file entries usually need a string containing the current time; in Python 2.1, time.asctime() can be used, instead of the lengthier time.asctime(time.localtime(time.time())) that was previously required.

    This change was proposed and implemented by Thomas Wouters.

  • The ftplib module now defaults to retrieving files in passive mode, because passive mode is more likely to work from behind a firewall. This request came from the Debian bug tracking system, since other Debian packages use ftplib to retrieve files and then don’t work from behind a firewall. It’s deemed unlikely that this will cause problems for anyone, because Netscape defaults to passive mode and few people complain, but if passive mode is unsuitable for your application or network setup, call set_pasv(0) on FTP objects to disable passive mode.

  • Support for raw socket access has been added to the socket module, contributed by Grant Edwards.

  • The pstats module now contains a simple interactive statistics browser for displaying timing profiles for Python programs, invoked when the module is run as a script. Contributed by Eric S. Raymond.

  • A new implementation-dependent function, sys._getframe([depth]), has been added to return a given frame object from the current call stack. sys._getframe() returns the frame at the top of the call stack; if the optional integer argument depth is supplied, the function returns the frame that is depth calls below the top of the stack. For example, sys._getframe(1) returns the caller’s frame object.

    This function is only present in CPython, not in Jython or the .NET implementation. Use it for debugging, and resist the temptation to put it into production code.

Other Changes and Fixes

There were relatively few smaller changes made in Python 2.1 due to the shorter release cycle. A search through the CVS change logs turns up 117 patches applied, and 136 bugs fixed; both figures are likely to be underestimates. Some of the more notable changes are:

  • A specialized object allocator is now optionally available, that should be faster than the system malloc() and have less memory overhead. The allocator uses C’s malloc() function to get large pools of memory, and then fulfills smaller memory requests from these pools. It can be enabled by providing the --with-pymalloc option to the configure script; see Objects/obmalloc.c for the implementation details.

    Authors of C extension modules should test their code with the object allocator enabled, because some incorrect code may break, causing core dumps at runtime. There are a bunch of memory allocation functions in Python’s C API that have previously been just aliases for the C library’s malloc() and free(), meaning that if you accidentally called mismatched functions, the error wouldn’t be noticeable. When the object allocator is enabled, these functions aren’t aliases of malloc() and free() any more, and calling the wrong function to free memory will get you a core dump. For example, if memory was allocated using PyMem_New(), it has to be freed using PyMem_Del(), not free(). A few modules included with Python fell afoul of this and had to be fixed; doubtless there are more third-party modules that will have the same problem.

    The object allocator was contributed by Vladimir Marangozov.

  • The speed of line-oriented file I/O has been improved because people often complain about its lack of speed, and because it’s often been used as a naïve benchmark. The readline() method of file objects has therefore been rewritten to be much faster. The exact amount of the speedup will vary from platform to platform depending on how slow the C library’s getc() was, but is around 66%, and potentially much faster on some particular operating systems. Tim Peters did much of the benchmarking and coding for this change, motivated by a discussion in comp.lang.python.

    A new module and method for file objects was also added, contributed by Jeff Epler. The new method, xreadlines(), is similar to the existing xrange() built-in. xreadlines() returns an opaque sequence object that only supports being iterated over, reading a line on every iteration but not reading the entire file into memory as the existing readlines() method does. You’d use it like this:

    for line in sys.stdin.xreadlines():
        # ... do something for each line ...
        ...
    

    For a fuller discussion of the line I/O changes, see the python-dev summary for January 1-15, 2001 at http://www.python.org/dev/summary/2001-01-1/.

  • A new method, popitem(), was added to dictionaries to enable destructively iterating through the contents of a dictionary; this can be faster for large dictionaries because there’s no need to construct a list containing all the keys or values. D.popitem() removes a random (key, value) pair from the dictionary D and returns it as a 2-tuple. This was implemented mostly by Tim Peters and Guido van Rossum, after a suggestion and preliminary patch by Moshe Zadka.

  • Modules can now control which names are imported when from module import * is used, by defining an __all__ attribute containing a list of names that will be imported. One common complaint is that if the module imports other modules such as sys or string, from module import * will add them to the importing module’s namespace. To fix this, simply list the public names in __all__:

    # List public names
    __all__ = ['Database', 'open']
    

    A stricter version of this patch was first suggested and implemented by Ben Wolfson, but after some python-dev discussion, a weaker final version was checked in.

  • Applying repr() to strings previously used octal escapes for non-printable characters; for example, a newline was '\012'. This was a vestigial trace of Python’s C ancestry, but today octal is of very little practical use. Ka-Ping Yee suggested using hex escapes instead of octal ones, and using the \n, \t, \r escapes for the appropriate characters, and implemented this new formatting.

  • Syntax errors detected at compile-time can now raise exceptions containing the filename and line number of the error, a pleasant side effect of the compiler reorganization done by Jeremy Hylton.

  • C extensions which import other modules have been changed to use PyImport_ImportModule(), which means that they will use any import hooks that have been installed. This is also encouraged for third-party extensions that need to import some other module from C code.

  • The size of the Unicode character database was shrunk by another 340K thanks to Fredrik Lundh.

  • Some new ports were contributed: MacOS X (by Steven Majewski), Cygwin (by Jason Tishler); RISCOS (by Dietmar Schwertberger); Unixware 7 (by Billy G. Allie).

And there’s the usual list of minor bugfixes, minor memory leaks, docstring edits, and other tweaks, too lengthy to be worth itemizing; see the CVS logs for the full details if you want them.

謝辞

The author would like to thank the following people for offering suggestions on various drafts of this article: Graeme Cross, David Goodger, Jay Graves, Michael Hudson, Marc-André Lemburg, Fredrik Lundh, Neil Schemenauer, Thomas Wouters.