Changeset 87
- Timestamp:
- Fri Sep 16 10:53:56 2005
- Files:
-
- trunk/doc/source/Tutorial.rst (modified) (diff)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
trunk/doc/source/Tutorial.rst
r86 r87 1231 1231 ************* 1232 1232 1233 ImportTracker can be called in two ways: analyze_one(name, importername=None) 1234 or analyze_r(name, importername=None). The second method does what modulefinder 1233 ``ImportTracker`` can be called in two ways: ``analyze_one(name, importername=None)`` 1234 or ``analyze_r(name, importername=None)``. The second method does what modulefinder 1235 1235 does - it recursively finds all the module names that importing name would 1236 cause to appear in sys.modules. The first method is non-recursive. This is1236 cause to appear in ``sys.modules``. The first method is non-recursive. This is 1236 1236 useful, because it is the only way of answering the question "Who imports 1237 1237 name?" But since it is somewhat unrealistic (very few real imports do not … … 1241 1241 |GOBACK| 1242 1242 1243 analyze_one() 1244 ************* 1243 ``analyze_one()`` 1244 ***************** 1245 1245 1246 1246 When a name is imported, there are structural and dynamic effects. The dynamic … … 1252 1252 1253 1253 The analyze_one method determines the structural effects, and defers the 1254 dynamic effects. For example, analyze_one("B.C", "A") could return ["B", "B.C"] 1255 or ["A.B", "A.B.C"] depending on whether the import turns out to be relative or 1254 dynamic effects. For example, ``analyze_one("B.C", "A")`` could return ``["B", "B.C"]`` 1255 or ``["A.B", "A.B.C"]`` depending on whether the import turns out to be relative or 1256 1256 absolute. In addition, ImportTracker's modules dict will have Module instances 1257 1257 for them. … … 1268 1268 or absolute names - we don't know until they have been analyzed). 1269 1269 1270 The highly astute will notice that there is a hole in analyze_one() here. The 1271 first thing that happens when B.C is being imported is that B is imported and 1270 The highly astute will notice that there is a hole in ``analyze_one()`` here. The 1271 first thing that happens when ``B.C`` is being imported is that ``B`` is imported and 1272 1272 it's top-level code executed. That top-level code can do various things so that 1273 when the import of B.Cfinally occurs, something completely different happens1273 when the import of ``B.C`` finally occurs, something completely different happens 1273 1273 (from what a structural analysis would predict). But mf can handle this through 1274 1274 it's hooks mechanism. … … 1280 1280 ************* 1281 1281 1282 Like modulefinder, mf scans the byte code of a module, looking for imports. In 1283 addition, mf will pick out a module's __all__ attribute, if it is built as a 1284 list of constant names. This means that if a package declares an __all__ list 1282 Like modulefinder, ``mf`` scans the byte code of a module, looking for imports. In 1283 addition, ``mf`` will pick out a module's ``__all__`` attribute, if it is built as a 1284 list of constant names. This means that if a package declares an ``__all__`` list 1285 1285 as a list of names, ImportTracker will track those names if asked to analyze 1286 package.*. The code scan also notes the occurance of __import__, exec and eval,1286 ``package.*``. The code scan also notes the occurance of ``__import__``, ``exec`` and ``eval``, 1286 1286 and can issue warnings when they're found. 1287 1287 … … 1298 1298 1299 1299 In modulefinder, scanning the code takes the place of executing the code 1300 object. mfgoes further and allows a module to be hooked (after it has been1300 object. ``mf`` goes further and allows a module to be hooked (after it has been 1300 1300 scanned, but before analyze_one is done with it). A hook is a module named 1301 hook-fullyqualifiedname in the hookspackage. These modules should have one or1301 ``hook-fullyqualifiedname`` in the ``hooks`` package. These modules should have one or 1301 1301 more of the following three global names defined: 1302 1302 1303 hiddenimports1303 ``hiddenimports`` 1303 1303 a list of modules names (relative or absolute) that the module imports in some untrackable way. 1304 1304 1305 attrs 1306 a list of (name, value) pairs, (where value is normally meaningless). 1305 ``attrs`` 1306 a list of ``(name, value)`` pairs (where value is normally meaningless). 1307 1307 1308 hook(mod) 1309 a function taking a Module instance and returning a Module instance (so it can modify or replace). 1308 ``hook(mod)`` 1309 a function taking a ``Module`` instance and returning a ``Module`` instance (so it can modify or replace). 1310 1310 1311 1311 1312 The first hook (hiddenimports) extends the list created by scanning the code. 1313 ExtensionModules, of course, don't get scanned, so this is the only way of 1312 The first hook (``hiddenimports``) extends the list created by scanning the code. 1313 ``ExtensionModules``, of course, don't get scanned, so this is the only way of 1314 1314 recording any imports they do. 1315 1315 1316 The second hook ( attrs) exists mainly so that ImportTracker won't issue1316 The second hook (``attrs``) exists mainly so that ImportTracker won't issue 1316 1316 spurious warnings when the rightmost node in a dotted name turns out to be an 1317 1317 attribute in a package module, instead of a missing submodule. 1318 1318 1319 1319 The callable hook exists for things like dynamic modification of a package's 1320 __path__ or perverse situations, like xml.__init__ replacing itself in 1321 sys.modules with _xmlplus.__init__. (It takes nine hook modules to properly 1320 ``__path__`` or perverse situations, like ``xml.__init__`` replacing itself in 1321 ``sys.modules`` with ``_xmlplus.__init__``. (It takes nine hook modules to properly 1322 1322 trace through PyXML-using code, and I can't believe that it's any easier for 1323 the poor programmer using that package). The hook(mod)(if it exists) is1323 the poor programmer using that package). The ``hook(mod)`` (if it exists) is 1323 1323 called before looking at the others - that way it can, for example, test 1324 sys.version and adjust what's in hiddenimports.1324 ``sys.version`` and adjust what's in ``hiddenimports``. 1324 1324 1325 1325 |GOBACK| … … 1334 1334 ******** 1335 1335 1336 ImportTracker has a getwarnings() method that returns all the warnings 1337 accumulated by the instance, and by the Module instances in its modules dict. 1338 Generally, it is ImportTracker who will accumulate the warnings generated 1339 during the structural phase, and Modules that will get the warnings generated 1336 ``ImportTracker`` has a ``getwarnings()`` method that returns all the warnings 1337 accumulated by the instance, and by the ``Module`` instances in its modules dict. 1338 Generally, it is ``ImportTracker`` who will accumulate the warnings generated 1339 during the structural phase, and ``Modules`` that will get the warnings generated 1340 1340 during the code scan. 1341 1341 … … 1348 1348 *************** 1349 1349 1350 Once a full analysis (that is, an analyze_r) has been done, you can get a 1351 cross reference by using getxref(). This returns a list of tuples. Each tuple 1352 is (modulename, importers), where importers is a list of the (fully qualified) 1353 names of the modules importing modulename. Both the returned list and the 1350 Once a full analysis (that is, an ``analyze_r`` call) has been done, you can get a 1351 cross reference by using ``getxref()``. This returns a list of tuples. Each tuple 1352 is ``(modulename, importers)``, where importers is a list of the (fully qualified) 1353 names of the modules importing ``modulename``. Both the returned list and the 1354 1354 importers list are sorted. 1355 1355 … … 1406 1406 .. _iu.py: 1407 1407 1408 iu.py: An *imputil* Replacement 1408 ``iu.py``: An *imputil* Replacement 1409 ----------------------------------- 1409 1410 1410 Module iu grows out of the pioneering work that Greg Stein did with imputil 1411 (actually, it includes some verbatim imputil code, but since Greg didn't 1411 Module ``iu`` grows out of the pioneering work that Greg Stein did with ``imputil`` 1412 (actually, it includes some verbatim ``imputil`` code, but since Greg didn't 1412 1413 copyright it, we won't mention it). Both modules can take over Python's 1413 1414 builtin import and ease writing of at least certain kinds of import hooks. … … 1419 1419 * more managable 1420 1420 1421 There is an ImportManagerwhich provides the replacement for builtin import1421 There is an ``ImportManager`` which provides the replacement for builtin import 1421 1421 and hides all the semantic complexities of a Python import request from it's 1422 delegates. .1422 delegates. 1422 1422 1423 1423 |GOBACK| 1424 1424 1425 ImportManager 1426 ************* 1425 ``ImportManager`` 1426 ***************** 1427 1427 1428 ImportManagerformalizes the concept of a metapath. This concept implicitly1428 ``ImportManager`` formalizes the concept of a metapath. This concept implicitly 1428 1428 exists in native Python in that builtins and frozen modules are searched 1429 before sys.path, (on Windows there's also a search of the registry while on1429 before ``sys.path``, (on Windows there's also a search of the registry while on 1429 1429 Mac, resources may be searched). This metapath is a list populated with 1430 ImportDirector instances. There are ImportDirectorsubclasses for builtins,1430 ``ImportDirector`` instances. There are ``ImportDirector`` subclasses for builtins, 1430 1430 frozen modules, (on Windows) modules found through the registry and a 1431 PathImportDirector for handling sys.path. For a top-level import (that is, not 1432 an import of a module in a package), ImportManager tries each director on it's 1431 ``PathImportDirector`` for handling ``sys.path``. For a top-level import (that is, not 1432 an import of a module in a package), ``ImportManager`` tries each director on it's 1433 1433 metapath until one succeeds. 1434 1434 1435 ImportManager hides the semantic complexity of an import from the directors. 1436 It's up to the ImportManager to decide if an import is relative or absolute; 1437 to see if the module has already been imported; to keep sys.modules up to 1435 ``ImportManager`` hides the semantic complexity of an import from the directors. 1436 It's up to the ``ImportManager`` to decide if an import is relative or absolute; 1437 to see if the module has already been imported; to keep ``sys.modules`` up to 1438 1438 date; to handle the fromlist and return the correct module object. 1439 1439 1440 1440 |GOBACK| 1441 1441 1442 ImportDirectors 1443 *************** 1442 ``ImportDirector`` 1443 ****************** 1444 1444 1445 An ImportDirector just needs to respond to getmod(name) by returning a module 1446 object or None. As you will see, an ImportDirector can consider name to be 1445 An ``ImportDirector`` just needs to respond to ``getmod(name)`` by returning a module 1446 object or ``None``. As you will see, an ``ImportDirector`` can consider name to be 1447 1447 atomic - it has no need to examine name to see if it is dotted. 1448 1448 1449 To see how this works, we need to examine the PathImportDirector.1449 To see how this works, we need to examine the ``PathImportDirector``. 1449 1449 1450 1450 |GOBACK| 1451 1451 1452 PathImportDirector 1453 ****************** 1452 ``PathImportDirector`` 1453 ********************** 1454 1454 1455 The PathImportDirector subclass manages a list of names - most notably, 1456 sys.path. To do so, it maintains a shadowpath - a dictionary mapping the names 1457 on it's pathlist (eg, sys.path) to their associated Owners. (It could do this 1455 The ``PathImportDirector`` subclass manages a list of names - most notably, 1456 ``sys.path``. To do so, it maintains a shadowpath - a dictionary mapping the names 1457 on its pathlist (eg, ``sys.path``) to their associated ``Owners``. (It could do this 1458 1458 directly, but the assumption that sys.path is occupied solely by strings seems 1459 ineradicable.) Owners of the appropriate kind are created as needed (if all 1460 your imports are satisfied by the first two elements of sys.path, the 1461 PathImportDirector's shadowpath will only have two entries). 1459 ineradicable.) ``Owners`` of the appropriate kind are created as needed (if all 1460 your imports are satisfied by the first two elements of ``sys.path``, the 1461 ``PathImportDirector``'s shadowpath will only have two entries). 1462 1462 1463 1463 |GOBACK| 1464 1464 1465 Owners 1466 ****** 1465 ``Owner`` 1466 ********* 1467 1467 1468 An Owner is much like an ImportDirector but manages a much more concrete piece 1469 of turf. For example, a DirOwner manages one directory. Since there are no 1468 An ``Owner`` is much like an ``ImportDirector`` but manages a much more concrete piece 1469 of turf. For example, a ``DirOwner`` manages one directory. Since there are no 1470 1470 other officially recognized filesystem-like namespaces for importing, that's 1471 all that's included in iu, but it's easy to imagine Owners for zip files 1472 (and I have one for my own .pyz archive format) or even URLs. 1471 all that's included in iu, but it's easy to imagine ``Owners`` for zip files 1472 (and I have one for my own ``.pyz`` archive format) or even URLs. 1473 1473 1474 As with ImportDirectors, an Owner just needs to respond to getmod(name) by 1475 returning a module object or None, and it can consider name to be atomic. 1474 As with ``ImportDirectors``, an ``Owner`` just needs to respond to ``getmod(name)`` by 1475 returning a module object or ``None``, and it can consider name to be atomic. 1476 1476 1477 So structurally, we have a tree, rooted at the ImportManager. At the next 1478 level, we have a set of ImportDirectors. At least one of those directors, the 1479 PathImportDirector in charge of sys.path, has another level beneath it, 1480 consisting of Owners. This much of the tree covers the entire top-level import 1477 So structurally, we have a tree, rooted at the ``ImportManager``. At the next 1478 level, we have a set of ``ImportDirectors``. At least one of those directors, the 1479 ``PathImportDirector`` in charge of ``sys.path``, has another level beneath it, 1480 consisting of ``Owners``. This much of the tree covers the entire top-level import 1481 1481 namespace. 1482 1482 1483 1483 The rest of the import namespace is covered by treelets, each rooted in a 1484 package module (an __init__.py).1484 package module (an ``__init__.py``). 1484 1484 1485 1485 |GOBACK| … … 1495 1495 ******** 1496 1496 1497 To make this work, Owners need to recognize when a module is a package. For a 1498 DirOwner, this means that name is a subdirectory which contains an __init__.py. 1499 The __init__ module is loaded and it's __path__ is initialized with the 1500 subdirectory. Then, a PathImportDirector is created to manage this __path__. 1501 Finally the new PathImportDirector's getmod is assigned to the package's 1502 __importsub__ function. 1497 To make this work, ``Owners`` need to recognize when a module is a package. For a 1498 ``DirOwner``, this means that name is a subdirectory which contains an ``__init__.py``. 1499 The ``__init__`` module is loaded and its ``__path__`` is initialized with the 1500 subdirectory. Then, a ``PathImportDirector`` is created to manage this ``__path__``. 1501 Finally the new ``PathImportDirector``'s ``getmod`` is assigned to the package's 1502 ``__importsub__`` function. 1503 1503 1504 1504 When a module within the package is imported, the request is routed (by the 1505 ImportManager) diretly to the package's __importsub__. In a hierarchical 1506 namespace (like a filesystem), this means that __importsub__ (which is really 1507 the bound getmod method of a PathImportDirector instance) needs only the 1505 ``ImportManager``) diretly to the package's ``__importsub__``. In a hierarchical 1506 namespace (like a filesystem), this means that ``__importsub__`` (which is really 1507 the bound getmod method of a ``PathImportDirector`` instance) needs only the 1508 1508 module name, not the package name or the fully qualified name. And that's 1509 1509 exactly what it gets. (In a flat namespace - like most archives - it is 1510 1510 perfectly easy to route the request back up the package tree to the archive 1511 Owner, qualifying the name at each step.)1511 ``Owner``, qualifying the name at each step.) 1511 1511 1512 1512 |GOBACK| … … 1516 1516 ************* 1517 1517 1518 Let's say we want to import from .zip files. So, we subclass Owner. The 1519 __init__ method should take a filename, and raise a ValueError if the file is 1520 not an acceptable .zip file, (when a new name is encountered on sys.path or a 1521 package's __path__, registered Owners are tried until one accepts the name). 1522 The getmod method would check the .zip file's contents and return None if the 1518 Let's say we want to import from zip files. So, we subclass ``Owner``. The 1519 ``__init__`` method should take a filename, and raise a ``ValueError`` if the file is 1520 not an acceptable ``.zip`` file, (when a new name is encountered on ``sys.path`` or a 1521 package's ``__path__``, registered Owners are tried until one accepts the name). 1522 The ``getmod`` method would check the zip file's contents and return ``None`` if the 1523 1523 name is not found. Otherwise, it would extract the marshalled code object from 1524 the .zip, create a new module object and perform a bit of initialization (121524 the zip, create a new module object and perform a bit of initialization (12 1524 1524 lines of code all told for my own archive format, including initializing a pack 1525 age with it's __subimporter__).1525 age with it's ``__subimporter__``). 1525 1525 1526 Once the new Owner class is registered with iu4, you can put a .zip file on 1527 sys.path. A package could even put a .zip file on it's __path__. 1526 Once the new ``Owner`` class is registered with ``iu``, you can put a zip file on 1527 ``sys.path``. A package could even put a zip file on its ``__path__``. 1528 1528 1529 1529 |GOBACK| … … 1535 1535 1536 1536 This code has been tested with the PyXML, mxBase and Win32 packages, covering 1537 over a dozen import hacks from manipulations of __path__ to replacing a module 1538 in sys.modules with a different one. Emulation of Python's native import is 1539 nearly exact, including the names recorded in sys.modules and module attributes 1540 (packages imported through iu have an extra attribute - __importsub__). 1537 over a dozen import hacks from manipulations of ``__path__`` to replacing a module 1538 in ``sys.modules`` with a different one. Emulation of Python's native import is 1539 nearly exact, including the names recorded in ``sys.modules`` and module attributes 1540 (packages imported through ``iu`` have an extra attribute - ``__importsub__``). 1541 1541 1542 1542 |GOBACK| … … 1545 1545 *********** 1546 1546 1547 In most cases, iu is slower than builtin import (by 15 to 20%) but faster than 1548 imputil (by 15 to 20%). By inserting archives at the front of sys.path 1547 In most cases, ``iu`` is slower than builtin import (by 15 to 20%) but faster than 1548 ``imputil`` (by 15 to 20%). By inserting archives at the front of ``sys.path`` 1549 1549 containing the standard lib and the package being tested, this can be reduced 1550 1550 to 5 to 10% slower (or, on my 1.52 box, 10% faster!) than builtin import. A bit 1551 more can be shaved off by manipulating the ImportManager's metapath.1551 more can be shaved off by manipulating the ``ImportManager``'s metapath. 1551 1551 1552 1552 |GOBACK| … … 1562 1562 1563 1563 Quite simply, I think cross-domain import hacks are a very bad idea. As author 1564 of the original package in which |PyInstaller| is based, McMillan worked with1564 of the original package on which |PyInstaller| is based, McMillan worked with 1564 1564 import hacks for many years. Many of them are highly fragile; they often rely 1565 1565 on undocumented (maybe even accidental) features of implementation. 1566 1566 A cross-domain import hack is not likely to work with PyXML, for example. 1567 1567 1568 That rant aside, you can modify ImportMangerto implement different policies.1568 That rant aside, you can modify ``ImportManger`` to implement different policies. 1568 1568 For example, a version that implements three import primitives: absolute 1569 1569 import, relative import and recursive-relative import. No idea what the Python 1570 sy tax for those should be, but __aimport__, __rimport__ and __rrimport__were1570 syntax for those should be, but ``__aimport__``, ``__rimport__`` and ``__rrimport__`` were 1570 1570 easy to implement. 1571 1571 … … 1577 1577 ***** 1578 1578 1579 Here's a simple example of using iuas a builtin import replacement.1579 Here's a simple example of using ``iu`` as a builtin import replacement. 1579 1579 1580 1580 >>> import iu
