First of, make sure hawkey is installed on your system, this should work from your terminal:
>>> import hawkey
Sack is an abstraction for a collection of packages. Sacks in hawkey are toplevel objects carrying much of hawkey’s of functionality. You’ll want to create one:
>>> sack = hawkey.Sack() >>> len(sack) 0
Initially, the sack contains no packages.
hawkey is a lib for listing, querying and resolving dependencies of packages from repositories. On most linux distributions you always have at least the system repo (in Fedora it is the RPM database). To load it:
>>> sack.load_system_repo() >>> len(sack) 1683
Hawkey always knows the name of every repository. Names of repositories loaded from Yum metadata are chosen by the client and the system repostiroy is always called @System.
Let’s be honest here: all the fun in packaging comes from packages you haven’t installed yet. Information about them, their metadata, can be obtained from different sources and typically they are downloaded from an HTTP mirror (another possibilities are FTP server, NFS mount, DVD distribution media, etc.). Hawkey does not provide any means to discover and obtain the metadata locally: it is up to the client to provide valid readable paths to the Yum metadata XML files. Structures used for passing the information to hawkey are the hawkey Repos. Suppose we somehow obtained the metadata and placed it in /home/akozumpl/tmp/repodata. We can then load the metadata into hawkey:
>>> path = "/home/akozumpl/tmp/repodata/%s" >>> repo = hawkey.Repo("experimental") >>> repo.repomd_fn = path % "repomd.xml" >>> repo.primary_fn = path % "f7753a2636cc89d70e8aaa1f3c08413ab78462ca9f48fd55daf6dedf9ab0d5db-primary.xml.gz" >>> repo.filelists_fn = path % "0261e25e8411f4f5e930a70fa249b8afd5e86bb9087d7739b55be64b76d8a7f6-filelists.xml.gz" >>> sack.load_yum_repo(repo, load_filelists=True) >>> len(sack) 1685
The number of packages in the Sack will increase by the number of packages found in the repository (two in this case, it is an experimental repo after all).
What the load_filelists=True argument to load_yum_repo() above does is instruct hawkey to process the <hash>filelists.xml.gz file we passed in and which contains structured list of absolute paths to all files of all packages within the repo. This information can be used for two purposes:
Some files provided by a package (e.g those in /usr/bin) are always visible even without loading the filelists. Well-behaved packages requiring only those can be thus resolved directly. Unortunately, there are packages that don’t behave and it is hard to tell in advance when you’ll deal with one.
The strategy for using load_filelists=True is thus:
Internally to hold the package information and perform canonical resolving hawkey uses Libsolv. One great benefit this library offers is providing writing and reading of metadata cache files in libsolv’s own binary format (files with .solv extension, typically). At a cost of few hundreds of milliseconds, using the solv files reduces repo load times from seconds to tens of milliseconds. It is thus a good idea to write and use the solv files every time you plan to use the same repo for more than one Sack (which is at least every time your hawkey program is run). To do that use build_cache=True with load_yum_repo() and load_system_repo():
>>> sack = hawkey.Sack(make_cache_dir=True) >>> sack.load_system_repo(build_cache=True)
By default, Hawkey creates @System.cache under the /var/tmp/hawkey-<your_login>-<random_hash> directory. This is the hawkey cache directory, which you can always delete later (deleting the cache files in the process). The .solv files are picked up automatically the next time you try to create a hawkey sack. Except for a much higher speed of the operation this will be completely transparent to you:
>>> s2 = hawkey.Sack() >>> s2.load_system_repo()
By the way, the cache directory also contains a logfile with some boring debugging information.
Query is the means in hawkey of finding a package based on one or more criteria (name, version, repository of origin). Its interface is loosely based on Django’s QuerySets, the main concepts being:
For instance, let’s say I want to find all installed packages which name ends with gtk:
>>> q = hawkey.Query(sack).filter(reponame=hawkey.SYSTEM_REPO_NAME, name__glob='*gtk') >>> for pkg in q: ... print str(pkg) ... NetworkManager-gtk-1:0.9.4.0-9.git20120521.fc17.x86_64 authconfig-gtk-6.2.1-1.fc17.x86_64 clutter-gtk-1.2.0-1.fc17.x86_64 libchamplain-gtk-0.12.2-1.fc17.x86_64 libreport-gtk-2.0.10-3.fc17.x86_64 pinentry-gtk-0.8.1-6.fc17.x86_64 python-slip-gtk-0.2.20-2.fc17.noarch transmission-gtk-2.50-2.fc17.x86_64 usermode-gtk-1.109-1.fc17.x86_64 webkitgtk-1.8.1-2.fc17.x86_64 xdg-user-dirs-gtk-0.9-1.fc17.x86_64
Or I want to find the latest version of all python packages the Sack knows of:
>>> q.clear() >>> q = q.filter(name='python', latest=True) >>> for pkg in q: ... print str(pkg) ... python-2.7.3-6.fc17.x86_64
You can also test a Query for its truth value. It will be true whenever the query matched at least one package:
>>> q = hawkey.Query(sack).filter(file='/boot/vmlinuz-3.3.4-5.fc17.x86_64') >>> if q: ... print 'match' ... match >>> q = hawkey.Query(sack).filter(file='/booty/vmlinuz-3.3.4-5.fc17.x86_64') >>> if q: ... print 'match' ... >>> if not q: ... print 'no match' ... no match
If the Query hasn’t been evaluated already then it is evaluated whenever it’s length is taken (either via len(q) or q.count()), when it is tested for truth and when it is explicitly evaluated with q.run().
Many Sack sessions culminate in a bout of dependency resolving, that is answering a question along the lines of “I have a package X in a repository here, what other packages do I need to install/update to have X installed and all its dependencies recursively satisfied?” Suppose we want to install the RTS game Spring. First let’s locate the latest version of the package in repositories:
>>> q = hawkey.Query(sack).filter(name='spring', latest=True) >>> pkg = hawkey.Query(sack).filter(name='spring', latest=True) >>> str(pkg) 'spring-88.0-2.fc17.x86_64' >>> pkg.reponame 'fedora'
Then build the Goal object and tell it our goal is installing the pkg. Then we fire off the libsolv’s dependency resolver by running the goal:
>>> g = hawkey.Goal(sack) >>> g.install(pkg) >>> g.run() True
True as a return value here indicates that libsolv could find a solution to our goal. This is not always the case, there are plenty of situations when there is no solution, the most common one being a package should be installed but one of its dependnecies is missing from the sack.
The three methods Goal.list_installs(), Goal.list_upgrades() and Goal.list_erasures() can show which packages should be installed/upgraded/erased to satisfy the packaging goal we set out to achieve (the mapping of str() over the results below ensures human readable package names instead of numbers are presented):
>>> map(str, g.list_installs()) ['spring-88.0-2.fc17.x86_64', 'spring-installer-20090316-10.fc17.x86_64', 'springlobby-0.139-3.fc17.x86_64', 'spring-maps-default-0.1-8.fc17.noarch', 'wxBase-2.8.12-4.fc17.x86_64', 'wxGTK-2.8.12-4.fc17.x86_64', 'rb_libtorrent-0.15.9-1.fc17.x86_64', 'GeoIP-1.4.8-2.1.fc17.x86_64'] >>> map(str, g.list_upgrades())  >>> map(str, g.list_erasures()) 
So what does it tell us? That given the state of the given system and the given repository we used, 8 packages need to be installed, spring-88.0-2.fc17.x86_64 itself included. No packages need to be upgraded or erased.
For certain simple and commonly used queries we can do installs directly. Instead of executing a query however we instantiate and pass the Goal.install() method a Selector:
>>> g = hawkey.Goal(sack) >>> sltr = hawkey.Selector(sack).set(name='emacs-nox') >>> g.install(select=sltr) >>> g.run() True >>> map(str, g.list_installs()) ['spring-88.0-2.fc17.x86_64', 'spring-installer-20090316-10.fc17.x86_64', 'springlobby-0.139-3.fc17.x86_64', 'spring-maps-default-0.1-8.fc17.noarch', 'wxBase-2.8.12-4.fc17.x86_64', 'wxGTK-2.8.12-4.fc17.x86_64', 'rb_libtorrent-0.15.9-1.fc17.x86_64', 'GeoIP-1.4.8-2.1.fc17.x86_64'] >>> len(g.list_upgrades()) 0 >>> len(g.list_erasures()) 0
Notice we arrived at the same result as before, when a query was constructed and iterated first. What Selector does when passed to Goal.install() is tell hawkey to examine its settings and without evaluating it as a Query it instructs libsolv to find the best matching package for it and add that for installation. It saves user some deicsions like which version should be installed or what architecture (this gets very relevant with multiarch libraries).
So Selectors usually only install a single package. If you mean to install all packages matching an arbitrarily complex query, just use the method describe above:
>>> map(goal.install, q)