Python glob.glob – How do I find a specific file (or list of files) without knowing how deep a subdirectory is?

Python glob.glob – How do I find a specific file (or list of files) without knowing how deep a subdirectory is? … here is a solution to the problem.

Python glob.glob – How do I find a specific file (or list of files) without knowing how deep a subdirectory is?

Now, I

use SubProcess to call Find, which does the job pretty well, but I’m after the Python way of doing things.

Here is the current code:

cmd = "find /sys/devices/pci* | grep '/net/' |grep address"
p = subprocess. Popen(cmd, stdout=subprocess. PIPE, shell=True)

In the output I receive the following list:

[root@host1 ~]# find /sys/devices/pci* |grep '/net/'|grep 'address'
/sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/0000:05:00.0/0000:06:00.0/0000:07:00.0/0000:08:00.0/net/eth0/address
/sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/0000:05:00.0/0000:06:00.0/0000:07:01.0/0000:09:00.0/net/eth1/address
/sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/0000:05:00.0/0000:06:00.0/0000:07:02.0/0000:0a:00.0/net/rename4/address
/sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/0000:05:00.0/0000:06:00.0/0000:07:03.0/0000:0b:00.0/net/eth3/address
/sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/0000:05:00.0/0000:06:00.0/0000:07:04.0/0000:0c:00.0/net/eth4/address
/sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/0000:05:00.0/0000:06:00.0/0000:07:05.0/0000:0d:00.0/net/eth5/address
/sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/0000:05:00.0/0000:06:00.0/0000:07:06.0/0000:0e:00.0/net/eth6/address
/sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/0000:05:00.0/0000:06:00.0/0000:07:07.0/0000:0f:00.0/net/eth7/address
/sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/0000:05:00.0/0000:06:00.0/0000:07:08.0/0000:10:00.0/net/eth8/address
/sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/0000:05:00.0/0000:06:00.0/0000:07:09.0/0000:11:00.0/net/eth9/address
/sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/0000:05:00.0/0000:06:00.0/0000:07:0a.0/0000:12:00.0/net/eth10/address
/sys/devices/pci0000:00/0000:00:07.0/0000:04:00.0/0000:05:00.0/0000:06:00.0/0000:07:0b.0/0000:13:00.0/net/eth11/address

Now, if I execute glob.glob(‘/sys/devices/pci*/*/*/*//net/'), I get a list directory where I can even look up files, but it definitely takes longer than finding, even through child processes. Also, the result set is so large that I can’t know in advance whether the schema of the prop body host will have the same directory structure, so I don’t know how many asterisks to enter in glob.glob()<

My question is, how do I repeat the simple find | behavior? The grep command is implemented, or if there is a better way to find all MACs for all NICs owned by the host, whether in an event state or not (I’m looking for a specific MAC mode here).

EDIT: glob shouldn’t be used, os.walk seems to be doing the job :

>>> for root, dirs, names in os.walk('/sys/devices/'):
...     if 'address' in names and 'pci' in root:
...         f = open(str(root + '/address'), 'r')
...         mac = f.readlines()[0].strip()
...         f.close()
...         print mac
...         eth = root.split('/')[-1]
...         print eth

Solution

Have you checked os.walk()?

import os
for root, dirs, names in os.walk(path):
    ...

http://docs.python.org/library/os.html#os.walk

From the link above, here’s a way to skip certain directories:

import os
from os.path import join, getsize
for root, dirs, files in os.walk('python/Lib/email'):
    print root, "consumes",
    print sum(getsize(join(root, name)) for name in files),
    print "bytes in", len(files), "non-directory files"
    if 'CVS' in dirs:
        dirs.remove('CVS')  # don't visit CVS directories

Related Problems and Solutions