How to stop matching file lines after the first match
I need to create a loop:
Read the contents of files in the list in the format Hostname-YYMMDD.txt;
Matches specific content in a line in this text file;
Stop on first match (ignore duplicates);
Print a specific portion of this row in an Excel worksheet.
So far I failed at point 3.
import os
import xlsxwriter
import re
MyPath = "FileDirectory" #e.g. "MyDocuments/Python"
MyHost = "Hostname" # e.g. "Router1_Loc1"
Host_Probes = []
# Loop: Populate Host_Probes []
for root, dirs, files in os.walk(MyPath, topdown=False):
for names in files:
if MyHost in names:
Host_Probes.append((os.path.join(names)))
# List with locations of all log files for the TargetHost
Probe_Paths = [MyPath + s for s in Host_Probes]
# Excel file and sheet:
workbook = xlsxwriter. Workbook('MyFile'.xlsx)
worksheet = workbook.add_worksheet('Sheet1')
row = 2 #Row:3
col = 2 #Col:C
# Here I "tell" Python to write the Line that says "CPU utilization"
# For a given day and then write the CPU utilization for the next day
# in the next column:
for s in Probe_Paths:
with open (s) as Probe:
for fileLine in Probe:
if "Core0: CPU utilization" in fileLine:
worksheet.write(row, col, int(re.sub('[^0-9]', '', fileLine)))
elif "Core1: CPU utilization" in fileLine:
worksheet.write(row +1, col, int(re.sub('[^0-9]', '', fileLine)))
col +=1
Probe.close()
worksheet
workbook.close()
The problem is that this output duplicates some files, so instead of filling once, it is written twice in the file.
The first time I encounter rows with the content “Core0:CPU utilization” and “Core1:CPU utilization”, I can’t get the loop to stop matching.
Is there a way for Python to just write the first match and move to the next string of list Probe_Paths?
Hope someone points me to.
Solution
You can create a flag variable to indicate whether you have seen the line you want to write
for s in Probe_Paths:
with open (s) as Probe:
seen = [0, 0]
if "Core0: CPU utilization" in fileLine and not seen[0]:
worksheet.write(row, col, int(re.sub('[^0-9]', '', fileLine)))
seen[0] = 1
elif "Core1: CPU utilization" in fileLine and not seen[1]:
worksheet.write(row +1, col, int(re.sub('[^0-9]', '', fileLine)))
seen[1] = 1
col +=1
# have both, can stop looking in the file
# will not increment col for skipped lines
if all(seen):
break