| Home | | Python | | ![]() | ![]() | Share This Page |
A beautifier for Bash shell scripts written in Python
— P. Lutus — Message Page —
Copyright © 2011, P. Lutus
(double-click any word to see its definition)
This is the second Bash script beautifier I have written — the first was written in Ruby and it's become pretty well-known. But since that time, for those tasks where it's appropriate, I have decided to program in Python instead of Ruby, and eventually I decided to rewrite the Bash beautifier and clean up some annoying inconsistencies in the process.
Beautifying Bash scripts is not trivial. Bash scripts aren't like C or Java programs — they have a lot of ambiguous syntax, and (shudder) keywords can be used as variables. Years ago, while testing the first version of this program, I encountered this example:
done=3;echo done;doneSame name, but three distinct meanings (sigh). The Bash interpreter can sort out this perversity, but I decided not to try to recreate the Bash interpreter just to beautify a script. This means there will be some border cases this Python program won't be able to process. But in tests with many large Linux system Bash scripts, its error-free score was roughly 99%.
BeautifyBash has three modes of operation:
- If presented with a list of file names —
beautify_bash.py file1.sh file2.sh file3.sh— for each file name, it will create a backup (i.e. file1.sh~) and overwrite the original file with a beautified replacement.- If given '-' as a command-line argument, it will use stdin as its source and stdout as its sink:
beautify_bash.py - < infile.sh > outfile.sh- If called as a module, it will behave itself and not execute its main() function:
#!/usr/bin/env python # -*- coding: utf-8 -*- from beautify_bash import BeautifyBash [ ... ] result,error = BeautifyBash().beautify_string(source)BeautifyBash handles Bash here-docs very carefully (and there are probably some border cases it doesn't handle). The basic idea is that the originator knew what format he wanted in the here-doc, and a beautifier shouldn't try to outguess him. So BeautifyBash does all it can to pass along the here-doc content unchanged:
if true then echo "Before here-doc" # Insert 2 lines in file, then save. #--------Begin here document-----------# vi $TARGETFILE <<x23LimitStringx23 i This is line 1 of the example file. This is line 2 of the example file. ^[ ZZ x23LimitStringx23 #----------End here document-----------# echo "After here-doc" fiAs written, BeautifyBash can beautify large numbers of Bash scripts when called from ... well, among other things, a Bash script:
#!/bin/sh for path in `find /path -name '*.sh'` do bash_beautify.py $path doneAs well as the more obvious example:
$ beautify_bash.py *.shCAUTION: Because BeautifyBash overwrites all the files submitted to it, this could have disastrous consequences if the files include some of the increasingly common Bash scripts that have appended binary content (a regime where BeautifyBash's behavior is undefined). So please — back up your files, and don't treat BeautifyBash as though it is a harmless utility. That's only true most of the time.
Licensing, Source
BeautifyBash is released under the GNU General Public License.
Here is the plain-text source file without line numbers.
Revision History
- Version 1.0 04/14/2011. Initial Public Release.
Program Listing
1: #!/usr/bin/env python
2: # -*- coding: utf-8 -*-
3:
4: #**************************************************************************
5: # Copyright (C) 2011, Paul Lutus *
6: # *
7: # This program is free software; you can redistribute it and/or modify *
8: # it under the terms of the GNU General Public License as published by *
9: # the Free Software Foundation; either version 2 of the License, or *
10: # (at your option) any later version. *
11: # *
12: # This program is distributed in the hope that it will be useful, *
13: # but WITHOUT ANY WARRANTY; without even the implied warranty of *
14: # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
15: # GNU General Public License for more details. *
16: # *
17: # You should have received a copy of the GNU General Public License *
18: # along with this program; if not, write to the *
19: # Free Software Foundation, Inc., *
20: # 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. *
21: #**************************************************************************
22:
23: import re, sys
24:
25: PVERSION = '1.0'
26:
27: class BeautifyBash:
28:
29: def __init__(self):
30: self.tab_str = ' '
31: self.tab_size = 2
32:
33: def read_file(self,fp):
34: with open(fp) as f:
35: return f.read()
36:
37: def write_file(self,fp,data):
38: with open(fp,'w') as f:
39: f.write(data)
40:
41: def beautify_string(self,data,path = ''):
42: tab = 0
43: case_stack = []
44: in_here_doc = False
45: defer_ext_quote = False
46: in_ext_quote = False
47: ext_quote_string = ''
48: here_string = ''
49: output = []
50: line = 1
51: for record in re.split('\n',data):
52: record = record.rstrip()
53: stripped_record = record.strip()
54:
55: # collapse multiple quotes between ' ... '
56: test_record = re.sub(r'\'.*?\'','',stripped_record)
57: # collapse multiple quotes between " ... "
58: test_record = re.sub(r'".*?"','',test_record)
59: # collapse multiple quotes between ` ... `
60: test_record = re.sub(r'`.*?`','',test_record)
61: # collapse multiple quotes between \` ... ' (weird case)
62: test_record = re.sub(r'\\`.*?\'','',test_record)
63: # strip out any escaped single characters
64: test_record = re.sub(r'\\.','',test_record)
65: # remove '#' comments
66: test_record = re.sub(r'(\A|\s)(#.*)','',test_record,1)
67: if(not in_here_doc):
68: if(re.search('<<-?',test_record)):
69: here_string = re.sub('.*<<-?\s*[\'|"]?([_|\w]+)[\'|"]?.*','\\1',stripped_record,1)
70: in_here_doc = (len(here_string) > 0)
71: if(in_here_doc): # pass on with no changes
72: output.append(record)
73: # now test for here-doc termination string
74: if(re.search(here_string,test_record) and not re.search('<<',test_record)):
75: in_here_doc = False
76: else: # not in here doc
77: if(in_ext_quote):
78: if(re.search(ext_quote_string,test_record)):
79: # provide line after quotes
80: test_record = re.sub('.*%s(.*)' % ext_quote_string,'\\1',test_record,1)
81: in_ext_quote = False
82: else: # not in ext quote
83: if(re.search(r'(\A|\s)(\'|")',test_record)):
84: # apply only after this line has been processed
85: defer_ext_quote = True
86: ext_quote_string = re.sub('.*([\'"]).*','\\1',test_record,1)
87: # provide line before quote
88: test_record = re.sub('(.*)%s.*' % ext_quote_string,'\\1',test_record,1)
89: if(in_ext_quote):
90: # pass on unchanged
91: output.append(record)
92: else: # not in ext quote
93: inc = len(re.findall('(\s|\A|;)(case|then|do)(;|\Z|\s)',test_record))
94: inc += len(re.findall('(\{|\(|\[)',test_record))
95: outc = len(re.findall('(\s|\A|;)(esac|fi|done|elif)(;|\)|\||\Z|\s)',test_record))
96: outc += len(re.findall('(\}|\)|\])',test_record))
97: if(re.search(r'\besac\b',test_record)):
98: if(len(case_stack) == 0):
99: sys.stderr.write(
100: 'File %s: error: "esac" before "case" in line %d.\n' % (path,line)
101: )
102: else:
103: outc += case_stack.pop()
104: # sepcial handling for bad syntax within case ... esac
105: if(len(case_stack) > 0):
106: if(re.search('\A[^(]*\)',test_record)):
107: # avoid overcount
108: outc -= 2
109: case_stack[-1] += 1
110: if(re.search(';;',test_record)):
111: outc += 1
112: case_stack[-1] -= 1
113: # an ad-hoc solution for the "else" keyword
114: else_case = (0,-1)[re.search('^(else)',test_record) != None]
115: net = inc - outc
116: tab += min(net,0)
117: extab = tab + else_case
118: extab = max(0,extab)
119: output.append((self.tab_str * self.tab_size * extab) + stripped_record)
120: tab += max(net,0)
121: if(defer_ext_quote):
122: in_ext_quote = True
123: defer_ext_quote = False
124: if(re.search(r'\bcase\b',test_record)):
125: case_stack.append(0)
126: line += 1
127: error = (tab != 0)
128: if(error):
129: sys.stderr.write('File %s: error: indent/outdent mismatch: %d.\n' % (path,tab))
130: return '\n'.join(output), error
131:
132: def beautify_file(self,path):
133: error = False
134: if(path == '-'):
135: data = sys.stdin.read()
136: result,error = self.beautify_string(data,'(stdin)')
137: sys.stdout.write(result)
138: else: # named file
139: data = self.read_file(path)
140: result,error = self.beautify_string(data,path)
141: if(data != result):
142: # make a backup copy
143: self.write_file(path + '~',data)
144: self.write_file(path,result)
145: return error
146:
147: def main(self):
148: error = False
149: sys.argv.pop(0)
150: if(len(sys.argv) < 1):
151: sys.stderr.write('usage: shell script filenames or \"-\" for stdin.\n')
152: else:
153: for path in sys.argv:
154: error |= self.beautify_file(path)
155: sys.exit((0,1)[error])
156:
157: # if not called as a module
158: if(__name__ == '__main__'):
159: BeautifyBash().main()
160:
| Home | | Python | | ![]() | ![]() | Share This Page |