Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Removing unknown 10 character string

Im using a modified version of Eric Bidelman's/HTML5Rocks cachebust.py file for css/js. link is here

Instead of appending timestamp like

.css?2012-07-30

I modified variable to -

cachebust = ''.join(random.choice(string.ascii_uppercase + string.digits) for x in range(10))

so it becomes (for example)

.css?6SKD39SFJ3

his original version didnt seem to remove the date either, so im not really sure how that is a 'cache control' but i figured if i could just auto-strip those 10 characters, it would work. first targeting any js files (for new files), then if js? (with cachecontrol already in place), strip that existing cachecontrol

asset = re.search('\.(js")><\/script>', line)
if asset is not None:
  existing = re.search('\.(js?"', line)
  if existing is not None:
    line[i] = line.replace('.js?'STRING????'"', '.js"')
  lines[i] = line.replace('.js"></script>', '.js?%s"></script>' % cachebust)

thoughts on what that STRING???? should be, or if this method wouldnt work? im new to python so im just experimenting here...

like image 797
ndreckshage Avatar asked Apr 06 '26 22:04

ndreckshage


1 Answers

You could replace the 3 lines:

existing = re.search('\.(js?"', line)
if existing is not None:
    line[i] = line.replace('.js?'STRING????'"', '.js"')

with:

re.sub(r'\.js\?[-0-9]{10}">',r'.js?">', line)

Output:

>>> re.sub(r'\.js\?[-0-9]{10}">',r'.js?">', '<script type="blah" src="url/to/path.js?2012-07-02">')
'<script type="blah" src="url/to/path.js?">'

I have used the regexp [-0-9]{10} which stands for 10 characters of digits and a dash. In case that can stand for any 10 characters, use: .{10}

like image 195
UltraInstinct Avatar answered Apr 08 '26 19:04

UltraInstinct