I'm trying to set conditional formatting in openpyxl to emulate highlighting duplicate values. With this simple code, I should be able to highlight consecutive duplicates (but not the first value in a duplicate sequence).
from pandas import *
data = DataFrame({'a':'a a a b b b c b c a f'.split()})
wb = ExcelWriter('test.xlsx')
data.to_excel(wb)
ws = wb.sheets['Sheet1']
from openpyxl.style import Color, Fill
# Create fill
redFill = Fill()
redFill.start_color.index = 'FFEE1111'
redFill.end_color.index = 'FFEE1111'
redFill.fill_type = Fill.FILL_SOLID
ws.conditional_formatting.addCellIs("B1:B1048576", 'equal', "=R[1]C", True, wb.book, None, None, redFill)
wb.save()
However, when I open it in Excel I get an error related to conditional formatting, and the data is not highlighted as expected. Is openpyxl able to handle R1C1 style referencing?
In regards to highlighting to find duplicates of sequential values, the formula you want is
=AND(B1<>"",B2=B1)
With a range starting from B2 (aka, B2:B1048576)
Note - this appears to be broken in the current 1.8.3 branch of openpyxl, but will be fixed shortly in the 1.9 branch.
from openpyxl import Workbook
from openpyxl.style import Color, Fill
wb = Workbook()
ws = wb.active
ws['B1'] = 1
ws['B2'] = 2
ws['B3'] = 3
ws['B4'] = 3
ws['B5'] = 7
ws['B6'] = 4
ws['B7'] = 7
# Create fill
redFill = Fill()
redFill.start_color.index = 'FFEE1111'
redFill.end_color.index = 'FFEE1111'
redFill.fill_type = Fill.FILL_SOLID
dxfId = ws.conditional_formatting.addDxfStyle(wb, None, None, redFill)
ws.conditional_formatting.addCustomRule('B2:B1048576',
   {'type': 'expression', 'dxfId': dxfId, 'formula': ['AND(B1<>"",B2=B1)']})
wb.save('test.xlsx')
As a further reference:
If you want to highlight all duplicates:
COUNTIF(B:B,B1)>1
If you want to highlight all duplicates except for the first occurence:
COUNTIF($B$2:$B2,B2)>1
If you to highlight sequential duplicates, except for the last one:
COUNTIF(B1:B2,B2)>1
Regarding RC notation - while openpyxl doesn't support excel RC notation, conditional formatting will write the formula as provided. Unfortunately, excel enables R1C1 notation only superficially as a flag, and converts all the formulas back to their A1 equivalent when saving, meaning you'd need a function to convert all R1C1 functions to their A1 equivalents for this to work.
Openpyxl doesn't support Excel RC notation.
You could use A1 notation instead which would mean that the equivalent formula is =B2 (I think).
However, you should verify that it actually works in Excel first.
My feeling is that it won't. In general conditional formatting uses absolute cell references $B$2 instead of relative cell references B1.
If it does work then convert your formula to A1 notation and that should work in Openpyxl.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With